“Engineering Asset Lifecycle Management” Proceedings of the 4thWorld Congress on Engineering Asset Management (WCEAM 2009) 28-30 September 2009
Editors Dimitris Kiritsis, Christos Emmanouilidis, Andy Koronios, and Joseph Mathew
Published by Springer-Verlag London Ltd
ISBN 978-1-84996-002-1
Proceedings of the 4th World Congress on Engineering Asset Management (WCEAM 2009) Ledra Marriott Hotel, Athens 28-30 September 2009
All Rights Reserved Copyright © Springer-Verlag 2010
No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or means, electronic, mechanical, photocopying, recording or otherwise, without prior written permission of the publisher.
PREFACE The 4th World Congress on Engineering Asset Management, WCEAM 2009, held at the Ledra Marriott Hotel in Athens, Greece from 28 to 30 September 2009, represents a milestone in the history of WCEAM. It is the first to be organised under the auspices of the newly formed International Society of Engineering Asset Management (ISEAM) who will host WCEAM on an annual basis as its forum for exchange of information on recent advances in this rapidly growing field. WCEAM 2009 was organised with the invaluable support of the newly formed Hellenic Maintenance Society (HMS) in Greece, who acted as the local host. After the inaugural WCEAM in July 2006 in the Gold Coast, Australia, visioned and initiated by the CRC for Integrated Engineering Asset Management (CIEAM) and organised in conjunction with the International Conference on Maintenance Societies (ICOMS) hosted by the Maintenance Engineering Society of Australia (MESA) and the International Maintenance Systems (IMS) Conference hosted by the Intelligent Maintenance Centre (IMS Centre), USA, the 2nd WCEAM was co-organised with the Condition Monitoring Conference hosted by the British Institute for Non-Destructive Testing (BINDT) in April 2007 in Harrogate, UK . The third congress in the series was organised in Beijing in October 2008 by a consortium comprising the Division of Mechanical and Vehicle Engineering, the Chinese Academy of Engineering, the China Association of Plant Engineering, the National Science Foundation Industry/ University Cooperative Research Center on the Intelligent Maintenance Systems Center (IMS), USA, the Diagnosis and Self Recovery Engineering Research Centre at Beijing Chemical Technology University and CIEAM. The theme of WCEAM 2009 was “Engineering Asset Lifecycle Management – A 2020 Vision for a Sustainable Future” and fits well with the lessons learnt from the recent financial and economical crises that impacted severely on the global economy and showed that a sustainable future requires consideration of the various lifecycle aspects of business, industrial and public good activities in their technical, organisational, economic, environmental and societal dimensions. The management of engineering assets over its lifecycle, which includes maintenance at its core, is a crucial element for global business sustainability and its importance is gradually being recognised by corporate senior management. Our seven distinguished keynotes at WCEAM 2009 presented developments in a number of these areas. ISEAM envisions WCEAM as a global annual forum that promotes the interdisciplinary aspects of Engineering Asset Management (EAM). In view of this vision, WCEAM seeks to promote collaboration between organisations who share similar objectives and where particular matters of common interests are discussed. The program for this year included a number of special sessions organised by the EFNMS, EURENSEAM and the Manufacturing Technology Platform (MTP) on Maintenance for Sustainable Manufacturing (M4SM) of the Intelligent Manufacturing Systems international program (IMS). In addition, a session on modern maintenance training along with a dedicated e-training workshop on Maintenance Management was organised by the EU project, “iLearn2Main”. WCEAM 2009 brought together over 170 leading academics, industry practitioners and research scientists from 29 countries to: • Advance the body of knowledge in Engineering Asset Management (EAM), • Strengthen the link between industry, academia and research, • Promote the development and application of research, and • Showcase state-of-the-art technologies. Over 120 scientific and technical presentations reported outputs of research and development activities as well as the application of knowledge in the practical aspects of: • Advanced Maintenance Strategies (RCM, CBM, RBI) • Condition monitoring, diagnostics, and prognostics
• • • • • • • • • • • • • • • • •
Decision support and optimization methods and tools Education and training in asset and maintenance management E-Maintenance Emerging technologies & embedded sensors and devices for EAM Human dimensions in integrated asset management Intelligent maintenance systems Lifecycle & sustainability considerations of physical assets Performance Monitoring and Management Planning and scheduling in asset and maintenance management Policy, Regulations, Practices and Standards for asset management Quality of information and knowledge management for EAM Risk management in EAM Safety, Health and Risk Management in EAM Self-maintenance and self-recovery engineering Strategic asset management for sustainable business Technologies for asset data management, warehousing and mining Wireless technologies in EAM
All full papers published in these proceedings have been refereed by specialist members of a peer review panel for technical merit. We would like to gratefully acknowledge the support of Bayer Technology GmbH as Gold Sponsor, the Intelligent Manufacturing Systems – IMS international organisation as Satchel and Dinner Sponsor, the CRC for Integrated Engineering Asset Management (CIEAM) as Bronze Sponsor of WCEAM 2009 and the ATHENA Research and Innovation Centre as the Welcome Reception Sponsor. Function sponsorship was undertaken by Gefyra SA and Atlantic Bulk Carriers Management Ltd while the SBC business channel, the Supply Chain & Logistics & Plant Management magazines and the supply-chain.gr portal were WCEAM 2009’s publicity sponsors. We would like to thank the Hellenic Maintenance Society (HMS) and the WCEAM Organising Committee for the enormous effort they have contributed to making this conference a success. Thanks are also due to members of the WCEAM International Scientific Committee for their efforts in both reviewing papers as well as promoting the congress within their networks. Athens, the “mythical” city of Athena, the ancient Greek goddess of Sophia, and the Parthenon offered WCEAM 2009 participants with an ideal environment for knowledge exchange and networking opportunities which will lead to the beginning of much new fruitful collaboration. Thank you for having been a part of WCEAM 2009. We look forward to meeting you at our next event, the 5th
WCEAM in Brisbane, Australia, from 25-27 October 2010. Dr. Dimitris Kiritsis Congress Chair
Dr. Christos Emmanouilidis Co-chair
Professor Joseph Mathew Co-chair
Congress Chairs Chair: Dr Dimitris Kiritsis, Ecole Polytechnique Fédérale de Lausanne, Switzerland. Co-Chairs: Dr Christos Emmanouilidis, C.E.T.I/R.C. Athena, Greece Professor Joseph Mathew, CRC for Integrated Engineering Asset Management (CIEAM), Australia. International Scientific Committee Adolfo Crespo Marquez, Spain Ajith Parlikad, UK Andrew Starr, UK Andy Koronios, Australia Andy Tan, Australia Anthony David Hope, UK Antonio J Marques Cardoso, Portugal Ashraf Labib, UK Basim Al-Najjar, Sweden Benoit Iung, France Birlasekaran Sivaswamy, Australia Bo-Suk Yang, Korea Brett Kirk, Australia Brigitte Chebel Morello, France Bruce Thomas, Australia Christos Emmanouilidis, Greece Dimitris Kiritsis, Switzerland Erkki Jantunen, Finland Gao Jinji, P R China George Tagaras, Greece Hong-Bae Jun, Korea Ioannis Antoniadis, Greece Ioannis Minis, Greece Jay Lee, USA Jayantha Liyanage, Norway
Jing Gao, Australia Joe Amadi-Echendu, South Africa Jong-Ho Shin, Switzerland Joseph Mathew, Australia Kari Komonen, Finland Kerry Brown, Australia Kondo Adjallah, France Len Gelman, UK Lin Ma, Australia Marco Garetti, Italy Margot Weijnen, The Netherlands Ming Zuo, Canada Mohd Salman Leong, Malaysia Mohsen Jafari, USA Nalinakash S Vyas, India Noureddine Zerhouni, France Peter Tse, Hong Kong, China Rhys Jones, Australia Roger Willett, New Zealand Seppo Virtanen, Finland Shozo Takata, Japan Stanislaw Radkowski, Poland Tadao Kawai, Japan Yiannis Bakouros, Greece
WCEAM 2009 Organising Committee Ian Curry, Hi Events, Australia Jane Davis, CIEAM, Australia Katerina Zerdeva, Zita Congress, Greece Professor Andy Koronios, CIEAM, University of South Australia, Australia Rhonda Hendicott, Hi Events, Australia Zacharias Kaplanidis, Zita Congress, Greece
Keynote Speakers Dr Claudio Boer - Intelligent Manufacturing Systems
Dr. Panayotis Papanikolas – GEFYRA S.A.
“IMS Global Network Support for Maintenance for Manufacturing”
“Construction, operation and lifecycle cost of ships - Realizing that before maintenance comes maintenability “
Claudio is Chairman of the International Steering Committee for Intelligent Manufacturing Systems (IMS). Over the past year, he has been involved with working to initiate an innovative program for researchers designed for easier global collaborations for new and ongoing research called the Manufacturing Technology Platform (MTP) program.
Heinz Cznotka - Bayer Technology Services GmbH “Proactive Asset Lifecycle Management” Dr Heinz Cznotka is the Director Competence Center Asset Management Consultancy and Sr. Risk and Reliability Consultant. He has 20 years experience in the oil, gas & petrochemical industry as: Project Manager, Reliability Manager, Production Manager, Managing Director Maintenance Services and Technical Risk Manager. Dr Cznotka’s core competences are Risk-based Maintenance (RBM), Reliability-centered Maintenance (RCM), Turnaround optimization; Lifecycle Length and Cost Optimization; Reliability & Maintenance Engineering and Optimization; Reliability & Maintenance related EHS Management; Reliability & Maintenance Production and In-Service Inspection.
Richard Edwards - EFNMS/IAM “Why do we need asset management trends and issues?” Richard is a member of the Council of the Institute of Asset Management and the current chairman of the Asset Management Technical and Professional Network, a joint venture between the IET and the IAM. He is also a Board member of the IAM. A Director of AMCL for ten years, Richard has extensive experience in the application and assessment of Asset Management. ‘
Eric Luyer - IBM Corporation “Leveraging Smart Asset Management in Engineering and Product Lifecycle Management” Eric Luyer has global responsibility for managing Industry Marketing activities for the Industrial Manufacturing sector. With more than 25 years of experience, Eric has developed in-depth expertise in the industry, working in Financial, ERP and Enterprise Asset Management software application environments, initially in Europe and the last seven years worldwide. Eric Luyer has held senior positions in Sales Management, Indirect Channels, Alliance Partner Management and Industry Marketing - working for software solutions providers, such as Comshare, Global SSA, Baan/Invensys, MRO Software. Currently Eric is taking the position of IBM’s worldwide Manager Product Marketing for Maximo Asset Management – positioning Asset Management in Industrial Manufacturing industries.
Panayotis is Vice-Chairman & Managing Director of GEFYRA S.A., the concessionnaire company of the Rion - Antirion Bridge. Holding the position of Technical Director during the construction of the longest multi-span cable stayed bridge in the world, built in aggressive environment in terms of durability, seismicity and wind, he was in charge of developing the inspection, monitoring and maintenance management plan of the bridge. Five years of operation have helped in fine-tuning what is considered today as one of the most complete structural asset management plans for bridges.
Jürgen Pothoff - Bayer Technology Services “Proactive Asset Lifecycle Management” Jürgen has 16 years experience with globally operating Engineering, Procurement & Construction (EPC) Companies working on projects in Europe, Americas, Asia, MENA and Australia servicing the petrochemical, refineries, chemical process industry. He has worked as a Lead Process Engineer and Feasibility Study Leader for world scale petrochemical plants; Sales & Project Manager for lump sum turnkey EPC projects including a 4 year local assignment for the development of the Australian market; Establishment / Management of Maintenance Management & Contracting services based on risk based maintenance concept and 3 years as a Reliability Engineering Manager with major chemical company, on the development & global implementation of risk based maintenance.
Kristian Steenstrup - Gartner Inc. “Operational Technology and the relationship to IT Management” Kristian is research vice president in Gartner’s Australian branch. He conducts and delivers research on Enterprise Business Systems including ERP, EAM, SCM, CRM and c-commerce. Kristian is responsible for vendor analysis in the Asia/Pacific market and is global research leader for Asset Intensive ERP II. Before joining Gartner Kristian worked in ERP and EAM system design and delivery for over 15 years in a number of global markets. During this period he was directly involved in emerging technologies in Asset Intensive industries such as Utilities, Mining, Rail and Defence.
Panos Zachariadis - Atlantic Bulk Carriers “Construction, operation and lifecycle cost of ships – Realising that before maintenance comes maintainability” Panos is Technical Director of Atlantic Bulk Carriers Management Ltd. From 1984 to 1997 he was Marine Superintendent for a New York bulk carrier and oil tanker shipping company. His shipping experience spans diverse areas such as sea service in bulk carriers and oil tankers, supervision of dry dock repairs, new building specifications and supervision, ship operations andchartering. Mr. Zachariadis holds a BSc degree in Mechanical Engineering from Iowa State University and a MSE degree in Naval Architecture and Marine Engineeringfrom the University of Michigan. He is a founding member of Marine Technical Managers Association (MARTECMA) of Greece.
SPONSORS Gold Sponsor
Satchel & Congress Dinner Sponsor
Welcome Reception Sponsor
Bronze Sponsor
Wednesday Lunch Sponsor
Publicity Sponsors
Exhibitors
Bayer Technology Services GmbH 51368 Leverkusen, Germany E-mail:
[email protected] www.bayertechnology.com
Monday Morning Tea Sponsor
Ledra Marriott Hotel, Athens, Greece
Program The organisers reserve the right to make changes to the program.
Sunday 27 September 2009 1930
Dinner at Horizons Restaurant - Optional dinner at additional cost
Sponsored by Bayer Technology Services
Monday 28 September 2009 0700
Exhibitors and poster presenters mount displays
0830
Conference Registration
0900-0930
Welcome – Opening session
Plenary session sponsored by Bayer Technology Services
Chair: Dimitris Kiritsis 0930-1015
Keynote address 1: Heinz Cznotka & Jürgen Potthoff, Bayer Technology Services GmbH Proactive Asset Lifecycle Management
1015-1100
Keynote address 2: Kristian Steenstrup, Research VP, Gartner Inc Operational technology and its relationship to IT Management
1100-1120
Coffee break
1120-1300
Sessions
1300-1420
Session1: EURENSEAM - 1 - Strategic Engineering Asset Management
Session 2: Transport, Building and Structural Session 3: Lifecycle & Sustainability Asset Management Considerations Of Physical Assets
Chair: J P Liyanage
Chair: Tony Hope
Chair: Andy Koronios
Ahonen, T., Collaborative development of Maintenance investment management: A case study in pulp and paper industry
Kaphle, M., Tan, ACC., Kim, E. and Thambiratnam, D., Application of acoustic emission technology in monitoring structural integrity of bridges
Frolov, V, Mengel, D, Bandara, W, Sun, Y and Ma, L., Building an ontology and process architercture for engineering asset management
Al-Najjar, B. and Ciganovic, R., A model for more accurate maintenance decisions
Nayak, R., Piyatrapoomi, N. and Weligamage, J., Application of text mining in analysing road crashes for road asset management
Karray, MH, Morello, BC and Zerhouni, N, Towards A Maintenance Semantic Architecture
Gudergan, G., The House of Maintenance - Identifying the potential for improvement in internal maintenance organizations by means of a capability maturity model
Pérez, AA, Vieira, ACV, Marques Cardoso, AJ, School Buildings Assets - Maintenance Management and Organization for Vertical Transportation Equipment
Koronios, A., Steenstrup, C. and Haider, A., Information and Operational Technologies Nexus for Asset Lifecycle Management
Parida, A. and Kumar, U., Integrated strategic asset performance assessment
Phillips, P., Diston, D., Starr, A., Payne, J., and Pandya, S., A review on the optimisation of aircraft maintenance with application to landing gears
Matsokis, A., Kiritsis, D., An advanced method for time treatment in product lifecycle management models
Rosqvist, T., Assessing the subjective added value of value nets: which network strategies are really win-win ?
Piyatrapoomi, N. and Weligamage, J. Risk-based approach for managing road surface friction of road assets
Shin, J-H, Kiritsis, D., Xirouchakis, P., Function performance evaluation and its application for design modification based on product usage data
Lunch
World Congress of Engineering Asset Management 2009
Program Monday 28 September 2009 ... continued 1420-1540
Sessions Session4: EURENSEAM - 2 - Strategic Engineering Asset Management
Session 5: Transport, Building and Structural Asset Management
Session 6: Technologies For Asset Data Management, Warehousing & Mining
Chair: Bassim Al-Najjar
Chair: Joe Amadi-Echendu
Chair: Michael Purser
Al-Najjar, B., A Computerized model for assessing the return on investment in maintenance
Amadi-Echendu, J.E., Belmonteb, H., von Holdtc, C., and Bhagwand, J. A case study of condition assessment of water and sanitation infrastructure
Haider, A., Open Source Software Development for Asset Lifecycle Management
González Díaz, V., Fernández, JFG, Crespo Márquez, A., Case study: Warranty costs estimation according to a defined lifetime distribution of deliverables
Nastasie, D., Koronios, A., The role of standard information models in road asset management
Kans, M., Assessing maintenance management IT on the basis of IT maturity
Schuh, G. And Podratz, K., Remote service concepts for intelligent tool-machine systems
Nastasie, D., Koronios, A., The diffusion of standard information models in road asset management: - A study based on the human - technology environment
Mathew, A., Purser, M., Ma, L. and Barlow, M., Open standards-based system integration for asset management decision support
Wijnia, YC and Herder, PM, The state of Asset Management in the Netherlands
Ninikas, G., Athanasopoulos, Th., Marentakis, H., Zeimpekisa, V., and Minis, I., Design and implementation of a real-time fleet management system for a courier operator
Peppard, J., Koronios, A. Gao, J., The data quality implications of the servitization - theory building
1540-1600
Coffee break
1600-1645
Keynote address 3: Richard Edwards, European Federation of National Maintenance Societies & Institute of Asset Management Why do we need asset management: trends and issues Chair: Christos Emmanouilidis
1650- 1810 Sessions
1930-2100
Session7: EFNMS - Engineering Asset Management in Europe
Session 8: Decision Support & Optimisation Methods & Tools
17.00 - 19.00 ISEAM AGM
Chair: Kari Komonen
Chair: Marco Macchi
Chair: Joe Mathew
Benetrix, L., Garnero, MA and Verrier, V., Asset management for fossil-fired power plants: methodology and an example of application
Cenna, AA, Pang, K, Williams, KC, Jones, MG, Micromechanics of wear and its application to predict the service life of pneumatic conveying pipelines
Olsson, C., Labib, A. and Vamvalis, C. CMMS – Investment or Disaster ? Avoid the Pitfalls
Kim, JG, Jang, YS, Jeong, HE, Lim, J and Choi, BK, Flexible coupling numerical analysis method
Ulaga, S., Jakovcic, M. And Frkovic, D. Condition monitoring supported decision processes in maintenance
Godichaud M. , Pérès F. and Tchangani A., Disassembly process planning using Bayesian network
Welcome Cocktail Function - Sponsored by ATHENA Research & Innovation Centre
Ledra Marriott Hotel, Athens, Greece
Program Tuesday 29 September 2009 830
Delegate Arrival & Registration
0900-0945
Keynote address 3: Claudio Boer, Chairman, Intelligent Manufacturing System (IMS) IMS Global Network Support for Maintenance for Manufacturing Chair: Marco Garetti
1000-1140
Sessions Session 9: Maintenance for Sustainable Manufacturing - 1
Session 10 - Strategic asset management for sustainable business
S11 - Advanced Maintenance Strategies (RCM,CBM,RBI)
Chair: Marco Garetti
Chair: Ajit Parlikad
Chair: Lin Ma
Garetti, M. Welcome & Introduction to MTP M4SM Special Session
Furneaux, C.W., Brown, K.A., Tywoniak, S., and Gudmundsson, A., Performance of public private partnerships: an evolutionary perspective
Gorjian, N., Ma, L., Mittinty, M., Yarlagadda, P. and Sun, Y., A review on degradation models in reliability analysis
Pantelopoulos, S. (Industrial talk), Product maintenance in the ‘Internet of things’ world
Labib, A., Maintenance strategies: a systematic approach for selection of the right strategies
Gorjian, N., Ma, L., Mittinty, M., Yarlagadda, P. and Sun, Y., A Rreview on reliability models with covariates
Gómez Fernández J F*, Álvarez Padilla F J, Fumagalli L, González Díaz V, Macchi M, Crespo Márquez A, Condition monitoring for the improvement of data center management orientated to the Green ICT
Rezvani, A., Srinivasan, R., Farhan, F., Parlikad, AK., and Jafari, M., Towards Value Based Asset Maintenance
Muhammad, M., Majida, A.A. and Ibrahima, N.A., A Case study of reliability assessment for centrifugal pumps in a petrochemical plant
Liyanage J P, Badurdeen F, Strategies for integrating maintenance for sustainable manufacturing: developing integrated platforms
Yeoh, W., Koronios, A. And Gao, J., Ensuring Successful Business Intelligence Systems Implementation: Multiple Case Studies in Engineering Asset Management Organisations
Sun, Y., Ma, L., Purser, M. and Fidgeb, C., Optimisation of the reliability based preventive maintenance strategy
Tsutsui M, Takata S, Life Cycle Maintenance Planning System in consideration of operation and maintenance integration
Gao, J., Koronios, A., Kennett, S., Scott, H., Data quality enhanced asset management metadata model
Lee, WB, Moh, L-S, Choi, H-J, Lifecycle Engineering Asset Management
Session 12: Maintenance for Sustainable Manufacturing - 2
Session 13 - Planning, Scheduling & Performance Monitoring
Session 14 - Decision Support & Optimisation Methods & Tools
Chair: Shozo Takata
Chair: Seppo Virtanen
Chair: Colin Hoschke
Cannata A, Karnouskos S, Taisch M, Dynamic e-Maintenance in the era of SOA-ready device dominated industrial environments
Haider, A., Driving innovation through performance evaluation
Chebel-Morello, B., Haouchine, K., Zerhouni, N., Methodology to Conceive A Case Based System Of Industrial Diagnosis
Emmanouilidis, C. and Pistofidis, P. Design requirements for wireless sensor-based novelty detection in machinery condition monitoring
Kim, D., Lim, J-H., Zuo, MJ., Optimal schedules of two periodic preventive maintenance policies and their comparison
Lim, JI, Choi, BG, Kim, HJ, Kim, JG, and Park, CH., Optimim design of vertical pump for avoiding the reed frequency
Kans M, Ingwald A, Analysing IT functionality gaps for maintenance management
Lipia, TF, Zuo, MJ, and Lim, J-H., Optimal Replacement Decision Using Stochastic Filtering Process to Maximize Long Run Average Availability
Mokhtar, AA, Muhammad, M and Majida, MAA, Development of spreadsheet based decision support system for product distributions
Garetti, M - Session Wrap-Up
Seraoui R., Chevalier R. and Provost D., EDF’S plants monitoring through empirical modelling: performance assessment and optimization
Smalko Z, Woropay M, Ja, Z, The diagnostic decision in uncertainly circumstances
1140-1200
Coffee break
1200-1310
Sessions
World Congress of Engineering Asset Management 2009
Program Tuesday 29 September 2009 ... continued 1310-1430
Lunch
1430-1610
Sessions Session 15: Maintenance for Sustainable Manufacturing - 3 - Education & Training
Session 16: Advanced Maintenance Strategies
Session 17: Condition Monitoring, Diagnostics & Prognostics
Chair: Jan Franlund
Chair: Stanislow Radkowski
Chair: Tony Rosqvist
Bakouros, Y., Panagiotidou, S, and Vamvalis, C., Education and Training Needs in Maintenance: How you conduct a selfaudit in Maintenance Management
Bey-Temsamani, A., Engels, M., Motten, A., Vandenplas, S. and Ompusunggu, AP., Condition-based maintenance for OEM’s by application of data mining and prediction techniques
Chen T., Xu XL, Wang, SH, Deng SP. , The Construction and Application of Remote Monitoring and Diagnosis Platform for Large Flue Gas Turbine Unit
Emmanouilidis, C., Labib, A, Franlund, J., Dontsiou, M, Elina, L., Borcos, M., iLearn2Main: an e-learning system for maintenance management training
Gontarz, S. and Radkowski, S., Shape of specimen impact on interaction between earth and eigenmagnetic fields during the tension test
Gu, DS, Kim, BS, Lim, JI, Bae, YC, Lee, WR, Kim, HS, Comparison of vibration analysis with different modeling method of a rotating shaft system
Franlund, J, Training and Certification of Maintenance and Asset Management Professionals
Kiassat, C and Safaei, N., Integrating human reliability analysis into a comprehensive maintenance optimization strategy
Kim, BS, Gu, DS, Kim, JG, Kim, YC and Cho, BK, Rolling element bearing fault detection using acoustic emission signal analyzed by envelope analysis with discrete wavelet transform
Macchi, M., Ierace, S., Education in Industrial Maintenance Management: Feedback From Italian Experience
Mazhar, MI, Salman, M and Howard, I, Assessing the reliability of system modules used in multiple life cycles
Kim, H-E, Tan, ACC, Mathew, J, Kim, EYH, Cho, BK, Prognosis of Bearing Failure Based on Health State Estimation
Starr, A. Bevis, K., The role of education in industrial maintenance: the pathway to a sustainable future
Radkowski S, Gumiński R, Impact of vibroacoustic diagnostics on certainty of reliability assessment
Xu XL, Chen T, Wang SH, Research on Data-Based Nonlinear Fault Prediction Methods in Multi-Transform Domains for Electromechanical Equipment
Maintenance for Sustainable Manufacturing - 4 - M4SM Project kick-off 1° M4SM Meeting
Session 18: Emerging Technologies in EAM
Session 19: Condition Monitoring, Diagnostics & Prognostics
Chair: Marco Garetti
Chair: Bo-Suk Yang
Chair: Andrew Starr
Espíndola, D., Pereira, CE, Pinho, M., IM:MR - A tool for integration of data from different formats
Jasiński M., Radkowski S., Use of bispectral-based fault detection method in the vibroacoustic diagnosis of the gearbox
Mikail F. Lumentut , Ian M. Howard, Theoretical study of piezoelectric bimorph beams with two input base-motion for power harvesting
Maszak, J., Local meshing plane as a source of diagnostic information for monitoring the evolution of gear faults
Shim, M-C, Yang, B-S, Kong, Y-M, Kim, WC, Wireless condition monitoring system for large vessels
Yang, SW, Widodo, A, Caesarendra, W, Oh, JS, Shim, MC, Kim, SJ, Yang, BS and Lee, WH, Support vector machine and discrete wavelet transform for strip rupture detection based on transient current signal
Smit, JJ, Djairam, D., Zuang, Q., Emerging Technologies and Embedded Intelligence in Future Power Systems
Ierace, S., Garetti, M. and Cristaldi, L., Electric Signature Analysis as a cheap diagnostic and prognostic tool
1610-1630
Coffee break
1630-1750
Sessions
1915
Departure for Gala Dinner
1945
Conference Dinner - Athens Yacht Club Sponsored by IMS Keynote address 4 - Panos Zachariadis, Atlantic Bulk Carriers Construction, operation and lifecycle cost of ships - Realizing that before maintenance comes maintenability Chair: N. Nassiopoulos
Ledra Marriott Hotel, Athens, Greece
Program Wednesday 30 September 2009 0845
Delegate Arrival
0900-1300
WORKSHOP: INTEGRATION AND INTEROPERABILITY IN ENGINEERING ASSET MANAGEMENT (EAM)
0900-1020
Sessions Session 20: e-Maintenanance
Session 21: Policy, Regulations, Practices & Standards For Asset Management
Session 22: Condition Monitoring, Diagnostics & Prognostics
Chair: Errki Jantunen
Chair: Ashraf Labib
Chair: Andy Tan
Baglee, D., The Development of a Mobile e-maintenance system utilizing RFID and PDA Technologies
Haider, A., A Roadmap for information technology governance
Kim, EY., Tan, ACC., Mathew, J. and Yang, B-S., Development of an Online Condition Monitoring System for Slow Speed Machinery
Jantunen, E. Gilabert, E., Emmanoulidis, C. and Adgar, A., e-Maintenance: a means to high overall efficiency
Mathew, A., Purser, M. Ma, L. and Mengel, D., Creating an asset registry for railway electrical traction equipment with open standards
Rgeai, M., Gu, F., Ball, A., Elhaj, M., Ghretli, M. Gearbox Fault Detection Using Spectrum Analysis of the Drive Motor Current Signal
Oyadiji, SO, Qi, S. and Shuttleworth, R., Development of Multiple Cantilevered Piezo Fibre Composite Beams Vibration Energy Harvester for Wireless Sensors
Stapelberg, RF, Corporate Executive Development for Integrated Assets Management in a New Global Economy
Zhu, Z., Oyadiji, SO, and Mekid, S., Design and Implementation of a Dynamic Power Management System for Wireless Sensor Nodes
1020-1040
Coffee break
1040-1200
Sessions
1200-1320
Session 23: Technologies For Asset Data Management, Warehousing & Mining
Session 24: Advanced Maintenance Strategies (RCM,CBM,RBI)
Session 25: Condition Monitoring, Diagnostics & Prognostics
Chair: Matt Barlow
Chair: Ioannis Bakouros
Chair: Ioannis Antoniadis
Grossmann, G, Stumptner, M, Mayer, W, and Barlow, M, A Service oriented architecture for data integration in asset management
Bohoris, G.A. and Kostagiolas, P.A., Inferences on nonparametric methods for the estimation of the reliability function with multiply censored data
Mpinos, CA and Karakatsanis, T., Development of a dynamic maintenance system for electric motor’s failure prognosis
Natarajan, K., Li, J. and Koronios, A., Data mining techniques for data cleaning
Gilabert, E., Gorritxategi, E., Conde, E., Garcia, A., Areitioaurtena, O. and Amaya Igartuaa, An advanced maintenance system for poligeneration applications
Gryllias, K.C., Yiakopoulos, C. and Antoniadis, I., Automated diagnostic approaches for deffective rolling element bearing using minimal training pattern classificaiton methods
Natarajan, K., Li, J. and Koronios, A., Detecting mis-entered values in large data sets
Kostagiolas, P.A., and Bohoris, G.A., Finite sample behaviour of the Hollander-Proschan goodness of fit with reliability and maintenance data
Pang, K., Cenna, AA, Williams, KC, and Jones, MD, Experimental determination of cutting and deformation energy factors for wear prediction of pneumatic conveying pipeline
Apostolids, H. (industrial talk), Crisis in Maintenance and Maintenance in Crisis: Opportunities for Maintenance Re-engineering
Yachiku, H., Inoue, R., and Kawai, T., Diagnostic Support Technology by Fusion of Model and Semantic Network
Lunch - sponsored by GEFYRA SA
World Congress of Engineering Asset Management 2009
Program Wednesday 30 September 2009 ... continued 1320-1520
Sessions Session 26: Workshop for e-training in Maintenance Management (HMS, M4SM, iLearn2Main)
Session 27: Safety, health and risk management in EAM
Session 28: Condition Monitoring, Diagnostics & Prognostics
Chair: Christos Emmanouilidis
Chair: Pantelis Botsaris
Chair: Tadao Kawai
Papathanassiou, N., Emmanouilidis, C., e-Learning in Maintenance Management Training and Competence Assessment: Development and Demonstration
Botsaris, P.N., Naris, A.D., and Gaidajis, G., A Risk Based Inspection (RBI) preventive maintanance programme: a case study
Fumagalli, L., Jantunen, E., Garetti, M. and Macchi, M., Diagnosis for improved maintenance services: Analysis of standards related to Condition Based Maintenance
Maintenance Management: Live & Interactive e-training and e-assessment workshop
Papazoglou, IA, Anezirisa, ON, Konstandinidou, M, Bellamy, LJ, Damen, M, Assesing occupational risk for contact with moving parts of machines during maintenance
Widodo, A., and Bo-Suk Yang, Machine prognostics based on survival analysis and support vector machine
Skroubelos, G., Accident causes during repair and maintenance activities and managerial measures effectiveness
Younus, AM, Widodo, A and Yang, B-S., Image Histogram Features Based Thermal Image Retrieval to Pattern Recognition of Machine Condition
Training in Maintenance Management Panel Discussion & Evaluation (Jan Franlund, Ashraf Labib, Andrew Starr, Yiannis Bakouros, Cosmas Vamvalis)
Elforjani, M. Mba, D. Acoustic emissions observed from a naturally degrading slow speed bearing and shaft
A Addali, S Al-lababidi, H Yeung, D Mba, Measurement of gas content in two-phase flow with acoustic emission
1520-1540
Coffee break
1540-1625
Congress Closing Keynote Address 6: Panayiotis Papanikolas, GEFYRA SA Maintenance Management of the Rion-Antirion Bridge
1625-1710
Congress Closing Keynote Address 7: Eric Luyer, IBM Corporation Leveraging Smart Asset Management in Engineering and Product Lifecycle Management Chair: Joe Mathew
1710-1730
Closing Remarks - End of WCEAM 2009
1830
Visit to new Acropolis Museum and dinner at Dionysos Restaurant - optional at additional cost
Proceedings of the 4th World Congress on Engineering Asset Management Athens, Greece 28 - 30 September 2009
COLLABORATIVE DEVELOPMENT OF MAINTENANCE INVESTMENT MANAGEMENT – A CASE STUDY IN PULP AND PAPER INDUSTRY Jouko Heikkiläa, Toni Ahonena, Jari Neejärvib, Jari Ala-Nikkolac a
VTT Technical Research Centre of Finland, P.O. Box 1300, FI-33101 Tampere, Finland;
[email protected] b
Myllykoski Paper, Myllykoskentie 1, FI-46800 Anjalankoski, Finland;
[email protected] c
ABB Service, Myllykoskentie 1, FI-46800 Anjalankoski, Finland;
[email protected]
The interest in purchasing external maintenance services increases as a strategic choice, as manufacturing companies increasingly focus on their core business. In the case presented in this paper, an external service provider has taken the responsibility of the maintenance functions as a strategic partner. A development project was started to support the creation of profound and functional partnership. Focus of the project was to support the practical collaboration between the partners. The development focused on two activities, in which the both organisations clearly have important roles. These two activities are: management of maintenance investments and operator involvement in maintenance activities. The idea was to improve the collaboration practices by practical collaborative development of these two key activities. Maintenance investment decision making requires functional collaboration between the organisations. The maintenance organisation is responsible for making investment proposals and production organisation makes the final decisions. Originally not much collaboration was planned in this process. Furthermore there was a demand for improving and clarifying the reasoning for proposals. To meet this demand, a working group comprising representatives of both organisations and researchers developed proper commonly agreed criteria, a tool and a process for preparation of maintenance investment proposals. In the developed investment process, the collaboration aspect was taken into account. In this paper, we present the collaborative development process, resulting maintenance investment management development and key findings regarding the importance of practical development efforts done together in building trust and collaboration in a business partnership. Key Words: Maintenance investment, Pulp and paper industry, Decision making, Management
1
INTRODUCTION
Demand for paper products is not expected to grow, especially in Europe. Therefore investments in new production lines will be very rare and the role of maintenance investment management will increase in engineering asset management of ageing production lines. A maintenance investment here means typically a replacement of worn equipment, but excluding scheduled maintenance and reparation of failures. While scheduled maintenance and reparation activities both include replacements of worn equipment, too, maintenance investments are typically more expensive and require specific planning and decisionmaking which is substantially based on cost-effectiveness (which makes it as an investment). Unlike other investments, the main focus in maintenance investments is on restoring the original performance. The payback from a maintenance investment comes mainly from decreasing (corrective) maintenance and unavailability costs. Outsourcing of services in industry is a common trend that has been started in 1960s and which has been significantly increased in the last 20 years (as summarised by, for example, Hendry (1995), Bailey et al. (2002), and Huai and Cui (2005). Maintenance is one of the most commonly outsourced services (Bailey et al. 2002, Tarakci et al. 2009). The outsourcing trend is still going strong, even though numerous and severe drawbacks has been reported (Hendry 1995 and anon. 2007). In Finnish paper industry traditionally nearly all activities at plant has been accomplished by the in-house maintenance organisation. In a competition for market shares, paper companies have searched for effectiveness by focusing on core business. Thus, the companies are looking for and establishing outsourcing of functions, which are not seen to be their core business. The outsourcing trend was started from such activities as canteen and cleaning services, and now the maintenance services are
1
being outsourced. After such a long tradition of in-house operations, outsourcing is likely to raise fears and opposition in organisation, as well as practical challenges in partners’ roles, distribution of work and cooperation. Improving the performance of service, focusing management activities on core business and reducing costs are the most common motives for outsourcing. However, in many cases, there have been difficulties to reach these goals, because of unexpected management needs and “hidden costs” related to cultural and cooperation aspects of organisations (Hendry 1995, Bailey et al. 2002, anon. 2005, Huai & Cui 2005). In the case presented in this paper, the maintenance services were outsourced in the beginning of 2007. This started the build-up of a new form cooperation between a paper company and a maintenance company. To support managerial and practical cooperation, a development project was started. The development project focused on two practical issues in cooperation: 1) maintenance investment management process and 2) development of the role of operators in maintenance and development. In this paper the first task and its results are presented. 2 HOW THE MAINTENANCE INVESTMENT MANAGEMENT WAS DEVELOPED The initial state in the beginning of the development project was that the agreement between the paper company and the maintenance company had been signed and the agreed maintenance operations had been started. Most of the maintenance staff remained, but some changes were made in maintenance management. Maintenance tasks and responsibilities were defined in the agreement. One of the agreed tasks allocated to the maintenance organisation was to prepare maintenance investment proposals. The proposals are then evaluated and investment decisions will be made by the paper company. A need to improve and clarify the reasoning for investments had already been identified and the two organisations proposal-decision-model emphasized this need. Too often the proposed reason for a maintenance investment was only that “it has to be done”, which makes the evaluation of the necessity and profitability of the proposal quite difficult. Initially it was supposed – especially on the paper company side – that the maintenance company would prepare the maintenance investment proposals independently. Quite soon it appeared to be evident, that at least some cooperation would be useful or even required. There were two reasons for the need of cooperation: 1) the novelty of maintenance organisation at the plant and 2) the need to coordinate maintenance investments and development investments. Even though most of the employees in the new maintenance organisation had been working already in the old organisation, the organisation itself and many of its managers were new at the plant. In addition to that, the existing maintenance and equipment history documentation was not so complete that it could cover all the history knowledge of the experience personnel. Additional challenge was the change of the maintenance information system. Several activities have been made to improve the maintenance documentation in order to make it a more reliable and complete basis for maintenance investment planning. Meanwhile, experienced production and maintenance personnel has important additional role in producing and preparing information. Hopefully in near future, the maintenance history data will be more ready to be used for maintenance investment planning. This will reduce the need for cooperation in information collection for this specific purpose in future. In some cases there are connections and overlapping between the maintenance investment and development investment: A target of a planned maintenance investment may be also a part of a planned development investment. Or, a planned development investment may set some new requirements, which are worthwhile to be taken into account in maintenance investments. Since, in this case, the preparation of maintenance investments and development investments has been separated into different organisations, a frequent enough communication between these organisations should be especially planned and agreed. Coordination between development investments and maintenance investments would have taken place later in the decision-making process anyway, but coordination already while planning will save some preparation work and enable better results. Since improvement of collaboration between two organisations was the basic objective of the project, collaborative process was chosen to be used in developing maintenance investment preparation. Collaborative development process meant that representatives of both organisations together developed and agreed the method to be used starting from the criteria to be used in investment selection and including all important aspects, such as information to be collected, means for information collection, cooperation related to information handling, responsibilities and annual schedules for the phases of preparation. External consults from VTT guided the development process and supported it with their knowledge. The development consisted of 11 workgroup meetings during one year and additional development work between the meetings. The development could have been done as an expert work by the consults or by either of the organisations by themselves. Possibly, in that case, the developed method would have been more sophisticated and the development might have been more efficient (up to the method document). However, the equality based collaborative development process aimed to ensure an undisturbed start-up and sustainable use of the new practice, by dispelling suspicions and practically constructing collaboration between the two organisations. The developed method is at the moment practically implemented at the plant, which means still some learning and finalizing of practices. The final success of the work will still be seen in future.
2
3
THE DEVELOPED MAINTENANCE INVESTMENT IDENTIFICATION AND PROPOSAL PROCESS
The development work in this case focused on the first phase of a maintenance investment management process – the phase that includes the tasks from the collection of information until preparation of investment proposals. The second phase – handling of the proposals and the final decision making – was outside of the scope of this case. The criteria to be used in investment candidate identification, candidate selection and final decision making were carefully selected. From the beginning it was clear to all parties that the economic criteria – namely the profitability of an investment – would be dominant in most cases. Especially production loss and unavailability related costs would be important criteria. Maintenance costs could be triggering criteria in some cases. Opportunity to improve the system performance (above the original) is never a main criterion for a maintenance investment, but it may be an additional factor supporting the investment. The improvement may be related to production capacity, occupational or environmental safety, or quality. Plans for development investments should always be checked and taken into account, when maintenance investments are prepared. Capability to invest may be a restricting factor in some situations. Capability may be restricted by financial, resource or production reasons. In practice, concerning a single maintenance investment, capability to invest is not an important issue, since the maintenance investments are carried out within an annual budget. When proposals for maintenance investments are prepared, it should also be checked, if other options were possible and more profitable. The other options – instead of maintenance investment – may be an improvement of maintenance procedures or a development investment. One of the main objectives in the development was to improve the information collection for the basis of potential maintenance investment identification. It is known that collaboration and related information exchange is relevant from the integrated perspective of production and maintenance planning. E.g. Sloan et al. (2000) have studied the combined production and maintenance models and concluded that combined models result in significantly greater (25 %) reward compared to traditional method. An important source of information is the maintenance management information system. However, after a strict examination it appeared that the failure and maintenance history data collected in the system was not complete and detailed enough to be used as an only information source for the identification of potential maintenance investments. One specific drawback was that, even though very extensive data on downtime existed in production management system, it could not be automatically linked to the failure and maintenance data in maintenance information system. Thus, detailed enough information on how much each failure cause loss of production was not easily available. Possibilities to link the two information systems are being examined and management activities have been carried out to improve the maintenance and failure documentation. The quality of maintenance data has improved already during the project. In addition to above mentioned general improvements in maintenance data collection, an annual process was developed (figure 1) focusing to collect and analyse maintenance investment related information especially. At the end of the process, the maintenance investment proposals are prepared. The process consists of three main phases: 1) Identification and selection of potential management investments 2) Profitability assessment based on a cost-benefit analysis 3) Decision making and proposal finalisation. By making the preparation of maintenance investment proposals a year-long – or actually a continuous – process, better quality of proposals was sought.
3
Continuous tasks
Responsible
Identification of potential for maintenance investments on the basis of incidents causing downtime and maintenance costs. Rough estimate of investment benefits and costs. Supervisor Local service manager Responsible
Local service manager
•
Morning meetings
• •
Thursday meetings Root cause analyses Periodical tasks
Responsible
Examination of downtime January Decemb February information er Novemb er
March
October Examination of downtime information
Service manager
Preparation of investment proposals
August
Local service manager
April
Septemb er
Local service manager
Examination of downtime information
May
July
Examination of downtime information
Profitability assessment Proposals for major investments Selection of potential minor investments Examination of non-indicated risks
Service and production managers
June
Service and production managers
Figure 1. Annual schedule of preparation of maintenance investment proposals
The basis of the process is in daily and weekly maintenance and production meetings. In these meetings it is frequently checked, whether there has happened something, which might indicate a need for a maintenance investment. These potential targets for maintenance investments are documented in a formal common inventory (data base). The documentation in inventory should include a note on why the investment should be done (what should be improved) and a coarse cost-benefit evaluation. An additional information source for potential maintenance investments is the root cause analyses which are carried out for any failure or disturbance which have caused at least two hour downtime in production. An increased number of failures and increased need for maintenance is one kind of indication of a required maintenance investment. However, all aging equipment does not so clearly show such symptoms of replacement need. Such equipment requires regular risk assessment as means for identification of maintenance investment need. The risk assessment is based on the criticality classification of equipment: the criticality in relation to production and safety has been defined for each device in the plant. In risk assessment the main factors to be examined are: -
equipment age compared to the typical age of such equipment
-
changes in environment, use or functional requirements for equipment
-
changes in availability of spare parts and services.
Based on these factors, it will be evaluated, on which year the replacement would most probably be required. The evaluated year of failure is related to the likelihood part of risk assessment. The consequence aspect of risk assessment is taken into account by comparing planned and unplanned replacement cases. The difference between the costs of planned and unplanned cases affects to the preferable timing of replacement: if the unplanned replacement is much more expensive, the investment in replacement is likely to be profitable even long before the evaluated replacement date. On the other hand, if the difference in costs is small, a risk can be taken by postponing the
4
replacement even after the evaluated date. On the basis of risk assessment, potential targets for maintenance investment are added to the potential maintenance investment inventory. Risk assessment is carried out annually. The inventory of potential maintenance investments is examined four times a year. In this examination the information related to these potential investments is updated and complemented. This quarterly examination serves the follow-up. In the second examination of the year the cost-benefit analysis is carried out. On the basis of the analysis, budget proposals for major investments are prepared and forwarded to decision making, and other (minor) maintenance investments are selected for proposal finalisation. In the third annual examination, maintenance investment proposals are finalised. 4
PROFITABILITY ASSESSMENT BASED ON A COST-BENEFIT ANALYSIS
The practical approach and tool developed to support the profitability assessment in this context was based on an LCP calculation model. The developed model is a combination of the practices utilised at the plant earlier, cost-based failure analyses (e.g. Rhee & Ishii, 2003) and LCP principles for a dynamic environment (Ahlmann, 2002). So far the profitability assessments have been made mostly based on information on the investment payback time which was, however, found too one-dimensional a measure if applied alone. In practice, utilising payback time as the only criterion will not take into account the profits during the lifecycle but favours investments with short payback time. Thus, it was targeted that the developed approach and related tool should produce information on both the investment decision’s effects on the time periods for which capitals are invested (payback time) and the anticipated lifecycle profits of individual investment targets. However, economic fluctuation and other key features of the dynamic business environment can have effects on which criteria are emphasised. In capital-intensive industries, production downtime typically generate most of the total costs related to equipment failures. For this reason, the main driver for the implementation of a maintenance investment often comes from the resulted system downtime and related costs, as in this case. However, the more comprehensive list of cost items used in our profitability assessment model is as follows: -
unavailability costs (production downtime)
-
maintenance costs
-
o
failure based maintenance costs
o
preventive maintenance costs
energy consumption
By evaluating the assumed change in the fore-mentioned cost items due to the considered investment, one can calculate the cost effects of the investment. Thus, our practical LCP based calculation model is a comparative analysis tool for making assessments of future costs for two different scenarios where a) no investment is done or b) investment will be done during the next year. In addition, the costs of the implementation of the investment are taken into consideration in the investment calculations. Implementation costs are categorised as follows: -
investment purchase costs
-
installation related unavailability costs
-
other installation related costs and costs generated by the need to modify surrounding assets
The main results our practical investment evaluation tool produces are the payback time of the investment and investment’s effects on the lifecycle costs of the considered target. Qualitative information on the target is also given to support the decision-making: the description of the target, the identified need for the investment, and additional benefits related to the investment are to be described qualitatively. This complements the quantitative analysis which is purely focused on the main drivers of a typical maintenance investment: unavailability, maintenance and energy costs. Thus, the following aspects regarding the potential investment target are analysed: -
environmental and occupational safety
-
target’s capability to answer the future demands (e.g. increase of capacity/speed)
-
improvement of product quality
-
synergy potential in maintenance.
5
Out of the list of potential investment targets, most auspicious candidates are chosen for profitability assessment - based on commonly agreed preliminary criteria with qualitative and quantitative aspects. The number of candidates should at this phase be larger than the number of investment targets typically funded within the considered maintenance investment budget. Profitability assessment phase results in a list of maintenance investment proposals for the next year, with both quantitative and qualitative depictions of profitability and investment benefits as well as other key features having effects on decisionmaking. 5
CONCLUSIONS
In this case study maintenance investment management was developed as a collaborative development of operations models and practices, rather than a pure technical method and tool development task. The objective was to develop the best practical means for investment management rather than theoretically best means. The development was based on wide knowledge on maintenance investment management methodologies. The (theoretically) optimal solutions were modified for local circumstances and requirements of which a newly started cooperation of two companies was an important one. A practically useful and successful process and tool was strived for. How it was succeeded, is still being seen.
6
REFERENCES
1
Ahlmann, HR (2002) From Traditional Practice to the New Understanding: The Significance of Life Cycle Profit Concept in the Management of Industrial Enterprises. IFRIMmmm Conference, Växjö, Sweden, 67 May 2002. 16 s.
2
anon. (2005) Calling a Change in the Outsourcing Market, The Realities for the Worlds Largest Organisations, Deloitte Development LCC.
3
Bailey, W., Masson, R. & Raeside, R. (2002) Outsourcing in Edinburgh and Lothians. European Journal of Purchasing & Supply Management, 8, 83-95.
4
Hendry, J. (1995) Culture, Community and Networks: Hidden Cost of Outsourcing. European Management Journal. 13(2), 193-200.
5
Huai, J. & Cui, N. (2005) Maintenance Outsourcing in Electric Power Industry in China. In: Proceedings of International Conference on Services Systems and Services Management. 13-15 June 2005. IEEE. 2, 1340-1345.
6
Rhee, S. J., Ishii, K. (2003) Using Cost Based FMEA to Enhance Reliability and Serviceability. Advanced Engineering Informatics. 17(34). July October 2003. s. 179188.
7
Sloan, T.W. Shanthikumar, G. (2000) Combined production and maintenance scheduling for a multiple-product, singlemachine production system. Production and operations management. Vol. 9(4), Winter 2000.
8
Tarakci, H., Tang, K. & Teyarachakul, S. (2009) Learning effects on maintenance outsourcing. European Journal of Operational Research. 192, 138-150.
6
Proceedings of the 4th World Congress on Engineering Asset Management Athens, Greece 28 - 30 September 2009
A MODEL FOR MORE ACCURATE MAINTENANCE DECISIONS (MMAMDEC) Basim Al-Najjar a and Renato Ciganovic b a b
Terotechnology, School for Technology and Design, Växjö University, Sweden,
[email protected]
Terotechnology, School for Technology and Design, Växjö University, Sweden,
[email protected]
It is usual when using CM technology for assessing the state of a component and planning maintenance actions using predetermined levels for warnings and replacements. The replacement of a damaged component is usually done at lower or higher than the predetermined level, which both means losses. This is because the probability of doing replacements just at the predetermined level is negligibly small. The accuracy in the assessment of the condition of a component has big technical and economic impact on the output of the machine, production process and consequently company profitability and competitiveness. The higher the accuracy in assessing the condition of a component yields higher probability of avoiding failures and planning maintenance actions at low costs. In this paper, techniques for assessing the state of a component using both mechanistic and other statistical approaches are considered. This paper also applies Cumulative Sum (CUSUM) Chart for identifying the time of damage initiation and reducing false alarms. Techniques for assessing the probability of failure of a component and its residual life, and predicting the vibration level at the next planned measuring opportunity or planned stoppage are introduced, discussed, computerised and tested. The problem addressed is: How is it possible to increase the accuracy of assessing the condition of a component? The major result achieved is; development of a model for more accurate assessment of the condition of a component/equipment through combining different approaches. The main conclusion that can be drawn is; applying the model, it is possible to enhance the accuracy of assessment of the condition of a component/equipment and consequently maintenance decision since the integrated model provides comprehensive and relevant information in one platform. Key Words: Integrated Approach for Maintenance Decisions, Effective Maintenance Decision, Prediction of Vibration Level, Probability of Failure, Residual Lifetime, Vibration Monitoring, Total Time on Test. 1
INTRODUCTION
Today the companies try to reduce their production costs for gaining a competitive advantage in the market. Maintenance plays a key role in reducing production cost through enhancing availability and extending the life length of producing assets, Wu et al. (2007). Using condition monitoring (CM) technologies, production security and operating safety will increase because the probability of detecting and treating problems will increase Wang (2002), Al-Najjar (2004) and Wu et al (2007). Furthermore it will result in much lower operational costs, Xiaodong et al. (2005). According to White (1994) it also improves the safety for the labour because the probability for a sudden breakdown of a machine will decrease. In other words, lack of maintenance can also have a huge negative impact on the surrounding environment through, for example radioactive radiation, oil leakage, explosions etc. This is why maintenance can reduce safety hazard for the labour and make them feel safer when working with the machines. Furthermore, White (1994) states in many industries the production has increased with around two to ten percent when applying condition-based maintenance (CBM). It is important to keep the machines in good condition, plan maintenance at need and try to avoid failures and disturbances. Without these enhancements, the company may face difficulties in keeping-up and improving the customer service level, production quality, personal safety and company profitability and competitiveness. When using CM technology, it is quite common to use vibration signals analysis for detection of machine faults, Samanta and Al-Balushi (2003). Detection is made by comparing the vibration signals of a machine operating under normal conditions with a machine running under faulty conditions. Noise, randomness and deterioration may cause variation in the vibration level, Al-Najjar (2001) and Samanta and Al-Balushi (2003). Random fluctuating of the vibration measurements may occur according to some uncontrolled factors independent of component’s condition. These can obstruct the detection of component condition, Samanta and Al-Balushi (2003).
7
When using CBM system, there are analytical tools for interpreting the signals and assessing the condition of the component, Al-Najjar (2001 and 2003). If the prediction of the vibration level in the close future utilises previous and current measurements, current and future operating conditions, the predicted value may support and enhance the accuracy of maintenance decisions. But even if it provides reliable information about the condition of a component, we do not certainly know when the component will break, Al-Najjar (1997). Replacing components at unreliable assessment of its condition leads to two types of losses; either losing a big part of its residual life or at failures. 2
CUSUM CHART
Variation in the vibration level can occur due to noise, randomness in the vibration and a wide spectrum of deterioration causes, Al-Najjar (2001) and Samanta and Al-Balushi (2003). The variation in the vibration level may lead to over- (or under-) estimation of the component/equipment condition. The economic losses that arise due to the false signals are significant especially when the downtime cost is high, see Al-Najjar (2003). According to Al-Najjar (1997) the cumulative sum (CUSUM) chart of the condition measurements is a better indicator of a potential failure than the vibration measurement itself when the variation in the vibration level is appreciably big. The graph of CUSUM values shows the behaviour of the cumulative sum of the vibration level with reduced probability of false alarms. The CUSUM chart is obtained by plotting the cumulative sum of deviations from a target value along the monitoring period with respect to a predetermined level, see Al-Najjar (1997). 3
PREDICTION OF THE VIBRATION LEVEL
Effective usage of Vibration-based Maintenance (VBM) within TQMain concept for rotating machines provides the user with a long lead time for start planning cost-effective maintenance actions to avoid failures. It can be achieved through acting at an early stage in order to keep the probability of failure of a components low until a condition-based replacement is done, Al-Najjar (2000). In Al-Najjar (2001) and Al-Najjar and Alsyouf (2003) a model for predicting the vibration level during the close future is developed and tested, respectively. The tool for predicting vibration level utilises previous and current measurements and operating conditions to predict future vibration level. Components' mean effective life can be prolonged if the operating conditions have been assessed properly, Al-Najjar (2000). When assessing the condition of a component, deterministic and probabilistic or combined approaches can used. Deterministic method includes for example issues related to machine function, failures analysis and diagnostics, while probabilistic method includes assessment based on statistical tools, Al-Najjar (2007B). More specifically, it means prediction of the vibration level, assessment of the time to maintenance action and the probability of failure of the component. According to Al-Najjar (2000) damage initiation and development phase represents more than half the total usable life of a component/equipment. The past history of a component, its current status and its operating and environmental conditions during the near future, will affect the probability of failure, Ibid. During normal component state, i.e. when no damage is initiated, CM parameter value, e.g. vibration level, is fluctuating around its mean level xo, Al-Najjar (1997). Denote the phase of mechanical component life when its vibration level approaches the potential failure level (initiation and development of damage) by xp. Replacement level is denoted by the variable xr. When the damage under development has been confirmed, the function of CM parameter value, i.e. x(t) is assumed to be a non-decreasing function. While CM parameter level is usually considered stationary during the interval prior to the initiation of damage, see Herraty (1993) and Al-Najjar and Alsyouf (2003). According to Al-Najjar (1997), it is often difficult to tell for certain that the damage development has begun, especially when the number of CM measurements, e.g. vibration, is very small. The model shown in Eq. (1) is used for predicting the vibration level during the period until the time point for next measurement of a planned stoppage see Al-Najjar (2001). c
Yi +1 = X i + a * Exp(bi * Ti +1 * Zi i ) + Ei
(1)
Yi+1 is a dependent variable representing the predicted value of the CM level at the next planned measuring time. The model expressed in Eq. (1) consists of three independent variables (X, Z and T) and three parameters (a, b and c). Ti+1 is the elapsed time since the damage initiation has been confirmed. The current CM level is labelled by Xi. Zi is the deterioration factor, which in its turn is a function of the current and anticipated load and previous deterioration rate. (Z = dx` * Lf/Lc). The parameter 'a' is the gradient (slope) by which the value of the CM parameter increased since it started to deviate from its normal state xo due to damage initiation until detecting it at potential failure level, xp. Parameters bi & ci are model's non-linear constants. The model error, Ei, is assumed to be identical, independent and normally distributed with zero mean and constant variance, N (0, s). Finally i, for i = 1, 2…n, is number of measuring opportunities after damage initiation.
8
4
ASSESSMENT OF PROBABILITY OF FAILURE AND RESIDUAL LIFETIME
The second technique uses failure and condition-based data for same or similar components using graphical Generalised Total Time on Test (GTTT-plot), Al-Najjar (2003). This technique is developed to asses the probability of failure of a component and its residual lifetime at demand. Assessment of probability of failure and residual lifetime is necessary to effectively enhance the information given to describe the condition of a component, e.g. bearing, at any time or after each vibration measurement. This is especially important for situations when the value of CM parameter, e.g. vibration level, is increasing rapidly or when it is relatively high and there is a risk of faster deterioration during the time until the next measurement, see Al-Najjar (2003). GTTT-plott can be obtained by plotting Ui = (Ti/Tn) = (Ti/n)/ (Tn/n) on y-axis versus ni/n on the x-axis. Here Ui indicates the proportion of the average exhausted life length of the bearings until their vibration levels approached or exceeded x(i) divided by the average time generated by n components until their levels equalled or exceeded x(n), for x(i)<x(n) and i=1,…,n. Consequently the ni/n gives an estimation of probability of failure occurrence, where ni is the number of replacements that have been done until x(i) is exceeded, while n represents the total number of the bearings under consideration and xr is the replacement vibration level, see Al-Najjar (2003). 5
MODEL DEVELOPMENT
It is better that predicting the vibration level in near future should not be done before the damage initiation has been confirmed for avoiding unnecessary work due to false alarms that may arise because of randomness. This is where CUSUM chart is useful due to its ability to identify the moment when damage has initiated, see Al-Najjar (1997). Assessment of the probability of failure of a component and its residual lifetime do not by itself consider the current and future state of the component in operation, which is the reason behind combining it with the tool for prediction of vibration level and CUSUM chart. The probability of failure and residual life help to enhance the information about the component by considering its age in comparison with historical data about same or similar components. CUSUM chart is computerized and the user-interface of CUSUM chart is shown by Fig.2. The CUSUM chart has been developed in Microsoft Excel since it is practical software for creating graphs. Thus, Fig. 2 displays the plot area intended for accumulated sum of the deviations between each vibration measurement and the reference value (k). The x-axis represents the measurement number and y-axis represents the CUSUM value. In this paper we apply CUSUM chart technique in the same way as done in Al-Najjar (1997). Therefore the reference level (k) is chosen in middle between xo and xp. Furthermore, the reference level is controlled against the mean vibration level of measurements and plotting of CUSUM chart is only obtained when the accumulated value of the vibration measurements is equal or less than zero. On the other hand the plotting of CUSUM is stopped as the measurements are declines under the k level because the process is assumed to be in control. Development of PreVib, ProFail and ResLife software prototypes: For easy use of the model and in order to achieve reliable and faster prediction of the vibration level, new software based on Eq. (1) and also called PreVib was developed. Also, ProFail & ResLife was developed for assessing the probability of failure of a component and its residual lifetime. The userinterface of PreVib is divided into two halves, Fig.3: left half representing input data for prediction and right half representing the result in form of a graph. Input data show both database and non-database data in the grey and white boxes, respectively. When a prediction is to be made, the segment (machine), asset (component, e.g. a rolling element bearing) and the location (direction of the vibration measurements) should be specified. Also, we should specify how many measurements in “Limit measurement” that we consider if we focus only on part of the measurements. Otherwise we should use “None” if we consider all the measurements in the prediction. Then, the measurements can be downloaded by pressing “Load” which uploads the data from a MIMOSA database. The “Mean vibration level (xo)”, “Prediction time period” and “Near future load/present load” should all be specified before clicking “Predict”. The desired future “Prediction time period” can be specified in, e.g. seconds, hours, days, weeks, months, etc. When “Predict” is clicked, the value of the predict vibration level and the associated date and plot it on the diagram on the right hand side of the user interface will be determined. By analogy, the same thing can be repeated any time the user need to predict the vibration level, for example after the next measurement of the vibration level. The right half of the user-interface presents a graph will contain two plots, the blue representing the actual vibration level values and the red showing the predicted vibration level values. Y-axis is the vibration level in, e.g. mm/sec and x-axis is the calendar time for measurements and predictions. The graph contains also two dashed lines representing the two different levels of life of a mechanical component i.e. mean vibration level (xo) and potential failure level (xp). These levels are determined and set based on data from identical components and is automatically retrieved when component data is uploaded. Some of the initial measurements belong to the first phase will not be shown in the graph, even if they have been considered when estimating model's constants b and c. The first prediction, as evident, should be done after confirming damage initiation, i.e. the vibration level exceeds or close to xp, because as long it fluctuates around xo it means there is no damage is initiated and thereby there is no need for predicting future level. To be able to predict future vibration level one need at least three measurements. This is why we need some of the measurements that have been considered part of the first phase of the component life. Observe that sometimes, the first two predictions are special cases due to the use of the measurements below the potential failure level to assess model's parameters b and c. The reason for that is; we want to predict future state as soon as we have confirmed the initiation of damage, instead of waiting for at least three measurements exceeded (xp). Waiting could also cause uncertainty in the maintenance decisions. Also, because deterioration process is a stochastic process, vibration level
9
can be changed randomly in time. Therefore, some of the level values may acquire higher level than it should, i.e. exceeds xp, which it may be irrelevant to the severity of the deterioration. For each prediction, all previous measurements are used for estimating model's parameters (b and c) and thus predicted vibration level itself. ProFail & ResLife software module is supplementary to PreVib module for enhancing the accuracy in maintenance decisions, which can be used jointly and independently. As for PreVib, ProFail & ResLife user-interface is also divided into two halves, Fig.4. The left half represents input data required for assessment and in the right half the result is shown in form of a plot (graph). When assessing the probability of failure and residual lifetime of a component, it demands to specify the “Segment” (machine) and “Asset” (component, e.g. a rolling element bearing in this case) and “Assessment time point”, i.e. the time at which the assessment should be done followed by clicking assess-button. Subsequently the program uploads the table of lifetime data from MIMOSA database and presents the graph on the right half of the user-interface. The y-axis in the graph shows the proportion of the average exhausted lifetime and the x-axis shows the probability of failure for the analysed component. “Probability of Failure” of the component in question and its “Residual lifetime” are then assessed and displayed in the left hand side of the user-interface. The values can be shown by using the cross-air pointer in the graph. If a new assessment is performed after the current assessment time point, then consequently the cross-air pointer will move forward on the curve. Each point on the graph represents each and one of the component lifetimes in relation to each other. The basic idea of developing a model for more accurate maintenance decisions (MMAMDec) through integrating CUSUM chart, PreVib, ProFail and ResLife based on; applying CUSUM chart reduces the probability of false alarms and confirms the initiation of damage, which makes the prediction of the vibration in the near future more effective. While, assessment of the probability of failure of a component and its residual lifetime increases the information underlying maintenance decision through improving the description of the condition of the component from additional dimensions. Fig.1. shows how the mentioned modules are integrated and the model working steps. Vibration measurements should firstly be analysed by using CUSUM chart. The vibration measurements can be exported to Microsoft Excel file from the original vibration measurement database. When damage initiation is confirmed by using CUSUM chart, i.e. the vibration level has approached xp, and then PreVib is applied. The software modules PreVib and ProFail&ResLife are MIMOSA compatible. MMAMDec working steps, see also Fig.1: 1. 2. 3. 4.
Use CUSUM chart to identify when xp is approached. Then, Predict the vibration level in the near future using PreVib module. From this step you can either go to step number 3 in order to enhance the information underlying maintenance decision through ProFail & ResLife module or you can jump to step 4. Assess the probability of failure of the component and its residual lifetime through using ProFail & ResLife module for the same dates as the predicted vibration levels or other date (after demand). It helps to enhance the information required to confirm the results in step 2, reduce its significance or reject it. Plan the required maintenance action by means of the information in previous steps
Fig.1. MMAMDec for integrating CUSUM, PreVib and ProFail&ResLife
10
6
MODEL TEST
The model is tested using real industrial data provided by CNC machine manufacturer Goratu from a specific machine/motor. The company also provided the reference levels for the machine type, i.e. mean vibration, potential failure and replacement levels. All the data needed for testing the models were retrieved from a MIMOSA database. Furthermore Goratu provided lifetime data, installation times and replacement/removal times for previous motors and the one under analysis. For the test of the model the same type of data that the three integrated modules use individually are needed. For the CUSUM chart we need vibration measurements and corresponding date of measurement. But we also need the different levels i.e. xo and xp in order to decide the reference level k. PreVib uses same type of data and level which are retrieved from MIMOSA database. ProFail & ResLife uses life data. First step in the model test is to identify when potential failure level xp is approached. CUSUM chart is appropriate tool for detecting systematic change from a prescribed level. The changes in the mean of the vibration measurements from the reference value can be plotted. If the mean of vibration measurements is equal to the reference value then cumulative sum fluctuates about zero. Variations in vibration level will make the CUSUM chart slope either to increase or decline. An upward slope in the CUSUM chart will indicate that accumulated sum of the deviations, between the vibration measurement and the reference value, has increased and vice versa for a downward slope. With help of CUSUM chart it is also possible to trace back the time point when the change occurred, see Fig. 2.
Fig.2. User-interface of CUSUM chart in Microsoft Excel using real data
For example by eye-balling the CUSUM chart in Fig. 2 one can see when the CUSUM chart slope changes its direction by observing the corresponding value on x-axis (Xi), which corresponds to the different measurements. Then it is easy to go back to the CUSUM chart file and trace the date for Xth measurement and start investigating the reason behind the change. CUSUM can also help us to find out if it is a false alarm or to confirm damage initiation. The characteristic of the CUSUM chart is that it cannot discover sudden deviation easily, this because usually at least two to three measurements are needed to indicate a deviation. From Fig. 2 we can see that decision interval is estimated to approximately 0,080 mm/sec, which means that passing this interval can be considered as an initiation of a potential failure. Furthermore we can see that the vibration measurement that passed the decision interval (potential failure initiation) was around measurement X50. But, prior X50 we have registered around 20 measurements have been near to or passed the potential failure level. It is by using the CUSUM chart that these false alarms have been discovered and thus reduced. We can also see that damage development is constantly increasing after the damage initiation point (X50), see Fig. 2. Once the damage initiation has been confirmed through CUSUM chart, we can start applying the PreVib module for predicting future vibration level and ProFail Module for assessing probability of failure and residual lifetime see Fig. 3 and 4.
11
Fig.3. User inter-face of the software program PreVib using real data The software module PreVib was used to predict vibration level for variable calendar time. The application of PreVib is shown in Fig. 3. One can see from Fig. 3 that predicted values are rather accurate since they are close to the real measurements. By predicting the vibration level the maintenance engineer will be able to model the optimum time for the replacement of the component.
Fig.4. User-interface of the software program ProFail and ResLife using real data Notice that the software module ProFail & ResLife is showing that probability of failure is 100% for the same dates as the predicted vibration values, see Fig. 4. This is because ProFail & ResLife makes an assessment based historical data for similar
12
or same components, which in this particular case have been shorter than the current, see life distributions on the left half of Fig.4. Also more historical data would be sufficient in order to consider the information provided by ProFail & ResLife. In the last step of MMAMDec, Fig. 1, the operator or maintenance engineer can plan the required maintenance action by means of the information in given from CUSUM chart, PreVib and possibly ProFail & ResLife. 7
RESULTS, DISCUSSIONS AND CONCLUSIONS
In this article we have presented an approach of how to integrate the three different tools with the purpose to increase the accuracy of assessing the condition of a component. By using CUSUM chart, the probability of false alarms can be reduced. After the damage initiation we started to apply the tool for predicting the future vibration levels so that an optimum time for replacement can be identified. At the same time we applied the tool for assessing the probability of failure and residual lifetime in order to enhance the information available. The main conclusion from this paper is that the developed model has the ability to integrate different type of data in one platform. The model has also clear and systemized way of application. The model creates a link between these modules so that the decision maker can for example for same date predict the vibration level and assess the failure probability and residual lifetime of the component. In order to make more accurate maintenance decisions; we should also sort out false alarms due to randomness in the vibration signals. In the final step the maintenance engineer can then plan the maintenance actions required with better accuracy. 8
REFERENCES
1
Al-Najjar, B. (1997) Condition-based maintenance: Selection and improvement of a cost-effective vibration-based policy in rolling element bearings. Doctoral thesis, ISSN 0280-722X, ISRN LUTMDN/TMIO—1006—SE, ISBN 91-628-2545X, Lund
2
University, Inst. of Industrial Engineering, Sweden.
3
Al-Najjar, B. (2000) Accuracy, effectiveness and improvement of Vibration-based Maintenance in Paper Mills; Case Studies. Journal of Sound and Vibration, 229(2), 389-410.
4
Al-Najjar, B. (2001) Prediction of the vibration level when monitoring rolling element bearings in paper mill machines. International Journal of COMADEM 4(2), 19-27.
5
Al-Najjar, B. (2003) Total Time on Test, TTT-plots for condition monitoring of rolling element bearings in paper mills. International Journal of COMADEM 6(2), 27-32.
6
Al-Najjar, Basim (2007A) The Lack of Maintenance and not Maintenance which Costs: A Model to Describe and Quantify the Impact of Vibration-based Maintenance on Company's Business. International Journal of Production Economics IJPPM 55(8).
7
Al-Najjar, Basim (2007B) Establishing and running a condition-based maintenance policy; Applied example of vibrationbased maintenance. WCEAM2007, 106-115, 12-14 June Harrogate, UK
8
Bergman, B. (1977) Some graphical methods for maintenance planing. Annual Reliability and Symposium, 467-471.
9
Bergman, B.and Klefsjö, B. (1995) Quality from customer needs to customer satisfaction. Studentl., Lund, Sweden.
10
Herraty, A.G. (1993) Bearing vibration-Failures and diagnosis. Mining Technoloy, 51-53.
11
Jardine, A.K.S. and Joseph, T. and Banjevic, D. (1999) Optimizing condition-based maintenance decisions for equipment subject to vibration monitoring, Journal of Quality in Maintenance Engineering, 5(3), 192-202.
12
Lin, C. and Tseng, H. (2005) A neural network application for reliability modelling and condition-based predictive maintenance. International Journal of Advanced Manufacturing Technology, 25(1), 174-179.
13
SAMANTA B. AND AL-BALUSHI, K.R. (2003) ARTIFICIAL NEURAL NETWORK BASED FAULT DIAGNOSTICS OF ROLLING ELEMENT BEARINGS USING TIME-DOMAIN FEATURES, Mechanical Systems and Signal Processing, 17(2), 317-328.
14
Xiaodong, Z. and Xu, R. and Chiman, K. and Liang, S.Y. and Qiulin, X. and Haynes, L. (2005) An integrated approach to bearing fault diagnostics and prognostics, American Control Conference, 2005. Proceedings of the 2005, 2750-2755.
15
Wang, W. (2002) A model to predict the residual life of rolling element bearings given monitored condition information to date. IMA Journal of Management Mathematics, 13(1), 3-16.
16
Wang, W. and Zhang, W. (2007) An asset residual life prediction model based on expert judgments, European Journal of Operational Research, 2, 496-505.
13
Maintainability
17
White, Glenn (1996) Maskinvibration, Vibrationsteori och principer för tillståndskontroll, Landskrona: Diatek vibrationsteknik
18
Wu, S. and Gebraeel, N. and Lawley, M. A. and Yih, Y. (2007) A Neural Network Integrated Decision Support System for Condition-Based Optimal Predictive Maintenance Policy, IEEE Transactions on Systems, Man, and Cybernetics - Part A: Systems and Humans, 37(2), 226-236.
Acknowledgement We would like to thank EU for the support that we had in EU-IP DYNAMITE, where this paper is part of the work done within DYNAMITE.
14
Proceedings of the 4th World Congress on Engineering Asset Management Athens, Greece 28 - 30 September 2009
THE HOUSE OF MAINTENANCE - IDENTIFYING THE POTENTIAL FOR IMPROVEMENT IN INTERNAL MAINTENANCE ORGANISATIONS BY MEANS OF A CAPABILITY MATURITY MODEL Prof. Dr. Guenther Schuh, Bert Lorenz, Cord-Philipp Winter, Dr. Gerhard Gudergan Research Institute for Operations Management at RWTH Aachen University (FIR), Pontdriesch 14/16, 52062 Aachen, Germany
In order to guarantee an efficient and effective employment of production equipment, it is essential to identify any possible potential for improving performance, not only in the production process, but also in supporting areas such as maintenance. One of the major tasks in increasing maintenance performance consists of systematically identifying the company’s most significant weaknesses in maintenance organisation and thus being able to implement improvements there where they are most needed. But how is a company to tackle this important task? To answer this question, this paper describes an assessment and improvement approach, based on a capability maturity model (CMM). By means of this approach, the status-quo of a maintenance organisation can be analysed and its individual improvement opportunities identified. Key Words: Maintenance Management, Capability Maturity Model, Maintenance Assessment, Improvement Program, House of Maintenance Framework, Maintenance Performance 1
INTRODUCTION
Ever changing market conditions, shorter product life cycles and an increase of competitive pressure create the need for a more effective and efficient deployment of production facilities. Maintenance is a key factor for both efficiency and effectiveness in production [1]. Being an internal service provider, maintenance is essential in ensuring a high degree of equipment availability, as well as meeting quality requirements and thus being able to meet customer’s needs without wasting resources. Due to its increasing significance concerning production, maintenance has become a major cause for cost increases. Nevertheless, it is essential to identify maintenance as a strategic factor for success and not a mere cost centre. Also, maintenance departments must continuously be improved, in order to enhance their performance. Improvement, however, requires the identification and understanding of unutilized potentials within maintenance. Target-oriented improvements are only possible, if the object and organisation respectively, and their current situation are measurable, and hence, rateable. It is, however, particularly difficult to develop a suitable method for objectively measuring and accessing a complex structure like a maintenance organisation. A further requisite for an improvement process is the fact that both starting point (in this case, the status quo of maintenance) and the target strived at, are known beforehand. The maturity-level-system theory offers a promising approach for solving such problems. With this in mind, the Research Institute for Operations Management at RWTH Aachen University (FIR) has attempted a research effort, aimed at projecting the maturity-level-system onto the management of the internal maintenance. For this purpose, the CMM was first assigned to the internal maintenance as well as to all other relevant supporting departments. Finally, the assessment framework (the House of Maintenance of FIR) was validated in practice, using business cases. The results of the research project are shown below for a practical case. As part of its research, the FIR has developed a diagnostic instrument, the “IH-Check” (“Maintenance-Check”), for recognizing strengths and weaknesses in the maintenance departments of mainly small and medium-sized enterprises (SME), and for recognizing potentials for improvement and methods for utilising the same.
15
Modern concepts of management, like Total Productive Maintenance (TPM), Reliability Centred Maintenance (RCM) or Risk Based Inspection (RBI), can help improve maintenance performance [2, 3, 4, and 5]. Although these and other similar concepts have been successfully used by large enterprises, their application to SMEs is, however, limited. This is validated by results of expert surveys, conducted among heads of maintenance departments [6].
„According to your oppinion, which concepts / methods are suitible for improving the maintenance performance of SMEs?“
Reliability oriented maintenance Rist Based Maintenance (RBM) Total Productive Maintenance (TPM) Outsourcing 0%
20%
40%
60%
suitable
suitable to a limited extent
not suitable at all
unknown
80%
100%
hardly suitable
N = 56 Source: FIR Expert Study „Trends and DevelopmentPerspectives in Maintenance“ 2004
Figure 1 Acceptance of existing maintenance concepts in operational service It is essentially the first step of such improvement processes, namely the realistic evaluation of one’s own strengths and weaknesses, which causes considerable problems within enterprises. Experts believe that an absence of an integrated approach is one of the causes of these problems (see fig. 1). Many existing management concepts in maintenance consist of single isolated solutions which are not part of an integrated improvement process. Additional challenges accompanying the application of modern concepts of maintenance management are described in figure 2. „According to your opinion, which are the main problems preventing the application of Maintenance concepts in SMEs?“ Systematic support in identifying potentials for improvement A systematic and integrated view of maintenance Internal analysis (Estimation of potentials in maintenance) Insufficient involvement of employees in the process of improvement Insufficient consideration of resources (human resources as well as financial resources) of SMEs Setting realistic goals concerning maintenance Consideration of qualitative assessment criteria 0%
N = 56
20%
40%
60%
80%
Source: FIR Expert Study „Trends and DevelopmentPerspectives in Maintenance“ 2004
Figure 2 Causes of the lack of acceptance of existing maintenance concepts in SMEs
16
In order to support enterprises in this situation, FIR has developed “IH-Check”, a diagnostic instrument which helps to systematically reveal organisational weaknesses in maintenance.
Customer
Partnerships
Materials management
Maintenance controlling
Maintenance organisation Maintenance object
Information and knowledge management
Maintenance policy and strategy
Maintenance staff Figure 3 House of Maintenance The assessment is based on a framework called the “House of Maintenance” (see fig. 3), which consists of nine fields of action, describing the elements of a typical maintenance organisation on a generic level. These fields are defined by an individual set of nine assessment criteria, each of which contains a set of specific levels of maturity. The levels of maturity are developed according to the CMM [7]. The CMM is a structure of elements that describes certain aspects of maturity in an organization. It aids in defining and understanding processes within an organization and is based on a five-level process maturity continuum.
Analysis of each individual field of action Evaluation of maturity level of
-
-
the specific field of action Field of action „Maintenance Controlling“
Calculation of individual maturity level for each criteria
Criteria
Tasks
Data collection
Identification of key figures
Do you collect maintenance specific data?
Answer-possibilities
Use of key figures
Key figure comparisons
Cost accounting
Budget
Calculation of profitability
Indirect costs
maturity level
No data is collected for maintenance Little data is collected. Figures are only sporadic and cannot be verified. No evaluation of figures.
Evaluations are carried out regularly.
consequences is possible. All the key figures can be determined without any uncertainty.
17
4
The collected data is complete without redundancy. It is evaluated in such a way that a forecast on the
Figure 4 Elements of the maintenance assessment
3
A large amount of data is collected, certain data often more than once. For collection a standard is specified.
2
Data is collected, but collection is not completely standardised. Evaluations are only carried out irregularly.
1
5
Following the assessment, a company’s individual maturity profile regarding maintenance management which determines the company’s potential for improvement in maintenance, is developed. In combination with a prioritisation for identifying the crucial fields of action – e.g. by means of a pair-wise comparison – specific measures can be developed to exploit the company’s full potential in maintenance. This paper describes the House of Maintenance’s structure on the one hand, enlarging on its different fields of action and referring to dedicated maturity levels, while discussing the procedure of maintenance assessment and the identification of specific measures for improvement in maintenance organization within a practical example on the other hand. Further details under discussion are the House of Maintenance’s different fields of action as well as companies’ specific levels of maturity.
2
FRAMEWORK: THE “HOUSE OF MAINTENANCE”
The assessment is based on a framework called the “House of Maintenance”, which consists of nine fields of action originating from our vast experience in maintenance organisation. The House of Maintenance is oriented towards current models concerning excellence and maintenance management [2, 8, 9, and 10]. The fields of action describe the elements of a typical maintenance organisation on a generic level. Their significance has been validated in the study among maintenance experts mentioned above. The nine fields of action represent all persons and sections/departments with a significant impact on the Overall Equipment Effectiveness (OEE) – the most important key figure for measuring maintenance performance. Maintenance staff provides the basis for all maintenance activities and is thus a key to a company’s performance whilst both production uptime and production quality depend on the maintenance department’s performance. Requirements for uptime and quality within production are determined by the production department, which therefore has a major impact on the configuration of all other fields of action. The latter have to be managed accordingly.
3
•
While developing the House of Maintenance, we considered state-of-the-art developments in maintenance as well as issues specific to SMEs:
•
The House of Maintenance has a practical and easy-to-communicate visualisation, readily understood by both management and shop floor workers.
•
While maintenance is recognised as an internal service provider, it is essential to evaluate maintenance performance from the customer’s viewpoint – the customer being the production department. It is, therefore, necessary to include aspects such as service quality and customer orientation in the House of Maintenance structure
•
A growing degree of linking between maintenance and other internal organisational units, as well as external service providers, has increased the importance of interface-management and IT-support.
•
Maintenance policy is supposed to aim at satisfying the needs of production departments in order to ensure efficient and effective production.
DEVELOPING STEPS FOR CAPABILITY MATURITY
As is the case in quality management and software engineering, the evaluation of individual criteria is based on a CMM. Such models can be employed to systemise and structure varying processes. They initiate a stable long term process of optimisation, by pointing out the course for future developments. The progress of development is quantified and can be checked at regular intervals. The IH-Check uses a total of five maturity levels, based on the typical CMM levels [7]. The adaption of the content to inhouse maintenance, however, represents a totally new development. Each benchmark is structured according to the House of Maintenance and is based on the five parameters (characteristics) of the CMM, namely, “improvisation”, “orientation”, “commitment”, “implementation” and “optimisation” (see fig. 5 top part). The first level of CMM characterises a chaotic condition, in which improvements in maintenance are introduced sporadically. No consistent understanding of a fully integrated maintenance management exists in the company. The second level is characterised by an awareness of the importance of internal maintenance as an internal service provider and its contribution to a company’s value and operational performance. Very frequently, this development is initiated and carried out by individual employees.
18
Steps Optimisation “There will always • Integrated maintenance management be room for • Continuous improvement improvement!“
CIP 5
Implementation • Usage • Measurement • Company-wide implementation
4
Commitment • Company-wide rules • Introduction • Organisation
3
Orientation • Target planning • Target formulation • Breakup period
2
1
Improvisation • To become conscious • To get real • Chaos
„Not bad at all!“
„We are Getting better!“
„Let´s get started“
„There is much to do!“
Time
Call for action:
Need for documentation of adopted measures Need to analyse adopted measures and initiation of further activities Identification of priority measures most essential to maintenance performance
Orientation:
Often, only a small number of employees realise the importance of Maintenance organisation Improvement is possible in several cases Employees begin to realise the importance of maintenance performance
Figure 5 CMM for an effective facility-oriented maintenance (modified version based on [7]) and development of the CMM (example: “Orientation”) In the third level, all measures undertaken to improve the performance of maintenance are documented, evaluated, standardised and laid down in the form of operating instructions. The fourth level may be considered to be the stage in which most important maintenance processes are understood. At this stage, a high level has already been attained. Further improvements can only occur in small steps, demanding large inputs of effort. The fifth and highest level denotes a stage of integrated maintenance management which performs with high efficiency and effectiveness. Up to this moment, the maintenance has been reorganised using effectively coordinated steps. All employees, including external service providers, have, meanwhile, adopted these aims and regulations and are prepared to accept and to improve them continuously. The five levels of the CMM may be considered to be a relatively stable condition of maintenance organisation, based on actual and durable activities and processes. This implies that a maintenance organisation cannot be changed overnight and that no steps or levels may be skipped. The individual steps are consecutive and support each other interactively, so that any single level can be attained only after the requirements of the previous one are fulfilled.
4
PROCEDURE
Assessments are handled in multiple interactively working steps. The first step is to select those fields of action which are relevant and are able to significantly improve the performance of internal maintenance. These are then compared and evaluated, thus depicting to the individual situation of the company. The above is achieved by using a pair wise comparison to give a weighting to the nine fields of action as described in the House of Maintenance [11]. In the following step, the decisive fields of action are evaluated using the consigned criteria. The criteria are depicted as open ended questions based on five standard statements. These statements are interactively related and represent the five levels of the applied CMM. The use of previously formulated statements within the assessment generally prevents any subjective evaluations (see fig. 2).
19
As an intensive evaluation process is very costly and consumes a vast amount of a company’s resources, which, especially in SMEs, are not always available in the desired quantities, it was necessary to curtail the number of evaluation criteria for each field of action to a maximum of nine, giving a total of 81 evaluation criteria for the diagnosis, requiring four to six hours for the accomplishment of the assessment. The assessment takes place as a workshop using a questionnaire based approach and including collective discussions to guarantee a highly objective evaluation process. The questionnaire, as seen in fig. 2, is used for the identification of individual maturity levels for each evaluation criteria. An evaluation of maturity levels for all relevant fields of action follows implicitly in accordance with the House of Maintenance. In order to consider all relevant views of maintenance as an internal maintenance provider, the employees and head of the maintenance department, together with employees of production, controlling and purchasing should be included in the analysis. Hence, an integrated view of maintenance as part of the enterprise as a whole, is guaranteed on the one hand. On the other hand, an evaluation including representatives from all divisions within a company ensures a high commitment of the employees involved. At the end of the questionnaire based survey, the whole information is consolidated and the individual maturity levels are calculated. For further analysis, results can be depicted for every field of action and its related criteria. This maturity evaluation is depicted in a radar chart (see fig. 6).
25%
1 Tasks 1
50%
100%
2 Data collection
80%
9
2
25%
40%
4 Use of key figures
20%
8
25%
3 Identification of key figures
60%
3
0%
5 Key figure comparison
0%
50%
6 Cost accounting 7
4
6
5
50%
7 Budget 8 Calculation of profitability
0%
0%
9 Indirect costs 0%
100%
Figure 6 Evaluation of results in the form of radar charts (example: field of action “Maintenance Controlling”) In addition, the maturity levels of all nine fields of action are condensed into a single diagram, giving a maturity profile for the company’s maintenance as a whole (see fig. 7). On the basis of this profile, it is possible to judge which fields of action should be developed primarily, and which level should be strived for. The aggregated result determined by the IH-Check also delivers a collective maturity score in the form of a percentage (0 to 100 %). It shows the stage that the maintenance organisation of the company has reached, regarding a maintenance organisation aiming at excellent equipment effectiveness. This key performance index (KPI) can be effectively applied for internal marketing purposes within the organisation. Using the information derived in the weighting process within the House of Maintenance and in the creation of the enterprise’s individual maturity profile, “IH-Check” then prioritises the fields of action, considering both the importance of each field of action to the individual company and its current level of maturity. The prioritisation’s results are depicted in a prioritisation matrix shown in figure 11. Within the matrix, fields of action with a high importance as well as a low level of maturity can clearly be identified as most crucial concerning improvement. Based on these insights, measures can be developed to improve maintenance performance and thereby increase the enterprise’s operational performance.
20
44%
100%
1 Customer 1
80%
9
2
100%
75%
3 Maintenance organisation
60%
72%
4 Information and knowledge management
40% 20%
8
33%
2 Maintenance policy and strategy
100%
3
0%
25%
5 Maintenance controlling
100%
100%
100%
59%
100%
6 Maintenance object 7
4
67%
7 Materials management 53%
6
5
100%
100%
8 Partnerships 50%
100%
9 Maintenance staff 0%
100%
AS- IS TO- BE scenario
Figure 7 Maturity profile (regarding a company’s maintenance as a whole) (example)
5
PRACTICAL APPLICATION
Practical experience has, in a variety of cases, shown that “IH-Check” supports organisations to conduct systematic, object oriented and substantiated diagnosis of their internal maintenance. This applies to a first self-assessment, which a project team can conduct in a short time and is sufficient to lay the basis for a detailed discussion of problems within individual areas. The evaluation of criteria within a team has proved to be beneficial. Firstly, this encourages an exchange among the different lobbies and ultimately promotes an improved mutual understanding regarding the different points of view existing within the organisation. On the other hand, it is just this team evaluation which is absolutely necessary to ensure the objectivity necessary for a realistic determination of the strengths and weaknesses existing at the different locations. It occurs often enough, that the highly distorted views which employees have, regarding their own maintenance performance have to be put into proper perspective. In addition, it is already in the course of these discussions, that initial valuable contributions for potential improvement are suggested. The presentation of results as radar charts has likewise proved its merit. The results of the assessment, independent of the depth and scope of the survey, can be clearly and easily communicated to and interpreted by both maintenance employees and management.
6
CASE STUDY EXAMPLE
With the objective of optimising the maintenance management of a company operating in the gas industry and the FIR have conducted a project for assessing the potential of the maintenance management and deriving the optimal maintenance strategy. An analysis of the actual situation of the company’s maintenance management was conducted using the “IH-Check”. Both the current relevance of the respective fields of action and the level of maturity within maintenance management were assessed. After defining the relevant fields of action as described in the House of Maintenance, a pair wise comparison was used for a weighting of these fields of action, specifically adapting the House of Maintenance to the company’s individual situation (see fig. 9). “Customer”, “Maintenance Object” and “Information and Knowledge-Management” were identified as fields of action with a high relevance for the company’s maintenance management, which is reflected in their weighting scores of 14.9%, 14.0% and 12.2%.
21
Materials management 2 Maintenance Partnerships staff 2 1 11 22 Materials management Maintenance staff 0 0 Partnerships 12 Maintenance staff Partnerships
Maintenance staff
0
201 2
222 1
002 1
11
21
2
20
12
0
1 Information 2 2 Maintenance organisation management and knowledge 0 Partnerships 10 1 2 2 Maintenance organisation management 01 0 Materials 1 management Customer Maintenance policy and 01
02
0
2
2
1
1
1
1
0
1
2
2
1
1
1
1
0
2
0
1
1
1 Customer strategy Maintenance 1
Maintenance policy and strategy9,6 % Maintenance organisation 14,0 %
10,2 %
9,6 %
7,2 %
Partnerships
Reset
Reset Maintenance organisation policy and 11 Maintenance 0 0 Reset Maintenance controlling strategy Maintenance policy and object/asset 02 12 Maintenance 0 0 1 Maintenance controlling strategy 22 12 Maintenance 21 Maintenance 1 2 policy Customer organisation and Maintenance object/asset 2 0 11 12 strategy 22 21 Maintenance 1 2 Customer organisation 0 Maintenance object/asset Information 12 02 Information 21 Maintenance 2 Maintenance object/asset and knowledge knowledge 0Maintenance 02 0and 1 0controlling 1 11 1 staff Information and knowledge management 12 management object/asset Maintenance 2111 0Maintenance 102 101staff 00 Maintenance 00 Maintenance 1 0controlling policy and management 10 00 20 Maintenance 0Materials 0 12 00 strategy 0 Information 0object/asset Materials management and knowledge 0 00 management Partnerships 10 000 010 020 Maintenance object/asset 12 Information 01 0 0 management andMaterials knowledge 1 Materials management management 22 12 11 Maintenance 0 11 01 staff 0 1staff Maintenance controlling Partnerships management 1 01 Maintenance 22 002 021 Maintenance staff 210 Partnerships 12 01 Maintenance 0 1staff Maintenance controlling 2 02 211 Materials Partnerships 2Customer 20 20 organisation 1management 1 Maintenance object/asset 0 21 Maintenance 02 201 Materials organisation 20 0 2Customer 20 22 Maintenance 1 management 1 Maintenance object/asset 2 2 2and 0 Partnerships 1 1 Maintenance staff Information knowledge 2 01 Maintenance 1 1 controlling 2 1 Maintenance staff 20 2Information 2 20 and Partnerships knowledge 2 2management 02 1 10 Materials 2 1 management Maintenance staff 10 management 0 0 Customer 0 1 Maintenance organisation 11 0 Materials 1 1 2 1 Partnerships management 2 controlling 100 120 00 10 Customer 11 Maintenance 2 1 Partnerships Information and knowledge 22 12 2 1 Maintenance controlling
201 1
2
12,2 % Sort
Materials management
Reset Maintenance staff
strategy Maintenance policy and 0 strategy Maintenance organisation 1 2 Customer
Maintenance object
staff
Sort
Reset
10,2 %
Maintenance controlling
Sort
Reset
Information and know ledge management
Sort
Reset
Maintenance policy and
Partnerships Maintenance
Maintenance staff
Sort
2
Maintenance staff
2
Customer
Sort
Materials management Maintenance staff Partnerships
Structural fields Main 0 Structural 0 1fields0 0 1 0 0 tena Main Customer 0 0 1 fields 0 0 1 0 Structural Maintenance policy and tena Main 2 1 2 1 Customer 1Structural 21 2fields 22 12 21 strategy Maintenance policy and tena Main 2 1 Customer 02 01 11 02 02 strategy Maintenance policy and Maintenance organisation 2 1 2 Main 1 2 2 1 1 tena 21 Customer 00 01 10 02 strategy Maintenance and Maintenance organisation policy 2 1 2 Main Information and knowledgeCustomer 2 tena 11 121 212 212 1 0 policy 0 0 0 strategy Maintenance and 0 tena Maintenance organisation 00 0 20 Main management Information and knowledgeCustomer 2 11 220 212 Maintenance 1 policy 0 and 0 0 tena 0 Main 0 0 strategy Maintenance organisation 2 1 1 2 2 1 0 1 management Information knowledge Customer Maintenance controllingand strategy 2 1 1 2 2 211 Maintenance 01 policy 22 and 21 Maintenance organisation 20 1 02 tena 21 Main management Customer Information andMaintenance Maintenance controlling strategy 2knowledge 1 1 2 and Maintenance organisation 11 02policy 00 0 02 tena 20 Maintenance object/asset 2 1 Maintenance 1 2 002 1 management Information and knowledge Maintenance controlling 01 12 strategy Maintenance organisation 11 policy 00 and 00 0 01 Information2and strategy knowledge Maintenance object/asset 1 1 2 1 2 02 management Maintenance controlling2 1 organisation 01 22 0 1 Maintenance 2 2 Materials management 1 0 0 2 0 0 2 0 management Information and knowledge Maintenance object/asset 1 2 2 Maintenance controlling2 11 112 12 212 Maintenance organisation Materials management 1 0 0 2 0 0 2 management Information Maintenance Maintenance object/asset controlling 2 and 1 0knowledge 1 12 111 20 Partnerships 2 0 Information 0 0 0 0 Materials management 00 02 management Maintenance Maintenance object/asset controlling 20and knowledge 11 111 120 121 Partnerships 2 0 0 0 0 management Materials Maintenance management object/asset 10 02controlling1 10 22 20 10 Maintenance Maintenance staff 2 1 1 1 2 2 2 Partnerships 11 02 Materials Maintenance management object/asset 10 controlling 01 002 221 1102 Maintenance Maintenance staff 2 1 1 1 2 2 Partnerships Materials management 21 02object/asset 00 02 00 10 Maintenance Maintenance staff 0 0 2 1 2 21 11 Partnerships Materials management 2 0 0 0 2 2 Maintenance object/asset 200 Maintenance Partnerships staff 2 1 11 02 01 11
Sort
Reset Maintenance object/asset Partnerships Maintenance staff Materials management
Structural fields Structural fields Customer
14,9 %
Sort
Reset
Maintenance controlling Materials management Partnerships Maintenance object/asset
Customer
Structural fields
Weighting of fields of action
Sort
Information and knowledge Maintenance organisation Maintenance controlling Maintenance object/asset Materials Maintenance management staff Partnerships management
Structural fields
Information and knowledge Maintenance object/asset Materials management Maintenance controllingstaff management Partnerships Maintenance
Structural fields
Maintenance policy and Customer strategy Maintenance policy and Customer Maintenance organisation strategy Maintenance policyand andknowledge Information Customer Maintenance organisation strategy management Informationpolicy and knowledge Maintenance and Maintenance organisation Customer Maintenance controlling strategy management Information and knowledge Maintenance policy and Customer Maintenance controlling Maintenance organisation Maintenance object/asset management strategy Maintenance policy and Customer Information and knowledge Maintenance Maintenance organisation controlling Maintenance object/asset strategy Materials management management Maintenance policy and Information and knowledge Maintenance organisation Customer Maintenance object/asset controlling Materials management strategy Maintenance Partnerships management Information and knowledge Maintenance policy and Customer Maintenance organisation Materials management Maintenance controlling Partnerships Maintenance object/asset Maintenance staff management strategy Maintenance policyand andknowledge Information Maintenance Maintenance organisation controlling strategy Partnerships Maintenance object/asset Maintenance staff management Materials management
Pair wise comparison
Maintenance staff12,0 %
policy and
Maintenance policy and strategy strategy
Figure 8 Weighting of individual fields of action (example) Using the CMM approach, an evaluation of specific maturity levels followed for each field of action, revealing the company’s status quo regarding maintenance. The evaluation took place in a questionnaire based workshop including collective discussions as described above (see fig. 9). The hereby identified status quo in maintenance is depicted in figures 6 and 7.
Field of action „maintenance controlling“ Criteria
Tasks
Data collection
Identification of key figures
Use of key figures
Key figure comparisons
Cost accounting
Budget
Calculation of profitability
Indirect costs
Do you collect maintenance specific data?
Answer-possibilities
maturity level
No data is collected for maintenance
Little data is collected. Figures are only sporadic and cannot be verified. No evaluation of figures.
Data is collected, but collection is not completely standardised. Evaluations are only carried out irregularly.
0%
2
25%
x 3
50%
4
75%
5
100%
A large amount of data is collected, certain data often more than once. For collection a standard is specified. Evaluations are carried out regularly.
1
The collected data is complete without redundancy. It is evaluated in such a way that a forecast on the consequences is possible. All the key figures can be determined without any uncertainty.
Figure 9 Evaluation of specific maturity levels using a CMM based approach Considering the weighting within the House of Maintenance as well as gap between the status quo and the desired level of maturity for each field of action, the company’s current state was transformed into a prioritisation matrix (see fig. 10). The Prioritisation matrix was used to identify those fields of action, which combine a high weighting and a high gap – representing the potential for improvement within the respective field of action – and thus have a high priority concerning the improvement of maintenance performance.
22
By assessing the weighting of relevance of each field of action against the gap between the actual and achievable level of maturity (potential for improvement) the following fields of action were identified as most crucial in the company’s maintenance management: 1. Maintenance Controlling and Performance Management (MC) 2. Maintenance Policy and Strategy (MPS) 3. Customer of Maintenance (C) Within these fields of action lay the biggest opportunities for improvement. These improvements were essential for building up a maintenance organisation that could provide the maximum value and profit to the whole company and its customers.
C:
Customer
MPS:
Maintenance policy and strategy
MO:
Maintenance organisation
IKM:
Information and knowledge management
MC:
Maintenance controlling
MOA: Maintenance objects and assets MM:
Material management
P:
Partnership
MS:
Maintenance staff
Figure 10 Prioritisation Matrix (example) Based on the results of the maintenance assessment, the FIR suggested an improvement of maintenance controlling (as well as other measures, which will not be discussed in this paper) by introducing a performance management system (PMS) based on balanced scorecards [12]. In detail, the FIR recommended a two-level PMS for maintenance consisting of strategic and execution level and five perspectives which provide a consistent monitoring of the company’s maintenance, focussing on Processes, Customer, Finance, Staff and External Services. After its implementation, the PMS would work as a tool for creating transparency and provide the basis for the continuous measurement of the actual maintenance performance in sense of effectiveness and efficiency. The PMS system was successfully implemented and increased the maintenance department’s performance within the company.
7
CONCLUSION
The “IH-Check” assessment is a powerful tool, clearly identifying shortcomings in maintenance performance as well as potentials for improvement and thus enabling the introduction of specific measures ensuring a continuous improvement. A periodic application of this instrument ensures a successful derivation and implementation measures aiming at an improvement of maintenance performance and thereby ensuring effectiveness and efficiency in production. Based on the analysis of the current condition of a company’s maintenance, “IH-Check” helps in efficiently introducing measures for improvement. The sequence of reorganisation projects in maintenance, till now more a result of reactions to
23
external causes, is converted to a process of systematic improvements. Employees are no longer confronted with the introduction of new management concepts, but are now directly involved in measures specifically derived for the company. The problem of measuring and assessing internal maintenance was solved with the introduction of the House of Maintenance framework and the associated assessment tool “IH-Check”. The actual status of a company’s maintenance can be measured easily, using the status quo analysis described. Applying the maturity-level model, it is also possible to identify salient fields of action, thus being able to derive appropriate measures. Using the “IH-Check”, it is possible to convert a mere reorganisation of internal maintenance to a continuous improvement process. The resulting progressive improvements in maintenance contribute in improving the operational performance of the whole enterprise.
8
REFERENCES
1
Schuh G., Berbner J., Lorenz B., Franzkoch B. & Winter C.-P., (2008) Reliability leads to a better performance – results of an international survey in continuous process industries. Proceedings of the 3rd World Congress on Engineering Asset Management and Intelligent Maintenance Systems (WCEAM-IMS 2008), 1366-1374, Springer-Verlag London Ltd. Beijing.
2
Shirose K. (2007) Total Productive Maintenance. New Implementation Program in Fabrication and Assembly Industries. Seventh edition. JIPM-Solutions. Tokyo.
3
Moubray J. (1997) Reliability-Centered Maintenance. Second edition. Industrial Press Inc. New York.
4
McKone K. E., Schroeder R. G. Cua K. O. (2001) The impact of total productive maintenance practices on Manufacturing performance. Journal of Operations Management, Vol. 19, 39-58.
5
Khan F. I., Haddara M., Krishnasamy L. (2008) A new Methodology for Risk-Based Availability Analysis. IEEE Transactions on Reliability, Vol. 57, No. 1, 103-112.
6
Schick E. (2004) unpublished Expert Study “Trends and Development-Perspectives in Maintenance” conducted by the Research Institute for Operations Management at RWTH Aachen University (FIR).
7
Mark C. Paulk, Charles V. Weber, Bill Curtis & Mary Beth Chrissis (1994). The Capability Model: Guidelines for improving the software process (SEI Series in Software Engineering). Addison-Wesley Professional. Boston.
8
Marquez A. C. (2007) The Maintenance Management Framework: Models and Methods for Complex Systems Maintenance (Springer Series in Reliability Engineering). Springer-Verlag. Berlin.
9
Moore R. (2004) Making Common Sence – common practice: Models for Manufacturing Excellence. Third edition. Butterworth-Heinemann. Oxford.
10
Al Najar B., Asyouf I. (2000) Improving effectiveness of manufacturing systems using total quality maintenance. Integrated Manufacturing Systems, Vol. 11, No. 4, 267-276.
11
Carrizosa E., Messine F. (2007) An exact global optimization method for deriving weights from pairwise comparison matrices. Journal of Global Optimization, Vol. 38, No. 2, 237-247.
12
Lorenz B., Winter C. (2008) Identification of Optimal Maintenance Strategy Mixes for Small and Medium Enterprises (SME). Euromaintenance Papers. Bemas. Brussels.
24
Proceedings of the 4th World Congress on Engineering Asset Management Athens, Greece 28 - 30 September 2009
INTEGRATED STRATEGIC ASSET PERFORMANCE ASSESSMENT Aditya Paridaa and Uday Kumara a
Division of Operation and Maintenance Engineering, Luleå University of Technology, Luleå, Sweden
Asset performance assessment forms an integral part of a business process for heavy and capital intensive industry to ensure performance assurance. Managing the asset performance is critical for long term economic and business viability. Assessing the asset performance is a complex issue as it involves multiple inputs and outputs; and various stakeholders’ dynamic requirements. Lack of integration among various stakeholders and their changing requirements in strategic asset performance assessment is still a problem for the companies. It is a challenge to integrate a whole organization, where free flow and transparency of information is possible; and each process is linked to integrate to achieve the company’s business goals. In this paper, various issues associated with an integrated strategic asset performance assessment are discussed. Key Words: Asset performance assessment (APA), asset Performance Indicators (API), employee involvement, Maintenance process. 1
INTRODUCTION
Global dynamic and competitive business scenario with technological development and changes during last couple of decades, besides the prevailing economic slowdown have dominated the global industrial scenario demanding an effective, safe and reliable integrated strategic engineering asset management. Further, outsourcing, separation of asset owners and asset managers, and complex accountability for the asset management, make the assessment of asset performance and its continuous control and evaluation more critical. Organizations operating today are facing several kinds of challenges, like; highly dynamic business environments, complicated intellectual work at all levels of the company, efficient use of information and communication technologies (ICT), and a fast pace of information and knowledge renewal [1]. Thus, under this technological advancement and global competition scenario, asset owners and managers are striving for monitoring, assessing and following up the asset performance. Health monitoring of the strategic engineering asset is an important issue and challenge for the management as it provides information on plant and system health status to achieve higher productivity with minimum cost and safety with high reliability. The advancement in computer and information technology has a significant impact on the process of asset management information system for finding the asset health status facilitating timely decision making. Advancement of sensor technologies, automated controls and data telemetry have made possible new and innovative methods in asset health monitoring. Rapid growth in the networking systems, especially through the internet has overcome the barriers of distance, allowing a real time data transfer to occur easily from different locations [2]. Corporate strategy of an organization describes how it intends achieve its mission and objectives, and to create value for its stakeholders like; the shareholders, customers, employees, society, regulating authorities, suppliers and alliance partners. Without a comprehensive description of strategy, executives cannot easily communicate the strategy among themselves or to their employees [3]. Therefore, it is essential that the corporate strategy and objectives of an organization is converted to the specific objectives integrating different hierarchical levels of the organization. Under challenges of increasingly technological changing environment, implementing an appropriate performance assessment (PA) system in an organization becomes a necessity. This is because, without integrated assessment of performance, it is difficult to manage and verify the results of the desired objectives of an organization. Maintenance of an asset is considered as an important support function for the management and perceived as “It can be planned and controlled” and “It creates additional value” [4]. To know the amount of additional value created, the assessment of asset performance needs to be integrated into the business process. Assessing the asset performance of an organisation is a complex issue due to multiple inputs and out put which are influenced by stakeholders and other sub-processes. Many times, contribution of maintenance on asset performance can only be assessed in terms of the losses incurred due to lack of
25
maintenance activities. Coupled with lack of maintenance activities have resulted in disaster and accident of extensive losses and changes in legal environment, the asset managers are likely to be charged with “corporate killing” for the future actions or omissions of maintenance efforts [5]. These societal responsibilities to prevent loss of life and property, besides high asset maintenance cost are compelling the management to undertake asset PA as part of the business measurement system. An asset PA system ensures that all operational activities are aligned to the organization’s corporate strategies and objectives in a balanced manner. The organization has to satisfy and meet the requirements of both the external and internal stakeholders need and identify the performance indicators (PIs) from a integrated and balanced point of view. The purpose of this paper is to discuss various issues associated with integrated strategic asset performance assessment. The structure of the paper is as follows: after providing an introduction to the topic at section 1, strategic issues of engineering asset are discussed at section 2. Section 3 deals with the integrated issues in engineering asset performance assessment. A discussion and conclusion is provided at section 4.
2
STRATEGIC ISSUES IN ENGINEERING ASSET
Strategy; is concerned with the long-term directions of the firm, deals with the overall plan for deploying resources that the firm possesses, entails the willingness to make trade-offs to choose between different directions and between different ways of deploying resources, achieving unique positioning vis-à-vis competitors, and sustainable competitive advantage over rivals and ensuring lasting profitability [6]. An organization’s strategy indicates how it intends to create value for its stakeholders, like the shareholders, customers, employees, the society, etc though effective use of its asset. For maximum impact, the measurement system should focus on organization’s strategy, how it expects to create future and sustainable value [4]. No two organizations develop and follow the strategy in the same way, because some follow strategy from financial perspectives for revenue and growth; some by the services or products focusing on their customers, others from marketing or quality perspectives; and some others from a human resource perspectives. Observing different organizations and critical analysis of authors, strategic policy exist around shareholders value, customer satisfaction, process management, quality, innovation, human resources and information technology amongst others. Engineering asset strategy is formulated from the corporate strategy considering the integrated and whole life cycle of the asset. An integrated approach is essential as an asset performance management is associated with various stakeholders with their conflicting needs with multiple inputs and outputs. From asset performance objectives, two set of activities are undertaken. One set of activity develops the key performance indicators for bench marking performance with similar industry and the other set formulate the activity plan, implementation, measurement and review as given at Figure 1. As shown in the figure, asset performance objectives are formulated as per stakeholders’ requirements and organizations integrated capability and capacity. In order to achieve the asset performance objectives, critical success factors are identified from which key result areas of activities are identified. From the key result areas, key performance indicators (KPIs) are developed for measuring and assessing the asset performance. On the other set of activity, activity plans are made based on which the implementations are carried out. After implementation, measurement and assessment of the asset performance is undertaken, so that feedback and reviewing action can be undertaken to validate the asset performance objectives.
Corporate strategy Feedback & review Asset performance objectives
PM
Asset strategy
Implementation Activity plan
Critical success factors Key result area Key performance indicators
Figure 1 Strategic asset performance measurement process (Adapted from [7])
26
Companies are using the scorecards as a strategic management system to manage their strategy over their long run and using the measurement focus to accomplish critical management processes [8] like; 1. Clarify and translate vision and strategy 2. Communicate and link strategic objectives and measures 3. Plan, set targets, and align strategic initiatives 4. Enhance strategic feedback and learning In an asset management strategy, various industry forces play important roles and are required to be considered for analysis. For new entrants, it is the entry barriers of experience and culture; for suppliers, there may be many service providers; for alternate products, it is better system and processes; for customer, it is trust and good relationship; and for the industry, it is the competitors. Importance of strategic aspects for engineering asset cannot be overlooked, especially under the present business scenario context. Some of the examples of asset performance objectives could be to achieve higher OEE level, zero defects (zero quality complaints), and nil accidents. The KPIs translate aggregate measures from shop floor levels to the strategic level. The real challenge lies in measuring all the KPIs, as some of the KPIs are difficult to measure being intangible in nature and cannot be quantified. Organizations need a framework to align their performance measurement system with the corporate strategic goals of a company by setting objectives and defining key performance at each level [9]. The performance measurement PM which forms part of the asset performance measurement system needs to be aligned with the organizational strategy [10]. The PIs are required to be considered from the perspective of the multi-hierarchical levels of the organization. As per [11], maintenance management needs to be carried out in both strategic and operational contexts and the organizational structure is generally structured into three levels. The three hierarchical levels considered by most of the firms are; the strategic or top management level, the tactical or middle management level, and the functional/operational level [12]. Two major requirements of a successful corporate strategy relevant for the performance measurement are namely: 1. 2.
Cascading down the objectives from strategic to shop floor level Aggregation of performance measurements from shop floor to strategic level.
2.1 Cascading down the objectives from strategic to shop floor level. The strategic objectives are formulated based on the requirements of the stakeholders, both internal and external. The plant capacity and resources are considered from long-term objectives and matched. These corporate objectives are cascaded down the hierarchical level of the organization though the tactical level which considers the tactical issues such as financial and nonfinancial aspects both from the effectiveness and the efficiency point of view. The bottom level is represented by the functional personnel and includes the shop floor engineers and operators. The corporate or business objective at the strategic level is communicated down through the levels of the organization and translated into the objective measures in a language and meaning appropriate for the tactical or functional level. This cascading down of strategy forms part of the goal deployment of the organization.
2.2 Aggregation of performance measurements from shop floor to strategic level. The performance at the shop floor level is measured and aggregated through the hierarchical levels of the organization to evaluate the achievement of the corporate objectives. The adoption of fair processes is the key to successful alignment of these goals. It helps to harness the energy and creativity of committed managers and employees to drive the desired organizational transformations [13]. This aggregation leads to empowerment of employees in the organization.
3
INTEGRATED ISSUES IN ENGINEERING ASSET PERFORMANCE ASSESSMENT
Observing different organizations and critical analysis of authors, strategic policy exist around shareholders value, customer satisfaction, process management, quality, innovation, human resources and information technology amongst others. [14] Thompson (1997) listed eight clusters of organizational competencies which are linked to the integrated and strategy contents competencies, needs to stay aware and strategic change competencies. The eight clusters are: 1. 2. 3. 4. 5. 6.
strategic awareness abilities stakeholders satisfaction abilities competitive strategic abilities strategic implementation and change abilities competency in quality and customer care functional competencies
27
7. 8.
ability to avoid failure and crises Ability to manage ethically and with social responsibility.
Therefore all successful organizations have to be aware, formulate the winning integrated strategy, implement and manage it in a dynamic and competitive business environment. Companies are using the PM scorecards as a strategic management system to manage their strategy over their long run and using the measurement focus to accomplish critical management processes [8] (Kaplan and Norton, 1996): • • • •
Clarify and translate vision and strategy Communicate and link strategic objectives and measures Plan, set targets, and align strategic initiatives Enhance strategic feedback and learning
The integrated issues in engineering asset PA are discussed as under [15] (Parida, 2006): 1.
2.
3.
4.
5. 6.
Stake holder’s requirement. The stakeholders’ external needs are to be assessed and responded with matching asset and resource requirements with planning and that of internal stakeholders’ capability and capacity, which formulate the corporate objectives and strategy and translate into targets and goals at the operational level or converting a subjective vision into objective goals. While considering the external stakeholders needs, the prevailing and futuristic business scenarios are looked into besides the competitors. Internal stakeholders’ need from employees, management & organizational culture perspectives are also considered, besides the asset and other resources capacity and capabilities. Organizational issues. The asset PM system needs to be aligned and form integral part of the corporate strategy. This will require commitments from the top management and all employees to be aware of the asset PM system through effective communication and training, so that they all speak same language and are fully involved. The involvement of the employees in the asset PM system at every stage, like the planning, implementation, monitoring and control, and at each hierarchical level can ensure the success of achieving the asset performance and business strategies. Besides, all functional processes and area like, logistics, IT, human resources, marketing and finance need to be integrated with engineering assets. Engineering asset requirements. From the stakeholders’ need, the demand analysis of engineering asset is perceived and designed. After concept development, validation and engineering asset specifications are worked out. Besides, competitive product, cost of maintenance, risk management, correct product design, asset configuration and integration are considered from strategic and organizational perspective. From operation and maintenance, the engineering asset may be outsourced partially or entirely. How to measure? It is essential to select the right PIs for measuring asset PA from an integrated whole life cycle perspective for benchmarking besides, collecting the relevant data and analysis for appropriate decision making. The asset PM reports developed after the data analysis are used for subsequent preventive and/or predictive decisions. The asset PM needs to be holistic, integrated and balanced [12] Sustainability. Sustainability development is the development that is consistent while contributing for a better quality of life for the stakeholders. This concept integrates and balances the social, economic, and environmental factors amongst others. Linking Strategy with integrated asset performance assessing criteria. The linkage between integrated Enterprise Asset Management (EAM) measuring criteria with condition monitoring, IT and hierarchical level for decision making at different hierarchical level is given at Figure 2. This figure describes the linkage between the external and internal stakeholders needs and considers the concept of integrated enterprise asset management from the different hierarchical needs, while linking the performance measurement and assessment from engineering asset’s operational level to decision making at strategic level. The external effectiveness is high lighted by stakeholders need like return on investment and customer satisfaction. The internal effectiveness is high lighted through the desired organizational performance reflected by optimized and integrated resources utilization for EAM. For example; availability and performance speed of the equipment and the machineries forms part of the internal effectiveness or back end process. Quality is the most important aspect, which is not only related to the products quality of the back end process, but also with customer satisfaction of external effectiveness. From external stakeholders, the quantity of annual production level is decided, considering the customer’s requirements, return on investment and internal plant capacity and plant availability etc. From internal stakeholder’s, the organization considers department’s integration, employee requirements, organizational climate and skill enhancement. After formulation of the asset PA system, the multicriteria PIs are placed under the multi hierarchical levels of the organization.
28
External Stakeholders’s Needs
Organization’s Vision & Objectives
Company’s Internal Needs
Decision
making STRATE GIC LEVEL
Checked with PIs
Organization’s
Production/Op erational Strategy & Policy
Business & Marketing Strategies
Production Planning Scheduling & Control
Maintenance Planning, Scheduling & Control
TACTICAL/ MANAGERIAL LEVEL
OPERATIONAL LEVEL
& KPIs
Information
Data
Performance Indicators For: Productivity Process, Cost, Environment,
Employee Satisfaction Growth & Innovation, Etc.
IT system
Embedded sensors
Enterprise asset management (EAM)
Figure 2. Linkage between integrated Enterprise Asset Management’s (EAM) measuring criteria with condition monitoring, IT and hierarchical levels for decision making
4
DISCUSSION AND CONCLUSION
An asset cannot be managed with out considering the integrated strategic issues for an appropriate PA system. This is because of the various stakeholders conflicting interest needs, associated multiple inputs and outputs including the tangible and intangible gains from the asset. For engineering asset PA, strategic issues are essential to be considered. Under prevailing dynamic business scenario, asset PA is extensively used by the business units and industries to assess the progress against the set goals and objectives in a quantifiable way for its effectiveness and efficiency. An integrated asset PA provides the required information to the management for effective decision making. Research results demonstrate that companies using integrated balanced performance systems perform better than those who do not manage measurements [16]. In this paper the complexities of asset PM are considered and discussed from corporate strategy perspective of the organization. Since no two organizations are exactly similar, the asset PM framework and PIs formulation from corporate strategy need to be specific to that organization. The concepts and strategic issues are discussed for a successful engineering asset PM.
29
5
REFERENCES
1
Antti Lönnqvist. (2004) Business Performance Measurement http://www.pmteam.tut.fi/julkaisut/HK.pdf, visited on 2004 Aug 22.
for
Knowledge-Intensive
Organizations,
2 Toran, F., Ramirez, D., Casan, S., E. Navarro, A and Pelegri, J., (2000) Instrumentation and Measurement Technology, Vol. 2 (Ed. IEEE) IMTC, pp. 652-656 3 Kaplan, R. S and Norton, D. P, (2004) Strategy maps, converting intangible assets intangible outcomes, Harvard Business School Press, USA 4
Liyanage, J. P. and Kumar, U. (2003) Towards a value-based view on operations and maintenance performance management, Journal of Quality in Maintenance Engineering,
5
Mather, D (2005), An introduction to maintenance scorecard, The plant maintenance News letter edition 52, dated 13 April 2005.
6
Jelassi, T. and Enders, A. (2005) Strategies for e-business, Prentice Hall, Essex, London
7
Parida, A, Ahren, T and Kumar, U. (2003) Integrating Maintenance Performance with Corporate Balanced Scorecard, COMADEM 2003, Proceedings of the 16th International Congress, 27-29 August 2003, Växjö, Sweden, pp. 53-59. 8
Kaplan, R. S and Norton, D. P (1996) The Balanced Scorecard: Translating Strategy into Action, Harvard Business School Press, pp. 322.
9 Kutucuoglu, K. Y; Hamali, J; Irani, J; Sharp, J. M (2001) A framework for managing maintenance using performance measurement systems, International Journal of Operation and Production Management, Vol. 21, No. ½, pp. 173-194. 10 Eccles, R. G. (1991) The performance measurement manifesto, Harvard Business Review, January-February, pp. 131-137. 11 Murthy, D. N. P, Atrens, A. and Eccleston, J. A. (2002) Strategic maintenance management, Journal of Quality in Maintenance Engineering, Vol. 8, No. 4, pp. 287-305. 12
Parida, A. and Chattopadhyay, G. (2007) Development of Multi-Criteria Hierarchical framework for Maintenance Performance Measurement (MPM). Journal of Quality in Maintenance Engineering, Vol. 13, No. 3, pp 241-258
13 Tsang, A. H. C. (1998) A strategic approach to managing maintenance performance, Journal of Quality in Maintenance Engineering, Vol.4, No. 2, pp.87-94 Vol. 9, pp. 333-350. 14
Thompson, J. L. (1997) Lead with Vision: manage the strategic challenge, International Thompson Business Press, London
15 Parida, A (2006) Development of Multi-criteria Hierarchical Framework for Maintenance Performance Measurement: Concepts, Issues and Challenges, Doctoral thesis, Luleå University of Technology, Sweden, http://epubl.ltu.se/14021544/2006/37/index.html 16 Lingle, J. H. and Schiemann, W. A. (1996) From balanced scorecard to strategy gauge: Is measurement worth it?” Management Review, March, pp. 56-62.
30
Proceedings of the 4th World Congress on Engineering Asset Management Athens, Greece 28 - 30 September 2009
ASSESSING THE SUBJECTIVE ADDED VALUE OF VALUE NETS: WHICH NETWORK STRATEGIES ARE REALLY WIN-WIN? Tony Rosqvista, Toni Ahonenb, Ville Ojanenc, Arto Marttinend a
VTT Technical Research Centre of Finland, PO Box 1000, FI-02044 VTT, Espoo
b
VTT Technical Research Centre of Finland, PO Box 1300, FI-33101 Tampere, Finland
c
Lappeenranta University of Technology, PO Box 20, FI-53851 Lappeenranta, Finland d
Metso Automation Inc, PO Box 310, FI-00811 Helsinki, Finland
As manufacturing companies increasingly focus on their core business, the interest in the utilisation of external services provided by system suppliers and service companies increases. Currently an increasing number of services are purchased from service supply networks. Furthermore, globalisation, complexity of technological innovations and demand for integrated solutions also create need for networking and collaboration. Establishing or improving the performance of the networked service providers, the value net, is a long-term effort, requiring the build up of trust between the partners. The necessary condition of moving from a subcontractor relationship to a strategic network or partnership is the sharing of the view of joint gains in a prospective value net. How do we then evaluate the added value of moving to a new partnership? What network strategies provide the win-win network solution? This paper is a tentative effort in answering these questions based on Decision Analysis. Keywords: value net, strategic network, network orchestrator, win-win strategy, collaborative maintenance network 1
INTRODUCTION
A current trend is to outsource operations that do not belong to the core competence of a company. The main rationale for this development is the assumption that by subcontracting, the company can buy certain services cheaper and receive them better managed and implemented, thus providing more added value compared to keeping the same functions in-house. Furthermore, embedded technology in the assets, ever-increasing demand for production efficiency, and dynamic end customer requirements increase the need for a variety of services. The main assumption here is that there exists, or will rapidly emerge, a competitive market of services that is able to provide collaborative services cost-effectively. Of course, there is always a tendency in the area of manufacturing and servicing, that some services may be based on know-how that is restricted to only some service providers, resulting in the creation of an oligopolistic service market which is not cost-effective (market effect of scarcity power). The strive to get competitive edge on a special area will naturally lead to this type of oligopolistic service market, but on the other hand companies must avoid to seize all the customer value of their related service. This is characteristic in networked environments. where the market players aim at strategic partnering, forming a value net, which can be characterised as in Fig. 1, showing the ‘strategic’ differences between the network partners and the role of a network orchestrator.
31
Figure 1. Illustration of the requirements for trust, information transparency and the roles of the value net partners. As the figure shows service providers have different strategic roles in the value net: the principal service providers belonging to the core of the network whereas other service providers lie at the fringe of the value net with decreasing strategic significance. Such a governance structure is typically worked out by the network orchestrator or network leader who has the closest relationship with the customer. From customer’s perspective, there has been a clear need to actually decrease the number of closest partners in order to ease the management. This favours the one stop shops that can integrate solutions and services and act as the main partner in the customer’s direction. The network leader establishes the values and culture of the network, developing its guiding principles (e.g. centralisation vs. decentralisation, incentive systems, accountability rules - all these issues entangled with their own tensions and trade-offs) while utilising the best practices from the network itself. In contrast to rigid control systems used to manage production units, the network orchestrator relies not just on rewards, but also upon a combination of empowerment and trust, as well as training and certification, to manage a network that it does not own. Finally, orchestrators have a different way of creating value: value in the traditional firm comes from specialisation, refining skills in specific areas, protecting trade secrets, and keeping out rivals and even partners, whereas value nets create value by integration, bridging borders, leveraging intellectual property across the network. In other words, the social rationale of a value net is simply: i) doing ‘more’ than the organisation knows, and ii) knowing ‘more’ than the organisation does. Of course, this has implications in managerial areas such as contracting / strategic partnering, relationship management (culture, trust), knowledge management (social networks, IT systems), and change management (strategy, leadership, ‘integrator’) – all looking at organisational boundaries from their own perspectives: legal, physical, work/activity and knowledge. The managerial challenges of strategic networks or partnership are best illustrated by the results from a survey made in Finland between 2007-08 on challenges that Finnish industrial service providers meet in offering a client extended services in managing the technical and economic life time of fleet assets [1]. In the survey covering 5 industrial partners (4 vendors, 1 client) and 11 interviewees, success factors for a service network where identified. The following key elements are crucial for building trust between the network partners:
32
-
Pricing in relation to the added value created - How transparent should the price and added value determinants of the services be?
-
Service offering (coupling between service bundling and pricing) - What kind of bundling of product/services will satisfy the client’s need of getting everything from ‘one hatch’?
-
Information management (open access to reliability performance of equipment and maintenance performance and plans) - To what extent should the information systems be open, or even shared, based on a common platform?
-
Mixed cultures (balance between service oriented and product manufacturing cultures) – How to reconcile effectiveness of production to flexibility of services?
-
Intermediate operator / system integrator (ability to connect product and services from separate vendors) - ‘Is an intermediate operator or system integrator’ needed – in the beginning of the new service development process only, or over the life time of the physical assets serviced?
-
Knowledge management (scope of access to each others data, IPR, business models, etc.) - What kind of knowledgerelated asymmetries [2] do we need to worry about?
Good management of the key elements above can be viewed as a necessary condition of moving towards a true value net in the particular industrial area considered in the survey. It is obvious that this transition takes a lot of time and commitment of the managers of the companies in question. The remainder of the paper is organised as follows: In section 1 the different roles in a value net are defined and some demarcations are made for presenting a framework supporting value judgments on alternative service networks, presented in section 2. Section 3 presents a value measurement framework based on value-tree analysis that is a key feature in the generic assessment process presented in section 4. The paper is concluded by a discussion of directions for future research in section 5.
2
ROLES IN A VALUE NET
REGULATOR
OWNER
SERVICE PROVIDERS PHYSICAL ASSET
END-USER
SYSTEM SUPPLIER
OTHER STAKEHOLDERS
OPERATOR
In principle, the network leader has several governance choices for a value net servicing a physical asset. The choices are reflected in the management of the different roles, accountabilities and functions conducted in collaboration. The roles and functions should be defined with particular attention to customer needs which typically derive from the asset owner’s needs together with end-users needs. This is to ensure that the established network processes are truly value adding for the customer. In practice, network and partnership choices have to be made based on current relationships and roles adopted. Basic roles tied together by the physical asset are illustrated in Fig. 2.
Figure 2. The basic question of the network orchestrator is: Which network strategies provide the highest joint gains in the value net, and at the same time meet possible constraints (e.g. regulatory)?
To narrow down the decision context of developing network strategies we will make following demarcations:
33
-
the owner of the physical asset is also the user and customer in the value net
-
one of the suppliers or service providers is also the network orchestrator (see Fig. 1)
-
the ‘value’ of running and maintaining the assets is also assessed by ‘other stakeholders’ (e.g. environmentalists) and the regulator, but these are usually not considered as the principal partners in the value net
-
the added value provided by the value net is basically determined by end-user, but the ultimate rationale of the value net is to provide ‘value’ for all partners i.e. a win-win or a joint gains outcome
-
the service agents are not all ‘equal’ but have different strategic positions in the value net (Fig. 1), and accordingly different network strategies
Formulation of the strategic objectives of the value net aims at the maximisation of customer received value and thus the analysis of customer objectives lays the foundation for the derivation of the network objectives, together with the strategic objectives set by the individual service providers. In other words, the network strategy should be based on customer’s strategic objectives. For instance in a situation where customer’s objectives are highly influenced by the dynamics in business environment, network strategy must include aspects of these dynamics as well. Or, if the customer wants the (physical) assets to have a certain performance (availability, reliability, maintainability, etc), the value net has to manage its processes in a way that the corresponding performance levels are met. Direct business related value of networking for the partners can be e.g. increasing business opportunities and profit, coping with the challenges resulted from a dynamic market, benefiting from network level reputation and reaching economy of scale in the business. In the following, we discuss the value measurement further.
3
VALUE MEASUREMENT
From the point of view of the individual partner, his/her perceived ‘value’ can be assessed using the standard Balance Scorecard approach [3-5]. The BSC perspectives are fourfold and for each of them, strategic objectives, goals and indicators can be defined according to the managerial plans. For instance, plant and maintenance objectives, and related Key Performance Indicators, are discussed in Rosqvist et al. [6]. The Learning & Growth Perspective This perspective includes employee training and corporate cultural attitudes related to both individual and corporate self-improvement. In a value net, learning from each other and sharing knowledge can offer a competitive edge for the partners. The emphasis is therefore in increased understanding of openness, transparency requirements and construction of mutual trust, and promotion of systematised feedback and interaction mechanisms. Indicators can be developed to support managing these issues. Such indicators are usually interpreted as leading indicators, i.e. indicators that signal future outcome of customer satisfaction and financial performance. The Business Process or Internal Perspective This perspective refers to internal business processes. The managerial areas are the operations management, customer management, innovation management and the regulatory&social issues management. Proper indicators allow the managers to know how well their business is running, and whether its products and services conform to customer requirements. Again such indicators are usually interpreted as leading indicators. The above managerial areas may to large extent be jointly managed by partnering. It is thus important for a value net to identify key managerial areas where joint gains can be achieved by reallocating managerial tasks and responsibilities within the network. In particular, information sharing principles need to be addressed: what is openly accessible to all partners, what is restricted to certain partners, what information is entered by whom, what ICT is used and who maintains it, what happens with IPR, etc. The Customer Perspective Recent management philosophy has shown an increasing realisation of the importance of customer focus and customer satisfaction in any business: if customers are not satisfied, they will eventually find other suppliers that will meet their needs. The mutual trust between the customer and the network is crucial. As customer value is something perceived by customers rather than objectively determined by the supplier, the significance of understanding and management of customer knowledge has to be emphasised. The main responsibility of the network orchestrator is to read the customer’s ‘signals’ and ‘transmit’ them properly up to the furthermost partner in the network (see Fig.1). The signals can relate to many attributes such as price, quality, service availability and selection, but also trust and branding and overall functionality of the network. In essence, the value propositions given by the value net need to be achieved, maintained and monitored.
34
The Financial Perspective The financial perspective relates to immediate economic determinants and economic results. Currently, financial metrics have been criticised for their emphasis on short-term performance by moving into quarterly financial reporting. For each partner in a value net, the financial performance determines the success of the network in the eyes of the partners. How this affects the decision to continue as a partner or move somewhere else depends on the position and the role of the partner in the value net: the furthermost partners are expected to base their partnering decisions more on short term than long term financial performance. It is also expected that the financial performance is one key driver of changing the network structure: if revenues and risks are perceived to be distributed in an unfair way by some partners, then the cohesion of the network is clearly threatened. Kaplan and Norton, augmented their performance measurement BSC system with a strategy planning tool – the strategy map – which depicts the cause-effect relationships between the four standard perspectives or objectives in the BSC system [7]. A key insight is that factors in the ‘Learning&Growth’, ‘Business Process’ and the ‘Customer Perspectives’ can be interpreted as leading indicators for the financial performance and competitiveness with their respective lagging indicators [8,9]. It has to be remembered that the developments by Kaplan and Norton are connected to the strategy development of a single corporation. The strategy development of a value net, can, basically, utilise the strategy map idea. Any prospective network strategy can be formulated in terms of managerial objectives that the partners share and jointly try to achieve. The network partners may value a network strategy by assessing its impact on determinants of customer value, and even further on financial determinants. In principle, the added value related to a prospective network strategy may be subjectively assessed, by each partner, by comparing it to the existing network strategy. This is illustrated in Fig. 3 which is an adaptation of the strategy map by KaplanNorton [7]. The strategy map illustrates the structure of a shared network strategy in order for the value assessment to verify that a prospective network strategy is a win-win strategy. It is important to note that a network strategy that yields added value to the customer may for the other network partners produce negative added value due to complexities introduced in contracting and knowledge management, inducing extra transaction costs. In principle, the value assessment framework includes all the strategic elements relevant in the survey referred to in the Introduction, providing a direction for further application-specific refinement with respect to the concerned managerial areas. Some of these aspects are discussed next. The role of a network orchestrator in a service supply network has been emphasised in our survey as, for instance, capabilities for structured strategic planning are often lacking particularly in smaller service companies. In the case, where an network orchestrator will take full responsibility network strategy development, smaller companies may compensate this lack by adopting dynamic working methods of good quality in the implementation of the strategy, making them an important partner in the service provision. Also with respect to information management, our survey has identified strategically important differences between typical players in a service supply network, especially with respect to transparency requirements as indicated in Fig.1. Regardless of these differences, the internal managerial perspective to information management requires integrated solutions to align the combination of strategies. Based on the survey, the surrounding business environment and the related dynamics create the most fundamental needs for partnership development with effects on pricing, service offering, as well as, leadership (culture) and knowledge management. From the customer point-of-view, one of the most important, but at the same time most difficult aspects to control during the contract period, seems to be the network’s capability to continuously develop the services found strategically important. Openness and transparency are found important when finding common ground for assessing and capturing opportunities for the added value of improved services and/or business processes in the network. In the next section, a value assessment process is describe that operationalises the strategy map – based added value model in Fig. 3.
35
Partnership Added Value (dependent on network strategy)
Added Value of service providers
Network Orchestrator Added Value
Customer Added Value (user and/or owner of physical asset)
Financial determinants
improve cost structure
improve asset utilisation
expand revenue opportunities
enhance customer value
Customer value determinants
price
quality
availability
selection
functionality
Product/service attributes
flexibility
brand
Relationship
Internal processes of the partners, where some functions are the property of the network and defined in the network strategy.
causal relationship
defining relationship
Each partner valuates the performances in the various managerial areas differently.
Current network strategy (zero reference)
partnering
Value adding alternative network strategy
?
Figure 3. A strategy map linking the internal managerial perspective (objectives) with the customer and financial perspectives (objectives) for the assessment of added value of a prospective network, formulated in terms of shared managerial objectives and goals that are expected to lead to added value for the customer and the partners, i.e. a win-win situation. The added value assessment is subjective and comparative in nature. (The reader is referred to the work of Kaplan and Norton on strategy maps).
4
THE ASSESSMENT PROCESS OUTLINE
To be able to use the added value model in the previous section, following key issues need to be addressed, mainly by the network orchestrator: -
what managerial elements are incorporated in the current network strategy, and what changes could provide added value?
-
how are the impacts of the prospective network strategy assessed?
-
how are uncertainties (risks) incorporated in the assessment?
36
Figure 4 shows two distinct activities that are needed: the Network Strategy Formulation and the Added Value Assessment.
Figure 4. Activities to support network strategy development: the outcome is an action plan that outlines how the current partnership should be changed to create an improved win-win network strategy. Network Strategy Formulation is expected to be a sensitive process led by the network orchestrator with numerous mutual discussions with the existing and potential partners on wanted or possible changes in roles, accountabilities, etc. with respect to the current network strategy. The sense of trust in the leadership and orchestration is crucial. Basically, the outcome of the activity could be in the format of Table 1 which shows how the current and the prospective network strategies are formulated. The structuring is based on the BSC perspectives. In addition, risk management issues can be included in the form of real options that can be executed conditional on the occurrence of random events. For instance, if there is a sudden price increase for a certain raw material or component, or a radical breakdown of one partner’s production, there are options for the affected partner to switch the material/component to another, or an option for the network to be temporarily supplied by some other company outside the network, respectively. Such real options should be identified and agreed upon in the network strategy. Table 1 Network strategy formulation – generic template Management area
Current strategy
Prospective strategy
financial customer business process
What is in place to implement the current network strategy?
What should be in place to implement the prospective improved network strategy?
regulation&social
The prospective new network strategy is then valuated in order to assess the added value for each partner, as well as for the customer. The valuation techniques follow the methods and techniques presented in Decision Analysis, e.g. [10,11]. The customer may, or may not be included in the assessment. This depends on the character of the customer-network relationship. At the same time will be verified that the new network strategy produces added value for each (principal) partner, i.e. the prospective network strategy is a win-win strategy. The outcome of this activity is an action plan that indicates the next concrete measures to be performed in order to implement the new network strategy. In general, it is expected that the alternative network strategy entails only a small change in terms of strategy, but anyhow, reflects a major change in attitude and trust among the partners. Any formal change in strategy is always connected with changed expectations on the outcome and the way partners will act. If expectations are met, trust will build up, and the willingness to develop deeper partnerships will increase. Such a partnering process is incremental: evolving step-by-step. The formulation and valuation activities may be supported by a Group Decision Support System (GDSS), allowing effective ways of generating ideas, commenting, voting and arriving at an action plan. The use of GDSS is focusing on supporting the whole multi-phased process of group decision making with certain technologies, and it can be seen to fall under the larger umbrella of the concept of GSS (Group Support Systems), which may include any technologies used to make groups more productive [12,13].
37
Earlier studies have revealed several benefits of GDSS that are worth noticing in coping with the problem area of this paper. These are, for example, process structuring, goal oriented process, parallelism (many people able to communicate at the same time), allowance of larger group sizes, automatic documentation, anonymity of group members, access to external information, and automated data analysis [13-16]. The problem area in this paper is a complex one, it needs structuring, it influences on several people, and several people are needed to make decisions with regards to it. Therefore, it can be assumed that the use of GDSS would provide significant benefits. A large amount of results could be done during, e.g. two strictly phased GDSS workshop sessions, of which the first one would formulate the prospective network strategy, and the second one would have the principles of value assessment, metrics, and action plans as outputs. However, earlier studies have also stressed the significance of a detailed pre-planning: In pre-planning meetings, the purpose and goals for each GDSS sessions need to be carefully defined in order to make the process itself as efficient and effective as possible, and in order to improve the likelihood of making the concrete results of the sessions the best possible.
5
CONCLUSIONS
In the proposed framework for assessing the added value of a network strategy value is a measure of the subjective preferences of the network partners. Thus, a win-win strategy cannot be proved other than by monitoring the customer-network relationship and the relationships between the partners. The partners have different roles, skills, expectations, etc. that need to be aligned in a network strategy for adding value to the customer and to each other. The Balance Scorecard perspectives and strategy maps provide a good basis for developing a value theoretic assessment framework that supports the network orchestrator in the formulation and the valuation of a prospective network strategy. In the framework, the valuation is comparative with the current network strategy as the reference point. The framework is developed as one answer to the needs to improve partnering for maintenance and production services for a customer in the fertiliser business in Finland. The presented approach needs many test cases for refinement and validation and should be viewed as a reference for further research rather than a readily implementable method to develop value nets.
6
REFERENCES
1
Ojanen V, Lanne M, Reunanen M, Kortelainen H & Kässi T. (2008) New service development: success factors from the viewpoint of fleet asset management of industrial service providers. Fifteenth International Working Seminar on Production Economics, Pre-Prints Volume 1, 369-380.
2
Cimon Y. (2004) Knowledge-related asymmetries in strategic alliances. Journal of Knowledge Management, 8(3), 17-30.
3
Kaplan RS & Norton DP. (1992) The balanced scorecard - Measures that drive performance. Harvard Business Review, Jan-Feb, 71-79.
4
Kaplan RS & Norton DP. (2001a) Transforming the Balanced Scorecard from Performance measurement to Strategic Management: Part I. Accounting Horizons, 15, 87-104.
5
Kaplan RS & Norton DP. (2001b) Transforming the Balanced Scorecard from Performance measurement to Strategic Management: Part 2. Accounting Horizons, 15, 147-160.
6
Rosqvist T, Laakso K & Reunanen M. (2009) Value-driven maintenance planning for a production plant. Reliability Engineering and System Safety, 94, 97-110.
7
Kaplan RS & Norton DP. (2004) Strategy maps: Converting intangible assets into tangible outcomes. Boston: Harvard Business School Press.
8
Neely A, Bourne M & Kennerley M. (2000) Performance measurement system design: developing and testing a processbased approach. International Journal of Operations & Production Management; 20 (10): 1119-1145.
9
Fitzgerald L, Johnston R, Brignall S, Silvestro R & Voss C. (1991). Performance measurement in service business, Chartered Institute of Management Accountants (CIMA), London.
10
Keeney L & Raiffa H. (1993) Decisions with Multiple Objectives: Preferences and Value Trade-offs. Cambridge University Press.
11
Keeney R. (1992) Value-focused thinking – a path to creative decision making. Harvard University Press.
12
Nunamaker J, Briggs R & Mittleman D. (1996) Lessons from a decade of Group Support Systems Research. Proceedings of the 29th Annual Hawaii International Conference on Systems Sciences, Jan 3-6, Maui, Hawaii.
38
13
Elfvengren K. (2006) Group Support System for Managing the Front End of Innovation: case applications in business-tobusiness enterprises. Lappeenranta University of Technology, Acta Universitatis Lappeenrantaensis 239, doctoral dissertation Lappeenranta, Finland.
14
Jessup L & Valacich J. (1993) Group Support Systems: New Perspectives. Macmillan Publishing Company.
15
Weatherall A & Nunamaker J. (1995). Introduction to electronic meetings. Technicalgraphics.
16
Turban E, Aronson J. & Liang TP. (2004) Decision support systems and intelligent systems, 7th ed. Prentice-Hall.
39
Proceedings of the 4th World Congress on Engineering Asset Management Athens, Greece 28 - 30 September 2009
APPLICATION OF ACOUSTIC EMISSION TECHNOLOGY IN MONITORING STRUCTURAL INTEGRITY OF BRIDGES Manindra Kaphle a, Andy CC Tan a, Eric Kim a and David Thambiratnam a a
CRC for Integrated Engineering Asset Management, Faculty of Built Environment and Engineering, Queensland University of Technology, Brisbane, Australia.
Bridges are an important part of a nation’s infrastructure and reliable monitoring methods are necessary to ensure their safety and efficiency. Most bridges in use today were built decades ago and are now subjected to changes in load patterns that can cause localized distress, which can result in bridge failure if not corrected. Early detection of damage helps in prolonging lives of bridges and preventing catastrophic failures. This paper briefly reviews the various technologies currently used in health monitoring of bridge structures and in particular discusses the application and challenges of acoustic emission (AE) technology. Some of the results from laboratory experiments on a bridge model are also presented. The main objectives of these experiments are source localisation and assessment. The findings of the study can be expected to enhance the knowledge of acoustic emission process and thereby aid in the development of an effective bridge structure diagnostics system. Key Words: Structural health monitoring, acoustic emission, damage, bridge structures 1
INTRODUCTION
Bridges are an important part of a nation’s infrastructure and reliable monitoring methods are necessary to ensure their safety and structural well-being. Many bridges in use today were built decades ago and are now subjected to changes in load patterns that cause localized distress, which may result in bridge failure if not corrected. Early detection of damage and appropriate retrofitting can help in prolonging the lives of bridges and preventing failures. Bridge failures can cause huge financial losses as well as loss of lives, an example being the I-35W highway bridge collapse in Minnesota, USA in August 2007 which killed 13 people and injured 145 people. There are altogether 33500 bridges in Australia with replacement value of about 16.4 billion dollars and about 100 million dollars annual maintenance expenditure [1]. In USA, out of a total 593,416 bridges, 158,182, that is, around 26.7 percent were identified as being either structurally deficient or functionally obsolete [2]. These statistics point to the need of cost effective technology capable of monitoring structural health of bridges, ensuring they remain operational during their intended lives. The aim of the paper is to compare various methods currently used in health monitoring of bridge structures and in particular discuss the application and challenges of acoustic emission (AE) technology. Some of the results from laboratory experiments, which are aimed at finding source location and assessment, are also presented. The findings of the study can be expected to enhance the knowledge of acoustic emission wave propagation and signal analysis techniques and their applications in structural health monitoring.
2
LITERATURE REVIEW
2.1 Structural health monitoring techniques Visual inspection has been the traditional tool for monitoring bridges. Bridges are inspected at regular interval for visible defects by trained inspectors. Though simple, visual inspection results depend solely on inspectors’ judgements and small or hidden defects may go unnoticed. A range of newer techniques are available today that provide more reliable information than visual inspection. In commonly used vibration monitoring techniques, damage to a bridge is assessed by measuring changes in
40
the global properties (such as mass, stiffness and damping) of the whole structure and identifying the shifts in natural frequencies and changes in structural mode shapes [3-5]. But some damage may cause only negligible change in dynamic properties and therefore may go unnoticed. Additionally, these methods generally give the global picture, indicating the presence of damage in the whole structure; and local methods are often necessary to find the exact location of the damage. Several non-destructive techniques are available for local health monitoring of bridge structures. Most commonly used techniques are based on the use of mechanical waves (ultrasonic and acoustic), electromagnetic waves (magnetic testing, eddy current testing, radiographic testing) and fibre optics [4, 6]. Ultrasonic technique detects the geometric shape of a defect in a specimen using an artificially generated source signal and a receiver [7]. Magnetic particle testing uses powder to detect leaks of magnetic flux [8]. It can be an economic alternative to other methods but cannot be used for nonferrous materials. Eddy current testing is based on the principle that there is a change in the eddy- current pattern due to the presence of a flaw in a structure [4]. It can detect cracks through paint and is effective for detecting cracks in welded joints, but testing can be expensive. In radiographic methods, a suitable energy source is used to generate radiation and flaw is detected when the radiation is recorded in the other side of the specimen. Laboratory results of radiographic testing are promising but large size of the equipment hampers its use in field investigation. Fibre optics can detect various parameters; displacement and temperature are the common ones. Sensing is based on intensity, wavelength and interference of the light waves [9]. Advantages include their geometric conformity, capability for sensing a variety of perturbations and no electric interference [9]. But they can be costly and placement within the structure during the construction may be needed, excluding their use for already built bridges.
2.2 Acoustic emission technique Acoustic emission (AE) waves are the stress waves that arise from the rapid release of strain energy that follows microstructural changes in a material [10]. Common sources of AE are initiation/growth of cracks, yielding, impacts, failure of bonds and fibre failure. AE waves generated within a material can be recorded by means of sensors placed on the surface. AE technique involves the analysis of these recorded signals to obtain information about the source of the emission. Physically, AE waves consist of P waves (primary/longitudinal waves) and S waves (shear/transverse waves) and might further include Rayleigh (surface) waves, reflected waves, diffracted waves and others [11]. In plate like structures, as signals travel away from the source, Lamb waves become dominant mode of propagation [12]. Lamb waves primarily travel in two basic modes namely symmetric (S0) or extensional and asymmetric (A0) or flexural, though higher modes such as S1 and A1 can exist [13]. These modes travel with different velocities depending on the frequencies and thickness of the plate. Dispersion curves based on solutions to Lamb’s equation are used to relate velocity with the product of frequency and plate thickness [13]. Some of the advantages of AE technique over other non-destructive techniques are its high sensitivity, source localization capability and ability to provide monitoring in real time, that is, damage detection as it occurs. Study of AE started in 1950s and AE technique found its initial application in monitoring pressure vessels and aerospace structures. AE was first applied for bridge monitoring in early 1970s but the use and study of AE for monitoring bridge structures rose with the rapid increase in computing resources and development of sensor technology.
2.3 Applications of AE technology for bridge monitoring and challenges faced AE is well suited for the study of the integrity of bridge structures as it is able to provide continuous in-situ monitoring and is also capable of detecting a wide range of damage mechanisms in real time [12]. A general overview of applications of AE for monitoring bridges has been given in [13]. Application of AE for steel bridges has been discussed by [14] and application for concrete has been covered by [11] . A number of previous studies have explored the use of AE technology for monitoring bridge structures made of different materials, such as steel [12, 15, 16], concrete [17-19] and composites [20, 21] as well as masonry bridges [22]. Most of the studies have combined field testing with experiments performed in laboratory. The traditional way in AE monitoring involves the use of parameters of the recorded AE signals such as amplitude, energy content, rise time and duration in characterising damage [23]. This parameter based approach is simple but may be insufficient as all waveform information is not used during analysis. Waveform based approach involves recording the whole AE signal waveform and studying its features; and is often regarded as better than the parameter based method. Though AE technique has been successfully applied for bridge monitoring, several challenges still exist. Large size of bridges creates practical problems, for example with accessing desired areas and the need for a large number of sensors to monitor the whole structure. The solution is to identify critical areas and then to monitor these targeted areas. As large volume of data is generated during monitoring owing to high sampling rate, effective data management becomes important. AE signals can arise from a number of sources, so distinguishing the sources of origin is critical in understanding the nature of damage. For instance, in steel bridges likely sources of AE include crack growth, sudden joint failures, rubbing, fretting and traffic noises. Presence of noise sources that can mask the AE signals from real crack has been identified as one of the biggest limitation of the AE monitoring system. Different suggestions have been made for noise suppression [24] but scope exists for further research. Proper analysis of recorded data to obtain reliable information about the source is another major challenge in
41
AE technique. Frequency based analysis provides important information about the nature of the source. Along with traditional Fourier based analysis, other signal analysis techniques such as short time Fourier transform (STFT) and wavelet analysis (WA) are gaining popularity [7].
3
EXPERIMENTS
Since three important aspects of AE monitoring have been identified as source location, source identification and severity assessment [13], laboratory experiments were carried out to address them. Two sets of experiments were carried out to determine the source location in small and large plates using popular time of arrival (TOA) method. In TOA method, differences in arrival times of the signals at different sensors and velocity of the waves are used to find the locations of the source using triangulation techniques [25]. Influence of AE wave travel modes in source location process was also studied. Next set of experiments was aimed at finding a way to determine similarity between two different sources of AE signals. Knowing how to differentiate and classify signals from different sources can be expected to help in source identification and assessment. The AE analysing system used was micro-disp PAC (Physical Acoustics Corporation) system, along with AEwin software provided by the same company. Sensors used were R15a sensors (PAC) resonant at 150 KHz and preamplifiers used had a gain of 20 dB. Data was acquired at a sampling rate of 1 MHz for duration of 15 ms. The system recorded a hit when signals reached a certain threshold, set at 60 dB.
3.1 Source location experiments 3.1.1 Source location in small plate A 300 mm by 300 mm aluminium plate of thickness of 3 mm was used as a test specimen. Three sensors were placed in three different locations to record AE signals. The source of AE signals were pencil lead breaks which involved breaking 0.5 mm pencil leads at selected locations within the plate. Signals from pencil lead break tests have been found to closely resemble crack signals. MATLAB (R2008a, The MathWorks) codes were used for iterations purposes. Calculated locations were compared with the exact locations to verify the accuracy of the TOA method. 3.1.2 Source location in larger plate Similar experiments were conducted on a 1.8 m by 1.2 m steel plate of thickness of 3 mm that acted as a part of a deck for slab-on-girder bridge model. Again three sensors were used to record data.
3.2 Signal similarity experiments In a long steel beam of dimensions (3m long, 0.15m wide, 75mm thick), two sources of AE signals were generated: pencil lead breaks and steel ball drops (6 mm diameter balls dropped from a vertical height of 15 cm). 10 sets of each test were carried out. The signals were recorded by a sensor placed at a distance of 1.5 m from the source. They were then analysed to determine similarity, with an aim of finding a way to classify signals from various sources. A parameter called magnitude squared coherence (MSC) was used to measure similarity. The magnitude squared coherence estimate is a function of frequency with values between 0 and 1 that indicates how well two signals correspond to each other at each frequency, with the value of 1 indicating exact match (MATLAB help guide R2008a, The MathWorks). MSC is calculated using the power spectral densities and the cross power spectral density of the signals [7].
4
RESULTS
4.1 Source location experiments For small plate source location experiment, longitudinal wave velocity in aluminium ( c L = E / r = 5128 m/s, E being Young’s modulus and ρ density) was used to calculate the source location. Fig.1 shows the calculated positions and the exact locations of pencil lead breaks, along with the sensor locations. A good correlation between the calculated and exact values is seen.
42
The result of the larger plate source location experiment using the longitudinal wave velocity in steel (cL=5188 m/s) is shown in Figure 2a. The exact and calculated values do not show good match. Results using c = 3000m/s, a value close to the transverse velocity (cT = E / 2 r (1 + u ) , where ν is the Poisson’s ratio) is shown in Figure 2b, where a much better correlation is obtained. The results indicate that the waves recorded by the sensors are not longitudinal waves, as in the first experiment.
Figure 1 Source location in small plate
(a) Using c = 5188 m/s
(b) Using c = 3000 m/s
Figure 2 Source location in larger plate with two different velocities Initial parts of sample signals from the two source location experiments are shown in Figure 3, along with the threshold value (dotted line); demonstrating how the threshold is crossed by the recorded signals.
43
(a)
(b)
Figure 3 Record of hits in small plate (a) and larger plate (b) In Fig. 3a, the first arriving wave crosses the threshold and records a hit. On the other hand, in Fig. 3b, the initial portion consists of low amplitude signals that do not cross the threshold; and therefore do not trigger a hit. Also seen is that the initial component arrives about 90 µs before the triggering wave component. Using the velocity of the triggering wave (c=3000m/s) and the distance between the source and signal (in this case signals were recorded by Sensor S3 for source at position of (0.3, 1.2), so the distance was calculated to be 0.67 m); the velocity of the initial arriving wave can be calculated to be around 5000 m/s. This value is close to the longitudinal velocity of waves in steel. Initial conclusions can be drawn that though longitudinal waves are present they have attenuated to a level below the threshold; and the waves that record a hit by crossing the threshold are the transverse waves. More investigation is needed to check whether the waves seen are Lamb wave modes, as Lamb waves are common in large plate like structures. Detailed frequency analysis of the signals by means of Fourier analysis and short time Fourier analysis (STFT) are expected to be useful in identifying the modes. Therefore, Fourier analysis was carried out for two parts of the sample signal from the large plate source experiment: the initial 90 µs portion and the next 230 µs portion. The frequency response diagrams are shown in Figure 4.
(a) Initial 90 µs
(b) Next 230 µs
Figure 4 FFT of two portions of signal from large plate source location experiment The major difference observed between Figures 4a and 4b is that frequency peaks around 47, 70 and 90 kHz appear in Fig. 4b, indicating that these lower frequency wave modes arrive late and trigger a hit. To obtain more information, Short time Fourier transform (STFT) analysis was carried out using time-frequency toolbox [26] and the results are shown in Fig. 5.
44
Figure 5 STFT analysis of the signal From the STFT plot in Fig. 5, it is clear that waves with frequencies around 100 to 180 kHz arrive at the beginning. Starting at around 90 µs, waves with a large variation in frequencies arrive. The frequencies gradually decrease from around 350 kHz till 30 kHz, with a peak value at around 45 kHz and 250 µs. Another similar wave pattern with decreasing frequency values emerges after 200 µs; these could be the reflected waves. For further insight into Lamb wave phenomenon, study of the dispersion curve for steel is useful. It is given in Figure 6 and shows the variation of the velocities of modes S0, A0, S1 and A1 with frequencies and thickness of the plate. Using frequency f of 100 kHz and 180 kHz and 3 mm thickness t of the plate (f.t = 0.3 or 0.54 MHz.mm), group velocity of around 5000 m/s is seen for S0 mode in Fig. 6. This value matches calculations made before for initial fast arriving component. But for the triggering slow arriving wave (of velocity 3000 m/s) to be flexural mode (A0), frequency of around 333 kHz or more is required (f.t = 1 MHz.mm). This high frequency component is not conspicuous in FFT analysis in Fig. 4b, though it can be seen in STFT analysis in Fig. 5. But since signals with a wide range of frequencies are present, it is hard to ascertain that 333 kHz component crosses the threshold first. It has to be added that due to sensor sensitivity, frequencies in higher range are not recorded properly, as sensors tend to be sensitive to the signals near their resonant frequency, in this case 150 kHz.
Figure 6 Dispersion curves in steel [13]
45
4.2 Experiment 3 To judge the similarity of signals, magnitude squared coherence (MSC) is used. MSC values between signals from pencil lead break experiments have a mean of 0.78 while the MSC values between pencil lead break and ball drop signals have a mean of 0.38. These differences are significant, showing the possibility for further exploration of magnitude squared coherence for signal classification purposes. A sample MSC values between two pencil lead break signals (a) and between pencil lead break and ball drop signals (b), using MATLAB code mscohere (R2008a, The MathWorks), are shown in Fig. 7. Fig. 7a indicates close match between the signals, especially up to 400 kHz, whereas Fig. 7b indicates less coherence.
(a)
(b)
Figure 7 MSC values versus frequencies
5
DISCUSSIONS
Experiments in this study have explored the use of frequency analysis for better interpretation of recorded AE data. Ability to determine the location of the source is an important advantage of acoustic emission technique. AE propagation in solids is a complex phenomenon as signals travel in various modes and mode conversions occur. For accurate source location, identification of the modes is necessary. It has been seen that progressive attenuation of longitudinal waves precludes their use in locating the source of AE events in larger scale [15]. The results from source location experiments confirm this case. For signal analysis, STFT is found to be more informative than Fourier transform, as both frequencies and times of occurrence of different wave modes are seen in STFT analysis. More rigorous analysis is required to ascertain whether the wave modes recorded are the Lamb wave modes, transverse waves or other waves. Use of finite element analysis and other techniques such as wavelet analysis are expected to be beneficial and will be carried out in the next stage. Similar sources have been found to give similar waveforms. Magnitude squared coherence (MSC) based on the power spectral frequency analysis of the signals provides a simple way of judging signal similarity, as verified by the experimental results. Signal similarity can be an effective tool for signal classification and thus for source identification and assessment, which are some of the important aspects of AE monitoring method. A crack waveform obtained from a laboratory experiment can act as a template for distinguishing a similar signal obtained in field testing from other noise sources. Waveform recorded by a sensor is influenced by the path travelled by the signals and by the sensor characteristics; the influence of these parameters needs further consideration.
6
CONCLUSIONS
The study of acoustic emission technique for monitoring bridge structural integrity is growing continually. Though AE technique has several distinct advantages over other non-destructive methods of monitoring, it still has not become the preferred choice in bridge monitoring. Existence of sources of noises that can mask AE signals from real damage has been identified as a major hindrance for the use of AE technique in monitoring bridges. Signal processing techniques including the frequency analysis tools used in this study can be effective tools to distinguish and remove noises from real signals. Proper analysis of recorded AE signals to deduce useful information about the nature of the source is another challenge. This study has aimed to address some of these issues by analysing the signals for source location and source assessment purposes. Though only laboratory tests have been carried out so far, knowledge from these tests is valuable in interpreting the results from actual
46
field tests. Plate like structures and beams are common in bridges, and hence make ideal experimental specimen. As AE is generally used as local monitoring technique, critical areas where damage is likely are identified and specifically monitored. In experiments in this study, signals were recorded for a distance up to slightly more than 1m for plates and 1.5 m for beams. Studies on attenuation of AE signals (not shown in this paper) have shown possibility of recording AE signals for larger distances (5-7 m) effectively; hence fairly wide area could be monitored using strategically placed AE sensors. The approach in this study was to record experimental data first and then transfer and analyse data later on. Real time analysis is an attractive option and, especially with advanced computing resources available today, should be possible for implementation. To conclude, it can be said that monitoring structural integrity of bridges by AE provides insight into the current state and helps determine if further steps are necessary to extend the lives of bridges and to ensure they perform safely and reliably.
7
REFERENCES
1
Austroads, (2004) Guidelines for Bridge management - Structure Information. Austroads Inc: Sydney, Australia.
2
USDoT, (2006) 2006 Status of the Nation's Highways, Bridges, and Transit: Condition and Performance. U.S. Department of Transportation Federal Highway Administration Federal Transit Administration.
3
Shih, H.W., D.P. Thambiratnam, and T.H.T. Chan, (2009) Vibration based structural damage detection in flexural members usign multi-criteria approach. Journal of sound and vibration. 323, 645-661.
4
Chang, P.C. and S.C. Liu, (2003) Recent research in nondestructive evaluation of civil infrastructures. Journal of materials in civil engineering. p. 298-304.
5
Chang, P.C., A. Flatau, and S.C. Liu, (2003) Review paper: Health monitoring of civil infrastructure. Structural health monitoring. 2, 257-267.
6
Chong, K.P., N.J. Carino, and G. Washer, (2003) Health monitoring of civil infrastructures. Smart Materials and structures. 12, 483-493.
7
Grosse, C.U., et al., (2004) Improvements of AE technique using wavelet algorithms, coherence functions and automatic data analysis. Construction and building Materials. 18, 203-213.
8
Rens, K.L., T.J. Wipf, and F.W. Klaiber, (1997) Review of non-destructive evaluation techniques of civil infrastructure. Journal of performance of constructed facilities. 11(2), 152-160.
9
Ansari, F., (2007) Practical implementation of optical fiber sensors in civil structural health monitoring. Journal of intelligent material systems and structures. 18, 879-889.
10 Vahaviolos, S.J., (1996) Acoustic emission: A new but sound NDE technique and not a panacea, in Non destructive testing, D. Van Hemelrijck and A. Anastassopoulos, Editors. Balkema: Rotterdam. 11 Ohtsu, M., (1996) The history and development of acoustic emission in concrete engineering. Magazine of concrete research. 48(177), 321-330. 12 Holford, K.M., et al., (2001) Damage location in steel bridges by acoustic emission. Journal of intelligent material systems and structures. 12, 567-576. 13 Holford, K.M. and R.J. Lark, (2005) Acoustic emission testing of bridges, in Inspection and monitoring techniques for bridges and civil structures, G. Fu, Editor. Woodhead Publishing Limited and CRC. p. 183-215. 14 Lozev, M.G., et al., (1997) Acoustic emission monitoring of steel bridge members. Virginia transportation research council. 15 Maji, A.K., D. Satpathi, and T. Kratochvil, (1997) Acoustic emission source location using lamb wave modes. Journal of engineering mechanics. p. 154-161. 16 Sison, M., et al., (1998) Analysis of acoustic emissions from a steel bridge hanger. Research in Nondestructive Analysis. 10, 123-145. 17 Colombo, S., et al., (2005) AE energy analysis on concrete bridge beams. Materials and structures. 38, 851-856. 18 Shigeshi, M., et al., (2001) Acoustic emission to assess and monitor the integrity of bridges. Construction and building materials. 15, 35-49. 19 Yuyama, S., et al., (2007) Detection and evaluation of failures in high-strength tendon of prestressed concrete bridges by acoustic emission. Construction and building materials. 21, 491-500. 20 Rizzo, P. and F.L. di Scalea, (2001) Acoustic emission monitoring of carbon-fiber-reinforced-polymer bridge stay cables in large-scale testing. Experimental mechanics. 41(3), 282-290.
47
21 Gostautas, R.S., et al., (2005) Acoustic emission monitoring and analysis of glass fiber-reinforced composites bridge decks. Journal of bridge engineering. 10(6), 713-721. 22 Melbourne, C. and A.K. Tomor, (2006) Application of acoustic emission for masonry arch bridges. Strain - International Journal for strain measurement. 42, 165-172. 23 Vallen, H., (2002) AE testing fundamentals, equipment, applications. NDT.net. 7(09). 24 Daniel, I.M., et al., (1998) Acoustic emission monitoring of fatigue damage in metals. Nondestructive testing evaluation. 14, 71-87. 25 Nivesrangsan, P., J.A. Steel, and R.L. Reuben, (2007) Source location of acoustic emission in diesel engines. Mechanical systems and signal processing. 21, 1103-1114. 26 Auger, F., et al., (1996) Time-Frequency Toolbox - For use with MATLAB. CNRS (France) and Rice University (USA).
Acknowledgments The authors gratefully acknowledge the financial support from the QUT Faculty of Built Environment & Engineering and the Cooperative Research Centre for Integrated Engineering Asset Management (CIEAM).
48
Proceedings of the 4th World Congress on Engineering Asset Management Athens, Greece 28 - 30 September 2009
APPLICATION OF TEXT MINING IN ANALYSING ROAD CRASHES FOR ROAD ASSET MANAGEMENT Richi Nayak1, Noppadol Piyatrapoomi2 and Justin Weligamage2 1
Faculty of Science and Technology, Queensland University of Technology, Brisbane, QLD 4001, Australia
2
Road Asset Management Branch, Queensland Government Department of Main Roads Brisbane, Queensland, Australia
Traffic safety is a major concern world-wide. It is in both the sociological and economic interests of society that attempts should be made to identify the major and multiple contributory factors to those road crashes. This paper presents a text mining based method to better understand the contextual relationships inherent in road crashes. By examining and analyzing the crash report data in Queensland from year 2004 and year 2005, this paper identifies and reports the major and multiple contributory factors to those crashes. The outcome of this study will support road asset management in reducing road crashes. Key Words: Text Mining, Road Crashes, Data Analysis 1
INTRODUCTION
Traffic safety is a major concern in many states around Australia and including the state of Queensland. Since year 2000, there had been 296 fatalities on average per year in Queensland as recorded by the Office of Economic and Statistical Research [1]. Data obtained for analysis in this paper shows that during the years 2004 and 2005, there were over 20,000 traffic crash investigation reports recorded involving Queensland motorists. The annual economic cost of road crashes in Australia is enormous - conservatively estimated at $18 billion per annum - and the social impacts are devastating [2]. The cost to the community through these crashes is very high. They also have a devastating impact on the emergency services and a range of other groups. In addition, it is inevitable that the insurance companies will have to increase the cost of premium to cover their ongoing cost of insuring those motorists and their vehicles. It is therefore in both the sociological and economic interests of society that attempts are made to identify the major and multiple contributory factors to those crashes. Statistical analysis of road crashes is not a new realm of research by any means. For many years, road safety engineers and researchers have attempted to deal with large volumes of information in order to gain an understanding of the economic and social impacts of car crashes. The hope is, that with this understanding, more efficient safety measures can be put into place to decrease the number of future road crashes [3]. Various data mining and statistical techniques have been used in the past in the domain. Researchers have attempted to investigate crash analysis through ordinary statistical tables and charting techniques [4, 5].The issue with these techniques is that they limit human involvement in the exploration and knowledge discovery tasks. Researchers have also attempted advanced data analysis methods of data mining that include clustering, neural networks and decision trees to reveal relationships between distractions and motor vehicle crashes. Major focuses of research on road crashes are the use of data mining to analyse freeway or highway accident frequency, the development of models to predict highway incident durations and the use of data mining in the classification of accident reports [6, 7, 8]. Other studies include the use of data mining and situation-awareness for improving road safety; a comparison of driving performance and behaviour in 4WDs versus sedans through data mining crash databases [5] and a study in the safety performance of intersections [9]. These studies revealed some interesting results, however, they are unable to properly analyse the cognitive aspects of the causes of the crashes. They often opt to leave out significant qualitative and textual information from data sets as it is difficult to create meaningful observations. The consequence of textual ignorance results in a limited analysis whereby less substantial conclusions are made. Text mining methods attempt to bridge this gap. Text Mining is discovery of new, previously unknown information, by automatically extracting it from different written (text) resources. Text mining methods are able to extract important concepts and emerging themes from the collection of text sources. Used in a practical situation, the possibilities for
49
knowledge discovery through the use of text mining is immense. To our knowledge, there is limited or no reputable studies that have utilised text mining in this data domain, however, earlier studies in the field indicate a real need for textual mining in order to better understand the contextual relationships of road crash data. This paper presents a text mining based method to better understand the contextual relationships inherent in road crashes. By examining and analyzing the crash report data in Queensland from year 2004 and year 2005, this paper identifies and reports the major and multiple contributory factors to those crashes. Analysis is performed to identify links between common factors recorded in crash reports. Of key concern are the causes of crashes, rather than the consequences. The outcome of this study will support road asset management in reducing road crashes. With those findings on hand, we hope it can be useful for reviewing the limitations of existing road facilities as well as planning better public safety measurements. Most importantly, implementing and continuing a long term public education on road safety issues especially amongst the young generations and male gender which historically are involved in a high proportion of road crashes each year.
2
TEXT MINING METHOD
Text mining is discovery of new and previously unknown information automatically from different text resources using natural language and computation linguistics, machine learning and information science methods [10]. The key element is linking of the discovered information together to form new facts or new hypotheses to be explored further by more conventional means of experimentation [10]. Text mining methods include the steps of processing the input text data, deriving rules and patterns within the newly processed data and finally the evaluation and interpretation of the output rules and patterns.
2.1
Objectives
The focus of this paper is to determine the most common causes of road crashes so that appropriate measures can be taken by road asset management in the future to prevent these accidents from occurring. The objective is to investigate the nature of crashes and roads that result in the crashes reported by Queensland traffic accident investigators within the period 2004 to 2005. Performing text mining on the crash description gives information about the causes of a crash that cannot necessarily be categorized into any particular field within a database. The crash reports, when pre-processed and grouped into clusters, can reveal insights that may have formally been unrecognisable. The identified unusual and hidden relationships may be useful to government, businesses (insurance organisations, motoring associations) and individuals in better road asset management. As compared with simply having a quantitative data set, textual information can enable conclusions to be drawn from the circumstances that caused the accident as opposed to simply looking at what the accidents were.
2.2 Dataset The data used for this analysis, is collected from files containing information related to road crashes from the state of Queensland. The two files supplied are actual reports produced by traffic accident investigators within the period 2004 to 2005. They contain data for each reported road accident in this period according to 29 attributes including date, time, location and road conditions of the crash. More specifically they are: Atmospheric, Carriageway, Crash_Description, Crash_Nature, Crash_Area, Crash_Date, Crash_Day_of_Week, Crash_Distance, Crash_Divided_Road, Crash_Landmark, Crash_Number, Crash_Speed_Limit, Crash_Time, District, Horizontal_Allignment, Lighting, Owner_ID, Roadway_Feature, Road_Section, Road_Surface, Traffic_Control and Vertical_Allignment. Preliminary review of the dataset helped us in determining the most interesting and important attributes to be used in our text mining analysis. However, for the purposes of text mining the crash description was of highest significance. This is a character attribute containing data with values up to 403 characters. The attribute “Street 2” with data of for example: “Warrego highway” had over 241 reported crashes which is also an interesting piece of information to look into. It is noted that there were as many as over 80% crashes occurred with clear conditions as shown in the attribute “Atmospheric”, and over 65% cases occurred during “daylight hours”. Also, a large number of reported crashes occurred on sealed and dry road surface as shown in the attribute “Road_Surface”, only small number of crashes occurred on wet surface. Another interesting attribute that came out of our observations was “Owner_ID”. It represents the gender of the person who was involved in each of these described crashes and it appears that almost all of them are MALE. The assumption is that this is due to the limitation of this particular crash report data. It may or may not be a true reflection of the correct distribution of gender involved in road crashes in a broader sense.
50
2.3 Process: Step1 - Pre-processing Pre-processing of textual information is a time-consuming task but is essential in order to achieve results that are of value to the users of the information. An initial scan through the data set identified a number of potential problems that will need to be addressed before any text mining could take place. The main cause of this is mainly due to noise and various inconsistencies between the different records, which could possibly be due to different forensic experts writing the notes. Punctuation: Punctuation was often omitted or used extraneously. No consistent information was apparent in the use of punctuation. Therefore, to simplify the text mining all punctuation was removed and replaced with spaces. Specifically, the following characters were replaced: ~`!@#$%^&*()_+-={}[]|\;':'<>?/,. Broken Words: The previous step resulted in some words with gaps. Also there were many gaps (spaces) between words that will need to be removed in order to obtain any value from the words. This is because gaps between words are not actually “new” words, and during the text mining process they will have a low frequency, meaning that they are unlikely to be used. These gaps were consequently removed in order to obtain a more accurate result. Some examples are “trave lling”, “ro ad”. Inconstancy due to the user of abbreviation and different cases: Another problem encountered with the data set is that there are many inconsistencies in different records. An example of this would be “unit 1”, where variants including “u1” and “unit one” were used throughout the data set. This presents a problem in the context of road crash text mining because names of roads and highways could be abbreviated in a multitude of ways. In order to provide any meaningful recommendations, abbreviations also had to be standardised. It has been agreed that most value can be provided to the end user if the full word was used. Another example of inconsistency was using lowercase and uppercase to representing the same word. Consequently data was transferred to lowercase to prevent text mining tools from separating the same words that started with either upper or lower case. If this action was not taken, same words would not be grouped, therefore misleading the results and the integrity of the analysis. Converting all the text to lower case meant there were less combinations to code for transforming abbreviations to their full descriptions. Spelling mistakes: Spelling mistakes was another common problem encountered during the pre-processing phase. They were removed by filtering through each record and correcting mistakes through the data set. If spelling mistakes are not corrected, text mining tools do not recognise the word, nor would it group same words. Some examples are “uint”, “unti”, “utni” user frequently to write the word “unit”. Common phrases: Finally as part of the formatting functions, common phrases that comprise of more than a word were combined into a single word to assist the text analysis. The data was processed for words which were in close proximity to each other to create common combinations. This was important as combinations such as green light and police station do not have the same meaning if they were not combined. For instance the car could have been green and it crashed into a light pole, instead of the car went through a green light and was hit in the middle of an intersection. Table 1 presents examples of the phrases replaced with concatenated words.
Table 1: Example replacements for common phrases
Original Text Traffic light red light turning lane stop sign green light give way lost control police station parking bay failed to stop road side bruce highway round about towing a trailer
Replace With red-light stop-sign turning-lane stop-sign green-light give-way lost-control police-station parking-bay failed-to-stop road-side bruce-highway roundabout towing-a-trailer
51
2.4 Process: Step 2 – Text Mining The process of text mining includes converting unstructured text data into structured data, clustering the crash reports to identify links between common factors reported in crash reports, and viewing the concept links. We employ the Leximancer tool [13] based on the bayesian theory [14] to assess each word in the dataset to predict the concepts being discussed. It learns which word predicts which concept (or cluster) and forms concepts (or clusters) based on associated terms. It thus positions clusters based on the terms that they share with other clusters. It constructs a conceptual graphical map by measuring the frequency of occurrence of the main concepts and how often they occur close together within the text. A concept is treated as a cluster. Each term appearing in the text data is analyzed to form a concept to allow blackbox discovery of patterns that may not otherwise be known. Concepts that are similar are merged and edited. For example, the concept list included concepts such as turn, turning and turned; direction, north, south, east and west; light and lights; approached and approaching; road and street; lane and lanes. Each of these combinations of concepts relates to the same thing and one word is merely a stem of the other. As a result these similar concepts are merged into one and are renamed to reflect the true meaning of the concept, for example: Day, Time, Years, and Week are merged into the single concept ‘Time’. Some concepts are removed which may not be pertinent to the crash senario being analysed, for example, preceded and occurred. Many concepts are then put together to form a theme.
Table 2: Example stop words excluded from the standard stop-word list Word Excluded from the stop list Bald Hit Look, looking Fast Indicate, Indicated Right Following Two 3
Rational “Bald” tyres may be a cause of an accident. Hit may imply a collision. Look and Looking may be referring to where a driver was looking when the accident occurred. Speed which may be the cause of an accident. Whether a driver indicated left or right. Turning and merging right as opposed to left may have more of an impact in collisions. (i.e. turning across traffic). Car could be following too closely to another vehicle. Could refer to Unit two or number of vehicles involved in the accident.
ANALYSIS AND RESULTS
3.1 Dataset examination Data distribution of the data set is displayed in figures 1 to 5 showing some significant correlations as well as disassociations between various attributes. The atmosphere is usually clear when the crash occurred indicating weather condition was not a big factor in this dataset. The distribution of crash time is mostly in the afternoon especially between 3pm to 5pm. This is during afternoon peak times when drivers are tired from working all day. The area with speed limit of 60km/h was where most crashes occurred, following by the area with speed limit of 100km/h. This is expected as the majority of roads have a speed limit of either 60km/h or 100km/h. No traffic control showing as most important contributory factor for the crash in this dataset. The three most significant crash natures were angle, hit fixed obstruction/ temporary object and rear-end. Figure 6 shows that over a quarter of the accidents (28%) are classified as “rear-end” which is a high proportion of the data given there are 14 categories for this attribute. The data in Figure 7 displays the count of accidents grouped by the characteristic of the road where the accident occurred according to the “Roadway Feature” attribute. The proportion of accidents that occur at some form of intersections (i.e. Cross, interchange, multiple road, roundabout, T junction and Y junction) is 95% (excluding Not Applicable). This would indicate that there might not be enough controls in place around intersections to avoid an accident.
52
Figure 1 - ATMOSPHERIC attribute
Figure 2 - CRAS_TIME attribute
Figure 3 - CRAS_SPEED_LIMIT attribute
Figure 4 - TRAFFIC_CONTROL
Figure 5 - CRASH_NATURE attribute
Figure 7: Roadway Feature Summary
Figure 6 - Crash Nature Summary
53
Figure 8 – The cluster map 3.2 Cluster Analysis This Cluster Map in Figure 8 shows the different Clusters that were produced by the Leximancer text mining tool after the pre-processing had been preformed. Several clusters are immediately obvious by communicating possible causes for road accidents including intersections, rear-ending and loss of control. The list of concept terms generated by the Leximancer tool includes roundabout, intersection, traffic, bend, lane, injuries, left and right, give-way, rear and speed, among others. These terms alone give a good indication of possible causes of road accidents as they are the most frequently appearing terms in the sample text once stop words have been removed. The travelling, Unit 1, Unit 2, road, right, vehicle and intersection have the highest relative counts. The two highest frequency concept terms are Unit 1 and Unit 2. Unit 1 occurs 12,774 times whilst Unit 2 only occurs 7,286 times. Similarly, the concept terms ‘left’ and ‘right’ appear 3111 and 6446 times respectively. An analysis into why ‘right’ might appear more than twice as many times as ‘left’ revealed that perhaps more accidents occur in right-hand lanes or whilst performing right-hand turns. Indeed, the relationship between ‘right’, ‘left’ and ‘intersection’ showed that the concept term ‘intersection’ will be accompanied by the concept term ‘right’ 73.5% of the time whilst it was only accompanied by ‘left’ 18% of the time. An immediate assumption that could be made regarding the reason for Unit 1 appearing nearly twice as many times as Unit 2 would be that single vehicle accidents occur more frequently than multi-vehicle accidents. However, this assumption may not necessarily be true. For example, Unit 1 may just be repeated more times than Unit 2 within the same passage. Figure 9 indicates the strength of the relationship between Unit 1 and all other concept words, whilst Figure 10 indicates the strength of
54
the relationship between Unit 2 and all other concept words. The relationship from Unit 1 to Unit 2 is moderately strong whilst the relationship from Unit 2 to Unit 1 is significantly stronger. The relative count of the first relationship shows that the term ‘Unit 2’ (7286) is closely related to the term ‘Unit 1’ (7286) 100% of the time. However, the second relationship, shows that the term ‘Unit 1’ (12774) is only closely related to the term ‘Unit 2’ (7268) 57% of the time. This information indicates that 43% of the time a second vehicle is not involved. It is therefore possible to conclude that nearly half of all road crashes in this case study are single vehicle accidents.
Figure 9: Unit 1 Concept
Figure 10: Unit 2 Concept
With the above discoveries in mind, the clusters can now be analysed to identify meaning in the grouping of concept words and their relative locations. The first meaningful clusters are the ‘vehicle’ and ‘lost-control’ clusters (as shown in Figure 11). These clusters appear in close proximity to each other, in fact overlapping, indicating a strong relationship between the two clusters. These two clusters include key words such as ‘towing’, ‘trailer’, ‘speed’ and ‘lost-control’. One possible conclusion that could be drawn from these concept words is that drivers can often lose control of their vehicles when speeding. Another is that drivers can easily lose control of their vehicles when towing a trailer. The ‘vehicle’ cluster may also indicate a relationship between these conclusions and ‘single driver’ accidents or ‘single vehicle’ accidents.
Figure 11: Driver Control Cluster
Figure 12: Rear-ending Cluster
The second meaningful cluster (as shown in Figure 12) is the ‘rear’ cluster which is also overlapping with the ‘Unit 2’ cluster. The concept words of the cluster include ‘slowed’, ‘stop’, ‘time’, ‘collided’ and ‘rear’. These combination of words could indicate a scenario of rear-ending, a common form of car crash in suburban areas. The concept terms ‘collided’ and ‘rear’ alone would suggest this to be the case. However, this is also supported by the terms ‘stop’ and ‘time’ indicating that perhaps a vehicle could not ‘stop in time’ and as a result collided with the vehicle in front.
Figure 13: Intersection Cluster
Figure 14: Speed Concept
55
The ‘intersection’ cluster (as shown in Figure 13) also seems to indicate quite meaningful information. An immediate observation that can be made is that this cluster overlaps with both the ‘Unit 1’ and ‘Unit 2’ clusters, indicating that perhaps accidents at intersections often involve two vehicles. The ‘intersection’ cluster includes interesting concept words such as ‘red light’, ‘green light’, ‘intersection’, ‘intending’ and ‘give way’. These key words might indicate that ‘giving way’ (or lack thereof) at intersections in perhaps a common cause of road accidents. This cluster may also suggest that traffic lights are often involved in crashes at intersections. Although this data alone does not tell us exactly how the traffic lights might be related to the accident one conclusion is that perhaps people are not stopping for red lights or simply do not see them. However, the relationship between traffic lights and causes of crashes is an area for further investigation. Although these three clusters were perhaps the most meaningful, further conclusions could be drawn from the remaining clusters with further analysis. These three clusters in particular were chosen for analysis as they indicated possible causes for road accidents. The implications of these findings are discussed later. One last area of interest is how road accidents are influenced by speeding. An analysis of the relationship between the concept word ‘speed’ and all other concept words indicates that perhaps speeding is a cause of more accidents in ‘low speed’ areas such as roads or streets rather than ‘high speed’ areas such as highways. As can be seen in Figure 14, the “speed” concept can be correlated with “road” or “street” (these two words were grouped in pre-processing) 63.6%. Whereas, highway and motorway only appear with speed 17.6% and 13.2% of the time respectively. Whilst the word ‘speed’ alone does not necessarily indicate speeding, it can be assumed that if the word were to appear in a crash report the speed must have been an influence factor.
4
DISCUSSION AND CONCLUSION
Several conclusions were drawn from the analysis conducted above. These conclusions involved: (1) the likelihood of a second vehicle being involved in an accident; (2) the likelihood of an accident when turning right as opposed to turning left; (3) the influence of towing a trailer in losing control of a vehicle; (4) the influence of speed in losing control of a vehicle; (5) a person’s inability to stop resulting in a rear-ending accident; (6) the likelihood of more than one vehicle being involved in an intersection accident; and (7) the influence of speed zone category in speeding accidents From these conclusions, various recommendations can be made. Our proposed recommendations are as follows: (a) greater awareness be raised regarding following another vehicle too closely or better known as tail-gating. Such awareness may help reduce the number of incidents related to rear-ending. (b) determine new as well as improving existing controls to prevent these sort of rear ending accidents through signalling by the immobile vehicle. This would involve developing and improving methods of displaying that a vehicle is immobile to vehicles approaching it on either side. This includes for both the vehicle as well as for trailers that are attached to it to prevent the rear ending happening. (c) further enhance future analysis of accident data, the improvement of information capture can be achieved by recording the presence/absence of right hand turning lanes at the intersection (for those accidents occurring at an intersection). (d) determine new as well as improving existing roadway features. As mentioned in point (c), turning lanes are used to improve safety at intersections, however, if these are not able to be installed at certain intersections there may be a requirement to develop alternate controls. Another consideration is if turning lanes do not reduce accidents at intersections, which could be the subject of additional research. (e) drivers purchasing trailers should be made aware of the difficult in controlling such vehicles and the implications associated with this. (f) Speeding campaigns should target low speed zone areas rather than high speed zone areas and speed cameras should be utilized in low speed areas more often to discourage speeding in these problem areas. (g) Furthermore, drivers should be reminded of ‘give way’ rules and there should perhaps be a greater focus on these rules during driving exams, particularly focusing on right-hand turns. Table 4 lists recommendations for improvement in road assets according to the features that are highlighted during text mining analysis of the crash data set. Finally, this paper has focused on the causes of road accidents and has not considered the consequences of such accidents but recognises that this is an equally significant area of concern. Whilst there is some information regarding injuries and damage to vehicles it is recommended that further research be conducted into the consequences of road accidents.
56
Table 4: Recommendation according to the features from the crash data set of interest Other features Losing control of vehicle or rolling on embankment Motorcyclist accidents Speeding through intersections Failure to obey signs. Collision with inanimate objects Blood samples taken Accidents at traffic lights Police did not attend scene/minor accidents Collisions due to right-hand turns Towing trailers Serious accidents requiring hospitalisation Rear-end collisions
Recommendations for improvement (if possible) Increased signage of accident-prone areas. Install road barriers if feasible. Increased regulations for gaining a motorcycle licence. Encourage motorcyclists to be more careful on roads at all times. This figure is concerning. More driver education is needed. Fixed speed cameras could be considered. Public awareness campaign might be required. Decrease speed limits if warranted. Stricter consequences for violation of traffic regulations. Reflector strips on guardrails, other inanimate objects. Install parking lane for stopped vehicles. Increase visibility of traffic lights or install signage on approach to lights. Improve traffic lights at Southport and Nerang. Consider installing a right-hand turn lane. Create signalised intersections. Educate drivers with long/ heavy loads how to manoeuvre vehicle properly. Investigate each of these accidents separately. Remind drivers to keep a safe distance behind other vehicles at all times.
5
REFERENCES
1
Australian Government, Department of Infrastructure, Transport, Regional Government and Road Safety. (2008) Road Safety, http://www.infrastructure.gov.au/roads/safety/, Retrieved October/2008.
2
Queensland Fire and Rescue. (18/09/2002) Firefighters called to http://www.fire.qld.gov.au/news/view.asp?id=207 , Retrieved October/2008
3
Abugessaisa, I. (2008). Knowledge discovery in road accidents database – Integration of visual and automatic data mining methods. International Journal of Public Information Systems, 2008 (1), 59-85. Retrieved October 20, 2008, from Emerald Insight database.
4
Gitelman, V. and Hakkert, A. S. (1997) The evaluation of road-rail crossing safety with limited accident statistics. International journal of Accident Analysis Prevention, 29 (2), 171-179. Retrieved October 20, 2008 from Emerald Insight database.
5
Gurubhagavatula, I., Nkwuo, J. E., Maislin, G., and Pack, A. I. (2008). Estimated cost of crashes in commercial drivers supports screening and treatment of obstructive sleep apnea. International Journal of Accident Analysis & Prevention, 40 (1), 104-115. Retrieved October 20, 2008 from Emerald Insight database.
6
Chatterjee, S. (1998). A connectionist approach for classifying accident narratives. Purdue University.
7
Li-Yen, C., & Wen-Chieh, C. (2005) Data mining of tree-based models to analyze freeway accident frequency. Journal of Safety Research.
8
Tseng, W., Nguyen, H., Liebowitz, J., & Agresti, W. (2005) Distractions and motor vehicle accidents: Data mining application on fatality analysis reporting system (FARS) data files. Industrial management and data systems, 109 (9), 1188-1205. Retrieved October 20, 2008, from Emerald Insight database. Journal of Safety Research.
9
Queensland University of Technology. (2008) Retrieved October 22, 2008, from QUT Centre for Accident Research and Road Safety: http://www.carrsq.qut.edu.au
record
number
of
road
crashes,
,
10 Hearst, M. A. (1999) Untangling Text Data Mining. The 37th Annual Meeting of the Association for Computational Linguistics, Maryland, June 20-26, (invited paper). http://www.ischool.berkeley.edu/~hearst/text-mining.html. Accessed 20 April 2007. 11 Jain, A.K., M.N. Murty, and P.J. Flynn, (1999) Data Clustering: A Review. ACM Computing Surveys (CSUR), 31(3): p. 264-323.
57
12 Grossman, D. & Frieder, O. (2004) Information Retrieval: Algorithms and Heuristics. 2nd edn., Springer. 13 Smith A. E. Humphreys, M. S. (2006) Evaluation of unsupervised semantic mapping of natural language with Leximancer concept mapping. http://www.leximancer.com/documents/B144.pdf Accessed 28 May 2007, 14 Han, J., & Kamber, M. (2001) Data Mining: Concepts and Techniques. San Diego, USA: Morgan Kaufmann
Acknowledgments We will like to thank CRC for Integrated Engineering Asset Management (CIEAM) to provide us the opportunity to conduct this case study. We will also like to thank students of ITB239 and ITN239: Enterprise Data Mining to conduct some of the experiments and Dan Emerson to assist us in reformatting the figures.
58
Proceedings of the 4th World Congress on Engineering Asset Management Athens, Greece 28 - 30 September 2009
SCHOOL BUILDINGS ASSETS – MAINTENANCE MANAGEMENT AND ORGANIZATION FOR VERTICAL TRANSPORTATION EQUIPMENTS Andrea Alonso Pérez a,b, Ana C. V. Vieira b,c and A. J. Marques Cardoso b a
b
Universidade de Vigo, ETSII, Campus Universitario Lagoas-Marcosende, C.P. 36310 Vigo, España.
Universidade de Coimbra, FCTUC/IT, Departamento de Engenharia Electrotécnica e de Computadores, Pólo II – Pinhal de Marrocos, P – 3030-290 Coimbra, Portugal. c
Instituto Politécnico de Tomar, Escola Superior de Tecnologia de Tomar, Departamento de Engenharia Electrotécnica, Estrada da Serra – Quinta do Contador, P – 2300-313 Tomar, Portugal.
Maintenance of educational building assets is an important tool not only for the wellness of students and other users, but also as an important economic instrument maximizing items life cycle and minimizing maintenance costs. The law in force regarding the Abolition of Architectural Barriers was conceived to facilitate the access to buildings of people with physical deficiencies, but even after its endorsement, there are still Portuguese Schools without an easy access or even none at all, for such individuals. The vertical transportation of persons helps to avoid this discrimination, as it provides access to all the building floors, with the highest possible comfort, for people whose mobility is reduced. This paper addresses the maintenance management and organization of Vertical Transportation Equipments. It will be proposed a standard example of a Maintenance Plan for the Portuguese Schools focusing on the elements and infrastructures related to vertical transportation of people with physical deficiencies. It will be also presented a cost analysis of these infrastructures maintenance and operation in order to reach the optimum point that allows the reduction of these costs without reducing the reliability of components, thus improving their useful life and safeguarding the wellness of users. Key Words: Costs evaluation, assets maintenance management, maintenance planning and scheduling 1
INTRODUCTION
For the particular case of school buildings, a great number of maintenance actions may be organized in a systematic way, with foreknowable costs and controlled funds. It is considered adequate to adopt preventive maintenance strategies, condition based or planned and scheduled in time. Although this might be an effective method for preserving schools, preventive maintenance is highly dependent on the availability of human resources and budgets. Preventive maintenance efforts range from visual inspection only, to performance testing and analysis; from minor adjustments, cleaning and/or lubrication, to complete overhauls; from reconditioning, to complete replacement. One must identify the adequate maintenance strategy to follow. Simultaneously, for each item a specific maintenance plan must be developed. The maintenance program may be divided into several lists of planned activities, allowing for unplanned reactive activities and deferred ones. The elevator life cycle is longer than other transport systems, reason why the design, operation, safety and accessibility can be delayed in relation with new technologies, making difficult the access of people with disability. For example, in the European Union there are more than four million of elevators, 50% of them having more than twenty five years, and they don’t have an appropriate level of safety and accessibility for nowadays [1].
59
The Portuguese Law 123/97 from the 22nd of May of 1997 enumerates the technical standards that regulate the buildings accessibility to people with disability [2]. A study developed in the framework of the Portuguese Secondary School Level of Education, the CARMAEE study [3], indicates that only 64.87% of the schools participating in the CARMAEE study, refer the amount of money expended in outsourcings, for example in maintenance of elevators and acclimatization systems. According to the same study, Portuguese schools have only adapted the building for the access of people with disability with the following elements and percentages [3]: - Elevators: 40.24% - Ramps: 71.60% - Wheelchair elevator platforms: 2.96% To accomplish the maintenance plan, it has to be considered a period of time that may, according to literature, allow to guarantee reliability of the elevator and safety of its users [2-8]. To settle on the maintenance strategy to apply, one must consider the procedures complexity, process time, and necessary technicians. According to literature, maintenance operations regarding vertical transportation of persons may be grouped in two different types. The first group includes maintenance activities that can be accomplished by an employee of the school due to the low incidence in the reliability of the equipment as well as users safety. This group of maintenance activities include ground cleaning and button illumination, for example. The second group of vertical transportation maintenance activities includes those activities that, according to current law, should be accomplished by certified companies, Elevators Maintenance Companies (EMC). These activities have a technical character and, since they have a high incidence in the element reliability and safety, they therefore should be part of the maintenance objectives. When developing the maintenance program, two different types of maintenance procedures where considered: the inspections and the maintenance actions procedures. By inspections, one means the procedures regarding maintenance activities aiming to verify the global state of operation of the elevator, and they can be partial inspections or tests. The inspections will be made in preset dates and, according to their technical complexity, they can be carried out by a school employee or a technician from the EMC. Maintenance actions procedures include operations released according to the inspections result, or due to a bad operation or damage of the equipment. As well as the inspections, they will be accomplished by a technician or an employee, depending on its technical complexity. The operations to be accomplished by the technicians of the EMC shall be performed according to a preventive maintenance plan, including procedures to be differentiated according to months, quarters, etc. On the other hand, the maintenance activities to be accomplished by school employees shall result from the following events: - At the beginning of the academic period: preventive maintenance. - During the academic period: preventive maintenance. - At the end of the academic period: preventive maintenance. - When there is a damage or bad operation: corrective maintenance. It is considered opportune to use as a general model the one of preventive maintenance, with the end of accomplishing the defined objectives, and to appeal to corrective maintenance when necessary. The maintenance plan is a very important tool since its periodicity will regulate the vertical transport function reverberating in the reliability and safety. There are several manuals, legislation, normative and technical documents, according to which it should be accomplished tests and inspections of the equipments following a stipulated periodicity, in order to increase equipments reliability and safety. The elevator is a total automatic equipment. The departure, acceleration and stop functions are automatically accomplished in agreement with the calls, without the necessity of a human command to work. Therefore, the accomplishment of regular inspections is indispensable to guarantee to users trips with safety and comfort. According to elevator manufacturers, elevators must be check-up every month to guarantee the users safety, the reliability of the equipment and its performance [4]. All maintenance reports must be kept near the equipment documents. These registers should be maintained actualized whenever there are modifications in the equipment characteristics and they should be available in the School Unit for those people in charge of the maintenance as well as for the organisms responsible for the accomplishment of tests and inspections.
60
2
MAINTENANCE PLAN OF VERTICAL TRANSPORT ELEMENTS 2.1 Maintenance activities to be accomplish by school employees
Schools that have between their staff, employees prepared to accomplish maintenance work, may organize these persons work to accomplish the proposed activities. This will allow them not only to optimize their productivity but also it will reduce costs, so important nowadays, since schools usually struggle with reduced budgets. Table 1 shows the activities to accomplish in each period of the year. Moreover, employees can also accomplish other daily recommendable inspections such as [6]: - Test if there are abnormal noises when the elevator is in movement. - Test that the platform moves without any interference. - Test that the bridge plate moves without any interference when it is folding or unfolding. - Test that the front barrier closes and blocks immediately after the platform ascends from the ground. Table 1 Maintenance activities to be accomplish by school employees [5] When to accomplish
Operations to accomplish
Before the academic period
-
During the academic period
- Coordinate the cabin and corridors cleaning. - Verify that there are no objects obstructing the accesses and -
After the academic period
circulations. Travel inside the elevator stopping in every floor. Test alarm button. Test the opening door button. Verify the telephone operation. Verify the equipment integrity.
- Send the elevator to the first floor. -
In case of damages or bad operation
Turn on the equipment according to manufacturer orientations. Travel inside the elevator stopping in every floor. Test alarm button. Test the opening door button. Verify the telephone operation. Turn on the lights of the floors and platforms of the elevators, also verifying if the access doors to the corridors are locked. Verify the equipment integrity.
-
Turn off the elevator in the command board, according to the manufacturer's orientation, and lock the board. Verify that the corridors elevator doors are locked, and turn off the corridor lights. Contact the EMC, requiring assistance. Place a poster indicating “Elevator in Maintenance”.
Reprinted from: School Elevators Usage and Maintenance (in Portuguese) 2005. Elevator damages and/or bad operation may usually be associated with the following scenarios [5]: - The minimum time of permanence of cabin door opening is less than 3 seconds. - The unevenness between the cabin floor and the pavement floor does not guarantee a safe and perfect accessibility. - There are buttons without light, do not obey the command or the elevator doesn't assist the call in the pavement. - There is an alarm information in the cabin or pavement panels. - There is cabin swinging, vibrations or noises in excess during the transport.
61
Cleaning must have a special mention since it is usually outsourced, despite its importance to avoid accidents ensuring users safety. Water infiltration in the facilities is harmful to the equipment, therefore it must always be kept dry: cabin, ground and walls of the elevator. This procedure is also important in order to avoid slipping and possible wounds to users. Special careful should be taken in the cabin interior cleaning, following the orientation of the manual supplied by the elevator manufacturer. As much for the finish in stainless steel as for the laminated one it must be used soap or neutral detergent and water, applied with cloth or sponge. Aggressive chemical products or abrasive material, as straw-of-steel should never be used [6].
2.2 Maintenance Plan to be accomplished by certified Elevators Maintenance Companies (EMC) The legislation that regulates elevators maintenance in Europe is scarce. With regard to school facilities for example it only states the necessity for inspection every two years by competent entities. On the other hand, manufacturers and other entities offer several recommendations. Although the legislation on safety inspections in elevators may be different according to the locality, country, etc, technicians usually employ similar procedures [7]. In the hoist way, in the ditch, or on top of the cabin, technician must organize the inspection in a safe and ordinate way to avoid injuries not only for himself but also for people occupying the immediately adjacent areas. To accomplish this work, it is necessary to avoid wide clothes. Technician must be equipped with safety glasses or protector glasses in the hoist way, since the airborne particles swept there might cause eye injury. Minimum tools include a flashlight, a 183 cm ruler of wood, a magnifying glass, a small mirror, and a multimeter to measure voltages and grounds [7]. The minimum specific maintenance for all traction elevators may be accomplished according to the periodicity and procedures presented in Table 2:
Table2 Maintenance plan to accomplish by the EMC [7, 8] Periodicity
Procedures to accomplish
Actions according to inspection’s result
Weekly
Perform general inspection of machinery, sheaves, worm and gear motor, brake and selector of floor controllers.
Lubricate as necessary.
Empty drip pans, discard oil in an approved manner and check reservoir oil level. Observe brake operation.
Adjust or repair if required.
Inspect machinery, contacts, linkage and gearing.
Lubricate.
Inspect brushes and switches.
Clean and repair.
Inspect controllers, selectors, relays, connectors, contacts, etc.
Clean and repair.
Ride car and observe operation of doors, levelling, reopening devices, pushbuttons, lights, etc. If rails are lubricated, check conditions and lubrication service lubricators. Verify lamps in elevator cars, machine room, pit, hall lanterns, etc.
Replace all burned out.
Inspect all machine room equipment.
Remove garbage, dust, oil, etc.
Clean trash from pit and empty drip pans. Check condition of car switch handle.
Replace emergency release glass.
Check governor and tape tension sheave lubrication.
Lubricate.
Verify lamps in all lanterns, push buttons, car and corridor position indicators, direction indicators and in other signal fixtures.
Replace all burned out.
62
Table2 Maintenance plan to accomplish by the EMC [7, 8] Periodicity
Procedures to accomplish
Bi-monthly
Observe operation of elevator throughout its full range of all floors. This procedure serves to test controls, safety devices, levelling, relieving, and other devices.
Quarterly
Actions according to inspection’s result
Check door operation: brakes, checks, linkages, gears, wiring motors, check keys, set screw, contacts, chains, cams and door closer.
Clean, adjust and lubricate.
Check selector, brushes, dashpots, travelling cables, chain, pawl magnets, wiring, contacts, relays, tape drive and broken tape switch.
Clean, adjust and lubricate.
Check car: car door and gate tracks, pivots, hangers, car grill, side and top exits.
Clean, adjust and lubricate.
Inspect interior of cab: test intercommunication system, normal and emergency lights, fan, emergency call system or alarm, car station.
Repair.
Visually inspect controller, contacts and relays.
Adjust or replace.
Observe operation of signal and dispatching system.
Repair.
Inspect compensating hitches, buffers, rope clamps, slack cable switch, couplings, keyways and pulleys. Check load weighing device and dispatching time settings.
Clean, adjust, repair, lubricate and replace.
Check oil level in car and counterweight oil buffers.
Add oil as required.
Check brushes and commutators. Inspect switches for finish, grooving, eccentricity and mica level.
Clean, adjust, repair, replace or refinish to provide proper commutation.
Inspect brushes for tension seating and wear.
Replace or adjust.
Check car ventilation system, car position indicators, direction indicators, hall lanterns and car and hall buttons.
Replace or adjust.
Check levelling operation: levelling switches, hoist way vanes, magnets, and inductors.
Clean, adjust and repair.
Check hoist way doors: car door or gate tracks, hangers, and up thrust eccentrics, linkages jibs and interlocks.
Clean and lubricate.
Check car door or gate tracks, pivots, hangers. On hoist ways doors: tracks, hangers and eccentrics, linkages jibs and interlocks.
Clean, adjust and lubricate.
Inspect all fastening and ropes for wear and lubrication: governors and hoist ropes. Inspect all rope hitches and shackles and equalize rope tension.
Clean, lubricate, balance the tension rope, repair.
Inspect hoist reduction gear brake and brake drum, drive sheave and motor, and any bearing wear. In the car, test alarm bell system: fixtures, retiring cam devices, chain, dashpots, commutators, brushes, cam pivots, fastenings.
Clean and adjust.
Test emergency switch: Inspect safety parts, pivots, setscrew, switches, adjustment of car and counterweight jibs, shoe or roller guides.
Replace, lubricate and repair.
In the pit, verify compensating sheave and inspect hitches.
Lubricate.
Inspect governor and tension sheave fastenings.
Tape tension sheave fastenings
63
Table2 Maintenance plan to accomplish by the EMC [7, 8] Periodicity
Procedures to accomplish
Actions according to inspection’s result
Quarterly
Verify oil drip pans.
Clean and empty.
Verify all parts of safeties and moving parts.
Clean and lubricate.
Check clearance between safety jaws and guide rails.
Adjust.
Visually inspect all safety parts. SemiAnnually
Annually
Examine governor rope.
Clean and replace.
Check controller, alignment of switches, relays, timers, contacts, hinge pins, etc.
Clean with blower, adjust and lubricate.
Check all resistance tubes and grids, oil in overload relays, settings and operation of overloads.
Lubricate and adjust.
Inspect fuses and holders and all controller connections. In hoist way examine guide rails, cams and fastenings.
Clean.
Inspect and test limit and terminals switches.
Replace or repair.
Check car shoes, jibs or roller guides.
Adjust or replace.
Check all overhead cams, sheaves, sills, bottom of platform, car tops, counterweights and hoist way walls.
Clean.
Inspect sheaves to ensure they are tight on shafts. Sound spokes and rim with hammer for cracks.
Adjust or repair.
Examine all hoist ropes for wear, lubrication, length and tension.
Replace, lubricate and adjust.
On tape drives, check hitches and broken tape switch.
Replace or repair.
Check car stile channels for bends or cracks; also car frame, cams, supports and car steadying plates.
Replace or repair.
Examine moving parts of vertical rising or collapsible car gates. Check pivot points, sheaves, guides and track wear.
Lubricate and replace.
Inspect guide shoe stems.
Lubricate and replace.
Check governor and tape tension sheave fastenings.
Tape tension sheave fastenings
For bi-parting doors, check: chains, tracks and sheaves, door contacts.
Clean, lubricate, repair or replace.
Clean car and counterweight guide rails using a non-flammable or high flash point solvent to remove lint dust and excess lubricant. Examine brake cores on brakes, linings, and inspect for wear.
Remove, clean, and lubricate. Correct excess wear and adjust.
Examine reservoirs of each hoisting motor and motor generator. Drain, flush and refill. Check all brushes for neutral settings, proper quartering and spacing on commentators.
Restore.
Group supervisory control systems installed must be checked out. The systems, dispatching scheduling and emergency servicing must be tested and adjusted in accordance with manufacturer’s literature. Reprinted from: Procurement Services Group: Elevator Maintenance, 2000 and Maintenance Engineering Handbook, 2002.
64
Elevators maintenance must consider three different areas: hoist way, pit and machine-room [7]. - Hoist way: • Mechanical equipment: sheaves, buffers, door closers, floor selectors, limit switches, hoist way door hangers, closers, and door gibs, interlocks, to be sure they are in the right form. • Hoist and governor ropes and their fastenings, to avoid wear and rust. • Travelling cables to verify its state, and cables for vibration and wear or tear. • Rails for a correct alignment. • Steadying plates for excessive fluctuations. - Pit: oil level in the buffer, rope stretch, debris or water leaks and safety's shoes. - Machine-room: • Motors and generators should be clean, undercut and conditioned correctly, and brushes verified. • Bearings must be inspected for wear or tear. • Electric equipment must be grounded. • Brakes and brake belts, to avoid defective safety. • Shafts into the pulley, for a correct alignment. • Gears and bolts must be inspected to ensure that they are not loose or broken. • Controllers, for a correct operation. • Switches. • Safety equipment, for blocked or shorted contacts. • Governors, to certify that the rope is appropriately placed in the sheave, correctly lubricated and that the rust don’t affect its operation. • Landing equipment, to verify if broken buttons exist and if the illumination is correct. Maintenance operations to accomplish in case of hydraulic elevator would be the same ones that in the case of a traction elevator, but in addition it must be considered [7]: - Packing of the superior part of the cylinder and piston, to make sure that the oil is not draining excessively or it is returning to the tank through a deviation. - Hydraulic machines, to be sure that there is enough oil so that the car still reaches the top of the landing with oil left in the tank. - Load oil average, for cleaning. It is recommended that safety tests should be conducted by the elevator manufacturer. In addition to ensuring compliance with safety regulations, inspections contribute to equipment’s life extension by preventive care and adjustment. A comprehensive elevator maintenance program may also include a full-load safety test, to be carried out every 5 years, and annual no-load tests [7]. 3
MAINTENANCE CONTRACT
As previously stated, school organizations must decide how to organize the elevator maintenance, whether or not trying to accomplish internal maintenance or just to hire the maintenance services of an independent company as well as those offered by the company that manufactured and installed the elevators [5, 7]. EN 13269:2006 enunciates the instructions for maintenance contracts preparation. Maintenance contracts may be differentiated by the offered coverage, namely total maintenance or partial/conservative maintenance [9]. Partial or conservative maintenance is the simplest form of contractual service. This type of maintenance contracts covers lubrication and minor cleaning of elevator equipment and they usually do not include parts replaced during maintenance activities nor overtime emergency calls. The maintenance agreement limits the responsibility of the contractor and, while protecting the elevator contractor, they are potentially costly to the owner. Contract prices shall be adjusted every year according to the changes in the labour costs [7]. Total maintenance contracts usually cover both parts and work demanded to maintain the elevator system. Emergency calls service may be or not included, depending on individual needs. The exclusions in contracts of total maintenance are normally less than in the partial/conservative maintenance contracts and the contractor party's responsibility is larger since it is responsible for the preventive maintenance. Whenever a total maintenance contract excludes mandatory safety tests; the school administration must provide trained employees to shut down the equipment in case of unsafe conditions development. These items omission by the contractor increases the maintenance cost supported by the owner [5, 7]. Contract prices and services vary a lot from company to company, depending on the contract type and services required. Table 3 presents some considerations about procedures to be followed before making a maintenance contract or when revising it [5].
65
Table 3 Advices on how to choose a maintenance contractor and procedures to adopt during contract’s life [5] Terms to consider when choosing a maintenance contractor
Procedure to adopt during contract’s life
Demand the company’s certificate of register in the regulating agency, as well as the document corroborating that the responsible engineer is properly registered.
Demand the technician's functional identity.
Verify if the company has vehicle, telephone, workshop, and parts of the installed equipment.
Control frequency and visiting hours.
Verify if the contract includes 24 hours attendance.
Demand a copy of the record service filled out and signed by the responsible.
Verify the references and services given to other customers.
Demand receipt, guarantee of use of original parts and the executed services.
Reprinted from: Foundation for the Education Development, 2005.
Together with the maintenance contract the following documents and information have to be provided in order to be able, thus, to have an easy access when it is necessary: - The date when the equipment was put in service. - Basic characteristics. - Traction handles’ characteristics. - Characteristics of the parts for which it was asked for inspection certificate. - Designs of the installation. - Schematic diagrams of the electric circuits. - Licenses of installation and operation (or documentation that substitutes them, according to legislation). - Technical guarantee of the equipment. - Maintenance contract. - Technical Dossier of the equipment (supplied by the manufacturer and/or constructor). - Manuals of the equipment (of the manufacturer). - Maintenance manual of the equipment. - Inspections, rehearsals and verifications. 4
MAINTENANCE COSTS Nowadays the maintenance costs depend on several factors like: - Charge capacity - Velocity - Number of floors - Automatic/semi-automatic doors - Kind of work: universal, with memory, etc. - Type of maintenance - 24 hours service
In the following it is presented an example of an annual maintenance cost for a standard elevator, provided by an elevator company. - Total Maintenance: € 2 400.00 - Partial Maintenance: € 1 056.00 - 24 hours Service: € 180.00 There is a big difference between the total and the partial maintenance cost. The school management executive board has to consider both options and choose the one that better fits their maintenance objectives.
66
By comparing this information with the one from CARMAEE Project it can be concluded that the average cost of € 1 679.05 for the elevator maintenance of Portuguese schools is within the limits provided above. 5
CONCLUSIONS
There are more than four million of elevators in the E.U., 50% of them having more than twenty five years, and they don’t have an appropriate level of safety and accessibility for nowadays. The elevators life cycle is longer than other transport systems, reason why the design, operation, safety and accessibility can be delayed in relation with new technologies, making difficult the access of people with disability. This paper proposes a Maintenance Plan for the Portuguese schools vertical transport equipment with the aim to provide some general guidelines for school management executive boards.
6
REFERENCES
1
Technical Specification CEN/TS 81-82:2008 (2008) Safety rules for the construction and installation of lifts. Existing lifts. Part 82: Improvement of the accessibility of existing lifts for persons including persons with disability, AENOR, Madrid.
2
Ministry of Solidarity and Social Security (1997) Law nº123/97 from the 22nd of May of 1997 (in Portuguese), Diary of the Republic nº 118 the 22nd of May of 1997, Portugal.
3
Cação, C.; Silva, F. and Ferreira, H. (2004) CARMAEE: Characterization of the Maintenance in School Buildings (in Portuguese), Final Year Project Dissertation; Department of Electrical and Computer Engineering, University of Coimbra, Portugal.
4
ThyssenKrupp Elevators (2005) Your Elevator: Preventive Maintenance (in Portuguese), 12th edition.
5
Foundation for the Education Development (2005) School Elevators Usage and Maintenance (in Portuguese), São Paulo.
6
Phantom, Phantom Elevator Maintenance Guidelines (in Spanish).
7
Robertson, J. (2002). Maintenance of Elevators and Special Lifts. In: L. R. Higgins and R. K. Mobley eds. Maintenance Engineering Handbook, 6th edt, McGraw-Hill, United States of America.
8
New York State Office of General Services (2000) Procurement Services Group: Elevator Maintenance, New York.
9
European Committee of Standardization CEN/TC 319 (2006) EN 13269: Maintenance. Guidelines on Preparation of Maintenance Contracts, Brussels.
67
Proceedings of the 4th World Congress on Engineering Asset Management Athens, Greece 28 - 30 September 2009
A REVIEW ON THE OPTIMISATION OF AIRCRAFT MAINTENANCE WITH APPLICATION TO LANDING GEARS P. Phillips¹, D. Diston¹, A. Starr², J. Payne³ and S. Pandya³ ¹School of Mechanical, Aerospace and Civil Engineering, University of Manchester, Manchester, UK ²School of Aerospace, Automotive and Design Engineering, University of Hertfordshire, Hatfield, UK ³Messier Dowty Ltd, Gloucestershire, UK Current maintenance programmes for key aircraft systems such as the landing gears are made up of several activities based around preventive and corrective maintenance scheduling. Within today’s competitive aerospace market innovative maintenance solutions are required to optimise aircraft maintenance, for both single aircraft and the entire fleet, ensuring that operators obtain the maximum availability from their aircraft. This has led to a move away from traditional preventive maintenance measures to a more predictive maintenance approach, supported by new health monitoring technologies. Future aircraft life will be underpinned by health monitoring, with the ability to quantify the health of aerospace systems and structures offering competitive decision-making advantages that are now vital for retaining customers and attracting new business. One such aerospace system is the actuator mechanisms used for extension, retraction and locking of the landing gears. The future of which will see the introduction of electromechanical replacements for the hydraulic systems present on the majority of civil aircraft. These actuators can be regarded as mission critical systems that must be guaranteed to operate at both take-off and landing. The health monitoring of these actuation systems can guarantee reliability, reduce maintenance costs and increase their operational life span. Aerospace legislation dictates that any decisions regarding maintenance, safety and flight worthiness must be justified and strict procedures followed. This has inevitably led to difficulties in health monitoring solutions meeting the necessary requirements for aerospace integration. This paper provides the motivation for the research area through reviewing current aircraft maintenance practices and how health monitoring is likely to play a future strategic role in maintenance operations. This is achieved with reference to current research work into developing a health monitoring system to support novel electromechanical actuators for use in aircraft landing gears. The difficulties associated with integrating new health monitoring technology into an aircraft are also reviewed, with perspectives given on the reasons for the current slow integration of health monitoring systems into aerospace. Keywords: Maintenance management, health monitoring, actuators, landing gear 1
INTRODUCTION
The airline industry is considered as one of the most unique businesses in the world. The business suffers from a variety of complex operations. These include moving aircraft loaded with passengers and cargo over large distances and the scheduling of flights, crews and maintenance. These all lead up to substantial operating and maintenance costs measured in time and money. Aircraft maintenance forms an essential part of airworthiness, with its main objective being to ensure a fully serviced, operational and safe aircraft. If an aircraft is not maintained to a required level then this inevitably risks passenger and crew safety. Table 1 lists examples of incidents that have occurred with the probable cause being due to insufficient maintenance [1]. Also, the risk is that the aircraft may be unable to take-off leading to passenger dissatisfaction; likewise it is plausible that the aircraft may be forced to land in undesirable locations, where spare parts or maintenance expertise is unavailable. Maintenance actions therefore have to be carried out at regular scheduled intervals, but ideally be performed with minimum cost to the operator.
68
Table 1: Aircraft maintenance related accidents Airline Aloha Airlines 737
Location Hawaii
Year 1988
Incident Inspection failure led to fuselage failure
BM AirTours 737
Manchester
1989
Wrong bolts led to windshield blowout
United Airlines DC10
Iowa
1989
Continental Express
Texas
1991
Engine inspection failure led to loss of systems Tail failure as task not completed before flight
Northwest Airlines
Tokyo
1994
Incomplete separation
ValueJet
Florida
1996
Fire in hold due to incendiary cargo
assembly
led
to
engine
One of the key systems which have to be maintained and kept fully operational is the aircraft landing gears. Landing gear’s are an essential part of any aircraft; even though they remain redundant for most of the flight. Their main task is to absorb the horizontal and vertical energy of the aircraft as it touches down on the runway. During flight most modern aircraft have their landing gears retracted and stowed and only extend them during the approach to landing. Aircraft extend and retract their landing gears using a variety of methods, which include pneumatic, hydraulic or electrical motor-driven drives, with the majority of retraction mechanisms being hydraulically powered. Most landing gears contain three actuators; the largest of which is the retraction actuator that generates a force about a pivotal axis in order to raise the landing gear against weight and aerodynamic loads. The other two actuators are the lock-stay actuator, which locks the landing gear in place once extended and the door actuator that ensures that the bay doors are successfully opened and closed for landing gear deployment. Figure 1 shows a typical arrangement of the down lock and main retraction actuator positions.
Figure 1: Airbus A320 main gear Hydraulic actuation systems have found popular use in aerospace due to their reliability, relative simplicity and their wide spread use has generated engineering experience and familiarity. They are also ideally suited for landing gear operation as the hydraulic fluid provides a constant lubrication and natural damping. There are disadvantages also, when used in aircraft they are heavy, require large volumes of space, operate noisily and require the correct disposal of hydraulic fluids in accordance with environmental legislation There is currently a move within the aerospace industry towards replacing hydraulic drives with electrical counterparts as part of the ‘more electric aircraft’ concept [2]. The motivation for the use of Electro-Mechanical
69
Actuators (EMA) is driven by the desire to reduce aircraft weight arising from a combination of increasing fuel costs and environmental concerns. For example, environmental damage associated with air traffic has created the need to reduce aircraft fuel consumption and polluting emissions, a key factor in achieving this is the reduction of weight. Landing gears contribute a significant amount of mass to the aircraft, on average they contribute 4% on civil aircraft and 3% on military aircraft [3]. The European drive for the replacement of hydraulic systems on landing gears is driven in part by a large DTI funded project known as ELGEAR [4]. The aim of which is to demonstrate the potential to reduce operating noise, increase operating efficiency, reduce installation volumes and most significantly reduce the landing gear mass by up to 12%. A further motivation for utilising electrically powered actuators is the real possibility that with the move towards optimising engine efficiency, future aircraft engines will not produce hydraulic power. Innovative maintenance solutions, such as health monitoring systems are required to support the introduction of the new electrical actuator technology to provide reliability assurances. Health monitoring systems will enable decisions to be taken regarding aircraft flight worthiness. Importantly for aircraft operators it can aid in providing a unique optimised maintenance scheduling allowing maintenance decisions to play a key part in driving forward operations strategies and providing a business winning advantage This paper will provide the motivation and justification for the current research area of electromechanical actuator health monitoring for landing gears. This is done through reviewing the current maintenance practice for aircraft with reference to landing gears, and what path future advanced maintenance solutions are likely to follow. How health monitoring is likely to play a strategic operations role offering benefits to operators, manufactures and maintenance service providers is also highlighted. The difficulties associated with integrating new health monitoring technology into an aircraft are reviewed with perspectives given on the reasons for the current slow integration of health monitoring systems into aerospace. 2
ELECTROMECHANICAL RETRACTION ACTUATOR DESIGN
The design for the retraction actuator is based around that of a roller screw. A roller screw converts the rotary torque produced by the motor into a linear motion. Roller screws consist of a multiple arrangement of threaded helical rollers assembled in a planetary arrangement around a screw captured in place by a nut. Rotation of the nut with respect to the shaft enables axial translation of the nut; likewise rotation of the screw also can enable translation of the nut. A primary duplex motor connected to a gearbox linearly displaces the nut by rotation of the screw, which moves a lever arm about a pivot achieving retraction/extension of the landing gears. If the primary motor fails there is an emergency control that will ensure successful displacement of the actuator. Figure 2 provides a schematic of the actuator arrangement.
Brake
Emergency Gear Box
Primary Gear Box
Duplex Motor
Emergency Motor Primary Roller Screw
Emergency Roller Screw
Figure 2: The EMA retraction actuator Hydraulic actuation has been used in aerospace successfully for many decades, proving to be reliable and robust, gaining the confidence of aircraft operators. Any replacement drive will therefore need to provide assurances that they are of equal robustness and reliability to the preceding system [5]. The replacement of a key aircraft system, such as actuators, will inevitably require changes in the way in which maintenance actions are performed. As an example, it is easy too visually inspect a hydraulic actuator for faults such as fluid leaks or corrosion. An EMA, such as the schematic in figure 2, is a complex mechanical system and it is not so easy to inspect individual subsystems and components. Most of theses key components (i.e. gears, electrical wiring) are often sealed within housing units making access exceptionally difficult. The use of traditional visual inspections would therefore require a certain degree of landing gear dismantling. The move towards ‘more electric aircraft’ could therefore increase the time an aircraft spends in the maintenance hangar, increasing aircraft downtime costs. This aids in justifying the need for additional automated fault detection and diagnostic health monitoring incorporated into the design
70
3
CURRENT MAINTENANCE PRACTICE
Maintenance programmes for aircraft, which include key systems such as the engines and landing gears are made up of several activities based around preventive, corrective, on-condition and redesign maintenance. Preventive actions are taken at pre-determined intervals based upon the number of operating hours, or often in the case of landing gears, the number of landings. This is supported by regularly scheduled inspections and tests in which on-condition maintenance is performed based upon observations and test results. Each of these activities is finally supported by corrective maintenance conducted in response to discrepancies or failures within the aircraft during service. The final action type, redesign maintenance takes the form of engineering modifications that are made in order to address arising safety or reliability issues, which were unanticipated in the original design. It is essential that aircraft maintenance be performed at appropriate times and to the highest standard to ensure system reliability and guaranteeing passenger safety. So that any potential unsafe conditions can be identified and addressed, the country of aircraft registration and the civil aviation authority of the manufacturing country, generate a set of mandatory guidelines known as airworthiness directives. These directives notify the aircraft operators that their aircraft may not conform to the appropriate standards and if there are any actions (i.e. maintenance) that must be taken. It is a legal requirement that operators follow the airworthiness directives and country specific authorities closely regulate them. Such authorities include the Federal Aviation Administration (USA), The Civil Aviation Safety Authority (Australia) and The Joint Aviation Authorities (Europe). Much of the major maintenance and repair work performed on aircraft is provided through service providers who carry out Maintenance, Repair and Overhaul (MRO) operations for the aircraft operators. The landing gear is a critical assembly and a major key to maintaining the overall aircraft value. Operators cannot afford, or are willing to risk compromising their landing gear MRO activities and will look for the best combination of affordability, expertise, flexibility and the ability to offer customised solutions when faced with the choice of MRO provider. An example of how maintenance support of landing gears would be as follows. In the event of a series of incidents such as ‘hard landings’ reported by the operators, major repair operations, or complete gear overhauls will be conducted at a MRO provider’s maintenance site. The operators themselves can carry out, minor repairs and on-wing maintenance, also at predetermined intervals. Once landing gears have been received at the MRO maintenance facility, they will be dismantled and individual parts will be put through a serious of non-destructive tests. This testing will identify any developing failures, such as structural fatigues or internal corrosion. The results of which will determine if the parts are repaired, replaced, scrapped or recycled [6]. There are a vast number of parts on a typical landing gear which need to be maintained and inspected, An example of key inspection areas along with typical timescales would be: 1.
After 300 hours or after 1 year in service inspection -
2.
Shock absorber Nitrogen Pressure check
After 600 hour inspections -
Landing gear hinge points visual inspections
-
Leak inspection (oil, hydraulic fluid etc)
-
Inspection of torque link play
3. After 7 years or 5000 cycles : Landing gear overhaul To understand maintenance costs it is necessary to look at the elements of maintenance in terms of time. Figure 3 gives a breakdown of the time elements covering maintenance actions. A breakdown such as this can show designers the areas in which they can influence related activity times. In corrective maintenance much of the time is spent on locating a defect which often requires a sequence of disassembly and reassembly. Being able to predict fault location times is extremely difficult using traditional inspection techniques. The ability to automate this fault diagnosis, with advanced technologies and techniques, can help accurately predict the downtime required [7].
71
Time
Up Time
Flying Time
Down Time
Available for Flying Time
Flight Prep. Time
Maintenance Time
Turn Around Time
Pre-flight inspection Time
Preventive Maintenance
Access Time
Modification Time
Corrective Maintenance
Inspection Time
Preparation Time
Defect Location Time
Defect Rectification Time
Rectify by adjustment time In-situ repair time Remove repair & refit time Remove and replace time
BITE effectiveness Fault diagnostic aids Equipment test/Read out capability Technician skill, experience & training
Figure 3: Civil aircraft maintenance time relationships 4 CHANGING MAINTENANCE STRATEGIES Currently the European market holds a 26% share of the worldwide MRO business compared to 39% held by North America and is expected to experience dramatic world wide growth during the next 10 years [8]. There are however several hurdles which must be overcome by these MRO providers in order to continue their leading global market shares [9]. Examples of which include:
Growing competition from the Middle East
Greater competition from Original Equipment Manufacturers (OEM)
Continuing pressure from airlines to reduce costs
These hurdles coupled with increased demand for airline MRO are forcing changes in the global aviation maintenance industries, including:
MRO providers are expanding their geographical reach and capabilities in a bid to become regional and global full service providers.
Spending on MRO is expected to universally increase
72
Airlines are now seeking how to make the next level of savings, which has raise the demand for more predictive maintenance strategies, with more reliability and material solutions to compliment outsourced maintenance repair work.
To drive further cost reductions, airlines are seeking to incorporate sophisticated maintenance management solutions into their aircraft, reducing investments in inventory and to aid in improvements in airline operations and reliability.
Such factors have begun to dictate a change in maintenance strategy for operators and the solutions in the services that the MRO suppliers can provide. Changing economic climates have also led operators to begin seeking innovative technology solutions to maintenance management. These will aid in reducing the levels of scheduled maintenance and hence optimising maintenance on aircraft fleets. In terms of landing gear, much of the current business offered to the customers is contracted in the form of ‘time and materials’, which can be an expensive option for operators. The changing face of the aviation industry requires that maintenance management become increasingly tailored towards individual customers needs with cost-effective solutions being found, offering compromises between customer involvement and the level of commitment required from the providers. Figure 4 shows a matrix with different maintenance solutions and the level of commitment and partnerships required by the operators and MRO providers.
Aircraft Operator Involvement MRO Support High
High
Medium
Low All Inclusive Overhauls
Through Life Support Predictive Maintenance Customised Payment Scheme
Medium Preventive Maintenance
Time and Materials
Low
Figure 4: Maintenance support concepts
5
PREDICTIVE MAINTENANCE
The desire is such that in order to remain competitive and meet the demands and challenges facing operators and suppliers new maintenance support concepts should offer several gains. For the operators these should be reductions in unscheduled maintenance activity, lower total cost of ownership, reductions in administrative burdens and overall optimisation of maintenance activities. This can be achieved by moving away from the scheduled preventive maintenance actions by introducing new systems that can provide details on the in-service operation and condition of landing gear mechanisms, such as brakes, shock absorbers and actuators. Such systems known as health monitoring systems [10] utilise a variety of data gained from on-board sensors in order to extract meaningful information. This information when combined with expert knowledge such as component reliabilities, failure mechanisms and service/maintenance history will provide a quantification of system/subsystem/component health. Based upon this information future corrective maintenance actions can be predicted and allow the optimisation of aircraft maintenance. Incorporating health monitoring systems into an aircraft landing gears in order to employ a predictive maintenance strategy [11] in place of preventive maintenance, offers benefits to both the operators, MRO providers and landing gear manufacturers as described in table 2.
73
Table 2: Benefits of a predictive maintenance strategy Operator
MRO provider
Landing gear manufacturer
Optimised maintenance scheduling
Optimisation of spare parts stockpiling
Information available from onboard health monitoring sensors can be used as a marketing tool
Reductions in maintenance costs Reduced risk of in-service failures Increased aircraft availability
Minimisation of scrap Elimination of bottlenecks in machine usage during MRO operations Reduction in turnaround times
Evaluation of in-service performance of landing gear systems Extensive knowledge of in-service performance can be incorporated into re-designs. Aids in increasing operator confidence in incorporating new replacement technologies.
However it should be noted that innovative predictive maintenance solutions supported by health monitoring can only provide each of the key players the necessary benefits if the described commitments are made. A smooth flow of information is required between the operators, maintenance providers and the manufacturers. It could also be questionable if operators would really want to commit to a long term innovative maintenance solution, due to the added commitment requirements on there behalf. They may be hesitant to uptake the offer of health monitoring systems if the manufacturers have not listened to the specific requirements for their aircraft, most notably component reliability and minimal effects on weight and complexity. The operators will also be wary of the need for the probable handling of vast quantities of extra data and information generated from the health monitoring systems. Support with this should therefore be offered within any innovative maintenance service, or systems that can provide automatic health, related decisions are essential if health monitoring is to be accepted. Operators must also be willing to follow a long-term commitment as a support partner and be willing to exchange failure data with the manufacturers in order for increased reliability in future designs. 6
CHALLENGES TO INTEGRATING HEALTH MONITORING
Health monitoring is a disruptive technology – in that large-scale integration will cause disruptive changes within well defined and established working practices. But once established it can quickly go on to become a fully performance competitive system. Health monitoring systems are aimed at improving the performance of the aircraft, which will be achieved on the lines of ‘evolutionary’ changes whilst demonstrating reliability, validated cost benefits and reduces operational risks. The integration of new technologies inevitably face difficulties and a number of challenges face the community of engineer’s and technical specialists as they seek to utilise health monitoring for aerospace usage [6], a non-exhaustive list of these difficulties include:. 1.
The technology and frameworks are available but under utilised
2.
Performance characteristics are usually untested, leading to a lack of confidence
3.
There is often a wealth of data available from the end users, but access to this data can be limited and much is yet to be converted to ‘meaningful information’
Health monitoring systems for aerospace applications differ from those for other applications such as industrial machine monitoring or the monitoring of civil structures in that often there is hardware restrictions usually based upon weight, complexity and the difficulties associated with certification. Also, in many areas of aerospace health monitoring system development, often the state-of-the-art monitoring technique being developed are restricted by a variety of limitations. This affects there use in a real operational situation’, for example, many of the sensor based methods under development for the monitoring of fuselage structures, based upon such methods as acoustics or vibration patterns require vast sensor arrays. Much of the information gained requires high levels of signal processing with the results being very subjective and consequently they may not be applicable for an on-line real time aerospace monitoring system, even though the fundamentals of the techniques work well in other applications. This will potentially lead to a case where the state of the art has difficulties in matching the necessary requirements for aerospace integration. This the author believes is the reason for the current slow integration of
74
health monitoring on civil aircraft, despite the vast wealth of academic research detailing monitoring methods, industry drive and potential areas for application. Figure 5 illustrates this hypothesis; it demonstrates how the current health monitoring state-of-the-art trend is progressing with respect to the capability requirements for health monitoring for aerospace usage. The hypothesis indicates that the current state-of-the-art is advanced enough for most industry uses; offering leaps in performance and capabilities. But is far below what is required for aerospace applications, and will require further innovations, amongst others, in terms of hardware minimisation, data reduction techniques and the use of fusion to merge multiple techniques to reduce individual limitations and maximise advantages.
Desired HM state of the art trend for aerospace applications HM system requirements for aerospace applications
Current HM state of the art trend
Capability HM system requirements for an ‘enabling’ technology
Time Figure 5: Aerospace health monitoring requirements as compared to the state-of-the-art
7
CONCLUSION
With rising operating and maintenance costs airline operators are being forced to seek out new and innovative solutions to the maintenance of their aircraft fleets. Growing competition and operators demanding reductions in the time aircraft spend in the maintenance hangars, has led aircraft MRO providers to begin looking to future maintenance management solutions increasingly tailored towards their customer needs. Also the incorporation of new ‘unfamiliar aerospace’ technologies, such as electromechanical replacements for hydraulic actuation, requires additional health monitoring systems to ensure their reliability and robustness. The use of aerospace health monitoring however must be complemented by a change and modernisation in future maintenance management. The value of incorporating a health monitoring system is most likely to arise in savings in maintenance costs by providing reductions in the downtime of the aircraft. The use of health monitoring systems for future landing gear electrical retraction mechanisms, or other aircraft systems, will offer a very competitive advantage in maintenance decision-making, which is crucial for both military and commercial aerospace users. This will help manufacturers retain customers and attract new business; these aspects will mean that monitoring solutions are now becoming a key part of formulating future maintenance strategies. The application and usage of sensors for use in providing information regarding actuator health status, which can then be converted into decisions regarding maintenance, safety and flightworthiness for landing gears is part of a long-term future maintenance strategy. Currently on landing gears there is no monitoring solution in place, but it is envisioned that as part of this long-term strategy we will see further integration of new technologies incorporated into the landing gear design, which will be supported by health monitoring based maintenance solutions. This paper has introduced the reasons for the introduction of electromechanical actuation technology into future aircraft landing gears and how this and the changing requirements for aircraft maintenance has led to the current research into an actuator health monitoring system, the aim of which is a fully validated diagnostics system.
75
8
REFERENCES
1
Gramopadyhe, A., k. & Drury, C., G. (2000) Human Factors in Aviation Maintenance: How we got to where we are. International Journal of Industrial Ergonomics, 26, 125-131.
2
Jones, R.I. (2002) The more electric aircraft - Assessing the benefits. Proceedings of the Institution of Mechanical Engineers, Part G: Journal of Aerospace Engineering, 216(5), 259-269.
3
Greenbank, S.J. (1991) Landing gears-the aircraft requirement. Proceedings Institute of Mechanical Engineers, 205.
4
DTI (2007) Report on progress with the national aerospace technology strategy.
5
Phillips, P., Diston, D., Payne, J., Pandya, S., Starr, A. (2008) The application of condition monitoring methodologies for certification of reliability in electric landing gear actuators. In The 5th International Conference on Condition Monitoring and Machine Failure Technologies. Edinburgh, UK.
6
Patkai, B., Theodorou, L., McFarlane, D,. Schmidt, K. (2007) Requirements for RFID-based Sensor Integration in Landing Gear Monitoring - A Case Study. Auto-ID Lab, University of Cambridge.
7
Knotts., R.M. (1999) Civil Aircraft Maintenance and Support Fault Diagnosis from a Business Perspective. Journal of Quality in Maintenance Engineering, 5(4), 335-348.
8
Jenson, D. (2008) Europe’s Challenges In a Dynamic MRO Market. April 2008 [cited 4th April 2009]; Available from: http://www.aviationtoday.com/.
9
Fitzsimons., B. (2007) The BIG Picture: Airline MRO in a Global Context. Airline Fleet & Network Management, 52, 46-54.
10
Kothamasu, R., S.H. Huang, and W.H. VerDuin (2006) System health monitoring and prognostics - a review of current paradigms and practices, in International Journal of Advanced Manufacturing Technology. Springer-Verlag. 1012-24.
11
Mobley, R.K. (2002) An Introduction to Predictive Maintenance. Materials & Mechanical. Elsevier ButterworthHeinemann.
76
Proceedings of the 4th World Congress on Engineering Asset Management Athens, Greece 28 - 30 September 2009
RISK-BASED APPROACH FOR MANAGING ROAD SURFACE FRICTION OF ROAD ASSETS Noppadol Piyatrapoomi (Ph.D.) a and Justin Weligamage (M.Eng.Sc, MBA) a a
Road Asset Management Branch, Queensland Department of Transport and Main Roads, Brisbane, Queensland Australia.
In Australia, road crash trauma costs the nation approximately A$18 billion annually whilst the United States estimates an economic impact of around US$230 billion on its network. Worldwide, the economic cost of road crashes is estimated to be around US$518 billion each year. It is therefore in both the sociological and economic interests of society that attempts are made to reduce, as much as possible, the level and severity of crashes. There are many factors that contribute to road crashes on a road network. The complex manner in which human behaviour, environmental and vehicle failure factors can interact in many critical driving situations, making the task of identifying and managing road crash risk within a road network quite difficult. While road authorities have limited control over external factors such as driver behaviour and vehicle related issues, some environmental factors can be managed such as road surface friction (or skid resistance of road surface). A riskbased method for managing road surface friction (i.e. skid resistance) is presented in this paper. The risk-based method incorporates ‘risk’ into the analysis of skid resistance and crash rate, by linking the statistical properties of skid resistance to the risk of a crash. By examining the variation of skid resistance values throughout the network, the proposed methodology can establish an optimal ‘investigatory level’ for a given level of crash risk, along with a set of statistical ‘tolerance’ bounds in which measured skid resistance values can be found. The investigatory level is a threshold level for triggering a detailed investigation of a road site to identify whether a remedial treatment should be made. A road category of normal demand, spray sealed surface, speed zone greater than 80 km and annual average daily traffic less than 5,000 vehicles was used in demonstrating the application of the method. Key Words: skid resistance, wet crashes, risk-based approach, investigatory level, road surface 1
INTRODUCTION
The skid resistance of a road surface is a condition parameter which quantifies the road’s contribution to friction between the surface and a vehicle tyre. Technically speaking, it is the retarding force that is generated by the interaction between the road surface and the tyre under a locked (non-rotating) wheel. A wheel obtains such as state when the frictional demand exceeds the available friction force at the interface of tyre and road. Therefore, skid resistance is an important factor during events in which these phenomena are likely to occur, such as the high “demand activities” of accelerating, decelerating and cornering. While it is generally accepted that adequate skid resistance levels are maintained in dry conditions, skid resistance decreases substantially in wet driving conditions. It has been noted in US studies that around 20% of road crashes occur in wet driving conditions, a number which is increasing [1]. It is therefore the skid resistance level in wet conditions that is of interest when looking at crash occurrence. A common method applied in practice for managing skid resistance is to set what are known as investigatory levels for the various road categories. The investigatory levels are set as an intermediate form of surface friction management, in that they do not automatically signal that maintenance work is required. If, through normal roadway friction investigation, a particular road site is measured to have a skid resistance value below the relevant investigatory level, a more thorough site investigation and test are performed to determine if additional remedial action is needed. The use of an investigatory level allows the roadway to be assessed taking all factors, including its skid resistance, into account. This provides an extra layer of control in the management process by checking that only those sites which are in most need of maintenance are targeted first, optimising both the materials and budget available to the road authority. The selection of suitable investigatory levels has been the focus of much research, and so too the subject of this project.
77
Most investigatory levels adopted by road authorities in Australia were based on UK studies and were adjusted to suit Australia with some modifications [2, 3]. However, risks of road crashes associated with these adopted investigatory levels are unknown. Many research studies attempted to assess risk of road crashes in relation to skid resistance on their historical data using regression or correlation analysis however they reported no conclusive results [1, 4, 5, 6, 7]. A joint research project among Queensland Department of Transport and Main Roads (QDTMR) and the Cooperative Research Centre for Integrated Engineering Asset Management (CIEAM) and the Queensland University of Technology was established to explore other methods in assessing risk of road crashes and skid resistance and potentially in establishing investigatory levels explicitly associated with risk of road crashes. The final output of this project is a result of the application of the methodology to actual roadway system data. Several informed recommendations have been developed which aim to improve the methodology for managing surface friction based on risk associated with wet road crashes. The measured skid resistance value of a section of road does not stay constant throughout the life of the road surface. Indeed, skid resistance is affected by a variety of different factors, with the eventual result usually being a decrease in the available skid resistance of the road over time. External factors such as the speed, volume and type of traffic using a particular section of road, the local climatic conditions, and the various types of product used in the road all contribute to the period of time for which the skid resistance remains suitable for the safe manoeuvring of vehicles. Selecting and maintaining the skid resistance of the road surface to such levels is an important issue for road authorities, both as a quality issue in the daily driving requirements of the public, and for safety maximisation during high demand incidents such as road crashes. The hypothesis of this research is that current investigatory levels for each particular road category must be linked to an associated risk of road crash for that category, thereby incorporating risk into the management process. Given the wide range of conditions of a roadway, even for those sites within the same category, it is proposed to be more appropriate to derive a range of values on which a measured value may lie. This idea, effectively giving a certain ‘tolerance’ to the Investigatory Level, also incorporates one of the fundamental properties of the skid resistance across a site, category or indeed the entire network: the variability in the measured skid resistance value. Further than that, the linkage between skid resistance and crash rate can only ever be described in a probabilistic sense, and as such, defining a range in which the skid resistance can fall is more appropriate for the purposes of effective decision making.
2
SITE CATEGORIES OR DEMAND CATEGORIES FOR SKID RESISTANCE
The primary aim in managing skid resistance for a road network is to provide skid resistance for vehicles to manoeuvre safely in any roadway condition. It would seem desirable to simply maintain all parts of a road network to a high level, thus ensuring adequate skid resistance. However, to gain a high skid resistance level for all sections of a road network would in most cases be overly expensive. Thus, the purpose of the management process is trying to equalise crash risk across the road network, rather than simply trying to provide high skid resistance level for the whole road network. Higher skid resistance is provided to those road sites which require increased levels of friction, such as corners, intersections or roundabouts. The term “site or demand categories” was established to categorise road sites that require different skid resistance demands for safe manoeuvring. Three levels of demand categories were adopted in Queensland, namely normal, intermediate and high demand categories [2] as shown in Table 1. Table 1 shows typical demand categories adopted by Queensland Department of Transport and Main Roads. Different skid resistance investigatory levels have been given for these three demand categories. The investigatory levels given in Table 1 are the international friction indices (IFI). This table separates the road network across so-called ‘demand’ categories, as well as across various speed ranges. As can be seen from the table, manoeuvres in high demand areas would be expected to require more friction support than manoeuvres in normal demand areas. As mentioned, the investigatory levels are set as an intermediate form of roadway management, in that they do not automatically signal that maintenance work is required. If, through normal roadway testing, a particular road site is measured to have a skid resistance value below the relevant investigatory level, a more thorough site investigation and test are performed to determine if additional remedial action is needed.
78
Table 1 Current Queensland Department of Transport and Main Roads Investigatory Levels for skid resistance Demand category
High
Intermediate
Normal
3
Description of Site
Curves with radius < 100 m. Roundabouts Traffic light controlled intersections. Pedestrian/school crossings. Railway level crossings. Roundabout approaches. Curves with radius < 250m. Gradients > 5% and > 50m long. Freeway and Highway on/off ramps. Intersections. Curves with advisory speed > 15 km/h below speed limit. Manoeuvre – free areas.
F60 Investigatory Level 40 – 50 km/h
60 – 80 km/h
100 – 110 km/h
0.3
0.35
N/A
0.25
0.3
0.35
0.2
0.25
0.3
METHODOLOGY
This session presents a proposed method based on the application of probability theory in assessing risk of road crashes associated with skid resistance and in establishing investigatory levels. Information in relation to this methodology is also given in references 6, 7 and 8. The proposed method allows risk of road crashes to be explicitly incorporated into decisionmaking in establishing skid resistance investigatory levels. The steps in the analysis include 1. Categorise road network Given an initially large road network to study, it is desirable to separate the network down into a large number of smaller roadway sections. This is done not only to make the data more manageable to analyse, but also to allow roadway to be split up according to a particular set of characteristics. There are a variety of different environmental and structural properties which can affect the skid resistance of a roadway, and to be able to test if these have any relationships with skid resistance and crash rates, the various road sections must be grouped according to these characteristics. This process is also important when recommendations must be made for the various ‘demand’ categories. 2. Obtain historical road data under the particular categorisation Once the particular category and road surface condition variables are selected, all available data from the road network must be collected. An important point of this analysis is that it is based on historical data in such a way that the information provided by the analysis also increases and improves as the amount of data increases. The future intention of the analysis is that it is incorporated into the information management system of the QDTMR. 3. Divide road network sample into small sections suitable for the analysis. Once the data has been separated and collected according to the category of interest, the relevant road sections are divided into small segments of equal length. This allows the variability of skid resistance to be examined over a variety of distances. For example, there may be a particular category for which the variation over small segments is of interest, whereas for another category, larger segments may be sufficient. Once each segment is defined, a cumulative probability distribution (Fx) of skid resistance can be formed for each segment. Figure 1 shows the cumulative probability distributions of skid resistance for all segments of a category.
79
A cumulative probability distribution of recorded skid resistance for a road segment is the cumulative chance of the skid resistance that is likely to occur in that road segment.
Cumulative Probability (Fx)
1 0.9 0.8
A comparison of the recorded skid resistance cumulative probability distributions gives information relating to how the different segments compare in terms of variation in skid resistance. Apart from their use in this analysis, these plots can also provide other information that may be of interest to road engineers; i.e., a visual indication of particular road sections that may have a faulty road surface.
0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0.2
0.3
0.4
0.5
0.6
0.7
0.8
Recorded Skid Resistance (F60)
Figure 1: The cumulative probability distributions of skid resistance for all road segments of a certain category 4. Count the number of road crashes and identify the segments where crashes occur 5. Map the related crash data to these distributions After all cumulative probability distributions are created for the road segments within the category, those sections on which crashes occurred are identified. 6. Divide and categorise crash rates across all distributions and road sections This step is fundamental to the methodology in that it involves the selection of suitable envelope distributions of skid resistance (the terms envelope and investigatory curve will be used interchangeably when referring to the derived distribution). These distributions essentially split the crash sample in such a way that a certain percentage of crashes occur on segments of road whose skid resistance curves fall below the envelope. Alternatively, the split can be measured in ‘risk’ related terms. The figure indicates that 15% of crashes occur on road surfaces that have a cumulative probability distribution of skid resistance greater than the boundary F(x2). Seventy per cent of crashes occurred within the boundary of the two cumulative probability distributions of F(x1) and F(x2). Eighty-five per cent of crashes occurred on road surfaces having a cumulative probability distribution of skid resistance below the boundary F(x2).
Cumulative Probability (Fx)
1 0.9
Cumulative probability F(x1)
0.8 0.7
Cumulative probability F(x2)
0.6 0.5 0.4
70% crashes occurring within these two cumulative distributions
0.3 0.2 0.1 0 0.2
0.3
0.4
0.5
0.6
0.7
0.8
Recorded Skid Resistance (F60)
Figure 2: Envelope distribution functions are fitted to the data points, such that a certain percentage of crash related sections are found below the particular curve
80
7. Establish the distributional characteristics for the appropriate crash rates selected as per the management policy The method by which the distributional properties of the envelope distribution are obtained is dependent on the way in which the curve is produced. The simplest method is to fit the curves by visual inspection (either empirical or parametric, such as a normal distribution) or used a probability-based goodness-of-fit-test. This is the method suggested by Piyatrapoomi (2008) as a preliminary method to obtain curves that provide useful information to decision makers [6]. 8. Assess and establish Investigatory Levels for each variable Given a particular envelope distribution, the Investigatory curve, along with a range or interval can then be selected. The mean value of the distribution would normally be stated as the base Investigatory Level; with the interval set at a certain number of standard deviations either side of this mean value. 9. Repeat the analysis for the remaining road condition variables 10. Develop a management framework that incorporates the relationships established in Step 8 Once a suitable distribution has been selected to guide investigatory decisions, this can then be applied by management and practitioners in the maintenance regime of the road network. Once management has selected a suitable ‘risk’ value which meets departmental policy, the associated investigatory distribution can be evaluated and applied when examining road sections in the network.
4
ANALYSIS AND RESULTS
The goals of the research project were based around two specific research problems. The first of these was to examine the relationships that exist between crash rate or risk of crash and skid resistance. The analysis of these relationships was then expected to be linked in with a wider research problem; that of producing a methodology which allows management to determine appropriate skid resistance investigatory levels that incorporate the inherent risk of crashes into consideration. The analysis of the skid resistance – crash relationship was also expected to contribute to decisions relating to the current demand category split up for investigatory levels; a separation which up until now had been primarily based on studies from other countries. The analysis process began with the selection of appropriate analysis categories. Several different categorisations were initially used, and these changed over time as various road condition variables were tested. The two main variables that were used throughout the analysis were seal type and speed zone. The methodology developed for this project involves the extraction, manipulation and analysis of very large data sets. A calculation tool was developed specifically for the purpose of the project which allows efficient timely extraction and analysis to be performed. The software has a dual purpose: firstly, it allows the relationships between skid resistance and crash rate to be examined via the method outlined above. Secondly, it provides a beta software model on which future implementations may be based on. Figure 3 shows a calculation tool that was used for the analysis.
Figure 3: Analysis tool
81
For demonstration purposes, a category of normal demand, of spray seal surface, speed zone greater than 80 km/h, and annual average daily traffic less than 5000 vehicles was presented. The skid data and crash data were recorded in 2004, the total road kilometre in this category for the analysis was approximately 2667 kilometres, and the number of wet crashes was 27. The total road length was divided into small equal segments of 3 kilometres. Figure 4 shows the result of the analysis. Each cumulative probability distribution shown in the figure represented the variability of skid resistance within a 3 kilometre segment. The figure shows only the cumulative probability distributions of skid resistance of the road segments where wet crashes occurred. The figure shows the boundaries that divided road crashes into different crash risk or crash rate expressed in terms of the number of crashes per 10 million vehicle kilometre travelled or percent of crashes. The essence in this method is to select a boundary curve which is the investigatory level or ‘investigatory curve’ for an acceptable crash risk. For example, decision-makers may accept a risk of having 11 wet crashes per 10 million vehicle kilometre travelled, i.e. in this case the 85% percent boundary. However, this final step involves many more additional inputs than those produced by the analysis presented. Management, before deciding on the investigatory curves must combine these results with other economic and logistical information. For example, while the results may suggest a certain high value in investigatory curve for a large proportion of the network, it may simply not be economically feasible to maintain the entire section to such a high degree. Alternatively, it may not be logistically possible to obtain the required volume of aggregate needed to achieve such a skid resistance value. In these situations, management must combine all the information at hand to produce an optimal solution to the maintenance process.
Figure 4: Investigatory distributions and associated crash risks for the SNa80b5k category with wet crashes only
5
PROPOSED METHOD OF MANAGING SKID RESISTANCE
In the proposed method, the management of skid resistance on the road network is based on monitoring skid resistance and comparing them with an established investigatory curve as shown in Figure 5 rather than comparing the measured skid resistance with a single value of investigatory level as given in Table 1. Figure 6 demonstrates an example of a comparison between an investigatory curve and a cumulative probability distribution of measured skid resistance of a road section. Figure 6 demonstrates that even though some measured skid resistance values are less than the investigatory curve, it may not trigger a site investigation since the measured skid resistance values which are less than the investigatory curve are in low percentage. In this method, site investigation will not be triggered for every value of skid resistance that falls below the investigatory level, and in practice it may not be feasible to investigate every place where skid resistance falls below the investigatory level. Figure 7 demonstrates a comparison between the investigatory curve and a cumulative probability distribution of measured skid resistance of a road section that may require site investigation. In this example, the percentage of measured skid resistance is considered significant and also exhibits very low values in skid resistance. The probability-based method allows asset managers to identify more clearly the severity of road sections and better prioritise site investigation than the current practice which compares the measured skid resistance values with a single investigatory level. However as mentioned economic implication, societal expectation, government policies in relation to the tolerance level of crash rate and other logistical information such as availability of local material combined with the results of this analysis must be the basis of input information in establishing appropriate investigatory curves. Validation of
82
the decision-making must be carried out with crash data that occur after the proposed methodology has been implemented. The investigatory curves can be refined and improvement of the skid resistance management process can be developed through the validation process. The calculation tool developed in this project will be able to facilitate and enhance further improvement and development of the skid resistance management process recommended.
1 Cumulative Probability (Fx)
0.9 0.8 0.7 0.6 0.5
Investigatory curve
0.4 0.3 0.2 0.1 0 0.2
0.3
0.4
0.5
0.6
0.7
0.8
Skid Resistance
Figure 5: Example of an investigatory curve
1 Cumulative Probability (Fx)
0.9 0.8 0.7 0.6 Percentage area where measured skid resistance falls below the investigatory curve
0.5 0.4 0.3 0.2
Investigatory curve Measured skid resistance
0.1 0 0.2
0.3
0.4
0.5
0.6
0.7
0.8
Skid Resistance
Figure 6: Example of a comparison between an investigatory curve and a cumulative probability distribution of skid resistance (that does not trigger site investigation)
83
1 Cumulative Probability (Fx)
0.9 0.8 0.7 0.6
Percentage area where measured skid resistance falls below the investigatory curve
0.5 0.4 0.3
Investigatory curve Measured skid resistance
0.2 0.1 0 0.2
0.3
0.4
0.5
0.6
0.7
0.8
Skid Resistance
Figure 7: Example of a comparison between an investigatory curve and a cumulative probability distribution of skid resistance (that triggers site investigation)
6
CONCLUSIONS
The paper outlined the need to manage skid resistance of road network for road safety and proposed a methodology for establishing investigatory level which is used as an intermediate form of skid resistance management. In managing skid resistance, if a particular road site is measured to have a skid resistance value below the relevant investigatory level, a more thorough site investigation and test are performed to determine if additional remedial action is needed. The paper presented a step-by-step methodology in assessing the crash risk and skid resistance using probability-based approach and establishing investigatory level. An investigatory level suggested in this paper was in the form of a probability curve rather than a single investigatory value. The paper also presented how the investigatory curve would be used for managing skid resistance on the road network. The management of skid resistance on road network based on probability theory was presented. A road category of normal demand, spray sealed surface, speed zone greater than 80 km and annual average daily traffic less than 5000 vehicles was used in demonstrating the application of the method.
7
REFERENCES
1
Kuttesh, J.S. (2004) Quantifying the Relationship between Skid Resistance and Wet Weather Accidents for Virginia Data. Master Thesis, Virginia Polytechnic Institute and State University, Virginia, USA.
2
Weligamage J. (2006) Skid Resistance Management Plan. Road Asset Management Branch, Queensland Department of Main Roads, Queensland, Australia.
3
Austroads (2005) Guidelines for the Management of Road Surface Skid Resistance. AP-G83/05, Austroads, Sydney, Australia4
4
Seiler-Scherer L. (2004) Is the Correlation Between Pavement Skid Resistance and Accident Frequency Significant? Conference Paper STRC, Swiss Transport Research Conference, Switzerland.
5
Viner H.E., Sinhal R. & Parry A.R. (2005) Linking Road Traffic Accidents with Skid Resistance – Recent UK Developments. Proceedings of the International Conference on Surface Friction, Christchurch, New Zealand
6
Piyatrapoomi N, Weligamage J, Bunker J, & Kumar A. (2008) ‘Identifying relationship between skid resistance, road characteristics and crashes using probability-based risk approach’ The International Conference on Managing Road & Runway Surfaces to Improve Safety, 11 - 14 May 2008, Cheltenham England
84
7
Piyatrapoomi N, Weligamage J, & Kumar A.(2008) ‘Probability-based method for analysing relationship between skid resistance and road crashes’10th International Conference on Application of Advanced Technologies in Transportation , May 27th-31st, 2008, Athens, Greece
8
Piyatrapoomi N, Weligamage J, & Bunker J. (2007) Establishing a Risk based Approach for Managing Road Skid Resistance. Australasian Road Safety Research, Policing & Education Conference 2007 ‘The Way Ahead'17 – 19 October 2007 Crown Promenade, Melbourne, Australia
Acknowledgments The authors wish to acknowledge the Queensland Department of Transport and Main Roads and the Australian Cooperative Research Centre (CRC) for Integrated Engineering Asset Management for their financial support. The authors also wish to thank staff at Asset Management Branch in the Department of Main Roads, Queensland in Australia for providing technical data and support. The views expressed in this paper are of the authors and do not represent the views of the organisations.
85
Proceedings of the 4th World Congress on Engineering Asset Management Athens, Greece 28 - 30 September 2009
BUILDING AN ONTOLOGY AND PROCESS ARCHITECTURE FOR ENGINEERING ASSET MANAGEMENT Vladimir Frolov a, David Mengel b, Wasana Bandara c, Yong Sun a, Lin Ma a a
Cooperative Research Centre for Integrated Engineering Asset Management (CIEAM), Brisbane, Australia b
c
QR Network, QR Limited, Brisbane Queensland 4000, Australia.
Business Process Management Cluster, Faculty of Information Technology, Queensland University of Technology (QUT), Brisbane Queensland 4000, Australia. Historically, asset management focused primarily on the reliability and maintainability of assets; organisations have since then accepted the notion that a much larger array of processes govern the life and use of an asset. With this, asset management’s new paradigm seeks a holistic, multi-disciplinary approach to the management of physical assets. A growing number of organisations now seek to develop integrated asset management frameworks and bodies of knowledge. This research seeks to complement existing outputs of the mentioned organisations through the development of an asset management ontology. Ontologies define a common vocabulary for both researchers and practitioners who need to share information in a chosen domain. A byproduct of ontology development is the realisation of a process architecture, of which there is also no evidence in published literature. To develop the ontology and subsequent asset management process architecture, a standard knowledge-engineering methodology is followed. This involves text analysis, definition and classification of terms and visualisation through an appropriate tool (in this case, the Protégé application was used). The result of this research is the first attempt at developing an asset management ontology and process architecture. Key Words: asset management, ontology development, process architecture, text mining, classification.
1
INTRODUCTION
The proper management of physical assets remains the single largest business improvement opportunity in the 21st century [1]. Organisations from all around the world now collectively spend trillions of dollars in managing their respective portfolios of assets. Historically, asset management (AM) focused primarily on the reliability and maintainability of assets; organisations have since then accepted the notion that a much larger array of processes govern the life and use of an asset, leading to a significant increase in the amount of asset management literature being published (particularly since 2000) [2-5]. This can be attributed to the modern context of asset management - one that encompasses elements of: strategy; economic accountability; risk management; safety and compliance; environment and human resource management; and stakeholder and service level requirements [6-8]. These elements have previously existed as disparate departments (or silos) within an organisation and in many cases continue to do so; asset management’s new paradigm seeks a holistic, multi-disciplinary approach to the management of physical assets – the foundation for the overall success of an organisation [9]. Although most relevant articles acknowledge that asset management requires a multi-disciplinary approach, their content continues to mostly focus on individual elements of asset management, thus essentially missing the objective of what an ideal asset management definition strives for. A growing number of organisations, however, have understood the definition and are now developing asset management bodies of knowledge and asset management frameworks, i.e., high-level conceptual building blocks of asset management that bring together several disciplines into one overall process [7, 10]. Examples of such organisations include CIEAM [11], IAM [8, 12], AM Council [13] and IPWEA [14]. Such organisations are driving the development of new and extended asset management knowledge, incorporating the idea that asset management must be considered as a multi-disciplinary domain, i.e., one that governs and streamlines many different areas of an organisation whilst guiding managerial personnel the necessary know-how to successfully implement and sustain asset management initiatives. This research seeks to complement existing outputs of the mentioned organisations through the development of a fundamental, conceptual asset management ontology. Ontologies are content theories about the sorts of objects, properties of
86
objects, and relations between objects that are possible in a specified domain of knowledge and provide potential terms for describing our knowledge about the domain of interest [15]. Ontologies define a common vocabulary for both researchers and practitioners who need to share information in a chosen domain [16]. As more and more information is published in the asset management domain, the importance of knowledge-based systems and consistent representation and vocabulary of such information is increased [17], thus supporting the argument for the building of an asset management ontology. To the best of the authors’ knowledge, a holistic asset management ontology, i.e., one that encapsulates the ideal definition of asset management, has not yet been published. By developing an asset management ontology, one can also realise the basic structure of an asset management process architecture. The architecture of the processes of an organisation is defined as the type of processes it contains and supports, as well as the relationships among them [18]. It has already been stated in literature that it is desirable to decompose asset management into a set of processes [19-27]. An asset management process is a set of linked activities and the sequence of these activities that are necessary for collectively realising asset management goals, normally within the context of an organisational structure and resource constraints [28]. No consistent asset management process architecture has yet been published. To develop the ontology and subsequent asset management process architecture, a standard knowledge-engineering methodology is followed. This involves text analysis, classification of terms and visualisation through an appropriate tool (in this case, the Protégé application was selected). The result of this research is the first attempt at developing a fundamental, conceptual asset management ontology and process architecture. The developed ontology can be used to: share and annotate asset management information; identify gaps in current asset management thinking; visualise the holistic nature of asset management; classify asset management knowledge; and develop a relational asset management knowledge-based system. This paper is structured as follows: background information on various topics is presented in Section 2; the methodology followed is presented in Section 3; the asset management ontology and process architecture is detailed in Section 4; analysis of results is shown in Section 5; and conclusions and directions for future research are given in Section 6. 2
BACKGROUND Several research topics form the context of this research. This section presents a brief introduction to each topic. 2.1 The Evolution and Importance of Asset Management
Engineering asset management is a process of organising, planning and controlling the acquisition, use, care, refurbishment, and/or disposal of physical assets in order to optimise their service delivery potential and to minimise related risks and costs over their entire life. This is achieved through the use of intangible assets such as knowledge based decisionmaking applications and business processes [29, 30]. Previously, asset management was often a practice not dissimilar to pure reliability and maintenance, following the simplistic doctrine of cost saving. Now, however, many organisations have shifted views on asset management. The result is a new appreciation of the processes governing an asset, especially the integration of lifecycle costing into asset decisions. An asset typically progresses through four main life stages: create, establish, exploit and divest [2]. These four stages can be thought of as the value chain of an asset, and all must be optimised to deliver a better return on asset investment. Thus, engineering asset management is more than just a maintenance approach as it should influence all aspects of an asset’s life [31]. It encompasses a broader range of activities extending beyond reliability and maintenance [32]. The prevalent view today is that properly executed engineering asset management can bring great value to a business [7]. It has been stated, that asset management is ultimately accountable to the triple-bottom-lines of a business [2], namely economic, environmental and social. It is also an increasingly important governance issue, as the scope of engineering assets expands. Asset management is continually developed to become an integrated discipline for managing a portfolio of assets within an organisation. Much, however, will still need to be achieved before it can become a standard process for a business [33, 34]. Godau et al. [35] sums it up well by saying: asset management needs to deal with a range of complexities born out of the increasing technological, economic, environmental, political, market and human resources challenges facing this generation and our future generations. A holistic approach must be undertaken in which all roles involved with the management of assets come together in a practical framework and organisational structure to achieve the desired results and performance. Strategic thinking into the future is critical to ensure that future generations receive adequate levels of service across all industries, disciplines and applications [36]. 2.2 Text Mining as a Form of Information Analysis Text is the predominant medium for information exchange among experts and is also the most natural form of storing information [37-39]. The knowledge stored in text-based media is now recognized as a driver of productivity and economic growth in organisations [40]. With this, text mining is at the forefront of research in knowledge analysis and discovery.
87
Text mining is a multidisciplinary field that encompasses a variety of research areas: text analysis; information extraction and retrieval; clustering and classification; categorization; visualization; question-answering (QA) database technology; machine learning; and data mining [39, 40]. In almost all cases, text mining initiatives rely on a computer due to the massive text processing required [37, 41]. However, it is difficult for a computer to find the meaning of texts because they often have different possible meanings [42, 43]. Other ambiguities that occur when analysing text are: lexical ambiguities (words having more than one class – verb and noun); syntactic ambiguities (parsing of sentences); and semantic ambiguities (meaning of sentence). Humans can generally resolve these ambiguities using contextual or general knowledge about the subject matter, as well as a thorough understanding of the English language. Much research and methodologies has been developed to increase the efficiency and correctness of text mining applications. 2.3 Using Ontologies to Organise Knowledge In philosophy, ontology is the study of the kinds of things that exist in the world, including their relationships with other things and their properties [15, 44]. An ontology defines a common vocabulary for researchers and practitioners who need to share information in a domain in a consistent and agreed manner [16]. Although more prominent in Artificial Intelligence (AI) and Information Systems (IS) applications, many disciplines now develop standardized ontologies, e.g. SNOMED ontology in the medicine field [45]. As more and more information is published on a particular domain, the need for ontological analysis as a way to structure such knowledge becomes increasingly important. One of the more commonly referenced definitions of an ontology is that of Gruber’s [46] which states that ontologies are explicit formal specifications of the terms in a domain and relations among them. Noy and McGuiness expand on this by referring an ontology as a formal explicit description of concepts in a domain of discourse (classes), properties of each concept describing various features and attributes of the concepts (slots), and restrictions on slots (facets) [16]. Ontologies are used in many applications as mentioned in literature, for example:
Sharing and annotation of information [15-17, 47, 48] Reuse of domain knowledge [16, 17, 47] Facilitate communication [48, 49] Natural language understanding and knowledge-based systems design [15, 17] Business process re-engineering [49] Artificial Intelligence (AI) and Information Systems (IS) [15, 47, 49]
Despite their applications, ontology development is still a challenging task [50], and it suffers from two main limitations: use of ontology and construction difference. Use of ontology refers to the notion that an ontology is unlikely to cover all potential uses[15]. Construction difference refers the notion that building an ontology is more akin to an art rather than a science, and that there is no single and correct methodology for building an ontology [16, 17, 48, 51]. There are however a variety of methodologies currently exist in literature, such as TOVE, Ontolingua and IDEF[5] [17,49,51]. 2.4 Process Architecture The architecture of the processes of an enterprise is defined as the type of processes it contains and supports, as well as the relationships among them [18]. It can be defined for the whole of an enterprise or for some portion thereof and is generally presented as a high-level diagram [52]. Several whole-of-enterprise process architectures currently exist (e.g. AQPC’s Process Classification Framework [53] and the Zachman Framework [54]). However, they do not cover the scope of asset management at a sufficient level. A process architecture is a schematic that shows the ways in which the business processes of an enterprise are grouped and inter-linked. Developing a process architecture is generally seen as an important step in any process management initiative as it lays the framework for existing business processes, including the relationships among them. Therefore, interested personnel can view these business processes at varying levels of detail and scope, depending on their needs. In many cases, developing a process architecture becomes an iterative process as organisations understand more and more about their operations. Nevertheless, it is generally more appropriate to define the process architecture at an early stage of process management. Process architectures generally consists of several tiers (or levels) in a hierarchical orientation, with each tier describing more process detail than the tier before it. The first tier generally describes the overall, high-level, abstract activities that an organisation performs. The second tier generally describes the key processes that define an organisation and provide the mechanism for the implementation of the first tier elements. The third tier (and possible sub-tiers) generally describes individual, well-defined processes that are implemented in order to achieve the goals of an organisation. This tier is of much detail, with many meta-models available to increase the capability of an organisation in modelling this tier (e.g. ARIS [55]). The fourth tier (and possible sub-tiers) generally describes the individual, segmented activities that an organisation performs. These activities link together to make up processes.
88
3
METHODOLOGY
This section details the methodology followed in developing the fundamental and conceptual asset management ontology and process architecture. The overall methodology is shown in Figure 1, followed by the details of each phase.
Figure 1: Flowchart depicting overall methodology for AM ontology and process architecture development 3.1 Document Selection In order to conduct any text mining initiative, unstructured text (usually in the form of documents) must first be sourced. As the goal of this research was to develop a fundamental engineering asset management ontology, documents describing engineering asset management were first analysed. In total, over 100 articles (including journal articles, conference proceedings, books and practitioner publications) were scanned in order to find a suitable source, so as to establish a solid base for an asset management ontology and process architecture. The article that was ultimately chosen was the PAS 55 (Part 2) [12]. In 2004, the Institute of Asset Management (IAM) [8, 12] published, in two parts, PAS 55 – a publicly available specification document. It was developed in response to demand from industry for a standard for carrying out asset management. The first part details the specification, whereas the second part details the guidelines for applying the first part. PAS 55 centres on a core concept that an asset management system consists of five stages/phases: policy and strategy; asset management information, risk assessment and planning; implementation and operation; checking and corrective action; and management review. The specification then details what an organisation should have in their current asset management practice. Currently, the document is being used to certify organisations that prove their effective asset management practices through gap analyses. The PAS 55 can be thought of as a checklist of asset management elements that an organisation needs to adopt to improve their management of physical assets. The specification was developed by a large body of agencies, and in some ways is considered to be a quasi-standard (BSI standard) in asset management. The manual is not meant to be prescriptive as direct instructions, thus making it open to interpretation. There are also no individual quality weightings for the elements discussed, thus it is not easy to gauge exactly how to best apply PAS 55. It does, however, give a very good highlevel view of asset management (holistic) and can be of great benefit to organisations looking to improve their asset management processes. There are several reasons for choosing PAS 55-2 document. Firstly, the document is itself a summarised snapshot of engineering asset management, describing the essential elements of an effective and suitable asset management system. This means that the text contained within the document is more focused as compared to some of the other texts. Secondly, PAS 55 was developed by a large consortium of practitioners practicing asset management, as well as having gone through extensive review and update phases. PAS 55 is now being used to benchmark an organisation’s asset management initiatives, to see whether the organisation is implementing the required elements of asset management. PAS 55 has received mostly positive feedback and uptake by industry, and is the first step towards a more rigid standard in engineering asset management. 3.2 Tool Selection for Ontology Development Protégé is a free, open-source platform that provides a growing user community with a suite of tools to construct domain models and knowledge-based applications with ontologies. At its core, Protégé implements a rich set of knowledge-modeling structures and actions that support the creation, visualization, and manipulation of ontologies in various representation formats. In particular, the Protégé-Frames editor was used as it enables users to build and populate ontologies that are frame-based, in accordance with the Open Knowledge Base Connectivity protocol (OKBC). In this model, an ontology consists of a set of classes organized in a hierarchy to represent a domain's concepts (in this case asset management), a set of slots associated to classes to describe their properties and relationships, and a set of instances of those classes - individual exemplars of the concepts that hold specific values for their properties. Protégé is one of the most common tools available to build ontologies, and as it was not the goal of this research to evaluate various ontology development tools, Protégé was selected due to its broad support base and ease of use [56].
89
3.3 Manual Text Mining The PAS 55-2 document’s main content is 36 pages in length (reference Sections 4.2-4.6). A manual text mining/analysis approach (as opposed to using a computer text mining application) was utilised due to this relatively small document length. The second reason for choosing a manual approach was so that contextual information could also be captured. As mentioned previously, computers are unable to visualise contextual information as well as humans reading the same passage of text. Both contextual information and experience in using the English language are positive arguments towards using a manual text mining approach. This, however, only applies if one has a small amount of text to analyse. Most text mining applications use many source documents. In these cases, a manual text mining approach cannot realistically be utilised. When text mining a source document, an analyst essentially scans for three open word classes, namely: nouns, verbs and adjectives. Other word classes can also be utilised, but as it was the intention to extract only the key terms of asset management (for ontology), these three word classes were sufficient. To recall general knowledge: nouns give names to persons, places, things and concepts in general; verbs is the class of words used for denoting actions; and adjectives are words used to modify the noun [57]. Extracted terms were placed into the following format: noun (adjective1…adjectivex). As expected, at the start of the text mining activity, many terms were continually being added to the list (implemented in Excel 2007). As the activity continued, it was found that less and less terms were added as they already existed in the listed, but rather, adjectives were being added to the list where text described a particular concept from another contextual point-of-view. Verbs, in this case, were used purely for realising and supporting the context of any particular passage(s) of text. From the 36 pages of text scanned, a total of 1193 individual terms were manually extracted. 3.4 Classification of Terms The terms extracted from the previous step, were then classified into several categories of terms, following the ARIS architecture methodology, i.e., the ARIS house concept [55], and in particular, the EPC modelling convention [55]. The EPC (event-driven process chain) notation is a process modelling notation that is composed of the following rudimentary elements:
Event – passive trigger points for a process or function (or activities)
Function – fundamental activity as performed by an agent
Organizational unit – agent performing the activity (e.g. person)
Resource object - physical objects that exist in the world which are utilized by a function and/or an organizational unit
Information system object – information systems-related objects as utilized by a function and/or an organizational unit
As there was no distinct separation between functions and processes (functions being the constructs of a process), “process” was used as the category to describe a procedure the organization performed. The final categories used were as follows: AM EVENT, AM ORGANIZATIONAL ENTITY, AM RESOURCE ENTITY, AM INFORMATION SYSTEM ENTITY and AM PROCESS. These categories form the most upper elements of the ontology which equate to what objects exist in the asset management domain. The selection of these elements also aids in creating actual process chains in the EPC notation (a commonly adopted notation in the process management domain). In this case, the ontology represents the process architecture as part of its composition. 3.5 Ontology and Process Architecture Development An ontology is an explicit account or representation of some part of a conceptualisation, a collection of terms and definitions relevant to business enterprises [58, 59]. Ontologies are generally created for specific applications, and in some cases domains, however, their creation is still generally considered to be an art, rather than a science [17]. Several methodologies for ontology development currently exist in literature, such as in [16, 17, 46, 47, 49, 51, 56, 59-62]. In most of these literatures, a generic skeletal methodology for ontology development is proposed, and is as follows:
Identify a purpose for the ontology (determines the level of formality at which the ontology should first be described).
Identify scope (a specification is produced which fully outlines the range of information that the ontology must characterise).
Formalisation (create the code, formal definitions and axioms of terms in the specification).
Formal evaluation (generally includes the checking against purpose or competency questions specific to a particular ontology).
90
In [59], these generic steps are discussed in further detail. For example, formality refers to an ontology being either: highly information (expressed loosely in natural language); structured information (expressed in a restricted and structured form of natural language, greatly increasing clarity by reducing ambiguity); semi-formal (expressed in an artificial formally defined language; and rigorously formal (meticulously defined terms with formal semantics, theorems and proofs of such properties as soundness and completeness). There are also several purposes to an ontology (mentioned briefly in an earlier section). These are: communication (between people); inter-operability (among systems achieved by translating between different modelling methods, paradigms, languages and software tools); systems engineering (including re-usability, knowledge acquisition, reliability and specification). An ontology can also be generic, that is, can be reused in a range of different situations. In terms of asset management, as per this application, the ontology developed is unambiguous, but an informal ontology. This is because the focus of this research is not the inter-operability of information systems, but rather the systematic and consistent approach to developing asset management process patterns. The subject matter, is the third element of an ontology. Three widely accepted categories are:
Whole subjects (e.g. medicine, geology, finance)
Subjects of problem solving
Subjects of knowledge representation languages
The first category is generally the most popular one and is frequently referred to as a domain ontology. Overlap between these categories is generally encountered due the difficulty in scoping an ontology perfectly. In this paper, the developed ontology is an asset management domain ontology. The methodology implemented for developing the initial asset management ontology is presented below, followed by more specific details of each step:
Figure 2: Ontology development methodology (using [16]) Defining the domain and scope of the ontology: As it was mentioned earlier, the ontology developed as part of this research is for the asset management domain, the scope being that as implemented by the PAS 55-2 document. There is a perceived lack of clear understanding in literature of what processes and elements make up the modern context and understanding of asset management. Selecting important terms in the ontology: The extraction and classification of terms as outlined in Sections 3.3 and 3.4 ensured that the most important terms were selected from the document. As the document itself was a summary of asset management, as expected, a high percentage of terms were in fact considered to be important towards the ontology. Defining the class and class hierarchy of the ontology: A combination development process was used to develop the class hierarchy of important terms. The upper most classes were chosen as: AM EVENT, AM ORGANIZATIONAL ENTITY, AM RESOURCE ENTITY, AM INFORMATION SYSTEM ENTITY and AM PROCESS. A combination process is one where several toplevel concepts are first selected, followed by the recursive process of placing both lower-level and middle-level elements into the class ontology. Thus, a combination approach is the combination of a top-down approach (high-level concepts first, then lower-level concepts) and a bottom-up approach (group the most specific elements first, then generalize into more abstract constructs). When developing the class hierarchy, the following rule was applied to ensure consistency among classes: If a class A is a superclass of class B, then every instance of B is also an instance of A
91
Defining the slots of the classes: Slots define the internal structure of concepts of classes. Thus, slots are the internal properties of individual classes (relation). For this research, although slots were inputted into Protégé, they were not given values or ranges, rather being described simply as strings/words having no values set. 4
ASSET MANAGEMENT ONTOLOGY AND PROCESS ARCHITECTURE
Due to the size and layout constraint of this paper, the full presentation of the ontology and process architecture is not feasible as hundreds of classes and slots were identified from a single (short) document. Particular extracts of the ontology and process architecture are presented with accompanying details.
Figure 3: Upper-level AM ontology elements (part A) The diagram in Figure 3 shows the partial upper-level AM ontology (one superclass): AM RESOURCE ENTITY. This class describes a physical object that is used by (input or output) an activity/function/process to enable the activity/function/process to complete – in many cases, the object is modified (e.g. an asset is repaired. There are four sub-classes in this case: asset, asset system, asset-related resource (e.g. spares/inventory), and AM document. Each class (both super and sub) can have a slot associated with it. As mentioned previously, a slot describes certain properties of a class. As an example from the above diagram, an asset can have the property of “performance target” with an associated value or value range (in this case this is simplified and limited to just being a string value, however, quantitative values can also be put here in place).
Figure 4: Upper-level AM ontology elements (part B)
92
Continuing on from the diagram in Figure 3, Figure 4 shows the remaining superclasses of the developed ontology, namely: AM ORGANIZATIONAL ENTITY, AM EVENT, AM INFORMATION SYSTEM ENTITY and AM PROCESS. It can be seen that the more slots that are developed, the more defined a class can become. By putting exact values into the slots and slot ranges, instances of classes can be created. For example, a specific PERSON in the organization will have a particular set of values for the slots competence, expertise, qualifications and so on. By becoming an instance of a class, the element becomes more succinct and less abstract. The same goes for the ROLE class. A specific role will have the slot properties filled in. A major ability of a detailed and comprehensive ontology is the ability to do relational statements. For example, if the ROLE class had a slot called required qualification and an instance of the PERSON class had a specific qualification value filled in, the following statement could be made: if qualification (property) of instance of PERSON class is equal to or greater than required qualification (property) then instance X is suitable for role Y. The figure below shows how a particular instance is represented within an ontology (in this case a specialist asset designer is chosen for illustrative purposes only).
Specialist Asset Designer
Figure 5: Example of instance of class representation The arbitrary values of high, management and internal are chosen only to illustrate how an instance is represented within an ontology and its relationship with its parent class. As an ontology is filled in to a more defined (deeper) level, more and more instances would be enacted, with the super/parent classes remaining in an abstract form. The diagram in Figure 6 shows an extract of the process architecture (as per the source document). It shows how the levels discussed in Section 2.4 are actually enacted. As this is only a minimal extract, many processes are obviously missing.
---
---
---
---
Figure 6: Extract of AM process architecture
Each element describes a particular process, which can then be further subdivided into sub-processes, and so on – thus forming the exact definition of what process architecture is. Each process element therefore follows the same principles as those discussed earlier in regards to instances of classes.
93
5
ANALYSIS OF RESULTS
No existing ontologies were found that encapsulates the real-world objects existing in the asset management domain. Asset management processes, despite being classed as important in to the asset management community, have received limited focus in research. Asset management is composed of many processes which organisations implement, manage and reuse constantly in real-world asset management operations. It is logical to identify these processes and present them in a systematic and efficient manner, which supports reuse in industry. With the lack of explicit asset management ontologies and process architectures currently in literature (in many cases processes are implied), a direct comparison with an existing ontology and process architecture was not possible. Ontology literature suggests reusing existing literature when possible (or at least modifying it). This paper set out to develop a first-draft, fundamental asset management ontology, as well as showing that this is in fact possible and beneficial. Using only one source document, however, limits the scope and rigour of the results. Although PAS 55-2 was found to be a solid summary of asset management, there are other elements covered in other sources that are absent. With this, more source documents must be chosen so as to enable a broader scope of asset management to be captured. Through the analysis of existing asset management literature (other than the PAS55-2 document) it is clear that asset management suffers from inconsistent terminology, possibly stemming from its multi-disciplinary origin and general complexity in application. Thus, it is envisioned that a manual text mining methodology with several source documents should be utilised, rather than computer-aided text mining. Contextual information that can be captured in slots and instances of classes may be overlooked by using a computer for the analysis of text. An iteration process should also be used to enable the addition and elimination of terms/classes/instances/slots when necessary. In its current form, the developed ontology (and process architecture) builds a solid base for future additions and modifications, including the implementation of feedback from industry. By building a more rigorous ontology, relational statements can be utilised, which will lead into the development of an asset management knowledge system/base. 6
CONCLUSION
This research presents the methodology and development of an initial and fundamental asset management ontology and, subsequently, an asset management process architecture. The results show that an asset management ontology and process architecture can help support an organisation’s asset management initiatives through consistent knowledge representation, knowledge-based systems development, process representation and improvement, process benchmarking and process compliance checking. The developed ontology consists of hundreds of classes and slots, having been extracted and classified from a single article (PAS 55-2). This research illustrates how an ontology can benefit the asset management community through common representation of key terms and their relationship to each other. Future work in this area should see the inclusion of additional terms into the developed asset management ontology so as to build a more comprehensive asset management ontology. REFERENCES 1
H. W. Penrose, Physical asset management for the executive. Old Saybrook, CT, USA: Success by Design Publishing, 2008.
2
J. E. Amadi-Echendu, "Managing physical assets is a paradigm shift from maintenance," presented at 2004 IEEE International Engineering Management Conference, 2004.
3
C. A. Schuman and A. C. Brent, "Asset life cycle management: towards improving physical asset performance in the process industry," International Journal of Operations and Production Management, vol. 25, pp. 566-579, 2005.
4
R. Moore, "Many facets to an effective asset management strategy," Plant Engineering, pp. 35-36, 2006.
5
P. Narman, M. Gammelgard, and L. Nordstrom, "A functional reference model for asset management applications based on IEC 61968-1," Department of Industrial Information and Control Systems, Royal Institute of Technology, KTH 2006.
6
R. Lutchman, Sustainable asset management: linking assets, people, and processes for results: DEStech Publications, Inc., 2006.
7
J. E. Amadi-Echendu, R. Willett, K. Brown, J. Lee, J. Mathew, N. Vyas, and B. S. Yang, "What is engineering asset management?," presented at 2nd World Congress on Engineering Asset Management (EAM) and the 4th International Conference on Condition Monitoring, Harrogate, United Kingdom, 2007.
8
The Institute of Asset Management, PAS 55-1 (Publicly Available Specification - Part 1: specification for the optimized management of physical infrastructure assets), 2004.
94
9
D. G. Woodward, "Life cycle costing - theory, information acquisition and application," International Journal of Project Management, vol. 15, pp. 335-344, 1997.
10 E. Wittwer, J. Bittner, and A. Switzer, "The fourth national transportation asset management workshop," International Journal of Transport Management, vol. 1, pp. 87-99, 2002. 11 J. Mathew, "Engineering asset management - trends, drivers, challenges and advances," presented at 3rd World Congress on Engineering Asset Management and Intelligent Maintenance Systems (WCEAM-IMS 2008), Beijing, China, 2008. 12 The Institute of Asset Management, PAS 55-2 (Publicly Available Specification - Part 2: guidelines for the application of PAS 55-1), 2004. 13 D. Anderson, P. Kohler, and P. Kennedy, "A certification program for asset management professionals," presented at 3rd World Congress on Engineering Asset Management and Intelligent Maintenance Systems (WCEAM-IMS 2008), Beijing, China, 2008. 14 Ipwea, International Infrastructure Management Manual (Version 3.0): Institute of Public Works Engineers, 2006. 15 B. Chandrasekaran, J. R. Josephson, and V. R. Benjamins, "What are ontologies, and why do we need them?," IEEE Intelligent Systems and their Applications, vol. 14, pp. 20-26, 1999. 16 N. F. Noy and D. L. McGuiness, "Ontology development 101: a guide to creating your first ontology," Stanford KSL Technical Report KSL, 2009. 17 D. M. Jones, T. J. M. Bench-Capon, and P. R. S. Visser, "Methodologies for ontology development," in 15th IFIP World Computer Congress - IT & KNOWS Conference. Budapest: Chapman-Hall, 1998. 18 O. Barros, "Business processes architecture and design," BPTrends, 2007. 19 I. Moorhouse, "Asset management of irrigation infrastructure – the approach of Goulburn-Murray Water, Australia," Irrigation and Drainage Systems, vol. 13, pp. 165-187, 1999. 20 C. Spires, "Asset and maintenance management – becoming a boardroom issue," Managing Service Quality, vol. 6, pp. 13-15, 1996. 21 M. Hodkiewicz, "Education in engineering asset management (Paper 064)," presented at ICOMS Asset Management Conference, Melbourne, Australia, 2007. 22 R. E. Brown and B. G. Humphrey, "Asset management for transmission and distribution," IEEE Power and Energy Magazine, vol. 3, pp. 39-45, 2005. 23 M. Mohseni, "What does asset management mean to you?," presented at 2003 IEEE PES Transmission and Distribution Conference and Exposition, 2003. 24 C. Palombo, "Eight steps to optimize your strategic assets," IEEE Power and Energy Magazine, vol. 3, pp. 46-54, 2005. 25 C. P. Holland, D. R. Shaw, and P. Kawalek, "BP's multi-enterprise asset management system," Information and Software Technology, vol. 47, pp. 999-1007, 2005. 26 Y. Mansour, L. Haffner, V. Vankayala, and E. Vaahedi, "One asset, one view - integrated asset management at British Columbia Transmission Corporation," IEEE Power and Energy Magazine, vol. 3, pp. 55-61, 2005. 27 Y. Sun, L. Ma, and J. Mathew, "Asset management processes: modelling, evaluation and integration," in Second World Congress on Engineering Asset Management. Harrogate, UK, 2007. 28 L. Ma, Y. Sun, and J. Mathew, "Asset management processes and their representation," presented at 2nd World Congress on Engineering Asset Management, Harrogate, UK, 2007. 29 Cieam, "EAM 2020 roadmap: report of a workshop facilitated by the Institute of Manufacturing, UK for the CRC for Integrated Engineering Asset Management, Australia," 2008. 30 R. F. Stapelberg, Risk based decision making (RBDM) in integrated asset management. Brisbane, Australia: CIEAM, 2006. 31 D. L. Dornan, "Asset management: remedy for addressing the fiscal challenges facing highway infrastructure," International Journal of Transport Management, vol. 1, pp. 41-54, 2002. 32 I. B. Hipkin, "A new look at world class physical asset management strategies," South African Journal of Business Management, vol. 29, pp. 158-163, 1998. 33 G. O'Loghlin, "Asset management - has there been any reform?," Canberra Bulletin of Public Adminstration, vol. 99, pp. 40-45, 2001.
95
34 L. A. Newton and J. Christian, "Challenges in asset management - a case study," presented at CIB 2004 Triennial Congress, Toronto, ON, 2004. 35 R. I. Godau, "The changing face of infrastructure management," Systems Engineering, vol. 2, pp. 226-236, 1999. 36 Z. Okonski and E. Parker, "Enterprise transforming initiatives," IEEE Power and Energy Magazine, vol. 1, pp. 32-35, 2003. 37 M. Rajman and R. Besancon, "Text mining: natural language techniques and text mining applications " presented at 7th IFIP Working Conference on Database Semantics (DS-7), 1997. 38 I. Spasic, S. Ananiadou, J. McNaught, and A. Kumar, "Text mining and ontologies in biomedicine: making sense of raw text," Briefings In Bioinformatics, vol. 6, pp. 239-251, 2005. 39 A.-H. Tan, "Text mining: the state of the art and the challenges," presented at PAKDD 1999 Workshop on Knowledge Discovery from Advanced Databases, 1999. 40 R. A.-A. Erhardt, R. Schneider, and C. Blaschke, "Status of text-mining techniques applied to biomedical text," Drug Discovery Today, vol. 11, pp. 315-325, 2006. 41 M. Hearst, "What is text mining?," http://www.jaist.ac.jp/~bao/MOT-Ishikawa/FurtherReadingNo1.pdf, 2003. 42 J. Allen, Natural language understanding, 2nd ed: Bejamin/Cummins Publishing, 1995. 43 S. Russell and P. Norvig, Artificial intelligence: a modern approach: Prentice Hall, 1995. 44 D. W. Embley, D. M. Campbell, and R. D. Smith, "Ontology-based extraction and structuring of information from datarich unstructured documents," presented at International Conference On Information And Knowledge Management, Bethesda, Maryland, USA, 1998. 45 C. Price and K. Spackman, "SNOMED clinical terms," British Journal of Healthcare Computing & Information Management, vol. 17, pp. 27-31, 2000. 46 T. R. Gruber, "Towards principles for the design of ontologies used for knowledge sharing," International Journal of Human Computer Studies, vol. 43, pp. 907-928, 1993. 47 M. Uschold, "Building ontologies: towards a unified methodology," Technical Report - University of Edinburgh Artifical Intelligence Applications Institute AIAI TR, 1996. 48 M. Cristani and R. Cuel, "A comprehensive guideline for building a domain ontology from scratch," in International Conference on Knowledge Management (I-KNOW'04). Graz, Austria, 2004, pp. 205-212. 49 P. Bertolazzi, C. Krusich, and M. Missikoff, "An approach to the definition of a core enterprise ontology: CEO," in OESSEO 2001 - International Workshop on Open Enterprise Solutions: Systems, Experiences, and Organizations. Rome, 2001. 50 P. Velardi, P. Fabriani, and M. Missikoff, "Using text processing techniques to automatically enrich a domain ontology," presented at 18th International Conference on Formal Ontology in Information Systems, Ogunquit, Maine, USA, 2001. 51 M. Gruninger, K. Atefi, and M. S. Fox, "Ontologies to support process integration in enterprise engineering," Computational and Mathematical Organization Theory, vol. 6, pp. 381-394, 2000. 52 P. Harmon, "Business process architecture and the process-centric company," BPTrends, vol. 1, 2003. 53 AQPC, "Process Classification Framework," 2009. 54 J. Zachman, "Concise Definition of the Enterprise Framework," 2009. 55 M. Nuttgens, T. Feld, and V. Zimmermann, "Business Process Modeling with EPC and UML: transformation or integration?," presented at Proceedings of The Unified Modeling Language - Technical Aspecs and Applications, Mannheim, Germany, 1998. 56 O. Corcho, M. Fernandez-Lopez, and A. Gomez-Perez, "Methodologies, tools and languages for building ontologies. Where is their meeting point?," Data and Knowledge Engineering, vol. 46, pp. 41-64, 2003. 57 T. Amble, The understanding computer - natural language understanding in practice, 2008. 58 N. Guarino and P. Giaretta, Ontologies and knowledge bases: towards a terminalogical clarification. Amsterdam: IOS Press, 1995. 59 M. Uschold, M. King, S. Moralee, and Y. Zorgios, "The enterprise ontology," The Knowledge Engineering Review, vol. 13, pp. 31-89, 1998. 60 N. Guarino, "Understanding, building, and using ontologies," LADSEC-CNR, 1996.
96
61 M. Uschold and M. King, "Towards a methodology for building ontologies," in International Joint Conference on Artificial Intelligence - Workshop on Basic Ontological Issues in Knowledge Sharing, 1995. 62 M. Uschold and M. Gruninger, "Ontologies: principles, methods and applications," The Knowledge Engineering Review, vol. 11, pp. 93-136, 1996. Acknowledgments This research was conducted within the CRC for Integrated Engineering Asset Management, established and supported under the Australian Government’s Cooperative Research Centres Programme. This research is also sponsored by QR Limited. The authors are grateful for both the financial support and the opportunity of working with these organisations.
97
Proceedings of the 4th World Congress on Engineering Asset Management Athens, Greece 28 - 30 September 2009
TOWARDS A MAINTENANCE SEMANTIC ARCHITECTURE Mohamed Hedi Karraya, Brigitte Chebel Morelloa and Noureddine Zerhounia a
Automatic Control and Micro-Mechatronic Systems Department, 24, Rue Alain Savary, 25000 Besançon, France
Technological and software progress with the evolution of processes within company have highlighted the need to evolve systems of maintenance process from autonomous systems to cooperative and sharing information’ system based on software platform. However, this need gives rise to various maintenance platforms. The first part of this study investigates the different types of existing industrial platforms and characterizes them compared to two criteria namely: information exchange and relationship intensity. This allowed identifying the e-maintenance architecture as the current most efficient architecture. Despite its effectiveness, this latter can only guarantee technical interoperability between various components. Therefore, the second part of this study proposes a semantic-knowledge based architecture, thereby ensuring a higher level of semantic interoperability. To this end, specific maintenance ontology has been developed. Key words: maintenance systems, e-maintenance, interoperability, semantic maintenance, ontology. 1
INTRODUCTION
Today’s enterprise must respond to increasingly demands in terms of quality and quantity of products and services, responsiveness and costs reducing. To deal with these demands, company must have a reliable production system, well maintained by an efficient and inexpensive maintenance system. A performance and well-organized maintenance service contributes to the production system consistency, it will extend the life of industrial equipment and thus the best overall performance throughout the company. This need for maintenance concerns any type of enterprise, industry or service provider. Since the 80’Th, a phase of maintenance services structuring and standardization is being established. Markets’ evolution, globalization and their emphasis on profit and competitiveness of the firm cause the development of new concepts of production organization as well as the maintenance organization. At the same time, the quality aspect is beginning to play an important role as well as the dependability and, specifically, the maintenance function in companies. New technologies of information and communication technologies (ICT) have helped to establish and evolve these roles. Thanks to ICT, Web emergency and Internet, the achievement of maintenance services and monitoring can be performed automatically, remotely and through various distributed information systems. Hence the emergence of the concept of services offered through maintenance architectures, ranging from autonomic systems to integrated systems where cooperation and collaboration are vital to any operation. On the other hand, the set up of these fundamental aspects is a complex task. Thus we are particularly interested in the type of exchanged information, and the complex relationships between different systems and applications in these architectures. At this level, we are confronted to a classical problem in information systems, which is interoperability. This latter means the ability of two or more systems or components to exchange information and to use the information that has been exchanged [1]. In this paper we focus on the semantic interoperability of this exchanged information and how it can grantee an understandable exchange and to evolve existing architecture form static architecture to an intelligent one based on knowledge. To make that, we build a maintenance ontology, which will be shared between different systems of the architecture. The objective of this paper is twofold: (i) recense existing maintenance architectures and (ii) propose a new generation of maintenance architecture semantically interoperable. The rest of the paper is organised as follow. Section 2 is devoted to present the complex characteristics of systems and its relations. Section 3 and 4 are devoted to tale maintenance system historic and various existed maintenance architectures. We present in section 5 the semantic interoperability problematic and its importance to set up the s-maintenance architecture. In the
98
next section 6 we build a domain ontology of maintenance based on the analysis of the maintenance processes. Future work about our ontology use and evaluation and the conclusion are developed consequently in sections 7 and 8. 2
COMPLEX SYSTEMS CHARACTERISTICS
We develop in this section two classification criteria for characterizing software architectures from a macroscopic view freeing details to be studied (the protocols ...) when we want to improve these architectures, including the e-maintenance. 2.1 Information evolution The information used in different applications in the field of maintenance has changed in the light of information technology developments and depending on the complexity of the industrial environment. In the past, this information has been manually entered on paper (drawings, diagrams, manuals) and was verbally exchanged between operators in an informal way. Unlike today, the information is different. It has become formalized and structured to be manipulated by information systems. At the same time, enterprise environment becomes increasingly complex and production systems are becoming more dynamic, which makes the context of the information’s use more variable and unstable. Information is uncertain; it evolves with the changing context. The way to reduce this uncertainty is through the implementation of this information in a context with meaning and direction, by turning it into knowledge in a given objective. This knowledge then becomes, along with other information and knowledge a source to acquire skills. Today’s information systems handle this knowledge to provide a decision support for its users on problem solving and to improve their skills in this field. 2.2
Relations between systems
Thanks to technological and informatics evolution, information systems which were independent and autonomous begin to cooperate by exchanging and sharing information. More recently, new information technologies and communication (ICT) have enabled the migration of these different systems into an integrated system where cooperation and collaboration are essential to any operation. There are different types of relationships between systems under review and will be the basis for the classification of different architectures in maintenance (see Figure 1).
Figure 1. Relationship intensity between systems. - Autonomy relationship is a regime under which a system has the maximum power of management and is independent of all other systems and components. There is communication between systems and it must be self-sufficient in terms of necessary information. - Communication relationship is a link between two or more systems that allows transfers or exchanges. The information transmitted in the communication is no longer limited to alphanumeric characters and also include images, sound and video clips. In this context, the term communication is often used as synonymous with telecommunications. - The cooperative relationship presents a cooperative work being done by a work division in which each actor is responsible for part of the resolution of the problem. In our context, it is mainly technological and industrial cooperation, therefore, a cooperative agreement between independent systems that are committed to carry out joint production of maintenance services. - The collaborative relationship is a strategic partnership to achieve excellence through a combination of skills, suppliers or products. Collaboration involves a mutual commitment of stakeholders in a coordinated effort to resolve the problem by pooling resources, information and skills to better adapt to their environment organizations.
99
3
HISTORY OF MAINTENANCE’S COMPUTING SYSTEM
The development of computer systems in the field of industrial maintenance began when the maintenance has been recognized as a fundamental function in the company and a particular stress was laid on the study and development of procedures of this function. Information used in maintenance has changed in according to the evolution of information technologies and in according to the growth of enterprise environment complexity. The information structure has changed in order to be handled by information systems. We can identify various aspects in the evolution of computer system maintenance [2, 3]: - Computerization of the procedures of maintenance: The automation of the business management allowed computerizing several maintenance procedures. Computer files of equipment, interventions, stocks, plans and diagrams etc were thus created. The integration of these files and the automation of the maintenance activities were possible thanks to CMMS packages (Computerized Maintenance Management System). The daily events of maintenance were treated: the blackout, preventive execution, stocks management. - Interfacing with software packages: Thereafter, these software packages had to interface with the other enterprise software such as purchasing and accounting, already computerized. Large ERP systems (Enterprise Resource Planning) are a next step in streamlining the business processes and integration of maintenance with other corporate functions. - Evolution of the technical field: Informatics has also made progress in the technical field of maintenance. Modern techniques of analysis of maintenance and control have emerged in parallel computing: vibratory analysis, oil analysis, IR thermography, hot ultrasounds etc. We can distinguish among these systems two main groups: analysis systems and acquisition and control systems. * Analysis systems, sometimes coupled with expert systems have been developed. The analysis systems are also intended to provide decision support in diagnosis, prognosis and repair of equipment operators, etc… * Among the acquisition and control systems, we can quote SCADA (supervisory control and data acquisition), command-control of the equipment, technical data and documentation management systems, etc… - Integration of intelligent modules in maintenance architecture: The presence of these various intelligent modules of maintenance leads us to make them communicate and collaborate. The construction of intelligent modules or bricks must contribute to provide indicators to make the right strategic decision and maintenance policy. - Development of ICT: The development of new information and communication technologies, the extension of the Internet in the enterprise, application integration, and the emergence of new policies for maintenance indicate a new stage for the computerization of maintenance, that which some call "maintenance Intelligent”. This leads to cooperative and distributed architectures of maintenance systems communicating between them or on a basis of networks. Implementation of these maintenance architectures can be done using maintenance platforms whose main idea is to offer a maintenance service via internet. Maintenance platforms proposed in Proteus or OSA / CBM projects can serve as examples. 4
DEFINITIONS OF VARIOUS ARCHITECTURES
We propose a terminology characterizing the various computer systems in maintenance and we classify those under two areas: the type of information used in the system and the intensity of a possible relationship with other systems (see Figure 2). More the relation is intense more the systems are connected and integrated and we speak about common architectures to be implemented across platforms. The volume of automatically managed information is concretized by the surface of the square of each system and increases with the intensity of collaboration and also with the complexity of shared information. We note that there is a parallel between our classification of systems and enterprises classification as presented in several work [4].
100
Figure 2. Maintenance architectures classification - The maintenance system includes a single computer system on this site and used on the site of maintenance. This system is autonomous with data exchange with other systems. In parallel with the classification of companies, this corresponds to the traditional company; therefore we are talking about a traditional architecture of an information system. - The system of remote maintenance consists of at least two computer systems a transmitter and a receiver of data and information distantly exchanged. According to the definition of AFNOR remote maintenance is "the maintenance of a well executed without physical access of the staff to the equipment". We are talking about a distributed architecture, based on the concept of distance that can transfer data by radio, telephone line or through a local network. - With the extension of Internet, the systems of remote maintenance emergent towards the concept of E-maintenance. The system of e-maintenance will be implemented on a platform integrating various cooperative distributed systems and maintenance applications. This platform must take support on the global Internet (from where the E-maintenance term) and Web technology allows to exchange, to share and distribute data and information and to create common knowledge. Here the concept intelligent maintenance can be exploited and the proactive and cooperative strategies of maintenance are installed. - Finally, we propose an architecture intended to improve the performance of the architecture of E-maintenance on the level of the communication and exchange of the data between systems and which makes it possible to take account of the semantics of processed data in the applications - S-maintenance (where " S " means semantic) [5]. We will describe with section 5 this concept which takes support on semantic-knowledge via an ontology of maintenance. 4.1 Maintenance This is the basic notion where the system is completely autonomous. AFNOR defines it as “combination of all technical, administrative and managerial actions during the life cycle of an item intended to retain it in, or restore it to, a state in which it can perform the required function”. A maintenance system is represented by an application for maintenance or reliability of the various activities of the maintenance function such as logistics, planning interventions, inventory (managed by the CMMS, ERP), diagnosis and repair (expert systems, databases), monitoring equipment (SCADA, digital control equipment). The architecture of these systems can range according to various objectives. Therefore we propose to describe architectures of these systems by a generic scheme valid for any enterprise system. This scheme consists of two main parts, namely the physical system and management system. This latter produces the whole of the results or decisions based on information coming from the physical system [6]. The acquisition of information is manual or rather limited in its automation and the decisions are thus done by the intermediary of an information system. 4.2 Remote maintenance The architecture of remote maintenance consists of two or several systems or subsystems apart from each other and exchange data between them. One of the systems can function as a data acquisition system, representing the issuer of structured data. The second system is the receiver, functioning as a data processing system. The transmitter can send data automatically or in response to a request from the receiving system data. The results of data processing (output) are used by human actors or may be referred to the purchasing system to arrange the data acquisition. So that data can be exchanged and acceptable by both systems they must be structured. Always by keeping the aspect of distance, remote maintenance can be installed on only one production site as it may be distributed among different production or maintenance site and/or a maintenance center.
101
Figure 3. Example of remote maintenance architecture An example of remote maintenance architecture (cf figure 3) was created in the project TEMIC (Industrial Cooperative Remote Maintenance) which allowed cooperative remote maintenance: not only maintenance staff can perform work at a distance (remote maintenance) but it can do it in collaboration with other experts (cooperative work). Emphasis was placed on the mobility aspect of the cooperating members on several levels: - Distant Level: the actors of remote maintenance will be reachable wherever they are via the mobile network (GSM / GPRS). - Local Level (nomadism): by detection of the presence of remote maintenance actors in a preset perimeter (about 100m) within the company which manages maintenance, to join the most experienced technician on a particular problem. 4.3 E-maintenance The architecture of e-maintenance is done via Internet that allows to cooperate, exchange, share and distribute information to various partner systems of the network (see figure 4). The principle consists in integrating the whole of the various systems of maintenance in only one information system [7]. Systems offer different formats of information that are not always compatible for sharing; this requires coordination and cooperation between systems to make them interoperable. According to [8], interoperability is "the ability of two communication systems to communicate in an unambiguous way, such systems are similar or different. One can say that making interoperable is creating compatibility". The architecture of e-maintenance must ensure interoperability between each of these different systems. The project MIMOSA (Machinery Information Management Open Systems Alliance) was the first in 90th years in the United States to develop a complex information system for maintenance management [9]. The project aimed to develop a collaborative network of maintenance by providing the open standard protocol EAI (Enterprise Application Integration). The organization recommends and develops characteristics of information integration to allow the management and the control of added value by the opened, integrated and industry oriented solutions. This latter developed from information blocks to create e-maintenance platform have been proposed in this project [10].
102
Figure 4. E-maintenance architecture. A functional architecture OSA / CBM (Open System Architecture for Condition-Based Maintenance) dedicated to the development of strategies for conditional or predictive maintenance [11] was developed from the relational schema MIMOSA CRIS. It contains seven flexible modules whose contents (methodology and algorithms) are configurable by the user (cf figure 5). It can be simplified and adapted to each industrial requirement by reducing modules.
Figure 5. OSA/CBM project An E-maintenance architecture was presented in the European project Proteus (cf figure 6). The project was designed to provide a cooperative distributed platform of E-maintenance including the existing systems of data acquisition, control, maintenance management, diagnosis assistance, management of documentation, etc. The concept of this platform is defined by a single and coherent description of the installation to maintain, through a generic architecture based on the concepts of Web services and by proposing models and technological solutions of integration. These techniques help to guarantee interoperability of heterogeneous systems to ensure the exchange and sharing of information, data and knowledge. The aim of the platform is not only to integrate existing tools, but also to predict the evolution of these through the introduction of new services.
103
Figure 6. Proteus e-maintenance plateforme[wwww.proteus-iteaproject.com] Web services were conceived to guarantee interoperability between the various applications of the platform. But they prove that interfaces interconnection protocol does not treat semantics of the output and input data. XML used as bases for data exchange manages structures punts and must be used with the RdF standard to guarantee the bonds between these entities. This architecture guarantees technical interoperability -link between IT systems and services which they provide but does not take account of the semantic interoperability, which consists in giving "Meaning" (semantics) to exchanged information and to make sure that this meaning is distributed in all the interconnected systems. Taking into account this semantics makes possible these systems to combine the information received with other local information and to treat them in a suitable way compared to this semantics [12]. The European project PROMISE (Product Lifecycle Management and Information Tracking Using Smart Embedded Systems) [13] proposes a closed-looped design and lifecycle management system. The objective of PROMISE is to allow information flow management to go beyond the customer, to close the PLC (Product Lifecycle) information loops, and to enable the seamless e-transformation of PLC information to knowledge [14]. This project focused in three working areas related to e-maintenance issues [15]: - Area 1: E-maintenance and e-service architecture design tools (design of e-maintenance architecture as well as its platform for e-service applications). - Area 2: Development of watchdog computing for prognostics (development of advanced hashing algorithm for embedded product behaviour assessment and prognostics). - Area 3: Web-based and tether-free monitoring systems (development of ‘‘interface technologies’’ between the product eservice system platform and Web-enabled e-business software tools). DYNAMITE (Dynamic Decisions in Maintenance) is an European project which aims to create an infrastructure for mobile monitoring technology and create new devices, which will make major advances in capability for maintenance decision systems incorporating sensors and algorithms [16]. The key features include wireless telemetry, intelligent local history in smart tags, and on-line instrumentation [15]. In [15], Iung et al outline most e-maintenance platforms in order to evaluate their capacity from different points of views as collaboration, process formalization, knowledge management, knowledge capitalization, interoperability, etc… In term of knowledge capitalization the major contribution has especially provided by OSA-CBM and Promise Project platform. This latter has the major contribution in the context of knowledge management too. Regarding interoperability, MIMOSA and OSA-CBM standards have the most relevant contribution in this topic. On the other side, Iung et al didn’t talk about semantic interoperability, and in our knowledge, existing platforms do not focus on this issue. Hence, in this work we stress this problematic and we present s- maintenance (“S” for semantic) architecture which guarantee a high level of semantic interoperability between the various systems of the maintenance platform. 5
SEMANTIC INTEROPERABILITY IN MAINTENANCE ARCHITECTURES We seek to set up an architecture treating of the semantic interoperability of data.
104
5.1 Semantic interoperability The IEEE Standard Computer Dictionary defines interoperability as the “ability of two or more systems or components to exchange information and to use the information that has been exchanged” [1]. From this definition it is possible to decompose interoperability into two distinct components: the ability to exchange information, and the ability to use the information once it has been received. The former process is denoted as ‘syntactic interoperability’ and the latter ‘semantic interoperability’. A small example suffices to demonstrate the importance of solving both problems. Consider two persons who do not share a common language. They can speak to one another and both individuals will recognize that data has been transferred (they can also probably parse out individual words; recognize the beginning and end of message units, etc.). Nevertheless, the meaning of the message will be mostly incomprehensible; they are syntactically but not semantically interoperable. Similarly, consider a person who is blind and one who is deaf, but who both utilize a single language. They can attempt to exchange information, one by speaking and one by writing, but since they are incapable of receiving the messages, they are semantically but not syntactically interoperable [17]. In other words, Semantic interoperability ensures that these exchanges make sense—that the requester and the provider have a common understanding of the “meanings” of the requested services and data [18]. Achieving semantic interoperability among different information systems is very laborious, tedious and error-prone in a distributed and heterogeneous environment [19]. Currently it interests various works which was classified according to Park and RAM [20] in three broad approaches: 1. Cartography interoperability (mapping based approach). It aims to build cartographies between data or elements of models semantically connected [21]. A set of transformation Rules are installed to translate or federate local pattern with a global pattern. One therefore adopts an approach to study semantic interoperability via transformation [22, 23]. 2. Interoperability by interoperable languages. These query languages take into account data and metadata to solve semantic conflicts between several data bases interrogation [24]. 3. Interoperability through intermediate mechanisms such as mediators or agents. These mechanisms must have a specific knowledge of the area to coordinate different data sources generally via ontologies [25, 26] or via middleware like the Common Object Request Broker Architecture (CORBA) which based on metadata messaging to facilitate interoperability at each level [27]. Other promising ways are presented by Chen et al in [28] who propose two new approaches to resolve semantic interoperability, a model driven interoperability architecture and a Service oriented architecture for interoperability. The model driven interoperability (MDI) architecture based on MDA and enterprise interoperability concepts. The objective of this approach is to allow transforming automatically the models designed at the various abstraction levels of the MDA structure [29]. The service oriented interoperability is based on a Service-Oriented Architectures adopting a federated approach [30], i.e. allowing interoperability of services ‘on the fly’ through dynamic accommodation and adaptation. As for the above classified approaches, we choose the approach of intermediate mechanisms using ontology engineering. Indeed, Heiler, Mao et al, Yang et al and others researchers are in agreement that ontology engineering is recognized as the key technology to deal with the semantic interoperability problem [18, 19, 31]. Ontologies specify the semantics of terminology systems in a well defined and unambiguous manner [32], by formally and explicitly representing shared understanding about domain concepts and relationships between concepts. In the ontology based approach, intended meanings of terminologies and logical properties of relations are specified through ontological definitions and axioms in a formal language, such as OWL (Web Ontology Language) [33] or UML (Unified Modelling Language) [34]. This agreement seems promising to us and supports the work of knowledge management which we implemented on a repair and diagnosis module applied to an E-maintenance platform [5]. One of the problems arising from this approach is the definition of common ontology. In our case related to a business approach on maintenance, an ontology relating to the equipments was made during the development of the E-maintenance platform within the framework of European project PROTEUS. The conceived ontology, oriented towards a business approach for the maintenance of industrial equipments, is a common denominator between the various applications implemented in an E-maintenance platform. However this ontology was not exploited by all assistance modules of the platform, but only by our assistance module of diagnosis and repair. This did not guarantee the semantic interoperability of the platform. We propose to generalize with the other applications of maintenance assistance, the use of common ontologies in order to guarantee this semantic interoperability. An obstacle to such use is to have a knowledge management approach during the development of the assistance maintenance systems. 5.2 S-maintenance architecture The architecture of S-maintenance platform takes support on the architecture of E-maintenance where the interoperability of the various integrated systems in the platform is guaranteed by an exchange of knowledge represented by an ontology. So
105
that information sharing in the E-maintenance cooperative network is without difficulty, we are required to formalize this information in a way to be able to exploit it in the various systems belonging to the network. We are extending coordination between network partners and we develop an ontological base of the sharing information. Systems share the semantics created for the common architecture of the E-maintenance platform (cf figure 7). This terminological and ontological base models the whole of knowledge of the domain. It will play the role of a memory to set up a knowledge management and capitalization system and to thus exploit the experience feedback to improve maintenance system functioning. This system will use the knowledge engineering tools as well as knowledge management. The software tool must play the role of service integrator able to be connected to the other systems, specific to companies. This knowledge system makes possible to identify, capitalize and restore necessary knowledge to control, using a support environment [35]. Semantics has three levels, namely the general concepts of maintenance, application domain concepts, and specific concepts to each company.
Figure 7. S-maintenance architecture This system takes support on the concept of E-maintenance with an exchange of information either on the Web services but requiring additional constraints based on standard "OKC" resulting from the semantic Web. The semantics of exchanged information requires the creation of a domain ontology common to the various systems. It allows using and creating knowledge and skills which lead to the use of knowledge management techniques and allows capitalizing acquired knowledge. Systems collaborate, which requires a coordinated effort to solve problems. 6
DOMAIN ONTOLOGY OF MAINTENANCE
Several research works tried to build a maintenance ontology. In the software maintenance area , KITCHENHAM et al in [36] suggest that empirical studies of maintenance are difficult to understand unless the context of the study is fully defined. We developed a preliminary ontology to identify a number of factors that influence maintenance. The purpose of the ontology was to identify factors that would affect the results of empirical studies. We present the ontology in the form of a UML model. Ruiz et al in [37] developed a semi-formal where the main concepts, according to the literature related to software maintenance, have been described. This ontology, besides representing static aspects, also represents dynamic issues related to the management of software maintenance projects. REFSENO (A Representation Formalism for Software Engineering Ontologies) [38] was the methodology used in this work. Matsokis and Kiristis in [39] propose an ontology-based approach for product lifecycle management, as extension of the ontology proposed in Promise project [40]. This latter provide a Semantic Object Model for Product Data and Knowledge Management. The SOM provided a commonly accepted schema to support interoperability when adopted by different industrial partners. However, these works can be analyzed within two points of view: what does it present? And what does it regard? The two first works try to conceptualize the entire maintenance domain; nevertheless the second two works focus only on the product lifecycle and essentially the middle of life phase [41]. Within the second point of view last works regard to ensure interoperability between industrial partners contrariwise to the first two works which especially aim to ensure the best management of software maintenance and reuse activities. Thus, we develop a general product maintenance ontology which covers the whole of the maintenance domain having the goal to ensure semantic interoperability among different systems in the maintenance platform. We take advantages of the classification made by Rasovska et al in the study of maintenance process [42] to set up our ontology. In fact, like shown in figure 8, authors define four fundamental technical and business fields, identified in the general maintenance: (i) equipment analysis which consists of functional analysis and failure analysis; (ii) Fault diagnosis and expertise which aim to help the operator, during his intervention, to diagnose the problem and the prognostic to anticipate the
106
breakdown and to solve it without the recourse to an expert; (iii) Resource management which deals with the resource planning for all maintenance interventions; (iv) Maintenance strategy management which represents decision support concept for maintenance managers. Equipment analysis Fault diagnosis and expertise Resource management Maintenance strategy (contract) management Figure 8. Maintenance process concepts Based on the study of the maintenance process, dependability concepts and maintenance experts practice, we developed this ontology of maintenance expertise including maintained equipment model associated to maintenance system components as an UML class diagram. The choice of UML as language of our ontology is based on its graphical expressivity’s and the semantics’ power recommended in various research works. Cranefield et al in [34] focus on the benefits of using UML as an ontology’s language, Bézivin et al in [43] stresses that the meta-models (e.g. UML) in the sense that they are used in the OMG (Object Management Group) address the concept of representation and more specifically to the ontology definition presented in [44]. We have built our own framework, so that from its conception it takes into account the different scope of the maintenance process. This ontology was developed as a tool for sharing semantic between different actors in the e-maintenance platform. The ontology of the domain, although established independently of the methods of reasoning has a structure which depends on how acquired knowledge will be used for reasoning because experts deliver the knowledge adapted to their reasoning. The model domain consists of twelve parts (i.e. packages) corresponding to both the structure of the enterprise memory and the maintenance process (see Figure. 9). There are the monitoring management system, site management system, equipment expertise management system, resource management system, intervention management system, the maintenance strategy management model, maintenances management system, equipment states system, historic management system, document management system, functional management system, dysfunctional management system. The equipment expertise management system is characterized on the one hand, by the equipment components and sub components in a tree form (Component). Site management system: defined as a unity characterized by emplacement. This site can be a production site which contains operating equipments or a maintenance centre which is the central location to operate and maintain equipments. Equipment states system: during its operation, equipment may be in one of the following states: Normal state, Degraded state, Failure state, Programmed stop state. We include in programmed stop any stopping of the carried equipment by the authorized personnel. Among scheduled stops, we are interested only to maintenance. Maintenances management system: this package is related with programmed stop included in the equipment state system. This package manages the different types of maintenance, which are corrective maintenance, conditional maintenance, and preventive maintenance. The monitoring management system consists of sensors (sensor) installed on the equipment and various measurements (Measure) coming from these sensors. A model of data acquisition (Data acquisition model) manages the acquisition and the exploitation of these measures. This model can trigger the procedure of intervention request according to a threshold measures and is therefore connected with the intervention management model. The intervention management system focuses on the maintenance intervention. Intervention lets to remedy the equipment failure and is described by an intervention report and characterized by maintenance type. The maintenance strategies management system is based on technical indicators (Technical indicator) and financial (Financial indicator) for each equipment in a maintenance contract. The resources management system describes the resources used in the maintenance system, namely human, material, document and their subclasses: operators (Operator), expert (Expert) and manager (Manager) are subclasses of human resource. Tools (Tool), consumables (consumable) and spare parts (Spare part) are subclasses of material resource. The document resource and their subclasses are presented in a separated package. Document management system: this package presents documentation resources which are indispensable in maintenance as: the equipment plan which contains the design and the model of the equipment and its components, technical documentation where is defined all technical information of an equipment and its use guide, contract which presents maintenance contract, and finally intervention report. This latter is composed by observation, work order, technical comments.
107
Functional equipment management system functional analysis and associated model (Functional equipment model) characterize the equipment operation by MainFunction and SecondFunction classes. They represent the equipment main and secondary functions to ensure the running smooth of the main function. Dysfunctional equipment management system each equipment can suffer from breakdowns and failures described in the Failure class and analyzed in the failure Analysis (Failure equipment model). A failure is identified by symptoms (symptoms) caused by origins (Origin) and remedied with a remedial action (Action). It also has characteristics (Characteristics) such as criticality, appearance frequency, non detection and the gravity which are evaluated in the FMECA (Failure Mode, Effects and Criticality Analysis). Historic management system contains life history which stocks the life historic of an equipment. It is composed by equipments states, interventions and different measurements of the monitoring system.
108
Figure 9. Domain ontology of maintenance
109
7
FUTURE WORK: ONTOLOGY USE AND EVALUATION
Modelling the domain ontology is very beneficial. But how can we exploit this benefice?; is the question which must be responded. Presenting the ontology via UML class diagram is very beneficial in term of clearness and comprehensibility, but it does not allow the ontology evaluation and the ontology use in the e-maintenance platforms. In other words, we cannot validate the ontology’s reasoning, soundness and completeness [45], and we cannot navigate on an UML class diagram. The ontology must be translated to an ontology language allowing reasoning and understandable [46] by the technical components of the platform which use the ontology. Currently we are working to evolve this ontology by the translation on a description logic language to allow the reasoning on the ontology and to implement it throw an interpreted or compiled language. This will permit to us the study of our ontology capacity and quality to guiding us to trails and areas of the ontology development. In the other hand, we aim to enrich this domain ontology models by adding more concepts and more information to cover all domain areas which can be used to evolve the e-maintenance platform. In the same time we aim to relate this domain ontology with a task ontology providing dynamic activities in the maintenance system as diagnostic, prognostic, detection, acquisition, etc… 8
CONCLUSION
To improve system availability and safety as well as product quality, industries are convinced by the importance role of the maintenance function. Consequently various works are made to evolve this latter. Taking advantages of new information technologies -which allow the integration of various assistance systems via platforms- these works permit to expand and develop maintenance systems. In this paper we proposed a classification of various existing maintenance architectures to infer a support system architecture for maintenance services. This classification is made according to relations intensity between systems (autonomy, communication, cooperation, collaboration) in a particular architecture. Collaborative or cooperative relation generates a problem on the interoperability level between the architecture systems. The semantic interoperability is considered as one of the complex interoperability's problems that is why we focus on it in this paper. Thus we highlighted the semantic maintenance architecture (S-maintenance) which is based on a common ontology for various systems. Indeed, this developed ontology witnesses a semantic interoperability level. This ontology is related to the maintained equipment and common for the platform to guarantee interoperability between integrated systems and applications. It is based on the maintenance process, dependability concepts and maintenance experts practice, including maintained equipment model associated to maintenance system components as an UML class diagram. The choice of UML as language of our ontology is based on the power graphical expressivity and the semantic of this language. To be operational this class diagram will be translate to an ontology language allowing reasoning like OWL DL language or LOOM. 9
REFERENCES
1 2 3 4
Staff, I.o.E.a.E.E. (1990) IEEE Computer Dictionary: Compilation of IEEE Standard Computer Glossaries 610–1990. Francastel J.C. (2003) Externalisation de la maintenance : Stratégies, méthodes et contrats, Paris : Dunod publication. Boucly F. (1998) Le management de la maintenance : Evolution et mutation, Paris : Afnor Editions. Dedun I & Seville M. (2005) Les systèmes d’information interorganisationnels comme médiateurs de la construction de la collaboration au sein des chaînes logistiques : Du partage d’information aux processus d’apprentissages collectifs. Proceedings of 6th international congress on Industrial engineering. Besançon. Rasovska I., Chebel-Morello B. & Zerhouni N. (2005) Process of s-maintenance: decision support system for maintenance intervention. Proceedings of 10th IEEE International Conference on Emerging Technologies and Factory Automation ETFA’05, Italie. Kaffel H. (2001) La maintenance distribuée: concept, évaluation et mise en oeuvre. Phd thesis, Université Laval, Quebec. Muller A. (2005) Contribution à la maintenance prévisionnelle des systèmes de production par la formalisation d’un processus de pronostic. Phd thesis, Université Henri Poincaré, Nancy. Spadoni M. (2004) Système d’information centré sur le modèle CIMOSA dans un contexte d’entreprise étendue. JESA, Volume 38, n° 5, 497-525. Kahn J. (2003) Overview of MIMOSA and the Open System Architecture for Enterprise Application Integration. Proceeding of COMADEM’03, pp. 661-670. Sweden: Växjö University. Mitchell J, Bond T, Bever K & Manning N. (1998) MIMOSA – Four Years Later. Sound and Vibration, pp. 12-2. Lebold M & Thurston M. (2001) Open standards for Condition-Based Maintenance and Prognostic Systems. Proc. Of 5th Annual Maintenance and Reliability Conference (MARCON 2001), Gatlinburg, USA. Wikipedia 2009. Available on: http://fr.wikipedia.org. Lee J, Ni J. (2004) Infotronics-based intelligent maintenance system and its impacts to closed-loop product life cycle systems. Invited keynote paper for IMS’2004, International conference on intelligent maintenance systems, Arles, France.
5 6 7 8 9 10 11 12 13
110
14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46
Kiritsis D. (2004) Ubiquitous product lifecycle management using product embedded information devices. Invited keynote paper of IMS’2004, International conference on intelligent maintenance systems, Arles, France. Muller A, Marquez C & Iung B. (2008) On the concept of e-maintenance: Review and current research. Journal of Reliability Engineering and System Safety 93 1165–1187. Amsterdam: Elsevier Science Publishers Holmberg K, Helle A & Halme J. (2005) Prognostics for industrial machinery availability. POHTO 2005, International seminar on maintenance, condition monitoring and diagnostics. Oulu, Finland. Komatsoulis GA, Warzel DB, Hartel FW, Shanbhag K, Chilukuri R, Fragoso G, de Coronado S, Reeves D M, Hadfield J B, Ludet C & Covitz P A. (2008) caCORE version 3: Implementation of a model driven, service-oriented architecture for semantic interoperability. Journal of biomedical informatics. Heiler S. (1995) Semantic Interoperability. ACM Computing Surveys (CSUR). Mao M. (2008) Ontology mapping: Towards semantic interoperability in distributed and heterogenous environments. PhD thesis University of Pittsburgh. Park J & Ram S. (2004) Information System Interoperability: What Lies Beneath?. ACM Transactions on Information Systems, vol. 22, n°4. Baïna S, Panetto H & Benali K. (2006) Apport de l’approche MDA pour une interopérabilité sémantique. Interopérabilité des systèmes d’information d’entreprise, Processus d’entreprise et SI, RSTI-ISI, pp.11-29. Rahm E & Bernstein P A. (2001) A survey of approaches to automatic schema matching. The International Journal on Very Large Data Bases, vol. 10, n° 4, p.334-350. Halevy A & Madhavan J. (2003) Composing mappings among data sources. Proceedings of the conference on very large databases, p. 572-583, Berlin, Germany. Fauvet MC, Baina S. (2001) Evaluation coopérative de requêtes sur des données semi-structurées distribuées. Proceedings of Information Systems Engineering. Maedche A & Staab S. (2000) Semi-automatic engineering of ontologies from texts. Proceedings of the 12th International Conference on Software Engineering and Knowledge Engineering (SEKE 2000), p. 231-239, USA. Halevy G I, Dan Suciu D & Tatarinov I. (2005) Schema mediation for large-scale semantic data sharing. The VLDB Journal, The international Journal on Very Large Data Bases, vol.14, n°1. Tannenbaum, A. (1994) Repositories: potential to reshape development environment. Application Development Trends. Chen D, Doumeingts G, Vernadat F. (2008) Architectures for enterprise integration and interoperability: Past, present and future. Journal of Computers in Industry 59 pp 647–659. Mellor SJ, Scott K, Uhl A & Weise D. (2002) Lecture Notes in Computer Science. ISO 14258. (1999) Concepts and Rules for Enterprise Models. Industrial Automation Systems ISO TC184/SC5/WG1. Yang Q Z & Zhang Y. (2006) Semantic interoperability in building design: Methods and tools. Journal of ComputerAided Design 38 pp 1099–1112. Amsterdam: Elsevier Science Publishers. Guarino N. (1998) Formal ontology and information systems. Formal ontology and information systems. IOS Press. W3C OWL Web ontology language overview, http://www.w3.org/TR/2003/PR-owl-features-20031215/; 2005 [last accessed 10.05].] Cranefield SJS & Purvis MK. (1999) UML as an ontology modelling language. In Proceedings of the Workshop on Intelligent Information Integration, 16th International Joint Conference on Artificial Intelligence (IJCAI-99). Kramer I. (2003) Proteus: Modélisation terminologique. Technical report INRIA France. Kitchenham B, Travassos G, Von Mayrhauser A, Niessink F, Schneidewind N, Singer J, Takada S, Vehvilainen R & Yang H. (1999) Towards an Ontology of Software Maintenance. Journal of Software Maintenance: Research and Practice 11. Ruiz F, Vizcaino A, Piattini M & García F. (2004) An ontology for the management of software maintenance projects. International Journal of Software Engineering and Knowledge. Tautz C & von Wangenheim C G. (1998) A Representation Formalism for Software Engineering Ontologie. Report of Fraunhofer Institute for Experimental Software Engineering. Matsokis A, Kiritsis D. (2009) An Ontology-based Approach for Product Lifecycle Management. Computers in Industry. Special Issue: Semantic Web Computing in Industry. In press. PROMISE, (2008) FP6 project. www.promise.no. Kiritsis D, Bufardi A & Xirouchakis P. (2003) Research issues on product lifecycle management and information tracking using smart embedded systems. Journal of Advanced Engineering Informatics; 17 (3-4), pp. 189-202 . Rasovska I, Chebel-Morello B & Zerhouni N. (2004) A conceptual model of maintenance process in unified modeling language. Proceedings at 11 th IFAC Symposium on Information Control Problems in Manufacturing 2004 (INCOM) Bézivin J. (2000) De la programmation par objets à la modélisation par ontologie. Journal of Ingénierie de connaissances. Charlet J, Bachimont B, Bouaud J & Zweigenbaum P. (1996). Ontologie et réutilisabilité : expérience et discussion. In AUSSENAC-GILLES N, LAUBLET P & REYNAUD C, Eds., Acquisition et ingénierie des connaissances : tendances actuelles, chapter 4, p. 69–87. Cepaduès-éditions. Uschold M & Gruninger M. (1996) Ontologies: Principles, methods and applications. Knowledge engineering review. Uschold M. (1998) Knowledge Level Modelling: Concepts and Terminology. The Knowledge Engineering Review, vol. 13, N 1.
111
Proceedings of the 4th World Congress on Engineering Asset Management Athens, Greece 28 - 30 September 2009
INFORMATION AND OPERATIONAL TECHNOLOGIES NEXUS FOR ASSET LIFECYCLE MANAGEMENT Andy Koronios a, b, Abrar Haider a, b, Kristian Steenstrup a, c a b
CRC for Integrated Engineering Asset Management, Brisbane, Australia
School of Computer and Infomation Science, University of South Australia, Mawson Lakes Campus, SA 5095, Australia. c
Gartner Inc.
Contemporary enterprises rely on accurate and complete information to make optimal decisions. In order to do so must have the ability to harvest information from every repository that will provide them the necessary information for good decisions to be made. Asset managing organisations have in recent times moved towards integrating many of their information systems but have, in most cases, focused on the business process enabling properties of information technologies, and have tended to overlook the role of these technologies in informing the strategic business orientation. In the same vein, asset managing organisations consider IT as a business support area that is there to support business processes or operational technologies to ensure smooth functioning of an asset. However, even these operational technologies are embedded in IT, and they generate information that is fed to various other operational systems and administrative legacy systems. The intertwined nature of operational and information technologies suggest that this information provides for the control of asset management tasks and also acts as an instrument for informed and quality decision support. There exposes the active and dynamic link between IT and corporate governance. The scope of IT governance should, thus, be extended to include the operational technologies so as to develop a unified view of information and operational technologies. This paper attempts to uncover the peculiarities and variances of the relationship between industry specific operational technologies used for asset management and organisational use of mainstream IT applications for business execution. This paper puts forward the proposition that in order to achieve a high degree in data driven decision making particularly at the strategic level of the organisation a confluence of information technology (IT), operational technology (OT) and information management technology (IM) needs to occur. Key Words: Information technology, governance, operational technologies. 1
INTRODUCTION
During the last two decades, significant change has occurred in the way enterprises manage information to execute business processes, communicate and make better decisions. Indeed many organisations have used information technologies to transform their organisation and create new business models creating new business value. Information Technologies (IT) for asset management are required to translates strategic objectives into action; align organisational infrastructure and resources with IT; provide integration of lifecycle processes; and informs asset and business strategy through value added decision support. However, the fundamental element in achieving these objectives is the quality of alignment of technological capabilities of IT with the organisational infrastructure, as well as their fit with the operational technologies (OT) used in lifecycle management of assets. IT and OT are becoming inextricably intertwined, where OT facilitate running of the assets and are used to ensure system integrity and to meet the technical constraints of the system. OT includes control as well as management or supervisory systems, such as SCADA, EMS, or AGC. These systems not only provide the control of asset lifecycle tasks, but also contribute to the overall advise on effective asset management though the critical role that they have in decision making. However, even though OT owe a lot to IT for their smooth functioning, yet due to their specialised nature these technologies are not considered as IT infrastructure. Furthermore, if the This paper, therefore,
112
attempts to uncover the relationship between industry specific OT used for asset management and organisational use of mainstream IT applications for asset lifecycle management. It starts with an analysis of the IT utilised for asset management, which is flowed by a discussion on their relationship with OT in asset lifecycle management. The paper, thus, presents a framework for IT-OT nexus.
2 2.1
ASSET MANAGEMENT Scope of Asset Management
The scope of asset management activities extends from establishment of an asset management policy and identification of service level targets according to the expectation of stakeholder and regulatory/legal requirements, to the daily operation of assets aimed at meeting the defined levels of service. Asset managing organisations, therefore, are required to cope with the wide range of changes in the business environment; continuously reconfigure manufacturing resources so as to perform at accepted levels of service; and be able to adjust themselves to change with modest consequences on time, effort, cost, and performance. Asset management can be classified into three levels, i.e. strategic, tactical, and operational (Figure 1). Strategic level is concerned with understanding the needs of stakeholders and market trends, and linking of the requirements thus generated to the optimum tactical and operational activities. Operational and tactical levels are underpinned by planning, decision support, monitoring, and review of each lifecycle stage to ensure availability, quality, and longevity of asset’s service provision. The identification, assessment, and control of risk is a key focus at all levels of planning, with the results from this process providing inputs into the asset management strategy, policies, objectives, processes, plans, controls, and resource management.
External Factors
Auditors
Regulations
Operational level
Operation management
Contract management Tactical level
Contractors
Suppliers
Customer service
Work management
Inventory control
Assets management
Maintenance management
Strategic Purchasing Resource level planning AM goals & policies, Strategic level AM planning, Human Ownership resource definition Engineering
External consultants
Finance
Marketing Location management Pressure groups
Legislation
Condition monitoring
Government agencies
Registry management Business stakeholders
Reliability management Risk management
Economic forecast
Figure 1: Scope of Asset Management (Source [1])
2.2 Strategic Asset Management Planning Asset management has evolved from the humble beginning of maintaining plant machinery to execute a host of related functions to an approach that is equally as important and essential as quality, reliability, and organisational efficiency [2]. Asset strategic planning typically have a 10-25 year horizon for financial planning purposes, although organisations may look well beyond this period in order to fully assess optimum lifecycle strategies [1]. Strategic asset planning translates legal and stakeholder requirements and expectations into service outcomes; thereby allowing for an overall long term strategy to manage assets. The main constituents of strategic planning process are,
113
a.
the development of vision, mission and values statements which describe the long-term desired position of the organisation and the manner in which the organisation will conduct itself in achieving the same [3];
b.
review of the operating environment, to ensure that all elements that affect the organisation’s activities have been considered. Such elements include corporate, community, environmental, financial, legislative, institutional and regulatory factors [4];
c.
identification and evaluation of strategic options to achieve strategic goals arising from the vision and mission statements [5]; and
d.
a clear statement of strategic direction, policies, risk management and desired outcomes [6].
Public sector organisations may give more weighting to environmental, social and economic factors in determining strategic goals, whereas private sector asset owners will typically place most emphasis on economic factors. However, the agreement on levels of service in terms of criteria such as quality, quantity, timeliness and cost provides the link between the strategic and tactical plans.
2.3 Tactical Asset Management Planning Tactical planning involves the application of detailed asset management processes, procedures and standards to develop separate sub-plans that allocate resources (natural, physical, financial, etc.) to achieve strategic goals through meeting defined levels of service. Depending on an organisation’s purpose, tactical plans may have varying priorities, for example, owners of infrastructure assets are usually directly concerned with asset management plans and customer service plans, which then become an input into other tactical plans, such as resource management plan. The fundamental aim of tactical asset management processes and procedures is to cost-effectively achieve the organisation’s strategic goals in the long-term. These processes, procedures, and standards cover asset management activities, such as, a.
setting asset management objectives, including technical and customer service levels; and regulatory and financial requirements [7];
b.
operational controls, plans, and procedures [8];
c.
managing asset management information systems and information contained in them, such as asset attributes, condition, performance, capacity, lifecycle costs, maintenance history, etc. [9];
d.
risk management [10];
e.
decision making for optimisation of asset lifecycle management [10]; and
f.
asset performance and condition assessments [11].
2.4 Asset Management Operational Planning Operational plans generally comprise detailed implementation plans and information with a 1-3 year outlook. These plans typically provide the organisational direction on annual or biannual basis and are concerned with practical rather than visionary elements. Operational plans actually work as practical translations for priorities arising from tactical plans in order to deliver cost effective levels of service. According to IIMM [1] operational plans typically include aspects, such as: a.
operational controls to ensure delivery of asset management policy, strategy, legal requirements, objectives and plans;
b.
structure, authority and responsibilities for asset management;
c.
Staffing issues - training, awareness and competence;
d.
consultation, communication, documentation; to/from stakeholders and employees;
e.
information and data control; and
f.
emergency preparedness and response.
Asset lifecycle management involves significant amount of acquisition, processing, and analysis of information that enables planning, design, construction, maintenance, rehabilitation, and disposal/refurbishment or replacement of assets. The complexity and increasingly entwined nature of asset management calls for integration of cross functional information. IT in asset management, therefore, has to provide for the control of asset lifecycle management tasks, as well as act as instruments for decision support. For example, the trade-offs between deferred maintenance and preventive maintenance, between shortterm fixes and long-term solutions. Thus, the most important function of IT in asset management is bringing together of the
114
various lifecycle management functions, thereby allowing for an integrated view of asset lifecycle. However, realisation of an integrated view of asset lifecycle through IT requires appropriate hardware and software applications; quality, standardised, and interoperable information; appropriate skill set of employees to process information; and the strategic fit between IT and asset lifecycle management processes.
3
SCOPE OF IT IN ASSET MANAGEMENT
In theory IT in asset management have three major roles; firstly, IT is utilised in collection, storage, and analysis of information spanning asset lifecycle processes; secondly, IT provides decision support capabilities through the analytic conclusions arrived at from analysis of data; and thirdly, IT provides an integrated view of asset management through processing and communication of information and thereby allow for the basis of asset management functional integration. According to Haider [12] minimum requirements for asset management at the operational and tactical levels is to provide functionality that facilitates, a.
knowing what and where are the assets that the organisation own and is responsible for;
b.
knowing the condition of the assets;
c.
establishing suitable maintenance, operational and renewal regimes to suit the assets and the level of service required of them by present and future customers;
d.
reviewing maintenance practices;
e.
implementing job/resources management;
f.
improving risk management techniques;
g.
identifying the true cost of operations and maintenance; and
h.
optimising operational procedures.
In engineering enterprises asset management strategy is often built around two principles, i.e., competitive concerns and decision concerns [13]. Competitive concerns set manufacturing/production goals, whereas decision concerns deal with the way these goals are to be met. IT provide for the these concerns through support for value added asset management, in terms of the choices such as, selection of assets, their demand management, support infrastructure to ensure smooth asset service provision, and process efficiency. Furthermore, these choices also are concerned with in-house or outsourcing preferences, so as to draw upon expertise of third parties. IT not only aid in decision support for outsourcing of lifecycle processes to third parties, but also provide for the integration of extra-organisational processes with the intra-organisational processes. Nevertheless, the primary expectation from IT at the strategic level is that of an integrated view of asset lifecycle, such that informed choices could be made in terms of economic tradeoffs and/or alternatives for asset lifecycle in line with asset management goals, objectives, and long term profitability outlook of the organisation. However, according to IIMM [1], the minimum requirements for asset management at the strategic level are to aid senior management in, a.
predicting the future capital investments required to minimise failures by determining replacement costs;
b.
assessing the financial viability of the organisation to meet costs through estimated revenue;
c.
predicting the future capital investments required to prevent asset failure;
d.
predicting the decay, model of failure or reduction in the level of service of assets or their components, and the necessary rehabilitation/ replacement programmes to maintain an acceptable level of service.
e.
assessing the ability of the organisation to meet costs (renewal, maintenance, operations, administration and profits) through predicted revenue;
f.
modelling what if scenarios such as, (i)
Technology change/obsolesce,
(ii)
Changing failure ratas and risks these pose to the organisation, and
(iii)
Alterations to renewal programmes and the likely effect on levels of service,
h.
alteration to maintenance programmes and the likely effect on renewal costs; and
i.
impacts of environmental (both physical and business) changes.
IT for asset management seeks to enhance the outputs of asset management processes through a bottom up approach. This approach gathers and processes operational data for individual assets at the base level, and on a higher level provides a consolidated view of entire asset base (figure 2).
115
IT Implementation Concerns
Desired Asset Management Outputs
Level
Providing and integrated view of asset lifecycle management information to facilitate strategic decision making at the executive level.
Tactical Level
Fulfilling asset lifecycle planning and control requirements aimed at continuous asset availability, through performance analysis based on analysis of various dimensions of asset information such as, design, operation, maintenance, financial, and risk assessment and management.
Strategic How IT must be implemented to provide an integrated view of asset lifecycle?
How IT must be implemented to meet the planning and control of asset lifecycle management?
How IT must be implemented to meet operational requirements of assets?
Operational Level
Aiding in and/or ensuring of asset design, operation, condition monitoring, failure notifications, maintenance execution and resource allocation, and enabling other activities required for smooth asset operation.
Figure 2: Scope of IT for asset management (source [14]) At the operational and tactical levels, IT systems are required to provide necessary support for planning and execution of core asset lifecycle processes. For example, at the design stage designers need to capture and process information such as, asset configuration; asset and/or site layout design and schematic diagrams/drawings; asset bill of materials; analysis of maintainability and reliability design requirements; and failure modes, effects and criticality identification for each asset. Planning choices at this stage drive future asset behaviour, therefore the minimum requirement laid on IT at this stage is to provide right and timely information, such that informed choices could be made to ensure availability, reliability and quality of asset operation. An important aspect of asset design stage is the supportability design that governs most of the later asset lifecycle stages. The crucial factor in carrying out these analyses is the availability and integration of information, such that analysis of supportability of all facets of asset design and development, operation, maintenance, and retirement are fully recognised and defined. Nevertheless, effective asset management requires the lifecycle decision makers to identify the financial and non financial risks posed to asset operation, their impact, and ways to mitigate those risks. IT for asset management not only has to provide for standardised quality information but also have to provide for the control of asset lifecycle processes. For example, design of an asset has a direct impact on its asset operation. Operation, itself, is concerned with minimising the disturbances relating to production or service provision of an asset. At this level, it is important that IT systems are capable of providing feedback to maintenance and design functions regarding factors such as asset performance; detection of manufacturing or production process defects; design defects; asset condition; asset failure notifications. There are numerous IT systems employed at this stage that capture data from sensors and other field devices to diagnostic/prognostic systems; such as Supervisory Control and Data Acquisition (SCADA) systems, Computerized Maintenance Management Systems (CMMS), and Enterprise Asset Management systems. These systems further provide inputs to maintenance planning and execution. However, effective maintenance not only requires effective planning but also requires availability of spares, maintenance expertise, work order generation, and other financial and non financial supports. This requires integration of technical, administrative, and operational information of asset lifecycle, such that timely, informed, and cost effective choices could be made about maintenance of an asset. For example, a typical water pump station in Australia is located away from major infrastructure and has considerable length of pipe line assets that brings water from the source to the destination. The demand for water supply is continuous for twenty four hours a day, seven days a week. Although, the station may have an early warning system installed, maintenance labour at the water stations and along the pipeline is limited and spares inventory is generally not held at each station. Therefore, it is important to continuously monitor asset operation (which in this case constitutes equipment on the water station as well as the pipeline) in order to sense asset failures as soon as possible and preferably in their development stage. However, early fault detection is not of much use if it is not backed up with
116
the ready availability of spares and maintenance expertise. The expectations placed on water station by its stakeholders are not just of continuous availability of operational assets, but also of the efficiency and reliability of support processes. IT systems, therefore, need to enable maintenance workflow execution as well as decision support by enabling information manipulation on factors such as, asset failure and wear pattern; maintenance work plan generation; maintenance scheduling and follow up actions; asset shutdown scheduling; maintenance simulation; spares acquisition; testing after servicing/repair treatment; identification of asset design weaknesses; and asset operation cost benefit analysis. An important measure of effectiveness of IT, therefore, is the level of integration that they provide in bringing together different functions of asset lifecycle management, as well as stakeholders, such as business partners, customers, and regulatory agencies like environmental and government organisations.
4
ISLANDS OF INFORMATION ARE NO LONGER AN OPTION IN AM ORGANISATIONS
For too long business units in organisations have been allowed to create pools of data that were at best not easily available to the rest of the organisation and at worst the organisation was not even aware that such potentially valuable resources existed. This was promoted by many in the organisation as a means of exercising power and control. Operational technologies have been good candidates for creating pools of data as their nature is to gather, generally in a real-time setting, performance data from various technological systems the need for these to be integrated with other business systems has not been evident. Yet such a situation prevents or hinders optimised management of the assets. IT departments have not in the past helped such a situation from occurring through the creation of significant barriers between them and the rest of the enterprise. Furthermore, IT has been viewed, at least until recently, as an enabler and infrastructure provider to the business function rather than a strategic and indeed transformational resource. It is recently that many organisations have considered an enterprise-wide view of IT and its strategic impact. Much of the effort in enterprises is devoted into managing the physical asset and human resources. Yet another valuable resource, the data that is captured and stored in the organisational repositories, is generally not given the attention that it deserves as a strategic component of the enterprise. In addition to the information technology and operational technologies there is therefore a third discipline within many organisations with its own language, technologies and folksonomies. This is the area of records, content and knowledge management. Although it would be reasonable to assume that these organisational functions would also integrate seamlessly within the IT, this is not often the case. In the past such functions were responsible for the management of information in physical form such as paper records, maps and design diagrams, microfilm, photographs as well as more recently multimedia resources. Our research has shown very little integration between this function and information and operational technologies. Thus a holistic enterprise-wide information lifecycle management and governance scheme is not typical in most asset management organisations. Consider the concept of ‘autonomic logistics’ as proposed by Hess [15]. Autonomic logistics applied to airborne vehicles refers to the harmonisation of embedded advanced technology on board the jet fighter and the automatic transmission of information regarding the condition of systems and components and related logistic suppliers as well as trained technicians so as to ensure that the necessary parts and technical capability is available to fix the problems as soon as the jet fighter arrives to base so as to minimise the time in which the asset is on the ground. This is a very natural objective. However, how can this be realised? If the engineering systems do not integrate easily with the logistic information systems and the work management systems such a vision could not be realised. Equally strategic decisions about maximising the life of an asset, minimise its total cost of ownership and extracting the most value of an asset are all difficult if not impossible if all the available information is integrated to maximise the value of the data upon which such decisions can be made. The corporate accounting scandals of Enron, Peregrine systems, WorldCom, and others provided businesses worldwide with valuable lessons about governance and government legislation such as the Sarbanes–Oxley Act of 2002 in the United States and Basel II in Europe have ensured that the minds of CEOs, senior managers and the boards of enterprises are focused on the need for good governance and accounting practices. These, and related laws hold office holders personally responsible for the accuracy of information in financial and other reporting. IT governance has in recent years gained significant status as an issue which CEOs and CIOs have been motivated to ensure as to achieve high levels of organisational governance. Apart from the regulatory compliance and reduction of risk responsibilities however, good IT governance delivers significant benefits to the enterprise; indeed Weill & Ross [16] argue that significant IT business value directly results from effective IT governance. It is thus critical for asset management enterprises to take a holistic, enterprise-wide view of data, its capture, storage, processing, and flow within the enterprise. Steenstrup [17] suggests that the separation of IT and OT is still a major issue in engineering asset management organisations and recommends that starting small in “self-contained initial projects” is a good way forward. This is good advice, however greater guidance by senior management is required to bridge the islands of information that exist not only in enterprise IT and engineering (OT) but also in the content and records management functions where unstructured data are usually captured, archived and forgotten.
117
An It governance model, shown in Figure 3 below, and an architected information lifecycle management strategymust be applied over the top of both enterprise IT and engineering IT to ensure that full integration of information takes place in the asset management organisation.
Figure 3: An IT Governance Framework for EAM Information Integration
5
CONCLUSION
Information can deliver incredible value to the engineering asset management enterprise. For this to occur however greater cognisance needs to be given to how information is captured and harvested, how it is stored and managed so that the right information finds the right users and it is integrated in a way that business insights and strategic decisions can be made on the basis of all the information being available at the operational, tactical and strategic levels of management. Such vision cannot be realised if there exists a chasm between the enterprise information technologies and the engineering, operational technologies. Furthermore, the custodians of the enterprise content and other unstructured intellectual assets are also integrated for easy access within the enterprise. This paper suggests that the most important initiative by the senior management in the organisation to go along way towards achieving this vision would be to introduce effective IT governance mechanisms throughout the organisation. Such committees, change control boards budgeting processes and so on must include all the elements of managing the information and knowledge assets in the enterprise, the enterprise information technologies, the engineering operational technologies and the information/content management systems.
6
REFERENCES
1
IIMM 2006, ‘International Infrastructure Management Manual’, Association of Local Government Engineering NZ Inc, National Asset Management Steering Group, New Zealand, Thames, ISBN 0-473-10685-X. 2 Narain, R, Yadav, R, Sarkis, J, & Cordeiro, J 2000, ‘The strategic implications of flexibility in manufacturing systems’, International Journal of Agile Management Systems, Vol. 2, No.3, pp.202-13. 3 Alexander, K 2003, ‘A strategy for facilities management’, Facilities, Vol. 21, No. 11/12, pp. 269 – 274. 4 Inman, RA 2002, ‘Implications of environmental management for operations management’, Production Planning and Control, Vol. 13, No.1, pp.47-55. 5 Boyle, TA 2006, ‘Towards best management practices for implementing manufacturing flexibility’, Journal of Manufacturing Technology Management, Vol. 17, No. 1, pp. 6-21. 6 Balch, WF 1994, ‘An Integrated Approach to Property and Facilities Management’, Facilities, Vol. 12, No. 1, pp. 17-22. 7 El Hayek M, Voorthuysen, EV, & Kelly, DW 2005, ‘Optimizing life cycle cost of complex machinery with rotable modules using simulation’, Journal of Quality in Maintenance Engineering, Vol. 11, No. 4, pp-333-347. 8 Taskinen, T, & Smeds, R 1999, ‘Measuring change project management in manufacturing’, International Journal of Operations and Production Management, Vol. 19, No. 11, pp. 1168 – 1187. 9 Gottschalk, P 2006, ‘Information systems in value configurations, Industrial Management and Data Systems, Vol. 106, No. 7, pp. 1060-1070. 10 Murthy, DNP, Atrens, A, & Eccleston, JA 2002, ‘Strategic maintenance management’, Journal of Quality in Maintenance Engineering, Vol. 8, No. 4, pp. 287-305. 11 Sherwin, D 2000, ‘A review of overall models for maintenance management’, Journal of Quality in Maintenance Engineering, Vol. 6, No. 3, pp. 138-164. 12 Haider, A 2007, Information Systems Based Engineering Asset Management Evaluation: Operational Interpretations, PhD Thesis, University of South Australia, Adelaide, Australia.
118
13 Rudberg, M 2002, Manufacturing strategy: linking competitive priorities, decision categories and manufacturing networks, PROFIL 17, Linkoping Institute of Technology, Linkoping, Sweden. 14 Haider, A 2009, ‘Value Maximisation from Information Technology in Asset Management – A Cultural Study’, 2009 International Conference of Maintenance Societies (ICOMS), 2-4 June, Sydney, Australia. 15 Hess, A. 2007 ‘Presentation to the CIEAM CRC Conference’ June 2007, Australia. 16 Weill P. & Ross, J.W. 2000, IT Governance: How Top Performers Manage IT Decision Rights for Superior Results’, Harvard Business School Publishing, USA 17 Steenstrup, K. 2008, ‘IT and OT: Intersection & Collaboration’ Gartner Industry Research, ID No G00161537, USA
Acknowledgement The authors acknowledge the support of the CRC for Integrated Engineering Asset Management in conducting this research project.
119
Proceedings of the 4th World Congress on Engineering Asset Management Athens, Greece 28 - 30 September 2009
AN ADVANCED METHOD FOR TIME TREATMENT IN PRODUCT LIFECYCLE MANAGEMENT MODELS Matsokis A a and Kiritsis D a a
Swiss Federal Institute of Technology in Lausanne (EPFL), STI-IGM-LICP, ME A1 380, Station 9, Lausanne, 1015, Switzerland.
Time is the only fundamental dimension which exists along the entire life of an artefact and it affects all artefacts and their qualities. Most commonly in PLM models, time is an attribute in parts such as “activities” and “events” or is a separate part of the model (“four dimensional models”) to which other parts are associated through relationships. In this work a generic idea has been developed about how to make better treatment of time in PLM models. The concept is that time should not be one part of the model, but it should be the basis of the model and all other elements should be parts of it. Thus, we introduce the “Duration of Time concept”. According to this concept all aspects and elements of a model are parts of time. A case study demonstrates the applicability and the advantages of the concept in comparison to existing methodologies. Key Words: Product Lifecycle Management (PLM), Asset Lifecycle Management (ALM), Middle of Life (MOL), Interoperability 1
INTRODUCTION
The aim of this work is to introduce a new methodology for improving today’s ALM and PLM systems in the aspects of data handling (visibility and integration) as well as system interoperability. Visibility of information between the different levels of abstraction in different information and data management systems is not always available and if achieved it requires a lot of effort due to the complexity of the systems (for the sake of simplicity in this document when we use the term “systems” we mean “Information and Data systems”). All these systems either are different to each other or are under the same commercial “ALM” system. In both cases it is very difficult to retrieve and synchronise the data of all phases (Beginning of Life (BOL), Middle of Life (MOL) and End of Life (EOL)) after the product exits its (BOL) phase (design and production). Furthermore, data is collected only for some pre-defined products-components. However, experience has shown that the requirements for the types of collected data change depending on the use of each part of the model and hence, data are missing and are impossible to recover when needed in later stages. This leads into having stored data, for use as input in decision making, which is incomplete and therefore decision support is unsatisfactory. Time is the only fundamental dimension which exists along the entire life of an individual (including materials and physical products) and it affects all individuals and their qualities. Individuals existed in the past and will exist in the future no matter if they only currently exist in our model. Therefore, we introduce a method for system modelling which utilises this unique advantage of time. Time in this context is used with its generic meaning. Time is considered as the fourth dimension in several sciences and Sider in his work “Four Dimensionalism” [1] provides a good description of the 4D paradigm. Individuals exist in a manifold of 4 dimensions, three space and one time and therefore, they have both temporal parts and spatial parts. The notion of time is not considered with the appropriate level of quality and it is underestimated in today’s methodologies. This is a key issue which makes systems more complex and causes a significant loss in valuable data/information about products, processes, etc while attempting to re-use this data/information for supporting decision in the different phases of lifecycle. Methods and ideas for loading time data into parts of the models have led to solutions such as: “time stamp” and “time-interval”. Time data are being stored only at the parts of the model they were designed for. Most commonly, time is an attribute of these parts such as in “activities” (starting and finishing time) and in “events” (points in time) or is a separate part of the model (“four dimensional models”) to which other parts are associated through relationships. Thus, time data do not cover the whole system for the whole life cycle, which leads to many complex problems when it comes to information visibility. This is because there are many different systems which are at a different
120
level of abstraction, regarding the individual target asset. This has led into models which are incomplete, complicated to manage and application specific. Innovative ideas of time treatment are necessary to change the philosophy of the models and simplify them, especially in the ever more competitive global market. The structure of the paper is as follows. Section 2 describes briefly previous works dealing with time in engineering. In section 3 there is the description of the proposed methodology for future models. In section 4 a case study on maintenance department for locomotives is demonstrated.
2
BACKGROUND LITERATURE
The importance of time in the field of engineering has been noted in several works. In part 2 of the ISO 15926 [2] there is a use of time as the fourth dimension. It is used to describe actual individuals (including physical objects) which actually exist, or have actually existed in the past, possible individuals which possibly have existed in the past, and may possibly exist in the future and individuals which are hypothetical having no existence in the past or future. West [3] describes the need for tracking the state and status of an individual along time (including to which physical product the individual belongs or is part of). The author also describes how this need inspired the development of ISO 15926-2. As a solution the author recommends the use of International Standards combined with ontologies. Batres et al. [4] describe their effort to develop an ontology based on ISO 15926, analyse part 2 and briefly show how time is used to demonstrate the continuity of functionality of the parts. Roddick et al. [5] discuss the significance of time in spatio-temporal data mining systems and describe the need for future research that has to be carried out. Zhang et al. [6] suggest a model for the lifecycle of the infrastructure system facilitating the spatio-temporal data. Roddick et al. [7] on their bibliography research point out the value of investigating temporal, spatial and spatio-temporal data for future knowledge generation. In PROMISE [8] semantic object model the continuity of the parts over time is also considered important and it is stored in the “part of” class. Jun et al. [9] developed a time-centric ontology model for product lifecycle meta-data for supporting the concept of Closed-Loop PLM. In the “four dimensional models”, time attributes are included in a separate part of the model (Date_Time class) to which other parts (not necessarily all parts) are associated through relationships as shown in Figure 1. Such systems become complex due to the large number of relationships between Date_Time class and the other parts of the model. Furthermore, time data is not being collected about the whole system for the whole life cycle. The latter occurs either in cases where not all parts are connected to the Date_Time class or in cases where the architecture of the system changes along the life cycle and the relationships to the Date_Time class are altered/affected.
Figure 1. Schematic representation of a four dimensional model.
In a significant number of models which do not claim to be four dimensional time attributes exist in the parts of the model where time was considered necessary by the model designer. Most commonly time attributes are in the parts of the model describing the “process”, the “activity” (having starting time, finishing time and duration) and the “event” (having points in time or time stamps). An example is shown in Figure 2. These types of models face data integration and interoperability issues and are mostly developed to describe specific applications. Moreover, time data do not cover the whole system which has consequences in later stages, when time elements are required (i.e. feedback from maintenance to design) but they were not collected and therefore, are not available.
121
Figure 2. Schematic representation of a model with time/date attributes distributed in various classes.
3
PROPOSED METHODOLOGY
The aim of the proposed methodology is to improve today’s ALM and PLM systems by changing the use of time in the systems. The importance of time in ALM and PLM has been noted in the previous section. However, time has some qualities which make it special among all the attributes. Time is the only fundamental dimension which objectively exists along the entire life cycle of all individuals (including materials and physical products) and it is an objective element. Time exists in our everyday life on different levels: duration of accomplishing a task, duration of coffee break, duration of a phone call, duration of studies, age of a human, roman era, duration of a trip, duration of a maintenance activity, working hours of a machine, etc. Time also has granularity in order to be easier comprehensible by humans depending on the application i.e. it is easier understandable to say that I signed a five year contract than to say that I signed a 43800 hours contract. In this way time is affecting all aspects of individuals and their qualities; people are getting older (changes in character due to experience, in health, etc.) and objects wear out. All have the need for some type of maintenance. Furthermore, time is simple, comprehensive and objective and therefore, application independent. For instance duration of 5 years is understood by all systems and humans. Of course it might have different meaning and importance when it is referring to the age of a human or of a machine. For instance if one is employed by company A for a duration of 5 years, it is not really important for him to know that the company has a history of 150 years. From the company point of view the individual exists only for a small fraction of its life, where as for the individual 5 years is an important part of his 35 years of work. Regarding assets, time has a meaning of useful life, working hours, maintenance intervals, etc. Similarly, a used component of a machine has its time in the previous machine and now it has a life in a current machine. Its lifetime history would be the following: duration of MOL A in machine A (during which performs it performs task A1, task A2, etc. with durations A1, A2, etc.), duration of re-manufacture, and duration of MOL B in machine B (during which performs it performs task B1, task B2, etc. with durations B1, B2, etc.). Of course the component might have unlimited number of future uses. In this way time describes the continuity of the components functionality. In today’s systems although time attributes exist is various parts of the systems, there are no systems which are based on time. These qualities of time characteristics were the initiative to select time as the basis for our methodology for model development, the “Duration of Time concept”. It introduces the idea of seeing all aspects and elements of a model as parts of time and it provides flexibility, application independence and simplicity. In this way time exists naturally in everything but sometimes we don’t really understand it since our view is too “narrow” to see the big picture and we focus only on the small part which affects us directly considering time with its generic meaning as stable. This work introduces the “Duration of Time concept” for improving today’s ALM and PLM systems in the domains of data visibility, data integration and system interoperability. The main element of the concept, used for improving the systems performance is time. The concept is that time should not be one part of the model, but it should be the basis of the model and all other elements should be parts of it. The “Duration of Time concept” has unique advantages over existing concepts, which stem from the qualities of time characteristics. Time is objective and it may be used as a guideline basis for achieving data integration and system interoperability. Therefore, systems built on this concept take advantage of the time characteristics and combined with semantics provide data visibility, data integration and system interoperability. Time is used as a basis to provide a first step system to system visibility and common understanding. Two different time based systems will certainly have in common their time attributes and therefore, they are synchronised even though they might have been extended and used differently. This is an easy to apply method on existing models by making a “duration of time” class as a super-class of all classes of the model. This class provides the unified time framework for the entire system. A schema of this model is shown in Figure 3. The concept is protected by a patent provisional application.
122
Figure 3. Schematic Duration of Time representation example.
4
CASE STUDY
This case study demonstrates an application of the duration of time concept on an ALM/PLM ontology model, highlighting the capabilities of the final model. The model used is based on the Product Data and Knowledge Management Semantic Object Model (SOM) as developed by Matsokis et al. [10] to which the duration of time concept has been implemented. The SOM has been made a subclass of duration of time class and has been extended to facilitate the case study. It describes the maintenance activities of locomotives and also includes some parts of the model such as documents which engineers are not used treating (seeing) from the time point of view. The case study describes the application of the model by an authorised locomotive maintenance provider (MP). The MP is specialised on one model/type of locomotives. The MP has two maintenance platforms: Platform A and Platform B; each one has one machine to aid maintenance: Machine A and Machine B; and one mechanic which performs the maintenance on each platform: Mechanic A and Mechanic B; each mechanic uses one tool-box: Tool-Box A and Tool-Box B; and there are 5 documents: Document 1, Document 2, Document 3, Document 4 and Document 5. Document 1 contains the field data from the locomotive and it is updated each time the locomotive visits an authorised MP (one per locomotive, for this reason we have Document 1a, Document 1b, etc.). Document 2 contains the maintenance history of the Locomotive and it is updated each time the locomotive enters the maintenance (one document per locomotive having a, b, c and d, similarly to Document 1). Document 3 contains the manufacturer’s guidelines for performing/ operating maintenance according to the working hours of the locomotive or to the period of time passed since the last maintenance. Document 4 contains the manufacturer’s instructions with schemas for removing and replacing parts. Document 5 contains the information about the stock of the spare-parts. To facilitate and to categorise better the data for this application the model was extended accordingly. The developing process was: • • • • •
The class Duration of Time was made the superclass of the model. A time framework for the existing ontology PLM was developed. This framework has the only “time” properties of the ontology (start_date_time, end_date_time, duration). Thus, all classes and subclasses of the ontology have the same “time” framework. A central reference time CET was chosen. In this way, misunderstandings concerning time in communication between different agents around the globe will be avoided. The model was extended to facilitate the case study Instances are stored for every physical product, activity, event, process, resource etc. necessary.
The final model is shown in Figure 4. For the case study we have only three locomotives involved, Locomotive No1, No2 and No3.
123
Figure 4. Ontology model extended with necessary classes
4.1 System Analysis and Functionality In this scenario locomotives are visiting the MP with an appointment. The duration of time for each resource, activity, etc. is shown in Figure 5. The colours blue, red and green in the rows are referring to Locomotive No1, No2 and No3 accordingly and show for which locomotive and for how long is each resource used. This could be referring to the future (daily/weekly/monthly etc. schedule according to appointments). All the uncoloured cells of each row represent the time that the resource related to this row is in idle status. Each column represents 5 minutes. These time periods of 5 minutes could have been time periods of any required type such as years, months, days, hours, minutes, seconds, milliseconds. In Figure 5 we have that Locomotive No1 arrives to the service department and Mechanic A is responsible for it. He updates Document 1a with field data from the locomotive’s on-board computer unit and he checks Document 2a which contains its maintenance history. Then, according to the status of the locomotive he reads the manufacturer’s guidelines for this type of locomotive to see the maintenance activities to be performed and decides to replace some parts. He checks document 5 to see if there is any in the local stock and document 4 for the replacing instructions. Similar are the activities for Locomotives No2 and No3 shown in Figure 5 (for Locomotive No2 there is no need to remove/replace parts and Locomotive No3 arranges an appointment out of schedule). In case the MP provides multiple maintenance sites Locomotive No3 would have chosen the closest, soonest available maintenance site. Documents like all resources are seen as duration of time elements which appear in the system when they are used.
124
Figure 5. MOL Locomotives case study as seen from the “duration of time” Point of view with Queries
Using the duration of time approach provides engineers with all the necessary information for the state of each resource at every moment. Engineers can have information according to Which-queries such as “Which machines are available at this time slot?” which is equivalent to “Who is in stand by status at this time” and returns all the non-active values at that duration of time, or according to Availability-queries such as “Is Mechanic A available at a certain time?” or “When and for how long is a certain resource (mechanic or machine or document) available?” which return instances showing availability. This information is used for the best management of the resources. Moreover, the system also provides the information of the duration of time a Locomotive is using each resource. In Figure 5 several examples of the queries are shown. Firstly, a Which-Query is shown, which is applied on the model about the machine and describes “Which Machine(s) is (are) available right now (now=8:40 AM) and for how long?”. It returns the idle instance(s) of the available resources or nothing if the resources are not available. Secondly, there is the query “Is Mechanic B available right now (now=6:30AM)?” is shown. This query applies only to the certain resource instance (the query could be more generic like “who is available at this time?”) and returns either the idle instance if the resource is available or nothing if the resource instance is not available. Furthermore, Figure 5 shows an example of “When and for how long is Machine A available until 11am?” query. This query applies to all instances of Machine A and returns the idle instances of Machine A. Finally, an example of “When and for how long is Document 3 used?” query is shown; returning all the time slots during which Document 3 is being used.
4.2 Outcome of the case study
125
This case study has demonstrated that the initial model has been made simpler with the implementation of the Duration of Time concept, since the time attributes are unified and in case of model extension these attributes are inherited. A number of applications have shown that the system provides complete data visibility and therefore, inter-OEMs/Suppliers co-operation for better resources exploitation. Documents like all resources are seen as duration of time elements which appear in the system when they are used. Under this perspective one can have an overview of all documents, resources, etc. of all systems. Using similar queries, engineers are provided with a complete overview of the time slots and they are supported in decision making for optimal management of resources, activities, agents and processes. Moreover, the entire model has been described by the Duration of Time concept and still keeps its previous functionalities. Finally, through time it is very simple to track system or data changes and thus, keep track of all the past states of all the parts of the system.
5
CONCLUSION
Time has the characteristics of being objective and existing in all alive beings and materials. The Duration of Time concept has unique advantages over existing concepts exactly because it takes advantage of these characteristics. According to this concept time is used as the basis of the model and allows seeing all the parts of the system from the time point of view. The system, which has implemented the Duration of Time concept, provides flexibility, application independence and simplicity, since, time characteristics are objective and time could be used as a guideline basis for achieving data integration and system interoperability. In the case study it has been shown that it is an easy to apply concept on existing systems, it makes systems simpler, description of the system is provided through duration of time and system or data changes are tracked through time. Future development includes research on to what extend time data on all system parts supports vertical visibility of the different systems of the different levels and therefore, system interoperability and data integration under multi-system circumstances, application of the model to more existing ALM/PLM systems and further use of the concept in combination with semantics to provide benefits for industry.
6
REFERENCES
1
Sider T. (2001) Four-dimensionalism: An Ontology of Persistence and Time. Oxford University Press.
2
ISO 15926-2:2003 Integration of lifecycle data for process plant including oil and gas production facilities: Part 2 – Data model: http://www.iso.org/iso/iso_catalogue/catalogue_tc/catalogue_detail.htm?csnumber=29557 (April 2009).
3
West M. (2004) Some industrial experiences in the development and use of ontologies. EKAW 2004 Workshop on Core Ontologies in Ontology Engineering, pp. 1-14
4
Batres R, West M, Leal D, Price D, Masaki K, Shimada Y, Fuchino T, Naka Y. (2007) An upper ontology based on ISO 15926. Computers and Chemical Engineering, 31 (5-6), pp. 519-534
5
Roddick JF, Egenhofer MJ, Hoel E, Papadias D, Salzberg B. (2004) Spatial, temporal and spatio-temporal databases Hot issues and directions for PhD research. SIGMOD Record, 33 (2), pp. 126-131
6
Zhang C, Hammad A. (2005) Spatio-temporal issues in infrastructure lifecycle management systems. Proceedings, 1st Annual Conference - Canadian Society for Civil Engineering Toronto, pp. FR-131-1-FR-131-10
7
Roddick JF, Hornsby K and Spiliopoulou M. (2000) An Updated Bibliography of Temporal, Spatial, and Spatio-temporal Data Mining Research. Temporal, spatial, and spatio-temporal data mining: first international workshop, TSDM 2000, Lyon, France, pp. 147–163. Heidelberg/Berlin: Springer Verlag.
8
PROMISE Research Deliverable 9.2: http://www.promise.no/index.php?c=77&kat=Research&p=13|, (April 2009).
9
Jun H-B, Kiritsis D, and Xirouchakis P. (2007) A primitive ontology model for product lifecycle meta data in the closedloop PLM. In: Gonςalves RJ, Müller JP, Mertins K, and Zelm M, editors. Enterprise Interoperability II: New Challenges and Approaches, pp. 729-740. London: Springer Verlag;
10
Matsokis A, Kiritsis D. (2009) An Ontology-based Approach for Product Lifecycle Management. Computers in Industry. Special Issue: Semantic Web Computing in Industry. In press.
Acknowledgments This work was carried out in the framework of SMAC project (Semantic-maintenance and life cycle), supported by Interreg IV programme between France and Switzerland.
126
Proceedings of the 4th World Congress on Engineering Asset Management Athens, Greece 28 - 30 September 2009
FUNCTION PERFORMANCE EVALUATION AND ITS APPLICATION FOR DESIGN MODIFICATION BASED ON PRODUCT USAGE DATA Jong-Ho Shin a, Dimitris Kiritsis a, and Paul Xirouchakis a a
Institute de Génie Mécanique, Ecole Polytechnique Fédérale de Lausanne (EPFL), CH-1015 Lausanne, Switzerland.
In recent years, companies have been able to gather more data from their products thanks to new technologies such as product embedded information devices (PEIDs, Kiritsis et al. (2004)), advanced sensors, internet, wireless telecommunication, and so on. Using them, companies can access working products directly, monitor and handle products remotely, and transfer generated data back to appropriate company repositories wirelessly. However, the application of the newly gathered data is still primitive since it has been difficult to obtain this kind of data without the recently developed new technologies. The newly gathered data can be applicable for product improvement in that it is transformed into appropriate information and knowledge. To this end, we propose a new method to manage the newly gathered data to complete closed-loop PLM. The usage data gathered at the MOL phase is transferred to the BOL phase for design modification so as to improve the product. To do this, we define new terms regarding function performance considering the historical change of function performance. The proposed definitions are developed to be used in design modification so that they help engineers to understand components/parts working status during the usage period of a product. Based on the evaluation of components/parts working status, the critical components/parts are discriminated. For the found critical components/parts, the working status of them is examined and correlated with field data which consists of operational and environmental data. The correlation provides engineers with critical field data which has an important effect on the worse working status. Hence, the proposed method provides the transformation from usage data gathered in the MOL phase to information for design improvement in the BOL phase. To verify our method, we use a locomotive case study. Key Words: Performance evaluation, design improvement, multi linear regression model, degradation 1
INTRODUCTION
Thanks to recently developed technologies such as product embedded information devices (PEID, Kiritsis (2004)), various sensors, wireless telecommunication, internet, and so on, a company is able to access and monitor its products remotely, to gather product data continuously, and even more to handle products directly. Through these new technologies, the information flow between company and products extends to the whole product lifecycle, which is called closed-loop product lifecycle management (PLM). Until now, the main interest of closed-loop PLM was the beginning of life (BOL) phase such as supply chain management and production management. The product usage data generated during the middle of life (MOL) phase is less handled since the technological environment to gather them was premature. Nowadays, these limitations are overcome by new technologies so that company is able to gather and use various kinds of product usage data in a ubiquitous way. However, even though the technological infrastructure to gather product usage data from the MOL phase is well established, an application method to use them is still immature and in its infancy. Product usage data can provide a company with more opportunities to understand its products and to improve them. Furthermore, in case product usage data is transformed into appropriate information or knowledge, they are applicable for various objectives such as design improvement, production optimization, advanced maintenance, effective recycle, and so on. For example, the accurate understanding of product status helps design engineers to find and modify false design parameters. The failures of the improved product will be reduced and its reliability will be increased, which enhances customer satisfaction so that it assures companies to survive in the harsh market environment. Therefore, it is necessary to develop a method to apply product usage data into product improvement.
127
To this end, in this study, we propose a design improvement support method based on product usage data gathered during the MOL phase. With the proposed method, product usage data is transformed into information to support design improvement. The transformation procedure consists of several steps (see Figure 1). In the first step, we decompose product functions. Using the decomposed functions, degradation scenarios are defined. Degradation scenarios show degradation relationships among the decomposed functions. By degradation scenarios, the importance rates of functions are calculated. Then, the working status of functions during the usage period is calculated through function performance measure. The calculated working status combined with degradation scenarios is used to find the critical time instances when functions show poor working status. By the critical time instances, the field data that consists of operational and environmental data are classified as normal and abnormal. Then, the classified field data are compared with each other using clustering technique. The comparison between normal field data and abnormal field data make it possible to find the critical field data. The critical field data are causable field data which affect poor working status of functions. At the last step, using a relation matrix between the critical field data and design parameters, the importance rates of design parameters are calculated. The design parameters having high importance rates should be checked and modified so as for the related functions not to be affected by harsh operation and environment. The proposed method is described using a locomotive case study to show validity and to help understanding. This paper is organized as follows. In section 2, we introduce previous research works on product usage data application, design improvement using historical data, performance, and performance degradation. In section 3, we explain the overall procedure of our proposed method to transform product usage data into information to support design improvement with case study. In the last section, we will conclude our research and provide some discussions.
2
STATE-OF-THE-ART
There are many ways to explore and exploit product usage data. Among them, the maintenance area is the most active application field. In this application, the product usage data is applied into the enhanced maintenance decision such as predictive maintenance, condition based maintenance, and so on. Xiaoling et al. (2007) proposed a method to use historical time series data in predictive maintenance. In their work, the historical time series data is adopted for auto-regressive moving average (ARMA) model which is usually used for the prediction of a trend and future behaviour. In the ARMA model, residual series are calculated and they are used to predict a machine failure. Based on the machine failure prediction, the maintenance policy is decided. Chang-Ching and Hsien-Yu (2005) proposed a method to estimate machine reliability based on its status. The status is monitored by the product usage data. In their work, the vibration signal is monitored and the basic information for predictive maintenance such as hazards model, reliability, and mean time between failures (MTBF) is calculated using cerebellar model articulation controller neural network-based machine performance estimation model (CMAC-PEM). Bansal et al. (2005) proposed a real-time predictive maintenance based on product characteristics during operation. As product characteristics, the motion current signature of DC motor is monitored and the distinct motor loads is classified using neural network. A real-time prediction responds to product characteristics concurrently. In these applications, the product usage data is used for the prediction of future product status. They do not focus on understanding how the historical data changes and affect product status from the viewpoint of design modification. In another field where the product usage data is used for design improvement, Delaney and Phelan (2009) proposed a method to use historical process data for robust design improvement. In their work, the historical data obtained during production is referred to a new product design. Through the application of the processing data, a performance variation of a new product can be estimated early in the design phase so that more robust design tolerance can be defined. The functionfailure design method (FFDM) (Stone, Tumer et al. 2005) is another method to perform failure analysis in conceptual design. The FFDM offers substantial improvements to the design process since it enhances failure analysis so that it reduces necessary redesigns. In many research works, the failures are well studied to find failure cause and to improve designs based on this analysis; fault tree analysis (Zampino 2001), hazard analysis (Giese and Tichy 2006), failure mode and effect analysis (FMEA) (Chin, Chan et al. 2008), and so on. However, there is still lack of methods that consider product status for design improvement. To consider the product usage data for product improvement, an evaluation method of the historical data for product improvement is required so as to understand and use it appropriately. The failure rate, performance, and performance degradation are widely used evaluation measures of product status. Among them the performance and its degradation are a good reference to understand product status. In general, it is useful to measure product performance during product usage period in many industrial fields. If companies know the actual status of products in the context of performance, they can provide much effective services such as predictive maintenance. Furthermore, they can improve product design to fix the reason of performance degradation. However, it is difficult to define product performance because of the ambiguity of performance definition. Osteras et al. (2006) showed some definitions of performance and suggested their own definition of performance as a vector of performance variables. In spite of these efforts to explain performance, the definition of performance is still hard to use in a verbal form. To use product performance in the company, it is required to convert product performance into a numerical value. Many previous research works tried to define product performance as measurable values. For example, Lee et al. (2006) showed some examples of performance measure. In the first example of their work, the
128
vibration signal waveform monitored by an accelerometer was used as the performance measure of a bearing. Another example is about controller area network (CAN). In this example, overshoot at which the signal goes beyond the steady state value was used as the performance value of CAN. Based on performance definitions, performance degradation can be formulated depending on the purpose of applications such as design improvement, maintenance, remanufacturing policy, and so on. There have been some relevant research works on these purposes. For example, Djurdjanovic et al. (2003) suggested various methods for performance assessment such as statistical overlap between performance related signatures, feature map pattern matching, logistic regression, cerebellar model arithmetic computer (CMAC) neural network pattern matching, hidden Markov model based performance assessment, particle filter based, and so on. Bucchianico et al. (2004) used the maximum amplitude of the first peak of the current signal as a performance feature. With current signal, they applied wavelet analysis to simplify the description of signal. Then, they used it as features to perform the analysis of variance (ANOVA). Furthermore, Tang et al. (2004) used the intensity of LED as degradation measure. The temperature with time interval is changed and the degradation data of the light emitting diodes (LED) is modelled as degradation path. Using degradation path, they proposed optimal test plan for accelerated degradation test (ADT). Recently, Jayaram and Girish (2005) proposed a degradation data model called generalized estimating equation (GEE). Using this model, they predicted the characteristics of Poisson distribution of degradation data set whose marginal distribution is Poisson, from which they could estimate reliability. Even though there have been much effort on performance degradation definition and modelling, there are few applications which focus on design improvement. Usually, the product degradation is related with maintenance or product reliability issues. Moreover, there is a lack of research works which deal with the connection between degradation and field data to improve design parameters.
3
PROCEDURE EXPLANATION WITH CASE STUDY
To apply product usage data for design improvement, we propose the following procedure (see Figure 1). The proposed procedure consists of four parts; 1) function evaluation, 2) function performance degradation evaluation, 3) field data evaluation, and 4) design parameter evaluation. In the first part, the product functions are evaluated from the viewpoint of performance degradation. In the second part, the functional working status of functions is calculated. The field data is evaluated in the third part based on the functional working status. At the last part, design parameters are evaluated.
< function evaluation >
< field data evaluation >
1. Decompose product functions
5. Find the critical time instance
2. Define degradation scenarios
6. Make field data clusters
3. Calculate function importance rate
7. Find field data out of range from clusters
4. Calculate function performance degradation of functions at each time instance
8. Build relation matrix between design parameters and field data
< function performance degradation evaluation >
< design parameter evaluation >
Figure 1. Overall procedure. The proposed procedure is explained in detail using a case study ‘locomotive components’. A locomotive is a complex system consisting of millions of components/parts. Among them, we extract several components and parts that are related to the locomotive braking system for this case study. Figure 2 shows the general architecture of locomotive components that are considered in the case study. Based on the targeted components, the functions are defined and the performance measure is selected as failure rate for all functions. For the case study, we simplify the functions and their relationships. Also, we make some assumptions: •
The component working status is measured by a performance measure.
•
Field data affects functional working status.
•
Performance degradation by operational and environmental effect is related with design parameters.
129
*http://www.knorr-bremse.co.uk/
Figure 2. Selected components for case study.
3.1 Function evaluation In this part, functions are evaluated from the viewpoint of performance degradation. To do this, first of all, the product functions are decomposed. The objective of function decomposition is to show how functions are structured and how they are connected with each other. A product consists of several functions depending on its objective. These functions are correlated with each other to perform objectives. Some functions work together to perform another objective and others accomplish their objectives without any function combinations. To clarify this relationship, we decompose product functions into detailed subfunctions depending on their roles and their levels. According to function relationships, functions can be layered at several levels. A product can have several functions at the first level. Each function at the first level can be decomposed into several sub-functions at the second level. A sub-function can also be divided into sub-functions at a lower level again. The level of function decomposition can be extended depending on the complexity of functions. The decomposed functions are usually connected with each other. These complex relationships among decomposed functions are used to define degradation scenarios. The relationship among functions is defined as the energy/material/information flow and degradation scenarios follow these flows. An abnormal energy/material/information flow makes abnormal function performance degradations of related functions. For example, in Figure 3, if the material input of the function ‘F11’ decreases, the output material of the function ‘F11’ decreases. The reduced material from the function ‘F11’ makes the function ‘F12’ work improperly. Hence, the material flow and energy flow can be good references for defining degradation scenarios. Considering all possible function relationships, all existing degradation scenarios should be defined in this step.
Figure 3. Degradation scenarios
130
Using degradation scenarios, we identify an important degraded function which leads to other consecutive function performance degradations. For example, in consecutively connected functions, the first degraded function can be a possibly initiative function degradation in a degradation scenario. In general, some functions are involved in multiple degradation scenarios. If these functions do not work properly, several degradation scenarios are affected. Hence, these kinds of functions are more important than less correlated functions. These multiple correlations between functions and degradation scenarios are described in the form of matrix as a correlation matrix (see Table 1). Table 1 Correlation matrix between functions and degradation scenarios Correlation matrix between functions and degradation scenarios Degradation scenarios
Functions F11
F12
F13
F21
D1
S(5) W(1) W(1) W(1)
D2
S(5) W(1)
D3
M(3) W(1) M(3)
D4
M(3) W(1)
D5
M(3) W(1)
F22
F23
F24
F31
F32
F33
F34
F42
F43
F44
M(3) S(5) S(5) S(5)
D6
S(5)
M(3) M(3) M(3)
D7
S(5)
M(3) M(3)
D8
M(3) M(3) M(3) M(3)
S(5)
D9
M(3) M(3)
D10 Function importance rate
F41
W(1) 19
5
4
1
10
9
6
12
M(3) 6
3
5
6
S(5) M(3) M(3) 13
8
3
The correlation between the functions and the degradation scenarios is rated with a usual rating method such as 1-3-5, 1-59, and so on. For example, the function ‘F11’ in Table 1 is related with degradation scenario ‘D1~D5’ as multiple correlation. After rating the correlations, the function importance rate is calculated by the sum of the correlations. According to Table 1, a function which has strong correlations with several degradation scenarios has a high function importance rate value. For example, the functions ‘F11’, ‘F31’, and ‘F42’ are highly correlated with several degradation scenarios so that they have high function importance rate. The function importance rate will be used in the relation matrix between design parameters and field data in the last step.
3.2 Function performance degradation evaluation The function performance degradation can be monitored and calculated by the function performance measure. The function performance measure is decided by the function objective such as voltage, vibration, current, speed, or noise. Using the function performance measure, the performance degradation at each time instance for each function is calculated by equation (1). (1)
Dij = Pie - Pij , if Pij > Pie , then Pie - Pij = 0 Where
i : index of function j : index of monitoring time instance of function performance measure, 1 £ j Dij : funtion performance degradation of function i at j th time instance Pie : expected amount of function performance measure of function i by design specification Pij : monitored amount of function performance measure of function i at j th time instance
131
For each decomposed function in this case study, the failure rate is selected as function performance degradation. Hence, ‘Pie’ and ‘Pij’ in equation (1) are substitute with expected failure rate and monitored failure rate respectively.
3.3 Field data evaluation The function performance degradation is calculated during the usage period in the previous part. Hence, it represents the working status of functional ability at each time instance for each function. High function performance degradation means that a function loses its working status seriously, therefore it is much degraded. A time instance having high function performance degradation is defined as a critical time instance. To find critical time instances, we set a threshold value of the function performance degradation. The threshold value is set as a reliable limitation of the function performance degradation assuring an acceptable functional ability. This value can be defined by empirical tests in the laboratory, previous data from similar products/components/parts, knowledge from engineers, design specification for the reliability, and so on. In case the function performance degradation is higher than the threshold value at a certain time instance, this time instance is classified as a critical time instance. The critical time instances of each function are defined as a set tci (see equation (2)). Table 2 shows the critical time instances which are identified by equation (2) in the case study. According to Table 2, ‘F22’ shows a lot of critical time instances as 78 times. tci = {tij | Dij ‡ Dim , 0 £ tij £ EOL of product}
(2)
where
tci : set of critical time instances of function i Dim : threshold value of Dij of function i tij : monitoring time of function i at j th time instance In the function performance degradation analysis, the field data should be considered concurrently because it can be closely correlated with the behavior of function performance degradation. When high function performance degradation occurs, it is necessary to find the causing field data affecting the high function performance degradation. In this step, we use a clustering method to find the causing field data. In general, the field data consists of a large amount of data which is gathered through the usage period. Since there are various ranges of data due to their usage environment, we use a clustering technique so as to classify the field data into a few groups. For example, the locomotive which is used in the case study works in various environments such as cold region, hot region, high altitude region, dry region, and so on. In this case, the range of field data becomes diverse and the normal range of field data where the locomotive works properly should be grouped. Since the clustering technique is a useful method to group data in a condensed form, it is used to classify the field data in this part. However, the usual clustering method does not consider the quality of data which will be clustered. There is no discrimination in field data such as good data and bad data. To overcome this limitation, we separate the field data according to function performance degradation. The field data which is gathered at the critical time instances (tci) are regarded as abnormal and the others are regarded as normal since the field data at the critical time instances shows high function performance degradation. With the normal field data, we create clusters. Hence, the field data clusters made by normal field data assure good function performance. Then, we calculate the mean and variance of each cluster. The calculated means and variances are used as reference values to represent the normal field data range which shows a normal status of functional working status. Table 3 shows the result of clustering for the function ‘F22’ that shows a lot of critical time instances. Table 3 shows the normal field data range to show good working status of the function ‘F22’ During the product usage period, various kinds of field data such as outer temperature, voltage, current, speed, and global position are recorded by PEIDs. The field data usually includes various ranges of data since the product is used in different environments and by different users. For example, a locomotive records various kinds of field data such as temperature, speed, global position, voltage, and current depending on which country and by which company the locomotive is used. Therefore, the clusters created in the previous step show the field data range where the function performs its objective well. From this, we know in which range of the field data the product works without any problem. These clusters become reference values of the field data which assure normal status of functional ability. Then, we compare the field data which are gathered at the critical time instances with the reference values obtained from the clusters of normal field data. From this comparison, we can separate field data which are out of range from the clusters. The identified field data which are mostly out of range can be one of possible causes of the function performance degradation in case that the function performance has been degraded by the abnormal field data.
132
Table 3 shows the result of clustering. These clusters exclude field data gathered around the critical time instances for the function ‘F22’. According to the field data, the created cluster numbers are different. Based on the clusters, the critical field data which is out of range from the normal range of field data is counted in Table 4 Number of abnormal field data
Field data
Occurrence number of abnormal field data F22 F34 (18 times) (78 times) 0 0 45 45 67 67 25 25 77 77 18 18 56 58
Mileage Catenary voltage Catenary current Filter current Heating circuit current Locomotive acceleration Brake cylinder pressure
According to Error! Not a valid bookmark self-reference., the field data ‘Catenary current’, ‘Heating circuit current’, and ‘Brake cylinder pressure’ show high abnormal occurrence, which means that these field data are highly correlated with the functions ‘F22’ and ‘F34’ degradation. Hence, we select three field data as the critical field data which have effect on the functions ‘F22’ and ‘F34’. Then, we build a relation matrix between design parameters of components/parts related to functions ‘F11’ and ‘F34’ and field data. . Error! Reference source not found. shows the result of the comparison between clusters and field data at critical time instances. According to Table 4 Number of abnormal field data
Field data
Occurrence number of abnormal field data F22 F34 (18 times) (78 times) 0 0 45 45 67 67 25 25 77 77 18 18 56 58
Mileage Catenary voltage Catenary current Filter current Heating circuit current Locomotive acceleration Brake cylinder pressure
According to Error! Not a valid bookmark self-reference., the field data ‘Catenary current’, ‘Heating circuit current’, and ‘Brake cylinder pressure’ show high abnormal occurrence, which means that these field data are highly correlated with the functions ‘F22’ and ‘F34’ degradation. Hence, we select three field data as the critical field data which have effect on the functions ‘F22’ and ‘F34’. Then, we build a relation matrix between design parameters of components/parts related to functions ‘F11’ and ‘F34’ and field data. , the field data ‘Catenary current’, ‘Heating circuit current’, and ‘Brake cylinder pressure’ show high abnormal occurrence, which means that these field data are highly correlated with the functions ‘F22’ and ‘F34’ degradation. Hence, we select three field data as the critical field data which have effect on the functions ‘F22’ and ‘F34’. Then, we build a relation matrix between design parameters of components/parts related to functions ‘F11’ and ‘F34’ and field data.
133
134
F11
F11
F22
F22
F31
F41
F42
D4
D5
D6
D7
D8
D9
D10
F31
F42
F32
F23
F23
F12
F12
F12
Variance
3.48E+07
0.53
6.30
2.60
2.94
0.22
0.42
1.95E+07
0.81
6.32
0.41
3.05
0.02
0.27
Mileage
Catenary voltage
Catenary current
Filter current
Heating circuit current Locomotive acceleration Brake cylinder pressure
Cluster 1
Mean
Field data
3.16
19.71
62.70
69.44
16.21
2.46E+07
Mean
0.45
17.81
3.54
20.96
0.43
322288
Variance
Cluster 2
Table 3 Field data clustering result for function 'F22' Field data clustering result for function 'F22'
N/A : there is no failure occurrence during usage period.
N/A
N/A
N/A
78 times
78 times
N/A
N/A
N/A
4.16
1081.40
82.67
149.04
16.93
2.55E+07
Mean
0.26
83.82
5.75
27.56
0.22
99662
4.97
104.81
290.00
26.03
2.85E+07
N/A
N/A
N/A
N/A
N/A
N/A
N/A
N/A
N/A
tci
0.10
15.30
52.39
0.82
60122
Variance
Cluster 4 Mean
F43
F33
F32
F31
F42
F43
F13
F23
F13
Function
Variance
Cluster 3
N/A
N/A
N/A
N/A
N/A
N/A
N/A
N/A
N/A
N/A
tci
5.12
304.77
Mean
0.03
33.29
Variance
Cluster 5
F44
F34
F31
F21
Function
5.31
Mean
0.06
5.59
Mean
0.10
Variance
Cluster 7
N/A
F34 (18 times)
F22 (78 times)
F22 (78 times)
N/A
N/A
N/A
N/A
N/A
Critical time instances
Variance
Cluster 6
N/A
18 times
N/A
N/A
tci
#
Although the field data is the same for all functions, the normal field data will depend on each function. Here, we consider that all field data affect all functions; however, with a more detailed description of the sensor allocation scheme of the locomotive components, we could differentiate the field data that are applicable to each function.
*
F11
D3
+
F11
D2
F12
F12
N/A+
F11
N/A
Function
tci
Function
Degradation scenarios D1
Table 2 Degradation scenarios and critical time instances Degradation scenarios and critical time instances
Table 4 Number of abnormal field data
Field data
Occurrence number of abnormal field data F22 F34 (18 times) (78 times) 0 0 45 45 67 67 25 25 77 77 18 18 56 58
Mileage Catenary voltage Catenary current Filter current Heating circuit current Locomotive acceleration Brake cylinder pressure
According to Error! Not a valid bookmark self-reference., the field data ‘Catenary current’, ‘Heating circuit current’, and ‘Brake cylinder pressure’ show high abnormal occurrence, which means that these field data are highly correlated with the functions ‘F22’ and ‘F34’ degradation. Hence, we select three field data as the critical field data which have effect on the functions ‘F22’ and ‘F34’. Then, we build a relation matrix between design parameters of components/parts related to functions ‘F11’ and ‘F34’ and field data.
3.4 Design parameter evaluation In the last step, we combine the found critical field data in the previous step with the design parameters of component/part. The abnormal field data found in the previous step is regarded as strongly correlated with the function performance degradation. The design parameters are defined by the component/part design specifications. The components/parts are classified and connected by the function decomposition since each function requires components/parts to perform its function objectives. To make and evaluate the relationship between design parameters and field data, we use a matrix form (see Table 5) similar to the house of quality (HOQ) used in the quality function deployment (QFD) method. Using the relation matrix, the design parameters and the field data are correlated. The degree of relationship is decided by engineers using the usual rating (generally used in QFD) such as 1-3-5, 1-5-9, low-medium-high, and so on. Then, using a similar calculation procedure such as a summation used in the QFD, we calculate the importance rate of a design parameter. In Table 5, the importance rates of design parameters are calculated as the sum of the degree of relationship for each design parameter multiplied by the function importance rate.
Table 5 Relation matrix between design parameters and field data
Function
Function importance rate
Component/ part
Design parameters
F22
10
CPU board
F34
5
Brake caliper Brake cylinder
Power module Material Diameter Cylinder diameter
Field data Catenary current
Heating circuit current
Brake cylinder pressure
High (5)
Importance rate of design parameter 50
High (5)
25
Table 5 is made for the case study. According to Table 5 , the power module of the CPU board is affected highly by the field data ‘Catenary current’. Hence, the power module of the CPU board should be checked and modified. For the function ‘F34’, the wheel cylinder diameter of the component ‘Brake cylinder’ is highly related with the field data ‘Brake cylinder pressure’. Engineers should reconsider the diameter of cylinder to improve brake cylinder pressure.
135
4
CONCLUSTION
In this study, we propose a new design improvement support method based on product usage data. Our method follows several steps to transform product usage data into information to help a product design engineer for design improvement. The product usage data gathered by PEIDs is used for monitoring product status and for calculating working status of components/parts. Then, the evaluated function performance degradation at each time instance is used to assess the field data as normal or abnormal. The normal field data is clustered so as to be used as reference values for assuring good working status. Comparing the clusters by normal field data with field data gathered at critical time instances, the critical field data which have an effect of the function performance degradation is found. Since the found field data are correlated with the function performance degradation, the critical field data should be considered in product design improvement. Hence, the relation between design parameters and the critical field data is related by the relation matrix. The relation matrix suggests the importance rate of design parameters and design parameters with high importance rate should be considered in design modification so as to reduce function performance degradation due to critical field data. To show the feasibility of our method, we have applied our procedure into a case study based on a locomotive which is provided by a real company. This approach can be a reference model applying product usage data for product design improvement. In spite of the usefulness of this procedure, our method still needs some improvement. First of all, we consider only one performance measure for the calculation of the functional working status. In case that several combined performance measures are considered in the evaluation of functional working status, a new method to deal with this complex relationship among functions should be developed. The frequently used relation matrix calculation requires input from engineers so that it depends on the knowledge and experience of engineers. Even though we provided some guidelines, some steps of the proposed method are not supported by our methodology. Moreover, the procedure is not automated. A sophisticated method to use product usage data for providing automatically design improvement suggestions is needed.
5
REFERENCES
1
Bansal, D., D. J. Evans, et al. (2005) A Real-Time Predictive Maintenance System for Machine Systems - An Alternative to Expensive Motion Sensing Technology. Sensors for Industry Conference, 2005.
2
Bucchianico, A. et al. (2004) A multi-scale approach to functional signature analysis for product end-of-life management. Quality and Reliability Engineering International, 20(5), 457-467.
3
Chang-Ching, L. & T. Hsien-Yu (2005) A neural network application for reliability modelling and condition-based predictive maintenance. International Journal of Advanced Manufacturing Technology, 25(1-2), 174-179.
4
Chin, K.-S., A. Chan, et al. (2008) Development of a fuzzy FMEA based product design system. International Journal of Advanced Manufacturing Technology, 36(7-8), 633-649.
5
Delaney, K. D. & P. Phelan (2009) Design improvement using process capability data. Journal of Materials Processing Technology, 209(1), 619-624.
6
Djurdjanovic, D. et al. (2003) Watchdog Agent - an infotronics-based prognostics approach for product performance degradation assessment and prediction. Advanced Engineering Informatics, 17(3-4), 109-125.
7
Giese, H. and M. Tichy (2006) Component-based hazard analysis: Optimal designs, product lines, and onlinereconfiguration. Gdansk, Poland, Springer Verlag.
8
Jayaram, J. S. R. & Girish, T. (2005) Reliability prediction through degradation data modeling using a quasi-likelihood approach. Annual Reliability and Maintainability Symposium, 2005 Proceedings, pp.193--199.
9
Kiritsis, D. (2004) Ubiquitous product lifecycle management using product embedded information devices. Intelligent Maintenance System (IMS) 2004 International Conference, Arles.
10
Lee, J. et al. (2006) Intelligent prognostics tools and e-maintenance. Computers in Industry, 57(6), 476-489.
11
Osteras, T. et al. (2006) Product performance and specification in new product development. Journal of Engineering Design, 17(2), 177-192.
12
Stone, R., I. Tumer, et al. (2005) Linking product functionality to historic failures to improve failure analysis in design. Research in Engineering Design, 16(1), 96-108.
13
Tang, L. C. et al. (2004) Planning of step-stress accelerated degradation test. Annual Reliability and Maintainability Symposium, 2004 Proceedings, pp.287-292.
14
Xiaoling, B., X. Quanzhi, et al. (2007) Equipment fault forecasting based on a two-level hierarchical model. Piscataway, NJ, USA, IEEE.
15
Zampino, E. J. (2001) Application of fault-tree analysis to troubleshooting the NASA GRC icing research tunnel. Piscataway, NJ, USA, IEEE.
136
Proceedings of the 4th World Congress on Engineering Asset Management Athens, Greece 28 - 30 September 2009
A COMPUTERIZED MODEL FOR ASSESSING THE RETURN ON INVESTMENT IN MAINTENANCE; FOLLOWING UP MAINTENANCE CONTRIBUTION IN COMPANY PROFIT Basim Al-Najjar a a
Terotechnology, School for Technology and Design, Växjö University, Sweden,
[email protected]
In order to reduce as much as possible the economic losses that are generated due to lack or inefficient maintenance, it is necessary to map, analyse and judge maintenance performance and act on deviations before it is too late. It is always necessary for a company to act for increasing profit and consequently enhance its competitiveness. In this paper, a software model (MainSave) has been developed for mapping, monitoring, analysis, following up and assessing the cost-effectiveness of maintenance (and maintenance investments). MainSave can be used for assessing savings and profit/losses due to maintenance performance, identify problem areas and primarily plan for new beneficial investments in maintenance. The module has been tested at Fiat/CRF in Italy. The major conclusion is; applying MainSave it would be possible to identify, assess and follow up maintenance contribution in company business. Keywords: Maintenance Savings, Maintenance Profit, Maintenance Risk Capital Investment, Return on Investment in Maintenance 1
INTRODUCTION
Manufacturing industries realize the importance of monitoring and following up the performance of production and maintenance processes by simultaneously using economic and technical key performance indicators (KPIs). These indicators establish a bridge between the operational level in terms of, e.g. productivity, performance efficiency, quality rate, availability and production cost, and the strategic level expressed by company profit and competitiveness. Also, these key indicators are important to follow up the maintenance role in a sustainable manufacturing, [1]. In the past, the survival of manufacturing companies was mainly connected to how much a company was able to push into the market. This situation has changed and today's strategies imply cost minimization and differentiation and the ability to use available resources in a cost-effective way with reduced pollution to the surroundings. The focus on customer needs puts great demands on the production and maintenance systems to meet the goals of high product quality, production safety and delivery on time at a competitive price, [2,3]. Properly identified KPIs are required for following up the work done to achieve company strategic objectives and daily competition survival. Also, integration of the KPIs with the knowledge and database can provide a manager the required information, knowledge and ability to monitor and interpret the performance measures for making cost effective decisions, [4]. Furthermore, such KPIs can be utilised for benchmarking, which is one of the tools for never-ending improvements, [1,5]. 2
THEORETICAL BACKGROUND
Traditionally and faulty maintenance costs are divided into direct and indirect costs. Direct cost, i.e. the costs that can easy be related directly to maintenance, consists of direct maintenance labor, consumable maintenance material, outsourcing in maintenance and overheads to cover the expenses of, for example such as tools, instruments, training, administration and other maintenance related expenses. Indirect-costs, i.e. the costs that can be related indirectly to maintenance inefficiency, cannot all be easily related to maintenance as the losses in the production due to machine failures can be related. For example, indirectcost/profit that is related to losing/gaining of customers and shares of market are not that easy related to maintenance inefficiency/efficiency, respectively. Also, it would not be easy (or sometimes impossible) to find these costs in the current accountancy systems without being confused with other costs, [6]. In order to assess the economic importance of an investment in maintenance, it is often necessary to find the Life Cycle Income (LCI) of a machine/equipment, which is usually not an easy
137
task either. It is easier to asses the savings that have been achieved by more efficient maintenance, such as reduced downtime, number of rejected items, capital tied in inventories and operating costs [4,6]. To be able to monitor, assess and improve the outcome of different maintenance actions it is necessary to use a model for identifying and localizing/retrieving both technical and economic data from company databases. In order to make the process of data gathering and analyzing even easier and more cost-effective, the model should be computerized, [7,8]. Using MIMOSA database reduces technical difficulties and disturbance that may be induced in the current IT-systems of a company, [9]. This would allow following up maintenance KPIs more frequently and easily, thereby be able to react quicker on disturbances and avoid unnecessary costs. It will also be easier to identify and trace the causes behind deviations. The model should also help in interpreting the measurements of relevant basic variables and KPIs in order to achieve cost-effective decisions in planning and executing maintenance actions and to identify where an investment in maintenance may have the best financial payoff, [1,6,10]. In order to evaluate the economic importance of maintenance activities and consequently the Return on Investment in Maintenance (ROIIM), it is necessary to assess the savings achieved by a more efficient maintenance policy. It can be done by analyzing the life cycle cost (LCC) and the transactions between maintenance and other disciplines within the plant, such as production/operation, quality and inventories expenses using system theory. Analysis and assessment of the transactions between maintenance and other working areas can be used to highlight the real maintenance role in the internal effectiveness of a producing company. Maintenance savings are usually achieved through reducing; downtime, number of rejected items, operating/production costs, expenses of different fees/penalties, such as those due to failure-related accidents or failure-relatedenvironment violation and cost of tied up capital, i.e. less unnecessary components and equipment in inventories, [2,4]. Assessment of the savings achieved by more efficient maintenance is less influenced by irrelevant factors compared with the assessment of LCI when company profit is generally considered for assessment, [4]. In this case, several external factors, such as the amount of the product sold, currency course, wars, crises and product price that are irrelevant to the maintenance role but have an appreciable effect on the assessment of the company’s LCI. Discussing solely direct and indirect maintenance costs imply that maintenance is a cost-centre. Therefore, during recessions, companies generally reduce maintenance budget/costs regardless of the benefits that maintenance activities may generate. While the investments in maintenance during these periods can be one of the best investments in the company, see [6]. The economic benefits that could be gained by more efficient maintenance can be found as enhancements in the results of other working areas, such as production, quality and investments, through reducing losses of profit happened due to; a)
Losing production time (and production),
b) More tied up capital and expenses c)
Losses of customers,
d) Loss of reputation and consequently e)
Loss of market share.
These losses are usually generated mainly due to lack of (or inefficient) maintenance. In general, the majority of the indirect costs listed above are due to failures and short stoppages resulting from maintenance performance deficiencies, as discussed in [11]. In this paper, maintenance-related economic factors considered when evaluating the economic role of maintenance are; 1. Maintenance direct cost, 2. Economic losses (which can be considered as Potential Savings or Maintenance Income when using more efficient maintenance) 3. Maintenance savings, 4. Risk capital investments in maintenance for enhancing its performance and achieving better accuracy in maintenance decisions, and 5. Maintenance results (maintenance profit/losses) Part of the economic losses (potential savings), those are due to unavailability and expenses of delivery delay, that a manufacturing company may encounter can be recovered by implementing more efficient maintenance policy, [4,6,12]. This is why we label the economic losses as potential savings or maintenance income. The latter represents the resource for savings and consequently maintenance profit that can be generated by more efficient maintenance. 3
MODELING COST-EFFECTIVENESS WITH RESPECT TO MAINTENANCE
A maintenance policy is considered cost-effective if and only if its return on investment is greater than the capital invested in maintenance. But, the benefits of the improvements in maintenance are usually collected in other working areas but hardly
138
in maintenance as long as its accountancy system shows just costs. For example, identifying and relating the benefits generated by more efficient vibration-based maintenance (VBM) is not that easy task to perform if the mechanisms of transferring maintenance impacts, and technical and economic KPIs are not well identified, [4]. In order to justify investments in maintenance, the cost-effectiveness (Ce) of each investment in improving maintenance performance can be examined by using the proportion of the difference between the average cost per high quality product before and after the improvement to that before. This means that all the savings (and possible increments) in the expenses of production, tied-up capital, insurance premiums, etc., including the maintenance cost resulting from a more efficient maintenance policy should be assessed. At the beginning the cost-effectiveness can be ≥ 0 due to the extra expenses incurred by the learning period. This period can be defined on the basis of the nature of each improvement. But beyond this learning period it should be bigger than zero. Ce indicates the percentage of the reduction in the total production cost due to the maintenance impact and can thus be used as a measure of the cost-effectiveness of the improvements, [4]. A model shows the links between maintenance actions and their economic results has been developed at Växjö University, [4]. In order to make the model industrially applicable, we tried to make it transparent by avoiding the idea of Black Box, i.e. the end user feed in the data required in the software and get the results by pushing particular buttons without knowing any thing about what has happened inside the software. Changes and improvements in the production conditions, production and maintenance processes usually lead to appreciable changes in the performance of production and/or maintenance processes. Therefore, the formulas that have been developed, see [4], are used for assessing the impact of maintenance on the company economics during two different periods to highlight changes in production and maintenance performances and results, i.e. savings or losses achieved due to better or worse usage of the available maintenance technologies, [4]. These formulas can be applied independent of the maintenance technique being used. Denote five of the most popular sources generating savings/losses by Si, for i = 1, 2,…, 5. These popular sources are changes in; number of failures & short stoppages, stoppage time, bad quality production due to inefficient maintenance and additional expenses that can be defined by the user. These formulas are derived to underlying an inference motor constitute a new software tool that is given the name MainSave as it is shown in Fig.5. Then, the total savings or losses can simply be expressed as i =5
Total saving =
∑S
i
i =1
where S1, S2, … , S5 are assessed using the following formulas: I.
Failures; the saving or loss in the production cost has been generated due to less or more failures (S1) can be expressed by; S1 = Number of failures avoided * average stoppage time * production rate* profit margin (PM) S1 =
[(Y - y ) * L1 ] * Pr* PM
Where Y and y are the numbers of failures during the previous and current period, respectively, L1 is the failure average stoppage time and Pr is the production rate. II.
Average stoppage time; the saving or loss that has been generated due to shorter or longer stoppages (S2), i.e. longer/shorter production time, is expressed as; S2 = Difference in failure average stoppage time * number of failures * production rate * profit margin S2 =
[( L1 - l1 ) * y ] * Pr* PM
Where L1 and l1 are failure average stoppage times during the pervious and current period, respectively III. Short stoppages; the saving or loss in the production cost has been generated by less short stoppages (S3) can be expressed by; S3 = [short stoppages in previous period (B) – short stoppages in current period (b)] * average stoppage time (L2) * production rate * profit margin S3
= [( B - b) * L2 ] * Pr* PM
IV. Quality production. The saving or loss generated due to higher production quality (S4) is expressed by: S4 = [Current period high quality production per hour – Previous period high quality production per hour] * Number of production hours per day (Ph)* Number of production days per period (Pd) * profit margin S4 =
( p - P ) * Ph * Pd * PM 139
where P and p are amount (in tons, meters, etc.) of high quality product produced per hour in the previous and current year, respectively V. User defined expenses paid by the company to cover, for instant, personnel compensation due to accidents, environmental damage penalty, insurance premium, direct maintenance costs (that includes labor, spare parts and overheads), tied up capital in spare parts and equipment and penalty expenses of delivery delay. Denote the expenses before and after the improvement, i.e. previous and current period, by Eb and Ea, respectively. Then, the sum of the reduction or increment in these expenses can be expressed by S5 =
∑ (E
b
- Ea ) j
j
where j =1, 2,…, n denotes the n types of the expenses that can be expressed by the user. 4
SOFTWARE PROTOTYPE FOR INDUSTRIAL APPLICATION
One of the major reasons behind the lack of techniques for controlling and assessing maintenance economic impact on company profitability and competitiveness is the lack of clear and robust theory, methods and tools required for performing that task easily and properly, [4]. Also, the difficulties in finding and processing the data required for mapping, monitoring, controlling, following up, analysis and assessing maintenance economic impact. This is why a software-tool may make it is possible to perform this task easily and cost-effectively. The software module aims to sum all the economic losses (potential savings which represent maintenance future income) that are generated due to lack of (or inefficient) maintenance. Also, it assesses the savings and consequently the profit generated by applying more efficient maintenance. Furthermore, the KPIs, such as total savings, potential saving and profit, and ratios, such as maintenance savings to potential savings, savings to investments, investments to potential savings, investments per period, etc. are automatically assessed by the model using the above mentioned equations. The investment is assessed per the period of time that has been passed until the analysis is done instead of the whole depreciation period. The same thing can be said about Overall Equipment Effectiveness (OEE) that is assessed using the traditional equation, i.e. OEE = Availability × Performance efficiency × Quality rate The above mentioned ratios and measures are considered in this study as parts of the important KPIs that are required for mapping, monitoring, analysis and controlling maintenance performance and its economic impact. The main objective of using MainSave is to enable the user to easily and at demand assess and control the economic impact of maintenance as well as the potentials for further improvements. In other words, it can be utilized to assess the current situation, identify problem areas, assess technical and economic losses, and motivate investments in maintenance. The latter is important for providing objective evidences demanded to convince the company’s executives about the necessity of these investments for enhancing the productivity and effectiveness of a production process. All these results cannot be achieved without high quality and relevant coverage data, [4,6]. Also, the data required for applying MainSave should be easily retrieved by the system. An investment in maintenance often has a relative short payoff time compared to other investments if it has been made using right information and knowledge, [13]. Usually, it is hard to show the economic advantages of such investments due to the fact that the savings are spread out in many working areas in a company, and cannot be found easily in the current accountancy systems. 5
DATA DEFINITION AND GATHERING
The data required for applying and running MainSave can be divided in two major categories: Database datasets and Nondatabase datasets. From Fig.5, it is easy to distinguish non-database datasets from those which are database datasets. The former have white colored boxes while the latter have grey colored boxes. The data described by non-database datasets are defined below: o Profit margin per high quality item, ton, meter or cubic meter, etc. o Total investment in maintenance for improving its performance o Depreciation period, i.e. the period that was decided as the investment life length The data described by database datasets are defined below in Figs.1-4; o Data gathered concern one production machine and product. o The data cover two production periods, i.e. before and after an improvement in maintenance or production process any two well distinguished periods. o Numbers of unplanned stoppages, such as failures and short stoppages and their causes.
140
or
o Average time of the stoppages o Production rate, production time and theoretical and actual cycle time. o Quality rate, i.e. the share of high quality product out of the total production during the periods of analysis The rest of the information shown in Fig.5, such as savings, investment per period, ratios and OEE are assessed by MainSave.
Fig.1. Production theoretical cycle time.
Fig.2. Planned production time.
Fig.3. Production follow up.
6
MAINSAVE TEST
The tests were done using industrial data gathered from FIAT/CRF in Italy. Technical and economic data from production and maintenance processes and economy department have been collected and feed in the MIMOSA database located in IBK, Tallinn. A CNC machine at FIAT/CRF, was considered for MDSS test. It produces engine heads. The operation that is performed by the machine is milling. This machine is considered to be a bottleneck in the production line which makes it critical for the whole production process. The data were collected at two periods (previous/first and current/second period) of 6 141
months each (8th of Jan. 2007 – 7th of June 2007) and (7th of June 2007-8th of Jan. 2008). The periods were selected are considered to be long enough so that they include several events, such as production speed changes, shorts stoppages, failures, disturbances or any other stoppages that are generated by organization, man or environment. In general, the periods can be selected as; a)
Two periods were planned for producing two orders of the same product in the same machine, or
b) Two periods that have been selected from the machine register representing the time before and after a particular improvement has been done to the maintenance policy. The non-database dataset, i.e. profit margin, investment and depreciation period are given in the white colored boxes; 10 units, 30000 units and 4 years, respectively, Fig.5. The savings (or additional losses) that maintenance has generated due to its original performance or due to a particular performance improvement are assessed using different losses categories. The losses are classified in two groups of categories; o defined categories of losses, and o user defined category of losses, see Fig.5. In general, it is not necessary that companies have the same types of losses. Therefore, in order to accommodate MainSave for each particular application, machine and company, any of the defined losses can be included or excluded in MainSave. In MainSav, we tried to use the most usual categories of losses and left a wide window for the user defined expenses. MainSave converts all technical data, such as number of failure and downtimes to a measure on the economic scale, i.e. money. In this test, all the defined categories of losses expressed by MainSave are used and no user defined expenses were found. In Fig.5, it is clear that there is an obvious increase in the losses of the production time in the current (second) period compared with the previous (first) period. The major part of this increment was due to increased failures and short stoppages. This increment was, according to the company, happened because of using spar parts from another supplier than that they used to have. The major conclusion that can be drawn from this test is; using MainSave, it is possible to map, identify, analysis and assess the economic losses and savings in the production process and identify the causes behind that, which eases the localization of next investments required for improving maintenance performance to reduce losses. The number of failures was increased and the stoppage times were also prolonged. This resulted in more economic losses (15372.9 units) despite the investment (30000 units) was done. These losses are distributed among the major areas that are; more failures ((-4142.7 units), longer stoppage time (-10723.5 units), more bad quality expenses (-506.6 units). The biggest part of the losses is due to the longer stoppage time which represents about 70% of the total losses. Assessing the losses belonging to different category helps to primarily estimate and judge the size of the risk capital that should be invested for solving the problem. Notice that the saving with the (-) sign means losses. Also, the Overall Equipment Effectiveness (OEE) has been increased due to unknown reasons.
142
Fig.4a. Production events.
Fig.4b. Production events.
143
Fig.5. Test results of MainSave.
7
RESULTS, DISCUSSIONS AND CONCLUSIONS
When the profit margin of a plant decreases, the need for reliable and efficient maintenance policy becomes more important, because it will be more important to reduce the economic losses, i.e. pressing down production cost per high quality item, ton or meter, and consequently increases the profit. The major result of this study is the development of a new model and software prototype (MainSave) for enhancing maintenance cost-effectiveness, performance and economic impact on company business. The test of the model has shown clearly the potentials and the benefits of application. Using MainSave, it is possible to monitor, analyze, assess maintenance activities and act at an early stage in both tactical and strategic levels for fulfilling company's strategic goals in continuous improvement of its profitability and competitiveness which is hard to achieve using available tools and techniques even if the data required are available. Development of relevant and traceable technical and economic KPIs has made maintenance performance control more possible. Also, it makes it possible to handle real-time data gathering, analysis and decision making. Further necessary information about maintenance and other working areas is also provided to the decision maker. MainSave is developed using new and flexible model that can be accommodated to each production process without big difficulties for enhancing its accuracy. It provides better data coverage and quality which are essential for improving knowledge and experience in maintenance and thereby aid in increasing the competitiveness and profitability of a company. 8
REFERENCES
1
Al-Najjar, B., Hansson, M-O and Sunnegårdh, P. (2004) Benchmarking of Maintenance Performance: A Case Study in two manufacturers of furniture. IMA Journal of Management Mathematics, 15, 253-270.
2
Al-Najjar, B. (1997) Condition-based maintenance: Selection and improvement of a cost-effective vibration-based policy in rolling element bearings. Doctoral thesis, ISSN 0280-722X, ISRN LUTMDN/TMIO—1006—SE, ISBN 91-628-2545X, Lund University, Inst. of Industrial Engineering, Sweden.
144
3
Al-Najjar, B. (1998) Improved Effectiveness of Vibration Monitoring of Rolling Element Bearings in Paper Mills. Journal of Engineering Tribology, IMechE 1998, Proc Instn Mech Engrs, 212 part J, 111-120.
4
Al-Najjar, Basim (2007) The Lack of Maintenance and not Maintenance which Costs: A Model to Describe and Quantify the Impact of Vibration-based Maintenance on Company's Business. International Journal of Production Economics IJPPM 55(8).
5
Pintelon, Liliane, (1997) Maintenance performance reporting systems: some experiences. Journal of Quality in Maintenance Engineering. 3(1), 4-15.
6
Al-Najjar, B., Alsyouf, I., Salgado, E., Khosaba, S., Faaborg, K., (2001) Economic Importance of Maintenance Planning when using vibration-based maintenance policy, project report, Växjö University.
7
Al-Najjar, B. and Kans, M. (2006) A Model to Identify Relevant Data for Accurate Problem Tracing and Localisation, and Cost-effective Decisions: A Case Study. The International Journal of productivity and performance measurent (IJPPM), 55(8).
8
Kans, Mirka (2008) On the utilisation of information technology for the management of profitable maintenance. PhD thesis, 2008, Department of Terotechnology, Växjö University, Sweden
9
MIMOSA, (2006) “Common http://www.mimosa.org/.
10
Al-Najjar, B., Kans, M., Ingwald, A. Samadi, R. (2003) Förstudierapport - Implementering av prototyp För Ekonomisk och Teknisk UnderhållsStyrning, BETUS. Växjö University.
11
Al-Najjar, B. (2000) Accuracy, effectiveness and improvement of Vibration-based Maintenance in Paper Mills; Case Studies. Journal of Sound and Vibration, 229(2), 389-410.
12
Al-Najjar, Basim, (1999), Economic criteria to select a cost-effective maintenance policy, Journal of Quality in Maintenance Engineering, 5(3).
13
Al-Najjar, B. and Alsyouf, I. (2004) Enhancing a Company's Profitability and Competitiveness using Integrated Vibration-based Maintenance: A Case Study. Journal of European Operation Research, 157, 643-657.
Relational
Information
Schema
(CRIS)
Version
3.1
Specification”,
Acknowledgements This paper is part of the work done within DYNAMITE and the author would like to thank EU for supporting EU-IP DYNAMITE.
145
Proceedings of the 4th World Congress on Engineering Asset Management Athens, Greece 28 - 30 September 2009
CASE STUDY: WARRANTY COSTS ESTIMATION ACCORDING TO A DEFINED LIFETIME DISTRIBUTION OF DELIVERABLES Vicente González Díaz a, Juan Francisco Gómez Fernández a and Adolfo Crespo Márquez b a
PhD student of the Industrial Management PhD Program at the School of Engineering, University of Seville, Spain. b
Associate Professor of Industrial Management School of Engineering, University of Seville, Spain.
This paper pretends to describe a real case of warranty assistance, analyzing its management in the framework of a manufacturing company which provides deliverables during a specific period of time and following a scheduled distribution. With the sale of a product, the manufacturer is nowadays obliged contractually to perform warranty assistance to the buyer. Decreasing the incurred costs is clearly not the only aspect to achieve, since the decision has to be global and strategical inside the company in order to purchase a reliable and robust product, offering as well an appropriate after-sales service to the user. Therefore, key aspects will be presented along this study in order to estimate costs and, consequently, to take proper decisions for leading correctly the company to a successful goal. For that purpose, not only managers and responsibles in a well-established and controlled organization must take part, it is also of importance to consider the experience given by the technical staff for maintenance and warranty. As result, this paper will show basically how analyzing the past performance is possible to foresee and control the future. In our case, it will be possible to observe how the evolution of costs during the lifetime of a warranty assistance program can help to correct and foresee with more accuracy the expected total cost of the activity considered at the beginning of the program. The paper is based on an usual procedure in special supplies for the public sector (for instance, a fleet of customized vehicles), between companies inside the supply chain or directly to the final user, where this final user is for example a public entity and the budget for the complete warranty assistance is already known from the beginning of the project. Key Words: Warranty management, after sales, cost estimation, e-warranty, spare parts procurement 1
INTRODUCTION
Case studies have been normally used to support and help theoretical subjects in engineering and other research fields. Developing these cases, it is usually found such amount of information that can either trivialize the study or complicate it beyond a reasonable level. Therefore, the intention here is to synthesize a practical case which transmits easily how a proper management of warranty assistances helps to reduce costs, enables to take suitable decisions, and improves the image of the company in front of the client. The case here exposed, starts mentioning the antecedents related to warranty cost models. This brief State-of-Art will show how important is a warranty cost management system. Later on, it is described the scenario where contributions given in the mentioned State-of-Art will be applied. Once defined the problem and along the development of this particular case, a procedure is also proposed related to the way of working among different sections inside a generic company. This procedure will be exposed succinctly using a workflow chart by BPMN (Business Process Modelling Notation) standard. Finally, conclusions to this case study are expressed at the end of the paper.
2
ANTECEDENTS
Although many manufacturing companies spend great amounts of money just due to their service warranties, in most of the cases however, this reason does not receive too much attention. In spite of this, it is possible to find studies in the literature, related to the warranty cost modelling with very interesting contributions [1]. The authors of these contributions try usually to
146
identify processes, actions, stages, tools, methods or necessary support techniques to manage properly the warranty costs. Regarding processes, in order to apply an effective warranty management, is critical to collect the proper data and to exchange adequately the different types of information between the modules in which the management system can be divided [2]. In our case study, it will be proposed a warranty management system based on a several modules organization. PROBLEM / OBJECTIVE In the literature review, one can also observe different interactions between warranty and other disciplines, and how they are dealt by the different models and authors. Particularly and summarizing, three important interactions must be considered:
WARRANTY MANAGEMENT SYSTEM ENGINEERING MODULE
MANUFACTURING MODULE
POST SALE MODULE
MARKETING MODULE
DECISION SUPPORT
1. Warranty and Maintenance: In many cases, the warranty period is the time when the manufacturer still has a strong control over its product and its behaviour. Additionally, the expected warranty costs depend normally not only on warranty requirements, but also on the associated maintenance schedule of the product [3].
Figure 1. Warranty management system in four modules (adapted from [2])
3. Warranty and Quality: The improvement of the reliability and quality of the product has not only an advantageous and favourable impact in front of the client; also this improvement highly reduces the expected warranty cost [5].
High
Management Requirement
2. Warranty and Outsourcing: The warranty service or, in general, the after-sales department of a company, is usually one of the most susceptible to be outsourced due to its low risk and due also to the fact that, among others features, outsourcing provides legal insurance for such assistance services [4].
Low
Moderately Externalizable
Not Externalizable
Highly Externalizable
Less Externalizable
Interactions Requirements
High
Figure 2. The decision of outsourcing (from [1]) EAC = Cost to Date
+
Estimated Cost of Remaining Work
In reference to costs estimations, and apart from warranty issues, there are nowadays several methods to estimate accurately the final cost of a specific acquisition contract. In our case study, the method applied in a simplified way is denominated “Estimate at Completion” (EAC).
€ Estimated Final Cost Cost to Date
Project tracking
In few words, EAC is a management technique used in a project for the control of costs progress.
Now
Time
Here, the manager foresees the total cost of the project at completion, combining measurements related to the scope of supply, the delivery schedule, and the costs, using for that purpose a single integrated system.
Figure 3. EAC Formula (adapted from [6]) Finally, taking into account the above mentioned antecedents, one can see that by reengineering of management processes and by the application of a correct warranty cost model, it is possible to:
Increase sales of extended warranties and additional related products.
Increase quality by improving the information flow about product defects and their sources.
Improve better customer relationships.
Reduce expenses related to warranty claims and processing.
Better management and control over the warranty costs.
147
Reduce invalid-related expenses and other warranty costs.
Therefore, a well-established warranty management system will help basically to achieve a successful goal in the performance of the company warranty services.
3
SCENARIO OF STUDY
The case company is a large manufacturer in the metal industry that operates worldwide. The company designs, manufactures and purchases a wide range of industrial vehicles (such as forest machines, hydraulic excavators or track loaders) for industrial customers, as well as other related products like spare parts. In addition to the purchase of standard vehicles, nowadays is being also often the customization of machines. In our case, the company must supply to a client a specific amount of customized vehicles following a defined schedule. In the contract is included the assistance of warranty for the vehicles of the fleet during a period, starting when each vehicle is delivered to the customer. To provide the after-sales service in a satisfactory way, it is required the fulfilment of some conditions by the company: 1.
Teams formed by personal with appropriate training.
2.
Tools for maintenance / warranty tasks.
3.
Materials and spare parts to carry out the repairs.
The first two conditions are considered fulfilled. Regarding the third condition, the necessary materials for warranty operations are obtained from the same warehouse of the assembly line. By this way, there are two possibilities to give back the material:
When the piece is repairable, a spare part is taken from warehouse being later refunded after the repair of the disassembled piece.
When the piece is not repairable, a spare is also taken from warehouse, but the material must be restored by purchasing.
This situation is possible because the stock for manufacturing allows the loan of material for warranty without risk to the necessities of the assembly line. The problem in this scenario is defined as follows: Due to the fact that manufacturing and warranty assistance share the same warehouse, there will be a moment when the manufacturing is very advanced and simultaneously there are many vehicles under warranty. From this moment onwards, every decision must be taken prioritizing one of the two activities. Apart from the above described context, the study takes place during the lifetime distribution of deliverables. That means that, historical data regarding costs, failured items etc. are available for the research. In reference to the failured items, it has been used a classification tree with several levels following a hierarchical structure based mainly on their functionality, and reaching a sufficient level of detail in terms of procurement aspects. Custom ized Vehicle
Electrical System
Disjunctor
Cable
Hydralic System
Valve
Pum p
Level 0
Mechanical System
Gear
Brake
Auxiliary System
Intercom
Figure 4. Classification tree of components In figures, the described scenario and the delivery schedule are as shown in table 1. Our case study will be developed considering also the following hypothesis:
Every vehicle has the same reliability (they have the same failure probability).
The warranty cost is constant with the time.
148
Navigator
Level 1
Level n
The warranty time does not stop in any moment.
Table 1 Data of the described scenario
Total amount of customized vehicles to be delivered: 350 units.
Warranty period for each vehicle: 2 years.
Warranty expiration for last vehicle: March 2015.
Time point of the case study (t1): April 2009 (150 units already delivered).
Date March 2006
Accumulate amount of vehicles Roll-Out
April 2007
45 units
April 2008
100 units
April 2009
150 units
April 2010
200 units
April 2011
260 units
April 2012
315 units
April 2013
350 units
Regarding the EAC for warranty, it depends on the company policy. Usually, the budget for warranty is determined as a percentage of the project total cost. In our case study, the manufacturing plus indirect costs for each vehicle is supposed that amounts to ca. 375.000,00 € and the percentage for warranty attendance will be the 2 % of the budget for total costs. That yields around 2.625.000,00 € for the attendance of warranties during the whole project.
4
ANALYSIS, DEVELOPMENT AND RESULTS OF THE CASE STUDY
4.1 Costs analysis of the warranty assistances As mentioned, the study happens in a moment when the company has already delivered an amount of 150 vehicles. In this time, there are 105 vehicles under warranty. Some preliminary data are shown in table 2. Together to this, there is also a sample about the amount of vehicles under warranty according to the defined delivery schedule. Some figures here have been rounded off in order to simplify their use during the study.
t2
ar -0 se 6 p0 m 6 ar -0 se 7 p0 m 7 ar -0 se 8 p0 m 8 ar -0 se 9 p0 m 9 ar -1 se 0 p1 m 0 ar -1 se 1 p1 m 1 ar -1 se 2 p1 m 2 ar -1 se 3 p1 m 3 ar -1 se 4 p1 m 4 ar -1 5
In this moment, we can observe how close the end of the deliveries is (April 2013). Consequently, much closer (and critical) is therefore the manufacturing of such last vehicles.
400 350 300 250 200 150 100 50 0
Warranty Evolution
t1
m
In September 2011 (t2), the already delivered fleet -285 units- will have a maximum in the amount of vehicles simultaneously under warranty –128 units- (see yellow graphic line).
Monthly Delivery Delivered Vehicles (Acumulate) Vehicles in warranty
Amount of Vehicles
The table 2 (as commented) is only a sample extracted from the complete delivery schedule. From this complete schedule, it is possible to notice that the warranty expiration of the first vehicles takes place obviously on March 2008 and, also, that the most critical moment (t2) will happen in September 2011. Graphic in figure 5 can help to illustrate it.
Month
Figure 5. Warranty evolution graphic, in terms of delivered vehicles
149
Table 2 Extract of the delivery schedule Date
Ac. Amount of vehicles
Monthly Delivery
Vehicles in Warranty
March 2006
Roll-Out
5 units
5 units
April 2007
45 units
2 units
45 units
April 2008
100 units
3 units
91 units
April 2009
150 units
2 units
105 units
April 2010
200 units
7 units
100 units
No. of delivered Vehicles in t1: V1 = 150 units
No of Reclamations in t1: R1 = 1.200 reclamations
C1 = 1.000.000,00 €
April 2011
260 units
2 units
110 units
April 2012
315 units
2 units
115 units
April 2013
350 units
0 units
90 units
April 2014
350 units
0 units
35 units
Warranty incurred cost in t1: EAC for Warranty: EACw = 2.625.000,00 €
No of vehicles to be delivered: V(t) = [according to delivery schedule]
In t2, our teams of maintenance / warranty technicians will have to attend a high number of vehicles which will demands a huge amount of spare parts. At the same time, the operators of the assembly line will be requesting pieces for the production of the last vehicles. The shared warehouse will have then in store enough pieces for manufacturing but no more, so the loan of any spare part demanded by the after-sales personal must be decided taken into consideration the importance of the material, the time to repair the disassembled piece, and / or the time to restore it by purchasing. Every piece in the classification tree (see figure 4) belonging to the lowest level (level where materials can be procured), will have a weight (or criticity) which changes with the time. Every piece will be considered much more critical, as closer is the end of manufacturing. Therefore, and taking also into account a costs analysis, it will be necessary to have in mind the investment of a minimum strategical stock in order not to leave warranty claims unattended. Monthly Increase of Warranty Cost Warranty Cost (Acumulate) Warranty Cost
Warranty Cost Evolution
t1
2500000
Considering the above indicated data, it is possible to carry out a simple costs analysis obtaining some average values.
t2
Calculation of some values:
1500000
500000
Warranty cost per vehicle:
mar-15
jul-14
nov-14
mar-14
jul-13
nov-13
mar-13
jul-12
nov-12
mar-12
jul-11
nov-11
mar-11
jul-10
nov-10
mar-10
jul-09
nov-09
mar-09
jul-08
nov-08
mar-08
jul-07
nov-07
mar-07
jul-06
CV = C1 / V1 = 1.000.000,00 / 150 = 6.666,67 € nov-06
0
Warranty cost per reclamation: CR = C1 / R1 = 1.000.000,00 / 1.200 = 833,33 €
1000000
mar-06
EUR (€)
2000000
Month
Reclamations per vehicle: RV = R1 / V1 = 1.200 / 150 = 8 reclamations
Figure 6. Warranty evolution graphic, in terms of warranty costs With these values, and in order to make it more illustrated, it is included a graphic (figure 6) with the warranty evolution in terms of costs. One can see that lines in this graphic follow the same behaviour or track as the ones from figure 5.
150
That is because the total incurred warranty cost of every vehicle has been considered in a conservative way. That means they have been treated as already incurred just when each vehicle is delivered to the customer. Therefore, the accumulate warranty cost does not increase after the delivery of the last vehicle. In further studies, it will be possible to add also the consideration of several destinations of the vehicles, where local maximums can happen in different moments of the defined lifetime and costs must include the movement of warranty teams to different locations. Comparing the above results with the foreseen costs indicated in the EAC, a graphic is obtained as the one exposed in figure 7. The EAC is formed by a first part already known (pink line), which refers to the Incurred to Date (ITD); plus a second foreseen part (blue straight line), which refers to the Estimate to Completion (ETC).
Warranty Costs Comparison
t1
3000000 2500000 2000000 EUR (€)
1500000 1000000
Month
Figure 7. EAC vs. Average Cost Line That means, there is a budgetary buffer of ca. 290.000,00 €, which can be used for the investment of a strategical stock of spare parts. This amount would correspond to the budget for attending the warranty of around 43 vehicles, or equivalent as if the warranty assistance should be taking and advantage of ca. 15 months before the manufacturing end. Other interesting average values that can be obtained from this exercise are for example the estimated total amount of warranty claims, which shall be around 2.800 reclamations. Anyhow, and as the main conclusion of this analysis, the procurement of these strategical spare parts should avoid the use of the stock shared with assembly line, offering by this way an appropriate service to the client. That is due to the possibility of assisting warranties independently of the manufacturing department and consequently, not affecting to the final goal of the project.
4.2 Quantitative analysis of the claims Data for a huge variety of items have been possible to compile with the customer’s complaints. These items are classified according to their functionality and divided also into components that can be procured (see figure 4: Classification tree of components). COMPLAINTS ACCORDING TO COMPONENT
100
The figure 8 exposes a sample of the gathered data as an example for this case study. This kind of analysis usually helps not only to the Quality department, but also to the Manufacturing, in order to pay much more attention in those components that have many incidents during the warranty period.
95
90
80 70 62 60 54 50
Improving the manufacturing process or taking care during the component assembly, it is possible to reduce the complaints regarding a specific item.
46 38
40
30
30
26 22
10
8
6
4
3
2
Cable
12
Horn
14 10
Steering wheel
18
20
Heater
Navigator
Seats
Antenna
Lights
Intercom
Regulator
Gear
Disjunctor
Valve
Alarm
Brake
Battery
Pump
0
Engine
No. of COMPLAINTS
70
Due to the huge amount of components in such complex systems as an industrial customized vehicle, is suggested the choice of items in order to make all the gathered information easily manipulated. The criteria to select a group of items can be not only in terms of failures quantity. It is also important the cost of such components, the delivery time to procure them, etc.
Figure 8. No. of Complaints per Component
151
mar-15
jul-14
nov-14
mar-14
jul-13
nov-13
mar-13
jul-12
nov-12
mar-12
jul-11
nov-11
mar-11
jul-10
nov-10
mar-10
jul-09
nov-09
mar-09
jul-08
nov-08
mar-08
jul-07
nov-07
mar-07
jul-06
0
nov-06
500000 mar-06
Apart from the EAC line, the warranty cost line obtained from the cost average in t1 is also here implemented (green line). As a result from this graphic comparison, one sees that cost at the end (ca. 2.335.000,00 €) is slightly lower than the budget considered at the beginning of the project (2.625.000,00 €).
Estimate to Completion (ETC) Incurred to Date (ITD) Warranty according to Cost Average
In general, is important to know how critical each component is for the company and for the fulfilment of the production line. All these features will be conditions to have in mind when comes the time to take a decision. In other words, these features will be turn into factors which will give a specific weight to each component. This weight will help finally to the manager to take the proper decision. Taking this into account, and regarding again to the former figures, those data included in the graphic, are possible to be transformed in terms of relative frequency. This relative frequency refers to the number (ni) of times that an event (i) takes place (in our case, failures), and divided per the total number of events (Σni). Considering therefore statistical concepts (and together with other factors) is possible further on to weight, as mentioned, the value of each component in order to prioritize between the loan to warranty assistance or to stay the piece available for the manufacturing. Table 3 Relative frequencies Component
Claims Nº
fi = ni / S ni
Component
Claims Nº
fi = ni / S ni
Pump
95
0,1827
Lights
18
0,0346
Engine
70
0,1346
Intercom
14
0,0269
Battery
62
0,1192
Antenna
12
0,0231
Brake
54
0,1038
Seats
10
0,0192
Valve
46
0,0885
Heater
8
0,0154
Alarm
38
0,0731
Navigator
6
0,0115
Gear
30
0,0577
Horn
4
0,0077
Disjunctor
26
0,05
Steering wheel
3
0,0058
Regulator
22
0,0423
Cable
2
0,0038
The rest of components are basically not considered because:
They have been affected by very little amount of failures.
They have been delivered fast enough and mostly in time.
There is an extra stock in warehouse due to the purchasing of minimum quantities, higher than the real necessity.
Or they are not, definitively, under the interest of the project managers’ point of view, due to other reasons.
Summarizing, with the tasks before explained in order to obtain a set of chosen components (those acknowledged as critical), what we are really composing is a list of strategical spare parts. This means that, in case the company approves the use of the budgetary buffer for the supporting of the warranty service, the purchasing process can be quickly launched. All these actions will finally lead the company to positive returns:
by reducing the probability of paying penalties due to a global delay in the project delivery, and
by improving the confidence of the client due to the completion of contractual terms as the warranty assistance.
It is necessary to remark that, every failure referred here were incidences considered under warranty. For further researches on this field, is proposed for example the inclusion also of those incidences not considered under warranty. The analysis of such events must take into account the reasons why these situations happen (bad training of the user?; poor information for maintenance?; clients accustomed to other family product with different behaviour?...). Anyway, in each case and even when the failure is not attributed to the manufacturer, the company must be interested in the possible causes.
152
4.3 Spare parts management for warranty assistances The change in the utilization of pieces from assembly line to warranty assistance has a negative effect in cost for the whole project. The extracosts associated to the spare parts are of course due to the different price between the acquisition of a piece at the beginning of the project for the whole fleet, and the acquisition of a piece punctually during the lifetime of the project and for a specific incidence. Therefore, the accounts management must apply a compensation between the difference values in order not to remain such increment in the total costs of manufacturing, but to incurre it in the total warranty costs. Taking this into consideration is possible to calculate properly the costs of the loans. Consequently, the percentage incremented in the final acquisition price can be also a factor to have in mind when it will be estimated the weight for each component.
Figure 9. Typical feedback of analysis from collected reliability and maintenance data (from [7]) In order to make sure a correct warranty attention, the proposed action is basically to acquire a lot of reserves that allow reparations without delays in the vehicles manufacturing and, simultaneously during this process, to supply spares to the warranty service from the assembly line in a reasonable way. According to the mentioned considerations, and together with the collected data, the experience of warranty / maintenance technicians, the knowledge of the engineering department and of course using the already developed techniques in maintenance (figure 9), is possible not only to elaborate a spare parts purchase plan for warranty, but also to improve the business process of decision-making as well as to contribute with improvements actions for engineering and manufacturing. This purchase plan means an adequate list of essential pieces to assure the properly assistance of a high amounts of reclamations. At the end of the warranty period, the remaining spare parts can be negotiated with the client for their use in later maintenance tasks. This fact forces to control properly all these materials thus, at the conclusion of the project, they must be available to be supplied to the customer. At the same time, this action will suppose an opportunity to recover in a future, part of the incurred cost. In general, the decision-making will be the result of a process focused to a final choice among several alternatives. In our case, in order to lead the company to a fast and adequate decision-making, every department should know very clear what they have to do and which the scope of their responsibility is. For our company case, we have adapted the idea of a warranty management system divided in modules (see figure 1), proposing furthermore, certain interactions among different departments inside the company, which share the information, take suitable decisions according to their responsibilities, and coordinate activities to a common and profit goal for the whole company. In order to illustrate such interactions, activities etc., it has been used a workflow (figures 10 and 11) following a BPMN (Business Process Modelling Notation) methodology as a graphical representation for this specific business process, being by this way easily understandable. The considered departments here (including the client) are:
Logistics Department (LD)
Quality Department (QD)
Manufacturing Department (MD)
Purchasing Department (PD)
Management Board (MB)
Engineering Department (ED)
Aftersales Department (AD)
Customer (C)
The process starts when the customer detects a failure in a vehicle and informs consequently to the company. The communications can be addressed to different sections of the company, but the most appropriate way is to focus them in only
153
one communicator as, for example, the Management Board. Anyway, the Aftersales Department can also detect failures in the course of its maintenance activities. NO
Communicates the failure to MB.
CLIENT
YES Is acepted?
Are there spares in warehouse?
LOGISTIC DEPT.
YES
NO
MANUFACTURING DEPT. NO
Decides, as a last resort, if the failure must be repaired under warranty.
Transmits the information to AD.
MANAGEMENT BOARD
NO
Must the failure be considered under warranty?
Analizes the initial data.
AFTER-SALES DEPT.
Communicates the corresponding dept. the necessary actions in order to facilitate the material to AD.
Repair under warranty? YES
YES
SI
Determines resources and deadlines for the repair.
NO
Is any material needed for the repair?
Figure 10. Workflow of the proposed warranty management process (part 1 of 2). Once the information reaches the Aftersales Department, it analyzes the facilitated information. In case that the incidence is considered not object of repair under warranty (for example, when the cause of the failure has been a wrong or bad utilization), it informs to the Management Board who decides finally if, in spite of this, the incidence is repaired as warranty. If the incidence is discarded as warranty repair, the Management Board should inform the customer about that. The customer can of course disagree with such consideration. Therefore, a list of interventions (those not considered firstly as warranty) must be negotiated between the parts. If the incidence is considered under warranty conditions, the Aftersales Department must carry out a diagnosis of the incidence, detecting the problem, analyzing its solution, and determining the resources (staff and materials) and the necessary time for its repair. In reference to the material, the warranty technicians must identify between the repairable and the non reparable / consumable materials. YES
CLIENT
Sends to MB its Approval.
YES LOGISTIC DEPT.
MANUFACTURING DEPT.
NO
Is possible to obtain it by cannibalization?
YES
NO
YES
MANAGEMENT BOARD
NO AFTER-SALES DEPT.
QUALITY DEPT.
Does it affect to the delivery schedule?
NO
Decides, as a last resort, if the piece is lent.
Makes the corresponding Non Conformity Register and sends the failured material to the factory.
Is it a systematic failure?
NO
Is the piece lent?
NO
YES
Informs to AD about the disposition of materials for the repair.
Informs to MB and C about its Action Plan.
Informs the C about the closing of the incidence.
Resolves the incidence and fulfils the Closing Report.
Makes and manages the Data Base associated to the incidences, needed for its follow-up and periodical review.
Manages the repair whose charges impact in AD.
YES Manages the purchasing whose charges impact in AD.
PURCHASING DEPT.
ENGINEERIING DEPT.
Analyzes and justifies the causes.
Figure 11. Workflow of the proposed warranty management process (part 2 of 2). The necessities in general are communicated to the Management Board who addresses the actions to the corresponding department (Logistics, Manufacturing and / or Purchasing Department), in order finally to facilitate the material to the Aftersales Department. In this moment is when the Management Board must take the most important decisions in terms of costs and manufacturing prevision. Once the Aftersales Department has the material (either by a loan from warehouse, a loan by cannibalization, or acquisition by purchasing), it communicates to the Management Board (and afterward to the client), its action plan.
154
The damaged material is sent to the company where the Quality Department (together in some cases with Engineering Department) analyzes the failure. If the repair has been by replacement and the material is identified as repairable, the Quality Department manages the repair, taking into account the appropriate certification. The material, once repaired and certificated, will be stored again in warehouse for its use in the assembly line. In this process, every data about the incidence, damaged material, repair etc., gathered by Aftersales, Quality and Engineering Departments are introduced in a Data Base which is followed-up and reviewed by the Quality Department. Once the incidence is solved, Aftersales Department communicates the closure of the assistance to the Management Board, who transmits this to the client. From the customer is important to receive a document with the approval of the performed tasks and the acceptance of the service closure. The Data Base associated to these incidences and necessary for their follow-up should include, not only those incidences considered under warranty, but also the data about preventive and corrective maintenance performed on every vehicle, in order to enable the analysis of, for example, repetitive or systematic failures among others studies.
5
CONCLUSIONS
With the help of a case study, this paper summarizes a business process inside a specific framework: the warranty management. The analysis observes how information related to warranty and maintenance, gathered during the lifetime of the project, can be profitably used to take decisions reducing unnecessary expenses and improving the quality, the service and, essentially, the image of the company in front of the client. The data compilation enables to weight those parameters needed to choose properly among alternatives. Such decisions are expressed in the workflow as gateways. Nowadays, the use of computing tools can be helpful not only to make automatically choices, but also to model and simulate business processes in order to detect for example their weak points. Particularizing to our case, future studies can consider other conditions as for example:
Final products with different reliability.
Local maximums at different times and places.
Warranty cost depending on time.
When the budgetary buffer is negative.
Diverse inoperance degrees.
Etc.
In general and nowadays, e-technologies are being applied in many and different fields. In our case, further research can be focused to the e-warranty in the same way as e-maintenance. The concept of E-Warranty will be them defined as that warranty support which includes resources, services and management, needed to enable proactive decisions in the process execution. Etechnologies as e-monitoring or e-diagnosis will be, consequently, key factors to reach high levels of quality, reliability, effectiveness and, of course, confidence before the client.
6
REFERENCES
1
V. González Díaz, J.F. Gómez, M. López, A. Crespo, P. Moreu de León. (2009) Warranty cost models State-of-Art: A practical review to the framework of warranty cost management. ESREL 2009, Prague.
2
K. Lyons, D.N.P. Murthy. (2001) Warranty and manufacturing, integrated optimal modelling. Production Planning, Inventory, Quality and Maintenance, Kluwer Academic Publishers, New York. Pp. 287–322.
3
Boyan Dimitrov, Stefanka Chukova and Zohel Khalil. (2004) Warranty Costs: An Age-Dependent Failure/Repair Model. Wiley InterScience, Wiley Periodicals, Inc.
4
J. Gómez, A. Crespo, P. Moreu, C. Parra, V. González Díaz. (2009) Outsourcing maintenance in services providers. Taylor & Francis Group, London. Pp. 829-837. ISBN 978-0-415-48513-5.
5
Stefanka Chukova and Yu Hayakawa. (2004) Warranty cost analysis: non-renewing warranty with repair time. John Wiley & Sons, Ltd. Appl. Stochastic Models Bus. Ind. 20, Pp. 59–71
6
D. Christensen. (1993) Determining an accurate Estimate At Completion. National Contract Management Journal 25. Pp. 17-25.
7
ISO/DIS 14224, ISO TC 67/SC /WG 4. (2004) Petroleum, petrochemical and natural gas industries - Collection and exchange of reliability and maintenance data for equipment, Standards Norway.
Acknowledgments The author would like to thank the reviewers of the paper for their contribution to the quality of this work.
155
Proceedings of the 4th World Congress on Engineering Asset Management Athens, Greece 28 - 30 September 2009
REMOTE SERVICE CONCEPTS FOR INTELLIGENT TOOL-MACHINE SYSTEMS Guenther Schuh and Kevin Podratz Research Institute for Operations Management at RWTH Aachen University (FIR), Pontdriesch 14/16, 52062 Aachen, Germany In the near future, tooling companies will offer their customers not just maintenance services, but complex remote service packages for their engineering asset management, which is the total management of physical – not financial – assets [1]. The overall goal is to enhance the efficiency of the engineering asset, e.g. to reduce TCO, on the customers´ site by means of value creating partnerships [14]. These partnerships may be, e.g. the classical output or reliability partnership, but also process optimizing partnerships or lifecycle partnerships [4, 8]. The process optimizing partnership offers, e.g. the optimization of the system’s performance or the output quality, an optimized ramp-up and restart procedure or optimization of the production process parameters. The lifecycle partnership, on the other hand, accompanies the intelligent tool-machine-system throughout the whole lifecycle, which includes, e.g. provision of spare parts during the entire usage phase, storing, refurbishment, recycling and even the support of relocation of production facilities. Intelligent remote services have great potential for realizing all these partnerships. To realize such engineering asset-related partnerships, two major tasks have to be done. First, there has to be the intelligent tool-machine system, which delivers the information that is required for these services. And furthermore, this information has to be integrated into the maintenance processes, so that it is delivered at the right place and time and in the required form. Second, the activities and processes that are combined to the engineering asset-related partnerships have to be configured out of standardized service and process modules. Therefore configuration logic is essential. This paper is based on the results of a research project in which an intelligent tool-machine system is developed and forms the foundation for the development of such asset-related partnerships. So this paper describes the intelligent asset and the different types of partnerships and presents the logic to efficiently configure the required maintenance and business processes. Key Words: service engineering, service systems, configuration logic, configuration system, remote service, asset-related partnerships, tooling industry 1
CHALLENGES FOR TODAY’S TOOLING COMPANIES
The situation on the European tooling market is still satisfactory and even the present financial crisis has by now no serious effects on the respective companies. Nevertheless, the competition of upcoming tooling companies from Eastern Europe and Asia imposes pressure upon established companies. Thus these companies have to develop and defend their competitive advantage. Because these emergent tool constructors have a much lower cost structure, the situation becomes more and more difficult for their competitors. A statistical analysis comparing the situation of Chinese and German companies – for example – shows that tool-manufacturing in China has about 91% less personnel costs than Germany (i.e. 4.801€ compared to 58.407€ in Germany). Thus established tooling companies have to find new solutions and strategies to be competitive on the global market [9, 12]. According to Porter, an opportunity of assuring one’s competitive position on the international market is to use differentiation strategies [2]. They are enabled by product or service developments which offer unique attributes to the customer. The manufactures product quality is often easily copied and cheap tool plagiarisms regularly appear on the market. Services instead are not that easily imitated. To make use of this advantage, companies should stop just being a ‘producer’ and become a ‘solution provider’ [3, 11] that aims to solve their customers’ problems by offering complex service systems [5, 6, 7]. Thus, integrating products, parts, after sales services and value added services into solution systems allows a successful differentiation form the worldwide competition [10].
156
2
INTELLIGENT TOOL-MACHINE PARTNERSHIPS
SYSTEMS
AS
ENABLER
FOR
REMOTE
SERVICE-BASED
Technological innovations like RFID support, transponder and sensor technologies plus high-capacity communication technologies are absolutely necessary to build and install so called ‘intelligent tool-machine systems’, as enabler for remote services. By means of remote services that can be characterized as advancements of tele-services, individual service systems, which should lead to lasting customer-provider relationships [13] and the desired resulting partnerships can be effectively realised. They offer the possibility of providing technical services not locally at the machine, but from a distant location using communication networks. This is a way to create and establish entirely new services and additionally to make them much more efficient. Intelligent tool-machine systems (Figure 1) are equipped with sensors, which measure e.g. pressure or temperature. Depending on the machine itself and what it is used for, there are different other devices (e.g. counting units) which can be installed to gain more valuable information. Using a transponder and a RFID unit, specific software, like an electronic tool log, can be provided with this information. An electronic tool log, stored on a transponder and connected to the tool, administrates and processes the information from the intelligent tool-machine system. To detect failures and errors, which may occur, early enough to avoid long downtime, the electronic tool log is also able to determine critical situations that may lead to failures and gives warnings. If a repair could not be averted though, it still helps by narrowing the field of possible reasons. In order to be capable of this function, the system has to be supplied with ‘failure patterns’, meaning known combination of data and resulting failures of those.
Machine
Antenna Charge Amplifier Transponder RFID read/write unit Counting Unit
Figure 1. The intelligent tool-machine system
By means of the intelligent tool-machine system and the connected electronic tool log, services like the tool-data management, documentations, a service manual as well as engineering drawings and cooling plans can be provided directly where they are required. The processed data from tool sensor technology supports the process of tool editing and path relocation. Thereby revision and inventory data can be automatically managed. The process data is then used to improve the initial sample inspection processes, to realize fast ramp-up times and to reduce set-up times. Additionally, personnel trainings can be provided or supported by means of remote connection systems. On the basis of the intelligent tool-machine system it is also possible to provide such maintenance-related remote services like condition monitoring, remote diagnoses and remote repairing as well as process supervision and optimization (Figure 2). Beside this, planning, scheduling and dispatching can be supported by atomized processes.
157
emotional profile, customer experience integrated project managment integration of service service assortment product system
customer integration, customer value
product
trust, image
DIN/ISO approval
integrated project managment
process qualifier
life time cycle partner
status controller
3D-Data for self repair
optimizing tool-machine -system
modification data management
initial sample inspection
remote diagnosis
maintenance planning
repair-or-buy decision
tool treatment
tool storing
condition based maintenance
hotline
consulting
maintenance, repairs
training
periphery
spare parts
documentation, user manual
drawing and cooling plans
tool
tool data
integration of service
service
assortment product/ product system
availability backup
coordinator
Figure 2. Potential remote services for solution systems based on the intelligent tool-machine system [12]
By combing these new services the following service systems, in form of partnerships, may be offered in order to support the customer in managing its engineering assets more effectively [4, 8]: • Process optimization partnership: optimization of the tool-machine systems in installation and start-up phase and optimization of current processes, • Lifecycle partnership: assumption of the support of crucial tasks in the tools life cycle like continuous maintenance service, guaranteed spare part provision or end-of-use services like storage, refurbishment or even recycling, • Condition-monitoring partnership: tele-service based maintenance as well as support of preventive and reactive maintenance activities, • Availability backup partnership: support concerning machinery breakdown and effective production through knowledge of process parameters, • Coordination partnership: authorization of the customer to monitor its supplier and • Output-guarantee partnership: acceptance of responsibility of defined output quantity and process quality by customer. With the new technology of an intelligent tool-machine system, remote services and the respective partnerships between producer and customer, it is possible for the producer to establish a differentiation strategy that makes him a service provider as well. Furthermore the customer can profit from all the advantages that come with remote services. 3
REMOTE SERVICES INTEGRATION
To successfully offer such remote service systems in addition to the enabling intelligent tool-machine systems and the development of the services itself, it is necessary to match the company’s technological standards and to integrate the service systems into the company’s organization. Thus organizational and technological integration as described in Figure 3 have to be achieved.
158
Figure 3. Aspects of realization and integration of service systems [8]
Within the scope of integration of the respective service systems into the company’s technological environment, the interfaces to the existing system landscape have to be defined and it has to be specified which information will be transmitted to and from the relevant system. Thus the following tasks have to be fulfilled: • Selection and adaptation or development of suitable sensor and transponder technologies (if necessary) under consideration of special requirements of the application surroundings for e.g. injection moulds (pollution, high pressure, temperature, etc.), • Identification of necessary data to be measured like temperature, pressure, and of ways to achieve acceptable values, • Establishment of a connection between the tools transponder and the machine’s control system as well as integration of these elements into the complete system and • Integration of technologies to edit and transmit data for the realization of the remote interface. The organizational integration deals with the following aspects: • The presentation to the customer of configured service systems for example in terms of partnerships and of the opportunity for customizing packages. Besides the presentation it is necessary to integrate the sales department and consider the existing product and service portfolio as well as its structure. • The relevant resources have to be managed. This means, the resources have to be described and regularly updated by means of their essential characteristics and it has to be defined for which services the resources are required in a general way. • The integration of processes for the service delivery into the existing business process structure and of relevant information for the asset management has to be considered. 4
CONFIGURATION OF REMOTE SERVICE SYSTEMS 4.1 DEVELOPMENT OF CONFIGURATION LOGIC
For an efficient realization of the organizational integration a configuration logic (Figure 6) has to be developed according to the process displayed in Figure 4. For this purpose, first of all, the desired offerings meaning the service systems and partnerships have to be identified and described.
159
Identification and description of 1 desired service systems
Definition of service modules
Definition of combination rules
2
3
Revision of rules
6
Definition of the required information in the processes
5
Definition of the required resources for the processes
4
Definition of processes for delivery of the service modules
Figure 4. Development process of configuration logic
Next, these service systems or partnership services should be modularized into standardized service modules (Figure 5), which can be divided in: • Basic services: These are central services without which the service system would not be possible and without which no other additional services could be offered. These basic services can be specified e.g. by accounting types or contractual agreements, which also modify the respective processes for the service module. • Additional services: These services are extensions of the basic services which one can not (or does not want to) offer independently. In combination with a basic performance, additional performances offer an extra benefit for the customer.
basic services
basic services
service specifications
additional basic servicesservices incl. additional specifications additional services services
Figure 5. Services modules
After that, all possible combinations for customized packages (service systems) have to be marked out. It has to be determined which service modules have to be linked to achieve a desired function and which service modules can or must not be connected. These rules define which combinations of modules are: • Technically impossible, because the required processes or resources are not fitting together or do not exist in the required version for this special combination. • Illogical or don’t make any sense considering economical aspects. • Counterproductive, so that the required processes or resources would interfere with the functional capability of others. This designation between the service modules in the single classes (basic services and additional services) has also to be made between the classes themselves. This means that the combination rules for basic services as well as for the additional services have to be established. These rules help the user during the actual configuration to create a valid service system by giving guidelines and setting boundaries to the configuration possibilities. But not only the service modules are required to realize the service systems - actually, they are only variables or customer related terms for processes realizing them - one also requires resources and information (concerning machine control, intelligent tool and management systems) as well as an integration in the existing business processes. For this purpose, the process should be described and assigned to respective service modules. The next step is made with the assignment of the required resources as well as the relevant information of the machine control, tool machine system, IT system and management systems to those processes. This information has to be integrated into these processes to ensure its delivery at the right place and time and in the required form.
160
Figure 6. The configuration concept [8]
4.2 THE CONFIGURATION PROCESS Now the actual configuration, the allocation of the service modules and the developed service systems can be done. Therefore it has to be determined which modules are part of the respective service system. The assignments of resources to service modules and the definition of the combination rules have already been done in the development phase. But before starting the first actual configuration the company’s resources have to be inspected and introduced into the configuration system in an once-only administrative process. It is necessary to find out which resources (tools, means of transportation, employees and their qualifications) are available in the company, because they are the foundation for the service delivery processes and therefore determine which service modules and consequently services systems can be offered at all (Figure 6). In this inspection only the general availability of resources with its essential attributes is listed. The reason for dealing only with the general resource availability at first is that the resources are the basis for realizing the service systems. A company can only offer those service systems and service modules which can be delivered with its given resources. A stock check of all available resources at the very beginning makes sure no service systems and customized packages will be configured that cannot be realized afterwards. Then the actual configuration phase starts with the configuration or combination of the service modules guided by the predefined rules. After that and the optional further description of the service system the manual configuration is already finished. The further configuration of the processes and the information integration were already structured in the configuration logics. Also the actual gathering and processing of the relevant information as well as the actual execution is planned, coordinated and partly even done by the configuration system itself in collaboration with the further associated software systems. So the manual configuration actually consists of one single step.
Inspection and Introduction of the company‘s resources Once-only manual preparation
Configuration of the service systems One-step manual configuration
Figure 7. The one-step configuration
161
Configuration and Execution of order fulfillment process Automatical configuration and execution
By means of the present configuration logics the process of configuring (Figure 7) is simplified and the number of possible error sources is reduced. Therefore the configuration of service modules can be done by less specialized persons like a sales employee and even by the customer himself. With the help of the configuration logics and specific case oriented rules, service modules can be assembled easily and efficiently. Thus the realization of the new service systems and the integration of information in those can be assured and the project target can be accomplished. 5
CONCLUSION
In times of the present financial crisis and rising competitions concerning the tool-construction market, traditional tool-producers are forced to restore their competitive advantage. For that, new solutions and production strategies have to be developed. Thus expanding the already existing offers by engineering services becomes more and more important. On the one hand, entirely new services can be established and by that enlarge the company’s portfolio. On the other hand, already existing services can be improved and made much more efficient. Of course, these kinds of services should be customized and therefore have to be realized in a configuration system to allow the customer to exactly compile the needed service level. Such engineering asset-related partnerships can be facilitated by achieving organizational and technical integration into the customer’s company. Via configuration logics, information is attributed to processes, and these process and resources are attributed to service modules. The compilation is simplified in a way that the customer can realize a complete service system by combining only the service modules. 6
REFERENCES
1
Amadi-Echendu, J. et al. (2007) What is engineering asset management? In: Len Gelman, Joe, Mathew, Jim Kennedy, Jay Lee, Jun Ni (Editors): Proceedings of the 2nd World Congress on Engineering Asset Management (EAM) and the 4th International Conference on Condition Monitoring. Springer, London, p. 116 – 129.
2
Belz, C.; Bircher, B.; Büsser, M. (1991) Hillen, H.; Schlegel, H. J.; Willée, C.: Erfolgreiche Leistungssysteme – Anleitungen und Beispiele. Schäffer-Poeschel, Stuttgart.
3
DIHK (Editors) (2002) Industrie- und Dienstleistungsstandort Unternehmensbefragung durch die IHK-Organisation. DIHK, Berlin.
4
Hofmann, G. (2004) Intelligente Spritzgießwerkzeuge erschließen Servicepotenziale im Werkzeugbau. In: Wachstumspotenziale – Integration von Sachgütern und Dienstleistungen. Konferenzband, Esslingen, 2009.
5
Ittner, T.; Wüllenweber, J.: Tough times for toolmakers. In: The McKinsey Quarterly, Nr. 2.
6
Kersten, W.; Zink, T.; Kern, E. M. (2006) Wertschöpfungsnetzwerke zur Entwicklung und Produktion hybrider Produkte: Ansatzpunkte und Forschungsbedarf. In: Blecker, T.; Gemünden, H. (Hrsg.): Wertschöpfungsnetzwerke. Schmidt, Berlin.
7
Kuster, J. (2004) Systembündelung technischer Dienstleistungen. Shaker, Aachen.
8
Podratz, K. (2009) Ein Ass im Ärmel: Effizientes Handling von Remote Service basierten Leistungssystemen im Werkzeugbau. In: UdZ – Unternehmen der Zukunft Nr. 2, p. 53-59.
9
Porter, M. E.: Wettbewerbsstrategie. Campus, Frankfurt, 2008.
10
Ramaswami, R. (1996) Design and Management of Service Processes – Keeping Customers for Life. AddisonWesley, Old Tappan NJ US.
11
Schuh, G.; Friedli, T. (2004) Gebauer, H.: Fit for Service: Industrie als Dienstleister. Hanser, München.
12
Schuh, G. et al. (2008) Technologiebasierte Geschäftsmodelle für Produkt-Service-Systeme im Werkzeugbau. In: Seeliger, A.; Burgwinkel, P. (Editors): Tagungsband zum 7. Aachener Kolloquium für Instandhaltung, Diagnose und Anlagenüberwachung, Verlag Zillekens, Aachen, p. 325 – 335.
13
Schuh, G.; Gudergan, G. (2009) Service Engineering as an Approach to Designing Industrial Product Service Systems. In: Roy, R.; Shehab, E. (Editors): Industiral Product-Service Systems (IPS2) – Proceedings of the 1st CIRP IPS2 Conference. Cranfield Univercity Press, Cranfield UK, p. 1 – 7.
14
Ulepic, S.:(2009) Value-Added-Partnership-Model. In: Beschaffung aktuell Nr. 1, p. 50 – 51.
162
Deutschland:
Ergebnisse
einer
Acknowledgments The data for this paper is a result of a research project called ‘TecPro’. TecPro was established to work on the research topic “service systems for technology and production-based services of the tool and mold production“. The project is funded by the Ministry of Education and Research (BMBF) under the project reference 02PG1095 and within the research and development program “Forschung für die Produktion von morgen”. It is supervised by the department of production and manufacturing technologies of the project management agency Forschungszentrum Karlsruhe (PTKA-PFT). It started in September 2006 and will be finished in February 2010. The project’s aims were • the development and technical realization of an “intelligent tool”, • the development of business concepts and processes for the intelligent tool and • the development of a model for the configuration of service systems and integration into the business processes.
Figure 8. The TecPro project consortium
Within the project’s framework, an intelligent tool system has been developed. It is the foundation for the developments of asset-related partnerships. To successfully implement such partnerships, the data of tool-sensor and machine control have to be interpreted and implemented in the customer’s business process. Consequently, the companies can offer service systems and by that successfully reassure their strategic position on the global market. We would hereby like to thank all project partners (Figure 8) for their cooperation.
163
Proceedings of the 4th World Congress on Engineering Asset Management Athens, Greece 28 - 30 September 2009
THE STATE OF ASSET MANAGEMENT IN THE NETHERLANDS Y.C. Wijniaa, P.M. Herdera,c a
Delft University of Technology, Faculty of Technology, Policy and Management, Jaffalaan 5, 2628BS Delft, The Netherlands b c
D-Cision bv, PO box 44, 8000 AA Zwolle, The Netherlands
Next Generation Infrastructure Foundation, Delft, The Netherlands
The world of infrastructures for energy, transportation, water and communication is constantly changing. It is not only that the use is ever increasing, recent years have demonstrated that there is a move from the public domain to a more privatised setting. On the one side, customers become more critical and demand better quality and better service, whereas on the other side a need arises to limit the (public) costs of infrastructures. To cope with those conflicting demands many infrastructure organisations have introduced Asset Management, an integrated approach to balance costs, performance and risks. However, in discussions at the Asset Management Platform (a group of over 30 asset managers from all infrastructures in the Netherlands, established in 2007 by the Next Generation Infrastructures foundation) it appeared that many infrastructure asset managers had great difficulty in getting beyond the first promising step. As a first step in the formulation of a new research programme an interview round was conducted with a number of infrastructure asset managers to get a better feel for what the precise barriers and challenges were. In this paper the results of the interviews are presented. Key findings were that Asset management acquired a strong foothold, but that it was a bottom up process which did not reach the strategic level. There were still difficulties in convincing top management of the strategic value of asset management and in aligning organisational goals with technical and operational standards. Furthermore, it was recognized widely that asset management required a change in maintenance paradigms and that asset management should focus on life cycle costing. Nevertheless, asset management was generally considered to be an engineering discipline, even though lots of efforts were spent on creating support for the initiative. Some asset managers explicitly recognized the social and cultural issues and acted on this by training their asset managers in social skills like empathy and persuasiveness, however this was not common. Neither was the extensive use of asset performance models for decision support, as most asset managers preferred to rely on historical data. However, at the same moment, many asset managers reported problems in accessing those very historical data. Based on these results, a research programme was defined in which academics as well as practitioners will participate. The outlines of this programme will be introduced in the paper. Key Words: infrastructures, asset management practice, asset governance, research agenda 1
INTRODUCTION
Infrastructures have always been a vital part of society. In the past, most cities developed at a crossroad of different modes of transportation, as this was a great opportunity for trade. The first infrastructure planners were the Romans, with their extensive road network and water distribution system. Over the years, the concept of infrastructure has expanded from road and water to incorporate gas (1812), rail (1829), telecommunication (1843), electricity (1879) and airlines (1914)1. It appears society has become more and more dependent on the delivery of goods and services from/to far away, and thus on the infrastructure that supports delivery. The use of energy in our personal lives is ever increasing, and people got used to the idea that someone at the other end of the world is not far away. Most infrastructures have seen a continuous development of technology since their establishment. The 8 lane highways cannot be compared to the single track carriage ways of the Romans and the internet is miles away from the first telegraph.
1
Years according to wikipedia
164
However, in the institutional setting development seems less dramatic. Virtually all infrastructures started as a private, commercial enterprise. As government recognized the importance of well functioning infrastructures for society, many of those infrastructures were put under government control, either by institutionalizing them (the infrastructure provider becomes a government agency), acquiring the shares of the private companies or by strict regulation. Recent years have demonstrated a reversal of this movement, as many sectors have been liberalized and deregulated and many infrastructures are (re)privatized. This happened in the Netherlands for instance with respect to rail (1993), electricity (1999), and gas (1998). The general idea behind all those initiatives is that markets would provide better services at lower costs than governmental bodies could. Coinciding with this focus on commercial quality, customers became aware of the non-commercial qualities of the infrastructure. It has become more difficult and more expensive to plan new infrastructures (at least, the visible ones) as people claim a right to the view they always had, and that the landscape should not be disturbed by the new infrastructure. Incidents that happen (for example substantial power outages) are treated as news items, and the availability of information on those incidents is immense2. Infrastructures might therefore appear more risky, and new regulations are put in place to safeguard the public against the perceived “greediness” of the commercial infrastructure operators3. Infrastructures are under pressure, as one could say. This is shown in the figure below.
Increasing Performace requirements
Limited Budget
Less Public Acceptance
Infrastructure system
Higher Legal Rquirements Figure 1: The pressures on infrastructure systems It is up to the infrastructure operators to resolve this issue. In this light, it is no surprise that Asset Management has gained the attention of many infrastructure operators. Asset management, after all, is the profession of balancing cost, performance and risk over the lifecycle of an asset. For example, the PAS55 process standard defines asset management as: “Systematic and coordinated activities and practices through which an organization optimally manages its physical assets, and their associated performances, risks and expenditures over their lifecycle for the purpose of achieving its organisational strategic plan.” However, when getting into the details of the infrastructure system, it becomes clear that the total operation of an infrastructure not only depends on the physical assets, but also on other elements, like information systems, data, standards and procedures, employees, capabilities and culture4. These elements are only to a certain extent independent of each other. Over time the elements might influence each other. It is therefore probably better to speak of elements that are loosely coupled, instead of independent. This is shown in the diagram below, where the black box infrastructure system of figure 1 is replaced in which the constituting elements are represented as masses, connected by springs. The metaphor of a mass-spring system is explicitly chosen, as it provokes the images of objects reverberating and exerting forces on other elements. The metaphor also demonstrates another characteristic many asset managers are familiar with: you can increase the strain on the system to achieve a higher (financial) performance, but it either will come back at you in future (for example, postponing maintenance and having to replace the asset in a few years because of irrepairable damage) or have a (delayed) impact on the performance. Short term financial gains can be achieved, but it is much more difficult to sustain them.
2
Supported by the telecommunications infrastructure. In the recent Schiphol plane crash (25th feb 2009, 10.40 am) reports were on the internet at no later than 10.42 am (twitter) 3
In this debate, people seem to miss that the people working in those operators are to a large extent the same as when it was a state owned enterprise, preserving the culture and values of the old days 4
This is recognized by PAS55, but not in the definition
165
Increasing Performace requirements
Bioware
Software
Limited Budget
Personnel
Capability
Culture
Standards
Procedures
Practices
Assets
Systems
Data
Hardware
Less Public Acceptance
Higher Legal Rquirements Figure 2: The mass-spring metaphor of Asset management In the Netherlands, many infrastructure managers have embraced asset management. As the Netherlands are a small country, most asset managers regularly bump into each other, and several initiatives to share knowledge on the profession were deployed. In those initial contacts it became clear that knowledge sharing alone was not good enough, some new knowledge had to be developed as well. None of the participants was really certain where the next steps of asset management could lead them. The Next Generation Infrastructure Foundation (NGInfra) took up the challenge of knowledge development and established the Asset management platform in spring 2007. In the startup meeting of this platform, a very high representation of corporate (high level) infrastructure managers was present, as over 90% of the invited people attended. Represented infrastructures included road, rail, electricity, gas, water, sewage, telecom, and airports. However, in the first meeting it became apparent that asset management had very different connotations in the diverse infrastructures. As this was a serious barrier for developing a shared research agenda, the first step of the platform was to establish the state of Asset Management in the Netherlands. In this paper, the questions, procedures and key findings of this research are presented. Based on those results, a research agenda and programme was formulated and kicked off. The outlines of this programme will be introduced in the paper. 2
THE SETUP OF THE RESEARCH
The research aimed at arriving at a representative list of issues in asset management that would cross-cut all infrastructure sectors. Therefore asset managers in 7 different infrastructure and related sectors were interviewed. These were: 1. 2. 3. 4. 5. 6. 7.
Gas and oil Electricity Rail and road Drinking water Waterways and water protection Telecom Asset Service providers
The focus of NGInfra lies with the long lived infrastructures of sectors 1 to 5. To establish whether the technology was a determining factor, the 6th sector was added. In telecom, technology and demand is developing much faster than in the others, resulting in short lived assets. As outsourcing of services is a hot topic in the field of asset management, the 7th sector was added, to provide insight in the specific issues related to outsourcing from both sides. The core objective of the research was establishing the key issues in asset management, for which NGInfra could develop knowledge development and dissemmination programs. To achieve this, 17 interviews were conducted, which provided a wealth of insights. It was not the purpose of the research to judge asset management in the Netherlands, nor to provide a benchmark. The interviews were conducted by Joost Jongerius, Tom Meijerink and Jan van Miltenburg of Evers and Manders
166
consult in commission of the Next Generation Infrastructures Foundation 5 . To provide as much room as possible to the interviewees to talk about the issues close to heart, the option of open interviews was chosen instead of standardized interviews with structured questions. To facilitate comparing different interviews, a checklist of topics to be addressed in the interviews was prepared. This checklist is presented in table1.
Table 1. Overview of interview protocol.
Main topic The concept:
Organisation
Implementation
Knowledge aquisition
Room for improvement: What would be a real breaktrough?
Item • • • • • • • • • • • • • • • • • • • • • • • • •
How do you define asset management? Which assets? Which performance criteria How do you determine the criteria Development of asset management within the organisation Asset Owner or Service Provider Position of asset managent within organisation Reporting line to board of directors Number of disciplines/employees involved Balancing and prioritizing between disciplines Who determines budgets for investment and maintenance? Are there asset registers How is condition of assets established and monitored? Are the data sufficient for analysis Are there methods for risk analysis used? In which phases of the lifecycle (design, operation, decommissioning) are those employed? Are there maintenance and investment plans? How is knowledge acquired and managed? Most important knowledge gaps? Expected benefit of extra knowledge? Use of benchmarks? Organization Culture Tools/knowledge
As mentioned earlier, the objective was establishing the key issues in asset management across all sectors. However, this does not mean that only the average knowledge gap will be presented. The purpose was both illuminating where participants of the platform can learn from each other, possibly facilitated by a knowledge dissemination program of NGInfra, as well as establishing where (university based) research was needed to advance the profession. To achieve this, the differences between and within sectors provide a much wealthier insight than the average value. 3
DIRECT RESULTS 3.1 The concept of Asset Management 3.1.1 Asset management definition
All interviewees agreed that the concept of Asset Management was on the efficient delivery of desired performance. But this is an empty statement, as it can apply to many forms of management. It only becomes meaningful if the considerations which fall in the realm of asset management are further specified. With respect to this, three streams could be recognized: 5
Jongerius, J. Meijerink, T., Miltenburg, J. van, Added-Value Performance, Infrastructure asset management in the Netherlands, a study in seven sectors, Evers & Manders Consult BV, June 2007.
167
1.
Asset management as the professionalization of maintenance and operation In many organisations (of Asset owners) Asset management was limited to the operational part of the Asset life cycle. In those organizations, the production function (in cooperation with the maintenance function) was responsible for running the assets. Key objective was availability, supplemented with a pressure on operational costs. In the 1990s condition based maintenance was introduced, together with tools like FMEA and FMECA. Even though life cycle management was often mentioned in that period, the focus was still on the reliability of current assets and not on total cost of ownership as the concept of life cycle management promotes. Questions on the need for a high reliability were often not asked. Asset management was often not involved in decisions on investment, or sometimes not even in determining the maintenance budget. Only recently, risk analysis was introduced, and then often limited to individual assets and components. Thinking about the asset base and the organization as a holistic entity is still in far future. This vision of asset management can be branded as bottom up asset management.
2.
Asset management as part of an organizational performance strategy. A completely different approach is to take the organizational strategy as the starting point for asset management. Asset owners have assets for a reason, and it is up to the asset manager to help the asset owner in achieving these objectives as efficiently as possible. This means determining maintenance for current assets, but also determining the right investment strategy to arrive at a better portfolio of assets, or sometimes even challenging the objectives, as they might be very expensive to realize. Some organizations might call this performance or service management, others refer to the term risk based asset management. The reason for this is that in many infrastructures, no financial gains can be achieved by extra investments, just improvements in performance. All activities and investments are thus basically risk mitigations, hence the name.
3.
Asset management as concept of service delivery The service providers active in the field of infrastructure assets see Asset management as a means of extending their service offering. Traditionally, asset owners commissioned the construction of new assets, either turnkey or only as subcontractor for an internal project leader. The trend over time has been in the direction of a broader service offering, including design, operation and maintenance. These DCOM 6 contracts often have long running times. Problems in contracting often regard the over specification of requirements as a means of risk management by the asset owner. Therefore, service providers would like to take up the role of asset manager, where they can discuss the end performance targets with the asset owner and think of the best option to deliver this value. It is like the transition that can be witnessed in telecom. First, backups needed to be made on the computer or server itself, but because of high speed internet access, an online backup service is achievable currently. The asset (the backup tape unit) is thus replaced by a service.
As can be concluded from this impression, asset management is mainly limited to the operational level, but is developing into the tactical domain. Few organizations have moved asset management into the strategic domain. 3.1.2 Performance criteria A number of organizations has to deal with performance criteria that are imposed by external bodies. Road and rail infrastructure has to comply with certain levels of availability, water has an extensive body of quality standards, the high pressure gas transport infrastructure has strict safety risk limits. However, many of those obligations are on a high level and not directly applicable in day to day operation. Asset managers therefore have a role in translating the infrastructure performance criteria into asset performance criteria and internal targets. But alignment between organizational goals and the technical performance criteria is weak. One of the reasons for this is the lack of representation of asset management at the strategic level. The top level sees the added value of asset management, but has a limited view on what asset management can mean (often limited to improving maintenance and operation). Only in a few cases the top level acts as a real asset owner and tries to think what can be achieved with current assets instead of how the cost for the current operation can be reduced. As a consequence, the asset management professionals try to formulate meaningful objectives for themselves and then try to gain support for them. The resulting paradox is that asset managers, as they are not involved in the strategic debate, only can suggest improvements to the current situation. This is expressed by lower costs of the improved, thus reinforcing the (wrong) image of asset management as a tool for cost reduction and thus reducing the probability of being invited to the strategic sessions. Only a in a few organizations the asset manager presents the asset owner with different scenarios for cost and performance development.
6
Design, Construct, Operate and Maintain
168
3.2 Organization As has become clear in the previous section, asset management is in different stages of development in different sectors and companies in the Netherlands. This is reflected in the organizational position of asset management. In many cases asset management is part of the production or maintenance function. In some cases asset management is a separate department with direct representation in the board of directors of the infrastructure manager. However, most infrastructure managers are part of larger conglomerates, and in none of the organizations asset management held a position at the corporate level. If it had a board representation, it was in the board of the business unit. An issue addressed in the interviews was whether asset management was a line or a staff function. In none of the organizations asset management was a true staff function. The interviewees viewed this as the right way, as they would fear asset management as a staff function would be perceived as top down and thus provoke resistance. Building the asset management capabilities bottom up was perceived to be the better option. The result is that many asset managers indicate a large number of employees is involved in asset management. It also means that asset management is on the map and has acquired a strong foothold. Another observation is that asset managers tend to have a technical background, even though many recognize the need for a wider view on the world than only the technical. But they also seem to agree “that it is easier to train an engineer in economics than to train an economist in engineering”. However, despite the wide recognition of the importance of the social side of asset management, only a few organizations actively encourage their staff to develop itself in that direction by offering training courses. 3.3 Implementation 3.3.1 Availability of data “The data exist but are not accessible”. This was the central theme of all interviews. Some claimed this was a legacy problem7, but in other cases the data simply did not exist. Assets would have been constructed over multiple decades, and data might not have been recorded at all at the construction, or badly maintained, or lost over time. Some asset owners could not even tell for certain how much assets they owned. This might be hard to imagine for an asset manager working in a production plant where one sees the asset one owns. But in infrastructures, assets are distributed, so you do not necessarily know where to look. Furthermore, underground infrastructures cannot be seen at all. At best, their existence can be inferred from above ground connections, but what and where is not detectable. Another issue mentioned in many interviews was the missing quality of fault restoration data. Often the precise cause of the failure was not recorded, which is a vital piece of information for the asset manager. Despite these shortcomings, asset managers tend to be pragmatic. Even with only 80% of the data available they can make the right decisions. Nevertheless, still a drive exists to get more and better data. In this respect online monitoring systems gained much attention. A concern that many asset managers expressed was that much of the data was experience in the minds of the employees and not as a factual recording. Getting the real data out instead of history colored by rules of thumb was regarded as a big problem. This was further amplified by the uneven age profile of the organization. Employees of asset managers tend to be in the second half of their working life, with many not very far away from retirement. 3.3.2 Methods and tools One of the themes in the interviews was the use of tools and methods for asset management. With this respect, a division could be made between tools and methods for maintenance and operation on one side, and tools and methods for investment decisions. For the first set, the focus is on registering assets, classifying criticality and optimizing the maintenance strategy. Investment decisions rely more on the prognosis of future behavior and need for maintenance. •
7
Tools for maintenance It was found that a number of (IT) tools is used. Every organization seems to have a tailor made solution for their case. These are often adaptations of the commercial available suites like SAP, IBM MAximo or D7i, often supplemented with excel for registration of asset and data. In itself, the choice of the tool is not exciting: all tools deliver comparable features. However, the lack of standardization within organizations forms a barrier to centralization of asset management. What can be observed is that there is a paradigm shift happening in the field of maintenance. In the late 80s, preventive maintenance was seen as the way to reduce costs of the installations, as it limited the share of unplanned maintenance (outages) in the total costs. But nowadays, preventive maintenance is only applied when and where it is
Data only available on paper or in singular database
169
•
really needed. Maintenance is further diversified into use based maintenance, condition based maintenance, and corrective maintenance. The latter one means accepting the risk of asset failure. Tools for investment decisions In many cases, the asset managers were not involved in the strategic investment plans. The initiative is often taken at the commercial departments. However, the need for maintenance of the new investment seems to have acquired a foothold in the business case models, and therefore asset managers tend to get involved in the decision making process in a later phase. However, predicting the need for maintenance seems to be a key (and difficult) issue. The history of existing assets can provide some clues for investment decisions, but it is uncertain whether the future asset will behave comparably. In this respect two factions seem to exist. One group of asset managers is building models to predict future behavior, where the other sees them as too theoretical and tends to focus on gathering historical data.
One conclusion can be drawn from these interviews. Most asset managers used to think in terms of fixed budgets, which were spent according to best knowledge. But recent years have shown that asset managers are forced to build a business case for their cash needs. 3.4 Knowledge acquisition In the interviews, knowledge gaps and knowledge acquisition was addressed. A clear differentiation between knowledge on assets and knowledge on asset management could be observed • Knowledge gaps The major knowledge gap is on asset behavior, especially the long term behavior and prediction of the long term need for maintenance and (re)investment. Those long term predictions are not only important for the asset owners, but as well for the asset service providers. In general, asset managers think that they have good knowledge on asset management itself, and they do not indicate a knowledge gap in this field. They see a need for more knowledge on contracting, as the trend is more towards outsourcing services. • Knowledge acquisition Although asset managers have (external) technical courses in their training portfolio, external knowledge acquisition is not important with respect to building knowledge on asset behavior. As mentioned earlier, knowledge exist within the internal organization, and efforts focus on documenting the internal knowledge. External knowledge acquisition tends to focus on asset management in a broader sense, and especially (asset) risk management. Besides courses also many asset management conferences are visited. These provide new ideas, but it tends to be very difficult to apply the acquired ideas in the home environment. A final element is sharing knowledge directly with other organizations in the sector, but the extent varies widely over the sectors. Some multi sector initiatives exist. 3.5 Opportunities The opportunity for improving the profession of asset management was addressed as a final theme in the interview. A number of opinions is shown below with regard to the functions and objectives of the Platform: Don’ts • “Discussion group” • Lobbying • Combined meetings of asset owners and service providers
Do’s • “Task force”| • Lobbying • Promoting asset management at top management • Asset management Hub • Modular courses • Knowledge centre on tools and methods • Demonstrate generally applicable methods and tool • Link academic knowledge to practice • Facilitator at standardization of data
Note that lobbying is both mentioned as a do and as a don´t. A trend in the opinions is the interest in knowledge dissemination. Providing master classes, training courses, being a knowledge hub. A number of interviewees would welcome an independent body that could help them in judging the added value of “new and improved” tools and techniques.
170
4
KEY FINDINGS
1. Asset management is not yet at the strategic level Even though asset management is still a juvenile “science”, a significant progression has been achieved. Almost all interviewed organizations implemented the first asset management initiatives around 2002, followed by establishing separate asset management groups or departments around 2004-2006. Most of the time has been spent on acquiring support for asset management, working bottom up. In many organizations asset management originated in the functions of maintenance and operation, and is still regarded as the professionalization of those functions. However, over the years asset management moved up from purely maintenance decision to judging investments on their maintainability, and thus moved to the tactical level. Nevertheless, asset management is still regarded as a cost centre, and efforts should be spent to save costs. Many investments only result in risk reduction, not in extra profits. Business cases therefore often show a negative result, with the rare exception that reduced maintenance cost justify the investment. Analyses based on total life cycle cost and organizational risk are better suited to convince top management. Asset management then turns into a profit centre, a necessity to reach the strategic level. To achieve this, alignment is needed between organizational goals and technical norms and standards, but this is lacking in many organization. Translating the organizational goals into asset performance criteria could facilitate alignment, and could lift asset management to the strategic level. 2. Paradigm shift Where up until a few years ago mainly use or time based preventive maintenance was applied, currently condition based maintenance seems to be the norm, supplemented with corrective maintenance for non-critical assets. Use based preventive maintenance is only applied if it is really necessary. This is strongly linked to the concept of lifecycle costing. Trying to minimize maintenance cost for existing assets can be a silly thing if assets are available on the market with much lower operational costs, for example because they have higher energy efficiencies. Then the asset should be replaced and not maintained. The concept of lifecycle costing is gaining interest, but it is certainly not widely applied. 3. Performance models and data A key element of asset management is predicting the asset performance and the maintenance need of the asset. However, only a few asset managers relied on models to predict this, as many regarded the models as too theoretical. Most tended to rely on extrapolations of historical data. Paradoxically, at the same time most asset managers “complain” on the quality and availability of historical data. 4. Soft issues are important Even though asset management was generally considered to be an engineering discipline, most asset managers recognized the importance of the non-technical issues like economics and social elements. Many spent lots of efforts on creating support in the organization. Some even developed specific training modules for the social skills like empathy and persuasiveness, although this was not common. 5
RESEARCH AGENDA: TOWARDS ASSET GOVERNANCE
Based on the interview results, a research program was defined in which academics as well as practitioners are participating. The outline of this program is introduced in this section. Much of the research published in the area of asset management seems to deal with maintenance and operation. This is a good thing, as many issues the practitioners face today is related to maintenance and operation. Asset management without attention for maintenance and operation would be empty. Nevertheless, the interview results suggest that it is not the biggest issue asset manager face. That is more in the area of showing the added value of asset management to the whole organization, even if the costs of managing the assets would go up. Currently, asset managers tend to be involved in cost cutting in maintenance, in order to prove the value of asset management. However, we strongly feel that asset management should be more about discussing what the assets could and should deliver in relation to what the company needs them to do. In some cases, it might be wiser to increase the asset management budget as it would reduce the need to build new assets. In other cases it might be better to build new assets, as the cost of increasing the output of the existing assets could be much higher. For new assets, proper attention for the operational costs over the full asset life should be given, as these can outweigh the purchase price by many times. As long as asset management is confined to the maintenance and operation function, companies will not get the full benefit of an integrated approach. They still might build or acquire new assets that are hard to manage, or that are not needed if the current assets are managed properly8. We strongly feel that this strategic part of asset management should be addressed in our research program. As a start, we made a rough division of the field of asset management into four groups: 8
On AICHE meeting spring 2003 in the Kurhaus (Scheveningen), BP presented the concept of the phantom plant, which was the sum of all production losses. This phantom plant proved to the biggest of all plants.
171
-
Institutional embedding: This deals with how public values are embedded, how interaction with the regulatory bodies is organized, which strategic goals are allocated to asset management, how they are measured and so on. It is about the mission and values of asset management Internal organization: This deals with the business processes, operating models, authorization, change management, capability development and so on. The key theme is how to guarantee the organization structure fits the mission and values. Operational excellence: This is about fine-tuning of what you are doing. In our view, many of the maintenance management initiatives fall within this category. Contracting: This is both about the make or buy decision (in relation to outsourcing) as it is on the type of contract and the process of contracting in case the decision is on buying. This can be on all levels, from financing (PPS) to the contract itself (only services or DCOM). Theme is how to assure that you get what you want.
Crosscutting these four topics are a number of general themes that need to be researched. These are: - Maintenance and replacement: Many infrastructures are old and might need replacement or extra maintenance. How to determine what is best, and how to prepare the organization for the significant increase of work if the assets were to be replaced. - Human factors:. In the end, it is people who determine the success of an infrastructure. How to address this properly, what skills and competences are needed? - Innovation and transition: The use of many infrastructures is changing over time. How to integrate this into asset management. - IT systems and support: A key issue addressed was the availability of data. How to make certain the right data gets to the right people at the right time?
IT systems and support
Innovation and transition
Human factors
Maintenance and replacement
Together, these can be grouped into a matrix.
Institutional embedding Internal organisation Operational excellence Contracting This is a rough structuring of the asset management field. Based on the interviews, we feel that the most challenging and needed topics are institutional embedding and internal organization. One could brand this as strategic asset management, but we feel asset governance is a better name. After all, it is about the structure of asset management, and not on the strategic decisions. 6
CONCLUSION
Asset management as a profession has made a significant progress in the Netherlands in the past years, and has acquired a strong foothold in many infrastructure managers. It has moved up from operational decision regarding maintenance to a more tactical level, regarding the maintainability of future assets, but has not reached (given a few exceptions) the strategic level. To reach this, alignment between organizational goals and technical asset performance criteria is needed. However, almost no practitioner really mastered this challenge until now. It is at this point that the academia could provide support. Therefore our current research with regard to asset management focusses on linking asset management goals to organizational goals. The authors prefer to brand this asset governance, to differentiate it from the (maintenance and operation based) operational and tactical asset management. Acknowledgment This research was sponsored by the Next Generation Infrastructures Foundation, www.nginfra.nl.
172
Proceedings of the 4th World Congress on Engineering Asset Management Athens, Greece 28 - 30 September 2009
A CASE STUDY ON CONDITION ASSESSMENT OF WATER AND SANITATION INFRASTRUCTURE Joe E. Amadi-Echendu a, Hal Belmonte b, Chris von Holdt b, and Jay Bhagwan c a
Graduate School of Technology Management, University of Pretoria, Republic of South Africa. b
c
Aurecon, Republic of South Africa
Water Research Commission, Republic of South Africa
The management of physical assets covers a wide scope and range of processes that include acquisition, control, use, disposal and re-cycling of built environment structures in a manner that satisfies the constraints imposed by business performance, environment, ergonomics, and sustainability requirements. Technologies applicable towards the management of infrastructure assets for water and sanitation services are advancing rapidly apparently influenced by advances in condition monitoring, information and communication technologies. This paper discusses condition and risk assessment of water and sanitation assets. Although inferences are drawn from available public domain literature and non-probabilistic survey of representatives of organisations engaged in water and sanitation services, the findings reiterate that the most rapid trends are in technologies for the collection and transfer of data. We also find that the understanding and practice of asset management in water and sanitation services providers is still in infancy, and thus begs to question some of the purported benefits of technology applications for such organisations. Key Words: Engineering Asset Management, Water and Sanitation Infrastructure, Technology Trends. 1
INTRODUCTION
Technologies applicable towards the management of physical assets have advanced rapidly and asset-intensive businesses can take advantage of the technological developments to increase operational efficiency and to provide improved products and services. Noting the considerable impact of water and sanitation on health, economy, environment, and society at large, a core issue for service providers is to determine the condition of extensive infrastructure that includes buried pipes, dams, pumping stations, reservoirs, reticulation, treatment and transport systems. Technology can, and should be deployed towards monitoring the quality of potable water and effluents to ensure compliance with applicable health regulations. In societies with significant socio-economic disparity, there is the added imperative to establish adequate capacity for water and sanitation services both in terms of new and existing infrastructure. For example, infrastructure planners and operators need to determine the risks and interventions required in the creation, acquisition, maintenance, operation, decommissioning, disposal and/or rehabilitation of water and sanitation assets. Capital investments, operations and maintenance, and rehabilitation of water and sanitation infrastructure have traditionally been in the realm of massive public funding, and this is increasingly placing unbearable fiscal burden on government departments. The combined challenges of social cohesion, technological advancements and economic growth have provided incentives for increased participation by private sector investors and managers in water and sanitation services. This paper extrapolates from our review of methods, tools and techniques that are available for use in infrastructure condition assessment and risk management. Based on observed cases of water and sanitation providers in South Africa, we then summarise the extent to which available condition monitoring, information and communication technologies influence asset management activities like condition assessment, risk analysis and predictive modelling.
173
Challenges
As illustrated in figure 1, for the water and sanitation sector, technology embedded in physical assets, information systems, and business processes can be exploited towards addressing the wide-ranging socio-economic challenges that include satisfying healthy service delivery requirements, whilst concurrently minimizing environmental footprints in energy consumption, water extraction, and effluent discharge; all of which have to occur within highly constrained capital and operational expenditure programmes. Data, information systems and communication technologies provide the means for linking the infrastructure components to the asset management processes and towards resolving the challenges and achieving the business objectives for the owner/operator of the asset base.
•Provision of Healthy Water and Sanitation Services •Minimized Environmental Impact of Water and Sanitation Service Provision condition monitoring, risk analysis and predictive modelling
Asset Management processes
Data, Information Systems and Communications Technologies Operations and Maintenance
Acquisition
Decommisioning and Rehabilitation
PLANNING
ASSETS
Data, Information Systems and Communications Technologies MATERIALS and SENSOR TECHNOLOGIES Civil structures
Pipe networks
Mechanical Equipment
Electrical Equipment
Fig 1. Water and Sanitation Services Asset Management Model highlighting ICT Applications
2
RESEARCH
Effective decision making regarding long term planning, risk management, maintenance, operations, or other asset management activities, is dependent on the availability of appropriate data and information. Sensors, computerised systems, and communication technologies provide tools for the collection of condition and transactional data against asset records that can be processed into useful categories of information, which, subsequently, inform decision making. Asset management practices entail the use of information to make value-adding decisions regarding asset condition, performance and risk. A systematic, consistent and relevant technical assessment should provide condition information to enable infrastructure planners and operators to determine the risks and interventions required in the management of water and sanitation assets. The collection of pertinent data is a major task [1] and the assessment should at least: • provide a rating of the asset condition “as found”; • indicate the risks associated with allowing the asset to remain in the “as found” condition; and • identify the scope of work that may be necessary to restore to, and/or sustain the asset at desired condition. Marlow et al (2007)[2] provide a comprehensive breakdown of condition monitoring tools and techniques that can be applied to equipment and structures deployed in water and wastewater services. Their study produced a set of inclusive tables that break down the various inspection tools and techniques, environmental surveys and condition monitoring techniques. Our literature review (cf: for example, Andrews (1998)[3], Randall-Smith et al (1992)[4], Billington et al (1998)[5], Snyder et al (2007)[6], Ferguson et al (2004)[7], Stone et al (2002)[8] and), reveals a myriad of techniques for sensing the desired physical parameters as well as a number of computational models that can be applied towards the prediction of asset condition and risk profile. Whereas Watson et al[9], [10] and [11] may be useful references on practice guidelines, however, a key gap observed in our literature review is the apparent lack of specific sets of condition indices for each category of water and sanitation infrastructure assets.
174
Following our literature review of condition and risk monitoring techniques, we then focused our study on the application of these technologies by owners/operators of water and sanitation infrastructure. We developed a questionnaire to assist us in our study of how these techniques were applied by water and sanitation services providers in South Africa. We targeted a judgemental sample of people that included representatives of service providers, technology vendors and consultants. The service providers included 145 municipal agencies, some of which are responsible for water distribution, bulk transfer and sanitation; plus 5 companies primarily engaged in extraction, treatment and bulk transfer of water. The range of infrastructure owned/operated by the respondents’ organisations typically included boreholes, dams, reservoirs, pump stations, treatment plants, and pipeline transfer systems. Despite concerted efforts at persuading respective representatives of the respective organisations in our geographical delineation, only 23 respondents, almost exclusively representing local municipalities, completed our questionnaire. It is worthwhile to note that the responding municipalities serve less than 16% of households in a geographical population comprising more than 45 million people. The study was also conducted within the background of a recent legislation that more or less requires government departments and public agencies to adopt and implement asset management principles and practices. The bar graphs in figure 2 show the respondent feedback on how often they carried out condition assessments of the infrastructure assets and what technologies were used. The respondents claim that their respective organisations carry out daily, monthly and yearly inspections of their assets but more so on pump stations, pipelines and reservoirs facilities. It was revealing that some organisations seldom carried out condition assessment of their facilities, even if it was only limited to visual inspections, and especially with the wide range of technologies seemingly available. We were also perplexed to observe that some respondents indicated that condition assessments were “outsourced to consultants”, thus giving the impression that the particular organisations did not really pay attention to what technologies were applied.
Seldom 6/Year 3/Year 1 1
2
5
4 2 2
2 6
1
2
2
Daily
4
SEWAGEPIPELINES
WATERPIPELINES
WORKS
SEWAGETREATMENT
WORKS
1
3
2 WATERTREATMENT
Weekly
1
2
4
STATION
BOREHOLE
3
Monthly
4
3
2
6
5
4
SPRINGPROTECTION
3
2
6
Anually
1 1 1
1 1
1
SEWAGEPUMP
2
2
3
3
4
STATION
1
1 1 1 1 1 1
WATERPUMP
1
RESERVOIR
2
DAM
1 1
5
1
1 1
2
5 5
2
1/Year
2 1
VALVES
1 1
1 1
1 1 1
2/Year
2
3
Fig 2a. Frequency of inspections for condition assessment of infrastructure
80 74 70
60
50
40
30
2
1
2
1
1 Appoint a dam safety inspector
2
Acoustic emmissions
3
Sewer scanning & evaluation
4
1
Pipe inspection realtime insoection technique (PITAT)
1
Pipeline inspection gauge
2
Water meters
5 1
Visual Inspection of panels, motors and pumps
6
CCTV
6
Flow in manholes of toe drains and visuals
10
SCADA
20
Vibration Analysis
Motor current analysis
Check condition of pipes,penels and pumps
Physical Inspection
Visual Inspection
Outsourced to consultants
0
Fig 2b. Inspection technologies for condition assessment of infrastructure With regard to risk management, we approached the issue by asking the municipal organisations whether or not they measured reliability, based on the assumption that our respondents understood our definition of reliability as “the chance of pre-defined failure occurring under given conditions within a stipulated time period”. The bar graph in figure 3 suggests that less than half of the municipal organisations measured the reliability of the respective assets under their care. Of more concern is that majority of respondents indicated ‘direct assessment’ as a method for measuring reliability and ‘monetary value’ as the
175
method for risk ranking of assets. Such feedback more or less supported our apriori impression that majority of respondents did not understand how to measure reliability or risk. In fact, less than a third of our respondents indicated that their respective organisations maintained a risk register.
25
23
22
21
20
20
23
20
19
19
17
7
11
11
Sewage pump station
10
10
Water pump station
15
Number
15
12
12
10
12
11
8 6
5 2
Number of municipalties that are responsible for water assets
Valves
Sewage pipelines
Water pipelines (reticulation & bulk)
Sewage treatment works
Water treatment works
Reservoir
Spring protection
Borehole
Dam
0
No of municipalties that measure reliability of water assets
Fig 3. ‘Direct assessment’ of reliability as a measure of risk
3
DISCUSSION
Whereas the respondents’ feedback suggest visual inspections as the prevailing common method for condition assessments, however, visual inspections can encompass a rather broad definition of activities ranging from cursory inspections to highly detailed technical examinations utilising sophisticated instrumentation. The same applies for ‘direct assessment’ as the measure of reliability and the use of ‘monetary value’ as the basis for risk ranking. All the municipal organisations in the geographical delineation used for our case study are under pressure to prepare asset registers, especially to demonstrate financial compliance with the relevant legislation. The apparent lack of sector asset management guidelines over and above vendor equipment standards may exacerbate how to conduct condition and risk assessments of water and sanitation infrastructure assets, and hence the valuation of such assets. Although the technology exists and there are examples of the application of some of the methods for condition and risk assessments, however the need for an enabling environment is also exacerbated by the requirement to develop new skills, and this is further compounded by weak organisational commitments to the principles and practice of engineering asset management. The overall impression from our non-probablistic survey demonstrates that the understanding of engineering asset management is at an infancy stage for the water and sanitation service providers that participated in the study. With this in mind, we propose the following data progression structure to facilitate the journey in engineering asset management for such organisations.
Data level
Data type
Key Data Management Needs
Primary data
Inventory
Classification guidelines Basic attributes guidelines Data storage software
Secondary data Basic condition attributes
Tertiary data
Performance data/modelling
Where most Water Service Providers are now
Assessment guidelines Reporting guidelines Advanced condition technology Maintenance management software Business processes Predictive modelling methods Optimised decision making methods Benchmarking
176
Movement in the future
4
REFERENCES
1
Strategic Asset Management, Condition Assessment. www.build.qld.gov.au/sam/sam_web/content/76_cont.htm
2
Marlow, D., Heart, S., Burn, S., Urquhart, A., Gould, S., Anderson, M., Cook, S., Ambrose, M., Madin, B. and Fitzgerald, A. (2007) Condition Assessment Strategies and Protocols for Water and Wastewater Utility Assets. WERF & AWWA Research Foundation, Report 03-CTS-20CO.
3
Andrews, M.E. (1998) Large diameter sewer condition assessment using combined sonar and CCTV equipment. APWA International Public Works Congress, NRCC/CPWA Seminar series “Innovations in Urban Infrastructure”, Las Vegas, Nevada, National Research Council of Canada, Sept 14-17 1998.
4
Randall-Smith, M., Russell, A. and Oliphant, R. (1992) Guidance Manual for the Structural Condition Assessment of Trunk Mains. WRc, UK.
5
Billington, E.D., Sack, D.A. and Olson, L.D. (1998) Sonic Pulse Velocity Testing to Assess Condition of a Concrete Dam. November 1998.
6
Snyder, G., McEwen, D., Parker, B., Donnelly, R., and Murray, R. (2007) Assessing the reliability of existing anchor installation at Loch Alva and Log Falls dams. CDA 2007 Annual Conference St. John’s, NL, Canada. September 22-27, 2007
7
Ferguson, P., Shou, S. and Vickridge, I. (2004) Condition Assessment of Water Pipes in Hong Kong. Trenchless Asia Conference, Shanghai, April 2004.
8
Stone, S., Dzuray, E.J., Meisegeier, D., Dahlborg, A., and Erickson, M. (2002) Decision-Support Tools for Predicting the Performance of Water Distribution and Wastewater Collection Systems. National Risk Management Research Laboratory, U.S. Environmental Protection Agency, Cincinnati, OH, USA.
9
Watson, T.G., Christian, C.D., Mason, A.J. and Smith, M.H. (2001) Maintenance of Water Distribution Systems. The University of Auckland, Auckland, New Zealand.
10
Guidelines for Infrastructure Asset Management in Local Government 2006-2009. Department: Provincial and Local Government, Pretoria, South Africa.
11
International Infrastructure Management Manual, International Edition. (2006) Association of Local Government Engineering NZ Inc, Institute of Public Works Engineering of Australia, Thames, New Zealand.
177
Proceedings of the 4th World Congress on Engineering Asset Management Athens, Greece 28 - 30 September 2009
THE ROLE OF STANDARD INFORMATION MODELS IN ROAD ASSET MANAGEMENT Daniela L. Nastasie a, Andy Koronios a a
Cooperative Research Centre for Integrated Engineering Asset Management (CIEAM) -Systems Integration and IT, University of South Australia, Mawson Lakes SA 5095, Australia. Business activities rely on people’s understanding and interpretation of information. Meaning (semantics) is incorporated in the way information is defined and structured. From a semantics point of view information models range from low level semantics, such as taxonomies and data dictionaries, to high level semantics, such as formal ontologies. The low level semantics information models help humans add meaning to information in a structured way, while the high level semantics information models are essential for computed aided activities and automation of processes. This paper discusses standard information models of relevance to the Road Asset Management sector based on topics discussed on the IPWEA Asset Mates Forum and interviews with practitioners in Australian Government agencies. Current taxonomies, guidelines and open information standards with potential use for Road Asset Management were analysed. The findings suggest that information models used in the Road Asset Management industry are mainly at the low end of the semantics scale and they vary in consistency across the industry. At this stage there are no XML based industry standards specifically designed for Road Asset Management. It is recommended that Road Asset Management sector should consider designing XML based information standard with terms and concepts specific to this industry. Existing XML standards from other sectors could be used as examples or adapted to this particular industry needs for overlapping areas such as finance or business reporting. Key Words: Road Asset Management, Standard Information Models, XML Information Standards
1
INTRODUCTION
A 2008 research study conducted by the Commonwealth Grants Commission identified a lack of information and data related to road assets managed by local government and emphasised the need for consistency and accuracy of the local government data collections [1]. Consequently the Australian Local Government Association (ALGA) proposed a funding of $20 million over the next four years to develop a national data collection framework ($7 million) and to establish and/or upgrade asset management at local government level ($13 million). One of the three main initiatives required to improve local government services is the clarification of data types and standards, consistent with the national framework developed by ALGA together with the Australian Bureau of Statistics (ABS), the Commonwealth Grants Commission, and Local Government Grants Commissions [2]. These figures demonstrate the importance of road asset management and its reliance on data and information, as well as a special emphasis on local government agencies that manage more than 80% of the 812 kilometers Australian road network. Presently a variety of information systems are used to support the collection and analysis of data to perform road asset management tasks and even though information systems have proliferated in the road asset management arena, the expectation that smart information technologies solve the problem of information has not been met. A survey conducted in South Australia in 1999-2000 found an overwhelming majority of the councils had invested in better “data systems” as the main measure to improve their asset management activities. At the same time, the report indicates that what council staff lacked most was “better data” [3]. These findings show that even though information technology is an essential part in managing data and information, in itself it does not provide benefits. In order to use the information technology effectively and achieve the expected benefits more attention need to be paid to data and information. Traditionally data has been seen as secondary to processing the data which lead to the famous GIGO (garbage in, garbage out) problem. The importance of data per se started to become clear when computer scientists realised that software applications were entirely dependent on the data processed by those applications and they recommended a paradigm shift from applications to data [4]. This shift of power has been enabled by the maturity of the Web technologies that allows data to
178
become smarter. Daconta, Obrst and Smith envisaged the smart data continuum (Figure 1) as a four stage process from data proprietary to an application to data becoming increasingly independent of applications across the Web. The first stage of the continuum is the pre-XML stage represented by documents and data records stored in formats other than XML. The second stage corresponds to the first level of data independence, in which data related to an individual domain of practice is described using individual vocabularies represented in XML. In the third stage data is composed and classified in hierarchical taxonomies using mixed XML vocabularies from multiple domains and in the final stage new data can be inferred from existing data across the Web following logical rules embedded in XML ontologies [4].
Figure 1- The smart data continuum [[4], Figure 1.2, pg. 3] XML plays therefore a vital role in creating data independent of applications in the Web environment. While XML is the format that allows data to become independent of applications, the content represented in XML format is structured according to particular information models. Information models play an important role in attaching meaning (semantics) to data, as seen in Figure 2 which presents information models on a semantics spectrum, from low level semantics information models, such as taxonomies and data dictionaries, to high level semantics information models, such as formal ontologies [5]. For humans the information models easiest to be understood are the ones with the lowest level of formality, using natural language and located at the weak semantics end on the Ontology Spectrum. Computers need information models at the high end of the Ontology Spectrum in order to perform automated tasks. One way to create information models with more explicit semantics is by developing stronger semantics information models based on existing low level semantics information models, such as taxonomies and industrial categorisation standards. The main advantages of this approach are the fact that industrial information standards contain a multitude of concept definitions hierarchically organised and reflect a degree of community consensus, which makes their adoption and diffusion easier [6, 7].
Figure 2- Ontology Spectrum [[5], Fig. 1, pg. 367] The following sections discuss the findings from an exploratory study into the Road Asset Management information models and emphasise the role of XML information standards in the new Web enabled information environment.
179
2
RESEARCH METHODOLOGY AND FINDINGS
This paper is based on an exploratory study into the information models of relevance to Road Asset Management in Australia. Data triangulation was used in order to enhance the understanding of the issues related the role of information models in Road Asset Management. Empirical evidence was collected from three different types of sources: the Asset Mates Discussion Forum on the IPWEA (Institute of Public Works Engineering Australia) website, government publications related to information standards of relevance to Road Asset Management, and interviews with 14 Road Asset Management practitioners from local and state level government agencies in South Australia. The interviews were conducted using a semistructured questionnaire based on the HTE (Human, Technology, Environment) model [8]. A summary of findings follows.
2.1 Findings from the IPWEA NAMS.AU AssetMates Forum The IPWEA National Asset Management Strategy Committee (NAMS.AU) provides national leadership in community infrastructure management and supplies resources to assist asset management practitioners. AssetMates is an IPWEA NAMS.AU web forum that focuses on all aspects of asset management, such as asset management planning issues, discussion on the various asset classes, accounting for assets, condition assessment, information systems, levels of service, and a general discussion area. This paper analysed topics related to Asset Management Terminology posted on the IPWEA Asset Mates Forum between July 2004 and March 2009 to discover issues related to data and information in Road Asset Management. Three main topics (Asset Classes, Information Systems and General AM Issues) included 8 threads related to road classification hierarchies, asset data structures, asset data collection, third party data collection, containing a total of 118 messages, as shown in Table 1. Table 1 IPWEA NAAMS.AU Asset Mates forum information related topics as at 14 April 2009 [Source- Authors]
Topic
Threads
First message
Last message
Brief description
Number of message s
Asset Classes
Road classification hierarchy
1 May 2008
23 March 2009
Discusses the methodology used for determining road classification hierarchy
8
01 April 2009
Discusses the meaning of ‘asset classes’
7
28 October 2008
Discusses Data collection hardware and software Discusses Asset Management data structures in relation to business processes and differences in localisation or fields (Asset Register in Finance as opposed to Asset Register in Asset Management Systems) Discusses linear vs. spatial representation of roads and road assets in connection to GIS Discusses tenders for the videoing and inventory data collection of road networks and associated infrastructure (signs, footpaths, etc) Discusses definitions of basic terms related to Asset Management Discusses issues that Local Governments have in getting as-constructed asset information from developers at appropriate times and how they get around this issue; ADAC and D-Spec are the recommended information standards
Asset class definitions Information Systems
General AM Issues
Asset Data collection methods
23 December 2008 13 July, 2004
Asset Data Structure
14 September , 2006
19 December, 2006
Spatial Representation of Assets in the GIS
29 November , 2006
30 November, 2006
Road Inventory Data Collection
31 January, 2007
01 February, 2007
Definitions and Terminology
04 December , 2008
15 December, 2008
As Constructed Asset Information from developers
28 October, 2008
28 October, 2008
180
34
6
7
4
35
17
The IPWEA NAMS.AU supports the application of a nationally consistent framework for infrastructure asset management planning and encourages all entities responsible for managing service delivery from infrastructure assets to adopt the structure and framework for asset management planning in accordance with the International Infrastructure Management Manual (IIMM).
2.2 Findings from interviews with Road Asset Management practitioners Interviews with 14 practitioners in South Australia (10 from local councils and 4 from state road authorities) revealed that practitioners have a strong view on the role of information models in Road Asset Management. As shown in Table 2, the respondents have a good understanding of the benefits provided by standard information models, and at the same time they are aware of some of the issues that information standardisation might bring along. The general view is that standard information models would assist with benchmarking, data management, shared knowledge, transfer of skills and many other areas as presented in Table 2. On the other hand issues related to agreement on terminology, resistance to change, organisational politics and lack of expertise are among the concerns that need to be addressed before these standards can be created and implemented at industry level. Another finding is that due to a plethora of information standards developed at various administrative levels practitioners find it very difficult to know what information standards are available and which ones are better suited for particular business needs. This lead to the creating of several information models serving the same purpose, such as various functional road classifications for different states or ADAC (As Designed as Constructed) and D-Spec (Developer Specifications for the Delivery of Digital Data to Local Government) standards for the common specification to supply digital data from designers to local councils. Table 2 Drivers and inhibitors for the diffusion of consistent information structures [Source- Authors] Benefits access control accountability benchmarking collaboration data access data analysis data collection data consistency data control data independence data integration data maintenance data structure decision making efficiency gaining grants improved services information management more useful data reduced costs reporting shared knowledge strategic planning third party information exchange transfer of skills transparency
Issues
agreement on terminology budget business changes configuration issues data representation data re-structuring definition issues different interpretations lack of expertise local politics perspective reduced flexibility resistance to change technology requirements wasted knowledge
2.3 Findings from the Road Asset Management Information Standards Standards adoption has been considered a sign of industry maturation [9]. Road Asset Management as part of a holistic view on Asset Management is a relatively new industry [10], therefore the adoption of standards is still in early stages. Austroads, the association of Australian and New Zealand road transport and traffic authorities, has introduced the concept of Total Asset Management to the management of road networks for the first time in 1994 when the Austroads Road Asset Management
181
Guidelines was published. This study examined Austroads publications related to Road Asset Management, as well as road classifications such as NAASRA and government initiatives such as Queensland Road Alliance, the Australian National Transport Data Framework and the Australian Government Information Interoperability Framework. A brief description of the most relevant government initiatives and industry standards is presented in Table 3. The findings suggest that information models used in the Road Asset Management industry are mainly at the low end of the semantics scale and they vary in consistency across the industry. At this stage there are no XML based industry standards specifically designed for Road Asset Management.
3
DISCUSSION
In the Road Asset Management sector the Web is currently used for sharing information at industry level from top down, with Austroads and IPWEA publishing documents online that local and state road authorities can access on a regular basis. Some organisations use the Intranet to share data created in a local Knowledge base that can be accessed by local staff. IPWEA Asset Mates forum uses the Web also for the dynamic exchange of information over the Internet in order to foster knowledge sharing at industry level using web forums that respond to the users’ needs. The Web is therefore already present in the Road Asset Management activities, but it is not currently used for automatic dynamic interaction, such as publishing and sharing real-time information of relevance to internal or third party agencies. One reason for this lack of real-time information sharing is the absence of a consistent information model for collecting and storing this type of information at the industry level. The vast majority of road authorities are using Information Systems to store their data, but they all use different information models to do that. Sharing and re-using information over the Internet is going to be much more difficult to achieve if the same type of information is stored in different information models. An increasing number of information standards, models, frameworks and guidelines have been created in recent years to deal with this issue (Table 3), but this makes it very difficult for practitioners to select the particular one they need and there is no systematic approach to define these standard information models at present. From the review of the literature and other industries information models, it can be concluded that XML is the format required for information standards that are going to be widely spread, as this format can be used over the Internet, the largest of the communication networks available. XML based industry and information standards are therefore the first step in creating smart data independent of applications that will provide benefits in terms of data interoperability, as well as benefits derived from direct and indirect network effects (direct network effects coming from an increasingly larger communication network and indirect network effects [14] coming from the low price of Internet and Web hardware). In order to take advantage of the full potential of the Web, industry specific standards have been developed and implemented in various industries: ACORD in insurance, CIDX for chemicals, ebXML to exchange business messages, XBRL in business reporting, PIDX for petroleum and global energy, FIX in financial services (security transactions), FpML for financial derivatives instruments, MISMO for the mortgage industry, RosettaNet in electronics and high tech are all examples of XML based industry standards. XML has become the de facto standard for writing industry standards in many areas of practice. An increasing number of information standards originally written in natural language have been translated into XML or are directly created in XML format to increase interoperability at industry level [15]. Information models at industry level require a lot of effort to be created, and one of the most tedious jobs is the agreement on definitions of terms and concepts. This idea is demonstrated by the findings from analysing the IPWEA Asset Mates forum. As presented in Table 1, it can be noted that the threads Definitions and Terminology and Asset Data Collection methods attracted the most interest from the forum participants. The thread Definitions and Terminology is of particular relevance to this study and it was further analysed to get a better understanding of the issues. One of the main problems is the interpretation of the definition of various terms, such as: Asset Hierarchy, Asset Class, Asset Category, Asset Group, Asset type, Asset component, Asset attribute, Asset inventory, Asset register, Road inventory, Road register. To exemplify the issues with only two terms, Asset Register and Asset Inventory, there are several views on the data each of these storage devices contain. The International Infrastructure Management Manual (IIMM) defines an Asset Register as "a record of asset information considered worthy of separate identification including inventory, historical, financial, condition, construction, technical, and financial information about each....." [11]. This definition can have various interpretations, as follows: Forum Participant 1: ‘…An Asset Inventory would record the number of assets of a given type/capacity etc, owned by the entity, while the Register would record each of these assets separately, and include information specific to that individual asset, such as location, condition and so on’ Forum Participant 2: ‘… The Asset Register is a financial instrument that identifies all of the assets under the control of the organisation.... The detailed data is what you would hold in your asset inventory.’ Forum Participant 3: ‘An asset register contains assets above a threshold while an inventory contains assets (all assets or just those below the threshold) which are worth tracking individually (such as mobile phones) despite their lower value.’ Forum Participant 4: ‘Asset Register is developed using the Asset Inventory data. Asset inventory is recording data and continuously updating by collecting more and more data to increase the quality of the asset register. I also think that asset register has to include all the assets identified by the inventory.’
182
183
Issuing Body and/or Information Source Description
IIMM includes a glossary of 113 definitions related to asset management, from operational and maintenance terms, to management and finance activities. Asset hierarchies are provided as Association of Local Government International Infrastructure examples in appendixes. The terms and relationships between them are clearly defined, but only Engineering NZ Inc (INGENIUM) Management Manual (IIMM)- 3rd as examples and at a very high level of semantics, leaving organizations the freedom to add suband ed., 2006 components according to their needs. Physical assets are grouped by functionality (service area) Institute of Public Works Engineering of [11] and type (components). Road Assets hierarchy consists of roads and structures with service Australia (IPWEA) areas (land carriageway, footpaths, cycleways, etc.) classified under roads and bridge or retaining structure under structures. The components differ between roads and structures. PAS 55 has been designed as a specification providing guidance on good practice in all aspects of managing physical assets including acquiring, owning and disposing of physical assets. PAS Published by BSI British Standards BSI PAS 55 -2nd ed., 2008 55 is not a standard per se, describing mainly the processes and the steps involved in managing and distributed through [12] physical assets. It does not contain much detail about definition of terms or relationship between The Institute of Asset Management them. The first edition was published in 2004 in 2 parts: Part 1: specification for the optimized [13] (IAM) UK management of physical infrastructure assets and Part 2: guidelines for the application of PAS 55-1. Australian Infrastructure Financial 8 position papers have been prepared and posted on the IPWEA website for comments, as part of The IPWEA National Asset Management Management Guidelines (work in the background to the development of the new national guidelines for financial management of Strategy Committee (NAMS.AU) progress) infrastructure. NSW road classification divides roads into State Roads (Freeways and Primary Arterials), New South Wales Road New South Wales Roads and Traffic Regional Roads (Secondary or Sub Arterials) and Local Roads (Collector and Local Access Classification Authority (RTA) Roads). A review of road classification in NSW was done in 2004-2005. A total of 2,249 km of roads were considered for reclassification, but the main roads hierarchy remained the same. QLD road classification consists of five main categories: National Highways, State Strategic Queensland Department of Main Roads Roads, Regional Roads, District Roads and Local Government Roads. The first 4 categories are Queensland Road Classification (DMRQ) State Controlled Roads and their management is regulated by the Transport Infrastructure Act 1994. Road classification in Victoria is based on based on the Road Management Act 2004. It consists of Freeways and Arterial Roads managed by State Government (VicRoads) and Municipal Roads Victoria Road Classification VicRoads managed by Local Government agencies The National Association of Australian NAASRA classification separates roads by functionality, replacing the State classifications based State Road Authorities (NAASRA), on legislated definitions. It is used by road management authorities to define the road types NAASRA Classification for Roads currently eligible for Commonwealth Grants Commission (CGC). Variations of the NAASRA Management (used in ACT, NT, SA, The Association of Australian and New classification for road management are currently used in ACT, NT, SA, TAS and WA. NAASRA TAS and WA) Zealand Road transport and traffic classification consists of 9 classes separated in 2 groups: Rural Roads (classes 1-5) and Urban authorities (Austroads) Roads (classes 6-9).
Industry Standards
Table 3 Road Asset Management standards and guidelines [Source- Authors]
184
Issuing Body and/or Information Source Description
This framework was recommended by the National Transport Data Working Group (NTD-WG) in 2004 in order to facilitate land transport planning at the national, state, territory and local government level. At the heart of the NTDF will be a website designed as a central portal with an The National Transport Data The Australian Transport Council (ATC) open public interface allowing access to individual data collections according to various access Framework (Australia) restrictions. The data holdings should include Foundation data (Category ‘A’), New structured data (Category ‘B’) and Research data (Category ‘C’). Each Foundation data collection would have at least three layers: a public layer, a subscriber layer, and a private layer. The National Local Roads Data System (NLRDS) was designed by the Australian Local Government Association to aggregate existing sources of local road information in order to Australian Local Government provide a simple, consolidated, national local roads data system. NLRDS uses the following National Local Roads Data System Association (ALGA) performance measures: sealing of gravel roads; state of the asset; expenditure on roads and bridges; expenditure on roads and bridges per km for unsealed roads; lengths of unsealed roads, data used in performance measure; road asset consumption; road asset sustainability; road safety. The Queensland Road Alliance was established in 2002 as a partnership between Queensland's state and local governments to jointly manage about 32,000km of Queensland roads. The Road Queensland Department of Main Roads Alliance promotes sound asset management practices that will require a minimum set of road Qld Road Alliance (DMRQ) data inputs and outputs that are consistent statewide for the primary purpose of providing a and LGAQ relative ranking score for each road segment. The Road Alliance aims to provide a comparison, at a strategic network level, of road conditions across the state. These guidelines refer to the development and implementation of an Integrated Asset Management framework for managing road networks. The document regards a road network as a AP-R202/02 Austroads Integrated The Association of Australian and New major asset that need to be managed from 3 perspectives: financial, performance, and Asset Management Guidelines for Zealand Road transport and traffic deterioration. It is suggested that the asset management of a road network should focus on Road Networks authorities (Austroads) formations (cuttings and embankments including the subgrade), drainage, pavements (the road surfacing and structural layers that support the traffic loading), bridges, traffic control equipment such as signals, and roadside ITS installations, etc. The report describes a 1999 benchmarking study regarding the road asset management decision AP-R204/02 Austroads Road The Association of Australian and New processes in 12 international road agencies including Australia. It compares practices in strategic Network Asset Management: Zealand Road transport and traffic planning against the generic business processes presented in the Integrated Asset Management International Benchmarking Study authorities (Austroads) Guidelines for Road Networks. Among the opportunities for improvement: integration of processes and systems and documentation of Integrated Asset Management processes. It discusses topics of vehicle detection and classification. The Austroads 12 bin classification by AP-G84/04 - Best practices in road The Association of Australian and New axle configuration is considered stable and well received by road authorities. The issue of vehicle use data collection, analysis and Zealand Road transport and traffic classification by lengths into 3, 4 or 5 bins is also discussed and current practices are presented reporting authorities (Austroads) but not harmonised. Data integration and accessibility of road use data from multiple sources is considered, as many stakeholders are involved.
Industry Standards
185
Issuing Body and/or Information Source Description
This guide focuses on the management of the physical road assets including a range of AGAM01/06 Guide to Asset The Association of Australian and New coordinated activities, such as transport planning, design, implementation and operations. The Management Part 1: Introduction to Zealand Road transport and traffic Guide complements the Austroads Publication AP-R202/02 Integrated Asset Management Asset Management authorities (Austroads) Guidelines for Road Networks. The Association of Australian and New This guide is a collection of traffic studies related to collecting and analysing data related to Austroads Guide to traffic Zealand Road transport and traffic traffic. It includes a classification of Parking Data types into on-street and off-street. The Guide engineering practice - Traffic studies authorities (Austroads) is a reference for the Glossary of Austroads Terms. Some of the types of road use data considered for analysis are: traffic volume, traffic flow, traffic composition, weigh-in-motion, and adjustment factors. Recommendations: no new data AP-R292/06 - A Review of Road The Association of Australian and New framework is required, there is a need to link road use data with network performance data, road Use Data Integration and Zealand Road transport and traffic authorities need to remain abreast of the developments at NTDF (National Transport Data Management Models authorities (Austroads) Framework) and NDN (National Data Network), road authorities should work together with the private sector on the development of industry standard data protocols. It is recommended that road authorities become partners in aggregators established to meet the AP-R293/06 - A Review of Road The Association of Australian and New requirements outside the scope of road authorities. Likely partners in the aggregator would Use Data Pricing, Partnerships and Zealand Road transport and traffic include map data providers, telematics service providers, media organisations, motorist Accessibility authorities (Austroads) organisations, toll road operators, current commercial traffic information providers and telecommunication providers. This Glossary includes terms and definitions relevant to Austroads members and road and The Association of Australian and New AP-C87/08 - Glossary of Austroads transport industry practitioners, as well as a list of organisational acronyms. The Glossary has Zealand Road transport and traffic Terms 142 pages of entries. The Austroads Glossary is planned to be continually checked and new authorities (Austroads) terms, or definitions, included as deemed necessary.. The main goal of the Australian Government Interoperability Framework is that ‘information that is generated and held by government will be valued and managed as a national strategic asset’. Australian Government Information Australian Government Information The foundations of Information Interoperability are based on Information Management Interoperability Framework Management Office (AGIMO) principles. The Technical Interoperability Framework specifies a conceptual model and the agreed technical standards that support collaboration between the Australian government agencies such as: XML, UNICODE, AGLS, RDF, GIF, XML schema (DTD), BPEL4WS. The AGLS Metadata Standard is a set of 19 descriptive elements AGLS based on the Dublin Australian Government Locator Core Metadata Element Set (DCMES). The AGLS Metadata Element Set is intended to be used Service (AGLS) Metadata Standard- National Archives of Australia by the Australian government departments and agencies to improve the visibility and Australian Standard AS 5044 accessibility of their web services.
Industry Standards
The meaning of these 2 terms, Asset Register and Asset Inventory, has therefore important consequences on what type of data is collected, how it is stored, and consequently impacting on what data is available for use. For instance, if the asset register contains only assets above a [financial] threshold as proposed by Forum Participant 3, then it cannot be developed using the Asset Inventory data, as proposed by Forum Participant 4, because the 2 storage devices will contain different types of data. As important as information standards are, one thing needs to be remembered: increasing the number of information standards makes it very difficult for practitioners to select the particular one they need. This study concludes that information standards are better created and adopted at industry level rather than agency or road authority level, as this can reduce the number of information standards required and also the number of mapping required between agencies. With the increase in sharing information over the Internet, it becomes important to create industry information standards for the common business tasks and processes, and limit the number of local information models for unique circumstances that cannot fit in the general industry model.
4
CONCLUSION AND RECOMMENDATIONS
Standard information models are expected to play an important role in the information environment based on Web technologies. XML has become the de facto standard for writing industry information standards in many areas of practice. Mature industries such as finance, business or electronics have developed international standards based on XML. Road Asset Management is a relatively young industry therefore the information models are found mainly at the weak semantics end of the spectrum in the form of asset classifications, road hierarchies or industry standards. These information models have been created by various user groups to meet local needs and so far there is no consistency of information models across the industry at the national level. There are different taxonomies serving the same purpose (e.g. road classifications) that are built on different concepts and are used by various local or state road authorities. Currently there are no Road Asset Management information models at industry level based on XML. Standard information models are welcomed by practitioners, as there are many benefits associated with them, in addition to their usefulness for the Web environment. At the same time the findings suggest that the industry faces a lot of issues related to creating and implementing such information models at industry level. The effort of creating information models for this industry requires agreement on definitions and concepts, which is considered to be an issue Recommendations emerging from the findings of this study are: there is a recognised need for standard information models at industry levels; preferably these information standards would be created in XML format, to allow for interoperability at industry and inter-industry level using Web technologies; the information models should define specific tasks and business processes specific to Road Asset Management and for overlapping areas such as Finance and Business Reporting, existing XML information standards need to be carefully considered and adapted to the RAM industry needs by careful consideration at national level. New information standards should be created only for the specific industry needs not covered by any existing information standards.
5
REFERENCES
1
Australian Government-Productivity Commission (2008) Assessing Local Government Revenue Raising CapacityResearch Report: Canberra.
2
Australian Local Government Association (ALGA) (2009) 2009-2010 Budget Submission: Securing Australia’s Economic and Social Future: Canberra.
3
Burns, P., Roorda, J. & Hope, D. (2001) A Wealth of Opportunities: A Report on the Potential from Infrastructure Asset Management in South Australian Local Government, Contents and Glossary. Local Government Infrastructure Management Group.
4
Daconta, M.C., Obrst, L.J. & Smith, K.T. (2003) The Semantic Web: a guide to the future of XML, Web services, and knowledge management. 1st ed. Indianapolis, Ind. : Wiley Publishing, Inc.
5
Obrst, L. (2003) Ontologies for semantically interoperable systems in Proceedings of the twelfth international conference on Information and knowledge management. New Orleans, LA, USA: ACM.
6
Hepp, M. & de Bruijn, J. (2007) GenTax: A Generic Methodology for Deriving OWL and RDF-S Ontologies from Hierarchical Classifications, Thesauri, and Inconsistent Taxonomies, in The Semantic Web: Research and Applications. p. 129-144.
7
Hepp, M. (2006) Products and Services Ontologies: A Methodology for Deriving OWL Ontologies from Industrial Categorization Standards. International Journal on Semantic Web & Information Systems (IJSWIS) 2, 72-99
186
8
Nastasie, D.L., Koronios, A. & Sandhu, K. (2008) Factors Influencing the Diffusion of Ontologies in Road Asset Management- A Preliminary Conceptual Model in Proceedings of the 3rd World Congress on Engineering Asset Management and Intelligent Maintenance Systems (WCEAM-IMS). Beijing, China: Springer-Verlag London Ltd.
9
Bloomberg, J. & Schmelzer, R. (2006) Service Orient or Be Doomed: How Service Orientation Will Change Your Business. John Wiley & Sons, Inc.
10 Mihai, F., Binning, N. & Dowling, L. (2000) Road Network Asset Management as a Business Process, in REAAA Conference: 4-9 September, Japan. 11 INGENIUM & IPWEA (2006) International Infrastructure Management Manual. 3rd edition ed. Thames, New Zealand: Association of Local Government Engineering NZ Inc (INGENIUM). 12 Institute of Asset Management (IAM) (2004) PAS 55-1 Asset Management Part 1: specification for the optimized management of physical infrastructure assets. British Standards Institution (BIS): London, UK. 13 Institute of Asset Management (IAM) (2004) PAS 55-2 Asset Management Part 2: guidelines for the application of PAS 55-1. British Standards Institution (BIS): London, UK. 14 Katz, M.L. & Shapiro, C., (1985) Network Externalities, Competition, and Compatibility. The American Economic Review, 75(3): p. 424-440. 15 Koronios, A., Nastasie, D., Chanana, V. & Haider, A. (2007) Integration Through Standards – an Overview of International Standards for Engineering Asset Management, in Second World Congress on Engineering Asset Management, 11-14 June 2007: Harrogate, United Kingdom.
Acknowledgements The authors are very grateful to the Road Asset Management experts who accepted to be interviewed for this study. Gaining insight into the issues of information standardisation from the practitioners who work in this industry has been extremely valuable for this study. This paper was developed within the CRC for Integrated Engineering Asset Management, established and supported under the Australian Government's Cooperative Research Centre Programme.
187
Proceedings of the 4th World Congress on Engineering Asset Management Athens, Greece 28 - 30 September 2009
THE DIFFUSION OF STANDARD INFORMATION MODELS IN ROAD ASSET MANAGEMENT - A STUDY BASED ON THE HUMAN-TECHNOLOGY-ENVIRONMENT MODEL Daniela L. Nastasie a, b, Andy Koronios a, b a
Cooperative Research Centre for Integrated Engineering Asset Management (CIEAM), Brisbane, Australia b Systems Integration and IT, University of South Australia, Mawson Lakes SA 5095, Australia.
This paper reports on findings from the first stage of an exploratory study into the factors that influence the diffusion of ontologies in the Road Asset Management sector in Australia. The study investigates issues related to the diffusion of standard information models (taxonomies, road classifications and hierarchies, as well as various information systems conceptual schemas) in Road Asset Management. Individual and group interviews were conducted with 14 industry experts in four South Australian road authorities at state and local government level. The qualitative analysis of the findings is based on the preliminary HTE (human, technology, environment) conceptual model [1]. The findings suggest that the diffusion of standard information models at industry level is a complex process that combines human characteristics with technology and environment characteristics. Key Words: Road Asset Management, Diffusion of Information Standards, HTE model 1
INTRODUCTION
The information environment of the world has changed rapidly in the last few decades due to the proliferation of the Internet. Forty years after the creation of the ARPANET (the predecessor of Internet), the Internet has become one of the most common information exchange environment around the globe. Digital content has revolutionised the way information is used and the need for faster and wider information communication channels lead to an increasing demand for broadband Internet access. OECD considers broadband to have a similar impact on economic activities as electricity and internal combustion engine [2]. The potential of the new information environment is still to be fully exploited, but one particular application, the Web, has become as widely spread as the Internet itself. Among the Web technologies, the XML (Extensible Markup Language) enables data to become independent of the applications that create or store it, changing the focus of attention from software to information itself for the first time in the history of information management [3]. According to Sir Berners-Lee, the creator of the Web, the future information environment will evolve around the semantics (meaning) of data and information and will be based on high level information models such as ontologies. Standard information models (taxonomies, classifications, hierarchies) and information standards in general are expected to play an important role in the semantic information environment. The standard information models supporting the collection and storage of data are expected to provide the first step towards building ontologies [4], which would support semantic interaction across the Web. A more detailed analysis on the importance of the standard information models in the Web environment, discussing XML and information models described in XML can be found in [1, 5]. Effective Road Asset Management requires large quantities of data, from road inventory and condition data, to road usage and financial data. Various information systems are used to support the collection and analysis of data to perform road asset management tasks [6]. In 2008 the Australian Local Government Association proposed funding of $20 million over the next four years to develop a national data collection framework ($7 million) and to establish and/or upgrade asset management at local government level ($13 million) [7]. These figures demonstrate the importance assigned by the Australian government to data and information in relation to asset management. While roads and assets related digital information is gathered in increasingly larger quantities, these collections of data are in many instances very difficult to be analysed. The new Web environment presents opportunities for Road Asset Management in terms of dealing with large quantities of data, information management and automation of tasks. Information standardisation is one way of dealing with large amounts of data. Standard information models adopted at industry level would support the
188
exchange of information and sharing of knowledge between practitioners inside the Road Asset Management sector and with third party organisations. At the same time standard information models adopted at industry level would support the development of more formal logical structures, such as ontologies, that could allow semantic interaction over the Internet. This study investigated the factors that influence the adoption and implementation of standard information models in four government agencies in South Australia. 2
RELATED WORKS
Road Asset Management is heavily dependent on data and information. Information standards are discussed in the Information Systems literature as two separate groups: Horizontal IT standards and Vertical IS standards [8-10]. Markus and collaborators [8] differentiate the Horizontal IT standards and the Vertical IS standards as follows: ‘In contrast to horizontal IT standards, which concern the characteristics of IT products and apply to users in many industries, vertical IS standards focus on data structures and definitions, document formats, and business processes and address business problems unique to particular industries.’ This paper will use the term standard information model to refer to potential Vertical IS information models used by the various Asset Management Information Systems as well as information models that support classifications and hierarchies used in Road Asset Management. This approach is necessary as the Road Asset Management industry is too young to have developed vertical IS standards yet. Recent research covering the Engineering Asset Management information standards [11] discusses the plethora of information standards that could be developed into Vertical IS standards for this industry, with the most promising candidates being the information standards written in XML format. In the new Web based information environment standard information models are expected to play an important role. Current Information Systems research analyses the information models used in Road Asset Management industry in order to get a better understanding of the current status of information models in this industry [5]. Once these information models are created, their adoption and implementation becomes very important, as according to the Network Effects theory [12] the benefits they provide increase with the scale of the network that adopts them. Research is underway to study the adoption and implementation of standard information models in order to predict the adoption of ontologies in Road Asset Management [1]. A three dimensional conceptual model looking at Human, Technology and Environment characteristics (HTE) was designed to support the exploratory study in the first stage. The HTE model is based on a synthesis of constructs from Information Systems theories such as the Diffusion of Innovations (DOI) theory[13]; the Technology Organization Environment (TOE) [14]; TAM Technology Acceptance Model [15, 16] and related theories; NE - Network Effects - [12], CAT - Collective Action Theory [17] and the Public Goods Theory [18]. This model has been used to structure the data collection for the South Australian study. 3
RESEARCH METHODOLOGY AND FINDINGS
This paper is based on findings from an exploratory study conducted in South Australia between October 2008 and March 2009. In order to understand how standard information models (taxonomies, road classifications and hierarchies, information systems conceptual schemas, etc.) are adopted and implemented in the Road Asset Management industry, researchers [1] proposed a three dimensional conceptual model based on human (H), technology (T) and environment (E) characteristics, the HTE model as in Figure 1. A semi-structured interview questionnaire containing nine questions derived from the HTE model (with the ninth question open, asking for any other comments) was designed as presented in Appendix 1. The questionnaire has been used in 5 individual and 3 group interviews conducted with 14 industry experts in 4 South Australian government agencies at state and local government level. The transcripts from the interviews have been analysed using the NVivo 8 software and the findings were used to revise the HTE model. The findings show that the three stages of the diffusion of consistent information structures, Adoption, Implementation and Diffusion are supported by HTE drivers and hold back by HTE inhibitors, as shown in Figure 2. 3.1 RESEARCH FINDINGS Findings from this study show that: • Complex information systems pass the adoption stage raising high expectations, but they create big problems or even fail to deliver at the implementation and diffusion stage if not enough resources exist to manage their complexity • Organisations tend to be more cautious in adopting new information technologies. Early adopters of Information Systems prefer to wait longer now before they introduce new information technologies because most of their experiences as early adopters were negative. If the implementation is perceived as difficult, findings show that people are reluctant to become early adopters • External influences can be direct (as in government policies or influences from other councils) or indirect (through the people who move from one organisation to another). People with prior experience in the same area in other organisations bring a new way of looking at things and they can influence the new environment. While the impact of
189
people on organisations is obvious through the decisions they make day by day, the environment (organisation) influences people as well through the work practices in place, which remain imprinted in people’s mind long after they leave the organisation. This is an example of prior experience influencing an environment characteristic (business process).
Individual characteristics
H
Group characteristics
Adoption
External influences
T
Implementation
Technology characteristics
Technological context
Organization’s characteristics
E
Dissemination
Decision-making characteristics
Communication characteristics Figure 1- HTE conceptual model [[1], fig. 4, p. 1173]
Figure 2- HTE model with drivers and inhibitors [Source- Authors] •
One of the main issues is how to get agreement. Some of the factors that hinder the process of agreement on shared terminologies in RAM: o the plethora of Information Systems that have different conceptual schemas behind them, hence different concepts and terms o the historical development of information management in each organisation, with different business processes responding to the same needs In relation to ontologies the issue of agreement has been raised in the past by researchers trying to get agreements at the designing stage of ontologies [19] or investigating the automation of agreements between agents using different ontologies [20]. There have been no studies on getting agreement on taxonomies at industry level so far.
190
• •
•
• •
Agreement on Taxonomies. The majority of respondents have pointed out that agreement on terminology is very much needed, but very difficult to achieve. Getting agreement on terminology is an essential step in building ontologies at the industry level and more research is needed to understand how agreement can be achieved. Interoperability at industry level. Practitioners tend to connect more with peers that work on similar tasks and have similar issues at industry level rather than with staff from other areas in the same organisation. This leads to the idea that while information interoperability at enterprise level is important from the enterprise point of view, communication at industry level is very important for performing the business tasks effectively and efficiently. A number of interviewees reported that using different taxonomies for similar tasks creates real problems with benchmarking and communication at industry level. If one organisation uses Confirm and the other uses Hansen as their Asset Management Information Systems, then the way they collect and store information about their assets is different. While it is not imperative that these 2 systems ‘talk’ to each other, it is very important that people discuss their issues based on similar concepts. Taxonomies and metadata. Taxonomies provide relevant metadata to be associated with information contained in documents. This allows for consistent and systematic description of all the information contained in documents, and hence for a reliable exchange of information at industry level. A variety of taxonomies are required to cover different needs of the industry. Consistent taxonomies at industry level are the basis of formal ontologies. Human control over information decreases with the increase in the semantics level of the information structures. Once people don’t understand ‘how the system works’ they stop using it. This can make the new technology useless or used at a lower level capacity, as people need to have control over the data and information at all times. Politics have a big influence on the adoption and implementation of information standards. Beside the technical issues that have been investigated in the Computer Science arena, practitioners report social, economic and legal difficulties, a view supported by recent research in developing relevant ontologies [21].
3.2 DATA ANALYSIS Each of the three stages of the diffusion of standard information models, the adoption, the implementation and the dissemination of the model, is under the influence of two groups of factors – the HTE drivers, which motivate the diffusion of information models, and the HTE inhibitors, which can hinder or stagnate the diffusion process, as presented in Figure 1 The drivers and inhibitors from the findings of this study have been then classified according to the HTE model [1] as shown in Figure 3 (HTE drivers) and Figure 4 (HTE inhibitors). From a visual analysis of the two graphics representing the drivers and inhibitors, it can be noticed that the HTE drivers are in larger numbers than HTE inhibitors. Even though the number of factors does not necessarily reflect on the importance of the factors themselves, it provides an indication on the importance that standard information models are considered to play in Road Asset Management.
Figure 3- HTE drivers [Source- the authors] Some of these factors influence the diffusion of standard information models throughout its three stages. As an example, legislation to support the diffusion of a particular information model influences all three stages of the diffusion, while the benefits of data access and data maintenance are more important during the implementation and dissemination stages. Similarly, some inhibitors, such as the budget, influence the three stages of the diffusion process, while configuration issues
191
influence mainly the implementation of the information model. It is therefore necessary to distinguish between the factors according to their role in the process of the diffusion on information models. Considering the drivers presented in Figure 3, it can be noticed that some of the drivers, such as external influences (e.g. the Web environment that could force an agency into adopting and implementing a particular information model in order to be able to communicate with partners), or legislation (either mandatory or of an incentive nature) are motivators in themselves. These types of drivers will be called motivators. Another class of drivers (such as improved efficiency, transfer of skills, shared knowledge, etc.) could all come under the name of expected benefits, where expected benefits are one type of motivators. Similarly, the HTE inhibitors from Figure 4 can be grouped into factors that can block the diffusion process, such as an inappropriate budget or agreement on the standard information model, named blockers, and other issues that could impact negatively on the diffusion process if not addressed properly (such as definition of terms, resistance to change, etc.) and can be considered blockers for part of the diffusion process at least. The HTE drivers and inhibitors grouped by their role in the diffusion process are presented in Table 1.Expected benefits are listed under motivators as a group, and they are detailed in a separate column. One particular group of benefits was related to improved data quality and they were separated in the list to draw attention on the large number of benefits related to data quality that the practitioners have mentioned during the interviews. The HTE inhibitors were also separated into blockers and other issues, where other issues were listed in a separate column. Interviewees were concerned with the fact that information standardisation would reduce flexibility in terms of how the processes were defined and it will lead to a waste of knowledge in organisations that would have to change their business processes to fit the new information model.
Figure 4- HTE inhibitors [Source-the authors] A third category of factors that influence the diffusion of information standards are factors that could act both as drivers or inhibitors, depending on the context in which they apply. These are factors such as prior experience (which could be a driver in cases where the prior experience supports standard information models, or could be an inhibitor in cases where the prior experience with standard information models was negative) or the amount of information required in particular areas of practice (it was noticed during the interviews that respondents working in operational areas considered information standards almost irrelevant for their line of business, while areas such as finance, decision making or systems considered that standardisation of information would help a lot their work). These factors influence the personal motivation, for example when required to perform activities such as the collection of data required by certain information models. As this job is important for the quality of the data collected and stored in the system, if the personal motivation acts as a blocker it needs to be acknowledged and dealt with appropriately in order for the diffusion process of information standards to be successful. For the time being personal motivation has been included both in HTE motivators and HTE drivers. It should be noted that a factor like configuration issues, even though it influences the implementation stage mainly, it can have devastating influences on the whole diffusion process, as some of the interviewees reported. If the configuration of the system is not done properly, sometimes due to lack of time, sometime due to lack of expertise, the system cannot work efficiently and therefore it does not provide the expected benefits, which creates the false impression that the system is not appropriate for the job, when in reality it is not configured properly. Therefore the interaction between the factors is very important at any point in time. IS literature discusses a similar type of interaction in relation to the resistance to change in organisations to the implementation of Information Systems [22]. Standard information models are at the core of information
192
systems, so from this point of view the implementation of a new information model would be very similar to the implementation of a new information system. Even though in the HTE model resistance to change is one of the issues that need to be addressed in the process of implementation and dissemination, the findings of this study are in agreement with the interaction theory presented by Markus [22] who considers that people resist change because the interaction between people characteristics and system characteristics. The HTE model adds to this view the interaction to the environment characteristics, including influences external to the organisation, such as legislation, for instance, which is an important driver that can change the balance of power in the diffusion of information standards process Table 1 Drivers and inhibitors by their role in the diffusion of standard information models [Source-Authors] HTE Drivers Expected benefits assist accountability assist benchmarking assist transparency collaboration consistent business processes gaining grants legislation (mandatory improved data quality or incentive) data access data analysis expected benefits data collection data consistency personal motivation data control data independence data integration data maintenance data structure more useful data improved access control improved communication improved services increase efficiency information management legislation reduced costs reporting shared knowledge strategic planning support decision making third party information exchange transfer of skills Motivators external influences (e.g. Web environment, government incentives)
Blockers agreement budget other issues
personal motivation
HTE Inhibitors Other Issues business change configuration issues data representation data re-structuring definition issues different interpretations lack of expertise local politics perspective reduced flexibility resistance to change size of organisation technology requirements wasted knowledge
. 4
CONCLUSION
The proliferation of the Internet and the associated Web technologies present the Road Asset Management industry with an information environment suitable for managing large amounts of data in a very efficient way. In order to take advantage of the Web information environment, industry specific information standards need to be created. The standardisation of industry specific information can enhance human communication through a common terminology and can assist further automation of tasks. Semantic information retrieval can be achieved by increasing the level of semantics of the standard information models and designing ontologies specific to the Road Asset Management sector. These types of information models require communication at industry level, which has different requirements than communication inside an organisation. This paper is based on an exploratory study in four government agencies in South Australia investigating issues related to the adoption and implementation of standard information models in the Road Asset Management industry. According to the findings of this study, Road Asset Management practitioners can think of many benefits that can be derived from adopting and implementing industry level information standards: improved accountability and transparency, clear benchmarking and reporting, reduced costs and transfer of skills, improved communication at industry level and shared knowledge, etc. The
193
findings show that the interviewees consider also the issues associated with adopting industry information standards, such as reduced flexibility, lack of expertise or wasted knowledge if business processes have to be changed in order to implement the new information standards. These pro and contra factors have been classified according to the HTE (Human, Technology, Environment) model and some major motivators as well as blockers of the implementation process have been defined. Among the motivators, external influences (such as the existence of the Web environment or government incentives) and legislation (mandatory or incentive) add to the expected benefits to support the implementation of standard information models. The main blockers according to the findings of this study are agreement at industry level on such standard information models, budget constraints as well as several other issues that do not have the intensity of the blockers but could still have a negative impact on the implementation process. It has been noted also that certain factors, such as prior experience or the amount of information required in particular areas of business can act both as motivators or blockers. These factors have been classified under personal motivation category and listed under both headings (motivators and blockers). 5
LIMITATIONS AND FURTHER WORK
This exploratory study is the first in a number of studies planned to be conducted in relation to this topic. The study has been limited in number of participants (14 experts), location (Road Asset Management industry in South Australia) and time (October 2008- March 2009). These restrictions have been imposed by the time limits of this project and the number of volunteers among the industry experts. The findings from this study need to be validated against findings from road authorities in other states, to get a deeper understanding of the issues related to the adoption and implementation of standard information models in the Australian Road Asset Management sector. Further analysis is required to establish the factors that have a positive or negative influence on each of the three stages of the diffusion of standard information models (adoption, implementation and dissemination) in order to refine the HTE model. This analysis needs to take into account the interaction between the HTE factors, as the process of diffusion of standard information models is a dynamic process in which the factors influence each other. Therefore, even though an analysis for each individual stage of the diffusion is necessary to find out which of the factors have a greater influence on which stage, the fact that the factors influence each other has to be considered as well. A further revision of the HTE model will be done once the findings from other states are analysed. Recommendations for a best practice approach for the diffusion of standard information models in Road Asset Management will follow. 6
REFERENCES
1
Nastasie, D.L., A. Koronios, and K. Sandhu. (2008) Factors Influencing the Diffusion of Ontologies in Road Asset Management- A Preliminary Conceptual Model. in Proceedings of the 3rd World Congress on Engineering Asset Management and Intelligent Maintenance Systems (WCEAM-IMS). Beijing, China: Springer-Verlag London Ltd.
2
OECD (Organization for Economic Co-operation and Development), (2008) OECD Information Technology Outlook 2008: Highlights.
3
Daconta, M.C., L.J. Obrst, and K.T. Smith, (2003) The Semantic Web: a guide to the future of XML, Web services, and knowledge management. 1st ed. Indianapolis, Ind. : Wiley Publishing, Inc.
4
Hepp, M. and J. de Bruijn, (2007) GenTax: A Generic Methodology for Deriving OWL and RDF-S Ontologies from Hierarchical Classifications, Thesauri, and Inconsistent Taxonomies, in The Semantic Web: Research and Applications. p. 129-144.
5
Nastasie, D.L. and A. Koronios, (2009) The Role of Standard Information Models In Road Asset Management, in Fourth World Congress on Engineering Asset Management (WCEAM), Springer-Verlag London Ltd.: Athens, Greece.
6
Austroads. (2009) Asset Management - FAQ. [cited 2009 01 April]; Available from: http://www.austroads.com.au/asset/faq.html.
7
Australian Government-Productivity Commission, (2008) Assessing Local Government Revenue Raising CapacityResearch Report. Canberra.
8
Markus, M.L., C.W. Steinfield, and R.T. Wigand, (2003) The Evolution of Vertical IS Standards: Electronic Interchange Standards in the US Home Mortgage Industry, in ICIS 2003 MISQ Special Issue Workshop on Standards.
9
Markus, M.L., et al., (2006) Industry-Wide Information Systems Standardization as Collective Action: The Case of the U.S. Residential Mortgage Industry. MIS Quarterly, 30, 439-465.
10 Wigand, R.T., C.W. Steinfield, and M.L. Markus, (2005) Information Technology Standards Choices and Industry Structure Outcomes: The Case of the U.S. Home Mortgage Industry. Journal of Management Information Systems, 22(2), 165-191.
194
11 Koronios, A., et al., (2007) Integration Through Standards – an Overview of International Standards for Engineering Asset Management, in Second World Congress on Engineering Asset Management, 11-14 June 2007. Harrogate, United Kingdom. 12 Katz, M.L. and C. Shapiro, (1986) Technology Adoption in the Presence of Network Externalities. Journal of Political Economy, 94(4), 822. 13 Rogers, E.M., (2003) Diffusion of innovations. 5th ed. New York: Free Press. 14 Tornatzky, L.G. and M. Fleischer, (1990) The Processes of Technological Innovation. Lexington, MA: Lexington Books. 15 Davis, F.D., (1989) Perceived Usefulness, Perceived Ease of Use, and User Acceptance of Information Technology. MIS Quarterly, 13(3), 318-340. 16 Davis, F.D., R.P. Bagozzi, and P.R. Warshaw, (1989) User Acceptance of Computer Technology: A Comparison of Two Theoretical Models. Management Science, 35(8), 982-1003. 17 Weiss, M. and C. Cargill, (1992) Consortia in the Standards Development Process. Journal of the American Society for Information Science, 43(8), 559-565. 18 Kindleberger, C.P., (1983) Standards as Public, Collective and Private Goods. Kyklos, 36(3), 377. 19 Skuce, D., (1997) How We Might Reach Agreement on Shared Ontologies: A Fundamental Approach in AAAI Technical Report SS-97-06. University of Ottawa. 20 Laera, L., et al., (2006) Reaching Agreement over Ontology Alignments, in The Semantic Web - ISWC 2006. 371-384. 21 Hepp, M., (2007) Possible Ontologies: How Reality Constrains the Development of Relevant Ontologies. Internet Computing, IEEE, 11(1), 90-96. 22 Markus, M.L., (2002) Power, Politics and MIS Implementation, in Qualitative research in information systems : a reader, M.D. Myers and D. Avison, Editors. SAGE Publications: London. 19-48.
Acknowledgements The authors are very grateful to the Road Asset Management experts who accepted to be interviewed for this study. Gaining practical insight into the issues of information standardisation has been extremely valuable for this study. This paper was developed within the CRC for Integrated Engineering Asset Management, established and supported under the Australian Government's Cooperative Research Centre Programme.
195
Appendix 1 – Semi-structured interview questionnaire based on HTE model 1.
What is your role in this organization and what is your background? •
professional experience
•
qualifications
•
How would you describe your organization’s environment in regard to adopting information technologies (early, late adopter)? Who is usually involved at the adoption stage vs. the implementation stage? (Inter departmental relations, componence of groups)
2.
In regard to the adoption and/or implementation of [specific standard/information system] could you provide some details about the whole process?
3.
What can you say about the qualities and shortcomings of the [specific standard/information system]?
4.
How did the already existent technologies or the absence of some technology influence the process of adoption and implementation of the [specific standard/information system]?
5.
What can you say about the human resources expertise and skills involved in the process of adoption and implementation of the [specific standard/information system] ?
6.
7.
What communication channels were used to implement this [specific standard/information system] ? •
role of the intra and inter-organizational networks
•
industry and government networks
•
facilitating conditions
•
coordination among these networks
What external influences do you believe were related to the decision to adopt and implement this [specific standard/information system]?
8.
1.
the [specific standard/information system]promotion agent
2.
partner organizations had already adopted it
3.
it was imposed by an authoritative body
4.
other
Any other comments?
196
Proceedings of the 4th World Congress on Engineering Asset Management Athens, Greece 28 - 30 September 2009
DESIGN AND IMPLEMENTATION OF A REAL-TIME FLEET MANAGEMENT SYSTEM FOR A COURIER OPERATOR G. Ninikasa, Th. Athanasopoulosa, H. Marentakisb, V. Zeimpekisa, I. Minisa a
Department of Financial Management Engineering, University of the Aegean, 31 Fostini Str., 82 100, Chios, email: {g.ninikas; t.athanasopoulos; vzeimp; i.minis}@fme.aegean.gr b
ELTA Courier S.A. 40 D. Gounari Str., 153 43, Agia Paraskevi, Athens, email:
[email protected]
The need for higher customer service and minimization of operational costs has led many courier operators to seek innovative information systems for efficient handling of customer requests that occur either during the planning and/or the execution of daily deliveries. These systems address a series of operational issues that occur in the courier sector such as fleet and HR management, vehicle routing and monitoring, proof of delivery, trackand trace services, and so fourth. However, the demanding environment of the courier industry generates further operational needs that have not fully addressed by the existing systems. This paper presents the architecture of an innovative fleet management system that have been developed and implemented to a Hellenic courier operator, in order to address daily challenges and provide an integrated framework that supports effectively the dispatchers during the planning and execution of delivery schedules. The proposed system manages and allocates in realtime the dynamic requests that occur during service execution, as well as, the bulk deliveries that need to be serviced over a multiple-period (days) time horizon upon their receipt. The system has been evaluated through simulation tests and field experiments so as to ensure the robustness and interoperability of its components and assess the potential of adopting such a system in the courier industry. Key Words: Intelligent Information Systems, Dynamic Vehicle Routing, Telematic services, System Design 1
INTRODUCTION
In a courier service environment, efficient delivery is the key issue for customer satisfaction. The use of manual or empirical techniques for customer allocation to delivery vehicles, although necessary, is by no means sufficient to address customer requests that are likely to occur during delivery planning or execution due to randomness and complexity. A non optimized vehicle allocation plan may have negative effects on delivery performance leading thus, to higher costs and inferior customer service. With the advances in telecommunication and information systems such as Global Positioning Systems (GPS), Geographical Information Systems (GIS), and Intelligent Transportation Systems (ITS), it has become realistic to monitor in real-time the execution of routes, to control operational costs and at the same time to fulfil certain customer requests in real-time such as proof-of-delivery (POD). However, requests that arise during delivery execution and route planning for a multi-day framework cannot be addressed by current systems. The main aim of this paper is to present an innovative fleet management system that incorporates all the necessary modules that will be able to cope with the aforementioned operational needs. The paper is organized as follows. Section 2 presents the characteristics and the main operational challenges of the courier business sector. Section 3 describes the addressed problems/ challenges by the proposed system, along with their theoretical background. Section 4 analyses the system architecture, in parallel with its basic modules, while Section 5 describes the actions taken (pilot testing) to evaluate the efficiency of the proposed system. Finally, Section 6 presents the main conclusions of the system implementation. 2
THE COURIER SERVICES ENVIRONMENT
The core business of a courier company incorporates the delivery and pick-up of parcels and envelopes in a short period of time. A typical model of such type of delivery involves six major stages: a) the pick-up of the item, b) the initial processing of
197
the item in the local service point (LSP), c) the further processing in the main Hub, d) the longhaul transportation to the destination area Hub, e) the processing in the destination Hub and f) the delivery of the item from the destination LSP. This process is depicted in Fig. 1. In a typical courier delivery schedule, a large amount of requests are known to the dispatcher in advance and concern predefined deliveries to customers or scheduled pick-ups. The nature of courier problems however, is moderately dynamic [1], while a typically moderate amount of requests appear dynamically over time as the delivery plan is executed. As a result, vehicle routing is a challenging task as locations to be served vary greatly from day to day. Apart from dealing with their usual work, most courier companies deal, also, with bulk deliveries. These items can be informational/ advertising material, packages ordered through other sales networks such as Internet, etc. The courier company contracts with a client, based on a service level agreement (SLA), in order to provide distribution services to several customers.
Figure 1: Typical model of courier services distribution procedures Dynamic requests and bulk deliveries contribute significantly to the revenues of a courier company. On the other hand, the workload that occurs, demands an intelligent management in order to confront with the additional operational problems, such as: (a) the efficient allocation of dynamic requests to routes, and (b) the management of bulk deliveries in a multiple period (days) time horizon. On top of that, there is a series of dynamic parameters that affect the execution of a delivery and/or pick-up schedule. These are: • • • • • • •
Distribution area (e.g. area size) Type and volume capacity of vehicle (scooter, van) Pick-up/delivery item (letter, parcel) Type of distribution work (pick-up or delivery) Distribution times (travel/service/waiting times) Available resources (e.g. delivery vehicles and labour) Unexpected incidents (e.g. traffic congestion)
Due to the aforementioned dynamic parameters, the dispatchers face various difficulties during delivery scheduling and execution. Table 1 shows typical challenges that arise during the route planning and execution along with the main consequences resulted from them.
Table 1 Courier management challenges Category Problem Manually scheduling and plan of the initial routes Delivery Planning Massive planned deliveries/ pickups Manually assign dynamic requests to vehicles Late Proof of Delivery (PoD) Delivery Lack of fleet surveillance Execution Deviations on the planned pickup/ delivery time windows
198
• • • • • •
Consequences Increase of distribution cost Limited fleet control and delivery quality Customer complaints Dynamic incident handling inability (e.g. dynamic requests) Increase of pickup/delivery completion time Customer service and Quality of services
In order to define the main requirements for the design of the proposed intelligent fleet management system an extensive literature review in the area of express logistics, coupled with in-depth interviews with a leading Greek courier services company has been performed. The main requirements resulted are presented in Table 2. Table 2 Main requirements of courier services Addressed by current systems
Requirements Automation in the initial routing-scheduling Fleet monitoring/surveillance and vehicle performance
Proof of Delivery (PoD)
Bulk deliveries allocation in a multi-day framework
partially
Dynamic request handling
Existing fleet management systems can satisfy only a subset of the requirements of Table 2. However, most of the systems cannot handle in a dynamic manner the requests that arise during delivery execution neither route effectively deliveries that are planned in a multi-day framework. The aforementioned requirements constitute the main problems/ challenges to be confronted by the proposed system and are described analytically in the following section.
3
PROBLEM DESCRIPTION AND THEORETICAL BACKGROUND
Several research studies have addressed issues related to the ones addressed by the proposed system. The aforementioned problems/ challenges are described below together with selected references on similar problem settings. 3.1
Dynamic request handling
While executing a delivery plan, a fleet of vehicles is in movement to service customer requests known in advance, while new requests may arise dynamically over time as the working plan unfolds. Many different factors must be considered when a decision about the allocation and scheduling of a new dynamic request is taken such as the current location of each vehicle, their current planned route and schedule, characteristics of the new request, travel times between the service points, characteristics of the underlying road network, service policy of the company and other related constraints. Dynamic request handling is described mainly by the Dynamic Vehicle Routing Problem [2,3,21]. These problems exhibit special features [4]. Extensive research has been carried out on related models focusing on the Vehicle Routing Problem with Time Windows [5], Pickup and Delivery Problem with Time Windows [6,7,8] and other variations of them. In the literature, one can find three main approaches to cope with newly occurring requests. Typically, some static algorithm is first applied over requests that are known at the start of the day to construct an initial set of routes. Firstly, as the work day unfolds, a fast local update procedure can be used to integrate the new request into these routes. Insertion-based heuristics derived from local-search approaches is a way to solve these types of routing problems [9,10,11]. Re-optimization of the total VRP is a second approach and is used in order to improve the initial static solution every time an event occurs. A typical example is tabu search [12]. More recently, a third approach has proposed which instead of reacting to problem changes, it uses waiting strategies [13,14], i.e. it investigates the anticipation of these requests by positioning the vehicles in strategic locations or by exploiting information about future. 3.2
Routing of bulk deliveries in a multi-day framework
Bulk deliveries are being given an N-day horizon upon their request receipt, to serve each customer. Customers are provided with a specific service level (e.g. delivery within N days). The problem consists on deciding the best allocation of the requests to the next N-day schedules in order a) to avoid certain bottlenecks that may occur, b) to normalize the additional workload that occurs between periods (days), and c) to allow a proportion of the network availability to be allocated to dynamic requests through i.e. time-buffers of phantom customers. Related multi-period problems in the literature include the Inventory Routing Problem (IRP) [17, 18] and the Periodic Vehicle Routing Problem (PVRP) [19, 20]. For these problems it is clear that a solution, which results from applying a single period approach repeatedly, will not provide the efficiency of a multi-period solution. Thus, approaches used to solve such problems may sacrifice local optimality within a single period (day) in order to obtain global optimality over the entire horizon.
199
Rolling horizon problems apart from the above problem settings, have been studied in [15, 16]. The basic difference of the current problem vs. the PVRP and IRP is that in the former the frequency of customer requests is not known in advance (as in PVRP), or there is no a priori knowledge in order to determine this frequency (as in IRP in which the inventory and the consumption rate per customer is known). 4
SYSTEM DESIGN
In order to deal with the aforementioned operational inefficiencies and problems, an integrated system was designed and implemented. The system comprises of several modules – components which allow the incessant communication and information sharing. Initially, the core system components – modules are presented followed by the system’s architecture, which incorporates all the modules along with the needed input, outputs and interfaces. Additionally, the necessary user interfaces was designed and implemented in order to support the end users of the system (dispatchers, courier managers). 4.1
System Components and Modules
The core parts of the system include two (2) components which incorporate the algorithmic modules (dynamic request handling, Bulk deliveries allocation), as well as two (2) systemic tools (fleet monitoring tool, initial routing module), which support the integrated fleet management system. Dynamic Request Handling We consider a fleet of vehicles on their route to customers and a subset of requests that are dynamically revealed over time. The service is realized under various operational constraints such as time windows, limited vehicle capacity and route length and, therefore, decisions and changes must be made in a highly dynamic environment. Fig. 2 describes the proposed dynamic request management handler. As can be seen, the Allocate Module assigns these requests to vehicles. The inputs for this module are the initial day routes for all vehicles and the feedback given by the fleet surveillance system depicting the current location and availability of vehicles.
Figure 2: Dynamic request handling A fast local update procedure is used to integrate the new requests into the routes. The routing algorithm that solves this problem should be computationally efficient since the solutions are provided in real time. The key feature of the system is that it can allocate a large number of dynamic requests simultaneously and it can provide good solutions for moderately large customer sets in small computational time, taking into consideration all the operational constraints (time windows, available capacity of the vehicles, total length of the route, etc.). After solving the allocation of the dynamic requests, the system provides the vehicles with the updated delivery plan via the telematics system. A graphic user interface (GUI) was developed for the dynamic request handling module in order to support the end users of the system, as depicted in Fig. 3.
200
Figure 3: Dynamic Requests Handling Module Routing of bulk deliveries in a multi-day framework Scope of this module is to allocate the bulk deliveries in a multiday horizon in order to rationalize the extra routing workload. This module consists of three main parts, a) the historical data analysis, b) the routing component, and c) the allocation component. Fig. 4 presents the module’s procedures and the operational timings and Fig. 5 presents the user interface designed and implemented for the allocation module.
Model of the working VRP Environment
Allocation Problem
Historical Historical Data Data (raw) (raw) Routing Routing Historical Historical Data Data Analysis Analysis
• Expected customers • Parameters
• Expected Routes • Parameters (route length, capacities)
Daily Daily Requests Requests input input
• Flexible Requests
Allocation Allocation
• Routed Customers • Unallocated Customers
Unallocated Unallocated Customers Customers
• Unallocated Customers from previous days
Periodically (i.e once a month or based on service seasonality)
Daily
Figure 4: Allocation of Requests Initially, historical data collected over a specified time period (i.e. one month) are analyzed to provide the phantom (expected) customers. These customers represent urban areas with high request density, which are scheduled / serviced each day of operation. Expected customers are being routed in order to create the expected routes. Expected routes represent the typical paths that are most likely to be traversed in almost each day. Typical routes obtained will provide the basis for the allocation procedure for a specified time horizon. The allocation procedure is as follows: During each planning day, the following N days will be scheduled. Customers to be considered for allocation include the customers received during the last time period, as well as customers that have not been scheduled yet. Customers are allocated in the next N periods (days) based on an overall cost elimination procedure for the N period horizon. Customers that must serviced in day 1 (i.e. requests for which the N-day horizon expires) are scheduled with priority. In order to maintain a certain customer service level, any candidate request is forced to be scheduled no later than N days after request acceptance. The allocation procedure terminates when all customers have been assigned to one route of the N periods. Customers assigned to the first period’s routes are considered for implementation only, while the rest customers (allocated in periods 2 to N) form the unallocated customers of the next period’s allocation procedure. The operation will be repeated each day by subtracting the first day of the N-day horizon and adding day N+1. Thus, the problem is solved in a rolling horizon framework.
201
Figure 5: Bulk Deliveries Allocation Module User Interface (Allocation Component) Fleet Monitoring Module It provides all necessary real-time information on the state of the fleet (location, availability). Scope of the fleet monitoring module is to provide (a) a user friendly environment for the fleet surveillance during the execution of the scheduled routes, (b) the needed interfaces for the collection of historical data and (c) the main interface between the dynamic requests handling module and the actual implemented routes information in real time. The module is comprised of an XML client (field information database), several vehicle GPS devices (transmission of vehicle position information) and the fleet surveillance software (VIEWER). Fig. 6 presents the user interface developed and implemented for the fleet monitoring module.
Vehicle / Route Information Analytical route info (single veh.)
Actual vs. Plan (vehicle delays)
Map / routes imaging
Figure 6: Fleet Monitoring System User Interface Initial Routing Module The initial routing module is based on commercial routing software. Main purpose of this module is to provide the initial customer visiting order for each vehicle. The output of the routing procedure is provided to couriers as their initial job assignments and to the fleet monitoring module in order to monitor the execution of the assigned jobs and to provide the interconnection with the dynamic request handling module. 4.2
System Architecture
The aforementioned modules are interconnected in an integrated way in order to guarantee the interoperability of the proposed system. Fig. 7 presents the logical architecture of the proposed systems including the components – modules (purple
202
boxes), separated in algorithmic and systemic tools, and the main input / output data. Additionally, an illustration of the timings that each operation occurs is presented.
Figure 7: Integrated System Diagram Initially, in day i , bulk deliveries are scheduled based on the historical data (expected routes) and the pending bulk requests, and the initial routing module designs the routes to be implemented based on existing information (incl. normal courier requests and bulk deliveries). In day i + 1 , the dynamic requests handling module, in collaboration with the fleet monitoring system, will allocate the dynamic requests and will collect all needed historical data. The dynamic request handler connects with the fleet monitoring system via a TCP/IP connection to the server in order to receive the necessary data for the rerouting process (pending customers) and send back the updated routes that include the newly occurred requests. Also, fleet monitoring module supports the dispatchers providing all necessary real-time vehicle information. The physical architecture of the fleet management system that addresses the aforementioned requirements is categorized in three essential pillars as described below (see Fig. 8): 1.
Back-end sub-system: This sub-system consists of an integrated enterprise resource planning (ERP) communication system (communications server, database server, map server), which communicates bi-directionally with the front-end sub-system.
2.
Communications sub-system: It provides the wireless communication between the back-end and front-end systems. It consists of terrestrial mobile network (GPRS) and satellite positioning technology (GPS).
3.
Front-end sub-system: It consists of a GPS receiver and a GPRS modem which communicates (through wireless network) with the communication server in the back-end sub-system.
Figure 8: Architecture of the fleet management system
203
5
SYSTEM IMPLEMENTATION
The proposed system was implemented in a Hellenic courier company (Tachimetafores ELTA SA) in order to assess its functionality and efficiency. A medium-scale service area was chosen that services approximately 500 static and 70 dynamic customers daily with a heterogeneous fleet of 8 vans and 11 scooters. The implementation of the proposed system in real conditions has been found to be extremely difficult due to various operational constraints and complexities that correspond to the lack of an integrated information system. Due to the risk of causing significant nervousness to the user operations, the system implementation took place in two main phases, a) an online testing (field experiments) and b) an offline testing (simulation testing). The online testing focused on the assessment of the robustness of the proposed system modules, and the interoperability and functionality of the integrated system. Additionally, this phase evaluated the potential adaptation of such a system in the courier company operations. The offline testing took place in order to assess the efficiency of the routing systems (initial routing and dynamic request handler) compared with the current routing procedures of the user. It is worth mentioning that the bulk deliveries module has not been evaluated in this phase due to insufficient customer data (bulk deliveries) during the pilot testing period. The design of the aforementioned evaluation tests coupled with indicative results are described below. 5.1
Simulation testing set-up
The implementation of the simulation testing took place in a three day period in order to assess and compare the executed routing results of the user with the one proposed by the proposed system. The offline tests have been designed carefully in order to simulate the real operational conditions. A precise methodology was used in order to gather and match the customers and routes data in order to regenerate the conditions addressed by the company. This methodology provided the test data with approximately 500 customers, 70 dynamic requests and 13 routes per day (5 vans and 8 scooters). The offline tests were implemented in three stages that concern a) the assessment of the initial/planned routing, b) the dynamic request handling and c) the total routing cost. In the first stage, the current planned routes designed by the user were compared with the initial routing module’s results, while the second one provides comparison between the final routes executed by the user and the ones resulted by the dynamic request handler module. Finally, the third stage provides comparative results on the effectiveness of the integrated system concerning both initial routing and dynamic request handling. The dynamic request handling stage was compared using as initial routes the ones planned by the user and the ones resulted by the initial routing module. The results of the above stages are presented in the following section. 5.2
Results from simulation testing
Table 3 presents the 1st stage results consisting of the total routing cost (hrs) per day for the current planned routes and the total routing cost resulted by the initial routing module. The efficiency of the initial routing module is obvious while it succeeds in a 14% improvement of the total routing cost (hrs) on average, compared to the current route planning procedures of the user. Table 3 Stage 1 results (Offline Testing) [hrs] Day 1
Day 2
Day 3
Current Planned Routes
59,05
60,61
47,79
Initial Routing Module
49,93
51,26
41,36
% Difference
15%
15%
13%
Table 4 presents the 2nd stage results that depict the excess cost (hrs) occurred due to the insertion of the dynamic requests in the current routes as executed by the user and as proposed by the dynamic request handler module. The results prove the efficiency of this module as it reduces the total excess cost by 40% on average compared to the current dynamic requests’ allocation by the user. Table 4 Stage 2 results (Offline Testing) [hrs] Day 1
Day 2
Day 3
Current Routing Procedures
6,97
6,63
8,12
Dynamic Request Handler Module % Difference
4,11
4,45
4,38
41,0%
32,9%
46,1%
204
Table 5 presents the 3rd stage results that depict the total routing cost (hrs) of the current routes as executed by the user compared to the solution provided by the integrated system (initial routing module and dynamic request handler module). The integrated system results in a reduction of 16,5% of the total routing cost on average. Table 5 Stage 3 results (Offline Testing) [hrs] Current Routing Procedures Initial Routing + Dynamic Handler Modules (integrated system) % Difference
6
Day 1
Day 2
Day 3
66,02
67,24
55,91
54,60
56,74
46,71
17,3%
15,6%
16,5%
CONCLUSIONS
The courier business sector is an industry that is characterized by various operational and technical complexities. Several of these complexities can be addressed by the use of information systems such as the one proposed in this paper. The technical and operational principles underlying the design of the proposed fleet management system have been shown, followed by validation of the technical design, though empirical assessment of the benefits enabled by the use of the proposed system. The proposed system resulted to be rather effective in the routing procedures, presenting significant operational and cost reductions. Additionally, one of the basic advantages of the proposed system is that it eliminates the human factor inefficiencies caused by the empirical routing and scheduling procedures and provides thorough results and robust business operations. The effectiveness of the proposed system, as far as the operational cost (time) reduction is concerned was validated through: a) the reduction of the initial routing cost and b) the reduction of the dynamic requests allocation cost. The overall system resulted in a significant reduction of 16,5% on average of the total routing cost, while the standalone operation of the initial routing module resulted in 14% reduction of the initial routing cost. Finally, the dynamic request handling module reduced the excess cost on top of the initial routing cost by 40% on average. 7
REFERENCES
1
Larsen, (2000) The dynamic vehicle routing problem. Ph.D dissertation, Technical University of Denmark, Lyngby, Denmark.
2
M. Gendreau, F. Guertin, J-Y. Potvin, R. Seguin, (2006) Neighborhood search heuristics for a dynamic vehicle dispatching problem with pick-ups and deliveries. Transportation Research Part C, 157-174.
3
Febri, P. Recht, (2006) On dynamic pickup and delivery vehicle routing with several time windows and waiting times. Transportation Research Part C, 40, 335-350.
4
H.N. Psaraftis, (1995) Dynamic vehicle routing: status and prospects. Annals of Operational Research, 61, 143-164.
5
J. F. Cordeau, G. Desaulniers, J. Desrosiers, M. M. Solomon, F. Soumis, (2002) The VRP with Time Windows. In P. Toth and D. Vigo, The Vehicle Routing Problem, SIAM Monographs on Discrete Mathematics and Applications, Philadelphia, 157-193.
6
G. Desaulniers, J. Desrosiers, A. Erdmann, M. M. Solomon, F. Soumis, (2000) The VRP with Pickup and Delivery. Les Cahiers du GERAD.
7
Y. Dumas, J. Desrosiers, F. Soumis, (1991) The pickup and Delivery Problem with Time Windows: A Survey. European Journal of Operational Research, 54, 7-22.
8
S. Mitrovic-Minic, (1998) Pickup and Delivery Problem with Time Windows: A Survey. SFU CMPT TR.
9
Q. Lu, M.M. Dessouky, (2006) A new insertion-based construction heuristic for solving the pickup and delivery problem with time windows. European Journal of Operational Research, vol. 175, pp. 672–687.
10
A.M. Campbell, M. Savelsbergh, (2004) Efficient insertion heuristics for vehicle routing and scheduling problems. Transportation Science, 38(3), 369–378.
205
11
S. Mitrovic-Minic, R. Krishnamurti, G. Laporte, (2004) Double-horizon based heuristics for the dynamic pickup and delivery problem with time windows. Transportation Research Part B, 38, 669-685.
12
M. Gendreau, F. Guertin, J-Y Potvin, E. Taillard, (1999) Parallel Tabu Search for Real-Time Vehicle Routing and Dispatching. Transportation Science, 33(4), 381–390.
13
S. Ichoua, M. Gendreau, J-Y. Potvin, (2006) Exploiting Knowledge About Future Demands for Real-Time Vehicle Dispatching. Transportation Science, 40(2), 211–225.
14
S. Mitrovic-Minic, G. Laporte, (2004) Waiting strategies for the dynamic pickup and delivery problem with time windows. Transportation Research Part B, 38, 635–655.
15
P. Jaillet, J. Bard, L. Huang, M. Dror, (2002) Delivery Cost Approximations for Inventory Routing Problems in a Rolling Horizon Framework. Transportation Science, 36(3), 292-300.
16
H. N. Psaraftis, (1988) Dynamic vehicle routing problems. In B. L. Golden and A. A. Assad, Vehicle Routing: Methods and Studies, North-Holland, Amsterdam, 223-248.
17
M. Dror, M. Ball, B. Golden, (1985) Computational comparison of algorithms for the inventory routing problem. Annals of Operations Research, 4, 3-23.
18
M. Campbell, M. W. P. Savelsbergh, (2004) A decomposition approach for the Inventory-Routing Problem. Transportation Science, 38(4), 488-502.
19
M. Newman, C. A. Yano, P. M. Kaminsky, (2005) Third Party Logistics planning with routing and inventory costs. In J. Geunes and P. M. Pardalos, Supply Chain Optimization, Kluwer.
20
N. Christofides, J. Beasley, (1984) The Period Routing Problem. Networks, 14, 237-256,
21
V. Zeimpekis, C. Tarantilis, G. Giaglis, I. Minis, (2007) Dynamic Fleet Management: Concepts, Systems, Algorithms & Case Studies. Operations Research/Computer Science Interfaces Series, Springer-Verlag, Vol. 38.
Acknowledgments This work is partially funded by the “Regional Operational Programme of Attica” (ROP of Attica) [project “Management of Dynamic Requests in Logistics (MaDReL)”] and the “Reinforcement Program of Human Research Manpower” (PENED) [cofinanced by National and Community Funds (25% from the Greek Ministry of Development-General Secretariat of Research and Technology and 75% from EU-European Social Fund) – 03ED067 research project].
206
Proceedings of the 4th World Congress on Engineering Asset Management Athens, Greece 28 - 30 September 2009
OPEN SOURCE SOFTWARE: LESSONS FOR ASSET LIFECYCLE MANAGEMENT Dr. Abrar Haider a, b a
b
CRC for Integrated Engineering Asset Management, Brisbane, Australia
School of Computer and Infomatin Science, University of South Australia, Mawson Lakes Campus, SA 5095, Australia.
Generally, software used by individuals and organizations are proprietary software. Proprietary software usually provides hidden source codes, available at a specific cost, and limited flexibility on copyright licenses. On the other hand, open source software appears with some competitive features to proprietary software. The unique part of open source software is a development concept that harnesses open development from a wide community and decentralized peer review. This development process is effective in terms of software quality and the lowered software production cost. This paper presents an account of literature review to provide insights into the strengths of the concept of open source software development and to develop a case for its application in collaborative research environments to enhance and further develop the technical infrastructure to support asset lifecycle. Key Words: Open source, Asset management, Software. 1
INTRODUCTION
Economies worldwide are acknowledging the potential of OSS and research in viability, usability, maintainability, and supportability of Open source software (OSS) is gaining momentum. For example, European Union that has been a forerunner in adoption, development, and research of OSS in different areas of economy. However, interest in OSS is not limited to Europe only. It has been successfully implemented in many public sector organisations in countries such as, Brazil, Italy, Malaysia, Germany, Netherland, United Kingdom, France, USA, Denmark, Sweden and South Africa. It is not just the economic advantages that are attractive to economies around the globe; in fact OSS presents itself as a collaborative reliable, robust, and flexible alternative to proprietary software systems. Proprietary software is tightly controlled and leads to usage dependencies. At the same time, proprietary software is developed on standardised user requirements and thus does not exactly meet all of the demands of any organisation. In addition, proprietary software has on going support and up-gradation constraints, and contractual restrictions. On the other hand, OSS allows a participatory forum that engages communities of interest to decrease dependence on commercial software vendors in terms of source code, functionality, and contractual commitments. There are significant hidden costs involved in support, maintenance, training, re-engineering, and installation of software applications. These costs are generally not considered at the time of software procurement; however, they are significant in the total cost of ownership and even softwares like office suites or packages developed or tailored for a specific organisation can be dauntingly expensive. In addition, the licenses available for proprietary software are not flexible and are available for specific number of seats, whereby expansion for an organisation with available licenses is difficult and expensive. There are also legal costs and risks involved with checking and signing licenses, ensuring that license conditions are adhered to, and ensuring that all relevant licenses have been purchased and are up to date [11]. On the other hand, OSS adheres to simple and non-cumbersome licensing and free distribution. Concept of OSS has significant relevance for research organisations like Cooperative Research Centre for Integrated Engineering Asset Management (CEIAM). CIEAM brings together researchers and practitioners belonging to industry verticals such as electricity, gas, water, and transport. These researchers and practitioners have expertise in various areas of asset management such as design, operation, maintenance, lifecycle support, IT support development for lifecycle management, human resource development for asset management, and asset accounting. CIEAM, thus, resembles an extensive community of interest of asset lifecycle management. The concept of OSS could be applied to CIEAM to bridge the gap between software developers and software users through continuous software audit and requirements refinement. There is
207
enormous potential for asset management researchers, practitioners, and lifecycle IT support developers to get together and participate in an open source initiative to develop and mature software applications specific to the asset lifecycle management. This will not only help them create a robust set of applications but will also reduce their dependencies on software vendors as well as the need for reengineering the applications to customise them to organisational needs. Nevertheless, success of OSS is contingent upon critical aspects such as, its implementation enabling technical and economic value; its maintainability and adequate support available to sustain its utilisation. It is, therefore, essential to establish the potential and value profile of OSS for engineering enterprises. This paper presents an overview of the strengths and weaknesses of OSS to develop a case for its utilisation in asset lifecycle management. It begins with an overview of OSS development culture and adoption framework, followed by the OSS governance modus operandi. The paper concludes with a discussion on the relevance of OSS to asset management.
2
THEORETICAL FOUNDATIONS OF OPEN SOURCE SOFTWARE
OSS is developed by the community at large by providing free access to source code. The advantage of providing source code within OSS distribution is to enable end users to learn more about the program [1]. Thus programmers can improve the software and redistribute it again to the society. According to Perens [2], open source doesn’t simply mean open access to the source code, but the definition itself must comply with several criteria listed below, a.
Free Redistribution: Software license allows everyone to create multiple copies of software, sell, or give away the program without any fees required.
b.
Source Code: Software must include source code within their distribution or provide an accessible website that contains free downloadable source code. OSS thus enables programmers to modify or repair the program.
c.
Derived Works: Software can be freely modified, though needs to be redistributed by acknowledging the original licensing scheme.
d.
Integrity of the author’s source code: this rule defines the separation of modification code and original code to respect the integrity of the author’s original source code.
e.
No discrimination against persons or groups: Software must be available for any persons or groups without any exclusion.
f.
No discrimination against fields of endeavour: Software must be equally usable for any field of knowledge without any exclusion.
g.
Distribution of license: there is no first party’s signature required for distribution of license from second party to third party.
h.
License must not be specific to a product: the open source license must always be attached to the derivation works and particular parts of program distributions.
i.
License must not contaminate other software: the license does not restrict another program that is distributed along with open source software to be also open source license
Free definition in OSS is not similar to freeware or shareware applications. According to Hamel [3], The Free/Libre Open Source Software (FLOSS) definition is used to differentiate from free as freedom, free speech and free beer. Shareware by definition is software that is available to share though the user needs to buy it if the software is to be used for a longer period of time. On the other hand, freeware software is available for free download and free use with no charge by end users. However, free definition in OSS does not only mean free of costs (gratis), but also includes freedom and liberty development, usage, and distribution of the software. Richard Stallman [4], the inventor of copyleft mechanism instead of copyright explained well the definition of free in OSS within Stallman’s Four Freedom concept in GNU (Gnu is Not Unix) Public License: a.
Freedom 0 – “Freedom to run the program for any purpose”.
b.
Freedom 1 – “Freedom to study how the program works and adapt it to your needs”.
c.
Freedom 2 – “Freedom to redistribute copies so that you can help your neighbour”.
d.
Freedom 3 – “Freedom to improve the program and release your improvements to the public for whole community benefits”.
208
3
OSS DEVELOPMENT CULTURE
Unique quality of OSS is a development culture that harnesses open development from wide community and decentralized peer review; thus the development process is effective in lowering software production cost and improving the software quality [5]. According to his book, “The cathedral and the bazaar”, Raymond [5] draws an analogy between “The Cathedral” (Proprietary Software) and “The Bazaar” (OSS), where “the Cathedral” development is carefully crafted by individual “wizards” in an isolated work place and there are no beta releases on this development style since everything has been fixed to a single plan, single point of focus, or even single mind [6]. On the other hand, “The Bazaar” style is similar to the common meeting place, when everybody adds different things to the interaction, like minded community congregate and talk, and disseminate important information within the development community [5,6].
Submit feature/ bug fix/ bug report/ request to incorporate into main implementation
Feature
Stable core implementation
Distribute
Community of users and developers
Bug Fix
Core developers
Embodies Bug report
Source code modifications and feature
Source code modification and fix
Details of bug and steps to replace
Modular design :::
Feature request
Feature request
Figure 1: OSS Development Cycle [7] Figure 1 illustrates the OSS development cycle and highlights developers’ motivations and interactions. In OSS development process, developer(s) develop the core of an application and make its source code available to general public via internet. Like minded individuals use the application or go through the source code and add their enhancements to the core to enhance functionality to the core application, remove bugs, point out errors, report desired enhancement, and provide software quality assurance through testing. When this feedback is available to the developers they are in a better position to improve the software. Thus, the software goes though the process of continuous improvement. However, OSS is available for commercial use as soon as the original develop makes it available on the internet. OSS software applications have a modular design, which makes it easier for community of interest to understand the software and apply enhancements. It should be noted that this process is entirely voluntary and is driven by knowledge and challenge rather than economic considerations alone. In proprietary software project development, software development houses are commonly motivated by money as an incentive for their working efforts. However, in OSS development concept, the reason is not money as an incentive for the contributors’ motivation. Lerner & Tirole [8] define two types of motivations as immediate pay-off and delayed benefits. Immediate pay-off such as recognition, innovation, and idea generation is the most common motivation for any software developments. On the other hand, delayed benefits are claimed as indirect economic benefits that software developers will get in the future. According to Woods & Guliani [4], the three most significant strengths of “Bazaar” like software development model include, faster development due to large number of developers available and at lower production cost; flexible and close to user requirements, since it is developed by wide community, the resultant software will serve several common needs; and improved developer’s skills through interactions between developers of varied experiences.
209
OSS projects are developed through communities of interests that evolve a governance structure around the project lifecycle. This governance structure in open source community simply starts from individual motivations that interact into one social control mechanism [9]. This social control creates conformity for certain moral and cultural rules within the development community [10]. Thus, there are two types of social control activities within open source project, i.e. direct governance and indirect governance. Direct governance is a social control that ensures the quality of project by doing direct inspection or monitoring tasks. On the other hand, indirect governance is based on output result from development. Table 1 further elaborates on the governance structure of OSS projects. Introduction
Growth
Maturity
Decline or Revival
Focus
Idea Generation
Expansion
Stability
Adaption
Structure
Completely informal
More centralized
Division of Labor
Generalists
Some specification
Highly decentralized
Less specialized
Coordination
Informal one on one
Technology Introduction
Formal intensive
Formal but adherence
Examples
Dam, HTM Larena plus accessibility
Eclipse, Typo3
Linux, Apache, Mozilla
formal,
Somewhat decentralized
formal,
technology
Slightly formal but less adherence
less
Gnutella
Table 1: Governance Details within each lifecycle stages [11] In open source development, the author of the original code mostly bequest his/her intellectual property to society without thinking of return from the code. Open source development culture is also identical with parents relationships with their kids[6]. In analogy, parents always give everything to their kids without expecting any return. Researchers [7, 11] attribute various other benefits with OSS, such as a.
Acquisition cost of OSS is generally lesser than proprietary software and may even be completely free of cost and thus may eliminate the financial burdens from proprietary licensing schemes.
b.
Reduced vendor lock-in contracts, since OSS is freely available for download. Support is generally free, and even when it is commercially provided it is available at a lesser cost. Proprietary software is generally bundled together with additional update cost and maintenance cost. Thus, the vendor may charge more for updates available within specific software support. At the same time, closure of vendor may lead to the unavailability of support for the software. In open source, the power of the developer is leveraged among the society, even though vendors may close someday, the knowledge is still kept within society.
c.
Open source development enables global effort that maximizes the potential of software by integrating various programming skills of all contributors and thus enhances reliability of the software.
d.
The availability of source code enables end users to improve functionality or modify the software to specifically fit-in with their own needs. Personal customizations also enable sense of belonging and full performance measurement among end users.
e.
Wide development culture enables OSS to grow and enable programmers to work together to produce a secure software. Some contributors act as bug finders and communicate them through forums to let other programmers fix up the problem.
f.
Most OSS are open standard which means open interoperability with other systems. Open Standard feature enables simple interoperability with other systems without the need of additional integration software or systems modification.
g.
Capability and flexibility of customization enable improvement on IT value within the business. OSS is developed by various people with different expectations and thus develop a sophisticated result for society.
However, these benefits demand certain requisites, which include, understanding of technology requirement of the business and the capabilities of OSS in meeting these requirements or the improvement potential of OSS in incorporating these improvements. Another important factor is the development and maintenance of skills required to install and configure open source, since open source may be different from common proprietary software design, it is important to learn some skills to develop and maintain specific OSS. It is equally important to be able evaluate maturity of open source software. Proprietary software ensures that software is launched fully tested and ready to use. However, OSS development produces software in a relatively shorter period of development. It is, therefore, essential to always monitor and measure the maturity of the version before implementation. Although OSS are generally licence free, however, this is not always the case. There are various
210
licensing schemes available within open source software, therefore, it is important to understand the licensing mechanism and choose the license that is most suited to the needs of user or business.
4
OPEN SOURCE SOFTWARE – A CASE FOR ASSET MANAGEMENT
Information technologies utilised in asset management not only have to provide for the decentralized control of asset management tasks but also have to act as instruments for decision support. These technologies, therefore, are required to provide an integrated view of lifecycle information such that informed choices about asset lifecycle could be made. An integrated view of asset management, however, requires appropriate hardware and software applications; quality, standardised, and interoperable information; appropriate skill set of employees to process information; the strategic fit between the asset management processes and the chosen information technologies; and conducive organisational environment. Current information systems in operation within engineering enterprises have paid for themselves, as the methodologies employed to design these systems define, acquire and build systems of the past not for the future [12]. For example, development of maintenance software systems that have attracted considerable attention in research and practice are far from being optimal. While maintenance activities have been carried out ever since advent of manufacturing; modelling of an all inclusive and efficient maintenance system has yet to come to fruition [13, 14]. This is mainly due to the continuously changing maintenance requirements and the increasing complexity of asset equipment. In response to the increased competitive pressures, maintenance strategies that once were run-to-failure are now fast changing to being condition based, thereby necessitating integration of asset management decision systems and computerized maintenance management systems in order to provide support for maintenance scheduling, maintenance workflow management, inventory management, and purchasing [15]. However, in practice, data is captured both electronically and manually, in a variety of formats, shared among an assortment of off the shelf and customized operational and administrative systems, communicated through a range of sources and to an array of business partners and sub contractors; and consequently inconsistencies in completeness, timeliness, and inaccuracy of information leads to the inability of quality decision support for asset lifecycle management [16]. In these circumstances, existing asset management information systems could best be described as pools of isolated data that are not being put to effective use to create value for the organisation. Perhaps, the major issues in this regard relate to the way organisations utilise technology; however, there are certain technological constraints that also constrict their improvement potential. These constraints are posed in form of variety of disparate commercial off the shelf systems being used by the industry, which not only have limited functionality but also contribute to issues of interoperability integration. In these circumstances, OSS with its features and development culture appears as a viable option for asset managing engineering enterprises. From technological perspective, OSS provides global development approach and software quality testing, tailored solutions, better security than proprietary solutions, open standard architecture, and a degree of independence from vendor control. From financial perspective, OSS is available with lesser acquisition costs, avoids vendor lock-in and hidden costs, and less training and software integration costs. Open source development projects involve large number of developers at a global scale. Thus the developer community is able to harness the competencies and capabilities of wider develop community. OSS products do not follow any specific project plans or schedules, which means that these projects are emergent in nature with no end in sight, unless the develop community ceases o see any value in the project. Thus, the product being developed is continuously being, peer reviewed, updated, and enhanced. In addition, since work is not assigned to specific persons, developers contribute on the basis of their expertise. There is no central structure of project and the project follows a decentralized governance structure, thus, the developers work without any organisational or deadline pressures. Since there is no explicit system level design in OSS development, projects start on innovation, need, or opportunity basis. For example, a practitioner sees an opportunity to resolve an issue and an attempt is made to resolve it. Since this attempt is made available to wider community of interest, others join in enhance the original effort. OSS allows access to source code, which enables developers to enhance, customise, and integrate the software according to organisational or personal preferences. The most common misconception of OSS is the reputation of being less secure because of its freely available code [17]. The global development concept not only enables a wide ranging developmental community but also allows for global quality test for OSS. OSS development enables software solutions to be fully customised according to the functionality needs of the organization. On the other hand, proprietary software is designed according to vendor’s development planning and follows common design and needs, which lacks the depth and breadth allowed by OSS [4]. In proprietary software, software quality testing is limited within a controlled environment and specific scenarios [8]. However, OSS development involves much more elaborate testing as OSS solutions are tested in various environments, by various skills and experiences of different programmers, and are tested in various geographic locations around the world [18]. As the main financial benefit of OSS, the acquisition cost of OSS is generally lesser than proprietary software or even free of charge [11]. In addition, more flexible license coverage can be gained from OSS that enable redistribution and software modification to comply with specific needs of the organization [4]. Since OSS is developed in public domain and is freely available, there is no dearth of skills and knowledge in any market. These cost differences between proprietary software and OSS can be used for better staff training, customization tasks, or enhancement in existing IT infrastructure [11]. The most significant benefit of OSS is its appeal in terms of software integration and interoperability. Asset managing engineering organisations produce, store, and manage enormous amount of data on a daily basis. This information becomes
211
highly valuable when it is readily accessible to asset lifecycle participants, such as planners, maintainers, designers, accountants, and asset operators. Since asset managing organisations utilise a plethora of software systems, it is highly unlikely that all of these systems conform to a single data format. At the same time, since most of these systems are closed source it is not possible to re-engineer the software. However, with OSS source code is available to the organisation and it can thus be reengineered to make it consistent with the information architecture of the organisation.
5
CONCLUSION
Although it appears that the financial appeal of OSS is the major reasons of its popularity in the time of economic distress; however, it offers a lot more than just economic benefits. The concept of OSS has significant relevance for research organisations like CIEAM. Open source development concept fosters a sense of belonging and community of practice approach. Open source projects foster the collaborative concept that helps develop the technical infrastructure by drawing on the expertise of different participants. This goes a long way in growth and development of asset management industry. It allows asset managing organisations to participate in collaborative forums that engage in development efforts to create efficient and quality software solutions for managing asset lifecycle, and thus reducing dependencies on commercial software. In doing so, they enhance and mature software applications specific to their asset management operation. On the other hand open source software, itself, has an edge against the proprietary software in various aspects; such as the technical benefits that provide various advantages from the global open source development concept and interoperability among software solutions.
6
REFERENCES
1
von Krogh, G, & von Hippel, E (2003) Special issue on open source software development. Journal of Research Policy, 32, 1149-1157.
2
Perens, B (1998) Open source definition. http://ldp.dvo.ru/LDP/LGNET/issue26/perens.html
3
Hamel, MP (2007) Open source collaboration in the public sector: the need for leadership and value. National Centre for Digital Government working paper, vol. 7, no. 4. accessed online 9 May 2009, at http://www.umass.edu/digitalcenter/research/working_papers/07_004HamelOSCollaboration.pdf
4
Woods, D, and Guliani, G (2005) Open source for the enterprise. 1st edition, Sebastopol, CA: O'Reilly Media Inc.
5
Raymond, ES (2001) The cathedral and the bazaar, Revised edition, Sebastopol, CA: O'Reilly Media, Inc.
6
Zeitlyn, D (2003) Gift economies in development of open source software: anthropological reflection. Research Policy, 32, 1287-1291.
7
Alexy, O, and Henkel, J (2006) Promoting the penguin: who is advocating open source software in commercial setting?. Munchen, Germany: Technische Universitat Muncheno. Document Number
8
Lerner, J, and Tirole, J (2002) Some simple economics of open source. Journal of Industrial Economics, 50(2), 197-234.
9
Vujovic, S, and Ulhoi, JP (2008) Online innovation: the case of open source software development. European Journal of Innovation Management, 11(1), 142-156.
10
Latteman, C, and Stieglitz, S (2005) Framework for governance in open source community. Paper presented at the 38th Hawaii International Conference on System Sciences, Hawaii.
11
Kovacs, GL, Drozdik, S, Zuliani, P, and Succi, G (2008) Open source software and open data standard in public administration. Paper presented at the Computational Cybernetics, 2004. ICCC 2004. Second IEEE International accessed online on 14 April 200p, at www.ieee.com
12
Haider, A, & Koronios, A (2004) Converging monitoring information for integrated predictive maintenance. Iin Proceedings of 3rd International Conference on Vibration Engineering & Technology of Machinery (VETOMAC-3) & 4th Asia -Pacific Conference on System Integrity and Maintenance (ACSIM 2004), New Delhi, India, December 6-9, paper No. 40.
13
Duffuaa, SO, Ben-Daya, M, Al-Sultan, K, & Andijani, A (2001) A Generic Conceptual Simulation Model For Maintenance Systems. Journal of Quality in Maintenance Engineering, 7(3), 207-219
14
Yamashina, H (200) Challenge to world class manufacturing. International Journal of Quality & Reliability Management, Vol. 17, No. 2, pp. 132-143.
15
Bever, K (2000) Understanding Plant Asset Management Systems. Maintenance Technology, July/August, pp. 20-25
Accessed
212
online
on
27
May
2009,
at
16
Haider, A, & Koronios, A (2005) ICT Based Asset Management Framework. In Proceedings of 8th International Conference on Enterprise Information Systems, ICEIS, Paphos, Cyprus, 3, 312-322.
17
Taylor, PW (2004) Open source open government. Centre for digital government, accessed online 11 May 2009, at http://www.govtech.com/gt/91970.
18
Mockus, A, Fielding, RT, and Herbsleb, JD (2002) Two case study on open source software development: apache and Mozilla. ACM Transaction on Software Engineering and Methodology, 11(3), 309-346.
213
Proceedings of the 4th World Congress on Engineering Asset Management Athens, Greece 28 - 30 September 2009
ASSESSING MAINTENANCE MANAGEMENT IT ON THE BASIS OF IT MATURITY Kans, Ma a
School of Technology and Design, Växjö University, Luckligs plats 1, S-351 95 Växjö, Sweden.
Research shows that investments in IT have a positive correlation to company profitability and competitiveness. This is the case also for maintenance management IT (MMIT), i.e. applications used for maintenance management purposes such as computerised maintenance management systems (CMMS) and maintenance management or asset management modules in enterprise resource planning (ERP) systems. Although, models and methods for evaluating maintenance IT needs and IT systems are not well developed. This paper shows how the IT maturity of the maintenance organisation could be considered in the IT procurement process. If we are able to define functionality for different levels of IT maturity, the assessment and selection of the relatively best IT application for the maintenance organisation would be supported. A model describing three phases of IT maturity within maintenance (IT beginners, medium IT users and IT mature organisations) forms the theoretical basis. The applicability of the approach is tested by evaluating 24 CMMS and enterprise resource planning (ERP) systems. Key Words: Computerised maintenance management, maintenance IT functionality, IT maturity 1
INTRODUCTION
To reach success in the utilisation of information technology (IT) for maintenance management, we must be able to choose the relatively best alternative from a set of possible IT solutions. This requires an ability to understand the maintenance IT needs, as well as ways to assess different alternative IT solutions. Although Computerised Maintenance Management Systems (CMMS) have been in use for several decades, models and methods for evaluating maintenance IT needs are not well developed. This paper will address how the maintenance management information technology (MMIT) procurement process could be supported by taking into account the IT maturity of the maintenance organisation. IT maturity denotes the extent to which an organisation or a human can benefit from the technology [1]. If we are able to define functionality for different levels of IT maturity, the assessment and selection of the relatively best IT application for the maintenance organisation would be supported. A model for determining the IT maturity of maintenance was developed in [2] and validated in [3]. In this paper, this model will be used as a basis for defining IT functionality requirements for the evaluation of different MMIT systems alternatives.
2
METHODS SUPPORTING THE PROCUREMENT OF MMIT
For covering the past research within the assessment of MMIT, a literature survey was conducted in the full text database ELIN (Electronic Library Information Navigator), which integrates vast amount of databases and providers, such as Blackwell, Cambridge journals, Emerald, IEEE, Science Direct and Wiley. Key words were chosen to cover the area of computerised maintenance management combined with the terms benefits, needs, requirements, purchasing, procurement, selection and evaluation. 40 hits in all were addressing MMIT, representing 22 unique papers published between 1987 and 2008. Three papers were found which address the problem area of MMIT procurement, especially focusing on identifying and selecting IT systems. The first, [4], presents an evaluation model based on multi-criteria decision-making (MCDM) and the analytic hierarchy process (AHP). The model comprises seven levels, where level two is different scenario alternatives classifying future users of CMMS. This classification is mainly based on maintenance organisation size, but also on maintenance practices and technology utilisation. Levels three to six represents the criteria to be considered, in all 52 criteria.
214
Four generic CMMS alternatives are represented in the seventh level. By utilising this model, the needs of the company, identified as belonging in one of the scenarios, could be connected to one generic CMMS alternative and thereby reduce the decision complexity, before the selection of software application alternatives is made. This model is similar to the work of [5] that also proposes an MCDM/AHP approach for the evaluation of CMMS. This model consists of five levels, wherefrom four represents the criteria for evaluation based on the ISO-IEC 9126 classification. The CMMS alternatives are classified according to size characteristics, while the needs are exemplified by judgements from administration, production and maintenance in paper industries. These two publications provide well structured and objective methods for evaluating CMMS, and the MCDM process is commonly used in the software selection process, providing possibility to compare different alternatives with respect to large amount of criteria. The MCDM is a tedious process and requires much data and ability to process the data appropriately though. Thus, developing criteria for the MCDM requires a lot of effort, and it is hard to align the criteria with the demands of the organisation. Moreover, the criteria selected for analysis in [4] and [5] are mainly non-functional requirements. In the third paper, [6], a method for determining IT requirements in maintenance was developed, which could be useful in the requirements determination and CMMS selection. It does not address any specific methods for the actual selection though, such as in [4] and [5]. It is suggested in [6] that IT maturity plays a role in the IT requirements determination process. Therefore, determining the IT maturity of the organisation is in [6] one step in defining the current state of the maintenance. IT maturity could be one way to reduce decision complexity in a MCDM analysis if it is used to classify IT solutions, and thereby reduce the amount of candidate IT systems to consider in the further analysis process.
3
IT MATURITY OF MAINTENANCE
IT maturity denotes the extent to which an organisation or a human can benefit from the technology. An organisation with a low level of IT maturity uses IT mainly for the automation of daily activities and for data storage, while an IT mature organisation uses IT for collecting and combining vast amount of data for advanced strategic decision making [7]. This section will discuss the term IT maturity with respect to maintenance management.
3.1
The Maintenance IT Maturity Model
A model for determining information technology maturity within maintenance, which is an application of the IT maturity and growth model by Nolan [7], was developed in [2] for the positioning of the maintenance organisation relative to its IT maturity. The model was validated in a study presented in [3]. Three distinct groups of companies with different levels of IT maturity were found in the study by the means of cluster analysis. The groups and their characteristics are accounted for in Table 1 in the next section. In the maintenance IT maturity model three main phases of maintenance IT maturity have been defined: Introduction, Coordination and Integration. The history of industrial computerisation shows that these phases are natural steps towards an integrated IT solution, see [8]. The author also shows that the development of maintenance management IT in general has followed the development of industrial IT. The demands on MMIT has shifted over the years from being a tool to automate preventive maintenance management, such as task scheduling, plant inventory and stock control or cost and budgeting, to support predictive and proactive maintenance by providing real time data processing, effective communication channels and business function integration [9], [10]. The three phases of IT maturity are described briefly as: 1) Introduction (Efficiency reached by using IT): IT is introduced into the maintenance organisation in form of traditional CMMS, which support mainly reactive and preventive maintenance. The procurement of IT is mainly technologyoriented and IT is used for operational purposes. Goals with the IT use are mainly concerning an efficient management of work orders, spare parts inventory, purchase and cost control. It results in good control of available resources and cost reduction of carried out maintenance. 2) Coordination (Effectiveness reached by using IT): Maintenance IT systems and other corporate IT systems are coordinated and the use of IT is more stressed than the technology itself. These more advanced CMMS and ERP systems support mainly preventive, and to some extent, predictive strategies. IT is used for operational and tactical purposes, such as the follow up of carried out activities and failure frequencies. Schedules can be optimised. It results in good control of, and the capability to use, resources in the best way, and investment in maintenance will likely give positive returns. 3) Integration (Cost-effectiveness reached by using IT): Maintenance is an integrated part of the corporate IT system, enabling predictive and proactive maintenance strategies. Investments in IT are connected to actual needs. IT is used for operational, tactical and strategic purposes. Automatic monitoring of damage development and rapid rescheduling of activities is enabled. Based on failure history measures can be judged and the best maintenance alternative can be chosen, giving highest returns of investments.
215
3.2
IT Systems Functionality Connected to IT Maturity
The IT maturity model could serve as an input for developing appropriate benchmarks for different levels of IT maturity within maintenance. It could also be a tool used by the procurer of MMIT to understand the prerequisites of the organisation before assessing different MMIT solutions available. As such, knowing what functionality to utilise depending on the level of maturity is important. In [6], the IT utilisation within maintenance management was studied with respect to Swedish industry. Table 1 lists the functionality studied and the results of the cluster analysis. Of 71 companies in total, 19 were characterised as belonging to group Introduction, 35 as belonging to group Coordination, and 19 as belonging to group Integration. The type of functions and the extent to which these are utilised is accounted for in the three rightmost columns. In the Introduction phase Preventive Maintenance planning and scheduling, Work order planning and scheduling, Equipment parts list and Equipment repair history were utilised. These form the core functionality for the first phase. In the Coordination phase following additional functionality were added: Inventory control, Spare parts purchasing, Maintenance budgeting and Key performance measures. For the Integration phase, three more functionalities were included: Equipment failure diagnosis, Manpower planning and scheduling, Condition monitoring parameter analysis and Spare parts requirement planning.
Table 1 Utilisation of IT in Swedish industry, Kans and Ingwald (2008)
Function
Introduction (17)
Coordination (35)
Integration (19)
WO planning and scheduling
2
3
4
Equipment parts list
2
3
4
Inventory control
0
3
4
PM planning and scheduling
4
4
5
Spare sparts requirements planning
1
3
4
Equipment failure diagnosis
1
1
3
Equipment repair history
2
3
4
Spare parts purchasing
0
4
4
Manpower planning and scheduling
0
1
3
Maintenance budgeting
1
2
3
CM parameter analysis
1
1
3
Key performance measures
1
2
4
0 = Do not have this functionality, 1 = Used minimally, 5 = Used extensively
4
SELECTING MMIT BASED ON IT MATURITY
This section will illustrate how the maintenance IT maturity could be utilised as a means to compare different MMIT, for instance when selecting which IT system to further investigate for instance by the means of MCDM in a MMIT procurement situation. We will look at two different scenarios depending on the level of IT maturity of the organisation: 1)
The procurement of MMIT is made for the first time, i.e. computerisation of manual routines. This scenario is characterised by a maturity level corresponding to the Introduction phase.
2)
The procurement of MMIT is mainly an upgrade from simple computerised support, for instance an obsolete CMMS or a simple spreadsheet solution, to a standard CMMS or ERP system. This scenario is characterised by a maturity level corresponding to the Coordination phase.
A third scenario not addressed in this paper would be the procurement of MMIT connected to a major investment in IT support due to highly automated or complex production. This scenario is characterised by a maturity level corresponding to the Integration phase, and in general these kinds of investments involves the procurement of more IT systems than only MMIT.
216
4.1
Sample of MMIT Systems to be Evaluated in the Scenarios
A study of commercial MMIT systems’ functionality conducted by the author is utilised as a real-world illustration of the problem area of selecting appropriate MMIT. The study covered seventeen CMMS and seven ERP systems that included a maintenance or asset management module, in total 24 IT systems. The basis for the data collection was information from the constructers or vendors of the IT systems in form of folders, brochures, demonstrations, web pages, demonstration software and telephone contact with vendors. Some results connected to this data set were presented in [1]. The information was collected during 2004-2005. The aim was to study the most commonly used off-the-shelf systems used for maintenance management in Swedish industry. The study objects were therefore determined using three sources: 1)
A survey about maintenance management including 118 Swedish companies conducted by the Department of Terotechnology at Växjö University in 2004, see [11]. In the questionnaire one question covered which commercial CMMS the company used for maintenance management. A total of 87 answers were given. Some companies used more than one system. From these, systems that were used by two or more companies were chosen, in all thirteen systems (10 CMMS and 3 ERP systems).
2)
An Internet survey of Enterprise Resource Planning (ERP) systems used in Sweden based upon information provided by Data Research DPU AB [12]. Of 100 ERP systems, only seven contained a maintenance or asset management module. These systems were all included in the study (3 of the systems were already included based on paragraph 1).
3)
A list of commonly used maintenance IT systems provided by Swedish Center for Maintenance Management (Underhållsföretagen) published in [13]. The list contained twenty-one CMMS/ERP systems from where systems that were pure decision support systems or had less than 30 total users worldwide were excluded. The list of study objects was complemented with additional seven systems using this source.
This data set is highly suitable as they represent a real-life decision to be made, and the two latter sources of information could directly be utilised for this purpose. The first source represents the actual choice procurers of MMIT in Swedish industry have made.
4.2 Functionality Selected for Comparison The functionality included in the survey accounted for in Table 1 did not completely match the functionality included in the study described in 4.1. Therefore a complete mapping of functionality for each IT maturity phase was not possible. The functionality Spare parts requirement planning connected to the Integration phase is therefore not considered. Table 2 lists the functionality that will be considered for each IT maturity phase.
Table 2 IT functionality needs for different phases of IT maturity
IT maturity Introduction
IT systems functionality Work order planning and scheduling Preventive maintenance planning and scheduling Equipment parts list Equipment repair history In addition to the functionality connected to phase 1:
Coordination
Inventory control Spare parts purchasing Maintenance budgeting Key performance measures In addition to the functionality connected to phase 2:
Integration
Equipment failure diagnosis Manpower planning and scheduling Condition monitoring parameter analysis
217
5
ANALYSIS AND RESULTS
In the following, the 24 IT systems will be evaluated with respect to their ability to provide the maintenance organisation with IT support depending on IT maturity level and decision-making scenario. The aim is to reduce the number of possible candidates from 24 to around five before a further in-depth evaluation. The functionality coverage for the system alternatives is found in Figure 1 and 2. Figure 1 describes the functionality coverage of CMMS while Figure 2 describes functionality coverage for ERP systems. The total amount of functions is given for each phase in the legend. For the Introduction phase, we can see that all CMMS contain three or four out of four functions in total. For the Coordination phase the amount of functions varies between four and eight of eight in total. For the Integration phase the coverage is between four and ten out of eleven functions.
Introduction (4 functions)
Coordination (8 functions)
Integration (11 functions)
11 10 9 Number of functions
8 7 6 5 4 3 2 1
C
C
M M S_
1 M M S_ 2 C M M S_ 3 C M M S_ 4 C M M S_ 5 C M M S_ 6 C M M S_ 7 C M M S_ 8 C M M S_ C 9 M M S_ 10 C M M S_ 11 C M M S_ 12 C M M S_ 13 C M M S_ 14 C M M S _1 C 5 M M S_ 16 C M M S_ 17
0
System
Figure 1. Functionality coverage for CMMS.
Introduction (4 finctions)
Coordination (8 functions)
Figure 2 is read the similar way as Figure 1. The total number of functions connected to a certain phase is found in the legend. The functionality coverage for the Introduction phase varies between two and four out of four in total for the ERP systems. For the Coordination phase the amount of functions varies between three and eight out of eight in total. For the Integration phase the coverage is between four and eleven out of eleven functions.
Integration (11 functions)
11 10 9
Number of functions
8 7 6 5 4 3 2 1 0 ERP_1
ERP_2
ERP_3
ERP_4
ERP_5
System
Figure 2. Functionality coverage for ERP systems.
218
ERP_6
ERP_7
5.1
Computerisation of Manual Routines
The decision is to find the most suitable candidates for a maintenance organisation that today relies on manual work. The four functions connected to the Introduction phase are considered as important, and should be mandatory in the requirements specification. The four functions belonging to the next phase will be desirable, while the functions connected to the Integration phase are seen as undesirable, as they indicate a too complex solution for the level of maturity. From Figure 1 we find nine CMMS reaching the mandatory requirements: CMMS_1, CMMS_9- 13 and CMMS_15-17. These are the candidates to select from. To further delimit the amount of candidates we compare the functionality coverage for the desirable requirements. It is found that four out of the nine CMMS’s contains high level of coverage. Therefore, we select systems CMMS_1, CMMS_12, CMMS_13 and CMMS_14 for further evaluation. We note that the system 13 seems to be a highly complex one though and we might disregard CMMS_13 due to this reason. The ERP system is in general a bit too complex for an IT beginner, but depending on factors such as the existence of an ERP system within the company, this option could be of interest also in this scenario. Four ERP systems contain full coverage of the mandatory functions (ERP_1-3, ERP_6), and these also shows up high coverage for the desirable functions. All of these systems seem to be rather advanced though, and due to this we will likely not go further with them in the analysis.
5.2
Upgrading of Existing IT System
This scenario regards the upgrading of an old, incomplete or simple IT solution for maintenance management. It is assumed that the maintenance organisation as well as the maintenance personnel is IT mature to some extent. They are for instance used to store data in digital form and to plan activities with computerised support. Functions connected to the first and second maturity phase are therefore to be seen as mandatory, while the functions connected to the last phase are desirable. Both a CMMS and an ERP solution are of interest. Only one CMMS (CMMS_13) and two ERP systems (ERP_1-2) fulfil the mandatory requirements and will be selected for further analysis. However, the system ERP_3 seems to contain a high degree of functions in total and is therefore included in the list of candidates.
6
CONCLUSIONS
This paper proposes taking IT maturity into account in the procurement of IT in maintenance management. In this paper, IT maturity has been used as a means to compare functionality in different MMIT in order to select the most suitable candidate systems for more detailed analysis. As such, it could for instance be utilised as a first step for further MCDM analysis. This paper only addresses the functional requirements, whereas the non-functional requirements have to be considered in the detailed analysis. The IT maturity is not a static condition. The history of IT has shown how companies gradually have moved from lower to higher levels of IT maturity, resulting in shifting demands on IT applications to be able to meet more advanced business goals. In maintenance, the similar has been noted. This implies that the assessment of IT applications is not an activity to be carried out only when new software is to be purchased, but should be made on regular basis to determine whether the IT is supporting current maintenance practices to full extent or not. For this purpose, the IT maturity model could be serving as a simple yet powerful tool. This paper has shown that the different maturity phases could be translated into IT functionality, which would enable the assessment of IT support on both overall and detailed level.
7
REFERENCES
1
Kans, M. (2008) On the Utilisation of Information Technology for the Management of Profitable Maintenance. PhD Thesis. Växjö. Växjö University Press.
2
Kans, M. (2007a) The Development of Computerized Maintenance Management Support. Proceedings of the International MultiConference of Engineers and Computer Scientists 2007. Hong Kong 21-23 March, 2113-2118
3
Kans, M. and Ingwald, A. (2008) Exploring the information technology maturity within maintenance management, COMADEM 2008, Energy and environmental issues: Proceedings of the 21th International Congress and Exhibition. Prague Czech Republic 11-13 June, 243-252
4
Carnero, M. C. and Novés, J. L. (2006) Selection of computerised maintenance management system by means of multicriteria methods. Production Planning and Control, 17(4), 335-355.
219
5
Braglia, M., Carmignani, G., Frosolini, M. and Grassi, A. (2006) AHP-based evaluation of CMMS software. Journal of Manufacturing Technology Management, 17(5), 585-602.
6
Kans, M. (2008) An approach for determining the requirements of computerised maintenance management systems. Computers in Industry, 59(1), 32-40.
7
Nolan, R. L. (1979) Managing the crises in data processing. Harvard Business Review, 57(2), 115-126.
8
Kans, M. (2009) The advancement of maintenance information technology: A literature review. Journal of Quality in Maintenance Engineering, 15(1), 5-16.
9
Labib, A.W. (2004) A decision analysis model for maintenance ploicy selection using a CMMS. Journal of Quality in Maintenance Engineering, 10(3), 191-202.
10
Pintelon, L., Preez, N. D. and Van Puyvelde, F. (1999) Information technology: opportunities for maintenance management. Journal of Quality in Maintenance Engineering, 5(1), 9-24.
11
Alsyouf, I. (2004) Cost effective maintenance for competitive advantages, PhD thesis, Växjö university, School of industrial engineering.
12
Data research DPU ab, [WWW document] URL http://www.dpu.se/listmeny.html. [2004-02-02].
13
Swedish Center for Maintenance Management (2003) U&D’s UH-system översikt. Underhåll & Driftsäkerhet 7-8, 28-29.
220
Proceedings of the 4th World Congress on Engineering Asset Management Athens, Greece 28 - 30 September 2009
OPEN STANDARDS-BASED SYSTEM INTEGRATION FOR ASSET MANAGEMENT DECISION SUPPORT Avin Mathew a, Michael Purser a, Lin Ma a, Matthew Barlow b a
CRC for Integrated Engineering Asset Management, Queensland University of Technology, Brisbane, Australia b
Australian Nuclear Science and Technology Organisation, Lucas Heights, Australia
Over the last decade, system integration has grown in popularity as it allows organisations to streamline business processes. Traditionally, system integration has been conducted through point-to-point solutions – as a new integration scenario requirement arises, a custom solution is built between the relevant systems. Bus-based solutions are now preferred, whereby all systems communicate via an intermediary system such as an enterprise service bus, using a common data exchange model. This research investigates the use of a common data exchange model based on open standards, specifically MIMOSA OSA-EAI, for asset management system integration. A case study is conducted that involves the integration of processes between a SCADA, maintenance decision support and work management system. A diverse number of software platforms are employed in developing the final solution, all tied together through MIMOSA OSA-EAI-based XML web services. The lessons learned from the exercise are presented throughout the paper. Key Words: system integration; asset management; enterprise service bus; MIMOSA OSA-EAI; web services; service oriented architecture 1
INTRODUCTION
Over the last decade, system integration has grown in popularity as it allows organisations to streamline business processes. Many companies are now automating their asset management workflows such that stock levels can be reordered based on RFID-scanned remaining quantities; work notifications can be triggered from condition monitoring prognoses; and work details and asset documents are automatically uploaded to PDAs for maintenance teams before they depart. System integration also supports business intelligence and data mining where data sets can be combined in non-traditional ways. This leads to scenarios such as visualising scheduled maintenance geographically; seeing failures times overlayed on charts of operation or condition parameters; or predicting future asset capacity based on reliability block diagrams, asset throughput specifications, and predicted availability. Traditionally, system integration has been conducted through point-to-point solutions – as a new integration scenario requirement arises, a custom solution is built between the relevant systems. It is now known that while point-to-point solutions offer good performance and relatively less development time, they are sorely lacking in scalability and ease of management. Bus-based solutions are now preferred, whereby all systems communicate via an intermediary system through adapters. The adapters convert data between the system’s native format to the bus’ format and back again. The enterprise service bus (ESB) is the result of this paradigm, with many of the larger IT vendors now offering competing products in this space. To streamline the transfer of data between a work management system, process control system, and an asset health management system in ANSTO, a nuclear research facility in Australia, a case study was conducted into developing a service bus approach using open standards. As opposed to an ETL (extraction, transformation, and loading) process conducted on a batch transfer basis, the service bus approach sends messages in real-time once they are collected or computed by the respective systems. The messages use the format of the MIMOSA (Machinery Information Management Open Systems Alliance) OSA-EAI (Open Systems Architecture for Enterprise Application Integration) [1] XML Schema internally, and are then converted to the native system formats through specially-designed adapters. As the goal is to move towards a serviceoriented architecture (SOA), all developed components (in particular, native data model to OSA-EAI data model mappings) are componentised for reuse.
221
2
BACKGROUND INFORMATION 2.1 Enterprise Service Bus
The enterprise service bus sits between information systems and facilitates the communications among them. Thus all messages pass through the bus, forming a hub-and-spoke architecture (see Figure 1) as opposed to a point-to-point architecture. The change in architecture leads to the number of potential connections reducing from n(n - 1)/2 to n – 1 (where n includes the ESB), decreasing complexity and increasing scalability.
Adapter
Adapter
A bus system also leads to a decoupling between systems whereby systems request data, and receive an answer but not knowing from where it came. It also leads to a greater reliance on contract-based services as systems only know of the ESB and the services or interfaces it makes visible to a particular system. To assist the exchange of messages with consistent semantics, a canonical data model [2] (sometimes known as a canonical message model) is used to provide a common, singular data format. In this case study, the OSA-EAI forms the canonical data model.
Enterprise Service Bus
Adapter
Design Drawings
Adapter
EAM
Adapter
Reliability
Adapter
GIS
Fault Reporting
SCADA
Figure 1. Enterprise service bus approach mediating asset management information systems
2.1.1 Adapters Adapters provide a mediation mechanism for data formats and accessibility between two systems. The adapter translates data models and reference data between the ESB’s canonical data model and the native system typically through primary key mappings. For example, the asset with ID 504493 in the work management system could map to the asset with primary key “0000000100000001”,”1” in the ESB model. As systems might be set up to use the same primary key, the mapping can be exposed to adapters through an ESB service (also following the SOA paradigm). Mappings can become more complex if entities in the native data model map other than 1:1 with the canonical data model (e.g. a person’s full name needing to be split into a first and last name). There are two options on where a mapping can be stored: either in the adapter itself, or in a service outside of the adapter. Storing it in the adapter is efficient when the data type is not used in another other integration scenario and there is a n:m mapping between the native and canonical data model. If the data type is used in multiple integration scenarios, and maps easily between the native and canonical format, then storing the mapping in an accessible service is preferred to enhance componentisation. Adapters can also assist in increasing the accessibility of systems in that data can be exposed through web services. Thus, the adapter takes a request, transforms the request into a form suitable for a file reader, database SQL query, or an API call, receives a result and returns the result to the original requester. 2.2 MIMOSA OSA-EAI The MIMOSA OSA-EAI1 provides open data exchange standards for operations and maintenance data and is comprised of several layers including a conceptual model, physical model, reference data, and XML schema definition. The OSA-EAI has 1
As MIMOSA OSA-EAI is a continually improving standard, this discussion is in reference to version 3.2.1 (the latest version at the time of writing).
222
been expounded in previous work and only the relevant sections are discussed below. These sections are the Application, Service and XML Definitions, and the Reference Data Library. The OSA-EAI supplies three methods of transferring data via XML: through the Tech-DOC, Tech-CDE, and Tech-XML specifications. Each of these approaches has their merits, and while at times they can appear as substitutes, it is important to select the right approach on a case-by-case basis. The Tech-DOC specification is a single XML Schema that represents all parts of the CRIS. Multiple CRIS entities as well as multiple rows of that entity can be transmitted in a single XML document. No connection metadata is stored in the file, and as such, it is binding independent compared to Tech-CDE and Tech-XML. The Tech-CDE specification comprises of three XML Schemas that include a query schema, write schema, and a common schema that contains CRIS and supporting structural message elements. The operations, set by parameters in the messages, closely align themselves with the CRUD (Create, Read, Update, and Delete) operations for databases. It covers all parts of the OSA-EAI, and multiple CRIS entities as well as multiple rows can be transmitted in a single XML document. Tech-CDE follows a request and acknowledgement model and contains a SOAP-based specification. The Tech-XML specification comprises of numerous XML Schemas split over ten different package areas (including a package for CRIS and supporting structural message elements). The package-based classification results in duplication of certain schemas (e.g. the CreateAsEvent schema exists in seven of the packages); however, these messages incorporate different element types resulting in non-interchangeable XML documents. The schemas are lightweight and specific to a certain operation and CRIS area. The semantics of the schemas are restricted to queries and inserts; neither edit nor delete schemas exist. Create messages are usually limited to a single row at a time – multiple rows are created by sending multiple messages. While all sections of the CRIS are covered by Tech-XML, not all entities in the CRIS are covered. As with TechCDE, Tech-XML also follows a request and acknowledgement model and contains a SOAP-based specification.
Figure 2. MIMOSA OSA-EAI 3.2.1 layers
With three different options for message formats, this research prioritises Tech-XML, followed by Tech-CDE, and then Tech-DOC. Tech-XML’s operational-based schemas with communication metadata suitably lend themselves to the SOA and ESB approach. Tech-CDE’s schemas also contain communication metadata, allows for edits and deletes, and multiple rows to be sent, but is not as specific as Tech-XML. Tech-DOC’s schema is a fallback if the other two do not fit technically or semantically. 3
REQUIREMENTS AND DESIGN
Four requirements influenced the design of the integration service. The first three were functional requirements, while the fourth was a technological limitation. These requirements were: 1.
Storing monthly-aggregated SCADA measurement data in the work management system for reporting purposes.
223
2. 3. 4.
Using a maintenance decision support system to process SCADA measurement data and work management system failure/maintenance data to predict failures. Triggering potential work notifications from the maintenance decision support system to the work management system. The SCADA system cannot be directly accessed from any network due to a security requirement.
From the requirements, the work management system and maintenance decision support system would require two-way communications, while the SCADA system would only require one-way (output) communication. The SCADA system data would be downloaded as a CSV (Comma Separated Value), which could then be distributed onto the company intranet. Three triggers were identified that would initiate integration processes and are shown in Table 1. As the maintenance decision support system is primarily condition data-driven, it was decided that once new SCADA measurement data was available (with the CSV collection interval determined by the organisation’s business rules), any new maintenance data would be acquired in real-time. Thus while new SCADA data was triggered by human action (bringing a CSV file to the system), the decision support system would invoke the process to acquire failure/maintenance data. Once a failure prediction and maintenance schedule was calculated by the maintenance decision support system, human action would be required to send the schedule to the work management system.
Table 1 Integration process triggers
Trigger
Initiator
New SCADA measurement data was received as a CSV file
Human
The decision support system started a failure prediction process which required failure/maintenance data
System
A new maintenance schedule was calculated by the maintenance decision support system
Human
3.1 Sequence Diagrams The requirements were translated into sequence diagrams to describe the processes that occur after a trigger had occurred. The sequence diagrams are presented in Figures Figure 3, Figure 4, and Figure 5 and show the systems involved with the integration process, the message order, and the message format as mapped to the MIMOSA OSA-EAI. The first sequence diagram, Figure 3, sees the three required systems as well as a Data Aggregation Service, which performs the function of aggregating the SCADA data (recorded at hourly intervals) to monthly averages. Data is transferred using the Tech-DOC XML Schema2 as: • •
Tech-CDE’s queries and writes did not semantically fit with the design The closest matching Tech-XML schema, CreateScalarDataAndAlarms from the TREND package, is designed for sending a single measurement point and becomes inefficient for larger datasets (increased CPU usage and latency due to the number of messages required – see Section 5 for details)
The second sequence diagram, Figure 4, sees the querying of failure data by the decision support system. Such data is stored as notification records in the work management system, and the ESB is only used to forward the request and return the result from and to the decision support system. Data is transferred using the Tech-XML QuerySgCompWork schema from the WORK package as failure data is stored within the maintenance work order records. All completed work orders for each segment are returned.
2
While using OPC’s standards would be the ideal means for acquiring data from the SCADA system, the physical network separation meant that this could not occur.
224
Figure 3. New SCADA CSV data event triggered process
Figure 4. Failure prediction process start event triggered process
Figure 5. Maintenance schedule calculated event triggered process
225
SCADA CSV File
Maintenance Decision Support System
Adapter
Reference Data Mapping & Data Aggregation Services
Enterprise Service Bus
Adapter
Presentation Layer
WPF
Service Layer
Service Layer
WCF
Business Layer
Data Access Layer Desktop Application
Work Management System EJB (HTTP BC)
WCF
OSA-EAI Objects
Service Layer
Service Layer
Business Layer
Business Layer
DoME
Business API
Adapter Service Layer
Data Access Layer
OSA-EAI Objects
ADO.NET
Application Server (IIS) .NET
Application Server (GlassFish)
Self-Hosted Service
Java EE
Smalltalk
Figure 6. Overall solution architecture mapped to technologies
The third sequence diagram, Figure 5, shows the integration process started by the calculation of a new maintenance schedule. As with the second sequence diagram, the ESB simply forwards the create request and returns the result. Data is transferred using the Tech-XML CreateAsRFWandWR schema from the WORK package. As the Tech-XML schema only allows one row to be sent per message, multiple messages may need to be marshalled and sent. As opposed to the first sequence diagram with transferring SCADA data, the number of messages in this scenario is extremely small and does not warrant the Tech-CDE or Tech-DOC capabilities of sending multiple rows. 4
IMPLEMENTATION
Both SOA and ESB paradigms are platform-agnostic, although most software has standardised on using SOAP-based web services. While MIMOSA OSA-EAI ultimately uses XML-based schemas, both Tech-CDE and Tech-XML are inclined towards SOAP bindings and thus this is the only platform restriction. A mixture of different platforms was selected in developing the integration scenarios including Microsoft .NET, Sun Microsystems Java EE, and VisualWorks Smalltalk. Figure 6 shows the platforms, the hosting mechanism, layers, and specific technologies used in implementing the solution. Due to the relatively small processing resources required, the entire system was deployed on a standard desktop machine. The SCADA CSV File Adapter is the only visible component of the system to users in the organisation, as it contains a user interface; the other components of the system, once installed and set up, run in the background. As per good architectural design, components are designed with clearly delineating layers in mind, promoting reusability and maintainability. As OSA-EAI business objects (automatically generated from XSD or WSDL documents) are used for all business layers, the reference data mapping service3 returns the appropriate OSA-EAI objects for a particular business object. The reference data mapping system forms a pseudo asset registry, as it contains all object and data mappings for all systems that wish to connect to the ESB. OpenESB was selected for the ESB component as it provided a free open-source platform that is also commercially supported. Internally, it uses Java Business Integration (JBI) standards for the implementation of its binding and service components such that it can easily interoperate with JBI components from other platforms. The three sequence diagrams were translated into BPEL (Business Process Execution Language) orchestrations using the graphical design tools within OpenESB. Access to the work management system was governed by a web service wrapper written using the Smalltalk Domain Modelling Environment (DoME). While the services provided by this component could have been replicated in the .NET or Java EE environments, the selection of the platform was outside the authority of this case study.
3
Note that the data aggregation service is distinct from the reference data mapping service, but contains the same layered architecture, and is hence combined on the figure.
226
5
DISCUSSION
The relatively small size of the case study might beg the question for using such a complicated design and architecture for a somewhat simple problem. A point-to-point solution could have been used that eliminated the ESB and MIMOSA OSA-EAI formats such that all components would communicate directly with other components, and transform data from their format to the target format. While this method would be quicker to develop and would most likely offer better performance, it cannot compete with the illustrated approach in terms of scalability, extensibility, and maintainability. The implementation of the design results in a number of platforms not from a technical restriction but from logistic and financial reasons. As two universities were engaged to develop the structure, both the .NET platform and Smalltalk platform were deemed to be the optimal choice given the different preferences and experience at the universities. Financial restrictions played into the selection of a free ESB platform, in which OpenESB was selected after a cost-benefit analysis was conducted. Nevertheless, the mixture of platforms highlights the benefits of working with a standard communication layer in that all components can interoperate despite using different underlying technologies. For the first integration sequence, a Tech-DOC schema was used for the transfer of SCADA measurement data, rather than a Tech-XML one. The reason was that of performance, and Table 2 shows that the Tech-DOC schema required half the time compared to the Tech-XML one for a particular dataset with 21168 measurement events. The difference is due to the time spent creating new objects in memory, marshalling/unmarshalling parameters into/from XML documents, and sending messages over the network, ceteris paribus. The difference in the amount of data sent as XML documents was not substantial, with the extra data composed of structural tags required by the XML Schema. As mentioned in previous literature involving MIMOSA OSA-EAI [3], documentation on the standard and best practices is sparse. While revisions have seen documentation improving with almost all CRIS fields and XML Schema documents being documented, the best practices in using the XML Schema documents remains a trial and error process. Efforts are being made in creating a software development kit which should alleviate certain implementation issues and provide guidance on how the standard should be interpreted.
Table 2 Transferring SCADA measurement data to the Maintenance Decision Support System Tech-DOC Number of measurement events Number of messages sent Total size of messages Elapsed time
6
Tech-XML 21168
3
21168
54.510 MB
87.953 MB
22 mins, 45.372 secs
44 mins, 51.738 secs
CONCLUSION
The ability to automatically transfer data seamlessly between systems leads to a raft of possibilities for business process optimisation. Standards in data exchange for asset management, while not yet fully mature, are making headway in allowing organisations to develop flexible and reusable integration scenarios. MIMOSA OSA-EAI is a contending standard and despite some minor issues regarding documentation, its support of a large range of asset management data types allows it to be used in numerous asset management integration processes. While the benefits of standards-based interoperability for asset management organisations is clear, it can only be achieved through collaboration amongst software vendors and the standards community. 7
REFERENCES
1
MIMOSA. (2008) Open System Architecture for www.mimosa.org/downloads/44/specifications/index.aspx
227
Enterprise
Application
Integration
V3.2.1,
from
2
Hohpe G & Woolf B. (2003) Enterprise Integration Patterns: Designing, Building, and Deploying Messaging Solutions. Boston, USA: Addison-Wesley Professional.
3
Mathew A, Zhang L, Zhang S & Ma L. (2006) A review of the MIMOSA OSA-EAI database for condition monitoring systems. Mathew J, Ma L, Tan A & Anderson D (Eds.). World Congress on Engineering Asset Management, Gold Coast, Queensland, Australia. Springer.
Acknowledgements This research was conducted within the CRC for Integrated Engineering Asset Management, established and supported under the Australian Government’s Cooperative Research Centres Programme. The authors would like to acknowledge the assistance of Dr. Georg Grossmann from the University of South Australia and Dr. Ken Bever from Machinery Information Management Open Systems Alliance.
228
229
Proceedings of the 4th World Congress on Engineering Asset Management Athens, Greece 28 - 30 September 2009
THE DATA QUALITY IMPLICATIONS OF THE SERVITIZATION – THEORY BUILDING Joe Peppard a, Andy Koronios b and Jing Gao c a
School of Management, Cranfield University, UK
b
School of Computer and Information Science, University of South Australia
c
School of Computer and Information Science, University of South Australia
Servitization is now widely recognised as the process of creating value by adding services to products. A cornerstone of any servitization strategy is that ownership of the product or asset does not transfer to the customer. Rather, the customer purchases a service or capability with the asset being used to deliver that service or capability. The bulk of the research to date seeks to understand how traditional manufacturers might deliver integrated products and services with greater efficiency and effectiveness. One area that has not been addressed is the customer implications as a result of high quality data now not being available to them. This paper will explore the data quality issues emerging through the Servitization transformation. Implications for customers will be highlighted using a proposed framework developed from the data quality literature.
Key Words: Data Quality and Servitization 1
SERVITIZATION
Servitization1 is now widely recognised as the process of creating value by adding services to products (Vandermerwe & Rada, [23]). Since this term was first coined in the late 1980s it has been studied by scholars to understand the methods and implications of service-led competitive strategies for traditional product manufacturers (e.g. Wise & Baumgartner, [27]; Oliva & Kallenberg, [16]; Slack, [20)). During this same period there has been a growth in research on related topics such as Product-Service Systems (PSS) (Goedkoop, [6]; Mont, [15]; Meijkamp, [14]; Manzini & Verzolli, [13]), services operations, services science (Chesborough & Spohrer, [4]) and engineering asset management (Steenstrup, [21]). A cornerstone of any servitization strategy is that ownership of the product or asset does not transfer to the customer. Rather, the customer purchases a service or capability with the asset being used to deliver that service or capability. Thus, the proposition essentially represents an integrated product and service offering that delivers value-in-use (Baines et al., [1]). For example, in the aerospace sector, engine manufacturers such as Rolls-Royce, General Electric and Pratt & Whitney all offer some form of performance-based contract to commercial airlines, tied to product availability and the capability it delivers (e.g., hours flown). Rolls-Royce (R-R) have now registered trademarks for both ‘Power by the Hour’ and the more inclusive ‘TotalCare’. Such contracts provide the airline operator with fixed engine maintenance costs, over an extended period of time (e.g. ten years). In developing TotalCare, R-R is just one example of a manufacturer that has adopted a product-centric servitization strategy. Today, many other western companies, especially those in industry sectors with large installed product bases and high value assets (e.g., locomotives, elevators, machine tools, business machines, printing machinery, construction equipment and agricultural machinery), are also following such strategies and inevitably face similar challenges.
1
Servitization is often referred to as servicizing, particularly in United States. See White et al (1999) and Rothenberg (2007).
230
2
DATA QUALITY PERSPECTIVE
Dimensions of data quality typically include accuracy, reliability, importance, consistency, precision, timeliness, fineness, understandability, conciseness, and usefulness (Ballou & Pazer [2]; Wand & Wang 1996 [24]). Although the many dimensions associated with data quality have now been identified, it is still difficult to obtain rigorous definitions for each dimension so that measurements may be taken and compared over time. From the literature, it appears that there is no general agreement as to the most suitable definition of data quality or to a parsimonious set of its dimensions (Klein [10]). Some researchers have placed a special focus on criteria specified by the users as the basis for high quality information (Strong [22]; English [5]; Salaun & Flores [19]). Orr [17] suggests that the issue of data quality is intertwined with how users actually use the data in the system, since the users are the ultimate judges of the quality of the data produced for them. As a result, Wang & Strong’s ([22]) widely-accepted definition of data quality, “quality data are data that are fit for use by the data consumer”, is adopted in this paper. The bulk of the research on Servitization to date seeks to understand how traditional manufacturers might deliver integrated products and services with greater efficiency and effectiveness. One area that has not been addressed is the customer implications emerging as a result of data and knowledge now not being available to them. Thus, from a commercial perspective, servitization may be an attractive proposition for customers; however there may be implications for data quality and knowledge accumulation which could have longer-term implications for their competitiveness. At a fundamental level, who owns the data generated in the operation of the asset? Does servitization result in knowledge being lost to the organisation and in data quality problems that could potentially be of significant value? This paper will explore the data quality issues emerging from the Servitization transformation. Data Quality implications for customers will be highlighted using a proposed framework developed from the data quality literature.
3
SERVITIZATION IMPACT ON DATA QUALITY IMPACT
The study by Lindberg and Nordin ([12]) shows that more and more firms move from manufacturing goods to providing services or to integrating products and services into solutions or functions. This concept of “servitization” suggests that all organisations, markets and societies are fundamentally centred on the exchange of services – specifically, the exchange of intangible resources. Kapletia and Probert [9] point out that industries are currently only focusing on the core capability of the servitization solutions such as systems integration, operational services, and business consulting, while paying less attention to the other capabilities such as data, information and knowledge accumulation. Traditionally, manufacturers are responsible for delivering the products and supply manuals (e.g. operational manuals and specification) to the end-users. However, to a lesser extent, they are responsible for collecting operational data for a long period of time. In the majority of cases, the manufacturers are not owners of the operational data and may have no right to request these data in sensitive environments. In many organisations, the analysis through the accumulated operational data is one key enabler for effective asset maintenance and business process improvement. However, it may become impossible in the servitization solutions. For example, instead of selling engines directly, Rolls-Royce provides a total solution (based on the flying hours) to airline companies. Rolls-Royce is responsible for monitoring and collecting all operational data during and after each flight. In many cases, a proportion of this data is visible to the airlines. However, sensitive data (e.g. performance and benchmark ratios in comparison to the competitive products) is unlikely to be disclosed. In real practice, the case-study by Johnson and Mena [7] reveals that the supplier (manufacturers) relies on the sensor instruments installed on the product to monitor and collect the operational data through real-time transmission. Once the product reaches its life-span, a warning message is issued from the supplier (manufacturer) system to the customer to replace the equipment or re-negotiate the contract. Inevitably, this causes a great deal of concern to the customers: If analysing data enables the manufacturers to improve the performance of the product, will any savings be passed on to the customer? If, for example, an engine manufacturer detects that fuel consumption could be improved by operating the engine differently, do they pass this knowledge to customers (assuming fuel consumption is the responsibility of the customer)? How does the customer’s lack of access to operational data affect them when re-negotiating contracts? Failure or deficiency in supplying and accumulating data to the customer often results in data quality problems and may lead to less-informed decisions. In order to address data quality impacts in servitization and allow customer organisations to better negotiate their servitization contracts, a preliminary theoretical framework consisting of four stages is built from the data quality literature. 1.
Understand the Role of Data in Servitization – As a product and as a part of the service
Data should be treated as both a product and a service. The data quality literature draws distinctions between the product quality and service quality of data (Zeithaml, Berry & Parasuraman [28]). Product quality of data includes product features that
231
involve the tangible measures of information quality, such as accuracy, completeness, and freedom from errors. Service quality of data includes dimensions related to the service delivery process, and intangible measures such as ease of manipulation, security, and added value of the data and information to consumers (Kahn, Strong & Wang [8]). With respects to the data quality definition discussed previously - “fitness to use” - the collected data indeed serves the organisation for its business operations. Thus, in any servitization solutions, data must firstly be considered as a product which needs to be delivered to the customer organisation along with the physical products for the service to be based on. Similar to the physical product, the customer organisation does not necessarily need to own the generated data; however, full access to the data must be granted. Secondly, data must also be supplied in a form that the customer organisation can analyse within their existing information systems. For example, Rolls-Royce could choose to supply the engine operational data in Excel spreadsheets to airlines instead of real-time feeding into the airline systems, this data could become meaningless and out-ofdated. 2.
Develop Data Requirements and Prioritise Quality Dimensions
Modern organisations, both public and private, are continually generating large volumes of data. According to Steenstrup from Gartner Research ([21]), each person on the planet generates an average of 250 Mbytes of data per annum, and this volume is doubling each year. At an organisational level, there are incredibly large amounts of data, including structured and unstructured, enduring and temporal, content data, and an increasing amount of structural and discovery metadata. Most organisations have far more data than they can possibly use; yet, at the same time, they do not have the quality data they really need (Levitan and Redman, [11]). It is unlikely for a servitization solution provider to supply all data to the customer organisation and it can also become expensive for a customer organisation to acquire a large amount of data (as reflected in the servitization contract). Thus, it is essential for customer organisations to understand how the operational data were used for its business purposes in the past, as the adoption of servitization solutions will reduce the organisation’s capability to collect data itself. As discussed previously, data may be considered both as a service and a product in servitization; there is therefore a need to list the data quality dimensions in order to ensure that the customer organisation is able to receive quality data and analyse the data for informed and reliable decision-making. The PSP/IQ model developed by Kahn, Strong & Wang [8] can be regarded as a basis for organisations to understand its data quality requirements to be used in negotiating the servitization contract with suppliers. The PSP/IQ model (Table 1) views data as a product, and suggests the value of a product or service depends on the quality of the data associated with the products and services.
Table 1: Mapping the IQ dimensions into the PSP/IQ model Conforms Dimensions Product Quality
Service Quality
to
Essential
DQ
DQ Dimensions that Meets or Exceeds Consumer Expectations
Accuracy
Useful Data
Concise Representation
Appropriate Amount
Completeness
Relevancy
Consistent Representation
Understandability
Reliability
Usable Information
Timeliness
Believability
Security
Accessibility Ease of Manipulation Reputation Value-Added
Adapted from Kahn, Strong & Wang ([8]) 3.
Establish a data quality maturity model for comparing solutions and evaluating the supplier performance
More and more companies are becoming servitization solution providers. As a result, the customer organisation needs to perform a comprehensive review of available choices based on a complex range of criteria. Additionally, the customer-
232
organisation also needs to evaluate solution providers’ performance during each contracting period. When coming to data quality performance, a data quality maturity model becomes critical. Caballero, Gómez & Piattini ([3]) report on research aiming to optimise the data management process (DMP) in order to assure data quality. For this, they developed a framework based on maturity levels with specific and generic data quality goals, which can be achieved by executing the corresponding activities. Their data quality model provides guidance for evaluating and improving DMP so that organisations become able to manage more and more efficiently all related data processes – such as data acquisition, data product manufacturing and data maintenance - by addressing several issues in the main process areas like process management, project management, support and engineering processes. The work by Caballero, Gómez and Piattini ([3]) addresses main issues on drawing a DMP (table 2), identifying all components and established relationships, highlighting quality aspects for processes and for data governed by data quality policies. Applying their DQ model, organisations can learn and model formally their data quality management, so that major data problems and sources can be identified. Once identified, initiatives for avoiding them or for improving efficiency can be arranged.
Table 2: The DQ model based on maturity levels DQ Maturity Level
Management Activity
Initial
No managed and coordinated efforts are made in order to assure data quality
Definition
Efforts are made in order to draw the entire process, identifying and defining all components (both active and passive), their relationships and the way in which these are developed according to a previous project. DMP project management Data requirements management Data quality dimensions and metrics management Data sources and data targets management Database or data warehouse development or acquisition project management For climbing from Initial level to Definition level, a plan for developing the DMP must be drawn up and followed
Integration
Many efforts are made in order to develop and execute according to organisational data quality polices. This implies the existence of several standardised data quality issues. Data quality team management Data quality product verification and validation Risk and poor data quality impact management Data quality standardisation management Organisational processes management
Quantitative Management
A DMP is integrated into an organisation’s data quality culture and many efforts are made in order to take several measures related to DMP and its components. DMP measurements management
Optimising
Quantitative measurements taken at previous levels are used in efforts in order to detect defect sources or identify ways to optimise entire processes. Causal analysis for defect prevention Organisational development and innovation
Source: Caballero, Gómez & Piattini [3]
233
4.
Adopt Total data quality management
It is believed that the total data quality management cycle (TDQM) developed by Wang ([25]) is still very relevant in servitization. Wang ([25]) identified four important roles in the data and information supply chain – data producers (suppliers), data custodians (manufacturers), data consumers, and data managers. Data producers are those who create or collect data for producing information products. Data custodians are those who design, develop, or maintain the data and system infrastructure to produce the required information products. Data consumers are those who use the information products in their work. Data managers are those who are responsible for managing the entire manufacturing process throughout the information product life cycle. The TDQM cycle (Figure 1) suggests that the total data quality management is an ongoing and iterative process consists of four phases: define data and information quality requirements, measure quality dimensions, analyze the performance and improve from the past. Especially, it is an ongoing process requires collaboration from above four roles above (data producers, etc). In servitization, similar to the data custodians, data producers become external parties of the organisation, thus ensuring information flow between the external and internal parties emerges as a new challenge. Figure 1: TDQM Cycle - Wang ([25])
4
CONCLUSION
Servitization is now becoming a widely known concept. More and more companies are transforming from supplying products to providing solutions. Central to these solutions, the solution providers are responsible for monitoring and collecting data on behalf of the customer organisation and to use these data to ensure service quality. As addressed in this paper, the long term effect associated with the issues of data ownership, data transformation, data storage and data analysis are often overlooked in the servitization contract. This often results in data quality problems which lead to less informed decisionmaking in customer organisations. Based on the existing data quality literature, this paper explores a theoretical framework consisting of four stages to guide the both servitization suppliers and customers to understand the data quality impacts and provides a guideline to manage this process effectively. This theoretical framework will be verified and refined in future case studies.
5
REFERENCES
1
Bains, T. et al., (2007) “State-of-the-art in Product Service-Systems”, Proceedings of the Institute of Mechanical Engineers, Vol. 221, Part B, Journal of Engineering Manufacture, pp. 1543-1552.
2
Ballou, DP & Pazer, HL (1995), 'Designing Information Systems to Optimize the Accuracy-timeliness Tradeoff', Information Systems Research, vol. 6, no. 1, pp. 51-72.
3
Caballero, I, Gómez, Ó & Piattini, M (2004), 'Getting Better Information Quality by Assessing and Improving Information Quality Management', paper presented at the 9th International Conference on Information Quality, Cambridge, MA, November 2004.
4
Chesborough, H. and Spohrer, J. (2006) “A research manifesto for services science”, Communications of the ACM, Vol. 49, No. 7, p. 35.
5
English, LP (1999), Improving Data Warehouse and Business Information Quality: Methods for Reducing Costs and Increasing Profits, John Wiley & Sons, New York.
6
Goedkoop M. et al., (1999), “Product Service-Systems, Ecological and Economic Basics”, Report for Dutch Ministries of Environment (VROM) and Economic Affairs (EZ).
7
Johnson, M. and Mena, C., (2008), Supply chain management for servitised products: A multi-industry case study, Int. J. Production Economics 114 (2008) 27–39
8
Kahn, BK, Strong, DM & Wang, RY (2002), 'Information Quality Benchmarks: Product and Service Performance', Communications of the ACM, vol. 45, no. 4, pp. 184-192.
234
9
Kapletia, D. and Probert, D., (2009), Migrating from products to solutions: An exploration of system support in the UK defense industry, Industrial Marketing Management Upcoming
10 Klein, BD (1998), 'Data Quality in the Practice of Consumer Product Management: Evidence from the Field', Data Quality, vol. 4, no. 1, pp. 19-40. 11 Levitin, AV & Redman, TC (1998), 'Data as a resource: Properties, implications, and prescriptions', MIT Sloan Management Review, vol. 40, no. 1, pp. 89-102. 12 Lindberg, N. and Nordin, F., (2007), From products to services and back again: Towards a new service procurement logic, Industrial Marketing Management 37 (2008) 292–300 13 Manzini, E. and Vezolli, C. (2003) “A strategic design approach to develop sustainable product service systems: examples taken from the ‘environmentally friendly innovation’ Italian prize”, Journal of Cleaner Production, Vol. 11, pp. 851–857. 14 Meijkamp, R. (2000) “Changing consumer behaviour through eco-efficient services. An empirical study of car sharing in the Netherlands”, Delft University of Technology, 15 Mont O. (2000) “Product Service-Systems”, Final Report for IIIEE, Lund University, 16 Oliva R. and Kallenberg R. (2003) “Managing the Transition from Products to Services”, International Journal of Service Industry Management , Vol.14, No. 2, pp. 1 -10. 17 Orr, K (1998), 'Data Quality and Systems Theory', Communications of the ACM, vol. 41, no. 2, pp. 66-71. 18 Rothenberg, S. (2007) “Sustainability through Servicizing”, MIT Sloan Management Review. 48( 2). 19 Salaun, Y & Flores, K (2001), 'Information Quality: Meeting the Needs of the Consumer', International Journal of Information Management, vol. 21, no. 1, pp. 21-37. 20 Slack N., (2005) “Patterns of “servitization”: Beyond products and services”, Institute for Manufacturing, Cambridge, London (CUEA). 21 Steenstrup, K., (2005) “Enterprise Asset Management Thrives on Data Consistency”, Gartner Research, Research Article 22 Strong, DM (1997), 'IT Process designs for Improving Information Quality and reducing Exception Handling: A Simulation Experiment.' Information and Management, vol. 31, pp. 251-263. 23 Vandermerwe S. and Rada J. (1988), “Servitization of Business: Adding Value by Adding Services”, European Management Journal, Vol.6, No. 4. 24 Wand, Y & Wang, RY (1996), 'Anchoring Data Quality Dimensions in Ontological Foundations', Communications of the ACM, vol. 39, no. 11, November 1996, pp. 86-95. 25 Wang, RY (1998), 'A Product Perspective on Total Data Quality Management', Communications of the ACM, vol. 41, no. 2, pp. 58-65. 26 White, A., et al., (1999) Servicizing: The Quiet Transition to Extended Product Responsibility, report from Tellus Institute. 27 Wise & Baumgartner (1999) “Go downstream: The New Profit Imperative in Manufacturing”, Harvard Business Review, Sept/Oct., pp. 133 - 141 28 Zeithaml, VA, Berry, LL & Parasuraman, A (1990), Delivering Quality Service: Balancing Customer Perceptions and Expectations, Free Press, New York, NY.
Acknowledgments Cooperative Research Centre of Integrated Engineering Asset Management (CIEAM), Australia
235
Proceedings of the 4th World Congress on Engineering Asset Management Athens, Greece 28 - 30 September 2009
ASSET MANAGEMENT FOR FOSSIL-FIRED POWER PLANTS: METHODOLOGY AND AN EXAMPLE OF APPLICATION Ludovic BENETRIX , Marie-Agnès GARNERO, Véronique VERRIERa a
Électricité De France (EDF),Research and Development Department, 6 quai Watier, 78400 Chatou,France.
The current industrial context (deregulation of the utility market, constant evolution of the air emission standards) creates new requirements for electric utilities in terms of asset management and quantitative valuation. This is why EDF has been developing for several years risk-informed asset valuation methodologies and associated decision support tools. One of them – called “Durability method” – is based on a probabilistic approach and allows to deal with both local (i.e. at a component level) and overall (i.e. at a plant or fleet level) issues. This paper aims at presenting the principles of this method and describing an example of application for coal-fired power plants. Key Words: asset management, valuation, probabilistic, fossil-fired, power plants. 1
OVERALL INDUSTRIAL CONTEXT 1.1 Role of fossil-fired power plants in the French energy mix
Following the oil price shocks of the early 1970’s, France decided to rely on nuclear power plants for electricity generation to protect itself from the possible large oscillations of raw material prices (coal, oil and gas) and thus to reduce its variation generation costs. This has lead to a particular energy mix which mainly depends on nuclear power plants. Indeed the nuclear power plants represent 65% of the installed capacity of electricity generation and more than 80% of the effective electricity generation every year [1]. As for 2008 the amount of fossil-fired generation was 3.3% of the total electricity generation in 2008, as shown by Figure 1.
Figure 1. Electricity generation distribution in France in 2008 Yet the role of fossil-fired generation is essential in the French energy mix. As fossil-fired power plants have a high degree of flexibility which allows quick start-up and power modulation, they are used to ensure the balance between generation and
236
consumption in real time. Figure 2 shows that fossil-fired plants are used for semi-base load operation (less than 5000 hours per year), peak load operation (less than 1500 hours per year) and extreme peak load operation (less than 200 hours per year). This is why the reliability and availability of them are significant issues for EDF.
Figure 2. Use of the various generation means to ensure balance between generation and consumption.
1.2 Deregulation of the electric utility market A European directive dated June 26th 2003 (refer to [4]) lead to the electric utility market deregulation. Since 2007 (and 2004 for non household customers) the electricity generation and commercialization have been submitted to competition. EDF status changed from a public entity to a public limited company with public service obligations. The main objective of EDF is now to provide its current and future new customers with a still efficient public service while participating in a European electricity market which prices are highly volatile. When fossil-fired power plants are needed to generate electricity (particularly for peak and extreme peak loads) electricity is rare and consequently expensive on the electricity market. This accounts for a strong need for long term optimization and relevant asset management of the EDF’s fossil-fired fleet.
1.3 Air emission standard evolution Except the high volatility of the combustible prices the main drawback for the use of fossil-fired electricity generation is the emission of air pollutants (nitrogen oxides NOx, sulphur dioxide SO2, dust) and greenhouse gases (carbon dioxide CO2). This is why the air emission standards are constantly evolving, driven by European directives. In practical terms this consists in defining maximum allowable values for air pollutant emissions. As an example, EDF has to comply with the following specifications from 2008 to 2015 for its 600 and 700 MW fuel and coal-fired plants (11 plants) : 11338 t/year maximum emission for SO2, 13749 t/year maximum emission for NOx, 1417 t/year maximum emission for dust. These maximum allowable values will be probably modified from 2016. Thus the following issue has to be tackled as far as fossil-fired fleet management is concerned : what kinds of investments are profitable given the current and possible future air emission standards ?
1.4 New requirements for electric utilities The overall industrial context as described in the previous sections leads EDF to handle new requirements as other European electric utilities. 1.
First the strategy to manage the entire fossil-fired fleet in terms of whole-life cost has to be optimized with respect to the current and future emission standards. As an example one of the questions EDF may have to answer is : is it better to develop new pollution control systems for its existing power plants or to build new plants that directly comply with the future air emission standards ?
237
2.
In the meantime the long term reliability and availability of the main components of fossil-fired power plants have to be optimized too. Steam turbines, steam generators, generators, condensers and pollution control systems are then concerned by these issues. Consequently the strategy to manage the ageing of the main components has also to be optimized to ensure a satisfying level of security and performance for fossil-fired power plants.
These 2 requirements are strongly correlated : indeed the decision to extend fossil-fired power plants’ lifetime (e.g. thanks to the implementation of pollution control systems) is strongly related to the good condition of the plants’ components; inversely the investments to handle the components ageing have to be optimized with respect to the lifetime target for the plants that are under study. In addition many uncertainties usually make the decision making process more difficult : e.g. on the future air emission standard evolutions or on the precise lifespan of the major components. As a conclusion we can see that the decision making process related to the new requirements EDF has to face is quite a delicate issue ; this is why EDF had to develop new methodologies and associated decision support tools. These methodologies and tools have to be based on the following principles :
2
1.
A multi-level approach that allows to deal with both plant/fleet level and component level issues.
2.
A probabilistic approach to enable the modelling and computation of the various uncertainties related to the fossil-fired fleet.
3.
An approach that makes possible to cope with cross-correlated issues.
A SOLUTION FOR THE ELECTRIC UTILITY NEW REQUIREMENTS : THE “DURABILITY METHOD” This section aims at presenting the EDF Durability methodology and its associated tools as also described in [2]. 2.1 Analysis at a component level
We focus on the component level which is the first phase of the overall analysis. Let us consider a component for which we have to optimize lifecycle management : what kind of investments have to be done to ensure a satisfying level of security, availability and performance for the component under study and when is it preferable to make the investments ? The component level analysis can then be applied to any major “System, Structure or Component” (SSC). For fossil-fired power plants the question can be asked for major components such as steam turbine, generator, steam generator or condenser.
1.1. SSC Experts & Existing data identification
1.2. SSC definition : material and functional breakdown
1. SSC-file Identification of events and mitigation actions
1.3. SSC current state evaluation & main risks identification
1.4. Events identificat ion
2. SSC-scenarios Probability assessments and selection of events/mit igation actions
1.5. M itigation actions identification
1.6. SSC-file technical and strategic validation
3. SSC-valuation Distribution of probabilistic technical and economic indicators
Figure 3. The key phases for component level management.
238
Figure 3 sums up the principles of the component level analysis. It breaks up into the main 3 following phases : 1.
SSC-file elaboration : identification of events (main risks) and mitigation actions related to the SSC.
2.
SSC-scenarios building : quantification of the events’ probability distributions, mitigation actions’ costs and scenario elaboration.
3.
SSC-valuation : computation of the probabilistic technical and economical indicators to compare scenarios.
The SSC-file elaboration (step 1) aims at retrieving and aggregating the technical knowledge for the SSC being studied. It is a fundamental step of the component level analysis because the quality of the information gathered during this step strongly determines the quality and applicability of the results to be obtained in the next steps. A compromise between completeness of the analysis and time constraints for the achievement of the study has to be found for each SSC to be elaborated. This step consists in identifying the main risks (events) that could occur until the end of the SSC’s life and the preventive and/or corrective mitigation actions EDF will be able to carry out to cancel or significantly reduce the impact of the possible occurrence of the events (steps 1.4 and 1.5). The events can be related to many issues : performance, obsolescence, ageing, regulatory evolutions… As we see on Figure 3 the identification of events and mitigation actions has to be preceded by : the experts & existing data (degradation, failure or operation histories) identification (step 1.1), the precise definition of the SSC in terms of material and functional scopes (phase 1.2), the current state evaluation and main risks identification (step 1.3). Then we have to build scenarios for the SSC (step 2). The dates of occurrence of the identified events are in general unknown because they rely on many different phenomena that are difficult to predict with precision : ageing of components, future utilization of the fleet, regulatory evolutions. This is why the analysis is based on a probabilistic approach that allows to take into account the various sources of uncertainty. Thus every event has to be associated with its probability distribution of occurrence over time. Then the mitigation actions have to be quantified too : material and labour costs as well as the impact of the mitigation action application on the event’s probability of occurrence over time. Finally the scenarios have to be built : the relevant events and the strategies of application for mitigation actions (preventive or corrective, date(s) or strategy of application) are defined. The last step of the component analysis (step 3) is the SSC valuation. Every scenario defined in step 2 is compared to a reference scenario (which has been also defined in step 2) so as to help the decision maker to choose the optimal strategy. The comparison is based on many technical economical indicators among which the Net Present Value (NPV) is the most relevant. This indicator is computed as follows : 1.
The financial flows for reference strategy are subtracted to the financial flows for the strategy to be valuated for the whole life period. The financial flows are discounted thanks to the EDF official discount rate. A relative NPV is then computed.
2.
The financial flows for the different strategies are directly linked to the occurrence of the events of the SSC under question. As the events’ dates of occurrence are modelled as random variables (refer to step 2) the financial flows and thus the computed NPV also become random variables.
3.
Usually the probability density function of the NPV cannot be easily computed because the models are often complex. This is why EDF has developed a dedicated tool (VME : refer to [2] and [3]) to enable the modelling of the scenarios and the approximate calculation of the NPV distribution. This tool is based on Monte-Carlo simulation.
At the end of step 3, the decision maker has obtained the relative NPV distribution of each scenario with respect to a reference scenario (that would be applied if the SSC analysis was not performed). The examination of the main parameters of the NPV distribution (mean NPV, extreme values of the NPV, probability for the scenario not to be profitable) allows to make the optimal decision with respect to the decision maker’s goals for the SSC currently under study.
2.2 Extension for the plant and fleet levels The method described in the previous section allows to optimize decision making for a given System, Structure or Component. Yet it does not enable to make a decision at a plant or fleet level because many other topics have then to be taken into account : issues related to other SSCs and possible correlations between them, overall assumptions about the regulatory background, valuation of the overall economic performances of the fleet. As a consequence it is interesting for EDF to have an extension of the component level analysis that benefits from the SSC studies that may have been possibly performed. This is the objective of the plant/fleet level analysis which is described hereafter. Figure 4 describes the main steps of the plant/fleet level analysis.
239
1. SSC selection Selection of relevant SSCs
2. SSC elaboration Identificat ion of events and mitigation actions, elaboration of scenarios
For each selected SSC
3. Plant evaluation Distribution of probabilistic technical and economic indicators
4. Decision making Distribution of probabilistic technical and economic indicators
Figure 4. The key phases for plant & fleet level management This analysis breaks up into 4 steps : 1.
SSC selection (step 1) : this step consists in selecting which SSCs will have to be processed so as to perform a relevant plant or fleet wide analysis. The goal of this phase is to find a compromise between completeness (that would lead to select a large number of SSCs) and time constraints for the achievement of the study (that would tend to reduce the number of SSCs). Usually only the major components or issues related to a plant are chosen as SSCs (steam turbine, steam generator, generator, condenser, environment).
2.
SSC elaboration (step 2) : this step aims at elaborating the SSC-file for each SSC selected during step 1. The method for this step is identical to the component level analysis (refer to section 2.1) without the final valuation because it will be done at a plant or fleet level : identification and quantification of events and mitigation actions, building of scenarios to be valuated.
3.
Plant/fleet evaluation (step 3) : after having elaborated every SSC-file the SSCs have to be aggregated to perform the plant or fleet evaluation. This is done by :
4.
a.
First identifying and taking into account the cross-correlations between the SSCs.
b.
Then building plant and fleet wide scenarios by aggregating the SSC level scenarios.
c.
Finally elaborating a last SSC file that allows to take into account the overall economic performances of the plant or fleet (valuation of the electricity generated, consideration of taxes, charges, maintenance and operation common costs) and also the financial flows related to secondary components or issues which were not studied as SSCs.
Decision making (step 4) : the decision maker obtains at the end of the analysis the NPV probability distribution of the plant or fleet that is under study for each scenario (as defined in step 3). The NPV probability distributions are computed using the dedicated EDF software tool EDEN [2]. The examination of the relevant parameters of the NPV distributions (mean value, standard deviation, extreme values, probability for the scenario not to be profitable…) enables the decision maker to compare the various scenarios and to make a decision in the light of an objective and quantitative valuation of them.
240
3
EXAMPLE OF APPLICATION : QUANTITATIVE VALUATION OF MAINTENANCE AND OPERATION STRATEGIES FOR COAL-FIRED POWER PLANTS’ STEAM TURBINES 3.1 Events, mitigation actions and scenarios
This section aims at describing a real example of application of the “Durability method” for coal-fired power plants. We focus on the component level analysis as described in section 2.1. The scope of this example is :
Fleet: 600 MW coal-fired power plant fleet which is made up of 3 power plants. This fleet is used for semi-base load operation (around 5000 hours operation a year) and has a significant role in the balance between production and consumption especially in Brittany where there is no nuclear power plant.
SSC : “Steam turbine” was first studied because it is one of the major component of a power plant.
Main assumption : EDF has recently implemented efficient air pollution control systems for these power plants for more than 300 M€. The fleet is then assumed to comply with future air emission standards and the aim is to operate them up to around 2030.
The “Durability method” has been used to help EDF decision makers to optimize the long term management of the steam turbine. As an illustrative example let us consider the following model based on the “Steam turbine” SSC elaboration. The event considered here is : “Failure of the component C (C is a part of the steam turbine, not precisely defined for confidential reasons)”. As an example the probability density function over time for one of the 3 studied power plants is summarized by the figure 5. The probability density functions for the other 2 power plants are slightly different from this one because the operation conditions are not exactly the same which implies that the probability of failure is not exactly the same for the 3 power plants.
Cumulative probability over time for event "Failure of component C" 1,2 1
Probability
0,8 0,6 0,4 0,2 0 2008
2010
2012
2014
2016
2018
2020
Year
Figure 5. Probability distribution of the event over time. The reference strategy for this event consists in fixing the C component failure for each occurrence of the event for each power plant. This operation implies a cost of X M€ (confidential) and a plant unavailability of Y weeks (confidential). The probability distribution over time once the failure has been fixed (probability that the event occurs again after fixing) is described by Figure 6 (with time 0 corresponding to the date where the failure is fixed).
241
Cumulative probability over time for event "Failure of component C" after fixing of the failure 1,2 1
Probability
0,8 0,6 0,4 0,2 0 0
2
4
6
8
10
12
Time (from date of fixing)
Figure 6. Probability distribution of the event over time after fixing of the failure. The decision maker wants to assess the relevancy of the implementation of an alternative strategy which would consist in buying in advance a new C component. The cost of this operation is much larger than the fixing cost (4 times more than the fixing cost). Though this cost is expected to be counterbalanced by the fact that it allows to fix the possible failed components without a plant unavailability thanks to concurrent operation implementation. The two scenarios to be compared are summarized in tables 1 and 2. Table 1. Summary of the reference scenario Reference strategy Cause
Mitigation actions
Probability after action
Any occurrence of event "Failure of C component"
Fixing of the failure on C component : - X M€. - Y weeks unavailability.
Refer to figure 6
Table 2. Summary of the alternative scenario Alternative strategy Date or occurence
Mitigation actions
Probability after action
2010
Preventive acquisition of a new C component : - 4X M€. - 6Y weeks delay.
Not changed
First occurrence of event "Failure of C component"
Replacement of C component by the new one : - 0,7X M€. - 0,35Y weeks unavailability.
0 until end of life of the plant
First occurrence of event "Failure of C component"
Fixing of the failed C component (concurrent operation) : - X M€. - Y weeks delay (but no plant unavailability).
Not changed
Other possible occurrences of event "Failure of C component" on other plants
Replacement of C component by the fixed C component : - 0,7X M€. - 0,35Y weeks unavailability.
Refer to figure 6
242
3.2 Valuation results and decision making The NPV probability distribution of alternative strategy compared to reference strategy is shown by Figure 7. The alternative strategy is always profitable that is to say the probability for the NPV to be greater than 0 equals 1. The mean value of the final NPV is NPVmean (confidential). The NPV distribution goes from (0.4xNPVmean) to (2.1xNPVmean). Alternative strategy vs. reference strategy : NPV probability distribution
0,09 0,08 0,07 Probability
0,06 0,05 0,04 0,03 0,02 0,01 0 0
0,5
1
1,5
2
2,5
Normalized NPV : NPV / NPVm ean
Figure 7. Relative NPV distribution for alternative scenario compared to reference scenario
Figure 8 shows the evolution over time of the mean value of the relative NPV : we can notice than the preventive acquisition of component C is not profitable at the beginning because the new C component is quite expensive. But the investment becomes profitable from 2015 as more and more failures occur for the reference scenario implying sizeable costs due to fixing operations and unavailability periods. These costs will be avoided in the alternative scenario thanks to the earlier acquisition of a spare C component. To conclude this study shows the long term profitability of the preventive supply of a spare component whereas the investment could have been considered too expensive compared to the corrective mitigation actions. Based on this probabilistic study the EDF decision maker will make a decision in the next few months about the acquisition of the component C.
Alternative strategy vs. reference strategy : NPV mean evolution over time
1,2
NPVmean / final NPVmean
1 0,8 0,6 0,4 0,2 0 2005
2010
2015
2020
2025
-0,2 Time (year)
Figure 8. NPV mean value evolution over time
243
2030
2035
4
CONCLUSION AND PERSPECTIVES
The overall European and French industrial contexts raises new needs for EDF in terms of long term management of its fossil-fired assets. The “Durability method” which has been developed by EDF since 2001 is one possible answer to these needs : this fully integrated method provides a probabilistic asset management approach allowing to deal with component, plant and fleet levels. The example of long term management optimization for one of the major components of coal-fired power plants – the steam turbine – showed how the method can be practically applied at a component level. It particularly underlined the benefit that could be made from supplying in advance a spare part of a steam turbine’s major component in spite of his high initial cost. The developments are still going on mainly in the following ways : 1.
About coal-fired power plants : other SSCs are currently under question or will be elaborated in the next months (steam generator, environment issues, generator) so as to make the overall long term analysis and optimization possible.
2.
About “Durability method” : the current method relies on the detailed elaboration of several SSC-files for the approach to be relevant. But the processing of a SSC-file can be quite a difficult and long issue. As a consequence it can be a delicate task to process all the needed SSC-files and to perform a plant or fleet optimization whereas the issue is fundamental for EDF life cycle management. Thus EDF Research & Development Department is currently working on a simplified approach to deal with plant or fleet level issues with respect to time and human resources constraints.
5
REFERENCES
1.
EDF group (2008) Document de référence – http://shareholders.edf.com/the-edf-group/shareholders-97231.html
2.
ICONE16-48911 – Overview of EDF life cycle management & nuclear asset management methodology and tools – P. HAÏK, K. FESSART, E. REMY, J. LONCHAMPT, EDF Resarch & Development.
3.
Lambda Mu 16 – October 7th to 9th 2008 – Relative valorisation of major maintenance investments taking into account risk – J. LONCHAMPT, K. FESSART, EDF Research & Development.
4.
Directive 2003/54/CE du Parlement européen et du Conseil, du 26 juin 2003.
244
Proceedings of the 4th World Congress on Engineering Asset Management Athens, Greece 28 - 30 September 2009
CMMS – INVESTMENT OR DISASTER? AVOID THE PITFALLS Christer Olsson a, Ashraf Labib b, and Cosmas Vamvalis c a
Senior Consultant Maintenance Management, Midroc Engineering AB, Member of Board, the Swedish Maintenance Society, UTEK, E-mail:
[email protected] b
Professor and Associate Dean (Research), University of Portsmouth, Portsmouth Business School, Richmond Building, Portland Street, Portsmouth PO1 3DE, United Kingdom, E-mail:
[email protected] c
Vice Chairman, Hellenic Maintenance Society, Managing Director, ATLANTIS Engineering Ltd, nt.Tritsi 21 Pilea, Thessaloniki, Greece, E-mail:
[email protected]
It has been reported that a high percentage of Computerised Maintenance Management Systems (CMMS) implementations has not been successful in the past (Labib, 2008). Even when CMMS applications are implemented, it is noticed that low usage of software options is achieved. This may be attributed to the low organisation level of many maintenance departments. The investment in modern software for maintenance management is an expensive project especially when one considers the money invested in acquiring, implementing and maintenance license costs. If we also consider the time and effort that needs to be invested in training, setting up the structure of the system and data collection, then the investment grows to be at least double or even triple the initial cost. So, one might then think that the return of this investment is very well followed and the use of the new system is heavily supported by Management. However, in real life it is noticed that often those systems do not produce the support wanted or needed to the users resulting in poor trust from the maintenance organisations in their own tool. The use of standards to secure proper understanding and data quality is very often not in place. Using standards and proper structures gives a platform for a more effective use but there is also a need to get the commitment from the people involved to get the right return on the investment. We argue that this situation can be improved by the use of audits and benchmarking. One example on how this can be done is the Automated Tool of Maintenance Auditing, Benchmarking for Continuous Improvement (AMABI). AMABI consists of three modules, namely; auditing, benchmarking and recommendations. The first module is an auditing module which evaluates directly the CMMS data in order to assess the current organisation level and performance of the maintenance department. The second module is benchmarking. Within the benchmarking module, auditing results from different companies are automatically compared using different groupings (e.g. company sector and/or size). The third module is related to recommendations. In the third module points of further exploitation regarding the organisation of the maintenance department are identified and proposals for improvement, in a prioritised manner, are provided. The proposed tool is assisting the companies to be in a continuous improvement process environment. The main innovation of the tool is the automatic audit and benchmarking which makes it very cost effective, so it can be broadly used. Moreover, it ensures that results are not biased, as human judgment is minimised. Actual results of the tool in 50 Greek and Cypriot companies (e.g. auditing criteria, 2009 benchmarking results and indicative proposals of improvement) will be presented. Finally, benefits by companies using the tool will be reported.
245
1
STRATEGY – CULTURE
When one decides to invest in a new CMMS this is normally a part of an overall strategy to improve one’s maintenance performance. So, the first thing to think about is how do I measure if improvement really takes place? How do I use the new tool in the best way and how do I get the best out of the system? First of all – be open minded, try to understand the CMMS designers and how they have set up the system and the work processes they have worked out. Secondly – be ready to change one’s own processes if this is possible – Guess you have not bought a new system to support the old way of working? Be aware that most of the problems with new CMMS come from the fact that we try to convert the system to the way we are used to work – •
“I am not prepared to change my way of working because the system says so. o
“We have always worked like this!”
o
“We have never worked like that!”
As soon as one starts to change part of the system you also have a risk that some of the features built in to the system won’t work because one has a broken logistic chain. This is also very important from a cost perspective as all changes probably are done by consultants not only one time but every time one updates the system or a new feature is introduced. Overcoming the cultural rules and routines takes a lot of patience and hard work. Leadership and guidance are important issues as no system is better than the data quality in their database, and the data quality is created from motivated people. This means that they must know the quality parameters of the data and what it will be used for and, maybe most important, what’s in it for me? The use of industrial standards like CEN or ISO is a good strategy and gives one a safer road to success. ( EN 13306 Maintenance Terminology etc.) 1.1 Asset Hierarchy – History and Knowledge Base The backbone of all CMMS systems is the Asset Hierarchy. This is one of the crucial features of a CMMS. This is where one lists all the assets, decide all relations between them and connect all information that is needed for your work. The asset hierarchy is different from the asset register found in Accounting Departments as this is usually developed from a financial perspective rather than an engineering one. The hierarchy is preferably set up from a responsibility perspective so that it also will show which function in your organization is responsible for the care of each entity/equipment in your plant. Take your time before you decide on the hierarchy design because changing at a later date is very demanding as you start to make history from the first moment. If you need some guidance a standard like the ISO 14224 may be of help. That standard was made up by the oil and gas industry but it is especially valuable for those who plan to do benchmarking as it gives structure and advice on what belongs to a certain type of equipment and the structures you need to perform benchmarking. In other words it helps in providing guidance of the taxonomy (classification) of different machines and their components. 1.2 Work Process One of the most important issues you will deal with in your CMMS is the process of generating work/service requests and history records for your assets. Have the Deming planning circle (PDCA) in mind and be sure to set up the process to support all parts of the circle (Plan; Do, Check; Act). To guarantee the quality of work and historic data you must have the documentation and decision-making process in place and a proper training program for users. A good guide is the the EN 13460 standard (Maintenance –Documents for Maintenance) which also gives a proposal on which and what document should be used in work process. 1.3 Monitoring Progress An important task for the CMMS is to provide data for analysis on where there are possibilities for improvements in performance for your maintenance operations. Whether you want to analyze your data to follow only your own indicators over time or decide to start up benchmarking or comparing key performance indicators (KPI,s) with other companies, it is of great importance that you define your indicators very well. It is also important that you can measure the results from many perspectives, in a balanced approach, to find the right answer to your questions when formulating your strategy plan for the coming period.
246
Again, you will find help in the standards, e.g. EN 15341 (Maintenance – Maintenance Key Performance Indicators). This standard gives you the opportunity to select your set of indicators from the 71 KPI’s, grouped under three levels in three parts; Economic, Technical and Organizational indicators. More help is to be found from two products from the EFNMS Benchmarking Committee: •
“A user requirement specification for the calculation and presentation of Maintenance Key Performance Indicators according to EN 15341”. This product is for free use for the members of EFNMS (The European Federation of National Maintenance Societies).
•
“Global Maintenance and Reliability Indicators”. A publication of EFNMS and SMRP (Society of Maintenance and Reliability Professionals (USA)).
•
“The EFNMS Benchmarking Workshop” which provides training and practice in the calculation and understanding the indicators of EN 15341
1.4 Interfaces Make sure that your CMMS software is compatible with the other software packages involved or that you plan to involve your maintenance and operations. Very important are of course the financial, the documentation and the scheduling soft ware packages such as ERP (Enterprise Resource Planning). They must be integrated when it comes to setting rules for resource structures, rates and calendars as well as equipment and project breakdown structures. They must be able to supply you with the correct data for your indicators and your strategy planning. There is a very big risk that this will be very expensive if you have to put soft ware consultants to work on solutions and it will be so every time you update any of your systems. Again, you can use the Standards for the right understanding between the members or the financial, project and maintenance organizations on what is needed to provide the data quality for your operations. The Terminology standard, gives you, among other definitions, what is included in maintenance costs, the KPI standard what special data that might be needed from the financial system the document standard tells you what needs to be provided from your documentation system. 2
EXAMPLE OF INCREASING CMMS EXPLOITATION
Limited CMMS’s exploitation can be noticed in most implementations. This observation can be attributed to two fundamental reasons: The first is due to the limited CMMS options being used, and the second reason, is that from the options being used, it is common that data recorded are not analysed in order to take decisions. Several efforts have been made in order to increase CMMSs exploitation. One of these efforts is described by the following procedure. 2.1 General description - Main components This procedure consists of automatic CMMS auditing, benchmarking, and recommendations modules. These three modules are presented in the following paragraphs. The recommendations module is first presented, so the auditing module will be more efficient and transparent later on. 2.1.1 Recommendations The third module, which is first presented, is the recommendations. On this module, points of further exploitation regarding the organisation of the maintenance department are identified, based on the auditing results, and recommendations, in a prioritised manner, are provided. These recommendations are related to each company’s particularities (e.g. in numerous maintenance departments ‘maintenance tasks planning’ is more important compared to smaller departments). CMMS recommendations are made in its four main sections: “Corrective maintenance”, “Preventive maintenance”, “Spare parts”, and “Miscellaneous” and are presented below.
247
Figure 1. Recommendations concerning ‘Corrective maintenance’ In Figure 1, the recommendations in the corrective maintenance [1] (1), in a prioritised way, are presented: First, companies should record and give work orders for the immediate actions[1] (11). The next step is to record and give work orders for the differed actions (111). In parallel, immediate actions data should be analysed (112). When data are analysed, specific decisions should be taken (e.g. enhancement of PM schedules) (1121). It is important that the technical department except from doing their job right, they should also present this to the company’s management (11211). In parallel, KPI targets should be set and monitor their results. The most important KPI, concerning corrective maintenance is considered equipment availability [2,T1] (1122). It is also important to monitor & analyse maintenance cost (1123). When maintenance costs are known, ‘replace or repair’ model can run, especially for machines having the highest cost (11231). Finally, technicians’ performance can be analysed. This is a sensitive issue and management must be very careful in the way it is handled (1124). One more step is to collect maintenance requests directly from the production (into the CMMS) and not orally or in any other way (113).
248
Figure 2. Recommendations concerning ‘Preventive maintenance’ In Figure 2, recommendations in preventive maintenance (PM) (2) are presented: First PM schedules must be generated (21). Afterwards, PM tasks should be monitored (e.g. weekly or monthly). Next, PM tasks should be executed and recorded on the CMMS (211). After this, KPIs is recommended to be monitored, like technicians’ time spent for PM compared to total maintenance man-hours [2, O16 & O18] (2111). Finally, the optimum inspection time model (based on recorded immediate actions) can run (2112). In Figure 3, the recommendations in the spare parts (SP) (3) are presented: The most important is to record spare parts consumptions. Through this, a critical component of maintenance cost is filled and each machine’s bill of material is being built (31). The next step is to record spare parts deliveries (311). When consumptions & deliveries are accurately recorded, a physical inventory should be performed and stock level will be known (3111). Next step is to record SP location in the stores, specifically the row and bin no (31111). After this, labels on the shelves can be stuck (311111) and RF PDAs can be used in order to record consumptions and returns of the consumed SP (3111111). Also, optimum stock level model can run (31112) and KPIs regarding the spare parts can be set, like SP value/asset replacement value [2, E7] (31113). In parallel, orders to suppliers (3112) and offers from the suppliers (31121) can be recorded.
249
Figure 3. Recommendations concerning ‘Spare parts’
Figure 4. Recommendations concerning ‘Miscellaneous’
250
In Figure 4, the recommendations in the miscellaneous (4) are presented: There are two main miscellaneous issues. The one is to incorporate maintenance procedures into company’s ISO (41). The other is the data bridge with company’s ERP. This step is important in order to save human resources from recording the same data twice (42). Some general comments about the recommendations are: a) The order of the recommendations depends on each company’s priorities. b) The proposed order refers to an existing installation. Thus, the steps should be small in order to have the maximum result with the minimum effort. In a new installation, the order of the recommendations could be different and procedures could be followed based on their logical sequence (e.g. first orders and afterwards deliveries). On the process of increasing CMMS exploitation, there are two options: a) The one option is to use as many features of the CMMS as possible. b) The second option is to increase the exploitation of the features being used. This is done in four ways: b1) Increase the percentage of data being recorded on the CMMS (e.g. increase the percentage of deferred actions being recorded, Figure 1, 111). b2) Increase procedures execution percentage (e.g. increase PM tasks execution, Figure 2, 211). b3) Increase data analysis frequency (e.g. analysis of immediate actions history, Figure 1, 112). b4) Try to make procedures more effective (e.g. work orders being done orally to be through printout. Spare parts orders being done through printout and fax to be done electronically). 2.1.2 Auditing The first module, which is presented second, is the auditing module which evaluates directly CMMS data in order to assess the current organisation level and performance of the maintenance department. There are two kinds of auditing criteria: The one is Key Performance Indicators (KPIs). The common problem of KPIs is the reliability of their result, which, sometimes, makes difficult their use for decision making. The second kind of auditing criteria is related to the execution or not of simple procedures. On table 1, indicative automatically audited results, from four companies, on 13 auditing criteria are presented.
Table 1 Indicative auditing criteria from 4 companies
Company1
Company2
Company3
Company4
1. Number of machines 2. Number of immediate actions (Immediate actions per machine) 3. Average words/ immediate actions 4. Hours recorded per technician & day 5. Number of PM schedules (PM schedules per machine) 6. PM Schedules having at least one execution in the period(year) 7. Machines availability 8. Time spent for Preventive Maintenance compared to Corrective Maintenance 9. Suppliers offers 10. Spare parts location in the stores
496 1070 (2,1) 6,0 5,5 39 (0,07) 35 (89%)
166 113 (0,68) 6,8 5,0 89 (0,53) 64 (71%)
0,30
0,93
99 452 (4,5) 2,6 1,6 60 (0,60) 58 (97%) 95,47% 2,42
474 2217 (4,6) 8,5 7,8 527 (1,21) 245 (46%) 98,60% 0,45
5 0
0 0
11. Spare parts minimum stock level
101 (17%) 306 (0,62)
7 451 (27%) 415 (25%) 1.078 (6,49)
15 229 (52%) 0
12. Machines spare parts (Spare parts per machine) 13. CMMS reports used in company’s ISO
45,0%
251
27 (1%) 631 (6,37)
430 (0,91) 62,5%
From the audited results, the following points can be highlighted: •
On the 1st audited criterion, the number of machines in each company is presented.
•
On the 2nd criterion, the number of immediate actions being recorded and, in parenthesis, the number of immediate actions per machine is presented. It can be seen that Company 2 has a figure below 1 (0,68), which, most probably, means that not all immediate actions are recorded.
•
On the 3rd criterion, the average words per immediate action are presented. It can be seen that Company 3 has only 2,6 words per immediate action which means that it will be very difficult to analyse machines history afterwards.
•
On the 4th criterion, the hours recorded per technician & day are presented as it is important, for every company, to know how effectively the engineers are using their time. It can be seen that Company 3 has recorded only 1,6 hours per technician & day, which should be further analysed.
•
On the 5th criterion the number of existing PM schedules is presented. It can be noticed that Company 1 has only 39 PM schedules for 496 machines, which should be increased.
•
On the 6th criterion PM executions are presented. It can be noticed that Company 4 has only 46% of the PM schedules being executed. After analysing this, it has been found that this Company has a lot of daily PM schedules for which it is not recording on the CMMS the execution.
•
On the 7th criterion, it can be seen that only Company 3 & 4 are monitoring their machines Availability.
•
On the 8th criterion, it can be seen that Company 3 is doing much more PM compared to CM, so it appears as the best result. Nevertheless, this conclusion is not correct as Company 3 is recording only 1,6 h/td (4th criteria).
•
On the 9th criterion it can be noticed that very limited offers from the suppliers are recorded.
•
On the 10th criterion, it can be noticed that only Companies 2 & 4 are recording spare parts location in the stores. Company 4 has just started to monitor its spare parts, so this is the reason it has so limited spare parts.
•
On the 11th criterion can be seen that, in fact, only Company 2 is monitoring spare parts minimum stock level.
•
On the 12th criterion it can be seen that Company 2 has more updated machines bill of materials.
•
On the 13th criterion can be seen that only Company 2 & 4 are using ISO codes for their CMMS printouts.
2.1.3 Benchmarking The second module, which is presented third, is the benchmarking. On this module, auditing results from several companies are automatically compared, using different groupings, in order to find out maintenance department’s performance compared to other maintenance departments. The grouping typically used is CMMS sections, presented in §2.1.1. Other groupings are company’s sector and size. Finally, an interesting comparison is with same company’s results in the previous year.
Table 2 Benchmarking results for Company 2
Corrective maintenance Preventive maintenance Spare parts Miscellaneous Total
Maximum
Average
Company 2
Position
32 28 30 10 100
16,4 15,0 12,1 3,4 46,9
24,4 18,2 24,9 7,3 74,8
3 4 1 2 2
On the example of Table 2, are presented: a) Maximum marks in each section (in total 100).
252
b) Average marks of all companies participating (on the specific period). c) Marks, for each session, of Company 2 and its total marks. d) Company’s 2 position in each section and in total. On table 2, it can be noticed that the powerful section of Company 2 is “Spare parts” and the section having more room for improvement is “Preventive Maintenance”. 2.2 Companies feedback After running the process, companies are getting feedback, in three ways: a report sent to each company, a follow-up meeting, and the awards given. 2.2.1 Report The report sent to each company consists of three parts: a)
Results of the audited criteria and equivalent marks (depending on the range of the results). Examples of results have been presented on table 1.
b) Benchmarking, per CMMS section, comparing the maintenance department with other maintenance departments. Example has been presented in table 2. c)
Recommendations. Examples for the companies being audited (§2.1.2, table 1) could be: Based on criterion 5, company 1 is recommended to increase PM schedules. Based on criterion 7, company 1 & 2 are recommended to monitor their Availability. Based on criterion 11, companies 3 & 4 are recommended to set & monitor minimum stock level for their spare parts.
2.2.2 Review meeting Usually, a review meeting to discuss the report with each company’s top management is taking place. The main subject of this meeting is to identify the most significant points of potential improvement in the company and to decide the procedures to be followed to achieve the improvements [1, p.6]. Usually, the following year different points of potential improvement will be identified. 2.2.3 Awards Having the results on the audited criteria from all companies and by assigning marks on the ranges of the auditing results, a total mark for each maintenance department’s performance is extracted. The top performing companies are identified and three awards are given, which is an incentive and reward for the efforts of the maintenance departments. 2.3 Conclusion 2.3.1 Characteristics – Innovation The common alternative procedure is through the physical presence of an auditor, who will afterwards provide benchmarking and recommendations. The main characteristic – innovation of this process is the fact that it is automatic. This fact has two main effects: a) Minimises the cost, as it eliminates the need for the physical presence of an auditor. Minimum cost results in the increase of participating companies, so benchmarking results are more accurate and can run more frequently. b) It ensures that results are not biased, as human judgment is not involved. A second characteristic is that auditing information is detailed as thousands of data are manipulated in order to get the results. 2.3.2 Benefits The benefits of the procedure are as follows: •
Noble competition is developed among the companies (especially the top performing). This competition results on maintenance departments’ motivation for improvement.
•
Maintenance departments, as they know that they will be audited on the end of the year, are continuously doing their best throughout the year.
253
•
More active top management participation, which is achieved due to the report received and the review meeting. Thus, specific targets are set and top management is monitoring them.
2.3.3 Future The procedure described is planned to have the following three improvements: •
To further automate the process with the use of Internet.
•
To be adopted by more CMMSs. This will make benchmarking results more reliable.
•
To enhance the algorithm generating the recommendations. On the algorithm improvement, universities contribution will be significant.
3
REFERENCES
1
EN 13306 Maintenance terminology.
2
EN 15341 Maintenance Key Performance Indicators.
3
Labib, A.W. (2008) Computerised Maintenance Management Systems, in “Complex Systems Maintenance Handbook”, Edited by: K.A.H. Kobbacy and D.N.P. Murthy, Springer, ISBN 978-1-84800-010-0.
Acknowledgements The second and third authors would like to acknowledge the British Leonardo for partial funding of this work under the project titled iLearn2main.
254
Proceedings of the 4th World Congress on Engineering Asset Management Athens, Greece 28 - 30 September 2009
CONDITION MONITORING SUPPORTED DECISION PROCESS IN MAINTENANCE Samo Ulaga a , Mladen Jakovcic b, Drago Frkovic c a
University of Maribor, Mechanical Engineering Department, Smetanova 17, 2000 Maribor, Slovenia. b
b
Croatian Metrology Society, Berislavićeva 8, 10 000 Zagreb, Croatia.
INA d.d., Investment Management Sector, Lovinčićeva b.b., 10 000 Zagreb, Croatia.
In nowadays world of global competition high reliability of technical systems and small life cycle cost are crucial for business profitability. Systematic rise of inherent equipment reliability usually reduces operational cost but strongly influences acquisition cost. Consequently companies are looking for other, financially more satisfactory solutions. Introduction of measures to improve reliability of technical systems should be handled thoughtfully to prevent the situation where cost of changes exceeds savings due to those changes. In accordance with above statement must also introduction of any form of industrial diagnostics as part of preventive maintenance activities be an integral part of general asset management strategy. All activities should be carefully planned and systematically performed. Quality assurance mechanism must be introduced. Proposal of sequence of activities which are necessary to achieve expected goals is presented in the work. Key Words: maintenance, reliability, condition monitoring, P-F interval 1
INTRODUCTION
Maintenance is often the largest controllable operating cost in different industries. It is also a critical business function that impacts on plant output, product quality, production cost, safety and environmental performance. For these reasons maintenance should be regarded in best practice organisations not simply as a cost to be avoided but, together with reliability engineering, as a high leverage business function! A challenge for maintenance departments some years ago was to move out of a mostly reactive maintenance cycle. Their goal was to increase the productivity or effectiveness of existing personnel and make better use of the time allocated to maintenance. Today’s competitive environment demands an increasing level of equipment reliability in most industries. Industry has made great strides in the recent years in improving operational reliability. In addition to increasing acquisition costs to improve inherent equipment reliability, lowering of operational and total life cycle costs is clearly a recommended practice. However, investments into increasing operational reliability must be well considered to avoid exceeding optimal point beyond which total life cycle costs (LCC) begin superfluously increasing with further attempts of improvements in reliability. Systematic approach to integrate different condition monitoring techniques as a part of predictive maintenance activities and also as an important source of information for setting successful asset management policy in enterprises is studied in the paper. Predictive maintenance aims to identify problems in equipment. By identifying problems in their initial stages, the predictive maintenance system gives notice of impending failure, so downtime can be scheduled for the most convenient and inexpensive time. Predictive maintenance therefore minimizes the probability of unexpected failures, which would result in lost production. Attempts to apply different condition monitoring techniques as regular part of maintenance activities are often carried out in unsystematic and inconsiderate way. It commonly results in dissatisfaction and poor cost-effect ratio. When it is introduced systematically, there are many advantages of proactive approach: -
Equipment is only repaired when needed so that the costs of maintaining the machinery are reduced as resources are only used when needed. Potential failures are identified in advance and the severity of these failures can be substantially diminished by reducing or preventing secondary damage.
255
-
Inventory costs are reduced because a substantial warning of impeding failures is provided. Parts can be ordered when needed, rather than keeping them in large stock. Using predictive maintenance program, machines are only dismantled when necessary. Probability of ‘infant mortality’ is reduced. Predictive maintenance requires data from the plant to be taken, stored and analysed. Consequently the plant equipment efficiency is observed constantly and weak spots are detected.
Above list presents just a part of advantages, provided by systematic integration of machinery condition monitoring. It can be concluded that maintenance decision making process can benefit substantially from the information provided by predictive maintenance approach. 2
THEORY SURVEY
Corrective maintenance activities are performed when action is taken to restore the functional capabilities of failed or malfunctioned equipment. These actions are triggered by the unscheduled events of an equipment failure. With this kind of maintenance, the maintenance related costs are usually high due to the reasons like: secondary damage and safety hazards inflicted by the failure, high cost of the lost production, restoring equipment under crisis situation … Preventive maintenance is the approach developed to avoid this kind of cost. Traditionally it is performed in the form of actions at fixed intervals, irrespective of the actual state of the maintained component.
Lifecycle in 106 revolutions
Starting from new, properly built and installed part will operate at a particular level of performance. As its operating life progresses degradation occurs. Regardless of the reasons for degradation, the item can no longer meets its original service requirements and its level of performance falls. In [1] 30 equal ball bearings have been exposed to the same loading conditions. Figure 1 shows how heterogeneous distribution of achieved lifecycles they have exhibited.
30 equal ball bearings
Figure 1. Random bearing failure.
When analysing the behaviour of modern industrial machinery it has been proved that majority of failures are not agerelated. According to different studies and recognised text books like [2] there are six failure probability patterns, as shown in figure 2.
256
Likelihood of failure
< 25%
Age related patterns
> 75%
Random patterns Time
Figure 2. Failure patterns.
Probability of failure does not depend on length of use. There is no reliable procedure to predict expected equipment live cycle. Consequently time-based preventative maintenance is often pointless or can even mean introduction of failures while performing preventive measures. On the other hand by detecting the loss-in-condition of the item advanced information that degradation has started can be obtained. If one can detect this change in performance level it represents a means to forecast a coming failure. Condition based approach is an appropriate option when the following conditions apply: failure can not be prevented by redesign or change of use, events leading to failure occurs in random manner, measurable parameters which correlate with the onset of failures have been identified and selected cost effective condition monitoring method is technically feasible.
2.1 Determination of condition monitoring interval Introduction of condition monitoring into the daily maintenance routine is always exposed to severe expectations and must meet different requirements. CM analyst usually aims for higher monitoring frequency or even continuous monitoring. However increased number of monitoring task directly influences cost effectiveness of the applied method. In [2] author introduces the phrase P-F interval. Degradation process of equipment is represented by a curve, where time is presented on abscissa and resistance to failure on ordinate (figure 3). At least two prerequisites of condition monitoring should be fulfilled, to apply CM of machinery under consideration: a clear indication of decreased failure resistance, measurable by means of some CM method and consistent warning period prior to functional failure. Point in time of use where equipment has experienced measurable decrease in resistance of failure is labelled as P – the potential failure. Point of functional failure of monitored equipment is labelled F. Warning period, provided by particular CM method is known as P-F interval.
257
Resistance to failure
P-F interval
Detectable change
Net P-F
P
t
Potential failure detected
F
Inspections
Time Figure 3. P-F interval. Constraints that one should account for when deciding on condition monitoring methods to be used and inspection intervals to be prescribed are illustrated by the graph shown in Figure 3. Particular CM method must provide sufficient net P-F interval, adequate for the maintenance organization to react from the moment that a potential failure is detected. For example in the worse case it can happened that previous inspection has been done just before P – point. It means that the remaining warning period can be calculated as:
net P - F = (P - F ) - t
(1)
and it should still be long enough to prepare and perform maintenance action before functional failure. Let us consider vibration monitoring of an industrial fan, as schematically shown in figure 4. It represents critical equipment in a 24 –hour’s production process of a steelworks. It is decided to introduce a cost effective condition monitoring programme for it. Vibration monitoring has been recognised as technically feasible method for the purpose.
4
2
3
1
Figure 4. Industrial fan. Empirical warning time for an average bearing shows that, depending on the load, it takes several months for damage to the outer ring to develop and it takes some weeks to months for damage to the inner ring to develop. The most critical is failure of
258
the rolling elements. It can take as little as a couple of weeks for a bearing with damaged rolling element to fail! Consequently monitoring interval should be set to be shorter then this for CM to be effective! In some cases statistical approach to determine condition monitoring task frequency can be applied. It requires precise and comprehensive maintenance records to be available for equipment under consideration. For example [3] states that for random failures, the optimal CM frequency can be calculated using the following formula:
- MTBF Ci T ln (Cnpm - C pf )ln(1 - S ) n= ln (1 - S ) Where: n= T= MTBF = Ci = Cpf = Ci = S=
(2)
number of inspections during the P-F interval P-F interval of a particular CM method mean time between failures cost of one inspection task cost of correcting one potential failure cost of not doing preventive maintenance, including cost of lost production probability of detecting the failure in one inspection
To use such formula it is crucial to know MTBF of the equipment under consideration. It can be calculated from empirical formula like suggested by [4]. Typical failure rate calculation considers the material properties, operating environment and critical failure modes at the component part level to evaluate expected failure rate. Failure rate values can also be obtained from commercial reliability databases, like [5]. To make sure that condition monitoring task frequency reflects actual needs of particular piece of equipment it is advisable to use own maintenance records for calculation of MTBF. The following formula can be used to calculate failure rate of the particular failure mode: n
T = ∑ ti i =1
(3)
x 1 l = , MTBF = T l Where: i= ti = x= λ=
number of identical machines under consideration working time of individual machine number of registered failures for treated failure mode failure rate
2.2 Selection of suitable condition monitoring method Analysis of similar industrial equipment has shown that in most cases there is no adverse age-reliability relationship. This is because the ages at failure are distributed in such a way that there can be no consistent time expected between successive failures. Imposing an arbitrary preventive task at fixed intervals, regardless actual equipment condition can even increase the average failure rate through “infant mortality.” A variety of methods are available to assess the condition of machinery and to determine the most effective time to schedule and perform maintenance. These techniques should also be used to assess the quality of newly installed or rebuilt equipment. Overall equipment condition can be determined by using intrusive or nonintrusive methods. Process parameters like temperature, pressure, flow, turning speed, power consumption etc. can also provide valuable information. Vibration monitoring is widely used to assess the condition of the rotating components such as fans, gearboxes, pumps and motors. Lubricant analysis and wear particle analysis are used to identify problems with the lubricants and detect increased component wear. Thermography is used to check electrical installations, electrical motors, hydraulic systems and other machinery where failures can be detected by change of surface temperature distribution. Ultrasonic leak detection can be used to monitor condition of the compressed air infrastructure etc. Type of equipment to be monitored and the failure modes to be detected must be defined to select a suitable CM method. It also has to be checked if the equipment to be used is suitable for the actual conditions of application (environmental
259
conditions, accessibility, safety requirements etc.) It requires a deep understanding of monitored equipment and condition monitoring techniques to be used to provide a sound basis for carrying out the monitoring activities safely and correctly. Misunderstanding or overvaluation of the applied method can lead to unexpected failures, disappointment and aimless wasting of money! A case of unsuitable CM technique will be presented. Mixer in a rubber industry is powered by two electrical motors (1MW and 1,6 MW respectively) and a four stage gearbox (figure 5). RMS velocity sensors and temperature sensors were installed to monitor condition of the gearbox.
Figure 5. Four-stage gearbox. Temperature and vibration readings are presented in Table 1.
Table 1 CM readings Position (Fig. 4) 1 2 3 4 5 6 7 8 9
RMS_1 [mm/s] 0.64 0.79 0.60 0.64 0.55 0.58 0.50 0.39 0.51
RMS_2 [mm/s] 0.68 0.65 0.75 0.67 0.66 0.57 0.73 0.38 0.41
T_1 [ C] 38 40 54 51 53 54 45 38 37
T_2 [ C] 37 39 47 50 52 52 44 39 37
Detailed vibration measurements revealed rather advanced failure of the NU2240 bearing (Pos 4)! Acceleration envelop spectra and the state of the bearing is shown in figure 6.
260
Figure 6. Damaged bearing – POS 4. Readings in Table 1 denoted with index 1 (RMS_1, T_1) correspond to the time when bearing failure was detected by detailed vibration measurement, readings denoted with index 2 (RMS_2, T_2) were recorded 1 week later. Within this week has acceleration increased by factor 4.4, while temperature and velocity RMS show no sign of impending failure! It is obvious that chosen CM technique is not adequate for the intended purpose.
3
SYSTEMATIC APPROACH TO CONDITION BASED MAINTENANCE
The basic principle of condition based maintenance is to perform such measurements that enable maintenance department to predict which machinery will need maintenance action and when. Introduction of new condition monitoring techniques into daily maintenance routine is often underestimated and therefore not very successful project. To avoid aimless wasting of money and time it is necessarily to undertake introduction and integration of different condition monitoring techniques into general asset maintenance effort systematically! For successful integration of any condition monitoring method it is very important to make sure that benefits of applying such measurements are well explained to the staff. It has to be accepted as powerful tool in improving maintenance efficiency and not as additional workload. Introducing changes into traditional practices is always demanding and time consuming task. Staff is often not cooperative and difficult to stimulate. It is necessary to clearly define goals, advantages, work loads, tasks and responsibilities before the process can be initiated. It is a question of evolution process, no step-change in attitude should be expected! Activity flowchart for systematic introduction of condition monitoring techniques is suggested in figure 7. With regard to the experience of the authors, principal sequences of such process should be as follows: -
Condition based maintenance is recognised as a vital part of the global plant strategy. There are clearly defined expectations regarding equipment reliability and LCC. Financial constraints of the project are defined.
-
Thorough analysis of equipment failure criticality (with regard to safety, environmental impact and lost production cost) is performed. A list of critical equipment to be considered for condition monitoring is defined.
-
For each piece of equipment to be monitored and each failure mode to be detected a suitable CM method must be defined. Technical feasibility, effectiveness in detecting target failure mode, cost effectiveness and availability of human resources for performing it should be measures for selection of appropriate CM technique.
-
Production and maintenance staff should be acquainted with the purpose and importance of CM.
-
Quality assurance system should be established. Measurements must be carefully planned. Detected irregularities must be reported to production management and to people responsible for corrective actions. Control measurements must be performed after the repair.
-
Transparent plant performance monitoring at different levels should be developed
261
Figure 7. Systematic CM flowchart.
To enable impartial judgement and to follow effectiveness of the applied maintenance programme it is necessary to provide information required to calculate corresponding performance indicators. Equipment under consideration has to be catalogued. All maintenance events should be systematically registered. Criteria for availability calculation should be defined. Tool enabling transparent plant performance monitoring at different levels should be developed. An example is shown in figure 8.
262
Condition
Suggested measure
Deadline
Taskholder
Technologyst
Responsible
item
OK
measure
/
Name_2
Name
Name_1
item
bad
measure
/
Name_2
Name
Name_1
283400
item
OK
measure
/
Name_2
Name
Name_1
6m
282800
item
bad
measure
/
Name_2
Name
Name_1
0m
712200
item
OK
measure
/
Name_2
Name
Name_1
Line
0m
712300
item
bad
measure
/
Name_2
Name
Name_1
Line
0m
712600
item
bad
measure
/
Name_2
Name
Name_1
Nr.
Plant
proces line
Location
Position
Item
298
location
Line
18 m
4033A0
299
location
Line
0m
282800
300
location
Line
0m
301
location
Line
302
location
Line
303
location
304
location
Figure 8. Equipment status list. Such status list provides valuable information and does not require sophisticated and expensive software tools. It can be realised for example in simple Office environment and handled with in-house human resources. Information like availability, down time, failure type and applied corrective measure should also be regularly monitored and systematically registered.
4
CONCLUSION
Predictive maintenance can increase efficiency of plant production process and improves safe and continued plant operation. By reducing the likelihood of unexpected equipment breakdown safety of the employees is improved and possible environmental impacts are reduced. When introducing different condition monitoring techniques into daily maintenance routines it is of a crucial importance to do it systematically as a part of a well prepared asset maintenance strategy with a clearly defined goals, time schedule and task holders. In the presented work it is shown how in the real industrial environment different condition monitoring techniques can be successfully implemented and integrated into daily maintenance efforts to assure continues and predictable production process. Systematic approach to integrate different condition monitoring techniques as a part of predictive maintenance activities and also as an important source of information for setting successful asset management policy in enterprises is studied in the paper. Such approach has proven to be very efficient. Handling of measurements and processing of measurement results is very transparent. It provides exact information regarding the status of the equipment under consideration to maintenance and production staff. It is beneficial in establishing a proper image about the importance and usefulness of condition based maintenance to both, production and maintenance!
5
REFERENCES
1
Echmann et al (1985) Ball and Roller Bearings: Theory, Design&Application. John Wiley & Sons.
2
J. Moubray (1995) Reliability – centred Maintenance. Butterworth-Heinemann Ltd.
3
U.S. Department of Defense: MIL-STD-2173: Reliability Centered Maintenance Requirements for Naval Aircraft, Weapons Systems and Support Equipment
4
NSWC (1998) Handbook of Reliability Prediction Procedures for Mechanical Equipment. Carderock Division.
5
Reliability Analysis Centre (1995) Nonelectronic Parts Reliability Data 1995. RAC.
263
Proceedings of the 4th World Congress on Engineering Asset Management Athens, Greece 28 - 30 September 2009
MICROMECHANICS OF WEAR AND ITS APPLICATION TO PREDICT THE SERVICE LIFE OF PNEUMATIC CONVEYING PIPELINES A.A. Cennaa, Kim Panga, Williams, KCb and M.G. Jonesb a
b
Mechanical Engineering, Faculty of Engineering and Built Environment, University of Newcastle, Callaghan, NSW 2308, Australia.
Centre for Bulk Solids & Particulate Technologies, Faculty of Engineering and Built Environment, University of Newcastle, Callaghan, NSW 2308, Australia. Pneumatic conveying involves the transportation of a wide variety of dry powdered and granular solids through pipeline and bends using high pressure gas. It is a frequently used method of material transport particularly for in-plant transport over relatively short distances. This is primarily to exploit the degree of flexibility it offers in terms of pipeline routing as well as dust minimization. Approximately 80% of industrial systems are traditionally dilute phase system which uses relatively large amount of air to achieve high particle velocities to stay away from trouble, such as blocking the pipeline. However, for many applications higher velocities lead to excessive levels of particle attrition or wear of pipelines, bends and fittings. To combat these problems, there are systems designed to operate at relatively low velocity regimes. Yet one problem remains as a major issue with these conveying systems which is wear. In pneumatic conveying, service life is dictated by wear in critical areas of the pipelines and bends due to higher interaction between the particles and the surface. Depending on the conveying conditions or modes of flow, wear mechanism can be abrasive or erosive or a combination of both. Recent developments in predictive models of wear of materials showed that by using the particles energy dissipated to the surface and the surface material properties, it is possible to predict the overall material loss from the surface. Material loss from the surface is then can be converted to determine the pipeline thickness loss that can be used to indicate the service life of the pipeline. In this paper wear mechanisms in the critical wear areas of pneumatic conveying pipeline have been analysed. Based on the wear mechanisms, predictive models have been selected from the literature. A number of factors have been incorporated to apply the model into pneumatic conveying processes. Conveying tests were performed in the laboratory to determine the time to failure as well as gradual thickness loss in the bend. Finally experimental results have been compared the model output and the variations have been analysed for further improvement of the models. Key Words: Wear, abrasive, erosive, pneumatic conveying.
1
INTRODUCTION
Pneumatic conveying involves the transportation of a wide variety of dry powdered and granular solids through pipeline and bends using high pressure gas. It is a frequently used method of material transport particularly for in-plant transport over relatively short distances. This is primarily to exploit the degree of flexibility it offers in terms of pipeline routing as well as dust minimization. Approximately 80% of industrial systems are traditionally dilute phase system which uses relatively large amount of air to achieve high particle velocities to stay away from trouble, such as blocking the pipeline. However, for many applications higher velocities lead to excessive levels of particle attrition or wear of pipelines, bends and fittings. To combat these problems, there are systems designed to operate at relatively low velocity regimes. Yet one problem remains as a major issue with these conveying systems which is wear. Wear is surface damage that generally involves progressive material loss due to relative motion between the surface and the contacting substance or substances. In general the broad classification is based on the primary interaction between the surface and the contacting substance(s) and can be classified as abrasive wear and erosive wear. Based on the interactions between the surface and the erodent, material removal mechanism can be defined as cutting and deformation [1]. Although material can be removed from the surface by a single impact through cutting, for material removal through deformation multiple impacts as well as a secondary process such as fatigue may be involved.
264
1.1 Abrasive and Erosive Wear Abrasive wear occurs when the abrasive material stays in contact with the surface during the wear event. It is further categorised according to the type of contact as well as contact environment. The contact can be two-body (1 (a)), where the abrasive slides along the surface which acts like a cutting tool. In the case of three-body abrasion (1 (b)), the abrasives are trapped between two surfaces and free to roll or slide. In both the cases, it is possible to remove material through cutting and deformation.
(a)
(b)
(c) )
Figure 1: Illustration of basic wear mechanisms, a) two-body abrasive wear, b) three-body abrasive wear and c) erosive wear. The removal of material from a solid surface by action of impinging solid or liquid particles is known as erosion. The primary difference between erosion and abrasion is the contact duration of the particles with the surface. In the case of erosion, particles impact on the surface with a specific velocity and at certain angle with the surface. Depending on the particle impact velocity and impact angle, three things can happen. Particle can leave the surface with a residual velocity, particle can stop while cutting (particle can be embedded into the surface) or particle can deform the surface without any material removal from the surface. As a result, material removal mechanism can vary significantly depending on the particle and surface characteristics as well as impact parameters. Similar to abrasive wear, material can be removed from the surface through cutting and deformation mechanisms. As mentioned earlier, removal of materials occurs mainly by two different mechanisms, (a) caused by the cutting action of the free moving particles and (b) caused by deformation due to repeated collisions of particles with the surface and eventually resulting in breaking loose piece of material. In practice, these two types of material degradation occur simultaneously. In the case of hard and brittle materials, cutting wear is negligibly small compared to deformation wear, whereas for soft and ductile materials cutting is the primary mechanism of material removal process. Micro-cracking is the primary mechanism in brittle materials. In the case of micro-cracking, material is removed by subsurface lateral cracks spreading parallel to the surface, meeting the longitudinal cracks or cracks rising to the surface. Plastic deformation of the surface due to particles interactions generates a scale like topography on the surface of ductile materials which is harder than the substrate. Due to fluctuation of pressure on the surface, this harder layer can be delaminated and removed through cracks and crack propagation [2]. Micro-fatigue is a major contributor in this mechanism. If the particles are free to roll and slide on the surface, the surface can be subjected to a loading/unloading cycle due to the rolling contact of the particles. This causes micro-fatigue of the material which can subsequently generate randomly shaped wear areas similar to brittle fracture of the wall material. Material removal in these mechanisms can be increased considerably compared to cutting and deformation [3]. In pneumatic conveying, granular materials are transported through pipeline and bends. Particles interactions in the bends and in the reacceleration zone of the pipeline depend primarily on the solids loading ratio (ratio between the mass of solids to the mass of air). A detailed study of the flow structures in the bends as well as pipeline after the bends can be found in [4]. It was observed that for higher solid loading ratios, particles tend to accumulate in the bends and three-body abrasive wear become predominant after the impact. On the other hand, for lower solid loading ratios, erosive wear is the dominating wear mechanism. Mills and Mason [5] have conducted numerous investigations on the wear behaviour in pneumatic conveying systems and have found that in severe wear situations where a hole is formed in the pipeline, the majority of material loss is concentrated to an area where the hole has been formed. Mill’s also found that the highest wear rates defined by total material loss did not coincide with the tests where the holing of the pipe occurred. This suggests that in the case of severe wear, the flow profile is not uniform across the pipe cross-section. 1.2 Analysis of Wear Mechanisms in Industrial Pipeline Understanding of the wear mechanisms responsible for material removal in these areas is essential for the development of a predictive model for wear in pneumatic conveying. For a better understanding of the wear mechanisms industrial pipeline of
265
pneumatic conveying, wear sections have been analysed visually as well as unis the Scanning Electron Microscope (SEM). Visual observations of the samples from the pneumatic conveying of alumina showed that there were specific areas where severe wear and holing of the pipe occurred. These are primarily the bends and the straight sections immediately after the bends. Understanding of the critical wear areas of pneumatic conveying pipeline has been discussed in details in [4]. Critical wear areas after the bends are usually characterised by longitudinal channelling on the surface consistent with sliding wear. The continuous flow of material occurred over these surfaces clearly represent three body abrasive wear patterns.
a)
b)
Figure 2. a) Representative wear section from the pneumatic conveying pipeline. Backing plate had been used after the first appearance of a hole. The image shows the pipeline without the backing plate to realise the extent of wear. b) Samples for surface analysis for wear mechanism using SEM. The wear section in Figure 2a is representative of the severe wear areas from the pipeline after a bend. This section was chosen for a detailed analysis to determine the dominant wear mechanisms in the pneumatic conveying pipeline. Samples were cut from the pipe sections for further analysis using SEM as shown in Figure 2b. The back of the samples were machined flat for mounting purposes. Observations indicated that a continuous flow of concentrated particles created long wear on the pipe wall. These grooves narrowed with increasing material loss from the surface and eventually creating holes in the pipe wall. Surface analysis using SEM showed the wear patterns consistent with wear mechanisms such as cutting and deformation (Figure 3a). Other than cutting and deformation, lateral and longitudinal cracks were also revealed through the surface analysis using SEM. Although the crack formation and material removal through cracking are the primary mechanisms in brittle material, cracks are formed in these samples due to the sever alteration of the surface characteristics. The formation of cracks and material removal through brittle characteristics has been discussed in details in [2].
Figure 3: Surface characteristics in the high wear areas of the pipeline a) surface ripples, characteristic of deformation wear, b) lateral and longitudinal cracks presumably due to the pressure fluctuation in the pipeline
Ripples formation on the wear surface is a well known characteristic of wear surface in ductile materials specially for spherical erodent. Formation of ripples has been well documented by many researchers [6]. Ripples are formed when the rate of material removal due cutting is less than the rate of material removal through deformation on the surface. Talia et al. [6] showed
266
that the angular particles are more efficient in removing material from the surface as a result, may not produce the ripples as it the case with spherical particles. Formations of ripples with highly angular particles like alumina and ilmenite have been demonstrated by Kimpang et al. [7]. Figure 3b) showed the lateral and longitudinal cracks on the alumina conveying mild steel surface. This is one of the very specific wear characteristics in mild steel pipeline conveying alumina. In the process of conveying, the ultra particles of alumina can be embedded into the surface of mild steel around the impact areas. Due to the sintering capability of the alumina, more alumina particles are sintered on surface to generate a thin hard layer, so called alumina transfer film on the surface. Due to the fluctuation of pressure as well as fluctuation of the material impinging on the surface, the hard layer becomes delaminated from the substrate [8]. When the traction force of the sliding material become larger than the adhesion of the surface layer to the substrate, the layer start to peel off the surface in segments. Although the hard coating tends to protect the softer substrate from cutting and deformation wear, delamination and subsequent cracking of the surface layer increases the material loss dramatically, almost to a factor of 2-4 [9]. 2
DEVELOPMENT OF PREDICTIVE MODEL FOR WEAR IN PNEUMATIC CONVEYING PIPELINES
From the analysis of the material removal processes it was clear that one material removal mechanism cannot apply to all materials. In ductile materials erosion occurs by a process of plastic deformation in which material is removed by displacing or cutting action of the eroding particles. In brittle materials, the material is removed by the intersection of cracks which radiates out from the point of impact of the eroding particle. Finnie [37] divided the erosion problem into two major parts. The first part involves the determination of the number, direction, and velocity of the particles striking the surface, from fluid flow conditions. The second part of the problem is the calculation of the material removed from the surface. The first part is basically the problem of fluid mechanics. During erosion of ductile material, a large number of abrasive particles strike the surface. Some of these particles will land on flat faces and do no cutting, while others will cut into the surface and remove material. Finnie developed a model for material removal by the particles which displace or cut away material from the surface. An idealized picture of the particle interaction with a material surface is presented in Figure 4.
Figure 4: Idealized picture of abrasive particle striking a surface and removing material. Initial velocity of the particle’s centre of gravity makes an angle a with the surface. Finnie [10] derived and solved the equations of motion of the idealized particle and compared the predicted material loss with the experimental results. To solve the equations of motion of the particle the following assumptions were made: 1)
the ratio of the vertical and horizontal component of the force was assumed to be of constant value (K). This is reasonable if a geometrically similar configuration is maintained throughout the period of cutting.
2)
the depth of contact (l) to the depth of cut (ye) has a constant value (y).
3)
the particle cutting face is of uniform width, which is large compared to the cutting depth and
4)
a constant plastic flow stress (p) is reached immediately upon impact.
Based on these assumptions, the first microcutting model was developed based on the deformation caused by an individual particle. The volume of material W removed by a single abrasive grain of mass m, velocity V and impact angle a was given by
W=
mV 2 6 K 2 sin 2a - (sin a ) for tan a £ py K K 6
267
(1)
mV 2 k cos 2 a K for tan a ‡ W= py K 6 6
(2)
These two expressions predict the same weight loss when tan 2a = K/6 and the maximum erosion occurs at slightly lower angle given by tan 2a=K/3. The first equation applies to lower impact angles for which the particle leaves the surface while still cutting. The second equation applies to higher impact angles in which the horizontal component of the particle motion ceases while still cutting. The critical angle ac is the impact angle at which the horizontal velocity component has just become zero when the particle leaves the body; i.e., the impact angle above which the residual tangential speed of the particle equals zero. Based on the understanding of the material removal processes in erosion, Neilson and Gilchrist [11] proposed a simplified model for the erosion of material. They assumed the cutting wear factor f (kinetic energy needed to release unit mass of material from the surface through cutting) and deformation factor e (kinetic energy needed to release unit mass of material from the surface through deformation) and proposed the relationship for erosive wear loss based on the material and process parameters as:
W =
1 2
M (V 2 cos 2 a - v p 2 )
f
+
1 2
M (V sin a - K ) 2
e
( A) W =
1 2
2
for a < a 0
(3)
( B)
2
MV cos a
f
+
1 2
M (V sin a - K )
e
2
for a > a 0
(4)
(C ) ( B) where W is the erosion value, M is the mass of particles striking at angle a with velocity V. K is the velocity component normal to the surface below which no erosion takes place in certain materials and vp is the residual parallel component of particle velocity at small angles of attack. Part B accounts for deformation wear and part A and C account for cutting wear at small angles of attack and large angles of attack respectively. a0 is the angle at which the vp is zero so that at this angle both the equations will predict the same erosion. Tests on ductile materials with constant particle velocity show that, as the angle of attack is increased from zero, the erosion initially increases at a rapid rate but at larger values of angles, the rate decreases. With this observation the wear equation for a
W =
1 2
M (V 2 cos 2 a sin na )
f
+
1 2
M (V sin a - K ) 2
e
for a < a 0
(5)
The values of the parameters f, e and n can be obtained experimentally. To apply this model to predict the wear loss in pneumatic conveying pipeline systems, a number of factors have been incorporated into the model. These are due to the fact that there are numerous factors related to the particles, conveying parameters, pipeline material and layout that affect the wear process. Some of these factors are combined as a system factor, yet some have been considered separately. Finally for practical purposes the model is simplified as follows: Effective Cutting Energy:
Wc = E Cos 2a Sin (na ) f a . f c .l
Effective Deformation Energy: Wd = E Sin 2a . f a . f c . l
(6) (7)
Where E is the particle impact energy, fa is the energy absorption factor, fc is the particle concentration factor and l is the particle angularity factor. Total material loss is then calculated from the calculated from the total energy and the cutting and deformation energy factors determines elsewhere [7]. The cutting and deformation energy factors have been experimentally determined for different surface and particles combinations. It was observed that the factors are velocity dependent. Although, the factors are velocity dependent, it was found that within the velocity range of pneumatic conveying, these factors can be considered as constant. The final equation for the model can be given by:
W W W = c + d e f
(8)
268
The above equation presents the total loss of material based on particle impinging on a unit area. Once the total loss of material is calculated, mass loss is then converted to volume loss in unit area using the density of the surface material. This volume loss in unit area is then converted to thickness loss for any duration of any flow conditions. One of the major factors in the model is the particle velocity. Models developed in the literatures use the superficial gas velocity for the energy calculations. In this paper, the particle velocity has been used instead of the superficial gas velocity. Particle velocities have been measured using a high speed video for different conveying parameters and a parametric equation has been developed to determine the particle velocity for any conveying parameters. Using the particle velocity in conjunction with other particle parameters, the particle energy before and after the impact can be calculated and hence the energy dissipated into the surface. 3
EXPERIMENTAL TEST RIG
For the verification of the model, pneumatic conveying tests rig has been constructed at the University of Newcastle (Figure 4a). The conveying pipeline consisted of 25m long 50 mm diameter mild steel pipeline with horizontal-horizontal, horizontal-vertical and vertical-horizontal sections. The pipeline is also fitted with one short radius and two tee bends. Conveying tests were conducted using sand and alumina particles at different conveying conditions. A horizontal-horizontal short radius bend (Figure 4b) has been taken as the test bend. Thickness losses have been measured at the critical wear areas at different time intervals. The model output has been compared with the measured thickness loss for the verification of the model.
(a)
(b)
Figure 5: a) Pipe layout for the pneumatic conveying tests, b) a short radius test bend. A rotary valve has been used as the feeding device for the pneumatic conveying system. The main advantage of the rotary valve is that the solids flow rates can be measured and maintained from this device. Particles mass flow rate was maintained at 0.4 Kg/sec for sand. The air mass flow rate was maintained at 0.063 kg/s using a set of sonic nozzles. This equates to a solid loading ratio of about 6 which is within the definition of lean phase conveying. Although for lean phase conveying, the conveying particles were expected to be suspended in conveying air (possibly uniformly distributed), high speed video evidence showed that the sand particles were concentrated along the bottom section of the pipeline. As a result, the critical wear area was expected to be below the horizontal centreline of the pipeline. 4
EXPERIMENTAL RESULTS AND DISCUSSIONS
Two short radius bends have been tested to in this study to validate the predictive model developed as shown in Figure 5 b. For both bends, gradual thickness losses have been measured using an ultra sonic thickness gage. The thickness gage can measure thickness at 0.01mm accuracy. For the convenience of the conveying process, the particles have been separated from the air flow and recirculated in to the process. Although the sand particles have been degraded considerably due to the recirculation process, it was not practical to use single pass conveying tests. The conveying sand has been maintained at an acceptable degree of degradation through monitoring of the particles size distribution. The degradation of the particles has been reflected on the measured thickness loss data when compared with the model out put discussed later.
269
From the measured thickness loss in the critical wear area of the bend, wear profile was generated as shown in the following Figure 6. It was observed that the maximum wear point is below the median level of the pipeline. This is due to the fact that the sand particles were not totally suspended and uniformly distributed through the cross section of the pipeline. The profile also revealed the shallow wear grooves that represent the sliding wear after the initial impact of particles. The profile also revealed that the wear is more localised in certain areas based on the conveying parameters. Generation of wear profile demonstrated that that monitoring of thickness loss is important specially to determine the critical wear point in the pipelines.
Figure 6: wear profile from the critical wear area of a short radius bend worn by the pneumatic conveying of sand.
Figure 7 showed the inner surface of the two short radius bends worn during these conveying tests. Particle flow was from left to right. Both the bends were sectioned on a vertical plane so that the location of the primary impact point as well as the wear profile on the surface can be observed unaltered. The lower part of the pictures showed the bottom of the pipeline. It was observed that the particles flow stream deflected from the primary impact point for both the cases. For low velocity impact (Figure 7a), particle deflected upward moderately after the impact and particle the particles flow is almost along the bend curvature towards the exit. This made the surface wear of the pipeline nearly uniform along the groove. For the case of higher velocity flow (Figure 7b), particle deflection is considerably higher after the impact. Deflected particles met with the incoming particles generated a turbulent flow that creates a complex wear situation of sliding wear superimposed by impact wear. This wear situation is more localised compared to the surrounding areas of the impact zone. This situation increases the wear loss dramatically compared to both the sliding and erosive wear process. For higher particle velocity, particle deflection angle is higher and wear tracks clearly showed a spiral flow of particles after the impact. This has been a well documented feature of the pneumatic conveying process. Point of Maximum Thickness loss
Particles deflection angle higher, spiral flow after impact
Figure 7: Observation of wear scar on internal bend surfaces at two different conveying parameters. With material loss from the initial wear point, impact angle changes that further increase the wear rate. This is another reason for the more localised wear in pneumatic conveying processes. This has been reflected on the measured thickness loss towards the end of the tests, where the rate of thickness loss was seen to be increased.
270
From the analysis of the wear surfaces at two different conveying parameters, it became clear why the average measurement of wear, such as mass loss from a particular bends can not predict the failure of the bend through holes. This is due to the fact that, the critical wear point has the unique wear situation that progressively worsens the wear situation. At slower velocities, wear can be uniformly distributed in certain areas due to the particle stream flow along the pipeline surface (Figure 7a). As the mass loss is distributed in a wider area, the bend can sustain greater mass loss before failure. On the other hand, in case of higher velocities, wear is more localised and pipeline can fail at a lower mass loss compared to the lower velocity conveying. Following figures (Figure 8) present the comparison of experimentally measured thickness loss data and the thickness loss from the predictive model. Figure 8(a) showed the rate of thickness loss (mm/hrs) as well as the cumulative thickness loss for the test duration. It can be seen that the rate of thickness loss has been gradually increased towards the later part of the test. This is expected as discussed in the earlier paragraphs. With increasing loss of material, the surface profile changes and makes favourable for increasing energy dissipation to the surface. With increasing energy dissipated to surface increases the material loss through cutting and deformation.
Thickness Loss in SRB
Thickness Loss Data
2.5
0.02
2 0.015 1.5 0.01
TL rate
1
Cum TL
0.005
Poly. (TL rate)
0 0
20
40
60
80
100
120
0.5
Incremental Thickness Loss (mm)
3
0.025
4
0.8 0.6
Inc. TL
3.5
Cum TL
3
Model
2.5 2
0.4
1.5 1
0.2
0.5 0
0 140
Cumulative Thickness Loss (mm)
1
3.5 Cumulative thickness loss (mm)
Rate of Thickness loss (mm/hrs)
0.03
0 0
50
100
150
Test Duration (Hours)
Test Duration (Hrs)
(a)
(b)
Figure: 8: a) Rate of thickness loss and cumulative thickness loss from the bend wear b) Incremental thickness loss and cumulative thickness loss from the bend compared with model output.
Incremental and cumulative thickness loss from the critical wear area of the short radius bend is presented in Figure 8(b). It also showed the model output for the cumulative thickness loss for the similar pipeline configurations and conveying parameters. The model output graph has the similar trend as the experimental wear loss. But the model output consistently overestimated the actual measurement of the thickness loss. This can be explained considering the conveying of the sand and the measurement of the energy factors. The energy factors were measured using the single pass of the sand particles. That provided the maximum abrasivity of the particles in determining the energy factors. In fact, that is the case for pneumatic conveying in industries. For the conveying tests in the laboratory, it was not practical to use single pass tests for these conveying. Instead, particles have been recycled for several hours. Although particles have been degraded due to the recycling, we have monitored the particle degradation through particle size distribution through the tests for uniformity. Due to the degradation of the conveying materials, lower wear rates were expected compared to the model output. 5
CONCLUSION
Wear mechanisms in the critical wear areas of a pneumatic conveying pipeline have been analysed with respect to the conveying parameters and pipeline configurations. Wear profiles in short radius bends have been observed for the different conveying conditions. The well known spiral flow has been recognised from the wear profiles on the pipeline inner surfaces. Localised wear, particularly in higher velocity conveying, also been observed which is believed to be the main cause of irregularities between the mass loss and failure of the pipeline. Predictive models from the literature have been discussed and a practical way of implementing the model for predicting the wear loss from pneumatic conveying pipelines has been presented. As the wear of pipeline has serious consequences on the service life of pneumatic conveying pipelines, these models can be useful in predicting the service life of pneumatic conveying pipelines.
271
6
REFERENCES
1
Bitter, J.G. (1963) Wear, 6, 5-21
2
Cenna, A.A., Page, N.W, Williams, K.C. and Jones, M.G, (2008) Wear, 264, 905-913
3
Evans, A.G. and Marshall, D.B., (1981) in Rigney, D.A., (ed.), Fundamentals of Friction and Wear of Materials, ASM, 439-452
4
Cenna AA, Williams KC, Jones MG and Page NW, (2006) Flow Visualisation in Dense Phase Pneumatic Conveying of Alumina, Presented at the Inaugural World Congress on Engineering Asset Management, Gold Coast, Australia.11-14 July, 2006.
5
Mills, D. and Mason, J.S., (1976) Powder Technology, 17, 37-53
6
Talia, J.E., Ballout, Y.A. and Scattergood, R.O., (1996) Wear, 196, 285-294
7
Kim Pang, Cenna, A.A, Shengming T and Jones, Jones, MG., (2009) To be presented in WCEAM 2009, Greece, Athens.
8
Meneve, J., Vercammen, K., Dekempeneer, E. and Smeets, J., (1997) Surface and Coatings Technology, 94-95, 476-482.
9
Levy, A., (1995) Solid Particle Erosion and Erosion-Corrosion of Materials, Ch.5, ASM International.
10 Fennie, I., (1960) Wear, 387-103 11 Nelson, J.H. and Gilchrist, A., (1968) Wear, 11, 111-122. Acknowledgement This work form part of the research program for the CRC for Integrated Engineering Asset Management
272
Ltd.
Proceedings of the 4th World Congress on Engineering Asset Management Athens, Greece 28 - 30 September 2009
FLEXIBLE DISK COUPLING NUMERICAL ANALYSIS MODEL METHOD Jae Gu Kima, Young Seok Jang a, Han Eol Jeong b, JangIk Lim a, Byeong Keun Choi a a
Department of Precision Mechanical Engineering, Gyeongsang National University 445, Inpyeong-dong, Tongyeong city, Gyeongsangnam-do, 650-160 Republic of Korea b
Turbolink Co. Ltd, 41-3 Palyong-dong, Changwon city, Gyeongsangnam-do, 641-847 Republic of Korea
Flexible disk coupling is an equipment to translate the force of rotation without alignments. It consists of body, disk and bolt-nut. The characteristics of flexible disk coupling such as transfer torque, rotating speed, weight and etc. are decided by the shape and the number of disk. Also, the contact condition between each disk is most effective factor to the characteristics of flexible disk coupling. In this paper, we focus on developing the vibration analysis and design tool for flexible disk coupling. First, the numerical analysis model is developed and modal analysis is performed by using this numerical model, second, the modal analysis result is compared with impact hammering test result to identified the numerical analysis model. The commercial program, "Ansys Workbench", is used and five cases different contact conditions between each disk supporting from Ansys Workbench is considered in this paper. Key Words: Numerical Analysis, Impact Hammering Test, Mode Shape, Flexible Disk Coupling, Contact Method 1
INTRODUCTION
Performance of equipment and quality are improved rapidly in developments of recent industry, and mechanical equipment of each spot of industry is various, related parts subdivide it. In the case of basic systems for power transfer, and the vibration [1] characteristic of each parts is very important so, product development needs a lot of effort. In the past, vibration characteristic was analysed as it used prototype and provisional experiment. But since the CAE (Computer Aided Engineering) [2] technique was developed, the optimized prototype has been produced through optimization design in simulations using CAE. Then the experiment has done to evaluate the product using the optimized prototype. In this paper, to develop the simulation tool for flexible disk coupling, it was compared that natural frequencies were calculated by numerical analysis using Ansys Workbench [3] and impact hammering test with prototype. The numerical analysis had five different connection methods, consisted of pure, augment, MPC (Multi Point Constraint) suggested in Ansys Workbench. 2
MODELLING AND BOUNDARY CONDITION 2.1 Prototype flexible disk coupling Figure 1 is shown flexible disk coupling in this paper and Table 1 is explained about specification of the model.
273
Figure 1. E4-04 model Table 1 Specification of E4-04 model Torque(kg·m)
500
Max speed(rpm)
15,000
Weight(kg)
35.1
Allowable End Float(±mm)
4.5
2.2 Modelling
Figure 2. FEM model Flexible disk coupling model was designed using the CATIA V5 which was a 3D modelling program, and consisted of structure, disk and bolts. The model was imported to Ansys Workbench then it was created that the number of element and node were 11,689 and 45,678. Figure 1 is shown the FEM model in ANSYS and Table 2 is explained about material property. The structure was consisted of left, middle and right side which are same material, and connected by disk and bolts. The real disk had 20plates but, in simulation, it was defined as one plate. The six bolts connected structures and disk by two sides, that mean is the three bolts were jointed from left side and the other three bolts were contacted from right side and the contact condition between bolts and others was bonded type.
274
Table 2 Material property Structure
Disk
Bolt
200
200
206
0.29
0.29
0.29
7,850
7,850
7,850
Young's modules (GPa) Poisson's ratio Density (kg/m3)
2.3 Boundary condition To get the natural frequency of flexible mode, the flexible coupling was fixed [4] as Figure 3. Because the coupling is connected with shafts and the power is transferred through the key was of axial direction.
Figure 3. Boundary condition And pure, augment, MPC (Multi Point Constraint) be used during functions to provide at Ansys Workbench. In addition, mode analysis divided it to five cases as it gave contact stiffness [5] to Pure and Augment, and disk set up number with 1, 3, 5, 10 and 20. 2.4 Mode analysis result In the analysis result, the 1st bending mode shape of radial and axial direction as shown in Figure 4.
275
Figure 4. 1st bending mode (left) and axial 1st mode (right) And the results were arranged in Table 3. The natural frequencies were increased flowing the number of divided disk in case1 and 2, and decreased by it in case3. On the other hand, the results were decreased until the 5 disk, but it increased from the 10 disk at case 4 and 5.
Table 3 Analysis results
(Unit : Hz)
No. Disk
Case 1
Case 2
Case 3
Case 4
Case 5
1st
148
148
149.33
131.99
131.99
2nd
341
341
345
303
303
1st
160
160
145
126
126.16
2nd
379
379
334
289
289
1st
171
171
144.48
122.26
122.26
2nd
406
406
333
280
280
1st
639
639
129.94
212.98
212.98
2nd
933
933
305
300
300
1st
673
673
133
232.66
390
2nd
1,040
1,040
272
321
570
1
3
5
10
20
276
3
EXPERIMENT 3.1 Method
Impact hammering test was performed to get the natural frequency using acceleration sensor and impact hammer. Experiment equipment used Pulse 3560C, Amplifier mode by B&K. Frequency range was decided by 500Hz. The resolution for frequency was 1.25Hz, and the average was 3 times.
Figure 5. Experiment system for impact hammering test Figure 5 is displayed the experiment system and Figure 6 is shown the position of accelerometers and impact.
Figure 6. Modal test (X direction)
277
3.2 Impact hammering test result Figure 7 is shown the result of impact hammering test. In this Figure, the 1st mode was at 110Hz and 2nd was 270Hz, the low frequencies were occurred by the foundation of holder for test.
Figure 7. Impact hammering test result
4
DISCUTION
Figure 8. Parallel of analysis result and impact hammering test
278
Case1 and 2 had a large error than other cases that compared with experiment results following increasing the number of disk. It was know that the results from five partial cases, case 3 with 10 and 20 disks and case 4, 5 with 1, 3 and 5 disks, were the most similar with impact hammering test. 5
CONCLUSION
In this paper, the natural frequencies from experiment and numerical compared to develop the numerical analysis tool for flexible analysis. The target of this paper was that find the connecting method for each point as pure, augment, MPC. So, the five case of connecting method were considered and compared. According to the result, (1) The disk doesn’t have to separate a lot in case 4 and 5. (2) The best result was calculated by case 3; in this case, the number of divided disk was 10. Therefore, to get the good result, it was considered that the numerical analysis for flexible disk coupling had to follow the case 3 the 10 disk. 6
REFERENCES
1
Harris, Cyril M. and Crede, C. E., Shock and Vibration Handbook, Mcgraw-Hill, New York, 1995.
2
B. Raphael and I.F.C. Smith (2003). Fundamentals of computer aided engineering. John Wiley.
3
Kent Lawrence. (2007) ANSYS Tutorial Release 11. Kansas, SDC: Schroff Development Corporation Publications.
4
Y.Zheng, Z. Hou, Y.Rong, “The study of Fixture stiffness- Part Ⅱ: contact stiffness identification between fixture components”, Int J Adv Manuf Technol 2008 DOI 10.1007/s00170-007-1077-x.
5
Fuqian Yang, “Effect of adhesion energy on the contact stiffness in nanoindentation”, J Mater. Res., Vol. 21, No. 10,
6
Oct 2006
7
Bo Suk Yang, “Diagnostics and vibration condition monitoring of equipment”, INTER VISION, 2006.5 Publication
Acknowledgments This research was supported by Second – Phase of BK21 (Brain Korea21) project, and thank you to concern.
279
DISASSEMBLY PROCESS PLANNING USING BAYESIAN NETWORKS Godichaud M.a , Pérès F. a and Tchangani A. a a
Laboratoire Génie de Production, Ecole Nationale d’Ingénieur de Tarbes 47, Avenue d’Azereix – BP 1629 – Tarbes Cedex - France [email protected], [email protected], [email protected] The management of end-of-life systems becomes more and more important due to the awareness of their environmental impact. In this context, the disassembly process requires more attention with the ultimate goal to make profit. In this paper, we propose a new approach to determine optimal disassembly plan of an end-of-life system by using bayesian network. To take advantage of some existing approaches that use Petri Net to model such process, a Petri Net model is first established and then translated to Bayesian Network in order to take into account inevitable uncertainties associated to such process. Key Words: Disassembly, modelling, Bayesian networks, uncertainties. 1
INTRODUCTION
The end-of-life phase of systems life-cycle has become more and more important since several years. This is firstly due to the reinforcement of the government legislation with regard to environmental protection that forces the system manufacturers to take care of the disposal in an environmental conscious way. Different activities are required to achieve this goal. They allow bringing back the components of the end-of-life system to conditions that enable their reintroduction in the life-cycle of other systems. The principal activities are material recycling, remanufacturing or reusing. In this paper, we classify these activities in two generic classes: –
material recycling which consists in material recovery of the system elements;
–
functional recycling which consists in functional recovery of the system elements.
These activities can generate profits for the actors of the end-of-life phase who manage the disassembly of the system. Today, the awareness of economic profit perspectives becomes one of the principal motivations for system manufacturers to set up and develop disassembly process. The objective of the disassembly process is to generate parts and subassemblies that respect the specifications and conditions of the recovery activities (recycling, reuse, …) to which they are assigned to. The disassembly process includes sequences of separation actions that go from the whole end-of-life system to the valuable recycling products that the decisionmaker has selected. Different separation actions are possible to obtain each product (destructive, non-destructive, shredding, sorting, …) and different recycling options can be in conflict in one disassembly strategy. The definition of a disassembly strategy begins with an analysis phase. We decompose this phase in three main tasks: –
identification of the component and subassembly of the end-of-life system: the purpose here is to represent the product topology and mating relationships;
–
identification of the recycling actors who can evaluate different components with recycling viewpoints;
–
analysis of the separation actions and resources in order to determine the precedence constraints between separation actions.
In the next step, decision-maker has to determine the optimal solution according to some criteria. The solution must establish for each component the best recycling option, the disassembly level (i.e. for each subassembly, what is the best option between recover and disassembly?) and for each subassembly the best type of separation actions. For a system with an important size, these tasks appear to be complex since the number of solutions might be important.
280
Furthermore, decision-maker has to manage the uncertainty of the disassembly process. Indeed, this is an important characteristic of this process but there are only few works that deals with it in the disassembly literature. The main uncertainties in the disassembly process are related to the states of the systems and components as well as the demand for the recovery products. There are two approaches to cope with this problem: predictive or reactive. The first one consists in selecting one solution that integrates the uncertain parameters. They can be taken into account in the decision model by introducing probabilities on success of disassembly operations [16] or by specifying parameter values by intervals [7]. The second approach to take into account uncertainties consists in keeping alternative sequences in the disassembly process model in order to change the principal sequence if an operation fails [6] [10] [3]. The research issues we address in this work concern the modelling of the decision problem in disassembly planning with uncertainty. The remainder of the paper is organized as following: in the second section we present the necessary steps to modeling disassembly planning problem; in the third section we present our approach for solving such problem by using Bayesian network and in the last section we illustrate this approach on an example. 2
DISASSEMBLY PLANNING PROBLEM 2.1
Problem modelling framework
In this section, we present the different steps in modelling the disassembly planning problem that lead to an optimal disassembly sequences. In a determinist context, there are many works that address these problems in the literature (see for instance [2][7]). We present a way of linking these different approaches. Generally, the modelling of the disassembly planning problem requires three main steps (see Figure 1). The first step concerns the structure modelling of the end-of-life system. The goal of these models is to represent the valuable parts and subassemblies and the connections between them. The input data can be generated from CAD (Computer Aided Design) systems as well as MRP (Material Requirement Planning) systems which facilitate the activity of identifying the components that can be reused. Several models are proposed in the literature ([1][9][12]). The second step (denoted by Process models in Figure 1) of the disassembly planning problem concerns the modeling of the disassembly process. The obtained model represents the different operations that can be made on the system to obtain the valuable part and subassembly. After each disassembly operation, the state (or structure) of the end-of-life system is modified. If necessary, the process model integrates these intermediate structures. This is interesting when decision-maker wants to take into account the uncertainty of operation success. Indeed, if one operation fails, one has to manage the intermediate component generated previously. The problem is to generate all the sequences of disassembly operations that respect the precedence constraints identified in the structure model. The main difference between several sequences is due to the changes of tools and disassembly axis as well as varying accessibility of parts. The consequence is the variation of operation time and costs (see [7][15] for instance). The third step (Decision models in Figure 1) concerns the search of the optimal sequence among those identified in the process model. The purpose is to jointly determine the disassembly level and sequence. The model must take into account the preferences of the different stakeholders involved in the end-of-life phase of the system. Classic approaches model the decision problem as a linear program and solve it by existing algorithms ([7][8]).
Figure 1. Modelling approach
2.2
Variables of the disassembly process
As we can see, solving disassembly planning problem involves different models and algorithms. Most of the works on this subject we encounter in the literature propose their own method and modelling language. In most cases, the considered
281
approaches do not facilitate the integration of uncertainties. Our goal is not to propose one more approach to solve this problem but to integrate the uncertainties of the disassembly process on the basis of existing models in order to determine an optimal and robust solution. To achieve this objective, we will use concepts and entities utilized by different approaches and then add uncertainties using a modelling language that cope with uncertainties. The main entities involved in the resolution of the disassembly planning problem are the following: (i) components: they represent the composition of the end-of-life system and can correspond to parts, subassembly and/or intermediate disassembly states; (ii) connections: in relation with the component, they complete the structural point of view by representing the joints and/or contact connections as well as relationships between components and subassemblies; (iii) end-of-life variables: linked with each component, they describe the different recovery actions which can be recycling actions or disassembly operations; (iv) contextual variables: attached to component and end-of-life variables, they model the recycling actors and other constraints on the recovery actions; (v) decision variables: related to component and end-of-life variables, they represent the different actions of the decision-maker and give a framework to the decision process; (vi) performance parameters: attached to end-of-life variables, they describe the consequences of the decision-maker actions. On the basis of these entities, we propose in Figure 2 the generic framework of the modeling of disassembly planning problem (UML class diagram).
Figure 2. Structure of disassembly variables At this stage, we have underlined the steps in solving disassembly planning problem and the different variables that decision-makers have to manage. We will introduce now Bayesian network to cope with uncertainties on some of these variables. 3
BAYESIAN NETWORKS FOR DISASSEMBLY PLANNING 3.1
Bayesian networks and influence diagram
We propose to use Bayesian networks as the modelling mathematical tool to solve the following decision problem: for a given end-of-life system, determine the disassembly levels and sequences on the basis of the process model while taking into account the uncertainties of the disassembly process. We use the Bayesian network and their extension to influence diagrams because problems we want to solve have the following features [4]: (i) they can be represented graphically, (ii) there is necessity to integrate and manage uncertainties, (iii) there is necessity to solve an optimal uncertain problem.
282
The first reason (i) is important in a multi-actors context as in the case of the problem considered here. Indeed, the Bayesian networks and the influence diagrams facilitate the understanding of the problem by all the actors by means of a simple and natural graphical representation. Furthermore, they enable the interaction between these actors and the sharing of knowledge in a unique representation. Indeed, a Bayesian network is a graph model in which knowledge is modeled as variables and each variable correspond to a node in the graph. The directed arcs represent dependence relationship between the variables. The first step in developing a Bayesian network model consists in the elicitation of the interesting variables. The second point (ii) corresponds to the purpose of our problem approach. The Bayesian networks enable inference that consists in the determination of probabilities for hidden variables of the problem given evidence. When decision-makers think there are uncertainties on some variables, they can evaluate them by probability formulation. Given the knowledge of the stakeholders, these probabilities can be conditional (they depend on some others variables) or marginal. Once the uncertainties of the disassembly process have been evaluated, decision-makers have to determine the optimal solution according to several criteria. Decision and utility nodes are then added into the Bayesian network that becomes an influence diagram. It models the selection problem of end-of-life options for each component including the utility of these options. In [11] [5], the authors propose inference algorithms that determine the optimal solution. 3.2
Modelling disassembly decision problem using influence diagram
In this work, we propose to use influence diagrams (ID) as a decision tool to model disassembly problem and we suppose that a process model is given. The ID could be used directly to model the process ([4]) but we want to focus in this paper on the decision modelling. We use Disassembly Petri Nets (DPN) to model the disassembly process. The advantages of using Petri Nets in this context are highlighted in [16]. Briefly, the DPN clearly describe the precedence constraints between operations in the disassembly process. The places represent system, components and subassemblies and the transitions represent joints and disassembly actions. The purpose is to represent all the possible sequences of operations. The decision problem is to determine the best sequence according to one or more criteria. Firstly, our approach will consist in translating the DPN model into a disassembly influence diagram. We illustrate the procedure on Figure 3. The following rules have to be applied: 1) Each place or transition in the DPN is a chance node in the ID (displayed as circles) (Figure 3(a)); 2) The variables are linked in the same way in both models; 3) Decision variables are created in the ID whenever a firing conflict is identified in the DPN (Figure 3(b)); 4) Utility nodes are created for each transition and end-of-life node. 5) End-of-life nodes are created for each product which represent the end-of-life option (they are noted Ox_P1with x=1,2,… in Figure 3(c)); The previous example represents a simple disassembly decision problem that is finally modelled as in Figure 3(c). We have to select the best option between recycling P1 and disassembling P1 (firing t1). The second option implies the recycling of P2 and P3. There is only one end-of-life option for each product. The proposed model looks more complex than the DPN because it takes into account more information into order to solve the decision problem. Once the graph model has been generated, the variable domains and conditional probability tables (CPT) must be specified. We propose the following states for the place and transition variables: –
non-activate (na): the disassembly option corresponding to the variable is not selected;
–
success (s): the disassembly option is selected and successes (i.e. the output products are obtained);
–
failure (f): the disassembly option is selected but fails (i.e. the output products are not obtained or the product is stored waiting for a demand).
Consequences of the state ‘f’ are presented in the next part. For the decision nodes, the states correspond to all the disassembly options of the level at which the decision nodes is placed. At this stage, contextual variables can be integrated to the model to represent the cause of uncertainties. It could be the state of a joint that causes the failure of an operation or the demand for a product that causes the failure of recovery action.
283
P1
P1
t1
t1
t2
t1 1
t2
(b) Integration of decision nodes
(a) Translation of DPN into influence
D_P1
P1
P1
P1 t1 1
P1
D_P1
P3
P2
P3
P2
P1
O1_P1
… O2_P1 .
t1
O1_P1
t1 t1 1
P1
P2 P2
(c) Integration of End-of-life options
P3
P3
O1_P2
(d) Final model
Figure 3. Disassembly process model using influence diagram The CPT must translate the firing rules of the DPN and integrate uncertainties. For a given place/transition variable, if the uncertainty is directly integrate in the variable (i.e. no contextual variable is used), the CPT is presented on Tableau 1. CPT modelling where p corresponds to the success rate of the recovery action or disassembly operation. The table should be adapted according to the context and graph model. Tableau 1. CPT modelling T na s f
P s 0 p 0
na 1 0 1
f 0 1-p 0
This CPT can be associated with the place variable P2 in Figure 3 if P corresponds to variable P2 and T corresponds to variable t1. The translation of the DPN firing mechanisms into the CPT is presented here: 1) for transition variables ti in the ID which correspond to the transition nodes ti in the DPN: a)
state na (in ID) means that the transition (in DPN) cannot be fired or can be fired but we don’t decide to (conflict resolution);
b) state s means that ti can be fired and is fired (token are moved from input place to output place); c)
state f means that ti can be fired but fails (not fired);
2) for place variables Pi in the ID which correspond to the transition nodes P i in the DPN: a)
state na means that there is no token in the place;
b) states s and F means there is a token in the place and the output transition are not fired. Once we have translated the DPN behaviour mechanisms into the CPT of ID variables (i.e. process model into decision model), we have to determine the states (retrieved components, subassemblies, joint states …) of the disassembly process according to different decision’s configurations. 3.3
Evaluation of disassembly solutions
284
Disassembly solutions are evaluated by means of utility nodes in the ID. They model the economical performance of the different recovery actions and disassembly operations. ID models enable the integration of utility in table forms as presented in Tableau 2. There are three types of nodes: ±
disassembly cost nodes: linked to disassembly operation node, their value is function of disassembly operation realization,
±
recycling cost nodes: they evaluate each recycling action realization mode of recycling node to which they are linked,
±
recycling revenue nodes: they model economical flow that is generated when a recycling option is validated.
Tableau 2. Performance parameters Utility
na 0
Utility
na 0
Pi s ri tj s cj
f cfi f cfj
ri: profit of the recovery action associated with Pi cfi: cost of the failure of recovery or disassembly actions cj: cost of the disassembly action
An example of an evaluation of an operation is given on Figure 4. A cost parameter is associated with each realisation mode of the operation (t1 and t2 correspond to different duration and ar correspond to stoppage). Given the uncertainty on the operation realization, this operation is evaluated through expected utility calculation.
Expected utility of operation: EU(Operation) = Pr(t1)*c1 + Pr(t2)*c2 + Pr(ar)*c3 Figure 4. Evaluation of a disassembly operation These different utility nodes allow optimization of a criterion for selecting a disassembly plan. This criterion is decomposed at each product in order to select the option or operation for each of them. To achieve this goal, decision node of each product indicates the evaluation of each option. The decision rule consists in selecting the option that maximise the expected utility of the product and it is called end-of-life policy. The set of all polities forms the strategy. It gives the products that have to be generated from the end-of-life system, the recycling options for these products and the disassembly operations that are needed to generate them. The purpose of optimization method is to determine the strategy for a given end-of-life system. There are different methods to solve decision graph (see [5] for a presentation). In [11], the authors propose an algorithm to solve multistage decision problem represented as influence diagram. We use this algorithm which is implemented in BNT, a bayesian network toolbox for MATLAB (see [13] for the presentation of the algorithms used in this toolbox). 3.4
Predictive or reactive disassembly strategy
As we mentioned before, the disassembly strategies with uncertainties can be predictive or reactive. The advantage of disassembly ID is that it can manage the both cases. This is due to the fact that we can enter observation on the state of variables in the network as evidence. In a predictive context, no observation is made on variables and the optimal sequence is determined before the beginning of the process. The reactive case is encountered when the process has begun and a disassembly operation fails. The user may want to search a new sequence from the operations that have already been made. He can enter the following observations on transition variables: 1) ti = s for the operations that succeed, 2) ti = f for the operations that failed. The searching for an alternative sequence takes into account these observations. Others observations could be integrated as contextual variables such as the state of a joint that might cause the failure of an operation. In both predictive and reactive disassembly strategy, the user must determine the success rates of disassembly operations and the probability of demand for recovery products. For the case of disassembly operations, the success rate can be evaluated
285
by the record of total number of success, 𝑁𝑡′ , divided by the total number of execution, 𝑁𝑡 , as presented in [16]. Then 𝑝(𝑡𝑖 ) = 𝑁𝑡′ /𝑁𝑡 . Parameter learning algorithms in Bayesian networks could also be used. Some of them are presented in [14][13] for instance. 4
APPLICATION EXAMPLE
We use a telephone example to illustrate our approach of disassembly planning. It is extracted from [2] or [16] where the authors propose a DPN to model the disassembly process that is presented in Figure 5. The telephone product consists of four parts A, B, C, D. The place P1 represents the first assembly state and P2, P3, P4 represent subassemblies. The parts are represented by P5, P6, P7, and P8. The transitions represent operations which consist in this example of removing joints. The end-of-life values and operation costs are denoted by r i and ci (see Figure 5 for numerical values). As explained before, the DPN displays the precedence constraints between operations.
Figure 5. Disassembly models of a telephone We apply the transformation rules presented in this paper to obtain the ID model illustrated in Figure 5. The names of the YDULDEOHVFRUUHVSRQGWRWKHQRGH¶VQDPHVRI the DPN. We can see that there are four decisions nodes and each configuration of the decision node values is a disassembly solution. In fact, some configurations are the same due to the fact that when we stop the disassembly process at an intermediate stage, the successor decisions node are not necessary. Indeed, we want to determine disassembly level and sequence so P i variables means that the associated component is recovered (selected by means of D_P i variable) and the disassembly process can stop here. We apply the solving algorithm for the case with non uncertain data (p=1 in the CPT of disassembly variables) and we obtain the following result: 1) Disassembly sequence: t1 ± t3; 2) Disassembly level: {P4, P6, P8}; 3) MEU = -2.5. This result is not the same as in [12] since these authors add a constraint: parts A and D must be reused at the material level. We add this constraint in the ID by entering evidence to forbid the activation of the subassemblies containing A and D. We enter then P1 = NA, P2 = NA and P4 = NA. We obtain the same result as in [12]: 1) Disassembly sequence: t1 ± t2 ± t4; 2) Disassembly level: {P5, P6, P7, P8}; 3) MEU = -3. In the same way, we can manage a reactive disassembly strategy. If operation t3 fails, we determine an alternative sequence after entering t3 = na. The solution is then the same as the later. Tableau 3. Success rate of operations p
t1 0.9
t2 0.95
t3 0.8
t4 0.9
t5 0.95
When applying a predictive disassembly strategy, the parameters p k have to be learnt for each disassembly variable (P k or tk). To test our approach, we arbitrary set up parameters for CPT corresponding to transition nodes and we only enter uncertainties on disassembly operations. They are displayed on Tableau 1. CPT modelling. For the cost parameters, we consider here that the failure of an operation tj implies the stop of the process on the predecessor P i variable: cfj = ri + cj. Indeed, in this example, all the utility of intermediate sub-assembly are negative.
286
The result of disassembly sequence and level is the same as the first one but with MEU = -3.29. Although the success rate of t2 is superior to that of t3, the optimal strategy takes t3. This is due to the important negative utility of P3 (BC) with regard to P4 (BD). If we had used the failure risk of the disassembly sequence as criteria, it could have been more interesting to take t2 instead of t3. Others criteria can be use to determine the optimal disassembly strategy such as inventory cost or demand for recovered products. 5
CONCLUSION
In this paper, we present an approach to cope with uncertainty in disassembly planning process. We propose to use Bayesian network and their extension to influence diagram as the underlying tool. The influence diagram model can either be constructed directly from the problem specification or be a translation of a Petri Net model. Our method consists in translating Disassembly Petri Net (DPN) into influence diagram to determine the disassembly strategy. We have tested this approach on an example of the literature. Future researches include the integration of multi-criteria approaches to solve the decision problem. Furthermore, we want to automate the translation from DPN to influence diagrams to decrease the modeling effort. 6
REFERENCES
[1] Chen S., Oliver J., Chou S. 1997 Parallel disassembly by onion peeling. Journal of Mechanical Design, 119 (2), 267-274. [2] Dutta L. (2006) Contribution à l’étude de la conduite des systèmes de désassemblage, PhD. Dissertation, Université de Franche-Comté, France. [3] Geiger D., Zussman E (1997) Probabilistic Reactive Disassembly Planning, Annals of the CIRP, 45(1), 49-52. [4] Godichaud M. (2009) Outils d’aide la décision pour la sélection des filières de valorisation des produits de la déconstruction des systèmes en fin de vie : application au domaine aéronautique, PhD. Dissertation, Université de Toulouse, France. [5] Jensen F.V., Nielsen T.D. (2007) Bayesian network and decision graph, Springer. [6] Kanai. S, Sasaki. R., Kishinami. T. (1999) Representation of Product and Processes for Planning Disassembly, Shredding, and Material Sorting based on Graphs, Proceedings of IEEE International Symposium on Assembly and Task Planning, Port, Portugal, 123-128. [7] Kang J-G (2005) Non-linear disassembly planning for increasing the end-of-life value of products, Ph.D. dissertation, Ecole Polytechnique de Lausanne. [8] Lambert A.J.D. (1999) Linear Programming in Disassembly/Clustering Sequence Generation, Computers and Industrial Engineering, 36(4), 723-738. [9] Laperrière L., Elmaraghy H. (1992) Planning of products assembly and disassembly, Annals of the CIRP, vol. 41(1), 5-9. [10] Martinez M., Pham V. H., Favrel J. (1997) Dynamic Generation of Disassembly Sequences, Proceedings of 6th International Conference on EngineeringTechnologies and Factory Automation, 177-182. [11] S. Lauritzen, Nilsson D. (2001) Representing and solving decision problems with limited information, Management Science, 47(9), 1235-1251. [12] Moore K., Gungor A. (1998) Gupta S. Disassembly Petri net generation in the presence of XOR precedence relationships, Proceedings of the IEEE International Conference on Systems, Man, and Cybernetics, San Diego, California, 13-18. [13] Murphy K.P. (2002) Dynamic Bayesian Networks: Representation, Inference and learning, Ph.D. Thesis, University of California, Berkeley. [14] Naïm P., Wuillemen P-H., Leray P., Pourret O., Becker A. (1999) Réseaux Bayésiens, Eyrolles. [15] Sanchoy, K. D., Sandeep N. (2002) Process planning for product disassembly, International Journal of Production Research 40(6), 1335-1355. [16] Zussman E., Zhou M. (1999) A methodology for modeling and adaptative planning of disassembly processes, IEEE Trans. Robotics and Automation, 15(1) , 190-194.
287
Proceedings of the 4th World Congress on Engineering Asset Management Athens, Greece 28 - 30 September 2009
DIAGNOSIS FOR IMPROVED MAINTENANCE SERVICES: ANALYSIS OF STANDARDS RELATED TO CONDITION BASED MAINTENANCE Luca Fumagalli a, b, Erkki Jantunen c, Marco Garetti a, d, Marco Macchi a, d a
Politecnico di Milano, Department of Management, Economics and Industrial Engineering, P.zza Leonardo Da Vinci 23, 20133, Milano, Italy b Corresponding author: Email: [email protected], d Email: [email protected], [email protected] c
TT Technical Research Centre of Finland, P.O.Box 1702,, FIN-02044 VTT, Finland Email: [email protected]
Depending on the maintenance strategy there can be enormous differences how much energy the machinery uses and how much waist it produces. It has become a common practise to study the efficiency of production machinery together with the quality of production and availability of this machinery i.e. the overall effectiveness is studied. In order to reach high efficiency, high availability and good quality, the production machinery has to be in the condition to fulfil these goals. In principle there are two questions that need to be answered when maintenance is planned for tackling the above described situation: 1) What do we have to do? 2) When do we need to take action? The maintenance strategy that has been developed to answer these questions in an optimal way is Condition Based Maintenance (CBM) i.e. the maintenance actions are based on the need of the machinery. Up to this level everything is very logical and clear but unfortunately the current reality in the industry is far from optimal. It is not an easy task to define the condition of production machinery and it is not easy to say what needs to be done and when. This paper is oriented to help to answer the What question i.e. diagnosis of the condition of machinery and When question i.e. prognosis of wear development is not discussed in detail. However, the What question as such is already very demanding. The reason for this is that automatic diagnosis should be based on measurements of the condition and this becomes very difficult in practise due to the differences in the production machinery and the difficulty in separating the condition information from information that is related to the production parameters. The paper provides an ontology of diagnosis in order to support the building of diagnostic tools. The motivation for relying on defining ontology is based on the fact that it takes a lot of work to define a reliable diagnosis tool and it also is very demanding to be able to keep the system working when changes in the machinery or the software environment take place. The ontology is built in such a way that diagnosis of similar machinery can be identified and new type of machinery as well, based on the similarity of components. One of the most important findings, in the development of this process has been that in order to make the development of ontology possible in practise and in order to be able to keep the system up to date, is to rely on available definitions in the form of standards and practices that other developers are prepared to support and update. The main focus of this paper is in defining the environment that supports the development of the ontology of diagnosis of the condition of rotating machinery. To build the ontology both references related with standards for exchange of information in a CBM system [1] and upper ontologies are analysed. Upper ontologies define toplevel classes such as physical objects and activities from which more specific classes and relations can be defined. For example following the scope of this paper ISO 15926-2 [2] is analysed. The using of upper ontologies to develop a more specific domain ontology enables to define concepts based on the more general concepts provided by the upper ontology, avoiding reinventing the wheel, while having better integration [3] and standardization.
288
1
INTRODUCTION
Maintenance is becoming a crucial competence factor in the management of assets also due to the attention to sustainability [4] issues that is growing. Maintenance is the most efficient way to keep the functional level of a product above the level required also from the viewpoint of environmental impact. This concept is not new, since e.g. sustainable issues are dated back to ‘80s. What is new, instead, is that new technologies enable proper activities to carry out maintenance activities in such an efficient way now that also the corporate profits can benefit. The concept of CBM was proposed, based on the development of machine diagnostic techniques in the 1970s. In the case of CBM, preventive actions are taken when symptoms of failures are recognized through monitoring or diagnosis. Only later authors started to investigate in depth diagnostic systems, mainly thanks to the availability of IT solutions. Works of this decade about condition monitoring and diagnosis are e.g. [5], [6], [7], [8]. Equipment must be maintained according to its characteristic. To this end, different decision making frameworks are available in literature to explain which maintenance policy that must be adopted in different industrial environment. To provide an example, it is possible to refer to [9], where a decision making framework is provided that also enables to find when CBM approach is suitable. Some questions are formulated to the decision maker, in order to decide upon the maintenance policy. In particular to come to CBM the following combination of questions and answers are necessary, according to [9]: •
Is the equipment critical for the system dependability? YES
•
Is the fault hidden, not detectable for operating crew in normal operation? NO
•
Does the condition degrade noticeably before functional failure? YES
•
Is the degraded condition detectable during normal operation? NO
•
Are there reliable means to measure the condition or performance? YES
Result: Condition monitoring: i) Automatic or human based sensing; ii) Process parameter monitoring Once CBM approach is identified as suitable, the decision maker should then decide upon the best diagnostic technique to adopt. However this is not a trivial task since different techniques are available. The aim of this paper is to clarify which information is to take into account to perform diagnosis on industrial equipments. This paper represents a short literature review of: i) monitoring techniques, ii) standards for CBM. Standards for CBM are presented in order to understand which information is available for the maintenance decision maker that tries to face the problem here presented. The paper is structured in the following way. Paragraph 2 briefly describes the main monitoring techniques, according to [10]. Paragraph 3 summarises the concept of ontology as a mean to help to carry out this research. Paragraph 4 deals with the analysis of standards. Paragraph 5 summarizes the analysis of the standards, integrating it with other references and proposing an ontology. Paragraph 6 ends the paper with conclusions and issues related with further development of the present work.
2
MONITORING TECHNIQUES – INDUSTRIAL APPLICATION
A variety of techniques can be used as part of a CBM program, and a good CBM program needs adoption of several techniques that can be used together in a plant system. In [10] a survey is presented, carried out in 2004 in 15 different countries, including the Americas, Europe, Japan, Australia, South East Asia, Middle East and Africa on a sample of 157 companies. According to this survey some of the most adopted diagnostic techniques are: •
Vibration analysis, adopted by 94% of the interviewed companies (148 on 157)
•
Oil analysis, adopted by 72% of the companies
•
Infra-red Thermography, adopted by 65% of the companies
Here some monitoring techniques are described, i.e. some of the most adopted as shown by [10]. As general remarks, before going ahead, it is worth saying that two stages are mainly covered by the solutions available in the market: data acquisition and data processing with analysis tools [1]. Vibration monitoring It is a technique that can be used for electromechanical systems such as pumps, fans, rotating machinery in general, both in continuous processes and in manufacturing systems, and it is a primary predictive maintenance tool. A machine is subjected to several sources of vibration, and so it has a composite vibration profile. This profile can be acquired by accelerometers and analysed in a frequency domain thanks to Fast Fourier Transforms (FFT), in order to individuate the different sources of vibration, and focus on the abnormal component indicating a strange behaviour of the equipment (e.g. deriving from worn
289
bearings). According to such aspects as working temperatures, presence of electromagnetic fields, signal quality and vibration frequency band, different sensors might be chosen in order to implement a vibration monitoring program [11]; for this reason, sensor producers often offer a wide range of solutions. Oil analysis Three oil analysis techniques are often used in condition-based maintenance. They are: i) lubricating oil analysis, ii) wear particle analysis and iii) ferrography (for further details see [12]). These techniques are relatively slow and expensive because the analysis requires the use of laboratory facilities such as spectrometer and scanning electron microscope. In lubricating oil analysis, samples of lubricating, hydraulic, and dielectric oils are analysed at regular intervals to determine if they can still meet the lubricating requirements of their application. Lubricating oil analysis involves the use of spectrographic techniques to analyse the elements contained in the oil sample. However, it must be supplemented with other diagnostic procedures in order to identify the specific failure mode which may have caused the observed degradation of the oil condition. The limitations of oil analysis in a condition-based maintenance programme are: high equipment costs, being a laboratory-based procedure, reliance on acquisition of accurate oil samples and skills needed for proper interpretation of data. However in the recent years some solutions for oil analysis are available at a lower cost. Portable devices, enabling oil analysis through visual analysis are, in fact, available nowadays, e.g. online oil sensors [13]. Thermography Thermal non destructive methods involve the measurement or mapping of surface temperatures as heat flows to, from and/or through an object; by detecting thermal anomalies incipient problems can be located and defined. Only emitted energy is important to predict and prevent failures, so the other energy forms (reflected and transmitted) must be filtered out in order to have a good analysis. Infrared imaging is the most used technique to catch thermal data, because it can provide it in a shorter way than others (such as line scanners or infrared thermometers) can. Thermography is a relatively inexpensive technique and has a wide application range, so its use is very frequent in CBM programs (e.g. see [14]). Thermal cameras can be used for acquiring images that can be analysed either manually or automatically (thanks to supporting software). Thermal imaging is a fast and cost effective way to perform detailed thermal analysis. Temperature/pressure monitoring In order to measure the parameters that indicate the actual operating conditions of plant systems, sensors measuring temperature, pressure and other quantities can be used. They aim at defining the conditions at which the output of the production process is obtained, and they are usually used in process control and automation: temperature and pressure sensors can be parts of feedback control systems, which aim at maintaining constant conditions in a production process. They can be also used as parts of a CBM program: the measurement of process parameters can be introduced in a CBM program, with the purpose to monitor the production capacity in a plant and discover process inefficiencies [15]. In this case, instrumentation must be installed to measure the parameters that indicate the actual operating condition of plant systems; the so-obtained data can be periodically recorded and analysed. As for acceleration sensors, there is a wide range of possible solutions in accordance to measurement range, response time, environmental conditions, accuracy etc. [16].
3
DIAGNOSIS ONTOLOGY
In recent years a number of ontologies have been constructed and applied in selected domains. Here we avoid speaking about ontology for software interoperability that is not relevant for the present discussion. In medicine and biology, instead, Semantic Web technologies and web mining have been exploited in new intelligent applications. However, these disciplines are generally influenced by government support and are not as commercially fragmented as the manufacturing and process industries. Creating an industry-wide standard in a fragmented field is a task that must be handled carefully. There are some important standards for industries in the maintenance context, but there are also smaller standards, and many companies use their own internal terminologies for particular areas. This does not help to consider a common terminology nor a common structure of concepts. To this end, the research approach followed in this paper keeps the analysis of industrial standards at basic level. Then the development of an ontology can be clarified. Ontology can be defined as: a taxonomy of concepts [17], a list of constraints and interrelations among the concepts [18], a hierarchically ordered amount (collection) of classes, instances, relations, functions and axioms [19]. In the scope of the present paper the last definition will be adopted. A list of classes will be presented to describe the “environment” where the machine operates, see Figure 1. These classes will be linked to each other through the establishment of relations. These relations will allow classifying machine environment in an ordered way. Classes describing machine components will be included. Once machine components and operating environment is established, these can be matched with the monitoring technique to adopt. This last issue will be presented in a further
290
development of the present work, while in this paper it will be presented an ontology supporting the description of the “environment”.
Figure 1. Processes in defining the classes of a diagnosis system. In Figure 1 an example of classification though classes is presented. Machines can be classified according to their, e.g. Failure mode or type of inspection interval. The inspection interval, for instance, can be determined by rules concerning safety. Machines can be classified according to these sets. More precisely the item that can be classified is one component of a machine. How to manage the composition of components into a machine is explicitly described by ISO 15926 [2]. This furthermore underlines the importance and role of standards in approaching this kind of work.
4
STANDARDS AND REFERENCES
Standards are of great importance in building diagnosis ontology. Keeping in mind that diagnosis ontology is defined in order to make it easy to define automatic diagnosis systems and also keeping them up to date. Here the proper use of standards is the main technology that decreases the amount of definition dramatically and guarantees that the diagnosis system can be kept alive and working for a long period if there are changes in the operating system and programming environment. The paper provides a brief overview of the current status of standards that support the establishment of a CBM system and thus are useful to build the diagnosis ontology that will be presented in paragraph 5. The standards here analysed are designed for different purposes. The benefit of the ontology described in this paper will be to utilise part of the standards in order to describe the context where a machine works. The rationale behind these standards is that when developers of CBM systems start to follow standards and standardization proposals it is easier to direct the development towards new innovative ways of predicting remaining useful life. The CBM community would achieve interchangeable hardware and software components, more technological choices for users, more rapid developments of technology, reduced prices and improved ease of upgrading of system components. Interface standard that enhance the ability of integration between different vendor-products in a system have several positive effects. Overall system costs will be reduced; performance will be optimized as well as the ability to implement new CBM approaches will be enhanced by the adoption of standards [20]. In the scope of the present work these standards must be considered to ensure that the result of this work can be adopted in an environment where these standards are applied, though the adoption of a common terminology. However one main benefit of this approach is to synthesize the available material concerning CBM and try to integrate it with the material available in literature. OSA-CBM OSA-CBM is an abbreviation for Open System Architecture for CBM. As declared by the OSA-CBM organization the standard proposal shall cover the whole range of functions of a CBM system, for both hardware and software components. The OSA-CBM proposed standard divides a CBM system into different processes: Data Acquisition, Data Manipulation, State Detection, Health Assessment, Prognostic Assessment and Advisory Generation. In the scope of the present work, OSA-CBM documentations do not provide information to build the diagnosis ontology to describe industrial environment condition. However OSA-CBM must be kept in mind since it represents a reference for the above mentioned processes. MIMOSA The Machinery Information Management Open System Alliance, MIMOSA, was founded in 1994 and introduced in the September 1995 issue of Sound and Vibration. The purpose and goal of MIMOSA is to develop open conventions for information exchange between plant and machinery maintenance information systems. The development of MIMOSA CRIS
291
(Common Relational Information Schema) has been openly published at their website (www.mimosa.org). The CRIS provides coverage of the information (data) that will be managed within a CBM system. This is done by a relational database schema for machinery maintenance information. The typical information that is needed to handle is: a) a list of specific assets being tracked b) a description of system functions, failure modes, and failure mode effects, c) a description of the monitoring system and characteristics of the monitoring components, d) a record of alarm limits and triggered alarms, e) resources describing degradation in a system as well as prognostics of system health trends, f) a record of recommended actions, g) a complete record of work requests. The adoptions of CRIS specification in the ontology represent the possibility to consider the above listed information as already enumerated and defined. MIMOSA in particular defines the terminology to adopt and the way the data can be then managed also in the practical implementation of diagnosis systems that however is out of scope of this paper. It can be concluded that Mimosa as such is of highest importance as it defines the format and relations for most of the data needed in a system to diagnose the condition of rotating machinery. ISO 15926 ISO 15926 [2] is a standard for integrating life-cycle data across phases (e.g. concept, design, construction, operation, decommissioning) and across disciplines (e.g. geology, reservoir, process, automation). It consists of 7 parts, of which parts 2 and 4 are the most relevant for the present work. In fact, part 2 specifies a meta-model or top-level ontology [3] for defining application-specific terminologies. Part 2 includes about 200 entities. It is intended to provide the basic types necessary for defining any kind of industrial data. Part 4 of ISO 15926 includes application or discipline-specific terminologies, and it is usually referred to as the Reference Data Library. These terminologies are instances of the data types from part 2. Part 4 contains around 50.000 general concepts. Standards for geometry and topology (Part 3), procedures for adding and maintaining reference data (Part 5 and 6), and methods for integrating distributed systems (Part 7) are the other parts of the norm, out of the scope of this work. The scope of the information model (Part 2) is to provide: generic concepts associated with set theory and functions, concepts and relationships that describes changes to physical objects during time periods, generic relationships relevant to engineering such as connection, composition, containment and involvement (in an activity). ISO 15926 defines a format for the representation of information about a process plant. The basis for ISO 15926 is a record of: a) the physical objects that exist within a process plant, b)identifications of the physical objects, c) properties of the physical objects; d) classifications of the physical objects, e) how the physical objects are assembled, f) how the physical objects are connected. ISO 15926 does not attempt to standardize all these classes, but instead provides a small set of basic engineering classes which can be specialized by reference to a dictionary. The reference is made by instantiating a proxy for the class defined in a dictionary and by associating information with this proxy, such as: •
the name of the source dictionary, which defines a namespace for the identification of the class;
•
the identifier of the class within the source dictionary.
ISO 15926 does not only record the process plant as it exists at an instant, but also how the process plant changes as a result of maintenance and refurbishment activities and the requirements for a process plant and the design for a process plant; which may not directly correspond to a process plant as it exists. The reference data library (Part 4), instead, contains a dictionary of basic classes and properties used within the process industries. The dictionary specializes the generic concepts within the information model. The first release of the reference data library contains about 10,000 classes and properties. It is intended that the reference data library will be subject to continual revision and extension as an ISO register. ISO 15926 part 4 standardizes an initial set of a few thousand generic classes. In the scope of the present work, from the above description, partially coming directly from the ISO documentations, it seems that ISO 15926 is almost ready to describe all the industrial systems and machines, describing it from an aggregation of classes. However the way ISO 15926 can be accessed through the web (www.http://15926.org) is not so user-friendly and some difficulties are present surfing the website and trying to find specific classes defined by the standards. From a theoretical point of view it can be concluded that our ontology can rely on this standard for what concern the description of components. ISO 17359 This International Standard sets out guidelines for the general procedures to be considered when setting up a condition monitoring program for machines and includes references to associated standards required in this process. It is applicable to all machines. ISO 17359 [21] provides also a condition monitoring procedure flow chart, where the procedures from selecting the equipment to monitor to determined required maintenance actions are enumerated, going through processes that fits OSACBM layers. In the scope of the present work it seems very useful the part where measurement locations are discussed. Measurement locations should be chosen to give the best possibility of fault detection. Measurement points should be identified uniquely. The use of a permanent label or identification mark is recommended. Factors to take into consideration are •
safety,
292
•
high sensitivity to change in fault condition,
•
reduced sensitivity to other influences,
•
repeatability of measurements,
•
attenuation or loss of signal,
•
accessibility,
•
environment, and
•
costs.
It can be concluded that for vibration condition monitoring, information on measurement locations is contained in ISO 13373-1. For tribology-based condition monitoring, information on measurement locations is contained in ISO 14830-1. These references are provided within the norm ISO 17359. ISO 17359 provides also an informative table where examples of condition monitoring parameters to monitor according to machine type are presented. It is a similar example that the ontology presented in this paper aims at reaching, namely the matching between machine and proper diagnostic techniques. ISO 13373-1 This International Standard [22] is dedicated to Condition Monitoring. It describes in detail transducers and measurement locations, according to the different problems to diagnose. A vibration condition monitoring flowchart is provided within the documentation of the norm. This flowchart, however differs from the one provided by ISO 17359 since it is very specific for the practical application of vibration measurement sensors. Without discussing the measurement location it seems that only the first two steps of that flowchart are worth noticing in this context. In the scope of the ontology here presented, instead, Table A.1 of the documentation of the standard completely fits the needs of the ontology that this work aims at defining. In particular within that table, types and location of measurement are provided, as a function of machines and parameters to monitor. To this end it is envisioned that the machines there enumerated are to be included in the ontology as type of machines in which vibration analysis can be performed. Annex C of the Standard helps to evaluate which causes can be derived from the symptoms identified through vibration analysis on different equipment. Basics but useful information are provided concerning shafts, gears and bearings. Also most common causes of turbo machinery, torsional vibration and resulting vibration characteristics are provided and they mainly refer to electrical problems. It can be concluded that ISO 17359 provides suggestions coming from MIMOSA standard for what concerns the definition of measurement location; this is reported in Annex D of the ISO. ISO 13379 This standard [23] provides the generic steps of the diagnostics study that include: a.
analyse the machine availability, maintainability and criticality with respect to the whole process;
b.
list the major components and their functions;
c.
analyse the failure modes and their causes as component faults;
d.
express the criticality, taking into account the gravity (safety, availability, maintenance costs, production quality) and the occurrence;
e.
decide accordingly which faults should be covered by diagnostics (“diagnosable”);
f.
analyse under which operating conditions the different faults can be best observed and define reference conditions;
g.
express the symptoms that can serve in assessing the condition of the machine, and that will be used for diagnostics;
h.
list the descriptors that will be used to evaluate (recognize) the different symptoms;
i.
identify the necessary measurements and transducers from which the descriptors will be derived or computed.
According to the research scope this paper aims at covering some of the points mentioned above. Point “b” is partially covered by the ontology that will be proposed here, obviously when it is customized on the specific industrial case. Point “c” is only partially covered since in the ontology the failure modes and causes as component faults are enumerated, namely reserving room in the ontology for their definition. Point “e” is covered in the sense that monitoring techniques to make the fault diagnosable are suggested. Point “f” is satisfied through the definition of the operating condition of the machine. Norm ISO 13379 interestingly provides definition of diagnostic methods that are there classified by two main approaches:
293
•
Numerical methods (neural network, pattern recognition, statistical, histographic Pareto approach, or other numerical approaches). These methods are generally automatic, do not need deep knowledge of the mechanism of initiation and fault propagation, but require a learning period with a large set of observed fault data.
•
Knowledge-based methods which rely on the use of fault models, correct behaviour models or case description. This partially fits Jardine (2006)’s definition, where diagnostic methods are classified as: o
Statistical approach,
o
Artificial Intelligence (AI) approach,
o
Physical model based approach.
These features are not included in the ontology presented here, that aims at clarifying concepts in the scope of classifying machine environment for diagnostic purposes. However this information will be kept in mind for the further development of this research. ISO 18436-2 This standard [24] is not useful in the scope of this paper, namely in the definition of the ontology. The authors however would like to point out that this norm helps to define the amount of time required to train personnel in monitoring techniques. It seems thus that this standard is relevant in practice, above all in an initial phase of implementation of a CBM approach. ISO 13380 This norm [25] provides an important point of view on machine operating condition, namely the norm states that: measurements of different parameters should be taken wherever possible at the same time, or under the same operating conditions. For variable duty or variable speed machines, it may be possible to achieve similar measurement conditions by varying speed, load or some other control parameters. Monitoring should be taken where possible when the machine has reached a predetermined set of operating conditions (e.g. normal operating temperature) or, for transients, a predetermined start and finish condition and operating profile (e.g. coast down). These are also conditions which may be used for a specific machine configuration to establish baselines. Subsequent measurements are compared to the baseline values to detect changes. The trending of measurements is useful in highlighting the development of faults. This is important information to take into account when evaluating which diagnostic techniques must be selected. The norm is quite restrictive in this sense, since, following its instruction it seems no other diagnostic approaches are possible if the conditions are not stable as above explained. However this is not the case in all the situations and thus the case above mentioned represents only one of the possible cases of industrial environment and condition where a machine is called to operate. It can be concluded that annex A and annex C of this norm contain similar examples as 13373-1. These examples can be taken into account to feed the ontology at its early stage. In particular annex C provides fault and symptoms or parameter changes for several types of machines, like: industrial gas turbine, pumps, compressors, electric generators, fans.
5
PROPOSAL OF THE DIAGNOSIS ONTOLOGY
Standards must be integrated with other references in order to approach such a kind of problem in a way that also very practical issues are considered. Therefore some other references were consulted, mainly coming from industrial presentations. It must be clear to the reader that this part of the paper does not aim at being a complete literature review. On the other hand it aims at presenting an ontology that will be of course influenced by the analysed references. Information coming from the above described i) monitoring techniques, ii) standards, iii) review of practical cases are synthesized in the following picture (Figure 2), describing the ontology as a result of the analysis presented in this paper. The ontology presents the classes that classify the environment, the components and the related faults, according to some rules. The rules come from the review presented in the paper. Figure 2 shows the classes, while the model developed in Protègè is able also to provide relations between the classes in order to check consistency in the classification. Classes are then presented. In [26] the authors point out that inspection for CBM can be performed based on a plan or upon a call due to an emergency situation. Moreover in that reference thermography is indicated as a suitable method when the availability of the equipment during the inspection must be guaranteed. It is then important to consider if the equipment to be monitored must be maintained through regular inspections, required by laws and regulations. This led to define the class “Inspection interval”, in order to consider this aspect. In [27], the authors point out that the different failure modes experienced by rotating machinery result from random shocks to the system or deterioration mechanisms such as wear and fatigue. Component deterioration rates depend on different factors like operational loading, quality of maintenance and other external effect. It is possible to state that components are generally
294
designed to operate satisfactorily, even if stressed, if they are under the forecasted operating condition. This reference led to define the class describing the capability to handle stress, it is called “Attitude to handle stress” in the ontology. Large variation of the failure rate, with increased rates of failure, may result as consequence of other factors such as reduced safety margins, poor quality of maintenance, hostile operating condition or extreme environmental condition. These increased rates may develop as function of time due to component degradation. From these considerations arise the need that operating condition and type of design must be considered in the analysis.
Figure 2. Ontology describing the “environment” where the machine operate. Here few examples of the analysed documents are presented. In [28] it is stated that complete monitoring techniques, like vibration monitoring and thermography analysis require special tools and skilled personnel. Moreover a critical thing is that the analysis of the signals when performed manually requires a certain time frame. Reference [28] points out that when there is physical separation of the monitored equipment from the analysis centre, the situation could be critical. This driver must then be taken into account. This reference led to define the class describing the skills of the operators using the machines; it is “Driver of the machine”. Then, the CBM system can associate multiple metered equipment values to calculate an overall equipment health factor. For instance, a feed water pump might require the lube-oil pressure to be at a different reading when it is running idle than when it is at full capacity (www.matrikon.com). Consequently, placing high and low alarm points on the lube-oil pressure may not be accurate at different operating conditions. Process analysis software determines the relationship between the various metered equipment values and their interactions. A normal relationship between the values indicates a good health factor. As well, an abnormal relationship indicates a poor health factor, and is a warning that maintenance is required. Furthermore it is also pointing out that it is important to understand the behaviour of the failure mode. This is done identifying the corresponding class within the macro-class “Failure mode” The class Component contains subclasses defining all the possible components of a production plant. This class is not detailed by our ontology, since information to build it is contained in ISO 15926. The definition of the “Failure rate value” comes from the traditional statistical maintenance approach where the failure behaviour is modelled through exponential function describing a constant failure rate. Conversely with Weibull function increasing (or eventually decreasing) failure rate can be modelled. In our ontology also the case of a failure rate function of the type of production campaign is considered. “Industrial situation” class deals with the classification of the operating environment and the managerial situation: maintenance can be done properly or poor maintenance can be performed (e.g. in location where maintenance culture is not high).
6
CONCLUSION
This paper presented an ontology in order to synthesize part of the material available in literature and above all the standards related with CBM. The focus of the ontology is on describing the “environment” where a machine is called to
295
operate, as well as describing the machine itself through the listing of its components. It is envisioned that the ontology represents a good synthesis of the standards and can be adopted to further develop other tools to be adopted for machine diagnosis. In particular some co-authors of this paper are carrying out a project related to the development of a tool aimed at making the definition of an automatic diagnostic system a semi-automatic process. In that project the ontology here presented will be adopted as basis to build a decision making tool that can decide which diagnostic technique is the most suitable approach in certain industrial environment and situation. The present work shows how standards must be adopted in the scope to classify the “environment” where a machine is specified to work. This helps to answer to the question “What”, concerning the maintenance action to perform on the machine, as stated in the beginning of the paper. The analysis here presented should help in the scope of diagnosis of machines to improve the maintenance services related with CBM. It is envisioned that, through the better development of CBM approach, maintenance action will be more effective and will allow also the keeping of machines in a better working condition, allowing the saving of resources, through saving of energy and spare parts consumption.
7
REFERENCES
1
MIMOSA, OSA-CBM Primer, August 2006, www.mimosa.org
2
ISO 15926, www.iso.org
3
R. Batres, M. West, D. Leal, D. Price, M. Katsube, Y. Shimada, T. Fuchino. An upper ontology based on ISO 15926. Comp. & Chem. Eng., 31, 5-6 (2007) 519–534
4
Brundtland Commission, 1987, Our Common Future, Report of the World Commission on Environment and Development, World Commission on Environment and Development, Published as Annex to General Assembly document A/42/427, Development and International Co-operation: Environment, August 2.
5
Grall, A., Dieulle, L., Berenguer, C. and Roussignol, M. (2002), “Continuous time predictive maintenance scheduling for a deteriorating system”, IEEE Transactions on Reliability, Vol. 51 No. 2, pp. 141-50.
6
Chen, D. and Trivedi, K.S. (2002), “Closed-form analytical results for condition-based maintenance”, Reliability Engineering and System Safety, Vol. 76 No. 1, pp. 43-51.
7
Marseguerra, M., Zio, E. and Podofillini, L. (2002), “Condition based maintenance optimization by means of genetic algorithms and Monte Carlo simulation”, Reliability Engineering and System Safety, Vol. 77 No. 2, pp. 151-65.
8
Jamali, M.A., Ait-Kadi, D., Cle´roux, R. and Artiba, A. (2005), “Joint optimal periodic and conditional maintenance strategy”, Journal of Quality in Maintenance Engineering, Vol. 11 No. 2, pp. 107-14.
9
Rosqvist T., K. Laakso, M. Reunanen, Value-driven maintenance planning for a production plant (2009), Reliability Engineering and System Safety 94, 97–110.
10 Higgs A., Parkin R., Jackson M., Al-Habaibeh A., Zorriassatine F. e Coy J. (2004),"A survey on condition monitoring systems in industry", Proceedings of: ESDA 2004, 7th Biennal ASME Conference Engineering Systems Design and Analysis, July 19-22, Manchester, UK. 11 www.skf.com 12 Nunnari J. J. ; Dalley R. J., An overview of ferrography and its use in maintenance, 1991. Tappi journal ISSN 07341415, vol. 74, no8, pp. 85-94. 13 Jantunen E., Adgar A., Arnaiz A. (2008), “Actors and roles in e-maintenance”, Proceedings of the 5th International Conference on Condition Monitoring and Machine Failure Prevention Technologies. 14 Ierace S., Carminati V, (2007). Application of thermography to Condition Based Maintenance: a case study in a manufacturing company, In: Proc. of 3rd International Conference on Maintenance and Facility Management, pp. 159166, Roma, Italy, September 27 -28, 2007. 15 Mobley R. Keith (2002), "An introduction to predictive maintenance", Butterworth- Heinemann 16 Garetti M., Taisch M. (1997), Automatic production systems, in italian, original title: Sistemi di produzione automatizzati, 2nd ed., CUSL, Milan 17 Alberts, (1994) `Ymir: A sharable ontology for the formal representation of engineering design knowledge' Design Methods for CAD, http://citeseer.ist.psu.edu/alberts94ymir.html, pp. 3-32. 18 Storey, V.C. Ullrich, H. Sundaresan, S. (1997) `An ontology for database design automation' Proceedings of the 16th International Conference of Conceptual Modelling, pp. 2-15.
296
19 Visser, P.R.S. and T.J.M. Bench-Capon (1998) A comparison of four ontologies for the design of legal knowledge systems, Artificial Intelligence and Law, vol. 6, n 1, pp. 25-57. 20 Tassey, G., 1992. Technology Infrastructure and Competition Position. Kluwer, Norwell, MA. 21 ISO 17359, www.iso.org 22 ISO 13373-1, www.iso.org 23 ISO 13379, www.iso.org 24 ISO 18436-2, www.iso.org 25 ISO 13380, www.iso.org 26 Thermography analysis on gas turbine systems. Inprotec MCM Milano April 2007 27 Moss, T.R. and Andrews, J.D., 1996. Factors influencing rotating machinery reliability. IN: Proceedings of the European Safety and Reliability Data Association Conference, 10th ESReDA, Chamonix, France, pp 149-171. 28 Neapolitan Engineering Association, Navy Commission. Remote monitoring of ships and maintenance strategies. Eng. Fabio Spetrini
297
Proceedings of the 4th World Congress on Engineering Asset Management Athens, Greece 28 - 30 September 2009
CONDITION MONITORING FOR THE IMPROVEMENT OF DATA CENTER MANAGEMENT ORIENTATED TO THE GREEN ICT J. F. Gómez Fernández a, F.J. Álvarez Padilla a, L. Fumagalli b, V. González Díaz a, M. Macchi b, A. Crespo Márquez a (*)Department of Industrial Management, School of Engineering, University of Seville, Camino de los Descubrimientos s/n., 41092 Seville, Spain. (**)Department of Management, Economics and Industrial Engineering, Politecnico di Milano, Piazza Leonardo da Vinci 23, 20133 Milano.
Nowadays society is worried about the environmental conservation. Therefore, it is crucial to take advantage of Information and Communication Technologies (ICT) and maintenance knowledge to guide the evolution of maintenance management toward energy efficiency. Many researches focus their attention on the savings concerning steam consumption or oil consumption in huge plants of process industries. Nevertheless, even if in such kind of environment the savings are large, in other industrial context the savings are related to other kinds of equipments, not concerned with steam production, oil consumption, movement of pressured air, etc. In particular, looking at the exponentially progress of Information and Communication Technologies, it is worth noticing that, unfortunately, the consumption of power has increased for the electronic devices as well. Even if more efficient electronic devices are available, they are used in a major number than in the past and so the total consumption of energy related to these equipments has increased. To improve the energy efficiency of industrial electronic devices it is necessary to monitor, not only the main equipment under analysis, but also the entire infrastructure that surrounds and supports the main equipment. In industrial environment one of the main centre of power consumption can be the Data Centres (DC) for data elaboration. To this end there are some standards and best practices on power consumption recently established. Systems as SCADA and different sensors can provide information to optimize the maintenance avoiding short circuits and reducing the consumption [1], [2]. This paper presents an overview of these standards and best practices and then proposes a practical case to demonstrate the possibility to save energy in this application. The case represents an action research activity. In the action research, first of all quick wins were achieved, based on the savings of energy in the infrastructure that surrounds and supports the main equipment. This represents a first phase of improvement of the system. A second phase of improvement is then represented by the implementation of a SCADA system to monitor the status of the equipments in the data center. Key Words: Green ICT, Data Centers, Energy efficiency, SCADA system 1
BACKGROUND
In ICT (Information and Communication Technology) field, recent studies [3] [4] have demonstrated the important increase of energy consumption due to the use of ICT. The development of equipments and infrastructure of ICT follows a quick evolution facilitating more and more the management and store of information in large capacities and celerity [5], and also reducing the cost of them [6]. That is to say, companies are able to employ ICT in more complex tasks under a minor budget. So the number of servers increases exponentially, in particular if the level of power per server remains constant, the power consumption should increase about 40% in 2010 [3]. Then, although the cost of ICT devices decreases, the cost of power consumption will increase, not only in the servers themselves, but also in the facilities to maintain them in an adequate conditions of operation as air conditioning, fire control,
298
generators, lights, access control, etc. Simultaneously, the cost of electricity generation is increasing: this situation provokes an important increment in the costs of operation of ICT, in general, and in servers in particular. The U.S. Environmental Protection Agency (EPA) describes that data centres consumes about 1.5% of all electric energy consumed in U.S [4]; besides, the contribution of data centre to CO2 emissions is significant too. In the following picture information about the prediction of energy consumption for DC in the future is provided.
Figure 1. Prediction of energy consumption in the future [4]. Data Centers (DC) can be considered good examples to discuss, since it is a typical ICT equipment related with the industrial sector. In the scope of this paper, the following definition is considered: a Data Centre is a restricted area containing ICT devices as centralized repository for the storage and management of data and information. The DC includes the building that contains the main devices, the facilities, the area where the servers are installed, the communications equipments, the air conditioning, the electricity distribution, access control, fire detection and extinction, and the SCADA system for the maintenance and safety. In other words, The equipments in a DC can be classified in two groups [7]: •
IT equipments:
•
Auxiliary infrastructure and facilities (mechanical and electrical system): Transformer, Engine Generator Plant, UPS (Uninterruptible Power Supply to maintain power in the event of a power outage), Battery, Power Distribution Unit, CRACs (Computer Room Air Conditioners), Computer Room Air Handlers, Direct expansion coolers, etc.
Servers, Telecommunication equipments, Storage, etc
The efficiency analysis of a DC has to be realized from a holistic point of view, thus, it is necessary to manage the DC as an entity composed by different elements and their relations among them. According to McKinsey & Company [8], there are four main causes in the inefficiency power in a DC at the moment: i) the poor designs and planning of equipments and poor designs and planning of power and cooling equipments; ii) the poor management of IT equipment capacity; iii) the absence of a supervision during the implementation and amplification of DC; iv) the outdated Designs and technologies from the point of view of the power consumption. Moreover, the power consumption in a DC is 25% of the total cost in an IT company (see Fig. 2) [8].
Figure 2. Costs in a typical IT company [8].
Figure 3. Consumptions in DC of an IT company [9].
299
Focusing more in depth on the DC, The Green Grid indicates that the amount of consumed power by IT equipment is a 30% of the total consumed by a DC and then 67% is converted in heat inside the DC (The Green Grid, 2007) (Fig. 3).
2
STANDARDS AND BEST PRACTICES IN DATA CENTER MANAGEMENT
Actions to improve the power efficiency have to be implemented both in the design phase and in the installation phase. The operation and maintenance phase of the DC is important as well and needs the application of known best practices and recommendations. Currently, it is possible to find norms and practices applicable to each part of the DC: building, ICT, UPS, air conditioning and security. There are several organizations producing specifics norms about DC. An example is represented by ASHRAE recommendations (American Society of Heating, Refrigerating, and Air Conditioning Engineers), that are a set of seven books [2] dedicated to different aspects of design and management of a DC: (i) Thermal Guidelines for Data Processing Environments (2004); (ii) Datacom Equipment Power Trends and Cooling Applications (2005); (iii) Design Considerations for Datacom Equipment Centers (2005); Liquid Cooling Guidelines for Datacom Equipment Centers (2006); Best Practices for Datacom Facility Energy Efficiency (2008); Structural and Vibration Guidelines for Datacom Equipment Centers (2008); High Density Data Centers - Case Studies and Best Practices (2008).TIA (Telecommunications Industries Association), founded in 1988, is a trade association representing the global information and communications technology (ICT) industries in US, and it is composed by more than 600 members. TIA published in April 2005 a norm with the objective of standardizing the DC from an ICT point of view. The norm TIA-942 [10] offers standard specifications for the ICT infrastructure, the use of the available space, equipments distribution, reliability and environmental considerations. This recommendation ANSI/TIA-942 presents, then, a classification for a DC in four different levels according to the established and published concepts of the Uptime Institute, where you can distinguish that if the TIER [27] level increases, the availability increases too, and due to this, the redundancy paths and parallel equipments increase at the same time, producing an increment in the power consumption. Therefore, it can stated that TIER levels provide a classification for identifying data centers according to how stringent are their infrastructure design topologies, i.e. in accordance to the partial or fully redundant subsystems, compartmentalized and controlled security zones as well as environmental considerations. Besides these organizations, the Lawrence Berkeley National Laboratory, managed by the University of California and member of the national laboratory system supported by the U.S. Department of Energy through its Science Office, has developed its own recommendations and tools, where the possible savings of power according to the best practices are also presented into a specific table. This table [11] was made together with the U.S. Department of Energy (EYP Mission Critical Facilities in the Save Energy Now Program) and the expert recommendations of Emerson [12], Electric Power Research Institute (EPRI) [13], and the Syska Hennessey Group that it is a leading global consulting, engineering, technology and construction firm [14]. EPRI is an independent, non-profit organization that conducts research and development relating to the generation, delivery and use of electricity; it represents more than 90% of the electricity generated and delivered in the U.S.. Another organization is The Green Grid that also published documents about efficiency and productivity in DC, and how to obtain them, especially orientated to the management of indicators. The Green Grid is a non-profit global consortium dedicated to developing and promoting energy efficiency for data centers and information service; Dell, HP, Microsoft, Intel, AMD, APC, Sun, EMC and IBM are board membership [15]. In Europe, the researches and recommendations have been based on the mentioned above references and in the Renewable Energies Unit of Institute for Energy in the EUROPEAN COMMISSION that published a Code of Conduct on Data Centres Energy Efficiency [7], where three work-groups are defined: i) Best practices, ii) Energy efficiency metrics and measurements, iii) Data collection and analysis. In these three work-groups, general principles are proposed including practical actions searching a reasonable use and cost of the power without commit the reliability and operations continuity in a DC. Moreover, the activities to keep a DC in an energy efficient status can be grouped in i) mechanical activities, ii) activities related with the IT equipments and iii) activities related with the electric infrastructure and lightening. Within the mechanical activities the following are present: heat management, management of air flow, management of the systems of air-conditioning (fans, fan coils, etc.) and the management of the air quality, mainly related with the level of humidity. [16] describes under an increment of temperature that servers become the critical elements, because their CPU employs power depending on the temperature, as for that is a critical factor of design. Then, it exits an optimal temperature of operation in a DC that depends on its characteristics: IT equipment, cooling system architecture, location, etc. Consequently, the heat management is a key aspect for reaching efficiency. The flows of air, cold and hot, require a dynamic study measuring the pressure, humidity and temperature in several points in the DC, not only in the implementation phase but also during the operation, because the level of use of the servers changes every time. The cooling air has to be well conduced to the IT equipments with less losses as possible, avoiding return flows of air, and the hot air has to be removed appropriately away from the IT equipments. These two principles have to be fulfilled dynamically, depending on the work load of the IT and the environmental condition, in order to reduce the power consumption. It is possible to design [17] an ideal DC in the management of air flows, almost without negative pressure, air recirculation and bypass flows. Condition monitoring may help as well to keep decisions aligned with the real time condition. The employment of intelligent sensors and IR (infrared) cameras simultaneously [18] demonstrated that the cooling system could be optimized in the course of time. The
300
intelligent sensors are useful to monitor the environmental variables, saving up to 25% of power in the consumption of the cooling equipments [19]. Last but not least, the activities on the IT equipments are dedicated to the optimization of the productivity. It is known that the utilization of the servers is in between 10% and 15% of their capability to manage information. Therefore, the main issue is to make the equipments working at the optimal level of consumption, utilizing distributed control techniques, both from a software and an hardware point of view. As best practices in the IT field, it is important to use correctly the hibernation capability of the hardware. To sum up, according to what above mentioned, the following pillars can be defined to reduce the power consumption: •
Appropriate design and planning of auxiliary infrastructure, aligned with the suitable IT equipments;
•
Flexibility in use of IT capacity and the auxiliary infrastructure, adapting them to real necessities of operation;
• Condition Monitoring of all the equipments in a DC to conserve and manage them in the most efficient way. To this last concern, at the moment, different normative institutions are working, writing norms about the concepts and the measures to be realized in the field of the management of the energy efficiency in DC.
3
MEASURING EFFICIENCY IN A DATA CENTER
The efficiency is in general related to cost reduction and production increment depending on the applied resources, while the production in a DC is the amount of information processed according to the consumed resources. Two are the organizations that put their proposals of metrics in industrial norms: The Green Grid and The Uptime Institute. Then, some indicators of efficiency depending on the type of resources [9] are available: auxiliary infrastructure and IT equipments. This paper focuses on two specific metrics, proposed by The Green Grid and by The Uptime Institute, to improve the power efficiency and production in a DC.
3.1 The Green Grid The organization Green Grid proposes metrics for the IT equipments related to power consumption in all the DC [7] as: Power Use Efficiency (PUE) is the ratio between the total power consumption in a DC and the power destined to IT equipments.
PUE =
Total Facility Power (1) Power Equipment IT
Data Centre Infrastructure Efficiency (DCiE) is the inverse of PUE.
DCiE =
Power Equipment IT 1 = (2) PUE Total Facility Power
Computer Power Efficiency (CPE) is the relation between the use of IT equipments and PUE.
CPE =
IT
Equipment Utilization (3) PUE
This organization introduces the variable useful work to indicate the result of the activities in a Data Centre. The term Utilization is the measurement of the useful work in IT equipments. It is not easy to Measure this term, since there are several changes of functioning in IT equipments and the measure has to be realized during the real operation without disturbing the production processes. [20] proposes to employ in the metrics the variable Energy (kWh) instead of Power (kW), because IT equipments consume more power than the necessary to process information, that is, the IT equipments consume power although they were not processing information. To measure the metrics according to Belady and Malone [20], the measurement has to be realized time-based, during a determined period of time of evaluation. To this end, the measurements have to be retrieved equipment by equipment, then they could not be used to compare with whatever DC, rather than a DC with the same characteristics and services. The Green Grid [21] indicates a set of eight proxies to measure and compare the power efficiency in DC.
301
Table 1 The Green Grid set of eight proxies to measure and compare the power efficiency in DC [21] 1 2 3 4 5 6 7 8
Useful Work Self-Assessment and Reporting: All applications self-report, utility function is uniform and equal to one; user must decide how to normalize each task DCeP Subset by Productivity Link: Sum all outbound bitstreams from data centre, divide by energy used by data centre DCeP Subset by Sample Workload: An instrumented subset of servers is measured running as sample workload; scale results to represent whole data centre Bits per kilowatt-hour: Sum all outbound bit-streams from data centre, divide by energy used by data centre Weighted CPU Utilization-SPCint_rate: Use CPU clock speed, SPECint_rate benchmarks, and CPU utilization to determine amount of work being done Weighted CPU Utilization-SPCpower: Use CPU clock speed, SPECpower benchmarks, and CPU utilization to determine amount of work being done [22]. Compute Units per Second Trend Curve: Uses trend curve based on Moore's law, age of server, and CPU utilization to determine work done [6]. OS Workload Efficiency Estimate efficiency by calculating the number of operating system instances per watt
3.2 The Uptime Institute In the same way, The Uptime Institute [23] also recommends a set of metrics based on consumption of electricity and the generated heat due to the functioning of equipments, the produced losses in distribution of power and electricity conversion from alternate current to direct current. Besides the Uptime Institute uses the term useful work and proposes to measure the consumption in conditions of full load, and estimate the potential savings using the hibernate mode when it is possible.
Table 2 General indicators of the Uptime Institute
1 2 3 4 5 6 7 8 9 10
Data Centre Consumption (kWAC): This is the total power consumption for the data centre as measured “at the meter” Hardware Load at the Plug (kWAC or DC): For a single piece of IT, this is AC (DC in Data Centre with direct current power distribution) power consumption measured at the hardware power plug. Hardware Compute Load (kWDC): This is the numbers of Watt of Direct Current power that are consumed by the computing components within the IT equipment. Site Infrastructure Power Overhead Multiplier (SI-POM): This is the amount of power that a data centre’s site consumes in overhead (equivalent to the numerator in PUE). IT Hardware Power Overhead Multiplier (H-POM): Tell us how much of the power input to a piece of hardware is wasted in power supply conversion losses or diverted to internal fans. DC Hardware Compute Load per Unit of Computing Work Done (Qualitative): Describe how power hungry a particular platform is. Deployed hardware (storage and server equipments) Utilization Ratio (DH-UR): Uses trend curve based on Moore's law, age of server, and CPU utilization to determine work done. Deployed hardware Utilization Efficiency (DH-UE): This is used for the quantification of the opportunity for servers and storage to increase their utilization by virtualizing. Free Cooling (kWh): This estimates the amount of money that could be saved by using cold outside air to directly or indirectly cool computer room. Enable Energy Saving Features (kWh): This estimates the amount of money, energy and carbon that could be saved by letting IT equipment hibernate during times when it is not being used.
As we have mentioned, these measurements have to be obtained equipment by equipment, and in specific period of time. The consumptions are obtained under next scheme, modelling the infrastructure of a DC:
302
Figure 4. Measurement places for the energy consumption monitoring (adapted from [23]).
4
CONDITION MONITORING IN A DATA CENTER – ACTION RESEARCH
Measuring the power consumption is a way to assess the level of greenness of a DC. To this end, a monitoring system has to be implemented to measure the power efficiency metrics, supervising, at the same time, the real state of all the elements in a DC, servers and facilities. An action research [24], carried on a DC of a company offering telecommunication services in Spain, is now reported showing two steps of implementation toward building such a monitoring system. The company requires high capability of data analysis as well as a great number of ICT services concurrently running. The company is a SME (small and medium enterprise), having customers requiring SLAs (Service Level Agreements) based on the quality of the service delivered to maintain the telecommunication networks. In this action research, the proposal from The Green Grid has been considered: henceforth two efficiency metrics have been monitored, i.e. Power Use Efficiency (PUE) and Data Centre Infrastructure Efficiency (DCiE). Accordingly, two groups of elements have their own monitoring system: the IT equipments and the infrastructure in the DC.
4.1
Initial situation
The DC is placed at the ground floor of a building composed by 2 floors, mainly dedicated to offices, located in an area with an average temperature in the last 6 years in between 10,8°C and 26,8°C. The DC takes up half of the ground floor. The DC is 6 years old and in the last June 2008 a revamping of the DC required the re-building of the DC infrastructure. The initial characteristics about power consumption per year are presented in Table 3.
Table 3 Initial characteristics about power consumption per year Power consumption in DC (MWh/year) Number of racks with IT equipments
1.846,00 42
2
Tecnical space ( m )
261,50
IT equipment consumption (MWh/year)
748,00
Lightnening consumption (MWh/year) Air conditioning equipment consumption (MWh/year) PUE DCiE (%)
19,00 156,00 2,47 40,49
303
Two stages of improvements have been carried on starting from this initial situation: the former aimed at achieving quick wins in respect of energy consumption; the latter was planned to orienteer towards optimisation through condition monitoring. 4.2 The first stage of improvements While revamping the DC a series of improvements, based on best practices, were done.. Within the best practices applied, we can mention the management of the air flow, the layout optimization, other activities related with the electric infrastructure, lightening and IT equipments. For what concern the management of the air flow, the isolation between hot and cold air flow lines was realized, through the filling of the empty spaces between IT equipments, in order to improve the air flow [17]. Henceforth: (i) the hot air is removed appropriately away from the IT equipments; (ii) due to isolation, the chance to create losses from the cold air flow is reduced and it is also avoided a mixture between cold and hot air. The improvement for what concern the lightening was done through electronic controls and presence sensors. The main actions carried on were in fact: i) the installation of electronic ballasts in lightening fixtures; ii) the placement of presence sensors so that the lights are activated only in the rows accessed by people, in addition to the bypass corridor side. Lightening consumption has been reduced by about 34 %, thanks to the simultaneous use of presence sensors and electronic ballasts. In addition, some activities have been automated thus leading to decrease the DC staff required during operation of the DC: this improvement is mainly reflected in the reduction of number of hours required for lightening power. Two further actions were done, concerning both operations management and real time control of the room: i) shutting down on holidays and weekends the servers in which applications are used exclusively during business hours; ii) keeping an average temperature inside the DC between 23 º C and 25 º C, by using, wherever possible (outdoor temperature of 7 º C lower than the interior), free cooling. The application of the best practices helped decreasing the energy consumption and improving the PUE of the DC. In fact, at this point, the following parameters presented in Table 4 were achieved in the DC.
Table 4 Characteristics achieved after the application of the best practices Power consumption in DC (MWh/year) Number of racks with IT equipments 2
Tecnical space ( m )
3.619,00 91 482,85
IT equipment consumption (MWh/year) Lightnening consumption (MWh/year) Air conditioning equipment consumption (MWh/year) PUE
1.791,60 12,42 1.742,60 2,02
DCiE (%)
49,50
The improvement gained is 18,03% of the PUE of the DC.
4.3 The second stage of improvements In the second stage of improvements, a SCADA system was initially implemented in order to monitor the efficiency of the DC infrastructure. This system is capable of reporting on environmental conditions inside the DC and consumption in accordance with the measurement pattern of a DC raised by Uptime Institute. One of the monitoring action is to provide regular temperature reports on the rows of the racks to be analyzed, in order to improve efficiency and to correct deviations in the heat flows through the racks. The SCADA monitoring is shown in Figure 2. This system can allow also the use of smart sensors and technologies OPC (i.e., sensors interconnected through normalized buses like Modbus, Fieldbus, Ethernet,… and OPC interfaces like OLE for Process Control).
304
Figure 5. Architecture of the Scada System Beside the SCADA system, the efficiency of the DC infrastructure has also been monitored according to a prior planning of inspections: IR (infra-red) thermographies, for the analysis of the air flow through the rows of the racks, are in fact carried on by means of walk around inspections supported with IR cameras. Regarding the functioning of the IT equipments, the monitoring elements are usually based on SNMP protocol (Simple Network Management Protocol) or on XML (Extensible Markup Language). This monitoring mechanism leverages on the offers of the suppliers of IT equipments. Indeed, some suppliers include monitoring systems in order to manage and control the production through APIs (Application Program Interface, a set of routines, protocols, and tools) in the IT equipments. This is the mechanism used in order to measure the power consumption in these equipments. Henceforth, the servers through APIs and programs for themselves provide data for their own consumption and the internal temperature; these services are not fully available at the moment and their implementation is still ongoing in the action research. Summarizing, at the moment, the monitoring system is mainly focused on infrastructure equipments in order to monitor the temperature of the room, improvement of the efficiency of the air conditioning equipments, control of the air flows and improvement of the efficiency of the UPS systems. The use of the SCADA system allows a second stage of improvement: the data reported in Table 5 refers to the mean values obtained in the last three months about energy consumption and efficiency.
Table 5 Mean values obtained in the last three months about energy consumption and efficiency Average consumption of the DC (kW)
421,36
IT equipment consumption (MWh/year)
250,53 1,68
PUE
59,46
DCiE (%)
5
CONCLUSIONS
References recommend to monitor the DC activities in order to analyze and improve its efficiency. Then this efficiency must take into account the internal production and the environmental conservation. At the same time, to urge the internal
305
personnel to continuous improvement, the evolution has to be quantified, making easier the decision process orientated to power efficiency. This study showed the necessity of reducing power consumption in any installation and mainly in Data Centres, adapting them to the new technical and social exigencies. To this end, the standard indicators, here mentioned, have to be employed to manage the power efficiency and the internal productivity. Of course, this battle has to be realized jointly with the IT suppliers integrating their systems (APIs) with the infrastructure monitoring system (SCADA). After the first stage of the action research, the achieved improvement (18% on the PUE) can be considered a good result, but according to other reference value of the sector the PUE around 2 is not yet the optimal value. Compared with the reference of Lawrence Berckley National Lab that considers PUE = 1,83 (average value, obtained from a study over 25 DC) [25], it is possible to state that the result obtained in the action research activity study is worth, even if there is room for a further improvement. Then, in order to improve this situation, the next task was to improve the monitoring of the infrastructure equipments, e.g. providing more sensors next to each rack. By a SCADA system installed and working, the data center object of this paper has got a PUE= 1,68 (average value for 2009 from January to may); this value is very close to 1,69 that is EPA objective for 2009 [26]. As future improvements, we must cite that the monitoring system should be extended to all the equipments in an extensive way that will allow to know exactly the behaviours of them in order to keep a high level of power efficiency. On the other hand, the maintenance activity must be planned in a proper way keeping in mind the relation between maintenance and operation (in this case the delivery of the service) and mainly considering the Service Level Agreement contracts. The monitoring system installed goes into the direction of performing maintenance in respect with energy consumption and thus sustainability. Future actions must be addressed to optimization of the IT parts and the improvement of the PUE index at the same time. The final objective in the management of a DC has in fact been to make IT Green and efficient, applying, when needed, techniques of CBM (condition based maintenance) in the scope of improving maintenance management, towards the principle of a better life cycle cost management.
6
REFERENCES
1
ASHRAE. (2004) Thermal Guidelines for Data Processing Environments. ASHRAE ISBN 1-931862-43-5
2
Patterson M., Bean J., Jones, R., Jones R., Wallerich J., Bednar R., Vinson W., Morris P., (2009) ASHRAE datacom book series. The Grind Grid, white paper 18.
3
Koomey J.G., (2007) Estimating Total Power Consumption by Servers in de U.S. and the World. Lawrence Berkeley national Laboratory.
4
Energy Star and other climate protection partnerships. (2007) Annual report. EPA Report. www.energystar.gov
5
Short J., Wis E. and Christie B. (1976) The social Psychology of Telecommunications. John Wiley and Sons, England.
6
Moore Gordon E. (1995) Electronics. www.intel.com.
7
European Commission, Directorate General JRC, Institute for Energy, Renewable Energies Unit. (2008) Code of Conduct on Data Centres Energy Efficiency.
8
McKinsey & Company. (2008) Revolutionizing Data center Efficiency. Uptime Institute Symposium.
9
The Green Grid White paper 16. (2007) Guidelines for energy efficient datacenters. The Grind Grid.
10 TIA. (2005) ANSI/TIA-942 http://www.tiaonline.org/index.cfm.
Telecommunications
Infrastructure
Standard
for
Data
Centers.
11 LBNL. (2008) Data Center Assessment Tools. http://hightech.lbl.gov/dcassessmenttools.html. 12 Emerson Network Power. (2008) Energy Logic: Calculating and Prioritizing your Data Center IT Efficiency Actions. White Paper. 13 Electric Power Research Institute, www.epri.com 14 Syska Hennessey Group. (2009) www.syska.com 15 The Green Grid. (2009) www.thegreengrid.org. 16 Patterson, M.K, (2008) The effect of data center temperature on energy efficiency. Thermal and Thermomechanical Phenomena in Electronic Systems. ITHERM 2008. 11th Inter-society Conference on. Page(s): 1167 - 1174
306
17 Tozer Robert. (2008) Data Center Energy Strategy. EYP MCF White Paper. 18 Karlsson Frederick J. and Bahram Moshfegh. (2005) Investigation of indoor climate and power usage in a data center. Energy and Buildings. 19 Bash, C.B.; Patel, C.D.; Sharma, R.K. (2006) Dynamic thermal management of air cooled data centers. Thermal and Thermomechanical Phenomena in Electronics Systems. ITHERM '06. The Tenth Intersociety Conference on. Page(s): 8 pp. – 452 20 Belady C.; Rawson, A.; Pfleuger, J; Cader, T. (2008) Green grid data center power efficiency metrics: PUE and DCIE. The Grind Grid, White paper 6. 21 Hass, J., Monroe M., Pflueger J., Pouchet J., Snelling P., Rawson A., Rawson F. (2009) Proxy proposals for measuring data center productivity. The Grind Grid, White paper 18. 22 SPEC power. (2008) SPECpower_ssj2008. www.spec.org 23 Stanley J.R., Brill K.G., Koomey J. (2007) Four metrics define data center greenness. Uptime Institute, White Paper. 24 Blum, F. (1955) Action research—a scientific approach? Philosophy of Science, 22, 1–7. 25 LBNL, http://hightech.lbl.gov/datacenters 26 Energy Star and other climate protection partnerships. (2007) EPA Datacenter Report Final (Appendices). www.energystar.gov 27 Turner W. Pitt IV, Seader J.H., Renaud V., Brill K.G., (2008) Tier Classifications Define Site Infrastructure Performance. The Uptime Institute White Paper.
307
Proceedings of the 4th World Congress on Engineering Asset Management Athens, Greece 28 - 30 September 2009
STRATEGIES FOR INTEGRATING MAINTENANCE FOR SUSTAINABLE MANUFACTURING Jayantha P. Liyanage a and Fazleena Badurdeen b a
b
Centre for Industrial Asset Management (CIAM), University of Stavanger, Norway .
Centre for Manufacturing and Department of Mechanical Engineering, College of Engineering, University of Kentucky, KY, USA.
With the changes in economical, social, political, and environmental conditions around industrial activities, sustainable manufacturing gradually emerges as the way to do business in the immediate future. Many global manufacturing giants appear to have embraced the idea, yet are in need of novel solutions and methods that can be capitalized. So far, attention is largely paid on the technological innovation. However, quite much can be done with internal processes such as operations & maintenance to achieve sustainability in the manufacturing cluster. Maintenance in particular has not been in the centre of focus, which can mostly be attributable to the conventional wisdom of the industry. The manufacturing industry is in need of an understanding of the principal strategies exploitable for maintenance integration in different scales. This paper discuses three such strategies from three different perspectives, namely technical, life-cycle, and organizational. It argues that in order to truly integrate maintenance the classical reliability-based approaches are inadequate to make the business case, and that sustainable manufacturing is in an immediate need for novel risk-based applications. Key Words: Maintenance, Sustainable manufacturing, Asset management, Risk. 1
INTRODUCTION
Sustainability can be misunderstood as entirely negative, an approach that forces costly changes to production and distribution eventually making products more expensive for the consumer. Sustainability is also often misunderstood as just another approach to environmental protection. In fact, sustainability is a broad and very challenging approach that strives to balance and satisfy the demands of three areas often seen only in isolation and in conflict: economic benefits, environmental protection, and societal well-being—the so-called Triple Bottom-Line (TBL) (Elkington, 1998). While the United Nation’s Brundtland Commission report (UNWCED, 1987) defined sustainable development rather vaguely as action “….to meet the needs of the present without compromising the ability of future generations to meet their own needs,” the subsequent publications and reports including the Agenda 21 (UNCED, 1992) clarified the much broader scope of sustainability. In essence, sustainability is an enlarged framework through which we must view the making of products – sustainable manufacturing. Sustainable manufacturing requires an emphasis on products, processes and the entire system; it also calls for a more holistic approach that covers the total life-cycle of the product (or manufacturing asset, if viewed from a maintenance perspective) from pre-manufacturing through the post-use stages (discussed later). One often overlooked process that is conventionally viewed as a ‘necessary evil’ (Paz and Leigh, 1996) and considered as not having any value-adding capability but very important for sustainable manufacturing is the management and maintenance of assets and equipment used in the making, storing and delivery of the products. However, most conventional maintenance strategies have a narrow focus that merely focuses on the operational stage (or the post-design operations and maintenance) of the asset’s life-cycle. However, maintenance for sustainable manufacturing requires a much broader emphasis on the entire life-cycle of the asset (Liyanage, 2007, Liyanage, Badurdeen et.al. 2009). Moreover, the practice of sustainable manufacturing does not begin in a manufacturing asset and end in the same after delivery of a product (Liyanage, 2003, Liyanage & Kumar 2003). Those companies that own manufacturing assets do have form of a responsibility to cultivate a sustainable cluster around the manufacturing asset actively involving external business partners from pre-manufacturing to post-manufacturing stages. This
308
paper sets out to examine how to approach the integration of maintenance strategies to pursue sustainable manufacturing from three different perspectives. In order to do so, the evolution of manufacturing strategies to the current state of sustainable manufacturing and the characteristics as well as main foci of the earlier manufacturing strategies is examined in Section 2. The following section then examines principal strategies to integrate maintenance to follow sustainable manufacturing, as presented in Section 2. In Section 4, the drivers/pre-requisites to integrating maintenance for sustainable manufacturing is discussed. Finally, concluding remarks are presented in Section 5. 2
EVOLUTION OF SUSTAINABLE MANUFACTURING: STATUS AND EMERGING SCENARIOS
The activities involved in the life-cycle of a product can be assigned to four different stages: (1) pre-manufacturing, (2) manufacturing, (3) use, and (4) post-use. To ensure sustainability in the supply chain, and everything within, all four product life-cycle stages and economic, environmental and societal impacts at all these stages must be explicitly integrated. Because manufacturing is the core operation in a supply chain (limiting the focus to physical products), designing the system and promoting sustainability in its operations must center on a sustainable manufacturing approach. Such an approach requires two important considerations: (1) a total life-cycle emphasis as well as (2) multi-lifecycle emphasis. The former is needed to that manufacturing is pursued while explicitly considering activities across all the four life-cycle stages and the impacts thereof. The latter is needed to ensure closed-loop material flow from the post-use stage of one life-cycle to the pre-manufacturing stage of the next, mandatory for sustainable manufacturing. In addition, current more advanced thinking also emphasises on the need for active integration of business partners who contribute to the sustainability stakes of a product. The evolution of manufacturing strategies over the years and their impact on stakeholder value (much broader than shareholder value, which is the only consideration in conventional manufacturing) is shown in Figure 1. As can be seen, traditional manufacturing was substitution-based and relied upon relentless resource consumption to deliver value to customers; the value addition to the wider group of stakeholders was very limited. Subsequent practice of lean manufacturing, as Toyota’s Production System is commonly referred to in the Western hemisphere, focused on waste-reduction (one ‘R’) and was able to deliver more value to customers while also appreciating the role of team members. Green manufacturing practices, which gained significant popularity over the last several years, advocates environmentally-benign practices through the 3R approach of Reduce, Reuse and Recycle. Innovation Elements Sustainable Manufacturing (Innovative, 6R-based)
Remanufacture
Stakeholder Value
Redesign Green Manufacturing
Recover
(Environmentally-benign, 3R-based)
Lean Manufacturing
Recycle
(Waste Reduction-based)
Reuse
Traditional Manufacturing (Substitution-based)
Reduce
1980
1990
2000
2010
2020
2030
Time
2040
2050
Figure 1: Evolution of Manufacturing Strategies (Jawahir, 2008) Each of the 3 ‘R’s in promoted through Green Manufacturing are: • • •
Reduce: primarily focuses on the first three stages of the product life-cycle and refers to the reduced use of resources or ‘source reduction’ in pre-manufacturing, reduced use of energy and materials during manufacturing and the reduction of waste during the use stage (USEPA, 2008). Reuse: This refers to the reuse of the product or its components, after usage in its first life-cycle, for subsequent life-cycles to reduce the usage of new (virgin) raw materials to produce such products and components (USEPA, 2008). Recycle: involves the process of converting material (for example glass, metal and paper) that would otherwise be considered waste into new materials or products (USEPA, 2008).
However, for true sustainable manufacturing a much broader focus is needed; innovative techniques to focus on the total product life-cycle, particularly the post-use stage, which previous manufacturing practices cover very scarcely, are needed. Such an approach is presented through the 6R methodology to sustainable manufacturing which focuses on Reduce, Reuse and Recycle as well as to Recover, Redesign, and Remanufacture the products over multiple product lifecycles. Such practices could enable exponential increases in stakeholder value, by adopting a closed-loop, cradle-to-cradle (McDonough &
309
Braungart, 2002) material flow to achieve the triple bottom-line. Each of the additional three ‘R’ included in the 6R are as follows: •
Recover: this involves the process of collecting products at the end of the use stage, disassembly into components, sorting and cleaning for utilization in subsequent life-cycles of the product (Joshi et al., 2006). Redesign: is the act of redesigning products for better resource utilization during manufacture, use and to simplify future post-use processes through the application of techniques such as Design for Environment (DfE) to make the product more sustainable. Remanufacture: involves the re-processing of already used products for restoration to its original state or a likenew form through the reuse of as much components and parts without loss of functionality (Joshi et al., 2006).
• •
Together, the 6R’s provide the framework to implement sustainable manufacturing, where the performance in all life-cycle stages must be considered in making decision at any lifecycle stage. The application of the 6R’s across the four stages for multiple life-cycles, where the frustum depicts the reduced resource footprint (for subsequent life-cycles) is illustrated in Figure 2.
Figure 2: Application of 6R’s Across Product Life-cycle Stages (Badurdeen et al., 2009)
3
MAINTENANCE INTEGRATION STRATEGIES FOR ‘SUSTAINABLE’ MANUFACTURING
The concept of ‘sustainable’ manufacturing is likely be the basis for competitive advantage for companies in times to come given the emerging social and environmental conditions. It promotes a manufacturing practice where the underlying processes and products adopt criteria that are more conscious of environmental impact, energy conservation, resource consumption, health & safety, reuse and waste management etc., as well as financial gains (Liyanage, 2006, Liyanage, 2007, Jovane et.al. 2008). On the other hand, manufacturing processes are challenged in terms of; •
Unnecessary consumption of raw materials, technical and energy resources
•
Reduction of environmental emissions, waste generation and disposal
•
Reuse / recycle of equipment and other technical resources
•
Enhancement of process efficiency with less buffer, non-productive time, and better technical capacity utilization
•
Implementation of ‘sustainable’ policy in material flow management, equipment selection and operation, systems design, conversion into products, and even in post-manufacturing tasks.
Obviously, all the processes in a manufacturing asset have significant roles in the assurance of ‘sustainable’ status all the way through, from pre-manufacturing to post-use. Maintenance, as a process dedicated to ensure the technical health, in particular, is expected to make a major contribution in this respect. Ill-defined practices with respect to the maintenance of manufacturing assets, leads to numerous problems such as hazardous emissions, production waste due to systems malfunctions, health & safety incidents, in-efficient energy usage, in-effective resource consumption, wastage of stored materials, financial losses due to capacity losses or downtime, etc. However, many companies appear to have developed the tunnel vision that plant modernization is the principal resort to being ‘sustainable’. The lack of progress in achieving a ‘sustainable’ status can in fact be attributed to this same reason because the scale of modernization required in most plants/facilities may demand massive investments, which in most cases may not be worthwhile environmentally or socially, even if it is justified based on economic terms. Yet, not many companies have understood that a more professional approach to developing and implementing
310
effective maintenance practices for existing plant/facilities can help quite significantly in their efforts at becoming sustainable. In this context, this section discusses 3 principal strategies that can be used to integrate maintenance from different viewpoints to promote a more sustainable approach to manufacturing.
3.1 Risk-based Top-down Systems Integration It is well known that industrial sectors lack a professional approach to defining their maintenance policies and strategies under dynamic conditions. The existing strategies and policies are mostly seen outdated or ill-defined, and not strategically carved to suit to novel business risks. The manufacturing sector in this regard appears to be more conservative in thinking, where automation trends dominate and maintenance is not often considered as a critical business process. At the technical systems level, maintenance often comes-in after technology implementation and commissioning process, mostly as a function for restoration and/or retention of designed technical capacities. The lack of strategic connection between policies and technical equipment performance is often revealed in technical plant status review processes, and also during various audits and investigations. The ‘top-down systems integration’ approach aims at ensuring that the ‘sustainable’ policies defined and adopted at the strategic level of the plant/facility are systematically followed-up through the technical systems to the critical equipment performance level. Figure 3, illustrates principles of top-down systems integration.
Figure 3. Principles for ‘Top-down systems integration’ Identification of the business case of the plant and the definition of the risk factors are key in the development of the plant/facility policies. This takes into consideration general as well as unique business conditions, from overall sustainability perspective, arising from the local operational environment that impact a facility. This follows a process where the systems’ capacities and capabilities are analysed in detail after functions analysis w.r.t a given sustainability criteria, whereby modernization and upgrading requirements are identified if necessary. This may provide the basis particularly for application of health monitoring techniques, development of necessary preventive maintenance programs, and instrumentation solutions. Maintenance is integrated to enable the systems to deliver the designated functions in compliance to the company specific sustainability policies for the given asset. More specific consideration here should not only be on the reliability of equipment performance through criticality analysis, but various other important factors that truly matters for sustainable exploitation of engineering capabilities of the manufacturing asset. One may argue that conventional practice does not differ much from the above. In fact the convention is mostly based on the classical reliability-based approaches that are built on downtime losses and/or production impact. The integration process required in this process, with respect to ‘sustainable’ manufacturing, extends beyond classical reliability. We argue here that this should be based on a comprehensive risk analysis at the systems function and equipment performance level for instance considering; •
hazardous emissions,
•
production waste due to systems mal-functions,
•
health & safety incidents,
•
in-efficient energy usage,
311
•
in-effective resource consumption,
•
wastage of stored materials,
•
financial losses due to capacity losses or downtime, etc..
This demands a different analysis platform than the ones supported by classical reliability theories that are mostly used in the last few decades within various industrial sectors.
3.2 Plant / Systems Life-cycle Based Integration Every plant and technical system has a given commercial life. Those technical systems that lack proper care or maintenance intervention strategies will have a life profile completely different from ones that have well defined maintenance programs. In fact, from a life-cycle perspective the aim of maintenance is to extend the useful life (i.e: the use stage, discussed in Section 2 previously) of systems and equipment to their extreme ends enhancing the overall productivity factor of those systems and equipment. Apart from financial benefits, this also has other major implications, for instance in terms of resource utilization, reuse of technical capacity, etc. However, it is not so common to see that maintenance is integrated throughout the plant/systems life-cycle, beginning from engineering design to removal. Maintenance interventions often appear at the operational or use phase, after some early recommendations by original equipment manufacturers on specific periodically defined inspection and replacement tasks. This has been the formal practice for design engineers for decades without much of lateral thinking on the operational and removal implications of design decisions, and even post-design operational costs during the rest of life (that is the use and post-use stages). In fact, maintenance is a process that commences with the very early conceptual engineering tasks (pre-manufacturing stage) and ends with the fall of technical worthiness (post-use stage). The Life-cycle based integration intends to cover this entire commercial life of equipment or systems in such a way that more lateral thinking is adopted in the early equipment selection and full exploitation process, involving few ownership transfers throughout its technical life. This is briefly illustrated in Figure 4.
Figure 4. Maintenance integration process from a systems/equipment life-cycle perspective. From life-cycle based integration perspective, maintenance interventions can be divided into three main stages, namely Pre-operational, Operational, and Post-operational. The Pre-operational phase represents early engineering phases. Formally, maintenance intervention at this stage is limited to the consideration of equipment reliability data in equipment selection process, and recommendation of periodic maintenance programs for selected equipment. The major criteria thus are more technical and are largely limited to downtime, failure modes, and failure frequencies. Some industry databases, such as OREDA (Offshore reliability data) provide the bases for this task. However, from ‘sustainable’ manufacturing point of view, we argue that this phase require a more comprehensive analysis of technical condition as well as specific functional characteristics. This need to include, for instance, energy consumption profiles, CO2 and other green house gas emissions, chemical consumption patterns, contingency requirements, resource tied-up level, human error potential and safety risk, health exposures, etc., that have important effects on the ‘sustainable’ profile. The interventions at the Operational phase, is to be based on a risk-based strategy, rather than pure reliability analysis, initiated at the earlier phase where equipment are classified based on their overall criticality ranking. The risk-based approach is to adapt a more comprehensive operational impact analysis, for instance in economical, social, and environmental terms, and
312
classification of sustainability-based influence factors that help achieving sustainability objectives of the manufacturing facility. Such classification provides the basis for implementation of early technology application solutions and instrumentation needs, for systematic measurements and performance tracking, as well as other specific organizational measures that are necessary to achieve a ‘sustainable manufacturing’ profile. The Operational phase thus involve a number of active intervention tasks, based on a calculation of sustainability risks during operational phase, and can involve a range of operational measures other than conventional reliability-focused maintenance tasks. In fact, the level of actual resource tied-up and consumption at this stage for maintenance work execution, is also largely conditioned by the level and nature of Pre-operational interventions. Maintenance interventions at the Operational phase have a serious effect on the Post-operational condition of the systems / equipment, and subsequently on the reuse capability. The traditional practice often found in industry at this stage is such that equipment / systems are minimally maintained solely for regulatory purposes. There is no formal recycling / post-use thinking embraced by the industrial sectors purely due to financial reasons. This strategy rests purely on the equipment / systems discarding practice, which ignores an important principle sustainable practice of re-consumption of valuable resources. In this context, the Post-operational maintenance interventions are of great importance, which is largely concentrated around retaining the technical worthiness and assurance of reusability after the transfer of ownership. The cycle thus continues until the system / equipment lacks technical worthiness, which then can be subjected to material recycling process. As mentioned above, the practice of whole life-cycle maintenance integration in the manufacturing sector today occurs at a very marginal scale. Maintenance has not yet gained the recognition as an important business function, and the level of consideration given has been limited to pure reliability analysis. While reliability has a strong technical logic for maintenance integration task, it does not provide the necessary comprehensive basis for full scale application of sustainability practices in a manufacturing environment. This is where novel risk based criteria that challenge the conventional practices are needed.
3.3 External organizational integration The manufacturing sector represents a large body of commercial transactions in various capacities, and these transactions in most cases connect various other organizations and regions. The concept of ‘sustainable’ manufacturing, ideally, does not begin with the feeding of raw materials to a manufacturing floor and does not end with the manufacturing of a product ready to be sent to a market. It involves a number of well-coordinated operations that extends beyond a company’s manufacturing floor. This is so even for maintenance process, where various other external organizations are involved explicitly or implicitly. This includes, for instance, spare part vendors, logistic chains, diagnostic specialists, equipment manufacturers, instrumentation vendors, and even sales agents, etc. The question here is to what extent a manufacturer is capable of having better control of ‘sustainability assurance’ of the external business partners having any form of commercial or legal influence on the manufacturing of a product. The challenges are particularly evident for instance when certain commercial tasks of manufacturing processes are transferred to developing or sensitive regions in Asia, Africa, and South America This requires the introduction of suitable corporative strategies to streamline the supply chain as well as the product delivery chain (see Figure 5).
Figure 5. Integrating external organizations both in Pre-manufacturing, Manufacturing and Post-manufacturing levels are central to creating a sustainable manufacturing cluster around operations & maintenance process of a manufacturing asset. The complexity of this integration process varies depending on the product size and variety, type (either discrete or continuous) and the manufacturing volume/conditions of the asset, and importantly the scale of geographical diversification of manufacturing tasks. The aim here is that not only the manufacturing organization is committed to adapting sustainable
313
business practices, but also external business partners are strategically aligned towards sustainable manufacturing around a manufacturing process and the resulting product. This also in principle calls for other specific organizational measures such as communication, reporting, compliance assurance, and various other measures that can lead to better integration.
4
DRIVERS AND PRE-CONDITIONS OF SUCCESS
Notably, sustainable manufacturing is gaining momentum as the basis for competitive advantage in the present and emerging industrial settings. Even though some of the leading organizations have openly expressed their commitment and have taken some early steps indicating their strategic directions, much is yet to be done both internally and externally to create truly sustainable manufacturing clusters around the globe. With respect to operations and maintenance process of manufacturing assets, the progress still is far from being satisfactory. In an environment where plant modernization and upgrading is embraced as the only way to achieving sustainability, the major challenge, as afore-mentioned, is to break the conventional reliability-based barriers, and to develop more risk-based approaches to cultivate sustainable practices in the manufacturing sector. This needs to involve critical sustainability factors other than classical reliability parameters, such as pure failures modes and frequencies. On the other hand there are various other organizational challenges to the active integration of maintenance as a critical component of sustainable manufacturing. This particularly so when the manufacturing tasks are diversified and globalized. Based on continuous engagement and experience with industry, this for instance often appears to include; •
Development and introduction of a convincing sustainable business case for maintenance as a critical contributing process for sustainable manufacturing. This typically involves identification of critical parameters of sustainable manufacturing in the plant and equipment performance setting that are affected by maintenance.
•
Senior managers willingness and commitment to implement more localized solutions, and provision of time and resources to encourage novel thinking and practice.
•
Communicating the importance of sustainability across the operations & maintenance organization and involved external business partners, and specific measures to enhance awareness and knowledge of the technical personnel. The later can include sustainability campaigns, seminars, educational programs, etc.
•
Performance tracking and compliance assurance measures, such as periodical reporting, performance measures, assessment techniques, etc.
•
Promoting a culture of ownership to sustainability practice within operations & maintenance of the manufacturing cluster, and introduction of incentives programs to boost achievements.
Obviously, the technological innovation is a major contributor for a sustainable future. However, the industry is confronted with various bottlenecks and challenges in resorting to technology alone. The success in fact is achieved through a good blend of technology with other soft organizational issues, and allowing new strategies and improvement measures within existing manufacturing plants/facilities.
5
CONCLUSION
As sustainable manufacturing is systematically being embraced by the industry as the future direction for successful business, many manufacturing organizations are in need of novel thinking as to what solutions they can adapt. While new technology has so far been able to get the attention of the manufacturing sector, there are various other measures that can also be taken locally and cost-effectively to contribute to the cause. Among others, this includes the operations & maintenance process, which mostly has been undermined due to conventional wisdom of the industry. However, maintenance has a much larger business potential than most industrialists are aware of, in the pursuit of sustainable manufacturing. Capitalizing on this business potential needs more forward thinking that reaches beyond classical reliability theories and practices. Sustainable manufacturing raises the need for more risk-based applications and lateral thinking in the development of solutions for maintenance.
6
REFERENCES
1
Badurdeen F, Iyengar D, Goldsby TJ, Jawahir IS, Metta H & Gupta S. (2009) Extending Total Lifecycle Thinking to Sustainable Supply Chain Design. International Journal of Product Lifecycle Management. Under Review.
314
2
Elkington J. (1998) Cannibals with Forks: The Triple Bottom Line of the 21st Century Business, USA: New Society Publishers.
3
Jawahir IS. (2008) Beyond the 3R’s: 6R Sustainability Concepts for Next Generation Manufacturing. Keynote paper presented at IIT MMAE Symposium on Sustainability and Product Development, August 7-8, 2008.
4
Joshi K, Venkatachalam A, Jaafar IH & Jawahir IS. (2006) A new methodology for transforming 3R Concept into 6R Concept for Improved Product Sustainability. Global Conference on Sustainable Product Development and Life Cycle Engineering, October 2006. Sao Paulo, Brazil.
5
Jovane, F., Yoshikawa, H., et.al. (2008) The incoming global technological and industrial revolution towards competitive sustainable manufacturing, CIRP Annals – Manufactuyirng technology, 57, pp 641-659.
6
Liyanage, J.P., (2003), Operations and maintenance performance in oil and gas production assets: Theoretical architecture and Capital value theory in perspective, PhD Thesis, Faculty of Engineering Science and Technology, Norwegian University of Science & Technology, Norway.
7
Liyanage, J.P., Kumar, U., (2003) Towards a value based view on operations and maintenance performance management, Journal of Quality in Maintenance Engineering, vol.9, no.4, pp 333-350.
8
Liyanage, J.P., (2006) Sustainability risk management: Managing business risk in complex environments through systematic operations and maintenance performance specifications, 18th European Maintenance Congress: Euromaintenance 2006, Basel, Switzerland, pp 677-682.
9
Liyanage, J.P., (2007) Operations and maintenance performance in production and manufacturing assets: The sustainability perspective, Journal of Manufacturing Technology Management, Emerald, 18(3&4), pp 304-314
10
Liyanage, J.P., Badurdeen, F., Ratnayake, R.M.C., (2009) Asset maintenance and Sustainability risk, Duffuaa, S.O. (ed.), Handbook on Maintenance engineering and management, Springer. (accepted)
11
McDonough W & Braungart M. (2002) Cradle to Cradle: Remaking the Way We Make Things. North Point Press.
12 Paz, N.M. and Leigh, W. (1994), Maintenance scheduling: issues, results and research needs, International Journal of Operations and Production Management, Vol. 14, No. 8, pp. 47-69 13
UNCED (1992) Agenda 21. United Nations Conference on Environment and Development. (http://www.un.org/esa/sustdev/documents/agenda21/english/agenda21toc.htm).
14
UNWCED (1987) Our Common Future. United Nations World Commission on Environment and Development: UK: Oxford Publishers.
15
USEPA (2008) Municipal Solid Waste (MSW) – Reduce, Reuse, and Recycle. U.S. Environmental Protection Agency, http://www.epa.gov/msw/reduce.htm.
315
Proceedings of the 4th World Congress on Engineering Asset Management Athens, Greece 28 - 30 September 2009
LIFE CYCLE MAINTENANCE PLANNING SYSTEM IN CONSIDERATION OF OPERATION AND MAINTENANCE INTEGRATION M.Tsutsui a and S.Takata a a
Department of Industrial and Management Systems Engineering, Waseda University, Okubo, Shinjuku-ku, Tokyo, 169-8555 Japan.
In this paper, we propose a maintenance planning method that takes into account the interrelation between maintenance and operations. In the proposed system, the maintenance plan is evaluated in terms of losses related to operations as well as maintenance. Operational losses include production losses due to both failure and maintenance operations. Maintenance losses include inspection costs, repair costs, and other losses such as those incurred due to operator injury. In addition, the system evaluates the risk of fatal failure. We have developed a prototype system that evaluates these losses using Monte Carlo simulation. The effectiveness of the system is demonstrated by applying the prototype to a heating furnace used in the direct desulfurization facility of an oil refinery plant. Key Words: Maintenance planning, O&M integration, Monte Carlo simulation, Oil refinery plant, Creep rupture 1
INTRODUCTION
Since manufacturing has become increasingly dependent on manufacturing facility, facility life cycle management has become essential to execute sustainable manufacturing business by which companies can ensure continuing profit while reducing their environmental load [1]. In facility life cycle management, proper maintenance planning is indispensable. Since the execution of maintenance procedures affects production operations, and operation affects the life span of the facilities and their condition, we should manage operation and maintenance in an integrated way. Maintenance planning should, therefore, be performed taking interactions between operations and maintenance into account. Existing maintenance planning methods such as RCM (Reliability Centered Maintenance) and RBM (Risk Based Maintenance), however, do not pay much attention to interrelations between operation and maintenance. In this paper, we propose a maintenance planning system that takes into account the interrelation between operations and maintenance. For integrating operation and maintenance planning, we evaluate losses as a common index. Losses related to operations include production losses due to failure occurrences, maintenance operations, excess energy consumption due to performance degradation of facilities, and so on. Losses related to maintenance include costs of inspection and repair, and other losses such as those caused by injuries to operators. In addition to such losses, the system needs to evaluate the risk of fatal failure. For evaluating losses and risks, we propose a life cycle maintenance planning support system. The system performs the evaluation of losses and risk by means of Monte Carlo simulations [2] and assists the planner to select a proper maintenance plan. We have applied the prototype system to a heating furnace of a direct desulfurization facility of an oil refinery plant. We have demonstrated the effectiveness of the system by evaluating the losses induced by major deterioration and failure modes including creep of heating tubes.
2
OPERATION AND MAINTENANCE INTEGRATION
With increasing dependence of manufacturing activities on plant facilities, it is essential to maintain facilities in proper condition for achieving efficient and stable production. For this purpose, the operation and maintenance planning processes should account for how each affects the other: the execution of maintenance affects production operations, and operating conditions cause changes in the facility condition. In many plants, however, operation and maintenance management is carried
316
Production operation schedule
Determination of manufacturing and sales strategy ・Setting an acceptable risk ・Demand forecasting Production Parameter for losses evaluation Acceptable risk requirement 3.3 3.2 Evaluation of expected 3.1 Production Maintenance planning operation planning losses and risk
Failure probability 2.2 Likelihood evaluation 2.3 Maintenance data of failure occurrence acquisition Maintenance Operating conditions Condition of facilities execution data 1.3 Execution of 1.1 1.2 Condition monitoring Execution of production operation and inspection maintenance 2.1
Operation data acquisition
・Availability of manpower ・Availability of spare parts
Asset ・facilities, device
Non-asset Resource ・Manpower, material
Maintenance schedule
4.1
・Condition of facilities ・Permitted timing of treatment
Figure 1. Integration of operation and maintenance
out with different criteria. In many cases, priority is placed on safety and reliability in maintenance management, while emphasis is put on efficient and stable production in operation management. Figure 1 shows a concept for the integration of operation and maintenance management, referring to ADID (Activity Domain Integration Diagram) in ISO18435 [3]. In this figure, activities related to production operations are indicated in the left row and those related to maintenance are indicated in the right row. The activities related to operation and maintenance integration are represented in the middle row. The figure shows how the operations affect the facility conditions; this forms the basic information required for maintenance planning. Level 3 in the figure also shows that operation and maintenance plans should be made based on the evaluation of the losses and risks related to both operations and maintenance. We assess the effectiveness of operation and maintenance plans, in terms of losses and risk, as defined by the following equations.
L=
Lm + Lo , T T
R=
(1)
I
∑∑ e t =1 i =1
i ,t
T
Pri ,t ,
(2)
where L is losses per unit term, Lm is losses related with maintenance activity, Lo is losses related with production operation, T are evaluation periods, R is risk per unit time, I is total number of failure modes, i is the failure mode identification number, t is the term number, ei,t is the amount of emergency maintenance losses induced by failure mode i in term t, and Pri,t is failure probability of failure mode i in term t. In general, operation and maintenance plans should be optimized so as to minimize the total loss and maximize the profit. At the same time, we should manage the manufacturing facilities in a way that reduces the risk of fatal damage to a certain level. Although the risk can be evaluated in terms of its financial cost together with the other losses, it is preferable to manage the risk independently, because we need to pay special attention to failures that would cause tremendous losses but would only occur rarely. In this study, therefore, we set the allowable maximum probability for the fatal failures modes. The selection of the operation and maintenance plan is executed by adopting L as the objective function and R as a constraint.
3
LIFE CYCLE MAINTENANCE PLANNING
The procedure of life cycle maintenance planning is shown in Figure 2. The procedure consists of 9 steps. The details of each step are described below.
317
(1) Information gathering for target facility
Information gathering for target facility
First, we should collect all necessary information associated with the target facility such as configuration and function of the facility, historical data for operation and maintenance, and constraints on operating conditions. Based on this information, we execute the structural and functional deployment in step 3.
Structural and functional deployment Deterioration and failure analysis Evaluation of likelihood of failure occurrences Evaluation of operation and maintenance losses
(2) Structural and functional deployment The structure and function of the facility are analyzed. First, we identify the facility’s components and their connectivity. Then, we analyze the functional relations of the components that are necessary for the failure analysis performed in the next step. (3) Deterioration and failure analysis First, we identify the potential deterioration modes that can occur in the facility on the basis of structural deployment. Potential deterioration is identified on the basis of component geometry, material properties, and connectivity characteristics. We should also identify the deterioration patterns that characterize the progress of deterioration. Since missed failure modes in this step will not be considered in maintenance planning later on, it is important to enumerate all possible deterioration possibilities. In the failure analysis, we identify the failure mode induced by deterioration. Based on the deterioration patterns, we select the applicable maintenance policies, which include time based maintenance (TBM), condition based maintenance (CBM) and breakdown maintenance (BM). For instance, TBM makes no sense for random failures and CBM cannot be applied to sudden failures.
Set an acceptable risk Set constraints on operation and maintenance planning Evaluation of expected losses Selection of proper operation and maintenance plan
Figure 2. Procedure of lifecycle maintenance planning
Is there a detectable variable which expresses the deterioration and failure state?
no
yes Is enough knowledge available to construct the deterioration model?
no
yes Construct the deterioration and failure model.
Identify the failure distribution using test and field data.
Evaluate failure probability based on deterioration model.
Evaluate failure probability based on failure distribution.
Accumulation of operating conditions and maintenance data
Figure 3. Classification based on accumulated data
(4) Evaluation of likelihood of failure occurrences For evaluating the likelihood of failure occurrences, we take one of two approaches depending on the characteristics of deterioration and failure modes and our knowledge about them, as shown in Figure 3. When we have enough knowledge about the deterioration mechanism, we construct a model of deterioration progress, by which the component life can be estimated based on the operating history. If we do not have enough knowledge to construct the deterioration model, we identify the failure distribution functions based on failure data. For this purpose, we often use the cumulative hazard method, assuming the Weibull distribution for failure. Although a statistical approach must be taken due to the lack of knowledge of the deterioration modes in the early stages of the operation, we may be able to construct a deterioration model based on accumulated operation and maintenance data later, as indicated in Figure 3. (5) Evaluation of operation and maintenance losses For evaluating operation and maintenance losses, we need to define the loss items. They are identified from the viewpoint of objects in which losses occur, and the type of losses. Then, we calculate losses by gathering necessary information. They are,
318
for example, production losses due to maintenance operations, maintenance losses such as inspection and repair costs, and other losses such as injury to operators. (6) Set an acceptable risk An acceptable risk during operations is set by considering the features of the facility and the products. (7) Set constraints on operation and maintenance planning In this step, we set constraints on operation and maintenance planning. On the maintenance side, for example, we set the range for the TBM cycle and a threshold value for executing CBM. On the operations side, on the other hand, we set the acceptable range of production volume. Operation and maintenance plans are made within the constraints determined in this step. (8) Evaluation of expected losses [4] For evaluating the expected losses, we consider facility failure probability and accumulate the possible future losses incurred due to operation and maintenance, which include preventive maintenance activities, decreased production, and so on. Then we develop an operation and maintenance plan that would entail minimum loss. For this purpose, we use life cycle maintenance simulation.
4
APPLICATION TO OIL REFINERY PLANT
4.1 Deterioration and failure analysis of direct desulfurization facility We have applied the life cycle maintenance planning support system to a heating furnace of a direct desulfurization facility of an oil refinery plant. In the plant, heavy oil is heated by the furnace and desulfurized by a decomposition reaction at high pressure and high temperature in the presence of a catalyst. First, we conducted structural and functional deployment of the heating furnace of the direct desulfurization facility and identified potential deterioration and failure occurrences. From them, we selected seven major deterioration modes as shown in Figure 4 for demonstrating the maintenance planning by using the proposed system. Table 1 shows their models for estimating the likelihood of failures due to these deterioration modes, maintenance policies, control variables, types of treatment, and timing of inspection and treatment. Among these deterioration modes, creep rupture, brick separation, breakage of tiles, and nozzle clogging are the modes that cause failures. The wastage of metal dusting does not cause tube failure, because we assume that the tube is exchanged before the amount of wastage exceeds a thresold value.
Figure 4. Structural and functional deployment
319
Table 1 Deterioration modes and its maintenance policies deterioration model
deterioration degradation of catalyst
maintenance policies
control vaiable
treatment
inspection treatment timing timing
proportional to production carbon deposit on inner wall proportional to production creep Manson-Haferd model wastage due to metal dusting proportional to corrosion production brick separation Weibull
TBM
production volume
replace
CBM
shutdown shutdown
TBM
repair amount of cabon deposit life consumption rate replace
CBM
wall thickness
shutdown shutdown
CBM
inclination of bricks repair
tile breakage
Weibull
TBM
total operation time
repair
-
shutdown
nozzle clogging
Weibull
TBM
total operation time
replace
-
anytime
replace
-
shutdown
shutdown shutdown
shutdown shutdown
Ts = Ti + Tco + Tca + Th Th Tco Ts carbon
Heating tube
Temperature [℃]
Before explaining the details of the deterioration models in the next section, the interrelationships among the deterioration modes are explained below. The first three deterioration modes are related to the temperatures of the heating tubes as shown in Figure 5. To maintain the target production volume, the effect of the degradation of the catalyst should be compensated by increasing the oil temperature. To keep the oil temperature at the required value, the external temperatures of the heating tubes should be increased, which depends on the amount of carbon deposit on the inner walls of the tubes. The increase in the external temperatures of the heating tubes, however, accelerates the progress of creep and reduces the life of the tubes.
Tca flow
Ti
Ts : Required surface temperature of inner Ti : Inner fluid temperature Tco : Increase in temperature induced by carbon deposit on inner wall Tca : Increase in temperature induced by catalytic degradation Th : Increase in temperature induced by insulation of heating tube
Figure 5. Formulation of required surface temperature
4.2 Modelling of deterioration and failure modes For evaluating expected losses quantitatively, we construct the mathematical model of deterioration modes as follows. (1) Degradation of catalyst effect The catalyst effect decreases with increasing operation time. For compensating this effect, it is necessary to increase the oil temperature by using more fuel. The catalyst degradation, therefore, is represented by the necessary increase in oil temperature required to maintain production volume. The effect of the degradation of the catalyst progresses non-linearly with operation time. We apply a piecewise linear approximation to represent this trend. For the first 300 days, the degradation rate is assumed to be 5.15 °C per month, after which the rate increases to 7 °C per month. The degradation of the catalyst leads to an increase in fuel costs, which is considered in the calculation of expected losses.
320
(2) Carbon deposit on inner walls of heating tubes Due to carbon deposition reaction, carbon deposition adheres to the inner walls of the heating tubes. The thickness of the deposition is assumed to increase in proportion to the cumulative production volume. Since the carbon deposition has an adiabatic effect, the heating temperature of the external wall of the tubes should be increased in order to maintain the oil temperature in the tube at the required level. We assume that carbon deposition progresses in proportion to the production volume, and that the increase in surface temperature required because of carbon deposition is proportional to the thickness of the carbon deposit.
Uncertainly of creep life Creep rupture probability
N( µ ,σ ) µ = 100% σ = 5.035%
0%
100% life consumption rate
Term k equipment life consumption rate: LC(k)
Figure 6. Creep life distribution
(3) Creep of heating tubes For evaluating creep life, we use the Manson-Haferd equation: 5 k log t R = log t a + [(T + 273.15) - Ta ] · ∑ bk (log S ) , k =0
(3)
where tR is the time to rupture (h); T is temperature (degree); ta and Ta are optimized constants; S is stress (MPa); b0, b1, b2, …, bk are regression coefficients estimated by the least-square method; and k is the degree of the regression equation. We use NIMS (National Institute for Materials Science) data for the values of model parameters assuming that the tubes are made of SUS347 [5]. From this equation, we estimate the life consumption rate of the tubes. Providing that the tubes are used under temperature Top(i) for the period of time Dti (i = 1, 2, …, k), the life consumption rate of the tube at the end of the term k is represented by the following equation, where tR,T=Top(i) indicates the creep life under the temperature of Top(i): k
LC ( k ) = ∑ i =1
Dt i t R ,T =Top ( i )
(4)
In making a decision on the basis of such models, we should take uncertainty of the estimated life into account. Therefore, we assume that the distribution of the creep life follows a normal distribution, shown in Figure 6. The shaded area in the figure shows the failure probability of creep rupture when the life consumption rate is LC(k). (4) Wastage due to metal dusting corrosion Although metal dusting corrosion is known to be one of the deterioration modes in heating furnaces, the progress model has not been established yet [5]. We assume, therefore, that the metal dusting corrosion progresses in proportion to production volume. If the wastage value exceeds a threshold, the heating tubes are exchanged. We assume the corrosion wastage rate is 0.001 mm/month. (5) Brick separation Bricks are used to protect the walls of the combustion chamber from the high-temperature combustion gas. Bricks are bonded by infilling space with mortar as shown in Figure 7. The process of brick separation is as follows. First, the hightemperature combustion gas causes mortar corrosion. Then, as the gas penetrates the furnace wall behind the bricks, it generates scales which gradually push the bricks out. If the inclinations of bricks become large, they drop. In the case in which a brick falls, we need to repair the brick by shutting down the furnace. We assume that occurrences of bricks falling follow the Weibull distribution. (6) Burner tile breakage and nozzle clogging The furnace uses burners to burn heavy oil and to heat the tubes. The structure of the burner units is shown in Figure 8. The burner tiles and the nozzle are important to maintain the flame shape for supplying a stable heat flux. Thus, if the tile or nozzle fails, the shape of the burner flame becomes unstable and the heating efficiency decreases. In the case of burner tile breakage, it is necessary to increase the heat output of other burners in order to maintain the required surface temperature of the heating tubes until the next shut down for maintenance. In this case, we need to take additional fuel costs into account. Nozzle replacement can be performed immediately after it fails, because it is possible to remove it from outside the combustion chamber without shutting down the furnace.
321
Heating tube
Scale layer
Furnace wall
Furnace wall
Furnace wall
Heavy oil Burner Burnertiles tile
High-temperature combustion gas
Figure 7. Structure of furnace wall and brick
gas
Burner nozzle
Figure 8. Structure of burner units
4.3 Life cycle maintenance simulation (1) Evaluation of operation and maintenance activity losses We developed the life cycle maintenance simulation system for evaluating operation and maintenance losses taking into account deterioration and failure modes, shown in Table 1. Before executing the simulation, we define here the loss items. They are identified according to the objects in which the losses occur, and according to the types of these losses. We consider three objects: man, machine, and the environment, and three types of losses: quality, cost, and delivery. For machines, we identified the losses caused by operation and maintenance activities. For maintenance, labor cost, cost for parts, and expenses for maintenance operations such as inspection, repair and replacement are considered. For operations, production losses due to maintenance operations and excess fuel costs are considered. Life shortening is considered as the loss associated with both maintenance and operations. The loss item deployment explained above is illustrated in Figure 9. The data used for the evaluation of these losses is shown in Tables 2 and 3. SDM (shutdown maintenance) requires 60 days including shutdown and start up time. If decoking is executed during SDM, 7 more days are needed. The loss values derived from the data are indicted in Table 4.
Evaluating losses Man Maintenance Injury
Compensation
Environment
Machine
Repair/Replace costs Labor costs
Parts expenses
Operation
Breakdown time
Facility life
Production volume
Operation costs
Production losses
Cost depreciation
Production losses
Fuel cost losses
Figure 9. Operation and maintenance losses
322
Compensation
Table 3 Number of days required for execution
Table 2 Loss evaluation parameters
events parameter
required days
value
decoking SDM SDM including decoking tube exchanging emergency tube exchanging brick falling shutdown the plant start up the plant
6,440 target production volume per day (kl ) 8,440 marginal profit per production (yen/kl ) 1,000 fuel cost of basal condition (yen/kl ) labor cost per hour (yen) 20,000 necessary day of SDM inspection (days) 10 30 fuel cost losses of nozzle replacement(yen/kl ) 20 fuel cost losses until tile repairment(yen/kl )
12 35 42 65 65 5 10 15
Table 4 Evaluation of operation and maintenance losses (k-yen) operation production fuel cost losses losses SDM SDM (including decoking) decoking replace creep rupture EM replace corrosion wastage EM furnace wall brick replace EM falling replace burner tile EM chipage repair burner nozzle EM clogging
3,240,960 3,619,072 1,998,592 4,861,440 4,861,440 4,861,440 4,861,440 0 1,620,480 0 0 0 0
maintenance treatment
0 0 0 0 0 0 0 0 0 0 *1 0 *1
10,000 220,000 210,000 2,000,000 2,000,000 2,000,000 2,000,000 2,003 2,003 1,002 1,002 1,004 1,004
inspection
compensation sum of losses costs
*2 *2 0 0 0 0 0 0 0 0 0 0 0
3,250,960 3,839,072 2,208,592 6,861,440 6,861,440 6,861,440 6,861,440 2,003 1,622,483 1,002 1,002 1,004 1,004
3,000,000 0 3,000,000
*1: Fuel cost loss is dictated by the time to find failure and days until next SDM *2: Inspection cost is dictated by SDM cycle *3: EM means emergency maintenance (2) Setting the acceptable risk In addition to loss evaluation, we also set the maximum acceptable risk in order to manage the failure which may result in catastrophic damage. For the direct desulfurization plant, creep rupture of the heating tubes can be considered fatal damage. The allowable rupture probability, therefore, is set at 0.0001(month-1), which corresponds to 81.27% of the life consumption rate of the creep life. Preventive maintenance, that is replacement of tubes, is executed before the life consumption rate reaches this limit in our simulation. (3) Life cycle maintenance simulation The loss and risk evaluation is performed by means of Monte Carlo simulation based on deterioration models and failure distribution functions of the deterioration modes listed in Table 1. In the simulation, expected losses from operations, maintenance, and failures are evaluated in each term and accumulated for the whole simulation period. The procedure of the simulation is illustrated in Figure 10. After setting the simulation parameters, the system starts to execute the simulation loop for each term (one term equals one month in this simulation). First, the production volume in the term is determined according to the production plan and the
323
condition of the facility. Then, the system checks whether any failure occurs during the term by means of the Monte Carlo method. Here, four kinds of deterioration modes are considered as causes for failure. They are: creep rupture, brick separation, breakage of tiles, and nozzle clogging. If a failure occurs, the corresponding failure loss is added to the expected losses in the term. For the deterioration modes that do not lead to failures during the term, the system checks to verify if any preventive maintenance is needed. If so, the losses for preventive maintenance are added to the expected losses. The system repeats the above procedure until the last term is processed. Since Monte Carlo simulation relies on random sampling, it is necessary to repeat the simulation a certain numbers of times to obtain a stable result. Finally, we select the proper operation and maintenance plan in terms of profit and risk calculated by the simulation. With this simulation, we can evaluate various operation and maintenance scenarios by changing simulation parameters, such as product price, labor costs, and parts expenses.
start necessary simulation parameter setting determination of production volume in term i evaluation of time to failure determination of preventive maintenance operations in term i calculation of cumulative losses and risk no
(4) Setting the operation and maintenance scenario To examine the effects of the SDM cycle on the relationship among various losses, such as production loss, maintenance loss, and shortening of the facility life, we execute the simulation by changing the SDM cycle from 10 to 21 months. The maximum maintenance cycle of 21 months is selected because catalyst degradation has a huge impact on the life expectancies of the heating tubes when the cycle becomes longer than 21 months. In this simulation, we adopt the maintenance policy for each deterioration and failure mode shown in Table 1. Besides, we change the TBM cycle from 8 to 35 months. This range is determined by the results of the preliminary simulation. The Weibull parameters used in the simulation are listed in Table 5. The criteria for applying the treatment are shown in Table 6. We adopt decoking as a maintenance alternative only when the amount of carbon deposit exceeds 3mm.
finish evaluation periods? yes
no
finish all candidates for the maintenance plan? yes
calculation of average of trial end
Figure 10. Procedure of life cycle maintenance simulation
As for the operation, we set the allowable range of production volume at 4800 kl to 6400 kl per day, and operating pressure at 13.2 (MPa). Since carbon deposition is accelerated in the case of low production volumes due to the characteristics of the facility, we set the lowest limit of production volume. We have performed the simulation for the period of 800 months. The number of repetition of the Monte Carlo simulation was 1,000 times. Table 5 Weibull parameters deterioration brick separation tile breakage nozzle clogging
Table 6 Criteria for applying the treatment Weibull parameter m η (month) 12 10 14
40 24 30
deterioration wastage due to metal dusting corrosion brick separation
Criteria value 10.0 (%) 2.0 (degree)
4.4 Simulation result and discussion The results of the simulation are shown in Figures 11 (a) and (b), in which the average losses per unit term, the average marginal profit per unit term, the lifetime of heating tubes, and the holding risk per unit term are indicated. Figure 11 (a) shows that most losses in this setting are induced by production losses and the depletion of heating tubes. The total losses have the minimum value at the SDM cycle of 15 months, while the marginal profit has the maximum value for this cycle. On the other hand, Figure 11 (b) shows that the holding risk increases monotonously as the SDM cycle increases, while the lifetime of tubes decreases. The amount of risk shown in Figure 11 (b) corresponds to emergency maintenance losses. Therefore, applying better maintenance policies to reduce the emergency maintenance cost contributes to an increase in the marginal profit. There results shows that the selection of proper SDM cycle can increase the total profit. At the same time, we need to pay attention to the increase in fatal risks at the facilities.
324
Ma rgina l profit
3.5
13.4 13.2
3.0
13.0
2.5
12.8
2.0
12.6
1.5
12.4
1.0
12.2 10 11 12 13 14 15 16 17 18 19 20 21 SDM cycle [month]
Creep rupture Brick separa tion Burner nozzle clogging Burner tile breakage 900
1.8
800
Lifetime of hea ting tube
1.6
700
1.4
600
1.2
500
1.0
400
0.8
300
0.6
200
0.4
100
0.2
0
0.0 10 11 12 13 14 15 16 17 18 19 20 21
Total holding risk [100 million-yen / month]
losses [100 million-yen / month]
4.0
Ma rginal profit [100 million-yen / month] Lifetime of heating tube [month]
Production losses Depletion of heating tube Fuel cost losses Preventive maintena nce costs Emergency ma intenance losses
SDM cycle [month]
(a) Average of total losses and marginal profits
(b) Average of lifetime of tube and holding risk
Figure 11. Simulation results
5
CONCLUSIONS
In this paper, we developed a prototype life cycle maintenance planning support system that evaluates the expected total loss by accumulating operation and maintenance losses, as well as the risk of fatal failures. By applying the proposed system, we can determine proper operation and maintenance plans which consider losses and risks. In this system, we can take into account various changes such as changes in the product price, labor costs, and parts expenses. The proposed system is, therefore, especially effective for the facilities in which the key parameters are often changeable. For future investigations, we also need to evaluate operation and maintenance plans in terms of the environmental load for realizing environmentally sustainable manufacturing. Besides, we have to construct a module which could assist in the improvement of the accuracy of the deterioration model by accumulating operation and maintenance field data.
6
REFERENCES
1
S. Takata, F. Kimura, F.J.A.M. van Houten, E. Westkämper, M. Shpitalni, D. Ceglarek & J. Lee. (2004) Maintenance: Changing Role in Life Cycle Management: Annals of the CIRP, vol 53/2, 643-655.
2
James RE, David LO. (1998) Introduction to Simulation and Risk Analysis: Prentice hall, Inc.
3
ISO/PRF 18435-1. (2009) Industrial automation systems and integration -- Diagnostics, capability assessment and maintenance applications integration -- Part 1: Overview and general requirements.
4
Y. Katsuta. (2007) Evaluation of maintenance plan based on the expected maintenance effects estimation method: Proc. of JSPE spring meeting, 1171-1172.
5
NIMS Microstructure Database for Crept Materials, National http://creptimg.nims.go.jp/index_eng.html (accessed in 2009/5/30)
6
Yoshitaka N. (2007) Metal dusting of steels and alloys in carbon-bearing gases (2007): Zairyo-to-Kankyo, 56(3), 84-90.
Institute
for
Materials
Science
(NIMS):
Acknowledgement My heartfelt appreciation goes to H. Ishimaru, M. Iwata, and I. Kawamura whose comments and suggestions were innumerably valuable throughout the course of our study.
325
Proceedings of the 4th World Congress on Engineering Asset Management Athens, Greece 28 - 30 September 2009
PERFORMANCE OF PUBLIC PRIVATE PARTNERSHIPS: AN EVOLUTIONARY PERSPECTIVE Furneaux, C.W. a & b, Brown, K.A. c, Tywoniak, S. a & b, and Gudmundsson, A. b a
b School
CRC for Integrated Engineering Asset Management, Brisbane, Australia
of Management, Faculty of Business, Queensland University of Technology, Brisbane, Australia c School
of Tourism and Hospitality Management, Southern Cross University, Australia
PPPs are held to be a powerful way of mobilising private finance and resources to deliver public infrastructure. Theoretically, research into procurement has begun to acknowledge difficulties with the classification and assessment of different types of procurement, particularly those which do not sufficiently acknowledge variety within specific types of procurement methods. This paper advances a theoretical framework based on an evolutionary economic conceptualisation of a routine, which can accommodate the variety evident in procurement projects, in particular PPPs. The paper tests how the various elements of a PPP, as advanced in the theoretical framework, affect performance across 10 case studies. It concludes, that a limited number of elements of a PPP affect their performance, and provides strong evidence for the theoretical model advanced in this paper. Key Words: procurement, asset management, organisational routines, variation, performance 1
INTRODUCTION
Public Private Partnerships (PPPs) are a way of mobilising private finance and resources to deliver public infrastructure. According to [1] PPPs are “a contracting arrangement in which a private party, normally a consortium structured around a Special Purpose Vehicle (SPV), that takes responsibility for financing and long term maintenance or operation of a facility to provide long term service outcomes”. According to [2], growth in the use of PPPs “is due in a large part to the scope to bring in private sector management skills, the opportunity that bundling design, construction and operation, or parts thereof, provide to improve efficiency and the ability to bring forward the provision of the infrastructure service.” In summary PPPs “work best where government has considerable skill in contract negotiation and management, and where there is adequate competition for the projects. The costs of tendering, negotiating and managing contracts can be considerable – with tendering costs alone estimated at up to 3 per cent of the project cost. And while risks may be transferred to private partners, the cost of risk will be factored into the cost of finance. The main advantage of PPPs comes from the scope for lowering the total cost of the project through improving project risk management. And while contract negotiation can be lengthy, PPPs provide a more flexible, and potentially more timely source of finance for important infrastructure investments that might otherwise be constrained by public debt pressures” [2]. While some authors hold very positive views on the performance of PPPs [1], other authors point to significant differences in performance of PPPs [3], particularly as a result of the different types of risk and cost. Theoretically, research into procurement in general has begun to acknowledge difficulties with classification of procurement routes, which do not sufficiently acknowledge the varieties which exist in specific approaches to procurement. According to [4] “The procurement routes selected for the project were virtually impossible to classify according to the commonly accepted labels … the contracts had been ‘heavily amended’ to a point that they no longer adhered to the type of structure that the industry would normally perceive to belong to any particular procurement route. Procurement is significantly more complex than construction academics would wish it to be – the variability is such, it is argued, that it is virtually impossible to classify procurement by any sort of rational positivistic approach”. This finding also holds for PPPs, as once the underlying characteristics of PPPs are taken into consideration, wide variations can be seen to exist when they are compared to each other [3]. Traditional approaches to procurement, which attempt to link different types of infrastructure procurement
326
(such as PPPs or alliances) do not take into account the complexity and variation possible between projects, and this failure to take into account the all of the variables involved in projects results in considerable difficulties in assessing and comparing the performance of different projects. There is considerable variety within the procurement models relating to infrastructure, which has led to confusion, and a realisation that simple high level correlation techniques between PPP and success does not hold [3], particularly when success is viewed as more than on time and on budget. New analytical approaches need to be developed which can cope with the variability evident in procurement projects, and yet which can still provide some framework for analysing performance. This paper advances a new theoretical framework developed from organisational theory, evolutionary economics and complexity theory which can take into account the variety evident in procurement projects, in particular PPPs, and provide a framework for performance assessment. This theory is outlined in the section below.
2
LITERATURE REVIEW
In their seminal work, [5] highlight the importance of ‘routines’ as a unit of analysis in organisations, as they provide the basis of an evolutionary theory of organisations: “The first is the idea of organizational routine. At any time, organizations have built into them a set of ways of doing things and ways of determining what to do”. Routines are patterns of behaviour [5] which structure the specific ways organisations may do specific things. Empirical examples of routines include such aspects as budgeting and hiring of staff [6, 7]. As PPPs are recurrent patterns used by governments as a mechanism to procure infrastructure, this research holds that PPPs are a particular example of a routine which are implemented by government as a set way of achieving certain infrastructure projects. While early discussion of routines has focussed on viewing routines as undifferentiated singular entities, recent research has noted that there can be different elements in a routine, and it is helpful to discuss these elements in some detail. Identifying the various elements of a routine can assist in providing conceptual clarity about the nature of the routine, and [8] suggests that routines in organisations, can be viewed as being similar to genes in organisms. Thus routines are structured ways of undertaking specific activities and consist of various specific elements. It is this detailed view of routines, examining the various elements of PPPs, which goes beyond traditional approaches to analysing the performance of PPPs. Despite early theorists [5] arguing from an evolutionary perspective of economics that routines are the analogue of genes in organisations, attempts to specify how routines can be seen to be similar to genes are still very few [8 p.662]. While well established in the routine literature, there is little in the organisational or economic literature to assist in explicating and examining the genes of routines. To understand routines as genes, by way of analogy, it is necessary to turn to biological theorists – particularly those who can theoretically inform the organisational theory of routines and how the genes of routines affect performance of the routines. One theorist who has provided a cogent explanation of complex adaptive systems and relates this to performance of organisms is Kauffman [9, 10]. The work of Kauffman [9, 10] has not been engaged with significantly in the routines literature to date (although some mention his work in passing [see 11]). However, the wider organisational theory literature has begun to engage with Kauffman’s work [12] as it provides a pragmatic approach to using biological metaphors which can be readily applied to the problems of business and government. Such application has focussed on search and stability in organisations [13], and designing organisations in order to respond to changes in their landscapes or business environment [14, 15]. Kauffman [10], focuses on biological entities in his work, which examines how genes affect the fitness of an organism in a given landscape. Kauffman’s [10] approach can readily be understood as for animals, certain traits (derived from their genes) result in improved performance on particular landscapes. For example, thick fur or blubber helps animals to survive in the cold climates like the Antarctic, whereas long legs and short hair are better on savannahs. The sum of the genes of animals contributes to their fitness – or performance – on various landscapes [16]. Fitness is the essential element of adaptation [17], and is therefore an essential element of adaptive systems. The most appropriate rules, (i.e. the ones that ‘work’) are the ones that tend to be reinforced – as they have a high level of fitness for organisms in relation to the ‘landscape’. Each gene has several alternative forms called alleles … and these forms usually have distinct observable affects on the organism. The objective in genetics, as it is for rules, is to determine the effects of different alternatives at different positions. In mathematical genetics there is a classical approach to determining these effects. It is to assume that each allele contributes something positive or negative to the overall fitness of the organism [16] Both [10] and [16] argue that the fitness of a particular animal on a given landscape can be determined by identifying the genes of the animal, and assessing how these contribute to the overall performance of the animal on a given landscape. In a similar fashion, [18] has argued that routines can be conceptualised as genes sequences of DNA. As [5, 106] argue organizations respond routinely with a wide variety of specialised routine performances, each ‘customised’ for a particular configuration of the environment. In other words, there is not a single routine, but rather a number of different ways in which the routine can be performed, and the effectiveness of the routine needs to match the environment in which it is situated. Thus routines have been conceptualised as similar to DNA sequences in the routines literature. In both biological literature and
327
routines literature, there is an understanding that there are various elements of a routine (or gene), which contribute in various ways to performance. This suggests the following research question: Research Question 1 – How can PPPs be conceptualised as a gene?
2.1 Conceptualisation of PPPs as routines In the previous section a theoretical proposition was put forward, that PPPs, as routine procurement arrangements, contain various elements or options which could be conceptualised as analogous to the gene of an organism, and the structure of these elements can be related to performance. While this has been advocated theoretically, operationalisation of the concept in specific situation remains to be achieved. This section outlines how a PPP project might be reconceptualised as something similar to a gene. It should be noted that this is a metaphorical approach and there is not intent to argue that a PPP is literally and actually a gene. In order to determine the various elements of the routine, firstly the various elements or sections of the PPP ‘entity’ need to be determined. [19], [20] and [3] have already identified several key elements of any type of procurement project which will be used to analyse PPPs. These elements of PPPs are set out in Table 1 below, together with the various options available in each ‘gene’, and an explanation of each element follows. Importantly, this approach which does not rely on the traditional labelling system which has been soundly criticised [21] as it does not allow for the sheer level of variation evident in PPP projects. Table 1 Genetic elements of a PPP routine Asset specificity
Construction Cost
Cost risk transferred
Construction complexity
Revenue uncertainty
Revenue risk transferred
High Low
Actual cost
Yes or No
High, medium, low
High, medium or low
Yes or no
or
$
Govt contract management skills Poor, fair, good
Externalities or other negative events Actual events
Transaction costs High, moderate, or low
Each of these elements are discussed below 2.1.1
Asset specificity: Asset specificity can relate to variety of uses of an asset, or how specific the asset is to a particular location [19]. Dams exhibit high location specificity and can only be used for a single purpose in a single location, making them effectively a one of a kind asset with very high specificity. Manufactured goods, such as cars are not location specific, although cars are built for different purposes. Assets which can be used for multiple purposes, or are not location specific, have low asset specificity. The specificity of an asset can significantly impact its performance so is a key element, or gene, of PPPs. An asset may have high or low specificity.
2.1.2
Construction Cost: Cost is the simple value of the PPP contract and is a key performance element of PPPs.
2.1.3
Cost risk transferred: With any major project, there are risks that costs could increase beyond the contract price. Under some PPP arrangements costs can be transferred from government to the private sector as part of the contract [3]. Shifting of cost risk may significantly affect the performance of a PPP so is likely to be a key driver of performance. Cost risk is either transferred in the PPP arrangement or not.
2.1.4
Complexity: Task Complexity refers to the degree of difficulty in specifying the terms, conditions and outcomes of the procurement activity [20]. “Complex goods involve uncertainty the nature and costs of the production process itself. They also face more environmental uncertainty because complex goods are more likely to be affected by unforseen changes in the environment” [20]. Complexity is therefore a key element which can affect the performance of PPPs. Projects can evidence high, medium, or low complexity.
2.1.5
Revenue uncertainty: For some PPPs there is a level of uncertainty about how much revenue a particular asset will generate. As PPPs require investment by the private sector in public infrastructure, the revenue, or return, from such investment is an important element of the performance of the PPP [3]. Consequently it is a very important element which might affect the performance of PPPs. PPPs can have high, medium or low revenue uncertainty.
328
2.1.6
Revenue risk transferred: While revenue is a key risk, in some cases government underwrites or guarantees the income for a period of time, whereas in other cases the risk is transferred to the private sector [3]. Consequently, the transfer of revenue risk is an important element which can affect the performance of PPPs, as in situations of high revenue uncertainty the private sector may not be willing to accept the full risk for the project, leaving government with ongoing financial obligations for the project. Revenue risk is either transferred or not in PPPs.
2.1.7
Government contract management skills: PPPs are quite complex organisational and contractual arrangements, so the skills of the government agency in handling these contracts is critical to the successful negotiation, and completion of the project [3]. Government skill levels can be high, might be poor, fair or good.
2.1.8
Externalities or other negative events: Occasionally irrespective of the financial performance of a project, considerable negative external events can be generated, such as poor service levels, obstruction of other projects or interference with other infrastructure [21] These can affect an otherwise successful project and lead to a perception that the project is poor, even if it is delivered on time and on budget. Externalities need to be noted on a case by case basis, as they are idiosyncratic to the specific project.
Transaction costs: While a project has direct costs, a key logic in economic literature, which is often understated in the contracting out literature, is that the total cost of the project involves not just the costs in delivering the project, but also in managing the contractual arrangements [22]. While a project may be delivered within budget, high costs in managing the contract may result in the project being too expensive overall, particularly if legal action is involved. Consequently, transaction costs are a critical element which may affect the performance of the project. Transaction costs can either be high, medium or low. Kauffman [9, 10] suggests that the various elements of the routine can be related positively or negatively to performance. As has been noted above, each of the elements of the routine has the potential at least to affect the outcome of a particular PPP. Taken together, these elements of a PPP include critical aspects which can affect project performance, and allow for the detailed analysis of the elements, and how each of these might affect performance, and suggest a second research question: 2.1.9
Research Question 2: How does the elements of a PPP routine affect performance? Finally, as alluded to in the introduction, other researchers have found that there is considerable variety in procurement approaches [4]. By viewing procurement approaches, such as PPPs as a routine, it is proposed that the variability in procurement approaches can be explored. This suggests the final research question: Reseach Question 3: How variable are PPPs as routines? The elements of PPPs and how these elements affect performance are analysed in the results section below. Firstly a brief discussion of research methods is appropriate.
3
RESEARCH METHODS
One useful way of undertaking in-depth analysis of a particular issue or technology as it impacts an organisation or industry, is by undertaking case studies, as such a strategy can provide strong recommendations for improvements in theory, technology or policy [23, 24]. As noted there is a dearth of existing work undertaken the area of routines [8, 25]. In such cases, exploratory research using case study methodology is seen to be appropriate [26]. This is because a series of case studies can develop the application of theory to an area [27], and test the applicability of that theory for utility in explaining specific phenomena. In particular, undertaking multiple case studies on the same phenomena, allows for comparison between the cases, and extension of theory. In multiple cases a logic of replication is followed in order to compare data and improve the generalisability of the data [28]. Multiple case studies enable researchers to clarify whether a finding is peculiar to a singular case study, or whether it is consistently replicated across several cases [29]. Multiple case studies provide a much stronger case for generalisability than single case studies [29]. This is because examining a phenomenon in multiple situations leads to better understanding, and perhaps better theorising, about an even wider selection of cases [30]. By examining the elements of multiple PPPs and comparing these with the perceived performance of the PPPs, underlying contribution of specific elements of the routine to performance becomes clear. The analysis of multiple PPPs enables both the individual elements and how these elements contribute to performance to become clear. Existing research has already been undertaken into the performance of PPPs [3]. This paper will take this existing research and treat each PPP reported as a separate case which
329
Case studies are best suited to real life phenomena under conditions where researchers have little control [31], and are appropriate for examining events or activities [32]. As this paper is examining how routines can be conceptualised as a gene, then a case study methodology is appropriate as an overarching strategy. In particular the use of multiple case studies enables the comparison of different arrangements in order to determine the underlying contribution of specific elements to performance. The findings of the study are reported below.
4
FINDINGS
In attempting to develop a new theoretical framework this paper has uses existing reports of the performance of PPPs, as the focus is on testing the validity of the theoretical framework, not examining a new situation per se. Data has been made available on the performance of PPPs against the elements identified above [3]. In fact the performance of 10 PPPs provides 10 case studies to test the validity of Research Question 1 – how can a routine be conceptualised as a gene. It also provides a useful set of comparative studies in which to answer Research Question 2: How does the structure of the routine affect performance? While data is drawn from existing work, the theoretical framework used to assess the performance and the structure of the PPP as a gene is new. Consequently, these findings serve to extend and critique the existing work of [3].
4.1 Structure of Canadian PPP In the theory section, we advanced a proposition that as routines PPPs could be considered metaphorically to be like a gene, and a number of different project genes, or elements, were identified from the existing literature. 10 case studies were reviewed and each of these elements was identified for each case study. As can be seen from Table 2, PPPs exist in a wide variety of configurations. Once viewed from the perspective of routines, there is no single form of PPP but rather a wide variety of projects and structures which happen to all bear the same name. Given the high variability between each of the PPPs reviewed, this finding supports the concerns of [21] who argued that the sheer variability of procurement projects makes treating each one the same thing highly difficult. Instead, examining the various elements of each PPP allows for the differences and similarities between each of them to become apparent, and as will be discussed in the next section, the elements which affect performance to become clear. Those elements which do not relate clearly to performance are not shown in the table. Thus in response to Research Question 1 – by viewing PPPs as a set of elements, PPPs can be viewed in a similar way to a gene. Each PPP can be viewed as comprising a set of characteristics, or variables which in turn might relate to performance. This issue of performance is a critical aspect of the use of the metaphor of routines as genes and is discussed in the next section. Additionally, viewing PPPs as routines demonstrates the sheer variability of routines. Thus in answer to Research Question 3, there is a very wide variety of procurement routines.
4.2 The performance of PPP genetic structure As outlined in section 4.1, the analysis of 10 PPPs according to their elements allows for a finer grained analysis than simply examining cost or quality which is often used to asses the outcome of projects. As suggested by Kauffman [9], by adopting a more fine-grained approach the performance of each of the PPPs, and how these might relate to different arrangements also becomes clear. As Table 2 suggests, projects 1 to 5 were considered not to be as success, while projects 6 to 10 were considered to be successes. Looking across the various elements, the relative strength of each of these in relation to successful and unsuccessful projects can be ascertained, and these are discussed below. Kauffman [9] argued that for certain animals, certain genetic structures resulted in poor performance (e.g. polar bears in the desert). However, other genetic structures enabled animals to perform well (e.g. cheetahs in the savannahs of Africa). Each of the various elements and how these affect performance are discussed below. 4.2.1
Asset specificity: All the assets reviewed had high levels of asset specificity [3]– particularly location specificity, so this in effect formed a constant and has been removed from further analysis in this paper on the contribution of specific elements to performance.
4.2.2
Construction Complexity: All the projects were from low to medium complexity and there was no correlation between low or medium complexity and the outcome of a specific project [3]. Consequently, while an important element in projects, this variable does not relate to performance in the PPPs examined.
330
There was a weak relationship between complexity and cost, as the more complex projects tended to be higher cost, but this did not directly relate to performance overall. 4.2.3
Construction Cost: While the cost varied significantly, there was no direct correlation between cost and performance. Indirectly however, the greater the value of the project, the more difficult it appeared to shift cost risk to the private sector [3]. Less complex projects tend to be less costly, and therefore less risky for the private sector to take on. However, there is some dynamic interaction between variables here, so further research is needed to unpack the relationships between complexity, cost and transferral of cost risks. Whereas some evaluations see cost as one of the key performance measures of PPPs [1] (dependent variables), in this framework there are multiple costs that are variables which influence the outcomes (independent variables).
Table 2: Structure and performance of 10 Canadian PPPs (derived and adapted from [3]) Start Date
Cost risk transferred
Revenue uncertainty
Revenue risk transferred
Government contract management skills
Negative Externalities
Contract transaction costs
Success
1
Waste management
1987
Partially
Moderate
Not first ten years
Poor
High
No
2
1999
Not effectively Yes
Moderate
Partially
Fair-Poor
High
No
3
Sports Multiplex Highway
Chemical leaks, under utilisation. None evident.
High
No
Poor
High
No
4
School
1997
Yes, but costs high
Low
No
Poor
High
No
5
Highway
1995
Yes
No
Fair
Moderate
6
Bridge
1997
Yes
HighModerate High
7 8 9
School Gas power Water treatment Water treatment
1995 1998 2005
Yes Yes Yes
Low Low Moderate
Revenue guarantees Yes Yes Yes
2005
Yes
Low
Partially
Project
Type
10
1999
Fair
6 toll increases - roads congested Negative political outcomes Toll-level problems None evident
Moderate
Fair Good Good
None evident None evident None evident
Moderate Average Low
No (qualified ) Yes (qualified) Yes Yes Yes
Good
None evident
Low
Yes
4.2.4
Cost risk transferred: Every project which was successful, the cost risk was transferred to private sector. The results are more mixed for unsuccessful projects – only two PPPs successfully transferred cost risk, and another one did so with high costs associated with the transfer [3]. From a government perspective the logic of involving the private sector in a PPP involves financing a project, which means logically that there needs to be a shift of costs to the private firms.
4.2.5
Revenue uncertainty and revenue risk transferred: Under the framework adopted in this paper, different types of risk are acknowledged. While cost risk involves the possibility of the construction costs increasing beyond budget, revenue risk relates to the operational life of the PPP. Revenue uncertainty is fairly complex, and does not seem to operate on its own. Instead, as [3] argues, as revenue uncertainty increased, the likelihood of revenue risk being transferred, decreased. In other words, the more uncertain the operational profit was for the private partner, the less likelihood that the private partner would accept this type of risk, although this was not a clear cut relationship. While cost risk may be off set by expertise in the private sector which can generate better estimates of cost than government [3]. Revenue risk is far more difficult for the private sector to ameliorate as it is beyond their power to increase the revenue generated by infrastructure assets. In case 6, the project was a success, even with otherwise negative elements, due to reasonable management skills and a limited revenue guarantee [3].
331
Some critical public policy guidelines assert that revenue risk can be transferred to the private sector under PPPs, [33], while some economists [3] and financial advisers [34] argue that this risk should not be transferred. This paper takes more nuanced approach, by acknowledging both risk and uncertainty. The more certain a private partner can be about the return on investment in the operational cycle, the more likely a project is to succeed. This differentiation between cost risk (in the construction phase) and revenue risk (in the operation phase of a PPP) is a useful distinction which warrants further exploration, and one which has begun to appear in detailed public policy documentation [35]. However, the assumption that revenue risk can automatically be transferred in PPPs evident in some public policy advice [33] is not supported with evidence found by [3] and reported here. 4.2.6
Government contract management skills: [3] does mention government contract management skills, although this is not given the same high level of focus in their paper as risk is given. And yet, the relationship between government contracting skills and successful completion of projects is arguably as strong, or even stronger than in the data, once interpreted and reordered as per Table 2, than that of risk. Such a finding also supports other work by the authors which found that the capability of clients to affect procurement is a critical driver of successful procurement arrangements [36].
4.2.7
Externalities or other negative events: Externalities were identified from the economic literature as a likely element of PPPs which could affect performance. As can be seen from table 2, in nearly every case the negative externalities is highly related to poor performance of the PPP (80% of cases)[3].
4.2.8
Transaction costs: As Table 2 demonstrates, high and moderate transaction costs (the costs involved with managing the contract) were also related to the poor performance of the PPP. However, moderate transaction costs, combined with low revenue uncertainty seemed to result in reasonable project performance [3].
Summary of performance: These finding answers the second research question as to how the elements of PPP routines affect performance. As Kauffman [9] argues that the genes of an animal can determine its performance on a specific landscape. What would the metaphorical genes of a successful PPP be? According to the studies examined in this paper, a successful PPP is one with the following elements set out in Table 3: Table 3 Genes of successful PPPs Cost risk transferred Yes
Revenue uncertainty Low - moderate
Revenue risk transferred Yes
Government contract management skills Fair to Good
Negative Externalities None
Contract transaction costs Moderate to Low
A PPP likely to succeed is one where cost risk is transferred to the private sector, but one where there is low revenue uncertainty, medium to high contract management skills, low transaction costs and absence of negative externalities [3]. Indeed this finding begs the question about the utility of PPPs as a procurement vehicle: “Our analysis suggests that, in some sense, effective PPPs are not PPPs! Private sector project participation makes the most sense when it bears cost risks, but not revenue risks. In such circumstances, there is not that much different from what has been traditionally described as a ‘build-operatetransfer’ contract” [3]. This finding also relates to Research Question 3, which examines the variability of PPP routines. Viewed from the perspective advanced in this paper, while all of these projects were labelled PPPs, not all of them performed well, or contained the key aspects (financial risk transfer) which are a key component of PPPs. Consequently, while there are multiple varieties of PPPs, only some varieties can be considered successes.
5
LIMITATIONS
The analysis undertaken in this paper is based on existing research and amounts to a re-examination and an extension of previous studies. One of the issues with the existing assessment of performance is that it appears to be largely subjective. For example “negative externalities and high transaction costs lead us to conclude that this project was poor from a social perspective” [3]. Thus the poor performance is a subjective assessment not based on objective measures, but on the basis of theoretically driven criteria. In fact, performance is a highly difficult area in projects, as there are multiple criteria for claiming success and otherwise successful projects can be deemed unsuccessful by other authors or stakeholders. Nevertheless, other authors focus on financial performance of PPPs [1], which is certainly one valid measure. By extending the assessment of performance to include the social issues associated with PPPs [3] a wider array of actors becomes involved. As PPPs are often created to deliver public services such as waste management, water and electricity, it is important that the social impact of
332
various PPPs is considered, as a project which is a financial success may well not be delivering the services it was meant to provide in the first place. Future research would need to consider other performance criteria of PPPs in order to explore the issue of performance more closely. A second limitation is that in the reported outcomes of 10 PPPs [3], the conditions of the market are not taken into consideration. Other frameworks consider the issue of the contestability of the market in assessing performance [20], as this can seriously affect the performance of procurement arrangements. Indeed with the low contestability the logic underpinning contracting out in the first place, as competition drives cost down, making procurement from the private sector viable in the first instance [19]. Future research should also consider the impact of the market forces on projects which is not currently included in this model.
6
CONCLUSIONS
This paper set out to test the validity of a new theoretical framework for PPPs, using existing PPP research. In so doing, a set of findings enabled both research questions to be answered. Firstly, it is suggested that it is possible to examine the structure of procurement projects such as PPPs by identifying the various elements of the routine’s ‘gene’. Indeed such a process underscores the importance of undertaking such analysis at a detailed level as there is a high level of individual variation between projects. A second outcome was that the re-analysis of the existing data enables a clear and concise picture of the influence of specific genes on the performance of PPPs. This finding supports the second research question which sought to identify if specific ‘genes’ can be shown to relate to the performance of PPP routines. It is evident that there are a number of elements which affect performance, and a few which do not – at least in the case studies reviewed here. This study has demonstrated that viewing PPPs through the metaphor of genes enables the variation between the various elements of PPPs to become apparent, and to assess how these elements relate to performance. While a larger number of projects would test and generalise these findings to all projects, strong support for the theoretical validity of the framework advanced earlier in this paper has been demonstrated. Consequently, the model advanced in this paper offers a novel theoretical framework which holds promise for providing a robust and accurate framework to asses the elements and performance of procurement arrangements such as PPPs. Aside from the theoretical value of such a framework, there is considerable practical utility to enable government and firms to assess under what circumstances to use PPP as a procurement method.
7
REFERENCES
1
Duffield, C., P. Raisbeck, and M. Xu, Benchmarking Study, Phase II. Report on the performance of PPP projects in Australia when compared with a representative sample of traditionally procured infrastructure projects. . 2008, National PPP Forum Melbourne
2
Chan, C., et al., Public Infrastructure Financing: An International Perspective, A.G.P. Commission, Editor. 2009, Australian Government Publishing Service: Canberra
3
Vining, A.R. and A.E. Boardman, public-private partnership in Canada: Theory and evidence. Canadian Public Administration / Administration Publiqiue du Canada, 2008. 51(1): p. 9-44.
4
Tookey, J.E., et al., Construction procurement routes: re-defining the contours of construction procurement. Engineering, Construction and Architectural Management, 2001. 8(1): p. 20-30.
5
5. Nelson, R.R. and S.G. Winter, An evolutionary theory of economic change. 1982, Cambridge, MA: Harvard University Press.
6
Feldman, M.S., Organizational routines as a source of continuous change. Organization Science, 2000. 11(6): p. 611-629.
7
Feldman, M.S., A performative perspective on stability and change in organizational routines. Industrial and Corporate Change, 2003. 12(4): p. 727-752.
8
Becker, M.C., Organizational routines: a review of the literature. Industrial and Corporate Change, 2004. 13(4): p. 643677.
9
Kauffman, S.A., The Origins of Order: Self-Organisation and Selection in Evolution 1993, New York: Oxford University Press.
10 Kauffman, S.A., At home in the universe: The search for the laws of Self-Organization and Complexity 1995, New York: Oxford University Press.
333
11 Becker, M.C., T. Knudsen, and J.G. March, Schumpeter, Winter, and the sources of novelty. Industrial and Corporate Change, 2006. 15(2): p. 353–371. 12 Winter, S., Optimization and evolution in the theory of the firm, in Adaptive Economic Models, R.H. Day and T. Groves, Editors. 1975, Academic Press: New York. p. 73-118. 13 Rivkin, J.W. and N. Siggelkow, Balancing search and stability: Interdependencies among elements of organizational design. Management Science, 2003. 49(3): p. 290-311. 14 Levinthal, D.A., Adaptation on rugged landscapes. Management Science, 1997. 43: p. 319-340. 15 Levinthal, D.A. and M. Warglien, Landscape design: Design for Local Action in Complex Worlds Organization Science, 1999. 10(3): p. 342-357. 16 Holland, J.H., Hidden Order: How adaptation builds complexity. . 1995, New York: Basic Books. 17 Gell-Man, M., Complex Adaptive Systems, in The Mind, The Brain, and Complex Adaptive Systems, H. Morowitz and J.L. Singer, Editors. 1994, Westview Press: Santa Fe. p. 11-24. 18 Pentland, B.T., Conceptualising and measuring variety in the execution of organizational work processes. Management Science, 2003. 49(7): p. 857-870. 19 Globerman, S. and A. Vining, A Framework for Evaluating the Government Contracting-Out Decision with an Application to Information Technology Public Administration Review, 1996. 56(6): p. 557-586. 20 Vining, A. and S. Globerman, A Conceptual Framework for Understanding the Outsourcing Decision. European Management Journal, 1999. 17(6): p. 645-654. 21 2Chan, A.P.C. and A.P.L. Chan, Key performance indicators for measuring construction success. Benchmarking: An international journal, 2004. 11(2): p. 203-221. 22 Coase, R.H., The Nature of the Firm. Economica, 1937. 4(16): p. 386-405. 23 Osbourne, S.P. and K.A. Brown, Managing change and innovation in public service organizations. 2005, New York: Routledge. 24 Stake, R.E., Qualitative case studies, in The Sage Handbook of Qualitative Research Methods, N. Denzin and Y.S. Lincoln, Editors. 2005, Sage: Thousand Oaks. p. 443-466. 25 Becker, M.C., The concept of routines: some clarifications. Cambridge Journal of Economics, 2005. 29: p. 249-262. 26 Babbie, E., The practice of social research. 10th ed. 2004, Belmont, CA: Wadsworth/Thompson Learning. 27 Eisenhardt, K.M., Building theories from case study research, in The Qualitative Researcher’s Companion, A.M. Huberman and M.B. Miles, Editors. 2002, Sage Publications: Thousand Oaks. p. 5–31. 28 Yin, R.K., Case Study research: Design and Methods. 3rd ed. 2003, Thousand Oaks: Sage Publications. 29 Eisenhardt, K.M., Better stories and better constructs: The case for rigor and comparative logic. Academy of Management Review, 1991. 16(3): p. 620-627. 30 Stake, R.E., Case studies, in Strategies of Qualitative Enquiry, N. Denzin and Y.S. Lincoln, Editors. 2003, Thousand Oaks: Sage. p. 134 - 164. 31 Lee, T.W., Using qualitative methods in organizational research. 1999, Thousand Oaks: Sage. 32 Cresswell, J.W., Qualitative Inquiry and Research Design: Choosing among five approaches. 2nd ed. 2007, Thousand Oaks, CA: Sage Publications. 33 NSW Government, Working with Government: Risk Allocation and Commercial Principles, N. Treasury, Editor. 2007, NSW Government Sydney.http://www.treasury.nsw.gov.au/__data/assets/pdf_file/0012/3135/risk_allocation.pdf 34 Deloitte, Closing the Infrastructure Gap: The Role of Public-Private Partnerships. 2006, Deloitte Research: NPD.http://www.infrastructureaustralia.gov.au/files/Closing_the_Infrastructure_GapThe_role_of_PPPs_Deloitte_2006.pdf 35 Infrastructure Australia, National Public Private Partnership Guidelines. 2008, Australian Government Canberra.http://www.infrastructureaustralia.gov.au/files/National_PPP_Guidelines-Vol_4_PSC_Guidance_Dec_08.pdf 36 Furneaux, C.W. and K.A. Brown, Capabilities, institutions and markets: A cross jurisdictional analysis of embedded public values in public works procurement in Australia, in Infrastructure Policies and Public Value panel track, International Research Symposium on Public Management XI. 2007: Potsdam (Germany)
334
Acknowledgments This paper was developed within the CRC for Integrated Engineering Asset Management, established and supported under the Australian Government’s Cooperative Research Centres Programme.
335
Proceedings of the 4th World Congress on Engineering Asset Management Athens, Greece 28 - 30 September 2009
MAINTENANCE STRATEGIES: A SYSTEMATIC APPROACH FOR SELECTION OF THE RIGHT STRATEGIES Ashraf Labiba a
Professor and Associate Dean (Research), University of Portsmouth, Portsmouth Business School, Richmond Building, Portland Street, Portsmouth PO1 3DE, United Kingdom E-mail: [email protected]
In this paper we use Computerised Maintenance Management Systems (CMMSs) to propose a method of selecting best maintenance strategies. There are fundamental questions that need to be asked with regard to existing CMMSs, such as: What do users really want from a CMMS? Does it support what is happening in the shop-floor? Or, Is it a rather expensive calendar to remind one of when to perform a maintenance schedule? Is it really worth spending so much effort, time, and money in buying a system for just being an electronic calendar or a monitoring device? Do existing CMMSs really contribute to the bottom line benefits of the company and support in the reduction of breakdowns or are they just a beast to be served by an army of IT specialists? Companies seem to spend a vast amount of resources in acquiring systems to perform data collection (clever databases) and data analysis (clever charts), and the added value to the business is often questionable. They then hope that somehow someone will make sense of the data and eventually things would get better. The key message here is that the aspect of decision support is missing in these systems. In this paper we propose a cohesive model that uses data in a CMMS to help in the selection of best maintenance strategies. It is an attempt to develop an intelligent system that can support decisions in maintenance. 1
INTRODUCTION
Computerised Maintenance Management Systems (CMMSs) are vital for the co-ordination of all activities related to the availability, productivity and maintainability of complex systems. Modern computational facilities have offered a dramatic scope for improved effectiveness and efficiency in, for example, maintenance. Computerised maintenance management systems (CMMSs) have existed, in one form or another, for several decades. The software has evolved from relatively simple mainframe planning of maintenance activity to Windows-based, multi-user systems that cover a multitude of maintenance functions. The capacity of CMMSs to handle vast quantities of data purposefully and rapidly has opened new opportunities for maintenance, facilitating a more deliberate andconsidered approach to managing assets. Some of the benefits that can result from the application of a CMMS are: a) a) b) c) a)
resource control – tighter control of resources ; cost management – better cost management and audibility; scheduling – ability to schedule complex, fast-moving workloads; integration – integration with other business systems; and reduction of breakdowns – improved reliability of physical assets through the application of an effective maintenance programme.
The most important factor may be reduction of breakdowns. This is the aim of the maintenance function and the rest are ‘nice’ objectives (or by-products). This is a fundamental issue as some system developers and vendors as well as some users lose focus and compromise reduction of breakdowns in order to maintain standardisation and integration objectives, thus confusing aim with objectives. This has led to the fact that the majority of CMMSs in the market suffer from serious drawbacks, as will be shown in the following section.
336
The term maintenance has many definitions. One comprehensive definition is provided by the UK Department of Trade and Industry (DTI): “the management, control, execution and quality of those activities which will ensure that optimum levels of availability and overall performance of plant are achieved, in order to meet business objectives”. It is worth noting that the definition implies that maintenance is a managerial and strategic activity; today, the term ‘asset management’ is often used instead. It is also worth noting that the word ‘optimum’ was used rather than ‘maximum’ which implies that maintenance is an optimisation case, where both over-maintenance and under-maintenance should be avoided. In this paper an investigation of the characteristics of Computerised Maintenance Management Systems (CMMSs) is carried out in order to highlight the need for them in industry and identify their current deficiencies. This is achieved through the assessment of the state-of-the-art of existing CMMSs. A proposed model is then presented to provide a decision analysis capability that is often missing in existing CMMSs. The effect of such model is to contribute towards the optimisation of the functionality and scope of CMMSs for enhanced decision analysis support. The system is highly adaptive and has been successfully applied in industry. The proposed model employs a hybrid of intelligent approaches. In this paper, we also demonstrate the use of AI techniques in CMMS’s. The paper is organized as follows. Section 2 provides evidence of existence of ‘black holes’ in the CMMS market. An alternative is provided in Section 3 where a model for decision analysis called the Decision Making Grid (DMG) is introduced. Section 4 describes maintenance policies that are covered by the DMG. This is then followed by demonstration of incorporating the DMG into a CMMS through a case study in Section 5 with a discussion of the results. The final two sections (Sections 5 and 6) deal with the unmet needs in CMMSs and a discussion of future directions for research. 2
EVIDENCE OF ‘BLACK HOLES ’ Most existing off-the-shelf software packages, especially CMMSs and enterprise resource planning (ERP) systems, tend to be ‘black holes’. This term has been coined by the author as a description of systems that are greedy for data input but that seldom provide any output in terms of decision support. In astronomical terms, ‘black holes’ used to be stars at some time in the past and now possess such a high gravitational force that they absorb everything that comes across their fields and do not emit anything at all, including light. This is analogous to systems that, at worst, are hungry for data and resources and, at best, provide the decision-maker with information that he/she already knows. Companies consume a significant amount of management and supervisory time compiling, interpreting and analysing the data captured within the CMMS. Companies then encounter difficulties analysing equipment performance trends and their causes as a result of inconsistency in the form of the data captured and the historical nature of certain elements of it. In short, companies tend to spend a vast amount of capital in acquisition of off-the-shelf systems for data collection, but their added value to the business is questionable. Few books have been published about the subject of CMMSs (Bagadia, 2006), (Mather, 2002), (Cato and Mobley, 2001), and (Wireman, 1994). However, they tend to highlight its advantages rather than its drawbacks. All CMMSs offer data collection facilities; more expensive systems offer formalised modules for the analysis of maintenance data, and the market leaders allow real time data logging and networked data sharing (see Table 1). Yet, despite the observations made above regarding the need for information to aid maintenance management, a ‘black hole’ exists in the row titled ‘Decision analysis’ in Table 1, because virtually no CMMS offers decision support.1 This is a definite problem, because the key to systematic and effective maintenance is managerial decision-making that is appropriate to the particular circumstances of the machine, plant or organisation. This decision-making process is made all the more difficult if the CMMS package can only offer an analysis of recorded data. As an example, when a certain preventive maintenance (PM) schedule is input into a CMMS, for example to change the oil filter every month, the system will simply produce a monthly instruction to change the oil filter and is thus no more than a diary.
Table1: Facilities Offered by Commercially Available CMMS Packages
337
A step towards decision support is to vary the frequency of PM depending on the combination of failure frequency and severity. A more intelligent feature would be to generate and prioritise PM according to modes of failure in a dynamic realtime environment. A PM is usually static and theoretical in that it does not reflect shop floor realities. In addition, the PM that is copied from machine manuals is usually inapplicable because: b) all machine work in different environments and would therefore need different PMs; d) machine designers often have a different experience of machine failures and means of prevention from those who operate and maintain them; and b) machine vendors may have a hidden agenda of maximising spare parts replacements through frequent PMs. The use of CMMSs for decision support lags significantly behind the more traditional applications of data acquisition, scheduling and work order issuing. While many packages offer inventory tracking and some form of stock level monitoring, the reordering and inventory holding policies remain relatively simplistic and inefficient. See the work of Exton and Labib (2002) and Labib and Exton (2001). Also, there is no mechanism to support managerial decision-making with regard to inventory policy, diagnostics or setting of adaptive and appropriate preventive maintenance schedules. A noticeable problem with current CMMS packages regards provision of decision support. Figure1 illustrates how the use of CMMS for decision support lags significantly behind the more traditional applications of data acquisition, scheduling and work-order issuing. Applications of CMMS Modules Maintenance budgeting Pr edi ctive maintenance data analysis Equipment f ai lur e diagnosis Inventor y contr ol Spar e par ts r equir ements pl anning Mater i al and spar e par ts pur chasing Manpower planning and schedul ing Wor k-or der planning and schedul ing Equipment par ts list Equipment r epair histor y Pr eventati ve Mai ntenance planning and schedul ing 70
75
80
85
90
95
100
Per cent ag e o f syst ems i nco r p o r at ing mo d ul e
Figure 1 Extent of CMMS module usage (from Swanson, 1997) According to Boznos (1998): “The primary uses of CMMS appear to be as a storehouse for equipment information, as well as a planned maintenance and a work maintenance planning tool.” The same author suggests that CMMS appears to be used less often as a device for analysis and co-ordination and that “existing CMMS in manufacturing plants are still far from being regarded as successful in providing team based functions”. He has surveyed CMMS as well as total productive maintenance (TPM) and reliability-centred management (RCM) concepts and the extent to which the two concepts are embedded in existing marketed CMMSs. He has concluded that: “It is worrying the fact that almost half of the companies are either in some degree dissatisfied or neutral with their CMMS and that the responses indicated that manufacturing plants demand more user-friendly systems.” This is a further proof of the existence of a ‘black hole’. To make matters worse, it appears that there is a new breed of CMMSs that are complicated and lack basic aspects of user-friendliness. Although they emphasise integration and logistics capabilities, they tend to ignore the fact that the fundamental reason for implementing CMMSs is to reduce breakdowns. These systems are difficult to handle for both production operators and maintenance engineers; they are accounting- and/or ITorientated rather than engineering-orientated. Results of an investigation (EPSRC – GM/M35291) show that managers’ lack of commitment to maintenance models has been attributed to a number of reasons:
338
c) Managers are unaware of the various types of maintenance models. e) A full understanding of the various models and the appropriateness of these systems to companies is not available. c) Managers do not have confidence in mathematical models due to their complexities and the number of unrealistic assumptions they contain. This correlates with surveys of existing maintenance models and optimisation techniques. Ben-Daya et al. (2001) and Sherwin (2000) have also noticed that models presented in their work have not been widely used in industry for several reasons, such as: d) unavailability of data; f) lack of awareness about these models; and d) restrictive assumptions of some of these models. Finally, here is an extract from Professor Nigel Slack (Warwick University) textbook on operations management regarding critical commentary of ERP implementations (which may as well apply to CMMSs as many of them tend to be nowadays classified as specialised ERP systems): “Far from being the magic ingredient which allows operations to fully integrate all their information, ERP is regarded by some as one of the most expensive ways of getting zero or even negative return on investment. For example, the American chemicals giants, Dow Chemical, spent almost half-a-billion dollars and seven years implementing an ERP system which became outdated almost as it was implemented. One company, FoxMeyer Drug, claimed that the expense and problems which it encountered in implementing ERP eventually drove it to bankruptcy. One problem is that ERP implementation is expensive. This is partly because of the need to customise the system, understand its implications for the organisation, and train staff to use it. Spending on what some call the ERP ecosystem (consulting, hardware, networking and complimentary applications) has been estimated as being twice the spending on the software itself. But it is not only the expense which has disillusioned many companies, it is also the returns they have had for their investment. Some studies show that the vast majority of companies implementing ERP are disappointed with the effect it has had on their businesses. Certainly many companies find that they have to (sometimes fundamentally) change the way they organise their operations in order to fit in with ERP systems. This organisational impact of ERP (which has been described as the corporate equivalent of dental root canal work) can have a significantly disruptive effect on the organisation’s operations.” Hence, theory and implementation of existing maintenance models are, to a large extent, disconnected. It is concluded that there is a need to bridge the gap between theory and practice through intelligent optimisation systems (e.g. rule-based systems). It is also argued that the success of this type of research should be measured by its relevance to practical situations and its impact on the solution of real maintenance problems. The developed theory must be made accessible to practitioners through IT tools. Efforts need to be made in the data capturing area to provide necessary data for such models. Obtaining useful reliability information from collected maintenance data requires effort. In the past, this has been referred to as ‘data mining’ as if data can be extracted in its desired form if only it can be found. In the next section we introduce a decision analysis model. We then show how such a model has been implemented for decision support in maintenance systems.
3
APPLICATION OF DECISION ANALYSIS IN MAINTENANCE
The proposed maintenance model is based on the concept of effectiveness and adaptability. Mathematical models have been formulated for many typical situations. These models can be useful in answering questions such as "how much maintenance should be done on this machine? How frequently should this part be replaced? How many spare should be kept in stock? How should the shutdown be scheduled? It generally accepted that the vast majority of maintenance models are aimed at answering efficiency questions, that is questions of the form "how can this particular machine be operated more efficiently?” and not at effectiveness questions, like "which machine should we improve and how?”. The latter question is often the one in which practitioners are interested. From this perspective it is not surprising that practitioners are often dissatisfied if a model is directly applied to an isolated problem. This is precisely why in the integrated approach efficiency analysis as proposed by the author (do the things right) is preceded by effectiveness analysis (do the right thing). Hence, two techniques were employed to illustrate the above-mentioned concepts mainly the Fuzzy Logic Rule based Decision Making Grid (DMG) and the Analytic Hierarchy Process (AHP) as proposed by Labib etal (1998). The proposed model is illustrated in Figure 2. The Decision-Making Grid (DMG) acts as a map where the performances of the worst machines are placed based on multiple criteria. The objective is to implement appropriate actions that will lead to the movement of machines towards an improved state with respect to multiple criteria. These criteria are determined through prioritisation based on the Analytic Hierarchy Process (AHP) approach. The AHP is also used to prioritise failure modes and fault details of components of critical machines within the scope of the actions recommended by the DMG. The model is based on identification of criteria of importance such as downtime and frequency of failures. The DMG then proposes different maintenance policies based on the
339
state in the grid. Each system in the grid is further analyzed in terms of prioritisations and characterisation of different failure types and main contributing components.
Low
Downtime Medium
High
CBM
OTF
Medium High
Frequency
Low
SLU: Skill Level Upgrade. OTF: Operate To Failure . CB M: Condition B ased Monitoring. FTM: Fixed Time Maintenance. DOM: Design Out Maintenance.
FTM DOM
SLU
DMG: Strategic Grid (overall map)
: Mac hine / System
Multiple Criteria
Fixed Rules & Flexible Strategies
Downtime
Elect rical
Multiple Criteria: Prioritised focused actions
Motor faults
Frequency Spare Parts Bottleneck
Mechanical
Panel faults
Hydrauli c
Switch faults
Pneumatic
No power faults
© A.W. Labib
Figure. 2 Decision Analysis Maintenance System
4
MAINTENANCE POLICIES
Maintenance policies can be broadly categorised into the technology or systems oriented (systems, or engineering), management of human factors oriented and monitoring and inspection oriented. RCM is a technological based concept where reliability of machines is emphasised. RCM is a method for defining the maintenance strategy in a coherent, systematic and logical manner. It is a structured methodology for determining the maintenance requirements of any physical asset in its operation context. The primary objective of RCM is to preserve system function. The RCM process consists of looking at the way equipment fails, assessing the consequences of each failure (for production, safety, etc), and choosing the correct maintenance action to ensure that the desired overall level of plant performance (i.e. availability, reliability) is met. The term RCM was originally coined by Nolan and Heap (1979). For more details on RCM see Moubray (1991, 2001), and Netherton (2000). TPM is human based technique in which maintainability is emphasised. TPM is a tried and tested way of cutting waste, saving money, and making factories better places to work. TPM gives operators the knowledge and confidence to manage their own machines. Instead of waiting for a breakdown, then calling the maintenance engineer, they deal directly with small problems, before they become big ones. Operators investigate and then eliminate the root causes of machine errors. Also, they work in small teams to achieve continuous improvements to the production lines. For mordetails on TPM see Nakajima (1988), Hartmann (1992), and Willmott (1994). Condition Based Maintenance (CBM) – not Condition Based Monitoring – is a sensing technique in which availability based on inspection and follow-up is emphasised. In the British Standards, CBM is defined ast the preventive maintenance initiated as a result of knowledge of the condition of an item from routine or continuous monitoring.” (BS 3811, 1984). It is the means whereby sensors, sampling of lubricant products, and visual inspection are utilised to permit continued operation of critical machinery and avoid catastrophic damage to vital components The integral components for the successful application of condition monitoring of machinery are: reliable detection, correct diagnosis, and dependable decision-making. For more details on CBM, see Brashaw (1998), and Holroyd (2000). The proposed approach in this paper is different from the above mentioned ones in that it offers a decision map adaptive to the collected data where it suggest the appropriate use of RCM, TPM, and CBM.
340
5
THE DMG THROUGH AN INDUSTRIAL CASE STUDY
This case study demonstrates the application of the proposed model and its effect on asset management performance. The application of the model is shown through the experience of a company seeking to achieve World-Class status in asset management. The company has implemented the proposed model which has had the effect of reducing total downtime from an average of 800 hours per month to less than 100 hours per month as shown in Figure 3. 5.1 COMPANY BACKGROUND AND METHODOLOGY In this particular company there are 130 machines, varying from robots, and machine centres, to manually operated assembly tables. Notice that in this case study, only two criteria are used (frequency, and downtime). However, if more criteria are included such as spare parts cost and scrap rate, the model becomes multi dimensional, with low, medium, and high ranges for each identified criterion. The methodology implemented in this case was to follow three steps. These steps are i. Criteria Analysis, ii. Decision Mapping, and iii. Decision Support.
B re a k d o w n T re n d s (h rs .) 1200 1000 800 600 400 200 0 N ov
D ec
Ja n
F eb
M ar
A pr
M ay
Ju n
Ju l
A ug
S ep
Oct
Nov
Figure 3 Total breakdown trends per month
5.2 STEP 1: CRITERIA ANALYSIS As indicated earlier the aim of this phase is to establish a Pareto analysis of two important criteria Downtime; the main concern of production, and Frequency of Calls; the main concern of asset management. The objective of this phase is to assess how bad are the worst performing machines for a certain period of time, say one month. The worst performers in both criteria are sorted and grouped into High, Medium, and Low sub-groups. These ranges are selected so that machines are distributed evenly among every criterion. This is presented in Figure 4. In this particular case, the total number of machines is 120. Machines include CNCs, robots, and machine centres.
341
Criteria: Dow ntime Name Machine [A] Machine [B] Machine [C] Machine [D] MEDIUM Machine [E] Machine [F] Machine [G] LOW Machine [H] Machine [I] Machine [j] Sum of Top 10 Sum of All Percentage HIGH
Frequ ency Downtime (hrs) 30 20 20 17 16 12 7 6 6 4 138 155 89%
Name Machine [G] Machine [C] Machine [D] Machine [A] Machine [I] Machine [E] Machine [k] Machine [F] Machine [B] Machine [H] Sum of Top 10 Sum of All Percentage
Frequency (No. off) 27 16 HIGH 12 9 8 8 MEDIUM 8 4 3 LOW 2 97 120 81 %
Criteria Evaluation Figure 4 Step1: Criteria Analysis
5.3 STEP 2: DECISION MAPPING The aim of this step is twofold; it scales High, Medium, and Low groups and hence genuine worst machines in both criteria can be monitored on this grid. It also monitors the performance of different machines and suggests appropriate actions. The next step is to place the machines in the "Decision Making Grid" shown in Figure 5, and accordingly, to recommend asset management decisions to management. This grid acts as a map where the performances of the worst machines are placed based on multiple criteria. The objective is to implement appropriate actions that will lead to the movement of machines towards the north - west section of low downtime, and low frequency. In the top-left region, the action to implement, or the rule that applies, is OTF (operate to failure). The rule that applies for the bottom-left region is SLU (skill level upgrade) because data collected from breakdowns - attended by maintenance engineers - indicates that machine [G] has been visited many times (high frequency) for limited periods (low downtime). In other words maintaining this machine is a relatively easy task that can be passed to operators after upgrading their skill levels. Machines that are located in the top-right region, such as machine [B], is a problematic machine, in maintenance words "a killer". It does not breakdown frequently (low frequency), but when it stops it is usually a big problem that lasts for a long time (high downtime). In this case the appropriate action to take is to analyse the breakdown events and closely monitor its condition, i.e. condition base monitoring (CBM). A machine that enters the bottomright region is considered to be one of the worst performing machines based on both criteria. It is a machine that maintenance engineers are used to seeing it not working rather than performing normal operating duty. A machine of this category, such as machine [C], will need to be structurally modified and major design out projects need to be considered, and hence the appropriate rule to implement will be design out maintenance (DOM). If one of the antecedents is a medium downtime or a medium frequency, then the rule to apply is to carry on with the preventive maintenance schedules. However, not all of the mediums are the same. There are some regions that are near to the top left corner where it is "easy" FTM (Fixed Time Maintenance) because it is near to the OTF region and it requires re-addressing issues regarding who will perform the instruction or when will the instruction be implemented. For example, in case of machines [I] and [J], they are situated in region between OTF and SLU and the question is about who will do the instruction - operator, maintenance engineer, or subcontractor. Also, a machine such as machine [F] has been shifted from the OTF region due to its relatively higher downtime and hence the timing of instructions needs to be addressed. Other preventive maintenance schedules need to be addressed in a different manner. The "difficult" FTM issues are the ones related to the contents of the instruction itself. It might be the case that the wrong problem is being solved or the right one is not being solved adequately. In this case machines such as [A] and [D] need to be investigated in terms of the contents of their preventive instructions and an expert advice is needed.
342
Decision Making Grid DOWNTIME
Low
Med.
High
FREQUENCY
10
Low Med. High
O.T.F. 5
10
20
F.T.M. [H]
(When ?)
F.T.M. [I]
F.T.M.
S.L.U.
F.T.M.
(Who ?)
[J]
[G] CBM: Condition Base Monitoring SLU: Skill Level Upgrade FTM: Fixed Time Maintenance
(How ?)
C.B.M. [F] [E]
[B]
F.T.M. (What ?)
[A]
D.O.M. [D]
[C]
OTF: Operate To failure DOM: Design Out M/C.
Figure. 5 Step2: Decision Mapping
5.4 STEP 3: MULTILEVELED DECISION SUPPORT Once the worst performing machines are identified and the appropriate action is suggested, it is now a case of identifying a focused action to be implemented. In other words, we need to move from the strategic systems level to the operational component level. Using the Analytic Hierarchy Process (AHP), one can model a hierarchy of levels related to objectives, criteria, failure categories, failure details and failed components. For more details on the AHP readers can consult Saaty (1988). This step is shown in Figure 6.
Figure. 6 Step3: Decision Support
343
The AHP is a mathematical model developed by Saaty (1980) that prioritises every element in the hierarchy relative to other elements in the same level. The prioritization of each element is achieved with respect to all elements in the above level. Therefore, we obtain a global prioritized value for every element in the lowest level. In doing that we can then compare the prioritized Fault Details (Level 4 in Figure 6), with PM signatures (keywords) related to the same machine. PMs can then be varied accordingly in an adaptive manner to shop floor realities. The proposed decision analysis maintenance model as shown previously in Figure 2 combines both fixed rules and flexible strategies since machines are compared on a relative scale. The scale itself is adaptive to machine performance with respect to identified criteria of importance. Hence flexibility concept is embedded in the proposed model. Fuzzy Logic Rule based Decision Making Grid In practice, however, there can exist two cases where one needs to refine the model. The first case is when two machines are located near to each other across different sides of a boundary between two policies. In this case we apply two different policies despite a minor performance difference between the two machines. The second case is when two machines are on the extreme sides of a quadrant of a certain policy. In this case we apply the same policy despite the fact they are not near each other. Both cases are illustrated in Figure 7. For both cases we can apply the concept of fuzzy logic where boundaries are smoothed and rules are applied simultaneously with varying weights.
Figure 7 Special cases for the DMG model In fuzzy logic, one needs to identify membership functions for each controlling factor, in this case: frequency and downtime as shown in Figures 8 (a) and (b). A membership function defines a fuzzy set by mapping crisp inputs from its domain to degrees of membership (0,1). The scope/domain of the membership function is the range over which a membership function is mapped. Here the domain of the fuzzy set Medium Frequency is from 10 to 40 and its scope is 30 (40-10), whereas the domain of the fuzzy set High Downtime is from 300 to 500 and its scope is 200 (500-300) and so on. The output strategies have a membership function and we have assumed a cost (or benefit) function that is linear and follows the following relationship (DOM > CBM >SLU > FTM > OTF) as shown in Figure 9 (a). The rules are then constructed based on the DMG grid where there will be 9 rules. An example of the rules is as follows: Rules:e) If Frequency is High and Downtime is Low Then Maintenance Strategy is SLU (Skill Level Upgrade). e) If Frequency is Low and Downtime is High Then Maintenance Strategy is CBM (Condition Based Maintenance). Rules are shown in Figure 9 (b).
344
m
m Medium
Low
1
High
0.75
Medium
Low
1
High
0.7
0.4 0.2
0
10
20
30
40
12
0
50 Frequency (No. of times)
100
200
300
400 380
500 Downtime (hrs)
Figure 8 (a) Membership function of Frequency, (b) Membership function of Downtime
m
0
20
30
40
OTF
FTM
SLU
CBM
50 DOM Units of Cost (x £1,000/unit)
Figure 9 (a) Output (strategies) membership function, (b) The nine rules of the DMG. The fuzzy decision surface is shown in Figure (10). In this figure, given any combination of frequency (x-axis) and downtime (y-axis) one can determine the most appropriate strategy to follow (z axis).
DOM CBM SLU FTM OTF
Figure 10 The Fuzzy Decision Surface.
345
It can be noticed from Figure (11) that the relationship of (DOM > CBM >SLU > FTM > OTF) is maintained. As illustrated in Figure (11), given a 380 hrs of downtime and a 12 times Frequency, then the suggested strategy to follow is CBM.
Figure. 11 The Fuzzy Decision Surface Showing The Regions of Different Strategies
5.5 DISCUSSION The concept of the DMG was originally proposed by (Labib, 1996). It was then implemented in a company that has achieved a World-Class status in Maintenance (Labib, 1998(a)). The DMG Model has also been extended to be used as a technique to deal with crisis management in an award winning paper (Labib, 1998(b)), where it was presented in the context of crisis management rather than maintenance management. The DMG could be used for practical continuous improvement process because when machines in the top ten have been addressed, they will then, if and only if, appropriate action has been takes, move down the list of top ten worst machines. When they move down the list, other machines show that they need improvement and then resources can be directed towards the new offenders. If this practice is continuously used then eventually all machines will be running optimally. If problems are chronic, i.e. regular, minor and usually neglected; some of these could be due to the incompetence of the user and thus skill level upgrading would be an appropriate solution. However, if machines tend towards RCM then the problems are more sporadic and when they occur could be catastrophic. Uses of maintenance schemes such as FMEA and FTA can help determine the cause and may help predict failures thus allowing a prevention scheme to be devised. Figure 12 shows when to apply TPM and RCM. TPM is appropriate at the SLU range since Skill Level Upgrade of machine tool operators is a fundamental concept of TPM. Whereas RCM is applicable for machines exhibiting severe failures (high downtime and low frequency). Also CBM and FMEA will be ideal for this kind of machine and hence a RCM policy will be most applicable. The significance of this approach is that in one model we have RCM and TPM in a unified model rather than two competing concepts.
Figure 12 when to apply RCM and TPM in the DMG
346
Figure 13 Parts of PM schedules that need to be addressed in the DMG. Generally the easy Preventive Maintenance (PM), Fixed Time Maintenance (FTM) questions are Who? and When? (efficiency questions). The more difficult ones are What? and How? (effectiveness questions), as indicated in the Figure (13).
6
UNMET NEEDS IN RESPONSIVE MAINTENANCE
According to Professor Jay Lee, of the National Science Foundation (NSF) Industry/University Cooperative Research Centre on Intelligent Maintenance Systems (IMS) at the University of Cincinnati, unmet needs in responsive maintenance can be categorised as follows: A. machine intelligence – intelligent monitoring, prediction, prevention and compensation and reconfiguration for sustainability (self-maintenance); B. operations intelligence – prioritisation, optimisation and responsive maintenance scheduling for reconfiguration needs; and C. synchronisation intelligence – autonomous information flow from market demand to factory asset utilisation. It can be concluded that the challenges, and research questions, facing research and development (R&D) concerning next generation maintenance systems are:
7
a.
how to adapt PM schedules to cope dynamically with shop-floor reality;
b.
how to feed back information and knowledge gathered in maintenance to the designers;
c.
how to link maintenance policies to corporate strategy and objectives; and
d.
how to synchronise production scheduling based on maintenance performance.
FUTURE DIRECTIONS AND CONCLUSION
Training and educational programmes should be designed to address the existence of the considerable gap between the skills that are essential to maximise the potential benefits from these advanced systems and technologies in the area of maintenance and asset management and the skills that currently exist in the maintenance sections of most industries. Existing ERP and CMMS systems tend to put much emphasis on data collection and analysis rather than on decision analysis. Although the existing teaching programmes already address some of the issues related to next-generation maintenance systems, there is still room for considering other issues, such as: a.
Emphasis on CMMS and ERP systems in the market, as well as their use and limitations.
b.
Design awareness in maintenance and design for maintainability.
c.
Learning from failures across different industries and disciplines.
d.
Emphasis on prognostics rather than diagnostics.
e.
e-maintenance and remote maintenance, including self-powered sensors.
347
f.
Modelling and simulation using OR tools and techniques.
g.
AI applications in maintenance.
As the success of systems implementation are based on two factors, human and systems, it is important to develop and nurture skills as well as to use advanced technologies. In this paper, we have investigated the characteristics of Computerised Maintenance Management Systems (CMMSs) and have highlighted the need for them in industry and identified their current deficiencies. A proposed model was then presented to provide a decision analysis capability that is often missing in existing CMMSs. The effect of such model was to contribute towards the optimisation of the functionality and scope of CMMSs for enhanced decision analysis support. We have also demonstrated the use of AI techniques in CMMS’s. 8
REFERENCES
1
Bagadia, K. (2006) Computerized Maintenance Management Systems Made Easy, McGraw-Hill.
2
Brashaw, C. (1980) Characteristics of acoustic emission (AE) signals from ill fitetd copper split bearings, Proc 2nd Int. Conf on Planned Maintenance, Reliability and Quality, ISBN 086339 7867.
3
Ben-Daya, M., Duffuaa, S. O. and Raouf, A. (eds) (2001) Maintenance Modelling and Optimisation, Kluwer Academic Publishers, London.
4
Bongaerts, L., Monostori, L., McFarlane, D. andKadar, B. (2000) Hierarchy in distributed shop floor control, Computers in Industry, 43, 123-137.
5
Boznos D. (1998) The Use of CMMSs to Support Team-Based Maintenance, MPhil thesis, Cranfield University.
6
Brashaw, C, “Characteristics of Acoustic Emission (AE) signals from ill fitetd Cooper split bearings”, Proc 2nd Int. Conf on Planned maintenance, reliabiliyt and quality, ISBN 086339 7867,1998.
7
Cato, W., and Mobley, K. (2001) Computer-Managed Maintenance Systems:A Step-by-Step Guide to Effective Management of Maintenance, Labor, and Inventory, Butterworth Heinemann, Oxford.
8
Exton, T. and Labib, A. W. (2002) Spare parts decision analysis – The missing link in CMMSs (Part II), Journal of Maintenance & Asset Management, 17,14–21.
9
Fernandez, O., Labib, A. W. Walmsley, R. and Petty, D. J. (2003) A decision support maintenance management system: Development and implementation, International Journal of Quality and Reliability Management, 20, 965–979.
10 Hartmann, E. H. (1992) Successfully Installing TPM in a Non-Japanese Plant, TPM Press, Inc., New York. 11 Holroyd, T. (2000) Acoustic Emission & Ultrasonics, Coxamoor Publishing Company, Oxford. 12 Labib, A. W. (2003) Computerised Maintenance Management Systems (CMMSs): A black hole or a black box?, Journal of Maintenance & Asset Management, 18, 16–21. 13 Labib, A. W., Exton, T. (2001) Spare parts decision analysis – The missing link in CMMSs (Part I), Journal of Maintenance & Asset Management.16(3):10–17. 14 Labib, A. W., Williams, G. B. and O’Connor, R. F. (1998) An intelligent maintenance model (system): An application of the analytic hierarchy process and a fuzzy logic rule-based controller, Journal of the Operational Research Society, 49, 745–757. 15 Labib, A. W. (1996) An integarted approprate productive maintenance, PhD Thesis, University of Birmingham. 16 Labib, A W. (1998) World-class maintenance using a computerised maintenance management system, Journal of Quality in Maintenance Engineering, 4, 66–75. 17 Labib, A.W. (1998) 1 A Logistic Approach to Managing the Millennium Information Systems Problem. Journal of Logistics Information Management (MCB Press), Vol 11, No 5, pp 285-384, ISSN: 0957-6053. 18 Labib, A W., Cutting, M C., Williams G B. (1997) Towards a world class maintenance programme, Proceedings of the CIRP International Symposium on advanced Design and Manufacture in the Global Manufacturing Era, Hong Kong. 82– 88. 1
Received the “Highly Commended Award 1999” from the Literati Club, MCB Press (a publisher of 140 journals), for a paper [Labib, 1998], Journal of Logistics Information Management, MCB Press, 1998.
348
19 Mather, D. (2002) CMMS: A Timesaving Implementation Process, CRC PRESS, New York. 20 Moubray, J. (1991) Reliability Centred Maintenance”, Butterworth-Heinmann Ltd. 21 Moubray, J. (2001) The case against streamlined RCM, Maintenance & Asset Management, 16, 15-27. 22 Nakajima, S. (1988) Total Productive Maintenance, Productivity Press, Illinois 23 Netherton, D. (2000) RCM Standard, Maintenance & Asset Management, 15, 12-20. 24 Newport, R. (2000) Infrared thermography: A valuable weapon in the condition monitoring armoury, Journal of Maintenance & Asset Management, 15, 21-28 25 Nolan, F. and Heap, H. (1979) Reliability Centred Maintenance, National Technical Information Service Report, # A066579. 26 Moubray, J. (1991) Reliability Centred Maintenance, Butterworth-Heinmann Ltd, Oxford. 27 Saaty, TL, (1980) The Analytic Hierarchy Process: Planning, Priority Setting - Resource Allocation, McGraw-Hill, New York. 28 Saaty, T.L. (1988) The Analytic Hierarchy Process, Pergamon Press, New York, 29 Sherwin D, (2000) A review of overall models for maintenance management, Journal of Quality in Maintenance Engineering, 6, 138-164 30 Shorrocks, P. and Labib, A. W. (2000) Towards a multimedia-based decision support system for word class maintenance, Proceedings of the 14th ARTS (Advances in Reliability Technology Symposium), IMechE, University of Manchester. 31 Slack, N., Chambers, S. and Johnston, R. (2004) Operations Management, 4th edition, Prentice Hall. 32 Swanson, L. (1997) Computerized Maintenance Management Systems: A study of system design and use, Production and Inventory Management Journal, Second Quarter: 11-14. 33 Willmott, P. (1994) Total Productive Maintenance. The Western Way, Butterworth Heinemann Ltd., Oxford 34 Wireman, T. (1994) Computerized Maintenance Management Systems, 2nd edition, Industrial Press Inc, New York.
349
Proceedings of the 4th World Congress on Engineering Asset Management Athens, Greece 28 - 30 September 2009
TOWARDS VALUE-BASED ASSET MAINTENANCE Ali Rezvanib, Rengarajan Srinivasan a, Farnaz Farhanb, Ajith Kumar Parlikad a and Mohsen Jafarib a b
Cambridge University Engineering Department, Cambridge, UK CB2 1RX.
Dept. of Industrial & Systems Engineering, Rutgers University, New Brunswick, NJ 08854, USA.
The management of assets such as equipment and infrastructure can be a challenging task, and optimizing their usage is critical. Consequently, the importance of the maintenance function has increased because of its role in ensuring and improving asset performance and safety. Over the past few decades, there has been increasing interest in the area of maintenance modelling and optimization. More recently, the focus of research has been to take a whole-life perspective of the asset, and optimise maintenance decisions across the complete asset lifecycle. Most of the current research in this area takes cost (e.g., Life Cycle Cost) as the primary objective for optimisation (minimisation). These approaches do not effectively represent the role of maintenance because they do not consider the performance improvement organisations can expect to gain by proper maintenance – and this, we feel is a key limitation. In order to use maintenance as a “value driver” for the organisation, one must move away from cost-based thinking to value-based thinking. An important step in this direction is to consider net present value/utility of the decisions as the objective function, which will be discussed in detail in this paper. Nevertheless, the key parameters that are involved in a NPV or MVA based optimisation are still related to cost or “money” in general. By doing so, we would miss a number of other value-drivers that would be affected by maintenance. Examples of these are quality of products/service, customer satisfaction, environmental impact, etc. In this paper, we examine the possible elements of value provided by assets to the organisation owning those assets, and discuss how these value-drivers are affected by maintenance decisions. In this direction, we propose a measure - Value of Ownership (VOO) - to assess the value of maintenance and performance of maintenance decisions throughout an assets lifecycle. This measure will consider different value-drivers of a decision and makes it possible to consider the impact of maintenance decisions on a broader value space. Key Words: Asset Management, Maintenance, Value of Ownership 1
INTRODUCTION
Asset management is a strategic approach to build, preserve and operation of facilities with improved asset performance, by encompassing multiple business processes and relying on good information and analytic capabilities [NCHRP 20-22(11)]. By looking at the whole lifecycle of the asset, the asset management system is capable of taking a more logical approach to decision-making and provides a framework for both short- and long-term planning. Maintenance and Operation is the next stage in lifecycle of every asset after its Built/Procurement. The role of maintenance has been gaining importance in recent years as it has major influence on organisational performance. Apart from increasing the service life of the asset, maintenance also impacts on production, quality, safety, social and environmental aspects, which emphasises on making efficient and timely decisions. More recently, the focus of research has been to take a holistic life cycle approach and decisions are made on this perspective. However, these approaches predominantly focus on cost centric function and are mainly focussed on cost reduction. These models either have the objective function of minimizing the maintenance cost with constraints on service availability or maximizing the availability with constraints
350
on maintenance cost. Most performance models used in the literature are focusing on minimizing the system’s maintenance cost rate but ignoring other dimensions of the system such as reliability performance. (Wang H. , 2002) The cost based asset management fails to identify the value in owning and maintaining the functionality provided by the asset. This can lead to a sub optimal asset management strategy and leads to premature and costly decisions. Moreover, existing holistic approaches to understand the impact of maintenance on asset life cycle and organisation’s performance is cost based. For example, proper maintenance has implications on the functionality, economical, energy, safety and environmental impacts. A pure cost based approach ignores these important dimensions. In this paper, we propose a value-based maintenance approach in order to understand the value of the maintenance action on asset’s life cycle. We also explore the various dimensions of value based and cost based asset maintenance approaches. This paper has been structured as follows: Section 2 analysis the existing literature on the value based maintenance approaches. Section 3 describes the various value dimensions and maps them to maintenance actions. Section 4 proposes a new value based approach for maintenance and section 5 illustrates its application using a case study. Section 6 concludes with reference to future work. 2
LITERATURE REVIEW
In this section we will describe the various existing literatures on value based maintenance. Liyanage and Kumar (Liyanage & Kumar, 2003) developed a value based conceptual framework to measure the performance of oil and gas companies. Their argument was to adapt a new decision making criteria not exclusively based on economics, but also on social and environmental considerations. They used four classes of values: Resource and competence based values, capability based values, plant-process condition based values and delivery based values. The research primarily focussed on value of operating and maintenance process on the overall performance of the organisation, and hence does not focus on the value of maintenance on the asset. Rosqvist et al have developed a value-driven maintenance planning for a production plant (Rosqvist, Laakso, & Reunanen, 2009). The framework involved defining objectives, classifying equipment locations with associated functional requirements and then selecting maintenance strategies and tasks. It is vital to define strategic and functional objectives, and then classifying assets based on functional requirements identifies value creating segments. When maintenance plan is drawn out in this way, the authors noted that objectives set will be achieved, thereby creating value. Dwight (Dwight R. , 1995) (Dwight R. , 1999)used system audit approach to understand the value based performance of maintenance on organisational success. The concept of value arises from the organisation’s goal towards increasing future wealth and values. The value was based on future cash flow measurement using the formula
Performance, Pi =
Vr - Vl V*
Where V* is the best known value that can be realised in a given period. The numerator signifies the achieved value, which is the difference between value realised Vr and the future value lost Vl. Marias and Saleh (Marais & Saleh, 2009) have proposed a metric for capturing and quantifying the value of a maintenance activity by considering a system that exhibits multi-state failures, and setting a price for the service that it provides per unit of time. Using this method they have priced interruptions in service due to failures. They have also calculated the value of the maintenance by subtracting pre and post maintenance value of the system minus the cost of the maintenance. In another paper Macke and Higuchi (Macke & Higuchi, 2007) discussed an optimal maintenance planning by maximizing the net present benefit rate throughout the lifetime by finding optimal sequence of time and rehabilitation levels. Most of the value based approaches are oriented towards finding the organisational values, however, it is vital to understand the value created by maintenance on an asset towards the organisation success. The key to this is to understand the value creating elements due to maintenance. For example, maintenance adds value through improvement in operational efficiency, reduction in maintenance, production performance and increase in safety. In the next section we will describe the various value dimensions of maintenance. 3
VALUE DIMENSIONS VS MAINTENANCE DIMENSIONS
Value can be defined as the relationship between the satisfaction of many different needs and resources in doing so (Institute of Value Management United Kingdom) e.g. infrastructure assets create value by satisfying the need for distribution of resources and essential services to the public. Any decision made towards addressing community’s needs by maintenance or creation of new infrastructure affects its sustainability in three different dimensions: •
Economic Sustainability: profitability through efficient use of resources
351
•
Social Sustainability: responding to needs of society
•
Environmental Sustainability: preventing harmful and irreversible effects on the environment by efficient use of natural resources
Environmental
These sustainability dimensions are later used for measuring the value of different actions. In a general maintenance problem the objective can be maximization of level of service given certain cost constraints or minimization of maintenance cost given certain constraints on level of service. There are many different models that consider the cost of maintenance either in the objective function or in constraints however there are few of them that consider value of maintenance. Maintenance/improvement actions have some positive or negative effects on three mentioned value dimensions. While in cost based framework, maintenance actions are looked as cost centres, whereas in value based approach the goal is to come up with a maintenance strategy to maximize the difference between maintenance expenditures and the future expected return. If such a strategy doesn’t exist and maintenance costs can’t be covered by future expected return, the asset is outdated/obsolete and a replacement strategy should be considered (Marais & Saleh, 2009).
Social
Economical
Figure 1: Value Dimensions
Depending on the context of the problem different value functions should be used to show the importance and desirability of achieving different performance levels in value dimensions. For an engineering asset different metrics can be defined at Equipment and Organizational levels. In Equipment level performance measures are: Reliability, Availability, Maintainability, Cost and Safety; in Organizational level performance measures are: Technical Measure, Economical, Safety/Environmental, Process-task related efficiency, Customer Satisfaction, employee Satisfaction and Learning/growth/skill. These metrics can be ~
mapped into main value dimensions to create the value generation vectors for an asset can later be used for calculation the asset value.
V
r
= f ( Pm1 , Pm2 ,...Pm) which
As assets are operated and maintained over its life cycle and its values are realised, it is important to measure the value created. In the next section, we will describe a new value based approach to quantify the value of maintenance. 4
VALUE BASED APPROACH
As most of the performance measurements are oriented towards performance measurement systems and linking maintenance objectives with corporate goals, we will take asset focussed approach to identify the value it offers to the organisation. So the challenge is to identify a global PI defined by Pintelon (Pintelon & Puyvelde, 1997), which should provide what the life cycle value is for the asset due to the impact of maintenance. Current performance indicators for maintenance are predominantly focussed on technical and financial measures. Some common measures include mean time between failure, downtime, cost of maintenance, life cycle cost (LCC) and total cost of ownership. LCC approach has been widely used to understand the impact of decisions made in different stages of assets’ life cycle. However, this approach provides more insight into investment decision and is less suitable to understand the impact of maintenance on the operational lifecycle stage. The LCC approach uses acquisition costs, operating costs and loss cost, which does not provide an indication on the areas of maintenance such as reliability and efficiency. It predominantly focuses on cost reduction and does not provide an indication of the value or the revenue generated by the asset. Concepts such as Total cost of ownership (TCO) and Value of ownership (VOO) will be more suitable to understand the outcome of maintenance actions on the life cycle of the asset. Overall Equipment Effectiveness (OEE) has been used as a measure for Total Productive Maintenance TPM (Nakajima, 1989). OEE is the product of availability, performance and quality. The value of an asset can be quantified by measuring the operational effectiveness, performance, affordability, and safety, and determining how these can be retained over the asset’s life cycle. Moreover, with increasing demand to sustainability and environmental issues, it is vital for organisations to include these factors in determining the value. So we can define the value of owning an asset as a linear combination of all its value creating elements, discounted for future values as given by the following equation:
352
t 1 ValueofAsset = ∑ t (1 + r )
* Cr * ∑ g i * (Vi (t ) - Ci (t )) i
Here t is the life time of the asset, r the discount factor and Cr is the criticality of the asset. Criticality is the risk factor which identifies the probability of failure and the consequence of failure. i is the number of value creating elements, which can include technical value, safety and environmental values. Vi(t) signifies the value created by each segment during the period t and Ci(t) is the cost incurred in achieving the value. For example, OEE will be one of the technical values and the cost will be cost of maintenance, labor and etc. The factor γi is the importance factor associated with each value element. This can signify which value element is vital for decision making. For example, if the asset is non-critical, the γi factor will be less for safety and environmental segments. However, the cost factor will help organisations to reduce economical aspects of the non critical assets. If the asset is a critical safety related, then the decision maker can make sure the safety level is maintained at an optimal cost to monitor the performance. The γi factor can be found using historical operating information and existing knowledge. They can also be found using statistical methods such as linear regression given the historic information. Most of the units of measurement of performance in maintenance are either time based or cost based. For example, Mean Time Between Failure (MTBF) and Mean Time to Repair (MTTR) are time based units, whereas cost based metrics are maintenance, labour and spare costs. In the above equation, the units will be "utils" from utility theory, which is a real valued number. For an example, if we just take the technical value of the asset, we can define the value of owning the asset using the cost of ownership model (Sohn & Moon, 2003), t 1 VOO = ∑ t (1 + r )
TPT (t ) + Y (t ) + U (t ) * Cr * CF (t ) + CV (t ) + CY (t )
where CF(t) is the fixed cost, CV(t) is variable cost, CY(t) is yield loss cost, TPT(t) is the throughput, Y(t) is the composite yield and U(t) is the utilisation. CY(t), the opportunity lost cost, can be number of shutdowns times the average loss per shutdown. Y(t) and U(t) can be replaced by overall equipment effectiveness or the ratio of operating time to available time. These factors provide an indication to the functionality provided by the asset. In a production plant they will signify the amount of production and the revenue generated. However, it depends on the demand and how much one is willing to increase the spending to achieve the desired performance. Instead of having pure cost centric adoption towards asset management, value metric will have wider implication as a performance measure such as the inclusion of social and environmental factors. As can be seen, the value provided by asset differs in different industrial context, it is vital to understand the value creation elements and evaluate the value added by maintenance and information management. In an asset management problem, the objective of maintenance is to keep the asset in a service level that addresses the required service level. This goal service level can be calculated using the availability and capacity of the asset. In such a setting deviating from required service level will cause additional cost to the asset owner. The cost of excess of capacity would show itself in form of additional maintenance cost and the under availability of the system will lead to lost sales. A model focused on minimizing the operational cost will keep the availability of the system to the required availability. However such method fails to realize the value of having excess capacity that is resulted from maintaining the asset at higher availability. In a value based approach a maintenance action is taken only if it has a positive effect on the VOO of an asset. This effect is measured by using the value rate vector introduced earlier in the study. To make a maintenance decision based on defined performance measures and value metrics pre and post maintenance value of the asset will be calculated to select the proper maintenance action:
V ma int enance = V ( asset / noma int enance ) - V ( asset / ma int enanceacti
on ( s ))
The above formula can be used to understand the implication of a particular maintenance strategy. This will allow the decision maker to view maintenance function as a value added activity rather than cost based. This formula compares the use of reactive versus proactive, and thus provides the savings or value achieved through maintenance. In the next section, we will illustrate the advantages of using value based maintenance through a case study. 5
CASE STUDY
For a deteriorating asset the optimal maintenance problem strategy for an asset can be found using stochastic programming. Consider a case for a deteriorating asset in 4 states which deteriorates based on a Markov process with transition probability matrix P and the demand for this asset for the planning horizon is known. If the objective 1 3 4 2
Figure 2: Asset's state transition graph 353
of the maintenance is to maximize the value that the asset creates over the planning horizon then this problem can be modelled as follow: •
st : The state variable. It identifies the condition of the asset at each decision making period. Asset is assumed to have a life time of 10 years deteriorating based on transition probability matrix P.
•
at : Maintenance/improvement action taken to preserve/improve the condition of the asset. Four different maintenance actions (including do nothing) are defined for this asset.
•
~
~
S t +1 = g ( S t , at ) : Transition function from state st to state S t +1 where maintenance/improvement action at is taken.
•
~
Vt = V ( S t , at ) : In value based optimization the contribution function is defined by the difference between cash inflow of the service sold and cash outflow of the production and maintenance costs.
•
Max∑ t
1 · Cr · Vt : As it was mentioned earlier, the objective of the value based optimization is to (1 + r ) t
maximize the value of the ownership of the asset. The objective function here is the discounted sum of the contribution functions over the lifetime of the asset. ~
If and
At = [1,0.8,0.4,0] shows the availability of the system in states 1 to 4, maintenance actions are shown by a0 , a1 , a2
a3 in which a0 is do nothing, a1 is one state improvement, a2 is two state improvements and a3 is complete ~
rehabilitation, given a demand profile of D the optimal maintenance policy and availability curve of the asset can be acquired by solving the mentioned MDP problem. For the case that the demand for operation of the asset is 100% through all 10 years of the planning the optimal availability of the asset would be as follow:
Figure 3: Demand-Availability for Value-Based Method (changing Demand)
The optimal availability of the system depends on the demand profile. Changing the demand profile leads to different optimal maintenance policies and consequently different optimal availability of the asset. Figure 4 shows the new optimal availability curve when the demand profile is changed from 100%.
354
Figure 4: Demand-Availability for Value-Based Method (Demand = 100%) In the cost based method the objective of the optimization will change from maximizing the value generated to minimizing the operational cost. The operational cost will be consisted of production costs, maintenance cost and the cost of lost sale. The new objective function for the model can be shown as follow:
1 Min( Z ) = ∑ · Cr · ∑ g i · Cit' t t (1 + r ) i ' where Cit is the operational cost of the asset. The optimal solution of this problem depends on the proper estimation of the cost of lost sale (for the same demand profile different cost of lost sale leads to different optimal maintenance policies).
Figure 5: Optimal Availability - Cost Based Method (Demand = 100%)
Similar value base optimization the optimal performance curve of the asset depends on the demand profile for the asset. Changing the demand profile from 100% to a new demand profile will result a different optimal availability for the asset. Figures 5 and 6 illustrate the demand availability based on cost centric decisions.
355
Figure 6: Optimal Availability - Cost Based Method (Changing Demand)
Looking at the maintenance as a cost center in lifecycle of an asset leads to different optimal maintenance policies for the asset. Cost-based maintenance policies are best designed to minimize the operational cost of the asset. If the objective of the owner is to maximize the value created by the asset rather than the minimization of the operational cost, solutions generated by the cost-center algorithms are not the best policies for the asset owner. Proper estimation of cost of lost sale plays an important role in credibility of the outcome of the cost-cased model. 6
CONCLUSIONS AND FUTURE WORK
Maintenance is one of the important tasks required to achieve the functionality and the value provided by the asset. It is vital to understand the value of maintenance in maintaining the functionality of the asset. Using a value based approach instead of cost based solutions will enable maximum returns from the asset. One of the key challenges in this approach is to identify and quantify the value creation elements due to maintenance. Hence, the value based asset maintenance objectives can be summarised as: •
Identifying the value creating elements
•
Quantifying the realised value
•
Optimising or maximising the value
As maintenance impacts on different areas of organisation such as production, quality, safety, social and environmental factors, it is important to consider these parameters while making maintenance decisions. In the future, we will develop a methodology to quantify value and develop a maintenance decision framework using a value-based approach. In asset intensive industries, the problem of managing large number of assets compounds the problem in identifying the value. The value of maintenance can be derived by taking different perspectives such as considering a group of assets or process, for example an assembly line in manufacturing. In such a case the value based decision will need to consider the whole process. Such kind of problems will be addressed in the future to identify, quantify and optimise value-based maintenance decisions. Industrial case studies will be conducted to understand the implications of value based maintenance optimisation. 7
REFERENCES
1
(n.d.). Retrieved May 28, 2009, http://www.ivm.org.uk/whatisivm.php
2
Dwight, R. (1995) New Developments in maintenance: An international review. IFRIM.
3
Dwight, R. (1999) Searching for real maintenance performance measures. Journal of quality in maintenance engineering , 5 (3), 258-275.
4
Liyanage, J., & Kumar, U. (2003) Towards a value-based view on operations and maintenance performance management. Journal of quality in maintenance engineering , 9, 333-350.
from
Institute
356
of
Value
Management
United
Kingdom:
5
Macke, M., & Higuchi, S. (2007) Optimizing Maintenance Interventions for Deteriorating Structures Using Cost-Benefit Criteria. Journal of Structural Engineering , 925-934.
6
Marais, K. B., & Saleh, J. H. (2009) Beyonditscost,the value of maintenance:An analytical framework for capturing its net present valu. Reliability Engineering and System Safety , 94, 644-657.
7
Nakajima, S. (1989) TPM development program. Productivity Press.
8
Pintelon, L., & Puyvelde, F. V. (1997) Maintenance performance reporting systems: some experiences. Journal of quality in maintenance engineering , 3, 4-15.
9
Rosqvist, T., Laakso, K., & Reunanen, M. (2009) Value-driven maintenance planning for a production plant. Reliability Engineering and systems safety , 94, 97-110.
10
Sohn, S. Y., & Moon, H. U. (2003) Cost of Ownership model for inspection of multiple quality attributes. IEEE transactions on semiconductor manufacturing , 16, 565-571.
11
Wang, H. (2002) A survey of maintenance policies of deteriorating systems. European Journal of Operational Research , 139, 469–489.
357
Proceedings of the 4th World Congress on Engineering Asset Management Athens, Greece 28 - 30 September 2009
ENSURING SUCCESSFUL BUSINESS INTELLIGENCE SYSTEMS IMPLEMENTATION: MULTIPLE CASE STUDIES IN ENGINEERING ASSET MANAGEMENT ORGANISATIONS William Yeoh a, b Andy Koronios a, b Jing Gao a, b a b
CRC for Integrated Engineering Asset Management, Brisbane, Australia
School of Computer and Information Science, University of South Australia, Mawson Lakes 5095 Australia
Recently heightened competition resulting from market deregulation and increased regulatory compliance requirements have demanded greater accountability for decision making in engineering asset management (EAM) organisations. However, the siloed information structure and fragmented business function of conventional EAM organisations do not support the effective extraction, analysis, and provision of actionable information to improve decision-making process. In response to this, many EAM organisations turned their efforts to implementing complex Business Intelligence (BI) systems. But how to increase the likelihood of BI systems implementation success for the diverse EAM organisations with its traditionally strong and fragmented culture? This paper investigates and discusses the critical success factors (CSFs) influencing BI systems implementation in EAM organisations. Seven in-depth case studies were conducted in EAM organisations. The empirical findings show a clear trend towards multidimensional challenges involved in such resourceful and complex undertaking. The CSFs exist in various dimensions composed of organisation, process, and technology perspectives. Nevertheless, the study reveals that a more fundamental issue concerning the business needs of BI systems may, in the end, impede BI systems success. Therefore, BI stakeholders of EAM organisations are urged to apply a business-orientation approach in tackling implementation challenges and ensuring buy-in from business stakeholders. Key Words: Business intelligence system, Critical success factors, Engineering asset management organisations 1
INTRODUCTION
Like most organisations, many business functions and segments of engineering asset management (EAM) organisations, such as utilities and transportation enterprises, have been computerised and so acquired immense volumes of transactional data [1, 2]. Indeed, Gartner Research found that EAM organisations spend more than half of their operation budget (and considerable management attention) in managing asset lifecycle information [3]. Several studies indicated that some EAMrelated systems (including both operational and administrative systems) work in isolation, yet they control and monitor asset operations and provide administrative support throughout the asset lifecycle [4, 5]. This situation occurs because various software/system vendors offer an assortment of EAM-related modules with different levels of sophistication [1, 2]. For example, there are more than 70 vendors of asset information systems in the UK market alone [6]. In practice, data and information are collected both manually and electronically, in an assortment of formats (even stand-alone databases and spreadsheets) dispersed throughout EAM organisations resulting in multiple versions of the truth [1, 2]. There has always been a degree of difficulty to which critical business information can be accessed appropriately for a variety of functions and analysis. Although computerised and digitised initiatives are intended to enhance efficiency, nevertheless, issues such as lack of interoperable data architecture, disparate asset lifecycle management systems, and inappropriate organisational and staffing issues may prevent EAM stakeholders from using the data to full advantage [5]. In other words, these organisations suffer from the inability to properly exploit the managerial information from their pool of data. Such EAM organisations have been described as “drowning in data and starving for information” [6]. Yet, there exist significant data quality and integrity issues among these disparate systems [4, 5], though accessing cross-functional and cross-departmental data is a challenging tasks involving manual intervention and human effort. The plethora of incompatible systems and dissimilar data structures make it extremely difficult for a system in one department to communicate with systems in other business units, and to trust the quality
358
of the data. The problems of incompatibility severely limit the decision-support capability and thus the bottom line of an EAM organisation [1, 2]. The traditional ‘stovepipe’ approach that is driven by the goals of individual functional units can be integrated and coordinated to allow for both horizontal and vertical communication [7]. Vertical communication assists senior executives to understand the factors that dictate decisions at the operational level through slice-dice, roll-up, or drill-down applications. Horizontal communication, on the other hand, implies organisational integration for cross-functional information needs because an effective EAM lifecycle requires inputs ranging from asset-orientated information to finance and human resources. However, the existing ‘stovepipe’ IT infrastructure in many EAM organisations may be suitable for functional efficiency but discourages cross-boundary communication and cooperation [7]. The siloed data and systems could not provide levels of reliability for strategic decision making equal to that which they provide for operational business [8]. Although significant efforts have been made to deploy these so-called transactional systems, little has been done to bring the data from these disparate systems to a common decision-making platform. Obviously this fragmented practice does not support the effective extraction, analysis, and provision of actionable information that is critical to improve the bottom line of EAM organisations. Yet, EAM organisations are being held more accountable for their management reporting and business performance in order to comply with new laws and regulations In view of these issues, a comprehensive, objective, fact-based business intelligence (BI) system is needed to weave together those disparate systems and silo databases. The technological capabilities of a BI system in finding, retrieving, accumulating, organising, storing, processing, analysing, and communicating large quantities of data has made the enterprise BI initiative a feasible goal [9]. Nevertheless, a BI system and its associated infrastructure will not replace an existing operational system but will build upon it, and the legacy system can continue to provide essential ‘bread and butter’ functions, such as work order management and asset maintenance [7]. With this backdrop setting the context for BI systems implementation in EAM organisations, empirical research was necessary to understand the key factors impacting such systems initiatives. So the implied research question for this study is: what are the critical success factors (CSFs) and associated contextual issues influencing the implementation of BI systems, especially in EAM organisations? The remainder of this article has been structured as follows. The next section briefly describes the BI systems before outlining the CSFs framework used in this research. Then section four provides the research methodology and case backgrounds. The later section presents and discusses the research findings in multiple EAM organisations. In the following section, the authors present the overall findings and conclude the study. 2
BUSINESS INTELLIGENCE SYSTEMS
According to Moss and Atre [10], a BI system is “an architecture and a collection of integrated operational as well as decision-support applications and databases that provide the business community easy access to business data”. Whilst Reinschmidt and Francoise [11] describe a BI system as “an integrated set of tools, technologies and programmed products that are used to collect, integrate, analyse and make data available”. Stated simply, the main tasks of a BI system include “intelligent exploration, integration, aggregation and a multidimensional analysis of data originating from various information resources” [12]. Implicit in this definition, data is treated as a highly valuable corporate resource, and transformed from quantity to quality [13]. As a result, critical information from many different sources of a large EAM enterprise can be integrated into a coherent body for strategic planning and effective allocation of assets. Hence, meaningful information can be delivered at the right time, at the right location, and in the right form [9] to assist individuals, departments, divisions or even larger units to facilitate improved decision-making [14]. Recently BI applications have been dominating the technology priority list of many chief information officers (CIOs) [15, 16, 17]. Gartner Research predicts that the BI market will be in strong growth till 2011 [18]. While BI market appears vibrant, nevertheless the implementation of a BI system is a financially large and complex undertaking [19]. The implementation of an enterprise-wide information system (such as a BI system) is a major event and is likely to cause organisational perturbations [20]. This is even more so in the case of a BI system because the implementation of a BI system is significantly different from a traditional operational system. It is an infrastructure project, which is defined as a set of shared, tangible IT resources that provide a foundation to enable present and future business applications [21]. It entails a complex array of software and hardware components with highly specialised capabilities [22]. BI project team need to address issues foreign to the operational systems implementation, including cross-functional needs, poor data quality derived from source systems that can often go unnoticed until cross-systems analysis is conducted; technical complexities such as multidimensional data modelling; organisational politics, and broader enterprise integration and consistency challenges [23]. Consequently, it requires considerable resources and involves various stakeholders over several months to initially develop and possibly years to become fully enterprise-wide [24]. Typical expenditure on these systems, includes all BI infrastructure, packaged software, licenses, training and entire implementation costs, may demand a seven-digit expenditure [24]. The complexity of BI systems is exemplified by Gartner’s recent study that predicted more than half of systems that had been implemented will be facing only limited acceptance [25].
359
To date, much IS literature suggests that various factors play pivotal roles in the implementation of an information system. However, despite the increasing interest in, and importance of, BI systems, there has been little empirical research about the CSFs impacting the implementation of such systems. Although there has been a plethora of BI system studies from the IT industry, nonetheless, most rely on anecdotal reports or quotations based on hearsay [14]. This is because the study of BI systems is a relatively new area that has primarily been driven by the IT industry and vendors, and thus there is limited rigorous and systematic research into identifying the CSFs of BI system implementation. Therefore, the increased rate of adoption of BI systems, the complexities of implementing a BI system, and their far-reaching business implications justify a more focused look at the CSFs required for implementing BI systems. 3
CRITICAL SUCCESS FACTORS IN BI SYSTEMS IMPLEMENTATION
Given the motivation for this research and drawing on Yeoh et al’s [26] CSFs framework, the authors used case studies to examine the CSFs that influence the implementation of BI systems in EAM organisations. Yeoh et al [26] proposed a research framework for examining the CSFs affecting the implementation of BI systems. As shown in Table 1, the framework postulates that there is a set of CSFs which contribute to the success of a BI system implementation. These CSFs exist in various dimensions composed of organisation, process, and technology perspectives. In brief, this framework treats the proposed factors as necessary factors for implementation success, whereas the absence of the CSFs would lead to failure of the system. The CSFs, and its related contextual elements, are the foci for the data collection and analysis of this research. Table 1 CSFs and the contextual elements by dimension [26] Dimension CSFs Contextual elements Organisation Committed management • Top management commitment in overcoming cross-functional challenges support & sponsorship • Business-side sponsorship Clear vision & well• Aligning the BI project with organisational business vision established business case • Well-established BI business case Business-centric Process • Existence of a business-centric champion championship • The team is cross-functional & balanced team • Committed expertise from business domain composition • Use of external consultants at early phase Business-driven & • Project scope is clearly defined iterative development • Adoption of incremental delivery approach approach • Project starts off on high impact areas User-oriented • Formal user involvement throughout the lifecycle change management • Consistent education, training and support are in place Business-driven, scalable Technology • Stable source systems are in place and flexible technical • Establishment of a strategic, business-driven, extensible technical framework infrastructure • Prototype is used as proof of concept Sustainable data quality • High quality of data at source systems & integrity • Business-led establishment of common measures and classifications • Sustainable dimensional and metadata model • Existence of data governance initiative 4
RESEARCH METHODOLOGY
Due to limited academic literature, a case study methodology was used in this research. The case study methodology provides better explanations, insights, and understandings on the examined phenomenon and allows for rich descriptions which would otherwise be lost in other quantitative designs [27]. Darke et al [28] assert that a case study “is well-suited to study the interactions between information technology-related innovations and organisational context”. Yin [29] emphasises the use of a case study as part of a comprehensive research methodology that provides a description of the actual situation. All the above definitions contain characteristics of the case study methodology that are applicable to this project. It should be noted that, unlike sampling logic, replication logic is a purposive selection - it does not aim to generalise the findings to an entire population [30]. Thus multiple case studies in this research should be regarded as multiple experiments and not multiple respondents in a survey [29]. That is, relevance rather than representativeness is prioritised in case selection. Within each case, and guided by the research questions, the need for in-depth investigation and richer understanding of CSFs in the EAM organisation’s real-life context was the subject of this study. Given that the objective of this study was to build theory, a case
360
study process with multiple-case design was the appropriate approach, and the use of the case study methodology is justified on these grounds. Data collection for this study entailed semi-structured interviews with key stakeholders of BI projects, and those interviewed included project managers, end users, key project stakeholders (who have been involved directly in either business or IT functions), and in two instances, external consultants/contractors. To facilitate data triangulation, data were also gathered from a number of sources including relevant documents, training documents, presentation slides and publicly available reports (such as annual reports and financial statements). In order to maintain consistency of questioning, the authors used the above CSF framework as a basis for the interviews, inviting participants to comment on the CSF framework. In order to maintain homogeneity and thus lessening the potential for confounding effects of different industries and IT environments, it was decided that all case organisations should come from the same industry. The engineering asset management organisations, such as electric, gas, water utilities and railway companies, hence were selected for these reasons. First, the types of information systems in these asset-intensive organisations are identical. Typically, they are composed of asset operational systems, maintenance systems, condition monitoring, work management and contract management. Secondly, due to fierce competition resulting from deregulation, increased regulatory compliance, and governance requirements, many EAM organisations are on the verge of implementing large-scale BI systems, yet there is very limited literature to guide such organisations. For instance, a recent BI project introduced by an Australian water utility costs several millions dollars [31]. To assess the importance of the seven previously-identified CSFs, the authors studied seven EAM organisations that had implemented BI systems. These case companies are illustrated in Table 2. Table 2 Case coding and BI system implementation success level Case Types of EAM Industry R1 Rail transport and network access R2 Rail transport and network access E1 Electricity and gas utilities S1 Ship builder and maintainer W1 Water, sewage, recycled water utilities W2 Water, sewage, recycled water utilities W3 Water, sewage, recycled water utilities
Implementation Success Level Successful Successful Successful Partially successful Successful Successful Unsuccessful
A cross case analysis approach was used in this study to gain better understandings and increase the generalisability of the findings [27]. In searching for patterns, the authors examined similarities and differences about relationships within the data [32]. Hence, varying the order in which case data are arrayed enables patterns to become more obvious [32]. Moreover, this research did not produce quantitative data. In all cases, the authors were examining the presence or absence of a particular CSF (e.g., were adequate resources provided?), while at the same time ascertaining whether that characteristic was fulfilled in a meaningful way. To assess the importance of the seven previously-identified CSFs, the authors studied seven EAM organisations that had implemented BI systems. 5
ANALYSIS AND DISCUSSION OF INDIVIDUAL CRITICAL SUCCESS FACTORS
The following section analyses and discusses the results relating to each of the seven CSFs as they relate to the case studies. To better facilitate discussion, rather than giving an account of the experiences of all case organisations in relation to each CSF, this section discusses critical factors with salient data from various cases in order to explain the underlying research issues. 5.1 CSF 1: Committed management support and sponsorship The first CSF is the need for commitment and sponsorship by top management. In each case, the research participants were asked to describe the extent to which top management supported the introduction of a BI system. Moreover, ‘management commitment’ was not rigidly defined so enabling interviewees to independently identify issues that they associated with management support. The interviewees were then asked to describe how that management support and sponsorship, or lack of it, influenced their implementation effort. According to the interviewees, top management commitment and sponsorship was the most critical factor for successful implementation. This particular factor appeared in various forms across the cases. Indeed, two additional contextual elements emerged during the data analysis process, namely: management involvement in steering committees to oversee high-level architecture design, and management involvement in amending organisational structure and/or roles and responsibilities. The involvement of senior management in information steering committee. The participation of executives in information steering committee (as demonstrated in cases of R1, R2, E1 and W1) had the benefits of providing overall
361
direction and support to the BI initiative and of facilitating architectural design and cross-functional data ownership issues at strategic levels. The typical membership of steering committees included the CIO, general managers, functional managers (who are also data owners), IT/IS managers and project manager(s). Hence, senior managers together with the committees could determine a strategic BI governance direction and ensure that the process for establishing and maintaining BI-business alignment would be ongoing. In fact, this assertion was best demonstrated in cases R1, E1 and W1 where the steering committee was responsible for system acceptance, for signing-off deliverables at the relevant milestones, and for recommending continuation to the next development phase. Furthermore, the involvement of senior business executives in steering committees addresses the project prioritisation and cross-departmental scope definition issues, and assures appropriate allocation of relevant resources for such a complex undertaking. Therefore, having such an information steering committee composed of a group of senior managers boosts the implementation process leading to a standardised, business-aligned BI system. Business-side sponsorship. Business executives of the successful cases were also committed to fully sponsoring and resourcing the implementation process. The requisite operating resources included financial commitment, adequate staffing, and the allocation of sufficient time to get the job done. The instance of failure, case W3, enjoyed little commitment or sponsorship by senior management who were strongly focussed on cost. As a result, the BI initiative in case W3 was curtailed due to a lack of commitment and sponsorship from business stakeholders. Indeed, this failure reinforces the significance of the business-orientation approach. The commitment of top management was also shown through amendment of organisational structure and/or roles and responsibilities, as happened in the successful cases of R2, E1, S1 and W1. Enterprise R2 recently underwent a comprehensive exercise aimed at better managing its asset-oriented information and thus improving decisionmaking process. Conseqeuntly, a second phase enterprise-wide BI initiative was also introduced to meet its essential asset information management needs. Along these lines, a new organisational structure and new role definitions were established to provide a better pathway for the associated BI initiative. As a result, the former individual functioning units of asset operation and maintenance were amalgamated, and a new position (known in R2 as the BI coordinator) from the business sector of the firm was appointed to explore the core business needs and information requirements for enhanced management-reporting and decision-support purposes. Similarly, in business E1 the new position of project manager was created to coordinate the implementation of both the BI and enterprise resource planning (ERP) systems. In W1 the manager of an existing business information system was asked to lead the BI project for a period of three years, and in case S1 the team leader of business analysts assumed an additional role as project manager as well as system owner at the enterprise level. In this latter case, a shortage of skilled personnel resulted in their technical work being contracted to third parties, while small numbers of internal staff were responsible for on-going maintenance support. The experiences of these businesses demonstrates the value of amending organisational structure and/or roles and responsibilities to ensure that the system can be implemented successfully. In brief, the factor of committed management support and sponsorship cannot be taken for granted to ensure a successful system implementation. In addition to business-side sponsorship, these commitments were best demonstrated by their involvement as steering committee and by their willingness to amend their organisational structure and/or roles and responsibilities. 5.2 CSF 2: Clear vision and well-established business case All participants were asked how they aligned their BI system implementation project with their business vision and with the business case they had established. Aligning the implementation effort with the business vision required a clearly-defined business case, and project decisions were made in accordance with that case. Responses by participants indicated that there were many similarities between the seven organisations. The five most successful cases (R1, R2, E1, W1 and W2) formally aligned their business cases for introducing a BI system with their corporate business vision. In these cases, the project teams focused on their current business requirements while also including expected directions and organisational growth in their project decisions. This approach appeared to have a long-term focus, and these cases reported enjoying significant benefits from their BI systems. This indicates that a formal strategic vision and aligned business case can strongly contribute to implementation success. Aligning the business case with the organisational vision requires a strategic vision to exist in the first place. In business W3 (the failure case), an organisational vision to integrate BI did not exist, while in case S1 (the partially successful case) the BI initiative was not primarily constituted at the organisational and strategic level, but mainly driven by its key defence client. This situation prevents the implementation projects from being strategically aligned to core business needs, and thus it neglects the corresponding development of a solid business case. In these two instances the project teams concentrated on either technology or their client’s requirement and in so doing they overlooked their strategic and core business requirements. Nonetheless, for S1 the implementation project has been completed and the BI system was still in use, largely to meet the stringent safety standards and regulatory reporting requirement of its defence client. In the instance of failure (case W3) the company neglected to define how the adoption of a BI system would accomplish their main business goal, and this neglect contributed to the early termination of the project. In fact, their BI team attempted to include all possible scenarios when defining the project’s benefits, yet the various business units failed to specify their particular need for the BI system. In the end, this nebulous scenario blurred their motivation for, and vision of, the proposed
362
new system. Based on these cases, it is evident that formally aligning a business case in support of a BI system with the business’ strategic vision greatly enhances the likelihood of success of a BI system implementation. 5.3 CSF 3: Business-centric championship and balanced team composition In all cases the champion played a major role in the development of BI systems. Their leadership and strong business acumen gave the project more of a business focus rather than a technical focus, except for the instance of failure (W3). In fact, based on the face-to-face interviews, most champions (i.e. project managers) seemed to possess in-depth understanding of the business as well as sufficient technological knowledge. However, the respective approaches which they adopted could lead to different outcomes. For example, in case of W3 the champion had stressed the technological features but overlooked the compelling business needs. Apparently, he did not identify the core business requirements and instead invited BI vendors to ‘sell’ the advantages to management while emphasising the technological perspectives. A BI initiative usually spans many functional units and requires resources and data from these groups. In the successful cases of R1, R2 and W1, the role of the champions involved managing political and organisational issues that arose during the course of the project. In R1, the champion was faced with the challenge of convincing management and various functional groups of the strategic value of integrating the BI system of the parent company with that of the acquired firm. The champion in R2 was engaged in the political issue of how the operation data would interact with maintenance data and other data, currently and in the future. Cross-functional team composition. Managing a large-scale BI initiative goes beyond typical IT project management. It requires a change in the composition of the BI team into a cross-discipline and cross-functional team [33]. In this enquiry, all organisations being studied recognised the importance of involving the business stakeholders, particularly at the outset of the project. In fact, the participants reported that the involvement of the business sector of the company was deemed critical to provide a mixture of skills needed for a BI project. Domain expertise of the business stakeholders provides valuable in-sights and guidance from the perspectives of the end-users and business units. For the five successful cases, the team’s skill profile encompasses both technical and business expertise. More importantly, the teams were also responsible for fostering an enterprise-wide culture of collaboration in implementing the new BI system, because committed support is required from across the organisation. Without this commitment, disappointment or failure can come at a huge cost to the organisation [34]. Thus, the successful cases demonstrated that commitment must come not just from management but also from a competent BI team comprising appropriate business and technical skills. This enables the high-level design to be driven by the business personnel, and ensures that core business needs are a driver of the logical data architecture. Use of external consultants. All of the businesses represented in this study reported that the use of external consultants greatly enhanced the success of the system implementation. Cases R1, E1, S1 and W1 had their BI systems technically implemented by external consulting contractors, and cases R2 and W2 sought some limited assistance from external contractors. The use of consultants occurred mainly because internal staff had minimal ability to design the system architecture (especially for the integration of source systems) and limited time to master the complex ETL and data modelling tools. Indeed, the use of experienced external consultants is so significant that theoretical replication occurred, as evidenced by the successful cases and even the partially successful case. The failure of W3 may, in part, be blamed on the decision to avoid using experienced external consultants. This left the project champion and his internal team (which possessed little experience in implementing BI systems) with a complex task within an already complicated engineering asset management company. The importance of experienced consultants, especially at the early phase of the project, should be considered for those companies that are planning to implement BI systems to avoid costly and unnecessary pitfalls. In summary, a balanced BI team should comprise a quality external consultant, a dedicated champion who possesses adequate business knowledge, and an internal project team that consists of both business and technical personnel. 5.4 CSF 4: Business-driven and iterative development approach The participants were asked to describe their development approach regarding the system implementation. In some cases, organisations were also asked to provide copies of documents that illustrated their approach to project management, such as project timelines and project scope. These documents were used in conjunction with interviewee responses, enabling a triangulated analysis of this factor. Business-driven project scoping. All of the organisations that participated in this survey put considerable effort into managing their implementation projects, however the scale and scope of their respective projects had a significant impact on the outcome. This was particularly apparent in W1 where the large scale, the great cost, and the complex legacy systems and data integrity issues obliged the BI team had to scope the project into several key areas in terms of business functions, information types, and underlying technology. These major areas were inter-related, and it can be seen that the necessary technological infrastructure was developed so as to correspond to the types of information requirements for meeting its objectives of enhanced decision support, management reporting, and analytical capabilities. Using an incremental delivery (‘iterative’) approach, the BI team of W1 divided the project into several major phases which would be introduced over three years. The business-centric project scoping established a solid foundation to determine the associated time frame and related resources. In addition, the team prepared project documentation which specified the expected milestones, the responsibilities of the respective business units, and the target dates for completion. Hence, all business parties were kept informed of the proposed changes and all were kept involved. As a result, the project team could
363
focus on the prioritised business areas without becoming trapped in one ‘big bang’ complex project implementation. This phased approach is well understood because the complexity and duration of such a large BI system implementation project can quickly become overwhelming and out of control. The iterative development approach helped to ensure that project tasks were not lost in the frequently chaotic project environments, especially where some team members remained responsible for their regular duties (as in the case of W1). In summary, it can be stated that a business-driven and iterative development approach appeared to positively support the success of a BI system implementation. 5.5 CSF 5: User-oriented change management All participating organisations were also asked to describe their efforts regarding user-related change management issues which arose during the implementation process. In most cases, it appears that the majority of key users did not object to the system implementation initiative because many were already familiar with the analytical and management reporting tools. They realised the potential benefits which a standardised, enterprise-wide BI system could offer to them. In fact, the key users who were interviewed expressed enthusiastic interest in their new BI system applications, showing the researcher how it has helped improve their work by delivering timely and consistent information, so saving significant time in resolving conflicting data. Hence, based on the experiences of the successful companies, it is evident that neither an education nor a ‘marketing’ campaign is needed to win the support of informed, knowledgeable users who are already aware of the benefits which can come from a BI system. Interactive user involvement. These case studies show that key users’ interactive involvement throughout the various phases of the project are critical to implementation success. Evidence from the cross-case analysis suggests that greater user participation can contribute to better appreciation of their needs. As demonstrated in cases R1, R2, E1, W1 and W2, regular workshops and meetings between users and project teams are an efficient way to achieve this objective. One interviewee pointed out that a workshop also facilitates an efficient communication platform where business stakeholders of different functional groups can discuss a particular issue from multiple angles at the same time. The case studies also show that key users must be consulted on the choice of user interface and query tools. In fact, this view is congruent with the findings reported in the literature which states that users often perceive the usefulness of a system from the ‘first impression’ of the user interface design [35]. Consistent maintenance support. Most interviewees stated that although the training provided to the general users helped with the adoption of the new system, constant support after the system was implemented was even more critical. According to key users from the successful companies, training using generic data rather than their own data was of little value because adhoc reports and analytics queries are generated ‘on the fly’. Furthermore, the key users had adequate knowledge of the application tools which were considered to be ‘intuitive’, and most were experienced with the functions of BI systems. This specific finding indicates that a lack of rigorous training did not seem to bring any significant disadvantages in regard to user-oriented change management. It appears that providing consistent maintenance support upon request could better aid the business users than standard training sessions. As a result, committed and consistent support from the BI team is influential to system use, particularly during the period when the new system is being adopted. This finding is in line with the evolutionary development requirements of a BI system, as opposed to routine maintenance for classical operational systems [36]. In regard to the CSF of user-oriented change management, interactive user participation throughout the implementation cycle can help meet their critical information needs and format requirements. Also, business users will better appreciate the BI applications if formal and constant maintenance support is in place throughout the adaptive system lifecycle. 5.6 CSF 6: Business-driven, scalable and flexible technical framework Turning now to technological issues, the successes of R1, R2, E1, W1 and W2 cases and the partial success of S1 case confirm the importance of a business-driven, scalable and flexible technical framework. In these cases the technical frameworks were designed and standardised centrally by the BI competency team, and driven by core business needs. The required source systems were stable and compatible for BI system implementations, despite some data integrity issues across diverse back-end systems (discussed below). Also, their respective high-level system architectural designs were deemed flexible, scalable and extensible across other functional areas, meeting dynamic management reporting and analytical requirements while making provision for possible future applications. Moreover, prototypes using real corporate data proved to be helpful in obtaining acceptance by business stakeholders. According to most project managers, prototypes facilitate better communication among staff of different functions, especially in showing them the benefits, advantages and values an individual unit may contribute to the overall organisation. The case studies reveal that those organisations that have implemented ERP infrastructure and/or integrated asset management systems are more likely to achieve success, as shown in cases R1, R2, E1 and S1. According to participants, this is mainly because existing integrated information infrastructures (such as ERP) had provided an adequate foundation for the large scale BI system implementation. Moreover, senior management and key business stakeholders of these enterprises were
364
more aware of the benefits of BI applications, and in fact they adopted the systems quite readily. This was particularly so in businesses R1 and S2, for in each case an integrated BI solution (together with an ERP system) had been planned and included at the outset of the project. Therefore, they managed to optimise the large investment in enterprise-wide information infrastructure while ensuring greater effectiveness of operational, reporting, and analytical processing. By way of contrast, the failure of W3 was in some measure the result of incompatible legacy IS applications and siloed ERP modules which deterred executives and functional managers from looking further into BI applications. This observation was also justified in the case of R2 which faced similar problem in the early phase of its BI applications. Integrating the siloed BI applications from individual functional units had posed complex technical issues to the centralised project team. Such challenges included functional-focussed data models, incompatible vendor tools, and multiple versions of the truth. In response to these issues, and to better facilitate overall business needs, R2 has recently restructured so that two key divisions (asset management and maintenance management) have been amalgamated. The information environments of these units have also been altered so as to provide a standardised and integrated BI system for improved decision support. It can be seen, then, that companies that adopt a strategic and business-focused view in planning their adaptive BI systems are more likely to succeed with their BI initiatives. 5.7 CSF 7: Sustainable data quality and integrity In regard to the CSF of sustainable data quality and integrity, the case studies show that most corporations were committed to achieving and maintaining high data quality and integrity, albeit with various degrees of priority. Businesses R1, E1, S1 and W2 clearly stressed the importance of this specific factor. Their efforts to establish common definitions and measures were evidenced by data dictionary frameworks, corporate glossaries, business-centric dimensional and metadata models, data quality working groups, data governance committees, corporate coding policies, and corporate data models (which included candidate information entities and baselines). In doing so, business domain experts were actively involved in validating and verifying quality attributes to address the issues of multiple versions of truth. Moreover, a range of regular workshops with business decision-support personnel, user reference groups, and high-level steering committees were convened to solve the problems arising from business issues. Furthermore, most cases indicated that business users were pro-active in reporting data quality issues. While all participants reaffirmed the importance of high quality data derived from source systems, in some instances corporate transactional systems were reverse-engineered and aligned to facilitate a common hierarchy for management reporting and communication. Initially, the quality of data was adequate at the unit level, but not at the cross-functional level as each unit developed its own measurements and definitions. To illustrate, case R1 was faced with inconsistent interpretations of ‘track corridors’ across its operational, financial, and asset management units, despite accurate data being captured at the respective operational systems. In response, the BI team of R1 conducted cross-system profiling and then aligned these transactional systems via the rail referencing system to create a common reference line and thus better facilitate a management reporting hierarchy. The individual operational systems can still record data in its original measures without massive reengineering work, while ensuring standardised reference understandings at the level of the BI system. As a result, these standardised, business-centric measures, facts, and associated metadata ensure data integrity and sustainable dimensional model development required for a BI system. Also, it is noted that the project team in W2 acted as data custodian for data quality issues that arose within the company. According to the project champion of W2, the sources and repercussions of data quality problems can best be demonstrated and resolved through their ‘neutral’ role. This is because the organisation of the cross-functional BI team and the characteristics of the information generated by BI systems make them a suitable unit (and one without internal contradictions) for assuming the role of data custodian for the enterprise. Acknowledging the possibility of data quality problems, cases E1, S1 and W1 also made use of various data cleansing tools. The automated data quality ‘watchdog’ mechanism embedded in the BI system of company S1 enabled efficient tracking of operational data quality issues, so alerting data providers of issues to be solved. Sustainable data quality and integrity ensures a single version of truth and thus the quality of information provided by BI systems. This factor also reinforces the significance of cross-system analysis and business-side participation in overcoming data quality problems inherent in siloed functional units. 6
OVERALL REMARKS
The evidence from these case studies reaffirms the applicability of the CSFs framework proposed by Yeoh et al [26]. More importantly, the studies further reveal the significance of addressing those CSFs through the business orientation approach. That is, without a specific business purpose BI initiatives rarely produce a substantial impact on business. It is evident that a BI system will prove to be far more useful to an enterprise when its business needs are identified at the outset and used as the driver behind the implementation effort. Thus, the introduction of a new BI system must be business-driven and organisationfocussed. It should also have an interactive business involvement, and be adapted to meet evolving business requirements throughout its lifetime. A ‘build it and they will come’ approach which overlooks business-focused strategies proves to be unsatisfactory and very expensive [37]. In other words, this particular meta-factor (i.e. a business orientation approach) dictates
365
the CSFs, particularly within the following important aspects: business case formation, management commitment, championship, team composition, scoping and development methodology, organisational change management, technical framework development, data model and data quality issues. The five successful cases (R1, R2, E1, W1 and W2) emphasised the importance of the business-oriented approach when addressing the CSFs, while the partially successful case (S1) appeared to comprise a mixture of business and customer-centric approaches. The instance of failure (W3) was not totally business-driven but instead was technology oriented. The five successful cases shifted their focus from the technological view and instead adopted an approach that put their respective business needs first. On the basis of these case studies, it is apparent that the manner in which an organisation addresses those CSFs, whether through a business-oriented, technology-oriented, or customer-oriented approach, will have a substantial impact on the implementation outcome. Having a clearly-defined set of CSFs is important, but it is even more critical to address the CSFs from the right approach. The triangulated data derived from the case studies clearly demonstrate that by placing business needs ahead of other issues an enterprise has a higher likelihood of achieving a useful BI system. In summary, the five successful cases clearly demonstrated that addressing the CSFs from a business perspective was the cornerstone on which they successfully based the implementation of their BI systems. Conversely, the unsuccessful case failed because it focused primarily on the technology and neglected the core requirements of its business. In order to better address the CSFs it is essential for an organisation to emphasise the business orientation approach, and in so doing it will achieve a higher probability of implementation success. Indeed, this view was supported by Gartner Research who stated that, “best in class organisations focus on business objectives and use a business-driven approach to define and scope their people, process, application, technology and/or services strategy” [38]. 7
CONCLUSIONS
Understanding CSFs is a key for successful implementation of a BI system. This study examined the CSFs and the associated contextual issues impacting BI systems implementation in EAM organisations. The findings of multiple case studies substantiate the construct and applicability of the multi-dimensional CSFs. More importantly, this research suggests that organisations are in a better position to successfully address those CSFs through the business-orientation approach. That is, without a clear business-driven objective, the BI initiatives rarely produce substantial impact on business. As a result, the implementation of a BI system has a much greater likelihood of success when specific business needs are identified at the outset, and when those needs are used to direct the nature and scope of the implementation effort. Therefore, this business orientation meta-CSF should be regarded as the most critical factor in determining the implementation success of BI systems. This study has made several contributions on how implementation of BI systems can be improved. First, large and complex enterprises, such as EAM organisations, that are planning to implement BI systems will be better able to identify those factors that will enhance the likelihood of success. The findings will help them understand that implementation involves many factors - organisational, process and technological - occurring simultaneously. They will also help them to determine those factors on which they should give particular attention to ensure that they receive continuous management scrutiny. For senior management, this research finding can certainly assist them by optimising their scarce resources on those key areas that will improve the implementation process. Moreover, management can concentrate on monitoring, controlling and supporting only those critical areas. The findings with regard to the CSFs represent best practices for firms that have successfully implemented BI systems. The evidence that was revealed provides insights for BI stakeholders that can increase the chances of implementation success. 8
REFERENCES
1
IIMM (2006) International Infrastructure Management Manual. Australia New Zealand 2nd Ed, National Asset Management Steering Group, Thames.
2
IIMM (2002) International Infrastructure Management Manual. Australia New Zealand 1st Ed, National Asset Management Steering Group, Thames.
3
Steenstrup K (2004) Asset-Intensive ERP II and EAM/CMMS MQ Criteria: Gartner Research.
4
Lin S, Gao J, Koronios A & Chanana V. (2007) Developing a data quality framework for asset management in engineering organisations. International Journal of Information Quality 1(1), 100-126.
5
Haider A. (2007) Information Systems Based Engineering Asset Management Evaluation: Operational Interpretations. Thesis. University of South Australia, Adelaide.
6
Woodhouse J. (2000) Key Performance Indicators, John Woodhouse Partnership, viewed 12 Feb 2008,
366
7
USDOT-FHWA December (1999) Asset Management Primer, U. S Department of Transportation, Federal Highway Administration, Office of Asset Management, FHWA Pub. No FHWA-IF-00-10.
8
Amadi-Echendu J, Willett R, Brown K, Lee J, Mathew J, Vyas N & Yang BS. (2007) 'What Is Engineering Asset Management?' paper presented at the 2nd World Congress on Engineering Asset Management, Harrogate, UK.
9
Negash S. (2004) 'Business Intelligence', Communications of the Association for Information Systems, 13, 177-195.
10 Moss L & Atre S. (2003) Business Intelligence Roadmap: The Complete Lifecycle for Decision-Support Applications. Boston, MA: Addison-Wesley. 11 Reinschmidt J & Francoise A. (2000) Business Intelligence Certification Guide, IBM, International Technical Support Organization, San Jose, CA. 12 Olszak C & Ziemba E. (2007) 'Approach to Building and Implementing Business Intelligence Systems', Interdisciplinary Journal of Information, Knowledge, and Management, 2, 135-148. 13 Gangadharan GR & Swami SN (2004), 'Business Intelligence Systems: Design and Implementation Strategies', paper presented at the 26th International Conference Information Technology Interfaces ITI. 14 Jagielska I, Darke P & Zagari G. (2003), 'Business Intelligence Systems for Decision Support: Concepts, Processes and Practice', paper presented at the 7th International Conference of the International Society for Decision Support Systems. 15 Gartner. Gartner EXP Survey of More than 1,400 CIOs Shows CIOs Must Create Leverage to Remain Relevant to the Business 2007. Retrieved May 1, 2009, from: http://www.gartner.com/it/page.jsp?id=501189 16 Gartner. Gartner EXP Worldwide Survey of 1,500 CIOs Shows 85 Percent of CIOs Expect "Significant Change" Over Next Three Years 2008 Retrieved May 1, 2009, from: http://www.gartner.com/it/page.jsp?id=587309 17 Gartner. Gartner EXP Worldwide Survey of More than 1,500 CIOs Shows IT Spending to Be Flat in 2009." Retrieved May 1, 2009, from: http://www.gartner.com/it/page.jsp?id=855612 18 Richardson J & Schlegel K. (2008) Magic Quadrant for Business Intelligence Platforms, Gartner Research. 19 Watson HJ, Fuller C & Ariyachandra T. (2004), 'Data Warehouse Governance: Best Practices at Blue Cross and Blue Shield of North Carolina', Decision Support Systems, 38(3), 435-450. 20 Ang J & Teo TSH. (2000), 'Management Issues in Data Warehousing: Insights from the Housing and Development Board', Decision Support Systems, 29 (1), 11-20. 21 Duncan NB. (1995) 'Capturing Flexibility of Information Technology Infrastructure: A Study of Resource Characteristics and Their Measure', Management Information Systems, 12(2) , 37-57. 22 Watson H & Haley B. (1998) 'Managerial Considerations', Communications of the ACM, 41(9), 32 - 37. 23 Shin B. (2003) 'An Exploratory Investigation of System Success Factors in Data Warehousing', Journal of the Association for Information Systems, 14 (1), 141- 170. 24 Watson HJ & Haley BJ. (1997) 'Data Warehousing: A Framework and Survey of Current Practices', Journal of Data Warehousing, 2(1), 10-17. 25 Friedman T. (2005) Gartner Says More Than 50 Percent of Data Warehouse Projects Will Have Limited Acceptance or Will Be Failures through 2007, Gartner Research, viewed 21 Feb 2007, . 26 Yeoh W, Koronios A & Gao J. (2008) Managing the Implementation of Business Intelligence Systems: A Critical Success Factors Framework, International Journal of Enterprise Information Systems, 4(3), 79-94. 27 Miles M & Huberman AM. (1994) Qualitative Data Analysis: An Expanded Sourcebook, Thousand Oaks, CA: Sage. 28 Darke P, Shanks G & Broadbent M. (1998) 'Successfully Completing Case Study Research: Combining Rigour, Relevance and Pragmatism', Information Systems Journal, 8 (4), 273-289. 29 Yin R. (1994) Case Study Research, Design, and Methods, 2 edn, Newbury Park, CA: Sage. 30 Firestone W. (1993) 'Alternative Arguments for Generalizing from Data as Applied to Qualitative Research', Educational researcher, 22(4), 16-27. 31 LeMay R. (2006). Sydney Water signs business intelligence vendor. Retrieved http://www.zdnet.com.au/news/software/soa/Sydney-Water-signs-business-intelligencevendor/0,130061733,139268270,00.htm
367
11
July 2008,
from
32 Stuart I, McCutcheon D, Handfield R, McLachlin R & Samson D. (2002) Effective Case Research in Operations Management: A Process Perspective, Operations Management, 20(5), 419-433. 33 Hostmann B & Buytendijk F. (2004) Management Update-Effective BI Approaches for Today’s Business World: Gartner Research. 34 Dresner H, Linden A, Buytendijk F & Friedman T. (2002) The Business Intelligence Competency Center: An Essential Business Strategy: Gartner Research. 35 Stumpf R & Teague LC. (2005) Object-Oriented Systems Analysis and Design with UML, Upper Saddle River, NJ: Prentice Hall. 36 Fuchs G. (2006) 'The Vital BI Maintenance Process', in Business intelligence implementation: issues and perspectives, B. Sujatha (Eds). pp. 116-123. Hyderabad, India: ICFAI University Press. 37 Bates AW. (2000) Chapter Ten: Avoiding the Faustian Contract and Meeting the Technology Challenge. In T. Bates (Ed.), Managing technological change. San Francisco: Jossey-Bass. 38 Burton B. Geishecker L. & Hostmann B. (2006) Organizational Structure: Business Intelligence and Information Management: Gartner Research Acknowledgement This paper is conducted through the CRC for Integrated Engineering Asset Management, established and supported under the Australian Government’s Cooperative Research Centres Programme.
368
Proceedings of the 4th World Congress on Engineering Asset Management Athens, Greece 28 - 30 September 2009
A REVIEW ON DEGRADATION MODELS IN RELIABILITY ANALYSIS Nima Gorjian a, b, Lin Ma a, b, Murthy Mittinty c, Prasad Yarlagadda b, Yong Sun a, b a
Cooperative Research Centre for Integrated Engineering Asset Management (CIEAM), Brisbane, Australia b c
School of Engineering Systems, Queensland University of Technology (QUT), Brisbane, Australia
School of Mathematical Sciences, Queensland University of Technology (QUT), Brisbane, Australia
With increasingly complex engineering assets and tight economic requirements, asset reliability becomes more crucial in Engineering Asset Management (EAM). Improving the reliability of systems has always been a major aim of EAM. Reliability assessment using degradation data has become a significant approach to evaluate the reliability and safety of critical systems. Degradation data often provide more information than failure time data for assessing reliability and predicting the remnant life of systems. In general, degradation is the reduction in performance, reliability, and life span of assets. Many failure mechanisms can be traced to an underlying degradation process. Degradation phenomenon is a kind of stochastic process; therefore, it could be modelled in several approaches. Degradation modelling techniques have generated a great amount of research in reliability field. While degradation models play a significant role in reliability analysis, there are few review papers on that. This paper presents a review of the existing literature on commonly used degradation models in reliability analysis. The current research and developments in degradation models are reviewed and summarised in this paper. This study synthesises these models and classifies them in certain groups. Additionally, it attempts to identify the merits, limitations, and applications of each model. It provides potential applications of these degradation models in asset health and reliability prediction. Key Words: Degradation model, Reliability analysis, Asset health, Life prediction 1
INTRODUCTION
In current years, the research on prognostics and asset life prediction has enhanced in the field of Engineering Asset Management (EAM). One of essential tasks in EAM is the development of mathematical models that are capable to predict time-to-failure and the probability of failures to occure. In practical applications, an important requirement to estimate remaining useful life of assets is establishing their current state of degradation. Research shows that degradation measures often provide more information than failure time data to assess and predict the reliability of systems [1, 2]. In addition, degradation phenomenon is a kind of stochastic process; therefore, it could be modelled in several approaches. Hence, many prediction models have been developed regarding the concept of degradation. In general, degradation is the reduction in performance, reliability, and life span of assets. Most assets degrade as they age or deteriorate due to some factors that termed as covariates. Hence, reliability declines when assets degrade or deteriorate. Assets fail when their level of degradation reaches a specified failure threshold. However, what threshold should be and how it should be specified has not been made clear yet [3]. There are a number of literature and research on degradation modelling and reliability assessment using degradation data; however, few summary reviews on degradation models exist. Singpurwalla [4] reviews the degradation models with covariates when the environment is dynamic. Van Noortwijk [5] surveys the application of Gamma processes in maintenance, as well as reviews some degradation models in reliability. Meeker and Escobar [1] reviews several aspects of modelling for degradation data and the connections and differences between degradation models and failure time models. Ma [6] discusses the requirement for a new paradigm shift in condition monitoring research for modern EAM and reviews some degradation modelling techniques. This paper is a review of degradation models in reliability analysis. In this paper, degradation models are classified in certain groups. Comments on their merits and limitations are provided. Applications of each model are also presented. This review paper provides the potential applications of these models in asset health and reliability prediction. The reminder of this paper is organised as follows. Section 2 classifies degradation models into two major groups and explains them in more details.
369
It then discusses the merits and limitations of each model. Section 3 provides comments on the potential applications of these models. Section 4 presents the conclusion of this paper. 2
DEGRADATION MODELS IN RELIABILITY ANALYSIS
A variety of classification schemes for failure modes have been published. A failure can be produced by different causes that can be classified either as internal or external [7-12]. Internal failures occur due to the inner structure of systems (e.g. ageing and quality of materials). External failures often occur due to the environmental conditions in which systems operate (e.g. vibrations, humidity, and pollution). Generally, failures can be divided into two groups:
The failure that may be predicted by one or several condition monitoring indicators. This failure is referred to as a gradual failure. It also called soft (or degradation) failure. The failure whose probability is completely random. The failure cannot be predicted by either condition monitoring indicators or by measuring the age of the asset. The asset ceases to function without any indication. This kind of failure is referred to as a sudden failure. It is also termed as hard failure.
Many failure mechanisms can be traced to an underlying degradation process [13]. In general, degradation process is the reduction in performance and reliability of assets. Figure 1 illustrates the basic notion in degradation models. This notion is (D ) that assets fail when their levels of degradation hit a specified failure threshold [14]. There are two type of degradation: natural and forced degradation [15, 16]. Natural degradation is age or time-dependent. The term ageing refers to an internal process in systems where gradual degradation occurs, thus it brings the W (t ) systems closer to failure. However, forced degradation that is external to systems, where its loading gradually increases (t ) in response to increased demand so that a point is reached beyond (T ) which the systems can no longer safely carry the load. Figure 1: The degradation process
Degradation models represent the underlying prognostics. There are different classifications for prognostic approaches in the literature [6, 17-27]. In general, these approaches can be classified into four main groups: experienced-based approaches, model-based approaches, knowledge-based approaches, and data-driven approaches. This paper does not attempt to review these approaches in detail as they are not the main focus of this paper. Amongst these approaches, model-based approaches and data-driven approaches are two typical approaches that can use degradation data for reliability assessment. Knowledge-based approaches can also be used in prognostics when combined with other approaches, e.g., with data-driven approaches. Table 1 provides a discussion about the merits, limitations, and applications of some typical models amongst these three approaches. Experienced-based approaches are the simplest form of fault prognostics as they require less detailed information than other prognostic approaches. These approaches are based on the distribution of event records of a population of identical items. Many traditional reliability approaches such as Exponential, Weibull, and Log-Normal distributions have been used to model asset reliability. The most popular approach amongst them is the Weibull distribution due to its ability to conduct different types of behaviour including infant mortality and wear-out in the bathtub-tube curve [28]. In practical applications, experienced-based approaches can be implemented when historical repair and failure data are available. These approaches do not consider the failure indication (degradation) of an asset when predicting asset life. Model-based approaches usually use mathematical dynamic models for an asset being monitored. These approaches can fall into physics-based models and statistical models [19, 20, 29-31]. Crack growth modelling is a common physics-based model. In such critical systems as aircraft and industrial and manufacturing processes, defect (e.g. cracks and anomalies) initiation and propagation must be estimated for effective fault prognostic [32]. Statistical models are developed from collected input/output data and as such might not account for conditions that have not been recorded and thus not included into the models. A common statistical model is Kalman/particle filtering. It employs a state dynamic model and a measurement model by using Bayesian estimation technique to predict the posterior probability density function of the state, that is, to predict the time evolution of a fault or fatigue damage. It avoids the linearity and Gaussian noise assumption of Kalman filtering and provides a robust framework for long-term prognostics while accounting effectively for uncertainties. Knowledge-based approaches are suitable for solving problems usually solved by human specialists. Compared to modelbased approaches, these approaches require no models so it appears to be promising [20]. These approaches are employed where accurate mathematical models are difficult in real-world, or limitations of using model-based approaches become significant [refer to table 1]. Two typical examples of knowledge-based approaches are expert systems and fuzzy logic systems [20, 25]. Expert systems are one of the major playing fields of artificial intelligence and they have traditionally been used for
370
fault diagnostics. Currently, expert systems begin to be used in the area of fault prognostics. The process of building expert systems includes knowledge acquisition, knowledge representation, verification and validation of prototypes. Rule-based expert systems are useful in encapsulating explicit knowledge from experts [25, 33]. Generally, rules are expressed in form IF condition and THEN consequence [34]. Fuzzy logic provides a robust mathematical framework for dealing with real-world imprecision and non-statistical uncertainty. Fuzzy logic can model system behaviour in the continuum mathematics of fuzzy sets rather than with traditional discrete values. Coupled with extensive simulation, it offers a reasonable compromise between rigorous analytical modelling and purely qualitative simulation. Generally, the application of fuzzy logic approach in prognostics is usually incorporated with other techniques such as expert systems or neural networks. Data-driven approaches are based upon statistical and learning techniques which come from the theory of pattern recognition. These range from multivariate statistical methods (e.g. static and dynamic principle components analysis, linear and quadratic discriminations, partial least squares, and canonical variate analysis) to black-box methods based on neural networks (e.g. probability neural networks, decision trees, multi-layer perceptrons, radial basis functions and learning vector quantization, graphical models (e.g. Bayesian networks, hidden Markov model), self organising feature maps, filters, autoregressive models) [20, 26]. Amongst them, Neural Networks (NNs) and Hidden Markov Models (HMMs) are two typical approaches, which are widely applied in prognostics [35, 36]. Table 1: Merits and limitations of prognostic approaches
Merits
Limitations
Model-based approaches (i.e. physics-based models and statistical models) •
Model-based approaches apply to prognostics in different • These approaches require a specific mechanistic ways (e.g. derive the explicit relationship between the knowledge and theory relevant to the monitored asset condition variables and the lifetimes via mechanistic • Model-based approaches need much assumptions modelling) about system and its operating conditions • These approaches provide a technically comprehensive • Physics-based models require the estimation of method that has been used traditionally to understand various physics parameters component failure mode progression • Physics-based models might not be the most practical • These approaches provide a means to calculate the solution since the fault type in question is often unique damage to critical components as a function of operating from component to component and is hard to be conditions identified without interrupting operation • These approaches generally require less data than datadriven approaches • By integrating physical and stochastic modelling techniques, the output model can be used to evaluate the distribution of remaining useful component life as a function of uncertainties in component strength/stress properties, loading • Physics-based models may be the most suitable approach for cost-justified applications in which accuracy outweighs most other factors Knowledge-based approaches (e.g. expert systems and fuzzy logic systems) • • • • •
• • •
These approaches are suitable for solving problems usually solved by human specialist These approaches used where accurate mathematical models is usually difficult to build, as well as limitations of model-based approaches These approaches require no models compared with model-based approaches Expert systems are successfully applied to fault prognostic application Expert systems can be applied in diagnosing and monitoring problems, selecting facilities configurations, planning for predictive maintenance and refurbishment, capturing and transferring expertise Expert systems are able to continuously monitor the condition of a system and make expert decisions Expert systems require less programming and training than neural networks Expert systems are automated built-in test systems, which
371
• • • • •
•
In the expert systems technique, both obtaining domain knowledge and converting it to rules are difficult and need certain skills Expert systems cannot handle new situations not covered explicitly in its knowledge bases In the expert systems technique, computational problems increase by dramatically increasing the number of rules In the expert systems technique, it is simple to make changes to the knowledge base, as a result it is easy to introduce errors into the expert system Since expert systems are modelled after actual human experts, there might be inherent flaws in the knowledge base of the expert system if the expert’s logic is flawed In order for an expert system to be built , the situation must already have been dealt with and encountered by human experts
Merits
Limitations
make use of IF-THEN rules to make a decision, thus they • Expert systems normally do not incorporate economic can replace a human expert’s decision-making analysis in their decisions responsibility • Fuzzy logic lacks capabilities of learning and have no • Rule-based expert systems are useful in encapsulating memory explicit knowledge from experts • Fuzzy logic determines good membership functions; • Expert systems can be built not only hard-and-fast IFhowever, fuzzy rules are not always easy THEN rules, but also fuzzy logic (uncertain/unclear IF- • Fuzzy logic cannot applied in prognostics without THEN rules) incorporating with other techniques such as expert • Fuzzy logic provides a very human-like and intuitive way systems or neural networks of representing and reasoning with incomplete and inaccurate information • Fuzzy logic provides a robust mathematical framework for dealing with real-world imprecision and nonstatistical uncertainty • Fuzzy logic can model system behaviour in the continuum mathematics of fuzzy sets rather than with traditional discrete values, coupled with extensive simulation, thus offers a reasonable compromise between rigorous analytical modelling and purely qualitative simulation Data-driven approaches (e.g. Neural Networks (NNs) and Hidden Markov model (HMM)) • • • • • • • • • • •
• • • • •
NNs have gained significant progress in the field of prognostics Both static and dynamic NNs approaches are available NNs make much fewer assumptions about the system and its operating conditions NNs are processors that have the ability to acquire knowledge through a learning process and then store this knowledge by connectors or synaptic link NNs are useful in condition monitoring due to they can learn the system’s normal operating conditions and determine if incoming signals are significantly different NNs are useful approach when enough training data are available NNs provide desired outputs directly if it using wellestablished algorithms NNs are useful approach when the condition monitoring process has some notable imprecision NNs are adaptable and dynamic NNs are highly nonlinear and in some cases are capable of producing better approximations than multiple regression NNs are useful approach when hard-and-fast rules (such as those, which applied in expert systems) cannot easily be applied, so this neural network is Fuzzy Neural Networks (FNNs) NNs perform at least as good as the best traditional statistical methods without requiring untenable distributional assumptions NNs capture complex phenomenon without a priori knowledge HMM has some distinct characteristics that are not possessed by some traditional methods HMM reflects both the randomness of asset behaviours and reveals its hidden state change processes HMM has a strong constructed theoretical basis and easy to realise in software
• • • • • • • •
•
• •
•
372
NNs are difficult approaches for developers to fit domain knowledge into neural networks in practical applications The main limitation of NNs is the lack of transparency, or rather the lack of documentation on how decisions are reached in a trained network NNs are black box methods, so it is very difficult for developers to have physical explanations of NNs outputs NNs approaches usually need simulation There are no methods for training NNs that can magically create information that is not contained in the training data Training data must be representative of all conditions of an asset in order for NNs to successfully be used in smart condition monitoring NNs have longer training times compared to expert systems A NNs decision engine has the ability to adapt with incoming signal inputs, so if the incoming signals are drifting, the NNs may adapt and view this drift as normal, when this drift is a sign of an out-of-control process, which referred to as over-fitting In NNs approaches, when a drift is introduced into a variable, it impacts the estimate of a different variable. This can make NNs seem as if many signals are drifting, while just one signal is actually drifting HMM is assumed that successive system behaviour observations are independent In HMM, Markov assumption itself that the probability in a given state at time only depends on the state at time is clearly untenable in practical applications HMM has difficulty in relating the defined health state change point to the actual defect progression since it is often impractical to physically observe a
Merits
Limitations
defect in an operating unit HMM does not present temporal structure adequately since its state durations follow an exponential distribution • HMM generates a single observation for each state The fundamental research about the life of an asset is to predict how much time is left before a failure occurs given the current asset condition and past operation profile. The time left before observing a failure is usually termed as Remaining Useful Life (RUL) [17]. In some industry applications, especially when a failure is catastrophic (e.g. in nuclear power plants, airplanes, bridges, dams), it would be more imperative to predict the chance that an asset operates without a failure up to some future time (e.g. next inspection time) or a specified failure threshold given the current asset condition and past operation profile. Degradation models are one of the suitable approaches to deal with this type of prediction. •
Reliability prediction based on degradation modelling can be an efficient method to evaluate reliability of systems when observations of failures are rare. Current research shows that there has been an increasing interest in application of degradation models in reliability prediction. Moreover, it illustrates that significant progress has been achieved in applications of degradation models in various industrial areas. Degradation models in reliability analysis can potentially be grouped as per Figure 2. Each model is discussed in the following sub-sections.
Figure 2: Classification of degradation models in reliability analysis
2.1 Normal Degradation Models In general, normal degradation models are utilised to estimate reliability from degradation data that are obtained at normal operating conditions. Normal degradation models can be classified into two major groups: degradation models with and without stress factors. Degradation models with stress factors (e.g. stress-strength interference model, cumulative damage/shock model, and diffusion process model) are those in which the degradation measure is a function of defined stress. However, degradation models without stress factor (e.g. general degradation path model, random process model, linear/nonlinear regression models, mixture model, and time series model) are those in which the degradation measure is not a function of defined stress and related reliability is estimated at fixed level of stress. 2.1.1 General Degradation Path Model The fundamental notion under the general degradation path models is to limit the sample space of the degradation process and assume all sample functions admit the same functional form but with different parameters [37]. The general degradation path model fits the degradation observations by a regression model with random coefficients. Jiang and Jardine [38] and Zue et al. [9] present simple general degradation path models. Liao [37] asserts that both simple linear regression and nonlinear regression models are generally employed in degradation path modelling. Linear degradation is utilised in some simple wear processes such as automobile tire wear. However, degradation paths are often nonlinear functions of time and sometimes linearization is infeasible. Lu and Meeker [39] introduces a general nonlinear mixed-effects model and a two-stage approach to estimate model parameters, that are multivariate normally distributed. In addition, Lu and Meeker [39] develops a Monte Carlo simulation procedure to calculate an estimate of the distribution function of the time-to-failure. They propose a parametric bootstrap method to set confidence intervals [39, 40]. Lu and Meeker [1, 39] suppose the three following assumptions about the manner which the test and measurement should be conducted: 1. 2. 3.
Sample assets are randomly selected from a population or production process and random measurement errors are independent across time and assets Sample assets are tested in a particular homogenous environment such as the same constant temperature Measurement (or inspection) times are pre-specified, the same across all the test assets, and may or may not be equally spaced in time. This assumption is used for constructing confidence intervals for time to failure distribution via the bootstrap simulation technique
373
In the general degradation path model, the observed degradation path
is an asset’s actual degradation path
decreasing function of time that cannot be observed directly, plus measurement error
.
, a non-
is called threshold which denotes
the critical level for the degradation path above which failure is assumed to have occurred. Time
is real time while failure
time is defined as the time when the actual path to stop the experiment.
denotes the planned time
For each asset in a random sample of size
crosses the threshold level
. In addition,
assets, it is assumed that degradation measurements are available for pre-
specified times , generally, until crosses the pre-specified critical level first. Based on [9, 10, 38, 39], a general degradation path model can be expressed as:
or until time
, whatever comes
(1) Where is time of the measurement or inspection; is the measurement error with constant variance ; is the actual path of the asset at time with unknown parameters as listed later; is the vector of fixed-effect parameters, common for all assets; is the vector of the asset random-effect parameters, representing resenting individual asset characteristics;
and
are independent of each other
;
is the total number of
possible inspections in the experiment; and is the total number of inspections on the asset, a function of . It is assumed that follows a multivariate distribution function , which may depend on some unknown parameters that must be estimated from the data. The distribution function of , the failure time, can be written as: (2) Key merits The general degradation path model is the simplest degradation model The general degradation path model is directly related to statistical analysis of degradation data All of the model parameters are randomised to model the random effects across samples [41] Parameter estimation of the general nonlinear mixed-effects model is computationally simple compared with maximum likelihood estimation method When the close-form expression of cannot obtained easily, the Monte Carlo simulation method can be utilised in the general nonlinear mixed-effects model Key limitations
The fundamental assumption of general degradation path models about the sample space and sample function of the degradation process is restrictive when the patterns of some sample degradation paths are inconsistent with the others due to slight or intensive variations in the environment that an individual asset operates [37] The assumptions of the general nonlinear mixed-effects model about test and measurement and degradation path are quite restrictive
2.1.2 Random Process Model The random process model fits degradation measures at each observation time by a specific distribution with timedependent parameters [11]. The time-dependent parameter distribution model comes up naturally as the degradation measure is a random variable that distribution is a function of time. In this method, multiple degradation data at a certain time have to be collected and treated as scattered points without orientation [37]. The observations at to find the equation for
are assumed to follow a and
with constant shape parameter
-normal distribution with
. If the degradation measure at time and time dependent scale parameter
and
. Linear regression is then used
follows a two parameters Weibull distribution
which can be expressed as [9]: (3)
Where, and are constants, can be described as:
is degradation level at
, then for a given threshold level
, the reliability function
(4)
374
Yang and Xue [11] models degradation data based on -normal distribution with time dependent parameters and uses regression analysis model parameters. Zuo et al. [9] extends the idea to processes with general distributions. Similar to the Yang and Xue approaches for the
-normal random process, the reliability function is evaluated by: (5)
Key merit This model is suitable for reliability estimation with no assumption about degradation paths Key limitations
This model does not work in practical applications, since there may not be multiple degradation observations at certain time points In this model, multiple degradation data at a certain time have to be collected and treated as scattered points without orientation [9] Due to this approach ignores the orientation of the degradation data and requires multiple observations; it might not be an appropriate model to predict remaining useful life of an individual asset
The random process model requires the sufficient data points to estimate the parameters of the distribution of the degradation variable at a fixed time point; i.e. one has to acquire several degradation values at the same time point. However, in most real-life situations there may not be multiple degradation observations at some time points. Therefore, in these situations the random process model is not applicable. To overcome this limitation, the linear regression model was introduced as a new approach which each observation can be obtained at different time point [9]. Some advantages of this model are as follows:
This model conquers the limitations of the random process model It is more flexible approach compared to the random process model, due to there is no requirement for multiple observations at each fixed time point The requirement for sample size is removed [9] The mathematical treatment is simple and straightforward
Crk [42, 43] introduce the nonlinear regression model which a straight-line regression model can be generalized to: (6) Where,
explains a function of the vector of regressor variables,
, and the vector of
model parameters,
. A nonlinear regression model is the one in which at least one of its parameters appears nonlinearly. In other words, a nonlinear relationship occurs if at least one of the derivatives of with respect to the parameters is a function of at least one of the those parameters. 2.1.3 Mixture Model for Hard and Soft Failures Zuo et al. [9] proposes the mixture model for both hard (catastrophic) and soft (degradation) failures. After degradation test, there are two observation samples: (1) catastrophic failures, (2) degradation observations. From the catastrophic failure sample, the appropriate cumulative density function , and a proportional function will be found. Similarly, from the degradation measurement sample, one can find another appropriate , to describe the soft failures. Therefore, system failures involving potential catastrophic failures and degradation failures which can be modelled by: (7) Where,
is
of
,
is proportion of components failed catastrophically within specified
(observed
degradation value), and are respectively the catastrophic failure of and degradation failure of (lifetime). However, further research is required to analyse the properties of this mixture model for modelling both catastrophic and degradation failures. Key merit It can be used to model both catastrophic and degradation failures Key limitation
Extremely limited testing
375
2.1.4 Time Series Model Lu et al. [44] proposes a technique to predict individual system performance reliability in real-time considering multiple failure modes. This technique unlike conventional reliability modelling approaches, which yield statistical results that reflect reliability characteristics of the population, includes on-line multivariate monitoring and forecasting of selected performance measures and conditional performance reliability estimates. The performance measures across time are treated as multivariate time series. The state-space approach is used to model the multivariate time series. The predicted mean vectors and covariance matrix of performance measures are applied to measure system reliability with respect to the conditional performance reliability. Recursive forecasting is performed by adopting Kalman filtering method. Lu et al. [44] develops a means to forecast and estimate the performance of an individual system in a dynamic environment in real-time. This model is practical in applications where critical operational conditions are required such as system maintenance, tool-replacement, and human/machine performance assessment. The concept of system performance reliability prediction with multiple performance measures and multiple failure modes can be briefly explained as follows. If density function of performance variables at time evaluated as [44]:
represents a joint probability
. The overall system reliability considering all
failure modes can be
(8) Where,
is the space determined by
Key merits This model applies to predict individual system performance reliability in a dynamic environment This model includes on-line multivariate monitoring and forecasting of selected performance measures and conditional performance reliability estimates This model is practical in applications where critical operational conditions are required such as system maintenance, tool-replacement, and human/machine performance assessment Key limitation
In the model, recursive forecasting is performed by adopting Kalman filtering technology; however, Kalman filtering technique has poor performance with high dimensional data [44]
Most of the existing prognostics techniques use degradation observations to present the health of the monitored asset and then use approaches such as time series prediction or regression to estimate the asset’s future health [18]. For many application areas, it is becoming important to include elements of nonlinearity and non-Gaussianity so as to model accurately and underlying dynamics of physical systems. The state-space approach is convenient to handle multivariate data and nonlinear/non-Gaussian processes and it supplies an important advantage over traditional time series technique [45]. Arulampalam et al. [46] reviews both optimal and suboptimal Bayesian algorithm for nonlinear/non-Gaussian tracking problems, with a focus on particle filters. Particle filters are sequential Monte Carlo methods based on point mass representations of probability densities, which can apply to any state-space model. Particle filters generalise the traditional Kalman filtering approaches. 2.1.5 Stress-Strength Interference Model The Stress-Strength Interference (SSI) model is an early and still popular representation of asset reliability. In this model, there is random dispersion in the stress, , which results from applied loads. The dispersion in the stress realized can be modelled by a distribution function . And is random dispersion in inherent asset strength . Therefore, asset reliability corresponds to the event that strength exceeds stress can be described as [47]: (9) Xue and Yang [48] classifies the SSI models into three following groups: I. Deterministic Degradation: The degradation measure at time is described by a deterministic strength degradation function . Thus, the SSI reliability model with strength degradation can be expressed as: (10)
376
Where, of
is intensity parameter of Poison process of loading force appearance, , and
is
of loading force,
is
strength which is a random process.
II. Random Strength Degradation Process: The SSI reliability is: (11) III. Upper and Lower Bounds: Suppose the loading force upper & lower SSI reliability are:
and strength
have
-normal distributions. Then, the (12)
(13)
and
Where, deviation of
are as follows:
is , and
of the standard ,
-Normal (Gaussian) distribution, and
,
are mean and standard
are mean and standard deviation of the loading force [49].
Xue and Yang’s model [48] presents an improved SSI reliability model involving both stochastic loading and strength aging degradation; however, their model assumes a homogeneous Poisson loading process with -normally distributed load amplitudes. Huang and Askin [50] asserts that there is no comprehensive research involving both stochastic loading and stochastic strength aging degradation in the existing literature. As a result, Huang and Askin [50] suggests the generalised SSI reliability model. This model is classified into two following groups: A. For deterministic strength aging degradation The generalised SSI reliability model can be expressed as: (14) Clearly, the derived SSI reliability model for deterministic strength aging degradation belongs to the generalised exponential distribution family. Where, intensity function of the loading process, at time
is strength aging degradation model, as function of is
of
, and
and
,
is
is failure probability given stochastic loading appear
.
B. For stochastic strength aging degradation For a stochastic strength aging degradation, the vector,
which is containing multiple random variables, in the strength
aging degradation model represents a random variable vector, and is multivariate joint probability density function of . The generalised SSI reliability model under both stochastic loading and stochastic strength aging degradation is established from: (15 ) Key merits SSI model is popular for random dispersion stress (loads) SSI model can applied to reliability estimation in wear-out, fatigue, crack growth with static or dynamic loading forces SSI model can apply for any kind of strength aging degradation model SSI model provides the sensitivity information during the reliability analysis SSI model is traditionally used in structural engineering; however by better understanding about strength and stress, it can be applied in many other engineering disciplines for reliability analysis Key limitations
SSI model is only applied in a situation, which external loading is higher than item strength (or capacity)
377
SSI model gives only the reliability at one point of time and fails to provide an explicit interpretation of the reliability profile along time [41] There is no comprehensive research involving SSI model for dynamic application
2.1.6 Cumulative Damage/Shock Model The conceptual nature of the cumulative damage/shock model and SSI model is quite noticeable. In the SSI model, stress is treated as constant and strength as variable. However, in the cumulative damage/shock model, the strength (damage threshold) is constant quantity, and the stress (damage) is variable. Cumulative damage/shock model is based on the cumulative damage theory for a degradation process exposed to discrete stresses (e.g. temperature cyclic and random shock) and also the state of process is assumed discrete. In the assumption of the cumulative damage/shock model, an asset is subjected to shocks that occur randomly in time. Each shock imparts a random quantity, , of damage to the asset, which fails when a capacity or endurance threshold is exceeded. The most common assumption of this model is that the shocks occur according to a Poisson process with intensity , and the amounts of damage per shock are independently and identically distributed based on some arbitrarily selected common distribution, which is
. If
shows the reliability over time, and
, the reliability function based on the pre-specified threshold of
is the number of shocks that happen over the interval is [47, 51]: (16)
Note that the sum is taken over all possible numbers of shocks, and the notation and thus the sum of and
shock magnitudes,
shows the
. By convention,
-fold convolution of
for all values of
,
[47].
Key merits The cumulative damage/shock model is widely applied in the field of asset life prediction such as; fatigue failures in aircraft fuselage The cumulative damage/shock model is applied for a degradation process exposed to discrete stress Generalisation of the cumulative damage/shock model are available, which is termed as diffusion process model [47, 52] Key limitation
This model is only applied to discrete sample path
2.1.7 Other Commonly-Used Degradation Models In contrast to the cumulative damage/shock model, the continuous-time models are appropriate for modelling continuous degradation process. Brownian motion (or Weiner process) and Gamma process models are such the models. Weiner process and Gamma process models employed to describe continuous degradation due to each degradation path underlying continuoustime stochastic process. Current research shows that these two models as well as Markov models are widely applied as degradation modelling techniques. These three models are discussed in the section. Markov models play a significant role in the reliability estimation. In 1907, the Russian mathematician A.A Markov proposed a special sort of stochastic process whose future probability behaviour is uniquely determined by its present state, that is, with behaviour of non-hereditary or memory-less. The Markov Chain model is a stochastic process with a discrete state space and discrete time space. While the time (index parameter) space is continuous, it is referred as the Markov process model [53]. The Markov process model is a stochastic process with the property that, given the value of , the values of , where , are independent of the values of
,
. That is, the conditional distribution of the future
, given the present
is independent of the past. In the Markov process model a system may stay in [12]. These states are, for example, failed and not failed, they may be defined by time in state
is exponentially distributed with parameter
and the past
,
,
states, forming a so-called Markov chain stages of degradation [54]. The sojourn
. The transition probabilities
to make a jump to state
when
leaving state are specified in a probability matrix and are independent of the history of the process [54, 55]. This property is called the Markov property and it is apart from the fact that the sojourn times are exponentially distributed, a second kind of lack-of-memory property [54]. The conventional Markov process model has been developed to the semi-Markov process model and the hidden Markov (chain or process) model to tackle more general reliability analysis problems [5, 6, 8, 56]. Semi-Markov process is a model,
378
which joints together the theory of renewal process and Markov chains. In this model, a with the transition matrix
[54, 57]. However, if the time spent in state
time has the probability density function
-state Markov chain is considered
is followed by a jump to state
, then this sojourn
. The sojourn times are mutually independent [54].
Key merits These models are able to model numerous system designs and failure scenario They are suitable models for incomplete data sets Computationally efficient when they are developed Key limitations
These models need large amount of data for training These models assume a single monotonic, non-temporal failure degradation pattern
Continuous-time Markov processes with independent increments are such as the Brownian motion with drift that also termed as the Gaussian or Wiener process [5, 52]. The standard Brownian motion model which the properties are [3, 37, 58-60]: a) b)
and is a continuous process having stationary and independent increments, such as, random variables and intervals
c)
is defined as a stochastic process
For all
are independent normally distributed for the non-overlapped time
and ,
follows normal distribution with mean and variance
The Brownian motion model has an additive effect on the degradation process in the form of below equation [37]: (17) Where, is the initial degradation value, is the trend and is constant diffusion parameter. Since each increment is not necessarily positive, the reliability function is the same as that defined in the equation of the reliability function in the random process model. The linear form of the preceding equation, assuming , has been researched widely in the fields of finance and reliability prediction. In the context of structural reliability, a characteristic feature of this process is that a structure’s resistance alternately increases and decreases, similar to the exchange value of a share [5]. Key merits Explicit expressions are available when stress and strength are assumed to be independent Brownian motions with drift Maximum likelihood and Bayesian estimation of parameters of Brownian stress-strength model are available Key limitation
This model is inadequate in modelling degradation which is monotone [61]
The Gamma process model is introduced into the area of reliability at 1975. It has been increasingly used as a degradation process in maintenance optimisation models [62]. As degradation is generally uncertain and non-decreasing process, it can be regarded as a Gamma process [63]. The Gamma process is a stochastic process with independent non-negative increments having a Gamma distribution with identical scale parameter [64]. The Gamma process is a proper model for degradation occurring random in time. The Gamma process is suitable for describing gradual damage by continuous use. The Gamma process is expressed as follows [5, 33, 63, 65, 66]: (18)
Where, for
for . Let
and
and scale parameter
; with probability one
ii. iii.
, and
be a non-decreasing, right continuous, real-valued function for
process with shape function with the following properties [5]: i.
for
for all has independent increments
379
is a Gamma function with
is a continuous-time stochastic process
. The Gamma
Here
denotes the degradation at time
,
, and a component is said to fail when its deteriorating resistance
, denoted by drops below the stress distribution function of time to failure is [63]:
.
is the time at which failure occurs. Thus, the cumulative
(19)
Where, is the incomplete Gamma function for method applies in order to estimate the parameters.
and
. Maximum likelihood
Key merits The mathematical calculations for modelling degradation through Gamma processes are relatively straightforward The Gamma process is suited to model stochastic degradation for optimal maintenance (e.g. time-based preventive maintenance and condition-based preventive maintenance) The Gamma process is also suitable for modelling the temporal variability of degradation The Gamma process is suited for the stochastic modelling of monotonic and gradual degradation The Gamma process is suitable to model gradual damage monotonically accumulating over time in sequence of tiny increments such as; wear, fatigue, corrosion, crack growth, creep, swell, and degradation health index Maximum likelihood, method of moments, Bayesian updating, and expert judgement of the Gamma process model are available Both the special case of Gamma process which termed as Levy process and the generalised Gamma process are available [4, 66] Extended of the Gamma process model is available, which termed as the weighted Gamma process [5] Key limitations
This model is not suitable for modelling usage such as damage due to sporadic shocks [5, 63] The Gamma process model is mainly applied to maintenance decision problems for single components rather than for systems The Gamma process is not a suitable model for long-term prediction. It is a suitable model for component’s life in each maintenance cycle
2.2 Accelerated Degradation Models Accelerated degradation models make inference about reliability at normal conditions using degradation data obtained at accelerated time or stress conditions. In real-life situations and industry applications, a degradation process may be very slow at normal stress level; as well as time-to-failure is comparatively high [14, 67]. Estimating the failure time distribution or long term performance of components of high reliability products is particularly difficult [10, 13, 68]. Therefore, to attain data quickly from a degradation test, it is often possible to employ the accelerated life test [69, 70]. This test is applied by increasing the level of acceleration variables, such as vibration amplitude, temperature, corrosive media, load, voltage, pressure [70, 71]. However, the accelerated life test is a costly approach. Accelerated degradation models consist of: physics-based models and the statistics-based models. Physics-based models are: Arrhenius model, Eyring model, and Inverse Power model. Arrhenius model is used when the damaging mechanism is caused by temperature (especially for dielectrics, semi-conductors, battery cells, lubricant, plastic, etc). Eyring model is used for accelerated life tests with respect to the thermal and non-thermal variable. Inverse power model is widely used to analyse accelerated life test data of many electronic and mechanical components such as insulating fluids, capacitors, bearings, and spindles in order to estimate their service lives when the acceleration operating parameters are nonthermal (e.g. speed, load, corrosive medium and vibration amplitude, etc). This model describes the damaging rate under a constant stress. This paper does not attempt to review the accelerated degradation models in details as they are already covered by the existing literature. Nelson [72] extensively describes both the physics-based models and statistics-based models. Furthermore, the statistical models with covariate (e.g. the accelerated failure time model) are reviewed in greater details by Gorjian et al. [73].
380
3
POTENTIAL APPLICATIONS
In Section 2, the degradation models used in reliability analysis are presented. The merits and limitations of each model based on its underlying assumptions, and data requirements are discussed. The aim of this section is to tabulate this information in order to show potential applications of these degradation models. Table 2 presents the key notes about the circumstance and condition of choosing these models in asset health and reliability analysis. Table 2: Potential applications for the degradation models
Model name General degradation path model
Potential applications • •
Random process model
Linear and nonlinear regression models
• • •
Mixture model
•
Time series model
•
SSI model
Cumulative damage/ shock model
• • • • •
Markov models
•
Weiner process model
•
Gamma process model
• •
4
This model is suitable to fit the degradation observations by both linear and nonlinear regression models This model is suitable for sample assets which are tested in a particular homogenous environment This model is suitable for reliability estimation with no assumption about degradation paths This model is appropriate for multiple observations at certain time points This model is more flexible than the above two models and is applied when observations are obtained at different time point This model is extremely limited testing, thus further research is needed to analyse the properties of this model for modelling both soft and hard failures This model is suitable for predicting individual system performance reliability with multiple performance measures in a dynamic environment This model is appropriate for reliability estimation at random dispersion stress This model is applied in a situation that external loading is higher than item strength This model can be applied for any kind of strength aging degradation model This models is applied for a degradation process exposed to discrete stress The generalisation of this model can be applied for continuous sample paths These models are the initial degradation models and applied widely in reliability analysis situations This model is not suitable for modelling degradation which is monotone increasing; however, it can be effective to model the degradation process considering maintenance effects This model is suitable for the stochastic modelling of monotonic and gradual degradation This model can be applied to degradation process in maintenance optimisation models
CONCLUSIONS
Degradation is the reduction in performance, reliability, and life span of assets. Most assets degrade as they age or deteriorate as a result of some factors that termed as covariates. Many failure mechanisms can be traced to an underlying degradation process. Assets fail when their level of degradation reaches a specified failure threshold. However, what the threshold should be and how it should be defined has not been made clear. In real-life situations and industry applications an important requirement for estimating remaining useful life of assets is to establish their current state of degradation. Degradation measures often provide more information than failure time data to assess and predict the reliability of assets. Moreover, degradation is a type of stochastic process; as a result, it could be modelled in several approaches. This phenomenon has spawned a great amount of literature published in the field of asset life and reliability prediction. Some of these degradation models are reviewed in Section 2. The merits, limitations, and applications of each model in reliability analysis are discussed. By aggregating the information of merits and limitations of each model, this review paper provides the key notes about circumstances and conditions for choosing suitable degradation models in asset life and reliability analysis. This paper illustrates that preliminary studies on degradation models adopted simple probabilistic models such as general degradation path model or random process model that focused more on the statistics of cross-sectional degradation data. Afterwards, more sophisticated stochastic models such as cumulative damage/shock model, diffusion process model, Brownian motion model, and Markov models are applied in degradation modelling. In recent years, stochastic models with a rich probabilistic structure and simple methods for statistical inference such as Gamma process model are employed to model the degradation process.
381
5
REFERENCES
1
Meeker WQ & Escobar LA. (1998) Statistical methods for reliability data: J. Wiley.
2
Meeker WQ & Escobar LA. (1993) A review of recent research and current issues in accelerated testing. International Statistical Review, 61(1), 147-168.
3
Singpurwalla ND. (2006) The hazard potential: introduction and overview. Journal of the American Statistical Association, 101(476), 1705-1717.
4
Singpurwalla ND. (1995) Survival in dynamic environments. Statistical Science 10(1), 86–103.
5
Van Noortwijk JM. (2007) A survey of the application of gamma processes in maintenance. Reliability Engineering & System Safety, In Press, Corrected Proof, 20.
6
Ma L. (2007) Condition monitoring in engineering asset management. APVC. p. 16.
7
Rausand M. (1998) Reliability centered maintenance. Reliability Engineering & System Safety, 60(2), 121-132.
8
Blischke WR & Murthy DNP. (2000) Reliability : modeling, prediction, and optimization. New York: Wiley.
9
Zuo MJ, Renyan J & Yam RCM. (1999) Approaches for reliability modeling of continuous-state devices. IEEE Transactions on Reliability, 48(1), 9-18.
10
Meeker WQ, L. A. Escobar & Lu CJ. (1998) Accelerated degradation tests: Modeling and analysis. Technometrics, 40(2), 89.
11
Yang K & Xue J. (1996) Continuous state reliability analysis. Annual Reliability and Maintainability Symposium. pp. 251-257.
12
Montoro-Cazorla D & Perez-Ocon R. (2006) Reliability of a system under two types of failures using a Markovian arrival process. Operations Research Letters, 34(5), 525-530.
13
Yang G. (2002) Environmental-stress-screening using degradation measurements. IEEE Transactions on Reliability, 51(3), 288-293.
14
Yang K & Yang G. (1998) Degradation reliability assessment using severe critical values. International Journal of Reliability, Quality and Safety Engineering, 5(1), 85-95.
15
Borris S. (2006) Total productive maintenance. New York: McGraw-Hill.
16
Endrenyi J & Anders GJ. (2006) Aging, maintenance, and reliability - approaches to preserving equipment health and extending equipment life. Power and Energy Magazine, IEEE, 4(3), 59-67.
17
Jardine AKS, Lin D & Banjevic D. (2006) A review on machinery diagnostics and prognostics implementing conditionbased maintenance. Mechanical Systems and Signal Processing, 20(7), 1483-1510.
18
Heng A, Zhang S, Tan ACC & Mathew J. (2008) Rotating machinery prognostics: State of the art, challenges and opportunities. Mechanical Systems and Signal Processing, In Press, Corrected Proof.
19
Vachtsevanos GJ, Lewis FL, Roemer M, Hess A & Wu B. (2006) Intelligent fault diagnosis and prognosis for engineering systems. Hoboken, NJ: Wiley.
20
Zhang L, Li X & Yu J. (2006) A review of fault prognostics in condition based maintenance. Sixth International Symposium on Instrumentation and Control Technology: Signal Analysis, Measurement Theory, Photo-Electronic Technology, and Artificial Intelligence China. pp. 6357521-6. SPIE.
21
Sikorska J. (2008) Prognostic modelling options for remaining useful life estimation: CASWA Pty Ltd & University of Western Australia.
22
Jiang R & Yan X. (2007) Condition monitoring on diesel engines. 25.
23
Kothamasu R, Huang S & VerDuin W. (2006) System health monitoring and prognostics — a review of current paradigms and practices. The International Journal of Advanced Manufacturing Technology, 28(9), 1012-1024.
24
Katipamula S & Brambley MR. (2005) Methods for fault detection, diagnostics, and prognostics for building systems—a review, part I. International Journal of HVAC&R Research, 11(1), 3-25.
25
Goh KM, Tjahjono B, Baines TS & Subramaniam S. (2006) A review of research in manufacturing prognostics. IEEE International Conference on Industrial Informatics. pp. 1-6.
26
Ma Z & Krings AW. (2008) Survival analysis approach to reliability, survivability and prognostics and health management. IEEE Aerospace Conference. pp. 1-20.
382
27
Pusey HC & Roemer MJ. (1999) An assessment of turbomachinary condition monitoring and failure prognosis technology. The Shock and Vibration Digest, 31(5), 365-371.
28
Weibull W. (1951) A statistical distribution function of wide applicability. Journal of Applied Mechanics, 18(3), 293-297.
29
Chelidze D & Cusumano JP. (2004) A dynamical systems approach to failure prognosis. Transactions of the ASME, 126, 2.
30
Chen A & Wu GS. (2007) Real-time health prognosis and dynamic preventive maintenance policy for equipment under aging Markovian deterioration. International Journal of Production Research, 45(15), 3351.
31
Luo J, Bixby A, Pattipati K, Liu Q, Kawamoto M & Chigusa S. (2003) An interacting multiple model approach to modelbased prognostics. Bixby A (Ed.). IEEE International Conference on Systems, Man and Cybernetics. pp. 189-19.
32
Kulkarni SS & Achenbach JD. (2008) Structural health monitoring and damage prognosis in fatigue. Structural Health Monitoring, 7(1), 37-49.
33
Wang W & Zhang W. (2008) An asset residual life prediction model based on expert judgments. European Journal of Operational Research, 188(2), 496-505.
34
Jardine AKS. (2002) Optimizing condition based maintenance decisions. Annual Reliability and Maintainability Symposium. pp. 90-97. IEEE.
35
Eleuteri A, Tagliaferri R, Milano L, De Placido S & De Laurentiis M. (2003) A novel neural network-based survival analysis model. Neural Networks, 16(5-6), 855-864.
36
Li C, Tao L & Yongsheng B. (2007) Condition residual life evaluation by support vector machine. 8th International Conference on Electronic Measurement and Instruments. pp. 441-445.
37
Liao H. (2004) Degradation models and design of accelerated degradation testing plans. United States -- New Jersey: Rutgers The State University of New Jersey - New Brunswick.
38
Jiang R & Jardine AKS. (2008) Health state evaluation of an item: A general framework and graphical representation. Reliability Engineering & System Safety, 93(1), 89-99.
39
Lu CJ & Meeker WQ. (1993) Using degradation measures to estimate a time-to-failure distribution. Technometrics, 35(2), 161-174.
40
Engel SJ, Gilmartin BJ, Bongort K & Hess A. (2000) Prognostics, the real issues involved with predicting life remaining. IEEE Aerospace Conference Proceedings pp. 457-469.
41
Yuan X. (2007) Stochastic modeling of deterioration in nuclear power plant components. Ontario -- Canada: University of Waterloo.
42
Crk V. (2000) Reliability assessment from degradation data. Annual Reliability and Maintainability Symposium. pp. 155161.
43
Crk V. (1998) Component and system reliability assessment from degradation data. United States -- Arizona: The University of Arizona.
44
Lu S, Lu H & Kolarik WJ. (2001) Multivariate performance reliability prediction in real-time. Reliability Engineering & System Safety, 72(1), 39-45.
45
Gordon NJ, Salmond DJ & Smith AFM. (1993) Novel approach to nonlinear/non-Gaussian Bayesian state estimation. IEE Proceedings "F" Radar and Signal Processing. pp. 107-113.
46
Arulampalam MS, Maskell S, Gordon N & Clapp T. (2002) A tutorial on particle filters for online nonlinear/nonGaussian Bayesian tracking. IEEE Transactions on Signal Processing, 50(2), 174-188.
47
Nachlas JA. (2005) Reliability engineering : probabilistic models and maintenance methods. Boca Raton: Taylor & Francis.
48
Xue J & Yang K. (1997) Upper and lower bounds of stress-strength interference reliability with random strengthdegradation. IEEE Transactions on Reliability, 46(1), 142-145.
49
Sweet AL. (1990) On the hazard rate of the lognormal distribution. IEEE Transactions on Reliability, 39(3), 325-328.
50
Huang W & Askin RG. (2004) A generalized SSI reliability model considering stochastic loading and strength aging degradation. IEEE Transactions on Reliability, 53(1), 77-82.
51
Esary JD & Marshall AW. (1973) Shock models and wear processes. The Annals of Probability, 1(4), 627-649.
52
Lemoine AJ & Wenocur ML. (1985) On failure modeling. Naval Research Logistics, 32(3), 497-508.
383
53
Li N, Xie W-C & Haas R. (1996) Reliability-based processing of Markov chains for modeling pavement network deterioration. Transportation Research Record, 1524(-1), 203-213.
54
Pijnenburg M. (1991) Additive hazards models in repairable systems reliability. Reliability Engineering and System Safety, 31(3), 369-390.
55
Kallen MJ & van Noortwijk JM. (2006) Optimal periodic inspection of a deterioration process with sequential condition states. International Journal of Pressure Vessels and Piping, 83(4), 249-255.
56
Welte TM, Vatn J & Heggest J. (2006) Markov state model for optimization of maintenance and renewal of hydro power components. International Conference on Probabilistic Methods Applied to Power Systems pp. 1-7.
57
Papazoglou IA. (2000) Semi-Markovian reliability models for systems with testable components and general test/outage times. Reliability Engineering & System Safety, 68(2), 121-133.
58
Ross SM. (1996) Stochastic processes (2nd ed.). New York: Wiley.
59
Whitmore G & Schenkelberg F. (1997) Modelling accelerated degradation data using wiener diffusion with a time scale transformation. Lifetime Data Analysis, 3(1), 27-45.
60
Bagdonavicius V & Nikulin MS. (2001) Estimation in degradation models with explanatory variables. Lifetime Data Analysis, 7(1), 85-103.
61
Doksum KA. (1991) Degradation rate models for failure time and survival data. CWI Quarterly, 4, 195-203.
62
Singpurwalla ND. (2006) Reliability and risk : a Bayesian perspective. New York: J. Wiley & Sons.
63
Van Noortwijk JM, Van der Weide JAM, Kallen MJ & Pandey MD. (2007) Gamma processes and peaks-over-threshold distributions for time-dependent reliability. Reliability Engineering & System Safety, 92(12), 1651-1658.
64
Van Noortwijk JM & Frangopol DM. (2004) Two probabilistic life-cycle maintenance models for deteriorating civil infrastructures. Probabilistic Engineering Mechanics, 19(4), 345-359.
65
Lawless J & Martin C. (2004) Covariates and random effects in a Gamma process model with application to degradation and failure. Lifetime Data Analysis, 10(3), 213.
66
Singpurwalla N. (1997) Gamma processes and their generalizations: an overview. Engineering Probabilistic Design and Maintenance for Flood Protection, 67–75.
67
Tang LC & Shang CD. (1995) Reliability prediction using nondestructive accelerated-degradation data: case study on power supplies. IEEE Transactions on Reliability 44(4), 562-566.
68
Meeker WQ & LuValle MJ. (1995) An accelerated life test model based on reliability kinetics. Technometrics, 37(2), 133-146.
69
Zhang C, Chuckpaiwong I, Liang SY & Seth BB. (2002) Mechanical component lifetime estimation based on accelerated life testing with singularity extrapolation. Mechanical Systems and Signal Processing, 16(4), 705-718.
70
Shiau J-JH & Lin H-H. (1999) Analyzing accelerated degradation data by nonparametric regression. IEEE Transactions on Reliability, 48(2), 149-158.
71
Pham H. (2006) Reliability modeling, analysis and optimization. Singapore: World Scientific.
72
Nelson W. (1990) Accelerated testing: statistical models, test plans, and data analyses. New York: John Wiley & Sons.
73
Gorjian N, Ma L, Mittinty M, Yarlagadda P & Sun Y. (2009) A review on reliability models with covariates The 4rd World Congress on Engineering Asset Management, Athens-Greece. Springer.
Acknowledgement The authors gratefully acknowledge the financial support provided by both the Cooperative Research Centre for Integrated Engineering Asset Management (CIEAM), established and supported under the Australian Government’s Cooperative Research Centres Programme, and the School of Engineering Systems of Queensland University of Technology (QUT).
384
Proceedings of the 4th World Congress on Engineering Asset Management Athens, Greece 28 - 30 September 2009
A REVIEW ON RELIABILITY MODELS WITH COVARIATES Nima Gorjian a, b, Lin Ma a, b, Murthy Mittinty c, Prasad Yarlagadda b, Yong Sun a, b a
Cooperative Research Centre for Integrated Engineering Asset Management (CIEAM), Brisbane, Australia b c
School of Engineering Systems, Queensland University of Technology (QUT), Brisbane, Australia
School of Mathematical Sciences, Queensland University of Technology (QUT), Brisbane, Australia
Modern Engineering Asset Management (EAM) requires the accurate assessment of current and the prediction of future asset health condition. Suitable mathematical models that are capable of predicting Time-to-Failure (TTF) and the probability of failure in future time are essential. In traditional reliability models, the lifetime of assets is estimated using failure time data. However, in most real-life situations and industry applications, the lifetime of assets is influenced by different risk factors, which are called covariates. The fundamental notion in reliability theory is the failure time of a system and its covariates. These covariates change stochastically and may influence and/or indicate the failure time. Research shows that many statistical models have been developed to estimate the hazard of assets or individuals with covariates. An extensive amount of literature on hazard models with covariates (also termed covariate models), including theory and practical applications, has emerged. This paper is a state-of-the-art review of the existing literature on these covariate models in both the reliability and biomedical fields. One of the major purposes of this expository paper is to synthesise these models from both industrial reliability and biomedical fields and then contextually group them into non-parametric and semiparametric models. Comments on their merits and limitations are also presented. Another main purpose of this paper is to comprehensively review and summarise the current research on the development of the covariate models so as to facilitate the application of more covariate modelling techniques into prognostics and asset health management. Key Words: Covariate model, Hazard, Reliability analysis, Survival analysis, Asset health, Life prediction 1
INTRODUCTION
In recent years, the emphasis on prognostics and asset life prediction has increased in the area of Engineering Asset Management (EAM) due to longer-term planning and budgeting requirements. One essential scientific research problem in EAM is the development of mathematical models that are capable of predicting Time-To-Failure (TTF) and the probability of failures in future time. In most real-life situations and industry applications, the hazard (failure rate) of assets is influenced and/or indicated by different risk factors, which are often termed as covariates. Probabilistic modelling of assets lifetime using covariates (i.e. diagnostic factors and operating environment factors) is one of the indispensible scientific research problems for prognostics and asset life prediction. Until now, a number of statistical models have been developed to estimate the hazard of an asset/individual with covariates in both the reliability and biomedical fields. Most of these models are developed based on the Proportional Hazard Model (PHM) theory which was proposed by Cox in 1972 [1]. The basic theory of these covariate models is to build the baseline hazard function using historical failure data and the covariate function using covariate data. There are few review papers on the covariate models which have been reported in the literature. Kumar and Klefsjo [2] reviews the existing literature on the proportional hazard model. Kumar and Westberg [3] provides a review of some reliability models for analysing the effect of operating conditions on equipment lifetime. Ma [4] discusses new research directions for Condition Monitoring (CM) and reviews some prognostic models in EAM. Almost all existing covariate models have been applied in the biomedical field. However, some of them have been applied in the reliability area. This expository paper is a collective review of the existing literature on covariate models in both the reliability and biomedical fields. In this paper, each individual covariate model has been contextually grouped into nonparametric and semi-parametric models. Moreover, comments on their merits and limitations are discussed. Applications of
385
each model in both biomedical and reliability field are also presented. The purpose of this study is to facilitate the application of more covariate modelling techniques into prognostics and asset life prediction. The reminder of this paper is organised as follows. Section 2 classifies these covariate models into two groups and then explains them in greater detail. In this section, the merits, limitations, and applications of each model are discussed. Section 3 provides the conclusions of this paper. 2
SURVIVAL / RELIABILITY MODELS WITH COVARIATES
Survival / reliability analysis (also called failure time analysis) is a specific field of statistics that studies failure time and its probability on a group or groups of assets/individuals. Failure time is a defined point event, often called failures, occurring after a length of time. Some examples of failure times include the lifetimes of machine components in reliability field, the survival times of patients in a clinical trial, and durations of economic recessions in economics. Survival analysis was advanced at the UC Berkeley in 1960s to present a better analysis method for Life Table data [5]. The development of statistical procedures and models for survival analysis exploded in the 1970s. In the 1980s and early 1990s, survival models with covariates have been widely applied in both the reliability and biomedical research. In general, the survival / reliability models with covariates can be classified into two groups as: non-parametric and semi-parametric models, which are discussed in the following sub-sections. 2.1 Non-Parametric Models In non-parametric models, the form of degradation paths or distribution of degradation measure is unspecified [6]. When the failure time data involve complex distributions that are largely unknown, or when the number of observations is small, it is difficult to accurately fit a known failure time distribution. In other words, non-parametric models are used to avoid making unrealistic assumptions that would be difficult to test [7]. Such models to be reviewed include:
Proportional hazard model Stratified proportional hazard model Two-step regression model Additive hazard model Mixed (additive-multiplicative) model Accelerated failure time model Extended hazard regression model Proportional intensity model Proportional odds model Proportional covariate model
2.1.1 Proportional Hazard Model The Proportional Hazard Model (PHM), which is a multivariate regression analysis, was first proposed by Cox [1] in 1972. This model estimates the effects of different covariates influencing TTF of a system. This model has been employed for different applications in lifetime data analysis. Due to its generality and flexibility, PHM was quickly and widely adopted in the biomedical, reliability, and economics from 1970s to early 1990s. Almost all covariate models are based on PHM theory. Cox’s PHM for static explanatory variables is expressed as [1, 8]: (1) Where,
is the unspecified baseline hazard function which is dependent on time only and without influence of
covariates. The positive functional term, , is dependent on the effects of different factors, which have multiplicative effect (rather than additive) on the baseline hazard function. The
proportionality
that constant proportion for all
assumption
in
PHM
is
. The hazard at different
values are in
, hence the name for PHM [9-11].
Cox’s PHM for dynamic explanatory variables is [8, 12]: (2)
386
Three parameterizations of
may be considered as log linear, linear, and logistic forms [8]. The log linear form has
become the most popular for good reasons. Covariates are represented by a row vector consisting of the covariates
and
is a column vector consisting of the regression coefficients. The covariate is associated with the system, and is the unknown parameter of the model, defining the effects of the covariates. Cox [1, 13] propose the conditional likelihood, which later is so called partial likelihood, to estimate regression coefficients (regression coefficients) . Kalbfleisch and Prentice [12, 14] propose the marginal likelihood which is identical to partial likelihood. Their likelihood can deal with tied data (assets/individuals which failed simultaneously), censored data and uncensored (observed) data. Regression coefficients are estimated by maximising the partial likelihood. A number of graphical techniques, goodness-of-fit techniques, and confidence intervals techniques can be employed to examine the appropriateness and fit of the PHM [15-17]. Since 1972, a number of applications of PHM in the reliability [10, 11, 15, 18-43] and biomedical fields have been developed. Key merits
PHM is an influential technique which can be used to investigate the effects of various explanatory variables on hazard of assets/individuals This approach is essentially distribution free, thus it does not have to assume a specific form for the baseline hazard function Regression coefficients are estimated using partial likelihood without the need of specifying the baseline hazard function This model is available for both static and dynamic explanatory variables Explanatory variables have a multiplicative effect (rather than additive effect) on the baseline hazard function, thus it is a more realistic and reasonable assumption This model handles truncated, non-truncated data, and tied values Many goodness-of-fit tests and graphical methods are available for this model [17]
Key limitations
PHM is a vulnerable approach when covariates are deleted or the precision of covariate measurements is changed. Therefore, if one pertinent covariate is omitted, even if it is independent of the other covariates in the model, averaging on the omitted covariate gives a new model which leads to biased estimates of regression coefficients [10] The estimated value of the regression coefficient is biased in the case of a small sample size Mixing different types of covariates in one model may cause some problems The main assumption of this model is that an asset/individual life is assumed to be terminated at the first failure time. In other words, this model depends only on the time elapsed between the starting event (e.g. diagnosis) and the terminal event (e.g. fail) and not on the chronological time The influence of a covariate in PHM is assumed to be time-independent [44] Due to proportionality assumption, a common baseline hazard for all assets/individuals has been assumed in a case in which the assets/individuals should be stratified according to baseline [12, 17]
2.1.2 Stratified Proportional Hazard Model In the Stratified Proportional Hazard Model (SPHM), it is assumed that the population can be divided into strata (or levels), based on the discrete values of a single covariate or combination of discrete values of a set of covariates [3]. For example, an asset operates at three different temperature levels; say low, medium, and high. In SPHM, it is assumed that the hazard is proportional within the same stratum (or level) but not necessary across different strata. The hazard of a system in the stratum can accordingly be expressed as [12]: (3) A similar likelihood method to Cox’s PHM is used to estimate regression coefficients biomedical by Kay [45]. This model applied by Kumar [31] in the reliability field.
. The model is applied in
Key merits
The arbitrary baseline hazard function is allowed to be different for each stratum whereas the regression coefficients are the same across strata SPHM is one of the simplest and most useful extensions of the PHM for its application in different situations Similar to PHM, this approach is essentially distribution free, thus it does not have to assume a specific form for the baseline hazard function Regression coefficients are estimated using partial likelihood without the need of specifying the baseline hazard function
387
Explanatory variables have a multiplicative effect (rather than additive effect) on the baseline hazard function, thus it is a more realistic and reasonable assumption Similar to PHM, many goodness-of-fit tests and graphical methods are available for PSHM
Key limitation
Due to multicollinearity, estimated values of regression coefficients are sensitive to omission, misclassification and time dependence of explanatory variables [10] The estimated value of the regression coefficient is biased in the case of a small sample size The main assumption of this model is that an asset/individual life is assumed to be terminated at the first failure time. In other words, this model depends only on the time elapsed between the starting event (e.g. diagnosis) and the terminal event (e.g. fail) and not on the chronological time
2.1.3 Two-Step Regression Model Anderson and Senthilselvan [46] extends Cox’s PHM to allow (approximately) for changing covariate effects in time and presents a method of estimation. This model is applied and tested in the biomedical field. This extension is called two-step regression model. Anderson and Senthilselvan [46] also introduces a method for smooth estimates of the hazard. In general, the prediction effect of a covariate measured at a particular point of time (the beginning of a study) becomes progressively less important as time goes by. If there is a good deal of information about a particular process, it may be possible to model this directly and express as a specific function of time, such as allows a very simple form of time-dependent for . It assumes that [46]:
. Two-step regression model (4)
and For some
, which is not assumed known a priori. In most situations this step-function must be regarded as a first
approximation to the true form of
, as a function of time. The regression coefficients
and
refer to short-term and long-term dependence of the hazard of [46]. It is noticeable that approximation implies roughly equal rates of change for the . The hazard of two-step regression model is [46, 47]. (5) Anderson and Senthilselvan [46] proposes the conditional log-likelihood of Peto in Cox’s discussion paper [1] to estimate regression coefficients. By maximising this log-likelihood regression coefficients will be estimated. Key merits
Literature shows this model fits the data much better than Cox’s PHM [46, 47] This model is more appropriate to deal with time-dependent covariates rather than Cox’s PHM This model handles heavy censoring circumstance better than Cox’s PHM [46, 47] Approximation of the model replaces a time-dependent coefficient by the step function, which is representing shortterm and long-term effects This model can be extended to three-step regression function
Key limitations
This model has difficulty for estimating regression parameters due to large values of breakpoint This model has a common breakpoint for all covariates [46, 47]
[46]
2.1.4 Additive Hazard Model The Additive Hazard Model (AHM) can be described as [48]: (6) Where,
is the unspecified baseline hazard function which is dependent on time only and without influence of
covariates. The positive or negative functional term, , is dependent on the effects of different factors, which have additive effect (rather than multiplicative) on the baseline hazard function. This model provides the means for modelling a circumstance when the hazard is not zero at time zero. Maximum likelihood procedures can be used to estimate this model’s parameters [49]. Pijnenburg [48] tests this model in modelling the reliability of the air conditioning system of aircrafts. Newby [50] asserts that AHM applications are restricted by an identifiability problem. Theoretical limitations leads to identification problems
388
while estimating parameters of the model [51]. Due to that this model is not identifiable, the observation of explanatory variables does not add anything to the knowledge obtained from the event data [50]. Key merits
AHM is intuitively an attractive model when a system after repair is better than it was just before the repair, but not as good as new This model offers the hazard that is not zero at time zero
Key limitations
An additive assumption often leads to estimated hazard less than zero, as a result it is not a realistic and reasonable assumption This model cannot handle tied values (which have likelihood zero under the continuous random variable assumption) The model cannot handle failure times equal to zero This model can only be used in a phenomenological way to measure the magnitude of the jump in the hazard, and models for the parameters [51]
which makes use of explanatory variables are unlikely to produce satisfactory estimators of the
2.1.5 Mixed (Additive-Multiplicative) Model To enhance modelling capability about covariates, the mixed model considers the hazard of an asset/ individual, which contains both a multiplicative and an additive component [52, 53]. The additive-multiplicative hazard model specifies that the hazard for the counting process associates with a multidimensional covariate process additive-multiplicative hazard model takes the below form [54]:
. Therefore, a general (7)
Where,
is a vector of unknown regression coefficients,
and
are known link functions and
is an
unspecified baseline hazard function under and . Lin and Ying [54] develops a class of simple estimating functions for , which contains the partial likelihood score function in the special case of proportional hazard model. The mixed model is applied in the biomedical field [52]. Key merits
This model sometimes gives a better fit to the data rather than PHM [54] This model allows covariates to have both the additive and multiplicative effects
Key limitation
Extremely limited testing
2.1.6 Accelerated Failure Time Model The Accelerated Failure Time Model (AFTM) is one of the most common approaches used in obtaining reliability and failure rate estimates of devices and components in a much shorter time [55-58]. AFTM is to assume that the log lifetime parameter
, given applied stress vector , has a distribution with a location parameter . AFTM can be expressed as [55, 58]:
and a constant scale (8)
Where, and written as [8, 12]:
is a random variable whose distribution does not depend on
. The hazard of AFTM can be (9)
Where,
denotes the baseline hazard function, and
is a vector of regression coefficients. The effects of a covariate
is to accelerate or decelerate the failure time relative to the baseline hazard function according to [56]. Maximum likelihood estimation can be used to estimate the parameters of the AFTM.
or
Key merit
AFTM is used to obtain reliability and failure rate estimates of assets and components in a much shorter time
389
Key limitation
AFTM is a time consuming process as well as is costly to set up this test
2.1.7 Extended Hazard Regression Model The Extended Hazard Regression Model (EHRM) that includes PHM and AFTM is developed at 1985 [59, 60]. This model assumes that a covariate vector changes the baseline hazard function, , according to [59, 60]: (10) Where, and
and
are positive functions equal to
at zero.
denotes the baseline hazard function, and
are vectors of regression coefficients. For simplicity, it is assumed that
describes a situation in which
affects survival by changing both the time scale by the factor
the hazard is measured by the factor for
. The general case , and the scale in which
. This model reduces to the PHM for
. This model describes a situation in which
and to AFTM
influences survival by changing both the time scale by the
factor , and the scale in which the hazard is measured by the factor Maximum likelihood function based on polynomial spline approximation is developed to estimate all of the parameters of the model. This model is applied in both the biomedical [59, 60] and reliability fields [55]. Key merits
The appropriate applications of this model are AFTM dependent on the failure data analysis and types of failure mechanisms EHRM is a general model for hazard which includes both PHM and AFTM
Key limitations
Maximum likelihood estimation has a restricted assumption due to choosing a quadratic splines in order to maintain the number of parameters small Maximum likelihood estimation for this model is based on approximation of by splines; however, a spline function cannot guarantee that any point in is always positive
2.1.8 Proportional Intensity Model The Proportional Intensity Model (PIM) was first introduced by Cox [1]. PIM is similar to PHM, but with the underlying failure mechanism following a stochastic point process rather than a probabilistic distribution [61]. PIM is used to model the intensity process of failures and repairs of a repairable system which incorporates explanatory variables [49, 62]. Volk et al. [63] introduces PIM for both non-repairable and repairable systems utilising historic failure data and corresponding diagnostic measurements. PIM assumes that a system enters stratum
at
and that it enters stratum
immediately following the
failure, [3]. The classes are based on two time scales, namely the global time the immediately preceding failure, , respectively [64, 65]:
and the time from
(11) Where, represents a random variable for the number of failure in , and denotes the covariate process up to time . and are the intensity function and the baseline intensity function, respectively, and is the regression coefficient for the stratum. The baseline intensity function can have three different forms: constant intensity, log-linear intensity, and power-law intensity [66]. Maximum likelihood estimation is used to estimate regression coefficients. This model is applied in the reliability field [31, 34, 62, 64]. Key merits PHM assumes a system is renewed at failure, while PIM does not necessary make this assumption This model is suitable for optimising maintenance and repair policy in a cost-effective manner This model is simple, and directly related to the two most common models (PHM and AFTM) already in use, and effective in discriminating between them by means of statistics based on a few degrees of freedom only Key limitations
390
If covariates are deleted from the model or measured with different level of precision, the proportionality is in general destroyed
2.1.9 Proportional Odds Model In 1980, McCullagh [67] generalises the idea of constant odds ratio to more than two samples by means of a regression model which is termed as the Proportional Odds Model (POM). The theoretical basis for the model assumes that prognostics factors have a multiplicative effect on the odds against survival beyond any given time [68]. POM can be expressed as [12, 67, 69]: (12) Where,
is the cumulative distribution function of the occurrence of events in group
and
is an underlying
unknown cumulative distribution function. The term is called proportional odds ratio. The full maximum likelihood function of this model is derived by Bennett and McCullagh [67, 69]. POM is developed [67] and applied [68] in the biomedical field. Key merits Hazard for separate groups of asset/individual converges with time Due to hazard converges with time, it is useful model when the effects of covariates diminish or disappear with time increases Key limitation
It is necessary to use some time transformation of the failure times to estimate the parameters of the model
2.1.10 Proportional Covariate Model The Proportional Covariate Model (PCM) is developed by Sun et al. [70] in 2006. PCM assumes that covariate of a system, or a function of those covariates, are proportional to the hazard of the system. Sun et al. [70] and sun and Ma [71] claim that PCM is developed due to some shortcomings of PHM. The generic form of PCM is expressed as [70]: (13) Where, is the covariate function which is usually time dependent. The variable is the baseline covariate function which is also usually time dependent. The function is the hazard of a system. Maximum likelihood estimation is used to estimate parameters. Due to the novelty of PCM, this model is only tested by laboratory data in the reliability field. Key merits
Baseline covariate function can take into account both historical failure data and historical condition monitoring data Baseline covariate function can be updated according to newly observed failure data and covariates PCM is used to update the hazard of a system
Key limitations
This model does not consider operating environment data. Thus, this model is not sensitive to severe environments (e.g. high ambient temperature) Unlike the authors’ claims as an advantage of the model, PCM requires historical failure data to establish the covariate baseline PCM has not been applied widely in literature as it is a relatively new model
2.2 Semi-Parametric Models In parametric models, the form of degradation paths or distribution of degradation measure is specified and/or partially specified [6, 72]. These models can be classified as:
Weibull proportional hazard model Logistic regression model Log-Logistic regression model Aalen’s regression model
2.2.1 Weibull Proportional Hazard Model
391
The Weibull Proportional Hazard Model (WPHM) is a special case of PHM when the Weibull distribution is assumed for the failure times. Thus, in this model the baseline hazard function is Weibull distribution [33, 73, 74]. This model, as employed by Jardine [21, 26], introduces a new concept in reliability of utilising diagnostic factors (condition monitoring data) as explanatory variables. This model is expressed as [21, 75]: (14) Jardine et al. [21] derives a new likelihood function. By maximising this likelihood function, all parameters (i.e. regression coefficients and parameters of Weibull distribution in the baseline hazard function) are estimated. Moreover, EXAKT software is developed based on WPHM by Jardine et al. [76, 77] in University of Toronto. Key merits
WPHM is an influential technique which can be used to investigate the effects of various explanatory variables on the life length of assets/individuals Explanatory variables have a multiplicative effect (rather than additive effect) on the baseline hazard function, thus it is a more realistic and reasonable assumption The model handles truncated or non-truncated data
Key limitations
Due to multicollinearity, estimated values of regression coefficients are sensitive to omission, misclassification and time dependence of explanatory variables [10] The estimated value of the regression coefficient is biased in the case of a small sample size Mixing different types of covariates in one model may cause some problems The main assumption of this model is that an asset/individual life is assumed to be terminated at the first failure time, in other words this model depends only on the time elapsed between the starting event (e.g. diagnosis) and the terminal event (e.g. death) and not on the chronological time Unlike PHM, this model does assume a specified form for the baseline hazard function
2.2.2 Logistic Regression Model The Logistic regression model is a special case of POM. Logistic regression model is usually adopted to relate the probability of an event to a set of covariates [78]. This concept can be used in degradation analysis. If current degradation features are , Liao [78] defines the odds ratio between the reliability function distribution function as:
and the cumulative
(15) Where
and
are the model parameters to be estimated. Therefore, the reliability function can be expressed as: (16)
Liao [78] asserts that maximum likelihood function for the model parameters can be obtained by maximising the loglikelihood function using the Nelder-Mead’s algorithm. This model is applied in reliability analysis [78]. Key merit
Need less computation effort to estimate parameters rather than PHM [78]
Key limitation
Unlike POM, this model assumes a specified distribution
2.2.3 Log-Logistic Regression Model The Log-Logistic regression model is a special case of POM when a Log-Logistic distribution is assumed for the failure times [3]. The Log-Logistic regression model is described in which the hazard for separate samples converges with time. Therefore, this provides a linear model for the log odds on survival by any chosen time. This model is developed to overcome some shortcomings of Weibull distribution in the modelling of failure time data.
392
The distribution used frequently in the modelling of survival and failure time data is the Weibull distribution. However, its application is limited by the fact that its hazard, while may be increasing or decreasing, must be monotonic, whatever the values of its parameters. Bennett [69] claims that Weibull distribution may be inappropriate where the course of the failure (e.g. disease in individuals) is such that mortality reaches a peak after some finite period, and then slowly declines. The hazard of Log-Logistic regression model is [69]: (17) Which has its maximum value at . Where, is a measure of precision and is a measure of location. The hazard is assumed to be increasing first and then decreasing with a change at the time. Parameters of this model are estimated by maximising the likelihood function [69]. This model is tested and applied in biomedical [69]. The ratio of the hazard for a covariate
taking two values
and
, which converges to unity as
increases, is given by [69]: (18)
Key merits It is more suitable to apply in the analysis of survival data rather than Log-Normal distribution It is suitable model where hazard reaches a peak after some finite period, and the slowly declines It has mathematical tractability when dealing with the censored observations The hazard for different samples is not proportional through time as in the Weibull model, but that their ratio trends to unity. Thus, this property is desirable when the initial effects of covariates (e.g. treatment) trend to diminish with time, and the survival probabilities of different groups of asset/ individual become more similar Key limitation
Unlike POM, this model assumes a specified distribution
2.2.4 Aalen’s Regression Model The Aalen’s regression model is introduced by Aalen in 1980 [79]. This model is based on Aalen’s multiplicative intensity model for counting processes [80] to assess additive time-dependent covariate effects in possibly right-censored survival data. At first glance, this model seems to be non-parametric due to the absence of specified distribution. In this model linearity represents a kind of distributional assumption; however, no finite-dimensional parameter is introduced in the model [80]. Therefore, in this study, it is classified as a semi-parametric model. Aalen develops this model in order to improve some restrictions of Cox’s PHM. In his model, asset/individual, (
denotes the intensity of the event happening at time
is the probability that the event occurs in some small time interval between
that it has not happened before).
is the number of assets/individuals and
Aalen [80] considers the following linear model for the vector,
and
for the given
is the number of covariates in the analysis.
, of intensities
: (19)
The
matrix
then the
row of
is constructed as follows: if the is the vector
asset/individual is a member of the risk set at time , where
, are time-
dependent covariate values. If the asset/individual is not in the risk set at time , thus the corresponding row of contains only zeros. The first element of the vector , is interpreted as a baseline parameter function, while the remaining elements, , are called regression coefficients, which measure the influence of the respective covariates. Aalen’s regression model explicitly allows for contributions of the covariates that change over time, since the regression functions may vary arbitrary with time [47]. Consequently, if the effect of covariates is zero, the slope will also be zero. If the covariate has a constant influence over time, the plot will be approximately a straight line. If the slope is positive (or negative), it shows that the effect of covariates is to increase (or decrease) the hazard. If the plot is a curve with an increasing (or decreasing) slope this indicates an increase (or decrease) in the magnitude of influence of covariates [3]. Aalen’s regression model has been applied in biomedical [47, 80, 81] and reliability [44] fields. The merits and limitations of the model are illustrated in the following table. Key merits
The main advantage of this model is its linearity
393
It is less vulnerable approach than Cox’s PHM to problem of inconsistency when covariates are deleted or the precision of covariate measurements is changed [80] If a covariate which is independent of the other covariates is removed from the model, then the new model is still linear with unchanged regression coefficients for the remaining covariates, only the baseline parameter function is affected This model shows the influence of time-dependent covariates better than Cox’s PHM [80]
Key limitations
3
If a covariate which is not independent of the other covariates is removed from the model, then it is necessary to assume a multivariate normal distribution for the covariates in order to the new model stay linear, thus normal assumption cannot be taken literally, since that might give positive probability to negative intensities The obvious weakness of this model compared to Cox’s PHM is the expression for does not restrict naturally to non-negative values The consequence of the lack of restriction to non-negative values is that the estimated survival function may not be monotonically decreasing throughout, but can have occasional lapses where it increases slightly Suitable methods of parameter estimation and its goodness-of-fit require to be investigated CONCLUSIONS
The hazard model with covariates is one of the most common statistical models in reliability and survival analysis. This expository paper reviews the existing literature on hazard models with covariates (termed covariate models) in both the industrial and biomedical fields. This paper synthesises these models from both industrial reliability and biomedical fields and then contextually groups them into: non-parametric and semi-parametric models. The merits and limitations of these models have been discussed so as to establish suitable potential applications, especially for the information of fellow researchers. This review paper demonstrates that all of these covariate models have been developed and then tested in the biomedical field to estimate the survival time of patients or individuals. Only a few of these models appear in reliability literature and have been applied in industrial cases. Most covariate models have not yet been effectively applied in reliability analysis. One reason for this situation may be the lack of awareness of these covariate models by reliability engineers. Moreover, due to the prominence of some models (e.g. PHM and WPHM), attempts to develop and apply alternative covariate models in the reliability field have been somehow stifled. It may be beneficial to introduce and evaluate more covariate models to reliability analysis for industrial cases. 4
REFERENCES
1
Cox DR. (1972) Regression models and life-tables. Journal of Royal Statistical Society, 34(2), 187-220.
2
Kumar D & KlefsjÖ B. (1994) Proportional hazards model: a review. Reliability Engineering & System Safety, 44(2), 177-188.
3
Kumar D & Westberg U. (1996) Some reliability models for analysing the effects of operating conditions. International Journal of Reliability, Quality and Safety Engineering, 4(2), 133-148.
4
Ma L. (2007) Condition monitoring in engineering asset management. APVC. p. 16.
5
Ma Z & Krings AW. (2008) Survival analysis approach to reliability, survivability and prognostics and health management. IEEE Aerospace Conference. pp. 1-20.
6
Liao H. (2004) Degradation models and design of accelerated degradation testing plans. United States -- New Jersey: Rutgers The State University of New Jersey - New Brunswick.
7
Luxhoj JT & Shyur H-J. (1997) Comparison of proportional hazards models and neural networks for reliability estimation. Intelligent Manufacturing, 8(3), 227-234.
8
Cox DR & Oakes D. (1984) Analysis of survival data. London ; ; New York: Chapman and Hall.
9
Crowder MJ. (1991) Statistical analysis of reliability data (1st ed.). London ; ; New York: Chapman & Hall.
10
Kumar D & Klefsjoe B. (1994) Proportional hazards model-an application to power supply cables of electric mine loaders. International Journal of Reliability, Quality and Safety Engineering, 1(3), 337-352.
11
Landers TL & Kolarik WJ. (1986) Proportional hazards models and MIL-HDBK-217. Microelectronics and Reliability, 26(4), 763-772.
12
Kalbfleisch JD & Prentice RL. (2002) The statistical analysis of failure time data (Second ed.). New Jersey: Wiley.
13
Cox DR. (1975) Partial likelihood. Biometrika, 62(2), 269-276.
394
14
Kalbfleisch JD & Prentice RL. (1973) Marginal likelihoods based on Cox's regression and life model. Biometrika, 60(2), 267-278.
15
Booker J, Campbell K, Goldman AG, Johnson ME & Bryson MC. (1981) Applications of Cox's proportional hazards model to light water reactor component failure data (No. LA-8834-SR; Other: ON: DE81023991).
16
Breslow N. (1974) Covariance analysis of censored survival data. Biometrics, 30(1), 89-99.
17
Arjas E. (1988) A graphical method for assessing goodness of fit in Cox's proportional hazards model. American Statistical Association, 83(401), 204-212.
18
Bendell A, Walley M, Wightman DW & Wood LM. (1986) Proportional hazards modelling in reliability analysis - an application to brake discs on high speed trains. Quality and Reliability Engineering International, 2(1), 45-52.
19
Dale CJ. (1985) Application of the proportional hazards model in the reliability field. Reliability Engineering, 10(1), 114.
20
Ansell RO & Ansell JI. (1987) Modelling the reliability of sodium sulphur cells. Reliability engineering, 17(2), 127-137.
21
Jardine AKS, Anderson PM & Mann DS. (1987) Application of the Weibull proportional hazads model to aircraft and marine engine failure data. Quality and Reliability Engineering International, 3(2), 77-82.
22
Landers TL & Kolarik WJ. (1987) Proportional hazards analysis of field warranty data. Reliability engineering, 18(2), 131-139.
23
Baxter MJ, Bendell A, Manning PT & Ryan SG. (1988) Proportional hazards modelling of transmission equipment failures. Reliability Engineering and System Safety, 21, 129–144.
24
Chan CK. (1990) A proportional hazards approach to correlate SiO2-breakdown voltage and time distributions. IEEE Transactions on Reliability, 39(2), 147-150.
25
Drury MR, Walker EV, Wightman DW & Bendell A. (1988) Proportional hazards modelling in the analysis of computer systems reliability. Reliability Engineering and System Safety, 21, 197-214.
26
Jardine AKS, Ralston P, Reid N & Stafford J. (1989) Proportional hazards analysis of diesel engine failure data. Quality and Reliability Engineering International 5(3), 207-16.
27
Leitao ALF & Newton DW. (1989) Proportional hazards modelling of aircraft cargo door complaints. Quality and Reliability Engineering International, 5(3), 229-238.
28
Mazzuchi TA & Soyer R. (1989) Assessment of machine tool reliability using a proportional hazards model. Naval Research Logistics, 36(6), 765-777.
29
Mazzuchi TA, Soyer R & Spring RV. (1989) The proportional hazards model in reliability. Reliability and Maintainability Symposium, 1989. Proceedings., Annual. pp. 252-256.
30
Elsayed EA & Chan CK. (1990) Estimation of thin-oxide reliability using proportional hazards models. IEEE Transactions on Reliability 39(3), 329-335.
31
Kumar D. (1995) Proportional hazards modelling of repairable systems. Quality and Reliability Engineering International, 11(5), 361-369.
32
Kumar D, KlefsjÖ B & Kumar U. (1992) Reliability analysis of power transmission cables of electric mine loaders using the proportional hazards model. Reliability Engineering & System Safety, 37(3), 217-222.
33
Love CE & Guo R. (1991) Using proportional hazard modelling in plant maintenance. Quality and Reliability Engineering International, 7(1), 7-17.
34
Love CE & Guo R. (1991) Application of weibull proportional hazards modelling to bad-as-old failure data. Quality and Reliability Engineering International, 7(3), 149-157.
35
Pettitt AN & Daud IB. (1990) Investigating time dependence in Cox's proportional hazards model. Applied Statistics, 39(3), 313-329.
36
Gasmi S, Love CE & Kahle W. (2003) A general repair, proportional-hazards, framework to model complex repairable systems. IEEE Transactions on Reliability, 52(1), 26-32.
37
Jardine AKS, Banjevic D, Wiseman M, Buck S & Joseph T. (2001) Optimizing a mine haul truck wheel motors' condition monitoring program: Use of proportional hazards modeling. Quality in Maintenance Engineering, 7(4), 286.
38
Krivtsov VV, Tananko DE & Davis TP. (2002) Regression approach to tire reliability analysis. Reliability Engineering & System Safety, 78(3), 267-273.
395
39
Martorell S, Sanchez A & Serradell V. (1999) Age-dependent reliability model considering effects of maintenance and working conditions. Reliability Engineering & System Safety, 64(1), 19-31.
40
Park S. (2004) Identifying the hazard characteristics of pipes in water distribution systems by using the proportional hazards model: 2. Applications. KSCE Journal of Civil Engineering, 8(6), 669-677.
41
Prasad PVN & Rao KRM. (2002) Reliability models of repairable systems considering the effect of operating conditions. Annual Reliability and Maintainability Symposium pp. 503-510.
42
Prasad PVN & Rao KRM. (2003) Failure analysis and replacement strategies of distribution transformers using proportional hazard modeling. Annual Reliability and Maintainability Symposium pp. 523-527.
43
Wallace JM, Mavris DN & Schrage DP. (2004) System reliability assessment using covariate theory. Annual Symposium Reliability and Maintainability pp. 18-24.
44
Kumar D & Westberg U. (1996) Proportional hazards modeling of time-dependent covariates using linear regression: a case study. IEEE Transactions on Reliability, 45(3), 386-392.
45
Kay R. (1977) Proportional hazard regression models and the analysis of censored survival data. Applied Statistics, 26(3), 227-237.
46
Anderson JA & Senthilselvan A. (1982) A two-step regression model for hazard functions. Applied Statistics, 31(1), 4451.
47
Mau J. (1986) On a graphical method for the detection of time-dependent effects of covariates in survival data. Applied Statistics, 35(3), 245-255.
48
Pijnenburg M. (1991) Additive hazards models in repairable systems reliability. Reliability Engineering and System Safety, 31(3), 369-390.
49
Wightman D & Bendell T. (1995) Comparison of proportional hazards modeling, additive hazards modeling and proportional intensity modeling when applied to repairable system reliability. International Journal of Reliability, Quality and Safety Engineering, 2(1), 23-34.
50
Newby M. (1994) Why no additive hazards models? IEEE Transactions on Reliability, 43(3), 484-488.
51
Newby M. (1992) A critical look at some point-process models for repairable systems. IMA Journal of Management Mathematics, 4(4), 375-394.
52
Andersen PK & Væth M. (1989) Simple parametric and nonparametric models for excess and relative mortality. Biometrics, 45(2), 523-535.
53
Badía FG, Berrade MD & Campos CA. (2002) Aging properties of the additive and proportional hazard mixing models. Reliability Engineering & System Safety, 78(2), 165-172.
54
Lin DY & Ying Z. (1995) Semiparametric analysis of general additive-multiplicative hazard models for counting processes. The Annals of Statistics, 23(5), 1712-1734.
55
Shyur H-J, Elsayed EA & Luxhøj JT. (1999) A general model for accelerated life testing with time-dependent covariates. Naval Research Logistics, 46(3), 303-321.
56
Lin Z & Fei H. (1991) A nonparametric approach to progressive stress accelerated life testing. IEEE Transactions on Reliability 40(2), 173-176.
57
Mettas A. (2000) Modeling and analysis for multiple stress-type accelerated life data. Annual Reliability and Maintainability Symposium. pp. 138-143.
58
Newby M. (1988) Accelerated failure time models for reliability data analysis. Reliability Engineering & System Safety, 20(3), 187-197.
59
Ciampi A & Etezadi-Amoli J. (1985) A general model for testing the proportional hazards and the accelerated failure time hypotheses in the analysis of censored survival data with covariates. Communications in Statistics - Theory and Methods, 14(3), 651 - 667.
60
Etezadi-Amoli J & Ciampi A. (1987) Extended Hazard Regression for Censored Survival Data with Covariates: A Spline Approximation for the Baseline Hazard Function. Biometrics, 43(1), 181-192.
61
Landers TL & Soroudi HE. (1991) Robustness of a semi-parametric proportional intensity model. IEEE Transactions on Reliability, 40(2), 161-164.
62
Lugtigheid D, Jardine AKS & Jiang X. (2007) Optimizing the performance of a repairable system under a maintenance and repair contract. Quality and Reliability Engineering International, 23(8), 943-960.
396
63
Vlok P-J, Wnek M & Zygmunt M. (2004) Utilising statistical residual life estimates of bearings to quantify the influence of preventive maintenance actions. Mechanical Systems and Signal Processing, 18(4), 833-847.
64
Jiang S-T, Landers TL & Rhoads TR. (2006) Proportional intensity models robustness with overhaul intervals. Quality and Reliability Engineering International, 22(3), 251-263.
65
Prentice RL, Williams BJ & Peterson AV. (1981) On the regression analysis of multivariate failure time data. Biometrika, 68(2), 373-379.
66
Percy DF, Kobbacy KAH & Ascher HE. (1998) Using proportional-intensities models to schedule preventivemaintenance intervals. IMA J Management Math, 9(3), 289-302.
67
McCullagh P. (1980) Regression models for ordinal data. Journal of the Royal Statistical Society. Series B (Methodological), 42(2), 109-142.
68
Bennett S. (1983) Analysis of survival data by the proportional odds model. Statistics in Medicine, 2(2), 273-277.
69
Bennett S. (1983) Log-logistic regression models for survival data. Applied Statistics, 32(2), 165-171.
70
Sun Y, Ma L, Mathew J, Wang W & Zhang S. (2006) Mechanical systems hazard estimation using condition monitoring. Mechanical Systems and Signal Processing, 20(5), 1189-1201.
71
Sun Y & Ma L. (2007) Notes on "mechanical systems hazard estimation using condition monitoring"--Response to the letter to the editor by Daming Lin and Murray Wiseman. Mechanical Systems and Signal Processing, 21(7), 2950-2955.
72
Kobbacy KAH, Fawzi BB, Percy DF & Ascher HE. (1997) A full history proportional hazards model for preventive maintenance scheduling. Quality and Reliability Engineering International, 13(4), 187-198.
73
Jardine AKS. (1983) Component and system replacement decisions. J.K S (Ed.). Image Sequence Processing and Dynamic Scene Analysis. pp. 647-654.
74
Jozwiak IJ. (1997) An introduction to the studies of reliability of systems using the Weibull proportional hazards model. Microelectronics and Reliability, 37(6), 915-918.
75
Jardine AKS & Anders M. (1985) Use of concomitant variables for reliability estimation. Maintenance Management International, 5, 135-140.
76
Jardine AKS, Banjevic D & Makis V. (1997) Optimal replacement policy and the structure of software for conditionbased maintenance. Quality in Maintenance Engineering, 3(2), 1355-2511.
77
Jardine AKS, Joseph T & Banjevic D. (1999) Optimizing condition-based maintenance decisions for equipment subject to vibration monitoring. Quality in Maintenance Engineering, 5(3), 192-202.
78
Liao H, Zhao W & Guo H. (2006) Predicting remaining useful life of an individual unit using proportional hazards model and logistic regression model. Annual Reliability and Maintainability Symposium pp. 127-132.
79
McKeague IW. (1986) Estimation for a semimartingale regression model using the method of sieves. The Annals of Statistics, 14(2), 579-589.
80
Aalen OO. (1989) A linear regression model for the analysis of life times. Statistics in Medicine 8(8), 907-925.
81
Mau J. (1988) A comparison of counting process models for complicated life histories. Applied Stochastic Models and Data Analysis, 4, 283–298.
Acknowledgement The authors gratefully acknowledge the financial support provided by both the Cooperative Research Centre for Integrated Engineering Asset Management (CIEAM), established and supported under the Australian Government’s Cooperative Research Centres Programme, and the School of Engineering Systems of Queensland University of Technology (QUT).
397
Proceedings of the 4th World Congress on Engineering Asset Management Athens, Greece 28 - 30 September 2009
A CASE STUDY OF RELIABILITY ASSESSMENT FOR CENTRIFUGAL PUMPS IN A PETROCHEMICAL PLANT Masdi Muhammada, M Amin Abd Majida and Nurul Akma Ibrahima a
University Technology Petronas, Bandar Seri Iskandar, 31750 Perak, Malaysia
Centrifugal pumps are widely used in petrochemical industry and in some instances, the number of pumps used could easily amount to hundreds of pumps in a typical petrochemical plant. Consequently, the reliability of these pumps essentially translates into stable and reliable plant operation as the pumps performances are critical in ensuring continuous plant productivity. Reliability assessment for repairable equipment, which in this case centrifugal pumps, is highly dependent upon the assumption of the state after each repair. The post repair states can be categorized into three different states namely, ‘as good as new’, ‘as bad as old’ and the states in between. In practice, however, the usual state of equipment after repair follows the state of ‘better than old but worse than new’ which lies somewhere in between the two extremes. This paper focuses on the reliability assessment of the centrifugal pumps at a refinery plant that has been in operation for more than 10 years using a more robust process called generalized renewal process (GRP). This process has been proposed to model not only the ‘inbetween’ states but also the two extreme post repair states. A case study utilizing centrifugal pump failure data is used as a comparative appraisal of reliability assessment between GRP, perfect renewal process (PRP) and non-homogenous Poisson process (NHPP). The underlying distribution for time to first failure for these pumps is assumed to follow the two-parameter Weibull distribution and the parameters for the models are estimated using maximum likelihood estimation (MLE). The GRP solution based on the case study showed better description of the failure distribution even with limited available failure data in contrast with other assumptions as indicated by the likelihood values. Key words: Repairable system, generalized renewal process, maximum likelihood estimation 1
INTRODUCTION
Modelling and analysis of repairable systems have received a great deal of attention lately due to the increased focus on reliability and maintenance as part of cost reduction program. A repairable system is defined as “a system which, after failing to perform one or more of its functions satisfactorily, can be restored to fully satisfactory performance by any method other than replacement of the entire system”[1]. Understandably, most systems in petrochemical industries can be categorized as repairable which includes pumps, compressors and turbines. This study focuses on centrifugal pumps which could easily amount to hundreds of pumps in a typical petrochemical plant all of which undergo some form of maintenance to ensure continuous uninterrupted performance. There are three possible states that the pumps may end up after maintenance or repair [2]; - As good as new: A perfect repair where the system is restored to be as good as new condition. - As bad as old: A minimal repair where the repair action only restored the system to a state just before failure - Better than old but worse than new: An imperfect where a repair action will restore the system condition to be in between old and new . Earlier studies and modelling in reliability for repairable systems have been focusing on the first two assumptions with failure occurrences are modelled by a stochastic point process in time where the repair time is assumed to be negligibly small compared to its time to failure [3]. A commonly used model for perfect repair assumption is perfect renewal process (PRP). In PRP, it is assumed that the repair action or replacement of failed component will bring back the system to be as good as new and it is also assumed that the time between failures of the system to be independent and identically distributed (IID) random variables [3]. A homogenous Poisson process (HPP) with constant failure intensity is normally used to model this process. HPP implies that the system does not age nor deteriorate and independent of the previous pattern of failures i.e. it
398
is a memory-less process. On the other extreme end, the system is assumed to be in the same state as just right before failure after each repair. This assumption is based on the fact that for a complex system (with hundreds of different components) with many failure modes, replacing or repairing of the failed component will not significantly effect the age or state of the other components. As such, the occurrence of failures is neither independent nor identically distributed. In other words, the system was subjected to minimal repair and thus, the system age will not change or improve. The common model applied for as bad as old assumption is called non-homogeneous Poisson process (NHPP) [4]. Out of NHPP, an overwhelming majority of publication including Ascher and Feingold [1], Krivtsov [4] and Yanez [5], considers the two methods to estimate of rate of occurrence of failure (ROCOF) namely Power Law model as discussed by Cox and Lewis [4] and log-linear model as mentioned by Cox [4]. These two methods assumed that the time to first failure (TTFF) follows Weibull distribution. According to Yanes et al. [5], the main reason as to why the ‘in between post repair states’ have not received much attention is due to the difficulty in developing a mathematically robust and efficient approach to represent them. While much focus has been given to the first two assumptions, the proposed models (PRP and NHPP) have a limited practical application as most systems exhibit post repair conditions which are generally in between the two extremes. Most systems are seldom fully rejuvenated after repair, yet they are better than old as some of the components are newly replaced. As such a new probabilistic model called ‘generalized renewal process (GRP)’ has been proposed by Kijima and Sumita [6] to encompass all possible states after repair. The proposed model has been studied by Admantios et al. [7] where he compared the accuracy of results among PRP, NHPP and GRP models to estimates models’ parameters. The results showed that GRP offered the best fit for two different case studies data gathered from automotive industry due to large number of available data. Krivtsov [8] proposed approximate solution to Kijima Model or GRP using the Monte Carlo (MC) simulation technique where he has also indicated that PRP and NHPP are in fact specific cases of GRP model. However, MC techniques of estimating GRP solution requires large data set which is uncommonly available in petrochemical industries. Yanez et al. [5], on the other hand, offered classical method using maximum likelihood and Bayesian approaches to estimate GRP parameters. They have shown that GRP solutions can accurately represent the failure data even with reasonably small data set. The purpose of this paper is to investigate and provide a comparative analysis on the application of GRP as a more accurate reliability assessment of centrifugal pumps in petrochemical plant of which parameters are estimated using maximum likelihood due to limited number of available data. The results will be compared against prediction using PRP and NHPP assumptions. A brief definition and description of failure repair process and basic concept of GRP as well as methods of GRP parameter estimation are discussed in Section 2. Section 3 discusses the specific case study and the application of GRP in modeling the failure data. This is followed by a discussion and conclusion in Sections 4 and Section 5, respectively. 2
PROBABILISTIC MODELS 2.1. Perfect Renewal Process
PRP describes the situation where a repairable system is restored to as good as new condition, and the time between failures of a component or a system is considered independent and identically distributed [2]. This process assumes that the system will be restored to its original condition or renew itself upon completion of the repair action or part replacement. 2.2. Non-homogeneous Poisson process NHPP is a Poisson process with a simple parametric model used to represent events with a non-constant failure recurrence rate. This type of model is often used to model failure process with certain trends, namely the reliability growth and the reliability of repairable units [2]. NHPP describes the cumulative number of failures up to time t , N (t ) and it follows a
Poisson distribution with parameter l (t ) for a counting process {N (t ), t ‡ 0} . NHPP exists when the occurrence rate is timedependent and no more requirements of stationary increments. 2.3. Generalized renewal process Kijima [6] introduced the concept of virtual age to be the basis for GRP. If a system has virtual age V n -1= y immediately
after (n - 1) th repair, the n th failure time X is distributed according to the following cumulative distribution function (cdf) :
399
G ( x ) = F ( X | Vn -1 = y ) =
F ( X + y) - F ( y) 1 - F ( y)
(2)
R( X + y ) =1R( y )
where F (t ) is a cdf of the time to first failure (TTFF) of a new system and R(t ) = 1 - F (t ) is the reliability at the respective time. As such, the summation of n
Tn =
∑X
for n = 1, 2, 3...
i
(3)
i =1
With T0 = 0 is called the real age of the system. In GRP framework, there are two different assumptions as proposed by Kijima [6] with respect to the repair action. The first assumption (Type I) is that the nth repair would only compensate for the damage accumulated during the time between the (n - 1) th and the n th failure. Thus, the virtual age for the system after the nth repair is
Vn = Vn -1 + qX n
(4)
where q is defined as the repair effectiveness and the virtual age of a new system V0=0. Thus,
Vn = q( X1 + X 2 + ... + X n )
(5)
For second assumption (Type II), it is assumed that the repair will not only remove damages during the time between the n th failure but removing all damages accumulated up to n th failure. Accordingly, the virtual age can be represented by
(n - 1) th and the
Vn = q(Vn -1 + X n )
(6)
where similarly q is defined as the repair effectiveness and thus with V0 = 0 ,
Vn = q(q n -1 X1 + q n - 2 X 2 + q n -3 X 3 + ... + X n )
(7)
As can be observed, when the effectiveness of repair, q=0, the virtual age does not change with time which corresponds to a perfect repair assumption (as good as new). On the other hand, when q=1, this leads to a minimal repair assumption (as bad as q old). In the case of 0 < < 1, the corresponding repair assumption is somewhere in between i.e. better -than-old-but-worseq than-new. All the values of is applicable for both Type I and Type II assumption.
2.4. Parameter estimation The occurrence of failure in repairable systems which is defined by GRP parameters can be estimated in terms of the distribution of TTFF. Depending on the data sample available, different parameter approaches can be carried out. Kivtsov [8] proposed using Monte Carlo approach for the statistical estimation. However, this particular approach requires a large amount of data to estimate the distribution of TTFF. In the case which there is a reasonably enough data available, maximum likelihood estimation is preferable to be used to estimate the GRP parameters which will be the focus in this particular study. Failure-terminated likelihood function is defined by [2]
L = f ( x1 | b ,h )
n
g ( xi | b ,h , q )
i =2
(8)
400
where f (x ) and g (x ) are the probability density function (pdf) of the TTFF and the conditional pdf for the subsequent time between failures respectively. With the assumption that the pdf of TTFF follows a two-parameter Weibull distribution:
b f (t1 | b ,h ) = h
t1 h
b -1 - t1 h
b
e
(9)
where b and h are the shape and scale parameters respectively, the subsequent failure is given as
b qt + xi qt i -1 G ( xi | b ,h , q) = 1 - exp - i -1 h h
b
(10)
where xi is the period of time between the (i - 1) th and i th failure and ti -1 is the cumulative operating time up to the (i - 1) th failure. The ML estimation based on equations (8), (9) and (10) will lead to three equations that can be solved simultaneously for β, η and q as discussed by [5]. A numerical algorithm, such as Newton-Raphson can be used to find the solution for the three parameters.
3
METHODOLOGY
This study is based on actual historical pump failure data collected from June 2000 until September 2008 which is approximately 3000 days of operation (Note: data has been masked due to proprietary reason). The database includes key information such as equipment number, operating condition, failure date and time and causes of failures. The data is screened to ensure that only failure data is taken into account and to neglect all the false alarms and assist data (failures that require only assist instead of repair). The screened data is then grouped based on pump operating conditions ensuring homogenous data for the distribution fitting. A total of 71 centrifugal pumps’ failure data were selected. The failure data for each group is analyzed and fit into statistical distribution (PDF) after which the reliability analysis based on the different model assumptions is performed. Lastly, likelihood values are numerically calculated and compared to determine which of the model can better describe the failure data.
4
RESULTS AND DISCUSSION
Table 1 shows the failure data of the selected pumps. It is noted that the first failure is recorded after 25 days of commissioning and the last failure is registered at 2952 days. The failures for PRP or the time to first failures for NHPP and GRP models are assumed to follow a 2-parameter Weibull distribution. The differentiation among the three model assumptions are based on the value of the repair effectiveness, q, where q=0 indicates PRP, q=1 indicates NHPP and calculated q is for GRP model.
Table 1: Time to Failure Data for Pumps in Days 25
138
265
470
656
1078
1896
2461
2736
38
158
356
473
661
1182
1922
2510
2776
46
167
360
488
697
1294
1925
2562
2882
55
171
360
493
747
1336
1939
2568
2930
58
171
364
509
811
1383
2008
2583
2939
59
186
368
527
830
1462
2033
2648
2952
60
187
419
557
921
1536
2051
2698
66
191
427
568
963
1550
2097
2707
68
206
448
571
964
1550
2140
2718
112
231
452
600
985
1615
2195
2720
401
The calculation results for β and η for the different model assumption is shown in Table 2. It can be seen that the value of the shape factor, β <1 for all cases indicating reducing rate of occurrence of failure. This implies improvement of the equipment reliability possibly due to improvement in maintenance actions. Figure 1 and Figure 2 show a comparison plot of cumulative number of failures vs time for GRP type I and PRP respectively. The plots show a better fit of failure data with GRP type I compare with PRP as also indicated by the likelihood values in Table 2.
Cumulative Number of Failures vs Time 3.000
Cum. Number of Failures Data Points Function Line
Cumulative Number of Failures
2.402
1.804
1.206
0.608
0.010 10.000
608.000
1206.000
1804.000
2402.000
3000.000
Time, (t) Beta=0.6451, Lambda=0.0125, q=0.5837
Figure 1: Cumulative number of Failure vs time plot for GRP type 1
Cumulative Number of Failures vs Time 3.000
Cum. Number of Failures Data Points Function Line
Cumulative Number of Failures
2.402
1.804
1.206
0.608
0.010 10.000
608.000
1206.000
1804.000
2402.000
3000.000
Time, (t) Beta=0.7441, Lambda=0.0049, q=0
Figure 2: Cumulative number of failure vs time for PRP
402
Conditional Reliability vs Time 1.000
Cond. Reliability vs Time Function Line
Conditional Reliability
0.800
0.600
0.400
0.200
0.000
0.000
1600.000
3200.000
4800.000
6400.000
8000.000
Time, (t) Beta=0.6451, Lambda=0.0125, q=0.5837
Figure 3: Conditional reliability vs time for GRP Type I
Figure 3 shows the conditional reliability versus time plot for the pumps from which the reliability of the pumps can be predicted. For comparison, similar analysis was done on the two other model assumptions. The summary of model parameters is as depicted in Table 2. Based on likelihood values, where the higher the value the better fit the model, it shows that GRP Type I is the best fit model for the failure data with the repair effectiveness value at 0.5837. This indicates that the repair actions are in fact somewhere in between perfect repair and minimal repair.
Table 2: Analysis Results Comparison
Probabilistic Model GRP Type I GRP Type II PRP NHPP
5
β (beta) 0.6450 0.6487 0.7440 0.6579
q 0.5837 0.7793 0 1
η 891.39 869.11 1271.03 840.82
LV -822.4856 -822.5402 -822.5076 -822.5655
CONCLUSION
In this paper, the method of general renewal process was explored as an alternative for modeling of centrifugal pump failure times. By taking the consideration of repair effectiveness, the GRP model is shown to have a better fit to the failure data compared with PRP and NHPP assumptions and can be concluded to be better reliability assessment tool. The results also proved that GRP model can be used in petrochemical industries provided there is enough failure data to represent the time to first failure for the equipment. With GRP model, we will be able to predict the expected number of failure at a given time as well as the expect time to the next failure in order to have a more accurate preventive maintenance schedule. However, the GRP parameter estimates using maximum likelihood is only limited to application where there is a reasonably enough failure data available as in this case. Otherwise, other estimation method can be applied such as Bayesian formulation should the data is extremely limited but past experience or engineering judgement may be available to help describing the systems’ failure behaviour.
403
6
REFERENCES
1
Ascher, H. & Feingold, H. (1984) Repairable systems reliability: Modeling, inference, misconceptions and their causes. Marcel Dekker, New York.
2
Veber, B., Nagode, M. & Fajdiga, M. (2008) Generalized renewal process for repairable systems based on finite Weibull mixture. Reliability Engineering and System Safety, 93, 1461-1472.
3
Krivtsov, V. V. (2007) Recent advances in theory and applications of stochastic point process models in reliability engineering. Reliability Engineering and System Safety, 92, 549-551.
4
Krivtsov, V. V. (2007) Practical extensions to NHPP application in repairable system reliability analysis. Reliability Engineering and System Safety, 92, 560-562.
5
Yanez, M., Joglar, F. & Modarres, M. (2002) Generalized renewal process for analysis of repairable system with limited failure experience. Reliability Engineering and System Safety, 77, 167-180.
6
Kijima, M. & Sumita, N. (1986) A useful generalization of renewal theory: counting process governed by non-negative Markovian increments. Journal Applied Probability, 23, 71-88.
7
Mettas, A. & Zhao, W. (2005) Modeling and analysis of repairable system with general repair. IEEE 2005 Proceedings annual Reliability and Maintainability Symposium,Alexandaria, Virginia USA, January 24-27.
8
Krivtsov, V. (2000) Monte Carlo approach to modeling and estimation of the generalized renewal process in repairable system reliability analysis. Dissertation for the degree of Doctor of Philosophy, University of Maryland.
9
Weckman, G. R., Shell, R. L. & Marvel, J. H. (2001) Modeling the reliability of repairable systems in aviation industry. Computers and Industrial Engineering, 40, 51-63.
Acknowledgement The authors are grateful to Universiti Teknologi PETRONAS for providing facilities and support for the research and thankful to those who provided invaluable inputs in completing this paper.
404
Proceedings of the 4th World Congress on Engineering Asset Management Athens, Greece 28 - 30 September 2009
OPTIMISATION OF THE RELIABILITY BASED PREVENTIVE MAINTENANCE STRATEGY Yong Sun,a* Lin Ma,a Michael Pursera and Colin Fidgeb a
CRC for Integrated Engineering Asset Management, School of Engineering Systems, Queensland University of Technology, Brisbane, QLD 4001, Australia. b
Faculty of Information Technology, Queensland University of Technology, Brisbane, QLD 4001, Australia. * Corresponding author: [email protected], Tel: (61 7) 3138 2442, Fax: (61 7) 3138 1469
The Reliability Based Preventive Maintenance (RBPM) strategy is commonly used in industry to improve the reliability of engineering assets. In RBPM, a reliability threshold is predefined for a particular engineering asset. Whenever the reliability of the asset falls to this level, a preventive maintenance action is conducted to improve the asset’s reliability. As preventive maintenance is costly, finding optimal RBPM strategies for engineering assets, especially over a long term with multiple maintenance cycles, is of strategic importance to their owners, so as to increase their market competitiveness. Selecting an optimal RBPM strategy usually involves finding an optimal reliability threshold which enables the total expected cost, including repair cost, preventive maintenance cost and production loss, to be minimised. A number of factors such as required minimal mission time, customer satisfaction, human resources and acceptable risk levels can limit the ability of an organisation to achieve this objective. These factors are usually termed as constraints and have different influences on decision making. However, an effective tool which enables industries to make optimal RBPM decisions with consideration of the effects of these factors is still lacking. To address this issue, here we investigate these factors and identify critical constraints. Furthermore we develop an effective approach for determining the optimal RBPM strategy within the identified multiple constraints. Key Words: Engineering asset management, Optimisation, Preventive maintenance, Reliability based preventive maintenance, Multiple criteria decision making 1
INTRODUCTION
Preventive Maintenance (PM) has been widely used to improve engineering asset reliability in industry. As conducting PM is often costly, the asset’s owners need to optimise their PM decisions. This need is becoming even more pressing with the ever increasing complexity of machines and competitive market pressures. However, making optimal PM decisions in reality is often a difficult task. The objective of optimisation is normally to determine optimal PM times (or frequencies) which can minimise the total cost under different business constraints. These constraints include required minimal mission time, customer satisfaction level, available human resources and budget, as well as acceptable risk levels. Asset managers need an effective methodology or tool to assist them in optimising their PM decisions. Research on optimising Preventive Maintenance decisions has attracted considerable attention from engineering asset management practitioners and academic researchers [1-7] . Existing models and methods have largely focused on different assets or technical aspects of engineering asset management. For example, Sun et al. conducted a case study for pipeline PM decision optimisation [1] , and developed a Split System Approach (SSA) [8] based model for determining optimal PM strategies for production lines [3] . Ebeling’s analysis [5] indicated that an optimal maintenance frequency exists, but this analysis was conducted qualitatively rather than quantitatively. Kim and Makis developed a semi-Markov process based maintenance decision model under an assumption that the degradation of a system has multiple stages and the system is subject
405
to two types of failures, major and minor [9] . Chareonsuk et al.’s optimisation model [6] considered expected total costs per unit time and reliability, but it did not take other constraints into account. To deal with a long term PM schedule for new production lines, Percy et al. [7] postulated a new Bayesian method based approach but did not develop an applicable algorithm. An effective model for determining the optimal PM strategy for complex engineering assets with multiple imperfect repairs over its life-span is thus yet to be developed. Imperfect repairs indicate that the state of a system after a PM action can be anywhere between “as good as new” and “as bad as old”. This paper addresses this issue by developing a generic split decision-making process model: one that addresses both the basic AM decision-making process and the specific needs of the AM decision context. Our model is based on an analysis of the characteristics of typical industrial AM decisions while taking into account the NAMS Group’s decision process model and Rhodes’ five-step process model. There are a number of process modelling techniques applicable to modelling AM decisionmaking processes. We favour simple flowcharts in this paper because they are well-established, and familiar to most engineers and business managers. This paper addresses this issue and presents an advanced approach to optimising PM strategies under their two most critical constraints – minimum mission time and maximum acceptable failure risk. The rest of the paper is organised as follows: Section 2 describes the research scope in the paper, defines decision optimisation problems and formulates our corresponding mathematical model for solving the optimisation problem. Some numerical examples are presented to demonstrate the applications of the model in Section 3. Conclusions are given in Section 4.
2
MODEL FORMULATION Before mathematical formulation, the research of the paper needs to be scoped.
2.1 Research scope Currently, two types of Preventive Maintenance policies are commonly applied in industry. One is Time Based Preventive Maintenance (TBPM) and the other is Reliability Based Preventive Maintenance (RBPM). In the TBPM policy, the system is maintained based on scheduled PM times. The intervals between two PM actions may or may not be the same. In the RBPM policy, a control limit of reliability R0 is defined in advance. Whenever the reliability of a system falls to this predefined control limit, the system is maintained (see Figure 1). In Figure 1, T is a given timespan considered in T decisions. Rsc(τ)0 Parameters t 0 and Rsc(τ)1 Rsc(τ)n-1 t n+1 are the start Rsc(τ)n and end points of t the time-span. R0 Parameters ti Dt n Dt n+1 Dt3 …. Dt 2 Dt1 ( i = 1,2,K ,n ) are t 2 , …. the times for tn t0 t1 t n +1 t Preventive Maintenance actions. Duration Figure 1. Changes of the conditional reliability of an imperfectly repaired asset Dti is the time with a RBPM strategy interval between the i th and ( i - 1 )th PM actions ( i = 1,2,K ,n ). Parameter t is a relative time which changes from 0 (zero) to Dti ( i = 1,2,K ,n ). Limit R0 is the reliability threshold for controlling PM times. Function Rsc (t ) i ( i = 0,1,K ,n ) is the conditional reliability of the system in every PM cycle. The conditional reliability represents the system’s reliability under the condition that the preventively repaired components successfully survive until their individual PM times. For a complex engineering system, PM activities often belong to imperfect repairs because in most cases only some of the components instead of the entire system will be preventively repaired. As a result, the conditional reliability of a system just after a PM action cannot be restored to one Rsc(t)
The RBPM sometimes also stands for the Risk Based PM because of the following relationship,
406
Risk = Consequenceof failures· F (t ) ,
(1)
where, F (t ) is the failure probability and can be calculated based on reliability R(t ) as,
F (t ) = 1 - R(t ) .
(2)
For a specific engineering asset with certain failure modes, the consequences of failures are normally fixed. In this case, controlling risk can be converted to controlling reliability. Therefore, in this paper, we use reliability rather than risk for modelling development.
2.2 Optimisation problem Two major issues need to be addressed when making an optimal decision using a Preventive Maintenance strategy for engineering assets: (1) the changes in the reliability of engineering assets due to PM; and (2) maintenance related costs. Conflicting interests exist between these two issues. More frequent PM activities often need to be conducted and more resources need to be consumed if one wishes to maintain an engineering asset at a higher reliability level. As a result, maintenance related costs increase. On the other hand, lowering reliability requirements can reduce maintenance related costs. However, a lower reliability of an engineering asset usually means that this asset is prone to more breakdowns and greater losses in production. An effective maintenance strategy must balance both reliability and maintenance costs, i.e., find an optimal reliability threshold which enables the total expected cost, including repair cost, preventive maintenance cost and production loss, to be minimised. However, as indicated in Section 1, achieving this objective is subject to a number of constraints. These constraints can be roughly classified into two categories: (1) Limitations in the resources and capabilities that are available to an organisation, for example, human resources, financial resources, spare parts, maintenance performance, and available technologies. (2) Requirements to the business of an organisation, for example, minimum mission time (which is the required minimum time for an asset to implement a particular task without interruptions), minimum mission reliability (which is the minimum reliability required by a particular asset to implement a particular task), maximum acceptable risk level, and minimum customer satisfaction levels. Some constraints may cross the two categories, for example, registration or legal requirements. The constraints in the first category are “softer”, i.e., organisations have a higher ability to remove them. The constraints in the second category are “harder”, i.e., they usually cannot be relaxed. Therefore, when developing a PM optimisation model, the second category of constraints are those that have to be modelled. As risk and reliability are closely related, and customer satisfaction is usually associated with mission time and reliability, the minimum mission time and minimum mission reliability become the two most critical constraints. Therefore, the optimisation problem considered in this paper is defined as follows,
Ct = min[C r ( R0 ) + C m ( R0 )] , s.t. R0 ‡ Rmin ,
(3)
Dt i ‡ Tmin , i = 1,2,K ,n , where, Rmin is the minimum acceptable mission reliability and Tmin is the minimum possible mission time. Term Ct is the total expected cost. Terms Cr ( R0 ) and C m ( R0 ) are the risk related cost caused by the failure of assets and the maintenance related cost, respectively. Our prior analysis has revealed that maintenance costs will increase with increasing maintenance frequency whereas the costs due to breakdown of a production line decreases with increasing maintenance frequency [3] ,
C r ( R0 ) = k r [1 - R(T , R0 )] ,
(4)
Cm ( R0 ) = k m N (T , R0 ) ,
(5)
407
where, parameters k r and k m are two scale constants. Term N (T , R0 ) is the required number of PM actions over the period of time T . Function R (T , R0 ) is the overall reliability of the asset at time T . Both N (T , R0 ) and R(T , R0 ) are not only dependent on time T and the reliability threshold R0 , but are also dependent on the system configuration and the maintenance strategy. They can be calculated using the SSA [10] . If m components are repaired in j PM cycles (actions) and Lk indicates that component k ( k £ m ) receives its last repair in the Lk th PM action ( Lk £ j ), then the conditional reliability function of a system after the j th PM cycle is given by
Rs (t +
j
∑
m
Dt i ) 0
i =1
Rsc (t ) j =
m
∑ Dt )
i = Lk +1
k =1
Rk (t +
j
Rk (t + j
∑ Dt )
i Lk
, ( 0 £ t £ Dt j +1 , j = 0,1,K ,n ).
(6)
i 0
i =1
k =1 n
In Equation (6), let
∑ Dt
i
= 0 when Lk +1 > j . The reliability of the system can be calculated using a heuristic approach.
i = Lk +1
Function Rsc (t ) j is the conditional reliability of the system after the j th PM action. Function Rs (•) 0 is the overall reliability of the system before any repairs. Rk (•) 0 and Rk (•) Lk are the reliability of Component k before any repairs and after the Lkth PM action. Equation (6) can be used to calculate N (T , R0 ) , and also Dti ( i = 1,2,K ,n ). However, as Equation (6) is recursive, a computer program is normally needed to support the calculation. Term R(T , R0 ) can be calculated based on the relationship between the conditional reliability and overall reliability,
R (t ) j = Rk ( where, Rk (
∑ Dt )
∑ Dt )
j -1
j -1 Rsc (t ) j ,
( 0 £ t £ Dt j +1 , j = 0,1,K ,n ),
(7)
is the reliability of the component that is repaired at the j th PM action. Function R (t ) j is the
overall reliability of the system after the j th PM action. Using a recursive approach, R(T , R0 ) can be calculated.
3
A NUMERICAL EXAMPLE A serial system is assumed to be composed of ten identical components. The reliability of each component, Rcom (t ) is
t Rcom (t ) = exp - 539.4
5.59
.
(8)
An RBPM strategy is assumed to be employed to improve the system’s reliability. In this strategy, only one component will be preventively replaced by an identical one in any PM action. The ten components will sequentially receive their preventive replacement. The required minimum mission time is assumed to be 30 days. The required minimum mission reliability is assumed to be 0.85. The decision horizon is assumed to be 5000 days. Figure 2 shows the changes of the conditional reliability (left) and the overall reliability (right) of an imperfectly repaired asset with the RBPM strategy when the reliability threshold was set to 0.85. Figure 3 provides the same information when the reliability threshold is set to 0.9. From these two figures, it can be seen that both strategies can improve the reliability of the system. In both cases, the requirements for minimum mission reliability and the minimum mission time can be met. The minimum interval between two PM actions for the first case is 51.8 days and 47.8 days for the second case. Both are greater than 30 days. In addition, the required PM actions in both cases are the same (ten). However, the overall reliabilities of the system at 5000 days in these two cases are different: 0.599 for the first case and 0.671 for the second case. This result indicates that the second case is better than the first one.
408
Figure 2. The conditional reliability (left) and the overall reliability (right) of an imperfectly repaired asset with a RBPM strategy – the reliability threshold is 0.85
Figure 3. The conditional reliability (left) and the overall reliability (right) of an imperfectly repaired asset with a RBPM strategy – the reliability threshold is 0.9
409
Overall cost ($)
Reliability threshold
Figure 4. The changes of the total expected cost over different reliability thresholds
4
CONCLUSIONS Reliability (or Risk) Based Preventive Maintenance (RBPM) is a good strategy for improving the reliability of engineering assets. However, determining an optimal RBPM strategy usually involves finding an optimal reliability threshold which enables the total expected cost, including repair costs, preventive maintenance costs and production losses, to be minimised. The optimisation of RBPM is limited by many constraints. This paper has developed an effective model for optimising the RBPM strategy under two critical constraints – minimum mission time and minimum mission reliability. When conducting RBPM, the lowest reliability in every PM cycle is the same, equal to the predefined reliability threshold. However, the intervals between two PM actions in different PM cycles are often different and need to be calculated recursively. The newly developed model can effectively assist asset managers to choose an optimal RBPM strategy for their complex assets over multiple PM cycles. Although the mathematical model is developed based on a special case, the modelling approach can be adapted to more general cases.
5
REFERENCES
1
Sun Y, Ma L & Morris J. (2009) A practical approach for reliability prediction of pipeline systems. European Journal of Operational Research, 198(1), 210-214. Ma L, Sun Y & Mathew J. (2007) Effects of Preventive Maintenance on the Reliability of Production Lines. In M. Xie (Eds) Proceedings of IEEE International Conference on Industrial Engineering and Engineering Management, Singapore. pp.631-635. IEEE. Sun Y, Ma L & Mathew J. (2006) Determination of optimal preventive maintenance strategy for serial production lines. In J. Mathew (Eds) Proceedings of The 1st World Congress on Engineering Asset Management, Gold Coast, Australia. pp.paper 22. Springer-Verlag. Yeh RH, Kao K-C & Chang WL. (2009) Optimal preventive maintenance policy for leased equipment using failure rate reduction. Computers & Industrial Engineering, 57(1), 304-309. Ebeling CE. (1997) An Introduction to Reliability and Maintainability Engineering. New York: The McGraw-Hill Company, Inc. Chareonsuk C, Nagarur N & Tabucanon MT. (1997) A multicriteria approach to the selection of preventive maintenance intervals. Int. J. of Production Economics, 49(1), 55-64. Percy DF, Kobbacy KAH & Fawzi BB. (1997) Setting preventive maintenance schedules when data are sparese. Int. J. of Production Economics, 51(2), 223-234. Sun Y, Ma L & Mathew J. (2004) Reliability prediction of repairable systems for single component repair. In J. Lee (Eds) Proceedings of International Conference on Intelligent Maintenance System, Arles, France. pp.S3-D. IMS. Kim MJ & Makis V. (2009) Optimal maintenance policy for a multi-state deteriorating system with two types of failures under general repair. Computers & Industrial Engineering, 57(1), 298-303. Sun Y, Ma L & Mathew J. (2007) Prediction of system reliability for multiple component repairs. In M. Helander, M. Xie, R. Jiao & K. C. Tan (Eds) Proceedings of The 2007 IEEE International Conference on Industrial Engineering and Engineering Management, Singapore. pp.1186-1190. IEEE. Holloway CA. (1979) Decision Making under Uncertainty: Models and Choices. Englewood Cliffs, NJ: Prentice-Hall, Inc.
2
3
4 5 6 7 8 9 10
11
Acknowledgments This research was conducted within the CRC for Integrated Engineering Asset Management, established and supported under the Australian Government’s Cooperative Research Centres Program.
410
Proceedings of the 4th World Congress on Engineering Asset Management Athens, Greece 28 - 30 September 2009
DYNAMIC E-MAINTENANCE IN THE ERA OF SOA-READY DEVICE DOMINATED INDUSTRIAL ENVIRONMENTS Alessandro Cannata a, 1 and Stamatis Karnouskos b and Marco Taisch a, 2 a
Department of Management, Economics and Industrial Engineering, Politecnico di Milano, Milano, Italy. 1
e-mail: [email protected] 2
b
e-mail: [email protected]
SAP Research, Karlsruhe, Germany. e-mail: [email protected]
The factory of the future will be heavily based on service oriented architecture approaches. Business continuity will need to be guaranteed as interactions in the shop-floor will be more complex and demanding. In parallel sustainable manufacturing is emerging as a new approach to address economical, environmental, and societal issues. In this context, maintenance will play a major role. In particular, an e-maintenance platform will have to deal with the emerging challenges, and take advantage not only of the latest technologies but also of new collaborative concepts that will be possible in the future factory. We discuss here the dynamic e-Maintenance in the era of SOA-ready device dominated industrial environments. Furthermore, we demonstrate how existing efforts in cross-layer SOA based enterprises in conjunction with e-maintenance platform can greatly enhance existing decision making processes supporting the transition towards sustainable manufacturing. Key Words: service-oriented architecture, asset lifecycle management, open ICT cross-layer platform, eMaintenance, web services, DPWS, dynamic asset discovery. 1
MOTIVATION
Industrial maintenance, is gaining significance [7] both within the academic and industrial community, as it develops from being considered a minor activity, towards a strategic task in operation management [14], thus being called asset lifecycle management. Moreover, maintenance is considered a major lever to be exploited to go towards sustainable manufacturing, where economical, societal, and environmental issues are properly considered [17]. Recently, the focus is on e-Maintenance, that is “maintenance support which includes the resources, services and management necessary to enable proactive decision process execution” [13] and that is considered the major pillar of e-Manufacturing. However, in order to implement effectively e-Maintenance applications for asset lifecycle management, several requirements need to be met, and this is a challenging task. One major need is the absence of an open ICT platform that can fully support e-Maintenance practices [13], taking also into consideration the latest concepts and technology trends such as the Service Oriented Architecture (SOA) approaches at device level. The future factory will be dominated by SOA [4], which empowers us with new capabilities and enables the realization of sophisticated approaches based on the collaboration of devices, network services within the single enterprise and among enterprises. This is a key issue especially for the maintenance, as now the devices are not considered as simple passive black boxes, but active entities that can do self-monitoring, proactively inform third party services about their status or maintenance needs and therefore greatly enhance existing efforts for remote and autonomous maintenance. Within the project SOCRADES (www.socrades.eu) we have developed a Service-Oriented Cross-layer infRAstructure for Distributed smart Embedded devices, which is an open approach for enabling among others the effective interaction and collaboration among all entities in future industrial domain, ranging from devices, engineering systems, enterprise systems, etc. SOCRADES is a platform for next-generation industrial automation systems that exploits Service Oriented Architecture (SOA) paradigm in a cross-layer way i.e. at the device, network and business application level. The SOA paradigm in our case is implemented through Web Services technologies even at the device level which enables the adoption of a unifying technology for all levels of the future enterprise, from sensors and actuators to enterprise business processes. In this way, different entities
411
(whether they are services or devices) can subscribe and get the necessary information (event based infrastructure) while in parallel being agnostic to the actual implementation details. The SOCRADES Integration Architecture (SIA) [10, 16] demonstrates how the close coupling of devices that host web services locally e.g. via the usage of Device Profile for Web Services (DPWS [2]) and OPC Unified Architecture (OPC UA [8]) can offer several advantages and ease close collaboration with enterprise applications. These SOA-ready devices can host intelligence and offer their functionality in a service oriented way. That, in conjunction with the overall platform functionality, has the potential to drastically change the way we design and deploy services within the industry, and has a significant impact on asset management and e-maintenance. In this paper, we highlight how future industrial environments empowered by SOA platforms such as the SOCRADES one can support seamlessly and effectively e-Maintenance applications. In particular, since the SOCRADES platform bridges the communication gap between business applications (e.g. ERP) and shop floor applications (i.e. MES), it represents a useful support to implement the e-Maintenance applications envisioned in the literature [13], e.g. on-line maintenance, collaborative maintenance, etc. In particular, we stress how the features of the integrating platform that we propose [10, 15] (such as Service Discovery, Cross-layer Service Catalogue and Service Lifecycle Management) can be fully exploited in order to implement eMaintenance practices such as real-time fault diagnosis/localization, predictive maintenance, intelligent support for maintenance decision making, etc. Some scenarios, related to asset lifecycle management, are analysed in order to better highlight the functionalities enabled by the platform, in the future sustainable manufacturing environment. 2
THE E-MAINTENANCE CONCEPT
E-maintenance is a variegate concept, which has been studied from different perspectives and with different aims [13]. Muller et al. [13] define: “Maintenance support which includes the resources, services and management necessary to enable proactive decision process execution. This support includes e-technologies (i.e. ICT, Web-based, tether-free, wireless, infotronics technologies) but also, e-maintenance activities (operations or processes) such as e-monitoring, e-diagnosis, eprognosis, etc.” The e-maintenance enables four main different maintenance strategies [13] such as: •
Remote maintenance;
•
Predictive maintenance;
•
Real-time maintenance;
•
Collaborative maintenance.
Remote maintenance refers to the capability enabled by ICT developments to provide maintenance practices from anywhere e.g. third party entities outside the enterprise borders. Through remote maintenance applications, an operator may complete his/her task without having to be physically present where the asset is located. This aspect of e-maintenance has a significant impact in terms of cost, downtime, quickness of reaction, and of effectiveness of maintenance intervention, since experts on specific field may be more easily consulted from anywhere without the need to physically be present. Moreover, it dramatically affects business models that should be applied in order to provide the customer with maintenance services. Predictive maintenance (or condition-based maintenance) considers the adoption of models and methodologies to analyse real-time data coming from the monitored assets in order to provide optimized maintenance interventions. Predictive maintenance is the latest evolution of maintenance and reliability engineering, which aims to minimize failure in order to guarantee the appropriate asset operation. This relevant aspect of e-maintenance needs hardware and software components available at shop floor level (for example a watchdog agent [5]). These components collect data, analyse them, and provide bottom-up alters to force maintenance interventions. Real-time maintenance focus on the reduction of the time delay between the moment when an event occurs on the shop floor (i.e. a failure of a machine) and the moment when that information is transmitted to the operator/responsible. This allows increasing reactiveness of enterprise in terms of maintenance activities. This maintenance strategy is included in the more comprehensive real-time enterprise concept, which addresses in the same way the reduction of information time delay, but in a more general scope, by looking at all the information related to the shop floor status, i.e. Work-In-Process, machines utilization, etc. Finally, Collaborative maintenance refers to the capability enabled by e-maintenance concepts to allow collaboration among different areas of the enterprise (intra-enterprise collaboration) and among different enterprises (inter-enterprise). In particular this aspect is interestingly connected with another stream of research that addresses collaboration in the industrial automation domain, in what is called the Collaborative Automation Paradigm [3]. As a matter of fact both in automation and maintenance domain, when looking at collaboration similar issues are shared such as reduction of information interfaces, seamless communication, security, etc. Moreover, since e-maintenance is strictly connected to e-manufacturing [11], there is a need for an effective coordination among separate facets of manufacturing management, such as production, maintenance and
412
business activities. Nowadays, this coordination is still underdeveloped and one of the main reasons is the lack of open tools that could support it. One of the most relevant challenges when implementing e-Maintenance concepts is related to the lack of an integrating information and communication infrastructure [1, 11–13]. A seamless infrastructure that may support management and control of industrial operations (including maintenance) and connect them in real-time to the higher level layers (i.e. business layer) is the main issue to be addressed in order to go towards e-maintenance applications.
Figure 1. e-Maintenance: Basis and main pillars Hence, we consider the four maintenance strategies highlighted before as the main pillars of the e-maintenance approach. An appropriate information and communication infrastructure represents the basis that supports e-maintenance strategies. This concept is represented in Fig. 1. Due to the relevance of the topic and the impact on the overall e-maintenance concepts, in this paper we decided to address the issue of information and communication infrastructure. In the next section, we define the requirements that such an infrastructure should present in order to be exploitable in the e-maintenance context. 3
INFORMATION AND COMMUNICATION INFRASTRUCTURE REQUIREMENTS
We selected the following requirements starting from previous research conducted within SOCRADES project [10, 15]. These requirements are considered critical for future industrial environments and are directly coupled with the e-maintenance concepts: •
Interoperability: several e-maintenance platform are based on proprietary technologies [13], which implies higher costs and slow market adoption, since implementation costs and time are required. In order to effectively support Collaborative, Real-time, and Remote Maintenance this infrastructure should be able to support open communication among different actors (i.e. collaborative), in different layers (i.e. interoperable) and in a timely manner in the enterprise information system (i.e. real-time), and through the adoption of different tools and technologies globally distributed (i.e. remote). One way could be to adopt a common language that reduces the need for interfaces among different systems: the aim is to have a common basis for seamless operation of standard functions such as discovery, description, addressing, invocation, etc.
•
Scalability and flexibility: due to rapidly changing market and to on-going trend towards flexible and adaptive factories [9], there is a need for scalable platforms in order to effectively support all the pillars, even with changing conditions, such as for example, number and/or type of asset(s) monitored. Indeed, it is expected [6] that future factories will be composed by reconfigurable machines that will increase the level of dynamic behaviour shown at shop floor level. This dynamicity needs to be supported even in terms of maintenance through an appropriate scalable and flexible e-maintenance platform.
•
Security: an open infrastructure where rapidly changing business processes and collaboration among companies at several layers occur in an e-maintenance context (in particular considering Collaborative, Remote and Real-time Maintenance) needs to support security. The openness and heterogeneity of such systems is requiring a different security approaches from that of traditional systems and architectures. These security architectures must be flexible enough to tailor themselves to application-specific security requirements, but also to be customizable for policy/compliance.
413
•
Device to Business Integration (D2B Integration): Device manufacturers are increasing the amount of embedded software and also sophistication of their products. Therefore, new capabilities emerge on the shop floor and enable devices to actively participate in business applications by providing information from their domains and/or acquiring information from enterprise level (e.g.: devices can directly trigger an event in the business process and affect its execution [15]). An e-maintenance platform should consider this requirement and adopt a common approach to represent information among heterogeneous systems. This requirement is important to support in particular predictive maintenance and real-time maintenance.
•
Distributed management: we see a trend in the shifting of intelligence and processing tasks towards the field level devices. In order to fully exploit remote and collaborative maintenance, a distributed infrastructure is required that does not only consider hierarchical control as this constrains the opportunities enabled by heterogeneous, distributed, and autonomous interaction among single elements (e.g. operators, devices, watchdog agents [5], etc.). Moreover, this requirement can inherently add flexibility and scalability to the system by reducing the number of centralized points.
•
Semantics: due to the relevance of ontology in e-maintenance context [13], the information infrastructure should be able to inherently support knowledge processing. This is core for implementing effective collaboration among heterogeneous actors. Indeed, this requirement enables collaboration through formal description of elements and relationships in industrial maintenance domain.
On the base of these requirements, in the next section we describe the technologies that we consider to adopt in order to implement an effective e-maintenance platform. Moreover, in section 5 a description of the platform and of its components is proposed, with an evaluation based on the requirements identified. 4
SOA-READY DEVICES
Web services are used mainly in enterprise environments to support interoperable e.g. machine to machine (M2M) interaction while hiding the details of the implementation at each end-point. Enterprise applications use web services as basic blocks to create more sophisticated services e.g. to glue together cross-organizational functionality. Several standards exist, but most of them do not assume embedded systems as an implementation platform. In the past, there have been efforts (e.g. Jini, UPnP) to integrate devices into the networking world and make their functionality available in an interoperable way. The latest one, coming from UPnP and attempting to fully integrate with the web-service world, is DPWS [2], which defines a minimal set of implementation constraints to enable secure web service messaging, discovery, description, and eventing on resourceconstrained devices. DPWS is an effort to bring a web services on the embedded world taking into consideration its constrained resources. Several implementation of it exist in Java and C (www.ws4d.org, www.soa4d.org), while Microsoft has also included a DPWS implementation (WSDAPI) by default in Windows Vista and Windows Embedded CE.
Figure 2. Asset management: dynamic device discovery and info e.g. status, hosted services, serial number, etc. Emerging standards like DPWS and OPC UA [8] assume (web) services running within the devices. The key idea is to provide interoperability and easiness of integration of devices, focusing exclusively on the functionality they offer at the shop
414
floor and not on the device-specific implementation as such. As the device now can provide variety of information about itself as well as dynamic information about its status and the services it hosts, new approaches can be applied e.g. in the asset management. We can clearly see in Fig. 2 that any device with the DPWS stack can already be dynamically discovered, and information such as serial number, MAC address, IP address, model number, Unique ID (UUID), etc can be obtained. Furthermore there is a standard way to access the functionality on the device and e.g. control it, or obtain its health status. Since now the device can provide this information in a standard way via web services, other devices or services in an emaintenance platform can subscribe to the events it creates. In case of a failure, the e-maintenance platform is notified. This is a clearly paradigm shift towards an event-based infrastructure, where information can be dynamically discovered and be fed to the interested parties only. 5
E-MAINTENANCE PLATFORM
The purpose of an e-maintenance platform is to coordinate maintenance information shared among different actors (devices, plant managers, external partners, business managers, decision support systems, etc.), and provide the basic tools for decisions to be made. In the context of sustainable manufacturing, e-maintenance platforms are needed in order to make proper decisions (i.e. with accurate and near real-time information), taking into consideration all the relevant components of the production process and their respective impact. Since in maintenance domain, impact on sustainability often depends on timely reaction to unexpected events (e.g. a timely identification of failures that causes higher CO2 emissions, may reduce environmental impact), e-maintenance platform emerges as a core support also for sustainable manufacturing. As depicted in Fig. 3, the e-maintenance platform provides its functionalities that in our case are easily realizable due to the usage of the SOCRADES Integration Architecture [16] (SIA). SIA allows seamless interaction among devices and services hosted at different layers (e.g. at device, network or enterprise system).
Figure 3. The e-Maintenance platform in the future shop-floor empowered by cross-layer SOA Real-time Monitoring is possible and as an event-based infrastructure is used, this is done only when needed. Furthermore, via SIA and the web services on devices, management functions (soft control) can be directly done at device level. These two fundamental functionalities (i.e. monitoring and management) are used by the e-maintenance platform, to be combined in more sophisticated service behaviour. As an example, device status can be monitored or an event can be raised by the device itself. As now partial business logic can be hosted on the device, a direct mapping can be done and the e-maintenance platform knows which parts of the business process are affected. Immediately, automated tickets can be issued, e.g. a remote evaluation of the health status, exchange of the device, or even an order to the ERP system for a repair task to the nearest worker in the field. The e-maintenance platform can provide timely information that can be analysed by a Decision Support System (DSS) and, therefore, can predict or take sophisticated actions on the shop-floor with the goal of maintaining the business continuity.
415
Peer-to-Peer communication among the devices leads also to increased flexibility, as a malfunction device, with the help of the maintenance platform can identify and delegate part of its functionality to devices that host a similar set of services necessary to realize the tasks that are pending on the malfunctioning device. It is clear that the e-maintenance platform can be empowered with new capabilities and deal with dynamically arising situations by using discovery and an event based infrastructure. Furthermore, as the communication is done in a standardized way e.g. via Web Services, interoperability is enhanced, while integration costs are lower than in legacy systems. 5.1 Real-time Monitoring In Real-time Monitoring, SOA-ready devices provide information on their status to the higher level systems, through a bottom-up communication approach. Information is communicated on an event basis (and not on a pull schedule) and can be propagated across several layers that support SOA interaction. Raw basic data (e.g. temperature in an industrial oven) or processed information (e.g. expected time to failure derived from condition-based analysis) can be generated and delivered by the device itself as now it hosts logic and computational capabilities locally. Through the e-maintenance platform, information can be conveyed to heterogeneous agents, e.g. plant manager responsible for shop floor operation or business manager, who need fine-grained information from the shop floor level. Since seamless and real-time information can be obtained through cross-layer SOA, new knowledge can be potentially created anywhere by composing in a Lego-like way the services offered at device, network and enterprise level. Real-time Monitoring is particularly relevant in the context of sustainable manufacturing, where quick reactions to unexpected events need to be provided, in order to minimize economic, environmental and societal security risks. In the context of Real-time Monitoring, the e-maintenance platform is responsible for conveying information even beyond the company borders e.g. when device health monitoring is outsourced. Heterogeneous actors (operators, managers, etc.) may subscribe/unsubscribe for obtaining Real-time Monitoring of specific SOA-ready devices, therefore, extreme flexibility can be obtained both in term of devices monitored and of actors interested in monitoring. 5.2 Real-time Management Real-time Management focuses traditionally on top-down approaches to control and manage shop floor devices. Maintenance operators or plant managers may implement their decisions directly on the device, through e-maintenance platform. For example, a plant manager may directly switch off a specific machine, ask for a maintenance intervention and meanwhile reroute production flow on other machines; all this can be done in a seamless and transparent way thanks to the adoption of cross-layer SOA for the e-maintenance platform. In order to implement this, a repository of devices (i.e. service repository) needs to be provided at the e-maintenance level so that higher level systems (owned by plant manager, business managers, etc.) may retrieve the requested devices and perform a specific action through service invocation. This functionality is already provided for our case via the SIA, and the developers of services for the e-maintenance do not have to deal with the complexity or the specifics of the infrastructure - they rather use SOA techniques to request the necessary info from the SIA platform. It is clear that the e-maintenance platform now goes beyond specific network borders, or even geographical ones, and therefore real-time view on global scale can be achieved. Real-time Management in combination with real-time analytics may provide a better overview of the assets, their status and be able to start preventive measures or timely react to emerging problems. Furthermore, the usage of the timely generated info can be fed to Decision Support Systems that may help the managers on their perspective decisions, also by presenting alternatives and simulating possible scenarios. Finally, through real-time management, decision makers are provided with an overall picture of the manufacturing plant that could support sustainable directions. For example, reduction of emissions through proper maintenance intervention and minimization of dangerous failures, through appropriate analysis of fine-grained data coming from the shop-floor, could be exploited. 5.3 Dynamic Discovery Dynamic Discovery is a specific feature enabled by the SOA-ready devices. Indeed, since devices implement SOA specifications such as the WS-Discovery (that exists in DPWS) can now be seamlessly discovered without the need for explicit registration. Furthermore, the services can also be dynamically discovered and used, which really brings the benefits of service-oriented approaches to the lowest levels of the industrial environments. In practice, this means that if a new device is added to the production systems, it is automatically recognized by the e-maintenance platform, registered and monitored. As such we avoid the possibility for mistakes, and keep always up to date the mapping between the real world and the business one in the asset management systems. Being able to get dynamically accurate info about the devices and their services, can help prevent or timely identify conflicts that otherwise would be discovered only after a problem arises e.g. production halt. 5.4 P2P Communication
416
As depicted in Fig. 3, another functionality that can now be realized is the peer to peer communication among heterogeneous devices. Devices using web services (e.g. DPWS) are able to directly communicate among each other. This allows decentralization of management, and enhances collaborative scenarios where several devices cooperate e.g. for decision making. For example, considering a redundant production system composed by two single identical machines: one machine that has just entered a failure mode may verify the actual operation of the other identical machine, in order to decide the priority level of the maintenance intervention request to be sent to e-maintenance platform level. As such, partial failures of a device may result that it uses the same service offered by a nearby device, and continue its operation. This enables us to have a more reliable shop-floor with increased uptime, thus reinforcing the business continuity goal. 5.5 Cross-company communication Cross-company communication is already a reality, but constrained at enterprise level only. However, now the real-time connection to the devices will enable them to interact or inform actors over the company level, for their status. As such, malfunctioning of a device that may result in production slowdown, may have an impact on the performance of a production line in another company, which expects input from the first one. Communication can now be done directly e.g. via common trusted third party service providers, that may simply couple the two companies for the specific business case. As this can now be directly communicated, we avoid costly communication links by propagating the info on all above enterprise layers in both companies. Synergies can be identified and information that was too costly to be obtained in a timely manner can now flow into cross-company applications and services. This approach is very well suited for dynamic and short-lived interactions that can be set up, exploited and removed as easy as a simple composite service.
Figure 4. Remote continuous outsourced cross-company e-maintenance Cross-company collaboration allows us to realize new functionality and innovate at services offered. Especially in case outsourcing of maintenance, specialized partners can now bring in their expertise and monitor remotely the devices at the shopfloor and maintain them. Assets that the company operates on may in future not be owned by the company as such, but instead be provided to them over specific service level agreements (SLAs) e.g. a production line with uptime 99% - how this is achieved and its maintenance is responsibility of the service provider. As a result, companies can now focus more on their core business, while service level agreements can regulate shop-floor performance that matches better the business process goals, but not how this is achieved, which is responsibility of the e-maintenance partner. This can facilitate the development of new business models based on remote maintenance service delivery through e-maintenance platform. 6
E-MAINTENANCE EFFECT IN DECISION MAKING PROCESS
As we have discussed so far, there are several benefits that can be brought with the introduction of a SOA based emaintenance platform. The most important one is that business continuity can be enhanced. Business continuity describes a mentality or methodology of conducting day-to-day business. This assumes that critical business functions must be available to
417
business partners and suppliers, and a way to do that relates to minimizing downtime which is part of the goals of an emaintenance platform. By being able to directly access information, propagate it at different layers, endorse predictive and collaborative maintenance approaches, significant contributions can be made to the several steps of the business continuity process.
Figure 5. Timely reporting with e-maintenance Fig. 5 depicts in the centre the decision making hierarchy according to the ISA-95 standard. Today the biggest problem is interaction among the different levels and integration of the information generated. As depicted on the left side of Fig. 5, currently the reporting done is segmented and hierarchical. Furthermore, the information flow is too slow, which can be even translated into days until it reaches the business level. The adoption of an e-maintenance platform can have a significant impact. Apart from the other benefits of SOA and the explicit ones discussed in previous sections, now we can realize cross-layer integration and information flow. This leads to collaboration among the different layers in a peer-to-peer way without necessarily having to go through the whole hierarchy. In practice, that means that an enterprise service can directly be informed by a device e.g. an MES system on possible problems that directly affect a business process. This results in reduced time of reporting and timely dissemination of information at the appropriate and interested parties only via the event-based infrastructure. 6.1 Conclusions We have investigated the benefits (in terms of e-Maintenance applications) coming from the adoption of a SOA-based platform that integrates business and shop floor level, through web services technologies. The whole approach demonstrates that SOA-ready devices can further empower e-Maintenance capabilities and pave the way for better business continuity, and more sustainable manufacturing. Since e-Maintenance will gain importance in factory of the future, this enforces the relevance and value of the proposed platform for next-generation industrial systems. It is clear that the SOA-empowered e-maintenance platform can provide a significant business advantage with respect to the timely information delivery to the interested parties. Furthermore, new business models can be realized that are service-driven, employing outsourced expertise and predictive maintenance. If implemented properly, enterprises and collaborators will benefit from the increased asset management, optimal performance and seamless integration. 7
REFERENCES
1
Campos, J. (2009) Development in the application of ict in condition monitoring and maintenance. Computers in Industry 60, 1–20
2
Chan, S., Kaler, C., Kuehnel, T., Regnier, A., Roe, B., Sather, D., Schlimmer, J., Sekine, H.,Walter, D.,Weast, J., Whitehead, D., Wright, D. (2005) Devices profile for web services. Microsoft Developers Network Library
418
3
Colombo, A.W., Jammes, F. (2009) Integration of cross-layer web-based service-oriented architecture and collaborative automation technologies: The SOCRADES approach. In: Proc. of the 7th IEEE International Conference on Industrial Informatics (INDIN 2009)
4
Colombo, A.W., Karnouskos, S. (2009) Towards the factory of the future - a service-oriented cross-layer infrastructure. in the book ICT Shaping theWorld, A Scientific View, ETSI, John Wiley and Sons Ltd ISBN: 9780470741306
5
Djurdjanovic, D., Lee, J., Ni, J. (2003) Watchdog agent - an infotronics-based prognostics approach for product performance degradation assessment and prediction. Advanced Engineering Informatics 17, 109125
6
European Commission: Manufuture: A vision for 2020. Report of the High Level Group 1(92-894-8322-9) (November 2004)
7
Garg, Amik, Deshmukh, S.G. (2006) Maintenance management: literature review and directions. Journal of Quality in Maintenance Engineering 12(3), 205–238. DOI 10.1108/13552510610685075. URL http://dx.doi.org/10.1108/13552510610685075
8
Hannelius, T., Salmenpera, M., Kuikka, S. (2008) Roadmap to adopting opc ua. In: Proc. 6th IEEE International Conference on Industrial Informatics INDIN 2008, pp. 756–761. DOI 10.1109/INDIN.2008.4618203
9
Jammes, F., Smit, H. (2005) Service-oriented paradigms in industrial automation. IEEE Transactions on Industrial Informatics pp. 62–70
10 Karnouskos, S., Baecker, O., de Souza, L.M.S., Spiess, P. (2007) Integration of soa-ready networked embedded devices in enterprise systems via a cross-layered web service infrastructure. In: Proc. of the IEEE Conference on Emerging Technologies & Factory Automation (ETFA), pp. 293–300. DOI 10.1109/EFTA.2007.4416781 11 Koc, M., Ni, J., Lee, J., P., B. (2003) Introduction of e-manufacturing. In: Proceedings of the 31st North American manufacturing research conference (NAMRC), Hamilton, Canada 12 Kumar, U. (2008) System maintenance: Trends in management and technology. URL http://dx.doi.org/10.1007/978-184800-131-2_47 13 Muller, A., Crespo, Iung, B. (2008) On the concept of e-maintenance: Review and current research. Reliability Engineering & System Safety 93(8), 1165–1187. DOI 10.1016/j.ress.2007.08.006. URL http://dx.doi.org/10.1016/j.ress.2007.08.006 14 Pinjala, S.K., Pintelon, L., Vereecke, A. (2006) An empirical investigation on the relationship between business and maintenance strategies. International Journal of Production Economics 104(1), 214–229. URL http://ideas. repec.org/a/eee/proeco/v104y2006i1p214-229.html 15 de Souza, L.M.S., Spiess, P., Guinard, D., Koehler, M., Karnouskos, S., Savio, D.: SOCRADES (2008) A web service based shop floor integration infrastructure. In: Internet of Things 2008 Conference, Zurich, Switzerland (March 26-28, 2008) 16 Spiess, P., Karnouskos, S., Guinard, D., Savio, D., Baecker, O., de Souza, L.M.S., Trifa, V. (2009) SOA-based integration of the internet of things in enterprise services. In: Proceedings of ICWS 2009 (IEEE International Conference on Web Services). Los Angeles, California, USA 17 Seliger G. Kim, H-J. Kernbaum S. Zettl M. (2008) Approaches to sustainable manufacturing. Int. J. Sustainable Manufacturing, (1), (1/2).
Acknowledgment The authors would like to thank the European Commission and the partners of the European IST FP6 project “ServiceOriented Cross-layer infRAstructure for Distributed smart Embedded devices” (SOCRADES - www.socrades.eu) for their support.
419
Proceedings of the 4th World Congress on Engineering Asset Management Athens, Greece 28 - 30 September 2009
DESIGN REQUIREMENTS FOR WIRELESS SENSOR–BASED NOVELTY DETECTION IN MACHINERY CONDITION MONITORING Christos Emmanouilidis and Petros Pistofidis ATHENA Research & Innovation Centre, CETI, Comp Sys. & Applications Department, 58 Tsimiski St. Xanthi, Greece.
Wireless sensor networks are increasingly employed in a range of applications. Condition Monitoring in particular can benefit from the introduction of distributed wireless sensing solutions, operating with a high degree of autonomy. Wireless condition monitoring can extend the toolset available to the lifecycle management of engineering assets, offering ease of installation, flexibility, portability and accessibility. A significant hurdle for the adoption of wireless condition monitoring solutions in industry is related to the extent that such solutions can operate over long time periods, while providing adequate monitoring. Wireless sensor nodes extend the sensor functionality by providing on-board CPU, memory, power management and communications capabilities. Yet these are inherently limited due to the small form factor of the devices. Even in cases of sensor nodes with power harvesting capabilities, the minimization of the node energy consumption is sought at a premium. Apart from making the hardware design more energy efficient, sensor nodes can operate more efficiently if they achieve to minimize their energy-consuming activities, while meeting condition monitoring performance requirements. Low-level power management should be dealt with at the level of the sensor operating system. At the application end, a sensor node must feature some form of smart behaviour, enabling it to recognize events that deserve further attention. This is of profound importance as engineering assets equipped with embedded novelty detection capabilities would lend themselves for enhanced and sustainable operation. In this paper we study the design requirements for developing Novelty Detection techniques, as middleware components embedded on a single sensor board. Such smart components would enable the detection of events that signal the presence of unusual behaviour in the monitored equipment. On the basis of the identified design requirements, a conceptual architecture for the development of wireless sensor – board level novelty detection is discussed. Key Words: Condition Monitoring, Wireless Sensor Networks, Novelty Detection, Smart Sensors 1
INTRODUCTION
Wireless sensor networks constitute one of the most promising technologies to provide sophisticated infrastructure for condition monitoring processes. Increasingly sensor nodes are implemented as complex sensor board architectures, driving the emergence of a new breed of monitoring devices. In a typical setting, they are able to host automated computational and data storing operations. The fact that processing power and low latency memory became increasingly available and affordable for many types of small scale designs, lead to significant sensor board advancements in terms of both hardware components and supporting software. Recent examples of cutting-edge wireless sensor modules are powered by a potent 32 bit processor/controller accompanied by several Megabytes of flash memory, while support for external memory can also be available. Research utilizing these architectures has made possible the transition of sensor board logic from simple protocols and algorithms printed on bare metal, to complex middleware platforms sitting on top of dedicated sensor operating systems and providing interfaces to application components. Recent research on sensor platforms has produced implementations that include programmable APIs and versatile toolkits, able to provide a rich base for the developers to utilize and extend the sensor module’s functionality. Energy-aware transmission protocols and sampling algorithms were the first to upgrade their computational complexity in order to trade the significant cost of data transmission for cheap processing cycles. This trade-off was possible through the development of dedicated software components utilizing advanced methods and approaches on data acquisition, data modeling and data interpretation. Currently, the level of computational complexity that characterizes the operations embedded on a sensor board is only limited by the processing and storing abilities of the board’s implementation. In wireless condition monitoring, another
420
concern is to determine how much data processing should take place at the sensor level and how much data should be transmitted. A high level of processing sophistication at the sensor level may eventually lead to lowering the needs for RF transmission. Clearly a wireless condition monitoring application needs to strike a balance between the two. One solution is to design a sensor node that features some form of smart behaviour, enabling it to recognize events that deserve further attention. Often termed novelty detection, this functionality involves modelling the normal system function and detecting any deviations from that. Thus, condition monitoring greatly benefits from the integration of smart novelty detection capabilities. Firstly data transmission can be initiated only when novelty is detected, thus reducing energy spent on transmissions; and secondly novelty detection at the sensor level can be easier to adjust to serve diverse application needs. A significant functionality of any integrated condition monitoring solution is to detect when the machinery operation deviates from known patterns of normal operation. This functionality is usually termed novelty detection. Although novelty detection has been extensively studied in the literature, the recent surge in wireless sensor network application solutions enables the development of Novelty Detection and Condition monitoring solutions for wireless sensor boards and networks. This new potential is largely unexplored in the literature. This paper has looked into how Novelty Detection and Condition Monitoring functionalities can bee embedded in wireless sensor nodes and networks. It defines a reference architecture for sensor-embedded novelty detection. This architecture supports basic and advanced novelty detection functionalities in a modular approach, defining individual components and services that can be embedded at the sensor board level. Focusing on design considerations, the paper is structured as follows: section 2 discussed wireless sensor network architectures and functional requirements for wireless smart sensors; section 3 deals with adaptation issues in novelty detection; section 4 outlines the key design considerations for sensor-embedded novelty detection and introduces a novel reference architecture for sensor-embedded novelty detection; and section 5 is the conclusion. Instead of seeking an exhaustive discussion on the above issues, the aim is to focus on techniques either employed or having the potential to be employed in condition monitoring applications. 2
WIRELESS SENSOR NETWORKS
Condition monitoring was from the outset an early adopter of emerging sensor technology, since sensing operations populate its core functionality and provide the means for identifying machinery condition. Early after wireless technologies started to mature in modular designs and implementations, advances in network hardware integration and energy-aware protocols lead to the concept of wireless sensor networks [1]. In addition, the mass spread of small wireless devices, enables information access anytime and anywhere. With wireless sensor networks scalable monitoring environments can be created, as node addition is straightforward. The main advantages brought by wireless sensors include: • Ease of installation, as sensor positioning is freed from the constraints of cabled installations • Accessibility: every point of measurement becomes accessible • Simplified network design: dynamic topologies for fault tolerant networks supported by flexible protocols. • Scalable, large and yet maintainable infrastructures, as new nodes can easily be added to the network • Fault tolerance: redundancy in the sensor network enables it to tolerate faulty nodes. A smart sensor is a term advocating the presence of sophisticated sensor functionality. Smart transducers are defined in the IEEE 1451 family of standards. These outline a set of protocols for wired and wireless distributed monitoring and control applications. IEEE 1451 smart transducers are expected to have capabilities for self-identification, self-description, selfdiagnosis, self-calibration, location-awareness, time-awareness, data processing, reasoning, data fusion, alert notification (report signal), standard-based data formats, and communication protocols, supported by the so called TEDS (Transducer Electronic Data Sheets) [2]. Smart sensor behaviour may vary from simple signal amplification to advanced data modeling techniques for condition monitoring [3]. The characteristics and functionalities of a smart sensor are listed below [4]. • • • • •
They include the processing capacity and the proper software routines to process data locally. They can make efficient use of the network infrastructure through complex protocols and distributed communication patterns. Smart sensors are able to implement policies that enhance the network robustness and flexibility, and lessen the burden on centralized nodes. They can support the execution of advanced distributed processes. These may include collective decisions, node task allocation and workflow management for the entire network. They should be able to classify the data according to its criticality, in order to avoid unnecessary data processing during a critical stage of the monitoring item. Smart sensors can evaluate situations and configure sensing frequency enabling better monitoring performance when identifying a critical state. They should be capable of self-diagnosis and self-calibration and be able to periodically prompt on coordinating sensors to collect and process network statistics. Such processing can result in network self customizations that balance topology and upgrade sensing performance. Faulty sensors can be easily identified, while the deployment of new nodes can be estimated based on sensing coverage maps and algorithms.
421
•
They can be re-programmed and facilitate the network to receive remote software updates. Additional processing techniques can be downloaded to a smart sensor. This feature eliminates the down-time of the network for updates.
Sensor research & technology has been effectively serving a multitude of diverse applications. The result of this research is the development of open implementations that include fully featured operating systems (TinyOS [5], MantisOS [6], SOS [7]), flexible tiny-scale databases, and even service-based platforms (Fig. 1). Taxonomy and evaluation of these implementations’ provides valuable feedback that supports standardization initiatives and software architecture research [8] [9] [10] [11]. From a software platform perspective, sensor nodes have made significant leaps towards improved embedded logic. The software side of sensor nodes has been in most wireless sensor implementations rather primitive. Most approaches seek to optimize hardware rather software design: • Many pioneering concepts acting as cornerstones in the evolution of sensor technology, such as the ‘Smart Dust’ Project [12], focused their novelty in hardware design optimizations. Integrating sensing elements in small or tiny scale architectures, is a design decision that usually sacrifices the functionality potentials and the complexity of the embedded logic for an energy-saving and low-cost hardware implementation. When wireless sensor networks are employed to monitor vast industrial environments or track mobile assets, the selected sensor design is one that offers small size durable nodes. Such systems are comprised by networks of hundreds or even thousands of tiny sensors. When designing these systems, low cost and long energy life, are high priority features due to the large number of nodes. These features are connected with small print hardware architectures with elementary functionality that both ensure significant battery life. A large number of research projects, while attempting to address the monitoring needs of specific applications (e.g. condition monitoring, tracking, surveillance), adopted sensor technology at its very first steps. Their research quickly matured to offer the first examples of sensor network systems. These systems were, and still are, based on the strict design specifications of an isolated standalone system built to provide application-focused functionality. Porting to a more interoperable and scalable software logic is a design decision that requires extensive re-engineering and adds processing overheads. This decision will trade the performance of a kernel-based function set (minimized logic) for the flexibility of a scalable multitier platform (advanced logic). Most application-oriented research projects are not willing to accept this trade-off.
Figure 1. Smart Sensor Operating System & Middleware •
•
A key question is whether research and development effort can swift the above trends and decisions to focus more on sensor-embedded logic. A significant amount of research in early sensor technology was conducted through projects, whose requirements and characteristics were defined by various applications. Surveillance and monitoring systems constitute active research fields with far greater history than the history of sensor technology. Sensor nodes where introduced in research labs as the means for advanced sensing serving their applications. The fast adoption of sensors’ early implementations and their design-scale upgrade to “small devices” level, were the key factors that focused and ignited research on advancing sensor networking technology itself. Smart nodes in such networks can have significant capabilities: •
•
•
They can offer advanced embedded logic in cost-effective and tiny-sized board architectures. Most of the latest sensor architectures include enough processing power and memory that can feed the development of small to medium scale intelligence. Both research and market acknowledge sensor modules as “devices” with separate software driving their customizable functionality. Board optimizations have become as important as the development of software upgrades, since the potentials of the second can greatly benefit every functional aspect of a sensor module. Application requirements can easily be translated into processes and in turn, implemented in software routines. Provided with adequate software, sensor networks can utilize sophisticated protocols to ensure reliable network presence even in the most geographically distributed placement. Advanced energy-saving algorithms can synchronize sleep states and calibrate sampling techniques, thus extending the sensor’s life time to its maximum potentials A very important feature that needs to be decided, early in the design phase of a new smart sensor, is the level of interoperability that will be provided by its embedded software. The feasibility of a cross-vendor, cross-platform and cross-implementation sensor network is essentially defined by the presence of [13]: 1. Design specifications and functional requirements for sensor modules. 2. Standardized specifications for the format of sensor data inside and outside of the sensor’s logic. 3. Widely adopted and standardized interfaces for embedded processes and tasks. 4. Widely adopted and standardized protocols for inter-process and node-to-node communication and data exchange.
422
3
NOVELTY DETECTION AND CONDITION MONITORING
Detecting novel events is a first and key step in condition monitoring. The problem of efficient and reliable novelty identification in a series of maintenance data has to been addressed at several levels of a condition monitoring system. At the top level involving off-line data processing, down to the sensor level of on-line sample collection, novelty detection is defined as the identification of deviation from known machinery behaviour patterns. A key difficulty is related to the fact that monitored quantities may differ for different machinery, while they may also vary even for the same machinery, due to a number of reasons, including the exact sensor positioning and the presence of noise and interference from neighbouring equipment [14]. This implies that any successful novelty detection implementation critically depends on the ability to capture the individual equipment behaviour, expressed in patterns observed in the measured signals. In order to achieve this, novelty detection must employ empirical models that can be calibrated to serve individual application needs, while equipped with adaptation mechanisms, so as to enable them to capture the different observed patterns of operating state for each application case [15]. Novelty detection classification sometimes distinguishes between statistical and neural network approaches [16, 17] but as many neural network approaches can be shown to bear statistical relevance, it is preferable to focus on the type of the employed learning mechanism, which may be similar even for different families of models. Depending on the nature of the employed learning mechanism, novelty detection is usually implemented with supervised learning or unsupervised learning, with other forms of learning, such as semi-supervised or reinforcement learning not being frequently pursued in this application domain. The key difference between supervised and unsupervised learning is that in the former employs the known operating state (normal/abnormal) of training patterns to drive model training, while the latter attempts to cluster training data into categories, based on input space similarity. Both are valid approaches with advantages and risks. Unsupervised learning can exploit a larger amount of data, as it does not depend on a priori knowledge of the operating state that the data belong to. However, it carries the risk that input space similarity not be transferrable to output space similarity. In other words, similar data may sometimes correspond to different operating conditions. On the other hand, supervised learning can drive the empirical models to adequately map input data to the operating condition, but require that this condition is known for all employed training data. In all cases, critical to the success prospects of any empirical modelling technique for novelty detection is its ability to adapt, on the basis of accumulated evidence offered by newly acquired data. Substantial research efforts have been devoted to condition monitoring and fault detection. Each type of machinery class, along with their special attributes, have driven the exploration of numerous variations of widely adopted intelligent modeling approaches, such as knowledge-based systems, fuzzy logic, machine learning, neural networks and genetic algorithms. This paper will not attempt to provide a more comprehensive or up to date summary of these techniques, but it will propose the placement of their implementation on a different level (sensor board) of their supporting infrastructure (the sensor network). As the novelty detection functionality will be specified as a modular component in a reference sensor-based novelty detection architecture, any novelty detection algorithm can be plugged into it. When adaptive learning methods are employed to support novelty detection in condition monitoring systems, a range of issues need to be addressed [18]: • Fault detection and/or identification of the symptoms that indicate an imminent failure. • Identifying the fault’s characteristics. This can be achieved through approximations that model a fault as a function of the control input and state variables. • Effective fault accommodation strategies, through self-correction based on the feedback from the control algorithm. • Upgrading the ability of the detection scheme to avoid false alarms in the presence of modeling uncertainties. Such a goal can be achieved by on-line learning of the modeling uncertainties. However, the majority of published research on intelligent condition monitoring, executes the analysis and modeling tasks away from the actual sensor, on portable or desktop computing devices or data collection servers. The computing may be performed on-line or off-line. When considering porting novelty detection techniques to wireless sensor networks, a key limitation is that many of these techniques have significant requirements for resources both in terms of computational power and memory storage. Furthermore, the limited power autonomy of wireless sensor nodes, makes it imperative to design a sophisticated strategy that determines when and what quantity of data should be transmitted, so as to avoid excessive power consuming RF transmission. Sensor-level novelty detection can be implemented as a set of functions and routines embodied in the sensors software platform. Attempting to embed adaptive learning techniques in sensor software components is a very challenging task. The sensor board’s processing power and memory capacity are generally low, compared to the minimum requirements for executing such methods. As mentioned before all non-smart sensors are equipped with fixed routines that lack processing complexity, configurability and performance. On the other hand, smart programmable sensors constitute an emerging device technology that can embody and effectively execute small-scale learning techniques. Their advantages in terms of hardware architecture and software platform have been analyzed in section 2.
423
We thus seek to define a modular architecture for sensor-based novelty detection, that takes into account the specific advantages and limitations of employing wireless sensor nodes. Increasing the granularity and the distributed nature of any computationally intensive process is a paradigm followed in many disciplines and proved to offer reliability, performance and scalability benefits. Until recently, most of the proposed novelty detection systems followed a centralized approach, where the aggregation of data and their exploitation occurred on a single central computing point. An alternative to this approach could involve a collective data model. This model can take advantage of the distributed nature of deployment of a wireless sensor network, utilizing sensor nodes as a multitude of small interconnected repositories and/or working nodes. Such an approach can offer advanced solutions for data availability (data migration) and fault tolerant storing (data replications). The former implies that process results from novelty detection will be instantly available locally in the sensor’s memory, while the second ensures that high priority data are replicated and reside in multiple locations, thus enabling robust data legacy and rolling back options. The processing of this model could reside on various distributed patterns: multi-agent systems [19] and sensor services (Sens-ation [20], Atlas [21]). Investing in sensor intelligence will result in a constantly upgrading sensing infrastructure. Rather than utilizing a static sensing infrastructure to calibrate the training of a central model, smart sensors offer the option to calibrate the training of a dynamic sensing infrastructure to process a distributed model. Decentralization of novelty detection will result in decentralization of intelligence, which in turn is translated in a much more capable sensing network. Population-based computing paradigms such as evolutionary computing [22] or swarm intelligence [23] are strong candidates to implement the adaptive and distributed nature of this emergent learning behaviour pattern. Sensor-based novelty-detection upgrades the condition monitoring system’s responsiveness to faults. Sensor nodes constitute the part of the system that resides closest to the source of problem. Detecting a fault, a sensor can directly signal an alarm and schedule further maintenance actions. In the case of centralized novelty detection, the minimum time between a fault’s occurrence and its detection includes data transmission delays along the path from sensor to the data collection server and the processing overhead of the data model. Thus, sensor-embedded novelty detection offers significant advantages, either implemented at single-sensor node, or via collective novelty detection. 4
DESIGN REQUIREMENTS FOR SENSOR-EMBEDDED NOVELTY DETECTION
In order to study the design issues of embedding learning methods within the sensor logic, a reference, component-based architecture is proposed. The reference architecture is an advancement of the concept of constructive modeling for novelty detection, redesigned for embedding it within a sensor board [15]. In this constructive modeling setting, empirical process models are built and expanded to exploit newly acquired data. Each subtask of the novelty detection process is associated with an application or a service module according to its role and functionality. Inter-process communication and synchronization aspects are also addressed. The proposed architecture is modular by design and is based on a multi-tier sensor platform. It supports component interaction via appropriate interfaces. Subsequent analysis points at the need of standardizing these interfaces. Possible communication patterns and synchronization paradigms need to be studied. In our modular design, we assume that the components implement their intended functionality. Naturally, any alternative algorithm versions can be employed for the same functionality, thus the architecture can benefit from algorithmic improvements in terms of performance, computational efficiency and memory usage. For every subtask of the learning process a balanced decision must be made to prioritize and define the trade-off between performance and resource allocation. Sensor-level novelty detection aims at identifying unusual or unforeseen sensed behaviour based on features extracted from the monitored equipment. Any measurements which significantly deviate from the sensor-embedded process model are judged to be novel. It is important to make this clarification as in practice a measurement collection can be marked as novel not in the case that an expert user would consider it so but insofar as the built-in model recognizes it to be so. By the very nature of the approach, it is possible to expand the sensor-embedded novel to accommodate additional data and therefore expand its ‘receptive’ field, a process consistent with the concept of constructive model building. In the context of a smart sensor’s logic, this process can be implemented as a Novelty Detection (ND) subsystem constituting a part of its software middleware. In order to design a proposed architecture for this subsystem, we first describe its step-by-step functionality and then assign execution to software components that can be included inside the sensors resource-limited logic. The functionality of the novelty detection system should include the following steps (Fig. 2): 1. Initial setup. This step involves the initialization of the novelty detection subsystem. Parameters and processes are initialized and reset. During this step, ND database can be initialized to default data. This enables initial sensor-logic initialization to utilize previously built knowledge. 2. Initial data acquisition. A first set of samples are acquired, to capture the monitored machine’s normal behaviour. This information is utilized to build the initial model that reflects the normal operation state. In the case that the initialization process did not discard previously stored knowledge in the database, this step can be omitted.
424
3. Set-up refinement. The initialization step is executed again. This second initialization is configured by the normal state knowledge residing in memory (step 2). The goal of this repetition is to achieve the best initial calibration of the real-time monitoring and processing stage that follows. The benefits of utilizing any pre-existing data during initialization can prove to be much more important than hundreds of processing cycles that try to compensate for a badly configured training process. 4. Real-time monitoring, data acquisition and processing. This step can be characterized as the ND subsystem’s core phase. It includes the real time knowledge building through empirical model learning. The data acquisition should be carefully synchronized with processing and signaling operations. The processing includes the execution of the employed learning algorithm, while the signaling operations need to define and drive the sensor’s response and reaction to ND subsystems findings (sensor single mode). 5. Collective ND functionality (sensor collective mode). This step is also associated with an ND subsystem’s core phase. In order for the sensor’s logic to participate in the execution of distributed processing, it must switch to a different operation mode. This mode is characterized by communication based on preconfigured patterns and processing based on special components of the ND subsystem’s architecture. During this stage/mode the sensors’ subsystems communicate to synchronize their models and collectively train them, either as individual models or as one synthesis deriving from their fusion. Collective ND is a significant feature of the proposed architecture. Many processes need multiple-sources data acquisition, involving measuring the same physical quantities from different locations, or different quantities from the same location, or different quantities from different locations. The ability to perform novelty detection based on the fusion of all these quantities is termed collective novelty detection. Novelty detection based on a single sensor or single sensor-node readings are a trivial case of collective ND, when additional data sources are nullified. 6. Asynchronous independent functionality. This step is essentially executed in parallel to the previous core phases. It includes externally invoked processes that support the ND subsystems task. Their goal is to allow network users execute administrative and customization actions. These actions may include database management, labeling novel data or configuring the ND subsystem.
Figure 2. ND Subsystem modes and steps The overall functionality of the described above process is that of progressive model building [15]. The ND subsystem design should be versatile and customizable enough, in order to serve the knowledge-building needs of various monitoring applications. The main advantage of this approach is the full utilization of the energy-cheap processing and storing abilities provided by recent smart sensor modules. Thus, investing in the development of advanced sensor logic better exploits the sensor network’s infrastructure. Such an architecture has the additional advantage of bringing sensor software one step closer to standardization. In Figure 3 we illustrate the proposed ND subsystem’s software architecture. Deriving from the previously mentioned generic functionality steps and stages, this conceptual architecture is based on design decisions that involve component role designation, component implementation characteristics and component interaction and interfacing. To assign roles and properly place each component in our architecture, we divided each step and stage in a number of subtasks that constitute the components core functionality. A component’s implementation can be classified as: A Service Component – Service components follow publish and subscribe paradigms to register their functionality and their point of contact. Administrative and supervision tasks that involve network-user’s interaction with sensor-level information are implemented as services. Tasks that include data exchange between sensor nodes (distributed processes) or a sensor node and the data gateway (reporting tasks) constitute sensor interconnecting behaviour and they should also be implemented as services. Open and widely adopted standards should be followed to assure scalability and interoperability.
425
Figure 3. Sensor-based ND Subsystem. A Middleware Component – Middleware layer is mainly populated by processes that need to utilize stored data and interface with the underlying operating system. Instead of outgoing communications links, these components are focusing on core functionality and aim for performance and efficiency. Functions that handle and process significant amounts of data (considering the sensor level memory) naturally reside in the middleware tier, implemented as components that can directly connect to organized memory structures without the need of complicated interfaces. Middleware components should be implemented with board specific SDKs or Toolkits in order to maximize performance and make optimal resource utilization. The ND Systems Database A learning-based Novelty Detection system should be supported by a model that undergoes successive training cycles. This model has to reside in some kind of organized memory space, supported by the proper access mechanisms of the underlying sensor operating system. Various projects are currently offering advance data handling techniques as sensor-level middleware components (TinyDB [24], Cougar [25], DsWare [26]). SQL-like languages support these tiny scale Database Managements Systems (DBMS) and allow efficient filtering of sensor data. In our architecture, the database component plays a critical role in every step of the desired functionality. Implementing a ND subsystem, to train a model that resides in a DBMS middleware component, creates a strong dependency between the subsystem’s and the component’s performance. The database (DB) schema must include proper data records to assist the model training. It should also include detailed settings to allow extensive subsystem customization. In order to efficiently configure the ND subsystem and model sampled data, the DB includes: • The “Configuration Data” table where various operation profiles for the ND system can be stored, as a set of configuration parameters and settings. • A “Definition Data” table that lists and describes the application specific parameters and factors that can be monitored and used for the monitoring task at hand. A subset of these parameters will be selected to define the record structure of the rest of the DB tables. • “Test Data” is a set of predefined data used to test the models ability to effectively identify a number of important condition states. • “Novel Data” is a table where the system stores sensed data, whose processing could not associate them with a previously known state. • The “Approved Data” table includes previously identified novel data that have been cross-examined and revised by an expert and, in turn, have been labeled as indicators of a certain state (through administrative actions). Data populating this table compose the model that drives the training process. • Finally, “History Data” acts as a legacy data repository for backup purposes and rolling back self-correcting operations. Its size and type should be carefully configured, in order not to waist valuable memory space. Advanced DB management actions may include setting up “Definition Data” for the monitoring application, inserting, deleting and revising “Approved Data”, examining, re-classifying and labeling of past data, as well as appending them to the pool of “Approved Data”. The ND Systems Middleware Components
426
A design decision is made to implement the monitoring process as a middleware component. The essential role of the learning process is to compare new incoming sampled data against the data model’s profiled clusters and evaluate its participation to any previously profiled condition state. According to the employed learning algorithm, data representing thresholds, central points and other cluster information are stored inside the DB. When a new sample cannot be associated to a previously recorded or known state, then an interrupt flags novelty and instantiates a reporting service to alert the network administrator and/or provide additional info. This component essentially executes the training procedure, and thus requires fast and synchronized DB access for the periodic processing and updating of the data model. In order to achieve such a balanced synchronization, the operating system’s data handling mechanisms can be utilized to execute advanced memory actions (DMA). The monitoring process calibrates the data processing cycles according to the sampling frequency and the DB access delays. Any charted time overhead is available in DB and assists in refinements aiming to achieve optimal training rate. The ND Systems Middleware Services The Service layer of the proposed architecture includes four groups of services, each delivering different type of sensor node behaviour and connectivity: Initialization Services - A set of services for booting the ND subsystem by initializing settings values and table records in the DB (definition data, test data). If the initialization service detects the absence of a stored knowledge model, it invokes the monitoring process and executes one cycle of data acquisition and processing (Fig. 4). The resulting instance of the data model is processed by a refinement service that calculates enhanced values for a second subsystem initialization. Report Services - This group of services allow the ND subsystem to alert the network user on the identification of novel data and prompt for their classification through data labeling. Simple trending functions are also available as services, providing mini trending reports on parameter escalation and history of changes. Such reports along with streams of the corresponding data history, accompany alert messages to support the technician’s diagnosis for qualitative classification (Fig 4). The examination of this feedback may lead to the identification of a ‘hidden’ pattern of abnormal behaviour that causes a slow paced degradation. Setup Services - The ND Subsystem can be online reconfigured through a set of setup services. These services enable network users to access the data stored in the DB and modify critical settings Figure 4. ND Subsystem’s Flowchart. that define the learning process. Database management services allow the user to revise and tailor the approved or history data table. The user can translate, label and classify data previously identified as novel. If this classification is not possible due to user’s uncertainty, then further processing can be initiated (compare against updated data model) to refine the results and support a new decision. Swarm AI Service - The presence of these services can effectively turn a single ND subsystem into a smart agent operating in a network of collective intelligence. The learning process is encapsulated in a distributed service envelope, allowing connection and coupling with other similar services. As displayed in Figure 4, these services implement DB synchronization techniques (replication, migration) and execute algorithms for collective decisions (collective-based Novelty Detection). Streaming variables’ history, data model’s characteristics (clusters’ central points, thresholds and dimensions), or even profiled state, allow data fusion functions and provide advanced reports on trends and data classification. Any type of collective decision algorithms can be embedded in this service with a prime choice being swarm intelligence [23]. 5
DISCUSSION AND CONCLUSION
Condition Monitoring can greatly benefit from the introduction of distributed wireless sensing solutions. Wireless condition monitoring solutions have the potential to become a powerful toolset that can be exploited to enhance the lifecycle management of engineering assets. Key advantages include ease of installation, flexibility, portability and data accessibility. A significant functionality of any integrated condition monitoring solution is to detect when the machinery operation deviates from known patterns of normal operation. This functionality is usually termed novelty detection. This paper introduces key
427
features of a modular design pattern, capable to serve the development of smart sensor middleware. Rather than designing the structure of the complete sensor middleware layer, we identify the need for proper module containers inside a sensor’s logic. Containers host services that respond to network requests, or application components that respond to internal calls and invocations. Consensus on a unified and standardized module envelope is required, for both hosting environments, in order to support the efficient co-existence of multiple third party services and components in a single sensor node. Such a software paradigm can have a significant impact in various aspects: 1. It supports an emerging new level of software development; the sensor-device development. A design pattern disconnected from datasheets will ignite projects and developing communities that will produce the required frameworks and libraries for interoperable implementations. The sensor device software is a promising market that needs specs and tools to drive its solutions. Sensor services and sensor applications are very close to become a product unit. A sensor-logic component that employs a learning technique for novelty detection is no longer a simple function to perform an elementary sensor task; it constitutes an advanced application executed in a capable runtime environment. 2. Condition monitoring systems serving industrial plants will significantly upgrade their performance and cost savings, by utilizing sensor level novelty detection: a.
There is no need for a multi-core collection server, when there is a multi-agent collective network. Sensor level resource allocation and task allocation services will ensure data availability, network robustness and efficient processing. One step further from sensor embedded novelty is the sensor collective novelty, scaling down and porting the benefits of a grid architecture to a wireless sensor network. The practice of grid or clustered processing offers solid proof of the significant reliability and performance upgrades when moving to a distributed model. A collective of smart sensors can more than adequately fulfill the processing needs of most condition monitoring processes.
b.
The need for pre-scheduled visual inspections, maintenance actions and data readings is reduced when a group of smart sensors is able to identify critical events from fused novel data and automatically produce reports that can drive the scheduling of maintenance tasks. On-demand visual inspections mean fewer personnel on the shop-floor, which in turn, means less risk in the case of personnel working in harsh environments. Sensor embedded novelty can detect faults in the sensing infrastructure, while at the same time monitoring the state of the application environment. The same advanced detection method will be utilized to model, monitor and self-configure the sensor network itself. Sensor embedded novelty can serve as the basis for multiple network targeted functions: (a) scheduling sleep, (b) tuning sampling, (c) detect environment resource (light) for energy harvesting. Self calibration and energy savings are directly translated in less time spent by personnel to check, configure and maintain the engineering assets infrastructure.
c.
Finally, there is no need for top-to-bottom condition monitoring with overloaded software suites and black-box sensors, when programmable sensors with open software can be purchased and subsequently populated with services and applications able to drive and compose the desired monitoring process. Instead of full-scale suites, the provision of sensor services and sensor components will emerge offering a much greater variety, customization level and control over the features and the potentials of the final condition monitoring system. Thus engineering assets can be monitored by carefully tailored sensors equipped with the proper software modules to power its detecting capabilities.
This paper has looked into how Novelty Detection and Condition Monitoring functionalities can be embedded in wireless sensor nodes and networks. A reference architecture for sensor-embedded novelty detection has been defined. This architecture supports basic and advanced novelty detection functionalities in a modular approach, defining individual components and services that can be embedded at the sensor board level. Promising though it may be, the concept of sensor embedded novelty has significant limitations and constraints. Implementing model training tasks by utilizing the resources of a sensor board is challenging and bounded by the processing and memory capacity of small-devices. Novelty detection is a process that occupies many clustered and grid infrastructures. A smart sensor’s logic can only support the execution of simple and less sophisticated approaches, and thus the least efficient ones. The sensor’s small memory size poses serious constraints to the complexity and the magnitude of the dynamic data model. Low CPU resources will bound the amount of the sampled data that can be processed on-line, thus making the training rate low and the produced model non-qualitative. From a collective perspective, network-level intelligence can provide a more parallel-capable processing capacity with enough distributed memory to exploit and execute recent and more advanced novelty techniques. The performance of these implementations, ported for sensor middleware, will suffer from: (a) the overheads from non-standardized wireless and inter-process communication links, (b) the low-speed CPU-memory bus of the clustered sensor nodes, (c) fixed-to-OS implementations, bounded by their specific features for distributed functionality and lack of cross-OS portability. Currently, an increasing number of sensor middleware projects are utilizing a multi-agent paradigm for their implementations. These implementations do not share anything in terms of underlying OS, middleware interfaces, data formatting, process definition, agent architecture, thus making their fusion and collaboration, to form scalable sensor collectives, almost impossible. The next steps in our research involve the detailed definition and development of individual modules and services. Our aim is to abstract the wireless sensor hardware and operating system as much as possible, so as to make the developed modules as platform-agnostic as possible. Work is currently underway to port the initial version of our architecture to the PrismaSense sensor network development platform [27].
428
6. REFERENCES 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27
Feng, J., F. Koushanfar, and M. Potkonjak, (2002) System-architectures for sensor networks issues, alternatives, and directions. in Proceedings of IEEE International Conference on Computer Design: VLSI in Computers and Processors, Freiburg. Song, E.Y. and K. Lee, (2008) Understanding IEEE 1451—Networked Smart Transducer Interface Standard. IEEE Instrumentation & Measurement Magazine, 11(2), 11-17. Boltryk, P., C. Harris, and N. White, (2005) Intelligent sensors-a generic software approach. Journal of Physics: Conference Series, 15, 155-160. Vadde, S., S. Kamarthi, and S. Gupta, (2003) Modeling smart sensor integrated manufacturing systems. in Proceedings of the SPIE International Conference on Intelligent Manufacturing. Providence, Rhode Island: SPIE. Han, C., et al., (2005) A dynamic operating system for sensor nodes. in Proceedings of the 3rd international conference on Mobile systems, applications, and services. Seattle, Washington. Bhatti, S., et al., (2005) MANTIS OS: An embedded multithreaded operating system for wireless micro sensor platforms. Mobile Networks and Applications, 10(4), 563-579. Han, C., et al., (2005) SOS: A dynamic operating system for sensor networks, in Third International Conference on Mobile Systems, Applications, And Services (Mobisys). Akyildiz, I., et al., (2002) Wireless sensor networks: a survey. Computer networks, 38(4), 393-422. Tilak, S., N. Abu-Ghazaleh, and W. Heinzelman, (2002) A taxonomy of wireless micro-sensor network models. ACM SIGMOBILE Mobile Computing and Communications Review, 6(2), 28-36. Sugihara, R. and R. Gupta, (2008) Programming models for sensor networks: A survey. ACM Trans. Sen. Netw., 4(2), 129. Heinzelman, W., et al., (2004) Middleware to support sensor network applications. IEEE network, 18(1),6-14. Kahn, J., R. Katz, and K. Pister, (1999) Next century challenges: mobile networking for 'Smart Dust'. in Proceedings of the 5th annual ACM/IEEE Int. Conf. on Mobile computing and networking. Seattle, Washington, USA, ACM. Hu, P., R. Robinson, and J. Indulska, (2007) Sensor standards: Overview and experiences. in Proceedings of International Conference on Intelligent Sensors, Sensor Networks and Information, 2007. ISSNIP 2007. Melbourne, Qld. Emmanouilidis, C., C. Cox, and J. MacIntyre, (1998) Neurofuzzy Computing Aided Machine Fault Diagnosis. in Proceedings of JCIS'98, The Fourth Joint Conference on Information Sciences. Research Triangle Park, NC, USA. Emmanouilidis, C., E. Jantunen, and J. MacIntyre, (2006) Flexible software for condition monitoring, incorporating novelty detection and diagnostics. Computers in Industry, 57(6), 516-527. Markou, M. and S. Singh, (2003) Novelty detection: a review—part 1: statistical approaches. Signal Processing, 83(12), 81-2497. Markou, M. and S. Singh, (2003) Novelty detection: a review—part 2: neural network based approaches. Signal Processing, 83(12): p. 2499-2521. Polycarpou, M. and A. Trunov, (2000) Learning approach to nonlinear fault diagnosis: detectability analysis. IEEE Transactions on Automatic Control, 45(4): 806-812. Lesser, V., C. Ortiz, and M. Tambe, Distributed sensor networks, (2003) A multiagent perspective. 1 ed. Vol. 9. Springer. Gross, T., T. Egla, and N. Marquardt, Sens-ation, (2006) a service-oriented platform for developing sensor-based infrastructures. International Journal of Internet Protocol Technology, 1(3), 159-167. King, J., et al., (2006) Atlas: a service-oriented sensor platform. Proceedings of SenseApp. Emmanouilidis, C., (2002) Evolutionary Multiobjective Feature Selection and ROC Analysis with Application to Industrial Machinery Fault Diagnosis. Evolutionary Methods for Design Optimisation and Control. Engelbrecht, A.P., (2006) Fundamentals of Computational Swarm Intelligence. John Wiley & Sons. Madden, S., et al., (2005) TinyDB: an acquisitional query processing system for sensor networks. ACM Transactions on Database Systems (TODS), 30(1), 122-173. Yao, Y. and J. Gehrke, (2002) The cougar approach to in-network query processing in sensor networks. ACM Sigmod Record, 31(3), 9-18. Li, S., et al., (2004) Event detection services using data service middleware in distributed sensor networks. Telecommunication Systems, 26(2), 351-368. Emmanouilidis, C., S. Katsikas, and C. Giordamlis, (2008) Wireless Condition Monitoring and Maintenance Management: A Review and a Novel Application Development Platform, in Proceedings of the 3rd WCEAM-IMS 2008 Congress. SPRINGER: Beijing, China, 2030-2041.
Acknowledgements This research has been partially supported by the research contract ‘U-Sense’ between Prisma Electronics SA and the ATHENA RC. The authors acknowledge the technical support received by Prisma Electronics and in particular by Mr. Serafim Katsikas, in relation to the PrismaSense development kit for wireless sensor networks.
429
Proceedings of the 4th World Congress on Engineering Asset Management Athens, Greece 28 - 30 September 2009
ANALYSING IT FUNCTIONALITY GAPS FOR MAINTENANCE MANAGEMENT Kans, Mirka a and Ingwald, Anders b a
School of Technology and Design, Växjö University, Luckligs plats 1, S-351 95 Växjö, Sweden.
b
School of Technology and Design, Växjö University, Luckligs plats 1, S-351 95 Växjö, Sweden.
Several studies have been carried out for describing the functionality and use of computerised maintenance management systems. A major drawback of these studies is that they do not reveal the actual support for maintenance management. To describe the full situation, the gaps between required support and actual support have to be determined. The gaps are of two kinds: 1) Between the functionality included in the IT system and the functionality required, and 2) Between the functionality included in the IT system and the functionality actually used. To reach a better understanding of the utilisation of IT in maintenance management, the existence of these gaps must be further explored. In this paper, we will study the existence of functionality gaps in maintenance management IT applications using data from a web-based questionnaire survey conducted in Swedish industry. Results show that the IT systems in general provide good support for maintenance management, thus low degree of functionality gaps. However, most commonly unused function was failure cause and consequence analysis and the most commonly unused information was maintenance improvement suggestions. When comparing the results with respect to type of IT system used, some significant differences was revealed, indicating that ERPsystems and production systems do not include all required information for maintenance management. Next step will be to further study the reasons behind the differences in gaps by conducting additional interviews. Key Words: Maintenance Management, Information Technology, Software Functionality Gap, Survey 1
INTRODUCTION
The importance of enterprise information technology (IT) systems has increased in order to meet the demands set by the users. It is for instance shown that investments in IT have a positive correlation to company profitability and competitiveness [1]. This is the case also for maintenance management IT (MMIT), i.e. applications used for maintenance management purposes such as computerised maintenance management systems (CMMS) and maintenance management or asset management modules in enterprise resource planning (ERP) systems. The literature reports positive correlations between high use of IT in maintenance and high maintenance performance [2-3]. Success stories of MMIT implementation are found in for instance [4-7]. An important aspect to consider for being able to take full advantage of IT is the matching of IT functionality to IT requirements. Appropriate mapping of IT requirements for achieving the business goals which are set would avoid both overfitting and under-fitting of IT resources compared to actual needs. In maintenance, several studies have been carried out for describing the use and functionality of computerised systems; see e.g. [7-10]. A major drawback of these studies is that their inability to reveal the actual support for maintenance management. To describe the full situation, the gaps between required support and actual support has to be determined. In this paper, we will study the existence of functionality gaps in maintenance management IT. The disposition of the paper is the following: In section 2 theory concerning IT functionality gaps is presented and four research questions concerning functionality gaps are formulated. A questionnaire survey of MMIT use is presented in section 3. This survey serves as the basis for analysing the existence of functionality gaps. Results and implications are presented in section 4.
430
2
IT FUNCTIONALITY GAPS
A common way to describe the dependencies between an enterprise or business and their IT systems is in form of the sociotechnical system. Three subsystems comprise the socio-technical system: the human activity system (HAS), the information system (IS) and the information technology system (ITS) [11]. The HAS consists of people performing a collection of activities and is synonymous to the term business process. The IS supports the HAS with information processing, as it is a system of communication between people. Communication is enabled by technology represented as the ITS.
ITS (functionality)
IS (usability)
System quality
Use
Information quality
User satisfaction
HAS (utility)
Business impact
Misalignment between sy stem and use Misalignment between sy stem and requirements
For an IS to be successful, it has to be aligned with the HAS. In this context, the properties of functionality, usability and utility could be considered. BeynonDavies connects in [11] the three properties with the socio-technical system and applies this as a framework to understand the Information Systems Success Model by DeLone and McLean [12], see Figure 1. This model comprises two issues of technical nature: the systems quality and the information quality. These will influence the use and user satisfaction with the information system, and finally impact the individual and organisation as a whole. The two latter issues are in Figure 1 collapsed into one: the business impact.
Figure 1. Misalignments recognised between different systems. The functionality is connected to the quality of ITS and the information it handles, while usability is connected to the use and user satisfaction of the ITS. The utility describes the impact of the ITS on the HAS in form of business impact. We can recognise misalignments as gaps between the three systems. The gaps could be of two kinds: 1) Between the functionality included in the IT system and the functionality required, and 2) Between the functionality included in the IT system and the functionality actually used. How the gaps are interconnected is illustrated in Figure 2. The true utilisation of the IT system is found in the intersection of the factors to consider, in the figure marked as grey. Obviously, we strive for a situation where the three circles overlap as much as possible. Knowing the type and distance of the gap and how the gaps are varying with for instance type of application is important in order to understand the true utilisation of IT within a business. In this paper different types of maintenance management IT systems are investigated. Figure 2. Gaps between required IT support and actual IT support For maintenance management IT, two questions regarding the system quality are formulated: Research question 1a: Does a gap between IT system functionality coverage and IT system functionality demands exist? (RQ1a). Research question 1b: Does a gap between IT system functionality coverage and IT system functionality use exist? (RQ1b).
431
In addition, two questions regarding the information quality are formulated: Research question 2a: Does a gap between information coverage and information demands exist? (RQ2a). Research question 2b: Does a gap between information coverage and information use exist? (RQ2b).
3
DATA GATHERING AND ANALYSIS
In this section data from a web-based questionnaire survey are presented and analysed. The survey was conducted during fall 2008 and spring 2009. The questionnaire design was made in a web-based questionnaire software tool in 2008. It was tested internally within the department as well as on 5 industrial parts (a total of 15 test questionnaires was sent, wherefrom 5 answered). After minor changes in the design, the questionnaire was sent to the study participants during September 2008 to February 2009. A telephone call to the respondent was preceding the sending out to assure that correct person within the company was receiving the questionnaire. The questionnaire consisted of 29 questions in total: Company descriptive (Questions 1-6), Maintenance related descriptive (Questions 7-18), IT use related questions (Questions 19-23) and IT procurement related questions (Questions 24-29). Two additional questions were included regarding the wish to be contacted for further interviews.
3.1 Survey participants For this cross-sectional survey among Swedish production plants were selected using information from Swedish Centre for Maintenance Management containing contact information regarding production plants in Sweden. The population for our study consists of plants where the maintenance is performed in-house and where contact information regarding maintenance managers was available. The size of the population was 381. Of these 175 respondents could not be reached by telephone and were removed from the study and an additional 8 respondents stated that they could not participate by various reasons. This resulted in 198 respondents. The total number of respondents was 71 representing a response rate of 34%.
Industries 18 16 14 12 10 8 6 4 2 0
Chemical industry
Pulp and Paper
Wood and timber
Steel and metalwork
Automotive industry
Food industry
Energy
Other
Figure 3. Respondents The respondents represented the following industries: Chemical industry, Pulp and Paper, Wood and Timber, Steel and Metalwork, Automotive industry, Food industry, Energy and Other, see Figure 3.
3.2 Functionality gaps in MMIT From the questionnaire two questions are of special interest for this paper, Q23 and Q24. The first describes different activities within maintenance management and the respondents were asked to tell whether these were made manually, with support by IT, or not at all. In addition, they were asked if the functionality was present in the current IT system but not utilised, and if the functionality was not present in the IT system but needed. Table 1 presents the results from the two latter
432
parts of the question. The Q24 is similar to the first one, but is focusing on information needed for maintenance management. Table 2 presents the results of the part of the question requiring about whether the information was present in the current IT system used for maintenance management but not used and if the information was not present in the IT system.
Table 1 Functionality gaps in percentages
Support is not present in the current IT system but is needed (Percentages)
Work order handling Preventive maintenance planning Spare parts planning Personnel planning Budgeting Failure reporting Unplanned maintenance execution Planned maintenance execution Measurement and controlling Plant register handling Spare parts management Material purchasing Cost control Analysis of conducted maintenance Failure cause and conseq. analysis Analysis of CM data Improvement analysis of maintenance Technical imp. analysis of production Economic imp. analysis of production
Support is present in the current IT system but not used (Percentages)
0,0 1,4 2,8 4,2 5,6 1,4 0,0 0,0 2,8 0,0 0,0 0,0 1,4 2,8 7,0 4,2 7,0 5,6 4,2
12,7 11,3 8,5 15,5 5,6 11,3 11,3 8,5 11,3 5,6 7,0 5,6 14,1 12,7 22,5 15,5 16,9 8,5 9,9
The leftmost column in Table 1 lists various functionality of MMIT. The second column accounts for the results concerning RQ1a (Does a gap between IT system functionality coverage and IT system functionality demands exist?) and in the third column results for RQ1b (Does a gap between IT system functionality coverage and IT system functionality use exist?) are found. From Table 1 we can see that the required functionality is in general present in the IT systems used. Only in the area of analysis some lack of functionality is to be noted (Failure cause and consequence analysis and Improvement analysis of maintenance), but it is still quite modest. The difference between required functionality and functionality provided by the IT systems is higher: many respondents have recognised that their IT systems contain more functionality than they actually use. The functionalities that were mostly unused, in ascending order, are: •
Failure cause and consequence analysis (22,5%),
•
Improvement analysis of maintenance (16,9%),
•
Analysis of condition monitoring data and Personnel planning (15,5%),
•
Cost control (14,1%), and
•
Work order handling and Analysis of conducted maintenance (12,7%).
433
Table 2 Information gaps in percentages
Information is not present in the current IT system but is needed (Percentages)
Plant register Spare parts list Tools list Personnel Maintenance policy and objectives Maintenance instructions and procedures Maintenance follow up procedures OEM recommendations Technical specifications and drawings Maintenance budget Production plan Planned work orders Work order schedule Spare parts in inventory Work order history Maintenance costs Failure history Measurement history Failure cause and conseq. analysis Maintenance improvement suggestions
Information is present in the current IT system but not used (Percentages)
1,4 0,0 5,6 5,6 2,8 4,2 7,0 5,6 2,8 5,6 2,8 0,0 1,4 0,0 0,0 1,4 2,8 4,2 5,6 9,9
4,2 2,8 8,5 8,5 4,2 5,6 8,5 9,9 4,2 0,0 0,0 1,4 4,2 2,8 2,8 7,0 9,9 7,0 9,9 12,7
Table 2 contains a similar summary of the results as Table 1, but regarding information contents. The leftmost column in Table 2 lists information held by MMIT. The second column accounts for the results concerning RQ2a (Does a gap between information coverage and information demands exist?) and in the third column results for RQ2b (Does a gap between information coverage and information use exist?) are found. In general, the information coverage of the IT systems is high. Missing information is mainly found in the area of follow up and improvement (Maintenance improvement suggestions and Maintenance follow up procedures). The IT systems do not contain unnecessary information to high extent either. The information that most respondents found unused were: •
Maintenance improvement suggestions (12,7%),
•
OEM recommendations, Failure history and Failure cause and consequence analysis information (9,9%), and
•
Tools list, Personnel and Maintenance follow up procedures (8,5%)
3.3 Functionality gaps depending on type of IT system In the next step, results from Q23 and Q24 were analysed with respect to which type of IT system the company use. For this purpose Q20 was utilised. This question asked about the type of IT system the respondent use for maintenance management (The respondent could pick one or more type). IT system types included were ERP system, production system, CMMS, special designed system and others. In the further analysis the type “others” is not included. A cross-tabular analysis was made to determine if there were significant differences in functionality and information gaps depending on which IT system the respondent use, compared to the other IT system types. The differences were stated as significant at the p < 0,05 level. Table 3 lists all significant differences found.
434
Table 3 Significant differences depending on IT system type
Support is present in the current IT-system but not used ERP system Production system Not used Used Not used Use n=39 n=32 n=50 n=21 Work order handling + c2=4,801, p=0,028 Personnel planning + c2=7,249, p=0,007 Failure cause and consequence analysis
CMMS Not used Used n=12 n=59
+ c2=4,201, p=0,040
Information is not present in the current IT system but is needed ERP system Production system CMMS Not used Used Not used Use Not used Used n=39 n=32 n=50 n=21 n=12 n=59 Maintenance improvement + suggestions c2=5,182, p=0,023 OEM recommendations
+ c2=4,199, p=0,040 *
A + sign in the table indicates that significantly more respondents than expected selected an alternative. E.g. respondents not using ERP-systems have to a significantly higher degree (compared to respondents not using ERP-systems) answered that support for work order handling is present in current IT-system, but not used. A sign in the table indicates that significantly less respondents then expected selected an alternative.
*) During cross-tab analysis 2 cells had expected count less than 5.
From Table 3 we find that those not using ERP systems, compared to those using ERP, functionality regarding work order handling was to a significant degree present but not used. Furthermore, those using ERP state that there is a lack of information regarding maintenance improvements suggestions in their system. Those using production systems have stated that there is support for personnel planning in their systems but that it is not used (to a higher degree compared to those not using production systems). Additionally, those using production systems state that there is a lack of information regarding OEM recommendations to a higher degree compared to those not using production systems. Those using CMMS state that there is support for failure cause and consequence analysis present but it is not used to a higher degree compared to those not using CMMS.
4
RESULTS AND CONCLUSIONS In section two, four questions regarding functionality gaps in maintenance management IT were formulated: RQ1a: Does a gap between IT system functionality coverage and IT system functionality demands exist? RQ1b: Does a gap between IT system functionality coverage and IT system functionality use exist? RQ2a: Does a gap between information coverage and information demands exist? RQ2b: Does a gap between information coverage and information use exist?
The analysis showed that all types of functionality gaps did exist. In general, the functionality gaps connected to RQ1a and RQ2a were modest, while the gaps connected to RQ1b and RQ2b were existent to a slightly higher degree. The most commonly functions missing were failure cause and consequence analysis and maintenance improvement analysis, and the most commonly missing information was maintenance improvement suggestions. Notable is that the most commonly unused function was failure cause and consequence analysis and the most commonly unused information was maintenance improvement suggestions, thus the same functions appears as both missed and unused. The functionality gaps for all MMIT studied are summarised in Figure 4.
435
Missing
Unused
Functionality
Failure cause and consequence analysis Improvement analysis of maintenance
Failure cause and consequence analysis Improvement analysis of maintenance Analysis of condition monitoring data Personnel planning Cost control Work order handling
Information
Maintenance improvement suggestions Maintenance follow up procedures
Maintenance improvement suggestions OEM recommendations Failure history Failure cause and consequence analysis inf. Tools list Personnel Maintenance follow up procedures
Figure 4. Functionality gaps found The gaps were furthermore analysed with respect to the type of IT system used. Figure 5 summarises the findings with respect to IT system type. The results indicate that ERP-systems and production systems do not include all required information for maintenance management and that all IT systems except the ERP systems contains some unused functionality. The lack of information for maintenance improvement suggestions that was included in Figure 4 might be explained by the ERP users: they apprehended to higher degree, compared to those using other types of IT, that they lacked information regarding maintenance improvement suggestions. Similarly, the functionalities personnel planning, work order handling and failure cause and consequence analysis, which were listed as unused in Figure 4, could be explained by IT system type. All general gaps in Figure 4 could not be explained by IT system type though. Moreover, one piece of information was apprehended as missing for those using a production system, compared to the MMIT in general.
Functionality Information
Missing ERP
Unused
Maintenance improvement suggestions Production
Functionality Information
Personnel planning Work order handling OEM recommendations CMMS
Functionality
Failure cause and consequence analysis Work order handling
Information Special Functionality
Work order handling
Information
Figure 5. Functionality gaps for specific IT system The analysis presented in this paper has revealed some interesting results. The research questions have to some extent been answered, but the problem is complex in its nature, therefore additional research is required. One possibility is that the gaps are dependent on other factors than IT system type. Therefore, further analysis of other factors, such as industry type, production type or maintenance organisation will be made. The statistical analysis will also be combined with qualitative methods in form of interviews. The results of this paper will therefore be utilised when designing the interview template.
436
5
REFERENCES
1
Dedrick, J., Gurbaxani, V. and Kraemer, K. L. (2003) Information Technology and Economic Performance: A Critical Review of the Empirical Evidence. ACM Computing Surveys, 35(1), 1-29.
2
Jonsson, P. (2000) Towards an holistic understanding of disruptions in Operations Management. Journal of Operations Management, 18, 701-718.
3
Pintelon, L., Pinjala, S. K. and Vereecke, A. (2006) Evaluating the effectiveness of maintenance strategies. Journal of Quality of Maintenance Engineering, 12(1), 7-20.
4
Liptrot, D. and Palarchio, E. (2000) Utilizing advanced maintenance practices and information technology to achieve maximum equipment reliability. International Journal of Quality & Reliability Management, 17(8), 919-928.
5
Mjema, E. A. M. and Mweta, A M. (2003) An analysis of economics of investing in IT in the maintenance department An empirical study in a cement factory in Tanzania. Journal of Quality in Maintenance Engineering, 9(4), 411-435.
6
O'Donoghue, C. D. and Prendergast, J. G. (2004) Implementation and benefits of introducing a computerised maintenance management system into a textile manufacturing company. Journal of Materials Processing Technology, 153-154, 226-232.
7
Labib, A.W. (2004) A decision analysis model for maintenance policy selection using a CMMS. Journal of Quality in Maintenance Engineering, 10(3), 191-202.
8
Swanson, L. (1997) Computerized maintenance management systems: a study of system design and use. Production and inventory management journal, 3, 11-14.
9
Jonsson, P. (1997) The Status of Maintenance Management in Swedish Manufacturing Firms. Journal of Quality in Maintenance Engineering, 3(4), 233-258.
10
Alsyouf, I. (2004) Cost effective maintenance for competitive advantages, PhD thesis. Växjö: Växjö University Press.
11
Beynon-Davies, P. (2002) Information Systems, An introduction to Informatics in Organisations. Bath: Palgrave.
12
DeLone, W. H. and McLean, E. R. (1992) Information Systems Success: The Quest for the Dependent Variable. Information Systems Research, 3(1), 60-96.
437
Proceedings of the 4th World Congress on Engineering Asset Management Athens, Greece 28 - 30 September 2009
DRIVING INNOVATION THOUGH PERFORMANCE EVALUATION Dr. Abrar Haider a, b a
b
CRC for Integrated Engineering Asset Management, Brisbane, Australia
School of Computer and Infomatin Science, University of South Australia, Mawson Lakes Campus, SA 5095, Australia.
Since the past two decades there has been an increased activity in development of performance management systems aimed at various organisational levels, covering a multitude of dimensions. Since the focus of performance management is on enabling actionable learning aimed at business improvement, these systems should lead to innovation in management processes. In contemporary organizations, the pervasiveness of information and communication technologies underscores the importance of measuring their performance to drive a culture of continuous improvement. It is particularly relevant for asset managing engineering organisations, which are increasingly becoming information technology intensive by utilising a multitude of operational and administrative technologies to execute their business. Performance management of information technologies utilised in asset lifecycle management, therefore, should not only be aimed at reporting on the fit of existing asset management processes with the these technologies, but also on how to enhance the effectiveness of asset lifecycle management strategies enabled by various technologies. Therefore, it is important to assess the impact of performance management systems on business improvement, so as to enable assessment and establish credibility of the performance management system itself. However, literature is relatively silent on this issue. The lack of empirical research on this important issue has been attributed to the relatively immature theoretical nature of the field of performance management. This paper develops a theoretical framework for performance management research to guide empirical examination of the impact of performance management systems on business and management process innovation. Key Words: Information technology, IT governance, Asset management. 1
INTRODUCTION
Asset managing organisations utilise a variety of information and operational technologies to execute and mange asset management processes. Engineering enterprises traditionally adopt a technology-centred approach to asset management, where technical aspects command most resources and are considered first in the planning and design stage. However, most engineering enterprises mature technologically along the continuum of standalone technologies to integrated systems and in so doing aim to achieve the maturity of processes enabled by these technologies [1]. Haider [2] further assert that as a result of this approach, asset lifecycle is managed by isolated, stand alone, and fragmented technologies; consequently, there is little integration and connectivity among various lifecycle processes and activities. It is, therefore, important that performance of information technology (IT) investments is measured and managed by accounting for their impact on related areas such as overall IT infrastructure, process maturity, skills set available in the organisation, and other organisational factors such as structure and culture. IT evaluation calls for ascertaining both hard as well as soft benefits to the organisation by using quantitative as well as qualitative means and their connection to organizational development. This can only be attained if IT evaluation becomes a strategic advisory mechanism that supports planning, decision making, and management processes, and facilitates organizational learning. This feedback indicates the fundamental reasons, factors, and causes of IT investments. However, evaluation of IT investments by nature is unique and is different from other evaluations, due mainly to the tangible and intangible impacts of IT. IT systems are social systems and their interpretation is influenced by the use and meaning that organisational communities associate with them within the socio technical environment of the organisation. Evaluation, therefore, is subjected to the principles, assumptions, and concepts that the evaluators employ in carrying out the evaluation
438
exercise. In a social setting, human interpretation is continuously evolving and thus the interpretation of IT utilisation also reshapes due to the changes in business environment and information requirements. Evaluation, thus, represents the current meanings and interests that individuals or communities associate with the use of IT within the organisation. In crux, IT evaluation requires a variety of strategic organisational, economic, and social dimensions, and involves external as well as internal customers. It must also enable effective management action such that the results from the evaluation are put into practice and the learnings generated are properly followed up. However, while there are countless performance evaluation systems and methodologies available to businesses of all types, yet research and practice are silent on what is their impact. This impact statement is necessary to assess the suitability of chosen performance management methodology to the applied area(s) or dimension(s) of the business, as well as to ensure that it provides actionable learning such that the organisation takes corrective action and engages in continuous improvement. This paper attempts to investigate the impact of performance evaluation methodologies and aims to propose a research agenda for the impact of IT based performance management systems on process innovation and continuous improvement. It starts with a discussion of the role of IT for asset lifecycle management, followed by a detailed discussion on the characteristics and limitations of evaluation of IT for asset management. Having set these foundations, the paper then presents a research agenda to drive process innovation though performance evaluation of IT systems utilised for asset lifecycle management. 2
ASSET MANAGEMENT AND INFORMATION TECHNOLOGY
Principles of asset management are based on the alignment or the strategic fit of the organisation’s resources with stakeholders needs. Asset management, therefore, represents an ongoing process to strategically align organisation’s objectives and market demands through effective utilisation of assets [2]. According to Hastings [3], asset management is derived from business objectives and represents set of activities associated with asset need identification, acquisition, support and maintenance, and disposal or renewal, in order to meet the desired objectives effectively and efficiently. Fundamental aim of asset management is the continuous availability of value that it enables to its stakeholders through its service, production, or manufacturing provision. Core asset management represents asset lifecycle processes such as, asset design, acquisition, construction, and commissioning; operation; maintenance; refurbishment; decommissioning; and replacement. An asset lifecycle management process actually consists of three cycles, i.e. primary asset management cycle, learning and change cycle, and renewal cycle (figure 1). The learning, optimisation, and change cycle is aimed at changing of an asset solution in response to factors such as asset need redefinition, technology refresh, environmental and regulatory concerns, and economic trade offs. However, the crucial factor in this cycle is the ability of the organisation to evaluate primary asset lifecycle and compare its outputs with the business objectives. The gap analysis provides learnings on effectiveness of the existing asset solution to meet the stakeholders’ needs. Therefore the aim of this exercise is to highlight the optimum ways of managing assets with the existing resources base. Nevertheless, the objectives of this exercise are, firstly, to identify enhancements in asset solution design, and secondly, if the first is not possible, to provide alternatives for asset renewal. Subsequently, the learning, optimisation, and change cycle informs and calls for redefinition of asset strategy, whereas the renewal cycle informs and necessitates adjustment of asset management plan. Renewal Renewal Cycle
AM Strategy
Plan
Construct/ Acquire
Operate
Maintain
Retire
Primary Asset Life Cycle
Learning, Optimisation, & Change Cycle
Review Need
Re-Evaluate Asset Solution
Change
Figure 1: Asset Management Cycles [2]
439
Monitor
The learning, optimisation, and change cycle reviews the internal operating environment of the asset managing organisation, as well as the community, environmental, financial, legislative, institutional, and regulatory constraints within which it operates. This review of performance evaluation provides the strategic directions that shape or reshape the asset management strategy to meet its objectives, by taking into account the strategic priorities, organisational competencies, and how these competencies could mitigate the financial and non financial risks posed to the organisation. According to IIMM [4], asset lifecycle performance evaluation is aimed at, a. evaluation and justifications of planned levels of service; b. compliance with monitoring, and reporting and requirements; c. compliance with the planned techniques and methodologies to enable cost effective asset lifecycle treatment options, such as risk management, predictive modelling, and optimised decision support; d. evaluation and identification of task priorities and resources requirements; e. evaluation and justification of the roles and responsibilities for various organisation units in relation to asset management activities; f. evaluation of information requirements of asset lifecycle; and g. continuous improvement of asset management plan. Asset lifecycle performance evaluation includes review and assessment of asset lifecycle management plans, processes, technologies, and support mechanism to measure their effectiveness in satisfying business needs. It involves, audit and assessment of implementation and execution of existing measures and activities against actual documented standards, objectives, strategies, and stakeholder requirements aimed at continuous improvement of the asset management regime. Scope of asset management spans engineering as well as business activities, where most of these activities are cross functional and even cross enterprise. For example, maintenance processes influence many areas, such as quality of operations; safe workplace and environment; plant availability. The outputs from maintenance are further used to predict asset remnant lifecycle considerations, asset redesign/rehabilitation, and planning for maintenance support resources management. Asset managing organisations are increasingly implementing IT systems to automate and bond these activities together. However, close collaboration and openness of information exchange on asset management processes holds the key to effective asset lifecycle management. Thus, the scope of IT in asset management extends well beyond the usual data processing and reaches out to business value chain integration, enhancing competitiveness, and transformation of patterns of business relationships [5]. In theory IT utilised in asset management has three major roles; firstly, IT systems capture, store, and exchange information spanning asset lifecycle processes; secondly, IT provide decision support capabilities through the analytic conclusions arrived at from analysis of data; and thirdly, IT enables an integrated view of asset management through integration and interoperability of asset lifecycle information IT systems thus have help in translating asset management strategy into action by enabling asset lifecycle processes, and also inform the asset management strategy though their ability to analyse lifecycle information that asset mangers use in lifecycle planning and decisions. IT for asset management, thus, seeks to enhance the outputs of asset management processes through a bottom up approach. This approach gathers and processes operational data for individual assets at the base level, and on a higher level provides a consolidated view of entire asset base. At the operational and tactical levels, IT systems are required to provide necessary support for planning and execution of core asset lifecycle processes. For example, at the design stage designers need to capture and process information such as, asset configuration; asset and/or site layout design and schematic diagrams/drawings; asset bill of materials; analysis of maintainability and reliability design requirements; and failure modes, effects and criticality identification for each asset. Planning choices at this stage drive future asset behaviour, therefore the minimum requirement laid on IT at this stage is to provide right and timely information, such that informed choices could be made to ensure availability, reliability and quality of asset operation. An important aspect of asset design stage is the supportability design that governs most of the later asset lifecycle stages. The crucial factor in carrying out these analyses is the availability and integration of information, such that analysis of supportability of all facets of asset design and development, operation, maintenance, and retirement are fully recognised and defined. Nevertheless, effective asset management requires the lifecycle decision makers to identify the financial and non financial risks posed to asset operation, their impact, and ways to mitigate those risks. Owing to a deterministic view of technology, managerial expectations from IT investments are those of increased quality and quantity of output, as well as substitution of human effort through automation. These features also contribute to the underlying assumptions of IT investment that its adoption will outdo the related costs. Advantages of these costs benefit are often translated as gains in terms of production/manufacturing/service provision output through operational efficiency. However, effectiveness of IT depends on how IT is implemented, since it cannot be detached from human action and understanding, social context, and cultural environment within which it is implemented. IT implementation, in doing so, becomes a strategic advisory mechanism that supports planning, decision making, and management of IT infrastructure that facilitates organizational learning. It is, therefore, important to justify investments in IT and measure their performance, so as to identify whether IT is achieving desired results and enabling actionable learning, in case desired results are not achieved. It is, therefore, important to understand how performance evaluation contributes to learning that facilitates organisational improvement. The following section, therefore, explains the concept and characteristics of evaluation.
440
3
CONCEPT OF EVALUATION
Since early 1990s there has been an increased research activity in development of performance measurement systems aimed at various organisational levels, covering a multitude of dimensions. This increase has been fuelled by certain business development theories that promote the need for employing performance evaluation as means for performance improvement, such as constraints theory, lean enterprise, and six sigma. The activity thus generated has resulted in development of numerous models, frameworks, techniques, and methods applied in the industry with varying levels of acceptance and success. However, the discussion on the effectiveness of performance evaluation has been centred on three views. The first view suggests that businesses do well if there are integrated and well structured performance evaluation methods in place that inform and provide management with improvement indicators [6]. In contrast, there are researchers who have questioned the role of performance evaluation in general and individual performance evaluation methods in particular. For example researchers (See for example, [7, 8]) suggest that employing console style performance evaluation methods such as balanced scorecard, makes little or no contribution to business performance improvement. There are, however, other researchers who suggest that performance evaluation is a business management activity and its success is highly dependent upon the approach used to implement it. Performance evaluation methodologies have various aims and values, involve a range of stakeholders, and are aimed at various stages of system, product, or organisational lifecycle. The intent of performance evaluation exercise is to provide management with a progress report on the performance of the area under investigation, so as to prompt action to address the gaps thus identified. The major objective of performance evaluation, therefore, is proactive rather than reactive management. According to Atkinson et al. [9] performance measurement serves three basic functions, i.e. to co-ordinate, to monitor and to diagnose. Working through these functions, performance evaluation provides a roadmap for proactive organisational improvement. Thus the character of evaluation changes with changes in business environment. Meekings [10] summarises the character of evaluation exercise by arguing that, a. evaluation exercises should provide progressive forecasting and insights into business performance; b. instead of being tool for management control, performance measures should be aimed at providing feedback, instituting understanding, and promoting performance improvement motivation; c. the focus of performance evaluation needs to be based on systems thinking centred on a structured change and organisational learning, rather than setting targets aimed at fire fighting or allocation of blame; and d. performance evaluation and measures needs to be aligned with the organisational objectives, such that all levels of the organisation understand these measures and collaborate and contribute towards continuous business improvement. 4
UNIQUENESS OF IT EVALUATION
IT evaluation is a subjective activity that is highly influenced by the context within which the IT systems are employed. Furthermore, it involves a variety of organisational stakeholders, and a range of activities, processes, and conditions, which underscores the complexity of IT systems evaluation. Tangen [11] contends that performance evaluation represents the set of metrics used to quantify the efficiency and effectiveness of organisational actions taken towards achieving its objectives. IN terms of IT investments, this efficiency and effectiveness constitutes the value profile that the organisational stakeholders attach to their use in the organisation. This value profile, could thus be, financial, functional, individual, organisational, or strategic advantage. However, it is important to note that this value profile is different at different stages of an IT system’s lifecycle, for example, an ex ante or pre implementation evaluation is aimed at ascertaining cause and effect of technology; whereas, ex post or post implementation evaluation may be aimed at evaluation of how well the IT systems are enabling organisational strategy as well as how good they are in advising business strategy. In light of this discussion IT system’s evaluation could be termed as ‘an assessment of value profile of IT systems to an organisation using appropriate measures, at a specific stage IT systems lifecycle, towards continuous improvement aimed at achieving the overall organisational objectives”. This definition embodies the objectives, measures, and process of evaluation. However, formulation of an effective methodology requires complete understanding of the how, why, what, who, where, and when of performance evaluation. Edvardsson et al.[12] and Oakland [13]elaborate on these and explain that in order to formulate an effective performance evaluation methodology, organisation must resolve, a. Why measurement is required, by being explicit about the purpose of evaluation; b. What should be measured, by carefully working out the evaluation criteria that covers all relevant dimensions of the phenomena under evaluation; c. How it should be measured, by devising appropriate framework for evaluation such that the evaluation provides an integrated rather than disjoint view of the phenomena under investigation; d. When should it be measured, by deciding the timeframe and frequency of carrying out the evaluation exercise; e. Who should measure it, by identifying the stakeholders who are responsible for development of the evaluation methodology, as well as those who will carryout the evaluation exercise; and f. How the results should be used, by making it clear to the internal organisation about the follow up actions so as to motivate and involve internal staff in the process of continuous improvement.
441
4.1 The What and Why of Evaluation The fundamental question in an evaluation exercise is not only ascertaining the subject of evaluation but also the dimensions of the subject that needs evaluation. For example, in asset management there could be different permutations of IT evaluation, such as IT systems enabled asset managing processes, system design, a particular system (for example ERP), quality of information, or quality of decision support that the IT systems provide. Nevertheless, IT systems evaluations have a vast focus, and according to Teubner [14], may involve evaluation of, a. components of information technology, b. software applications, c. the way IT and software are applied in information systems, d. business processes supported or enabled by IT systems, e. IT systems related development processes, f. techniques, methods and tools used in development, g. artefacts and models built during the development process, and h. IT systems related management and service processes. The purpose of IT systems evaluation largely depends upon how the organisation views technology. For example, when viewed as a process automation tool, evaluation of IT systems may be aimed at assessing the way its supports business processes, which may be measured by the speed of process execution. However, although the focus of research into IT systems evaluation has traditionally been on software applications or standalone IT systems, the trend is shifting and broadening in scope and now involved even cross enterprise IT systems. 4.2 The How of Evaluation Every evaluation is based on some intrinsic measurement variables. These criteria are chiefly drawn from the subject or the dimension of the subject of evaluation. Brown et al. [15] identifies six generic types of performance measures that are widely employed in evaluation exercises, i.e. customer satisfaction measures; financial measures; product/service quality measures; employee satisfaction measures; operational measures; and public responsibility measures. In IT systems evaluation the generally applied generic performance measures are financial measures, such as costs of implementation; technical measures, such as response time; system usefulness attributes, such as user satisfaction; and quality of the information. IT systems, however, are social systems embedded within the organisational context and choosing criteria that encompasses evaluation of all the IT systems benefits is a difficult task. Teubner [14] points out that these difficulties are due to a range of factors, such as, a. Technical Embedding. Individual IT system’s components are often embedded in the overall technological infrastructure, which makes it difficult to assess the performance of these individual components. For example, while evaluating the effectiveness of a condition monitoring system, it is difficult to quantify the contribution of individual sensors. b. Organisational Embedding. IT systems infrastructure is an integral part of an organisation, and influences and is influenced by a number of organisational factors, such as culture and structure of the organisation. Consequently it has progressively become difficult to take the impact of IT systems apart from these organisational aspects. For example, utility of an IT system is not just restricted to the business process or process that it enables, but is also reflected in the ambiance of the organisation. It is therefore essential to develop evaluation criteria that truly investigate the performance of IT systems. c. Social Construction. The social impact of IT systems is well documented, which makes it much more than just a technical solution. Impact of changes that IT systems implementation brings affect work practices as well as the intellect and working habits of employees. However, impact of IT systems on staff, social life of the organisation, and collective sense making, is intangible and is difficult to measure. d. Social Adoption. IT systems adoption is a social process, since their use evolves over time and depends heavily upon skills of employees and culture of the organisation. It also means that IT systems may not start delivering desired results straight after their implementation. Evaluation criteria, therefore, needs to account for the time frame of IT systems lifecycle within which evaluation is to be carried out. 4.3 The When and Where of Evaluation IT systems evaluation can be carried out at various stages in their lifecycle. However, the most common evaluations are ex ante, ex post, and during operation. Depending upon the stage, such evaluations have different aims and objectives. In ex ante or pre implementation evaluations, performance measurement criteria are generally based on cost benefits, and perceived value that the investment may bring to the organisation. This investigation is usually carried out by functional teams, who evaluate different choices of technologies and then make a decision. Nevertheless, the measurement criteria are often not clear and are basically based on the assumptions of the future use of technology and that too as conceived by the evaluators. On the contrary,
442
during a post implementation evaluation a report card on the investment in IT systems is developed. This type of evaluation is generally not conducted by the people who conduct ex ante evaluations, and therefore susceptibilities of technology in terms of purpose and effectiveness of use are not considered. These two factors also change with time, mainly due to technological innovation and changes in business environment. On the other hand, ‘during operation’ evaluation is often expected to produce learnings and feedback that could be used for strategic reorientation. However, this form of evaluation requires long term involvement and experience, such that the purpose, use, and fit of technology within the organisation are understood by all the evaluators. This makes the success or failure of IT systems open to interpretation according to the judgements and experiences of the evaluators. It must be pointed out that this form of evaluation is the least common evaluation in research and practice. 4.4 The Who of Evaluation The subjectivity and social nature of IT systems evaluation makes choice of evaluation stakeholders an important issue. Evaluation stakeholders are not just the people who carryout evaluation exercise, but are also those who are affected by it in any way. For example, senior management, who may not be directly involved or directly affected by the evaluation, yet they have significant influence on the evaluation exercises. The choice of stakeholders is also affected by the time of IT systems lifecycle when the evaluation is carried out. For example, for ex ante evaluation evaluators comprise a diverse group such as, system developers, users, project managers, finance people, and customers; whereas ex post evaluation is generally carried out by functional teams, IT systems department, external agencies, or teams with organisation wide representation. 4.5 Methodologies and Tools Used Enacting appropriate methodologies, techniques, and tools for evaluation provide the rational underpinnings between the evaluation measures and the effectiveness of evaluation. Due consideration to this relationship is important, for the fact that IT implementation has a direct relationship with organisational context, human behaviour, and other structures developed around IT systems. Choice of evaluation method and tools needs to be comprehensive enough to encompass all these issues. De Toni and Tonchia [16] state that there are five types of performance evaluation models found in literature, these are a. Vertical or hierarchical models with an economic outlook typified by financial and non-financial evaluations connecting ROI and productivity. b. Balanced scorecards, where several dimensions (such as financial, learning and growth, internal business processes, and customers) are evaluated separately, which are linked together in a general way. c. Models, which can be termed as ‘frustum'’, where there is a synthesis of low-level measures into more aggregated indicators, but without the scope of translating non-cost performance into financial performance. d. Models that distinguish between internal and external performances. e. Models that are related to the value chain.. These models have been applied to IT paradigm extensively, though with varying success. However, IT evaluation methodologies are, qualitative as well as qualitative in nature; aim at single IT system or organisational IT systems in a collective way; are based on single as well as multidimensional evaluation criteria; and expanded to consider contextual and organisational factors. 4.6 Using the results of evaluation Using results from evaluation depends upon the type of feedback that performance evaluation systems enable. Generally evaluations provide feedback in terms of adaptive or generative learning. However, this feedback needs to provide actionable change, for example evaluation of IT systems could provide feedback in terms of the gap analysis of the desired versus actual state of their usefulness or levels of performance. In so doing, the underperforming areas are highlighted so as to take corrective action, for example, system upgrade, information integration, training of users etc. This type of learning is adaptive learning. When corrective actions taken are evaluated again in the following evaluation cycle, the assessment appraisal leads to further refinement or actionable learning to achieve optimum level of service from IT. Such learnings are generative and provide the basis for continuous improvement in the organisation. 5
LEARNING AS THE VALUE PROFILE FROM IT FOR ASSET MANAGEMENT
Literature review of performance evaluation efforts in engineering enterprises from the year 2002 to 2007 [2] reveals that a holistic evaluation of asset management processes or IT utilised for asset management has seldom occurred. The scope of performance evaluations has primarily been centred on efficiency of manufacturing strategy or production systems. Within the realm of asset management, asset maintenance and design appear to be major concerns; however, the methodologies employed to measure performance have a function specific focus that further restricts the scope of follow up actions. In addition, these evaluations have generally been carried out to justify managerial actions or investment decisions. In terms of IT, evaluations
443
have been aimed at measuring the impact of standalone systems as well as integrated systems on cost, quality, and throughput of business processes. Nevertheless, most of these evaluations have been carried out at the ex ante or ex post stages, with little research in measurement of benefits from IT during routine day to day operations. During ex ante evaluation engineering enterprises need to take a broad view of the technology implementation and consider the areas that may influence or be influenced by technology adoption. Researchers have generally focused on the justification of technology and have therefore stressed on the relationship between the justification methodologies and technology adoption; and have ignored the relationship between justification of actions and the performance of technology [2]. Consequently, attention is focused on the technical profile of technology; rather than how the organisation might accommodate technology. Although financial measures are important at this stage, choice of technology is equally dependent on the fit of technology with the strategic and organisational environment of the organisation. It is therefore, important to consider a broad range of performance variables spanning different organisational dimensions, so as to enable an effective and comprehensive follow up. Haider [1] argues that in order to develop and implement a performance evaluation methodology, organisations need to ask three fundamental questions, which are, what specific strategic, organizational, and technological issues should be considered when investing in IT?; how are the strategic, organizational, and technological issues interrelated and how does each relate to implementation of IT?; and when should specific strategic, organizational, and technological issues be addressed during IT execution. Ex post and during operation evaluations, however, have a much broader focus and are aimed at facilitating strategic action so as to facilitate continuous improvement. However, effectiveness of these actions is considerable when aims and targets of evaluation exercise are understandable and well articulated. Evaluation is therefore a learning exercise, and this learning occurs at the individual, group and organisational level. The notion of backing learnings from evaluation exercises with appropriate action has some limitations. For example, a routine medical examination measures factors such as body mass index and blood pressure. Even if these factors fall in the ideal category, this does not mean that the person has no illness. On the other hand, if these two factors fall outside the safe limits that does not necessarily signify a disease. In order to arrive at a fitness or illness decision certain follow up actions are required, which may include further examination or employing independent measures, such as blood tests. This has particular significance for asset management; since fundamental information input into the asset health management is the condition information that originates from sensors or manual inspection. Evaluation of this information may reveal inconsistencies, which may not necessarily be indicators of ill health. For example, during extreme heat a vehicle’s temperature may reach high limit; however, that does not mean that now the vehicle will not deliver at the same level. All it signifies is that if the vehicle keeps on operating in the same context a failure may be inevitable. Similarly, the IT employed at the condition monitoring level only communicate and act upon the information that is collected through the sensors. The temperature or humidity sensors may report readings that have been influenced by weather, and certainly these readings will not remain the same. This therefore requires a follow up action to ascertain if the problem really exists. The passive nature of IT as decision support tools rather than decision making tools, necessitates human intervention to justify follow up action. The effectiveness of the follow up action, itself, is heavily dependent upon the perspective of the decision maker, both in terms of breadth of availability of complete information and his/her experience. IT evaluation enables a learning organisation as well as organisational learning. Both of these concepts, though distinct, are interrelated. A learning organisation is one that creates, acquires, and transfers knowledge, and transforms itself to reflect new knowledge and insights [17], whereas organisational learning signifies the processes though which organisations take stock of their capabilities and thus the organisations can be and are changed [18]. IT for asset management, therefore, has a critical role in enabling learning organisation as well as facilitating organisational learning. Evaluation of IT utilised for asset management needs to provide insights into the effectiveness of asset lifecycle management processes through IT utilisation, and also enable feedback on the relevance and fit of existing IT infrastructure with operational technologies, such as SCADA and CMMS. Evaluation is thus a learning activity, which facilitates organisational learning through revealing explicit or implicit dimensions of IT for asset management. The learnings thus gained provide indicators for improvement as well as sustaining the existing level of asset lifecycle management. In doing so, evaluation works as means of feedback on the management actions taken and their impact on the organisation and develop into an instrument of learning that can reduce future uncertainty of emergent asset management approaches and decisions. However, where there are countless performance measurement and management methodologies, frameworks, and systems available, there is hardly anything credible available in research or practice research to look into the impact of performance evaluation. The following section presents a framework to guide a holistic performance evaluation agenda by providing theoretical support for adoption, implementation, and impact of performance evaluation.
6
PERFORMANCE EVALUATION OF IT FOR ASSET MANAGEMENT
Organisations’ expectations associated with adoption of IT are quite diverse, such as operational efficiency, reduction in operating expenses, or enhanced competitiveness. However, there are divergent views held about the value creation of IT investment. Although, recent studies have concluded that IT investments provide positive economic returns [19]; nevertheless the impact of IT investments varies within organisations [20]. Evidence found in literature, both industry and academic, sustains the argument of success (see for example, [21]) and failure ((see for example, [22]). The reason for this polarisation is the propensity to neglect the active interaction and shared shaping between technology and people [2]. It is also argued that
444
when organisations attempt to evaluate IT, managerial emphasis is mostly on the improving cost benefit management of IT adoption. Majority of IS evaluation exercises are carried out using capital investment appraisal techniques, such as cost benefit analysis, payback and return on investment [23]. These evaluations only give a slice of the total impact of IT investments and disregard the human and organisational aspects of IT adoption, and, therefore, not only keep the softer benefits hidden but the costs of managing these benefits also remain uncovered [24]. Furthermore, these unobserved benefits prevent the systems from delivering at its full potential [25]. Consequently, such evaluations fail to measure the total impact of IT and contribute to failure of IT investments to achieve desired objectives [26]. Liyanage and Kumar [27] argue that the changing competitive environment of asset managing organisations along with stricter regulatory requirements, are forcing asset managing organisations to have effective performance management mechanisms for their asset management processes. This trend is getting popular in capital intensive industries, such as petroleum [28]. As these industries are increasingly becoming aware of the shortcomings of the classic financially oriented measurement techniques, asset managing organisations like BP and Shell, are broadening the scope of their evaluation exercises, so as to include soft as well as hard determinants of asset management [27]. Figure 2, addresses this change in scope and presents an overarching performance evaluation research and practice agenda. At present, there is research and practice support available in certain areas of this framework, however there are important areas in implementation and impact which are untouched by research and business practice.
Implementation of Performance Evaluation
IT Implementation Concerns
IT Implementation to Enable Asset Management Strategy
How IT must be implemented to provide an integrated view of asset lifecycle?
Complex Adaptive Systems Theory Institutional Theory
Providing and integrated view of asset lifecycle management information to facilitate strategic decision making at the executive level. Fulfilling asset lifecycle planning and control requirements aimed at continuous asset availability, through performance analysis based on analysis of various dimensions of asset information such as, design, operation, maintenance, financial, and risk assessment and management.
Tactical Level
How IT must be implemented to meet operational requirements of assets?
• •
Desired Asset Management Outputs
Strategic Level
How IT must be implemented to meet the planning and control of asset lifecycle management?
Aiding in and/or ensuring of asset design, operation, condition monitoring, failure notifications, maintenance execution and resource allocation, and enabling other activities required for smooth asset operation.
Operational Level
• • •
Impact of Performance Evaluation
Task Technology Fit Theory of Reasoned Action Principal Agent Theory
•
IT Usage to Inform Asset Management Strategy
Adoption of Performance Evaluation
Structuration Theory
Figure 2: Performance evaluation research dimensions of IT for Asset Management This framework provides a comprehensive methodology for end to end performance evaluation and management of IT utilise fro asset lifecycle management. Starting from how to adopt a performance evaluation methodology, it goes all the way to how to measure the impact or suitability of the chosen methodology. In doing so, it not only measures the usefulness of IT but also measures the effectiveness of the methodology itself. It has the strategy translation and strategy enablement role of IT at the core and attempts to evolve and establish performance evaluation and management around it. 6.1 Adoption of Performance Evaluation
445
At the elementary stage it is important for asset managing organisations to understand three key issues. Firstly, why it is important to measure the performance of IT utilised for asset management; secondly, what type of performance evaluation methodology suits the subject of evaluation; and thirdly, how to adopt the chosen performance evaluation methodology. In order to do this, two theories i.e. complex adoptive systems theory and institutional theory provide theoretical support. Asset management processes are not well defined or stagnant; in fact these processes are emergent in nature. Consequently, information requirements from the IT systems employed to enable these processes are also developing and continuously changing. Complex adaptive theory (which is multi-agent theory where order is not predetermined and is emergent, and the agents themselves structure activities within the system through their actions) provides the requisite understating on as to why organizations would adopt a performance evaluation system. Viewing through this lens, performance evaluation enables organisations to better manage the balance between order (such as control mechanisms like SCADA or CMMS) and uncertainty (such as changing maintenance demands). In doing so, performance evaluation systems link operations to strategic objectives while at the same time providing decision-makers at all levels with information needed to adjust to changing organizational conditions. On the other hand, institutional theory suggests that organizational behaviour is embedded in the socio-politico-economic climate in which an organization operates. It focuses on the processes by which structures, including business processes, rules, and routines become established as guidelines for organisational behavior. It explains how these elements are made, disseminated, accepted, and adapted in organisational life. It, therefore, helps in identifying and understanding how organisations adopt certain business practices or technologies, which facilitates adoption of a specific performance evaluation methodology and expanding it to all areas of asset lifecycle management. 6.2 Performance Evaluation Implementation At the implementation stage the framework utilises task technology fit, theory of reasoned action, and principal agent theory to establish relative success, adaptability, and effectiveness of performance management systems or methodologies. Theory of reasoned action argues that behaviour of individuals is directly determined by intention, and intention is formed by attitudes towards the object in question. Utilising this theory at this stage helps in establishing the IT requirements that enable desired asset management outputs. In addition, it helps in establishing the reason for implementing the performance evaluation system that further helps in acting upon the learnings from evaluation and change management. Task technology fit theory states that the use of IT is expected to have a positive effect on people’s performance, if the capabilities of the technology match the task that people have to perform. It means that technology will only be useful if it matches the requirements of the task for which it has been implemented. Therefore, this theory helps in establishing realistic hard and soft expectations from IT in the areas of their application and influence. Principal agent theory argues for alleging the interests of principal and agent. Application of this theory will help in linking the strategic outcomes of the organization to operational level activities, i.e. the activities of agents (Asset management oriented IT systems and people using these systems) will be aligned with the needs of the principal (overall all business strategy, shareholders, or regulatory agencies). 6.3 Performance Evaluation Impact Structuration theory does not focus on any individual actor; in fact it focuses on the broader organisational order and practice. It, therefore, suggests that interactions of managers with the IT infrastructure as a whole will influence how the performance evaluation evolves. If the manager views IT as a core element of asset lifecycle management, then performance evaluation will evolve as a generative learning methodology, otherwise it will be restricted to adaptive learning. Evaluation, thus, represents the existing meanings and interests, which managers, individuals, or communities within the organisation associate with the use of technology within the socio technical environment of an organisation. It develops the dynamic relationship between technology, and the context within which it is employed and the organisational actors who interact with technology. When technology is physically adopted and socially composed, there is generally a consensus or accepted reality about what the technology is supposed to accomplish and how it is to be utilized. This temporary interpretation of technology is institutionalised and becomes associated with the actors that constructed technology and gave it its current significance, until it is questioned again for reinterpretation. This requirement of reinterpretation may grow owing to changes in the context, or the learnings that may render the current interpretation obsolete. It is, therefore, essential to measure the outcomes of performance evaluation against the assumptions or postulates that necessitated performance evaluation. In doing so, performance evaluation exercise highlights the gaps in performance and provides actionable learnings, which becomes a tool for process innovation and continuous improvement. 7
CONCLUSION
Performance management, conceptually, is biased and cannot be detached from the human understanding, social context, and cultural environment within which it takes place. It is influenced by the actors who carry out this exercise; and the
446
principles and assumptions that they employ to evaluate performance. Considering the fact that human interpretation shapes and reshapes over a period of time, the nature of evaluation also changes from time to time. When IT evaluation is employed it is expected that it will expose a number of different dimensions of IT implantation, such as, financial, technical, behavioural, social, and management aspects of information systems. Furthermore, these endeavours may be aimed at stakeholder satisfaction, role of IT, and IT lifecycle. These expectations change during the lifecycle of an information systems. Operationally, IT evaluation has different objectives and aims ex ante and ex post. An ex ante or pre implementation is aimed at ascertaining cause and effect of technology; whereas, ex post or post implementation evaluation may be aimed at evaluation of strategic translation as well strategic advisory role of information systems. Each of these dimensions, their related objectives and aims have their own theories, postulates, and evaluation criteria, which makes information systems evaluation complicated and difficult. However, pervasiveness of information technologies has lead to the development of various organizational performance evaluation tools and processes. Since focus of performance evaluation is on organizational performance, these systems must lead to innovation in management processes much as production technology is thought to lead to enhanced operational productivity. 8
REFERENCES
1
Haider, A 2009, ‘Value Maximisation from Information Technology in Asset Management – A Cultural Study’, 2009 International Conference of Maintenance Societies (ICOMS), 2-4 June, Sydney, Australia.
2
Haider, A 2007, Information Systems Based Engineering Asset Management Evaluation: Operational Interpretations, PhD Thesis, University of South Australia, Adelaide, Australia.
3
Hastings, NAJ 2000, ‘Asset management and maintenance’, Queensland University of Technology, Brisbane, Queensland.
4
IIMM 2006, ‘International Infrastructure Management Manual’, Association of Local Government Engineering NZ Inc, National Asset Management Steering Group, New Zealand, Thames, ISBN 0-473-10685-X.
5
Haider, A, Koronios, A, & Quirchmayr, G 2006, ‘You Cannot Manage What You Cannot Measure: An Information Systems Based Asset Management Perspective’, in Proceedings of Proceedings of Inaugural World Congress on Engineering Asset Management, eds J. Mathew, L Ma, A Tan & D Anderson, 11-14 July 2006, Gold Coast, Australia
6
Davis, S, Albright, T 2004, ‘An investigation of the effect of balanced scorecard implementation in financial performance’, Management Accounting Research, Vol. 15, No.2, pp.135-153.
7
Ittner, D, Larcker, DF, & Randall, T 2003, ‘Performance implications of strategic performance measurement in financial services firms’, Accounting Organisation and Society, Vol. 28, No.7/8, pp.715-741.
8
Neely, A, Kennerley, M, & Martinez, V 2004, ‘Does the balanced scorecard work: an empirical investigation’, in Proceedings of Performance Measurement Association Conference, Edinburgh, July.
9
Atkinson, AA, Waterhouse, JH, & Wells, RB 1997, ‘A stakeholders approach to strategic performance measurement’, Sloan Management Review, Vol. 38, No. 3, pp. 25-37.
10
Meekings, A 1995, ‘Unlocking the potential of performance measurement: A practical implementation guide’, Public Money and Management, Vol. 15, No. 4, pp. 5-12.
11
Tangen, S 2004, ‘Performance measurement: from philosophy to practice’, International Journal of Productivity and Performance Management, Vol. 53 No. 8, pp. 726-737.
12
Edvardsson, B, Thomasson, B, & Ovretveit, J 1994, Quality of Service, McGraw-Hill, London.
13
Oakland, JS 1995, Total Quality Management: Text with Cases, Butterworth-Heinemann, New York, NY.
14
Teubner, RA 2005, ‘The IT21 Checkup for IT Fitness: Experiences and Empirical Evidence from 4 Years of Evaluation Practice’, in working papers, European Research Center for Information Systems No. 2., eds. J Becker, K Backhaus, HL Grob, T Hoeren, S Klein, H Kuchen, U. Muller-Funk, UW Thonemann, G Vossen, Munster, ISSN 1614-7448.
15
Brown, MG 1996, Keeping Score: Using the Right Metrics to Drive World-class Performance, Quality Resources, New York, NY.
16
De Toni, A, & Tonchia, S 1998, ‘Manufacturing flexibility: a literature review’, International Journal of Production Research, Vol. 36, No.6, pp.1587-1617.
17
Garvin, D 1993, ‘Building a learning organization’, Harvard Business Review, Vol. 71, No. 4, pp. 78-92.
18
Chan, CCA, & Scott-Ladd, B 2004, ‘Organisational learning: Some considerations for human resource practitioners’, Asia Pacific Journal of Human Resources, Vol. 42, No. 3, pp. 336-347.
447
19
Anderson, M, Banker, RD, & Hu, N 2002, ‘Estimating the business value of investments in information technology, in Proceedings of the Eighth Americas Conference on Information Systems, AMCIS 2002, Dallas, TX. pp. 1195-1197.
20
Leibs, S 2002, ‘A step ahead: Economist Erik Brynjolfsson leads the charge toward a greater appreciation of IT’, CFO Magazine, NY, pp.38-41.
21
Devaraj, S, & Kohli, R 2002, Measuring the Business Value of Information Technology Investments, 1st edn, Financial Times Prentice Hall, New York, NY.
22
Ehrhart, T 2002, ‘All Wound Up: Avoiding Broken Promises in Technology Projects’ Risk Management, vol. 49, no. 4, pp.12-16.
23
Serafeimidis, V, & Smithson, S 2000 ‘Information Systems Evaluation in Practice: a case study of organisational change’, Journal of Information Technology, vol. 15, no. 2, pp. 93-105.
24
Khalifa, G, Irani, Z, Baldwin, LP, & Jones, S 2001, ‘Evaluating Information Technology with You in Mind’, Electronic Journal of Information Systems Evaluation, EJISE, vol. 4, issue 1.
25
Pennington, D & Wheeler, F 1998 ‘The Role of Governance in IT Projects: Integrating the Management of IT Benefits’, in Proceedings of the Fifth European Conference on IT Investment Evaluation. pp.25-34.
26
Pouloudi, A & Whitley, A 1997 ‘Stakeholder identification in interorganizational systems: gaining insights for drug use management systems’, European Journal of Information Systems, vol. 6, no. 1, pp.1-14.
27
Liyanage, JP, & Kumar, U 2003, ‘Towards a value-based view on operations and maintenance performance management’, Journal of Quality in Maintenance Engineering, vol. 9, no. 4, pp. 333-350.
28
Liyanage, JP, & Kumar, U 2000, ‘Utility of maintenance performance indicators in consolidating technical and operational health beyond the regulatory compliance’, in Proceedings of Safety Engineering and Risk Analysis: The International Mechanical Engineering Congress and Exposition-2000, pp.153-160.
29
Ballantine, J, & Stray, SJ 1998, ‘Financial appraisal and the IS/IT investment decision making process’, Journal of Information Technology, vol. 13, no. 1, pp. 3-14.
Acknowledgement Financial support from the Cooperative Research Centre for Integrated Engineering Asset Management (CIEAM) for this work is gratefully acknowledged
448
Proceedings of the 4th World Congress on Engineering Asset Management Athens, Greece 28 - 30 September 2009
OPTIMAL SCHEDULES OF TWO PERIODIC PREVENTIVE MAINTENANCE POLICIES AND THEIR COMPARISON Dohhon Kim a, Jae-Hak Lim b and Ming J. Zuo c a b c
Graduate School, Kyonggi University, Suwon, Gyenggi-do 443-760, Korea
Department of Accounting, Hanbat National University, Yusong-gu, Daejon 305-719, Korea
Department of Mechanical Engineering, University of Alberta, Edmonton, Alberta, T6G 2G8, Canada
In this paper, we propose two types of PM actions among which one type of PM actions effect on the relative wear-out since the last PM action (local PM action) while another PM actions are effective in restoring global wear-out since equipment started operating (global PM action). Based on the proposed local and global PM actions, we develop two periodic PM policies which are called the type I PM policy and the type II PM policy, respectively. For each PM policy, we derive formulas to compute the expected cost rates of the system during its life cycle and investigate the optimal PM schedules, which are to minimize the expected cost rates. We also compare the local PM policy and the global PM policy under the assumption that the cost for the global PM action is higher than or equal to the cost for the local PM action. For the purpose of illustration of our results, we investigate numerically the sensitivity of optimal schedules. Keywords : Periodic preventive maintenance, Expected cost rate per unit time, Minimal repair, Preventive maintenance, Hazard rate, Hazard rate reduction factor, Optimal schedule 1
INTRODUCTION
As most of industrial systems become more complex and multiple-function oriented, it is extremely important to avoid the catastrophic failure during actual operation as well as to slow down the degradation process of the system. One way of achieving these goals is to take preventive maintenances while the system is still functional. Although more frequent preventive maintenance (PM) actions certainly would keep the system less likely to fail during its operation, such PM policy inevitably requires a higher cost of maintaining the system. Since two types of PM policies was proposed in Barlow and Hunter [1], many authors have addressed the problem of the optimal schedule for the PM policy by determining either the length of time interval between PM actions or the number of PM actions before replacement, each of which minimizes the expected cost rate. Appropriate PM actions do not only prolong the usage life of the system but also slow down the degradation process. A number of PM models which reflect various effects of PM actions have been proposed. Different types of PM models studied in many literatures are excellently summarized in [2] and [3]. The earliest PM models assume that the system undergoes PM action at specified times and is restored to as good as new after each PM action. However, although the PM action improves the system and slows down the degradation process, it is very unlikely that the PM action restores the system to new one for practical system in use. That introduces the concept of imperfect PM model, which has attracted many researchers’ attention. Although imperfect maintenance covers various categories of repair and maintenance actions, it is assumed in this paper that imperfect maintenance restores the system operating state to somewhere between as good as new and as bad as old. That is known as the improvement factor model. ([2] and [3]) Malik [4] introduces the concept of improvement factor in which the hazard rate after PM action lies between ‘as good as new’ and ‘as bad as old’ and proposes an algorithm to determine successive PM intervals in the sequential PM policy. Lie and Chun [5] present a general expression to determine these PM intervals in the PM policy considered in [4]. Canfield [6] suggests an imperfect PM model in which PM action does not reduce the hazard rate but slows down the wear-out speed. In Jayabalan and Chaudhuri [7], after each PM action, the system is in a state between as good as new and as bad as old in such a way that
449
the effective age of the system after PM action is reduced to a certain age proportional to an improvement factor. Chan and Shaw [8] consider two types of hazard rate reduction models after each PM which are hazard rate with fixed reduction and hazard rate with proportional reduction. Doyen and Gaudoin [9] propose two types of repair models which are arithmetic reduction of intensity (ARI) model and arithmetic reduction of age (ARA) model. ARI model can be further classified into ARI1 and ARI∞ models. In ARI1 model, a repair reduces only the relative wear-out since the last repair while a repair reduces the global wear-out since the system started operating in ARI∞ model. Lim and Park [10] propose a periodic PM policy in which PM action reduces the hazard rate of system, but the effect of PM gets diminished as the number of PM actions increases. Recently, Bartholomew-Biggs, Zuo and Li [11] consider two sequential imperfect PM models in one of which each PM reduces the effective age gained in the PM period just prior to PM action and in another of which each PM action reduces the effective age gained since t=0. In practice, decision makers in a maintenance organization often encounter a decision problem of selecting the more effective PM action among the following two PM actions. One type of PM actions effects on the relative wear-out since the last PM action. This is called as the local PM action. Another type of PM actions is effective in restoring global wear-out since equipment started operating. This is called as the global PM action. For example, trucks in a transportation company are to be regularly maintained in the following two possible ways. One maintenance action is just to change engine oil or to replace belts or gaskets which are changed or replaced respectively at the last PM epoch and the other action is to inspect overall and to maintain all faulty parts detected by inspection. Then the first maintenance action would reduce the wear-out of the trucks since the last maintenance epoch and can be considered as the local PM action. And the second action would reduce the global wearout of the vehicle since the trucks started operating. Hence it can be considered as the global PM action. If the costs for two maintenance actions are equal, it is intuitively clear that the global PM action outperforms the local PM action. However, if the global PM action is naturally more costly than the local PM action, which PM action should the decision makers in the maintenance organization select in order to maintain trucks more economically during their life cycles? In this paper, we address the answer of the above question by developing two periodic PM policies which can accommodate the two types of PM actions described above. For each PM policy, we derive the expected cost rate per unit time and investigate the optimal PM schedule of each PM policy which minimizes the expected cost rate per unit time. And we also compare these two PM policies in sense of optimal PM schedules both analytically and numerically. Section 2 describes the periodic PM policies and its assumptions. The expression of the expected cost rate per unit time is obtained for each PM policy. Section 3 discusses the optimal period and the optimal number of PM actions for these two PM policies. In Section 4, we make a comparison of the optimal schedules of these two periodic PM policies. In Section 5, a numerical example is given. The following notations are adopted throughout this paper. Notation
2
h (t ) hpm (t ) x N -1 p
hazard rate without PM action
Cmr
cost for minimal repair at failure for each of two PM policies
C pm1
PM cost of type I PM policy
C pm 2
PM cost of type II PM policy
Cre
cost of replacement of both PM policies
C1 ( x, N )
expected cost rate (per unit time) of type I PM policy
C2 ( x , N )
expected cost rate (per unit time) of type II PM policy
hazard rate with PM action Time interval between two successive PM actions number of PM actions before replacement hazard rate reduction factor due to PM action, 0 £
p £1
THE PROPOSED PM POLICIES AND THE EXPECTED COST RATES
In this section, we describe two new periodic PM policies and obtain the expected cost rate per unit time for each PM policy. The two PM policies and related assumptions are described as follows. 1.
The system begins to operate at time t = 0 .
2.
For type I and type II PM policies, the PM is done at periodic time kx ( k = 1, 2, L , N ; x ‡ 0) with PM costs C pm1 and C pm 2 , respectively and is replaced by a new one at the N th PM. In type I PM policy, the local PM actions are
450
taken in such a way that the hazard rate h(kx + ) right after the k th PM action is reduced to
h(kx -) - p[h(kx -) - h((k - 1) x + )] where h(kx -) is the hazard rate just prior to the k th PM action, h((k - 1) x + ) is the hazard rate right after the (k-1) st PM action and 0 £ p £ 1 . In type II PM policy, the global PM actions are taken in such a way that the hazard rate h(kx + ) right after the k th PM action is reduced to h(kx -) - ph(kx -) . 3.
The system undergoes minimal repair at failures between PM actions.
4.
PM in type II PM policy, C pm 2 , is more costly than or equal to that in type I PM policy, C pm1 .
5.
It takes negligible time to perform a repair or a PM action.
6.
h(t ) is differentiable, strictly increasing and convex function.
7.
h(0) = 0.
It is noted that a PM action has an effect on the relative wear-out since the last PM action in type I PM policy while a PM action reduces the hazard rate of an amount proportional to the current hazard rate in type II PM policy. It is also noted that the wear-out speed after each PM action is the same as that just before the PM action is conducted. The local PM action used in the type I PM policy has been investigated by Doyen and Gaudoin [9] while the global PM action used in the type II PM policy has been investigated by Chan and Shaw [8] and Doyen and Gaudoin [9]. More explicitly, the hazard rates of the proposed periodic PM policies are as follows. 1. Type I PM policy :
h(t ) 0
(1)
2. Type II PM policy :
h(t ) 0
(2)
200
for 0 £ p £ 1 , hpm (o) = h(0) and k = 1, 2,L , N - 1
Figure 1 shows the hazard rates of the systems with the two PM policies considered. Since PM action in the type II PM policy has more effect on wear-out of system than the type I PM policy, it is natural that the system with the type II PM policy deteriorates more slowly than the system with the type I PM policy. To derive the formula to compute the expected cost rate, we use the well-known fact that the number of minimal repairs between the (k-1) st PM and the k th PM action follows a
Type I PM
100 0
50
Hazard rate
150
Type II PM
0
1
2
3
4
5
6
7
8
9
10
time
Figure 1. Hazard rates of systems with type I PM policy and type II PM policy To derive the formula to compute the expected cost rate, we use the well-known fact that the number of minimal repairs between the (k-1) st PM and the k th PM action follows a nonhomogeneous Poisson process(NHPP) with intensity function h pm (t ) . (Fontenot and Proschan [12]) Since the life cycle of the system is equal to Nx and the total cost of maintaining the
451
system is obtained as the sum of costs for PM action, minimal repair and replacement, the expected cost rate per unit time during the life cycle can be obtained as follows. 1. Type I PM policy :
C1 ( x, N ) =
N -1 1 Cmr H ( Nx) - px∑ h(kx) + ( N - 1)C pm1 + Cre Nx k =0
(3)
2. Type II PM policy : N -1 k -1 1 j C 2 ( x, N ) = Cmr H ( Nx) - px ∑∑ (1 - p ) h((k - j ) x) + ( N - 1)C pm 2 + Cre Nx k =0 j =0
(4)
where H (t ) = ∫0t h( x)dx .
3
OPTIMAL SCHEDULES FOR THE PERIODIC PM POLICIES
We use the conditions provided by Nakagawa [13] to investigate the optimal number of PM actions, N * , which minimize the expected cost rate per unit time.
3.1. Type I PM policy In order to show the existence and uniqueness of the optimal N * which minimizes C1 ( x, N ) , we rewrite the expected cost rate as follows.
C1 ( x, N ) =
Nx 1 Cmr ∫ h pm (t )dt + ( N - 1)C pm1 + Cre 0 Nx
=
N kx 1 Cmr ∑ ∫ (h(t ) - ph((k - 1) x))dt + ( N - 1)C pm1 + Cre ( k -1) x Nx k =1
=
N x 1 Cmr ∑ ∫ [h(u + (k - 1) x) - ph((k - 1) x)]du + ( N - 1)C pm1 + Cre 0 Nx k =1
.
(5)
Let rk (t ) = h(t + (k - 1) x) - ph((k - 1) x) . Then
C1 ( x, N ) =
N x 1 Cmr ∑ ∫ rk (t )dt + ( N - 1)C pm1 + Cre 0 Nx k =1
.
(6)
Nakagawa [13] shows that the sufficient conditions for the equation (6) to have the optimal N * are (i) rk (t ) is increasing in
k and (ii) rN (t ) fi ¥ as N fi ¥ for all t ˛ (0, x) . The following theorem shows that
rk (t ) in the equation (6) satisfies all those conditions in Nakagawa (1986).
Theorem 1. Suppose that 0 £ p < 1 . If h(t ) is strictly increasing and convex in t ‡ 0 , then there exists a finite and unique * 1
N which minimizes the expected cost rate in the equation (3) for a given
x > 0.
Proof. For any t ˛ (0, x) and k ,
rk +1 (t ) - rk (t ) = h(t + kx) - ph(kx) - [h(t + (k - 1) x) - ph((k - 1) x)]
= [h(t + kx) - h(t + (k - 1) x)] - p[h(kx) - h(k - 1) x)] > (1 - p )[h(kx ) - h(k - 1) x )] ‡ 0 The inequality holds since h(t) is strictly increasing and convex in t ≥ 0. And
rN (t ) = h(t + ( N - 1) x) - ph(( N - 1) x) > (1 - p)h(( N - 1) x) becomes infinity as N fi ¥ since h(t ) is convex and strictly increasing to infinity as t goes to infinity.
452
According to Nakagawa (1986), there exists a finite and unique N1* which minimizes C1 ( x, N ) for a given
x >0.
3.2. Type II PM policy By utilizing the same technique used in Section 3.1, we can show the existence and uniqueness of the optimal number of PMs, N 2* which minimizes C 2 ( x, N ) . They are summarized in the following theorem.
h(t ) is strictly increasing in t ‡ 0 , then there exists a finite and unique N 2* which minimizes the expected cost rate in the equation (4) for any x > 0 . Theorem 2. If
4
COMPARISON OF THE OPTIMAL NUMBER OF PM ACTIONS OF TWO PROPOSED PM POLICIES
As we mentioned earlier, if costs for PM action in both PM policies are the same, it is intuitively clear that Type II PM policy outperforms Type I PM policy. This can also be shown analytically. When the two PM costs are not equal, it can not be guaranteed that Type II PM policy outperforms Type I PM policy. In this section, we investigate the relationship between the optimal number of PM actions of type I PM policy and that of type II PM policy. We consider only the case that the PM cost of Type II PM policy is higher than that of Type I PM policy. Let N1* and N 2* be the optimal numbers of PM actions of type I and type II PM policies, respectively. Then N1* and N 2* are values of N which satisfy the following inequalities, respectively. (See Lim and Park (2007) for more details.)
C1 ( x, N + 1) - C1 ( x, N ) ‡ 0 and C1 ( x, N - 1) - C1 ( x, N ) > 0 (7) and
C2 ( x, N + 1) - C2 ( x, N ) ‡ 0 and C2 ( x, N - 1) - C2 ( x, N ) > 0. (8) It can be algebraically shown that C1 ( x, N + 1) - C1 ( x, N ) ‡ 0 and C1 ( x, N - 1) - C1 ( x, N ) > 0 imply that N
x
x
Cre - C pm1
0
Cmr
N ∫ rN +1 (t ) dt - ∑ ∫ rk (t ) dt ‡ 0
k =1
(9)
and N -1
x
x
Cre - C pm1
0
Cmr
( N - 1) ∫ rN (t )dt - ∑ ∫ rk (t )dt < 0
k =1
,
(10)
respectively. Let x
N
N
x
x
L1 ( x, N ) = N ∫ rN +1 (t )dt - ∑ ∫ rk (t )dt = ∑ ∫ [rN +1 (t ) - rk (t )]dt 0
k =1
0
k =1
0
,
(11)
where rk (t ) = h(t + ( k - 1) x) - ph(( k - 1) x). Then N1* is the value of N satisfying
L1 ( x, N ) ‡
Cre - C pm1 Cmr
and L1 ( x, N - 1) <
Cre - C pm1 Cmr
. (12)
Analogously, C2 ( x, N + 1) - C2 ( x, N ) ‡ 0 and C2 ( x, N - 1) - C2 ( x, N ) ‡ 0 imply that
L2 ( x, N ) ‡
Cre - C pm1 Cmr
and L2 ( x, N - 1) <
Cre - C pm1 Cmr
, (13)
where
453
N
x
N
x
x
x
0
0
L2 ( x, N ) = N ∫ r N +1 (t )dt - ∑ ∫ r k (t )dt = ∑ [ ∫ r N +1 (t ) - ∫ r k (t )]dt 0
k =1
0
k =1
(14)
and k -2
r k (t ) = h(t + (k - 1) x) - p ∑ (1 - p) j h((k - j - 1) x). j =0
(15)
Then N 2* is the value of N satisfying the equation (13).
14000 L2(x,N)
L1(x,N)
12000
10000 Cre - C pm1
8000
Cmr
6000
Cre - C pm 2 Cmr
4000
2000
0 0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
N 2*
N
18
19
20
N1*
Figure 2. Optimal numbers of PM actions in the type I and II PM policies and their comparison
Figure 2 shows the typical pattern of L1 ( x, N ) and L2 ( x, N ) when the initial hazard rate of the system is strictly increasing. It is obvious from Figure 2 that if the costs for PM actions in type I PM policy and in type II PM policy are the same, N 2* is
obviously greater than N1* for any given x > 0 . That is, the system with type II PM policy can be useful longer than the system with type I PM policy before it has to be replaced by a new one. When the cost for PM action in type II PM policy is higher than that for PM action in type I PM policy, it is not guaranteed that N 2* is greater than or equal to N1* . In some case,
N 2* could be smaller than N1* for given x > 0 as shown in Figure 2. These results are summarized as follows. Remark 1. Suppose that h(t ) is strictly increasing in t ‡ 0 and x > 0 is given. Then (a) if C pm1 = C pm 2 , then N 2* is greater than or equal to N1* . (b) When C pm1 < C pm 2 , (b-1) if L2 ( x, N1* ) <
Cre - C pm 2
(b-2) if L2 ( x, N1* - 1) ‡ (b-3) if L2 ( x, N1* - 1) <
Cmr
, then N 2* is greater than or equal to N1* ,
Cre - C pm 2 Cmr Cre - C pm 2 Cmr
, then N 2* is smaller than or equal to N1* , and L2 ( x, N1* ) ‡
Cre - C pm 2 Cmr
, then N 2* is equal to N1* .
454
5
QUANTITATIVE ANALYSIS
In order to perform a quantitative investigation of PM schedules under different PM policies, we consider Weibull distribution with a scale parameter l and a shape parameter b . The hazard rate of Weibull distribution is h(t ) = bl b -1t b -1 for l > 0, b > 0 and t ‡ 0 . We assume that b > 2 and l = 1 . Then the hazard rate, h(t ) , is strictly increasing and convex for t ‡ 0 . Cost structures are assumed to be as follows.
•
Cost for replacement C re = 2000
•
Cost for minimal repair C mr = 1.0
•
Cost for PM in type I PM policy C pm1 = 100
•
Cost for PM in type II PM policy C pm 2 = m C pm1 , m ‡ 1 .
Table 1 shows the values of the optimal number of PM actions and its corresponding expected cost rates of both PM policies for various combinations of C pm 2 and p and x = 0.8 . It is interesting to note that as the value of p representing the effect of PM action increases, the optimal number of PM actions increase and the expected cost rates decrease. In other words, the better the PM effect is, the greater the optimal number of PM actions before replacement is. Table 2 also shows that when the PM costs for both policies are equal, N1* is smaller than N 2* . It is quite natural since the PM action in type I PM policy has an effect on the relative wear-out since the last PM action while the PM action has an effect on the global wear-out in type II PM policy. It is observed that as PM cost of type II PM policy increases, N 2* decreases and C2 ( x, N 2* ) increases. Hence it is not beneficial to take type II PM policy when PM cost of type II PM policy is very high.
Table 1 Optimal number of PM actions and its expected cost rate when b = 3 and x = 0.8 .
type I PM
p * 1
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
* 1
* 2
N
C1 ( x, N )
N
13 13 14 14 15 16 18 20 24
406.252 396.652 386.387 375.155 362.373 347.998 331.024 310.054 281.270
15 18 21 25 29 33 37 41 45
m =1 C2 ( x, N 2* ) 384.828 354.199 326.176 302.432 282.814 266.454 252.543 240.469 229.790
type II PM m=5 * C2 ( x, N 2* ) N2 13 16 19 22 26 29 32 36 40
848.789 824.300 801.306 781.296 764.438 750.200 738.042 727.428 718.022
* 2
N 12 14 16 19 22 25 28 31 34
m=9 C2 ( x, N 2* ) 1308.87 1290.65 1273.06 1257.12 1243.38 1231.61 1221.43 1212.50 1204.55
The value of p in both policies represents the effectiveness of a PM action in such a way that the larger p is, the more effective the PM action is. Figure 3 shows the optimal number of PM actions and its corresponding expected cost rate for several values of p when the PM period is given by 0.8. It is shown from Figure 3 that the optimal number of PM actions increases and the expected cost rate decreases as p increases. In the sense of time for replacement, Figures 3 shows that the system under both PM policies with larger p operates for a longer time. It should be noted that an increase in p has more impact on the type II PM policy than on the type I PM policy in the sense of both the optimal PM period and the optimal number of PM actions.
455
450
45
400
40
350
35
900
300 250
15 200 10
150
5
Expected cost rate Optimal number of PMs
0 0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
Opimal number of PMs
20
Expected cost rate
Opimal number of PMs
25
850
30 800
25 20
750
15
100
10
50
5
0
0
0.9
700
Expected cost rate Optimal number of PMs
650 0.1
0.2
0.3
Effect of PM
(a) Optimal number of PM actions and expected cost rate of type I PM policy
Expected cost rate
30
0.4
0.5
0.6
0.7
0.8
0.9
Effect of PM
(b) Optimal number of PM actions and expected cost rate of type I PM policy with m=5
Figure 3. Optimal number of PM actions and the expected cost rate for given x=0.8
6
SUMMARY & CONCLUSION
A large number of PM models have been proposed and studied in the literature. Based on PM models, PM policies have been developed to maintain the system preventively while the system is operating and thus to prolong the lifetime of the system by reducing the hazard rate. In this paper, we consider two periodic PM policies which are type I PM policy and type II PM policy. Each application of PM actions in type II PM policy has an effect on global deterioration of system while each PM action in type I PM policy reduces hazard rate which have increased since the last PM. Under each policy, the system is preventively maintained at periodic times, x, 2x, . . . , Nx, and is replaced by a new system at the N-th PM action. After the k-th PM action, the hazard rate is reduced by each PM model. The proposed PM policies are PM policy with improvement factor. For given cost structures for PM action, minimal repair and replacement, we derive formulas to compute the expected cost rates per unit time during its life cycle for the proposed PM policies and determine the optimal PM schedules, which are to minimize the expected cost rates. We show that the optimal schedules of both PM policies exist and they are unique. Also we compare the optimal number of PM actions of two PM policies analytically and show that type II PM policy outperforms the type I PM policy. It is noted that the optimal numbers of PM actions of both PM policies are the same when p=1 and p=0. This is expected from both PM policies since when p=1, every PM action restores the system to the state as good as new and then the effects of PM actions in both PM policies are the same. When p=0, the system after PM action returns to the state just prior to PM action and then the effects are the same. Numerical studies show that type II PM policy outperforms type I PM policy when PM costs are equal. When PM cost of type I PM policy is less than one of type II PM policy, type I PM policy is more cost effective while type II PM policy results in longer cycle for replacement. For both policies, it is observed that as hazard rate reduction factor, p, increases, the system can operate in longer replacement cycle and the expected cost rate decreases.
7
REFERNCES
1
Barlow R & Hunter L. (1960) Preventive maintenance policies. Operations Research, 9, 90-100.
2
Pham H & Wang H. (1996) Imperfect maintenance. European Journal of Operations Research, 94, 425-438.
3
Wang H. (2002) A survey of maintenance policies of deteriorating system. European Journal of Operations Research, 139, 469-489.
4
Malik M. (1979) Reliable preventive maintenance policy. AIIE Transactions, 11, 221-228.
5
Lie CH & Chun YH. (1986) An algorithm for preventive maintenance policy. IEEE Transactions on Reliability, 35, 7175.
6
Canfield RV. (1986) Cost optimization of periodic preventive maintenance. IEEE Transactions on Reliability, 35, 78-81.
7
Jayabalan V & Chaudhuri D. (1992) Cost optimization of maintenance scheduling for a system with assured reliability. IEEE Transactions on Reliability, 41, 21-26.
456
8
Chan J & Shaw L. (1993) Modeling repairable systems with failure rates that depend on age and maintenance, IEEE Transactions on Reliability, 42, 566–571.
9
Doyen L & Gaudoin O. (2004) Classes of imperfect repair models based on reduction of failure intensity or virtual age. Reliability Engineering & System Safety, 84, 45-56.
10
Lim JH & Park D. (2007) Optimal periodic preventive maintenance schedules with improvement factors depending on number of preventive maintenances. Asia-Pacific Journal of Operational Research, 24, 111-124.
11
Bartholomew-Biggs M, Zuo MJ & Li XM. (2009) Modelling and Optimizing Sequential Imperfect Preventive Maintenance. Reliability Engineering & System Safety, 94, 53-62.
12
Fontenot RA & Proschan F. (1984) Some imperfect maintenance models. In M S Abdel-Hameed, E Cinlar, and J Quinn. (Eds) Reliability Theory and Models, Academic Press, San Diego.
13
Nakagawa T. (1986) Periodic and sequential preventive maintenance policies. Journal of Applied Probability, 23, 536542.
Acknowledgment This work was partially supported by the Natural Sciences and Engineering Research Council of Canada (NSERC).
457
Proceedings of the 4th World Congress on Engineering Asset Management Athens, Greece 28 - 30 September 2009
OPTIMAL REPLACEMENT DECISION MAKING USING STOCHASTIC FILTERING PROCESS TO MAXIMIZE LONG RUN AVERAGE AVAILABILITY Tahmina F Lipiaa, Ming J Zuoa1, and Jai-Hak Limb a
University of Alberta, Edmonton, Alberta, T6G 2G8, Canada b
Hanbat National University
Optimal replacement decision making based on conditioned information is important for maintenance managers. In this study, we address the age replacement decision problem using the history of condition information to maximize the long run average availability. A model is developed for determination of the optimum replacement time by maximizing the long run average availability utilizing the residual life time distribution. The stochastic filtering process is used to estimate the residual life time distribution with history of condition information. Using this availability based age replacement model, a relationship between optimum long run average availability and different level of maintenance efficiency is developed to help maintenance managers select proper maintenance crew to ensure a target level of long run average availability. Key Words: optimal age replacement time, availability, stochastic filtering process (SFP), maintenance efficiency 1
INTRODUCTION
To reduce unplanned shutdown time (i.e. to reduce production loss) and cost of corrective maintenance, the recent focus is to carry out preventive maintenance which recommends implementing maintenance action based on the condition information instead of doing traditional failure or time based maintenance. In condition based maintenance decision making, selecting optimal replacement time is a crucial task for maintenance managers. This decision making involves multiple criteria to consider. Among these criteria, minimization of long run average cost, maximization of long run average availability, and minimization of risk are important for maintenance managers. Among the prior research done to solve this problem, Sheu et al. (1999) derived expression for long run average cost with age replacement policy originally studied by Barlow and Hunter (1960) with initial life time distribution without consideration of current health condition of the machine. Recently as researchers focus on prediction of life time distribution on the basis of condition information, it is better to use the updated life time distribution to determine optimum replacement time. Jardine (2007), Jardine et al. (2006) and Makis et al. (1992) used proportional hazard model (PHM) to predict the hazard rate function from condition information and determine the optimum hazard level for replacement decision. Li et al. (2007) used proportional hazard model based life time distribution to find optimum replacement time and their objective was to maximize long run average availability of the component. Age replacement with proportional hazard model is an extended version of the age replacement model, in which hazard is expressed as a function of condition information and age (Scarf, 2007). But according to Wang et al. (2005), a problem in the proportional hazard model is that it uses only the current observation without considering the history of health condition. Wang (2002) used stochastic filtering process (SFP) to predict residual life distribution using condition information history and proposed a method to find optimum replacement time for which the long run average cost will be minimized. Age replacement decision with history of condition information to maximize long run average availability has not been addressed yet. This research aims to develop a model to find optimum age replacement time by maximizing long run average availability with residual life time distribution. To incorporate the history of condition information, stochastic filtering process (SFP) is adopted to generate the life time distribution. Finally for a specific health condition of a machine, a relationship between optimum long run average availability and different level of maintenance efficiency is developed. This relationship may be used to help maintenance manager select proper maintenance crew to ensure a target level of long run average availability for a machine.
1
The corresponding author: [email protected].
458
2
REPLACEMENT DECISION MODEL
To develop replacement decision model, age replacement policy integrated with residual life time distribution is adopted to maximize long run average availability. In the following two sections we will discuss the age replacement availability model and stochastic filtering process used to estimate residual life time distribution from history of condition information. 2.1 The age replacement availability model In age replacement, equipment is replaced preventively when it reaches at age T, or correctively replaced if it fails before T, given that the preventive replacement is less expensive than failure replacement (see Figure 1). The equipment is assumed to return to the “as good as new” state after preventive replacement or failure replacement.
failure 0
planned replacement
X
T
X
time
Figure 1. Age Replacement Policy The age replacement availability model tries to find the right time to replace the equipment in order to maximize long run average availability. To incorporate the history of condition information in the traditional age replacement model, age replacement availability model based on history of condition information is established. The age replacement availability model based on history of condition information can improve the maintenance decision-making where the availability of the component is more important.
Long Run Average Availabili ty =
Expected total uptime Expected total uptime + Expected total downtime
(1)
To calculate the expected total uptime and expected total down time, residual life time distribution is used in this model. To integrate the history of condition information in estimating residual life time distribution, stochastic filtering process (SPF) is adopted from Wang (2002). Notation Used in the Model:
A(T ) : long run average availability
yi : conditioned information obtained at time ti , a scalar xi : residual delay time at time ti Yi = { yi , yi -1 ,..... y1} : condition information history at monitoring point i t i : time at monitoring point i T * : optimum replacement time T p : average time needed to complete preventive replacement T f : average time needed to complete failure replacement
f i ( xi Yi ) : residual life time distribution at monitoring point i According to Figure 2, the expected total uptime and the expected total downtime can be calculated with following equations: Expected total uptime = t i +
T -t i
T - ti
0
0
∫ zf i ( z Yi )dz + (T - ti )(1 -
459
∫ f ( z Y )dz) i
i
Expected total down time = T p (1 -
T - ti
∫
T - ti
∫T
f i ( z Yi )dz ) +
0
f
f i ( z Yi ) dz )
0
f i ( xi Yi )
T
ti
0 Figure 2. Age replacement policy with condition based life time distribution Plugging the expressions of the expected total uptime and the downtime in equation (1), the long run average availability model based on residual life distribution becomes
ti + A (T ) = ti +
T - ti
∫
T - ti
T -ti
0
0
∫
zf i ( z Yi ) dz + (T - t i )(1 -
zf i ( z Yi ) dz + (T - t i )(1 -
T - ti
∫
0
ti +
f i ( z Yi ) dz ) + T p (1 -
f i ( z Yi ) dz ) T -ti
∫
0
ti + A (T ) =
∫
T - ti
∫ zf
i
0
( z Yi ) dz + (T - t i )(1 -
0
T - ti
f i ( z Yi ) dz ) +
T - ti
∫T 0
f
f i ( z Yi ) dz )
(2)
T - ti
∫
f i ( z Yi ) dz )
0
∫ ( z + T f ) f i ( z Yi ) dz + (T - t i + T p )(1 0
T - ti
∫
f i ( z Y i ) dz )
0
2.2 Stochastic Filtering Process (SPF) To estimate the residual life time distribution
f i ( z Yi ) based on history of condition information, this paper adopted the
stochastic filtering process from Wang (2002). Stochastic filtering process is used to establish a relationship between the history of condition information and the residual life time distribution with conditional probability. The established relationship can be expressed with the following equation:
fi ( xi Yi ) =
( x i + ti ) ¥
∫ (z 0
i
b -1
+ ti ) b
-1
e e
i
- ( a ( x i + t i )) b
e
i k =1
k
( A + Be
A + Be
k =1 - ( a ( z i + t i )) b
-( y
e
-( y
k
( A + Be
A + Be
- C ( x i + ti - tk ) -1
)
h
,
- C ( x i + ti - tk ) - C ( z + ti - tk ) - 1
- C ( z + ti - tk )
)
(3)
h
dz
where A, B, C , a , b are model parameters which can be estimated from the condition information data. For details on this method, check Wang (2002). 3
CASE STUDY
The developed model is applied in the vibration data for six bearings reported in Wang (2002) (see Figure 3) to determine optimum replacement time (T * ) to maximize long run average availability. With the estimated model parameters and given condition information history outlined in Wang (2002), the long run average availability can be calculated with equation (2) assuming average time needed to complete preventive replacements (Tp ) is 10 hours and average time needed to complete
460
failure replacements
(Tf ) is 50 hours. For the monitoring time point 940 hours, the long run average availability is plotted for
different replacement time in Figure 4 for bearing-6, from which it is obvious that the optimum replacement time is at 945 hour to achieve maximum availability of 0.9895.
30
Bearing 1 Bearing 2 Bearing 3 Bearing 4 Bearing 5 Bearing 6
25
Vibration Level (rms)
20
15
10
5
0 0
100
200
300
400 500 600 Operating Hours (since New)
700
800
900
1000
Figure 3. Vibration data of six bearings
Figure 4. Optimum replacement time by maximizing long run average availability
4
MAINTENANCE EFFICIENCY AND OPTIMUM AVAILABILITY Optimum long run average availability depends on average time needed to complete preventive replacement
(Tp ) and
average time needed to complete failure replacement (Tf ) for a certain level of health condition, in residual life time based availability model. Average time needed to complete preventive replacements failure replacements
(Tp ) and average time needed to complete
(Tf ) depends on the level of maintenance efficiency we are going to deploy. If we deploy highly efficient
maintenance crew, it will take less time to perform the maintenance task. But to hire highly efficient maintenance personnel, maintenance manger needs to spend more money. So to make this critical decision, maintenance manger needs a framework to select best maintenance personnel to achieve a target level of availability.
461
According to Jacobs et al. (2008) efficiency is defined as the ratio of actual output and standard output, from which we can end up with the following replacement efficiency:
Replacemen t Efficiency =
Standard time taken to perform the specific replacemen t operation Actual time taken to perform the specific replacemen t operation
Figure 5 shows the variation of long run average availability for different replacement time for different level of replacement efficiency. For a specific level of maintenance efficiency, it is clearly seen from Figure 5 that, there is an optimum replacement time for which we can achieve maximum long run average availability.
Figure 5.Variation of long run average availability for different maintenance efficiency.
It is also observed from Figure 5 that, we can achieve higher value of long run average availability for higher value of maintenance efficiency. To establish this relationship, optimum value of long run average availability is plotted for different level of replacement efficiency in Figure 6.
Optimum Availability 0.992
Optimum Avaialbility
0.99 0.988 0.986 0.984 0.982 0.98 0.978 0
20
40
60
80
100
120
Efficiency
Figure 6. Long run average optimum availability for different level of maintenance efficiency.
462
Figure 6 helps maintenance manager to decide which level of maintenance efficiency he needs to deploy to achieve a certain level of availability for a machine with specific level of health condition. Suppose at current monitoring time 940 hour, maintenance manager is planning to ensure at least 98.50% availability to meet the production target, then he needs to deploy the maintenance crew having at least 70% efficiency. 5
CONCLUSIONS
The model reported in this paper will help maintenance manager to select optimum replacement time for maximizing long run average availability. If machine availability is more important to satisfy a target level of production, this model helps to make a realistic replacement decision as it includes history of condition information in its model. Also it will help maintenance manager to select proper level of maintenance efficiency to ensure a certain level of long run average availability. Further investigation can be performed to select optimum replacement time to satisfy multiple criteria such as maximizing long run average availability and minimizing long run average cost simultaneously. 6
REFERENCES
1
Barlow, R. E., Hunter, L. C. (1960) Optimum preventive maintenance policies. Operations Research, 8, 90–100.
2
Jardine, A. K. S. (2007) Optimizing condition monitoring decisions for maintenance planning. IEEE Tutorial on Asset Management – Maintenance and Replacement Strategies, pp. 24-28 June 2007, Tampa, USA
3
Jardine, A. K. S. and Tsang, A. H. C. (2006) Maintenance, Replacement, and Reliability: Theory and Applications. CRC Press, Taylor and Frances.
4
Li, C., Xuhua, C., Yongsheng, B., Zhonghua, C. (2007) Age Replacement Model based condition information. The Eighth International Conference on Electronic Measurement and Instruments ICEMI, 2, 468-471.
5
Makis, V., Jardine, A.K.S. (1992) Optimal Replacement in the Proportional Hazards Model. INFOR, 30, 172-183.
6
Scarf, P. A. (2007) A framework for condition Monitoring and Condition Based Maintenance. Quality technology & Quantitative Management, 4(2), 301-312.
7
Sheu, S.-H., Yeh, R. H., Lin, Y.-B., Juang, M.-G. (1999) A Bayesian perspective on age replacement with minimal repair. Reliability Engineering and System Safety, 65, 55–64.
8
Wang, W. (2002) A model to predict Residual life of rolling element bearings given monitored condition information to date. IMA Journal of management mathematics 13, 3-16.
9
Wang W. and Zhang, W. (2005) A model to predict the residual life of aircraft engines based on oil analysis data. Naval Logistic Research, 52, 276-284.
10
Jacobs, F. R., Chase, R. B., Aquilano, N. J. (2008) Operations & Supply Management. 12th Edition, McGraw-Hill, Boston.
Acknowledgment This research was supported by the Natural Sciences and Engineering Research Council of Canada (NSERC).
463
Proceedings of the 4th World Congress on Engineering Asset Management Athens, Greece 28 - 30 September 2009
EDF’S PLANTS MONITORING THROUGH EMPIRICAL MODELLING: PERFORMANCE ASSESSMENT AND OPTIMIZATION Seraoui R. a, Chevalier R. b and Provost D. c abc
Electricité de France (EDF R&D), 6 quai Watier, BP 49, 78401 Chatou France. Contact: [email protected]
EDF Group (Electricité de France) is one of Europe’s leading energy players. With an installed capacity reaching the 127 GW (mainly nuclear, fossil, hydraulic…), EDF was originally acting in France but is nowadays a major company also in the neighboring countries such as Germany, Italy or Great Britain. Most of EDF power plant components are either permanently or periodically monitored by specific techniques such as vibration, acoustic analysis, thermal imaging or oil analysis as an Equipment Condition Monitoring. Today, Condition Monitoring based on statistical modelling algorithms using process data is becoming more prevalent in the power, chemical and aerospace industries. The main interest for EDF is to complete the specific techniques and to improve the detectability of the faults in order to have time to plan maintenance actions. All these statistical methods use the same principle for fault detection, which is to compare observed data to estimated data: a difference will reveal the presence of an anomaly, which can be related either to equipment or instrumentation. EDF R&D division has evaluated for its internal clients off-the-shelf monitoring tools, which embed such methods and has implemented auto associative kernel regression (AAKR) and evolving clustering method (ECM) on a Matlab platform. Up to now, the design of the industrial monitoring tools has centred about ease of use, allowing the user to gain access to complex analysis methods without great effort. However, for a proper engineering modelling, one must pay greater attention to the issue of parameters tuning. For that purpose, the R&D division is writing a guideline tackling the main issues entailed by learning-based modelling in a fleet-wide monitoring context. The paper gives first an overview of the theoretical principles behind AAKR and ECM then it presents usual criteria in order to quantify three important aspects of a model performance: the accuracy, the robustness and the ability to isolate faults called spill-over. The performance of the two methods is compared regarding simulated faults on real plant data. Secondly, the sensitivity of these methods has been investigated, particularly regarding the type of kernel, the maximum radius of a cluster, the number of samples in the training, the sensors weighting in the distance function or the number of signals in the model. Finally, an approach of model performance optimization based on these sensitivity indicators has been assessed. This procedure helps to identify the faulty sensors 1 adjust their weights in the overall distance and thus to avoid spill-over and increase the model robustness. Key Words: Empirical Modelling, Equipment Condition Monitoring, Power Plants, Fault Isolation, Robustness.
1
Faulty sensor : sensor on which a deviation is observed and can be related either to equipment or instrument
464
1
INTRODUCTION
Monitoring complex structures and processes is necessary for fatigue prevention, aided control and condition-based maintenance. The ultimate goal for EDF is to timely plan appropriate maintenance actions for its power plant components and to be capable of optimally coping with the production availability targets and the safety requirements of such plants. Many industrial processes rely on physical principles, which write in terms of differential equations and thus dynamical models. Moreover, the use of physical parameters is mandatory for fault isolation and diagnosis [1]. However, due to the complexity of the phenomena involved and the highly non-linear interrelationships between the representative variables in a power plant, it is usually difficult to develop analytical models. An attractive alternative is then to resort to empirical models built through a process of training based on a set of examples of component behaviour patterns. Condition monitoring of a component can be addressed by developing an empirical model of its behaviour in normal conditions. During operation, the state actually observed in the plant is compared with the state predicted by the model representing the component in normal conditions: a deviation between the measured and predicted values of the system parameters reveals the presence of an abnormal condition, e.g. caused by component or instrumentation faults [2]. The classical methods for condition monitoring, based on the observation of the trends of individual signals, are more and more substituted by advanced statistical and pattern recognition methods that simultaneously consider several plant signals. In this context, the current activity of EDF R&D is focused on the assessment of the more frequent monitoring methods embedded in commercial off-the-shelf online monitoring tools. To perform this, EDF R&D implemented on a Matlab platform auto associative kernel regression method (AAKR) and evolving clustering method (ECM). The purpose of this paper is to describe the key performance criteria associated to these two pattern recognition methods and to explain the main factors in a performance optimization procedure so as to use these tools for fault detection, isolation and diagnosis. The paper is organized as follows: the two implemented methods are first introduced from a theoretical point of view in section 2. Section 3 is devoted to major performance metrics used to assess an empirical model. In section 4, model complexity issues are addressed and some asymptotic approaches are described on test data to choose an appropriate model structure. An algorithm to simultaneously optimize a model robustness and spill-over is described in section 5. Finally, some discussions and conclusions are drawn in section 6. 2
THEORETICAL OVERVIEW OF THE IMPLEMENTED METHODS
The purpose of this section is to present the theoretical principles of a locally weighted estimation technique such as AAKR and a clustering method like ECM. All the monitoring methods described below use the same principle to detect a fault, which is to compare observed data to estimated data. Prior to the monitoring phase, it is required to build the model of the normal equipment behavior, during a training phase. Faults can thus often be detected as deviations between an observed and an expected value of the parameters vector of the system. Two crucial issues are the nature of the deviation since a difference can be related either to equipment or instrument and also the state of significance of the observed changes compared to additive signal noises, model uncertainties, and changes in the environment of the monitored process [3]. Several detection methods based on the analysis of residues are commonly used. They can be related on individual sensor thresholds or on an overall indicator of deviation. 2.1 Auto Associative Kernel Regression (AAKR) The principle of the Auto Associative Kernel Regression method [4] is to estimate a sample as weighed sum of the training data. Weighting the data can be viewed as replicating relevant instances and discarding irrelevant ones. The relevance is measured by calculating a distance d(x, xi) between the query point x and the each data training vector xi. A typical distance function is the Mahanalobis distance:
d M ( xi , x) =
P
∑m (x j =1
j
ij
- x j ) 2 = ( xi - x) t M ( xi - x)
(1)
where P is the number of parameters selected in the model and M the following matrix:
m1 M = 0
... mj ...
0 m P
(2)
This diagonal matrix M helps to take into account the weighting of each sensor in the overall distance. If the factors mj are all non zero, the input parameters space is distorted which can lead to more accurate predictions. If some of the factors are set to
465
zero, those dimensions are ignored by the distance function making the local model global in those directions. This point of view will be tested in section 5 to optimize the model robustness and spill-over. A weighting function or kernel K(.) is used to calculate a similarity for that data point from the distance. Figure 1 shows a great variety of possible kernels. The common point between all these functions is that they reach their maximum at zero distance and decay as distance increases. However, the Gaussian kernel K(d)=exp (-d²) which is a typically used weighting function remains an attractive compromise since it has no discontinuity in its derivative and decreases very fast when away from 0 which makes it look like a finite extent kernel.
Figure 1: Possible kernel shapes
x is performed by a weighted average of all the training vectors: ∑i K (d ( xi , x)) xi
Finally, the prediction
xˆ =
∑ K (d ( x , x)) i
(3)
i
Sample training weights are directly linked to the similitude degree between the new sample (for which the output is to be estimated) and the training samples: it emphasizes the points near the query point. 2.2 Evolving Clustering Method (ECM) This paragraph describes the individual steps of the ECM algorithm presented in references [4] and [5]. The only parameter controlling the clustering process is the maximum cluster radius defined as a threshold value Dthr. Figure 2 illustrates the influence of the cluster radius threshold in a simple two dimensional case. We observe, by the way, the cluster overlapping process very present in ECM.
Figure 2: Two dimensional clustering with different cluster radius
466
2.2.1 Training Procedure The ECM algorithm for data partition of the input space is a distance-based clustering method. Every single cluster Cl is described by its center cl and elongation (or radius) rl. The largest possible elongation of a cluster is specified by a threshold value Dthr. This radius is the maximum distance from a cluster center to an example point, belonging to that cluster. This means that the actual number of clusters Nc is determined by this threshold Dthr and not specified by the user. Starting with an empty space of clusters, the monitoring points to be clustered are passed one by one to the ECM algorithm. The first point always gives a cluster with center on top of this point and radius 0. When more points are presented, some created clusters will be updated through new centers and increased radii. Furthermore new clusters will be created – always with initial radius 0. The updating of the clusters is based only on the distances between the current point x and actually clusters centers and radii, meaning that a point simply passes through the model - which is suitable for on-line learning and time series prediction: -
Step 0: Create the first cluster C1 by taking the position of the first sample from the input data stream as the first cluster center c1, and set a value 0 for its cluster radius r1.
-
Step 1: If all samples of the data stream have been processed, the algorithm is finished. Else, the current sample, x, is taken and the distances, between this sample and all the m already created cluster centers cl are calculated for l varying from 1 to m.
Dl = x n - cl -
(4)
Step 2: If there is a cluster center (centers) clo, so that the distance value Dlo is equal to or less than, the radius r1, it is assumed that the current sample x belongs to a cluster Cm with the minimum of these distances:
l 0 = arg min x n - c l
(5)
l
In this case, neither a new cluster is created, nor any existing cluster is updated. The algorithm returns to Step 1, else it goes to the next step. -
Step 3: Find a cluster Ca (with a center ca and a cluster radius ra) from all mexisting cluster centers through calculating the values Sl = Dl + rl, for l = 1 to m, and then select the cluster center ca with the minimum value Sa = min Sl.
-
Step 4: If Sa is greater than 2Dthr, the sample does not belong to any existing clusters. A new cluster indexed m+1 is created in the same way as described in Step 0. The algorithm returns to Step 1.
-
Step 5: If Sa is less or equal than 2Dthr, the cluster Ca is updated by moving its center ca, and increasing the value of its radius ra. This updated radius ranew is set to be equal to Sa/2 and the new center ca is located on the line connecting the new input vector and the cluster center ca, so that the distance from the new center canew now to the point is equal to ranew. The algorithm returns to Step 1. In this way, the maximum distance from any cluster center to the farthest sample that belongs to this cluster is kept less than the threshold value, Dthr though the algorithm does not keep any information of past samples.
2.2.2 Estimation procedure During the monitoring phase, once the clusters are defined, then for any monitoring sample x, we have chosen as expected value the closest center coordinate of all the clusters. The distance measurement method used is the normalized Mahanalobis distance as defined in the following equation, where n indicates a normalized vector:
d Mn ( x n , y n ) =
∑m
n
j
n
(x j - y j ) 2 = (xn - y n )t M (x n - y n )
(6)
j
If Nc is the total number of clusters, we define:
xˆ n = c k1
(7)
with k1 the cluster index so that:
k1 = arg min d Mn ( x n , cl )
(8)
l =1:N c
Note: the matrix M represents the influence of each sensor in the overall distance.
467
3
EMPIRICAL MODELLING PERFORMANCE CRITERIA
The performance of on-line monitoring modelling has traditionally been measured in terms of three global criteria: accuracy, robustness, and spill-over [6], [7]. Accuracy measures the ability of a model to predict sensor values and is normally related to the mean squared error (MSE) between sensor predictions and the measured sensor values. Robustness measures the model's ability to make correct sensor predictions when some sensor values are corrupted by sort of fault. Finally, spill-over measures how a faulty sensor input has an effect on the other sensor predictions. Data used for the assessment of the modelling performances have been derived from combustion turbine. Measured data are active and reactive powers and seven different vibrations. The generalization capability of a training based algorithm has to be assessed on test data, that were not used for the training but drawn from the same distribution as the training data. These data correspond globally to the same operating conditions as those seen during the training. In our example, 3000 samples represent a period of normal behavior and a training file containing Nt=1500 samples has been obtained selecting one measure every two samples. The other Nm =1500 samples are used as validation data for assessing the accuracy, robustness and spill-over. 3.1 Accuracy The accuracy metric we define is the mean squared error (MSE) between the model's predictions and the target values. This metric compares only the un-faulted predictions with the raw process data. For each sensor j, the criterion Aj we use is simply the MSE on parameter j:
1 Aj = Nm + Nt where
Nm + Nt
∑
xij - xˆ ij
i =1
2
(9)
s 2j
xij is the model prediction of the ith validation sample, xij is the ith observation of the jth parameter in the validation
data and
sj
is the standard deviation of the jth parameter in the training phase. If a global measure is desired, the global
accuracy metric can be defined as follows:
A=
1 P ∑ Aj P i =1
(10)
Since this accuracy criterion is actually a measure of an error, the lower it is, the more accurate the model will be. In a more general framework, it can be shown for an estimation technique that MSE = bias² + variance, meaning that the modelling errors have two main origins: the bias that measures how far the model response is from the true value. the variance, that measures the sensitivity of the model to a particular sample. For a given model, the bias-variance dilemma states that a model with low bias has large variance and that a model with low variance has large bias. This accuracy metric A assesses a model's performance under non-faulty input data. However, since the purpose of empirical modelling is to identify sensor and equipment faults, their performance under faulted conditions must be quantified justifying the requirement for the robustness and spill-over metrics. 3.2 Robustness and Spill-over Generally the robustness criterion for model is defined as a model in which a little disturbance in some sensors does not effect the model's parameters predictions: in other terms small changes in the observed signals must lead to small changes in the predicted ones. In literature, many authors have developed criteria in order to quantify empirical modelling performance [6], [7]. The general testing process used to calculate those performance indicators must first be examined. To begin, a model's response using un-faulted monitoring data is calculated. Next, each of the parameters is sequentially artificially drifted. The empirical models are then run and their parameters outputs are stored. The predictions using faulty monitoring data are confronted to the unfaulty ones so as to determine the model's robustness and spill-over metrics. Therefore, these metrics involve the following values: the un-faulted prediction x , the drifted prediction y , the un-faulted input variable x, the drifted input y, and the index of the artificially drifted variable k. The robustness can be interpreted as a measure of an empirical model's ability to make correct sensor predictions when the respective sensor value is incorrect due to some sort of fault. Using the definitions below, the robustness for the drifted sensor k is given by:
468
1 Rj = Nm
N m + Nt
∑
i = N t +1
yˆ ik - xˆ ik
(11)
y ik - xik
A value of Rk near 0 is desirable because it means that a faulty sensor will not effect the model's parameter prediction. For the artificially drifted sensor k, it is a measure of the model's ability to detect a fault since the residual between the sensor k and its estimate will be largest for a robustness measure of zero. If a model's robustness is near 1, then the model's prediction follows the fault, resulting in a residual of zero, and the fault cannot be detected. If a model's robustness value is greater than 1, its prediction overestimates or amplifies the size of the sensor fault and monitoring system alarm values may need to be adjusted to reflect this fact. The next performance metric is spill-over. This criterion measures the effect a faulty sensor input has on the other sensor predictions. This is illustrated by the following equation, in which j is the index of an un-faulted parameter whose spill-over metric is being calculated:
Sj =
1 Nm
N m + Nt
∑
i = N t +1
yˆ ij - xˆ ij
(12)
y ik - xik
This metric reflects a very important aspect for a training-based model: a high level of spill-over entails a possible error of diagnosis since some sensors can appear faulty whereas they are due to some model spill-over. Figure 3 illustrates those phenomena in the case of active and reactive powers.
Figure 3: Drifted active and reactive power signals with associated predictions. 4
PERFORMANCE ASSESSMENT
4.1
Model complexity
Once selected a monitoring method, comes the issue of parameters tuning. The AAKR and ECM architectures have both parameters such as the cluster maximum radius, the kernel bandwidth, the distance functions, or the number of training points that modify completely the models response once they are tuned. Each particular model will have its own variance, bias and MSE. As a consequence, the question of parameters tuning entails an issue of model selection. We consider here that each of the models has a certain number of parameters that reflect his complexity. For each method, parameters such as the number of clusters, number of points in the training in the case of ECM, or the kernel bandwidth for AAKR help to discriminate the models by their growing complexity. The general bias-variance dilemma can be stated as follows: complexity models have a low variance and a large bias : the model cannot capture the structure of the data. high complexity models have low bias but large variance : the model learns parts of the signals noises as well as the true problem structure. As a consequence, the best model will always have a number of parameters which is neither too large nor too small: a compromise is necessary to establish. When testing the model on the cross-validation data, we observe for: a highly biased (low complexity) model the accuracy on both the training and the validation data is very poor.
469
-
a large variance model (over parameterization, too high complexity) we have excellent training accuracy but poor validation accuracy. This phenomenon is called overfitting.
4.2 Accuracy vs Complexity Figure 4 (a) shows the previously described phenomenon applied to AAKR. However, there is a slight difference with the ECM algorithm since the bias does not compensate the low variance when the model is too complex (too many clusters): when R decreases near 0, the number of clusters reaches the number of training points and the accuracy metric stabilizes (figure 4 (b)). Still there is a compromise to find in the complexity of the problem, since a growing complexity does not provide any better accuracy results for a greater computational time. We have also assessed the accuracy metric related to the number of points in the training period. On figure 5, one can observe that for the two methods, the accuracy decreases with the Nt number of samples in the training. Obviously, the accuracy metric depends not only on the number of samples Nt but above all on the quality of those samples. We observe some steps in the decrease of A which means that some points contain a great amount of information compared to the others. Therefore, it is possible to optimize the choice of the training points to feed the model. We have decided to keep all the Nt data as training data. However, literature gives other possible ways of selecting a subset of representative memory vectors that span each variable's operating region [8]. A lot of settings can influence the model's behavior such as the kernel type, bandwidth/maximum radius (smoothing parameters), the distance operator, the number of vectors in the training set... The empirical models sensitivities are discussed in the next section, based on their performance assessment under faulty conditions. 4.3 Robustness and spill-over 4.3.1 Assessment Figure 6 (a) shows the dependency of the AAKR robustness metric with the magnitude of the disturbance. We can conclude that this robustness metric grows with the disturbance magnitude since the biggest the drift is the farthest are the signals from those of the training period. The same figure 6 (a) enhances the effect of the model complexity (kernel bandwidth h) on the robustness. In the previous figure, the robustness seems to decrease with the kernel bandwidth (growing model complexity), however further tests exhibit that this fact cannot be considered as a general rule. For example when AAKR input signals are highly correlated and h near 0, the estimator may overfit the data and amplify the input noise producing huge weight coefficients resulting in very bad predictions and poor robustness.
Figure 4: Evolution of the accuracy metric with the kernel bandwidth and the cluster maximum radius (model complexity).
470
Figure 5: Evolution of the log-accuracy metric with the number of samples in the training for AAKR and ECM. On the contrary of what have been developed previously, the evolution of the spill-over metric with the magnitude of the input disturbance is weak (figure 6 (b)).
Figure 6: Evolution of the robustness/spill-over metrics with the magnitude of disturbance. 4.3.2 Optimization In this section we present the first results of a robustness/spill-over optimization strategy. For each empirical model, we propose to fit the appropriate weighting matrix M: the final objective is to obtain a unique matrix per model that reflects and takes into account the inner relationships between the parameters. To perform this, we propose to calculate, for a certain degree of disturbance magnitude, a new matrix following elements:
R1 S= S P1
... Rj ...
S containing the
S 1P R P
This matrix stores in its diagonal elements the robustnesses Ri related to each parameter and on the other elements the spillover metrics Sij of a non-drifted parameter i when the parameter j is drifted. We propose to minimize a criterion
J ( M ) = S 1 which represents the sum of all robustness/spill-over sub-metrics (for a given level of drift) in order to obtain ˆ for which the overall robustness and spill-over metrics of the model parameters are the lowest. the weighting matrix M To handle the optimization problem, [3] gives a heuristic recursive procedure: -
Step 1: Start with a matrix M=IP where all the sensors are weighted equally with a weight of 1.
471
-
Step 2: Calculate the convergence criterion J and the accuracies {A1,A2,...,AP} associated to each parameter of the model. Step 3: For each next iteration, calculate the new matrix Mnew=diag ({A1,A2,...,AP}) and take the best obtained J.
Figure 7: Evolution of the optimization algorithm with the iterations. This procedure takes advantage of the fact that poor accuracy parameters should be given less weight in the overall distance in order to improve the model robustness and spill-over. The results of this optimization procedure are presented in figure 7. The convergence is quite slow and oscillatory but still efficient. Tables 1 and 2 show the initial and final matrix S after the convergence process: most of the sensors exhibit a better result in terms of robustness and spill-over after the optimization. Table 1:
S initial matrix
0.1390
0.1374
0.0686
0.0338
0.0204
0.0285
0.0074
0.0094
0.0078
0.0082
0.0491
0.0059
0.0026
0.0046
0.0072
0.0010
0.0057
0.0044
0.2823
0.2034
0.1835
0.1038
0.0595
0.0714
0.0791
0.0715
0.0574
0.2396
0.2002
0.1401
0.1101
0.0532
0.0649
0.0540
0.0468
0.0469
0.3572
0.4452
0.1843
0.1160
0.1131
0.1180
0.1003
0.0886
0.0931
0.2904
0.2557
0.1531
0.1011
0.0620
0.1130
0.0763
0.0783
0.0738
0.5058
0.4234
0.2492
0.1627
0.1225
0.1347
0.2170
0.1205
0.1533
0.2130
0.2308
0.1124
0.0617
0.0510
0.0821
0.0655
0.0976
0.0570
0.4558
0.4546
0.1761
0.1310
0.1149
0.1308
0.1533
0.1258
0.1973
472
Table 2:
5
S matrix after convergence
0.0477
0.0723
0.0589
0.009
0.0088
0.0216
0.0139
0.0069
0.0050
0.0382
0.0963
0.0071
0.0115
0.0076
0.0225
0.0014
0.0071
0.0089
0.3060
0.4015
0.2369
0.0958
0.0696
0.0980
0.0911
0.0760
0.0854
0.0782
0.0493
0.0198
0.0266
0.0109
0.0193
0.0059
0.0146
0.0230
0.1278
0.2568
0.0874
0.0645
0.0665
0.0590
0.0447
0.0540
0.0668
0.0833
0.0617
0.0258
0.0107
0.0139
0.0398
0.0206
0.0242
0.0251
0.3098
0.3177
0.1957
0.1168
0.1044
0.1650
0.1573
0.1565
0.1272
0.0585
0.1552
0.0449
0.0290
0.0272
0.0351
0.0243
0.0405
0.0271
0.2743
0.3237
0.1384
0.0999
0.0912
0.1446
0.1051
0.1346
0.1718
CONCLUSION
Two empirical monitoring methods have been assessed through the application on real power plant data. The results show that all the methods need to be assessed in order to be tuned conveniently. The accuracy/complexity dilemma emphasized in this article indicates that too complex empirical models can lead to poor generalization capabilities and on the contrary too simple models cannot represent the variability of the sensor data. This article makes also an overview of the possible performance metrics to evaluate a model under faulty conditions. It is shown that those metrics depend not only on the magnitude of the applied drift but also on the models complexities. Finally, an optimization procedure has been developed based on the sensor weighting. It enables after a few iterations to obtain a weighting matrix in order to fit the relations between the sensors data for a given level of disturbance. Finally, those first results have to be confirmed and future work carried out in order to help the end-user to optimize his models. 6
REFERENCES
1
Patelli J.P, Fiorelli P & Chevalier R. (2005) Condition Based Maintenance experience in the EDF's Nuclear Power Plants (NPP), Technical Meeting on “Strategy and techniques on predictive maintenance and condition monitoring” IAEA Vienna June 2005.
2
EPRI Final Report 1008416 On-Line Monitoring for Equipment Condition Assessment.
3
Provost D. (2008) Online monitoring of wind turbines using statistical and pattern recognition approaches. EWEC.
4
Chevalier R., Provost D. & Seraoui R. (2009) Assessment of statistical and classification models for monitoring EDF's assets. NPIC & HMIT 2009, Knoxville, Tennessee, April 5-9, 2009.
5
Kasabov N. & Song - Denfis Q. Dynamic, evolving neural fuzzy inference system and its application for time-series prediction. IEEE Transaction on Fuzzy System, 10(2), 144-154.
6
Hines J.W. & Garvey D. (2006) Sensor Fault Delectability Measures for Autoassociative Empirical Models. 49th ISA POWID Symposium, Jan Jose, California.
7
Hines J.W. & Garvey D. (2006) Development and Application of Fault Detectability Performance Metrics for Instrument Calibration Verification and Anomaly Detection}. Journal of Pattern Recognition Research 1, 2-15.
8
Hines J.W. & Garvey D. Traditional and Robust Vector Selection Methods for use with Similarity Based Models}.
473
Proceedings of the 4th World Congress on Engineering Asset Management Athens, Greece 28 - 30 September 2009
A METHODOLOGY TO CONCEIVE A CASE BASED SYSTEM OF INDUSTRIAL DIAGNOSIS Brigitte Chebel-Morello a, Karim Haouchine a, and Noureddine Zerhouni1 a a
Automatic Control and Micro-Mechatronic Systems Department FEMTO-ST Institute –UMR CNRS 6174, France, 24 rue Alain Savary, 25000 Besançon - France.
The objective of this paper is to address the diagnosis knowledge-oriented system in terms of artificial intelligence, particular by the Case-Based Reasoning (CBR) approach. Indeed, the use of CBR, which is an approach to problem solving and learning, in diagnosis goes back to a long time with the appearance of diagnostic support systems based on CBR. A diagnostic system by CBR implements an expertise-base composed of past experiences through which the origins of failure and the maintenance strategy are given according to a description of a specific situation of diagnostic. A study is made on the different diagnostic systems based on CBR. This study showed that there was no common methodology for building a CBR system. This design depends primarily on the case representation and knowledge models of the domain application. Consequently, this paper proposes a general design approach of a diagnostic system based on the CBR approach. Key Words: Diagnostics, Case-based reasoning, Knowledge management, Functional safety, Hierarchical and contextual models, Adaptation-guided retrieval 1
INTRODUCTION In the field of diagnosis the different studies can be classified in three approaches [7]:
The first one is the analytical approach needing mathematical models, the second one is the data driven approaches like neural networks, systems-based rules, Bayesian networks, neuro-fuzzy networks, etc. And the third one is the knowledge based on causal analysis or expert knowledge. The choice of the approach depends on the type of available models or information like data or knowledge. Our choice is directed towards the knowledge-oriented methods and particularly the case based reasoning (CBR). In effect most of the data driven techniques require getting a complete set of learning, while the CBR that allows reasoning on a minimum cases. In effect the reasoning can be done in a few cases, and is less costly for a company than other methods which require a complete learning set to be effective.
1.1 Domain resources [4] thinks the CBR is the technology of choice to implement a knowledge-based system. Richter [18] defines four knowledge containers involved in a CBR process: •
The vocabulary containers which describing a vocabulary of the domain. This vocabulary is represented by ontology, rules, etc.
•
The case base which present a practical solving knowledge coming from experience.
•
The similarity containers identifying indexes similarity metrics,
•
The adaptation knowledge developing the solution transformation rules, adaptation operators, adaptation cases.
474
Knowledge is involved in the CBR cycle. The first “classical” CBR cycle was proposed by Aamodt and Plaza in 1994 [2]. The reasoning is composed into four steps; know as the four” Re”: Retrieve, Reuse, Revise and Retain. The retrieve phase find the most similar case or cases according to the similarity between the request and previously experienced cases stored in the case base. The reuse phase uses solutions of the similar cases in order to solve the new problem. The differences between the reminded case and the new case are taken into account and the old solution is adapted to the new situation. The phase “revise” of the proposed solution evaluates the proposed solution in the real world. And the retain phase store a new case in the case base. The figure1 presents an evolution of this cycle, by integration of the elaboration step. This step permits to formulate a request problem on a new problem case. In the core of the cycle we can find the four knowledge containers.
New problem
1.Elaborate
New case description 2. Retrieve
Learned case
N ew case Retrieved case
Knowledge Base
5.Retain
3.Reuse
Tested and repaired case
4.Revise
Confirmed solution
Retrieved case Suggested solution
Figure 1. The CBR cycle (Mille, 1999). However, Cordier in [8] summarise in the table1 the main form that the knowledge containers take and the roles played by the knowledge unit in the different steps of the CBR process. The containers allow the organisation of the knowledge units but are considered together as a single knowledge base. The typology of CBR knowledge is as follows (see Table 1) Table 1 Typology of CBR knowledge Knowledge containers
Knowledge units
Knowledge roles
Vocabulary of the domain
Ontology, rules, etc
Guidance of elaboration Control of the inference during retrieval and adaptation
Case base
Vectors of attribute value pairs, structured representations, textual cases
Support in all steps
Similarity knowledge
Similarity metrics, indexes
Retrieve a new case
Adaptation knowledge
Adaptation rules, operators, cases....
Support of the elaboration guidance of the retrieval realisation of the adaptation
To present expert knowledge as past and concrete experiences makes its comprehension by human users easy. Lamontagne & Lapalme [16] represent by two complementary processes a generic model of case based reasoning (see figure 1). A process off line generates the knowledge containers, and a phase of case authoring, and an on line process composed of the different step of the cycle of CBR.
475
« Containers de connaissance »
« Domain Ressources
Vocabulary Failure historic
Authoring knowledge acquisition
Case base
Request of problem
Elaborate
Retrieve
Reuse
Documents Similarity metrics
Revise
Domain expertise
Adaptation knowledge
Maintenance
Processus Off-line
Retain New solution
Processus On-line
Figure 2. Generic model of a CBR system. To conceive a system oriented knowledge, we must establish the two processes of this model, the off line process with the different containers knowledge, by authoring case and acquisition knowledge, and the process on line by the different algorithm corresponding at the five step of the CBR process.
1.2 Case based reasoning on the diagnosis system Diagnosis field are recurrent and the previous documented solutions can be used again. In the retrieval of similar cases, algorithms can be used even if problems are not completely understood. The case based reasoning is initially a cycle of four main activities. The retrieval phase determines most similar cases. The reuse phase solves the new problem by using information and knowledge in the retrieved cases. And the revise phase evaluates the applicability of the proposed solution in the real-world. The retain phase updates the case base with the new case for future problem solving. In [4] two different types of diagnostic systems based on CBR exist: •
In a “help desk” application, CBR is exploited as a decision support, and is used by non-specialists in domains with lots of technical equipment.
•
In a “general diagnosis and repair” application, CBR is used as a decision help system for the diagnosis of complex equipment or for medical diagnosis and points out failure cause research and problem exploitation.
The object of the CBR application can vary from a car studied in diagnostic technique system named CREEK [1] to a locomotive defined for remote diagnostics [20] and to a Boeing 747 aircraft developed in CaseLine [21] used as a demonstrator by British Airways and to the Boeing 737 in CASSIOPEE [5] owned by CFM International. There are also an industrial printer studied by Domino UK Ltd in CHEKMATE presented in [12] and gas turbines studied in a fault diagnosis system of General Electric Energy in Atlanta [10]. The current research in case-based reasoning focuses on detailed knowledge representation and adaptation phase.
476
Indeed, some systems such as NodalCBR [9] and Gas Turbine Diagnostics [10] do not develop an adaptation phase. While others, such as FormTool [6], apply a transformational adaptation but have no knowledge model. Creek system [1] composed of a network of semantic knowledge is handled by three sub tasks namely activation, explanation and focus. There is a wide variety of method ranging from classification problem when there is just a weak domain theory, to knowledge based systems. In most systems, a case characterizes a diagnostic experience without reference to any kind of model. The analysis of these different systems pointed out the lack of a proven methodology to register a typical diagnostic case, and methodology of building a case based diagnostic system. This paper has the objective to propose a methodology to conceive a diagnosis and repair help system based on case based reasoning for human operators. Three special challenges are solved in this project: The first one is the proposal of a general representation of the case, based on the diagnosis definition. The second one is to define knowledge models of the case based reasoning systems. Models adapted to the retrieval and reusing phase of the technical diagnosis. The third one is to propose three different phase of the on line process the elaboration, retrieval and reusing models. After presenting a previous works in the elaboration of a methodology of conceives a diagnosis system, we propose a specific formalization of the diagnosis case, based on two descriptor types. Our case formalisation is linked to the dysfunctional mode and to the analysis decomposition of the equipment [17]. After, we propose the different knowledge containers of the system, and we develop a retrieval phase based on the knearest algorithm within two measures (similarity and adaptation measures) depending on the dysfunctional mode and information relevant to the case. In the last we applied our methodology in industrial equipment namely a diesel motor of the Renault Company.
2
THE DESIGN APPROACH
Our design methodology of diagnostic system by CBR takes as a starting point the generic model of Lamontagne and Lapalme [16]. We develop in each process the various containers and steps of CBR cycle.
2.1 Previous works and methodology following Among the work on the diagnosis, the systems are describing without defining the methodology for obtaining the components of the system. We are interested in previous work on this issue. We are proposed a mix method of knowledge capitalization and case based reasoning in order to develop a diagnosis system. The knowledge capitalization cycle was adopted as the underlying principle Rasovska et al [17]. This method integrates a representation and a reasoning model both completing each other and suitable to represent and manipulate the domain knowledge. Our system, like Creek for example, is based on the operating safety tools [17] and proposes functional, dysfunctional or causality models. These models can be aggregated in a network of semantic knowledge as a Creek or be proposed in the case of our study, in the form of two models: hierarchical and contextual.
2.2 Off-line process For the diagnostic system design we first define the off-line process (see Figure 2). Documents We propose to use the documents produced for a given maintenance process (alarm handling, diagnosis, etc.) as our basic building blocks for the case model. And, because all this information may be retrieved from a given intervention report, we
477
choose the intervention reports database as our case base model. These two models will be completed and improved by the experience feedback Authoring In [17] we were interested in case formalisation with the proposal of ontology. In this paper we go further and we define the case as a set of descriptors based on the diagnostic reference definition: “these are the actions carried out for the fault detection, its localization and the identification of the cause” [3]. Consequently, the diagnostic is composed of three parts: the fault detection part, the part concerning the zone localization in which the failure occurred and the identification part of the failing component. We develop these three phases in the case design; moreover, will base on the equipment knowledge models to be diagnosed. We summarize the three parts as follows: Detection: we determine on the equipment the failure class based on the FMEA (Failure Mode and Effects Analysis). This class belongs to the solution of our case. Localization: this part is described in the first problem part of the case. It allows determining thanks to a context model the equipment failure zone. Identification: this identification concerns on the one hand, the second part of the case problem which one names “functional part” and on the other hand, the solution part while allowing to identify the failing component. The functional part is based on a components hierarchal model in which the classes of this model are found in the functional descriptors in order to obtain a generic case. We represent the industrial diagnostic problem in the most general possible way. A case has object formalization and allows defining a hierarchy of descriptors containing as well the problems descriptors as the solution descriptors. The descriptors are represented by three attributes. We associate to each attribute the modal values forming a partition contiguous to the considered attribute. One of specificities of our study case is to determine a state and a operating mode, with each component to be diagnosed [13]. Therefore, we assign to a given descriptor: •
An attribute on its value itself (the descriptor is an electrical or mechanical component for example);
•
An attribute on the its status;
•
And an attribute relating to the operating mode which will reflect the normal and/or abnormal state of the equipment components.
The problem descriptors of the functional part will have consequently three attributes relating to the value of the value
component, its state and its operating mode: dsi = ( d si
,
d sistate , d siFM ).
By taking account of all these specificities, we can schematize the case structure as shown in the Figure 3. Problem part Localization ds1… dsl
Solution part
Functional
dslvalue +1
ds lstate +1
ds lF+.1M
…
ds mvalue
ds mstate
ds mM . F
Class
Failure component
..
Ds1
Ds2
..
Figure 3. Structure générique du cas de diagnostic. We identify the relevant descriptors for the case representation according to the studied field and a suitable structure of the associated cases with the general knowledge models. This identification allows the installation of the two retrieval and adaptation phases and the exploitation of the comparison between these two phases through the creation of the appropriate measures. The knowledge containers The case-base container contains all the previously definite cases. Thereafter, we will develop the three remaining containers.
478
The vocabulary container The vocabulary container contains the knowledge models of the CBR system. The representation of knowledge of our CBR system dedicated to diagnostic is shown on Figure 1. It consists of two models supplementing the dysfunctional part of our system which is the case-base, and by decision rules determining the component operating mode within the operating context of this one. The equipment model is resulting from a equipment decomposition of the studied system which determines the functions performed by the equipment and its components. The components taxonomy model is determined from their functional analysis. Each components set is gathered by functional class.
Figure 4. Représentation des connaissances d’un système de RàPC dédié au diagnostic de pannes. The knowledge representation is based on two models namely the context model and the components taxonomy model. The context model is based on flows of specific magnitudes of the system and on the equipment decomposition analysis of the studied system. This decomposition determines the functions provided by the equipment and its components. As for the components taxonomy, it determines the equipment sets starting from their functional analysis. Each components set is grouped according to the operating similarity or ensured by the same functions. We associate with these two models and the case-base another module concerning the decision rules for determining the system operating mode to be diagnosed. The similarity measures container Before the Nineties, the two retrieval and adaptation phases were exploited in a completely independent way. Until Smyth and Keane [19] bring a new breathe and suggest the unification of these two phases. This unification goes in the direction where the chosen case in the retrieval phase is that most easily adaptable in order to optimize the adaptation phase results. Our recent work goes in this direction and we have proposed in [13] two measures, one of similarity and one of adaptation. In our study, the retrieval phase is based on the k nearest neighbours algorithm, on the components taxonomy and on the measures being in the similarity measures container. In order to select, among the retrieved cases, the most favourable case to the adaptation, we must initially evaluate the similarity of the source cases to the case to solve by applying a global similarity measure which is composed by a set of local similarity measures. Then, we will take account of the descriptors operating mode of the functional part via a λi weight in the second measure that called: Adaptation Measure (AM). •
Retrieval Measure
We initiated the Retrieval Measure (RM) in [13]. We recall that this measure is composed of four local similarities:
479
For the value of
d sivalue
, which belongs to the hierarchical model of descriptors,
If
d sivalue
=
d tivalue
then
j value = 1, And
if
d sivalue
≠
d tivalue
then
j value =0.8, j value = 0.6… or j value = 0
For the descriptor value If
d sistate
=
d tistate
then
d siFM
=
d tiFM
then
is developed.
d sistate j state , is developed
j state
= 1, and if
For the functional mode defined in If
j value
j FM
d sistate
≠
d tistate
then
j state = 0
d siFM j FM , id developed
= 1, and if
d siFM
≠
d tiFM
then
j FM
=0
To take into account the information in descriptors, a local similarity descriptor is indicated in the source case and
j
presence
j presence
is developed.
j presence =1,
when the
=0, if not.
The global similarity measure (1) is obtained by aggregation of these functions on the whole set of descriptors. From this measure, a set of cases can be selected.
m
RM =
∑j i =1
Valeur i
· j iEtat · j iPr ésence · j iM .F m
(1)
∑j i =1
Pr ésence i
Where n: represent the number of problem descriptors. Thus, this measure will provide a set of the most similar cases to the case to solve. In order to select the most adaptable retrieved case, we have introduced a measure called: Adaptation Measure (AM). •
Adaptation Measure
The Adaptation Measure (AM) takes account of the components operating mode by attaching importance to the abnormal operating mode in the case source descriptors (dsi) and the case target descriptors (dti), by providing them with greater weight. This importance will be characterized by the λi weight. The adaptation measure (AM) (2) takes into account the source cases descriptors which are different from case target and will be linked to the class and to the functional mode compared to the solution descriptors. n
AM =
∑j i =1
n
∑ (j i =1
F .M i
Class i
li (2)
+j
state i
).j
value i
Where λi is the associated weight according to the functional mode If FM=normal λi= 20; if FM= abnormal λi= 22; if FM= nor/ab λi= 21. A weight is associated to the functional mode because this last is considered as being important in the determination of the failing component. The number of different descriptors is determined by the denominator in the equation (2). The retrieved
480
source case having the greatest adaptation measure value among the retrieval source cases will be the candidate chosen for the adaptation step. The adaptation container We set up in an adaptation algorithm the dependency relations between the various problem and solution descriptors. These relations allow to express the influence of a problem descriptor “ds” on the solution descriptors “Ds” in the various sources cases of the case-base. These relations will help to define relevant descriptors set which could be used in the adaptation phase. Their implementation is explained in [13]. Three values are assigned to the various cases source descriptors according to the link between problem and solution spaces, DR = {high, low, none}. The method has been validated in our works of [13].
2.3 On-line process The on-line process contains various cycle phases. We develop three phases namely: elaboration of the target case, retrieval and adaptation. Elaboration phase At the time of failure, the user requests the diagnostic help system which proceeds in three steps: a failure localization step which is done thanks to the equipment model. This model provides a certain components set that could potentially failing by their geographical proximity of the concerned zone. Thus, the elaboration phase will be initialized by the filling of the associated descriptors with the “failure localization”. By a consultation of the supervisor SCADA (Supervisory Control And Data Acquisition) step, giving us the actual state of the various components. Then, by a decision phase, which is starting from the actual position of these components, implements decisions rules giving the operating mode of the implied components. This allows the part of the problem descriptors of the target case to knowing its localization part and its operating mode part (normal/abnormal) to be informed (Figure 5). It is to be noted that the cause and effect relation is taken into account during the case-base creation. The various components are linked by their state, which results in the case-base descriptors values and therefore the definition of their operating mode. Indeed, a component can be in an abnormal operating mode state but it is not failing. This can happen because of the influence of another failing component. Identification Equipment model (1)
The set of the implied components (potentially failing)
Reading SCADA (2)
The implied components state
Definition Decision rules (3)
Define the components degraded mode
Filling the problem part of case to solve
Figure 5. Les différentes étapes d’élaboration du cas cible.
481
Adaptation guided retrieval The retrieval phase is based on the two installation measures (AM) and (RM) in order to guide the adaptation. Indeed, a first retrieval phase allows thanks to the k nearest neighbours algorithm to define a set of most similar cases to the case to be solved. The (AM) measure is used to select among all retrieved cases the most easily adaptable case. Adaptation The adaptation phase is carried out thanks to an algorithm relies on the context model, the descriptors hierarchical model and the dependency relations. The dependency relations are inspired by Fuchs' work in [11]. The adaptation algorithm is presented in [14]. This algorithm adapts descriptor by descriptor. Three standard adaptation cases have been identified, namely: •
DR high involving at least a solution descriptor with a problem descriptor belonging to the same operating class;
•
DR high involving at least a solution descriptor with a problem descriptor belonging to different operating classes;
•
DR low.
The adaptation guided retrieval was validated on supervised industrial system of pallets transfer (SISTRE) [15]. Thus, we have proved the feasibility of our diagnostic help system. This allowed to set up an design approach of diagnostic help system applied to another industrial equipment, in this case a diesel motor 1.5 dCi K9K 105ch of Renault company available to the following link: http://v3.renault.com/cfm/module-K9K/fr/index.html.
3
APPLICATION
A functional and dysfunctional study of a motor diesel was introduced for the construction of case-base and knowledge models of the diagnostic help system. We develop in what follows the installation of the used documents, the case base and the knowledge models.
3.1 Les documents The used documents result from the diesel motor technical documentation which is available at: http://v3.renault.com/cfm/module-K9K/fr/index.html. This documentation enabled to determine various existing flows in the motor and its components. Moreover, we called upon an expert, in particular a mechanic, who helped us to define the components operating modes and to analyze the causal relations.
3.2 The motor case-base Through the functional and dysfunctional study of the motor, we built a case-base containing 130 cases. The case problem part is composed of 12 descriptors. The first four descriptors determine the failure zone field of motor equipment. This zone is determined by “ds1: the motor state”, “ds2: the glow plugs state", “ds3: temperature of the motor” and “ds4: failure sub zone”. Figure 6 shows an overview of the parts problem and solutions of seven cases of the diesel motor case-base.
482
Figure 6. A part of the case-base of Renault gas motor. Let us take the example of case 1. This last reflects a failure on the injection pump. The problem part determines in its “localization” part the failing component place which is defined by the context model. According to the first four descriptors, the motor is running, the glow plugs are in a bad condition and the two remaining descriptors are not indicated. When referring to the context model, we find a number of components which are potentially failing in the specified zone. Then, the case functional part indicates the components state in this zone. The “ds5” descriptor which reflects the “pump injection” component belonging to the “injection” class is in a “no trained” state generating an “abnormal” operating mode. The “ds7” descriptor reflects the candle state which is in a “normal” operating mode because it produces sparks. The “ds9” descriptor describes the filter component that normally circulates gas. The “ds12” descriptor indicates that the crankshaft has a continuous motion and is in “normal” operating mode. The other descriptors are not indicated. As for the solution part, the “Ds1” descriptor indicates that the concerned component failure class is “coupling mobile”. “Ds2” specifies that the failing component is the “injection pump” with a remark that this pump does not deliver fuel. Thus, the associated repair action which is described by “Ds3” descriptor recommends a change of the injection pump. Finally, the “Ds4” descriptor indicates that the failure occurred on the fuel passage flow.
483
3.3 The knowledge models The knowledge representation is based on two models associated with the case-base, namely: the context model and the components taxonomy model. The equipment analysis determines sets of these components. Each set includes components which provide the same functions. This functional analysis allows having a hierarchical modelling of the equipment. Figure 1 shows an example of the components hierarchical model of a diesel motor.
Figure 7. Aperçu du modèle hiérarchique des composants moteur. As the motor has a multiplicity of components, we show that part of the components hierarchical model. This model includes four classes: Ancillary, Coupling mobile, Basic elements and Timing. Each class is composed of subclasses until we arrive at the level of components. The context model is based on flows of the system entities and on the equipment decomposition analysis of the studied system. This decomposition determines the functions provided by the equipment and its components. These components constitute the context in which the failing component is identified. The context reflects the cause-and-effect relations permitting to locate components with problems and to select the good descriptors compared to the set. Figure 2 shows an overview of the diesel motor context model.
Figure 8. Aperçu du modèle de contexte d’’un moteur à explosion. The context model is set up according to different natures of flow in the motor. The flows which we identified are: exhaust, air, lubrification, cooling, fuel and electric. Each flow implies a number of components. Each class can contain several flows and also each flow can contain several classes. The retrieval phase will take account of three attributes types of the case functional part. The adaptation phase will exploit the dependency relations by traversing a context model which defines the cause and effect relations between descriptors. To evaluate our method of adaptation guided retrieval and the adaptation algorithm, we set up a specific protocol. This protocol relates to the division of the diesel motor case-base in two subsets. The first subset contains 40% of cases drawn randomly from the case-base forming the training set. The 60% of the remaining cases will constitute the test set. The goal of this protocol is to calculate the training case-base accuracy. We indicate that the accuracy relates to the determination of the good failure class.
484
We found that our retrieval and adaptation method ensured a very high accuracy. In fact, we can conclude that CBR can reason from a limited number of cases, based on knowledge models and on adaptation phase adjusted to the considered application. Moreover, the retrieval phase, which uses two retrieval and adaptation measures, takes account of the adaptation effort and selects the most easily adaptable cases and amongst other things the cases which have the same classes.
4
CONCLUSION
Within the study framework made on the industrial diagnostic help and repair systems, we proposed a system of diagnostic based on the case-based reasoning approach (CBR) with the deployment of the different phases. The designed system includes an object formalization of case associated with a hierarchical model common to the problem and solution descriptors of the case-base cases and a model in relation to the application context. All CBR phases depend on the cases formalization associated with the knowledge models. Recently in [13], we formalized a case for SISTRE equipment. This formalization is adapted to our method. The selected modelling influences proposed retrieval and adaptation measures. This latter is directly related to the operating mode of the monitored components (a specific attribute of a descriptor). The developed retrieval phase is guided by the adaptation in exploiting the relations of the two installation measures namely: Retrieval Measure (RM) and Adaptation Measure (AM). These relations enable to select among the retrieved cases those which are most easily adaptable. The adaptation phase operates the dependency relations between the problem and solution descriptors. These dependency relations are determined through relevant descriptors or using the context model reflecting the cause and effect relations between the equipment components. We have detected three standard cases, in the adaptation algorithm, associated with the different values of the dependency relations. In order to prove the feasibility of our approach, we have established in this paper our CBR diagnostic system on a diesel motor 1.5 dCi K9K 105ch of the “Renault” company. We plan in a near future to apply a maintenance method of the installation CBR system. Indeed, the system's evolution through time is done by the introduction of new cases following the appearance of new failures. This evolution may deteriorate the case-base quality and that two retrieval and adaptation measures can become obsolete. This maintenance method would allow the evolution of these two measures (as example while inserting weights) and of the case-base according to the system evolution.
5
REFERENCES
1
Aamodt A. (2004) Knowledge-Intensive Case-Based Reasoning and Sustained Learning, Proc. of the 9th European Conference on Artificial Intelligence, ECCBR’04, pp.1-15.
2
Aamodt A & Plaza E. (1994) Case-Based Reasoning: Foundational Issues, Methodological Variations, and System Approaches. AI Communications, 7(i): pp 39 59.
3
Afnor (2001). Maintenance terminology. European standard, NF EN 13306.
4
Althoff K.D. (2001) Case-Based Reasoning. S.K. Chang (Ed.), Handbook on Software Engineering and Knowledge Management, pp. 549-588.
5
Bergmann R, Althoff KD, Breen S, Göker M, Manago M, Traphöner R & Weiss S. (2003) Developing Industrial CaseBased Reasoning Applications: The INRECA Methodology. Lecture Notes in Artificial Intelligence, LNAI 1612, Springer Verlag, Berlin.
6
Cheetham W & Graf J. (1997) Case-Based Reasoning in Color Matching., the 2nd Second International Conference on CBR Research and Development, pp.1-12, vol. 1266.
7
Chiang, L, Russell E & R Braatz (2001). Fault Detection and Diagnosis in Industrial Systems. London Great Brintain: Springer.
8
Cordier A. (2008) Interactive and Opportunistic Knowledge Acquisition in Case-Based Reasoning. Thesis, Laboratoire d'InfoRmatique en Images et Systèmes d'information, Université de Lyon I.
9
Cunningham P & Smyth B. (1994) A Comparison of Model-Based and Incremental Case-Based Approaches to Electronic Fault Diagnosis, AAAI , Seattle, USA.
10
Devaney M & Cheetham B. (2005) Case-Based Reasoning for Gas Turbine Diagnostics, In 18th International FLAIRS Conference (FLAIRS-05).
11
Fuchs, B., Lieber, J., Mille, A., Napoli, A.: An Algorithm for Adaptation in Case-Based Reasoning. In Proceedings of the 14th ECAI, Berlin, Germany, p. 45–49, (2000).
485
12
Grant PW, Harris PM & Moseley LG. (1996) Fault Diagnosis for Industrial Printers Using Case-Based Reasoning. Engineering Applications of Artificial Intelligence. Vol.9, No.2, pp.163-173.
13
Haouchine MK, Chebel-Morello B & Zerhouni N. (2008a) Adaptation-Guided Retrieval for a Diagnostic and Repair Help System Dedicated to a Pallets Transfer. In 3rd European Workshop on Case-Based Reasoning and ContextAwareness. 9th European Conference on Case-Based Reasoning, ECCBR 2008, Trier, Germany.
14
Haouchine MK, Chebel-Morello B & Zerhouni N. (2008b) Conception d’un Système de Diagnostic Industriel par aisonnement à Partir de Cas. Actes du 16ème Atelier de Raisonnement à Partir de Cas, RàPC'2008, Nancy, France.
15
Haouchine MK, Chebel-Morello B & Zerhouni N. (2009) Algorithme d’Adaptation pour le Diagnostic Technique. Actes du 17ème Atelier de Raisonnement à Partir de Cas, RàPC'2009, Paris, France.
16
Lamontagne L & Lapalme G. (2002) Raisonnement à base de cas textuels – état de l’art et perspectives, Revue d’Intelligence Artificielle, Hermes, Paris, vol. 16, no. 3, pp. 339-366.
17
Rasovska I, Chebel–Morello B & Zerhouni N. (2007) A Case Elaboration Methodology for a Diagnostic and Repair Help. FLAIRS-20 Key West, Florida, May 7-9.
18
Richter M.M., The Knowledge Contained in Similarity Measures. Invited talk, First International Conference on CaseBased Reasoning (ICCBR’95), Sesimbra, Portugal, 1995.
19
Smyth B & Keane MT. (1993) Retrieving Adaptable Cases. The role of adaptation knowledge in case retrieval, in proceedings of EWCBR’93, LNAI 837, éd. Springer.
20
Varma A & Roddy N. (1999) ICARUS: A case_Based System for Locomotive Diagnostics. Applications of Artificial Intelligence Journal.
21
Watson I & Marir F. (1994) Case-Based Reasoning : A Review, The Knowledge Engineering Review, 1994.
486
Proceedings of the 4th World Congress on Engineering Asset Management Athens, Greece 28 - 30 September 2009
OPTIMUM DESIGN OF VERTICAL PUMP FOR AVOIDING THE REED FREQUENCY J. I. Lim a, B. G. Choi b, H.J. Kim c and C. H. Park d a, b, c
GyeongSang National University.Inpyeong-Dong, Tongyeong-city, Gyeongnam , South Korea
d
Doosan Heavy Industries &Construction, 555 Guigok-Dong, Changwon-city, Gyeongnam, South Korea.
Reed frequency that occurred during the motor operation in vertical cargo pump is important factor to control the limitation of operation speed. If the reed frequency has no consideration for motor in vertical cargo pump design, it may be impossible to design the operation design of vertical cargo pump. Therefore, in this paper, the optimization design of pump has done with common analysis programs to avoid the reed frequency resonance. And then the results are compared and analyzed. Key Words: Vertical pump, constraint, reed resonance 1
INTORODUCTION
The large-size vertical pump has some characteristics such as a small area to set-up, a large capacity and for a low water level. So it’s using in all sort of industrial plant. The resonance problem of the vertical pump often happens in industrial plant more than any other machinery. The natural frequency of structure is generally low, because a heavy motor is located on the top area of a longish pump. And the vibration tends to happen easily if it is forced by external force. The vertical pump is usually founded on concrete slab which is more weaker than the ground. The centre of gravity is higher than horizontal pump because the heavy motor is on the top of a pump which is cantilever shape. These reasons make the natural frequency lower. The resonance problem often happens by unbalanced of the shaft because rotating frequency of a large-size pump is low speed. Most of the large-size vertical pump is one of important part in facility operation, the resonance affect on stability of facility operation. If the resonance happens when the motor operates on the top of the pump, the strong vibration occurs. No matter how much the motor keep balance, and even if the installation situation satisfies the requirements of the manufacturer of pump and motor. At this time, an operator can judge that is problem just about motor because the high vibration occurs in solo driving situation that there is no load on the motor. This happening is result that the cause of vibration is not on the motor, but the reed resonance bet motor system and motor stool. To solve this reed resonance problem, above all, it should take structure part of pump and fluid in the pump into consideration. But in this paper, structure part will be considered first. Main purpose of this research is modelling the optimum pump design through avoiding the reed frequency and changing some parameters for optimum.
487
1.1 Background (1) Spectrum analysis this structure is fixed on the foundation strongly like Figure 1. When the static deformation is applied at a vertical motor and released. The number of vibration that vibrated on single degree of freedom is the reed frequency. The more excite frequency is approaching, the more the vibration is increased. This is reed resonance of motor.
Figure 1. Outline and Reed Vibration Mode (2) Single Degree of Freedom System Every machine structure is dynamic system of masses, spring and damping. The simplest example is the system show in Figure 2. It is made up of a mass, a spring and a piston-cylinder damper. The mass is restrained to move in direction and therefore has a “single degree of freedom”
F (t ) =
w g
x"-Cx'+ kx
(1)
These equation states that the driving force, F(t), is comprised of inertial force, mx”(m=w/g), a damping force , Cx’, and a stiffness force, kx Figure 2. Single Degree of Freedom System These three forces control the dynamics of the mass. The vibration response of the mass relative to the frequency of the forcing function. The vibration response is highly dependent on the frequency of the driving force. (3) Amplitude curve for reed resonant 1
Fr = 0.159( KW*g ) 2
(2)
K : A number of Equivalence Spring (N/m) g : Gravitational Acceleration (m/s2) W : Motor Weight (N) This is the typical curve graph showing the characteristic of reed amplitude that is passed reed resonance point following the rotating speed. In case of the S1, the motor amplitude is this. If the rotating speed is going up to S2, The amplitude is double. But case S3, Those are 3 times over. Figure 3. Amplitude curve for reed resonant If the rotating speed is constant as S3, move this resonance curve to left side or right side from this speed for deceasing vibration.
488
(4) The theory of structure optimization Working out a design is like giving a definition about system. The optimum design is a method that design can be quickly and work out in direction that we desired. The concept of optimum design has been making practicable rapidly due to development of computer. In generally, the optimization is what looking out a minimum value of objective function. Therefore, the structure optimization is a field of a general optimization. It is called as the optimum structural design which is designing structures that is based on analysis after the structure analysis. The optimization of structure design would satisfy some boundary conditions, and then determine the variables to make objective-function minimum. This is expressed mathematically as follows
A minimal objective-function
F(X )
g j ( X ) £ 0 . j = 1 ,....., m
hk ( X ) = 0, K = 1,....., l
A constraint
X min £ X £ X max The X is a variable and N-dimensional vector. An objective function[F(X)] is commonly tare weight when designing the structure. the gj and hk are called the behaviour constraint as an inequality and an equality. The third constrain is defining the boundaries of each variable and it’s called the side constraint. There is constraint in design when seeking the minimum value. For example, what the stress of reinforcement keeps within the extreme stress could be constraints. In case of that there is constraints is more complicate than no constraint. In this paper, the optimization carried on by using the search method. Is the F(X) is the objective function which is the object to optimum and the X is variable, The search method is what make a decision about Sq which is satisfying the convergence condition. At here, a q is number of repeat times, a a q is the optimum distance. The 1st search method is what the method that find the Sq which is searching way with the 1st derivative. There are many kind of this method. The algorithm that used in this paper is BFGS(Broydon Fletcher Goldfard shanno). The BFGS is the variable matrix that is transmitting the former calculation values by using variable ‘n’. that is, S q = - H - f ( X q ) , the H matrix is the unit matrix. And then defining as follows, H q +1 = H q + D q . At here, the D is as the newest matrix.
Dq =
s + pt q -1 q q PPT + H y ( H q y )T - [ H q ypT + p ( H q y )T 2 s t s
The variable p and y are,
and, the scalar σ, τ are
p = X q - X q -1
s = p* y
y = F ( X q ) - F ( X q -1 )
t = yT H q y
At this point, if the θ is zero, it is the DFP(Davidon Fletcher Powell). If the θ is one, it is the BFGS.
2
MAIN SUBJECT
2.1 An object and its specification
the pump design has been converted 2D into 3D with common use program ‘CATIA V5R18’
Figure 4. The vertical pump modeling
489
Table 1 The specification PUMP
CAPACITY
DISCH. HEAD
REVOLUTION
(GPM/m /min)
(ft / m)
(RPM)
240/0.912
175/53.34
3474
POWER
VOLTAGE
POLES
(HP/kW)
(V)
(P)
3
MOTOR
WEIGHT(Lb/Kg) PUMP
MOTOR
BASEPLATE
TOTAL
4237/1922
518/235
410/186
5165/2343
2.2 The FEM Model with MSC. PATRAN
Figure 5. the part above ground.
Figure 6. Impeller and its case.
Figure 7. Foundation and Foot
2.3 Material for each part Table 2 The material properties Model
Pump parts
Motor
Part Name
Material
Young’s Modulus(GPa)
Poisson’s Ratio
Density(g/cm3)
Casing & Impeller
SSC 13
210
0.3
7.95
Column Pipe & Shaft
STS 304
210
0.3
7.95
The others
GC 200
100
0.3
7.51
Frame
GC 150
100
0.3
7.51
490
2.4 The Cases and its Analysis Table 3 Change thickness of upper casing only Thickness
Upper casing
Lower casing
Case_U_1
T10
T10
Case_U_2
T9
T10
Case_U_3
T8
T10
Case_U_4
T7
T10
Case_U_5
T6
T10
Case_U_6
T5
T10
Case_U_7
T4
T10
Case_U_8
T3
T10
Case_U_9
T2
T10
Case
Figure 8. Case_U_1 mode shape Table 4 The natural frequency of cases Frequency
Frequency
Case Case_U_1
46.193
Case_U_2
46.155
Case_U_3
46.101
Case_U_4
46.018
Case_U_5
45.885
Case_U_6
45.661
Case_U_7
45.192
Case_U_8
44.22
Case_U_9
42.086
The each cases are mixed and analysed. The variables are thickness of upper casing and lower casing, and then add rib on the casing. In case of casing, the thickness 10 both of upper and lower casing is optimum. So, it has been added ribs on the casing. But there are no dominant changes.
Figure 9. The safe range to avoiding reed resonance The final target is making the pump’s natural frequency up to 70Hz. So it’s needed to set up another parameter which can be changed for optimum.
3
CONCLUSION
The operation frequency of pump is 58Hz(3474rpm). The analysis result is 46.193Hz on 1st mode. Actually this 1st mode should stay up to the operation frequency of pump. If that is staying under the operation frequency, the reed resonance can happen during the pump speed up.
491
But, the more parameters should have set up. Now is the first step to make it. So it is needed more change parameter and analysis. After optimum this structure part, fluid part is going to be considered with structure parts.
4
REFERENCES
1
Yang. B. S, Kim. W. C, Lim. W. S, Kwon. M. R, (1989) The dynamic response analysis of vertical pump: The Korean Society of Mechanical Engineers,13(3), 363-372
2
Lee. H, Kim. Y. H, Lee. K. S, Park. S. M, Lee. Y. S, (1994) A Study on Vibration Characteristics of Cylindrical Shells Structure for a Large Vertical Pump with Cutouts: The Korean Society of Noise and Vibration Engineering, 181-186
3
G. N. Vandelplates (1984) Numerical Optimization Technique for Engineering Design, McGraw-Hill
Acknowledgments This study was support by Second-Phase of BK21 Project (Eco-Friendly Heat and Cold Energy Mechanical Research Team).
492
Proceedings of the 4th World Congress on Engineering Asset Management Athens, Greece 28 - 30 September 2009
DEVELOPMENT OF SPREADSHEET BASED DECISION SUPPORT SYSTEM FOR PRODUCT DISTRIBUTIONS Ainul Akmar Mokhtara, Masdi Muhammada, Mohd Amin Abdul Majida a
Universiti Teknologi PETRONAS, Bandar Seri Iskandar,31750 Tronoh, Perak, Malaysia.
Product distribution is a complex process as it involves meeting requirements from several stake holders including the distributors and customers. The primary objective of product distribution process is meeting the customers’ demand as well as minimizing the cost incurred by the distributor. For distributor that supports large number of customers, the available commercial softwares for optimizing and scheduling of product distribution are typically being used. However, these systems are complex, costly and require long processing time on a dedicated computer system. Thus, these commercial softwares are not practical for distributors that support small number of customers and as such the optimization and scheduling activities are usually done manually based on rule-of-thumb. This process is time consuming and the results may not be optimal. This paper presents a decision support system employing a two-step sequential approach for product deliveries. First is to determine the optimum carrier required to meet customers demand utilizing linear programming with the objective function to minimize the total distribution cost. Premium Solver Platform (PSP) is utilized to model the optimization problem. Second is to use multi-criteria decision making approach applying various physical and logistic rules to generate the carrier assignment and scheduling. Both approaches are developed using spreadsheet due to its ease of implementation and lowest cost of ownership. The outcome indicates that this decision support system gives a better result compared to the manual assignment of carrier while minimizing the distribution cost. Furthermore, the system requires only a few minutes to generate the results and thus can be applied to practical usage. It is also shown that the system could be used as a viable planning tool for strategic decision concerning investment on the number of carrier required to meet future demand. Keywords: Multi-criteria decision making, spreadsheet model, product distributions
1
INTRODUCTION
As in almost all business environments with dynamic markets, information technology has become absolutely necessary to keep the operations going. One area where the application of information technology is crucial is the systems for product distribution. Product distribution is viewed as a complicated process as it involves meeting several requirements from different stake holders. The stake holders include the distributors and customers. An important and complex challenge in product distribution is to optimally assign available carriers to deliver the products based on the customers’ demand while minimizing the cost incurred by the distributors. In addition, the plan delivery has to also satisfy a number of practical constraints that include the distribution point time windows, carrier capacity, compatibility between carrier and customer delivery points due to carrier size or carrier type, just to mention a few. For distributors that support large number of customers, the available commercial softwares for optimizing and scheduling of product distribution are typically being employed. Nevertheless, these systems are complex and expensive as well as require longer processing time on a dedicated and expensive computer system. Thus, these commercial softwares are impractical for distributors that support small number of customers and as such the optimization and scheduling activities are usually done manually based on rule-of-thumb and their experience, with a pen and a piece of paper and often, no special tool to visualize the schedule. This process is time consuming and the results may not be optimal. Therefore, there is a need to develop an optimization-based decision
493
support system for proper product distribution which is cheaper in terms of total cost of ownership to cater the group of these distributors. In this paper, a decision support system employing a two-step sequential approach for product deliveries is presented. The first approach is to determine the optimum carrier required to meet customers demand. This is done through linear programming with the objective function to minimize the total distribution cost. Premium Solver Platform (PSP) is used to model the optimization problem. The second approach is to use the multi-criteria decision making method by applying various physical and logistic rules to generate the carrier assignment and scheduling. Both approaches are developed using spreadsheet due to its ease of implementation and lowest cost of ownership. The remainder of this paper is organized as follows. In Section 2, the overview of the decision support system together with the two-step sequential approach, namely linear programming and multi-criteria decision making, is also presented. Section 3 describes the design of the decision support system and its algorithms. Section 4 gives conclusions and some directions for future research and development of the DSS. 2
DECISION SUPPORT SYSTEM
A decision support system (DSS) assists management decision making by combining data, sophisticated analytical tools and user-friendly software into a single powerful system that can made semi-structured or unstructured decision making [1]. DSS provides users with a flexible set of tools and capabilities for analyzing important block of data. In this study, a two-step sequential approach is used to develop the DSS for scheduling of product distribution. In the first step, linear programming is utilized with the objective of determining the optimum carrier required to meet customers demand while minimizing the total distribution cost. The second approach is to use multi-criteria decision making approach to generate the carrier assignment and scheduling. 2.1 Linear Programming Linear programming (LP) is a widely used mathematical programming technique designed to assist operations managers in planning and making the decisions necessary to optimally allocate available resources. LP is designed to optimize a (single) linear objective function subject to a linear set of constraints where all model parameters are assumed to be known with certainty [2]. Over the past several decades, numerous applications for LP have been proposed for improving the efficiency of business operations. In this section, LP formulation is presented for the problem of distributing continuous products due to the fact that the carrier is assumed to be always full during the delivery. The following assumptions are being applied to the LP model: •
Customers with the shortest distance will be prioritized.
•
The total daily customers’ demands cannot exceed the total available carriers’ capacity.
•
Demand of each customer must be met i.e. half-full delivery is not allowed.
Let the following variables be defined as:
i = 1, 2,…,n be the number available of customers served by one distribution point. j = 1, 2 ,…, m be the number of carrier owners getting contracts to deliver the products
k = 15000L, 25000L be the two types of carrier size available p = 1, 2,…, P be the product types
Rip = rate of delivering product p to customer i
X ijk = decision variables The mathematical formulation of the problem of distributing the continuous products as LP problem can be written as: n
Minimize total distribution cost
m
P
∑∑∑ ((X
ij
) (
Rip 15000 + X ij Rip 25000
i =1 j =1 p =1
494
))
subject to the following constraints: •
The daily demand for each customer must be met.
•
The decision variables X ijk must be binary due to orders cannot be split.
•
The total number of selected carriers must be less or equal to the number of available carriers 2.2 Analytical Hierarchy Process
The analytical hierarchy process (AHP), developed by Thomas L. Saaty in the 1970s, is a structured method for dealing with complex decisions [3]. It is a multiple criteria decision-making technique that allows subjective as well as objective factors to be considered in the decision-making process. AHP is based on the following three principles: decomposition, comparative judgment and synthesis of priorities. It is applied around the world in many decision making situations such as in government, business, industry, healthcare and education. The steps in using the AHP are outlined as follows [4]: 1.
Model the problem as a hierarchy containing the decision goal, the alternatives for reaching the goal, and the criteria for evaluating the alternatives as shown in Figure 1.
2.
Establish priorities among the elements of the hierarchy by making a series of judgments based on pairwise comparisons of the elements. For example, when comparing the carrier size to deliver the products, the distributors might say they prefer the distribution cost over location of the customer and timing.
3.
Synthesize these judgments to yield a set of overall priorities for the hierarchy. This would combine the distributor’s judgments about location, price and timing for carriers A, B, C, and D into overall priorities for each carrier.
4.
Check the consistency of the judgments.
5.
Come to a final decision based on the results of this process.
Figure 1: A simple AHP hierarchy [4] 2.3 Excel as a programming tool In the past, mathematical programming problems were usually solved using special purpose optimization software packages such as MATLAB. However, nowadays, the spreadsheet general-purpose optimization modelling system known as Premium Solver is available as an add-in to Microsoft Excel. The widespread availability of the Premium Solver in Excel has generated many optimization applications in both private and public sectors. Excel becomes a more commonly used tool in the mathematical programming field to solve LP problems [5]. Fortunately, Excel provides an ideal software platform for solving various types of multiple objective LP problems. Because Excel is a spreadsheet and data analysis tool, it has many useful built-in statistical and graphical capabilities. These features greatly facilitate the interpretation of scenario-based optimization results without having to export output data to other packages and
495
without having to learn new specialized, software packages. Excel also provides a visual development language, Visual Basic for Applications (VBA), in the same package as the spreadsheet. VBA provides a high degree of flexibility and control in creating DSS, giving developers easy access to the Excel’s extensive collection of data analysis objects and tools for creating graphical user interfaces. 3
DESIGN OF THE DECISION SUPPORT SYSTEM
This section will illustrate how the DSS employing a two-step sequential approach for product distribution may be solved in Excel. 3.1 Combining several product It is learned that combining several products in one delivery are crucial to optimize carrier scheduling as well as maximizing the carrier utilization. Therefore, an algorithm was developed for automatically determining which products can be combined. The algorithm was based on the order numbers from each customer. The order number represents the demand size of only one product. In other words, one customer may have several order numbers in one day for the same product or for several products. The algorithm would then determine whether several order numbers can be combined for the same customer based on the maximum carrier size. The algorithm will combine several order numbers in one carrier. In other words, the algorithm will identify the proposed carrier size for delivering products to one customer with minimum cost. 3.2 Selection of optimal carrier size For some distributors, there are several carrier companies to help distributing the products as in the case at hand. Usually, the contract given to the companies are based on certain percentage depending on the number of available carrier. Therefore, an optimization model is developed to find the optimal number of carriers to deliver the products, meet the customer demand on that particular day and also meeting the contractual percentage. 3.3 Carrier assignment and scheduling From the optimization model, the number of carriers is proposed to be used for each carrier owner. However, which specific carrier to carry which order number(s) is not known yet. Therefore, several algorithms have been developed to come out with the final output which is the carrier scheduling for that particular day. The algorithms are based on AHP. The first algorithm was developed for carrier assignment where selection of carrier registration number for a group of jobs based on several factors such as carrier types, single or multiple products capability and travelling time to deliver to each customer. The results will be a sequence of jobs for each carrier. The job sequence is based on the traveling time; serving shortest distance first. Figure 2 and 3 show the samples of the algorithm.
Figure 2: Algorithm for selection of carrier registration number
496
Figure 3: Job assignment for each carrier The second algorithm was to determine the estimated refilling time at distribution, the estimated arrival time at destination and the estimated time to complete the job. Final algorithm was to check the proposed time against the constraints (refilling time at distribution point and opening and closing time at destination). If all the constraints are satisfied, the results will go to Final schedule. If the job cannot meet the constraints, it will be generated in the Exception report. In Figure 4, the schedule list is presented, where all the jobs needed to be done for each carrier are shown. It is also made possible for the user to view added information for each job, such as the refilling time and arrival times, the product information and also the expected distribution costs. Other ‘views’ that have been developed for the planner to really understand and evaluate the schedule include: •
Gantt chart, where the timing aspect is visualized. By using this view, the carriers’ utilization as well as the idle time in the plan can easily be seen.
•
Graphs showing the time utilization for each carrier, the total distribution costs, the breakdown of the cost for each distributor and the percentage of volume as per contract.
Figure 4: Schedule generated for each carrier on one day. 4
CONCLUSIONS
This paper presents a DSS for product distribution for continuous products employing a two-step sequential approach for product deliveries. Both approaches are developed using spreadsheet due to its ease of implementation and lowest cost of ownership. The outcome indicates that the DSS gives a better result compared to the manual assignment of carrier while minimizing the distribution cost. Furthermore, the system requires only a few minutes to generate the results and thus very practical. Both approaches are developed using spreadsheet since it is easy to implement and the cost of ownership is lower.
497
The DSS can also be used for strategic planning issues such as the evaluation of fleet size concerning investment on the number of carrier required to meet future demand. To summarize, the DSS includes the characteristics of an effective DSS: •
It supports, but does not replace the decision-makers.
•
It supports semi-structured decisions, where parts of the analysis can be calculated by the computer, but where the decisionmaker’s insight, experience and judgment are needed to control the process.
•
It emphasizes ease of use, user friendliness, use control, flexibility and adaptability.
5
REFERENCES
1
Dey PK. (2004) Decision support system for inspection and maintenance: A case study of oil pipelines, IEEE Transactions on Engineering Management, 51(1), 47-56.
2
Cunha CB &, Mutarelli F. (2007) A spreadsheet-based optimization model for the integrated problem of producing and distributing a major weekly newsmagazine, European Journal of Operational Research, 176, 925-940.
3
Saaty TL. (1980) The Analytic Hierarchy Process. New York, USA: McGraw-Hill Publications.
4
Analytical Hierarchy Process. http://en.wikipedia.org/wiki/Analytic_Hierarchy_Process
5
Novak DC & Ragsdale CT. (2003) A decision support methodology for stochastic multi-criteria linear programming using spreadsheets, Decision Support Systems, 36, 99-116.
Acknowledgements The authors gratefully acknowledge all reviewers for their valuable suggestions for enriching the quality of the paper. The support of Universiti Teknologi PETRONAS is greatly acknowledged.
498
Proceedings of the 4th World Congress on Engineering Asset Management Athens, Greece 28 - 30 September 2009
THE DIAGNOSTIC DECISION IN UNCERTAINTY CIRCUMSTANCES Smalko Z, Woropay M, śurek Ja a
Air Force Institute of Technology, Księcia Bolesława 6,Warsaw 01-485 , Poland.
In the paper a trial was made to describe selected dependability characteristic of technical objects. The creation of dependability related states of an object was considered. Triangularly probability distribution was used in order to describe in detail the characteristics such as inefficiency recognition threshold and limiting value of diagnostic parameter. Key Words: Diagnostic, Dependability, Man-machine system 1
INTRODUCTION
Technical objects are referred to as man-machine systems, which are able to receive and process information [2,3,4]. The technical systems aim at maintaining the required state through working against and preventing creation of disturbances. The systems receive signals and processes them producing specific reactions. A significant part of such an operation is the process of recognition of the machine state after receiving information on intensity of diagnostic parameters. The subject-matter of this consideration is a mechanism for recognition of dependability related object state on the basis of the observed values of diagnostic parameter. Formally, this parameter is a random variable whilst physically it is a response time of the system to specific stimuli. Based on experimental results, damaged technical objects are known to produce longer response time than undamaged ones but in a certain range the response time is the same. Hence, parameter values within the joint range can be uniquely assigned neither to damaged nor undamaged objects. To this end inefficiency recognition threshold is determined. Recognition sensitivity of object inefficiency state is influenced by expected benefits of diagnosis as well as their consequences. The complex performance characteristic of the system for specific conditions is reflected by the relation between characteristics for justified and unjustified warnings against efficiency loss. [4,5]. 2
ASSESSMENT OF MAN-MACHINE SYSTEM SENSITIVITY TO STIMULI
It is assumed for the method under consideration that conditional probability density distribution for the diagnostic parameters of damaged and undamaged machine is known. Conditional probability density functions of the diagnostic parameters can be expressed, for example, as follows [7]:
0 ( 1) f ( x z) = A z ( x - a z ) ( 2) A z ( bz - x)
A
( 1)
=
(m
2
( ) o - a) b - a
gdy x £ a z 〉 b z , gdy a z 〈 x £ m oz ,
(1)
gdy m oz 〈 x £ b z ;
,
A
( 2)
=
2
( b - m )( b - a ) o
499
(2)
0 gdy x £ a u ‡ b u , ( 1) f ( x u ) = A u ( x - a u ) gdy a u 〈 x 〈 m ou , ( 2) A u ( b u - x ) gdy m ou 〈 x 〈 b u ; where:
(3)
f (x z )-triangularly density of probabilities of diagnostic parameter (Figure 1) au, az - lower limits of the parameter ranges for damaged and undamaged objects respectively, bu, bz - upper limits of the parameter ranges for damaged and undamaged objects respectively, mou, moz - modal values of the parameter ranges for damaged and undamaged objects respectively.
In order to determine the equivalent limiting value of the diagnostic parameter xg in case of the triangular probability distribution we can use the following formula [7]: (1)
xg =
( 2)
A u au + A z bz (1)
( 2)
Au + Au
,
m oz < m ou ;
m oz < x g < m ou
(4)
Figure 1. An example for determination of the limiting value of diagnostic parameter xg.[7] az,bz - lower and upper limits of the parameter variation range for damaged objects, au, bu - lower and upper limits of the parameter variation range for undamaged objects, mou, moz - modal values of the diagnostic parameters for damaged and undamaged objects correspondingly 3
ASSESSMENT OF CONSEQUENCES RESULTING FROM RESPONSE OF MAN-MACHINE SYSTEM TO STIMULI There are the following modes with regard to a specific diagnosis made in the method under consideration: • accepting true hypothesis, which is equivalent to desirable warning about a hazard resulting in imposition of justified ban on further operation of the object, • accepting false hypothesis, which is equivalent to false alarm (a first type error) resulting in unjustified ban on further operation of the object; • rejecting true hypothesis (a second type error), which is equivalent to inflicting damage to the object and environment due to false diagnosis being the basis of unjustified permit for the operation of unreliable object; • rejecting false hypothesis, which is equivalent to right diagnosis on the lack of hazard being the basis of justified permit for the operation of reliable object.
500
In accordance with the rules of statistical theory of decision making a play with nature model can be used under specific circumstances. The play with nature is reflected by the process of random variations of the machine state. When the machine becomes inefficient such a situation is identified as a failure [2,3,6]. Pay-off matrix (table 1) is created with elements (numbers) assigned to usefulness (profit) and uselessness (losses) as a result of making specific decisions. Operational strategies of the man machine system are referred to as decisions and nature ”strategies” are referred to as reliability related states. These are estimated by using loss function - L or effectiveness function - W.
Table 1 Pay-off matrix
Decision
State of object damaged
undamaged
Recognized damage
WTU
- LTZ
Not recognized damage
- LNU
WNZ
where: WTU - usefulness (profit) from accepting true hypothesis expressed as savings resulting from preventing failure less additional expenditures to reinstate the machine reliability, WNZ - usefulness (profit) from rejecting false hypothesis expressed as revenues from the machine operation, LNU - uselessness (loss) from rejecting true hypothesis expressed as damages paid after failure including possible renovation of the damaged machine, LTZ - uselessness (loss) from accepting false hypothesis expressed as opportunities not taken and unnecessary expenditures to maintain the machine reliability. In the method under consideration decisions are expected to be made by using the Laplace-Bayes criterion. The operator can use both conservative and progressive strategy. Conservative strategy consists in accepting hypothesis that damage and losses occur regardless of whether or not a hazard appears. This strategy can be used when expected losses are significantly larger than profit.
E (T | x ) = WTU P (u | x ) - LNU P (z | x ),
(5)
where: P(ux) - conditional probability of the state that machine is damaged when diagnostic parameter is equal to x, P(zx) - conditional probability of the state that machine is undamaged when diagnostic parameter is equal to x, E(Tx) - expected effectiveness of using conservative strategy when diagnostic parameter is equal to x. Progressive strategy consists in rejecting hypothesis that damage and losses may occur regardless of whether or not relevant hazard appears. This strategy can only be used when expected profit is significantly larger than losses.
E (N | x ) = WNZ P (z | x ) - LTZ P (u | x ),
(6)
where: E(Nx) - expected effectiveness of using progressive strategy when diagnostic parameter is equal to x. 4
IMPACT OF MOTIVATION STRUCTURE ON STRATEGY CHOICE It is easy to notice that the most useful recognition threshold b [2,7] is when:
E (T | x ) - E ( N | x ) = 0
(7)
Hence, after transformation [2,,7] we will receive the following expression for calculation of the most useful relative recognition threshold b in the play with nature.
501
v (x g ) =
( ) = R(t f (x z ) F (t
f xg u g
) WNZ + LTZ d ) WTU + LNU
d
(8)
= b,
where: b - relative recognition threshold, motivation structure index, v(x) - likelihood ratio of the object state recognition, R(td), F(td) - values of the reliability and unreliability functions at the instant td when the object state is recognised, i.e. probability that after period of time td the objects are either undamaged or damaged, td - the instant when diagnosis is conducted. The index b is a number representing the motivation structure, i.e. relations between usefulness and uselessness of both strategies that are possible to use [2,6]. In the general case of triangularly probability distribution we cannot derive the limiting value of the parameters xg in algebraic way. Based on overall consideration of the changes in the machine unreliability and harmfulness of its responses the following conclusions can be drawn. Increase in unreliability and harmfulness is followed by a rise in effectiveness of the conservative strategy and, hence, a drop in the recognition threshold value. However, decrease in unreliability and harmfulness is accompanied by a drop in effectiveness of the conservative strategy and, hence, increases in the recognition threshold value. Permissible risk of rejecting true failure hypothesis increases when there is a rise in motivation index. Permissible risk of rejecting false failure hypothesis increases when there is a drop in motivation index below equivalent value [2,7]. 1.0
F(x g |u) 0.8
b =1/2
0.6
b =1
0.4
b =2 0.2
0.0 0.0
0.2
0.4
0.6
0.8
1.0
F(t)
Figure 2. Performance Characteristic of Machine System [7] Rational state recognition consists in the choice of the most favourable relation between non-making one type errors and making other type errors [2,6]. We can determine the following expressions for probabilities of: • non-making the second type error and making right decision about ban on operation and respectively making the first type error and making wrong decision about ban on operation: bu
( ) ∫ f (x u )dx,
R xg u =
xg
bz
( ) ∫ f (x z )dx.
R xg z =
(9)
xg
• making the second type error and making wrong decision to allow operation and respectively non-making the first type error and making right decision to allow operation: xg
( ) ∫ f (x u )dx,
F xg u =
au
xg
( ) ∫ f (x | z )dx.
F xg z =
(10)
az
502
Figure 3a illustrates the relation between the probability of non-making the second type error and the probability of making the first type error. Figure 3b illustrates the relation between the probability of making the second type error and the probability of non-making the first type error. The graph shown in Figure 3 is referred to as the Receiver Operating Curve (ROC). For each fixed set of experimental conditions (e.g. a specific motivation structure and a hazard level) there is specific recognition threshold b with the corresponding limiting value xg of diagnostic parameter. Therefore, the object receiving signals when operating under specific conditions has its corresponding point in the ROC curve. The curve illustrates the characteristic for correct warning accompanied by false alarming [1,2,3]. The complement of the characteristic illustrates the risk of failure to recognise actual state of the object accompanied by right diagnosis. It can be referred to as the reverse ROC curve. Both graphs present object sensitivity to stimuli received and to consequences resulting from responses to the stimuli. F(x g |u)
R(x g |u) 1,0
1,0
0,8
0,8
0,6
0,6
0,4
0,4
0,2
0,2
0,0 0,0
0,2
0,4
0,6
0,8
0,0 0,0
1,0
R(x g |z)
0,2
Figure 3a
0,4
0,6
0,8
1,0
F(x g |z)
Figure 3b
Figure 3. Performance characteristic of the object: (a) probability of accepting true failure hypothesis versus probability of accepting false failure hypothesis, (b) probability of rejecting true failure hypothesis versus probability of rejecting false failure hypothesis. 5
CONCLUSIONS
The method under consideration is characterised by distinction of the man machine system sensitivity to stimuli and creation of the basis for decision making under the influence of expected benefits resulting from responses to stimuli received. This complies with specificity of man machine systems. Introduction of safety and reliability related aspects to the model is a new feature as compared with the models that have been developed under the recognition theory so far. 6
REFERENCES
1
Coombs C.H., Dawes R.M., Tversky A. (1970) Mathematical Psychology. An Elementary Introduction. Prentice-Hall, Inc., Englewood Cliffs, New Jersey.
2
Drobiszewski J. ,Smalko Z. (2004) Risk in the Aspect of Safety and Reliability of Autonomous System. Proceedings of PSAM/ESREL’04, Berlin.
3
Helstrom C.W. (1960) Statistical Theory of Signal Detection. Pergamon Press Ltd.
4
Mazur M. (1966) Cybernetic theory of autonomous systems. PWN, Warszawa.
5
Mothes J.(1967) Incertitudes et decisions industrielles. Dunod, Paris.
6
Smalko Z. (1995) Risk analysis of transport systems. Archives of Transport, vol 7, issue 1-4.
7
Smalko Z. (1997) Recognition of independent objects stimuli. Archives of Transport, vol 27, issue 1-2.
503
Proceedings of the 4th World Congress on Engineering Asset Management Athens, Greece 28 - 30 September 2009
EDUCATION AND TRAINING NEEDS IN MAINTENANCE HOW YOU CONDUCT A SELF AUDIT IN MAINTENANCE MANAGEMENT Dr. Yiannis Bakouros, Associate Professor∗, Dr. Sofia Panagiotidou∗, Visiting Lecturer, Cosmas Vamvalis, Managing Director∗∗
ABSTRACT Effective maintenance strategies are needed in today’s competitive business to meet the needs of specific business environments for SMEs aiming to achieve increased availability and performance. Maintenance managers and technicians can operate more effectively if they are more adequately trained. Thus result a dramatic reduction in breakdowns, maintenance costs and defects, whilst realising an increase in resource availability and productivity. Unfortunately so far training in maintenance, by using mainly theoretical approach and therefore providing methodologies unfeasible to be implemented, was fragment and superficial,. The overall aim of this paper is to analyze the status of education and training on key maintenance principles and to present current needs and trends in the described area of present text is to describe the specifications that should be followed to better conduct a self-audit in Maintenance Management. The maintenance audit constitutes one of the first steps a company has to follow before launching a Maintenance Management policy. Furthermore, occasionally it is essential for the maintenance manager to take a step back and look at the overall progress being made; in short to carry out a maintenance audit. An audit is a more questioning activity than monitoring performance indicators. The specifications that are given in this text constitute a general frame of sequence of steps that should be followed in order an audit in Maintenance Management can be performed rather than a set of explicit and constraining guidelines.
∗
University of Western Macedonia, Department of Mechanical Engineering, Bakola & Sialvera, Kozani, 50100, Greece, [email protected], [email protected] ∗∗
Atlantis Engineering, 21, Antoni Tritsi, Pilea, Thessaloniki, Greece, [email protected]
504
KEYWORDS Maintenance, Training, SMEs. 1.
INTRODUCTION Business performance drivers are similar to each organization: customers
demand ever-increasing reliability, with faster lead and delivery times. In absence of improvement culture - difficulties arise because phrases such as ‘world-class’ and ‘best practice’ are not part of the SMEs vocabulary (Nelder et al, 2000). The demand for improvements in productivity, reduction in cost and for shorter life cycles during the introduction of new products, becomes more imperative day by day. European SMEs are currently facing very hard competition and the need to improve their production system in order to survive in the future has become vital. It is against this competitive backdrop that the performance and availability of key resources and assets have started to be identified as an important factor contributing to improved productivity. Related to performance and availability, the UK Department of Trade and Industry (DTI) has given a comprehensive definition in the term maintenance: the management, control, execution and quality of those activities which will ensure that optimum levels of availability and overall performance of plant are achieved, in order to meet business objectives (Labib A.W., 2006). Therefore an effective maintenance strategy, which meets the needs of specific business environments, is required for all European enterprises and especially for SMEs to achieve increased availability and performance. In today’s global economy and fierce competition, quality has been recognized as the major edge for competitiveness and long-term profitability. The role of maintenance in this endeavour cannot be neglected. In general terms, equipment which is not well maintained and fails periodically experiences speed losses and/or lack of precision and, hence, tends to produce defects. More often than not such equipment drives manufacturing processes out of control. A process that is out of control produces defective products and therefore increases the production cost which amounts to less profitability which endangers the survival of the organization (BenDaya M., Duffuaa S.O.,1995). The focus of the paper is to embed the key principles of maintenance successfully within small and medium enterprises (SME’s) and it is expected that
505
when this is achieved, the impact upon their competitive position will be considerable. Maintenance of equipment does not mean fixing things that break. Maintenance means performing work regularly to keep machines, equipment, and buildings in good condition and in working order. For example planned preventive maintenance is less costly than reactive repair maintenance by a factor of 10 times or more (Robert M.Williamson, 2000). SMEs production workers & managers after having followed specifically designed training courses in maintenance, they will be able to operate more effectively in their specific arena due to the fact that they will see a dramatic reduction in breakdowns, maintenance costs and defects, whilst realising an increase in resource availability and productivity. In addition, companies implementing a maintenance strategy contribute indirectly to environment protection by eliminating pollution and waste. These characteristics will ensure that the operating and financial performance of the organisation is improved and these benefits lead to increased customer satisfaction, greater profit margins and repeat orders. Examples of the benefits that derive from maintenance strategies exploitation are mentioned: reduction in machines downtime (35-50%), reduction of process errors (35- 50%), reduction of defectives (35-50%), reduction in maintenance cost (30- 40%), machine availability increased by (5-10%), reduction of repair times (35-50%), reduction of work accidents (50-60%), reduction of pollution (40%), increased participation in new ideas (2-3 times), (Nakajima, 1988). Maintenance is inevitably a strongly dynamic process and maintenance personnel require extensive training to be fully qualified. Equipment technology continuously evolves and new deficiencies arise calling for more sophisticated maintenance techniques and multi-skilled workers. The efforts made so far on training in maintenance are fragmentary and superficial, using mainly a theoretical approach and therefore providing methodologies unfeasible to be implemented. This is exactly the gap been identified and analyzed, in the purpose of this paper by utilizing the theoretical knowledge of academic institutions, enriched with the valuable experience of consultants and companies maintenance people and developing an integrated training model adjusted to the special needs and weaknesses of the European companies in order to make them more competitive. The realization of the benefits that arise from maintenance strategies implementation through the use of strategic, financial and recording tools and of the way all these
506
should be adjusted according to their own company needs (self audit tools), will be the means to this achievement.
2.
MAINTENANCE TRAINING PRACTICES As it was mentioned previously there is big gap in the European vocational
training area regarding maintenance, despite the fact that maintenance services is a key-horizontal factor for the competitiveness of all manufacturing SMEs which face great challenges mainly from bigger multinational companies in terms of productivity, quality, machine downtime, etc. According to many European studies SMEs represent the main manufacturing capacity in Europe. They are an important element in most countries economic development and employment base. In the UK, for example, there are 3.7 million businesses and of these only 24,000 were medium sized and less than 7,000 were large (Department of Trade and Industry, DTI, UK, 2000). Small business, including those without employees accounted for over 99% of businesses, 38% of turnover and employment levels of over 55%. The same applies almost to every European country. That’s why all E.U. members should design and provide them with the necessary training in areas supporting their productivity. A recent report published by the European Network of Excellence (I*PROMS) - Project No. NMP2-CT-2004-500273 related to Taxonomy and state-of-the-art researches in production and organization management (POM) emphasizes this argument and proposes integration of vocational training activities for SMEs. National as well as EU policies focus on raising the human capital of European citizens and European SMEs, to support innovation and the creation of new knowledge. The emphasis on strengthening maintenance and productivity capacity of organizations is clearly reflecting the Lisbon Objectives: through the development of maintenance management skills, this project will contribute in Europe becoming the most competitive and dynamic knowledge-based economy in the world; it will promote capabilities for SMEs and other organizations for achieving sustainable economic growth, create more and better jobs and establish social cohesion. SMEs are the backbone of the European economy. They are a key source of jobs and a breeding ground for business ideas. SMEs are considered as a
507
main driver for innovation, employment as well as social and local integration in Europe. Therefore the best possible environment for SMEs needs to be created. Action line 4 “Availability of skills” of the European Charter for Small Enterprises, aims at ensuring that training institutions, complemented by in-house training schemes, delivers an adequate supply of skills adapted to the needs of small businesses, and provide lifetime training It is argued that success of research on intelligent maintenance centered optimization systems should be measured by its relevance to practical situations and its impact on the solution of real maintenance problems (Labib A.W., 2006). Therefore, all national vocational training systems will benefit from results of projects which could provide a new, innovative certified and integrated training framework (handbook, self-audit tools, pilot courses, case studies and a web site). The main contribution of such project, named TRAIN IN MAIN, EU Leonardo financed project, to the vocational training systems originates from the approach taken in this project regarding the pre-training phase during which companies will learn how to implement the self-audit tools in Maintenance. Even if there are programmes to foster vocational training in SMEs in Europe, there is a lack of training methods taking into consideration their specific needs prior to training. Training needs in most of the companies following traditional training courses are not analyzed and addressed sufficiently. To provide managers, engineers and technical personnel of the European manufacturing firms with self assessed, flexible and easily accessible tools for the analysis of their and their staffs training needs and to propose models to address the specific conditions (knowledge gaps in maintenance, disfuntioning production processes due to bad maintenance policy etc), the Train in Main project introduces new innovative tools and content into the vocational training systems. This project not only represents an independent contribution to established vocational training systems, but is directly addressing the common European goal of making SMEs and other organizations more competitive, developing people’s skills and fostering lifelong learning. Although the project is scheduled in 8 work packages, the main work carried out within the project could be divided and presented in brief into the following 5 thematic actions:
508
1. Research and Business literature review: The review records the practices, tools and methodologies at international level towards maintenance implementation, thus covering the state-of-the-art in this area. This will ensure the quality of the training material, which is developed at a later stage 2. Training needs analysis and training model conception: The identification and determination of the implementation difficulties of training in maintenance gives the information needed (national reports and synthesis report for all countries) for the configuration of the training model specifications and the methodology is used later during the pilot phase. Attempts of maintenance implementation, experience and information from surveys, projects and other activities in this area are used among others as input towards this scope 3.
Training material and self audit tools development: The training model and the maintenance handbook delivered at this stage is the blending of the research and literature review, training needs conception and auditing of the existing performance in various companies for all countries. Easily performed self-audit tools is developed in order to examine the organization level on maintenance area (e.g. human resources sufficiency, organization schemes, existing procedures etc.) and the problems each company is dealing with, thus securing the feasibility and practicality of the training material. The training model developed for training workers and executives of maintenance sector covers substantial needs and gives the opportunity of self-auditing the performance and progress of maintenance departments’ and maintenance staff (technicians, workers, managers). The experience of consultancy firms and academic organizations are valuable for the success of this phase. According to their know-how and expertise all these partners contribute to the training syllabus in maintenance..
4. Pilot testing and amendment of the training material: Pilot seminars and workshops take place in each participating country, with reformed content from time to time, utilizing the information provided from the work-shop participants and e-users (web site visitors). Discussion forums between executives, academics and consultants, even via internet, could also turn out to be valuable source of information. All these facilities are embedded in a web site and presented in a SWOT analysis report.
509
5. Dissemination and exploitation actions: All results and tools developed during the project are available to the public, thus securing their dissemination and provision of continuous feedback from the web site visitors. Here it should be stated that the implementation of a well-planned valorization plan should be foreseen, in order to reach the best possible coverage in both National and International level of all the target groups as well as the potential beneficiaries of project results.
Training within companies may take several forms from rather informal, onthe-job, training to more structured and formal approaches (Gillum, 1990). When training is unstructured, there is no written documentation of the training procedures, nor objective means to measure the performance. Research shows that unstructured training results in decreased training efficiency and increased failures due to human errors (Jacobs and Jones, 1995). Thus, the emphasis here will be placed on structured training approaches with stated objectives and learning activities. Adopting another point of view, training can be of one of the following three types:
in-house training,
training from outside institutions and
a combination of the previous two types. Given the limitations of small and medium enterprises, training from outside
institutions, which provide training packages with detailed instructions consists the most appropriate option. Auditing should always be the first phase of a training program initiative since it provides valuable information concerning the current status within the organization. For the purposes of “Train in Main” project an extensive questionnaire is provided for self-auditing and aims at evaluating the organization level on maintenance area: human resources efficiency, organization schemes, existing procedures and methodologies etc, as well as the problems each company faces and at the same time investigates the skill and knowledge level of maintenance workers as well as their ability to respond in critical conditions and their team spirit.
510
3.
A STEP BY STEP DEVELOPMENT OF SELF- AUDIT TOOLS Before the development and implementation of the self audit and having
identified the training needs of European SMEs through the training needs analysis the training material could be developed. A maintenance training handbook should be composed of the following:
Definition of Maintenance Management, main types of maintenance, etc
Types of Maintenance Implementation process and ways of its management
Separate training chapters for each Maintenance Management Tool and Methodology, covering issues such as o “what it is..”, o “where it applies..”, o “how it is applied..”, o “benefits to gain / obstacles to overcome”,
Tests, summarising notes in presentation format, glossary, relevant references, links as well as real case studies from the literature.
Horizontal issues for each chapter such as: o Success / failure factors o Glossary for each chapter o Advantages / disadvantages of Maintenance Management activity for SMEs, Public organisations, R&D organisations
To cover the whole range of training needs in maintenance fundamentals, tools and methodologies the training material has been structured in four main sessions 1. Maintenance fundamentals. 2. Work and material. 3. Measurement and Improvement. 4. Computerized maintenance management system (CMMS). The training material of each session includes theoretical aspects as well as case studies and exercises for amelioration of comprehension. Each session is further divided in several topics. More specifically:
511
•
The first session deals with introductory subjects and tries to outline the basics of maintenance. Maintenance terminology, policies, goals and strategies are analytically presented in this session.
•
The second session includes thematic areas such as work planning and scheduling, work execution and safety (Performance of corrective or preventive maintenance, Quality, Work permit, Reporting, Safety), spare parts management (Criticality analysis, Physical locations, Logistics) and condition based maintenance.
•
The third session presents measurement indicators and improvement techniques. It states availability performance (Reliability, Maintainability and Supportability), key performance indicators, measuring and analyzing results, improvement techniques and learning from failures.
•
3.1.
Finally, fourth session explores computerized maintenance management systems.
SELF AUDIT TOOLS Maintenance Management audit is the first major phase or step of a
maintenance management initiative and it is used to provide a sound investigation into the company or organization’s maintenance «health». The audit is a fact-finding analysis, interpretation and reporting activity, which includes a study of the company’s information and maintenance strategies and its maintenance department structure. One or two extensive questionnaires along with the necessary guidelines which will allow each interested company or organization to continue with it’s use even after the end of the project, have to be developed. The first questionnaire should record the maintenance organization level of the company in general, including thematic areas as: o Breakdowns recording, o Spare parts management, o Preventive Maintenance, o Total Productive Maintenance, o Indicators Measuring-Reporting etc. The second questionnaire would examine how skilled and trained a worker is, how well he responds in critical conditions and if he has team-spirit, including
512
questions about: equipment cleaning, lubrication, alignment, filtration, operating and installation procedures etc. It is also proposed that the maintenance handbook should not be to long in order to provide a quick reference to everyone. However, special attention is given so that the training package does not lose its “educational” character due to the limitation in length. This could be achieved by training in maintenance web site 1 , which essentially complement the theoretical dimension of the handbook with more extensive reference information (i.e. ‘who’s who’ directories for leading researchers and practitioners, web sites, research papers as well as practical presentations, case studies, discussion forums for Maintenance Management etc). Therefore, a maintenance audit aims:
To provide a reference to training on the Maintenance Management Techniques needed
To support the pilot training of target groups in Maintenance Management Techniques
To stress the importance of being able to identify their company problems (Maintenance Management barriers) and make them able to understand what exactly to expect from Maintenance implementation in their company
To increase awareness of workers as well as executives from manufacturing companies about Maintenance management and the specific techniques used
3.2.
Assist them in “speaking the same language” about maintenance management
To help a company set a Maintenance Management Strategy.
PREPARING FOR AN AUDIT ASSESSMENT
A questionnaire presented in ANNEX A was developed in the framework of TRAININMAIN project. The questionnaire is proposed as a general model for the implementation of a maintenance audit process. As it has been stated earlier, a Maintenance Audit cannot be done quickly or without difficulty. Also, to effectively finalise and customise it managers and M.A experts have to spend some time to
1
www.traininmain.eu
513
openly and honestly discuss all company’s important business issues, the environment it operates in, etc. In some parts of the questionnaire, some questions must be customized so as to properly address the company’s current situation in terms of maintenance processes, policies, etc. In addition, in some cases the audit questionnaire can be changed (selfaudit concept) depending on the priority areas identified in the initial preparatory steps (M.A. planning). Organizations interested to be self audited have been identified during actions such as: Training needs analysis survey During the pilot workshop organized in each country During other project dissemination actions It is important that special care is taken in order to ensure that the right questions are asked to the right personnel in the company. A matching figure is proposed by A.Wilson (2002), where it is demonstrated how discussions could be held typically with the people such as: Maintenance Manager, Operations Manager, Engineers and others. The outputs gained from this process are shown in Figure 2. 3.2.
IMPLEMENTATION STEPS OF A MAINTENANCE SELF - AUDIT The following steps developed in the course of the “Train in Main” project are
proposed as a model for developing and implementing a maintenance audit. Step 1: Planning the maintenance audit Auditors should discuss with SME managers the general concept of M.A. during a briefing (visit or telephone call) to clarify them the whole method, its aims and possible benefits for the SME. Before this briefing, the translated M.A questionnaire should have been sent to managers for a better understanding and more fruitful discussion. Step 2: Data collection To reduce time spent from company’s staff, questionnaire responses (data collection) are suggested to be given by employees themselves and not through interviews. After the questionnaires get answered, they should be sent back to the M.A. team.
514
Step 3: Preparation of report The maintenance audit report is prepared. The report should present the current status of the company regarding maintenance issues and areas for improvement, according to the results of the audit. The structure of such a report is given in annex 2. Step 4: Presentation of M.A. report The auditor submits the audit report to the company managers and asks for further feedback/comments. Any feedback (positive or negative) will be given here could be used to improve the self-audit tool. 3.5
RESULTS AND CONCLUSIONS As a main outcome of the above described exercise a closing meeting sould
always be held between the appropriate manager(s) and the auditor. During the meeting the audit findings are presented and discussed. The aim of the meeting is to agree the outcome and determine what action is proposed to correct any discrepancies. It is important to remember that the auditors brief may not include the determination of future actions (unless he or she is a consultant who has been asked to advice on recommended action). The implications of any development or improvement projects should be evaluated at this stage and the scope, benefits, resource needs and costs for making the changes should be thought through and evaluated. Everyone involved should then be properly informed of the outcome. Having established the auditing procedure; the recommendations made should be followed-up; the adherence to the new methods should be noted; and their success in removing the non-compliance and improving performance should be re-audited. These follow-up audits are only intended to check on those areas where noncompliance or developments needed to be addressed. Some companies carry out as many as three full audits each year. Moreover the Train in Main exercise demonstrated clearly that the training model and material is a blend of the theoretical knowledge of academic institutions and the valuable experience and practical expertise of consultants and companies. Thus, the provided training reflects the state of the art research findings in
515
maintenance management, complies with the well-established standards for training and suggests specific tools and methodologies that can be easily implemented within companies. The training model developed for training workers and executives of maintenance sector covers substantial needs and gives the opportunity of selfauditing the performance and progress of maintenance departments’ and maintenance staff (technicians, workers, managers). More specifically, the integrated training package consists of several tools enabling the trainees as well as the companies to carry out a complete training program. And this program is primary based on the results of the self-audit a tool presented in this paper.
516
Figures Figure 1: Content of the training material
1. Maintenance fundamentals
Terminology. Policies, Goals, Strategies.
2. Work and Material
Work planning and scheduling. Work execution and safety. Condition based maintenance. Spare parts management.
3. Measurement and Improvement
Availability performance. Key performance indicators. Measurement and analysis. Fault finding techniques. Improvement techniques. Learning from failures.
4. CMMS
Computerized maintenance management systems.
Figure 2: Understanding the maintenance operations and targets (Wilson, 2002)
517
References 1. Ben-Daya, M., Duffuaa, S. O., 1995. “Maintenance and quality: The missing link”. Journal of Quality in Maintenance Engineering, 1 (1), 20-26. 2. Department of Trade and Industry (DTI), 2000, UK , Quarterly report on small business statistics, Bank of England,. 3. Gillum, D. R., 1990. “Training trends and practices”. ISA Transactions, 29 (3), 1-8. 4. Jacobs, R. L., Jones, M. J., 1995, “Structured on-the-job training: Unleashing employee expertise in the workplace”. Berrett-Koehler Publishers, San Francisco. 5. Labib, A.W., 2006, Enterprise Resource Planning. Maintenance and Asset Management Systems-Are they a blessing or a curse?, Plant Maintenance, Business Briefing: Oil & Gas Processing Review. 6. Nakajima, S., 1988, “Introduction to TPM: Total Productive Maintenance”. Productivity Press Inc. 7. Nelder, G., S. Childe, 2000, "The Impact of research on manufacturing SMEs." Industry and Higher Education. 8. Williamson, R.M., 2000, “Facing a Famine in the workforce: How employers and business leaders can overcome the shortage of skilled manufacturing and maintenance employees”. 9. Wilson, A., “Asset Maintenance Management. A guide to developing Strategy and Improving Performance”, Chapter 35: MAINTENANCE AUDITS, Industrial Press Inc, New York, 2002.
518
ANNEX A: TRAININMAIN - Self Audit Tool in Maintenance Management
Dear Sirs, The current audit tool has been developed in the framework of TRAIN IN MAIN, a European pilot Leonardo project funded by the European Commission, D.G Education & Culture. TRAIN IN MAIN consortium consists of 8 partners from 6 European countries: Greece, Sweden, U.K, Lithuania, Latvia and Bulgaria. The project aims to develop training material and training methodologies on Maintenance Management using e-learning technologies. The Maintenance Audit is a methodology in which they are recorded, from each factory, many characteristics and parameters related to maintenance. Maintenance Management audit is the first major phase or step of a maintenance management initiative and it is used to provide a sound investigation into the company or organisation’s maintenance «health». The audit is a fact-finding analysis, interpretation and reporting activity, which include a study of the company’s information and maintenance strategies and its maintenance department structure. Session A: General Data Company’s data: A1. Country:……………………………….……………………………………… A2. Company name *:…………………………………………………………… A3. Sector:……………………………………………...…………………………. (Metals, Textiles, Food, etc) A4. Number of Employees: …………. A5. Number of Technicians: ………... Respondent’s data: A6. Respondent name *:……………………………………………………….. A7. E-mail *:………………………………………………………………………. A8. Job position: Production Manager Maintenance Manager Other (please explain:………………………………………………………..) * Optional
519
Session B: Audit There are 4 main categories about maintenance (1.Maintenance Fundamentals, 2. Work and Material, 3. Measurement & Improvement, 4.Computerized Maintenance Management Systems). The aim of the next questions is to identify and characterise in which of the above categories your company’s main gaps are. 1. 1.1
1.2
MAINTENANCE FUNDAMENTALS TERMINOLOGY Do you know what Predetermined Maintenance is according to the European standard (EN 13306)?
Yes
No
Do you know what Deferred Maintenance is according to the European standard (EN 13306)?
Yes
No
POLICIES, GOALS, STRATEGIES Do you have written policies for maintenance? No
Moderate
Detaile d
No
Moderate
Detaile d
Never
Sometime s
Alway s
0%
50%
100%
What percentage of your employees in the workforce is competent for the tasks they are asked to perform?
40%
70%
100%
Have you special instructions for safety assurance regarding the work force and any third parties?
No
Moderate
Detaile d
How much of the preventive maintenance activities are Condition Based?
0%
30%
60%
Have you purchased any Condition Monitoring Equipment (that you are using)?
None
Few
Many
None
Some
All
No
Moderate
Detaile d
Do you have a written agreement with the production about the goals to achieve for the maintenance department? 2. 2.1
WORK AND MATERIAL WORK PLANNING & SCHEDULING Do you use a work order system? What percentage of the work tasks, in maintenance, is prepared before the task is carried out?
2.2
2.3
2.4
WORK EXECUTION & SAFETY
CONDITION BASED MAINTENANCE
SPARE PARTS MANAGEMENT Do you classify the spare parts with respect to importance and/or criticality? Do you have rules for the ordering points?
3.
MEASUREMENT & IMPROVEMENT
520
3.1
AVAILABILITY PERFORMANCE Do you monitor the three elements that have an influence upon the availability performance (reliability, maintainability, supportability)?
No
Moderate
Detaile d
None
Few
Many
Never
Rarely
Freque ntly
Never
Rarely
Freque ntly
Never
Rarely
Freque ntly
Never
Rarely
Freque ntly
None
Few
Many
Never
Rarely
Alway s
If yes, please specify the formula and the actual (average) result? 3.2
KEY PERFORMANCE INDICATORS Do you use key performance indicators? How often are you analysing the results?
3.3
MEASURE & ANALYSE RESULTS What are you measuring to control the performed activities in maintenance? How often are you analysing these results?
3.4
FAILURE LOSS PREVENTION & FAULT FINDING TECHNIQUES Do you perform failure loss prevention? Are you using a step-by-step rule for fault finding?
3.5
IMPROVEMENT TECHNIQUES How many improvements in maintenance and availability performance are approved per year? Are you using specified rules in the preparation of improvements? COMPUTERISED MAINTENANCE MANAGEMENT SYSTEMS
4.
Do you have a computerised maintenance management system?
No
For which type of activities are you using the CMMS (e.g. record maintenance actions, preventive maintenance, spare parts)?
-
What are the major maintenance problems that you have encountered throughout your experience? What would you do to improve your maintenance management? The TRAIN IN MAIN project team
521
Yes
ANNEX B:
Maintenance Management Self Audit report structure
Introduction (1st part) Very brief description of all Maintenance Management Audit actions including: Organizing the audit (which company, why they were interested for a further Maintenance Management action, how was contacted, what were their problems symptoms, when the audit took place, duration, general opinion (positive or negative). Please put some photos from your visit in the company. Text length: 1-2 paragraphs (0,5 page) Data Analysis (2nd part) Follow the structure of Maintenance Management Audit questionnaire and provide some comments for each question cluster. Example: If both 3.4 questions (there are 2 questions in 3.4) were “Never” then a reasonable conclusion would be: “According the current maintenance status of the company …X… it seems that the company has problems to prevent from failure loss and there is not any detailed fault finding technique.” Text length: 1-1,5 pages Proposals (3rd part) Conclusions based on: Comments from audited maintenance manager (given orally or written), data analysis, audit general experience, problems occurred, acceptance of results from managers, etc. Provide some proposals as next steps, based on the conclusions given previously, for example: “Start a Maintenance Management Policy, where the company has to focus and which the company can test.” Text length: 0,5-1 pages
522
Proceedings of the 4th World Congress on Engineering Asset Management Athens, Greece 28 - 30 September 2009
ILEARN2MAIN – AN ELEARNING SYSTEM FOR MAINTENANCE MANAGEMENT TRAINING Christos Emmanouilidisa, Ashraf Labibb, Jan Franlundc, Maria Dontsioud, Laila Elinae and Mirabella Borcosf a
ATHENA Research & Innovation Centre, CETI, Comp Sys. & Applications Department, 58 Tsimiski St. Xanthi, Greece. b
Portsmouth Business School (PBS), University of Portsmouth, UK. c
f
UTEK AB, the Swedish Maintenance Society, Sweden. d
Atlantis Engineering SA, Greece.
e
Latvia Technology Park, Latvia.
CNIPMMR, The Romanian SME Association, Romania.
Maintenance training and educational activities are increasingly exploiting technological innovations. Desktop and web-based e-learning applications offer academics and industrialists new tools to raise maintenance-related knowledge and competence. One recent initiative in this direction is the iLearn2Main project. The project employs e-learning technologies, offering customized maintenance management training, while facilitating the standardisation of competencies assessment and learning evaluation. The maintenance curriculum was designed by taking into account competence requirements for maintenance management professionals and a user survey across several EU countries. This paper provides an overview of the iLearn2Main development and system. Key Words: Maintenance Management, e-Learning, Competence Assessment, Learning Management Systems 1
INTRODUCTION
Maintenance Engineering and Maintenance Management competences are necessary for modern enterprises [1]. Although maintenance has been researched and taught as a subject for long, the ever more demanding market needs and the rapid technological advances require a thorough treatment of Education and Training in Maintenance. Enabling technologies in the form of e-Learning, mobile learning and virtual training expand the toolset available to Maintenance training. They also have the potential to provide a structured and objective way of accessing competences, facilitating skills & competences recognition [2]. One recent initiative in this direction is the iLearn2Main project (www.ilearn2main.eu). The project employs e-learning technologies, offering customized maintenance management training, while facilitating the standardisation of competencies assessment and learning evaluation. The Maintenance Management training curriculum was designed on the basis of the EFNMS Specification about the Competence Requirements for Maintenance Management. A user survey was contacted in 5 EU countries to identify learner and trainer needs for an eLearning system in Maintenance Management. The findings of the survey analysis have been taken into account in guiding the curriculum structure design. The curriculum comprises largely training courses related to the Basic Activities of the Maintenance function, as well as subjects related to the Improvement of the Maintenance function [1]. These are structured in three main course themes, namely ‘Asset Care’, ‘Asset Performance Evaluation’ and ‘Management/Economy of Assets’. e-Learning modules are being built on all the aforementioned course themes. These are uniformly structured, comprising fundamental course identification information, course theory, practical tips and case studies. The eLearning system
515
is built on the Open Source Learning Management System platform, Moodle, and the learning courses are being packaged as individual and reusable learning objects. A key requirement is related to formalizing Maintenance Management qualifications in such a way, so as to facilitate the standardization of competence assessment of personnel. In the EU, the EFNMS has produced specific guidance to this purpose in the form of Requirements and Rules to achieve a Certificate as a European Expert in Maintenance Management, as well as Regulations for the EFNMS Certificate as a European Maintenance Specialist. Building e-tools for Maintenance Management competence assessment can offer significant help towards satisfying such requirements. The e-Competence Assessment tool currently under development in iLearn2Main may function as a standardized means of carrying out the assessment of Maintenance Management qualifications. Taking advantage of the Learning Management System features, the e-Competence Assessment tool is blended with the Learning Objects in the sense that it can be used to provide feedback and guide the eLearning process. On the basis of the iLearn2Main Maintenance Management e-Learning platform, specific examples of the course structure and content delivery are provided, along with examples of the competence assessment functionalities. The evaluation and piloting of the iLearn2Main tools is currently under way in several EU countries.
2
IDENTIFYING THE TRAINING OBJECTIVES
2.1 Target Groups & VET objectives The iLearn2Main project aims to deliver Maintenance Management training using e-Training tools. Specifically the training target groups comprise both trainers and trainees: Teachers/Trainers
Personnel involved in maintenance-related training
Learners
Managerial personnel: their needs are determined by the necessity to make rational decisions related to maintenance of industrial equipment and human resources allocation.
Senior engineering personnel: their needs are related to the need to make appropriate choices to ensure adequate maintenance policies and technological solutions, as well as to devise plan and oversee the implementation of appropriate maintenance policies and actions
Other technical personnel: their needs refer to ensuring adequate technical knowledge and skills to efficiently carry out planned maintenance tasks, or to perform rapid maintenance-related audits on industrial machinery.
Training Objectives In Maintenance management training, the trainers should be able to bring enough knowledge and experience to the learners, so they will be able to fulfil satisfactory their basic function in maintenance. The fundamental areas of knowledge should cover the The Leading management capabilities having the Leading Star basic activities in place and in proper Changes use, as well as the usage of analytical in behaviours procedures for improvement of the for the Future maintenance function. Having the right knowledge and Company Company Team Team Work Work for for World World Class Class achievements
experience for the development of a Maintenance Function to World Class
Analytical Procedures for Improvement in place and in proper use The Basic Activities in place and in proper use
Having the right knowledge for improvement of a basic Maintenance Function Having the right knowledge for running a basic Maintenance Function
Figure 1. Maintenance Management Competences
516
The knowledge and competences can be illustrated in a hierarchical structure as seen in Figure 1 [1].
2.2 Survey and Analysis of Training Needs The iLearn2Main project opted to engage relevant stakeholders early on in the project. This was performed in a number of ways:
Meetings with business clientele and research/academic liaisons
Participation in info-days, workshops, conferences
User survey in UK, Sweden, Latvia, Romania and Greece
Specifically, the user survey involved discussions and interviews with relevant stakeholders, distribution of project brochures and completion of a lengthy user survey questionnaire. Two versions of the questionnaire have been developed:
one for the teachers and trainers in maintenance-related training
one for managerial personnel, senior engineering personnel and other technical personnel
The questionnaires comprised questions on the following:
the respondent’s background, working situation and experience
the respondent’s knowledge in the field of maintenance
the respondent’s wishes for areas in the field of maintenance to learn more about
basic computer use and literacy questions, as well as questions aimed at identifying the likely adoption prospects of an e-learning system.
The project collected 70 complete survey responses from managerial and maintenance engineering personnel (teachers & learners), as seen in Figure 2. It is worth focusing on some statistics from the survey. Specifically:
49% of those who completed the questionnaire are Senior Engineers
31% of maintenance learners are Managerial Staff
12,7% of learners are Technicians
Regarding the learner interviewees experience in maintenance issues, the following can be observed:
67% of learners work less than three years in maintenance issues and they may not have sufficient expertise.
33% are more than three years in the field of maintenance and consequently more experienced. Learners Profile 60 50
%
40 30 20 10 0 Managerial
Senior Engineering
Technician
Other
Function
Figure 2. Survey respondent’s profile From the collected responses, it is particularly interesting to note that the teachers/trainers have between little or moderate knowledge in:
517
Procurement, selling of service
Laws and regulations
As an average they seem to have at least moderate or much knowledge in the other actual areas. The learners have little knowledge in:
Economical control, LCC, LCP
Laws and regulations
As an average they seem to have at least moderate knowledge in the other areas. Although one might expect that managers would rank their training in maintenance to be lower than technical staff, both managers and engineers (including technicians) ranked their training in maintenance to be at the same level. A significant proportion (25%) believe that their training is inadequate (low) which should be regarded as acceptance from their part that they are in need of further training in maintenance issues. As this is a snapshot of the “own view” of the respondents, it may be that the level of inadequate training among maintenance – related personnel might be even higher. It is of particular interest to note that managers answers on ’economical control’ were surprising: only 13% have “very much” knowledge on economical issues and another 13% know “much” on the same subject. In fact a very large proportion (44%) know no more than “little” on the subject, something that is not consistent with their function and highlights a very clear training need. It is therefore suggested that courses on economic issues should not only cover basic issues (for technical staff) but also more specific issues (for managers). The fact that engineering and technical personnel chiefly do not consider that they have at least “much” knowledge on maintenance subjects is interesting. Perhaps learners with technical background consider that they have much practical knowledge on the subjects but they lack the necessary theoretical background. There may be an alternative explanation too: as they are directly involved with the technical difficulties of maintenance, they understand the considerable challenges involved in maintenance and so they provide more ‘reserved’ responses. A further interesting finding is that 41% of engineers and technicians do not know more than “little” about laws and regulations (this proportion for managers is just 13%). Nonetheless, knowledge on this subject is mandatory to become a Maintenance Specialist according to EFNMS. Yet, only a 13% of those responded indentified ‘Laws and Regulations’ as a preference topic for training. This is a clear indication that learner preferences should be interpreted with great caution and should be looked upon together with their stated knowledge on the subject and the objective requirements to become a Maintenance Specialist. In general it can be said that Engineers and Technical Personnel consider that they have much room for improvement and training on Maintenance issues, compared to Managerial personnel. Of particular interest was to focus not only on the requirements for Maintenance Management training, but also on the adoption prospects of an e-Training system for Maintenance Management. An especially high proportion (94,5 %) uses computer in a daily basis and believes to be “very much” familiar with computers (81,8%). Furthermore, 100% of the interviewees responded that they expect to benefit “much” (40%) or “too much” (60%) from a computer based automated learning platform. There were no negative or indifferent replies on this question. These responses bond well with potential future use of e-training for Maintenance Management. The survey responses were taken into account in designing the Maintenance Management Training curriculum, described in the next section.
3
MAINTENANCE CURRICULUM AND COURSES DEVELOPMENT
The iLearn2Main maintenance curriculum was structured on the basis of the specified VET objectives and the survey findings (Table 1), as well as by taking into account previous related activities in this area. The training content development process had as a key objective to provide both the theoretical background that trainees should possess in Maintenance Management, as well as the practical skills needed to implement the maintenance function. It is very important for a Maintenance Manager to have knowledge about the maintenance terms and their definitions, in order to avoid misunderstandings in the oral and written communication within the maintenance function – internally in the company as well as with external suppliers or customers. Correct and formal definitions are required to understand the maintenance terms used in requirements, specifications, instructions, contracts and associated maintenance standards in the maintenance function in general. Therefore, it has been necessary to produce a comprehensive structured generic maintenance glossary containing the main terms and their definitions.
518
Table 1 ILearn2Main Maintenance Management Learning Curriculum
CATEGORY 1 Performed activities on the assets (Asset Care)
2
3
Asset Performance Evaluation
Management/Economy of Assets
MODULE 1.1
Maintenance involvement in design, procurement and operation of assets
1.2
Preventive and inspection activities
1.3
Repair techniques and methods
1.4
Goal, strategies, results
1.5
Work execution
2.1
Analysis of the technical performance of the assets
2.2
Condition monitoring
2.3
Measurements
2.4
Information systems
3.1
Maintenance concepts (Dependability / Availability Performance)
3.2
Analysis of the economical results
3.3
Documentation
3.4
Laws and regulations
3.5
Determination of human & material resources
The ilearn2main Glossary was produced using several sources so as to be as objective, comprehensive and as globally acceptable as possible Maintenance terms included in the Standard “EN 13306: Maintenance Terminology” were used as well as the “Terminology” module from the TrainInMain European project and other published papers, studies and scientific internet portals in relation to maintenance terminology. The maintenance terms included in the Glossary are then enriched during the training content development for each course, a process that resulted in an extended maintenance glossary. . The main training content is structured in three basic course categories, each of which is composed by individual courses, following the curriculum structure. The above courses were firstly assigned for development to the maintenance experts of the iLearn2Main partnership. The assignment procedure was a combination of the partners’ preferences and the assignment proposal by Atlantis Engineering who led the content development effort. In order to achieve a uniform format for the courses, a template was produced that specified the structure, the content as well as the style of writing. The courses template follows the basic structure seen next: 1. Introduction (1.1 Objectives , 1.2 Learning Outcome , 1.3 Summary, 1.4 Prerequisites / Related Topics, 1.5 Keywords) 2. Theoretical Background (2.1 Prerequisites, 2.2 Main part, 2.3 Review Questions) 3. Implementation (3.1 Action plan, 3.2 Success factors, 3.3 Review Questions) 4. Case Studies 5. Assessment Questions 6. Glossary 7. List of References The courses template foresees both review / comprehension & independent assessment questions. The review questions are placed in-between the training content and their use is for testing comprehension while learning. Some review questions are placed at the end of the training course in order to provide an overall test of how effective the learning was. The assessment questions belong to a different batch of questions and are used for the Assessment test (e-Assessment tool).
519
Finally, a quality control / review process for the training content was included. The quality control procedure that was performed for each course can be described as follows: for each training module made available by the author, a quality check was performed by an appointed reviewer. Then the final training module content was produced, after taking into account and acting upon the review comments.
4
THE LEARNING MANAGEMENT SYSTEM
4.1 The Learning Management System E-learning has redefined the way education is provided in schools, academia and industry. It is defined as a technological, organizational and management system that enables and facilitates web-based learning. E-learning users, both trainers and learners, are offered integrated solutions which facilitate authoring, structuring and delivering educational content, as well as assessing the educational outcome. Such solutions are termed Learning Management Systems (LMSs). Most current LMSs include functionality to handle lesson content, often in the form of Learning Objects (LO). A Learning Object is an entity that can be used, reused or referenced in e-learning. There are many obvious advantages in web-based learning, compared with the conventional training. Typically, e-learning enables anytime, anywhere and to anyone with authorised access, to participate in the learning process. Apart from this flexibility, e-learning is usually associated with lower costs, compared to engaging a qualified teacher. On the other hand, e-learning involves extra costs in producing the actual educational content, as well as in customising it for the e-learning environment, while effort is needed for the setup and customisation of the learning environment itself. When considering the choice for an LMS to be employed for developing a Maintenance Training Toolkit, there are several options available. Comparisons have been made against a single criterion, such as adaptation capacity [4], functional assessment or SCORM conformance [5] or against multiple criteria [6-8]. Evaluations consider both the viewpoint of those concerned with producing an LMS solution for a specific application, as well as for specific targeted teacher and learner user groups. The LMS developers are concerned with the LMS offered features for customising, extending, deploying, upgrading or for migrating the content from one platform to another. Users on the other hand are more concerned with the offered features to support the learning process, as well as with the system usability. The iLearn2Main e-training toolkit employs the open source Moodle LMS platform. Moodle is an acronym for “Modular Object-Oriented Dynamic Learning Environment”. Most evaluations cited above consider Moodle to be among the top LMS in terms of reusability, accessibility, adaptability, affordability, durability and interoperability. Furthermore, it is used and supported by a wide user community, providing new versions, FAQs and tutorials, offering SCORM support, while it remains a free and open source solution. Among other, Moodles enables us to:
Design lessons with text, graphics, animations and video
Incorporate comprehension and final assessment questions
Define custom learning paths and pre-requisites for lessons
Define meta-courses, which are aggregation of courses for specific subjects
Include dynamic Glossaries with terminology
The iLearn2Main courses have been developed and deployed in the Moodle platform that was setup and customised at CETI/R.C.Athena to fit Maintenance Management training needs. The Learning system resides inside the project site, which is accessible at www.ilearn2main.eu. When a user first arrives at the project site, he is presented with a list of the offered learning material, as it is shown in Figure 3. Users can have access to courses summary and check for other users online. If the user selects a course, or clicks on the login hyperlink, he is presented with a login form where he has to enter his credentials. Inside a course, under the course title, there are the different parts that comprise an iLearn2Main course (Figure 4):
Course modules in the centre of the screen, including Lessons, Glossaries and References. This is the training content.
Links to other participants in the course to facilitate communication, if needed.
Links to activity types in the course for easier navigation.
Direct access to the student history.
List of all other courses the student has enrolled to.
Latest news and events relevant to the course, i.e. uploading of a new module.
520
Figure 3. ILearn2Main training home page
Figure 4. Example of ILearn2Main course page The courses content includes pointers to references so that trainees can seek additional information or resources, should they wish to do so. References can be accessed as a separate web page, which is convenient for direct linking from the courses but also as a collective reference for external material. Adequate training requires familiarisation with the typical terms relevant to the course content. This is supported by the use of an e-glossary. The e-glossary provides links to definitions for all the terms that have been used inside the course. This Glossary is integrated with the training content so as to provide direct and easy access to any of its terms. These terms are automatically linked everywhere they exist in the lessons, and comprise a full and analytic reference guide of all maintenance terms.
521
4.2 Maintenance Management Competence Assessment In modern industry it is of fundamental importance that personnel involved in the maintenance function have the right knowledge and skills to perform their intended function. This is necessary in order to attain excellence in productivity and cost effectiveness for the benefit of the company. It is of particular interest to note that training is important at different stages for all professionals, before and during their working life, as seen in Figure 5 [1].
Maintenance Competence Assessment Issuing of Certificates
Issuing of Diplomas
Evaluation of the Results
Evaluation of the Results
Performing the Validation
Performing the Validation
Questions to be answered
Questions to be answered
Rules for Validation
Performing Training Assuring Training Material
Assuring the Teacher’s Competences
Rules for Certification
Rules for Validation
After entering working life
Before entering working life
Issuing of Diplomas
Development of Syllabuses
Competence Requirement Specification
Before/After entering working life
Figure 5. Maintenance Competence Assessment The training content, ported in the form of Learning Objects and delivered to the trainees through the LMS, is an essential tool to deliver Maintenance Management e-training. One of the key advantages of employing e-training, as opposed to conventional training is the ability to customise training according to the circumstances and the progress achieved by each learner. In that way it is possible to personalise e-training, on the basis of user roles, performance and recognition of knowledge and competence gaps. This is not to say that conventional training cannot aim at achieving personalisation. Nonetheless, the additive value offered by the ability to automatically process training history data and the interaction of the trainee with the LMS is that the e-training can make a systematic, automated and independent assessment of each individual training case and thus seek to customise the offered training accordingly. Through the use of comprehension and review tests, it is possible identify knowledge and skills gaps. Furthermore, it is possible to divert the learning sequence to better address the identified gaps, by encouraging the trainee to focus its training in areas that he seems to need additional training on. In this way different trainees can follow distinct training paths, making the whole training procedure more efficient and tailored to individual needs. It is worth noting that conventional training require much more substantial effort by the trainers to achieve a similar level of customisation, thus the LMS, equipped with knowledge testing becomes a powerful tool for trainers too, in their function to provide adequate training. Maintenance Management training involves knowledge and skills which are multi-disciplinary by the very nature of the Maintenance Management function. Therefore, the ability to offer this level of customisation by e-training and knowledge assessment tools is particularly beneficial for the delivery of Maintenance Management training. A typical example of testing during the etraining is shown in Figure 6. While knowledge testing constitutes an essential function that is blended with the e-training delivery and facilitates efficient e-training and personalisation of the way this is delivered to each trainee, there is a clear need for an independent assessment of Maintenance Management competences. This independent assessment is needed in order to support an objective assessment of Maintenance Management competences. In iLearn2Main, this independent assessment is designed to be offered by a separate and stand alone tool. The underlying design consideration is that a number of tests are created, to cover the breadth of the iLearn2Main curriculum courses. These tests are placed in a competence assessment tests pool and the eAssessment tool randomly picks a subset of those each time to offer a different competence assessment test. Although the choice is random, care is taken so that the chosen tests cover the range of topics an courses that are deemed essential to successful perform the Maintenance Management function (Figure 7).
522
Figure 6. Example of training review testing.
Figure 7. The e-Assessment Tool
5
FUTURE STEPS The next steps in the development of the iLearn2Main e-training for Maintenance Management are:
Enriching the training content with additional training modules
Completing the development of the e-Assessment tool
523
Piloting and Evaluation
Training courses. All training courses specified in the iLearn2Main Maintenance Management Training Curriculum will be included in the iLearn2Main e-training platform. The curriculum can expand to incorporate additional training content, case studies and tests. Additional courses can also be included. In this way iLearn2Main may offer courses about both the essential knowledge for Maintenance Management basic training as well as for targeting the improvement of the Maintenance Function. In the future it may be beneficial to employ as much multimedia training content as possible, in the form of images audio, video or animation. Such content can offer enhanced training experience and user engagement during the training process. e-Assessment Tool. The e-Assessment Tool is offered separately from the e-training toolkit. It comprises hundreds of questions and tests on the taught subjects. A subset of tests is chosen in random order each time, so that different competence assessment tests can be offered. As tests are linked to specific taught module subjects, the e-Assessment tool is not only employed for overall competence assessment, but also to identify specific weaknesses of the assessed trainee, so as to guide future training on specific Maintenance Management subjects. The e-Assessment tool will be extended to include test from all courses included in the iLearn2Main curriculum. Evaluation and piloting. iLearn2Main employs ex-ante and ex-poste evaluation, so as to engage stakeholders at different stages. The ex-ante evaluation focused mainly on performing a user survey, while the ex-post evaluation comprises different piloting activities, such as training workshops. The piloting events seek participation and feedback from both academia and industry. Participants have the opportunity to use the Learning and the e-Assessment Toolkit. Evaluation questionnaires are being designed to assemble feedback from participating stakeholders. This feedback will be particularly useful in driving improvements and next development steps for the e-Training and e-Assessment toolkits. Piloting activities organised as workshops are intended to include presentations, interactive training, case study analysis and tests.
6
REFERENCES
1
Franlund, J., (2008) Some European Initiatives in Requirements of Competence in Maintenance, Proc. of CM-MFPT 2008, 5th Int. Conf. on Condition Monitoring & Machinery Failure Prevention Technologies, 15-18 July 2008, Edinburgh, UK.
2
Emmanouilidis, C., Papathanassiou, N. and Papakonstantinou, A., (2008) Current trends in e-training and prospects for maintenance vocational training, Proc. of CM-MFPT 2008, 5th Int. Conf. on Condition Monitoring & Machinery Failure Prevention Technologies, 15-18 July 2008, Edinburgh, UK.
3
Bakouros, Y., and Panagiotidou, S., (2008) An analysis of maintenance education and training needs in European SMEs and an IT platform enriching maintenance curricula with industrial expertise, Proc. of CM-MFPT 2008, 5th Int. Conf. on Condition Monitoring & Machinery Failure Prevention Technologies, 15-18 July 2008, Edinburgh, UK.
4
Graf, S. and B. List (2005) An Evaluation of Open Source e-Learning Platforms Stressing Adaptation Issues. 5th IEEE International Conference on Advanced Learning Technologies (ICALT 2005). Kaohsiung, Taiwan, IEEE: 163-165.
5
Garcia, F. B. and A. H. Jorge (2006) Evaluating e-learning platforms through SCORM specifications. IADIS Virtual Multi Conference on Computer Science and Information Systems (MCCSIS 2006), IADIS.
6
Itmazi, J. A. and M. M. Gea (2005) Survey: Comparison and Evaluation Studies of Learning Content Management Systems. MICROLEARNING2005: Learning & Working in New Media Environments. International Conference, Innsbruck, Austria.
7
Kljun, M., J. Vicic, et al. (2007) Evaluating Comparisons and Evaluations of Learning Management Systems. Proceedings of the ITI 2007 29th Int. Conf. on Information Technology Interfaces, Cavtat, Croatia, IEEE.
8
Itmazi, J. A. and M. M. Gea (2005) Survey: Comparison and Evaluation Studies of Learning Content Management Systems. MICROLEARNING2005: Learning & Working in New Media Environments. International Conference, Innsbruck, Austria.
Acknowledgments The authors wish to acknowledge the financial support received through the UK/07/LLP-LdV/TOI-004 project iLearn2Main, which is a collaboration of the Univ. of Portsmouth, the Athena Research & Innovation Centre, ATLANTIS Engineering, the Swedish Maintenance Society UTEK, the Latvia Technology Park and the CNIPMMR National Council of Small and Medium Sized Private Enterprises. Particular thanks are due to Nikos Papathanassiou at CETI/R.C.Athena for his work on the Learning Management System implementation.
524
Proceedings of the 4th World Congress on Engineering Asset Management Athens, Greece 28 - 30 September 2009
TRAINING AND CERTIFICATION OF MAINTENANCE AND ASSET MANAGEMENT PROFESSIONALS Jan Franlund Chairman of the EFNMSvzw European Certification Committee, ECC, Stockholm, Sweden
This paper will especially discuss the Requirements of knowledge for Maintenance Management and the thereto connected certifications. However the aspects of training and validation will also be covered. The connection with Asset Management will be mentioned, as well as the present international interest for certification all over the world. Key Words: Maintenance Management, Asset Management, Certification, Validation, Competence Requirement 1
BACKGROUND
The European Federation of National Maintenance Societies – the EFNMSvzw – consists at the moment of 21 members. The EFNMSvzw is non-profit organization. The objectives of the EFNMS are: the improvement of maintenance for the benefit of the peoples of Europe. In order to pursue its goals, the EFNMS shall be an umbrella organization for the non-profit National Maintenance Societies in Europe The EFNMS shall, amongst other activities: Co-ordinate maintenance matters between European National Maintenance Societies and establish contacts with Maintenance Societies outside of Europe; Study and introduce good maintenance management and promote and initiate maintenance techniques and processes; Provide direction in the development of maintenance; Draft maintenance guidelines. Some of the activities are related to Training, Validation and Certification in maintenance. Something that at the moment has got an increased interest in Europe – and as it seems – also all over the world.
2
COMPETENCE REQUIREMENTS The EFNMS has specified the competence requirements for Maintenance Management. The development of that specification has been made by maintenance experts from the Member Societies of the EFNMS.
Management and Organisation Goal, strategies, results Organisation, competence Procurement, selling of service Guiding, control, analysis Economical control, LCC, LCP Material handling, logistics
Reliability Performance of Production Plants
Maintenance Information Systems
Definitions Measurements, mathematical formulas Requirements, control, analysis Design, procurement, operation Laws, regulations
Planning, ordering, analysis Documentation Information systems Technical/economical analysis
Figure 1. The overall Competence Requirements for Maintenance Management
Maintenance Methods and Techniques Remote control Condition monitoring Preventive activities Repair techniques and methods Jan Frånlund 2008
525
Management and Organisation Very good knowledge in: • How to set up a company management policy in order to be able to participate in its definition as far as maintenance is concerned • How to formulate the maintenance policy within a company • How to formulate the maintenance goals • Different maintenance strategies and how to choose the right strategy • How to specify the requirements for the maintenance activities • How to organize the maintenance activities, how to choose a suitable organization and assure the right competence within the organization • How to determine the human and material resources in order to implement the organization • How to assure (by maintenance activities) the health and safety and the right environment conditions (inside and outside the company) • How to guide, control and analyse the maintenance activities • How to develop and use key-figures for the economical control • LCC/LCP techniques/methods • Logistics support, material and store handling, methods for spare part calculations • How to measure and analyse the results of the maintenance activities, e.g. efficiency and economy • The maintenance activities in the development and procurement of new production equipment • How to define the future maintenance needs of a company Good knowledge in: • How to define and implement human resources development policy Jan Frånlund 2008
Figure 2. Competence Requirements in Management and Organisation
Availability Performance of Production Plants Very good knowledge in: • Reliability • Maintainability • Supportability • Availability • Improvements of availability performance
Good knowledge in: • The mathematical and statistical formulas to be used in the specifications and for verifications • Human reliability • Production safety • Risk analysis
Maintenance Information Systems Very good knowledge in: • Maintenance Management Information Systems (key-figures, guidance tables and so on)
Good knowledge in: • Maintenance Information Systems (for planning, work-order, technical/ economical analysis, and so on • Technical documentation/information systems • Technical process control systems
Maintenance Methods and Techniques Good knowledge in: • The theory of the failure patterns • Types of wear and tear • Improvement techniques (aiming at reducing failure rates and down times) • Preventive techniques
• Inspection techniques • Condition monitoring techniques • Methods of life extensions • Measurement methods • Control systems Jan Frånlund 2008
Figure 3. Competence Requirements in three areas
526
Management and Organisation • How to formulate the maintenance goals. - to describe the general requirements for maintenance goals - to describe the process of the development of maintenance goals - to give examples of maintenance goals - to describe the relationship between goals and policy • How to assure (by maintenance activities) the health and safety and the right environment conditions (inside and outside the company). - to describe different conditions in the production equipment that may cause risks for health, safety and the environment (inside and outside the company) - to describe the possibility to prevent such incidents by maintenance activities, including co-operation with other departments in the company and external parties
Availability Performance of Production Plants • Maintainability - to understand that this has to do with active time for maintenance - to be able to define maintainability - to describe some different measures of maintainability (e g MTTR, M, etc) - to be able to calculate the maintainability - to describe which time elements that are included and not included in the calculation (e g preparation time, functional check out, waiting for resources) - to be able to analyse what causes the length of active maintenance times Jan Frånlund 2008
Figure 4. Detailed Competence Requirements in some areas
The specification can be used in a number of ways: • • •
As a base for the syllabuses and the training in the school systems and in the post graduate training; As a guideline for an analyse of the present competence among the maintenance professionals in a company and thereby identify competence gaps to be filled by additional training; As the base for the EFNMS certification of maintenance professionals;
Maintenance Competence Assessment Issuing of Certificates
Issuing of Diplomas
Evaluation of the Results
Evaluation of the Results
Performing the Validation
Performing the Validation
Questions to be answered
Questions to be answered
Rules for Validation
Performing Training Assuring Training Material
Assuring the Teacher’s Competences
Rules for Certification
Rules for Validation
After entering working life
Before entering working life
Issuing of Diplomas
Development of Syllabuses
Before/After entering working life
Competence Requirement Specification
Jan Frånlund 2008
Figure 5. The usage of the Competence Requirement Specification
527
3
CERTIFICATION
A certification of competence consists of a set of steps to be performed. In summary these steps are: 1) The Competence Requirement Specification; 2) The Rules for the Certification of the Competence, including the minimum test results which has to be achieved to become in conformity with the Competence Requirement Specification; 3) The Development of the Questions to be answered in the test; 4) The Evaluation of the Results for each candidate; 5) The issuing of the Certificates;
The different possibilities 1. Exam with diploma issued by the Nat. Soc.
1. ”Management and Organisation” and ”Maintenance Information Systems” (40/55 p) 4 hours 2. ”Reliability Performance of Production Plants” and ”Methods and Techniques” (30/45 p) 3,5 hours An EFNMS controller is present during the exam and the markings. Possible result: PASSED EXAM 3.
Passed exam for item 1 an 2 above
2. National 4. Five years practical experience within maintenance, of which two years of them in a managing position. At least one year of this should have Certificate occurred during the last 18 months. issued by Possible result:: the Nat. Soc. ”NATIONAL EXPERT IN MAINTENANCE MANAGEMENT” 3. European Certificate issued by the EFNMS
5. Passed 1 – 4 above 6. Exam in maintenance terms in the English language 1 hour (Will be performed the same day as 1 and 2 above, and with the same controller) Possible result:
”EUROPEAN EXPERT IN MAINTENANCE MANAGEMENT”
Figure 6. The Management Exam and the different result achievements
Part 1: “Management and Organisation” + “Maintenance Information Systems” 1. (4 points) a) Formulate the maintenance goals for a company and explain the ideas behind the chosen wordings. b) Give at least five examples of maintenance strategies that will support the maintenance goals. 2. (8 points) The way the maintenance activities are organised are essential for the need of and possibility to measure the results. a) Describe 4 different possibilities to organise the maintenance activities. b) How will you develop and assure the right competence within each of these 4 organisations? c) For each of the 4 organisations, describe: - the need to measure certain parameters - the possibilities to measure these parameters - the possibilities to measure problems and give examples of the two most essential parameters for each of the organisational alternatives
Figure 7: Sample questions from earlier exams
528
4
VALIDATION, EXAMINATION AND CERTIFICATION OF MAINTENANCE TECHNICIAN SPECIALISTS
Since 2001 the EFNMSvzw has also specified the Competence Requirements regarding Maintenance Technician Specialists. These requirements have, among other things, also been used for the EFNMSvzw examination since then.
The five parts of the examination 1
2
60 questions in 1 hour
60 questions in 1 hour
1.1 Work Planning
2.1 Maint. Objectives and Policies
2.7 Condition Monitoring
1.2 Team Working and Communication
2.2 Maintenance Concepts
2.8 Fault Finding Techniques
1.4 Information Technology
2.3 Restoration Techniques
2.9 Improvement Techniques
1.5 Training and Instructions
2.4 Maintenance Terminology
2.10 Documentation
1.6 Quality Assurance (Systems)
2.5 Contracts
2.11 Spare Part Management
1.7 Environment
2.6 Laws and Regulations
2.12 Materials Technology
1.8 Automation
3
30 questions in ½ hour
4
5
1 hour 3.1 Practical Computer Handling
1.3 English language
1 hour 3.2 Practical Fault Finding Jan Frånlund 2008
Figure 8: The content of the exams for Maintenance Technician Specialists There has also been developed a computer based validation tool to be used for the Maintenance Technician category. This tool is used not only for examination, but also for validation of the existing competence for a Maintenance Technician. Thereby it is possible to discover any gaps in the competence, which can be an information for tailored, individual training.
5
RESTRICTIONS
One important circumstance for the certification activity is the ISO standard ISO/IEC 17024:2003, in which it says that the certification body shall not offer or provide training.
EN ISO/IEC 17024:2003 The certification body shall not offer or provide training, or aid others in the preparation of such services, unless it demonstrates how training is independent of the evaluation and certification of persons to ensure that confidentiality and impartiality are not compromised.
Figure 9: The ISO rules
529
6
FUTURE TRENDS
There is an ongoing discussion between Europe, Australia and South America for common certification processes in Maintenance and Asset Management. At the same time it has been noted that the EFNMSvzw certification has got the best reputation all over the world. The Asset Management certification has at the moment two main directions. One is regarding a certification possibility for a company, and the other for functions/individuals within the company. In a meeting in May 2009 the EFNMSvzw developed the following proposal for a definition of Asset Management: “The optimal life cycle management of physical assets to sustainable achieve the stated business objectives.”
Functions & Individuals
Companies
Asset Management
Asset Management
Strategic Planning
Company Framework
Design
Maintenance
Operation
Accounting
Law
Figure 10: The possibilities of certification in Asset Management
7
REFERENCES:
1
The EFNMSvzw Competence Specifications at www.efnms.org / publications
2
The EFNMS European Certification Committee, c/o Mr. Jan Franlund, Box 10231, 10055 Stockholm, Sweden; phone: +46-8-664 09 25; e-mail: [email protected]
3
The ISO Standard ISO/IEC 17024:2003
530
Proceedings of the 4th World Congress on Engineering Asset Management Athens, Greece 28 - 30 September 2009
EDUCATION IN INDUSTRIAL MAINTENANCE MANAGEMENT: FEEDBACK FROM AN ITALIAN EXPERIENCE Marco Macchi a, Stefano Ierace b a
b
Department of Management, Economics and Industrial Engineering – Politecnico di Milano, Piazza L. da Vinci 32 – 20133 Milano, Italy (, E-mail: [email protected])
University of Bergamo, CELS - Research Center on Logistics and After Sales Service, Department of Industrial Engineering Viale Marconi, 5 - I - 24044 Dalmine (BG), Italy (, E-mail: [email protected] ) Corresponding author : Marco Macchi: [email protected]
Maintenance of industrial assets is widely recognized as a key element in order to keep and improve the values and competitiveness of an enterprise. This perception has been growing in the recent years accordingly with the new interests in sustainable manufacturing and energy efficiency. In such a business context, the competences and know-how for maintenance should be prepared not only at a technical level but also at the management level, in order to be able to keep pace with the complexity and changes of the business requirements. Preparation at the maintenance management level has been one of the main interest in a Master course on Industrial Maintenance Management, jointly delivered, during the last five years, by MIP, the business school of Politecnico di Milano, and Università degli Studi di Bergamo. This Master course was in fact proposed in order to form maintenance managers: accordingly with its mission, this course aims at educating maintenance managers that, “besides having proper technical competences, would be capable to manage the maintenance processes in view of their organizational and management aims and constraints, governing the impact of maintenance on other parts of the organization, on business objectives and subsequent continuous improvement processes”. To this concern, the course has been offered through a program developed in three formative areas, which offer a balanced set of management and technical topics. Besides lectures, the students have to attend different types of activities focused on specific topics. The focused activities have been organised in different forms, like: (i) the daily workshops, where industrial experts, academics and representatives of institutions (e.g. governmental agencies) are invited in order to discuss on specific themes related to maintenance and operations; (ii) the training sessions on industrial case studies, proposed in order to stimulate problem solving in a team work fashion; (ii) an industrial project work, carried on at the end of each of the two years of the course program. The present paper intends to make a review of the feedback gained by the Master course: indeed, the feedback, collected throughout the course, can be considered as meaningful outcomes thanks to the high sample of number of participant employees and industries; they are then used for drawing out a statistical analysis of different needs of the enterprises involved in the education of their maintenance persons. Indeed, the results, provided in the present paper, might be considered as a snapshot of the Italian experience in educating professionals in Industrial Maintenance Management. Key Words: Education, Industrial Maintenance Management. 1
INTRODUCTION
Crespo and Gupta [1] defined a basic supporting structure to enable and ease the maintenance management process in an organization. This structure is based on three “pillars”: the IT pillar, the maintenance engineering methods pillar and the organizational (or behavioural) pillar. Tsang [2] identifies four strategic “dimensions” of a maintenance management system for achieving business success: what he calls service-delivery options (this dimension is focused on the resources adopted in the system as entrance factors: spares, labour, tools, external parties, …), organisational design (this dimension deals with
531
workforce location, specialisation, …), maintenance methodologies (like the RCM and TPM, …), support infrastructures (emaintenance, …). Going back to the initial 90ies, Pintelon and Gelders [3], for example, are pointing out three main “aspects” or “levels” to consider in a maintenance management system: it is advised to integrate the operations management and maintenance management systems; the decision making, resource management, performance reporting should be considered as main part of the system activities; the availability of a managerial tool kit is essential for enumerating the tools needed in order to make activities within each of the three levels of the maintenance management system. These references from literature, however, are only just examples. They help reminding the importance of implementing a maintenance management system based on well grounded issues: either they are referred as “pillars”, “dimensions”, “aspects” etc …, a maintenance manager should be concerned with these issues of a maintenance management system. In such regard, this kind of models can be adopted, as a reference, in order to keep inspiration also to fix the primary targets for education of a maintenance manager: in fact, it would be advisable that the preparation of a maintenance manager is grounded on such kind of elements or pillars; said in the view of education, the topics related to management of resources (like the IT and human resources) and methods (for maintenance engineering, etc…), available at hand to maintenance, should be considered when preparing a maintenance manager. Starting from this generic principle, the Master course (Master meGMI [4]) has been offered through a program that was developed in three formative areas: the pillars of maintenance management are presented throughout two of the three areas (so called management area and electives); furthermore, some general topics are also included in a first area (so called general area), proposed as a foreword at each year in order to enlarge, beyond maintenance management, to the enterprise and operations management. In particular: (i)
the general area aims at providing a common background on the themes of enterprise management, organization, operations management, quality, safety, environmental regulations and ICT, and on the fundamentals of methodologies for quantitative analysis and decision support (e.g. statistics);
(ii) the management area aims at developing the knowledge on processes, methods, tools and information systems supporting the strategic planning, organization and management of maintenance operations of an industrial plant; (iii) the third area includes electives on management and technical issues; the electives on technical issues are delivered at the first year of the course program and aim at consolidating, at a first level, the knowledge of technologies for diagnostics of industrial plants and the technical know-how required for maintenance of industrial services (in particular energy), production technologies, equipments for environmental protection, etc.; the electives on management issues, offered at the second year, develop the specialized knowledge required by processes and advanced methods for the maintenance management in different industrial sectors (e.g. in the transport sector or for process industries). The course program has been structured in order to achieve two subsequent results. A diploma for maintenance operations is achieved at the end of the first year: this can be considered either as an intermediate or a final outcome. It is the final outcome for all the people who may not go ahead at the second year: in this concern, when designing this learning path (in the remainder the “1 year” learning path), it was decided, first, to clear out which educational targets were required for the first year; henceforth, subsequently, a structure of the course program sufficient to achieve these targets was deployed for this year. In a few words, the educational targets are to provide the maintenance manager with a balanced set of technical and management topics, enabling to plan, schedule, control maintenance activities, re-engineer activities based on failure analysis and condition based maintenance, exploit the possibilities resulting from ICT and diagnostics/prognostics tools. Some flavour of technically oriented person is still behind these educational targets even if a sufficient set of management topics are included. The Master course graduation is instead achieved at the end of the second year of the program (in the remainder the “2 years” learning path): this learning path is a continuation of the first above discussed and it was intended to provide more indepth on management topics, both having a broader perspective of maintenance management, e.g. by considering maintenance economics, and of enterprise operations, e.g. by considering enterprise strategic planning. In this concern, looking at the Master course as a whole, the learning paths therein included are the outcomes of a concept of progressive competence growth, leading from management capabilities needed for operational decisions (achieved at the end of the first year) to some other management capabilities required for a strategic analysis and decision (achieved at the end of the second year) at the enterprise level. The actual review will provide feedback on the learning formula adopted during the Master course. To this concern, a second paragraph provides more in-depth on the course program, by analysing with much more details the educational contents therein included. Thereafter, the third paragraph reports statistics from people attending the course. Some considerations are provided in the conclusions for what concerns the educational contents with respect to the qualification / certification of competences.
532
2
THE COURSE PROGRAM
The next table 1 summarises the number of hours offered in the course program for each learning topic. Besides, the two yearly learning paths, i.e. the “1 year” versus the “2 years”, are associated with the learning topics (so their own hours): in particular, in each of the yearly learning paths some topics are mandatory, some other topics are optionally offered in the electives (see, in table 1, “ ” as mandatory in the learning path; “*” as optional in the learning path, subject to the constraint that 3 options have to be chosen out of the total offer; “**” as optional in the learning path, subject to the constraint that 2 options have to be chosen out of the total offer).
Table 1 Structure of the course program and yearly learning paths offered to form maintenance managers Formative area
General area
Management area
Electives
Learning topic
Hours
Statistics
20
Enterprise organisation
44
Operations
66
1 year
2 years
Information systems
8
Decision support systems
12
Enterprise management
24
Human resources management
24
Project management
18
Maintenance organisation (1)
12
Maintenance engineering (1)
36
Maintenance planning
16
Maintenance information systems
20
Maintenance organisation (2)
24
Maintenance engineering (2)
60
Maintenance economics
18
Methods for diagnostics of industrial plants
18
*
*
Maintenance of process plants
18
*
*
Maintenance of industrial services
18
*
*
Maintenance of equipments for machining
18
*
*
Maintenance of equipments for environmental protection
18
*
Maintenance management for transport systems
18
**
Maintenance management for networked systems
18
**
After sales service management
18
**
Facility Management
18
**
*
Due to the modularity of the structure provided in the course program, other learning paths, shorter than the “1 year” and “2 years” paths, can be, and were, also offered to enterprises. These learning paths were offered independently as short courses, focused on specific topics and having a format of hours ranging between a minimum of 8 hours (equivalent to a “1 day” path) and a maximum of 36 hours (“6 days” path). Nonetheless, they were integrated in the same framework of learning topics (as shown in table 1) in order to achieve economic synergies: i.e. synergies by holding the same class to the participants coming both from a yearly path and a short term, daily, path. The next table 2 summarises the specific learning paths offered for the short term: it shows the hours of learning topics available in the framework of the course program and exploited for delivering short term, daily, paths (see in the table “full delivery” whenever a learning topic available in the course program is fully delivered also as a short term path; instead, “partial delivery” whenever only a part of the hours available for a learning topic is delivered also as short term path).
533
Table 2 Structure of the short term paths offered by exploiting learning topics concurrently delivered to yearly learning paths. Short term paths (daily paths) Integrated management of QHSE Project management Failure analysis Condition monitoring and diagnostics Maintenance planning Spare parts management TPM / Total productive maintenance Maintenance contracts Methods for diagnostics of industrial plants Maintenance of equipments for environmental protection After sales service management
Hours and source Partial delivery: 36 hours out of 66 hours. Source: learning topic “Operations”. Full delivery: 18 hours out of 18 hours. Source: learning topic “Project management”. Partial delivery: 8 hours out of 36 hours. Source: learning topic “Maintenance engineering (1)”. Partial delivery: 12 hours out of 36 hours. Source: learning topic “Maintenance engineering (1)”. Partial delivery: 12 hours out of 16 hours. Source: learning topic “Maintenance planning”. Partial delivery: 9 hours out of 60 hours. Source: learning topic “Maintenance engineering (2)”. Partial delivery: 9 hours out of 60 hours. Source: learning topic “Maintenance engineering (2)”. Partial delivery: 8 hours out of 16 hours. Source: learning topic “Maintenance economics”. Full delivery: 18 hours out of 18 hours. Source: learning topic “Methods for diag. of ind. plants”. Full delivery: 18 hours out of 18 hours. Source: learning topic “Maint. of eq.s for env. protect.”. Full delivery: 18 hours out of 18 hours. Source: learning topic “After sales service management”.
It is important to assert that the learning paths have not been defined once and never changed. Indeed, the judgments of people attending the Master course have been very useful for gathering feedback and activating continuous improvement. In fact, within the course, trainees complete a questionnaire for customer satisfaction. This is a tool to detect, by a scoring method and textual comments, the weaknesses and the strengths of learning topics, in order to plan corrective actions. After this general survey, direct interviews are held as well in order to gather a close feedback on tacit issues not emerging in the questionnaire. Both the questionnaire and the interviews have influenced the delivery of the learning topics in successive years. From an organisational point of view, the questionnaire is collected during the year, just after the conclusion of each learning topic; the interviews are normally organised by the end of the year; based on an yearly assessment, then, the review of the learning paths is deployed for the next year. From now on, the statistics will be focused on the yearly learning paths, due to their meaningful numbers; the shorter paths were initiated only recently (in 2007) and have not reached yet a sufficient number of data for statistical analysis.
3
STATISTICAL ANALYSIS
3.1 Industrial roles and sectors The following statistics are based on the sample given by the 95 people who attended or are actually attending the Master course. But, before analyzing in detail the statistics, it is worth summarizing two general issues, normally referred to when discussing about maintenance management. •
Maintenance is a complex process operating in an organization; as such, its challenges can be better dealt with if maintenance is strongly connected with other enterprise functions, in order to gain synergies, e.g. connection with production or engineering, in accordance to lean production principles and, in particular, TPM.
•
Maintenance needs may be seen from a dual point of view. The “user point of view” concerns those situations in which maintenance is serviced by the company owning and operating their own machinery. In this perspective, maintenance is connected mainly with production process and people from production/logistics. The “vendor/service point of view”, instead, is related to the offering of maintenance (with eventual spare parts management) from a company to another one as a service (e.g. as a complimentary set of maintenance services) or from a vendor to the customers (e.g. to grant the warranty service). In this perspective, maintenance can be considered as a service and is also often connected with R&D.
Based on these forewords, the statistics shown in figure 1 are meaningful; in fact, they are related to the industrial role of the people that attended the Master course. Since the people were identified as potential candidates for the Master course by their own enterprises, they are also an expression of the enterprise management objectives.
534
Figure 1: Industrial role of the people attending the Master Course. It is evident that the Master course, even if it is more centred on maintenance – related competence and know-how, was deemed interesting for management objectives not only toward maintenance people but also other employee in different industrial processes. It can be argued that there is a motivation standing behind this choice: the intention of their own enterprises, at least, to spread out, beyond the maintenance function, maintenance competence and know-how, so to favour the business integration between enterprise functions (e.g. maintenance and production); at most, the intention to create the general conditions for enacting a job rotation of managers from other industrial roles (production / logistics or project management / engineering for instance) to maintenance. Other interesting statistics are shown in figure 2 where it is represented a comparison between the industrial field affiliation of people attending the Master course and the industrial field affiliation of Italian employees in general (accordingly to the source www.istat.it).
Manufacturing Process Industry Service
Manufacturing Process Industry Service 13%
33%
49%
53%
34%
18%
Figure 2a
Figure 2b Figure 2: Comparing industrial field affiliations
These statistics are quite significant: the shares of companies sending people to attend the Master course (figure 2a), in comparison with the Italian panorama (figure 2b [5]), is clearly different. Indeed, it is worth mentioning the historical origin of the Master course, to motivate such a difference: the stimulus to the start-up of the formative course was initiated by big process companies in Italy and worldwide (i.e. companies operating in the steel making and cement industry). As generally known, maintenance assume an utmost importance in such kind of industries, both due to direct and hidden/indirect costs. Henceforth, their practices are influencing in some way the formative needs: so that the course program (explained in the previous paragraph 2) is strongly influenced by these companies. Just as an example, in the electives, there is a specific learning topic wholly dedicated to the maintenance of process plants.
535
Summarising it can be stated, from the statistical evidence of the last five years, that the Master course is sized more for process industries (34 % share of the pie with respect to 18 % in figure 2), but also widely accepted from the manufacturing world (53 % share, more than 33 %). Besides, the “user point of view” is clearly privileged (87 % share, 34 % plus 53 %, clearly higher than 51 %, 18 % plus 33 %); conversely, the “vendor point of view” is not considered much (only 13 % instead of 49 % of the potential service companies candidates for the course, as resulting from ISTAT statistics).
3.2 Topics in the industrial project works The next statistics help now providing an overview directly from the people perspective and their tasks in the course: the statistics are in fact derived from the sample of industrial project works that course attendants carried on at the end of each year of their learning path. From the statistics, it can be argued that a set of competences and know-how is required to the course attendants by their enterprises in the short / mid-term, i.e. when they have to decide and deploy their industrial project work. Under this concern, the choice of the arguments in a project work can be considered symptomatic of the requirements emerged in each enterprise where the attendant is employee. Figure 3 shows that maintenance engineering is the learning topic that is reaching top ranking of preference within the industrial project works. Quite surprisingly, also the maintenance economics, dealing mostly with the maintenance budget or contractual issues, is a topic requested, so that it is also overcoming the typical arguments normally related to maintenance, like the diagnostics. This orientation may be considered aligned with the educational target: to form maintenance managers, being particularly concerned not only with the technical, but also with the management level. Last but not least, also operations function is characterized by a significant percentage share: in another way, this confirms that people attending the course are not only coming from the maintenance department but from others (production, etc …).
45% 40% 35% 30% 25% 20% 15% 10% 5% 0%
Figure 3 – Topics addressed in the industrial project works In order to achieve more in-depth analysis, the next figure 4 provides details on the techniques and methodologies adopted in the industrial project works for the preferred topic of maintenance engineering. RAM (Reliability, Availability, Maintainability) analysis has been quite a popular choice in the industrial project works; this is followed by the classical FMECA or similar other techniques for criticality analysis of industrial plants, then the TPM and related techniques for problem solving, and, last but not least, having a significant percentage share, the spare parts management. It is worth noticing that spare parts management is quite interesting for the industrial project works: this might be motivated by the high share of people from the process industries (see what previously said in the third paragraph to this regard), where the cost of materials is not irrelevant. In this same concern, the prevalence of RAM analysis as top ranked in the preferences seems also to be correlated with the type of enterprises and industrial sectors involved in the Master course.
536
Due to the high number of project works in the sample, i.e. 99 works, it can be concluded that, in a mid / short term period, one course attendant is required a wide perspective of activities, ranging from maintenance engineering and planning to maintenance re-organization until, at the end, economics.
35% 30% 25% 20% 15% 10% 5% 0%
Figure 4 – Techniques / methodologies applied in the industrial project works focused on Maintenance Engineering
Eventually, it is here worth presenting some final recommendations on how to read the data from the figures above with respect to those data already presented in par. 3.1. It has to be mentioned that the sample in the statistics is not the same with respect to the sample used in Figure 1 and 2. First of all, the sample is somewhat larger. In fact, the number of project works is certainly different from the number of students that attended the Master course: accordingly with the rules established for the “2 years path”, each participant has to develop two project works (one for the first year and one for the second year). Moreover, the statistics shown in section 3.1 also include people who are now attending the Master course and who have not discussed, at the moment, any project work. This second motivation is just a time delay which is expected not to cause meaningful effects on the statistics. Conversely, it would be interesting to focus on the project works delivered at the first and at the second year, so to evaluate any difference, statistically interesting, between the behaviour of the attendants of the two years. Since the majority of the project works regard people attending the “2 years path”, the sample is not sufficient to make a robust analysis, across each year, eventually with regard also to the profile of people attending the “1 year” and the “2 years path”. Next years will help to reach this end.
4
CONCLUSIONS
The present paper has presented an Italian experience about education in maintenance, showing an initiative carried out jointly by MIP Politecnico di Milano and Università degli Studi di Bergamo: a Master course on Industrial Maintenance Management. The course program has been discussed and some statistics, related to the people attending the course and their enterprises, have been provided. The statistics may be useful in order to initiate a general understanding on the needs in education in maintenance management, emergent from industries. Nonetheless, one of the main topics, nowadays on the agenda, is not only how a formative course has to be offered for education in industrial maintenance management; there is also a strong need toward certificating the competence and knowhow in the maintenance area. From some interviews carried out within the @megmi, the Alumni association of the Master course [6], it came out that there is a strong industrial need in certificating maintenance competences. This is very aligned with the European context: indeed, the European Committee for Standardisation is moving toward this direction, by trying to define the minimum competence in the maintenance area. In this context also the EFNMS (European Federation of National Maintenance Societies) is moving in the same direction. Debates have been dealing with two different levels of competences in maintenance area: management (maintenance manager) and technicians. But discussion is still open in different arenas. Alike, this is what is happening also in Italy. Indeed, the Italian Committee for Standardisation is now initiating an initiative to define the competence at three different levels: “Maintenance Specialist” (Level 1), peculiar of people involved in operational
537
activities such as reparation of an equipment and / or inspection / preventive interventions; “Maintenance Technicians” (Level 2 similarly to the EFNMS classification), targeting people who have the responsibility of maintenance of a part of a plant and a group of “Level 1 personnel”; “Maintenance Manager” (Level 3 also according with EFNMS classification) who is the responsible of maintenance of a plant and needs not only a technical competence but also an economical one, being him /her responsible for the economic impact of maintenance in the organisation. Within this panorama, the Master course herein presented is actively involved in the definition of maintenance competence and know-how, with specific focus on Level 3 Certification. Therefore, the next aims for the future of this Master course will be to keep particular attention to what happens on the certification side: if the requirements on this side will be effectively supported by industrial stakeholders (that means vendors/service providers and plant users), compliancy to certification will be one strategic aim of the next course editions.
5
REFERENCES
1
Crespo and Gupta, (2006) Contemporary maintenance management: process, framework and supporting pillars. Omega, The International Journal of Management Science, Vol. 34 (2006), pp. 313 – 326
2
Tsang, A. (2002) Strategic Dimensions of Maintenance Management. JQME, 8(1), 7.
3
Pintelon, L. M., and Gelders, L. F. (1992) Maintenance Management Decision Making. European Journal of Operational Research, 58(3), 301-317.
4
Master meGMI (executive in Gestione della Manutenzione Industriale / executive in Industrial Maintenance Management). URL: http://www.mip.polimi.it/go/Home/Italiano/MASTER-E-CORSI/Master-Universitari/meGMIMaster-in-Gestione-della-Manutenzione-Industriale.
5
ISTAT, National Institute of Statistics . URL: http://www.istat.it/
6
@meGMI alumni association. URL: http://www.amegmi.org/..
538
Proceedings of the 4th World Congress on Engineering Asset Management Athens, Greece 28 - 30 September 2009
THE ROLE OF EDUCATION IN INDUSTRIAL MAINTENANCE: THE PATHWAY TO A SUSTAINABLE FUTURE Professor Andrew Starr and Dr Keith Bevis a a
University of Hertfordshire, College Lane, Hatfield, UK
Sustainability is critical in the maintenance of machinery and other high capital assets. Increasing regulation emphasises social responsibility, increasing the costs during asset life, like waste disposal, efficiency, emissions, and end-of-life. Our responses to these will become the differentiators for survival. Maintenance practitioners have a wide range of needs for updating and up-skilling. Some of this is achieved with new young staff, and some by training and education of employed staff. Education in maintenance has an important contribution to sustainability. The maintenance professional has special needs for accessible training, some very specific to new technologies, for example, but some in broader education leading to a wider understanding of his contribution to the sustainability agenda. Key Words: sustainability, training, education, learning 1
INTRODUCTION
Sustainability is the key to our collective futures: it will encompass the three key words of our age: green, healthy and digital. Maintenance professionals are entirely familiar with the concepts of whole life of machinery and other high capital assets, but they are finding their lives increasingly regulated. Their social responsibility, underpinned by law as well as moral pressure, takes engineers and managers into untracked territory. The legacy of our industrial past is a strong knowledge economy, with a vibrant manufacturing sector, but with costly new values, which affect competitiveness. But those additional costs take in account issues like waste disposal, efficiency, emissions, and end-of-life. In a densely populated world, all these issues are critical, and will become the differentiators for short term and long term survival. But businesses around the world have the same issues – and developing as well as developed countries experience the same pressures. This paper will examine the need for education in maintenance, and its contribution to sustainability, and the “products” available to achieve such education. It will continue to identify the nature of sustainability for asset management, before considering the role of the maintenance professional, and how training and education can contribute to the sustainability agenda.
2
EDUCATIONAL NEED AND PRODUCTS
The needs for education and training in the sector span a wide range of ages and prior learning. Qualifications are taken inside and outside the workplace, often in preparation for later employment. Young people prepare themselves with flexible qualifications, to give a general education; but persons employed in maintenance need more specific up-skilling and qualifications. Those in employment are also interested in modes of delivery which complement their working practices: short periods, distance learning, and flexible timing. Electronic delivery of blocks of training may range for delivery of bite-sized quantities of factual update or refresher material, up to major learning resources guiding study for postgraduate qualifications. Internet and mobile telephone delivery offer direct access at the point of consumption. A key distinction is that training is strongly guided, with most material provided, instructing the student; education leading to qualifications tends to put the responsibility on the student to learn, within a framework, building his/her whole portfolio. Maintenance engineering education products covers a wide remit. Courses exist for HND/HNC and foundation degrees in offshore operations and maintenance, aerospace maintenance engineering, computer systems maintenance, building
539
maintenance & management, and many other specialisms. Many senior posts in industrial maintenance advertise the requirement for HND plus extensive experience, which reflects the industry’s faith in vocational training. This level of education leads towards Incorporated Engineer registration. There is a strong demand for shorter, additional qualifications in the maintenance sector. These include technical topics such as condition monitoring techniques, and management topics including finance, systems and organisation. Proprietary courses are available across a huge range of topics, but without the requirement for specific entry qualifications or assessment. The principal providers are management training organisations and product vendors. These courses address the specific need for short vocational top-up training. Sixty one percent of employees will have received some form of training in the last twelve months [1]. But on the other hand, the indications are that 60% of large organisations embark on training without any contemplation of measures of effectiveness – more driven by the accepted wisdom that “training is beneficial” [2,3]. The difficulty with all short courses provider for employees is how to ensure that the learning itself is sustainable, forming an efficient contribution to the business. Some of this training has become standardised, e.g. the certification of competence of practitioners in condition monitoring [4,5]. These courses fulfil a need for specific monitored training and qualifications in an accredited framework. ISO 18436 standardises the levels on such courses [5]. Bachelor of Engineering (BEng) degrees are typically 3 year courses (not including any industrial experience), while UK Master of Engineering (MEng) courses typically last four years. European masters programmes in accordance with the Bologna Agreement [6] last typically 2 years after a bachelors degree. Course content in the UK is specified by UKSPEC [7] under four main headings: knowledge and understanding, intellectual abilities, practical skills and general transferable skills. The detailed syllabus is not specified. There is only the specific requirement “ensure fitness for purpose for all aspects of the problem including production, operation, maintenance and disposal”. Interpretation of this is the choice of the course designers. Specialist optional modules in years 3 and 4 tend to reflect the research interests of the university, rather than a core set of subjects on a national syllabus. Typical modules include vibration, reliability, maintenance management, condition based maintenance, and tribology. Resource constraints mean that not all universities will offer these. It could be argued that there is no place for a specific maintenance-oriented bachelors degree. Undergraduate degrees attract people who have yet to choose a specialist vocation, and are attracted by marketable highlights. But it has been shown above that there is a wide range of HND and Foundation qualifications with a focus on maintenance, which reflects some early specialisation, particularly in the workplace. From the viewpoint of maintenance, new graduates are often inexperienced, but are bright young people with an excellent grounding in engineering. Graduates from different universities will have quite a range of final-year specialisms, and not all will have had access to maintenance as a concentrated focus of study. The final year project is an area where extra emphasis can be placed on a chosen subject, and maintenance is a popular topic. The typical graduate undertakes further training in the workplace, often specifically aimed at the route to chartership. MEng graduates are considered to have sufficient educational basis for chartership, but BEng graduates are required to undertake Further Learning, usually an additional year’s study or equivalent to 1200 hours study. The route to chartership provides an environment where the motivation to learn is stimulated by the support of more knowledgeable and experienced colleagues. “There is an awareness that increased experience and competence brings with it higher occupational status. This encourages continued learning. A thirst to progress.” [8] Engineering as a profession is regulated in the UK by the Engineering Council. In 2004 the United Kingdom Standard for Professional Engineering Competence (UKSPEC) was published to identify the requirements for Chartered Engineers and Incorporated Engineers [7]. The UKSPEC helps implement international accords which establish the “tradability” of engineering and technology degrees. Accredited degrees appear in the Engineering Council UK index of accredited academic programmes, and also normally in the Fédération Européenne d'Associations Nationales d'Ingénieurs (FEANI) index of recognised European qualifications [9,10]. International recognition is achieved under the Washington Accord or the Sydney Accord [11]. Accreditation examines outcomes achieved, including the process of teaching and learning, assessment strategy employed, human and material resources involved, quality assurance arrangements, entry to the programme and how the cohort entry extremes will be supported. Programmes are accredited either as leading towards Chartered or Incorporated Engineer registration.
3
SUSTAINABILITY FOR ASSET MANAGEMENT
Sustainability reaches into every corner of our lives: indeed in some parts of the world it is life and death. However in asset management we are already familiar with many aspects of sustainability, not least of which is the survival of the company. We want equipment to last as long as economically possible, but on a wider scale the enterprise wishes to use as little energy as
540
possible, with a good health and safety record, and demonstrate a corporate social responsibility to the environment and the local community. For engineering assets, the process starts with design to meet engineering specifications, while meeting or exceeding standards for a range of sensitive issues, such as power, materials, environment, risk, health and safety of personnel, and quality. For some assets, the manufacturing cost, in terms of energy and subsequent carbon “footprint”, is a major concern, for example in the large-scale use of cement in concrete. In other areas it is the disposal costs which will be high: for example the end of life costs of electronic products, automobiles and aircraft are increasingly regulated, and owners must strip out a wide range of materials from the chassis before recycling, or commission such work to be done. Maintenance takes place between the initial manufacture and the end of life – such concerns are listed in Table 1. It may therefore be observed that sustainability is not a new concept in maintenance: by extending asset life in an economical manner, maintenance professionals may have been the original green practitioners, albeit in some less “clean” industries. Maintenance aims to keep the assets healthy.
Table 1 The contrasts of sustainability related concerns in maintenance
4
minimising
maximising
down time
availability
failures and their consequences
reliability
usage and storage of spare parts
cost effectiveness
unnecessary maintenance
performance and efficiency
EDUCATING FOR SUSTAINABILITY, AND SUSTAINABLE EDUCATION: THE ROLE OF THE MAINTENANCE PROFESSIONAL
For the learning that takes place within the working situation to be sustainable, those significant changes in behaviour in an organisation’s staff, or increase in their knowledge that are the evidence of learning, must continue to promote the organisation’s growth after any period of training [12]. It is worthwhile to highlight four indicators of such sustainable learning. These are the readiness of the company to engage in training, the readiness of the learner to engage, the measurable outputs of training and finally the organisational impact of the investment in training. For training to be effective and learning to be sustained the first measure is whether the company is ready. One measure would be the investment in training. In Manufacturing industry the average number of training days per person per year is two [13]. In the Aerospace industry this rises to three. These national statistics provide a useful benchmark for the quantitative measure of how seriously an employer takes training. The trainees too must be ready. New graduates have already been described as bright and intelligent employees. However, motivation for further learning is as important. A good track record of personal development within the company, evidenced by informal participative learning either as learner or “coach” provides an indication that a graduate is prepared to move on and absorb more learning. These provide two indicators that learning could be sustained. The output of the learning may be evidenced by qualifications and operational effectiveness. The clear indicator will be in actual performance. Reliability is just one of the hallmarks of a World Class operation as set out by Hayes and Wheelwright [14]. The company’s performance indicators can provide an indication of whether operational reliability is being maintained. This and the number of graduates achieving chartered status provide an indication of the organisational impact of the learning. It is the organisational impact of the training that gives an indication that the learning is being sustained.
541
5
DISCUSSION
There are two distinct angles on sustainability in education: the learning of the impact of sustainability on the business and its stakeholders; and the sustainability of the learning process itself. Both processes need continuous updating: they are not in steady state. Responsibilities of businesses towards their stakeholder communities are growing, and all developed countries experience the same additional costs. The rapidly-developing economies quickly find that they also acquire the same responsibilities: they are global concerns. Maintenance practitioners can expect to have an on-going need for learning, most of which will be related to sustainability. The process of learning itself requires a sustainable approach, because it is deeply entrenched in an organisation’s effectiveness. Planning of learning, and the measurement of the impact of learning, is important to maintain sustainability.
6
CONCLUSION AND FURTHER WORK
Sustainability is not an option: it will be the business differentiator which ensures survival. It is critical in maintenance because the contribution to whole life health of assets is core to success. Maintenance professionals are increasingly taken into new territory in the law, including health and safety, and the environment. End of life directives are major costs for some sectors. Education and training are vital for the updating and up-skilling of maintenance professionals. A range of products exist, but new developments in distance learning and accessible short course arrangements are necessary to help working people. More specific educational products are likely to arise; but sustainability is not a technical skill – it is a thinking process. Access to the long-term support, for learning in the workplace, will be pivotal for practitioners to achieve their goals. Qualifications will continue to be important – training is not enough. But the traditional providers of high-level qualifications need to become more flexible in accessibility, principally in timing and delivery, to meet the requirements of learning partners.
7
REFERENCES
1
Learning and Skills Council, (2008) Skills in England 2007
2
Reid, M. A., Barrington, H., Brown, M., (2004) Human Resource Development - Beyond Training Interventions, Seventh edn., Chartered Institute of Personnel and Development.
3
Kearns, P., (2005) Evaluating the ROI from Learning, Chartered Institute of Personnel and Development
4
ISO 18436-1 (2004) Condition monitoring and diagnostics of machines - Requirements for training and certification of personnel - Part 1: Requirements for certifying bodies and the certification process
5
Roe S., (2003) Condition monitoring certification of personnel in the UK, Insight v.45 iss.11 pp 764-765 ISSN: 13542575
6
http://www.ond.vlaanderen.be/hogeronderwijs/bologna/ (last accessed June 2009)
7
Engineering Council UK, (2004) UK Standard for Professional Engineering Competence - The Accreditation of Higher Education Programmes, ISBN 1-898126-631
8
Hayes, J., (2006) Towards a virtuous circle of learning, In NIACE's big conversation, National Institute of Adult Continuing Education, Leicester, UK
9
http://www.engc.org.uk/accredsql/public/index.asp (last accessed June 2006)
10
http://www.feani.org (last accessed June 2006)
11
http://www.engc.org.uk/international/index.asp (last accessed June 2006)
12
Keith Bevis, (2009) Sustainable Learning in the Workplace, pub VDM, ISBN 978-3-639-14437-6
13
EEF, SEMTA, (2004) 2003-2004 People Skills Scoreboard for the Engineering Industry, pub. EEF and SEMTA, London
14
Hayes, R. H., and Wheelwright, Steven C., (1984) Restoring Our Competitive Edge: Competing Through Manufacturing. John Wiley, New York.
542
Proceedings of the 4th World Congress on Engineering Asset Management Athens, Greece 28 - 30 September 2009
CONDITION-BASED MAINTENANCE FOR OEM’S BY APPLICATION OF DATA MINING AND PREDICTION TECHNIQUES Abdellatif Bey-Temsamani a, Marc Engels a, Andy Motten a, Steve Vandenplas a and Agusmian P. Ompusunggu a a
Flanders’ Mechatronics Technology Centre. Celestijnenlaan 300D, B-3001, Leuven, Belgium.
Increasing the products service life and reducing the number of service visits are becoming more and more top priorities for Original Equipment Manufacturer companies (OEM’s). Condition-based maintenance is often proposed as a solution to reach this goal. However, this latter, is often hampered by the lack of the right information that gives a good indication of the health of the equipment. Furthermore, the processing power needed to compute this information is often not afforded by machine’s processor. In this paper, a remote platform which connects the OEM’s to the customer’s premises is described, allowing thus a local computation of available information. Two approaches are then combined to process the optimal maintenance time. First, data mining techniques and reliability estimation are applied to historical databases of machines running in the field in order to extract the relevant features together with their associated thresholds. Second, prediction algorithm is applied to the selected features in order to estimate the optimal time to preventively perform a maintenance action. The proposed method has been applied to a database of more than 2000 copy machines running in the field and proved to identify easily the relevant features to be forecasted and to offer an accurate prediction of the maintenance action Key Words: Condition-based maintenance, Predictive maintenance, Data mining, Prognostics 1
INTRODUCTION
Condition based maintenance (CBM), also called Predictive maintenance (PdM), has evident benefits to OEMs, including reducing maintenance cost, increasing machine reliability and operation safety, and improving time and resources management [1]. A side to side comparison between PdM and traditional corrective or preventive maintenance programs is made in [2]. From this survey, it was concluded that major improvements can be achieved in maintenance cost, unscheduled machine failures, repair downtime, spare parts inventory, and both direct and indirect overtime premiums, by using PdM. Although these benefits are well illustrated, two major problems hamper the implementation of predictive maintenance in industrial applications. First, the lack of knowledge about the right features to be monitored and second, the required processing power for predicting the future evolution of features, which is often not available on the machine’s processor. For the last problem, the present work fits in an architecture where the machines at the customer side are connected to a central office. The live measurement data are processed by a server at this central office. This allows using standard computers with off the shelf software tool, with virtually unlimited processing power and storage capacity. Next to condition monitoring of the machine and predictive maintenance scheduling, the central server can also be used for providing other services to the customer. For the former problem, data mining techniques proved to be useful for relevant features extraction. It has been proved [3,4] that application of data mining techniques to data, such as acoustic emission for the monitoring of corrosion processes, is very useful for extracting relevant features which can be used as parameters for machine diagnosis and/or prognostics. However, in many other industrial applications, no clear physical understanding of the process is available and therefore to retrieve these relevant features, a clear methodology is required. The main contribution of this paper is such a practical methodology which combines data mining and prediction techniques used consecutively in order to perform an accurate predictive maintenance scheduling. This methodology is called the IRISPdM approach (see Acknowledgment section). Unlike standards such as Cross Industry Standard Process for Data Mining (CRISP-DM) [5], which indicated, in a high level, the different steps to be followed in order to apply data mining to industrial data, the IRIS-PdM approach described in this paper proposes specific algorithms which can be applied directly in different industrial applications. Furthermore, up-till-now prognostic has been tackled as an independent step from data mining,
543
assuming the prediction of a completely known parameter [6]. On the contrary, in the IRIS-PdM approach, prognostics is an integral part of the flowchart and makes use of the relevant features extracted in data mining step. In this way, the IRIS-PdM approach enables the possibility to compare different features evolution and combine them to improve the accuracy of remaining lifetime forecast. The IRIS-PdM approach consists mainly of: (i) A data mining step on historical data where data preparation, data reduction and relevant features extraction are performed. (ii) A prognostics step, where optimal thresholds are retrieved using reliability estimation methods and prediction algorithms are applied on live data to estimate the remaining time for the relevant features to reach the thresholds. This remaining life time can be used to establish an optimal predictive maintenance. This paper is organized as follows. In Section 2, general descriptions of the remote connectivity platform and the IRISPdM approach are given. In Section 3, the application of the IRIS-PdM approach to an industrial dataset is illustrated, with a review of data preparation steps, data reduction techniques and the most important data mining modeling techniques used to optimally identify relevant features. In Section 4, a description of the prognostics step is given, with a review of reliability estimation techniques to determine optimal thresholds and prediction algorithms to forecast the optimal time to schedule a maintenance action. Finally, conclusions are given in Section 5.
2
GENERAL DESCRIPTIONS
2.1 Remote connectivity platform Remote connectivity to customer premises offers the OEMs a continuous monitoring of their products, which results in a long service life of the products. This is thanks to the recommendations that OEMs can give to their customers for an optimal usage of the machines as well as the optimal maintenance scheduling before the machine or a sub-part of the machine approaches the end of life. Furthermore, improvement of maintenance logistics planning may be achieved. The work presented in this paper fits within the framework of such a platform, schematically shown in figure 1. A human expert in a remote assistance centre, at the machine builder side connects remotely to the customer premises, through a secure internet connection, and collects the data continuously or periodically from the machine. Using different local software tools, including data mining software, different intelligent services can be provided such as Predictive Maintenance (PdM).
Prognostics block Advanced prediction
Predictive Maintenance Estimation
Optimal thresholds (reliability estimation)
Data mining block
Relevant features Features selection Data reduction
Unified data format Data preparation
Case Study
Historical and live data
Figure 1. Schematic of the remote connectivity platform
Figure 2. IRIS-PdM approach steps.
2.2 IRIS-PdM approach In order to optimally forecast the predictive maintenance actions, a practical approach has been developed where different steps are followed starting from the available historical database of machines running in the field to the PdM scheduling. The different steps of the IRIS-PdM approach are summarized in figure 2. The IRIS-PdM approach proved to be easy to transfer from one industrial application to another. It was mainly tested for predictive maintenance of copy machines, but it also proved to be feasible for predictive maintenance of high-end microscopes. The starting point is the historical database of the machines. In general, such a database exists for almost all machine manufacturers, but only a limited amount of information is currently used. The next step consists of data preparation, including data transformation from the original to unified data formats and data cleaning such as removing outliers or calculating missing
544
values. Note that the data preparation step may be time consuming since the unified data format should be compatible with the data mining software tool and still retain a physical interpretation of the data. The data modelling step consists of two substeps. Firstly, the data reduction sub-step where the Independence Significance Feature method is used to reduce significantly the number of attributes. This method is described in Section 3.2.1. Secondly, the features selection sub-step where the relevant features are extracted out of the selected attributes using a decision tree (DT) method. This sub-step is described in Section 3.2.2. The choice of these methods was based on requirements of processing speed, accuracy, and interpretability of the results. The final step consists of prognostics, itself divided into two sub-steps. First, reliability estimation methods are applied to historical database in order to identify the optimal thresholds of the selected features. These techniques are described in Section 4.1. Secondly, a prediction algorithm is applied to live data in order to forecast the time to schedule the optimal PdM. The prediction model used in this paper is based on slope calculation and called Weighted Mean Slope (WMS) model. This model is described in Section 4.2. In next sections the different steps of IRIS-PdM approach are described in details and illustrated for an industrial dataset.
3
APPLICATION OF IRIS-PDM APPROACH TO A DATASET EXAMPLE
In this section, we illustrate the different steps of the IRIS-PdM approach, described in the previous section, on an industrial maintenance database. The dataset used in this section, is a large historical maintenance database of more than 2000 copiers. Every copier has one or more data files corresponding to the history of service visits. One data file contains measurements of the sensors/counters that the copiers are equipped with and information added by the service technician, that illustrates the maintenance action type performed during his visits. This maintenance type can be corrective, in case a replacement of a broken part is done, or preventive, in case the part is changed before it is actually broken. As mentioned previously, the preventive maintenance is decided by the service technician and not always based on a physical understanding of the degradation process.
3.1 Data preparation step The data preparation consists of transforming the information contained in the original data files of different machines to a unified format. The original data is structured in different files of different formats. The unified format is a matrix containing columns corresponding to the different attributes which represent sensors/counters and lines corresponding to observations at a service visit. Two extra columns with the component replacement information and maintenance action type are added. The different data files are concatenated into one single file. Since enough data was available, the missing values were simply discarded from the dataset. A schematic of such a data set format is given in table 1. Based on the dataset, a matrix of more than 1000 attributes (features) and 1 million objects was generated.
3.2 Data modelling step Data mining algorithms are generally evaluated according to three different criteria [7] Interpretability: how well the model helps to understand the data Predictive accuracy: how well the model can predict unseen situations Computational efficiency: how fast the algorithm is and how well it scales to very large databases The importance of these criteria differs from one application to another. For our use, the interpretability and scaling to large databases were essential. A summary of the techniques implemented in popular data mining software, is given in table 2. The data modelling step is divided into two sub-steps (i) data reduction, and (ii) features selection. 3.2.1 Data reduction The method chosen for data reduction is called Independence Significance Feature (ISF) technique. This method, initially described in [8], is meant to quickly and inexpensively discard features which seem obviously useless for the division of the data in multiple classes. Unlike the Principal Components Analysis (PCA) method, the ISF method does not generate new features and therefore retaining the physical meaning of the original features. The ISF method proved also to be much quicker than other methods like correlation or entropy reduction. For example, the processing time made by ISF method to calculate significance of the top 100 attributes, out of the database described in Section 3.1, is approximately 2.5s while Spearman correlation method made 20s and Entropy reduction method made around 130s. The ISF method consists on measuring the mean of a feature for all classes without worrying about the relationship to other features. The larger the difference between means, the better separation between classes (better significance). In our case, this dramatically reduces the number of candidate predictors to be considered in the final selection process.
545
The ISF method reduces the data from more than 1000 features to ~100 features, using as an output of the two classes corresponding to the replacement information (Output1 in table 1). The mathematical formula to calculate the significance, for every attribute, is given as:
Sig =
X1 - X 2 S X -X 1
(1)
2
Table 1 Schematic of a unified data format Attribute1
Attribute2
ObjectF11
…
…
AttributeN
Output1
Output2
ObjectF21
ObjectFN1
Part not replaced
No maintenance
ObjectF12
ObjectF22
ObjectFN2
Part replaced
Corrective maintenance
ObjectF13
ObjectF23
ObjectFN3
Part not replaced
No maintenance
...
...
...
…
…
…
…
…
…
ObjectF1M
ObjectF2M
ObjectFNM
Part replaced
Preventive maintenance
Table 2 Important data mining methods and associated algorithms Technique
Description
Method(s)
Clustering
Unsupervised machine learning to group objects in different classes
K-Means, AutoClass
Classification
Assign unknown objects to well established classes
Neural Networks, K-NN, SVM
Conceptual clustering
Qualitative language to describe the knowledge used for clustering
Decision Tree
Dependency modeling
Describes the dependency between variables
PCA, Dendrogram, ISF
Summarization Provides a compact description of a subset of data
Statistical reporting Visualization
Regression
Determines functions that links a given continuous variable to other
ANN, Regression Tree, ML Regression
Rules based modeling
Generate rules that describe tendency Association rules of data
With
S X -X = 1
2
s12 s 22 + n1 n2
(2)
546
where Sig : Significance X i : mean value of the input objects in class i 2
s i : the variance of the input objects in class i ni : number of samples in class i
Only features with high significance measure are retained for further processing. When the attributes are ordered according to decreasing significance, this results in the graph of figure 3. As can be seen on this graph, only the first 100 attributes have higher significance than 0.2. This threshold was chosen, since the most related attributes to the classification output have significance higher than this value. These attributes are selected to be further processed in the next step.
Figure 3. Significance measure versus attributes
Figure 4. Decision tree model for relevant feature extraction
3.2.2 Features selection Once the data reduction is done as explained in the previous section, the relevant features extraction method is applied. In this paper the Decision Trees (DT) method has been chosen. The main advantage of this method is the ease to interpret the results and transform them to if-then rules, which can be easily used by an industrial user in order to schedule a predictive maintenance action. In [9] a description of a decision tree method is given. It is a top-down tree structure consisting of internal nodes, leaf nodes, and branches. Each internal node represents a decision on a data attribute, and each outgoing branch corresponds to a possible outcome. Each leaf node represents a class. Inducing a decision tree is done by looking to the optimal attribute to be used at every node of the tree. In order to do that, the following steps are followed [10]: (1) a quantity known as Entropy ( H ) is calculated for every attribute m
H ( s1 , s2 , ..., sm ) = -∑ pi log 2 ( pi )
(3)
i =1
With si the number of samples belonging to the class Ci , ( m possible classes) for the calculated attribute. pi is the ratio of number of samples in each class divided by the total number of samples (can be understood as a probability). (2) the expected information E ( A) is computed for each attribute v
E ( A) = ∑ j =1
s1 j + s2 j + ... + smj s
H (s1 j , s2 j ,..., smj )
(4)
v is the number of distinct values in the attribute A and sij is the number of samples of class Ci in a subset S j obtained by partitioning on the attribute A . Where
547
(3) Compute the information gain G ( A) using
G ( A) = H (s1 , s2 ,..., sm ) - E ( A)
(5)
(4) select attribute having highest information gain to be test attribute (5) Split the node and repeat steps (1) to (4) till all values belong to the same class
Ci or all attributes were used.
In this work, the CART decision tree implementation in Matlab statistics toolbox was used for a relevant feature selection. Figure 4 shows a decision tree used to retrieve the relevant features from the list of the selected features obtained in Section 3.2.1. The purpose of this classification problem, is to identify from the list of features, which features are the most relevant to predict a maintenance action. The classification output used in this step is the maintenance type information (Output2 in table 1). Note that in order to reduce the size of the three a pruning method is used. This pruning method calculates the statistical significance of a further split of a data and stops the process when this significance becomes too low. By keeping the size of the three under control, it remains feasible to physically interpret the classification. In the resulting decision tree for our dataset, more than 95% of preventive maintenances were performed when the feature with number x72 is higher than the value ~2800 (right hand branch). The pattern proposed by decision tree can be visualized also by conditional histograms in figure 5. This figure shows a clear separation between values of feature x72 for the preventive maintenance class and the rest of the classes, at the threshold of ~2800. By looking back to the physical meaning of attribute x72, a meaningful relationship to preventive maintenance can be carried out. In order to check the accuracy of decision trees (DT) method towards other well-known data mining methods, such as k-NN (Nearest Neighbor) [11], a 5 fold cross validation check is carried out with both methods and visualized using confusion matrices. The corresponding results are shown in table 3 and table 4, respectively for the k-NN and DT method.
threshold
Figure 5. Classification using the relevant feature extracted by DT
Table 3 Confusion matrix for k-NN method Actual NM NM Prediction Total
CM
67.6% 57.3%
PM 49%
CM
9.9%
PM
21.7% 26.7% 39.3% 100%
15.9% 11.6% 100%
100%
548
Table 4 Confusion matrix for DT method Actual NM Prediction Total
CM
PM
NM
92.9% 73.3% 20.1%
CM
4.6%
21.3%
1.9%
PM
2.4%
5.3%
77.8%
100%
100%
100%
Where NM, CM and PM stand, respectively, for no maintenance, corrective and preventive maintenances. It is clearly shown from the tables that misclassification using the DT method is much lower than for the k-NN method. As an example, 77.8% of preventive maintenances are correctly classified using the DT method, versus only 39.3% using the kNN method. Note that the low percentage of the correct classification using both methods for corrective maintenance is mainly due to the corresponding low number of samples in the complete data set. The value 2800 shown in figure 5 can be used as a threshold for the monitoring of the selected feature. Note that, in order to fine tune this threshold in an optimal way, statistical methods, such as reliability estimation could be used. This latter together with prediction algorithms are discussed in the next Section.
4
PROGNOSTICS
Prognostics in the IRIS-PdM approach consists on two steps. (i) the reliability estimation step where optimal thresholds are calculated form the historical database for the selected features and (ii) the prediction step where a prediction algorithm is applied to live data to forecast the time for scheduling a maintenance action. The former step is described in Section 4.1, while the latter step is described in Section 4.2.
4.1 Reliability Estimation In the IRIS-PdM approach, reliability estimation is applied to the historical database in order to estimate the optimal thresholds for the selected features. These thresholds are going to be used for forecasting the remaining time until the next maintenance action. This estimation consists of fitting a life time distribution model on the data and identifying the failure rate at a given feature’s value [12]. For our dataset, a Weibull distribution model fits quite well the data. Figure 6 shows the Weibull distribution model compared to the life measurements of the machines. This model can be used to determine the optimal threshold by looking to an acceptable failure rate of the studied components. In our case we retain the threshold of ~2800 for the feature x72, which corresponds to ~25% failure rate (right side graph in figure 6).
Figure 6. Weibull distribution model fitted to the data
Figure 7. Forecast of remaining time to maintenance action
549
4.2 Prediction Algorithm In this step, time series of live data are analyzed. The main goal is to obtain a model of the evolution of the selected feature based on past observations in order to predict the future evolution. Combining this prediction with the optimal threshold, as defined in the previous section, allows estimating the remaining time before replacement of the part. The accuracy of this remaining time depends strongly on the choice of the model. Two classes of models can be identified in literature: the models which allow a non-linear trend prediction and the ones which allow a linear trend prediction of time series. For the former case, neural network is a good example, which is extensively used in stock market prediction [13]. It uses at least two parameters: the number of hidden nodes and the weight decay. These two settings are dynamically changing to adapt to the observations. For linear trend prediction methods, exponential smoothing techniques have been used since long time [14,15]. In these techniques some model parameters need to be optimized in order to minimize the error between model and data. This can be problematic in the presence of steep changes in the data evolution versus time. Therefore, a model based on weighted mean slopes (WMS) estimation has been developed. This model works robustly, even when data changes abruptly. In this model, the prediction is performed by looking to the recent observations and fit an optimal model to them. Suppose the primary data
Y = {y1 , y 2 ,..., y n } is the live time sequence of the selected feature at time T = {t1 , t2 ,..., tn }.
The weighted mean slope (WMS) is calculated as:
y k - y k -1 S k = t - t ; k k -1 ∑ k.S k WMS = k ∑k k
{k = 2,..., n} (6)
The prediction value at time t + m , is given by
yt +m = yt + WMS . m
(7)
An example of WMS prediction model applied to live data of one copier is shown in figure 7. In this figure three zones corresponding to low, medium and high risk zones are used. For each zone a different threshold can be set depending criticality of an unscheduled repair for the customer. The remaining days to perform a maintenance action are displayed with a confidence interval for each zone.
5
CONCLUSIONS
In this paper, a practical methodology called the IRIS-PdM approach has been presented and discussed. This approach consists of different steps making use of data mining, reliability estimation techniques and prediction algorithms in order to extract the relevant feature and use it in prognostics to predict the remaining time until a preventive maintenance action is required. Independent Significance Feature (ISF) was successfully applied on an industrial data set for data reduction. Next, the most relevant features were extracted by means of the Decision Tree classification method. Comparison between the k-NN and the DT methods proves that DT is an accurate classification method. A Weighted Mean Slopes (WMS) model was applied for prediction of the remaining time to schedule a maintenance action. This model works robustly, even for abruptly changing data. The methods described in this approach can be broadly and robustly applied to different industrial data sets and maintenance databases. The results can also be easily interpreted and transformed to if-then rules, allowing insight and easy interaction with the results.
6
REFERENCES
1
J. Blair & A. Shirkhodaie, (2001) Diagnosis and prognosis of bearings using data mining and numerical visualization techniques, Proceedings of the 33rd Southeastern Symposium on System theory, 395-399.
2
R. K. Mobley, (1990) An introduction to predictive maintenance, Van Nostrand Reinhold.
550
3
G. Van Dijck, (2008) Information theoretic approach to feature selection and redundancy assessment, PhD thesis, Katholieke Universiteit Leuven.
4
R. Isermann, (2006) Fault-diagnosis systems, Springer.
5
C. Shearer, (2000) The CRISP-DM model: the new blue print for data mining, Journal of data warehousing, 5(4), 13-22.
6
K. M. Goh, T. Tjahjono & S. Subramaniam, (2006) A review of research in manufacturing prognostics, IEEE International Conference on Industrial Informatics, 417-422.
7
R. Duda, P. Hart & D. Stork, (2001) Pattern classification, John Wiley & Sons, Inc.
8
W. Sholom, I. Nitin, (1998) Predictive data mining: a practical guide, Morgan Kaufmann.
9
P. Geurts, (2002) Contributions to decision tree induction, PhD thesis, University of Liège.
10
J. You, L. & S. Olafsson, (2006) Multi-attribute decision trees and decision rules, Chap. 10, Springer, Heidelberg, Germany, 327-358.
11
Y. Zhan, H. Chen, G. Zhang, (2006) An optimization algorithm of K-NN classification, Proceeding of the fifth international conference on machine learning and cybernetics, Dalian, 2246-2251.
12
P. Yadav, N. Choudhary, C. Bilen, (2008) Complex system reliability estimation methodology in the absence of failure data, Quality and reliability engineering international, 24, 745-764.
13
D. Komo, C. J. Cheng & H. Ko, (1994) Neural network technology for stock market index prediction, International symposium on speech, image processing and neural networks, Hong Kong, 534-546.
14
R. G. Brown, F. R. Meyer, (1961) The fundamental theorem of exponential smoothing, A. D. Little, Cambridge, 673-685.
15
E. S. Gardner, (2006) Exponential smoothing: the state of the arts – Part II, International journal of forecasting 22, 637666.
Acknowledgment This research fits in the framework of IRIS (Intelligent Remote Industrial Services) project which is financially supported by the Dutch governmental agencies: Ministry of Economy Affairs, Province of Noord-Brabant, Province Limbrug and ‘Samenwerkingsverband Regio Eindhoven, in the framework of Pieken in de Delta (PID) program (Project No. EZ 1/5262). IRIS is a cooperation of the project partners Sioux Remote Solutions, Assembléon, Océ Technologies, FEI Company, FMTC (Belgium) and the Technical University of Eindhoven. The IRIS consortium was created in cooperation with the BOM (Brabantse Ontwikkelings Maatschappij) and BO4. The authors wish to thank all the IRIS project partners for their valuable advices and review of this research.
551
Proceedings of the 4th World Congress on Engineering Asset Management Athens, Greece 28 - 30 September 2009
SHAPE OF SPECIMEN IMPACT ON INTERACTION BETWEEN EARTH AND EIGENMAGNETIC FIELDS DURING THE TENSION TEST Szymon Gontarz and Stanislaw Radkowski Warsaw University of Technology, Institute of Automotive Engineering, Warsaw 02-524, Narbutta 84, Poland.
Many materials that could cause real threat of the catastrophe caused by fatigue wear, exceeding stress limits or emerging of plastic deformation have magnetic properties that could affect the local magnetic field. As far as quite well known and applicable are active magnetic methods for condition monitoring, passive techniques which bases only on the existence of natural magnetic field of Earth still need researches and improvement. It is obvious that every physical object, which enclose in magnetosphere interacts for Earth magnetic field, follows to special physics lows. Such objects could focus or deflect the magnetic lines around its matter. Own magnetic field of object: H = - (w) where ‘w’-magnetic potential, is a function of magnetization gradient: w = w (div M). So, the measure magnetic field of object depends on object magnetization and distribution of this volume in medium (space). Considering magnetoelastic effects (Villary Effect, magnetostriction), the additional stress causes transformation to magnetic state of material which reflect to magnetization of object. This magnetization depends from many factors. In this paper shape of specimen impact on interaction between earth and eigenmagnetic fields during the tension test is considered. Due to simply model analysis, the laboratory experiment was proposed and performed. Controlling plastic and elastic range of specimen deformation, there was proved that exist relation between stress and magnetization degree which is strict connected with the deformation and effort state. Magnetic anomalies which are generated due to magneto-mechanic effect were collected by the three axial fluxgate magnetometer what allows exhibition of the own magnetic field component, which is least sensitive for the disturbances which are present in the real world. In the paper, it was proved that exist dependence between the stress and magnetization degree but it is very complex because additionally it depends on magnetization, history of magnetization, deformation and the shape of object. Also, the place of measurement relative to the shape of object has the influence on results. Further directions and comments about development of the techniques for technical state objects evaluation which could make use with the presented effects, were included.
Key Words: magnetoelastic effects, passive magnetic methods, eigenmagnetic field, diagnostic of constructions 1
INTRODUCTION
Non destructive testing in diagnosis state of technical objects plays more and more important role in contemporary diagnostic. Due to following and continuous evolution, new techniques bases on innovative ideas appear. Between many different diagnostic’s techniques, magnetic methods pay particularly attention. Additionally, having on consideration consequences coming from unforeseen architectonic construction crashes and technical objects break downs, it is necessary to progress of science in direction of technology for early phases of fault development detection. Many materials that could cause real threat of the catastrophe caused by fatigue wear, exceeding stress limits or emerging of plastic deformation have magnetic properties that could affect the local magnetic field. As far as quite well known and applicable are active magnetic method for condition monitoring, passive techniques which bases only on the existence of natural magnetic field of Earth still need researches and improvement. The base of performing passive magnetic method is existence of Earth magnetic field. The Earth can be consider for globe as a homogeneous magnetized, having magnetic axes with south’s pole on north geographic hemisphere and magnetic north’s pole on south hemisphere. It is obvious that, every „physical’s substance” staying in precincts magnetosphere, will influence
552
on local magnetic area of earth, but that influence is different depending from material, which concrete object is made. Taking into consideration interested us materials one must say that difference types of steels can be both magnetic and non-magnetic (magnetic metals: cobalt alloy, iron, nickel alloy, steel (except stainless steel); non-magnetic metals: aluminium, brass, cooper, gold, silver, titan, stainless steel). Outer magnetic field, which is acting on materials, sets domains unanimously with outer magnetic field what results in increase magnetic induction in sample till to attainment saturation. It refers to complete arrangement of domain. Further increase of magnetic induction in sample, will be cause only by increase of field H (because all domains are already arranged). Additionally, when the stress changes in materials with magnetic properties, then the transformation of material to magnetic state proceed – it could be found in magnetic memory metal (MMM) or in residual magnetic area (RMF). In general one could certify, that existent relation between stresses and grade of magnetization but it is complexity because additionally it depends from type magnetization, history of magnetization, strain and temperature. There is also suspicion that it is connected with the shape of the object, what may have significant meaning. In the paper, real world experiment was performed; using steel samples during the tensile test, without additional sources of magnetic field, generating by the mechano-magnetic effect, changes in intensity of magnetic local field was registered. In measurements there was use fluxgate magnetometer with possibility of intensity magnetic field simultaneously in three perpendicular directions registration. It was aiming to demonstrate component of magnetic field which would be least sensitive for interference prevalent in real world. Three different shapes of specimens were research as significant factor which could have the influence on magnetoelastic behaviour. Additionally extensometer technique was used as a explanation and confirmation of magnetic results. Targets of this paper can determine by cognitive aim and also by utilitarian intention. Cognitive target is hear most important, it is a background for the future work and consist in test and description of magneto elastic effects which could be useful in non-destructive magnetic method for level stress assessment of steel elements in industry and infrastructure.
2
PASSIVE MAGNETIC METHODS
The evaluation of both applied and residual stresses in engineering structures to provide early identification of eventual failure is a fast growing area in non-destructive testing. Recently, there is wide range of such a stress detection methods where magnetic techniques are included. Magnetic methods can be divided into two groups: active and passive magnetic techniques. Active magnetic techniques [1,2]. A magnetic field is applied to the material and variations in field parameters such as permeability, hysteresis and magnetic Barkhausen emission are used to draw inferences about the material stresses. Active magnetic techniques usually use high strength, low frequency fields to drive the material into saturation so offer fairly good penetration. Passive magnetic techniques. The magnetic field strength at the surface of the material is measured without prior application of a magnetic field. Passive magnetic techniques such as metal magnetic memory(MMM)make use of variations in the self magnetic leakage field (SMFL) of a ferromagnetic material due to geometrical discontinuities such as cracks and high density dislocations, formed in the presence of ambient magnetic fields such as the earth’s field. These variations reflect the stress history of the material. One of the popular techniques is magneto-acoustic emission for state diagnosis of microstructure of exploited ferromagnetic steels. So far, the results shows that for ferritic-pearlite and pearlite–bainite steels, method uses magneto acoustic effect (EMA) is characterizing with very good properties of use for investigations non-destructive techniques [3, 4, 5, 6]. Another, famous passive magnetic method for non destructive technique was developed in Russia. It is a diagnostics method for products and machines, basing on magnetic memory effect in metal. Conditions of forming magnetic residue in metal were determined and reflect structural of memory and condition of strains intensity in object. For this technology, name: Magnetic Memory Metal (MPM) was given and based on physical phenomenon of magneto-elasticity, magnetostriction and their relation with creating and locating limit of magnetic domains on wall of dislocation in concentration zones of stress and on phenomenon of magnetic dissipation which causes by structural and mechanical heterogeneities in condition of natural magnetization made by the load. In general the group of passive magnetic methods are considered as the very promising and modern techniques of XXI century. Comparing to active methods, they have many advantages as they do not need any artificial source of magnetic field so they can be use not only for diagnostic test but continuously, for example in condition monitoring. This also make possible of use in places where any artificial source of magnetic field could be dangerous. The passive methods do not require any object preparation and they can be useful even in difficult available places. From this and other reasons it is worth to develop this branch of diagnostics method.
553
3
EXPERIMENT DESCRIPTION
For the description of magneto elastic effects which could be useful in non-destructive magnetic method for level stress assessment of steel elements in industry the following laboratory experiment was prepared. The test has for one’s objective demonstration and confirmation of magnetic material properties, which can be useful in objects diagnosis made with these materials. Researches were prepared and realized in the following way. Shape and dimensions of sample were made unanimously with the figure 1. Despite the shape of specimens are different, the crossection field is similar. Only one type of steel for specimens were chosen, steel 45 as we wish to focus on magnetoelastic effects independent on chemical composition of material. Typical mechanical properties of selected material are exemplified in underneath table (Table 1). Table 1 Mechanical properties of specimen material. mechanical properties Rm [MPa]
Re [MPa] min
Hardness, HB, max
min 600
355
241
Shape C
Shape B
Shape A
Figure.1.Three different shapes of specimens. That designed samples were tested by WP 300 machine (Universal Material Tester) German’s firm GUNT. Power stress which introducing the load to the testing elements is a hydraulic pump controlled manually. The construction of machine, enable monitoring and direct access to value of force, which is generating by pump and enable by the linear encoder, access to the signal which is proportional to the elongation of sample.
Figure 2. Complete test stand. Measurement’s equipment is composed with two three axial Fluxgate magnetic sensors. There were magnetometers made by Applied Physics System, model APS 536 with sensitivity 5.0 V / Gauss. There were feeding by stabilized power supplier GPS4303 made by Good Will Instrument. Voltage signals from magnetic sensors were transmit to acquisition card from
554
National Instruments and work together which measurement computer NI - PXI. Another acquisition card in the computer was responsible for supporting the measure with the extensometers. These sensors were working in half bridge mode – two for one specimen. Into the same computer, through the USB thoroughfare, the signals with information about applying force and sample elongation were introduced. Magnetic sensors were fixed on the tripods what enable any spacing regards to sample and to oneself. Investigations were divided for three stages. The first measurement concerned investigations of eigenmagnetic field of specimen in range of elastic and plastic deformation to discover the magnetoelastic effects. The second approach considered the impact of the shape of specimen on quantity of eigne magnetic field during the tension test in elastic range. The last stage consist alternative stress measure made by extensometers. Results from these strain sensors were reliable comparison for magnetic measures. All of these three tests allow get closer and estimate the possibility of stress assessment using the passive magnetic method and its properties in the space.
4
EXPERIMENT RESULTS
During the first stage of test, designed specimen was used and its eigenmagnetic field was analysed during the tension test. Magnetic anomalies which were generated due to magneto-mechanic effect were collected by the three axial fluxgate magnetometer. First configuration was investigating the four different distances (25, 50, 100, 150 [mm]) (Figure 3) of one of the sensor (second one was stable) in a function of magnetization. During the test, the load was changing from 0 to about 40% of force for Re.
Figure 3. Magnetometers configuration strategy. From the results (Figure 4), it is easy to observe the change of magnetic field intensity which is varying with the change of distance between sensor and specimen. It stays important that the range of change is different what depends on measurement direction. So, the measurements by the three axial fluxgate magnetometer allows exhibition of the own magnetic field component (vertical direction), which is least sensitive for the disturbances (magnetic field of environment) which are present in the real world and allows discover different this direction as possibly most informative for stress assessment.
555
Figure 4. Change of magnetization in a function of force with different measurement distance. Another step was to cross elastic state and bring the probe to brake. During this test, one sensor was 1 mm close to the probe and second one was in a distance of 25mm. The changes of magnetization in vertical direction registered by the sensors along with the change of force were as follows (Figure 5). 1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49 51 -2 12,00 -2,5
10,00
B [m T ]
6,00
F [k N ]
8,00
-3
-3,5 4,00 -4
yield area -4,5
2,00 0,00
SN873
SN874
Force
Figure 5. Change of magnetization during the break test - vertical direction. Break test confirm magneto elastic effects which were observed during the previous tests. But additionally, after the elastic border was exceeded, in yield area, quality change of eigenmagnetic field could be observed. Present analysis give us result in form of information, that vertical direction of magnetic field should be measure if we want to control the elastic range of material and also this direction is better for discovering plastic deformation as soon as possible. This lets assume that innovative passive magnetic techniques exploiting quantitative and qualitative behaviour analyses of magnetic field changes during over crossing the yield area can bring good results in diagnostic of even complex technical objects. The wider overview of considered in this paper first step of experiment was described in [8].
556
Another thing which should be considered is the shape of specimen and its impact on eigenmagnetic filed during the tension. This allows estimate the influence of pure magnetoelastic energy which is register by magnetometers as eigenmagnetic field of specimen on the dependence on the shape. For this reason three designed specimens (figure 1) were tested. Every shape was represented by six specimens which were stretch in our test stand (figure 2). Tension was performing in the elastic range of specimen deformation and experiment was repeated four times for every specimen. From the collected data, the changes of magnetization between the state without and with tension applied were calculated and adequate average values were consider in further analysis. Bases on results from all these realized tests, distribution parameters N(µ,σ) were estimated. On the grounds of estimated parameters, equivalents functions of probability density were created. It allows represent the averaging eigenmagnetic field value of every measurement direction with the statistic description, like for example standard deviation and dispersion. Figures 6 from ‘a’ to ‘c’ presents functions of probability density for the changes of eigenmagnetic field in three perpendicular direction for every shape of specimen, which was investigated. ReliaSoft Weibull++ 7 - www.ReliaSoft.com
Probability Density Function 120,000
Pdf mikulasz_new_norm\XA Normal-2P RRX SRM MED FM F=18/S=0 Pdf Line
Shape A
mikulasz_new_norm\XB Normal-2P RRX SRM MED FM F=12/S=0 Pdf Line
96,000
Shape B
mikulasz_new_norm\XC Normal-2P RRX SRM MED FM F=9/S=0 Pdf Line
f(t)
72,000
48,000
24,000
Shape C ROBERT GUMIÑSKI PW 09-04-2009 18:17:38
0,000 0,000
0,120
0,240
0,360
0,480
0,600
Time, (t)
a)
mikulasz_new_norm\XA: mikulasz_new_norm\XB: mikulasz_new_norm\XC:
m=0,0183, s=0,0053, r=0,9644 m=0,0331, s=0,0040, r=0,9722 m=0,3854, s=0,0677, r=0,9945
ReliaSoft Weibull++ 7 - www.ReliaSoft.com
Probability Density Function 300,000
Pdf
Shape A
mikulasz_new_norm\YA Normal-2P RRX SRM MED FM F=21/S=0 Pdf Line mikulasz_new_norm\YB Normal-2P RRX SRM MED FM F=12/S=0 Pdf Line
240,000
mikulasz_new_norm\YC Normal-2P RRX SRM MED FM F=9/S=0 Pdf Line
180,000
f(t)
Shape B 120,000
60,000
Shape C 0,000 0,000
ROBERT GUMIÑSKI PW 09-04-2009 18:17:08 0,022
0,044
0,066
Time, (t)
b)
mikulasz_new_norm\YA: mikulasz_new_norm\YB: mikulasz_new_norm\YC:
m=0,0054, s=0,0015, r=0,9546 m=0,0088, s=0,0027, r=0,9489 m=0,0357, s=0,0163, r=0,9669
557
0,088
0,110
ReliaSoft Weibull++ 7 - www.ReliaSoft.com
Probability Density Function 20,000
Pdf mikulasz_new_norm\ZA Normal-2P RRX SRM MED FM F=12/S=0 Pdf Line
Shape A
mikulasz_new_norm\ZB Normal-2P RRX SRM MED FM F=12/S=0 Pdf Line
16,000
mikulasz_new_norm\ZC Normal-2P RRX SRM MED FM F=12/S=0 Pdf Line
12,000
f(t)
Shape B 8,000
Shape C 4,000
0,000 0,000
ROBERT GUMIÑSKI PW 09-04-2009 18:16:34 0,120
0,240
0,360
0,480
0,600
Time, (t)
c)
mikulasz_new_norm\ZA: mikulasz_new_norm\ZB: mikulasz_new_norm\ZC:
m=0,4283, s=0,0249, r=0,9578 m=0,2889, s=0,0400, r=0,9323 m=0,1871, s=0,0771, r=0,8881
Figure 6. Probability of density functions for three different shapes of specimens: a) x (horizontal) direction, b) y (horizontal) direction, c) z (vertical) direction. As one can observe that results for three measurement directions has different probability density function regards to each shape of specimen. In our case, the most interesting parameter is the average values which is significant different in case of Z direction. In other directions, there is not clear variation, excluding shape C which dispersion is so large that it is difficult to compare it. Anyway, as we already consider vertical direction as a most interesting due to stress and eigenmagnetic field correlation, the dispersion of C shape is still very high. The change of the mean value for the different shape of specimen can be understood, but for the changing dispersion in regards to shape, it is not clear and need the verification. This extraordinary effect turns our special attention and pressed us to the theory that for example inaccuracy quality of work could introduce nonuniform stress distribution in crossection of researching specimen (according to the figure 7) what could be probably the reason of large dispersion of results in case when magnetometers was fix relative to the wide side of specimen. Because of this new circumstantial, the new magnetic test was prepared. During similar tension test as was performed before, the magnetometer was moving along the specimen width (along longer side). In that way we were able to obtain the distribution of eigenmagnetic field on the one side of sample. Illustration of results from this experiment is figure 8.
Figure 7. Inaccuracy in quality work of specimen.
558
minimal measurement distant to specimen 0,45 0,4
delta H [Gauss]
0,35 C1
0,3
C2
0,25
C3 C4
0,2
C5 0,15
C6
0,1 0,05 0 0,5
1
1,5
2
2,5
3
3,5
measure point across the specimen w itdh
Figure 8. Distribution of eigenmagnetic field across the specimen width. This clearly shows that exist the differences in distribution of eigenmagnetic field of tensioned sample between its two extreme sides. The boundaries values are dissimilar, so the new question is growing does these values are correlated to the different values of stress. To verify this hypothesis, two extensometers were applied extremely on the left and right side of wider side of specimen. They were working it the half bridge mode each, so with the use of two channel acquisition card we were able to get the tension simultaneously from two sides. Before the key experiment, every extensometer was calibrated. For all six specimens’ similar tests were proceed – the sample was tensioned to the same stress as it was in case of magnetic measurements and than release. During such a loop the signal from extensometers was collected and in next step recalculated to estimate a stress value in Pascal units. The result shows that four specimens discover non-uniform stress distribution between the measurement points around 10MPa what is very little difference regards to the applied stress which was on the level of 300MPa. But rest of the specimens (2 samples) shows difference in stress distribution about 13% of maximal stress value – about 40MPa. Every specimen was marked so we were able to compare the extensometers and magnetic measure for these two samples. Extensometers confirm non-uniform stress distribution which where clearly notice for sample number 3 and 4, as we observed on the figure 8. Simultaneously, this comparison confirms also the theory of the dispersion reason which was discover and investigated during the magnetic measurement for the widest specimen (shape C). It that case, the magnetic method were affected not due to shape impact but due non-uniform stress distribution across the crossection of specimen.
5
CONCLUSIONS
Utilization of fact that without use additional outer sources of magnetic field, there exist possibility of present mechanical stresses condition estimation with difference correlation about intensity changes of generating magnetic field in dependence from direction realized measurement, create possibility arise new universal and efficient diagnostic method. This labour summarize that due to all presented matters, innovative passive magnetic techniques exploiting analysing of magnetic field changes is possible and could bring good results in diagnostic of complex technical objects. Controlling plastic and elastic range of specimen deformation, there was proved that exist relation between stress and magnetization degree which is strict connected with the deformation and effort state. In our test, three perpendicular directions were used, what clearly shows that changes of magnetization from the mechanic stress are different correlated to the measurement direction. Additionally, the distribution of the eigenmagnetic field from tensioned steel object examination gives expectation for stress assessment from the distance, but this and the measurements settings should be properly chosen. In this situation the properties of the technical object like: material, shape, construction and dimensions should be taken into consideration. The material influence was confirmed and investigated in the paper [8]. Another very important factor seems to perform shape impact. This factor has influence on quantity change of eigenmagnetic field of tensioned object what was proofed in the paper. Proper laboratory experiment was performed and additionally discovers the sensitivity of magnetic measurement on no-uniform distribution in widest samples. This unusual distribution was unintentional and was probably causes by the inaccuracy of specimen production or fixing to the test machine. This phenomenon discovered by magnetic
559
experiment and further confirmed by the extensometer technique gives assumptions for magnetic detail stress distribution analysis. In general, in the paper, one can distinguish quantity change which description relation of stresses and magnetization grade, however consideration of object shape can result indicate on possibility to get unusual precious diagnostic information on the level of mechanic effort of object. However, magnetic methods have many shortcomings, which in practice introduce limitation to the range of application [7]. Factor of limitation of using magnetic method can be even unstable character of global magnetic field or artificial sources of magnetic field could disturb the measurement but smart application of method could bring good results. In the face of acquired result of interaction between earth and eigenmagnetic field during the tension tests further work about development of the passive magnetic technique which allows early fault state detection of technical objects made with ferromagnetic materials, should be carry on.
6
REFERENCES
1
S.P. Sagar, B.R. Kumar, G. Dobmann, D.K. Bhattacharya (2005) Magnetic characterization of cold rolled and aged AISI 304 stainless steel, NDT & E Int. 38 674–681.
2
R.L. Hu, A.K. Soh, G.P. Zheng, Y. Ni (2005) Micromagnetic modeling studies on the effects of stress on magnetization reversal and dynamic hysteresis, J. Magn. Magn. Mater
3
Janusz ŁUKASZEWICZ, Zbigniew ŁAPIŃSKI, konstrukcyjnych; KBN, WITU Zielonka
4
Sprawozdanie z pracy nt. „Przeprowadzenie badań zjawiska emisji akustycznej towarzyszącej rozciąganiu probek stalowych”; IPPT 2001.
5
Sprawozdanie z pracy nt. (2001) Badanie zjawiska emisji akustycznej w trakcie prob rozciągania przygotowanych probek ze stali 30HGS”; PW Wydział InŜynierii Materiałowej
6
Dubov A.A. (2002) Diagnostyka wytrzymałości oprzyrządowania i konstrukcji z wykorzystaniem Magnetycznej Pamięci Metalu; Dozor Techniczny 2, 2002, 14 – 18 i Dozor Techniczny 1, 37-40
7
Gontarz S., Radkowski S. (2007) Own magnetic field as a source of diagnostic information. VI International Technical Systems Degradation Seminar 2009, Liptowski Mikulasz
8
Gontarz Sz., Radkowski S. (2008) Use of passive magnetic method for condition monitoring. Proceedings of the 3rd World Congress on Engineering Asset Management and Intelligent Maintenance Systems Conference (WCEAM-IMS 2008), 27÷30 October, 2008, Beijing, China, str. 543÷552 (CD-ROM)
(2001)
Określanie
parametrow
akustycznych
materiałow
Acknowledgments This work has been supported by the European Union in the framework of European Social Fund through the Warsaw University of Technology Development Programme.
560
Proceedings of the 4th World Congress on Engineering Asset Management Athens, Greece 28 - 30 September 2009
INTEGRATING HUMAN RELIABILITY ANALYSIS INTO A COMPREHENSIVE MAINTENANCE OPTIMIZATION STRATEGY Corey Kiassat and Nima Safaei C-MORE, Department of Mechanical and Industrial Engineering, University of Toronto, Ontario, Canada Email: [email protected] and [email protected] This paper demonstrates the significance of considering human-related characteristics within a system in the overall predictive failure analysis. No maintenance strategy is complete unless it integrates the human operator and the inherent uncertainty into the overall optimization scheme. The inclusion of Human Reliability Analysis (HRA) will expand the traditional focus of condition-based maintenance, thus making it an all-encompassing approach. HRA can capture the uncertainty associated with operator variability for more comprehensive predictive activities within the field of Maintenance Optimization. A well-known model used in failure prediction and reliability analysis is the Proportional Hazards Model (PHM). In many contexts, it makes sense for characteristics related to Human Reliability to be included as new covariates in the PHM. Two systems may start out as identical; but if one utilizes a fully trained crew and the other a completely unskilled crew, the two systems are bound to perform differently with varying reliabilities. The inclusion of Human Reliability in the PHM will make the analysis more complete, resulting in more accurate predictions of failure. This makes a positive financial impact as better maintenance decisions are made. A case study is presented to show the effects of operator skill level on system reliability as a function of time. Key Words: Human Reliability Analysis (HRA), Human-related Factors, Proportional Hazard Model (PHM), Maintenance. 1
INTRODUCTION
A Reliability expert who disregards the role of the human in the overall failure risk of a system exaggerates the role of machine-related failure modes in the overall unreliability of the system. This is quite common and, consequently, identical machines used across various sites may exhibit varying reliabilities. The skill level, motivation, work place politics, and many other intangible factors play a role that should be taken into account when devising reliability estimates and maintenance strategies. The main goal of this paper is to investigate the effects of human characteristics associated with skilled manpower on the time-to-failure of manufacturing equipment. One approach to estimate the failure time of any given equipment is the proportional hazard model (PHM). The PHM, also called Cox model, is a method that relates the time of an event, such as failure or breakdown, to a number of explanatory variables known as covariates (Vlok, Coetzee, Banjevic, Jardine, and Makis, 2002). From a maintenance point of view, the idea behind the PHM is that obvious and/or hypothetical factors, including the equipment age, may act as reliability criteria that influence the hazard rate of the equipment. The hazard rate is the rate of transition out of the current healthy state. When the hazard rate exceeds a pre-determined threshold, this signifies a high probability of a functional failure or one that is in the process of occurring. The hazard rate can also be affected by employing unskilled operators in the system. This is the essence of Human Reliability Analysis (HRA) where the knowledge of the human involved with the system may play a deciding factor between the survival and the failure of two otherwise identical machines. This may be especially true in the case of new machinery where operator unfamiliarity plays a larger role in failure frequency compared to the machine degradation. Figure 1 illustrates the concept that the human-related (HR) factors are the major contributor to the total frequency of failures in the early phases of a machine. As time goes on and operators gain more skills, machine-related (MR) factors start to contribute more. If HR failures are not taken into account in the early stages, the major contributor to the total failure frequency is mistakenly assumed to be MR. This may lead to a waste of resources in a maintenance optimization strategy.
561
Frequency of failure
Total
MR failure HR failure Age
Figure 1. Effect of HR vs. MR factors on the frequency of failures
The human characteristics that may normally be considered as intangible, or qualitative, factors can be captured through HRA methods. They can then be turned into quantitative data and integrated with the overall PHM for the purpose of predicting the risk of failure. If asset managers can make more informed maintenance-related decisions, the operating cost of the organization can be directly affected. The PHM is an appropriate tool to predict the failure time of the system under the condition-based maintenance (CBM) policy (Jardine and Banjevic, 2005). The general form of the PHM is as follows: HAZARD (t)
=
(1)
The first part of the equation is a baseline hazard function, sensitive to the age of the equipment. The individual
’s are
covariates that are supposed to affect the overall hazard. As pointed out earlier, in a complete analysis, some ’s will be machine-related, while others are assumed to be human-related. We will investigate how human-related covariates affect the hazard rate of a given system. A case study of a precision gear manufacturing company, Alpha, is considered to explore the supposed relationship. When a system hazard rate decreases as a function of time, the frequency of inspections, preventive and reactive spare part replacements, as well as machine downtimes also decreases. For instance, in a company that has a periodic vibration analysis program, a single complex machine may experience x bearing replacements in one year of operation with an untrained operator. This same machine may experience less than x bearing replacements if the operator is fully trained upon starting his job. This can result from the operator being more attuned to the needs of the machine, being cognizant of the ideal sequence of operations, and committing fewer mistakes. Less frequent replacements of bearings result in savings of material resources and skilled human resources, as well as increased machine uptime. When this is considered across the entire system in the long term, it can be quite significant financially. It is this type of economic incentive that will initiate the industry-wide adaptation of HRA in Manufacturing and Service. Nuclear (Bubb, 2005), Aviation (Latorella and Prabhu, 2000), and to some extent Oil & Gas industries, have embraced the use of HRA in recent years due to the overriding policy of safety. However, others, most notably Manufacturing and Service, will start to adopt HRA more rapidly as economic benefits are realized by reducing hazard through greater human reliability. Research has been conducted on the role of Human Reliability in maintenance activities and various approaches for incorporating it into maintenance strategies. Some of the related research, such as Barroso and Wilson (1999), has been conducted in manufacturing environments where the context is similar in focusing on estimating the overall effect of human unreliability in a manufacturing environment. However, the approach used in this paper was not based on PHM. Other works in research, such as Blanks (2007), discuss the need for improving reliability prediction, with special attention to human error causes and prevention. However, there are no mentions of PHM or any predictive techniques for human reliability. There are also other research works, such as Zimolong and Trimpop (1994), and Dhillon and Liu (2006) that focus on the maintenance workforce performing repair work while the machine is not being used for production purposes. There is a main difference among the aforementioned works of research and the discussions of this paper. The use of production volume as a covariate to incorporate the effect of HR error into a PHM distinguishes our paper from previous research. 2
PROBLEM DESCRIPTION
Company Alpha has four assembly lines with similar equipment and identical production quotas. Only the most complicated of the four is considered here. The tolerances on the final product were extremely high and, as a result, the equipment used were quite complex. The machines initially went into production and through a quick ramp-up period before going into full production mode. Six months after power-up, the machinery was being used around the clock. Production volume was expected to be 1,000 gears per day, for a total of 5,000 gears for the week. Due to equipment novelty and the lack
562
of sufficient operator experience and skill, the target volume was never achieved for the first year-and-a-half. Consequently, overtime shifts were added. This put an even higher strain on the system as a whole in terms of the equipment, maintenance activities, and human resources. Production volume data was collected for every shift after the six-month ramp-up, for a period of 20 months. Weekly production counts have been measured as pieces-per-shift to even-out the effect of additional overtime shifts in some weeks. These production totals were compared for the beginning, middle, and end of the 20-month period. A 25% production increase was observed. In addition, as shown in Figure 2, an eight-week moving average chart was compiled and a linear trend line showed a similar production volume increase over the entire period.
Figure 2. 8-week moving average chart of pieces produced per shift
As the hazard rate of a machine increases, it becomes less reliable, and subsequently breaks down more often. Higher downtime means the machine is less available for production purposes. Consequently, fewer parts are produced, ultimately resulting in lost revenue. Therefore, in the case of company Alpha, in order to tie production numbers to the PHM, we considered the shortage of parts produced compared to the weekly target of 5,000 gears, or 333 gears per shift. This shortage in reaching the target volume is considered as a covariate. As production volumes increased, the shortage to target decreased. During the period of data collection, there was no personnel turn-over. All operators were assigned to a particular machine and learned the specifics of that machine gradually as time went on. There were no formal training programs offered by the company. Operators learned by experimenting and by sharing their trouble-shooting experiences. Given the fact that some of the equipment was purchased for over $10 million, this was not a wise strategy by the company. Unskilled operators could not operate the machines at all. They were first partnered with a skilled operator to follow around and observe, after which they were considered a novice. After this initial training stage, novice operators were left on their own to run the machines they were responsible for. Over time, operators gained additional knowledge the longer they worked on the equipment. The following are the assumptions we are making throughout the discussions in this paper: • There is a heavy load on the machines right from the start. The machines are used for production around the clock and sometimes seven-days-a-week. Degradation starts to occur quickly. • Operators can never be perfect, no matter how much training they receive. The learning curve increases with a decreasing slope and a ‘perfect’ line acts as an upper-bound asymptote the learning curve will never reach. There is always the possibility of operator error. This possibility merely decreases with training and time. • We only consider scenarios where the operators are unfamiliar with the machinery. The equipment is relatively new and the operators’ learning curve is changing significantly. The learning curve has not reached a plateau yet. • Operators stay with the equipment throughout the entire planning horizon. Employee turn-over is negligible. 3
SKILL LEVEL CLASSIFICATION
A significant portion of the 25% capacity increase in the system observed above can be attributed to the increased skill level of the operators and the maintenance crew, as well as the improved decision-making of the managers. These can all be grouped together as human-related factors. An extreme case scenario may consider the entire 25% capacity increase on the additional operator skills. This will not be accurate because in the case of Alpha, there were some equipment modifications implemented in the study period. However, attributing this entire increase to operator skill can provide us with an upper bound
563
on the effect of human reliability. Another extreme case scenario may be to completely ignore human reliability and attribute all downtime to machine breakdowns. This would be incomplete and misleading for resource allocation. However, it can provide us with a lower bound for our eventual considerations in the human aspect within the overall risk. There will be a discussion in section 4 about the correct level between the two bounds to accurately portray the contribution of humanreliability to the overall system hazard. Knowledge in the form of machine operating skill is a function of other factors, most important of which is time. This is an increasing function that has a decreasing effect on system hazard, meaning as the operator gets more skilled on the machine, there would be fewer mistakes, quicker recoveries, and more efficient operations. If we add human reliability as a new covariate, Zi(t), we can initially calculate the corresponding coefficient gi and then evaluate whether or not this coefficient is significant. If so, the addition of human reliability as a covariate can be expected to have a significant effect on the overall system hazard. The skill categories chosen for this analysis and the respective production totals expected at each skill level are given in Table 1. According to our experience, these categories are novice, acceptable, good, and excellent. One factor to be cognizant of is the attribution of the various skill levels to the operators. As an operator goes from “novice” to “acceptable”, the positive impact on the operation of the machine should be equivalent to the operator skill going from “acceptable” to “good”. An unskilled operator is not considered for the model as they will not be operating the machinery on their own and, as a result, cannot theoretically affect the system. Table 2 shows the time that it normally took an typical operator to go from one skill level to the next. It also shows the duration it takes for an expert to formally train the operator to elevate their knowledge to the next level. Whether through training or the passage of time, the operator gains more skills in operating the particular machinery and progresses through the various skill categories. However, the rate of change of these steps is different, with more time required for the higher skill levels. Eventually, there comes a point where skill enhancement is minimal and reaches a plateau. An operator can never be perfect. As can be seen in Figure 3, the operators will go through the various skill levels but there will always be a possibility for human error. However, this probability gets reduced with training and time.
Table 1. Skill level categorization Skill Level Per shift production level Novice 100 Acceptable 165 Good 235 Excellent 300
Skill level change Novice to Acceptable Acceptable to Good Good to Excellent
Table 2. Skill level enhancement Duration with no training Training duration by an expert 1 month 2 shifts 4 months 4 shifts 12 months 4 shifts
Skill level Perfect Excellent Good Acceptable
Novice Time Figure 3. Skill level improvement
564
4
PROPOSED IDEA
As mentioned earlier, the PHM for a system includes the baseline hazard that considers the age of the equipment. As a machine ages, its failure risk increases. However, even for a PHM that includes HR covariates, the base-line hazard’s higher unreliability over time may not be strictly machine-related. An inadequately trained operator who mistakenly causes a few machine crashes may inherit a machine that will never be the same even after repairs. Some components may be replaced or put back together, but the ultimate functionality or alignment may never be restored to the original factory settings. Consequently, as the machine ages, its rate of reliability degradation will be more rapid than a machine that did not experience such drastic incidents. Referring back to the case of company Alpha, a comprehensive way to analyze the system through a PHM would be to initially identify all failure modes and then group them into two categories, machine-related and human-related. Various metal contents in oil analysis, vibration levels of various axes, or certain trends in gear measurements are examples of covariates related to failure modes associated with the machines. On the other hand, performing the wrong sequence of commands, wrong set-up of distance or cutting angle, or inaccurate levels of various lubes are examples of covariates related to failure modes associated with the operator. In a sensible PHM analysis of this system, appropriate information about baseline hazard can be obtained by consulting the Original Equipment Manufacturer (OEM). The OEM has its own estimates of the machine characteristics such as MTTR and MTBF and it can also obtain accurate estimates by collaborating with other customers, using similar equipment in more specialized manufacturing sites. Client sites with highly-specialized operators and maintenance crew, trained by the OEM representatives will have equipment that are well maintained and properly operated from the beginning. Failure data and best preventive practices across various sites can be used to determine machine MTTR and MTBF, as well as inspection frequencies. In these environments where the workforce is highly trained, most machine failures will be truly machine-related and this leads to a more accurate estimate of the MTBF. In addition, the maintenance crew is well trained and the MTTR is the minimum time it takes to repair the machine without wasting any time in the diagnosis or the repair. We can also collect failure and repair data to form Alpha’s estimate of the MTBF and MTTR. Once the mean times of the specialized manufacturing sites are superimposed on the mean times of Alpha, the gap between the curves represents the role of the human reliability, or the effect of lack of training. This gap is shown graphically in Figure 4 where machinery used at a site with fully trained operators experience a higher MTBF than identical machines used at a site with untrained operators. In the case of the operators at Alpha, this effect on the hazard rate falls between the lower and the upper bounds discussed in section 3. Once we have the overall effect on the hazard and a precise estimate of the baseline hazard, we can get a more accurate estimate of the human-related covariate, which includes all failure-modes attributed to the human operator.
MTBF
Skilled workforce
Training
Unskilled workforce Time Figure 4. MTBF comparison of two sites, with and without workforce training
We are currently in the process of collecting more data to enable us to perform some calculations based on the aforementioned ideas. We have received some machine failure data, including the frequency and duration of failures, which we will use to calculate the MTBF and MTTR of the machines used at Alpha. Furthermore, we await the receipt of data from the OEM on MTBF and MTTR of machines identical to those at Alpha but used at locations with a highly skilled workforce. We can then compare the two MTBF curves, one based on data from Alpha and the other based on the OEM data. If we use a PHM with the baseline hazard derived from the OEM data, the gap can be used to calculate the effect of HR factors as a covariate on the overall hazard rate. The coefficient of this covariate will be calculated from the Alpha data. A hypothetical scenario can then be set up where the effects of training can be assessed by hypothetically taking the operators through the various skill levels quicker by training programs. The cost of the training sessions can then be subtracted from the total cost of the overtime
565
shifts and the additional material and personnel resources. The difference between these two costs will demonstrate the importance of training and its value in the overall maintenance optimization strategy. 5
CONCLUSION
In this paper, the impact of human reliability is assessed on systems where the machinery is new and there is a heavy load on them at the start. If training is not done initially, operators will be unskilled and will commit many errors. Consequently, the system experiences extensive downtime, due more to HR errors since MR failures in new equipment are expected to be infrequent. If the maintenance manager ignores the important role of the human unreliability in this context, the majority of efforts and resources towards the equipment are misdirected. Resources are wasted and the desired results of increasing uptime and raising production levels are not achieved. By recognizing the significance of human reliability, one can dedicate the necessary resources to train the workforce in order to achieve the ultimate goal of increased revenue generation through higher production volumes. A widely used tool in condition-based maintenance, PHM, can become more effective and act as an all-encompassing tool with the inclusion of human reliability as a covariate. This paper does not discuss the various aspects that can be covered under human reliability. Rather, it groups all human-related factors together and considers them all as “operator skill”. Once the overall effect of the HR factors on the hazard rate is accurately estimated, the maintenance decision maker can use this estimate to assess the effect of training programs and justify their cost. The work in this paper requires the skill classification of the operators and their enhancement from one skill level to the next over time. In future research, fuzzy logic will play a role in determining the skill level of the operators. Otherwise, it would be difficult to differentiate between where one level would end and the next level would begin. In addition, we have only considered the case of new machinery. In the long term, after the machine has run for a while and the operators are all experts, the significance of the role of human reliability should be further investigated. 6
REFERENCES
1
Vlok P.J, Coetzee J.L, Banjevic D, Jardine A.K.S, & Makis V (2002) Optimal component replacement decisions using vibration monitoring and the PHM, Journal of the operational research society, 53, 193-202
2
Jardine A.K.S and Banjevic D (2005) Interpretation of inspection data emanating from equipment condition monitoring tools: method and software, In Armijo Y.M, Ed (2005) Mathematical and statistical methods in reliability, Singapore: World Scientific Publishing Company
3
Bubb H. (2005) Human Reliability: A key to improved quality in manufacturing, Human Factors and Ergonomics in Manufacturing, 15(4), 353-368.
4
Latorella K & Prahbu P. (2000) A review of human error in aviation maintenance and inspection, International Journal of Industrial Ergonomics, 26(2), 133-161.
5
Barroso M & Wilson J (1999) HEDOMS – Human error and disturbance occurrence in manufacturing systems: towards development of an analytical framework, Human Factors and Ergonomics in Manufacturing, 9(1), 87–104
6
Blanks H. (2007) Quality and reliability into the next century, Quality and reliability engineering international, 10(3), 179184.
7
Zimolong B & Trimpop R. (1994) Managing human reliability in advanced manufacturing systems. In Salvendy G & Karwowski W, Ed (1994) Design of work and development of personnel in advanced manufacturing. Ch. 15. John Wiley & Sons.
8
Dhillon B & Liu Y. (2006) Human error in maintenance: a review, Journal of quality in maintenance engineering, 12(1), 21-36.
566
Proceedings of the 4th World Congress on Engineering Asset Management Athens, Greece 28 - 30 September 2009
ASSESSING THE RELIABILITY OF SYSTEM MODULES USED IN MULTIPLE LIFE CYCLES Muhammad Ilyas Mazhar, Muhammad Salman and Ian Howard Department of Mechanical Engineering, Curtin University of Technology, Perth, Australia This paper presents a reliability assessment model for maintenance and reliability engineers to determine the reusability potential of system modules based on their operating lives and, the pre-determined lifecycle of the parent product. In the first phase, the paper considers the time-to-failure of system modules to determine their overall operating life under normal operating conditions. Then, it determines the maximum number of lifecycles of the modules under consideration, which is a function of the modules’ total functional life and the product lifecycle. The functional potential of the modules is then discussed with reference to the probability of their failure. The study employs Weibull analysis for carrying out this analysis. The methodology was validated by using lifecycle data from a consumer product. The findings show that some of the components/modules of the product have a remarkable amount of residual life, which can be utilized by reusing these components in the next generation of the product. The results demonstrate the effectiveness and practicality of this multiple-use strategy. The study provides more perspectives to future research in the field of reliability, and decision-making on maintenance management and value recovery through multiple uses of system modules. Key Words: Reliability, Multiple-use, life cycle 1
INTRODUCTION
Environmental awareness, depletion of available non-renewable resources, international and national legislations on industrial production and waste management are demanding remarkable changes in the manufacturing and asset management culture. Optimizing the equipment utilization and minimizing the harmful impacts to the environment by maximizing the material recovery from discarded/returned products are also promoting this shift of paradigms. This change in production and asset management culture is contributed to the growing recognition among leaders in the global business community that profitability alone is an inadequate measure of success, and that many of the intangible concerns associated with sustainability are fundamental drivers for long-term shareholder value [1]. Conventional products are designed to be suitable for a single lifecycle after which they are thrown to the landfill. Such products are based on cradle-to-grave concept, which is no longer a preferred approach in the modern world. Many countries are introducing environmental protection legislations and policies, such as WEEE, EuP and RoHs etc., to facilitate and ensure this transformation [2-4]. Some of these legislations, for example WEEE, make the producers responsible for take-back and end-of-life treatment of their products. As a result, the volume of used products and their associated components continues to grow internationally. The increased awareness and customer service considerations are other contributing factors. Consequently, the manufacturing and maintenance engineers are striving hard to optimize the plant and equipment utilization by promoting multiple uses of parts / modules of used products. One such example is Multi-Life Products that are described as products spanning several market cycles. A Multi-Life Product is defined as a product model representing several subsequent generations of a certain product. Multi-Life Products increase the sustainability of physical products as well as product models [5]. Another approach is product recovery, which has traditionally been viewed by customers as an economically and environmentally beneficial alternative to ordering new products. Product recovery is considered as a superior concept that involves concepts like reuse, remanufacturing and recycling. The aim with product recovery is to retrieve a product’s inherent value when the product no longer fulfils the user’s desired needs [6]. Research indicates [7] that reusing parts/products is one of the best ways of transforming the existing open-loop practices into a manufacturing and maintenance management culture, which is more efficient, technologically feasible, economically competitive and, environmentallyfriendly. Furthermore, reusing components from old products is technologically feasible and it doesn’t compromise product quality [8-10].
567
The reuse approach becomes particularly important for products that have shorter lifetime but longer technological life, as shown in Figure 1. For example, washing machines and refrigerators could be good candidates for parts recovery and reuse.
Figure 1: Product Lifetime and Technological Life of Products [11, 12] Once the product category has been decided, the next stage is the identification of product components or modules (subassemblies) that can be reused. As shown in Figure 2, some of the components/modules of a product have greater life than the life of their parent product. Making use of this residual life would result in an enormous amount of savings in resources.
Figure 2 : Reliability bathtub curve – product and its modules [13] However, this approach is not easy to be applied in reality. There are several uncertainties associated with the reuse of modules/components of old equipment; the most common is the uncertainty of the parts’ reliability after use [14]. This paper contributes to the product recovery and asset management by promoting multiple-use of components/modules of products and systems. The objective is to provide a strategy that helps reduce material consumption without compromising the product quality. The study investigates the behaviour of system modules in their first, second, and third lifecycles. The Weibull analysis has been employed to determine the reliability and mean life of system components. Then several hypothetical scenarios have been considered to estimate the maximum number of lifecycles of the components/modules under consideration. This gives rise to the concept of multiple-use of the system modules. The procedure is repeated for three consecutive lifecycles. The focal point of this study is to emphasise the fact that multiple-use of system modules is technologically feasible. It also highlights the importance of the lifecycle data in order to assess the reliability of used system modules with required levels of
568
confidence. The actual time-to-failure data pertaining to the subassemblies (modules) of a washing machine was used to demonstration the application of the proposed multiple-use strategy. 2
METHODOLOGY
A three-step methodology is proposed for determining the reliability of system modules before they are reused in the next generation of a product. This includes: a) Estimating the product lifecycle, b) Estimating the residual life of the components/modules, c) Determining the number of lifecycles of components/modules. 2.1 Product lifecycle The actual life span of a product depends on a number of factors, including its design life, operating environment and maintenance etc. The design data combined with statistical and condition monitoring information could be used to estimate the product lifecycle. This type of decision-making is relatively easy in maintenance systems because of the availability of condition monitoring data. In the case of consumer products, empirical evidence could be gathered to determine the product lifecycle. The condition monitoring data can also be used for estimating the actual (used) life of components of a product [15]. 2.2 Residual life of components/modules The residual life is a function of two parameters – i.e. product lifecycle and total functional life of the components/modules under consideration. Mathematically,
LRS = LTF - LP
(1)
where, LRS = Residual life of components at the end of the first lifecycle LTF = Total functional life of a component/module LP = Product lifecycle Estimating the total functional life of individual modules/components of a product operating under the given conditions is mainly based on the time-to-failure information. Gathering this type of data is a common practice in the majority of maintenance management systems. However, collecting this information on consumer products may not be an easy task. Inhouse testing is an alternative to fill this requirement. In the preceding studies [10, 12, 16], time-to-failure data was collected for the gearbox and electric motor of a washing machine. As the reported data belongs to the same category of the selected components, the reliability of those components was recalculated by collating the entire data, as shown in Table 1.
Table 1 Time-to-failure data Failures
Suspensions*
Gearbox
9
79
Electric motor
2
86
*It refers to a situation where the component was still functioning properly when the experiment was terminated. The two-parameter Weibull distribution [17-19] can be employed to determine the mean life of the selected components. The following equations describe the basic form of Weibull analysis:
569
t b F (t ) = 1 - exp - h
(2)
and
( b + 1) MTTF = hG b
(3)
where, F(t)=Fraction of units failing b =Shape parameter h =Scale parameter MTTF=Mean time to failure (Total functional life) Once the product lifecycle (LP) and total functional life (LTF) of product modules are known, the residual life (LRS) can be calculated using equation (1). This value will help make the decision whether or not it is worth considering the component for reuse. It has already been argued [20] that a product intended for reuse should have an operating life equal to at least two lifecycles (functional lives). Therefore, IF LRS ≥ 2 LP < 2 LP
The component can be reused. The component is rejected.
The maximum number of lifecycles of the component is estimated in the following section. 2.3 Component lifecycles Mathematically, it appears to be the easiest of the three steps. Dividing the total functional life of components/modules by the product lifecycle will give the maximum number of lifecycles.
N LC =
LTF LF
(4)
where, NLC refers to total number of lifecycles. It relates to the number of times, a component can be used (reused) in the entire technological life of the category of products under consideration. 3
RESULTS
As the available time-to-failure data pertain to washing machines, the proposed methodology requires information about the lifecycle of a washing machine. Empirical evidence indicates that the functional life of a washing machine falls between 10 to 15 years [21]. For this analysis, three scenarios were considered using the lifecycles of 10, 13 and 15 years. Applying the Weibull procedure to the time-to-failure data (Table 1) yielded the results, as shown in Table 2.
Table 2 Functional lives of gearbox and electric motor Shape parameter
Scale parameter (years)
Mean life (years)
Electric motor
5.6
42.5
39.42
Gearbox
2.6
45.9
40.72
570
When compared to the preceding study [13], the results showed a remarkable improvement in the mean life of the gearbox, which is attributed to the additional five data sets. All of the five data sets were suspended values that resulted in reducing the probability of failure (Figure 3). On the other hand, there was a slight decrease in the mean life estimation for the electric motor. This is due to another failed unit in the additional five data sets. The data used by Mazhar et al. [13] contained just one failure of the electric motor, whereas inclusion of the five additional data sets, used in the current analysis, doubled the number of failures in the overall data set.
Figure 3: Weibull probability plot – gearbox Both the gearbox and electric motor were found qualified for reuse as the total functional life (LTF) of each was greater than double the maximum lifecycle of the product. Considering the washing machine lifecycles of 10, 13 and 15 years produced the following results shown in Table 3.
Table 3 Component lifecycles Lifecycles
Gearbox (LTF = 40.72 years) Electric motor (LTF = 39.42 years)
LP = 10 years
LP = 13 years
LP = 15 years
4
3
2.7
≈4
3
2.6
571
It can be seen that both of the selected modules possess immense potential for reuse. The first scenario provides four lifecycles for both of the modules. A close examination of the reliability bathtub curve indicates that two of the component lifecycles are in the useful life zone that provides the minimum rate of failure, whereas each of the remaining two lifecycles contains one of the reliability phases that contain a higher rate of failure, as shown in Figure 4.
Figure 4: Functional life of product modules (four lifecycles) The second scenario that provides three lifecycles is described in Figure 5. The first lifecycle contains the regular burn-in (infant mortality) phase, which starts off with higher rate of failure and then settles down to useful life (stabilised) phase. As shown, this phase is virtually free of failures due to deterioration. The second lifecycle is the best one because the entire lifecycle is in the useful life (stabilised) zone, which is free of both the early life and wear-out life phases that are prone to failures. This highlights that reusing system modules, in fact, reduces the probability of failures.
Figure 5: Functional life of product modules (three lifecycles) The third and final scenario provides at least two lifecycles each of which shares one of the life phases prone to higher failures. Therefore, the total number of failures was divided into two lifecycles, thus, reducing the probability of failure. It is evident from the results that the multiple-use strategy adds more certainty by reducing the probability of failure. This is primarily due to the fact that the total number of failures was shared by the component lifecycles when the multiple-use strategy was implemented. On the other hand, both the burn-in and deterioration failures occur in the same lifecycle if the components’ operation were based on a single lifecycle. 4
CONCLUSION
This paper presents a reliability assessment strategy for the reuse of components/modules of products and systems. It has been shown that reusing system modules doesn’t compromise product quality. The proposed three-step multiple-use methodology determines the eligibility and reliability of components intended for reuse. The suggested procedure improves the reliability estimates by reducing the probability of failures. The proposed approach is more realistic and reliable as it has been validated by real data. The results indicate that the gearbox and electric motor of a washing machine possessed immense potential that could be realised by reusing them for two to three more lifecycles.
572
5
REFERENCES
1
Américo Guelere Filho, Aldo Roberto Ometto, and D.C.A. Pigosso, (2008) A Proposal for a Framework for Life Cycle Engineering, in 15th CIRP International Conference on Life Cycle Engineering. Sydney.
2
Allen H Hu, Chia-Wei Hsu, and W.-C. Wu, (2008) Risk Management of Green Components Using Failure Modes and Effects Analysis, in 15th CIRP International Conference on Life Cycle Engineering. Sydney.
3
Kara, S., M.I. Mazhar, and H. Kaebernick, (2004) Lifetime prediction of components for reuse: an overview. International Journal of Environmental Technology and Management, 4(4), 323 - 348.
4
Fatida Rugrungruang, Sami Kara, and H. Kaebernick, (2008) Technological Forecasting for Component Reuse. In 15th CIRP International Conference on Life Cycle Engineering. Sydney.
5
Jörg Feldhusen, Frederik Bungert, and M. Löwer, (2008) A Methodical Approach to Increase the Sustainability of Physical Products and Product Models, in 15th CIRP International Conference on Life Cycle Engineering. Sydney.
6
Johan Östlin, Erik Sundin, and M. Björkman, (2008) Business Drivers for Remanufacturing, in 15th CIRP International Conference on Life Cycle Engineering. Sydney.
7
Seliger, G., A. Buchholz, and W. Grudzein, (2002) Multiple Usage Phases by Component Adaptation. in The 9th CIRP International Seminar on Life Cycle Engineering. Erlangen, Germany.
8
Klausner, M., W. Grimm, and C. Hendrickson, (1998) Reuse of Electric Motors in Consumer Products: Design and Analysis of an Electronic Data Log. Journal of Industry Ecology, 2(2), 89 - 102.
9
Scheidt, L. and Z. Shuqiang, (1994) An Approach to Achieve Reusability of Electronic Modules. In The IEEE International Symposium on Electronics and the Environment.
10
Mazhar, M.I., S. Kara, and H. Kaebernick, (2004) Reuse Potential of Used Parts in Consumer Products: Assessment with Weibull Analysis. International Journal of Production Engineering and Computers, 6(7), 113 -118.
11
Fujimoto, H.A., (2001) Planning for product take-back and component life under uncertainty in technological evolution. In the Second International Symposium on Environmentally Conscious Design and Inverse Manufacturing.
12
Mazhar, M., I., (2006) Lifetime monitoring of appliances for reuse, PhD Thesis, School of Mechanical and Manufacturing Engineering. University of New South Wales: Sydney.
13
Mazhar, M.I., S. Kara, and H. Kaebernick, (2004) Reuse Potential of Used Parts in Consumer Products: Assessment with Weibull Analysis. in The 11th International CIRP Life Cycle Engineering Seminar on Product Life Cycle - Quality Management. Belgrade, Serbia.
14
Kaebernick, H., M. Anityasari, and S. Kara, (2002) A Technical and Economic Model for End-of-Life Options of Industrial Products. International Journal of Environment and Sustainable Development, 1(2), 171 -183.
15
Mazhar, M.I., S. Kara, and H. Kaebernick, (2007) Remaining life estimation of used components in consumer products: Life cycle data analysis by Weibull and artificial neural networks. Journal of Operations Management, 25(6), 1184-1193.
16
Jiří Vass, et al., (2008) Vibration-based Approach to Lifetime Prediction of Washing Machines. In 15th CIRP International Conference on Life Cycle Engineering. Sydney.
17
Abernethy, R.B., (1993) The New Weibull handbook. The Author, North Palm Beach Florida.
18
Cole, G.K., (1998) Practical Issues Relating to Statistical Failure Analysis of Aero Gas Turbines. Proceedings of the I MECH E Part G Journal of Aerospace Engineering, 212(3), 167 - 176.
19
Harris, J., (2001) Piecewise linear reliability data analysis with fuzzy sets. Proceedings of the I MECH E Part C Journal of Mechanical Engineering Science, 215(9), 1075 -1082.
20
Susumu Okumura, T. Morikuni, and N. Okino, (2003) Environmental effects of physical life span of a reusable unit following functional and physical failures in a remanufacturing system. International Journal of Production Research, 41(16), 3667 - 3687.
21
Moore, P.R., et al., (2000) Life Cycle Data Acquisition Unit - Design, Implementation, Economics and Environmental Benefits. In The IEEE International Symposium on Electronics and the Environment. San Francisco.
573
Proceedings of the 4th World Congress on Engineering Asset Management Athens, Greece 28 - 30 September 2009
IMPACT OF VIBROACOUSTIC DIAGNOSTICS ON CERTAINTY OF RELIABILITY ASSESSMENT Radkowski S, Gumiński R Institute of Automotive Engineering, Warsaw University of Technology, Narbutta 84, 02-524 Warsaw, Poland.
A relevantly defined operational strategy has decisive influence from the point of view of the ability to maintain and improve reliability and safety as well as from the point of view of maintaining manufacturing quality. The paper presents proposal of approach of implementing pro-active operation. Particular attention is paid to the issues of selection and adaptation of the methods of diagnosing the low-energy phases of defect development as well as use of a posteriori diagnostic information and application of Proportional Hazards Model (PHM) in this tasks. Attention is paid to the importance of technical risk analysis. Key Words: vibroacoustic diagnostics, maintenance, Proportional Hazards Model, Bayesian Updating
1
INTRODUCTION
A reduction of the uncertainty of reliability estimations becomes a critical issue in the process of making the decisions which are intended to ensure technical safety of the system and minimize the costs. One of the essential methods of reducing the epistemological uncertainty is to develop models and diagnose the degradation and wear and tear processes, thus reducing the variance of evaluations of the residual period until the occurrence of a catastrophic defect. From this point of view it is the implementation of proactive operational strategy that becomes particularly important. As is presented by Figure 1, the essence of such an approach boils down to anticipation of preventive actions, in equal degree prior to defect emergence as well as during the period of development of low-energy phases of defects. This calls for developing and adapting relevant methods of diagnosis which are supported by relevant diagnostic models.
Development of failure
Sympton value of damage Proactive maintenance
Reactive maintenance Failure Repair Condition monitoring and maintenance mode
Prenucleation and early stage failure detection
Operating time Figure 1. Comparison of technical diagnostics goals in proactive maintenance versus reactive maintenance [1]
574
Let us note that the essence of thus defined a strategy is the extensive use of monitoring, diagnosis, forecasting and decision-making models for creating the possibilities of taking maintenance-and-repair actions while anticipating problems [2]. This denotes the need for developing and applying advanced monitoring, presentation of information on emergency states and values, selection of methods and means enabling monitoring and on-line inference in a way enabling early detection of growing disturbances and extracting from general signals the information on anomalies in operation which are characteristic of defects; controlling the defects and taking corrective actions by the operator in order to minimize and in particular to avoid undesirable developments leading to serious consequences; development of a forecast of future events based on current observations and registered permanent changes of parameters which have been detected by analyzing the results and the measurements collected in the database. The last item is particularly important when monitoring the condition of mechanical elements and units as well as the remaining components which are subject to degradation and wear and tear for which the detection of early phases of defect development may help prevent the occurrence of the catastrophic phase of defect development, including destruction of the whole system. Realization of this goal calls for assessment of structural reliability of the system while accounting for detection and analysis of degradation processes affecting all the components during the previous and current period of use. This requires development of a relevant database containing information on potential defects of the system’s components, knowledge gathered based on the experience acquired by relevantly trained personnel as well as procedures which account for the feedback and adaptation changes occurring in the system. In the process of estimating the probability of defect occurrence, the above enables us to account for the influence of operational conditions on the possibility of defect occurrence, the influence of earlier defects, quality, scope and intervals between inspections, the probability of defect occurrence in specific time in the future. Let us note that estimation and modelling of the degradation process is one of the most effective methods of defect development anticipation and maintenance of system operation in terms of nominal parameters. In reality such an approach denotes compilation of several conventional methods of forecasting – probabilistic behavioral models and event models in particular. Probabilistic behavior and degradation models enable analysis of the type and extent of uncertainty which conditions forecasting reliability. Event models are a kind of a combination between the contemplated models and the actual system and they make up the basis for constructing and analyzing causal models which enable assessment of degradation and determination of the optimum scenario of maintenance-and-repair work. An additional problem is how to determine the impact of degradation processes on realization of tasks related to classification, determination of regressive relations and evaluation of probability distribution evolution. In the to-date contemplated models the probability of defect occurrence was being defined on the assumption of invariability of examined distributions during operation of the object. In reality, as a result of wear and tear processes and associated changes of conditions of mating elements and kinematic pairs, we observed conditional probability distributions, however the relationship demonstrated itself both in quantitative terms (change of the parameters of probability density function) and in qualitative terms (change of the function describing the distribution). In addition the degradation processes accompanying the performance of functional tasks can cause similar variability of distributions of the probability which describes load capacity. In this case one can expect that the location of the separating line and the probability of defect occurrence will not only depend on the time of operation of an object but on the new dynamic feedbacks in the system, associated in particular with the development of non-linear relations and non-stationary disturbance. 2
DIAGNOSTIC MODEL IN DETECTION OF LOW-ENERGY DEFECTS
While attempting to develop a model oriented on such defects one should on the one hand consider the issue of examining the signal's parameters from the point of view of their sensitivity of to low-energy changes of the signal and, on the other, the issue of quantification of energetic disturbances occurring in the case of defect initiation [3]. Let us assume that the degree of damage D is the dissipated variable that covers the changes of the structure’s condition due wear and tear:
dEd (Q , D0 ) =
¶Ed (Q , D0 ) ¶E (Q , D0 ) dD + d dQ D ¶ ¶Q
(1)
where:
df ( D, Q , g (Q )) dQ g (Q ) the parameter describing how big a part of the dissipated energy dEd is responsible for structural changes, Q operating time. Bearing in mind the possibility of diagnosis of the origin and the development of low-energy phases of defect formation, when the extent of the original defect can be different in each case, let us analyze this issue more precisely. To examine this problem let us recall here the two-parameter isothermal energy dissipation model proposed by Najar [4] where: dE d =
575
dEds = dEd - dEdq = Tds = s Q dD
(2)
where:
dE d q
-
energy transformed into heat,
dE d s
-
energy responsible for internal structural changes,
T temperature, ds growth of entropy. The expression (2) shows that the growth of the dissipated variable D is attributable to the dEd s part of energy, which is the dissipated part of dEd energy, that causes the growth of entropy ds. The role of the multiplier determining the relation between the increments of dissipated variable and the entropy is played by the dissipation stress s Q . The assumption of
T = constans results in independence of dissipation-related loss dEds = dEQs , thus following integration the expression (2) takes the following form:
Ed s = TDs
(3)
The derivative of defect development energy related to D, when E f ( D0 ) £ 12 Ee 2 , means the boundary value of deformation energy and takes the following form:
dEd s dD
=
E f ( D0 )(1 - D f )(1 - k ) D - k
(4)
D 1f-k - D01-k
For a defined initial defect of D0 and for a defect leading to damage D f , relationship (5) will have the following form:
dE d s ( D ) dD
= (1 - k ) E D0 , f ( k ) D - k
(5)
Let us note that parameter E D0 , f is an exponential function of power k, similarly as the whole derivative. While referring to the second rule of thermodynamics for irreversible processes we will assume the following in the contemplated model:
dEd s ( D) dD
‡0
(6)
Thus for the assumed model to be able to fulfill condition (6), the exponent must meet the requirement of k £ 1 . In addition, while referring to the rule of minimization of dissipated energy, the conditions of permissible wear process show that the change of exponent k is possible as the defect develops. To examine this problem let us assume that the exponent shows a straight line dependence on the extent of damage:
k ( D ) = a + bD
(7)
For damage of small magnitude the linear approximation seems to be sufficient and enables description of defects whose emergence is characterized by small growth of defect energy (see Figure 2).
576
D – normalized degree of failure
Figure 2. Change of energy of defect development for small D Thus while defining the set of diagnostic parameters we should pay attention to the need for selecting such a criterion so that it will be possible to identify defects whose emergence is characterized by small growth of defect-related energy. While contemplating this issue let us assume that vibroacoustic signal is real and meets the cause-and-effect requirement, which means that it can be the base for creating an analytical signal. In accordance with the theory of analytical functions, the real and the imaginary components are the functions of two variables and meet Cauchy-Riemann requirements. Please be reminded that the analysis of the run of the analytical signal will be conducted while relying on the observation of changes of the length of vector A and phase angle j:
z ( x, y ) + jv( x, y ) = A(cos j + j sin j )
(8)
z (x(t ), y (t ) ) = A(t ) cos j (t ), n (( x(t ), y (t ) ) = A(t ) sin j (t )
(9)
Thus,
means that the signal measured is the orthogonal projection of vector A on real axis. Ultimately, while exploiting the CauchyRiemann conditions for variables A(t ) and j (t ) we will obtain:
dz dA dj = cos j - A sin j dt dt dt
(10)
As we expected the obtained relationship presents an equation that enables the analysis of the measured signal while observing A and j. At the same time it should be noted that for low-energy processes, when we can neglect the changes of the vector’s length and assume A @ const , the whole information on the changes of the measured signal is contained in the phase angle, or more precisely in the run of momentary angular velocity. While accounting for the obtained results of the analysis of the process of low-energy defect emergence and detection of diagnostic information associated with the changes of momentary values of amplitude and angular velocity, let us analyze the conditions that must be fulfilled by a diagnostic model intended to enable observation of the influence of such disturbance on the form of the system’s dynamic response.
3
BAYESIAN UPDATING PROCEDURE
The essence of this approach involves updating of estimated parameters of the probabilistic model so as to achieve bigger alignment between the results of modeling and observations. In accordance with the above presented assumptions it is assumed that unknown or uncertain parameters of distribution are random variables. Uncertainty of estimation of results can be linked to variability of random variables by means of Bayes theorem [5]. Let us note that Bayes formula can be simplified by accounting for proportionality of a posteriori and a priori distributions only :
f (a / D) =
f ( D / a ) f (a ) f ( D)
f ( D / a) f (a)
(11)
577
f (a / D) = K B L[ D / a ] f (a ) where: KB –
(12)
standardizing constant
L[ D / a ] – likelihood function To be able to determine the probability of a defect in the analyzed timeframe, the information contained in the observations should account for both, occurrence of a defect and non-occurrence of a defect. For the exponential form of the function describing the distribution, the credibility function will be noted in the following form: L[D / a ] = where: n – m –
n i =1
p (q f / a ) ·
m j =1
[1 - P(q
f
/ a)
]
(13)
denotes the set of detected defects denotes the set of events defining non-existence of a defect.
In a similar way the probability density is proportional to the square root of Fisher’s information matrix factor [6]:
f (a )
(det I (a ))1 2
(14)
where:
¶ 2 ln f (D / a ) – is calculated as the matrix of average second derivatives from the credibility function logarithm ¶a 2 based on the results of the experiment. Thus formula (11) is finally written in the following way:
I (a) = - E
f ( a / D)
L( D / a)(det I ( a))1 2
(15)
Figure 3. Probability density function of scale parameter.
578
Figure 4. Probability density function of scale parameter estimated with using additional information 1 0.9 0.8 0.7
reliability
0.6 Without additional information
0.5 With additional information
0.4 0.3 0.2 0.1 0
0
10
20
30
40
50
60
70
time
Figure 5. Reliability function with 90% of confidence bounds. Figures 3 and 4 presents scale parameter for the amplitude distribution of 5th harmonic of meshing frequency which was obtained without taking into account additional information and with taking into account this information which was extracted during the diagnostic observation. On the figure 5 the reliability functions with 90% confidence bounds with and without taking into account additional diagnostic information were presented. Significan constrain of results dispersion gives special attention.
4
USING OF THE PROPORTIONAL HAZARDS MODEL IN THE UNCERTAINTY ANALYSIS OF RELIABILITY FUNCTION
Through condition monitoring it maybe possible to obtain a better understanding of the health of the item and to choice the most appropriate maintenance action just before catastrophic failure development. The most common is visual inspection, but for expensive and critical items the vibroacoustical signal analysis is used for condition monitoring. Taking the costs into consideration it is clearly to focus attention on the optimization of condition monitoring procedure. It this part of paper we will
579
present an approach for estimating the hazard function, that combines the aging effects of equipment and condition monitoring data using a Proportional Hazards Model (PHM). There are various form that can be taken by PHM to combine a baseline hazard function along with component that takes into covariates that are used to improve the prediction the level of reliability. We start with the form known as Weibull PHM [7]
b l (t , z (t )) = h
t h
b -1
m exp ∑ g i zi (t ) i =1
(16)
where: l (t , z (t )) – the instantaneous conditional probability of failure at time t given the values of z1(t) ... zm(t), that represent a data of monitored variables (covariates). The first part of equation presents the impact of aging processes of equipment at time of inspection, the second part takes into account the covariates and their associated weights.
lHPM(t,z(t))
l(t)
l0(t) t
Figure 5. Presentation of result of PHM with covariates dependent on time. One can observe that such a shape admission arise from the assumptions, which must be fulfilled in case of proportional hazards model adoption: 1.The ratio of intensity of damage for two different values of a systemic variable is independent of time 2.Intensity of defects for various values of the systemic variable are described by this distribution, Based on the above assumptions we can write the following equation:
l (t, z, β ) = l0 (t ) r (z, β )
(17)
where: t – time z – systemic variable b - unknown parameter accounting for the influence of the systemic variable l0(t) – intensity of defects for the value of the systemic variable adopted as the reference level If we assume, according to Cox model (1997), that:
r ( z , b ) = e zb
(18)
580
we obtain:
l (t, z, b ) = l0 (t ) e zb
(19)
The model that could be expressed by means of equation (17) is called the Proportional Hazards Model and we can generalize it to cover any number of systemic variables:
l (t , z1 ,...z n , b 1 ,..., b n ) = l0 (t ) r (z1 ,...z n , b 1 ,..., b n )
(20)
after incorporating the Cox model we will get:
l (t , z1 ,...z n , b 1 ,..., b n ) = l0 (t ) e z1b1 +...+ zn b n
(21)
The exponential form of function r(z,b ) guarantees that the intensity function assumes non-negative values irrespective of the value of coefficients. The distribution used most frequently when analyzing reliability is the exponential distribution for which the intensity of defects is constant and thus also the relationship of intensity for two groups of data related to defined defects will also be constant, which meets the requirements of the Proportional Hazards Model. Assuming exponential distribution, we rule out the possibility of accounting for influence of time. For that reason we should consider the possibility of using Weibull distribution, and thus let us analyze the conditions that must be fulfilled by the distribution’s parameters to meet the requirements of a Proportional Hazards Model. The intensity of defects for Weibull distribution has the following form:
l=
a a -1 t ha
(22)
while the relation of intensity is:
a 1 a 1 -1 t l1 h1a1 a h a 2 t a1-1 = = 1 2a1 a2 -1 a 2 a 2 -1 a 2 h1 t l2 t h 2a 2
(23)
The above relationship shows that the assumptions of the Proportional Hazards Model will be fulfilled for Weinbull’s model only if the coefficient of shape stays constant for both groups of data. R e lia S o ft's W e ib ull+ + 6 .0 - w w w .W e ib ull.c o m
P ro b a b ility - W e ib u ll
99
Unreliability, F(t)
90 50
W eib u ll D a ta 1 W 2 M LE - S R M M E D
F =4 9 / S = 0 D a ta 2
W 2 M LE - S R M M E D
0 ,7 1 m m /s
F =4 9 / S = 0
0 ,2 8 m m /s 10 5
1
RO BERT G U M IÑ S K I PW 1 7 -0 6 -2 0 0 5 1 5 :2 6
100
1000
b 1 = 1,000 0, h 1= 1,00 77E + 4 b 2= 1 ,0000 , h 2 = 3937 ,218 4
T im e , (t) 1 0 0 0 0
100000
Figure 5. Presentation of the data in probabilistic grid of Weibull distribution for various vibration speeds.
581
R e lia S o ft's W e ib u ll+ + 6 .0 - w w w .W e ib u ll.c o m
C o n to u r P lo t
2 ,0 0
W e ib u ll D a ta 1 W 2 M LE - S RM M ED
F=49 / S=0 D a ta 2
shape parameter
1 ,7 5
W 2 M LE - S RM M ED
F=49 / S=0
95% 90% 85%
1 ,4 9 0 ,7 1 m m /s
0 ,2 8 m m /s
1 ,2 4 0 ,9 8 0 ,7 3 2650
5044
7438
9832
b 1 = 1 ,0 0 0 0 , h 1 = 1 ,0 0 7 7 E + 4 b 2 = 1 ,0 0 0 0 , h 2 = 3 9 3 7 ,2 1 8 4
12226
ROBERT G U M IN S K I PW 1 7 -0 6 -2 0 0 5 1 5 :2 6
14620
scale parameter
Figure 6. Confidence fields of Weibull distribution for various vibration speeds Thus due to variables utilization, equation (16) allows tracking of distribution parameters changes with the retention of assumptions which were mentioned before. Lets notice that it is in accordance with Bayesian updating where the uncertainty of parameters is assume, keeping the form of distribution.
5
CONCLUSIONS
While summing up the methodology of a pro-active system of operations based on risk evaluation, attention should be drawn to the necessity of tackling the problem of taking into consideration events, incidents and defects in the system in a way enabling inclusion of the results in the decision-making processes
6
REFERENCES
1
Radkowski S, (2008) Vibroacoustic Monitoring Of Mechanical Systems For Proactive Maintenance, Diagnostyka, 3 (47), 157–164.
2
Muller A, Suhner MC & Iung B, (2008) Formalisation of a new prognosis model for supporting proactive maintenance implementation on industrial system, Reliability Engineering & System Safety, 93, 234÷253.
3
Radkowski S, (2008) Non-linearity and intermodulation phenomena tracking as a method for detecting early stages of gear failures, Insight, 50 (8), 419÷422;
4
Najar, J, (1991) Continuous damage of in: Continuum damage mechanics theory and applications, Editors: Krajcinovic D., Lemaitre J. International Centre for Mechanical Sciences Courses and Lectures, 295, 234÷293;
5
Gumiński R & Radkowski S, (2006) Diagnostics and uncertainly decreasing of technical risk evaluation, Materiały Szkoły Niezawodności, Szczyrk, 133÷140 (in polish).
6
Jeffreys HJ, (1961) Theory of Probability, Oxford Clarendon Press;
7
Jardine AKS & Tsany AMC, (2005) Maintenance, Replacement, and Reliability Theory and Applications, New York, CRC Taylor & Francis
Acknowledgments Scientific research project financed from the scientific research for years 2008-2011.
582
Proceedings of the 4th World Congress on Engineering Asset Management Athens, Greece 28 - 30 September 2009
THE CONSTRUCTION AND APPLICATION OF REMOTE MONITORING AND DIAGNOSIS PLATFORM FOR LARGE FLUE GAS TURBINE UNIT Chen Taoa Xu Xiao-lia,b Wang Shao-honga,b Deng San-penga,c a
b
Beijing Institute of Technology, Beijing, 100081,China
Beijing Information Science & Technology University, Beijing, 100192,China c
Tianjin University of Technology and Education, Tianjin, 300222, China
Large flue gas turbine unit is the key equipment in Catalytic Cracking unit of oil-fining plant and it plays an important role in energy saving. As operating in variable condition and high temperature harsh environment, the fault rate of the unit is relatively high. Once faulty happens, enormous economic loss will be caused so it is very important to make condition monitoring and diagnosis. The remote monitoring and diagnosis technology is a new fault diagnosis mode combining with computer technology, communication technology and fault diagnosis technology. Making large flue gas turbine unit as research object, this paper introduces different modes of condition monitoring and diagnosis system, then elaborates overall structure design of the remote monitoring and diagnosis platform constructed, and analyses application of the platform for the unit in detail. The platform can take full advantage of technical support and data sharing to perform remote monitoring and fault diagnosis as well as prediction effectively, improve success rate of fault diagnosis for the unit greatly, and provide technical means to achieve predictive maintenance for large unit. Key Words: remote monitoring, fault diagnosis, trend prediction, predictive maintenance, large flue gas turbine unit 1
INTRODUCTION
Flue gas turbine is a typical rotary mechanical system playing important role in energy saving in petroleum refinery. It withstands mechanical, electrical, thermodynamic change, and its operating conditions vary with production requirements [1,2]. Accidental shutdown will result in great economic lost. Therefore, safe, stable, long period operation of the unit is the greatest target for enterprise managers. It is of vital importance to perform condition monitoring and fault diagnosis, and implement predictive maintenance on the unit. Condition monitoring and fault diagnosis now mainly work in following modes: 1) offline mode; 2) single equipment online mode; 3) centralized online mode. In offline mode, operating information of the unit is monitored by various sensors, and is transferred to computer through data acquisition device, and then analysis and diagnosis are carries out. This mode is economic and convenient but only applicable for regular detection. In single equipment online mode, a set of condition monitoring and fault analysis system is installed for each equipment. This mode enjoys advantages such as good real-time performance and high reliability, but it is not economic and hard to share information among different monitoring and diagnosis systems. Although bad economic performance and information sharing difficulty are overcome, centralized online mode is limited by territorial restriction and fails to carry out remote diagnosis. With development of Internet, remote monitoring and fault diagnosis technology based on industrial Ethernet has become the hot research spot. The openness of remote monitoring and diagnosis technology can break through original frame of traditional monitoring and fault diagnosis [3]. The combination of fault diagnosis technology with network technology can adequately make full use of more technology supports and data resource sharing, thus improve success rate of equipment diagnosis. Making flue gas turbine as research object, remote monitoring and diagnosis platform is constructed to perform monitoring and diagnosis analysis.
583
2
OVERALL DESIGN OF REMOTE MONITORING AND DIAGNOSIS PLATFORM CONSTRUCTED
The Remote monitoring and diagnosis platform is an open, multi-layer and distributed system [4], the structure of remote monitoring and diagnosis platform constructed is shown in Figure 1.With this platform the remote diagnosis, service and management for unit are truly realized.
Figure 1. Structure of Remote Monitoring and Diagnosis platform
As shown in Figure 1 remote monitoring and diagnosis platform is divided into 4 levels, namely, onsite level, enterprise level, remote diagnosis level and decision-making level, users of different levels can handle different services according to their demands. 1) Site level is the basis and core of the remote monitoring and diagnosis platform system, mainly for data acquisition. In this level, data acquisition module mainly takes charge of online real-time acquisition of various operating parameters. It directly faces individual equipment in the field, and implements maximum condition monitoring tasks, not only monitoring vibration data, but also routine process variable data such as pressure etc. Different kinds of data monitored are stored into operating parameter database of this level. 2) Enterprise level connects client computer to server and coordinates remote requests among client computers, application servers and database servers. Data transmission is an important part of the platform. Through this level, data are transferred from operating parameter database in site level to remote diagnosis level by means of B/S mode. B/S mode is an advanced network distributed data management mode, consisting of browser, web server and data management server. The advantages of B/S mode are easy to operate and upgrade with great reduction in maintenance costs. With data transmission technology based on B/S, a great many of fault samples are obtained easily from monitoring subsystem via internet. Then diagnosis is performed rapidly, thus improving diagnosis capacity. 3) Remote diagnosis level consists of remote expert teams, individual remote expert and data warehouse, etc. Diagnosis expert situated in different places are employed for difficult fault diagnosis, which enterprise level experts fail to diagnose. In this level various fault diagnosis and prediction methods are implemented on basis of maximum comprehensive
584
utilization of information from the site level. Characteristic parameters are extracted from related database, then diagnosis and prediction module are constructed by applying suitable methods and strategies with scientific analysis graphic to perform comprehensive analysis according to proper steps, to determine the nature, position, extent, type and cause so as to provide basis for fault prediction, control and maintenance [5]. 4) Decision-making level is actually an output module which provides reference data for EAM (enterprise apparatus management) to conduct maintenance etc. to enable enterprise realize system integration.
3
APPLICATION OF THE PLATFORM FOR FLUE GAS TURBINE UNIT The following is application of the remote monitoring and diagnosis platform for large flue gas turbine unit.
3.1
Condition monitoring
Flue gas turbine unit uses modularized and net-based multi-channel monitoring system to perform field monitoring. Meanwhile it performs online long-term monitoring for some important equipment, and monitoring channel is configured flexibly depending on actual conditions [6]. The monitoring system processes simultaneously up to 4 channels key phase signals, 24 channels vibration signals, 12 channels static analogy signals, as well as up to 256 process variable data. The signals such as key phase/speed signal, shaft and bearing vibrations, axial displacement, differential expansion and eccentricity come from buffer output as BENTLY3300/3500, ENTEK and PHILIP monitoring system directly, or from various sensors via special port. Figure 2 shows overviews of the turbine with monitored data.
Figure 2. Overview of flue gas turbine unit
The acquired operating data of the unit include vibration waveform, frequency spectrum, axial displacement, differential expansion, eccentricity and fault frequency characteristic value as well as process parameters of the unit (speed, load, power, temperature, pressure and flow, etc.). Moreover, operating data of auxiliary machines, analysis data of related base departments input by man-machine interaction, and fault symptom information felt by equipment users, and signal descriptive information of monitoring parameters are input to construct operating parameter database of flue gas turbine and perform uniform data management.
3.2
Data transmission
With large-scale and hi-fi data compressed technology a set of data per second are e transmitted from operating parameter database by B/S mode. According to requirements of diagnosis and prediction, B/S mode performs immediate distribution and sending of the data. Its purpose is to reduce burden of network transmission to the utmost extent, and achieve effective
585
utilization of network resource. Figure 3 shows the monitored vibration parameters transferred. Combined with real-time monitoring data and history data from operating parameter database, general condition of flue gas turbine is gotten, which is helpful to carry out further fault diagnosis and prediction analysis.
Figure 3. Vibration parameters transferred
3.3
Fault diagnosis and prediction
Fault diagnosis and prediction are implemented based on deep study of various fault mechanisms. In rotary machinery, fault information of the most rotor systems can be reflected by vibration signal as it containing rich fault information and reflecting running condition more directly, rapidly and accurately than other parameters [7]. Vibration signal analysis is used to perform fault diagnosis and prediction for the turbine unit. Moreover, vibration intensity is calculated as characteristic parameters. There are kinds of faults occurred in flue gas turbine. Different faults correspond to different characteristic information in signals. This is the objective basis to identify equipment's condition or diagnose fault. Faults such as rotor unbalance, rotor misalignment, rotor impact-rub, oil whirl, shaft bending, bearing loosening, and poor lubrication of the unit can be diagnosed. Figure 4 is the waveform and frequency spectrum used to diagnosis rotor unbalance where the waveform graph is similar to sine wave and power frequency in frequency domain is the highest.
Figure 4. Rotor unbalance waveform and frequency spectrum
586
According to actually monitored data, suitable prediction methods are selected to make fault diagnosis and prediction. Vibration characteristic parameters are extracted from monitored data, and these characteristic parameters are tracked to make prediction, so as to provide analysis results to industrial field for online actual verification and improvement of the used prediction parameters and methods [8,9]. This way is in accordance with actual condition, and helps to obtain ideal results. Figure 5 shows vibration intensity prediction of the flue gas turbine using two different prediction methods with data monitored from site level.
Figure 5. Vibration intensity prediction by Lyapunov and BP methods
4
CONCLUSIONS
Remote monitoring and diagnosis platform is a new mode combining computer and communication technologies with fault diagnosis technology. The platform makes use of remote network, integrates distributed resources, and breaks through disadvantage of offline information decentralization. Making large flue gas turbine unit as research object, research on fault diagnosis and prediction technology are performed on basis of comprehensive utilization of monitored condition information. The platform can perform effective remote fault diagnosis and trend prediction for operating unit, which are of vital importance to ensure safe operation of the unit, save maintenance costs, and improve utilization rate and management level. Moreover, the platform can provide technical means for implementation of predictive maintenance on the unit.
5
REFERENCES
1
WANG Xiao-sheng, QU Liang-sheng, ZHAO Bo, LI Yi-pu.(1997) Research on the impact of bearing stability on the vibration from overhung flue gas turbine.Chemical Engineering & Machinery, 24(5): 293-297
2
FEI Guo-qin. (2003) The factors that impact long period, safe operation of flue gas turbine and its analysis. Petrochemical equipment technology, 24 (5): 24-28
3
XIE Zhi-jiang, GUO Yu-jing.(2006) Research on the distributed remote fault diagnosis platform. Modern Manufacturing Engineering, (7):103-105
4
CHEN Xiao-ming, WU jia-ming, KONG Qing-fu, YU Guang-fu, YANG Yong-hong. (2006) Design on building the modern monitoring and diagnosis platform for warship power plant. Ship & Ocean Engineering, (3): 90-93
5
G.K. Singh and Sad Ahmed Saleh.AI Kazzaz.(2003)Induction machine drive condition monitoring and diagnostic researcha survey. Electric Power Systems Research, 64(2): 145-158
6
Min-Chun Pan, Po-Ching Li, Yong-Ren Cheng. (2008) Measurement, (41):912-921
587
Remote online machine condition monitoring system.
7
Zhong Binglin, Huangren. (2007) Mechanical Fault Diagnostic Theory (3rd Edition), Beijing: China Machine Press
8
Eduardo Gilabert, Aitor Arnaiz. (2006) Intelligent automation systems for predictive maintenance: A case study. Robotics and Computer-Integrated Manufacturing, (22):543-549
9
Jay Lee, Jun Ni, Dragan Djurdjanovic, Hai Qiu, Haitao Liao.(2006) Intelligent prognostics tools and e-maintenance. Computers in Industry, (57):476-489
Acknowledgments The research has been supported by Scientific Research Key Program (KZ200910772001) of Beijing Municipal Commission of Education, Funding Project (PHR20090518) for Academic Human Resources Development in Institutions of Higher Learning under the Jurisdiction of Beijing Municipality.
588
Proceedings of the 4th World Congress on Engineering Asset Management Athens, Greece 28 - 30 September 2009
COMPARISON OF VIBRATION ANALYSIS WITH DEFERENT MODELING METHOD OF A ROTATING SHAFT SYSTEM Dong Sik Gu a, Byeong Su Kim a, Jang Ik Lim a, Yong Chae Bae b, Wook Ryun Lee b, Hee Su Kim b and Byeong Keun Choi c a
b
c
Department of precision and mechanical engineering, Graduate School of Gyeongsang National University, 445, Inpyeong-dong, Tongyeong City, GyeongNam-do, 650-160, South Korea
Hydro & Fossil Power Generation laboratory, Korea Electric Power Research Institute, 65, Munji-Ri, Yuseong-Gu, Daejeon, 305-380, South Korea
Department of precision and mechanical engineering, Institute of Marine Industry, Gyeongsang National University, 445, Inpyeong-dong, Tongyeong City, GyeongNam-do, 650-160, South Korea
The numerical analysis is the basic step of machine designing and the result from it is the fundamental data for the assess management of machine/plant. Recently, many kinds of softwares based on different method, like Finite Element Method (FEM) and Transfer Matrix Method (TMM) were developed for the machine design. Especially the development of simulating a rotating component, gas/steam/wind turbine shaft, high speed shaft, the shaft of large vessels and the shaft with blades, flywheels and gears. In order to reduce the error, it always needs to increase the number of element, some times over 100,000. But it will affect the inputing matrix size and calculating time. In this paper, a 2D modelling method was proposed to reduce the element size and solving time. This paper compared the simulation results from MSC Partan/Nastran and ANSYS Workbench using a 3D model and the results by solving MATLAB based on FEM using a 2D model. Key Words: Rotating Shaft System, Numerical Simulation, Finite Element Method (FEM), Add Mass, Modelling Method 1
INTRODUCTION
To perform the assess management, it has done that many kind of data, like operating conditions, design data, vibration trends, temperatures, maintenance history and etc, were acquired. The data is consisted of a given data that the information was included in manual books and a taken data which was extracted from transducers as accelerometers and thermometers since a machine had been installed. The second is very useful data when the machine has been operated for a long time, but if any fault was happened early in the machine, it didn’t work for maintenance. So, the given data is working to repair for just installed machines. The given data is written on the manual books with specifications, how to use the machine, machine or software installation method, shop operating data and some of tips for easy defects. The numerical simulation result is also comprehended in the manual and function that is a basic data base to check the machine. The numerical simulation is performed using a kind of solving software based on Finite Element Method (FEM) or Transfer Matrix Method (TMM). In these days, the FEM is used to the simulation of statics, dynamics, thermodynamics or hydrokinetics. And, to do the numerical simulation, it has to decide simulation conditions, like the environment parameters of operating, external forces, temperatures, and so on about the machine. Also, model design for the simulation is the most important step in numerical analysis because the solving time and the error rate of the solving result are depended on the size of element. In the common software as ANSYS Workbench [1] and MSC Partran/Nastran [2], the number of element is depended on the model. So software users refer increasing the size of them.
589
In addition, the rotating shaft system is a simple system of the machines but the dynamic analysis for it is not easy to get the exact result because the result is changed by the modelling method. Especially, if the rotor system had some kind of added masses which are heavier than the shaft as the rotor core of motors or Figure 1, the simulation result was calculated by the modelling caused by difference of the position and/or number of the add mass. And a variation of stiffness diameter of the added mass is also considered in the model.
Figure 1. Example shaft included add mass [3] So, in this paper, it was proposed that the modelling method was designed by 2D which is the same rotor system with the 3D models used in the numerical analysis software. It was put into a solving code using FEM developed in MATLAB [4]. The 3D models were designed in CATIA_v15 and then exported/imported in ANSYS Workbench and MSC. Patran/Nastran. In follow chapters, three cases of rotor with the difference of stiffness diameter of Hub will be suggested, and then this study will compared the three results from a 2D model using MATLAB and the two 3D models using ANSYS Workbench and MSC Patran/Nastran. The stiffness diameter of Hub was considered in each case because the stiffness of Hub was effect to the shaft stiffness caused by the connection method between the shaft and added mass.
2
FINITE ELEMENT METHOD
2.1 Definition The FEM (sometimes referred to as finite element analysis) is a numerical technique for finding approximate solutions of partial differential equations (PDE) as well as of integral equations. The solution approach is based either on eliminating the differential equation completely (steady state problems), or rendering the PDE into an approximating system of ordinary differential equations, which are then numerically integrated using standard techniques such as Euler's method, Runge-Kutta, etc. In solving partial differential equations, the primary challenge is to create an equation that approximates the equation to be studied, but is numerically stable, meaning that errors in the input data and intermediate calculations do not accumulate and cause the resulting output to be meaningless. There are many ways of doing this, all with advantages and disadvantages. The Finite Element Method is a good choice for solving partial differential equations over complex domains (like cars and oil pipelines), when the domain changes (as during a solid state reaction with a moving boundary), when the desired precision varies over the entire domain, or when the solution lacks smoothness. For instance, in a frontal crash simulation it is possible to increase prediction accuracy in "important" areas like the front of the car and reduce it in its rear (thus reducing cost of the simulation); Another example would be the simulation of the weather pattern on Earth, where it is more important to have accurate predictions over land than over the wide-open sea.
2.2 Application A variety of specializations under the umbrella of the mechanical engineering discipline (such as aeronautical, biomechanical, and automotive industries) commonly use integrated FEM in design and development of their products. Several
590
modern FEM packages include specific components such as thermal, electromagnetic, fluid, and structural working environments. In a structural simulation, FEM helps tremendously in producing stiffness and strength visualizations and also in minimizing weight, materials, and costs. FEM allows detailed visualization of where structures bend or twist, and indicates the distribution of stresses and displacements. FEM software provides a wide range of simulation options for controlling the complexity of both modelling and analysis of a system. Similarly, the desired level of accuracy required and associated computational time requirements can be managed simultaneously to address most engineering applications. FEM allows entire designs to be constructed, refined, and optimized before the design is manufactured. This powerful design tool has significantly improved both the standard of engineering designs and the methodology of the design process in many industrial applications.[5] The introduction of FEM has substantially decreased the time to take products from concept to the production line.[5] It is primarily through improved initial prototype designs using FEM that testing and development have been accelerated.[6] In summary, benefits of FEM include increased accuracy, enhanced design and better insight into critical design parameters, virtual prototyping, fewer hardware prototypes, a faster and less expensive design cycle, increased productivity, and increased revenue.[5]
3
MODELLING
3.1 Basic structure of models and materials The models for this research were separated to three types and the basic structure of them was consisted of Shaft, Hub and added mass as shown in Figure 2. It was supposed that the shaft and Hub were assembled by expansion fit, Hub and added mass were bonded, and the each center point of mass of three components were matched at one. And the material properties were explained in Table 1. The shaft was applied SUS304 (stainless steel) and the Al7075 (aluminium) were used in the others.
Figure 2. The basic components of models
Table 1 The material properties of model
Material name
Part
Modulus if elasticity
Shear modulus of elasticity
[GPa]
[GPa]
Poisson’s ratio
Density [kg/m3]
SUS304
Shaft
193
86
0.28
8,000
Al7075
Hub & Add mass
72
27
0.33
2,800
591
3.2 Modelling and analysis process In this chapter, the three kinds of model that were given the different shape of Hub are explained, and process of this research followed (Figure 3): a) 3D modelling: it has done by 3D modelling software, CATIA v5. b) Import to solve: the 3D model was saved *.stp file for ANSYS Workbench and *.igs file for MSC Patran/Nastran, and then they were imported to softwares. c) Solving and get result d) Comparing the matching rate of two results: to get the same/exact results. d) 2D modelling e) Input 2D model to MATLAB, solving and get a result f) Comparing the result from MATLAB and the others: in here, the stiffness diameter was modified by trial error. Because, in the 2D model, the added mass was transferred two point masses, so the effect of stiffness of the added mass had been lost in here. Figure 3. Flow chart of the analysis and modify the model For example, in the first 2D model (No.1), shown in Figure 4, it has a Hub of ‘ㅗ’ type, two bearings that spring stiffness was 50,000N/m, 1,000mm of shaft length, 100mm of shaft diameter and added mass that length was 700mm and thickness was 60mm. The stiffness diameter for No. 1 model was painted by red arrows in the Hub part of Figure 4(left) determined from matching process with the results of ANSYS Workbench and MSC Patran/Nastran. But the arrows wasn’t displayed in some part of Hub, that meaning is the part has no effect of stiffness at shaft. And the wing of added mass was transferred two point masses that have the inertia moment of mass about axial and radial direction and were connected with the two end point of Hub. The number of element of 2D model was 86 and it for 3D were 23,068 and 47,086 in MSC Partan/Nastran and ANSYS Workbench. Figure 4(right) is just shown the Shaft and Hub. No. 2 and No. 3 models were displayed in Figure 5 and Figure 6. They had the same shaft and added mass with the No. 1 model, but the type of Hub was changed to ‘ㅛ’ type from the first model and the gap between two connections of the shaft and the added mass was given by 90mm in the No. 2 and 250mm in the No. 3 model. Also, the stiffness diameters of the Hub were painted by red arrows in each figure. The number of the element for the 2D model was same in all models, but for the 3D models, it was 39,047 for the No. 2 and 39,182 for the No. 3 in ANSYS Workbench and was 32,102 for No. 2 and 24,542 for the No. 3 in other one. From the number of element for each model, it can be guessed that how much it takes to solve and the 2D modelling was easier than 3D modelling. In the next chapter, the analysis result, it will be compared about solving time and the difference of final results.
Figure 4. The No. 1 model for 2D (left) and 3D (right, the Add mass was hided)
592
Figure 5. The No. 2 model for 2D (left) and 3D (right, the Add mass was hided)
Figure 6. The No. 3 model for 2D (left) and 3D (right, the Add mass was hided)
4
ANALYSIS RESULT AND DISCUSSION
4.1 Mode shape In this paper, the object mode order was first to fourth mode so the mode shape was compared the results to matching, and they were presented in Figure 7. The mode shapes were exactly matched among three in all. It was checked that the first mode was Cylindrical mode, the second was Conical mode, the third was 1st Bending and the fourth was 2nd Bending mode. The first and second modes were defined as Rigid modes because the stiffness of the bearing was a journal bearing type which is lower than rolling element bearings. In Figure 6, the results of MATLAB are shown only the shaft mode shape.
Figure 7. Comparison of mode shape
593
4.2 Analysis result and solving time In the Table 2, the solving results and time were presented. It was gotten that the softwares were made similar data about natural frequencies. The first and the second mode, Rigid mode, were almost matched by that a maximum difference was 0.45 Hz in the second mode in the No. 3 model. And, also, in the third and the fourth mode, the results from MSC Patran/Nastran and ANSYS Workbench were closed on a difference below 6 Hz. But the difference between them and MATLAB were higher than it between them. The maximum difference between MATLAB and the others were 32 Hz in the fourth mode of the No. 2 model but the difference was lower than 10% of its natural frequency. Everyone of solving softwares have error in exactness because all numerical analysis supposed real machine to linear system, so the difference of below 10% is recognized in a permissible error. And in this result, it was considered that the solving times were very different. The one that use the longest time for solving in three softwares was ANSYS Workbench, next one is MSC Patran/Nastran, and last is MATLAB. The analysis time using MATLAB was included the time to calculate the natural frequencies for Campbell Diagram and Critical Speed Map, but the others were not. If Campbell Diagram or Critical Speed Map was made using the others, the time for calculation had to be needed more longer time than them in the Table 2.
Table 2 Comparison of the result and solving time
Model
No. 1
No. 2
No. 3
MSC. Patran/Nastran
ANSYS Workbench
MATLAB CODE
(3D)
(3D)
(2D)
1
3.72 Hz
3.68 Hz
3.68 Hz
2
7.64 Hz
7.55 Hz
7.4 Hz
3
560 Hz
557 Hz
542 Hz
4
627 Hz
624 Hz
631 Hz
1
3.78 Hz
3.7 Hz
3.7 Hz
2
7.65 Hz
7.56 Hz
7.4 Hz
3
540 Hz
537 Hz
522 Hz
4
581 Hz
575 Hz
607 Hz
1
3.79 Hz
3.75 Hz
3.75 Hz
2
7.64 Hz
7.55 Hz
7.19 Hz
3
632 Hz
628 Hz
617 Hz
4
750 Hz
747 Hz
774 Hz
150 s
2,891 s
40 s
Mode
Solving Time
5
CONCLUSION
The numerical simulation result is a data written in the manual and needed to maintenance when the faults were early generated in the machines. To get the good result from the simulation, the number of element depended on the modelling method have to increase. But that effected to the solving time and the error of result because the size of input matrix has to be created by it. So, in this paper, the 2D modelling method was proposed and compared with the common 3D modelling method. The number of element of the 2D modelling was 86 and it of the 3D was over 20,000. In the result of them, it was known that the
594
element size is heavy affected in the solving time and result. In addition, the result using the 2D model was in a permissible error, difference 10%, when it was compared with the results from others. And the analysis time was very different among them. It when used the 2D model was 40 sec included to get the Campbell Diagram and Critical Speed Map, but the others taken for 150 sec and 2,891 sec for only normal mode analysis. Therefore, in the result of this paper, it can be known that the function of the 2D model suggested in this paper was working enough for the numerical analysis, that mean is the advantages of 2D model are (1) the results are similar with 3D model, (2) we can got answers earlier than 3D model. But, to use the 2D model, it should be needed to research about that a confirmatory method for the stiffness diameter has to be developed.
6
REFERENCES
1
Kent Lawrence. (2007) ANSYS Tutorial Release 11. Kansas, SDC: Schroff Development Corporation Publications.
2
Louis Komzsik. (1998) MSC/NASTRAN Numerical methods User’s Guide. Santa Ana, MSC Software Corporations Publications.
3
Thelen R.F., Gattozzi A., Wardell D., Williams A. (2007) A 2-MW Motor and ARCP Drive for High-Speed Flywheel. Applied Power Electronics Conference (APEC2007) Twenty Second Annual IEEE, 1690-1694.
4
Schilling R.J., Harris S.L. (2000) Applied Numerical Methods for Engineers: Using MATLAB and C. Stamford, Thomson Corporation Publications.
5
Hastins J.K., Judes M.A., Brauer J.R., (1985) Accuracy and Economy of Finite Element Magnetic Analysis. 33rd Annual National Relay Conference.
6
Mclaren-Mercedes (2006) Vodafone McLaren-Mercedes: Feature Stress http://www.mclaren.com/features/technical/stress_to_impress.php. Retrieved on 2006-10-03.
to
impress.
Acknowledgments This research was supported by The Eco-Friendly Heat and Cold Energy Mechanical Research Team of the Second-Phase of BK (Brain Korea 21) Project and Hydro & Fossil Power Generation Laboratory of Korea Electric Power Research Institute (KEPRI).
595
Proceedings of the 4th World Congress on Engineering Asset Management Athens, Greece 28 - 30 September 2009
ROLINIG ELEMENT BEARING FAULT DETECTION USING ACOUSTIC EMISSION SIGNAL ANALYZED BY ENVELOPE ANALYSIS WITH DISCRETE WAVELET TRANSFORM Byeong Su Kim a , Dong Sik Gu a , Jae Gu Kim a ,Young Chan Kim b and Byeong Keun Choi a a
Gyeongsang National University, 445 Inpyeong-dong, Tongyoung City, Gyeongnam-do, 650-160, Korea, b
Doosan heavy industries, 555 Guigok-dong, Changwon City, Gyeongnam-do, 641-792 ,Korea
Acoustic Emission (AE) what is a non-destructive testing technique is widely used for the early detection of faults in rotating machine in these days, because its sensitivity is higher than normal accelerometers and it can detect low energy vibration signals. The faults in rotating machines are generally occurred at bearing and/or gearboxes which are one of the principal parts of the machines. To detect the bearing fault, envelope analysis was studied and presented for several decade years. And the researches spoke that AE has a possibility of the application in condition monitoring system using envelope analysis for the rolling element bearing. And the peak ratio (PR) was developed for expression of the bearing condition in condition monitoring system using AE. The noise level is needed to reduce to get exact PR value because PR is calculated from total root mean square (RMS) and the harmonics of the defect frequencies. Therefore, in this paper, the discrete wavelet transform (DWT) was added in envelope analysis to reduce the noise level in AE signals. And then, the PR was calculated and compared with general envelope analysis result and the result of envelope analysis added DWT. Key Words: Acoustic Emission, Envelop Analysis, Discrete Wavelet Transform, Peak Ratio, Rolling Element Bearing 1
INTRODUCTION
Application of the high-frequency Acoustic Emission (AE) technique in condition monitoring of rotating machinery has been growing over recent years. This is particularly true for bearing defect diagnosis and seal rubbing. The main drawback with the application of the AE technique is the attenuation of the signal and as such the AE sensor has to be close to its source. However, it is often practical to place the AE sensor on the non-rotating member of the machine, such as the bearing or gear casing. Therefore, the AE signal originating from the defective component will suffer severe attenuation before reaching the sensor. Typical frequencies associated with AE activity range from 20 to 1 MHz. The acoustic emission (AE) method is a high frequency analysis technique which was initially developed as a nondestructive testing (NDT) tool to detect crack growth in materials and structures. Now, AE method has been increasingly being used as condition monitoring tool of engineering assets, such as structures and industrial machines. Application of AE technique can be found in wide area of applications such as structural health monitoring, machine tool monitoring and tribological and wear process monitoring, gear defects monitoring and bearing monitoring. Therefore, in this study, envelop analysis added the wavelet transform for useful gear fault detection apply detecting of bearing defect in signal process.
596
2
EXPERIMENT AND METHOD
2.1
Experiment installation
In order to diagnosis about rolling element bearing fault with AE transducer, the equipment was installed as Figure 1. At the driving end, the shaft was attached to reduction gear box (10.1:1) through a coupling. The constant radial load can be applied close to the driven-end support for prolong period of loading and measured by a load cell. AE sensor with frequency range 25-530 kHz was attached on the top of the bearing housing using magnetic holder as shown Figure 1. Rotating speed were 20, 50, 80,110,140 rpm by gear box and force were 500N, 2kN, 5kN by load cell on the bearing. The bearing used in this study was a cylindrical roller bearing, SKF NF307, with separable inner ring and outer ring. Figure.2 is the test bearing enables an easy access to the raceway for seeded defects and to observe the surface condition. The various types seeded bearing defects used in this study are listed as follows: inner-race crack (IFC), inner-race spall (IF), outerrace crack (OFC), outer-race spall (OF), ball fault1 (BF1), ball fault2 (BF2). In total 6 classes of faulty bearing and one normal condition for fault diagnosis simulation were obtained.
Figure 1. Experiment installation
Figure 2. Seed defects on the bearing: (a) IFC (b) IF (c) OFC (d) OF (e) BF1 (f) BF2
2.2 Signal and Defection frequency The bearing fault occur the impact by rotation and it happen vibration. Shock wave come out during the passage of the ball when a defective part. BPFO (Ball Pass Frequency of Outer race) and BPFI (Ball Pass Frequency of Inner race) is frequency of bearing defect to find like formula (1).
(1)
597
2.3 Signal acquisition The data acquisition system was used to record continuous AE signal. A laptop computer connected to PCI board to form the Micro-DSP system which is capable of 18-bit, 10 MHz A/D conversions with on board processing. A total 15 waveforms were captured for each bearing condition for spectral averaging.
Table1 Specification of AE system
2 Channel AE System on PCI-Board AE Sensor (Wideband Type) Preamplifier 20/40/60db Gain
18-bit A/D conversion 10 M Samples/sec rate (on one channel, 5 M Samples/second on 2 AE channels). Peak Sensitivity V/(m/s)[V/µbar] : 55 [-62] dB Resonant Freq. V/(m/s)[V/µbar] : 125 [650] kHz Wide dynamic range < 90dB Single power/signal BNC or optional separate power/signal BNC
To remove unnecessary signal and noise signal pre-processing had done by the MATLAB.
2.4 Assessment methods In this paper, the PR (Peak Ratio) value is priority of assessment methods. The PR value is meaning about sum of defect frequency and peak value of harmonic. PR value is diving the sum of defect frequency and harmonic by the average of total value, and then the value that showing as dB.
(2)
3
SIGNAL PROCESSING
3.1 Envelop analysis Envelope analysis typically refers to the sequence of procedure like Figure 3: (1) Band-Pass Filtering, (2) Wave rectification, (3) Hilbert transform or low-pass filtering and (4) Power spectrum. The purpose of band-pass filtering is to reject the low-frequency high-amplitude signals associated with mechanical vibration components and to eliminate random noise outside the pass-band. Theoretically, in envelop analysis, the best band-pass range is to include the resonance of the bearing components. This frequency can be found through impact tests or theoretical calculations involving the dimensions and material properties of the bearing. However, it is very difficult to predict or specify which resonant modes of neighbouring structures will be excited. It will be costly and unrealistic in practice to find the resonant modes through experiments on rotating machinery that may also under alter under the different operational conditions. In addition, it is also difficult to estimate how these resonant modes are affected in the assembly of a complete bearing and mounting in a specific housing, even if the resonant frequencies of individual bearing elements can be tested or calculated theoretically. From this reason,
598
commercial analysers and data loggers that offer envelope analysis generally offer user selectable band-pass settings, such as 5 -15 kHz, 15 - 25 kHz, 25 kHz - 35 kHz, 35-55 kHz, 55-75 kHz, and 75-100 kHz for acoustic emission signals.
3.2 Wavelet transform Wavelet transform is usually using method about detection of gear defect and effective shock wave in signal to find. In this study, the Wavelet Transform was performed by from level 4 for the Daubechies in the matlab. Wavelet transform is like Figure 4 in the wavelet tool box of matlab.
Figure 3. Procedure of envelop analysis
4
Figure 4. Wavelet transform procedure
EXPERIMENT RESULT
Figure 5 and Figure 6 are the result of signal process about inner race fault of bearing in rotating speed 140rpm, load 2kN, and BPF 75 ~ 100 kHz. And the vertical dot-lines indicate defect frequency of bearing (BPFI) in the figures. Figure 5 shows only envelop analysis in signal process. And it was not detect harmonic and defect frequency. But figure 6 show harmonic, defect frequency and PR value better than it in the Figure 4. Because envelop analysis was added wavelet transform in signal process. These trends show in Figure 7 and Figure 8 about outer race fault of bearing. The sideband by noise and ball rotating frequency was removed after wavelet transform. Therefore, the defect frequency (11.21Hz) and the harmonics (2X, 3X, 4X) detect well. PR values are 30.55dB before wavelet transform and 46.19 after wavelet transform. Figure 9 and Figure 10 are the result of signal process about the ball fault of bearing in rotating speed 140rpm, load 2kN, and BPF 33 ~ 55 kHz. Result show similar trend with inner race fault and outer race fault. The defect frequency detects as well and PR value was increased after wavelet transform.
599
Figure 5. Result of inner fault bearing in 140rpm & load 2kN (Only envelop)
Figure 6. Result of inner fault bearing in 140rpm & load 2kN (Envelop add wavelet)
Figure 7. Result of outer fault bearing in 140rpm & load 2kN (Only envelop)
Figure 8. Result of outer fault bearing in 140rpm & load 2kN (Envelop add wavelet)
Figure 9. Result of ball fault bearing in 140rpm & load 2kN (Only envelop)
Figure 10. Result of ball fault bearing in 140rpm & load 2kN (Envelop add wavelet)
600
Table 2 Comparison envelop and envelop add wavelet about the PR of ball fault bearing BPF Ranger(kHz) Load
25-35
rpm (N)
35-55
Add envelop
Add envelop
wavelet
50
80
110
140
55-75
75-100
Add envelop
wavelet
Add envelop
wavelet
wavelet
500
36.23
38.25
38.12
38.72
39.66
38.83
29.79
37.31
2k
28.86
32.56
32.26
38.09
32.12
32.83
26.16
30.06
5k
28.27
32.56
25.77
30.71
30.69
31.48
27.75
41.42
500
28.54
31.40
31.14
37.97
39.21
38.80
33.73
38.30
2k
29.18
31.76
28.44
30.02
29.81
30.29
25.65
29.78
5k
25.13
28.05
25.13
31.71
31.81
33.17
28.32
38.74
500
29.11
29.72
27.46
29.50
29.95
34.74
32.37
34.74
2k
28.54
28.76
28.32
35.48
35.49
35.33
36.01
37.22
5k
30.84
30.60
29.96
35.67
36.45
38.60
37.18
38.23
500
28.58
30.39
27.50
30.57
31.37
33.91
31.28
34.82
2k
31.02
32.25
30.84
38.47
39.01
38.31
40.56
41.55
5k
31.65
31.79
32.08
38.11
39.28
39.67
32.22
39.59
Table 2 shows PR value about ball fault bearing to compare only envelop process and envelop process added wavelet transform.
5
CONCLUSION
In this study, wavelet transform what is useful gear fault detection applied for bearing fault detection in signal process. In the experiment result about inner fault of bearing, defect frequency was difficult to find about only envelop analysis. But it’s easy to find defect frequency after wavelet transform. Also, all case shows effect about defect frequency detection and rising the PR value. Therefore, Envelop analysis added wavelet transform was useful method for early detection of default in signal process.
6
REFERENCES
1
Andrew K. S. Jardine and Daming Lim., Dragan Banjevic (2006) A review pn machinery diagnostics and Prognostics implementing condition based maintenance. Mechanical System and Signal Processing, Austria
2
KIM, Y. H., Tan,Andy,C.C.,MATEW,J. and YANG,B.S. (2007) Experimental Study on Incipient Fault Detection of Low Speed Rolling Element Bearings: Time Domain Statistical Parament. 12th Asia-Pacific Vivbration Conference (APVC2007), August6-9,Hokkaido University, Sapporo, Japan.
3
Shiroishi, J., Li Y., Lian, S., Danyluk, S. and Kurfess, T. (1999) Vibration Analysis for Bearing Outer Race Condition Diagnostics. Journal of Brazilian Society of Mechanical Science, Rio de Janeiro.
4
S. A. Mclnerny and Y. Dai, (2003) Basic Vibration Signal Processing for Bearing Fault Detection. IEEE
601
5
Kim, Y. H., Andy, C. C. Tan, Joseph, M., V;adis, K. and Yang, B. S. (2007) A Comparative Study on the application of Acoustic Emission Technique and Acceleration Measurements for Low Speed Condition Monitoring. Proceedings of the 12th Asia-Pacific Vibration Conference (APVC2007).
6
Signal Processing Tollbox User's Guide, Math Work, Inc., 1999
7
Randall, R.B. (1987) Frequency Analysis. Bruel & Kjaer Book
8
Kim, H. J., Gu, D. S., Jung, H. E., Andy Tan., Kim, E., Choi, B. K. (2007) The comparison of AE and Acceleration tranducer for the early detection on the low-speed bearing. Transactions of the Korean Society for Noise and Vibration Engineering Annual Spring Conference, PP. 324~328.
9
Jung, H. E., Gu, D. S., Kim, H. J., Andy Tan., Kim, Y. H., Choi, B. K. (2007) The application of AE transducer for the bearing condition monitoring of low-speed machine. Transactions of the Korean Society for Noise and Vibration Engineering Annual Spring Conference, PP. 319~323.
10
Achmad Widodo, Eric Y, Kim,. Son, J, D,. Yang, B, S,. Andy Tan,. Gu, D, S,. Choi, B, K,. Joseph Mathew, (2009) Fault diagnosis of low speed bearing based on relevance vector machine and support vector machine. Expert Systems with Applications, Volume 36, Issue 3, Part 2, April 2009, Pages 7252-7261
Acknowledgments This study was support by Second-Phase of BK21 Project (Eco-Friendly Heat and Cold Energy Mechanical Research Team).
602
Proceedings of the 4th World Congress on Engineering Asset Management Athens, Greece 28 - 30 September 2009
PROGNOSIS OF BEARING FAILURE BASED ON HEALTH STATE ESTIMATION Hack-Eun Kim a, Andy C.C. Tan a, Joseph Mathew a, Eric Y. H. Kim a and Byeong-Keun Choi b a
CRC for Integrated Engineering Asset Management, School of Engineering Systems, Queensland University of Technology, GPO Box 2434, Brisbane, QLD 4001, Australia. b
School of Mechanical and Aerospace Engineering, Gyeongsang National Univ., Tongyoung, Kyongnam, Korea..
This paper proposes a new prognosis model based on the technique for health state estimation of machines for accurate assessment of the remnant life. For the evaluation of health stages of machines, the Support Vector Machine (SVM) classifier was employed to obtain the probability of each health state. Two case studies involving bearing failures were used to validate the proposed model. Simulated bearing failure data and experimental data from an accelerated bearing test rig were used to train and test the model. The result obtained is very encouraging and shows that the proposed prognostic model produces promising results and has the potential to be used as an estimation tool for machine remnant life prediction.
Key Words: Prognosis, Bearing degradation state, Support vector machines (SVMs), Remaining useful life (RUL)
1
INTRODUCTION
The ability to accurately predict the remaining useful life of a machine system is critical for its operation and can also be used to improve productivity and enhance system safety. In condition-based maintenance, maintenance is usually performed based on an assessment or prediction of the machine health instead of its service time, which leads to intended usage of the machine, reduced down time and enhanced operation safety. An effective prognostics program will provide ample time for maintenance engineers to schedule a repair and to acquire replacement components before catastrophic failures occur. Although today’s expert diagnostic engineers have significant information and experience about machine failure and health states by continuously monitoring and analysing of machine condition in industry, well understood systematic methodologies and support systems on how to predict machine remnant life are still not available. The task still relies on human expert knowledge and experience. Therefore, there is an urgent need to continuously develop and improve prognostic models which can be implemented in intelligent maintenance systems with minimum human involvement. An effective prognosis requires performance assessment, development of degradation models, failure analysis, health management and prediction, feature extraction and historical knowledge of faults [1]. For an accurate prognosis, it is essential to conduct a prior analysis of the system’s degradation process, failure patterns and event history of the machine as well as obtain quality machine condition data. In addition, to accurately predict remaining useful life, an ability to provide long-term prediction is one of the challenges in implementing predictive maintenance strategies in real application. Liu et al. [2] suggested the similarity based method for manufacturing process performance prediction and diagnosis. In their paper, similarities with historical data were used to predict the probabilities of failure over time by evaluating the overlaps between predicted feature distributions and feature distributions related to unacceptable equipment behaviour for long-term prediction of process performance. However, they only considered two degradation processes namely, a normal process behaviour and a faulty process behaviour. For accurate assessment of machine health, a significant amount of a priori knowledge of the assessed machine is required because the corresponding failure modes must be known in advance and well-described in order to assess the current machine performance [3]. In general, each machine system has its inherent characteristics that could be used to identify the source of failure. Therefore, prior analysis of machine and knowledge of failure pattern could lead to more accurate prediction of remnant life. Li et al. [4] suggested that a reliable diagnostic model is essential for the overall
603
performance of a prognostics system. Diagnostics also provides information on obtaining reliable event data and acquiring feedback for system redesign. In this paper, for long-term prediction of the remnant life of machine, the authors propose a machine prognostic model based on health state estimation using a modified SVM classifier. In this model, prior historical knowledge is embedded in the closed-loop prognostic system together with the classification of faults and health state estimation. The historical knowledge includes prior knowledge of the machinery degradation process, failure patterns and maintenance history. This model proposes an integration of diagnosis module to prognostics. By using an integrated system of diagnosis and prognosis, pre-determined dominant fault obtained in the diagnostic process can be used to improve the accuracy of prognostics in estimating the remnant life. In addition, an integrated system can also deals with different types of faults in a machine system. The health state estimation is carried out by exploring a full failure degradation process of the machine by optimal selection of the various health states over time from new to final failure stages. In terms of the historical knowledge, historical failure data and events will be applied to identify the failure patterns. This approach produces an effective feature extraction and the construction of fault degradation steps for impending faults. In this research, the authors aim to develop a practical prognostics model which could be used in on-line condition monitoring to predict the remnant life of a failing component. Fault diagnosis and different health states estimation are performed using the classification ability of SVM, with subsequent machine prognostics being conducted based on the probabilities of each health states. To validate the feasibility of the proposed model, the authors simulated progressive bearing degradation data and conducted experimental tests of bearing run-to-failure. To select effective features for the classification of health states, the effectiveness of features were examined and calculated. Then, the probabilities of each health states were obtained using the SVM classifier to estimate the machine’s remaining useful life (RUL). The results show that the proposed prognosis model has the potential to be used as an estimation tool for machine remnant life prediction. The remaining part of the paper is organised as follows. Section 2 presents the proposed prognosis model-based approach utilising health state estimation with embedded historical knowledge. In Section 3, describes the basic principle of SVM employed in this research. Section 4 presents the results of first case study using simulated progressive bearing fault data. Section 5 presents the second case study results from the experimental bearing failure data which is followed by the conclusions in Section 6.
2
PROGNOSTICS MODEL BASED ON HEALTH STATE ESTIMATION
In this research, an innovative prognostics model based on health state estimation with embedded historical knowledge is proposed.
Figure 1. A closed loop prognostic system. Figure 1 illustrates the closed loop prognostic model with an embedded historical knowledge in the centre of the closed loop circle. Basically, this model integrates condition monitoring data and historical knowledge with prognosis. The entire sequence includes condition monitoring, fault identification (diagnosis), health state estimation and prognosis by linking them to case based historical knowledge (data). The historical data and knowledge are used to provide useful information for the selection of suitable condition monitoring techniques such as sensor (data) type and signal processing techniques. In this system, the feature extraction and selection techniques in the diagnosis module are linked with the historical knowledge in the
604
system. The pre-determined failure degradation stage of machine embedded in the historical knowledge module can be used to estimate the health state of the machine. The final output of the prognosis module on certain impending fault can also be accumulated in the case based data as historical knowledge. This accumulated historical knowledge can then be used for system updating and improving for the prognosis model.
Figure 2. Framework of the Prognostics System Based on Health State Estimation. Figure 2 presents the framework of the prognostics model based on health state estimation using SVM. The proposed system consists of three sub-systems, namely, historical knowledge, diagnostics and prognostics. Through failure pattern analysis of the historical data and events, failure degradation stages can be determined to estimate the health state of the machine. In this model, prior historical knowledge is related to signal processing, feature extraction and selection in the diagnosis and prognosis sub-systems as depicted in Figure 2. First, historical condition monitoring data are used in failure pattern analysis in the historical knowledge sub-system. With this prior analysis, major failure patterns that affect the entire life of the machine are identified for diagnostics and prognostics. The failure degradation stages are also determined in historical data analysis. This historical knowledge to be used in diagnostics and prognostics will provide key information on the organisation of this system. In the diagnostics sub-system, the condition monitoring data of machine are collected where significant features of machine faults can be extracted. In general, raw data acquired from sensors require signal processing to obtain appropriate features. A range of features is calculated to cover the preliminary impending faults of the machine system. The effective selection of features is required in order to avoid the problem of dimensionality and high training error value which may cause computer overload and overfit of data training in the pattern recognition techniques. The goal of dimensionality reduction is to reduce high-dimensional data samples in a low-dimensional space while most of the intrinsic information contained in the data is preserved. Once dimensionality reduction is carried out appropriately, compact representation of the data for various succeeding tasks such as visualization and classification can be utilised. An effective feature selection can be used to provide a better performance of predictor, cost-effective predictors and a better understanding of the underlying process that generated the data [5]. After feature extraction (feature selection), predetermined major fault data are trained using SVM multi classifier. Through this training of major faults of machine system, current impending faults can be isolated and identified in the diagnostic system. However, this diagnostic system does not provide the severity of fault. After identifying the impending fault, the failure degradation stages determined in prior historical knowledge module are employed in health state estimation module as depicted in Figure 2. In this step, predetermined failure stages were trained before testing the current health state. Through prior training of each failure degradation stage, current health condition can be obtained in terms of the probabilities of each health state of the machine. The remaining useful life (RUL) obtained according to the probabilities of each health states and historical operation time can be expressed as (1) accordingly.
(1) Where P is a probability of health state, S is health state, N is number of states and T is operation hours.
605
In this paper, the prognostic sub-system is used to estimate the RUL since the feasibilities of SVM for the fault classification have been introduced in the recent literature.
3
SUPPORT VECTOR MACHINES
Support vector machines (SVMs) have been employed to conduct fault diagnosis and prognosis of machine because of its excellent ability in classification and regression. As an intelligent technique, SVM can train a given data set and save the result as weights, and then use the weights to perform classification. Traditionally, SVM is used for classification of linear data into two classes. However, by using Kernel mapping, SVM can be used to perform the training process and classification of nonlinear data. Furthermore, by optimizing the hyper-plane, SVM can solve the classification and regression problems. Nambura et al. [6] presented the possibility of fault severity estimation via SVM for the mode-invariant fault diagnosis of automotive engines. This section provides a brief summary of the standard SVM for pattern recognition. SVM is based on the statistical learning theory introduced by Vapnik and his co-workers [7,8]. SVM is also known as maximum margin classifier with the abilities of simultaneously minimizing the empirical classification error and maximizing the geometric margin. Given data input xi (i = 1, 2… M), where M is the number of samples having corresponding labels yi {-1, 1}. In the case of linear data, it is possible to determine the hyperplane f(x) = 0 that separates the given input data. f (x) = w T x + b =
M
∑w
j
(2)
xj +b = 0
j =1
This leads to the following optimization problem with respect to the primal variables w and b: M minimize 1 || w || 2 + C ∑ x i 2 i =1
(3)
y ( w T x i + b ) ‡ 1 - x i , i = 1,..., M subject to i i = 1,..., M x i ‡ 0
(4)
Where xi is the noise with slack variables and C is the penalty parameter of error term. This problem can be reduced to dual quadratic optimization problem as follows which practically we solve: maximize
L (a ) =
M
∑a i =1
i
-
1 2
subject to ai ‡ 0, i = 1, …, M.
M
∑a
i
a j yi y jx ix
(5)
j
i, j=0
M
∑a
i
(6)
yi = 0
i =1
Thus, by solving the dual optimization problem, this leads to the non-linear decision function as follows: M f ( x ) = sign a i y i (x ix j ) + b i , j =1
(7)
∑
SVM can also be used in non-linear classification tasks with application of kernel functions. The data to be classified is mapped onto a high-dimensional feature space, where the linear classification is possible. Using the non-linear vector function Φ(x) = (Φ1(x), …, Φl(x)) to map the n-dimensional input vector x onto l-dimensional feature space, the linear decision function in dual form is given by M f ( x ) = sign a i y i (Φ T ( x i )Φ ( x j )) + b i , j =1
(8)
∑
Working in the high-dimensional feature space enables the expression of complex functions, but it also generates other problems. Computational problems can occur due to the large vectors and over-fitting can also exists due to the highdimensionality. The latter problem can be solved by using the kernel function. The Kernel is a function that returns a dot product of the feature space mappings of the original data points, stated as K (xi , xj) = (ΦT(xi) Φ(xj)). When applying a kernel function, learning in the feature space does not require explicit evaluation of Φ and the decision function will be M f ( x ) = sign a i yiK (x i , x j ) + b i , j =1
(9)
∑
Any function that satisfies Mercer’s theorem [9] can be used as a kernel function to compute a dot product in feature space. There are different kernel functions used in SVM, such as linear, polynomial and Gaussian RBF. The kernel defines the feature space in which the training set examples will be classified. In this research, the polynomial function ((g xT·xj +r)d , g > 0) was employed for classification of health states. SVMs were originally designed for binary classification and there are several methods that have been addressed for multiclass classification, such as ‘‘one-against-one’’, ‘‘one-against-all’’, and directed acyclic graph (DAG). Hsu and Lin [10]
606
presented a comparison of these methods and pointed out that the ‘‘one-against-one’’ method is more suitable for practical use than other methods. Consequently, in this study, the authors have adopted the ‘‘one-against-one’’ method to classify the six failure degradation stages.
4
CASE STUDY USING SIMULATED BEARING FAULT DATA
4.1 Simulation of progressive bearing fault data In general, a prognostic model requires numerous sets of failure data for training and testing. Unfortunately it takes a long time to fail a bearing, even in accelerated run-to-failure tests. To resolve this dilemma, simulation of progressive bearing degradation data was developed as a substitute of real life testing data derived in our previous work [11]. This simulated data provides numerous sets of data and truncations were randomly imposed on a portion of the datasets for the validation of prognostic model. In this work, a vibration waveform generated by a rolling element bearing under constant radial load with a single point defect is first modelled using the MATLAB software and then repeatedly generated while increasing the defect severity exponentially with some added discontinuities. To describe the waveform generated by a rolling element bearing under constant radial load with a single localised defect, the vibration signature can be expressed as
y (t ) = y d (t ) y q (t ) y r (t ) y e (t ) y n (t )
(10)
where yd(t) is a series of impulses at the bearing fault frequency, yq(t) the bearing radial load distribution, yr(t) the bearinginduced resonant frequency and ye(t) the exponential decay due to damping [12,13]. The last component, yn(t), represents the noise added to corrupt the signal. According to this method, outer race fault, inner race fault, ball fault and a combination of multiple faults were simulated. To simulate random degradation data, these simulated signals had defect impulses that increase at different rates and discontinuities. Figure 3 shows the simulated time domain signal of a bearing with an outer race, an inner race and a ball defect with shaft frequency set at 600rpm. For the training and test of proposed prognostic model, two random progressive degradation data were simulated as shown in Table 1.
Figure 3. Simulated time domain signal with increasing defect impulse.
607
Table 1. Simulated progressive bearing degradation data set Data No
Number of Sample
RPM
Sampling frequency
Applied bearing faults
1
100
600
20,000
BPFI, BPFO, BSF
2
100
600
20,000
BPFI, BPFO, BSF
4.2 Feature calculation and selection In this paper, the authors calculated 10 statistical parameters from the time domain data. These feature parameters were mean, rms, shape factor, skewness, kurtosis, crest factor, entropy estimation, entropy estimation error, histogram lower and upper. In addition to these parameters, four parameters (rms frequency, frequency centre, root variance frequency and peak) in the frequency domain were calculated. A total of 14 features were calculated as shown in Table 2. Table 2. Statistical feature parameters and attributed label Time Domain Parameters
Frequency Domain Parameters
Mean[1], RMS[2], Shape factor[3], Skewness[4], Kurtosis[5], Crest factor[6], Entropy estimation value[7], Entropy estimation error[8], Histogram upper[9] and Histogram lower[10]
RMS frequency value[11], Frequency centre value[12], Root variance frequency[13] and Peak value[14]
In this paper, for the better performance of SVM and the reduction of computational effort, effective features were selected using the evaluation method of feature effectiveness introduced by Knerr et al [14-16] as depicted below. Step 1: calculation of relative distance (
) of each other data in same states
(11) Step 2: calculation of relative distance (
) of each other data among different states
(12) Step 3: calculation of effectiveness factor (relative distance) of each parameter (13) where, m, n = 1, 2,
, N, m ≠ n, Pi,j: eigen value, i: data index, j: state index, N: number of feature, M: number of state
Table 3 shows the effectiveness factor (
values calculated from 14 features and ten states.
Table 3. The effectiveness factor of features Label of feature Effectiveness value (
1
2
3
4
5
6
7
8
9
10
11
12
13
14
3.58
4.33
0.482
6.47
6.47
5.22
5.12
5.82
4.35
4.34
6.14
6.07
6.53
4.37
The authors selected four features (Skewness[4], Kurtosis[5], RMS frequency value[11] and Root variance frequency value[13]) which had high effectiveness factor values as compared with other features. High effectiveness value relates to those features which have low dispersibility in the same state and high dispersibility among different states. Therefore, it could minimise the classification training error of each bearing degradation stage.
4.3 Health state estimation and prediction of remnant life
608
As the basic kernel function of SVM, a polynomial was used. Multi-class classification using one-against-one was applied to perform the classification of bearing degradation. Sequential minimal optimization (SMO) proposed by Platt [17] was used to solve the SVM classification problem. For selection of optimal kernel parameters (C, γ, d), the authors used the crossvalidation technique in order to obtain effective classification of performance suggested by Hsu et al. [18] so as to avoid overfitting or under-fitting. In this paper, simulated degradation data were divided into ten degradation stages for the estimation of health state and prediction of remnant life using selected four features. Once the ten states were trained using the selected the four features of Data1, the full data sets of Data1 (100 samples) were tested to obtain the each probabilities of the ten degradation states. Figure 4 shows the probabilities of each stage of simulated data1 that was also used for training of the ten degradation states. The first stage probability started with 100% and decreased as long as next stage probability increased. Although there were some overlaps in the middle zone of the display, the probabilities of each health states well explain the sequence of ten degradation states over the entire sample. Especially, initial and final states are distinctly separated.
Figure 4. Probabilities of each state [Simulated data1]
Figure 5. Comparison of real remaining life and estimated life [Simulated data1] For the estimation of remaining useful life (RUL), the expected life was calculated by using the time of each training data set and their probabilities of each health state as be expressed in (1). Figure 5 shows the result of estimated remnant life and the comparison between real remaining life and estimated life. The overall trend of the estimated life follows the real remaining life of bearing failure. And the average prediction value was 95.05% in entire range of data set. The average of prediction value was calculated using the following equation.
(14) Where N: number of data,
:real remaining life,
: expected life.
Using identical training data (Data1), second simulated bearing failure data (Data2) which consisted of 100 sample sets was also tested to verify the proposed model. Figure 6 shows the probabilities of each state of Data2. Compared with Data1, the first state is more lasted and the final state probability not reached high probability than former state. Figure 7 shows comparison result the estimated life which was also calculated by using the time of each training data set and their probabilities
609
of each health state. Although there are some margins in initial states, the estimated life in the latter half of samples closely matches the real remaining life of bearing failure. And the average prediction value was 92.5% in entire range of data set.
Figure 6. Probabilities of each state [Simulated data2]
Figure 7. Comparison of real remaining life and estimated life [Simulated data2] 5
CASE STUDY USING EXPERIMENTAL BEARING FAILURE DATA
5.1 Experimental setup and acquisition of accelerated bearing failure data In order to validate the proposed prognostic model, bearing run-to-failure tests were performed under controlled load conditions on a specially designed test rig which can facilitate accelerated bearing life tests. This test rig will simultaneously host four test bearings on a shaft driven by an AC motor. Coupling used so that when a bearing fails, it can be extracted and replaced easily without having to move the other bearings on the shaft. A spring was designed to spring load the two middle bearings. The load can be adjusted accordingly by tightening or loosening screw on the spring mechanism. The schematic of the test rig is depicted in Figures 8. Radial load Thermocouple Accelerometers
Motor
Bearing 2 Bearing 3 Bearing 1
AE sensors
Figure 8. Schematic of the bearing test rig
610
Bearing 4
The two bearings at each shaft end will undergo the same amount of load as the middle bearings due to the reactant force at support. An accelerometer, an acoustic emission sensor and a thermocouple were attached to middle bearing housing for measurement reading. SMT 61806 single row deep groove ball bearings were used for the run-to-failure test at constant 1300 rpm of rotation speed. The data sampling rate is 250 kHz and data collections were conducted by a National Instruments LabVIEW program. Two bearing failure data were collected with identical condition for the proposed model validation. Table 4 summarizes the collected vibration data set. In this work, authors only used the vibration signals from this bearing test. Table 4. Experimental bearing failure data set Test No
Number of Sample
Bearing Position
RPM
Sampling frequency
Total operation time
1
912
3
1300
250K
683 Min
2
810
3
1300
250K
579 Min
5.2 Feature calculation and selection Using experimental test data, authors also calculated 14 features from time domain and frequency domain data. Moreover, the same feature evaluation method as depicted in Section 4 was used for the selection of effective features on the classification of health state subsequently. As a result of this test, four features, RMS, entropy estimation value, histogram upper value and peak value were selected to test the proposed model. 5.3 Health state estimation and prediction of remnant life In this work, the bearing run-to-failure data were divided into six degradation stages for the estimation of health state and the prediction test of RUL performed using selected four features. Figure 9 shows the probabilities of each stage of the experimental data1 that was also used for training of the health states. The probability variation of each state was occurred after 278 samples because an abnormal condition of bearing was detected at this point of time. And the entire probabilities of each stage well explain the sequence of six degradation states after starting of abnormal condition, which are distinctly separated as shown in Figure 9. The training error value was about 1.7% for six health states. The expected life was also calculated by using the time of each training data set and their probabilities of each health state as be expressed in (1).
Figure 9. Probabilities of each state [Experimental test1]
Figure 10. Comparison of real remaining life and estimated life [Experimental test1]
611
Figure 10 shows the comparison between real remaining life and estimated life. Due to the lasting a long time of normal condition, there were high margins between the real remaining life and the estimated life in initial state. However, the estimated life were closely followed the real remaining life after 278 samples as shown in Figure 10. The average of prediction value was also calculated using same equation (14).Although the average prediction value was 86.32% in entire range of data set, the average prediction value after starting of abnormal condition was 97.67%. The second experimental test data which consisted of 810 sample sets also employed for the validation of model using identical training data (Experimental data1). Figure 11, 12 shows the test results of probabilities of each state and the comparison between the real remaining life and the predicted life. As shown in Figure 11, the probability variations were begun after around 600 samples because an abnormal condition stated at the time of 600 samples in case of second test data. Compared with former result (Test 1), the probability of five states indicated a low value relatively and hard to find out in the display. Therefore, the estimated life of second test data also started to follow the real remaining life after the beginning of abnormal bearing condition. In this case, the average prediction value was 38.93% in entire range of data set and 95.5% after starting of abnormal condition. Furthermore, the difference of beginning time between the real remaining life time and the estimated time at initial degradation state was originated from the different life time between training data (Test 1, 683 minutes) and test data (Test 2, 579 minutes).
Figure 11. Probabilities of each state [Experimental test 2]
Figure 12. Comparison of real remaining life and estimated life [Experimental test 2]
6
CONCLUSION
This paper addresses the prognosis of bearing failure based on health state estimation using an SVM classifier. The proposed model is based on historical data in terms of historical knowledge to determine failure degradation states for use in the estimation of the machine health state. Initially, progressive bearing fault data were simulated and followed by experimental bearing run-to-failure tests were conducted to validate the proposed model. To increase the performance of SVM classifier, effective features were selected using an evaluation method of feature effectiveness. The results from two actual case studies indicate that accurate estimation of health states is achievable which would provide long-term prediction of machine remnant life and early warning of abnormal machine conditions. However, knowledge of failure patterns and physical degradation from historical data for diverse types of faults still needs further investigation.
612
7
REFERENCES
1
J. Lee, J. Ni, D. Djurdjanovic, H. Qiu and H. Liao. (2006) Intelligent prognostics tools and e-maintenance. Computers in Industry, 57, 476-489.
2
J. Liu, D. Djurdjanovic, J. Ni, N. Casoetto and J. Lee. (2007) Similarity based method for manufacturing process performance prediction and diagnosis. Computers in Industry, 58, 558-566.
3
A. K. S. Jardine, D. Lin and D. Banjevic, (2006) A review on machinery diagnostics and prognostics implementing condition-based maintenance. Mechanical System and Signal Processing, 20, 1483-1510.
4
Y. Li, S. Billington, C. Zhang, T. Kurfess, S. Danyluk, and S. Liang. (1999) Adaptive Prognostics for Rolling Element Bearing Condition. Mechanical Systems and Signal Processing, vol. 13, pp. 103-113.
5
I. Guyon and A. Elisseeff. (2003) An introduction to variable and feature selection, Journal of Machine Learning Research 3. 1157-1182.
6
S. M. Namburu, S. Chigusa, D Prokhorov, L. Qiao, K. Choi and K. Pattipati (2006) Application of an effective datadriven approach to real-time fault diagnosis in automotive engines. IEEE Transaction on Automatic Control, 1646(7).
7
V.N. Vapnik. (1995) The Nature of Statistical Learning Theory, Springer-Verlag, New York.
8
V.N. Vapnik. (1999) An overview of statistical learning theory, 1999. IEEE Transactions on Neural Networks, 10 (5) 988-999.
9
N. Cristianini and N.J. Shawe-Taylor. (2000) An Introduction to Support Vector Machines, Cambridge: Cambridge University Press.
10 C.W. Hsu and C.J. Lin. (2002) A comparison of methods for multiclass support vector machines, IEEE Transaction on Neural Network 13(2), 415-425. 11 A. Heng, A. C. C. Tan, J. Mathew and B. S. Yang. (2007) Machine prognosis with full utilization of truncated lifetime data, in Proceedings 2nd World Congress on Engineering Asset Management and the 4th International Conference on Condition Monitoring, pp. 775-784, Harrogate, UK. 12 P. D. McFadden and J. D. Smith. (1984) Model for the vibration produced by a single point defect in a rolling element bearing, Journal of Sound and Vibration, vol. 96, pp. 69-82. 13
S. Braun and B. Datner. (1979) Analysis of roller/ball bearing vibrations, Transactions of the ASME, vol. 101, pp. 118125.
14 J. Weston, S. Mukherjee, O. Chapelle, M. Pontil, T. Poggio and V. Vapnik. (2000) Feature selection for SVMs, In Proceedings of Advances in Neural Information Processing Systems 12. MIT Press, 526-532. 15 S. Knerr, L. Personnaz, and G. Dreyfus. (1990) Single-layer learning revisited: a stepwise procedure for building and training a neural network, in Neuro-computing: Algorithms, Architectures and Applications, J. Fogelman. Ed. SpringerVerlag, New York. 16 W. W. Hwang and B. S. Yang (2004) Fault diagnosis of rotating machinery using multi-class support vector machines, in Korea Society for Noise and Vibration Engineering. Vol. 14, No. 12, pp. 1233-1240. 17 J. Platt. (1999) Fast training of support vector machines using sequential minimal optimization, in: B. Scholkopf, et al. Advances in Kernel Methods-Support Vector Learning, MIT Press, Cambridge. 18 C. W. Hsu, C. C. Chang and C. J. Lin. (2005) A practical guide to support vector classification, in Technical Report, Department of Computer Science and Information Engineering, National Taiwan University, Available at: http://www.csie.ntu.edu.tw/~cjlin/papers/guide/guide.pdf.
Acknowledgments This research was concluded within the CRC for Integrated Engineering Asset Management, established and supported under the Australian Government’s Cooperative Research Centres Programme.
613
Proceedings of the 4th World Congress on Engineering Asset Management Athens, Greece 28 - 30 September 2009
RESEARCH ON DATA-BASED NONLINEAR FAULT PREDICTION METHODS IN MULTITRANSFORM DOMAINS FOR ELECTROMECHANICAL EQUIPMENT XU Xiao-lia,b CHEN Taoa WANG Shao-honga,b a
b
Beijing Institute of Technology, Beijing, 100081, China
Beijing Information Science & Technology University, Beijing,100192, China,
Safety of equipment has significant impact on production and human resources as well as environment. Ensuring safe operation of equipment is an important problem faced. One of the important and difficult key technologies in guaranteeing equipment operation is fault prediction. In this paper, research on fault prediction methods based on field data mainly is carried out to achieve predictive maintenance for large rotating electromechanical equipment as most of its faults being trendy ones with long course characteristics. This paper studies new way to make fault prediction in multi-transform domains, perform feature frequency band decomposition based on wavelet packet or HHT, explore nonlinear dimension reduction method to extract fault sensitive characteristics, applies Elman neural network methods to perform nonlinear associative intelligent prediction based on historical and present fault sensitive characteristics so as to realize long course fault prediction. The research is important for large electromechanical equipment to achieve early fault prediction, guarantee safe operation, save maintenance costs, improve utilization, and implement scientific maintenance. Key Words: Predictive Maintenance, Electromechanical Equipment, Data-Based, Multi-Transform Domains, Fault Prediction 1
INTRODUCTION
The safety of equipment has significant impact on production, culture, resources, and environment and one major issue is to ensure safe operation of equipment. Typically, traditional maintenance approach of large electromechanical equipment is time-based preventive maintenance, also called periodic maintenance. Meanwhile predictive maintenance is a kind of dynamic maintenance way [1,2], which is condition-based predictive maintenance. Traditional preventive maintenance replaced by predictive maintenance is the developing trend for key and large equipment. In this way, the previous maintenance mechanism of equipment can be changed substantially. To ensure equipment operating reliably and effectively, three stages have been undergone for the relevant study: (1) monitor operating condition of equipment; (2) diagnose fault, which is usually conducted when malfunction occurs; (3) predict trend of operating condition development in early stage to implement predictive maintenance, which is usually carried out before malfunction. Fault prediction technology is one key technology to ensure safe operation of equipment and one key topic in electromechanical fault diagnosis theory. Fault prediction technology not only can improve safety of the complicated equipment, secure reliability of the tasks and lower life cycle cost, but also plays a key role in conducting dynamic predictive maintenance. Moreover, fault prediction technology has brought enormous impetus to improvement of industry technology level, and even the comprehensive national power. Seen from a series of published literature [3,4], fault prediction technology has become one important topic of scientific and technological study.
614
2
THE CLASSIFICATION OF FAULT PREDICTION APPROACHES
Currently, there is not a unified classification for fault prediction approaches. In accordance with range of theory, approaches, and technology widely adopted in actual study, the approaches are roughly classified into three categories [5] namely model-based approaches, knowledge-based approaches and data-based approaches. (1) Model-based approaches. This kind of approaches can touch object characteristics in-depth on the premise that mathematical model of the studied objects is known. However, it is hard to set up exact mathematic models for complex dynamic systems in engineering field, so actual application and effect of this kind of approaches is limited largely. (2) Knowledge-based approaches. This kind of approaches does not require exact mathematical models, and its greatest advantage is to make use of expert knowledge and experience in related fields. The representative applications are expert system and fuzzy logic, but this kind is only fit for qualitative reasoning instead of quantitative calculation, so the actual application is limited in some extent. (3) Data-based approaches. In process of actual fault prediction, it is uneconomical or even impossible to establish mathematical models for operating conditions of complex equipment. At the same time, expert experience and knowledge in the fields cannot be expressed effectively. However, data-based fault prediction approaches are based on data monitored, and prediction can be carried out by mining implicit information to avoid shortcoming of model-based approaches and knowledgebased approaches. This kind has become applicable for fault prediction. The representatives are neural network and hidden Markov model etc. Nevertheless, in actual applications, it is hard to obtain typical data. In addition, it also brings processing difficulties for uncertainty and incompleteness of the data obtained. The following are difficulties involved in fault prediction technology: (1) Uncertainty of prediction. Uncertainty mainly derives from random process of fault mechanism and error occurs in prediction process. In long process of prediction, disturbance of non-fault factors, such as changes of working condition and load are major reasons for fault prediction uncertainty. (2) Problems of nonlinear prediction. Electromechanical system is a kind of complex and nonlinear power system. When prediction with traditional approaches, nonlinear factors are usually neglected. As a rule, the solutions derived from Fourier transform fall far from fact. Along with increasing requirements towards prediction accuracy, nonlinear problems have been highlighted and nonlinear prediction approaches suitable for nonlinear systems need to explore further. (3) Difficulties of data acquisition. The development and verification of fault prediction algorithms call for support from large volume of data, there are three kinds of these data: the first is actual working condition data. This kind of data can cover various working conditions, load and environmental factors of the equipment system, the data are true and reliable, but need to build data acquisition platform. The second is the infused experimental data based on faults of experiment tables. This kind can guarantee data authenticity to some extent, but the inadequacy is that these data cannot describe actual fault evolution process completely. The third is model simulation data. This kind can be customized according to requirements of algorithms development and verification, but the authenticity usually cannot be guaranteed, and it is also very difficult to build reliable simulation model. (4) Difficulties of prediction verification. Verification is needed to ensure fault prediction approaches adopted meet desired accuracy. As simulation verification can not generate convictive verification results and verification on experimental tables also departs the actual application environment far away, effective way is to build online data exchange platform and perform verification on industry spot.
3
RESEARCH ON DATA-BASED NONLINEAR FAULT PREDICTION APPROACHES IN MULTITRANSFORM DOMAINS FOR ELECTROMECHANICAL EQUIPMENT
With guaranteeing safety of large electromechanical equipment in key petrochemical enterprises as application background and widely used large rotating electromechanical equipment on industry spot as research object, nonlinear databased fault prediction approaches in multi-transform domains are carried out in consideration of fault prediction difficulties encountered to shorten distance between theoretical study and actual application. The research route is shown in Figure 1 and its detailed steps are as follows: (1) Obtain online actual data of equipment through remote online monitoring and diagnosis centre; (2) Perform feature frequency band decomposition in time-frequency domain based on wavelet packet or HHT for the obtained data in (1); (3) In the decomposed frequency band explore nonlinear dimension reduction method which maps fault development features to geometric shapes in topology domain, and reveal relationship between geometric shape information and fault trend representation so as to extract fault sensitive characteristics;
615
(4) With the fault sensitive characteristics in (3), build nonlinear fault prediction model, such as Elman neutral network etc, which is dynamic and self-adaptive, and conduct intelligent trend prediction in time domain. Taking large flue gas turbine unit with high fault rate in petrochemical enterprises as specific research object, with information provided by remote online monitoring and diagnosis centre to conduct fault prediction research, and also utilize electromechanical experimental tables to accomplish study of simulation mechanical dynamic feature and fault prediction under various application backgrounds.
Figure 1. Research route
3.1. Extraction of feature frequency band Correct analysis of operating condition of equipment is necessary in fault diagnosis and trend prediction. In recent years, wavelet technology in time-frequency domain is widely used in diagnosis of electromechanical equipment, which is proved effective. However, wavelet transform doesn't perform decomposition for high frequency part; frequency resolution of high frequency range is poor. As a kind of new time-frequency analysis method, Hilbert-Huang Transform (HHT) analysis method has been introduced in nonlinear and unstable signals analysis gradually, and has got local features [6] which cannot get through other method. And wavelet packet transform (WPT) analysis is the improvement of wavelet transform, provides refined analysis for unstable signals processing and feature extraction [7,8]. In wavelet packet transform, the details of both low and high-frequency parts are decomposed continuously; therefore, analysis capacity of wavelet packet transform is more powerful. So feature frequency bands are extracted based on various feature frequency band obtained through HHT or WPT. Figure 2 is the decomposition tree of flue gas turbine vibration signal based on WPT.
Figure 2. Decomposition tree of wavelet packet
616
3.2. Fault sensitive characteristics extraction Flue gas turbine is a kind of complex electromechanical system and its operation condition is nonlinear. Although the high dimension data monitored provides rich detailed information about operation conditions, the high dimension data bring great difficulties for fault sensitive characteristics extraction. Here draw lessons from facial image analysis with pose variations regarding corresponding feature change as a low dimension nonlinear manifold. Based on comprehensive analysis of operation data of equipment, introduce Isometric Feature Mapping (ISOMAP) algorithm [9,10] in nonlinear manifold learning to reduce high dimension data, explore distribution change of low dimension manifold in high dimension data, and obtain fault sensitive characteristics for long process fault prediction. Nonlinear manifold learning projects high dimension input to low dimension space by keeping local structure of data, identify inherent geometric structure and rule which are hidden in the data, and better solve “dimension disaster” issues in data processing. In manifold learning algorithm, suppose the processed data are sampled in underlying manifold, or in other words, there is an underlying manifold for this set of data, and then re-express them in high dimension space with low dimension. ISOMAP algorithm is used to reveal relationship between geometric shape information and fault development feature to extract sensitive characteristics.
3.3. Fault prediction Along with increasing complexity of electromechanical system, nonlinear fault prediction issues are highlighted; the prediction results using traditional approaches, which neglected nonlinear factors, fall far from fact most of time. While some analysis and processing approaches of dynamic system with nature of nonlinear have potential application prospects. Neural network has characteristic of approximating the arbitrary continuous nonlinear function and its all order derivatives with arbitrary precision by appropriate selection of network layers and cell number of hidden layers and thus it is widely used [11]. Elman neural network was put forward by Elman in 1990 [12] which is a kind of dynamic recurrent neural network. Usually Elman neural network has four layers: input layer, middle layer (hidden layer), acceptor layer, and output layer, as shown in Figure 3. In this model, an acceptor layer is added to the hidden layer of feed-forward network as the onestep time delay operator, so as to realize the memorial purpose. Therefore, the system can adapt to time variation and reflect dynamic process. Based on historical and current information, conduct fault prediction by applying Elman neural network to judge fault trend of the flue gas turbine [13]. Figure 4 is prediction error result of vibration amplitude on front bearing of the turbine with data input after dimension reduction.
Figure 3. Elman Neutral Network Structure
617
Figure 4. Predicting error of Elman neural network
3.4. Features of the research (1) In terms of data sources and results output, the monitored data on industry spot can be transmitted to laboratory by remote online monitoring and diagnosis centre. Most of data used are from online actual data of large electromechanical equipment (which are different from the regular simulation data). The results are verified directly and delivered to the users by the remote centre in time. (2) In terms of prediction, characteristics are extracted from monitored actual data; models are set up based on online monitored data from actual electromechanical system. Compared with traditional pure mathematical model method, it is fit for actual application well and thus to obtain more effective prediction results.
4
CONCLUSION
As for large rotating equipment, in order to carry out intelligent predictive maintenance, fault prediction research based on spot data is conducted and nonlinear data-based prediction methods in multi-transform domains are explored. The research realizes feature band captured in time-frequency domain based on WPT/HHT, fault sensitive characteristics extracted on basis of ISOMAP algorithm in topology domain and nonlinear intelligent prediction by applying Elman neural network based on historical and current characteristic information, to implement long process fault prediction. The research exerts significant implication on following aspects, such as realizing fault prediction in early stage of large electromechanical equipment, guaranteeing safe operation, saving maintenance fees and improving utilization rate, as well as achieving scientific maintenance and so like.
5
REFERENCES
1
Jay Lee,Jun Ni,Dragan Djurdjanovic,Hai Qiu,Haitao Liao.(2006) Intelligent Prognostics Tools and E-Maintenance. Computers in Industry, (57): 476~489
2
Andrew K.S.Jardine,Daming Lin,Dragan Banjevic.(2006) A Review On Machinery Diagnostics and Prognostics Implementing Condition-based Maintenance. Mechanical Systems and Signal Processing, (20):1483~1510
3
P.Caselitz,J.Giebhardt.Condition Monitoring and Fault Prediction for Marine Current Turbine. http://www.iset.unikassel.de/abt/FB-E/papers/Paper_CM4MCT_dl.pdf
4
Ammar Iqbal,Rakesh Tanange ,Shafqat Virk. (2006) Vehicle Fault Prediction Analysis. Sweden
618
5
Liang Xu, Li Xingshan, Zhang Lei, Yu Jinsong. (2007) Survey of Fault Prediction Supporting Condition Based Maintenance, Measurement & Control Technology, 26(6):5-8, 14
6
P.FrankPai,AnthonyN.Palazotto. (2008) HHT-based Nonlinear Signal Processing Method for Parametric and NonParametric Identification of Dynamical Systems. International Journal of Mechanical Sciences, (50): 1619~1635
7
Suleyman Bilgin, Omer H. Colak, Etem Koklukaya, Niyazi Ari. (2008) Efficient Solution for Frequency Band Decomposition Problem Using Wavelet Packet in HRV.Digital Signal Processing, 18: 892–899.
8
Hu Qiao,He Zhengjia, Zhang Zhousuo,Zi Yanyang. (2007) Fault Diagnosis of Rotating Machinery Based on Improved Wavelet Package Transform and SVMs Ensemble.Mechanical System and Signal Processing, 21(2):688-705
9
TENENBAUM J B, SILVA V, LANGFORD J C. (2000) A global geometric framework for nonlinear dimensionali-ty reduction.Science, (290):2319-2323.
10
Yin Junsong, Xiao Jian, Zhou Zongtan, Hu Dewen. (2007) Analysis and Application of Nonlinear Manifold Learning Method, Progress in Natural Science, 17(8):1015-1025.
11
Abhinav Saxena,Ashraf Saad. (2007) Evolving an Artificial Neural Network Classifier for Condition Monitoring of Rotating Mechanical System. Applied Soft Computing, (7): 441-454
12
J.L. Elman. (1990) Finding structure in time.Cognitive Sci, (14): 179-211
13
Meng Lingqi, Meng Meng. (2008) Application of Elman Neural Network on Wide Spread Prediction in Medium Plate Mill, Journal of Jilin University (Engineering and Technology Edition), 38(1):193-196
Acknowledgments The research has been supported by Scientific Research Key Program (KZ200910772001) of Beijing Municipal Commission of Education, Funding Project (PHR20090518) for Academic Human Resources Development in Institutions of Higher Learning under the Jurisdiction of Beijing Municipality.
619
Proceedings of the 4th World Congress on Engineering Asset Management Athens, Greece 28 - 30 September 2009
IM:MR - A TOOL FOR INTEGRATION OF DATA FROM DIFFERENT FORMATS Danúbia Espíndolaa , Carlos Eduardo Pereirab and Marcio Pinhoc a, b
c
Federal University of Rio Grande do Sul, Osvaldo Aranha, 103,Porto Alegre, RS, Brasil. Pontificia University Catholic of Rio Grande do Sul, PUCRS, Porto Alegre, RS, Brasil.
The tool IM:MR is a system for integration of data originating from Intelligent Maintenance (IM) and Mixed Reality(MR) systems. The motivation for the development of this solution is to facilitate the predictive data visualization in order to contribute with the downtime reduction in critical equipments of the industry. Through of IM:MR tool use, the operator can be guided during maintenance task. In this sense are integrated CAD data, virtual components (MR data) and several signs from IM systems. This study presents an introduction, contextualization and related works supplying subsidies to the understanding of this proposal. Finally, the tool, the experiments, conclusions and future perspectives are presented. Keywords: Mixed and Augmented Reality, Intelligent Maintenance Systems. 1
INTRODUCTION
The complexity of the industrial processes demands the use of support tools in automation systems. Nowadays supervisor systems are used to provide information in real time about the state of the equipment/plant. With the advent of the embedded systems, the Intelligent Maintenance Systems (IMS), also known as Predictive Systems, arise. These systems focus on the use of computational softwares and sensors in machines and equipment, in order to allow an evolution from the traditional systems of corrective and preventive maintenance to a predictive system. The focus of the IMS is the understanding of the components degradation process, based on the state and conditions of equipment use. In this context, techniques of Mixed/Augmented Reality(MR/AR) arise as a solution for 3D visualization capable of helping in the system supervision. The possibility of superimposing virtual objects generated by computer in the real environment (i.e. monitored equipment) in real time, represents a powerful support tool for industrial productive processes. Thus, in order to facilitate the users’ understanding (usually maintenance technicians) it is intended to develop a visualization tool to contemplate the information from predictive system. In other words, through visualization devices such as TabletPC, PDA(Personal Digital Assistant), HMD(Head Mounted Displays) and so forth, the user will can receive information from the intelligent maintenance system, such as: 2D graphs, 3D models, maintenance text guides during maintenance task. Besides, interaction by voice commands specifying the component of the machine which one wants to visualize the information will be possible. Having known that maintenance tasks need safety both in the maintenance procedure, and user’s physical integrity, the Mixed Reality, originating from Virtual Reality(VR), represent a way to provide a safe interaction for the user. Hence, traditional maintenance techniques, such as replacement and repair of components, accomplished only when break happens, are not enough. The search for effective improvement in the production process has become a constant in the race for the fundamental requirements of cost, time and quality product [12][22]. This search can be observed through the growing discussion about the use of VR/AR/MR in industry applications[17][24]. Therefore, tools that add to the Intelligent Maintenance Systems resources of Mixed/Augmented Reality, give a significant contribution both in the state of the art, and in the industrial context.
620
This paper is divided into 6 sections: section 2 will describe the context about subject. In the section 3 the related works about application of Virtual Reality in the industrial are presented. Section 4 proposes a conceptual model for integration of both systems. Finally, section 5 shows some experiments and section 6 describes the considerations and future perspectives.
2
CONTEXT
Intelligent Maintenance systems intend to allow an evolution from the traditional corrective/preventive maintenance systems to a predictive maintenance system. Preventive maintenance, the most used in industries nowadays, seeks to avoid failure executing maintenance in fixed intervals of time. However it may cause constant and unnecessary breaks in the production line, as well as replacement of components in good operational conditions. Corrective maintenance consists of replacing or repairing the component at the moment of the failure, this causes risks to employees and unexpected breaks. On the other hand, the predictive maintenance intends to communicate beforehand where and when failure will happen. The first step for the implantation of intelligent maintenance system is identifying the critical component of the system (machine or equipment). After the equipment definition, it is necessary to insert sensors for signs acquisition, for this it is important to have a previous study of the system as well as the perfect definition of what will be monitored. The data acquisition by sensors allows the creation of a machine operation report that can be used as diagnosis of the equipment with the purpose of analysing life cycle. Thus it is important to highlight that the term diagnosis refers to the past behaviour and when there is a suggestion of future behaviour, the term used is prognostic (focus of predictive maintenance). Having condition data and using instrumentation concepts, techniques of signal processing and degradation models are applied to predict where and when the component will fail, reducing downtime and allowing the optimization of the maintenance processes. These techniques and models are described in section 4. Regarding Mixed Reality, it has the capacity of mixing real and virtual elements in an environment presented on the output device in real time. The Augmented Reality(AR) and Augmented Virtuality (AV) are inserted in the context of Mixed Reality, however these terms are used in an indiscriminate way, prevailing the use of term Augmented Reality some times applied in a wrong way . According to [15], Augmented Reality occurs when virtual objects are brought to the real environment. Augmented Virtuality is the opposite, real objects are brought to the virtual environment. Hence, as this work focuses on industrial maintenance, aspects of Augmented Reality will be explored as a way of mixing virtual information with real environment (factory/equipment). The MR/AR use will allow analysis, interaction with graphs and exploration of cognitive aspects, become the understanding of information easier and facilitating operators in the decision making during maintenance task. Maintenance routines and failure diagnosis can be improved through the use of these technologies. This way this study intended to develop a tool for implantation of a system that can indicate and bring maintenance instructions providing for the user diagnosis results through MR visualization techniques [21]. Among the challenges in the implantation of MR techniques, one of them is the difficulty of synchronizing and aligning the camera movement, the operator environment and the virtual objects inserted in the same system of coordinates. In other words, it is the alignment and the synchronization between the real and the virtual. Another important aspect is the presentation device, this decision should take into consideration system mobility, noise and illumination in the industrial environment. Among the possibilities for this work the use of HMD, Tablet-PC and/or PDA is thought. Having the context been described, the next section show works found in literature. A more specific research concerning each one of the research areas can be obtained in the works [10][9][14][1].
3
RELATED WORKS
Both Intelligent Maintenance and Mixed/Augmented Reality systems in the industrial context [22] represent precursory solutions in both research areas [21][1]. For this reason some events in Mixed/Augmented Reality treat the use of MR/VR in the industry as a “potential application” in the industrial[6].
621
Among the initiatives found, there are: ARVIKA[7] and AMIRE[8] projects, STARMATE[23]. ARVIKA (Augmented Reality für Entwicklung, Produktion und Service) is a German AR project for Production, Project and Services, applied in manufacture industries and led by Siemens. The AMIRE (Authoring Mixed Reality) is also the European AR project that aims to qualify researchers who are not experts in Mixed Reality, what differs it from most of the solutions according to [8] Among the solutions developed in this project there is petroleum industry application that use the ARToolkit tool for superimpose virtual objects upon real scene [24]. The main areas where technologies of Mixed/Augmented Reality are applied in the industry are: collaborative project [2], training [16], maintenance [26] and design [20]. The increase of the products complexity demands support systems and database each more time powerful to organize information and to facilitate the access to development teams for that can bring the right information to the right place at the right moment. In the works regarding MR/AR application in maintenance, most of them use HMD (Head Mounted Displays) to visualize the real equipment, having as a support the superimposition of virtual elements. Maintenance information can be requested by several kinds of interaction. Among the solutions that were found, the voice command is an interesting alternative, once the hands of the maintenance operator can be free for manipulation of the equipment [3]. However, solutions for diagnosis and prognostic in the maintenance that use AR techniques are rare in the literature review. This work proposes the use of markers 1 to tracking of the scene and virtual information superimpose. Besides the development of repairing instructions and maintenance guide through texts and 3D objects will be possible. Videos of equipment assembly and 3D models with animation sequences and audio techniques for interaction also will be available. As result of this study the IM:MR tool for integration of different information will can shorten the downtime of maintenance and allow collaboration among several professionals. The IM:MR tool will be presented in the next section targets shows the integration of predictive system information with mixed environment visualization.
4
IM:MR TOOL
For the arrangement this tool, a conceptual model was firstly developed that presented the stages and modules necessary to the implementation this tool. According to Figure 1 it can be observed the predictive and mixed reality modules, that are IM and MR system. These modules supply the input for the MDI (Integration Descriptive Model). The MDI will be responsible for generating a descriptive model in XML that integrates the different information supplied by the systems and present these information in the visualization interface. The red rectangle indicates the focus of implementations of this tool. The following services will be supplied by the interface: simulation, task guide, internal visualization of the component and video 3D of assembly and disassembly. Among the researched solutions about intelligent maintenance it can be classified them in: systems, platforms and patterns. These solutions were developed for the industry in order to facilitate and to provide to the user the necessary resources for the implantation of intelligent maintenance systems [19][18]. The IMS system was chosen as a solution to the predictive system module of this tool. Further details can be analyzed in [11]. This system has a nucleus component, called WatchDog Agent [5] that evaluates the working state of machines and equipment. Figure 1. Conceptual model. The IMS system consists of prognostic algorithms embedded with a group of software tools to predict components and systems failure. This degradation calculation is based on multiple sensor readings which measure the critical properties of the process/machinery. 1
Markers are printed standard in papers and placed in the scene for position tracking through image processing techniques used in ARToolkit library.
622
This system has routines which were developed for each one of the predictive system module, proposed in the conceptual model. Then, having the system data been acquired, the algorithms of each one of the stages can be used and changed. For implantation of the MR system and of the interface, the ARToolkit library and the C++ language on Visual Studio platform were used. For the 3D modeling the use of VRML was elected, once the current CAD softwares have CAD-RV conversion resources. This will facilitate the task of the 3D virtual model creation. In the simulation interface is possible to select the components that are being monitored and to visualize virtual component respective. In other words, for instance, when increasing the temperature value of the component this will be highlighted in the virtual model through the color change and indicative arrows. The health visualization interface of the component allows the user to visualize the signs from intelligent maintenance by the selected component in the interface or warned by IM system. The task guide intends to present textual information based in specialist knowledge according to maintenance tasks informed by operator. The intern visualization of the component seeks to present the virtual model aligned to the real model using the X-Ray technique. The video 3D service is provided by CAD tools and will present assembly and disassembly video that will be integrated the solution, i.e., these videos will not generated by IM:MR tool, but integrated. The Figure 2 present the interfaces of the IM:MR tool for the implementation of component health visualization. In Figure 2 it is possible to observe the value confidence of the node 2 of temperature (node - valve component) supplied by the intelligent maintenance system. In other words, selected the component (nodes list), the sign that is being monitored (temperature, vibration and pressure) and the type of available information by the IM system (value confidence, statistic pattern recognition and logistic regression)
Figure 2. Visualization interface of component health. The XML description generates a nodes hierarchical model regarding the equipment components. Each node in the XML model contains the virtual information, the CAD information and information supplied by IM system. With that, the interface can organize the information of the several systems of intuitive and explanatory way helping operator during maintenance tasks. The next session presents the results of the case study as well as the used tools for the module IM and MR in this case.
5
RESULTS Being the proposal an integration between Intelligent Maintenance and Mixed Reality, the solutions found in literature that contemplate both themes are rare. However, individual solutions to each one of themes are several, what become evident the relevance of a tool that join both technologies. Figure 3 presents the physical implantation of the IM system module proposed in the conceptual model. This experiment is part of a case study with valves of the industry.
Figure 3. Physical implantation IM system module.
623
In this experiment was introduced a rubber pipe in the valve entrance to simulate the fault in the opening and closing valve. Along with Watchdog Agent a Matlab toolbox (Figure 4) is supplied for the use of prediction methods, signs processing, performance assessment, diagnosis and prognostic. With this system some graphs were generated for results analysis and integrated in developed IM:MR tool. Among these results was noticed clearly in the confidence value graph the behaviour degradation in the opening and closing valve. Mixed Reality immersion devices depend on the application, for cases in the industry it is important to consider illumination and the noise of the plant. Another fundamental characteristic to be considered is the mobility of the system that should be able to be carried easily by the maintenance operator. In this study, the tests were accomplished with desktop device, and later HMD based on video with coupled camera will be used. In future tests, exploring visualization aspects with PDAs and TabletPC is also intended. For the data processing of the maintenance and visualization, it is intended to use a mobile device (notebook or Tablet-PC). Figure 4. Toolbox Matlab of IMS system. Having the modules, tools and devices considered for this application been presented, the immersive visualization modules is briefly described. At first each sensor should correspond to a marker type registered in ARToolkit, this way the user can access the components different information activating the corresponding marker. Besides, other markers should be inserted for the machine virtual model and maintenance guide visualization. ARToolkit library will work as a mixer of virtual and real elements. However, the virtual elements should not appear automatically (at the moment of the marker visualization), as it is the library pattern, but through the request by voice commands, or activation of some input device by the user. Firstly is necessary to catch temperature and pressure data (monitored data) processed by IM system. After this the MR system superimpose these information in order to generate the mixed environment with 2D graphs that represent the sign behaviour and virtual elements that indicate the data state. In the Figure 5 the virtual elements are the smiles. The predictive system has text and graph files saved at a standard place from time to time, being necessary just search these information.
Figure 5. Markers to immersive visualization and data integration. The MR system acquires intelligence of the predictive system at the moment that occurred the fault and the system warning about the danger without the user's intervention. Thus, when there is failure danger, the visualization system should be warned by the predictive system. The use of markers to immersive visualization and data integrations is illustrated by Figure 5, where each marker is related with one virtual component, one graph and with real component.
6
CONCLUSIONS
This paper described, in a brief way, a tool for integration of MR/AR resources with an intelligent maintenance system. In the first phase, a conceptual model was developed as a solution to this proposal, where the predictive system used (Watchdog Agent) was capable of diagnosing and interpreting the equipment use conditions.
624
In the second stage, the diagnosis data were transmitted to the MR interface developed with the use of ARToolkit library, where the maintenance instructions will be presented, as well as 3D virtual information. These data will help in the process of virtual assembly/disassembly and at the same time in maintenance tasks. However, this tool is indicated to critical equipment that must not stop and when this happens it causes great financial damage for minute of break. Therefore, it is necessary to define which the critical components of the plant are, in order to establish the viability in terms of implantation cost of this solution. This way, it is not always economically viable to implement the predictive system module. In most of the times it is preferable to replace the component at the moment of the failure. However, the aid of the MR resources represented by MR system module is an alternative of low cost where the profits represent a competitive advantage in whatever maintenance system used. Finally, is important to point out that this tool was applied in a case study of industry valves, however once described the XML model, the MDI, this tool can be applied the any application that needs to integrate data of different formats.
7
REFERENCES
1
Appel M. And Navab N., (2003). Industrial Augmented Reality(IAR): Challenges in Design and Commercialization of Killer Apps, In Proceedings of the Second IEEE and ACM International Symposium on Mixed and Augmented Reality (ISMAR ’03),
2
Baratoff G. And Regenbrecht H., (2004). Developing and Applying AR Technology in Design Production, Service, and Training, In Virtual and Augmented Reality Applications in Manufacturing, S.K. Ong and A.Y.C. Nee, eds., Springer, 2004, pp. 207-236.
3
Behringer R. et al., (1999). A distributed device diagnostics system utilizing augmented reality and 3D audio, ACM Computers & Graphics, 23, pp. 821-825.
4
Comport Ai, E Marchand, F Chaumette. (2003) A real-time tracker for markerless augmented reality. In Proceedings of the Second IEEE and ACM International Symposium on Mixed and Augmented Reality. ISMAR, pp: 36- 45.
5
Djurdjanovic D., Lee J. And Ni J. (2003). Watchdog Agent, an infotronics-based prognostics approach for product performance degradation assessment and prediction. In Advanced Engineering Informatics, 2003 - Elsevier: pp. 109–25.
6
Fiorentino, M., De Amicis, R, Monno, G., Stork, (2002). A. Spacedesign: A Mixed Reality Work-space for Aesthetic Industrial Design, ISMAR2002, pp. 86-94.
7
Friedrich W., Jahn D., Schmidt L., (2002), ARVIKA – Augmented Reality for Development, Production and Service, In Proceedings of the International Symposium on Mixed and Augmented Reality (ISMAR’02) pp.3-4.
8
Grimm P., Haller M., Paelke V., Reinhold S., Reimann C., Zauner J., (2002). AMIRE - Authoring Mixed Reality In The First IEEE International Augmented Reality Toolkit Workshop, 29 September, 2002. Darmstadt, Germany.
9
Haringer M. And Regenbrecht H., (2002). A pragmatic approach to Augmented Reality Authoring, In Proceedings of the International Symposium on Mixed and Augmented Reality (ISMAR’02) pp. 237- 245
10
J-B. Le´ Ger., (2004). A case study of remote diagnosis and e-maintenance information system. Invited keynote paper for IMS’2004. In International conference on intelligent maintenance systems, Arles, France, 2004.
11
Lee J. And Ni. J. (2004). Infotronics-based intelligent maintenance system and its impacts to closed-loop product life cycle systems. Invited keynote paper for IMS’2004—International conference on intelligent maintenance systems, Arles, France, 2004.
12
Lee W. And Park J., (2005). Augmented Foam: A Tangible Augmented Reality for Product Design, In Proceedings of the International Symposium on Mixed and Augmented Reality (ISMAR’05), pp. 106- 109
13
Macchiarella M. And Vincenzi D., (2004). Augmented Reality In A Learning Paradigm For Flight And Aerospace Maintenance Training, in: Digital Avionics Systems Conference, 2004. DASC 04. The 23rd IEEE, (2004) pp. 5.D.1- 5.19
14
Mathew et al., (2006) Reducing maintenance cost through effective prediction analysis and process integration. In: Advances in Vibration Engineering 2006 5(2): pp.97-96.
15
Milgram, P. And Kishino, F. (1994). A Taxonomy of Mixed Reality Visual Displays, In IEICE Transactions on Information Systems, Vol. E77-D, no. 12, Dec. 1994.
16
Nakajima C. And Itho N., (2003). A Support System for Maintenance Training by Augmented Reality, In Proceedings of the 12th International Conference on Image Analysis and Processing (ICIAP’03) pp. 158- 163
625
17
Navab N., (2004). Developing Killer Apps for Industrial Augmented Reality. In. IEEE Computer Graphics and Applications, May/June 2004, pp. 16-20.
18
Peysson F. et al. (2007). New Approach to Prognostic Systems Failures. In: Proceedings of the 17th IFAC World Congress.
19
Provan G., (2003). Prognosis and condition-based monitoring: an open systems architecture. In: Proceedings of the .fth IFAC symposium on fault detection, supervision and safety of technical processes, p. 57–62.
20
Regenbrecht H., Baratoff G. And Wilke W., (2005). Augmented Reality Projects in the Automotive and Aerospace Industries, Published by the IEEE Computer Society, IEEE.
21
Roemer M., Byington C., Kacprzynski G., (2005). An overview of selected prognostic technologies with reference to an integrated PHM architecture. In: International Forum on Integrated System Health Engineering, United States, 2005.
22
Schwald B. And Laval B., (2003). An Augmented Reality System for Training and Assistance to Maintenance in the Industrial Context, In The 11th International Conference in Central Europe on Computer Graphics, Visualization and Computer Vision’2003, Plzen, Czech Republic.
23
Schwald, B., Figue, J., Chauvineau, E., et al., (2001). STARMATE: Using Augmented Reality Technology for Computer Guided Maintenance of Complex Mechanical Elements. In E-work and ECommerce, vol 1, pp. 196-202,IOS Press, 2001.
24
Träskbäck, M. And Haller, M., (2004). Mixed reality training application for an oil refinery: user requirements, In ACM SIGGRAPH International Conference on Virtual Reality Continuum and its Applications in Industry, VRCAI 2004, pp. 324- 327, Singapore (2004)
25
Weidenhausen J., Knoepfle C, Stricker. D, (2003). Lessons learned on the way to industrial augmented reality applications, a retrospective on ARVIKA, In Computers & Graphics, 27, pp. 887–891.
26
Zenati N. Et al, (2004). Assistance to Maintenance in Industry Process Using an Augmented Reality System, In International Conference on Industrial Technology (KIT), IEEE.
626
Proceedings of the 4th World Congress on Engineering Asset Management Athens, Greece 28 - 30 September 2009
THEORETICAL STUDY OF PIEZOELECTRIC BIMORPH BEAMS WITH TWO INPUT BASE-MOTION FOR POWER HARVESTING Mikail F. Lumentut , Ian M. Howard Theoretical Mechanics, Department of Mechanical Engineering, Curtin University of Technology, Australia
This paper presents a dynamic model of a piezoelectric bimorph beam with a tip mass for low level power harvesting. The piezoelectric bimorph beam is modelled as an Euler-Bernoulli beam with two input transversal and longitudinal base excitations. The strain field due to the longitudinal base input excitation can affect the piezoelectric response parameters although the transverse bending field has most often been considered in the use of the cantilevered piezoelectric bimorph in stimulating polarity and electric field for the energy harvester. The piezoelectric bimorph beam with centre brass shim can be analysed using series and parallel connections depending on the piezoelectric coupling and electric field parameters. The extracted power from the piezoelectric bimorph beam can be used for the powering of electronic storage devices, electronic media and wireless sensors. In this paper, we propose analytical methods for developing constitutive energy field differential equations using virtual work concepts (Weak form) from the interlayer elements of the piezoelectric bimorph beam. Analytical solutions of the constitutive dynamic equations from longitudinal extension, transverse bending and electrostatic fields are solved using Laplace transforms to obtain transfer functions between their relationships. Keywords : piezoelectric bimorph, power, base excitations, Euler-Bernoulli’s beam, Laplace transforms 1. INTRODUCTION The study of power harvesters utilising piezoelectric effects under dynamic motions has been of great interest for many researchers over the past fifteen years. The piezoelectric system is not a new field with most developments of embedded piezoelectric structures focusing on applications of health condition monitoring for rotating machinery and control systems. However, the piezoelectric effect can benefit the conversion of ambient vibration into low electrical power. The energy extracted from the vibrating environments can be utilised through an electronic circuit capable of supplying current into a rechargeable battery or electrical power storage device, for powering wireless communication, Sodano et al [1], Anton and Sodano [2]. The concept of power harvesting from the bio-mechanics field was first discussed by Starner [3] in which the generated power through piezoelectric components could be captured from the human body such as during the walking operation or chest breathing motion. Further research of the application of power harvesting from human walking was discussed by Kymissis et al [4]. The application of piezoelectric PVDF and PZT components mounted between the insole and rubber sole of shoes can generate power from footsteps during walking. Starner and Paradiso [5] further discussed the scavenging power generation from the human body for mobile computing applications. The trends in recovering power harvested from the human body utilise the body motion as a continuous power generator due to the mechanical contraction, Paradiso [6]. Similar work of scavenging power from human walking was done by Mateu et al [7] using capacitors as storage media with control and regulation for charging load. Recent work on implanting bio-medical devices to capture power from blood pressure fluctuations within the human body has been developed by Clark and Ramsay [8]. Some recent applications of power harvesting using cantilevered piezoelectric beams under dynamic input motion coupling with power electronic components have been an attractive field to be investigated both mathematically and experimentally by researchers. Sodano et al [9] experimentally proposed three piezoelectric devices (Quick Pack, MFC and Quick Pack IDE) bounded onto a cantilevered aluminium beam. As a result, Quick Pack had the highest power harvesting, then followed by Quick Pack IDE and the flexible MFC. Elvin and Elvin [10] discussed the cantilevered unimorph piezoelectric beam under base-excitation using the Rayleigh-Ritz method, coupled with an AC-DC rectifier and storage capacitor.
627
Jeon et al [11] discussed the micro power harvester from the cantilever piezoelectric model using the interdigitated electrode with the 3-3 piezoelectric operation mode. The generated AC current from the cantilever piezoelectric beam was rectified using a bridge rectifier to obtain DC current where it was then stored using a capacitor. Similar techniques were given by Song et al [12], in which cantilever piezoelectric beam models were made from macro-fiber composite (MFC) using 3-1 (d31) and 3-3 (d33) poling directions. The MFC with 3-3 poling direction gave very high power harvesting and electric voltage compared with the 3-1 poling direction. On the other hand, MFC with 3-1 poling gave relatively high current, which was very profitable for power storage. Dutoit et al [13] investigated the strain effect from mechanical vibration of the cantilever piezoelectric beam structure under transverse-base motion. In the analytical solution, the short and open circuit models with the 3-1 and 3-3 piezoelectric poling modes were optimised to obtain power harvesting based on the frequency responses under various load impedances. This situation depended on the series and parallel connection systems of the bimorph piezoelectric beam as discussed by Dutoit et al [14]. Several investigations of power harvesting based on varying load impedances using cantilever piezoelectric models under base motion have been discussed by Ferrari et al [15]. Erturk & Inman [16, 17] discussed an analytical model for optimising power vibration energy using the unimorph and bimorph cantilever piezoelectric beam under two base excitations, transverse displacement and small rotation. The trends of power harvesting versus the frequency response for both analytical and experimental results were investigated according to the various load impedances under base input transverse excitation. Kuehne et at [18] discussed a MEMS scale circular piezoelectric plate diaphragm under dynamic motion with an inertia mass attached at the middle surface of the piezoelectric element. The Ritz method was used to establish the dynamic equation of the clamped circular piezoelectric plate to determine the charge and voltage. Renaud et al [19] discussed a unimorph piezoelectric beam under input impact load to generate the electric voltage. The impact load was a slider to shock the tip of the piezoelectric element. The electric voltage obtained from this situation depended on the properties of the piezoelectric conversion factor and the thickness of the structure where the dielectric constant was not included. Shu and Lien [20] discussed a cantilevered bimorph piezoelectric beam coupled with an electronic circuit under dynamic input force. They derived the analytical method to obtain the non-dimensional normalized parameters of displacement, voltage and electrical power where the formulations were used to obtain the optimal parameter functions. This paper presents the study of analytical dynamic behaviour of strain-polarity-electric field effects on piezoelectric bimorphs under two input base-motion. The piezoelectric bimorph beam with a tip mass includes not only the pure bending transverse effect like previous published papers, but also the presence of a longitudinal extension effect on the interlayer elements. The piezoelectric bimorph was subject to both input base transversal and longitudinal excitations. The input base transverse excitation gave the strongest influence to affect the strain field from transverse bending in creating polarity-electric field to generate power harvesting for the cantilevered piezoelectric bimorph beam. However the presence of the input longitudinal excitation should not just be ignored. This paper also established the mathematical concept of the constitutive matrix dynamic equations. The parameters of polarity, electric field and mechanical strain of the piezoelectric bimorph also considered the series and parallel connections. In this paper, the numerical results for electromechanical vibration analysis of a piezoelectric bimorph are presented from calculations iterated using MATLAB.
2. MATHEMATICAL FOUNDATION A piezoelectric bimorph beam with centre brass shim was modelled with input base transverse and longitudinal excitations. In this case, the strain energy from coupled mechanical and electric fields (converse effect) and electrical energy from electrostatic energy of piezoelectric (direct effect) and kinetic energy were used to formulate constitutive dynamic equations of the piezoelectric bimorph by applying the Weak form of Hamiltonian theorems. The constitutive equation of the piezoelectric bimorph can be stated after simplifying complex equations as, T
∫ 0
L ( D , k ) ( 0 ) ¶ d u rel ∫ C 11 e xx ¶x 0
(H , k ) + R 31 v (t )
+
Lo 2
-
∫L
o
2
¶ 2 d w rel ¶x
2
(1 ) + C 11( F , k )e xx
¶ 2 d w rel ¶x
2
(G , k ) - R 31 v (t )
¶ d u rel ¶x
.. .. + I ( A , k ) u rel d u rel + I ( A , k ) w rel d w rel dx
.. ¶ d (w rel ( A ) .. (C ) ¶ w rel (L ) I tip u rel (L )d u rel (L ) + I tip ¶x ¶x
)
..
(L ) + I tip( A ) w rel (L )d w rel (L ) dx
628
L
-
2 (G , k ) ¶ u rel ( H , k ) ¶ w rel R 31 R 31 ∫ ¶x ¶x 2 0
+
Lo 2
-
∫L
o
(k ) d v (t )dx - S 33 v (t )d v (t )
.. .. ( A ) .. ¶ 2 d (wbase ) (C ) ( A) ( ) L + I w base (L )dwbase (L )dx I tip u base (L )du base (L ) + I tip w base tip 2 ¶x
2
.. .. L + ∫ I ( A , k ) u base d u base + I ( A , k ) w base dwbase dx dt = 0 0
.
(1)
Solutions of equation (1) can be obtained using eigenfunction series forms and the solutions must meet continuity and also boundary conditions of the piezoelectric bimorph beam under longitudinal extension and transverse bending effects. The solution forms can be prescribed as,
w rel ( x , t ) =
m
∑ w rel (t ) Y rel (x )
;
rel =1
m
∑ u rel ( t ) Q rel ( x )
u rel ( x , t ) =
.
(2)
rel =1
Parameters, Y ( x ) and Q ( x ) indicate the mode shapes or normal modes of the eigenfunction series which can be determined using analytical solution forms for the cantilevered piezoelectric beam with a tip mass. It should be noted that parameters, Y base ( x ) = Y rel ( x ) and Q base ( x ) = Q rel ( x ) because subscripts “rel” and “base” define relative motion and base motion for the bimorph beam indicating the same characteristics of dynamic equations in order to meet the continuity of the mechanical form or strain field. It should be noted that Y ( x ) and Q (x ) are independent continuous functions derived from the mechanical forms. Corresponding with equation (2), equation (1) can be formulated according to the eigenfunction series forms by setting virtual displacement forms, du rel (t ) , d w rel (t ) , dv (t ) separately to obtain three independent dynamic equations. Parameters of virtual displacements should meet the duBois-Reymond’s lemma to indicate that only dynamic equations have solutions. At this point, three dynamic equations can be formulated for the piezoelectric bimorph beam. The constitutive dynamic equations can be reformulated in matrix form by including the damping coefficients as, .. . u (t ) 0 urel(t ) K AA 0 PU urel(t ) - QU M AA 0 0.. rel CAA 0 0 M w (t ) + 0 C w. (t ) + 0 K 0 0 rel BB BB BB PW wrel(t ) = 0 rel .. . 0 0 0 v(t ) PU PW PD v(t ) 0 0 RL v(t ) 0
.. 0 ubase(t ) .. - QW 0 wbase(t ) , .. 0 0 v (t ) base 0
(3)
where: L
M AA = ∫ I 0
L
M BB = ∫ I 0
L
K
AA
=
A p
Q m ( x )Q n (x )dx +
A p Ym
(D , k ) d Q
∫ C 11 0
L
PU = -
m
+
dx
dx
∫
L - o 2
Lo 2
∫
L - o 2
( A )Q (L )Q (L )dx I tip m n
I ( A )Y tip
(x ) d Q n (x ) dx
(G , k ) d Q n ( x )
∫ R 31 0
(x )Y n (x )dx
Lo 2
dx
dx
m
(L )Y n (L ) dx
;
+
Lo 2
BB
=
∫C
(F , k 11
0
;
(C ) I tip
∫
L - o 2
L
K
(4)
,
L
(H , k ) pW = ∫ R 31 0
629
)
d Y m ( L ) d Y n (L ) dx dx dx
d 2 Y m ( x ) d 2 Y n (x ) dx , dx 2 dx 2
d 2 Y n (x ) dx dx 2
,
,
(5)
(6)
(7)
RL =
1 R load L0 2
L
Q U = ∫ I ( A , k )Q n ( x )dx +
∫
0
-
L0 2
( A , k )Q (L )dx I tip n
L0 L
QW =
∫I
(A ,k )
AA
∫I -
= a AM
AA
+b
K
A
,
(8)
(9)
,
L0
2
Y n (x )dx +
0
C
PD = - S 33(k )
;
(A ) tip
2
Y n (L )dx +
L0
-
2
I tip(C )
∫
L0
(10)
,
2
C BB = a B M
;
AA
d 2Yn (L )dx dx 2
BB
+ b B K BB
.
(11)
The piezoelectric bimorph has symmetrical geometry and the same material in the upper and lower layers with the brass centre shim. Corresponding with equations (4) to (10), each coefficient term can be formulated. Mass moment of inertia of the piezoelectric bimorph was given as, I ( A , k ) = bh
p
r ( A ,1 ) + bh s r ( A , 2 ) + bh
p
r (A ,3 ) .
(12)
Mass moment of inertia of the proof mass can also be formulated as, (A ) I tip =
.
A sh tip r tip
(13)
By considering rotary inertia at the centre of the proof mass, we have, h tip s 2 (C ) I tip = 3
3 C r tip
h tip s 2 + 3
3
3 C r C = s h tip r tip tip 12
.
(14)
Superscripts A and C indicate properties of mass moment of inertia for first and third terms (rotary inertia), respectively. The extensional stiffness coefficient of the piezoelectric bimorph can be stated as, C 11( D , k ) = 2 bh
p
Q 11( D ,1 ) + bh s Q 11( D , 2 )
.
(15)
The transverse modulus of elastic constants can be formulated as, C 11( F
,k
) = b 2 h 3
p
+
hs 2
3
-
hs3 12
Q
F ,1 11
+
3
bh
s
4
Q
F ,2 11
.
(16)
Superscripts D and F indicate the properties of stiffness coefficients for longitudinal extension and transverse bending, respectively. The piezoelectric bimorph can be arranged into series and parallel connections. When the piezoelectric bimorph beam is arranged for series connection, two poling vectors cross in the piezoelectric material (upper and lower layers), which is X-poled (opposite polarizations between upper and lower layers) due to the transverse bending term and Y-poled (same polarizations between upper and lower layers) due to the longitudinal extensional term. On the other hand, the parallel connection with the same material can also have two poling effects given by the Y-poling due to the transverse bending term and X-poling due to the longitudinal extension term. In this case, the cantilever piezoelectric bimorph with two input excitations was taken into consideration for two connection types. Case I. Series connection. a) Piezoelectric coupling for X-poling due to transverse bending form can be formulated as, (H ,k ) R 31 = -
2 h p hs b h p + 2h p 2 2
e ( H ,1 ) - b 31 2h p
h p 2 h p hs + 2 2
2 e (H ,3 ) = - b h p + h p h s 31 hp 2 2
e 31
.
(17)
b) Piezoelectric coupling for Y-poling due to longitudinal extension can also be stated as, (G , k ) R 31 =
(G ,1 ) bh p e 31
2h p
+
(G , 3 ) bh p e 31
2h p
= be 31
.
(18)
The capacitance of the piezoelectric element was considered as,
630
(1 ) bLh p b 33
(k ) S 33 =
4h p
2
(3 ) bLh p b 33
+
4h p
2
=
.
bL b 33 2h p
(19)
It should be note that the upper and lower layers of the piezoelectric bimorph will have the same material and geometrical (3 ) structure, thus the permittivity of the piezoelectric element will be b 33(1 ) = b 33 = b 33 . Case II. Parallel connection. a) Piezoelectric coupling for X-poling due to longitudinal extension can be formulated as, (G , k ) (G ,1 ) (G , 3 ) R 31 = be 31 + be 31 = 2 be
31
.
(20)
b) Piezoelectric coupling for Y-poling due to the transverse bending form can be formulated as, (H R 31
,k
)
2 h p hs b h p + hp 2 2
= -
e ( H ,1 ) - b 31 hp
hp2 h phs + 2 2
e (H 31
,3 )
= -
2 h p hs 2 b h p + hp 2 2
e 31
.
(21)
Capacitance of the piezoelectric element for parallel connection was given by, (k ) S 33 =
(1) bLh p b 33
hp
2
+
(3 ) bh p L b 33
hp2
=
2bL b 33 hp
.
(22)
Superscripts G and H represent the piezoelectric coupling properties for longitudinal extension and transverse bending, respectively. Superscript k indicates the layers of the piezoelectric bimorph. Equation (3) can be solved using Laplace transforms (LT) to obtain the transfer function (TF), H (s ) and the frequency response function (FRF), H ( jw) . In this case, the dynamic equations of the piezoelectric bimorph can be formulated in the form of a FRF matrix as,
H ( jw )11 H ( jw ) = H ( jw )21 H ( jw )31
H ( jw )12 H ( jw )22 H ( jw )32
.
(23)
The first term of the FRF represents longitudinal motion with respect to input longitudinal acceleration. In this case, the input transverse function is set to zero and the FRF from base-input longitudinal motion as it was formulated is given by, H ( jw )11 =
U ( jw )
=
..
u base
s = jw
[(M
BB
(
)]
RLw 2 + w 2 PDCBB - KBB RL ) + j PD M BBw 3 - wCBB RL - wPD K BB + wPW QU ( jw ) . Z ( jw ) 2
(24)
Equation (24) can be modified to obtain the FRF as a function of position of the piezoelectric element, (x) and frequency, jw by transforming it back into the eigenfunction. This then yields,
( )
H (x, jw )11 =
[(M
BB
(
RLw 2 + w 2 PDCBB - K BB RL )+ j PD M BBw 3 - wCBB RL - wPD K BB + wPW Z ( jw )
2
) ]Q ( jw )Q U
rel
(x ) .
(25)
If base-input longitudinal acceleration was not applied, the FRF of input-base transverse motion can be obtained as,
j w PW PU QW ( jw ) U ( jw ) H ( j w )12 = .. = Z ( jw ) w base s = jw
.
(26)
Equation (26) can be modified in terms of the FRF as a function of position in the piezoelectric element, (x) and frequency, jw as,
( )
H ( x , j w )12 = -
j w PW PU QW ( j w )Q rel (x ) Z ( jw )
.
(27)
The second FRF is the transverse motion with respect to input longitudinal acceleration. If base-input transverse motion is ignored, the FRF of base-input longitudinal motion can be obtained as,
631
H ( j w )21 =
W ( jw )
= -
..
u base
s = jw
j w Pu PW Q U ( j w ) Z ( jw )
.
(28)
R Load
M tip
..
Ubase Mtip =
..
RLoad
+
Ubase ..
Mtip
W ..
Wbase Figure 1. Cantilevered piezoelectric bimorph beam with two input base longitudinal and transverse excitations under series connection Equation (28) can be modified in terms of the FRF as a function of piezoelectric position, (x) and frequency, ( jw) as, H ( x , j w )21 = -
j w Pu PW Q U ( j w )Y rel ( x ) Z ( jw )
.
(29)
The FRF of transverse displacement with respect to base-input transverse acceleration can be obtained as, H ( jw )22 =
W ( jw )
=
..
wbase
[(M
AA
s = jw
(
)]
RLw 2 + w 2 PDCAA - K AARL ) + j PD M AAw 3 - wCAARL - wPD K AA + wPU QW ( jw ) . Z ( jw ) 2
(30)
By using the corresponding eigenfunction from equation (2) and the FRF from equation (30), the FRF as a function of piezoelectric element position, (x) and frequency, ( jw) can formulated as, H (x, jw )22 =
[(M
AA
(
)]
RLw 2 + w 2 PDCAA - K AARL ) + j PD M AAw 3 - wCAARL - wPD K AA + wPU QW ( jw ) Yrel (x ) . Z ( jw ) 2
(31)
The FRF based on the output electric voltage and the input base-longitudinal acceleration can be obtained as, H ( j w )31 =
V ( jw )
=
..
u base
(-
jw 3 M
BB
s = jw
Pu - w 2 C BB Pu + j w K BB Pu )Q U ( j w ) . Z ( jw )
(32)
The FRF of the output electric voltage in relation to the input base-transverse acceleration can be calculated by ignoring the base-input longitudinal acceleration to give, H ( j w )32 =
V ( jw )
=
..
w base
(-
jw 3 M
s = jw
AA
PW - w 2 C AA PW + j w K Z ( jw )
AA
PW )Q W ( j w )
.
(33)
Corresponding with equation (33), the FRF of power harvesting related to the transverse acceleration can be calculated as,
P ( jw .. w
base
)
=
2 ..
u
base
(-
jw 3 M
AA
PW - w 2 C
AA
PW + j w K
R load Z
( j w )2
AA
=0
632
PW
)
2
QW
( j w )2 .
(34)
The
FRF
P ( jw .. u
base
of
)
power
harvesting
(-
BB
=
2
jw 3 M
related
Pu - w 2 C
BB
to
the
Pu + j w K
R load Z
( j w )2
longitudinal
BB
Pu
)
2
QU
acceleration
( j w )2
can
be
calculated
.
as, (35)
..
w base = 0
where :
Z ( j w ) = w 4 (PD M AA C BB + PD C AA M BB + R L M AA M BB )
(
- w 2 R L C AA C BB - Pu 2 C BB - PW 2 C AA + K BB PD C AA + K BB R L M AA + K AA PD C BB + K AA R L M BB + K AA PD C BB ) + K AA K BB R L
{
+ j w 5 PD M AA M BB - w 3 (PD C AA C BB + R L M AA C BB + R L C AA M BB + PU 2 M BB - PW 2 M BB + K BB PD M AA + K AA PD M BB
(
2
2
)
+ w K BB R L C AA - K BB PU + K AA R L C BB - K AA PW + K AA K BB PD
)}
.
It should be noted that equation (3) can be used to model series and parallel connections of piezoelectric bimorph by choosing the proper coefficients of piezoelectric coupling and internal capacitance from equations (17) to (22).
3. NUMERICAL RESULTS This section provides the numerical results based on the suggested formulations. In this paper, the single mode of eigenfunction solutions was taken into consideration from the formulations. The numerical results are based on the series connection which was considered mathematically using coupling superposition of the elastic-polarity field as shown in figure (1). The FRF can be used to model and analyse the frequency, displacement and power harvesting based on the variances in resistances. In this case, the piezoelectric bimorph has characteristic properties as shown in table 1. Table 1. Characteristic property of piezoelectric bimorph Item
Piezoelectric
Brass
Q (GPA)
66
105
ρ (kg/m3)
7800
9000
d31 (pm/V)
-190
_
β33 (F/m)
1800εo
_
L (m)
29e-3
29e-3 *
h (m)
0.19e-3 (each)
0.13e-3
b (m)
6.4e-3
6.4e-3
εo (F/m)
8.854e-12
_
* Lower and upper layers of piezoelectric
633
The piezoelectric bimorph was considered as a cantilevered model element with two input-base excitations. The proof mass was also attached at the end of the piezoelectric element. The proof mass has dimensions 8mm x 6.4mm x 5.5 mm (Lo x s x htip) and has density mass 7800 kg/m3. The generating power from the piezoelectric bimorph was affected by input baseexcitations where the piezoelectric component has sensitivity of mechanical contractions and electrical effects. The input acceleration was assumed to be equivalent to gravitational acceleration, 1g (9.81 m/s2). However, in the experimental study, the range of input accelerations into the piezoelectric element should be determined properly so that the measurements of voltage, power, displacement, acceleration under dynamic motions can be obtained accurately. Also, the piezoelectric element must be kept from failure due to over dynamic motion through the limit of safety factor of the static bending stress. It should be noted that in practice, the machine vibration environment is very suitable for using piezoelectric elements to generate power even at low amplitudes of vibration.
Power Harvesting Per unit Input Transverse Acceleration FRF (W/g2)
10
-2
10
-3
10
-4
10
-5
10
-6
10
-7
10
-8
50
Rl=35000 Ohms Rl=45000 Ohms Rl=80000 Ohms Rl=500000 Ohms Rl=700000 Ohms Rl=1100000 Ohms
60
X: 91.15 Y: 0.003063
X: 80.04 Y: 0.00172
70
80
90
Frequency (Hz)
100
110
120
Figure 2. FRF of power harvesting per unit input transverse acceleration vs. frequency for varying resistance Figure (2) shows the single mode FRF of power harvesting per unit of input transverse acceleration, scaled in terms of ‘g’, with varying resistance. The resonance can be seen to shift frequency as the load resistance changes. This indicates that the load resistance can shift the power harvesting domain and characteristic behaviour with frequency. Furthermore the predicted frequency depends on the chosen load resistances to give optimum points of power harvesting. For example, resonance at 80.98 Hz with load resistance at 80 kΩ shifted to a resonance at 89.94 Hz with load resistance at 500 kΩ where these points indicated the lowest bound of power harvesting at resonance domains based on the chosen resistances. But the highest bound indicates the resonance at 80 Hz with load resistance at 35 kΩ where this resonance has shifted to 91.15 Hz at 1100 kΩ. The highest power harvesting for load resistances of 35 kΩ and 1100 kΩ indicated below 3 mW/g2 with frequency difference around 12.2% due to the change of load resistances as displayed in first mode. As mentioned previously, g represents the unit of gravitational acceleration (9.81 m/s2) where it was used to dimension the power harvesting of mW per unit g2. To visualize the power harvesting with respect to frequency and load resistance, figure (3) was plotted to show the trend of power harvesting with respect to changes of load resistances and frequencies. For example, increasing load resistances reduces the changes of resonance frequency and the results of power harvesting analysis shows the optimum points at load resistances between 35 kΩ and 1100 kΩ as indicated similarly with figure (2). The overall results obtained in this situation were based on the chosen load resistances.
634
Figure 3. FRF of power harvesting per unit input transverse acceleration vs. Frequency and varying load resistances
As shown in figure (4), the dynamic transverse displacement FRF at the free end of the cantilevered piezoelectric beam per unit input transverse acceleration also indicated similar results in the frequency domain as shown in figure (2) with the same chosen load resistances. The transverse displacement FRF at the free end of the cantilevered piezoelectric bimorph beam at resonances of 80 Hz and 91.15 Hz indicated 0.67 mm with load resistance of 35 kΩ and 0.76 mm with load resistance of 1100 kΩ per unit acceleration. In this numerical result, the chosen mechanical damping was used to predict power harvesting, dynamic displacement and velocity. In this case, we used Rayleigh’s proportional damping constants of mass multiplier, alfa (α, rad/s) and stiffness multiplier, beta(β, s/rad). The chosen α and β values were 6x10-5 rad/s and 6.4x10-5 s/rad, respectively. In practice, the damping constants can be predicted based on the experimental study.
Transverse Displacement Per unit Input Transverse Acceleration FRF (m/g)
10
-2
10
-3
10
-4
10
-5
10
-6
50
Rl=35000 Ohms Rl=45000 Ohms Rl=80000 Ohms Rl=500000 Ohms Rl=700000 Ohms Rl=1100000 Ohms
60
X: 91.15 Y: 0.0007602
X: 80.04 Y: 0.0006695
70
80
90
100
110
Frequency (Hz)
Figure (4). FRF of Transverse displacement per unit input transverse acceleration vs. frequency with varying load resistances
635
120
4. CONCLUSION This paper has presented a theoretical study of a coupled electromechanical piezoelectric bimorph beam under input base transverse and longitudinal excitations. The resulting matrix differential equation was obtained using the weak form from Hamiltonian’s theorem. This included modelling the series or parallel connections of the piezoelectric bimorph. In this case, we presented only the series connection. Solution of the matrix differential equations of motion using Laplace Transforms was used to obtain transfer functions and corresponding frequency response functions between the input motions, output voltage and power expressions. The results obtained from the frequency response function for a case study showed the influence of varying load resistance on resonance frequency behaviour. The power harvesting per unit input transverse acceleration FRF showed maximum points at resonances of 80 Hz with resistance of 35 kΩ and 91.15 Hz with resistance of 1100 kΩ, respectively. This shows that the piezoelectric bimorph beam was modelled as a coupled mechanical and electrical system using the Laplace transform and FRFs derived from the constitutive dynamic equations. Further investigations will be conducted experimentally including the analysis of a cantilevered piezoelectric bimorph beam with two input base excitations to be compared with this numerical study.
5. REFERENCES 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
Sodano HA, Inman, DJ and Park, G. (2004) A review of power harvesting from vibration using piezoelectric materials. The Shock and Vibration Digest 36 (3), 197-205. Anton SR and Sodano, HA. (2007) A review of power harvesting from vibration using piezoelectric materials (20032006). Smart Materials and Structures, 16, R1-R21. Starner T. (1996) Human-powered wearable computing. IBM Systems Journal 35, 618-629. Kymissis J, Kendall C, Paradiso J and Gershenfeld N. (1998) Parasitic power harvesting in shoes. Proceeding of the 2nd IEEE international Symposium on Wearable Computers, Pittsburg, PA, October, pp. 132-139. Starner T and Paradiso JA. (2004) Human generated power for mobile electronic, in Piguet, C.(ed), Low-Power Electronic, CRC Press, Chapter 45, pp. 45.1-45.35. Paradiso, J.A., 2006, System for human-powered mobile computing. Design Automation Conference, 43rd ACM/IEEE, pp. 645-650. Mateu L and Moll F. (2006) Appropriate charge of the storage capacitor in a piezoelectric energy harvesting device for discontinuous load operation. Sensors and Actuators, 132, 302-310. Clark WW, Ramsay MJ. (2000) Smart material transducers as power sources for MEMS device. International Symposium on Smart Structures and Microsystems, Hong Kong. Sodano HA, Lloyd J and Inman DJ. (2006) An experimental comparison between several active composite actuators for power generation. Smart Materials and Structures,15, 1211-1216. Elvin NG and Elvin AA. (2009) A general equivalent circuit model for piezoelectric generators, Journal of Intelligent Material systems and Structures, 20, 3-9. Jeon YB, Sood R, Jeong J.-h and Kim, S.-G. (2005) MEMS power generator with transverse mode thin film PZT. Sensors and Actuators A: Physical, 122, 16-22. Song HJ, Choi Y.-t, Wereley NM and Purekar AS. (2007) Analysis of energy harvesting devices using macro-fiber composite material. Proceeding of the ASME International Design Engineering Technical Conferences and Computers and Information Engineering Conference, Las Vegas, September. Dutoit NE and Wardle BL. (2006) Performance of micro fabricated piezoelectric vibration energy hasverters. Integrated Ferroelectrics, 83, 13-32. Dutoit NE, Wardle BL and Kim S.-G. (2005) Design considerations for MEMS scale piezoelectric mechanical vibration energy harvesters. Integrated Ferroelectrics, 71, 121-160. Ferrari M, Ferrari V, Marioli D and Taroni A. (2005) Modelling, fabrication and performance measurements of a piezoelectric energy converter for power harvesting in autonomous Microsystems. Proceeding of Instrumentation and Measurement Technology Conference, Ottawa, ON, Canada, May, pp. 1862-1866. Erturk A and Inman DJ. (2008) A distributed parameter electromechanical model for cantilevered piezoelectric energy harvesters. ASME Journal of Vibration and Acoustics, 130 (4), 1-15. Erturk A and Inman DJ. (2009) An experimentally validated bimorph cantilevered model for Piezoelectric energy harvesting from base excitations. Smart Materials and Structures, 18, 1-18. Kuehne I, Marinkovic D, Eckstein G and Seidal H. (2008) A new approach for MEMS power generation based on a piezoelectric diaphragm. Sensors and Actuators A: Physical, 142, pp. 292-297. Renaud M, Fiorini P and Hoof CV. (2007) Optimization of a Piezoelectric Unimorph for Shock and Impact Energy Harvesting. Smart Materials and Structures, 16, pp. 1125-1135. Shu YC and Lien IC. (2006) Analysis of power output for piezoelectric energy harvesting systems, Smart Materials and Structures, 15, 1499-1512.
636
Proceedings of the 4th World Congress on Engineering Asset Management Athens, Greece 28 - 30 September 2009
WIRELESS CONDITION MONITORING SYSTEM FOR LARGE VESSELS: DEVELOPMENT AND APPLICATION Min-chan Shima and Bo-suk Yang a, Young-mo Kongb, Won-cheol Kimc a
School of Mechanical Engineering, Pukyong National University, San 100, Yongdang-dong, Nam-gu, Busan 608-739, Republic of Korea. b
c
Vibration & Noise R&D Team, Daewoo Shipbuilding & Marine Engineering Co., Ltd, 1, Aju-dong, Geoje-si, Gyeongsangnam-do, Republic of Korea.
School of Mechanical and Aerospace Engineering, Institute of Marine Industry, Gyeongsang National University, 445 Inpyeong-dong, Tongyeong 650-160, South Korea.
The wireless measurement system (Wi-Measys) is developed for an effective vibration monitoring on large vessels. Global vibration measurement mainly executed constructed vessels on a trial running. This application was performed as a solution for replacing existing wired instrumentations by wireless one. The installations of wired sensors cause a manpower shortage, a cost prohibitive and very susceptible to noise through coaxial cables in a complex structure. However, these costs can be significantly reduced by utilizing a wireless system. WiMeasys is applied on container-ship where conducted a vibration testing on an engine room, deckhouse, and upper deck. This paper presented about how to apply a wireless measurement system and wireless local area network to a large ship have complex structure and how to develop data acquisition board to fit the environment of the vessel. Key Words: Wireless Local Networks (WLAN), Transmitter, Base Station, Global Measurement, Vessel 1
INTRODUCTION
Different applications require the measurement process to acquire a relevant number of quantities from spatially separated locations. An example is the management of production processes, where a great number of devices need to be monitored to assure the desired product quality [1]. To yield high immunity to noise, sensors are generally placed close to the phenomena being monitored. But the connectivity of various families of sensors and transducers to the central processing unit of the measurement system becomes a primary problem, especially if a large number of long wires should be installed in inaccessible or harsh and hazardous environments [2]. In case of the large ship, particularly, it is required the number of sensors and coaxial cables for a central measurement process in a synchronous time. When the wired networking of distributed sensors cannot be easily achieved, other approaches should be adopted. A help is represented by the availability of embedded processors and radios, which is enabling the use of wireless sensing to a wide range of applications. Wireless communication is seen to be one of the fastest growing technological sectors today, and in the last years several connectivity problems have found optimal solution through the definition of low-power Radio Frequency (RF) Wireless Local Area Networks (WLAN) [3]. A particular advantage of wireless sensing is that it is not indispensable that the device transmitting the signal of interest is fixed at a precise location. This makes it possible to implement new applications, such as traffic management, detection of the presence or absence of certain devices, and so on. Distributed wireless sensor networks greatly extend the ability of monitoring and controlling the physical process from remote locations, also by means of high-level user interfaces. Our aim in this paper is to propose a wireless measurement system (Wi-Measys) for wireless sensing. The application of Wi-Measys concepts to a specific environment of the large ship has produced the idea of the WLAN. Quick installation, modularity and expandability are the main advantage of the proposed system.
637
In the following sections we will present some issues associated with both the proposed hardware/software specification, and the how to set the WLAN up in the large ship. Moreover, we will report the results of performance tests, which were exerted during both simulation and experimental activities with particular reference to some functional signals of prototype under design [4].
2
IMPLEMENTATION AND SPECIFICATION
Wi-Measys consists of a base station and multiple transmitters units through WLAN for measuring data. The diagram of the system with the associated principal idea is depicted in Fig. 1. Some attentions were paid in the realization of the prototype [5, 8].
Figure 1 Schematics of the implemented system. An indispensable condition for application in large vessels is as follows. • Real-time process up to 6.4 KHz • Wireless LAN construction on the vessels • Data throughput secure at measuring point • Remove data loss through digital communication Acquiring data is sampled by individual 8 channel ICP type accelerometers, internally processed as analog, and wirelessly sent to the access point (AP) for uploading and storage on base station by an digital recorder. These transmitters can install until maximum 10 units. Unlike other WSN, Wi-measys based on real-time process within up to 6.4 kHz a wide bandwidth. The number of useful channels is decided by TCP/IP test supporting WLAN communication, while retaining the accuracy, bandwidth, and robustness of traditional sensors.
638
Figure 2 Wi-Measys prototype 2.1 Data acquisition board (DAQ) The transmitter is a small and lightweight box that can be installed wherever measurement takes place. In application, the external antenna and the ICP type accelerometers are connected to the box. The specifications of the DAQ are shown in Table 1, which are controlled by base station settings.
Table 1 Specification of Transmitter Item
Specification
Bandwidth
6.4 kHz
A/D resolution
16 bit/channel
Data transfer
16/24 bit selectable
Dynamic range
12, 7.07, 2.236, 0.707 mVp-p
Ground coupling
Single-ended
Input type
ICP, Charge
Absolute linearity
1 kHz, 1 Vinput: ±0.6 dB
Absolute AMP
1 kHz, 1 Vinput: ± 0.5 dB,
precision
typical: ± 0.6 dB
639
Figure 3 Response curve of 3rd order high pass filter (HPF) and 8th order low pass filter(LPF) Considering the effects of roll, pitch and yaw motions which have around low frequency range on a ship’s signature, The high pass filter (HPF) have to be set with a steep slope below 0.7 Hz or 7 Hz. Fig. 3 shows the roll-off of each partial filter versus the normalized frequency axis. An elliptic LPF is used for an anti-aliasing filter with adaptable cut-off frequency. The elliptic filter characteristic exhibits ripple in the pass band and generated by poles and zeros. This results in a cut-off which is sharper than most other filters.
Table 1 Specification of filter
Setting
Linearity
-3 dB
Slope
0.7 Hz
-0.1 dB
0.07 Hz
- 18dB/oct
7 Hz
-0.1 dB
0.7 Hz
- 18dB/oct
22.4 Hz
-0.1 dB
12.6 Hz
- 18dB/oct
Anti-aliasing
-0.1 dB
Adaptable
- 64dB/oct
2.2 Wireless Data Throughput The 'ad hoc' WLAN concept represents the natural solution for distributed measurement applications. It provides many benefits, e.g. quick installation, modularity and expandability. With regard to the measurement, it is essential to ensure a quality data transmission of WLAN as the system should conduct a high speed data sampling and data transfer. The measurement may take place either outdoors or indoors, so where and what type of wireless LAN antenna needs to be carefully considered for installation. For this purpose, directive and nondirective antennas should be used to make sure the area is covered for transmission prior to the measurement. When a transmitter is arranged in the various directions, WLAN of a base station uses a non-directive antenna [6].
DataThroughput = Fs · B · CN
(1)
Fs = Maximum Sampling Frequency B = Analogue to Digital Convert Resolution CN = Number of Channel
640
Wi-measys is designed for real-time process, which requires a high speed, wide bandwidth. To ensure the stability in network communication, usually 16 Mbps data transfer rate is needed at least. In this application, AP installed a wireless network fix on CISCO 1242 AG series IEEE 802.11a/b/g access point the versatility, high capacity, security, and enterprise-class features. With simultaneous support of 802.11a and 802,11g standards, the 1242 AG series delivers up to a 108 Mbps data rate in the 5 GHz and 2.4 GHz bands. The series supports 3 nonoverlapping and 11 overlapping channels [7].
Figure 4 Minimum data throughput at measuring points The wireless data mining system needs to be extremely reliable even in industrial environ-ments where WLAN interference due to motors, lighting, and other wireless systems is typical. Handheld radios common in industrial environ-ments operate at lower frequencies than most sensor networks and cause little interference. It is also necessary to guarantee that any errors in the message will be detected and corrected. Any lost or corrupt data will show up as distortion and could produce false alarms. Many possible solutions exist for wireless communication; however, a design must carefully balance performance, transfer rate and transfer range [8].
3
APPLICATION FOR LARGE VESSELS
The developed prototype of Wi-Measys was deployed on a trial run of a constructed ship for measuring a ship’s vibration. Typically the ship’s vibration tests execute specific locations in Figure 5.
Figure 5 Measuring point in vessels Usually, a number of wired sensors are mounted at each specific location; all of coaxial cables are wired to B deck for central process of measurement. However, Wi-measys allow transmitting data wirelessly to network access points from transmitters from each specific location to B or C deck. To do that, a stable radio frequency bandwidth is required. When negotiating AP locations, think about configuration attributes, such as signal method (i.e., 802.11a or 802. 11b), transmit power,
641
and inter-access interference and antenna type. These configurations affect range; therefore, use them to its advantage when tweaking the positioning of the AP. 802.11b access points generally offer greater range than 802.11a, mainly because 802.11b operates using lower frequencies (2.4 GHz instead of 5 GHz band). As a result, the use of an 802.11a network requires access points to be closer together (e.g., 100 feet) as compared to 802.11b (e.g., 500 feet). Keep this in mind when positioning the access points. To construct WLAN in the ship, two kinds of wireless frequency bands were used. A 2.4 GHz band communicates with transmitters and base station, while a 5 GHz band was used for connection between access points. Before connect to WLAN’s SSID have to check the environment of interference within 2.4, 5 GHz band. In Fig. 7, there are some unknown noises within #1, #6, and #11 (non-overlapping channel), but no noise within 5 GHz.
Figure 6 RF environment test and wireless LAN installation In this application, especially those requiring support for high throughput, a large number of APs are needed in a relatively small area. In order to accomplish this and avoid inter-access interference, try using a lower transmit power. It's difficult for those without radio frequency experience to determine the optimum location for access points. Through testing on the containership, we confirmed those optimum configurations. As the engine room is enclosed with complex plates, this has a problem to communicate with base station. Fig. 8 shows one of the ways of how to construct the AP on the ship. Similar to transmit power, the antenna type affects the positioning of access points. The antenna affects range, and it also affects the pattern of the propagated signal. For example, an omni-directional antenna broadcasts horizon-tally in all directions. The use of omni-directional antennas provides widespread coverage, which are best for applications in office complexes, warehouses, homes, etc. The access points in these cases need to have fairly equal spacing, assuming the construction and layout of the facility is fairly uniform. A directional antenna focuses the RF signal more in one direction than others (thus increasing the range more in that direction). To cover relatively long, narrow areas, such as the containership, a directional antenna may make most sense to minimize the number of access points. In these environments, the access point should be located near one end of the long corridor to focus most of the signal in the right direction. This technique can make use of only one access point instead of several access points having omni-directional antennas. In order to verify the prototype, transmitter installed in engine room (most severe location) measures a functional signal and ICP accelerometer which are excited by B&K calibrator. To examine the accuracy of the system the measured frequencies are compared. The resulting values can correlate well. This test executed a building ship in DSME, so it is difficult to get an exciting source comparing between a wired and wireless. In Fig. 10, to verify span between two signals, enforced time shift to right direction. To examine the accuracy between a wired and wireless signal, assessments have to require suitable methods for wireless measurement system. Also, it is imperative to develop time synchronize method for multiple Wi-measys.
4
CONCLUSTION AND FUTURE WORKS
The proposed system is conceived as a wireless data transferring in the ISM frequency band (WLAN) on the containership. The system has the potential for easy installation which makes it an ideal tool for placement of wired measurement system in the containership. This project is going on continuously with the noise and vibration R&D team, DSME. This paper presented how to apply a wireless measurement system and wireless local area network to a large ship with complex structure and how to develop data acquisition board to fit the environment of the vessel.
642
5
REFERENCES
1
G. Bucci, C, De Capua, C. Landi, (2002) Industrial Measurement and Control, Wiley Encyclo-pedia of Electrical and Electronics Engineering Online, John Wiley & Sons Inc.
2
R. Schneiderman, (1997) Future Talk: The Changing Wireless Game, IEEE Press, ISBN 0-7803-3407-8.
3
R. Min, M. Bhardwaj, S.H. Cho, N. Ickes, E. Shih, A. Sinha, A. Wang, A. Chandrakasan, (2002) Energy-centric enabling technologies for wireless sensor networks, IEEE Wireless Communications 9 (4), 28–39.
4
G. Bucci, E. Fiorucci, C. Landi, G. Ocera, (2004) Architecture of a digital wireless data communication network for distributed sensor applications, Science Direct Measurement 35, 33-45.
5
C. McLean, D. Wolfe, (2002) Wireless data gathering system for condition based maintenance, Intelligent Wireless Condition -Based Maintenance, 19 (6), 14-26.
6
K. Okada, M. Saruta, A. Fukukita, T. Hayakawa, T. Nishimura, M. Ishino, (2004) Wireless LAN Vibration Measurement System, INSS 2004: Second International Workshop on Networked Sensing Systems.
7
M.C. Shim, B.S. Yang, (2007) Condition monitoring system: high performance wireless measurement system, KSPSE, 11 (1), 28-32.
8.
CISCO systems technical document, Cisco Aironet 1240AG Series 802.11A/B/G access point data sheet, www.cisco.com
Acknowledgement This work is supported by DSME and the BK21 Project.
643
Proceedings of the 4th World Congress on Engineering Asset Management Athens, Greece 28 - 30 September 2009
EMERGING TECHNOLOGIES AND EMBEDDED INTELLIGENCE IN FUTURE POWER SYSTEMS Johan J. Smit a, Dhiradj Djairam b, Qikai Zhuang b a
Delft University of Technology, P.O.Box 5031, 2600GA Delft, the Netherlands [email protected] b
Delft University of Technology, Mekelweg 4, 2628CD Delft, the Netherlands [email protected]; [email protected]
The replacement wave around 2030 will create a hybrid power system of old and new technologies of which in particular the latter part will provide eminent opportunities for the implementation of embedded intelligence. However, the investment in smart grids is a difficult decision because it concerns a composition of primary and secondary equipment which have different lifetimes and different levels of robustness. Integration of sensor technology, on/off-line diagnostic systems and advanced ICT solutions enable the monitoring of the health index of the grid and its components, provided a physical model can be devised. From an economical and environmental point of view, there is much to gain by smarter electrical power networks, because in principle they enable us to extend the useful lifetime and to delay large replacement investments. However, the emerging technologies for sensors specifically for high voltage equipment performance, interpretation tools and aging models needed for such smart power networks are still in a premature stage. A few emerging technologies have achieved robustness to some extent. Dedicated techniques for partial discharge detection in high voltages cables and gas-insulated switchgear can predict failures on the basis of incipient dielectric faults. Similarly, dissolved gas monitoring of power transformers to alleviate has also been relatively successful. In this paper, the expectations of the power equipment monitoring will be discussed. Key Words: asset management, smart grids, condition based maintenance, emerging diagnostics, electrical power systems 1
INTRODUCTION
Currently, the important economic driver is changing its valuation regarding high voltage network assets. Moreover, global warming and other customer concerns are increasing at such a pace that sooner or later the “real” price of the environment has to be taken into account. In that regard, preventing function loss of high voltage equipment is at stake at many utilities for a more sustainable power system. Many power system operators are more and more considering the introduction of an architecture of subsystems containing intelligent system interface technologies. These subsystems can be designed for independent operation and self-healing dynamics. The amount of data transfer at both local and general level will increase so rapidly that e.g. agent technologies should provide adequate solutions for processing the growing data streams in the autonomous grid sections. Clearly, minimizing the required manpower input and automating maintenance schedules will be essential to the future asset management policy. Sensors, computers and fast telecommunication through wired and wireless networks should enable remotely controlled operation and data processing for maintenance and control of the future electrical infrastructure mix, thereby preventing upcoming faults more efficiently.
2
DRIVING FORCES
644
From the historic and recent developments in various countries, we have learned that there are three prominent driving Efactors that are considered important: Engineering, Economics and Environment [1], see figure 1.
Engineering
Economics
Environment
Economics
Environment
Economics
Environment
Engineering
Engineering
Electrification
Mature situation
Future situation
Fig. 1. E-E-E drivers in power system development. Depending on the growth situation of the country or region, the ranking of these drivers will be different. In fast developing countries, where new power infrastructures are sometimes introduced for the very first time, the most important driver will be “Engineering”. In many industrialized countries, the focus will have changed to “Economics” as the primary object will be the economical utilization and improved delivery performance. Power trading, upgrading and lifetime extension of systems are examples of challenges that have induced a revolution in managing and operating the aged assets of the system. However, in the long run, the drivers “Economics” and “Engineering” are expected to interchange when the real pricing of the environment is taken into account. Extension and replacement of the existing subsystems will result in a hybrid electrical infrastructure of old and new technologies. Future technology should include more sustainable solutions for power processing, energy storage and social/societal demands. Environmental concerns are expected to impose higher requirements from society as well as from international treaties. Environmental advantages may be found by small/mid-scale decentralized power generation growing into the distribution system. Therefore, auxiliary diagnostic need to have more control over power flows and automatic transmission and distribution equipment to improve the self restoring capability of the system, a so-called “energy internet” concept [2].
3
SENSORS AND MODELS
In order to get to the concept of “energy internet”, the actual health index of relevant components in the grid should be monitored which can be achieved by the integration of sensors for high voltage performance. Using the fixed and measured data and aging models, the actual condition or health state of the component can be assessed as input for advanced preventive maintenance methodologies like CBM, RCM, RBM which anticipate effectively on repair and replacement schedules. Also, by taking into account operational parameters and environmental conditions, the future health state can be predicted by using a predictive health model (PHM), which will be discussed in the next section. Two examples of components that are nowadays monitored closely are oil-filled cable systems and power transformers:
Fig. 2. Two examples of monitoring, an oil-filled cable system and a power transformer
645
The stress factors that can act on a high voltage component will in the long run result in certain internal integrity changes in the component. These changes can be detects with certain sensors. Certain changes can be pinpointed for the component, while other changes can only be found for the component in general. This is summarized in Table 1. Table 1. Assessment of high voltage equipment performance Assessment of HV performance Stress factors
Thermal Electrical Mechanical Ambient
Routine test
Sensor e.g.
Typical changes during aging
Range of assessment General Local condition Condition
Physical, intrinsic
dielectric absorption
+
-
Chemical reactions, corrosion, byproducts
dissolved gas analysis
+
-
Thermal distribution, expansion
spot temperature
-
+
Electrical losses and treeing
partial discharges
+
+
Mechanical interface formation
space charges
-
+
-
+
Functional property changes, e.g. withstand/inception voltages, vibration level, contact speed, resistance
contact performance
It must be noted that not all of these diagnostic methods are commercially available for all components or still being developed in the laboratory, see Table 2. For example, for gas/air insulated switchgear systems (GIS and AIS), the only sensors dedicated to high voltage performance available on the market are based on partial discharges and contact performance.
Table 2. Availability of market ready sensors. GIS = Gas Insulated Switch Gear, AIS = Air Insulated Switchgear and HVDC = High Voltage equipment. DIEL = Dielectric absorption, DGA = Dissolved Gas Analysis, T spot = Spot temperature, PDloc = Partial discharge, SC = Space charges and CP = Contact performance. Development status of diagnostic sensors for integration in autonomous substations available on the market DIEL
DGA
Tspot
PDloc
SC
CP
TRANSFORMERS CABLE GIS AIS HVDC
However, merely having the availability of market ready sensors is not sufficient for adequate decision making. For this, we also require models for interpretation of which the composed availability is shown in Table 3. We can see that means that our options have been severely decreased and that for AIS and HVDC no options exist yet.
646
Table 3. Availability of models for interaction and decision making. TR = Transformers, CAB = Cable, GIS = Gas Insulated Switch Gear, AIS = Air Insulated Switchgear and HVDC = High Voltage equipment. DIEL = Dielectric absorption, DGA = Dissolved Gas Analysis, Tspot = Spot temperature, PDloc = Partial discharge, SC = Space charges and CP = Contact performance. Development status of diagnostic sensors for integration in autonomous substations Models for interpretation and decision making DIEL
DGA
Tspot
PDloc
SC
CP
TR CAB GIS AIS HVDC
If we take all of this together, then we can see that in terms of overall readiness and robustness there are only three components for which there is exactly one suitable diagnostic method remaining as shown in Table 4. For transformers, only dissolved gas analysis (DGA) is available. For cables, only spot temperature is available and for GIS, only partial discharge location detection is available. Table 4. Overall readiness and robustness of the diagnostic sensors for integrations in autonomous substations Development status of diagnostic sensors for integration in autonomous substations Overall readiness and robustness DIEL
DGA
Tspot
PDloc
SC
CP
TR CAB GIS AIS HVDC
Therefore, it should be stressed that more research should be conducted to develop robust sensors and associated models that enable to monitor the actual aging state of the network component. This is even more necessary if an “energy internet” is to be developed by means of e.g. a predictive health model concept. 4
WIRELESS COMMUNICATION
In order to achieve autonomous operation of parts of the grid, information of the sensors needs to be collected for processing. As connecting all these sensors on and in all the components that might be present in e.g. a substation, the preferred method would be to use wireless technology, see figure 3. Currently, research is conducted on whether disturbances that frequently occur in substation, such as e.g. switching actions, corona and reflections. All these disturbances cause high frequency electromagnetic interference which can decrease the reliability and thus the applicability of wireless technology.
647
Fig. 3. Implementation of wireless technology At this moment, the protocol that has been tested for transmitting and receiving in substation conditions is ZigBee which operates at 2,4 GHz. Using this protocol, it has been possible to reliably transmit information such as temperature, humidity and light intensity while reflections and corona were present [3]. Among the advantages of ZigBee are the low cost and the long battery life of the transmitters/receivers. Its main drawback is the low data rate which is still sufficient to transmit e.g. one read-out of the temperature, humidity and light intensity per second. However, as soon as voltages, currents, other operating parameters and environmental conditions need to be transmitted in the order of 50 to 100 times per second, the ZigBee protocol will fall short. This is because an effect occurs that is called Inter Symbol Interference (ISI), which occurs when the signal bandwidth is significantly higher than the coherence bandwidth of the transmission channel. Therefore, research is also conducted towards achieving high data transfer rates and one method of accomplishing that is by using Orthogonal Frequency Division Multiplexing (OFDM). This method basically divides a wideband signal into many parallel narrowband signals. If the bandwidths of these narrowband signals are smaller than the coherence bandwidth of the transmission channel, then ISI can be eliminated. This way, the path can be cleared to reliably transmit various parameters of many components in a substation to a central processing point after which the health states of the components can be determined. Subsequently, using predictive health modeling decisions can autonomously be made.
5
PREDICTIVE HEALTH MODELLING
Due to all these advancements in measurement techniques and sensor technologies, a significant amount of technical information about the assets in the grid will become available. This information can be used for the optimization and the maintenance of power system equipment and thus become a valuable tool for asset management. In [4], a framework has been proposed for the modeling of the health state of power system equipment. This framework can be used to predict the effects and outcomes of different operating profiles, usage patterns and maintenance actions. This predictive health model is shown schematically in figure 4.
648
Physical Equipment
Usage Actions
Failure Rate (y)
Cumulative Stresses (x) Condition Parameters Monitoring Systems c Estimation of Cumulative Stresses (hx) xˆ e ud ua
Dynamic Stress Model (f)
xˆ
Failure Model (g)
yˆ
Fig. 4. Predictive health model (PHM) Using this model, the failure rate can be determined and therefore, steps can be taken to minimize the chance of a failure. In [5], this model has been applied in a simulation of the thermal effects in a power transformer, see figure 5. Using various models such as e.g. top-oil thermal model and hot-spot thermal model, it was possible to apply the predictive health model in such a way that the loading profile of the transformer was optimized. In this way, the temperature of the transformers could be maintained below the allowed limit.
Top-oil
load [pu · 10]
Hot-spot
140
140
120
120
100
100
Temperature ( C)
Temperature ( C)
Hot-spot
80 60 40
load [pu · 10]
80 60 40 20
20 0 0
Top-oil
60 120 180 240 300 360 420 480 540 600 660 720 time (minutes)
0 0
60 120 180 240 300 360 420 480 540 600 660 720 time (minutes)
Fig. 5. Optimized load control of a transformer using predictive health modeling. On the left side, the temperature of the transformers exceeds the maximum temperature 100 °C. On the right side, the optimizing algorithm decreases the load thereby reducing the temperature.
The method optimizes the utilization of the transformer by recommending load changes when required, thereby keeping the temperature within safe limits. Of course, in practice, the effectiveness of this method depends on the accuracy of the sensors and the reliability of the transmitted information. Also, it depends on the ability of the autonomous system to control the load patterns of the transformers.
6
CONCLUSIONS
Because of the renewal wave the electricity grid will become a mix of old and new technology. The focus of the newer technology will be more on “Economics” and “Environment”, which ultimately means that the grids will need to be more
649
sustainable. In order to achieve long-term performance, the health state of the assets in the grid needs to be determined advanced preventive maintenance and this can be done by the integration of sensors and the use of models for interpretations. It can be concluded that suitable sensors and/or models still do not exists for a number of assets and diagnostics methods. Even in the case that all these sensors and models would exist, this wealth of information cannot be transmitted using current wireless protocols such as ZigBee because of the limits on data rate. Nevertheless, this method has proven to be reliable in substation conditions with various forms of interference. In order to improve the data transfer rate, other protocols are being investigated of which Orthogonal Frequency Division Multiplexing looks promising. A predictive health model is currently being developed to collect and analyze the health state of the grid and its assets. Successful simulations have been carried out to maintain the temperature of a power transformer by optimizing the load profiles.
7
REFERENCES
1
J. J. Smit, B.M. Pryor et al., (2003) Cigré Brochure 224, “Emerging Technologies and Material Challenges”.
2
J. J. Smit, E. Gulski (2007) Integral Decision Support System for Condition Based Asset Management of Electrical Infrastructures. Proc. World Congress on Engineering Asset Management, Harrogate.
3
A. Lou, D.Djairam, H. Nikookar, J.J. Smit (2009 ) Interference in the Wireless Channel. Internal graduation report, Delft University of Technology, Delft.
4
G. Bajracharya, T. Koltunowicz, R. R. Negenborn, Z. Papp, D. Djairam, B. D. Schutter, and J. J. Smit (2009) Optimization of maintenance for power system equipment using a predictive health model. in Proceedings of the 2009 IEEE Bucharest Power Tech Conference, Bucharest, Romania.
5
G. Bajracharya, T. Koltunowicz, R.R. Negenborn, Z. Papp, D.Djairam, B. de Schutter, J.J. Smit (2009) Optimization of Condition-Based Asset Management using Predictive Health Model. International Symposium on High Voltage 2009, Cape Town, South Africa.
650
Proceedings of the 4th World Congress on Engineering Asset Management Athens, Greece 28 - 30 September 2009
USE OF BISPECTRAL-BASED FAULT DETECTION METHOD IN THE VIBROACOUSTIC DIAGNOSIS OF THE GEARBOX Jasiński M., Radkowski S. Institute of Automotive Engineering, Warsaw University of Technology, Narbutta 84, 02-524 Warsaw, Poland.
The central issue is extract the relevant diagnostic information from vibroacoustic signal of mating gears and use it in the forecasting process. Thus the research focuses in particular on the methods of analyzing the relations between various frequency bands and their links to various types of defects or phase of their development. The value of the information contained in the bispectrum consists of, among others, the fact that it enables examination of statistical relations between individual components of the spectrum as well as to detect the components generated as a result of occurrence of non-linear effects and the additional feedback associated with the emerging defects. This results from the fact that in contrast with the power spectrum, which is positive and real, the bispectrum function is a complex value which retains the information on both the distribution of power among individual components of the spectrum as well as the changes of phase. Let us note that the bispectrum enables one to determine the relations between essential frequencies of the examined dynamic system. High value of bispectrum for defined pairs of frequency and combinations of their sums or differences will point to the existence of frequency coupling between them. This may mean that the contemplated frequencies, being the components of the sums, have a common generator, which in the presence of non-linearity of higher order may lead to synthesizing the aforementioned new frequency components. Thus mean that bispectral measures like: diagonal bispectrum, row bispectrum, max bispectrum, can be useful in detection of gear’s fatigue crack. Key Words: Vibroacoustic diagnostic, Gearbox, Bispectral analysis 1
INTRODUCTION
The development of maintenance, which has continued developing since the beginning of 1980’s, is determined by development of technologies on the one hand and putting stress on relevant security of operated systems and the need for reducing the threat to the environment on the other. In such circumstances it is natural to adopt Condition Based Maintenance (CBM), which means introduction of a system of evaluating the technical condition based on the collected data related to the parameters of a machine’s operation and the parameters of residual processes as well as performance of preventive maintenance based on the forecasted damage (failure) occurrence, which we could term as “just – in – time” maintenance. Now the goal is to maximize the long-term effects. This means adoption of an operational strategy whose integral elements include technical condition diagnosis as well as predictive models of functional tasks’ realization and principles of pro-active machine maintenance and operation. Thus the defined strategy accounts for a whole series of aspects of harmonious development, starting from economic analysis of individual lifecycles, ecological requirements, ergonomic requirements and cultural requirements, with the technical component of the management system being distinguished by its predictability, holistic approach and openness. The predictive nature of the system means its ability to forecast the technical condition and the quality of realization of functional tasks by the system and by its individual elements. In practice various scopes of analysis are applied, depending on the method of defining the control, accounting for or not accounting for the consequences of failures or accidents [3]. From this point of view it is the implementation of proactive operational strategy that becomes particularly important. As is presented by Fig. 1, the essence of such an approach boils down to anticipation of preventive actions, in equal degree prior to
651
defect emergence as well as during the period of development of low-energy phases of defects. This calls for developing and adapting relevant methods of diagnosis which are supported by relevant diagnostic models.
Development of failure
Sympton value of damage Proactive maintenance
Reactive maintenance Failure Repair Condition monitoring and maintenance mode
Prenucleation and early stage failure detection
Operating time
Figure 1. Comparison of technical diagnostics goals in proactive maintenance versus reactive maintenance [3]
2
CHARACTERISTICS OF A PROACTIVE MAINTENANCE SYSTEMS Idea of proactive maintenance system algorithm, was presented in literature [1, 2, 5] (Fig. 2.)
Functioning and monitoring data
Monitoring and diagnosis process
Current state of the process Current technical state Current functioning condition
Object state evolution Past parameter of process Past functioning conditions Prognosis process Future state of the process Mean time before/between failure Remaining life of object
Operating knowledge Expected functionig conditions Direct/indirect maintenance cost
Aided-decision making process
Level of technical risk
Selected maintenance activity
Figure 2. Architecture of the proactive maintenance system Let us note that estimation and modelling of the degradation process is one of the most effective methods of defect development anticipation and maintenance of system operation in terms of nominal parameters. In reality such an approach denotes compilation of several conventional methods of forecasting – probabilistic behavioural models and event models in particular. Probabilistic behaviour and degradation models enable analysis of the type and extent of uncertainty which conditions forecasting reliability. Event models are a kind of a combination between the contemplated models and the actual system and they make up the basis for constructing and analyzing causal models which enable assessment of degradation and determination of the optimum scenario of maintenance-and-repair work. In the to-date contemplated models the probability of defect occurrence was being defined on the assumption of invariability of examined distributions during operation of the object. In reality, as a result of wear and tear processes and
652
associated changes of conditions of mating elements and kinematic pairs, we observed conditional probability distributions, however the relationship demonstrated itself both in quantitative terms (change of the parameters of probability density function) and in qualitative terms (change of the function describing the distribution). In addition the degradation processes accompanying the performance of functional tasks can cause similar variability of distributions of the probability which describes load capacity. In this case one can expect that the location of the separating line and the probability of defect occurrence will not only depend on the time of operation of an object but on the new dynamic feedbacks in the system, associated in particular with the development of non-linear relations and non-stationary disturbance.
3
VIBROACOUSTIC SIGNAL AS THE SOURCE OF DIAGNOSTIC INFORMATION
The central issue is how to extract the relevant diagnostic information and use it in the fore-casting process. Let us note that the measured vibroacoustic signal is a real signal which fulfils the requirement of causality. Thus, by using the measured signal z(t) and a defined formalism, we are able, by means of addition of an imaginary part of v(t), to form an analytical signal:
a (t ) = z (t ) + jv (t )
(1)
In accordance with the theory of analytical functions the real and the imaginary components are functions with two variables x and y. Let us assume that the analysis of analytical signal is conducted on the basis of observation of the changes of the length of vector A and the phase angle of j :
z ( x, y ) + jv ( x, y ) = A(cos j + j sin j )
(2)
z = A cos j , v = A sin j
(3)
Thus,
which means that the measured signal is an orthogonal projection of the vector A on the real axis. Basing on Cauchy-Riemann condition, finally we get:
dz dA dj = cos j - A sin j dt dt dt
(4)
The obtained relationship, in accordance with our expectations, presents an equation which enables the analysis of the measured signal on the basis of observation of A and j . What simultaneously captures our attention is the fact that for the low-energy processes, when we can disregard the changes of vector length and assume that A @ const , the whole information about the changes in the measured signal is contained in the phase angle:
dz dj = - A sin j dt dt
(5)
It is common knows that the power spectrum based methods cannot detect the phase relationship between different frequency components and additionally suppresses the phase information. It is therefore necessary to explore spectral measures of higher order, like the bispectral measures, to detect various forms of phase coupling between frequency components. Investigating this possibility we try to write, the bispectrum in form [4, 6]:
(
)
[
)]
( ) (
B f x , f y = E S ( f x )S f y S * f x + f y .
(6)
It is easy o see the bispectrum is complex and that the bispectral values depend on two frequencies fx and fy. Writing the Eq. (6) in terms of amplitude and phase quantities one becomes:
(
) ( ) ( ) ( ) ( f , f ) = Q( f ) + Q( f ) - Q( f + f ) and is called the biphase.
B f x , f y = S( f x ) S f y S f x + f y e where Q b
x
y
x
y
x
jQ b f x , f y
y
Using the fast Fourier transform (FFT) algorithm it is possible to calculate the raw bispectrum:
653
(7)
(
)
( ) (
Bi f x , f y = S i ( f x )S i f y S i* f x + f y
)
(8)
The raw bispectrum can be estimate over the inner triangular region 0 £ f y £ f x , f x + f y = f m / 2 . This is sufficient for a complete description of the bispectrum, since, due to symmetry in the fx – fy plane of the bispectrum, all of the significant information is contained in the principal domain that consists of the inner and outer triangles [6]. In addition to the basic bispectrum, the bispectrum diagonal slice is defined as:
[
]
B( f , f ) = E S ( f )S ( f )S * (2 f )
(9)
with fx = fy = f. The bispectrum diagonal slice is specially useful in detection of nonlinear effect.
4
RESULTS OF LABORATORY EXPERIMENTS
The experiment was conducted at the FZG back to back test-bed. The test-bed consists of two toothed gears operating in a revolving power setup and it enables examination of both toothed wheels as well as gear lubricants. The diagram showing the test-bed is presented in Fig. 3. The shaft connecting the pinions is divided, which enables rotating one of its sections versus the other and thus introducing relevant meshing forces. Strain gauges are affixed to the shaft and they enable measuring the torque. Wheels with straight teeth are installed in the examined gear, while wheels with helical teeth are installed in the closing gear. Thanks to such a set up it was the examined toothed gear that was subject to defect-development during the experiment. Parameters of the test-bed: -
Maximum tensioning torque 1200Nm (or 1500Nm for shafts with bigger torsional rigidity):
-
Motor speed: 1460 rpm;
-
Gear ratio in both toothed gears: 1.296;
-
Module of test specimen wheels and counter-test specimen wheels 4mm;
-
Number of teeth in test specimen wheels: 27;
-
Number of teeth in counter-test specimen wheels: 35;
-
Axle base for wheels: 125 mm.
1
2
3
4
5
7
6
8
Figure 3. Bispectrum Test-bed diagram: 1 - motor, 2 - clutch, 3 - closing gear, 4 - coupling shaft, 5 - prestressed clutch, 6 - examined toothed wheels, 7 - examined gear, 8 - shaft
654
Toothed wheels made of 20H2N4A carburized steel, hardened to 60 HRC hardness were used for the research. They were subjected to accelerated fatigue test. Fig. 4 and 5 present the spectra of a vibroacoustic signal registered on the toothed gear’s casing during the initial phase of the experiment.
spectrum k7 40 35 30
amplitude
25 20 15 10 5 0
0
1000
2000
3000
4000
5000 6000 frequency
7000
8000
9000 10000
Figure 4. Frequency structure of vibroacoustic signal of wheel no. 7 spectrum - k4 40 35 30
amplitude
25 20 15 10 5 0
0
1000
2000
3000
4000
5000 6000 frequency
7000
8000
9000 10000
Figure 5. Frequency structure of vibroacoustic signal of wheel no. 4 One can note substantial differences in the spectrum which are not justified by the fact of reaching the subsequent phases of defect evolution but result from characteristic features of contact of various pairs of teeth.
655
This illustrates the problem of effectiveness of diagnostic observations results in the task of diagnosis of early stages of defect development. The paper presents example of value of vibroacoustic signal envelope evolution, precisely the fifth harmonic frequency of meshing as calculated for the width of the frequency band corresponding to twice the frequency of the input shaft. Fig. 6 present the values of shape parameter corresponding to these changes. Let us note that this parameters depend on defect development while the value of shape factor in particular, does not change monotonously. In order to achieve higher efficiency in application of the results of vibroacoustic diagnosis, we should take seriously into account, the individual vibroacoustic characteristics which were defined during preliminary measurements and analysis.
Z 5ha-env-shape-par 10
value of shape parameter
9 8 7 6 5 4 3 2 1
0
200
400
600 800 1000 number of measurement
1200
1400
Figure 6. Parametr of shape in function of mesurement number of wheel no. 7
Figure 7. Bispectrum diagonal since at the beginning of the investigation of wheel no. 7
656
Figure 8. Bispectral maximum since at the beginning of the investigation of wheel no. 7
Figure 9. Bispectral triangular maximum since at the beginning of the investigation of wheel no. 7 The changes which accompany the subsequent phases of development of fatigue-related defects are observable in a bispectrum. Particularly interesting results have been obtained for a diagonal bispectral measure (Fig. 7 and Fig. 10), for a maximum bispectral measure (Fig. 8 and Fig. 11) and for measure created from vector of maximum values of triangular matrix separated from bispectrum matrix by removing main diagonal of this matrix (Fig. 9 and Fig. 12). Fig. 10-12 present the results of bispectral analysis until the time when a fatigue-related crack emerges. As a result, the phase reactions defined by the dominant non-linear effect become blurred. The results point to high sensitivity of bispectral measures to changes of the signal’s frequency structure and to the possibility of using these relations while constructing models of development of degradation-and-fatigue-related processes which are required while creating the procedures of proactive maintenance strategies.
657
Figure 10. Bispectrum diagonal since at the end of the investigation of wheel no. 7
Figure 11. Bispectral maximum at the end of the investigation of wheel no. 7
658
Figure 12. Bispectral triangular maximum at the end of the investigation of wheel no. 7 Next step was to create a new measure which is able to predict the moment of fatigue tooth crack. Integral of bispectral noise from bispectral maximum diagrams (Fig. 13) and integral of bispectral noise from bispectral triangular maximum diagrams (Fig. 14) were calculated both with maximum level 0.5E8 [m/s2] (everything higher than maximum level was equalize to this maximum level) for full life time of this wheel. At both figures 13 and 14 we can see that calculated derivative of this diagrams (applying a smoothed curve) we can build effective and sensitive diagnostic parameter of quality changes of fatigue process of toothed wheel damage.
Figure 13. integral of bispectral noise from bispectral maximum diagrams, full investigation of wheel no. 7
659
Figure 14. integral of bispectral noise from bispectral triangular maximum diagrams, full investigation of wheel no. 7
5
CONCLUSIONS
The work shows that it is possible to diagnose the changes of the condition of the objects by means of vibroacoustic techniques with an assumption of significantly small energy dissipation. The presented approach not only correctly explains and defines the phenomena from the qualitative point of view but also enables their quantitative evaluation, while maintaining, for defined conditions, a satisfactory consistence. A significant practical advantage of presented approach is that it allows to start managing the proactive maintenance strategy without having to develop detailed deterioration models of objects. One can start managing their facilities with a set of phase coupling mechanism of bispectral measures, which are not only very sensitive to changes of frequency structure of vibroacoustic signal but also are sensitive to changes of kind of nonlinearity connected with analyzing phenomena. The analyze of bispectral noise changes could be effective and sensitive diagnostic parameter of quality changes of fatigue process of toothed wheel damage.
6
REFERENCES
1
Muller A., Suhner M.C., Iung B., (2008) Formalisation of a new prognosis model for supporting pro-active maintenance implementation on industrial system, Reliability Engineering & System Safety, 93, 234–253. Muller A., Suhner M.C., Iung B., (2007) Maintenance alternative integration to prognosis process engineering, Journal of Quality in Maintenance Engineering (JQME), Vol. 2. Radkowski S., (2008) Vibroacoustic Monitoring Of Mechanical Systems For Proactive Maintenance, Diagnostyka, 3(47), 157–164. Radkowski S., Smalko Z., Piętak A., Woropay M., (2006) Use of Bispectral Analysis In Condition Monitoring of Machinery, Proceedings of Third European Workshop Structural Health Monitoring, 627–634. Han T., Yank B.S., (2006) Development of an e-maintenance system integrating advanced techniques, Computers in Industry, 53, 569–580. Yang W., (2001) Towards Dynamic Model-Based Prognostics for Transmission Gears, SPIE Conference Proceedings, 4733, 157–167.
2 3 4 5 6
Acknowledgments Scientific research project financed from the scientific research for years 2008-2011.
660
Proceedings of the 4th World Congress on Engineering Asset Management Athens, Greece 28 - 30 September 2009
LOCAL MESHING PLANE AS A AS A SOURCE OF DIAGNOSTIC INFORMATION FOR MONITORING THE EVOLUTION OF GEAR FAULTS Mączak J. Institute of Automotive Engineering, Warsaw University of Technology, Narbutta 84, 02-524 Warszawa, Poland.
In the paper a new method of assessment gear failures evolution is proposed. The method is based on the on calculation of instantaneous energy density of the vibroacoustic signal’s envelope for the consecutive meshes and allows for acquiring information about the meshes quality and the disturbances of the meshing process for particular teeth pairs. For this purpose a meshing local plane “pinion tooth – gear tooth” concept is proposed permitting the observation of the energy density changes for the consecutive teeth (or teeth pairs) during the normal exploitation of the gearbox. The meshing local plane allows finding in the signal the local disturbances caused by meshing errors or developing fatigue related failures like pitting or gear cracks and link them to the meshes of particular teeth pairs. In the paper the theoretical background of the method as well as the possibilities of different ways of signal analysis would be presented along with the experimental results taken on the back-toback test stand during the fatigue test. The method is relatively easy for implementation in an online monitoring diagnostic system as it is not based on spectral analysis nor requires synchronous averaging of the signal. It could be used for assessing the manufacturing quality of gears, the assembly quality as well as for the gear failure evaluation during normal exploitation.. Key Words: Gear diagnostics, signal envelope, instantaneous energy density, local plane 1
INTRODUCTION
The diagnosing of the early stages of defect development is the area of machine diagnostics that causes increasingly wider interest and expectations of broadening the range of applications. This refers to the studying of the process of generation and transmission of the diagnostic information and thus opening the window of opportunity of controlling the damage risk level and applying more reliable prognosis of the assembly proper functioning period. From this point of view the most important is the early detection of the damages of the kinematic nodes of the diagnosed objects. The need for selecting such a strategy is confirmed by the statistical data. For example, according to [1] in industrial stationary gearboxes 60% of all the damages are the gear teeth damages and 19% are bearing damages. Additionally 50% of teeth damages are caused by the fatigue degradation processes of the contact surfaces and the subsurface layer. The need for diagnosing the defects development and particularly development of fatigue damages like cracks and pitting in rotating machines forced the development of new methods of evaluation of durability of critical machines. Typical methods used for assessing the technical state of mating kinematic pairs with locally damaged surfaces of the machine elements are based on spectral analysis, typically on detecting changes in averaged Fourier Transform of the acceleration signals recorded during the machine exploitation. The so far conducted researches on the gearbox durability [2,3,4,5] indicates that the developing damages of the gears are accompanied by the measurable changes of skewness and flattening of the signal’s probability density function. From the diagnostic point of view this leads to the possibility of finding in the signal the disturbances caused by the developing fatigue defect. However, in their prenucleus period, these damages have very low energy and causes impulse disturbance of the signal resulting in the wideband signal response of the low amplitude making it very hard to detect. As a result these methods are usually unable to locate the defects in their early phase and to tell which parts of the elements are subjected to the progressive degradation. Another, rather simple, methods are based on trend analysis in signal discriminants like the RMS, crest factor etc. The averaged manner of these calculations and in the case of spectral analysis, the requirement for signal’s stationarity of the waveform causes problems with exact locating of the developing failure.
661
There exists another group of methods allowing allows more precisely distinguishing the place and the size of the damage. These methods are either based on analytical signal analysis, especially on the analysis of it’s momentary amplitude (envelope) and phase calculated with use of Hilbert Transform [6,7,8] or on time frequency analysis [9,10]. In contrary to the methods based on the Fourier Transform that are analysing the spectral components, the methods based on the Hilbert Transform allows more precisely distinguishing the place and the size of the damage. In case of the gearbox these methods allowed for precisely locating the damaged tooth and determining the influence of the manufacturing quality on the gearbox behaviour [11,12,13]. The method described in this paper belongs to the latter group. It allows for acquiring information about the developing damage and is based on the concept of the local plane. Local plane plane shows the changes in the signal parameters in the function of the place of it’s origin in the signal with respect to the machine’s kinematic. It means that the location of the disturbances of the signal parameters is well correlated with the machine kinematics and allows for precise localisation of the damaged element. Additionally the dependency from the worktime allows for observing the trends of the signal changes and prognoses the development of the failure.
2
LOCAL MESHING PLANE CONCEPT
The concept of the local meshing plane was proposed in [12,13] and the theoretical backgrounds were presented in [14]. It is an extension into the second dimension of a method of synchronous averaging widely use in signal processing [15]. As the synchronous averaging allows us to observe the averaged time signal in one dimension (time) during the single shaft rotation, local plane extends this possibilities into the second dimension. For the gearbox this means that we could observe the signal simultaneously in the domain of two shaft rotation angles and link the phenomena occurring in the signal to the rotation of each shaft. As it was stated above the method is the extension of synchronous averaging. The basis of the method is to link the parts of the signal to the kinematic of the machine. Let us observe the vibroacoustic signal emited by the two shaft gearbox interpreted as a representation of the energy emited by the set of cooperating teeth of the pinion and the gear. Assuming that we have N pinion teeth and M gear teeth we have N x M different pairs of mating teeth. The teeth mating is a main source of gear signal and as such the gear signal contains the information about the quality of mating teeth. In the case of the conducted experiment (described below), N=27 and M=35 giving 945 different teeth meshing pairs. Applying the trigger probe to the gearbox, we could assume that the starting position of the gearbox shafts is known and though it is possible to observe the history of the mating of the each teeth pair. The time signal Xij(t) emited by the selected teeth pair could be treated as a source of information about the mating quality of this pair and, if compared to the historical data, will revile trend of changes of signal parameters allowing to forecast the durability of particular pinion/gear teeth. This leads to the concept of a local plane of all the pairs (ni,mj) that will show the changes of signal parameters as a function of their occurrence in the signal and allowing easy comparison with the measurement data taken during the earlier exploitation of the machine. The placement of the disturbances is well correlated with the kinematics of the machine and allows localisation of the damage. Simultaneously, the dependency from worktime allows for observation of trends of changes in the signal and prognosing the further defect development.
Figure 1. Gearbox time signal presented on the plane (ni,mj). Straight arrows shows consecutive parts of the signal for the following teeth pairs. Graphs denotes modulo N and modulo M transitions on the plane [14].
662
Looking at the signal that is created by the consecutive teeth pairs (ni,mj) of the gearbox with theoretical geometry not disturbed by manufacturing processes, material stiffness changes or load changes, one can discover that observed signal and in consequence its statistical measures are functions of the single real variable – time (Fig. 1). However, observation of here introduced space, local plane of all the pairs (ni,mj), shows that this measures behaves like a continuous function of two independent variables (t1,t2) parametrising this plane ( ni = ni + N and m j = m j + M ). As a result,
the values characterizing the condition of the single stage gearbox are equally dependent on the closest point of coordinates (ni,mj), in the sense of time interval, but as well as possibly further ones (ni±1,mj) and (ni,mj±1). In practice that means that we should observe the mating of several teeth close to each other in their position on the gears but located in different part of the generated time signal. The creation of the local plane (ni,mj) from the vibroacoustic signal is relatively simple. The only requirement is the simultaneous registration of the trigger signal containing one impulse every one revolution of pinion shaft (or alternatively on gear shaft). Using triggers on both shafts allows for non-continuous registration of the signals and later finding the same initial pinion/shaft positions. Recorded signal have to be preprocessed in the following manner: •
Resampling (simple spline interpolation will fit) in such a way that every revolution is going to contain the same number of samples thus eliminating effect of rotational frequency changes and allowing simple tabularization and comparison of single shaft rotation signals. The number of samples, falling out on one pinion shaft revolution, should be divisible by number of pinion teeth.
•
Division of the resampled signal into sections of the same length K (in number of samples) corresponding to times of entering pinion-tooth contact. This is the time equivalence of base pitch and corresponds to the period of one revolution divided by the number of pinion teeth:
T=
1 f o1 N
[s].
(1)
Length of such selected sections will correspond to time between entrances of successive teeth pairs in tooth contact. Alternatively one can choose it with respect to the duration of expected theoretical path of contact (time between entering and leaving contact). In this case local plane will contain redundant data because of contact ratio greater then one. That means that the same parts of the signal will be connected to several teeth. •
Selected signal sections corresponding to the same tooth-contacts pairs (ni,mj) can be averaged to eliminate randomness from the signal and then presented on a graph in coordinates ni x mj (Fig. 2b). Please note that the local plane constructed this way is always continuous in one direction (i.e. time) and discrete in the other as it is impossible to gain all the positions of two teeth against each other (i.e. during proper operation two teeth tops could not meet).
Looking on the Fig. 1 one can note the apparent discontinuities in the signal caused by the modulo N and modulo M transitions. Moreover after N x M contacts the signal returns to the first teeth pair (1,1). This leads to the interesting observation that the signal Xij(t) from Fig.1 could be analysed on a T2 torus. In this case the grid points ni , m j ˛ T 2 are
(
)
discreet subset of two dimensional torus (Fig. 3). Geometry of the torus grid T2 is determined by the cyclicity of the process (with a period proportional to N x M ).
Figure 2. Diagram showing mating of teeth in the single stage gearbox (a) and a concept of pinion/gear local plane (b) [14]
663
Signal recorded during the work of the gearbox is marked on Fig. 3 as a points representing time samples. On Fig. 3 white points are representing theoretical moments of entering of the new teeth pair into contact (total of N x M contacts) while dark points are the signal data samples (K samples for each base pitch). This representation shows the possibility of analysing the same time points of the signal for one tooth (e.g. pinion) and different teeth of the second gear or just analysing the same points for the specific teeth pair in time thus limiting effect of signal averaging occurring during typical order analysis (order analysis averages signal over several revolutions of the same shaft) (Fig.4).
Figure 3. Representation of a signal on a T2 torus. White points represents teeth meshing points, dark points represents signal time samples (K samples for every resampled base pitch).
Figure 4. Signal (simulated) plotted on a torus
664
3
EXPERIMENTAL VERIFICATION OF THE METHOD ON BACK-TO-BACK TESTER
3.1 Experimental setup The proposed method was verified during the experiment on the back-to-back tester used for examining the durability of toothed wheels. The test-bed was equipped with a two-channel telemetric system which enables transmission of time data from rotating elements. During the experiment •
accelerations of the housing vibration in two directions,
•
torque variations on the pinion shaft,
•
stress at the base of one of the pinion’s teeth,
•
synchronizing pulses from the induction sensor placed on the pinion shaft,
were recorded with simultaneous sampling. The stress at the tooth base was measured with the use of a strain gauge placed at the side of a tooth base. The shaft torque was measured by strain gauge installed on the pinion shaft. The examined gearbox was equipped with spur gears, with a 4 mm module, having 27 teeth in the pinion and 35 teeth in the gear. During the accelerated test toothed wheel was loaded with a torque of ~1300 Nm. The experiment was conducted until full breaking of a pinion tooth. The whole experiment lasted 72 min. The registration and on line analysis of the signals was performed continuously in 6 second data blocks with use of National Instruments equipment consisting of NI-PXI 8186 measuring computer equipped with NI-PXI 4472B DSA measuring card. Total of 720 data blocks were recorded and analysed online. Sampling frequency was set to 20 kHz and rotational frequency of the pinion shaft set to ~25 Hz. The system was controlled by LabVIEW 8.6 programme. Upon completing the experiment it was discovered that the pinion tooth no. 11 (fourth tooth, counting from the synchronizing pulse) became fractured. Fortunately, as the gear was not specially prepared, the broken tooth was the one with a glued strain gauge!
3.2 Experimental results The registration of the signals were performed continuously (i.e. without any breaks) in data blocks of 6 s in length. Each data block was pre-processed according to the following rules: •
spline resampling to obtain the equal number of samples for each pinion resolution (810 samples per pinion revolution, K=30 samples between consecutive meshes) (Fig.6);
•
dividing the resampling signal into parts of K length and tabularizing them according to the number of pinion/gear teeth (ni,mj).
Closer observation of the results shows that the pinion shaft torque (Fig. 5) as well as the averaged (in 6s blocks) peak and RMS acceleration values (Fig. 6 and Fig. 7) were kept on a stable level during the second, crucial part of experiment. Results presented show no significant value increase except for the last few minutes before the complete tooth break. Interesting is that the tooth strain gauge (unscaled-exemplary waveform on Fig. 9) shows 10% decrease (Fig. 8) probably due to the tooth wear during the experiment. This observation is in contrary to the shaft torque recordings showing no change during the second part of the experiment. On Fig.10 the averaged, one revolution waveforms of housing accelerations (signal and it’s envelope) were shown with first visible sign of the tooth crack initiation after 3900s (out of 4300) of the experiment. Although the changes in the signal are visible their energy is so small that the overall level remains unchanged nearly to the end of the experiment.
Figure 5. Averaged peak pinion shaft torque changes during the experiment. Time axis is scaled in 6 s blocks. Averaging time: 6s
665
Figure 6. Averaged peak acceleration values of the gear housing
Figure 7. Peak acceleration values of the gear housing
Figure 8. Peak strain (unscaled) in the toot base
Fig. 9. Waveform of tooth base strain (unscaled) during tree consecutive revolutions (also visible are trigger impulses).
Fig. 10. Acceleration waveforms (600 and 650 data block out of 720) showing initiation of the crack.
666
4
LOCAL MESHING PLANE ANALYSIS
Local meshing plane (ni,mj) analysis described above allows transforming the signal (or it’s statistical parameters) into a new coordinate “pinion teeth” – “gear teeth” system. If the signal on the local plane shows some regularities connected to the kinematics of the analysed gear then they could be linked to the gear eccentricity, pitch variation, runout or teeth fatigue. It is therefore possible to tabularize the waveforms for each mating pairs. This is only limited by the storage memory. For example, on Fig 11. the pinion shaft squared torque envelope variations on the local plane were presented showing which sections of the pinion and gear are weighted more than others. This local plane was prepared for the #1000 mating of the teeth (out of ~3050 recorded matings of each pair). It could be compared with the same local plane view (Fig. 12) taken just before the complete tooth brake (#3020 mating). Visible are torque variations linked to the 4th pinion tooth (cracked).
Figure 11. Squared envelope of pinion shaft torque waveforms on the local plane for the mating #1000 of pinion/gear pair.
Figure 12. Squared envelope of pinion shaft torque waveforms on the local plane for the mating #3020 of pinion/gear pair. To better observe the small changes in the signal caused by early stage of the crack in the tooth base visible on Fig.10, let’s adopt the procedure of Envelope Contact Factor (ECF) described in [11,12,13,14]. ECF enables comparison of subsequent contacts of teeth in a toothed gear with each other. It is calculated as a difference of squared values of envelope waveforms for adjacent contacts at consecutive moments of time, that is
(
ECF = H ni , m j
) 2 - H(ni +1 , m j +1 ) 2
(2)
where H denotes Hilbert Transform. Squared envelope is used for it’s lower frequency and the physical interpretation of signal energy. ECF enhances the differences between the mating of teeth (meshing force) in adjacent contacts caused by pitch errors, the differences in terms of tooth rigidity, fatigue-related defects and all the other shortcomings related to imprecise manufacturing. Especially the changes of teeth stiffness are well visible as they causes smoother contact on one teeth pair that results in bigger impact on the next pair. Fig 13 and 14 shows the (not averaged) signals from Fig. 10 postprocessed according
667
to (2). As it is clearly seen any changes introduced by the early stage of the crack growth are well visible. Moreover they could be linked to the particular pinion tooth. One of the advantages of the local plane analysis is the ability to present the trends of the parts of signal linked to the particular mesh contact. It means we are able to present the historical changes of the waveforms of particular teeth pair e.g. (1,1) or several consecutive pairs: (1,1),(2,2),(3,3) etc. This allows to find in the signal the changes that otherwise would be masked in averaged signal by mating of other teeth. Example of squared envelope of acceleration signal with visible trend of changes was shown on Fig 15 for the contacts (3,1)->(6,4). The rising trend in the signal could be found around point no. 72. The trend is presented on Fig. 16.
Fig. 13. ECF on the local plane after 600 data blocks (3600s).
Fig. 14. ECF on the local plane after 650 data blocks (3900s).
Fig. 15. Time history of contacts (3,1)->(6,4) with visible rising trend marking cracked pinion tooth.
5
CONCLUSIONS
Local meshing plane (ni,mj) allows for determining the quality of contact of the single teeth pairs in the gearbox thus allowing for precise location of gear faults. Representing the base pitch part of path of contact as a K-element time vector allows for observation of a family of K-dimensional random variables indexed with the number of mating teeth (i,j). For such pre-processed signals different types of analysis could be adopted including trend analysis of the instantaneous energy density for each teeth pair, analysis of trend changes of the probability density functions for each teeth pair and methods of chaos analysis.
668
Moreover the calculations could be done on-line provided that there is enough time between acquiring the data blocks. For the chosen 6s length data blocks the analysis were performed on-line on the typical industrial DAQ PXI computer. The described method of the local plane could be a useful complement of currently used methods of machinery quality acceptance procedures. Proposed method is non-invasive and requires relatively simple equipment. It allows to investigate an assembled gear working in its natural conditions. It is possible to detect manufacturing and assembly errors (such as: gear-shaft misalignment, misalignment of bearing mountings, horizontal misalignment, pitch error, distance error, vertical misalignment) and investigate growth of there effects during exploitation. Observation on a local plane allows observation of tooth contact during normal work and selection the worst pinion – gear tooth contact in terms of dynamic overload, the factor that is critical for determining the durability of gear. Additionally local plane allows for detecting fatigue damages that occur to the gears during the exploitation.
Figure 16. Rising acceleration squared envelope trend for point 12 of contact (5,3) for all 3000 matings of this pair.
6
REFERENCES
1
Allianz. (1984). Handbuch der Schadensverhütung. Allianz Versicherungs AG.
2
Radkowski S., Zawisza M. (2004), Use of vibroacoustic signal for evaluation of fatigue-related damage of toothed gears”, The 17th International Congress & Exhibition on Condition Monitoring And Diagnostic Engineering Management, COMADEM 2004.
3
Mączak J. (1998) The use of modulation phenomena of vibroacoustic signal in helical gear diagnosis”, PhD Dissertation. Warsaw University of Technology (in polish).
4
Zakrajsek J.J., Townsend D.P., Lewicki D.G., Decker H.J., Handschuh R.F. (1995) Transmission diagnostic research at NASA Lewis Research Center. NASA Technical Memorandum 106901.
5
Decker H.J. (2002) Crack detection for aerospace quality spur gears. NASA TM-2002-211492, ARL-TR-2682,.
6
Randall R. B. 1982 A new method of modelling gear faults. Journal of Mechanical Design, 104, 259-267.
7
McFadden P.D. 1988. Determining the location of a fatigue crack in a gear from the phase of the change in the meshing vibration. Mechanical Systems and Signal Processing, 2(4), 403-409.
8
McFadden P. D., Smith J. D., (1985), A signal processing technique for detecting local defects in a gear from the signal average of vibration. Proceedings of the Institution of Mechanical Engineers, 199(C4), pp. 287-292.
9
Loutridis C. (2004) A local energy density methodology for monitoring the evolution of gear faults. NDT and E International, 6(37), 447-453.
10
Loutridis C. (2006) Instantaneous energy density as a feature for gear fault detection. Mechanical Systems and Signal Processing, 20, 1239-1253.
11
Mączak J., Radkowski S. (2002), “Use of envelope contact factor in fatigue crack diagnosis of helical gears”, Machine Dynamics Problems, 26, 115-122.
669
12
Mączak J. (2003), On a certain method of using local measures of fatigue-related damage of teeth in a toothed gear, COMADEM.
13
Mączak J. (2005), A method of detection of local disturbances in dynamic response of diagnosed machine element, Condition Monitoring, Cambridge.
14
Mączak J. (2009). Evolution of the instantaneous distribution of energy density on a local meshing plane as the measure of gear failures. The 8th International Conference on Reliability, Maintainability and Safety, Chengdu, China (in printing).
15
Bonnardot F., El Badaoui M., Randall R.B., Daniere J., F. Guillet F. (2005) Use of the acceleration signal of a gearbox in order to perform angular resampling (with limited speed fluctuation). Mechanical Systems and Signal Processing, 19, pp. 766–785.
Acknowledgments The author gratefully acknowledge the financial support from the polish Ministry of Science and Higher Education, scientific project for years 2008-2010.
670
Proceedings of the 4th World Congress on Engineering Asset Management Athens, Greece 28 - 30 September 2009
SUPPORT VECTOR MACHINE AND DISCRETE WAVELET TRANSFORM METHOD FOR STRIP RUPTURE DETECTION BASED ON TRANSIENT CURRENT SIGNAL S.W. Yang a, A. Widodo b, W. Caesarendra a, J.S. Oh a, M.C. Shim a S.J. Kim a, B.S. Yang *a and W.H. Lee c a
School of Mechanical Engineering, Pukyong National University, san 100, Yongdang-dong, Nam-gu, Busan 608-739 South Korea. b
Mechanical Engineering Department, Diponegoro University, Tembalang, Semarang 50275, Indonesia.
c
POSCO Technical Research Laboratories, No. 1, goidong-dong, Nam-gu, Pohang, Kyeongsangbuk-do.
This paper proposes the fault diagnosis method in 6 high cold rolling mill which consist of 5 stand to assess the normal and fault conditions. The proposed method concerns with the strip rupture fault diagnosis based on transient current signal. Firstly, the signal smoothing technique is performed initially to highlight the fundamental of transient signal at normal and fault condition. Then the smoothed signal is subtracted from the original signal in order to transform the original data become useful data that used for further analysis. Next, discrete wavelet transform (DWT) method is performed to present the detail signal. Moreover, features are calculated from detail signal of DWT and then extracted using principal component analysis (PCA) and kernel principal component analysis (KPCA) for dimensionality reduction purpose. Finally, using support vector machine (SVM) for classification, the results of stand 5 shows more clear classified compare with other stands. Key Words: Cold rolling mill, Strip rupture, Transient analysis, Wavelet transform, Support vector machine 1
INTRODUCTION
To increase productivity at low maintenance cost in cold rolling mill industry, the quality issues of steel strip is taking into account to be a first priority. The quality of rolling mill products can be determined by the uniformity of the movement of work rolls in contact with the strip. As increase the roll speed, the current signal becomes point of view to be analysed rather than steady state signal. It has information related with speed, vibration, force and thickness deviation. There is several type of damage that occurred in cold rolling mill due to high roll speed and steel works flattens to a desired thickness [1]. In this paper, strip rupture is considered as one of the steel strip damage which frequently occurred.
Figure 1 6 high cold rolling mill
671
The fault diagnosis method for 6 high cold rolling mill is presented in Figure 1, using SVM and DWT based on transient current signal is presented in this study. Previous work has discussed the utilization of wavelet transform and SVM for induction machine based transient signal [2]. Other articles that study the application of wavelet for transient signal have been published [3, 4]. In this work, the signal smoothing technique is performed initially to highlight the non-stationary fundamental of transient signal at normal and fault condition. Then the smoothed signal is subtracted from the original signal in order to transform the original signal become useful signal that used for further analysis. Next, DWT method is performed to present the detail signal. Moreover, nine features are calculated and extracted using time domain, frequency domain and entropy domain feature calculation formulas. To reduce the dimensionality and extract the useful features, PCA and kernel PCA are utilized. Finally, using SVM for classification, the result of stand 5 shows clearly classified compare with other stands.
2
BACKGROUND KNOWLEDGE
2.1 Wavelet transform Wavelet is known as a good tool for the analysis non-stationary signal having transient condition [5]. The wavelet transform decomposes a concerned signal into a linear combination of a time scale unit. The decomposition process is performed according to the translation of the mother wavelet (or wavelet basic function), which changes the scales and shows the transition of each frequency component [6]. There are many basic wavelet function and they have own characteristic. The most commonly used for discrete wavelet transform is Daubechies. Figure 2 shows Daubechies basic function. The basis function of wavelet system is scaling function f (t ) and wavelet function y (t ) that can be derived from a single scaling or wavelet function by scaling and translation. A scaling function f j , k (t ) scales with translates function f (t ) is defined as following equation:
f j ,k (t ) = 2 j / 2 f (2 j t - k )
(1)
Where, j is the log2 of the scale and 2- j k represent the translation in time. Then wavelet function is given by
y j ,k (t ) = 2 j / 2y (2 j t - k )
(2)
Figure 2 Daubechies basic function
2.2 Feature calculation and feature extraction Feature calculation Feature, in this study can be defined as representative information of machine condition [7]. In rotating machinery, the information of current condition can be obtained by applying appropriate feature calculation formula. Time domain, frequency domain and entropy domain feature calculation are considered in this study. The above some feature calculation formulas are listed as follows: n
n Mean: x = x1 + x 2 + L + x n = 1 x , Root mean square: x = ∑ i n n i =1
∑ xi2 i =1
n
1 n 4 ∑ xi x , Kurtosis: b 2 = n i =1 , Crest factor: CF = p 4
s
xs
(3)
672
where, xi is the ith time historical data, n is the number data points, x p is peak value and xrms root mean square value.
∫ Frequency center: F / C =
+¥
0
∫
f s ( f )df
+¥
0
s ( f )df
¥
, Root variance frequency: RVF =
∫ ( f - FC ) s( f )df ∫ s( f )df 2
0
¥
(4)
0
where, s(f) is the signal power spectrum. Entropy estimation: H ( x ) =
∫ p( x )
Inp( x )dx
(5)
Feature extraction Due to high dimensionality of features, it can decrease the accuracy of classification process for fault diagnosis. Therefore, feature extraction method is necessarily to reduce the dimensionality (Figure 3). Feature extraction is a method to obtain a new feature based on transformation and combination from feature calculation result [8]. In this paper, linear and nonlinear feature extraction methods are performed using PCA and KPCA, respectively.
Figure 3 Feature extraction
2.3 Support vector machine SVM is a supervised classification method based on the statistical learning theory. In SVM, original input space is mapped onto a high dimensional dot product space called a feature space, and in the feature space the optimal hyperplane is determined to maximize the generalization ability of the classifier. The maximal hyperplane is found by exploiting the optimization theory, and respecting insight provided by the statistical learning theory. Given data input xt (i = 1, 2,..., M ) , M is the number of samples. The samples are assumed have two classes’ namely positive class and negative class. Each of classes associate with labels is yi = 1 for positive class and yi = -1 for negative class,
respectively. In the case of linearly data, it is possible to determine the hyperplane f (x ) = 0 that separates the given data M
f ( x ) = wT x + b = ∑ wi xi + b = 0,
(6)
i =1
Where w is M-dimensional vector and b is a scalar. The vector w and scalar b are used to define the position of separating hyperplane. The decision function is made using sign f ( x ) to create separating hyperplane that classify input data in either positive class or negative class. A distinctly separating hyperplane should be satisfy the constraints
f ( x) = 1
if
yi = 1
f ( x ) = -1
if
yi = -1
(7)
Or it can be presented in complete equation
(
)
y i f ( xi ) = y i w T xi + b ‡ 1
for i = 1, 2, ..., M
(8)
The detail presentation of SVM theory can be found in Ref. [9]
673
3
RESULT AND DISCUSSION
3.1 Signal preparation for wavelet The flowchart of the proposed method is illustrated in Figure 4. Figure 5 shows the original transient current signal. The signal of motor current was acquired during one minute sampling time. Signal processing step using signal smoothing technique is initially used to process the original motor current signal to reduce effect of line frequency and pick up transient signal easily. In order to obtain useful signal also called “residual signal” that required in wavelet transform process, the smoothed signal have to transform by subtract the smoothed signal from original signal (Figure 6). The results of wavelet transform db1~db7 are shown in Figure 7. From wavelet result, it’s still difficult to identify the normal and fault condition even though up to detail 7 had been applied.
Figure 4 Flowchart of propose method
(a) Normal condition
(a) Normal condition
(b) Faulty condition
(b) Faulty condition
Figure 5 Original transient current signal
Figure 6 Residual current signal
674
(a) Normal condition
(b) Faulty condition
Figure 7 Wavelet transform for residual signal
3.2 Feature calculation and extraction To obtain useful information from the wavelet result, nine features are calculated from original data as presented in Table 1. The three best feature namely RMS, crest factor and root variance frequency are selected manually based on maximum separate distance of each feature. Figure 8 shows result of feature calculation using three best features. Blue circle and red triangle mean normal and fault data, respectively. Total number of data is 22 which consist of 11 normal data and 11 fault data. The result shows that stand 5 performs better cluster than the other stands. As shown in Figure 8, clustering of calculated features is not satisfied yet due to some normal feature are overlapping to the fault feature region. In addition, feature extraction using PCA and KPCA are employed in order to obtain the best feature called best principal component. The feature extraction results are plotted in Figure 9. Blue circle and red star mean normal and fault data, respectively. As shown in Figure 9, using feature extraction technique, PCA and KPCA have similar performance and the normal and fault features are not identified clearly. Table 1 Selected features Domain
Feature
Time
Mean RMS Kurtosis Crest factor
Frequency
Frequency center Root variance frequency Root mean square frequency
Entropy
Entropy estimation Entropy estimation error
3.3 Feature classification Classification results of SVM are presented in Tables 2-4. Total number of training data is 14 and testing data is 8. Test of classification is executed about ten times and indicated top three which is best accuracy about classification. Classification result using SVM and calculated features is presented in Table 2. Since using calculated features data, the best performance is accomplished in stand 5 and the accuracy is 87.5%. It means stand 5 have information on classification of normal and fault. Table 3 shows detail information of classification result using SVM and PCA feature extraction. The best classification is achieved in stand 5 and the accuracy is 87.5% that is identical when using SVM and calculated features. Comparing with previous result, classification using SVM and calculated features is similar to this method. The last method for classification is using SVM and KPCA feature extraction. Table 4 shows the classification results using this method. Result of SVM and KPCA feature extraction is also indicated in stand 5 but revealed only one time in ten times. When consider the frequency of accuracy, this result is worse than the previous two results. Comparing with result of classification using SVM and PCA, classification using SVM and KPCA achieved worse in performance due to improper kernel parameter. In this work, the kernel parameter of γ =1 is used. In Table 4, the proper parameters for RBF kernel function are 128 and 1 for C and γ, respectively.
675
Figure 8 Feature calculation (Stand 1~5)
Figure 9 Feature extraction using PCA and KPCA Table 2 Classification results using SVM and feature calculation No. of stand Stand 1
Stand 2
Stand 3
Stand 4
Stand 5
Accuracy (%)
Number of SVs
CPU times(s)
RBF kernel parameters C
γ
50.0
12
0.00177
512
2
50.0
11
0.00151
4096
4
50.0
11
0.00140
512
2
62.5
12
0.00145
64
0.25
62.5
12
0.00162
64
0.25
62.5
12
0.00147
64
2
75.0
10
0.00230
8192
16
75.0
11
0.00164
64
8
75.0
12
0.00119
1024
8
62.5
12
0.00173
4096
8
62.5
12
0.00161
256
128
50.0
11
0.00164
512
2
87.5
9
0.00156
512
16
87.5
12
0.00172
256
4
87.5
12
0.00167
128
8
676
Table 3 Classification results using SVM and PCA feature extraction No. of stand Stand 1
Stand 2
Stand 3
Stand 4
Stand 5
Accuracy (%)
Number of SVs
CPU times(s)
RBF kernel parameters C
γ
50.0
11
0.00148
1024
2
50.0
11
0.00238
1024
2
50.0
11
0.00154
4096
8
62.5
12
0.00103
64
1
62.5
11
0.00131
64
0.25
62.5
12
0.00142
64
0.25
75.0
10
0.00177
64
8
75.0
12
0.00155
64
2
75.0
10
0.00186
64
0.5
62.5
12
0.00153
256
32
62.5
12
0.00114
512
4
50.0
12
0.00193
256
64
87.5
10
0.00144
512
16
87.5
9
0.00150
64
2
75.0
12
0.00166
64
2
CPU times(s)
RBF kernel parameters
Table 4 Classification results using SVM and KPCA feature extraction No. of stand Stand 1
Stand 2
Stand 3
Stand 4
Stand 5
4
Accuracy (%)
Number of SVs
C
γ
50.0
11
0.00161
128
0.5
50.0
12
0.00177
128
4
50.0
12
0.00183
1024
8
62.5
12
0.00150
64
2
62.5
12
0.00264
64
2
62.5
12
0.00160
256
2
75.0
12
0.00128
128
8
75.0
12
0.00195
64
0.5
50.0
11
0.00156
512
2
50.0
12
0.00134
512
2
50.0
12
0.00164
512
2
50.0
12
0.00183
256
128
87.5
12
0.00190
128
1
75.0
12
0.00271
64
4
62.5
12
0.00135
64
4
CONCLUSIONS
The fault diagnosis method in application of 6 high cold rolling mill which consist of 5 stand has been studied. Since the transient current signal contains non-stationary fundamental, so it needs to be removed before classification. In this work, smoothing process and DWT are used to highlight the fundamental of transient signal at normal and fault condition. Feature calculation and extraction using component analysis is performed and salient classification of normal and fault is more clearly identified in stand 5 compared with the other stands. It means fault can be detected in early stage if stand 5 is monitored initially. The detail classification of normal and fault is necessary to improve and achieve the acceptable result of varies rolling
677
condition. To obtain the best result of classification between normal and fault, some appropriate prepossessing and classification method for transient signal are needed.
5
REFERENCES
1
Mackel J. (1999) Condition monitoring and diagnostic engineering for rolling mills. International congress of COMADEM.
2
Widodo A and Yang BS. (2008) Wavelet support vector machine for induction machine fault diagnosis based on transient current signal. Expert System with Application, 35(1-2), 307-316.
3
Douglas H, Pillay P and Ziarani A. (2004) A new algorithm for transient motor signature analysis using wavelet. IEEE Transaction on Industry Applications, 40(5), 1361-1368.
4
Douglas H, Pillay P and Ziarani A. (2004) The impact of wavelet selection on transient motor current signature analysis. IEEE International Conference on Electric Machines and Drives, 1361-1368.
5
Burrus CS, Gopinath RA and Guo H. (1998) Introduction to wavelets and wavelet transforms, a primer. Englewood Cliffs, NJ: Prentice-Hall.
6
Daubechies I. (1992) Ten lectures on wavelets. SIAM, Pennsylvania, USA.
7
Hwang WW. (2004) Condition classification and fault diagnosis of rotating machine using support vector machine, Master course thesis, Pukyong National Univ.
8
Han T. (2005) Developement of a feature-based fault diagnostics system and its application to induction motors, Doctor course thesis, Pukyong National Univ.
9
Vapnik VN. (1995) The nature of statistical learning theory. New York: Springer.
Acknowledgments This work was supported by the Brain Korea 21 Project.
678
Proceedings of the 4th World Congress on Engineering Asset Management Athens, Greece 28 - 30 September 2009
THE DEVELOPMENT OF A MOBILE E-MAINTENANCE SYSTEM UTILIZING RFID AND PDA TECHNOLOGIES Chi-Yung Yau and David Baglee University of Sunderland, Faculty of Applied Sciences, Department of Computing, Engineering and Technology, St. Peter’s Way, Sunderland, SR6 0DD, U.K.
Technological developments in e-maintenance systems, radio-frequency identification (RFID) and personal digital assistant (PDA) have proven to satisfy the increasing demand for improved machinery reliability, efficiency and safety. RFID technology is used to store and remotely retrieve electronic maintenance data in order to provide instant access to up-to-date, accurate and detailed information. PDA technology supports the transfer of data between user and central maintenance database systems. The DYNAMITE (Dynamic Decisions in Maintenance) project intends to support maintenance decisions by developing and applying a blend of leading-edge communications and sensor technology including RFID and PDA to enhance diagnostic and prognostic capabilities. The paper will present the development and implementation of an innovative system using newly developed RFID and PDA technology which is capable of storing and analyzing pertinent maintenance data which can be accessed by both mobile and static computing devices. Key Words: RFID, PDA, E-maintenance 1
INTRODUCTION
A problem of most industries faces how to manage their maintenance resources in a more efficient and cost effective way. To support maintenance decisions new tools and technologies are required. The Dynamic Decisions in Maintenance (DYNAMITE) project intends to develop and apply a blend of leading-edge communications and sensor technologies to enhance diagnostic and prognostic capabilities. In particular radio-frequency identification (RFID) smart tags and personal digital assistants (PDA) are used to support maintenance task development. RFID wireless technology provides a simple, convenient and reliable automatic detection of a unique smart tag’s serial number. In addition, small smart tags are used to store a variety of data, which allows the user to store such information as location, type and function of the asset and previous maintenance tasks. Thus, RFID is incorporated with smart tag to provide an automatic identification of maintenance equipments and information via radio waves. Besides, the use of PDA technology is to provide a powerful mobile computing device for accessing different Information Technology (IT) network. When used together, the RFID smart tag and PDA become a new multi-purpose mobile maintenance tool by interlinking different and separated technologies to support the transfer of data between user, machine, RFID smart tag and database. However, in order to achieve a unified system it is necessary to examine the different technologies and propose a system which combines the latest technologies via an adaptable and appropriate e-maintenance system. The aim of this research is to examine the use of stored and transmitted electronic information systems using RFID smart tags in order to ensure instant access to up-to-date, accurate and detailed information. In addition the PDA will provide the “operator tool” via a high resolution screen, wireless connectivity to remote, and often off-site, web-based services. A review of the current standard of radio-frequency identification (RFID) technologies and a discussion of usage of RFID smart tag technology in a computerized maintenance management system (CMMS) have been presented previously [1]. Several targeted services in maintenance including remote assets identification and query; spare part and tools inventory control; and mobile assets and agents tracking will be discussed. Finally, the design and deployment of an integrated prototype system will be presented to support the new vision of mobile e-maintenance.
679
2
RFID AND PDA FOR MOBILE E-MIANTENANCE
Along the growth of wireless sensor technology, the ease of access to the internet and the increasing effectiveness of PDA technology and the decreasing cost of such technologies have allowed maintenance managers to store and analyse maintenance large amounts of maintenance data. The PDA can be connected, for example, to a vibration sensor which in turn is connected using a Universal Serial Bus (USB) sensor to a particular mobile device for a single measurement or to a Zigbee sensor for larger amounts of sensory data. Zigbee is a wireless communication technology especially suitable for using in wireless sensor network due to long battery life, secure and support more wireless node in a single network [2]. Afterward those measuring data are analysed and processed by the mobile devices and uploaded to database directly via a computing wireless network. The role of RFID technology in maintenance is to establish a wireless information connection between assets and management system as illustrated in Figure 1.
Figure 1. Concept diagram of using RFID in maintenance
A RFID smart tag, also called transponder and RFID tag, is a compact silicon-chip containing memory, modulator and antenna [3,4]. The memory is an electrically erasable programmable read-only memory (EEPROM), which is a type of nonvolatile memory, containing a serial number and user defined area. The modulator is for modulating and demodulating a radiofrequency (RF) signal to convey message. The antenna is for receiving and transmitting RF signal. In order to detect and receive the RF signal emitted from RFID tags, a standard compatible RFID reader is needed. In practice, RFID tags will be placed on different machines, or assets, certain key items such as replacement parts and specific tools. While the RFID smart tag is scanned by the reader, smart tag’s identification number and memory can be retrieved automatically. Based on that, users can identify the attached asset either by matching smart tag’s identity number with asset record in database or reading the asset code and other relevant information directly from memory. There are three common types of RFID technologies, active, passive and semi-passive, in the market [5-8]. Active RFID smart tag is fully powered so it supports longer range communication and it can continuously emit a signal for asset/personnel tracking and security purpose. Also, it can support the use of different condition based techniques such as a temperature sensor for monitoring different temperature at different locations and a vibration sensor to detect movement. In contrast, passive RFID smart tag is powered from the reader. When a reader sends a ‘read request’ RF signal to a smart tag, the radio wave will provide the necessary power for smart tag in order to send a ‘reply’ RF signal carrying the RFID identity code and the memory content back to the reader. Typically, the read range is limited to a few inches for most portable readers and up to few meters for some large antenna readers and the memory space is limited up to 2K bit. It also supports embedding temperature sensor in a passive RFID smart tag, but it is not suitable for a continuous logging task. However, compared to active RFID, it is low cost, well standardized and durable, thereby making it suitable for inventory control and logistics.
680
The PDA can be used to plot and display sensory data, read asset information, search for components and access the internet or intranet. Additionally, most connection interfaces such as compact flash, USB, Bluetooth and WiFi are already supported by PDA. The role of PDA is act as a universal mobile maintenance computing device to run different user-friendly and comfortable maintenance system interfaces for dealing with different maintenance activities. RFID technology is often considered as a wireless barcode technology, and able to store large amounts of data [3]. However, the technology has improved exponentially and large amounts of data can be stored within the internal memory of smart tags no larger than a small coin. Engineers often have limited and no access to a network, wired or wireless, therefore, the internal memory of the smart tag removes the need for a network connection. Maintenance tasks are stored on the smart tag for retrieval at a later date. The current memory size of passive RFID smart tags is limited (from 1Kbit to 32KBytes [9]), however, it is capable of storing assets code and status and progress of previous maintenance works. Within two years it can be anticipated that RFID smart tags technology will increase sufficiently to allow the tags to be small and contain more text, diagrams and maintenance schedules. Different to most other maintenance technologies, the use of RFID and PDA for maintenance is established on the concept to support engineers performing their maintenance tasks faster, easier and more accurately in order to avoid any improper maintenance activities and indirectly improving the security and safety, rather than only focusing on increasing the intelligent of machines.
3
DESCRIPTION OF ARCHITECTURE
In order to demonstrate the use of RFID and PDA to contribute on different maintenance work, a RFID supported maintenance system has been developed. The system architecture, shown in Figure 2 outlines the five core components which are included in the architecture. They are: 1) the database; 2) the web service server; 3) the mobile assets tracking system; 4) the machinery parts and tools inventory control system; and 5) the assets identification and query system.
Figure 2. A complete active and passive RFID asset management system architecture
3.1 Database and Web-Service Sever The top two components in the architecture are the database server and web service server. A central maintenance database called MIMOSA is used in system. The Machinery Information Management Open System Alliance (MIMOSA) [10] is an enterprise-level maintenance open database system which is used within the Dynamite project for connecting all maintenance services, sensory data and information at the same place for machine diagnostics, prognostic and decision support. Web service, which contains service provider and service requester, is a software system module designed to support interoperable machine-to-machine interaction over a network. A service provider is a web service server to host different web services for service requesters to execute remotely via internet. In the architecture, a basic service-oriented architecture has been applied that a web service server is located at remotely site and directly connected to the MIMOSA database. The web
681
server hosts different database manipulation services and asset maintenance services. ‘Client-side’ devices act as the service requesters to request different web-services in order to achieve the interaction of database and the execution of different maintenance functions. The main advantage of using a web-service instead of directly connecting to database is the input data and information can be validated before storing or updating the database records. All services can be performed centrally at the sever side in order to reduce the problem of data inconsistency. Also, it supports the distribution of function oriented web services at different web service providers in order to keep the execution of services in high performance. Four function oriented groups of web services are included. The first is the asset and agent position tracking group which is a collection of web-services related to position tracking such as downloading a map of a room, collecting all possible locations within the room, querying all position of assets that located in that room. The second group is for inventory control of tools and machinery parts including different web-services such as issuing tool and querying asset quantity. The last two groups of webservices are for querying and updating segment and asset records.
3.2 Assets Identification System Within a small company it would not pose a problem for engineers to identify asset and replacement parts, however, within a large company with many assets in different locations it may prove more difficult to locate a particular asset. Therefore, the asset identification system is designed to fully utilize both the tag identity code (RFID) and the content stored in the smart tag internal memory for identification and query purpose. It is considered as a middleware system able to interlink users, smart tags and web-services. Based on the system, users can primarily identify the asset based on the information stored inside the smart tag.
3.3
Inventory Control System
The electronic inventory control system was developed and installed on a Personal Computer (PC). However, this limits the ‘portability’ of the platform. Therefore multiple systems were installed at different self-service checking points within the plant, i.e. store rooms and the entrance of working areas. This allows maintenance engineers to check-in and check-out their required spare parts and tools for maintenance. If an asset is misplaced the system can query the most recent movement and identify the last person to use it and the present location.
3.4 Mobile Assets Tracking System The mobile assets tracking system is a real-time locating system (RTLS) utilising active RFID technology. The tracking system can be used to track and monitor the position of valuable movable assets. Similar to passive smart tags, active smart tags will be attached on movable assets and personnel. Due to the active RFID smart tag operating in active mode, it will continuously send the new data to the monitoring system. To ensure data reliability the position of smart tag is identified using multiple readers, in addition this ensures the signal strength is kept to a maximum. Using multiple readers allows the position of the smart tag to be calculated by a positioning engine and the result will be used to update the record in a database. The PDA can query the updated asset positions via corresponding web-service and display them on the screen.
4
IMPLEMENTATION
In the demonstration system, the database sever is located at a remote site and connected via internet connection. The web services sever is located at a local sever to host all necessary web-services and web pages to support different maintenance activities and database records manipulation. A lot of sample assets information has been pre-created in the database. Those sample data are collected from a real machines, equipments and systems using in a university and an industrial plant. The following section will present the implementation on two RFID, the PDA-based asset identification system and the mobile assets tracking system.
4.1 Equipments For passive RFID, 13.56MHz I.CODE SLI is selected to be used in the demonstration system. A compact flash RFID reader is connected to the PDA and a USB reader is connected to the PC for reading and writing information to passive smart tags. The memory of I.CODE SLI passive smart tag is 1024 bits organised in 32 blocks of 4 bytes each. Some metal-on-mount smart tags are considered since they are suitable for mounting on the metallic surface of a machine. Figure 3 shows equipment currently utilised in the development of a passive RFID system.
682
Figure 3. Example of equipments including (left) USB PC-based passive RFID reader and (right) several metal-on-mount RFID smart tags, compact flash passive RFID reader and PDA
For active RFID, 433 MHz Wave trend products are used. Up to 255 L-RX201 active RFID readers can be connected by network cable to form a Wave trend reader-network. The first reader is connected to a PC via the serial RS232 protocol to transfer the reader data to PC. Typically, the life of smart tag is estimated at 5 years and the transmission time interval is approximately 1.5 seconds. If the reader is using L-AN100 (Whip) antenna, the read range can achieve up to 35 meter.
Figure 4. The equipments of active RFID system: (left) L-RX201 RFID reader and (right) TG800 asset smart tag
4.2 Assets Identification and Query System The demonstration system for assets identification and query has been successfully created to use passive RFID to identify different maintenance information. In order to better organise and manage the maintenance information, six types of fundamental information templates have been designed for machines, machinery parts, tools, facilities, locations and personnel. Their roles are to provide a more convenient and easy way to categorise and standardise the format for saving in RFID smart tag and displaying on a PDA screen. One example machine template for machine is shown in Figure 4.3 including the template identity code, the template format, the template record and the database query statement. All those information are stored in the MIMOSA database, so user can easily specify their information template for particular purpose.
Figure 5. The content of a machine template
683
Two platforms, web-based and PDA-based, have been developed as shown in Figure 6. The web-based platform is a group of multiple ASP.Net web pages mainly for dealing with the task of asset identification, query asset component and subcomponents. Each ASP page are designed for different purposes including displaying basic information of assets, displaying images, schematics and diagrams, querying related asset information including spare parts and maintenance instructions. The PDA-based platform is a window mobile application. It is designed for bridging a RFID device and web-services to manage the content stored inside smart tags. It also supports to export the smart tag carried information for other maintenance software used. Through predefined information templates, the software can download all necessary information from database in order to auto-generate content to store inside the memory of smart tag. The main role of both platforms is to allow users effectively managing the information for displaying them on the screen and storing in the smart tags; and utilizing the information for auto filling different online forms for querying detail information, reporting failure and recording the latest maintenance status.
Figure 6. Example screenshots of performing asset identification via RFID: (left) web-based interface for querying asset information and (right) PDA-based interface for manage information in a smart tag
4.3 Mobile Assets Tracking System Typically, there are three deployments for using active RFID for tracking: (1) mobile smart tags with fixed location of readers, (2) fixed location of smart tags with mobile readers and both are mobile and (3) hybrid. The first one is for centralise processing scenario, one positioning system is used to track and mange all movable smart tags. The second one is much simpler because each reader contains a basic calculating power to determine the nearest position based on identifying fixed active smart tags nearby. The final one is a hybrid scenario, each movable reader must have a full positioning power to continuously calculate the distance between them and at least three detected movable active tags in order to estimate their positions. In the experiment setup, we only consider the first deployment because our current active RFID equipment does not support PDA to receive active RFID signal. There are four active RFID readers connected as shown in the map (see Figure 7). Those readers will continuously receive RF message transmitted from active smart tags. All information such as the RSSI signal, sensor value and stored information are bound as a single message. All messages from different smart tags will be collected by the tracking application as shown in the table (see Figure 7). Each row in the table represents a set of RSSI value of smart tag from four readers. Then the positioning engine will read the table row-by-row to calculate the position and estimate the position of the smart tag. Currently, a non-linear artificial neural network called self-organizing map (SOM) is used as the positioning algorithm. When four RSSI values for a smart tag feed into the SOM position engine, one corresponding colour output will be activated for representing a corresponding position in the map. Finally, the results will be displayed on the map and the corresponding record in the database will also be updated. PDA users can call specific asset tracking web-services reversely to retrieve the asset tracking records and displaying them on the screen.
684
Figure 7. The complete explanation diagram of mobile assets tracking system
5
EVALUATION AND DISSCUSION
In order to evaluate the demonstration system, two separated testing have been carried out at a university laboratory and an industrial plant. The scenario is that an engineer receives a maintenance task for replacing a component inside a machine. The component is located at the store room and the required tools are located at a mobile truck, which is shared for whole plant. Based on the scenario, the summary of the overall analyses are focusing on the following area.
5.1 Convenient and Time Saving The wireless connectivity of the PDA allows the site engineer to check their work order anywhere via a wireless network. The current PDA supports high resolution which allows the information to be easily read and provide a substantial amount of documents such as maintenance manuals and maintenance instructions directly on the PDA. Through the asset query system, the engineer can check the stock of spare parts and get their exact storage location to reduce the searching time. When the spare parts are found, they have to be checked-out by the RFID inventory system. Afterward, using the active mobile asset tracking system, the current location of tools can be tracked and reserved. The overall procedures for searching, booking and locating the spare parts and movable tools are simple and convenient. This method saves time and effort and allows the maintenance engineer to spend more time on maintenance planning and task execution.
5.2 Prevention of Improper Maintenance Activities During maintenance planning the engineer would use both passive RFID and PDA together to accurately identify the correct machine before performing any maintenance activities. After identifying the machine, an inspection of the asset would identify the part which required replacing or maintained, however, this is dependent upon the availability of a wireless network. If a wireless network connection is not available, they also can scan different location smart tags in the machine for identifying the correct location. The engineer has to run the spare part replacement program and follow the on-screen instructions to firstly scan the measurement location smart tag as the confirmation of the target location and scan the spare part’s smart tag as the installation of spare part. When this is completed, a ‘replacement completed’ maintenance message,
685
which contains the information of the installation time, the location, the spare parts and the employee, can be written into another smart tag for subsequent engineers to refer. This is an advanced use of passive RFID in maintenance. It is not only extending the limitation of only storing the identification information of asset and location, but also the extensive use of them on storing the completion of maintenance procedures in order to support tracking the work-in-progress [11,12] and reduce the improper maintenance works.
5.3 Effectiveness of Using Active RFID for Locating Assets Different to the passive RFID technology, the active RFID is relatively expensive. It is used particularly for low-cost, longlived and continuous tracking and locating of movable and valuable assets. Since not all assets are movable and expensive such as spare parts. Therefore, the replacement of the passive RFID and the barcode by the active RFID is not recommended. A hybrid use of them is suggested and demonstrated in the system. Using the active RFID in the industrial environment does present one or two problems. When machines are running and suddenly stop, they will have different level of influence to the radio wave. The influence, or noise, increasers the difficulty in calculating the correct position of a movable smart tag. In order to solve that, a non-linear neural network approach is investigated to adaptively solve the problem. Several laboratory and industrial experiments have allowed data to be collected using a multi-layers hierarchical structure. Assuming a three layers structure is used, the first layer is to identify the zone, the second layer is for the smaller sub-zone and the final layer is for the exact position. Under this approach, 80%-90% of accuracy has been achieved in a noisy and high-reflective industrial environment. Although the scale in the experiment is not very large, the active RFID can be proven to be suitable for using as locating sensors in industrial environment by adding more readers and more layers in order to increase the accuracy.
5.4 Security RFID can effectively store different maintenance information such as machine information and maintenance status. Data security is important because the memory in the smart tag can be read, changed or erased and this may have an effect on the type of maintenance activity required [13,14]. If user does not trust or is uncertain of the information retrieved from the smart tag, they would be able to finally verify the data via the database stored within the network. In this case, the original idea and advantage of using RFID is lost. And it will be no longer able to support maintenance activities. Currently, a backup of smart tag content is saved in the MIMOSA database and the generation of content and saving it in a smart tag are automatic. However, we cannot protect it by unauthorised access. Another solution is to lock the content in a passive RFID smart tag permanently. However, this approach can only keep the security of content but lose the reusable characteristic indirectly increasing the maintenance cost. Targeting to the fraud protection, an encryption and password protection [15] approach is more considerable to secure data on the tag like MIFARE series RFID smart tag.
6
CONCLUSION
The RFID demonstration system has been successfully developed. Through utilizing RFID and PDA technologies together, the user can transfer different maintenance data including the measurement data and different reports from a PDA to a central maintenance database system. Reversely, they can also identify different smart tags, which are attached to different assets and measurement locations, to query different relative information and saving the work-in-progress status to the smart tag. Therefore, the focus of the using RFID in maintenance is not only limited on the asset identification and location identification, but also on keeping different up-to-date and important maintenance message at smart tag. Based on the concept, passive RFID technology has been successfully demonstrated and has been embraced and used to store and remotely retrieve electronic maintenance data in order to provide instant access to up-to-date, accurate and detailed information. The main idea of using RFID and PDA together as a powerful multipurpose mobile maintenance tool has been successfully demonstrated. It is not only proved the effective reduction of time in querying asset information and filling different online machine failure reporting forms, but also the reduction of improper asset and location identification in order to prevent a series of inappropriate maintenance activities. Beside the investigation of using passive RFID in maintenance, the use of active RFID is also investigated specific on real-time location tracking (RTLS) of mobile assets. It can be used for security purpose to detect any unauthorized people getting into a protected area. Currently a non-linear neural network approach is being investigated as the positioning algorithm. The current accuracy of position detection can up to 90% within one feet distance. Thus, using active RFID for tracking and positioning of movable assets and personnel in industrial environment is a workable. However, active RFID is not suitable to replace the passive RFID because the cost of each active smart tag and the maintenance cost are much higher than passive one. Among different maintenance tasks such as the tracking of machinery parts and recording the last maintenance processes, the use of active RFID for continuously monitoring is not necessary. In addition, not all active
686
RFID equipment support PDA platform. Therefore, a restriction of using active RFID for tracking some high value and movable assets purpose is suggested. Although both passive and active RFID technologies provide a lot of advantages in maintenance, the security and the protection of maintenance information in RFID smart tags are still being concerned. A high security and password protected RFID smart tag is suggested to be used. The solution described here is especially suitable for some industries using a lot of separated tools and maintenance systems during maintenance.
7
REFERENCES
1
Adgar, A., Addison, D., and Yau, C.Y. (2007) Applications of RFID Technology in Maintenance Systems, Proceedings of the second World Congress on Engineering Asset Management (WCEAM), Harrogate, UK.
2
Labiod, H., Afifi, H. and Santis, C.D. (2006) WI-Fi, Bluetooth, Zigbee and Wimax, Springer.
3
Finkenzeller, K. (2003) RFID Handbook: Fundamentals and Applications in Contactless Smart Cards and Identification, 2nd Edition, John Wiley and Sons Ltd.
4
Glover, B. and Bhatt, H. (2006) RFID Essentials - Theory in Practice, O'Reilly Media, Inc.
5
Gerst, M., Bunduchi, R. and Graham, I. (2005) Current issues in RFID standardisation, University of Edinburgh.
6
RFID Journal (2006) A Summary of RFID Standards, RFID Journal. http://www.rfidjournal.com/article/articleview/1335/1/129 [Accessed: April 15, 2009]
7
Hodges, S. and Harrison, M. (2004) Demystifying RFID: Principles & Practicalities, Auto-ID Centre, Institute for Manufacturing, University of Cambridge.
8
Ward, M., van Kranenburg, R. and Backhouse, G. (2006) RFID: Frequency, standards, adoption and innovation, JISC Technology and Standards Watch.
9
Business Wire (2009) Tego, Inc. Demonstrates World’s First High-Memory, Passive RFID, RFID Solutions Online, Apr. 2009. [Online] Available: http://www.rfidsolutionsonline.com/article.mvc/Tego-Inc-High-Memory-Passive-RFIDTagging-0001 [Accessed: June 05, 2009]
[Online]
Available:
10 MIMOSA. What is MIMOSA, MIMOSA. [Online] Available: http://www.mimosa.org/main/about.aspx [Accessed: June 07, 2009] 11 Bacheldor, B. (2007) OAT Launches RFID Software for Tracking Work-in-Process, RFID Journal, 2007. [Online] Available: http://www.rfidjournal.com/article/print/3619 [Accessed: May 14, 2009] 12 Sirico, L. Using RFID for Work-In-Progress (WIP) Management, Industrywizards. [Online] Available: http://rfidwizards.com/index.php?option=com_content&task=view&id=33&Itemid=427 [Accessed: May 14, 2009] 13 Thornton, F. and Lanthem, C. (2006) RFID Security, Syngress. 14 Packaging Gateway (2005) Is RFID a danger? Packaging http://www.packaging-gateway.com/features/feature74/ [Accessed June 07, 2009] 15 O'Connor, M.C. (2006) New Philips Chip Adds Security Option, http://www.rfidjournal.com/article/view/2459/1/1 [Accessed June 07, 2009]
687
Gateway. RFID
Journal.
[Online] [Online]
Available: Available:
Proceedings of the 4th World Congress on Engineering Asset Management Athens, Greece 28 - 30 September 2009
E-MAINTENANCE, A MEANS TO HIGH OVERALL EFFICIENCY Erkki Jantunen a , Eduardo Gilabertb, Christos Emmanoulidisc and Adam Adgar e a
VTT Technical Research Centre of Finland, P.O.Box 1000, 02044 VTT, Finland b
Fundación Tekniker, Avda. Otaola 20, 20600 Eibar, Spain c
CETI/R.C.Athena, 58 Tsimiski st. Xanthi, 67100, Greece
e
School of Science and Technology, University of Teesside, Borough Road, Middlesbrough, TS1 3BA, UK
Today the number of industries putting emphasis on maximising Overall Equipment Effectiveness (OEE) is rapidly increasing. Instead of looking only at one aspect of production, modern enterprises seek to plant a series of joint activities, aimed at minimising losses, by improving on Performance, Quality and Availability at the same time. Emphasising these three factors simultaneously leads to the introduction of efficient maintenance, including sound strategies, such as Condition Based Maintenance (CBM). The incorporation of key enabling technologies within a Condition Based Maintenance strategy, such as wireless networking, internet & mobile computing, minature sensing devices and location awareness, has paved the way to the introduction of eMaintenance. The paper looks at the key features pertaining to the successful implementation of e-Maintenance into modern industry. It then highlights steps taken towards the design and implementation of an e-Maintenance architecture within the EU Integrated Project ‘Dynamite’. The paper concludes with a discussion on current challenges and future prospects of e-Maintenance. Key Words: E-maintenance, wireless networking, web computing, mobile computing, RFID 1
INTRODUCTION
A number of challenges must be solved when the aim is to carry out maintenance in way that actions are carried out when required, following a Condition Based Maintenance (CBM) strategy [1]. A key concern is to schedule maintenance actions, taking into account condition monitoring measurements. The toolset available to implement CBM is now stronger than ever, taking advantage of the confluence of enabling technologies, such as smart sensors, wireless technologies and service-oriented computing. At the same time the price of sensors and analysis equipment has decreased, making CBM even more costefficient. With current ever growing demand for improvements on system productivity, availability and safety, product quality, customer satisfaction and taking into account the trend for decrease in profit margins, the importance of implementing efficient maintenance strategies becomes unquestionable. In this setting the maintenance function plays a critical role in a company’s ability to compete on the basis of cost; quality and delivery performance. Maintenance is increasingly taken into account in production requirements. Concepts such as lean production, six sigma, just in time or LCC are growing indicators of the operation efficiency, with maintenance playing an important role in keeping these indicators at profitable levels. Importantly, the costs of gathering, processing and acting on information is decreasing, whilst the cost of making incorrect decisions is increasing. The expansion of the use of mobile communication technologies and the internet bring in ways of solving many business and maintenance problems. Hence both condition based maintenance and predictive maintenance strategies may benefit from mobile and internet technologies, allowing them to become cost-effective for all types, sizes and complexity of machinery.
688
2
E-MAINTENANCE CONCEPT
The increasing demands on system availability, performance and product quality have contributed to pushing the importance of the maintenance function to higher levels. Many issues regarding production safety, customer satisfaction and stabilisation of profit margins are becoming the targets for maintenance activities. Several studies over the past years have indicated, the indirect cost of maintenance around Europe is a significant percentage of total sales turnover [2]. Maintenance strategies have undergone many major developments from the traditional ‘fail and fix’’ practices to ‘predict and prevent’ strategies. One of the latest developments, namely the ‘e–maintenance’ methodology, as described by Lee [3], is a derivative of the concept of e–manufacturing and e–business. A key objective of e–maintenance is to support the next generation of manufacturing processes by making maintenance – related information ubiquitously available. E-maintenance is a relatively recent development in the field of engineering asset management. It is an emerging maintenance management concept, whereby assets can be monitored and managed over the Internet. Many definitions of this concept have been proposed in the literature and research is still evolving new tools and enabling technologies for eMaintenance. The core ideas of e–maintenance are that the system gives the ability to monitor plant floor assets, link the production and maintenance operations systems, collect feedback from remote customer sites and integrates it to upper level enterprise applications. The maintenance management concept relies heavily on the Internet to monitor and manage the assets. Another other important feature of E-maintenance is that it attempts to integrate existing maintenance principles with telecommunications, web–services, mobile, wireless and portable devices and other means of electronic collaboration. The collection and exchange of dynamic, real–time maintenance information provides a powerful framework that enables to build, process and exploit detailed knowledge of the assets or production systems being monitored. This provides great opportunities for cost–effective decisions to be made since the relevant data can be made available in a timely fashion.
3
DYNAMITE PROJECT
Dynamic Decisions in Maintenance, DYNAMITE, an Integrated EU Project on e-Maintenance, started in the autumn of 2005 and ended in the end of February 2009.. Altogether there were 16 partners in the project: University of Sunderland (UK), University of Manchester (UK), Nancy University (France), Växjö University (Sweden), Volvo (Sweden), Fundación Tekniker (Spain), Goratu (Spain), Centro Ricerche Fiat (Italy), Zenon (Greece), Prisma (Greece), Martechnic (Germany), Diagnostic Solutions (UK), Engineering Statistical Services (UK), IB Krates (Estonia), Wyselec (Finland) and Technical Research Centre of Finland (Finland) who also acted as coordinator of the project. Figure 1 shows the work implementation structure of Dynamite project. The work was structured around three subprojects: the first SubProject targeted the development of condition monitoring sensors and smart tags; SubProject 2 aimed at introducing mobile & wireless devices and technologies, such as PDAs (Personnal Digital Assistant), to support maintenance and also the introduction of diagnosis and prognosis based on WebServices; SubProject 3 developed training and economical studies related to maintenance strategy. The detailed reporting of the project results is included in the project site (dynamite.vtt.fi), the internal reports and the conference and journal publications that the project has produced, while a book on e-Maintenance is due to be published.. In the following sections some of the most important results of Dynamite are discussed with reference to project deliverables (DWC). More in depth information can be found in the articles this paper refers to.
3.1 Sensors for condition monitoring There are quite a number of challenges when maintenance is optimised through a condition based maintenance strategy. The greatest among them is to be able to define when maintenance needs to be carried out. This can effectively be determined on the basis of the actual machinery condition, identified via implementing Condition Monitoring. In Dynamite a number of new sensors were developed. These included both vibration and on-line oil analysis sensors. For vibration monitoring supporting diagnosis and prognosis a new platform was developed (DWC28, Wyselec). Also for vibration monitoring a new sensor that is capable of carrying out the necessary signal analysis and passing the analysed results through USB to the PDA was developed (DWC12, Diagnostic Solutions). Based on MEMS technology a multimeasurand sensors for monitoring vibration, pressure and temperature was developed (DWC3, Manchester & DWC17, Prisma). For wireless transfer of measuring information a wireless sensor network system, including ZigBee sensing nodes and Zigbee to Wifi gateways, supported be dedicated operating system and hardware was developed (DWC15 & DWC18, Prisma) [17]. On-line monitoring of lubricating oil enables monitoring the condition of the lubricant and also the condition of the machinery. Five different types of sensors were developed for lubricant oil monitoring (DWC4 & DWC 5, VTT) and (DWC6, DWC7 & DWC8, Tekniker). Most of the developed sensors are based on optical methods. Altogether the sensors were tested in three different environments: the testing laboratory of Martechnic, the production environment of Volvo and lastly during final demonstration tests of all Dynamite products at Fiat. The performance of the various sensors in the tests has been reported
689
e.g. in references [4, 5]. Figure 2 shows the user interface of a device that monitors the oxidation of the lubricant. The interface is very simply giving the maintenance engineer the needed information in a user – friendly format (Figure 2).
SP 1 WP1 Requirements
WP2 Smart Tags WP3 Micro sensors
WP5 Demonstration
WP4 Lube sensor
SP 2 WP6 Requirements
WP7 Smart PDAs WP8 Wireless Comms
WP10 Demonstration
WP9 e-Maintenance
SP 3 WP11 Scenario Analysis
WP12 Strategies & Cost effectiveness
WP13 Training
WP14 Global Application demonstration
Figure 1. The structure of Dynamite project.
Figure 2. The user interface of a device that monitors the oxidation of lubricant oil. In maintenance there also other simpler challenges besides the measuring information, that still need to be tackled. In the first place the machine to be maintained needs to be identified. In Dynamite the use of RFID (Radio Frequency Identification) tags was studied for the above described purposes (DWC1 & DWC10, Sunderland, DWC2 & DWC11, Zenon) [6]. Smart tags offer an easy way to identify machines and their components which ensure that the right components are maintained in the correct way. The use of tags also supports condition monitoring i.e. again the correct machine can be identified. When the identification is carried out with a PDA, equipped with a RFID reader, it is possible to use the same PDA for the condition
690
monitoring measurement. The PDA accesses the accelerometer readings, making it easy to offer a diagnosis, based on acceptable limits which can be automatically read from the database. Naturally the PDA enables the sending of measured data and the diagnosis result to the server, which then can issue a work order if the allowable limits have been exceeded. With available RFID technology today it became apparent that the identification of movable objects is very challenging. In Dynamite project it was possible to improve the methodology and results with the introduction of a neural networks model which could compensate for the partially contradictory measuring information. However, the results could not be considered satisfactory for commercial use and instead additional research must be carried out in order to fully understand and handle this task.
3.2 DynaWeb The basic idea in Dynamite project was to be able to produce and take advantage of maintenance related information in different types of environments [7]. As explained section 2 of this paper the term e-Maintenance became popular in English literature in the last decade. As explained the term really pronounces the use of Internet to support maintenance [7]. The term e-Maintenance also puts emphasis on how proactive action can identify developing faults, so that problems are tackled beforehand, instead of repairing broken down machinery components, mitigating health & safety risk and especially avoiding the high costs that go hand in hand with this type of passive maintenance strategy [8]. However it has to be admitted that proactive can only be successful insofar as wear detection and fault prognosis can be available. Just as Lee et al. [9] put it: “We simply do not know how to measure wear of machinery components. We do not have the necessary models that would tell us what kind of message the changing parameters are telling us”. The Dynamite project has sought to build knowledge to support prognosis and some modules have been designed and developed for this purpose (DWC21, UHP). In general, e-Maintenance is used to provide the necessary maintenance information whenever and wherever it is needed. Consequently PDAs are a good way to get the necessary information in the field and also a means to report to the Computerised Maintenance Management System (CMMS) what has been done [10]. The key elements in the development of the e-Maintenance system were the use of PDA as a mobile user interface, the building of the necessary services as Web Services that can be called through the internet and the use of a common database for the integration of data of a large number of modules developed in the system. In this context, the DynaWeb platform [6] refers to the ICT architecture concerning software web services and communication architecture that support the new maintenance concept. It is best described as the information and communication platform that provides operational interaction, offered by means of web services between different technologies and actors in the framework of a distributed maintenance management application scenario (Figure 3). These application services are considered as plug-ins and may be cutosmised to better serve the needs of the end user. The software architecture of DynaWeb is based on software components offering intelligent distributed web services. They offer interoperability between independent software applications over Internet by means of SOAP protocol which enables the communication [9].The advantages of web services are the central issue in DynaWeb. Different software modules are able to communicate among them in order to perform a specific task. In this context, to provide the most convenient analysis flow, information processing is understood as a distributed and collaborative system, where there are different levels of entities that can undertake intelligent tasks. Given this, a system architecture has been defined to identify the interactions between actors and the required functions, including 4 layers that correspond to the central information processing layers of OSA-CBM standard [9][10][11] as condition monitoring (DWC19, Tekniker), health assessment (DWC20, Tekniker), Prognostics (DWC21,UHP), and Decision support (DWC23, Zenon, DWC24, Växjö).
3.3 Mimosa The Dynamite project recognised from the early stages the need to integrate the individual tools developed by various partners. There was an internal Software Team which agreed upon the programming platforms for the project components. While services integration was conveniently offered by the common adoption of web services, data integration was still a challenge and the consortium opted to adopt a common database. In practice the obvious pick was the MIMOSA database format [www.mimosa.org]. MIMOSA follows the ISO 13374 standard: Machine Condition Assessment Data Processing Information Flow Blocks. As such, MIMOSA is very large consisting of some 400 tables, a subset of which appeared to be adequate for the Dynamite objectives. The common database was located at the server of IBKrates in Tallinn. IBKrates also programmed a simple user interface to support the use of Mimosa (DWC25 & DWC26, IBKrates). At the same time VTT developed an easy to use user interface for the PDA communication with MIMOSA (DWC9, VTT). Afterwards it became apparent that decision to rely on Mimosa had been a very beneficial one. Figure 4 shows the simplified dataflow between various modules in DynaWeb.
691
Figure 3. Communication architecture among HMI and web services, through the usage of an agent storing data in Mimosa database.
Figure 4. Simplified dataflow between various modules within DynaWeb.
3.4 Personal digital assistant PDAs provide us a means to interface with the CMMS, the condition monitoring and diagnosis system, as well as other possible systems that support maintenance in which ever environment where wireless internet connection is available. In the Dynamite project the PDA plays a central role. The PDA can be used for accessing measurements data, studying the measurement results, communicating with the diagnosis and prognosis modules, handling of work orders, studying instructions for maintenance work and so on. Since Dynamite was an R&D project the goal was to develop new methods and tools to
692
support maintenance. Motivated by this two separate solutions were developed for the PDA. In one of them (DWC9, VTT) the basic assumption was that it must be possible to use the PDA to support maintenance even if no wireless connection is available. Following this assumption a small version of the MIMOSA is ported to the PDA whenever internet connection is available. At the same time the new definitions and measurement results are uploaded from the PDA to the common serverside database. In the other solution emphasis was given to the quality of information that is given to the maintenance engineer i.e. clear guidance for carrying out maintenance work with a well designed user interface (DWC14, Prisma). In addition to the two aforementioned solutions, special attention was given to maintenance scheduling (DWC13 & DWC23, Zenon).
3.5 Diagnosis and prognosis Carrying out condition monitoring is challenging but it lays down the foundation for making the decisions on what maintenance needs to be carried out. One of the important issues in Dynamite was how to make it easier to analyse condition monitoring signals and how to automate diagnosis and prognosis. The developed diagnosis and prognosis modules have been developed as Web Services (DWC19 & DWC20 & DWC22, Tekniker & DWC21, UHP). DWC19 is related to state detection. This web service receives measurements from sensors and their signal processing software. These measurements have to be compared to expected values, and an alert is generated in case of anomaly detection, due to values outside from preset limits or changes in the usual trend. The main function of DWC20 is to set the current health of the asset, in case of an anomaly detected by condition monitoring modules. It generates health records and offers faults identification, based on health history, operational status and maintenance tasks history. One of the existing diagnosis processes is based on previously developed systems [12] using Bayesian Networks to facilitate a model that can work with uncertainty and can also be adapted with feedback information. DWC21 takes into account the results from DWC 19 & DWC20. The primary focus of this prognostic module is to calculate the future health of an asset, taking into account future usage profiles. The module reports the failure health status of a specified time or the Remaining Useful Life (RUL).
3.6 Economy of maintenance Companies are often forced to think how efficient they are. One very popular subject in discussion is then maintenance and how it should be organised. Unfortunately it is typical that the discussion then mainly concentrates on costs i.e. how big are the costs related to maintenance and how could these costs be reduced. Unfortunately it can be claimed that the further from maintenance organisation the discussion takes place the more it concentrates on costs and less emphasis is given to what maintenance provides and what is its basic purpose and benefits are. In the Dynamite project considerable effort has been placed on studying maintenance from a cost efficiency perspective. A set of tools/modules were developed for enabling economical analysis related to maintenance, as well as for assessing the influence of different types of maintenance strategies and investments in order to enable decision making that relies on facts [14] (DWC24, Växjö).
3.7 Training Although Maintenance Engineering has been on demand for very long time, maintenance training has yet to fully exploit the recent wave of technological advances in information and communication technologies. Besides formal education and theoretical knowledge, on-the-job training and informal education are recognised to have great importance in developing Maintenance Engineering. Furthermore, desktop and web-based e-learning applications offer academics and industrialists new tools to raise maintenance-related knowledge and competence. On the other hand, maintenance services are increasingly equipped with innovative technological solutions in order to support technical and managerial personnel [16]. The Dynamite Project has implemented a vertical introduction of a stream of novel technologies for Maintenance Operations, using wireless sensing devices, RFIDs, handheld computers and decision support tools, as well as back office computing infrastructure in order to streamline the maintenance engineering process and make maintenance data transparently available at multiple levels of operation. Usage of such technologies is likely to be unfamiliar to maintenance personnel and adequate training and training tools are needed to support their introduction and adoption. Steps towards this direction are taken by developing a training strategy supported by e-learning technologies within the context of the Dynamite Project. The Dynamite Training Platform (DynaTrain) is built on a popular Open Source Learning Management System platform, Moodle. It comprises a set of lessons with practical and hand on tutorials on how to use the Dynamite technology and tools. Lessons include modules for the Dynamite ‘Inventory Tracking System’, the ‘MIMOSA Translator’, a User Guide for the RFID-PDA subsystems, the ‘Maintenance data acquisition system’, the Prognosis Web Services and the USB Vibration sensor (Figure 5). Other lessons can be easily integrated into the system.
693
Figure 5. An example of DynaTrain with instructions on performing diagnosis with the USB vibration sensor.
4
CHALLENGES AND FUTURE PROSPECTS
E-maintenance seeks to implement ubiquitous maintenance management, wherein maintenance operations, planning and decisions data and tools to process and act upon them become available anytime, anywhere and to anyone authorised to access them at multiple levels of operation [11][12]. At the operational level, e-maintenance provides enabling technologies and tools to integrate functions related to monitoring component degradation status and availability state, support personnel decisions with diagnostics & prognostics information, as well as support the estimation of performance indicators [12]. At the tactical level, e-Maintenance provides enabling tools and information mediation to facilitate the implementation of the maintenance policy selected at the strategic level. It provides seamless interfaces with CMMS data and the central ERP system in order to ensure that the necessary resources, services and management means necessary to the maintenance intervention execution are made available. At the strategic level, e-Maintenance deals with making available the IT tools needed to support making decisions about the maintenance policy to be adopted, as well as with defining this policy and assigning its execution to lower hierarchical maintenance layers. It is at this level that interfaces with internal and external logistics are needed. A key to the success of current e-Maintenance solutions is the degree and efficiency of integration at all levels of operation. Integration refers to information integration and interoperability, services integration and interoperability, as well as physical communications integrations and interoperability. Information integration has been pursued by adopting interoperable XMLbased data exchange formats, operational level data exchange protocols, such as OPC, and maintenance-oriented data format definitions, such as MIMOSA. Services integration is most successfully implemented by web services, initially defined by the WSDL, UDDI and SOAP triplet, with UDDI being gradually neglected and more recently with resources-oriented or RESTful services, which do not even employ SOAP envelopes. Finally, networking integration has been pursued through the combined use of wired and wireless communications. In a similar way that different wired communication protocols have competed in the past for industrial communications, such as ‘bus’ protocols for industrial automation and Ethernet-based protocols for higher level communications, different wireless protocols are currently competing for wireless industrial networking. Wireless Personal area protocols, such as the 802.15.4 family of protocols, including ZigBee and Bluetooth are more applicable at the operational level of functions, while Ethernet-based protocols, such as the 802.11 family of protocols (WiFi) are employed at higher levels. While at all fronts significant strides have been made, there are still gaps between existing research-based solutions, taking advantage of current enabling technologies and every day industrial practice. Issues related to performance degradation due to wireless networking interference & scatter severely hampers performance in industrial installations. Services integration
694
between e-Maintenance, CMMS and ERP solutions is still not fully addressed. Furthermore, while individual software solutions may adopt standardised data representations, data interoperability across platforms require that all such platforms employ data interoperable formats. This does not currently hold for many CMMS and ERP solutions and consequently eMaintenance providers need to consider further steps to bridge this gap. Finally, at the operational level, smart wireless sensing devices employed for condition monitoring needed to be further optimised to simultaneously encapsulate in a small form factor more resources, in terms of CPU, memory, I/O handling and power. Power efficiency itself needs to be addressed at multiple levels, including hardware, networking and operating system & middleware software. At the hardware effort the design effort is at the same time addressing the minimisation of energy consumption, by reducing sensor board energy consumption, while solutions for energy autonomy, based on energy harvesting are continuously advancing. At the networking level, energy efficient protocols are sought, with the ZigBee protocol currently claiming to offer competitive energy efficiency. The employed operating system and middleware of wireless sensor networks need to be energy efficient itself, by enabling minimal use of CPU or RF intensive operations, without compromising application requirements. RF energy consumption savings can also be implemented if more advanced and smart embedded software solutions are provided, enabling wireless nodes to operate in a higher degree of autonomy, adequately balancing internal processing and RF transmissions [15]. One can anticipate that foreseen costs reductions in all enabling technologies, devices and tools related to e-Maintenance, is likely to encourage the greater adoption of e-Maintenance concepts & technologies by modern enterprises. It is therefore natural to anticipate that for e-Maintenance to become everyday practice in industry, the offered solutions need not only to be seen as reasonably priced, offering cost-efficient investment turnover benefits, but they must be seen to integrate well with the current and prospective future status of industrial practice and state of the art technologies. This integration is likely to reserve additional challenges: in terms of technological integration, e-Maintenance solutions need to interface with legacy & other enterprise systems, offering data, activities and services interoperability; in terms of ‘integration with human resources’, they should offer simple and easy to operate interfaces, adequately designed and customised for the different roles and profiles of operating personnel; and in terms of overall integration with the overall enterprise strategies, the business case must be put on how efficiently e-Maintenance supports the organisation objectives. To confront many of these challenges, future eMaintenance R&D is likely to be required to focus: on user-driven design and development issues, placing the user within the design ,development & testing loop; many of the often sophisticated Maintenance Management functionalities should be offered as black-box results to non-Maintenance personnel, abstracting the inherent complexity of condition monitoring and maintenance management functions from higher organisation levels; and keep monitoring and integrating technological advances in an efficient way, offering simple and cost-efficient Maintenance – related tools, services and results that ultimately facilitate maintaining sustainable business & production activities.
5
CONCLUSION
Satisfying industrial needs with the assistance of new technologies has been demonstrated in many of the Dynamite project results. Elements of asset identification, condition monitoring parameter measurement and condition assessment have been combined with cost benefit analysis, semantic web capabilities and wireless communications. The link between all of these elements was supported by on-line MIMOSA databases and mobile computing devices, namely PDAs. It was show that the combined technology approach was able to provide up-to-date, easy to access and comprehensive maintenance data. Timely, well informed and cost effective maintenance decisions may be made under such conditions. The project has been formulated in a modular manner, so as to offer a methodology by which technology upgrades to existing maintenance systems may be made. The DynaWeb concept and structure for flexible e-maintenance using the DYNAMITE technologies follows the OSA-CBM architecture and MIMOSA data structure. This is thought to provide great opportunities to extend the work in many different areas using a plug and play approach. Developments are expected in the fields of intelligent web services, smart mobile devices and wireless communications.
6
REFERENCES
1
Jantunen E., Arnaiz A., Emmanoulidis C., Iung B. & Adgar A. (2009) Räjäyttääkö Dynamiitti kunnossapidon?, ProMaint No. 3 (in Finnish)
2
Pinjala, S.K., Pintelon, L. & Vereecke, A. (2006) An empirical investigation on the relationship between business and maintenance strategies. International Journal of Production Economics, Volume 104, Issue 1, pp 214-229.
3
Lee, J. (2004) Infotronics-based intelligent maintenance system and its impact on closed-loop product life cycle systems. Proc. Int. Conf. on Intelligent Maintenance Systems, Arles, France, 15–17 July.
4
Halme J., (2008) Online lube sensing technologies for condition monitoring, BRIDGE-DYNAMITE-PROMISE Training workshop, Lausanne, Switzerland, 18th-20th February, 2008
695
5
Gorritxategi E., Terradillos J., Aranzabe A., Arnaiz A. & Aranzabe E., (2008) Novel Method for Lube Quality Status Assessment Based on Visible Spectrometric Analysis, Lubrication Management and Technology (Lubmat). San Sebastian ISBN 978-84-932064-5-1
6
Arnaiz A. Iung B., Jantunen E., Levat E., Gilabert E. (2007) DYNAWeb. A web platform for flexible provision of emaintenance services. Harrogate.
7
Adgar A., Addison J.F.D. & Yau, C-Y , (2007) Applications of RFID Technology in Maintenance Systems, Proceedings of the second World Congress on Engineering Asset Management (WCEAM) June 2007 Harrogate, UK.
8
Gilabert E., Ferreiro S. & Arnaiz A.,(2007) Web Services System for Distributed Technology Upgrade Within an eMaintenance Framework" OTM 2007 Ws. Part I, R. Meersman, Z. Tari, P. Herrero et al. (Eds.), Lecture Notes in Computing Science (LNCS) 4805, pp. 14
9
Arnaiz A., Jantunen E., Adgar A. & Gilabert E., (2009) Ubiquitous computing for dynamic condition based maintenance, Journal of Quality in Maintenance Engineering (JQME), Special issue on Condition Monitoring and ICT application - volume 15, issue 2 In print
10
Bengtsson M. (2003) Standardization issues in condition based maintenance, Department of Innovation, Design and Product Development, Mälardalen University, Sweden.
11
Lebold M., Reichard K., S.Byington C., Orsagh R (2002) “OSA-CBM Architecture Development with Emphasis on XML Implementations” MARCON 2002.
12
Gilabert E., Arnaiz A. (2006) Intelligent automation systems for predictive maintenance. A case study. Robotics and Computer Integrated Manufacturing (RCIM). Vol 22. (5-6), 543-549.
13
Levrat E., Iung B. & Crespo Marquez A. (2008) e-Maintenance: review and conceptual framework. Production Planning & Control, 19( 4):408–429
14
Arnaiz A., Jantunen E., Emmanoulidis C. & Iung B., (2006) Mobile Maintenance Management, Journal of International Technology and Information Management – JITIM 154, 11-22. ISBN:1063-519X
15
Campos J., Jantunen E. & Prakash O., Mobile Maintenance Decision Support System, Maintenance and Asset Management Journal, 23(2), 42-48
16
Emmanouilidis, C., (2008) Current trends in maintenance e-training and m-training, CM2008 & MFPT2008, The Fifth International Conference on Condition Monitoting and Machinery Failure Prevention Technologies, 15-18 July 2008, Edinburgh, UK.
17
Emmanouilidis, C, Katsikas, S and Giordamlis, C., (2008) Wireless Condition Monitoring and Maintenance Management: A Review and a Novel Application Development Platform, Proceedings of the 3rd World Congress on Engineering Asset Management and Intelligent Maintenance Systems Conference (WCEAM-IMS 2008) 27 – 30 October 2008, Beijing International Convention Center, Beijing, China, pp. 2030-2041.
Acknowledgments This paper summarises work performed as part of FP6 Integrated Project IP017498 DYNAMITE "Dynamic Decisions in Maintenance". The authors gratefully acknowledge the support of the European Commission, as well as the collaboration of all project partners.
696
Proceedings of the 4th World Congress on Engineering Asset Management Athens, Greece 28 - 30 September 2009
DEVELOPMENT OF MULTIPLE CANTILEVERED PIEZO FIBRE COMPOSITE BEAMS VIBRATION ENERGY HARVESTER FOR WIRELESS SENSORS S. Olutunde Oyadijia, Shaofan Qib and Roger Shuttleworthb a
School of Mechanical, Aerospace and Civil Engineering, University of Manchester, Manchester M60 1QD, UK. b
School of Electrical and Electronic Engineering, University of Manchester, Manchester M60 1QD, UK.
There is considerable interest in the development of battery-free mobile electronics systems such as wireless sensors which are used for the conditioning monitoring of engineering assets some of which are located in hostile environments. The main focus has been on the development of techniques for the harvesting of energy from ambient sources. One of the main sources of ambient energy is mechanical vibration. A number of vibration energy harvesting devices have been developed using electromagnetic, electrostatic or piezoelectric principles. A currently favoured approach is the use of piezoelectric fibre composite (PFC) materials in the form of a cantilevered beam with a tip mass. This enables significant energy to be harvested at the resonance peak of the PFC-mass vibration system. However, off the resonance peak, the harvested vibration energy is relatively too small and, consequently, the bandwidth of reasonable energy harvesting is too small. To overcome this problem, this paper presents the use of a vibration energy harvesting device consisting of four PFC beams with tip masses. Each beam is tuned to a slightly different resonance frequency. Thus, the bandwidth of significant vibration energy harvesting is considerably extended. It is shown that the multiple cantilevered PFC beams vibration energy harvester can harvest energy from ambient vibrations more effectively than a single cantilevered PFC beam vibration energy harvester. Key Words: Vibration Energy Harvesting, Wireless Sensor, Cantilevered Beam, Piezoelectric Fibre Composite. 1
INTRODUCTION
Currently, most wireless sensor nodes are battery-powered and, therefore, require regular checks of the charge level of the batteries and of replacements of the batteries when they are no longer adequate. This considerably limits the autonomy of the sensor nodes and adds costs to any condition maintenance programme. Ideally, the sensor nodes should be self-powered and battery-less. This has led to an interest in designing electronics systems with the capability of deriving their electrical power needs from surroundings. The technique for doing this has come to be known as energy or power harvesting. It is essentially the conversion of low level ambient energy, such as vibration, heat or light energy, into usable but small amount of electrical power [1-5]. Mechanical vibration is one of the main sources of ambient energy. The harvesting of mechanical vibration energy has been achieved using electromagnetic [6-8], electrostatic [9-11] and piezoelectric [12-14] devices. But the development of piezoelectric vibration energy harvesters has dominated research effort in the last 5 to 10 years. In order to convert the ambient kinetic or vibration energy into electric power, piezoelectric materials are commonly incorporated in energy harvesting devices. A typical piezoelectric vibration energy harvesting device is a cantilevered beam consisting of one or two piezoelectric material layers bonded to the top and bottom surfaces of usually a metallic base structure. The configuration consisting of one piezoelectric layer is known as a unimorph while the configuration consisting of two piezoelectric layers is known as a bimorph. The vibration energy harvesting cantilevered beam is fixed to a vibrating structure from which vibration energy is to be extracted. The dynamic strain induced in the piezoelectric layers generates electric power across the electrodes of the piezoelectric layers. One of the most commonly used piezoelectric materials is piezoelectric fibre composite (PFC). It enables realistic applications which require highly distributed actuation and sensing, maintains the majority of the stiffness and bandwidth of
697
monolithic piezoelectric ceramics. A popular mode of application of PFC for vibration energy harvesting is as a cantilevered beam with a tip mass. The magnitude of the tip mass and its location along the length of the beam can be adjusted in order to obtain a resonance frequency which coincides with the dominant frequency of the ambient vibration energy. This enables significant amounts of energy to be harvested at the resonance peak of the PFC-mass vibration system. However, off the resonance peak, the harvested vibration energy is relatively too small and, consequently, the bandwidth of reasonable energy harvesting is too small. To overcome this problem, this paper presents the use of a vibration energy harvesting device consisting of four PFC beams with tip masses. Each beam is tuned to a slightly different resonance frequency. Thus, the bandwidth of significant vibration energy harvesting is considerably extended. The results of these investigations show that the multiple cantilevered PFC beams vibration energy harvester has the potential to harvest energy from ambient vibrations more effectively and efficiently and over a wider frequency band than a single cantilevered PFC beam vibration energy harvester.
2
THEORY
A cantilevered PFC beam is excited at the base. This induces dynamic strain in the beam which, therefore, produces a voltage across its electrodes. The theoretical simulation of the induced strain can be achieved by analysis of the beam as a continuous vibrating system with distributed parameters using the classical BernoulliEuler beam theory. The result of this analysis will give a multimodal behaviour. Alternatively, a cantilevered PFC beam can be represented by an equivalent lumped-parameter, single degree-offreedom (SDOF) system. This latter approach will yield an unimodal behaviour but it is much simpler to employ in studying the voltage generating behaviour of the beam.
vi
vo Piezoelectric layer Base structure x
Figure 1. Cantilevered piezoelectric unimoph beam subjected to base excitation
2.1 Transverse Vibration of Cantilevered Beam Figure 1 shows a cantilevered piezoelectric unimorph beam which is subjected to base excitation. The absolute transverse displacement of any point on the beam is given by,
v( x, t ) = vi ( x, t ) + vo ( x, t )
(1)
where vo(x,t) is the transverse displacement of any point on the beam relative to the clamped base of the beam, and vi(x,t) is the base excitation of the beam. Assuming that the beam is undamped, the equation of motion for free transverse vibrations in terms of the absolute displacement v(x,t) is given by the Bernoulli-Euler beam equation as,
EI
¶ 4v ¶ 2v + r A =0 ¶x 4 ¶t 2
(2)
in which v=v(x,t), E is Young’s modulus of elasticity, I is second moment of area, ρ is mass density and A is cross-sectional area. PFC beams are usually quite thin and flexible such that their motion will be affected by air damping. Also, they have got some internal structural damping. Including these damping effects, the equation of motion becomes [15,16],
EI
¶ 4v ¶ 5v ¶v ¶ 2v + c I + c + r A =0 s a ¶x 4 ¶x 4¶t ¶t ¶t 2
(3)
where cs denotes structural damping coefficient and ca denotes air damping coefficient. Substituting Equation (1) in Equation (3) gives,
EI
¶ 4vo ¶ 5vo ¶vo ¶ 2 vo ¶ 2vi ¶v + c I + c + r A = r A - ca i s a 4 4 2 2 ¶x ¶x ¶t ¶t ¶t ¶t ¶t
(4)
Equation (4) can be solved to obtain the transmissibility expression which relates the output transverse displacement response at the tip of the beam to the base excitation of the beam. In [15,16], Equation (4) has been solved to obtain the relative displacement transmissibility of the tip of the beam in the absence [15] or presence [16] of a tip mass. However, the focus in this paper is to use an equivalent single degree of freedom model to analyse the response of the beam in order to obtain the absolute displacement transmissibility of the tip of the beam.
698
2.2 Equivalent Single Degree of Freedom Model of Cantilevered Beam Figure 2 shows the equivalent single degree-of-freedom (SDOF) model of the cantilevered piezoelectric unimoph beam subjected to base excitation. The absolute transverse displacement of the tip of the beam is given by vo(t) while vi(t) is the base excitation of the beam. The equivalent mass of the cantilevered beam is denoted by me while m denotes the added tip mass of the beam. Also, ke and ce are the equivalent stiffness and damping coefficient of the beam. The equivalent mass, equivalent stiffness and equivalent damping coefficient are related to the mass density ρ, Young’s modulus E, second moment of area I, cross-sectional area A and length L of the beam as follows [17],
me = 0.24 rAL ;
ke =
3EI ; L3
vo me + m
ke
ce
vi
Figure 2. SDOF model of cantilevered piezoelectric unimoph beam subjected to base excitation
c e = 2z k e ( me + m)
(5)
where the damping ratio z includes the internal structural damping of the beam as well as the external air damping. Using Newton’s laws, the equation of motion of the SDOF system can be derived as,
M
d 2vo d (vo - vi ) + ce + ke (vo - vi ) = 0 2 dt dt
(6)
where M = me + m. If the base excitation is of the form,
vi = Vi e jwt
(7)
then, the tip response will be of the form,
vo = Vo e jwt
(8)
where Vi is the amplitude of the excitation,
Vo is the complex amplitude of the response and j = - 1 . Substituting
Equations (7) and (8) in Equation (6) gives,
(- Mw 2 + jwce + ke )Vo = ( jwce + ke )Vi
(9)
which gives the complex transmissibility as,
T =
Vo jwce + ke = Vi - Mw 2 + jwce + ke
(10)
Dividing the top and bottom of the right side of Equation (9) gives,
1+ j T =
wce ke
(11)
Mw wc 1+ j e ke ke 2
Defining the undamped angular natural frequency and the viscous damping ratio, respectively, as:
wn =
ke M
and z =
ce 2 ke M
(12)
Then, Equation (11) becomes,
T =
1 + j 2zb 1 - b 2 + j 2zb
(13)
699
where the frequency ratio,
b = w / wn = f / f n
and w, wn are measured in rad/s while f, fn are measured in Hz.. Thus, the
amplitude of the complex transmissibility is given by,
1 + (2zb ) 2 T =T = 2 2 2 1 - b + (2zb )
(
3
1/ 2
(14)
)
DISCUSSION OF PREDICTED RESPONSES
The transmissibility amplitudes for four cantilevered beams are predicted using Equation (14). Four cases, which involve different fundamental natural frequencies of the beams as shown in Table 1, are studied. For Case 1, the fundamental natural frequencies are 10, 11, 12 and 13 Hz, and for Case 2 they are 10, 12, 14 and 16 Hz. Similarly, for Case 3 the fundamental natural frequencies are 10, 15, 20 and 25 Hz, and for Case 4 they are 10, 20, 30 and 40 Hz. In all cases, equivalent viscous damping ratios of the beams of 0.01 and 0.02 are used in the predictions. The results are shown in Figures 3 to 6.
Table 1 Fundamental natural frequencies of the four cantilevered beams
Case ID
Fundamental natural frequency (Hz) Beam 1
Beam 2
Beam 3
Beam 4
Case 1
10
11
12
13
Case 2
10
12
14
16
Case 3
10
15
20
25
Case 4
10
20
30
40
2
2
10
10
fn =10 Hz fn =11 Hz fn =12 Hz fn =13 Hz
1
1
10
Transmissibility
Transmissibility
10
fn fn fn fn
0
10
-1
0
10
-1
10
10
(a)
(b)
-2
10
=10 Hz =11 Hz =12 Hz =13 Hz
-2
0
10
20
30 40 Frequency (Hz)
50
60
70
10
0
10
20
30 40 Frequency (Hz)
50
60
70
Figure 3: Transmissibility of SDOF equivalents of 4 cantilevered beams of fundamental resonance frequencies 10, 11, 12 and 13 Hz for damping ratios of (a) 0.01, (b) 0.02 Figure 3 shows a comparison of the transmissibility amplitudes of the 4 cantilevered beams for Case 1 for which the fundamental natural frequencies of the beams are 10, 11, 12 and 13 Hz. Figure 3(a) applies to a viscous damping ratio of 0.01 while Figure 3(b) is for a viscous damping ratio of 0.02. The thin curves are the transmissibility response curves for each beam while the thick curve is the theoretical sum of the 4 transmissibility curves. It is seen that an individual transmissibility curve has a sharp peak and a very narrow frequency band over which the transmissibility is greater than a factor of say 10. However, by using a combination of the 4 beams simultaneously and summing their amplitudes, the frequency range over which the
700
transmissibility is greater than a factor of 10 is significantly extended. Increasing the damping ratio from 0.01 to 0.02 causes the peak amplitudes of the individual and sum transmissibility curves to reduce. 2
2
10
10
fn =10 Hz fn =12 Hz fn =14 Hz fn =16 Hz
1
1
10
Transmissibility
Transmissibility
10
fn =10 Hz fn =12 Hz fn =14 Hz fn =16 Hz
0
10
0
10
-1
-1
10
10
(a)
(b) -2
-2
10
0
10
20
30 40 Frequency (Hz)
50
60
10
70
0
10
20
30 40 Frequency (Hz)
50
60
70
Figure 4: Transmissibility of SDOF equivalents of 4 cantilevered beams of fundamental resonance frequencies 10, 12, 14 and 16 Hz for damping ratios of (a) 0.01, (b) 0.02 The individual and sum transmissibility curves for Case 2, for which the fundamental natural frequencies of the beams are 10, 12, 14 and 16 Hz, are shown in Figure 4(a) for viscous damping ratio of 0.01, and in Figure 4(b) for viscous damping ratio of 0.02. It is seen that by increasing the maximum difference of the fundamental natural frequencies from 3 Hz in Case1 to 6 Hz in Case 2, the frequency range over which the transmissibility is greater than a factor of 10 has been increased from about 6 Hz to about 9 Hz. Similarly to the observation made for Case 1 (Figure 3), increasing the damping ratio from 0.01 to 0.02 causes the peak amplitudes of the individual and sum transmissibility curves to reduce. Also, the frequency range over which the transmissibility is greater than a factor of 10 is not affected by the increase in the viscous damping ratio. 2
2
10
10
fn =10 fn =15 fn =20 fn =25
1
fn =10 Hz fn =15 Hz fn =20 Hz fn =25 Hz
1
10
Transmissibility
Transmissibility
10
Hz Hz Hz Hz
0
10
0
10
-1
-1
10
10
(a)
(b) -2
-2
10
0
10
20
30 40 Frequency (Hz)
50
60
70
10
0
10
20
30 40 Frequency (Hz)
50
60
70
Figure 5: Transmissibility of SDOF equivalents of 4 cantilevered beams of fundamental resonance frequencies 10, 15, 20 and 25 Hz for damping ratios of (a) 0.01, (b) 0.02 The predicted transmissibility curves for Case 3, when the fundamental natural frequencies of the beams are 10, 15, 20 and 25 Hz, are shown in Figures 5(a) and 5(b) for viscous damping ratios of 0.01 and 0.02 respectively. Comparing Figures 5(a) and 5(b) with Figures 3(a) and 3(b) and with Figures 4(a) and 4(b), respectively, it can be seen that the peak levels of the individual and sum transmissibility curves are approximately the same. However, the troughs of the sum transmissibility curves decrease as the difference between the individual natural frequencies of the beams increases. It can be observed from Figures 5(a) that the first trough has a transmissibility amplitude of about 8 (i.e. less than a factor of 10). However, the transmissibility amplitude is greater than a factor of 8 over a frequency range of about 17 Hz compared to a range of about 10 Hz in Case 2. A detailed comparison of Figures 5(a) and 5(b) shows that by increasing the viscous damping ratio from 0.01 to
701
0.02, the peak transmissibility amplitudes are reduced while the transmissibility amplitudes of the troughs are increased. These results show that increasing the difference between the individual fundamental natural frequencies of the beams can limit the frequency range over which the transmissibility amplitude of the sum curve is continuously above a specified value. Figures 6(a) and 6(b) show the comparisons between the individual and sum transmissibility curves of the 4 cantilevered beams for Case 4 for which the fundamental natural frequencies of the beams are 10, 20, 30 and 40 Hz and for damping ratios of 0.01 and 0.02 respectively. The figures show that the individual transmissibility curves have variable frequency bands over which the transmissibility is greater than a factor of 10. The frequency band ranges from about 1 Hz for Beam 1 (f =10 Hz) to about 4 Hz for Beam 4 (f = 40 Hz). Also, both figures show that the transmissibility amplitudes of the three troughs are less than 10. For a damping ratio of 0.01, Figure 5(a) shows that the transmissibility amplitudes of troughs 1, 2 and 3 are about 5.5, 6.0 and 8.0 respectively. But when the damping ratio is increased to 0.02, Figure 5(b) shows that the transmissibility amplitudes of troughs 1, 2 and 3 are slightly increased to about 6.0, 7.3 and 8.3 respectively. From these results, it can be concluded that increasing the difference between the individual fundamental natural frequencies of the beams can severely limit the frequency range over which the transmissibility amplitude of the sum curve is continuously above a specified value.
2
2
10
10
fn =10 Hz fn =20 Hz fn =30 Hz fn =40 Hz
1
1
10
Transmissibility
Transmissibility
10
fn =10 fn =20 fn =30 fn =40
0
10
-1
0
10
-1
10
10
(b)
(a) -2
10
Hz Hz Hz Hz
-2
0
10
20
30 40 Frequency (Hz)
50
60
70
10
0
10
20
30 40 Frequency (Hz)
50
60
70
Figure 6: Transmissibility of SDOF equivalents of 4 cantilevered beams of fundamental resonance frequencies 10, 20, 30 and 40 Hz for damping ratios of (a) 0.01, (b) 0.02 Comparing Figures 3 to 6, it can be seen that the difference in the fundamental natural frequencies of the cantilevered beams has a more pronounced effect on the frequency range over which the transmissibility amplitude of the sum curve is continuously above a specified value. The value of the viscous damping ratio has a relatively less pronounced effect. To maximise the frequency range over which the transmissibility amplitude of the sum curve is continuously above a threshold value Tsum, the difference in fundamental natural frequency (Df) should be large when Tsum is less than 10 but small when Tsum is equal to or greater than 10. For example, when Tsum = 5, Figure 6 shows that Df =10 Hz gives a frequency range of about 40 Hz compared to 17 Hz for Df =5 Hz (Figure 5), 8 Hz for Df =2 Hz (Figure 4) and 6 Hz for Df =1 Hz (Figure 3). But when Tsum = 10, Figure 6 shows that Df =10 Hz gives a maximum continuous frequency range of about 5 Hz compared to 13 Hz for Df =5 Hz (Figure 5), 8 Hz for Df =2 Hz (Figure 4) and 6 Hz for Df =1 Hz (Figure 3).
4
INITIAL EXPERIMENTAL TEST RESULTS
Initial transmissibility tests are carried out on 4 piezoelectric fibre composite (PFC) cantilevered beams. Each PFC beam has a total length of 135 mm, width of 13 mm and a total thickness (including thickness of steel shim and top and bottom piezoelectric fibre electrodes) of about 1.25 mm. One end of each beam is clamped in a box while a tip mass is attached to its free end as shown in Figure 7. The magnitudes of the tip masses used are 8.5, 15, 21 and 30 grams. The box containing the 4 PFC cantilevered beams is mounted on top of an electromagnetic vibrator which is driven from a power amplifier that is connected to a frequency response analyser. An accelerometer is mounted on the base of the box while a very small accelerometer is attached to the tip of each PFC cantilevered beam. The beams are excited sinusoidally over a frequency range from 0 to 50 Hz. The frequency response analyser acquired the tip responses of the PFC beams as well as the base excitation applied. From these measurements, the transmissibility amplitude data for each beam is produced in the frequency domain as the ratio of the output response to the input excitation.
702
Figure 7. Multiple cantilevered PFC beams vibration energy harvester Figure 8 shows the plots of the transmissibility amplitudes versus excitation frequency for the 4 cantilevered PFC beams. It can be deduced that the beams have different resonance frequencies of 10.9, 11.8, 13.2 and 15.9 Hz. For each beam, Figure 8 shows that the maximum vibration energy occurs at its peak amplitude which is at its resonance frequency. Away from resonance, it is seen that the amplitude and, hence, the vibration energy rapidly decreases. Thus, for a single cantilevered PFC vibration energy harvesting beam with tip mass, the frequency band of significant energy harvesting is small. However, when the 4 beams are used simultaneously, the resultant response amplitude can potentially be the sum of the individual amplitudes. Consequently, the magnitude and the frequency band of the vibration energy that can be harvested in the resonant region of the 4 beams will be greater than that of a singe beam as was demonstrated in Figures 3 to 6.
102 1 10
Transmissibility
30G_AT_TIP_POSITION.z 21G_AT_TIP_POSITION.z 15G_AT_TIP_POSITION.z 8.5G_AT_TIP_POSITION.z
101 0 10
-1 10 100
-1 10 10-2
0
10
20
30
40
Frequency (Hz)
Figure 8. Transmissibility response of multiple cantilevered PFC beams Comparing the measured and predicted transmissibility responses, it can be seen that the measured resonance frequencies of 10.9, 11.8, 13.2 and 15.9 Hz for the 4 PFC beams are close to those of Case 2 of the theoretical predictions, namely: 10, 12, 14 and 16 Hz. It can be seen that the measured individual transmissibility curves shown in Figure 8 compare fairly well with those predicted for Case 2 which are shown in Figure 4. Consequently, it would be expected that the experimental sum transmissibility curve will also compare fairly well with the predicted sum transmissibility curve shown in Figure 4. The challenge is to build an electronic circuit that will enable this to be done effectively with minimal power losses. The experimental transmissibility measurements need to be improved in quality, especially around the resonance frequencies, by decreasing the frequency step of the excitation around the resonance frequencies.
703
5
CONCLUSIONS
A theoretical basis for analysing the vibration transmissibility responses of single cantilevered piezoelectric beam and of the single degree-of-freedom idealisation has been presented. This model has been used to predict the transmissibility responses of cantilevered PFC beams subjected to sinusoidal vibration. It has been shown that the difference in the fundamental natural frequencies of the cantilevered beams has a more pronounced effect on the frequency range over which the transmissibility amplitude of the sum curve is continuously above a specified value, and that the value of the viscous damping ratio has a relatively less pronounced effect. It has also been shown that the multiple cantilevered PFC beams vibration energy harvester can harvest energy from ambient vibrations more effectively than a single cantilevered PFC beam vibration energy harvester as it provides a wider frequency range over which the transmissibility level and, hence, the strain levels induced in the beams, continuously exceed a threshold level for effective vibration energy harvesting.
6
REFERENCES
1
Anton SR & Sodano HA. (2007) A review of power harvesting using piezoelectric materials (2003–2006) Smart Materials and Structures 16, R1–R21 Sodano HA, Inman DJ & Park G. (2004) A review of power harvesting from vibration using piezoelectric materials, The Shock and Vibration Digest 36:197–205 Beeby SP, Tudor MJ & White NM. (2006) Energy harvesting vibration sources for microsystems applications, Measurement Science and Technology 13, 175–195 Cook-Chennault KA, Thambi N & Sastry AM. (2008) Powering MEMS portable devices – a review of non-regenerative and regenerative power supply systems with emphasis on piezoelectric energy harvesting systems, Smart Materials and Structures 17, 043001, 1–33 Priya S. (2007) Advances in energy harvesting using low profile piezoelectric transducers, Journal of Electroceramics 19, 167–184 Arnold D. (2007) Review of microscale magnetic power generation IEEE Transactions on Magnetics 43, 3940–3951 Glynne-Jones P, Tudor MJ, Beeby SP & White NM. (2004) An electromagnetic, vibration powered generator for intelligent sensor systems, Sensors and Actuators A 110, 344–349 Williams CB & Yates RB. (1996) Analysis of a micro-electric generator for microsystems, Sensors and Actuators A 52, 8–11 Mitcheson P, Miao P, Start B, Yeatman E, Holmes A & Green T. (2004) MEMS electrostatic micro-power generator for low frequency operation, Sensors and Actuators A 115, 523–529 Roundy S, Wright PK & Rabaey J. (2003) A study of low level vibrations as a power source for wireless sensor nodes, Computer Communications 26:1131–1144 Roundy S, Wright PK & Rabaey J. (2002) Micro-electrostatic vibration-to-electricity converters Proceedings of the ASME 2002 International Mechanical Engineering Congress and Exposition Sodano H, Inman D & Park G. (2005) Generation and storage of electricity from power harvesting devices, Journal of Intelligent Material Systems and Structures 16, 67–75 Sodano HA, Park G & Inman DJ. (2004) Estimation of electric charge output for piezoelectric energy harvesting, Strain 40, 49–58 Jeon YB, Sood R, Jeong JH & Kim S. (2005) MEMS power generator with transverse mode thin film PZT, Sensors & Actuators A 122, 16–22 Erturk A and Inman D J 2008a On mechanical modeling of cantilevered piezoelectric vibration energy harvesters, Journal of Intelligent Material Systems and Structures 19:1311–1325 Erturk, A & Inman, DJ. (2007) Mechanical Considerations for Modeling of Vibration-Based Energy Harvesters, Proceedings of the ASME IDETC 21st Biennial Conference on Mechanical Vibration and Noise. Blevins, RD. (1979) Formulas for Natural Frequency and Mode Shape, Van Nostrand Reinhold, New York.
2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
Acknowledgments The funding received for this work from the EU under the FP6 programme for the DYNAMITE project is gratefully acknowledged.
704
Proceedings of the 4th World Congress on Engineering Asset Management Athens, Greece 28 - 30 September 2009
DESIGN AND IMPLEMENTATION OF A DYNAMIC POWER MANAGEMENT SYSTEM FOR WIRELESS SENSOR NODES Zhenhuan Zhua, S Olutunde Oyadijia, Samir Mekidb a
School of Mechanical, Aerospace and Civil Engineering, University of Manchester, Manchester, M60 1QD, U.K. b
Mechanical Engineering Dept. KFUPM, 31261 Dhahran, Saudi Arabia
For wireless sensor nodes to be fully autonomous and to have long lifespan, there is the need to make them selfpowering. This requires that energy be harvested from ambient sources, such as vibration, light and heat, in order to supply the electrical power requirements. Thus, there will be no need to replace batteries that are currently used to power wireless sensor nodes. The long-term goal is to make wireless sensor nodes truly battery-free. However, the energy harvested from the indoor environment is not matched with the power requirement of wireless sensor nodes, as it is very small and random. In order to solve this problem, a middleware for the dynamic power management of a sensor node is proposed, the hardware structure of the middleware and the system operation flow are described, and the performance of the system is evaluated. It is shown that the proposed middleware is an effective way of solving the challenging problem of providing a sensor node with an extended lifespan. Key Words: energy harvesting system, power management, wireless sensor node 1
INTRODUCTION
The lifespan of wireless sensor networks (WSNs) is a bottle-neck for their applications. It is because sensor nodes are powered by batteries and the energy stored in the batteries normally is limited by their volume as an entire node is expected to have a small size. A sensor node often consumes more energy for frequent wireless communications and larger computing workload. But battery replacement is not an option for networks with thousands of physically embedded nodes that are distributed randomly in a harsh environment that is difficult to access. So the maximisation of the network lifespan is one of the research highlights for wireless sensor networks. There is a lot of research to prolong the lifespan of WSNs using techniques such as energy scavenging from ambient environment, wireless recharging technology [1], and reducing power consumption of wireless sensor networks. Normally energy harvesting technique seems to make a longer lifespan of WSNs than the other two methods do as long as energy harvesting sources always exist and the components of wireless sensor nodes work smoothly. Energy harvesting system normally consists of energy transducers, circuit for energy adjusting, and components for energy storing. Energy transducers can convert mechanical vibration, thermal difference, light emissions, changes in electromagnetic field, air and liquid flow, and chemical energy from naturally recurring or biological processes [2] into electrical energy. The circuit for energy adjusting can convert different input from energy transducers into a suitable energy state for storing and releasing, and can dynamically implement power management. In general, energy storing components are super capacitors or rechargeable batteries, and the selection considerations are mainly based on energy density, lifespan, current leakage, size, and characteristics of the wireless sensor node powered. The paper proposes a middleware bridging a wireless sensor node and energy transducers, for the power management of a wireless sensor node. The structure of this paper is listed as follows: Section 2 provides a survey for energy transducers; Section 3 analyzes the power requirement of a wireless sensor node; Section 4 presents a scheme of an energy harvesting system for wireless sensor nodes that works indoor; Section 5 describes operation flow of the middleware; Section 6 evaluates the performance of the system; conclusions are given in Section 7 finally. 2
SURVEY FOR ENERGY TRANSDUCERS
705
Ambient energy indoor can be harvested from mechanical vibration, fluorescent light, thermal difference and other sources. In this survey, the main purpose is to evaluate the output power of different transducers such as piezoelectric materials, solar panels, electromagnetic generator, and thermoelectric transducers. FS-2513P is a piezoelectric film, provided by PROWAVE. The polymer film is composed of polyvinylidene fluoride (PVF2). The strain constant is 10 – 20 times larger than normal piezo ceramic and it is therefore ideal for converting mechanical to electrical energy. The resonance frequency is 80Hz [3]. TFM43, a piezoelectric buzzer provided by UNBRANDED, has resonance frequency of 4.5 kHz [4]. In experimental tests, a maximum power output is obtained when the shaker works at 700Hz. Piezoelectric fibre composites (PFCs) are provided by ACI [5], and their resonance frequencies are lower than those of other piezoelectric materials. PMG17 is an electromagnetic generator, a product developed by Perpetuum [6]. It is cylindrical in shape and is 5.5cm in height and 5.5cm in diameter. Thermo Life is a thermoelectric transducer developed by Thermo Life Energy Corp. When the temperature difference is 5oC, at matched load it can generate 3V of voltage, 10uA of current, 30uW of power; 10oC temperature difference generates 5.5V of voltage, 25uA of current, 135uW of power at matched load. Maximum temperature difference is up to100 oC, according to the datasheet of Thermo Life transducer [7]. SA-064 is a solar panel to convert sunlight or light from fluorescent and incandescent lamps into electrical energy. The size of the panel, which is provided by SOLAREX, is 15cm X 5.5cm [8]. Experimental testing is carried out to evaluate the output power of the solar panel when it works indoor. The test conditions are that light source is a fluorescent lamp with power of 54W [9], and the vertical distance between a solar panel and a light source is 270cm. Data from six energy transducers are summed in Table 1, which are mostly from experiments based on indoor environment and only the data of both PMG17 and Thermo Life are from datasheets of corresponding products. Experiments prove obviously that energy harvested from indoor environment is small. Table 1 Output of energy transducers Transducers
Type
Short Current
Open-Voltage
Conditions
FS-2513P
Piezoelectric
8.1µA
1.635 VAC
83Hz
TFM43
Piezoelectric
2.18mA
4.12 VAC
700Hz
PFCs
Piezoelectric
0.3mA
122 VAC
35Hz
PMG17-120
Electromagnetic
10mA
4.5VAC
60Hz
Thermo Life
Thermoelectric
25 µA
5.5 VDC
∆= 10oC
Solar
0.24mA
4.90VDC
2.7 m from light source of 54W
SA-064
3
REQUIREMENT OF A WIRELESS SENSOR NODE
The canonical architecture of a wireless sensor node is comprised of four subsystems for communicating, computing and controlling, sensing, and powering. Communicating subsystem normally consists of transceiver, antenna, and peripheral components. Its power consumption is evaluated by such parameters as voltage supply, transmitting current, receiving current, and current at power down mode. Table 2 provides power parameters of radio frequency modules based on Zigbee/IEEE802.15.4 protocol specification. In general, communication distance depends on transmitting power, i.e. the larger the transmitting power, the longer the
706
communication distance. Transmitting power can be controlled by user programming. Table 2 Power parameters of typical RF modules
RF module
Supply Voltage (V)
RX mode current (mA)
TX mode current (mA)
Power down mode current (µA)
CC2420[10]
2.1~3.6
18.8
17.4
0.9
MC13192[11]
2.0~3.4
42
35
1
UZ2400[12]
2.7~3.6
18
22
2
xBee[13]
2.8~3.4
50
45
<10
xBee-PRO[14]
2.8~3.4
55
270
<10
NanoPAN5360[15]
2.4~3.6
35
78
1.5
NanoPAN5361[15]
2.4~3.6
35
78
1.5
Computing and controlling subsystem are normally based on a microprocessor. For general purpose, many components are integrated in microprocessors, for example, ADC, DAC, RAM, timer, and so on. Because some components are not necessary for some special application and residence components still consume some energy even though they are not being used, a microprocessor is selected according to requirement of special application during the design of a wireless sensor node. Table 3 lists power parameters of some microprocessors. Table 3 Power parameters of some microprocessors
Microprocessor
Supply Current (mA)
Supply Voltage (V)
Run Frequency (MHz)
Power down mode Current (µA)
C8051F930 [16]
4.25
0.9
25
0.05
PIC18F4620 [17]
16
4.2
40
0.1
MC9s08GT [18]
6.5
3
16
2.5
ATMEGA128L [19]
5.5
3
4
<5
MSP430CG4618 [20]
0.4
2.2
1
0.35
ML610Q431 [21]
0.65
1.1
4
0.25
Because voltage supply and clock frequency affect significantly power consumption and performance of a microprocessor, several working modes are set for dynamic power management.
707
Sensing subsystem generally includes one or more sensors, and it is possible for these sensors to be homogeneous or heterogeneous. In general, the power of sensors is lower than that of other subsystems in a sensor node, but energy consumption on sensors is still very large because their operation time is very long during the lifespan of a sensor node. Powering subsystem normally uses a DC-DC device to convert single voltage of power source into different voltage values to satisfy the requirements of different components. For example, the voltage supply of a microprocessor is 3V or so; the voltage supply of some sensor may be up to 5V or so; and it also is possible to use a negative voltage for a special sensor subsystem. Power consumption exists in transferring efficiency and DC-DC device itself. The power consumption distribution for every subsystem mentioned above is described in Figure 1, which has been produced from data provided in Ref. [22]. It is obvious for the transceiver to consume a lot of energy, but the sensor to use small energy.
16 14
Power (mW)
12 10 8 6 4 2 0 Sensors
CPU
TX
RX
Idle
Sleep
Figure 1. Power consumption distribution in a wireless sensor node [22]
According to energy consumption analysis presented above, the power supply subsystem should provide that the low bound of voltage supply should be 3.3V as it can match the requirement of typical sensors with low power consumption, and the low bound of current supply should be 20mA, i.e. it means the low bound should be larger than the current consumption sum of both radio transceiver and microprocessor. Because there are some unexpected factors existing, so certain redundancy quantity should be considered for these parameters. 4
MIDDLEWARE FOR DYNAMIC POWER MANAGEMENT
By combining the survey of energy transducers and the power consumption analysis of a wireless sensor node, a conclusion is obviously that energy scavenged from indoor environment cannot power directly a wireless sensor node. In order to solve the problem, a middleware, shown in Figure 2, is proposed to bridge energy transducers and a wireless sensor node for implementing dynamical power management of the sensor node. The middleware is a mechanism that can store and release energy. Its design is based on two assumptions: the wireless sensor node can enter into sleep state for energy saving, as a very small energy is consumed during sleep state; energy transducers must work continuously so that more energy can be scavenged. The middleware is used to power a wireless sensor node with any size of power consumption by re-configuration of rechargeable batteries. Two rechargeable battery sets are used to power a wireless sensor node in turn. When the voltage supply of a battery set that is powering the node declines to the lower bound of the voltage supply for the node, the other battery set is switched to work, and the battery set losing energy is charged by a supercapacitor storing the energy harvested from energy transducers such as solar panels or vibration generators.
708
4.1 Hardware description The schematic diagram of the middleware is shown in Figure 4. Here, R7 R8 and Q2 form a switch circuit to control the output of the battery set1, and R13, R14 and Q4 control the output of set2; the two battery set outputs power a wireless sensor node in turn. R5 R6 and R11 R12 form two sampling circuits that detect the voltage states of battery set1 and set2 respectively. R3 R4 and Q1 form a switch circuit to control the charge of battery set1, and R9, R10 and Q3 control the charge of set2. R1 R2 form a sampling circuit for detecting the voltage states of a super capacitor storing the energy harvested from a solar panel or an electromagnetic generator respectively. As electromagnetic generator produces an alternating output, a full bridge rectifier is used to convert AC to DC. A ZENER diode is used to protect the supercapacitor and rechargeable batteries. The whole system is controlled by a PIC microprocessor and there are four LEDs to indicate the states of four switches. The circuit of R16 R19, S1 and C3 starts the system, and controls the PIC processor to turn LEDs on or off in order to save power consumption on LEDs. In other words, a double-function button S1 is used to start the system after the system is powered, and it is used to control LEDs on or off when the system works. The schematic circuit diagram of the middleware is shown in Figure 4. The prototype of the middleware is shown in Figure 5.
Figure 2. Hardware structure of middleware
Figure 3. Prototype of the middleware
709
4.2 Working process In order to clearly express the working process of the system, the dynamic behaviour of the system is divided into five states, and the conversion between states is described by a finite state machine in Figure 4. The parameters and states are defined as follows: VLOW: lower bound of system voltage supply, which include VLOWL and VLOWH. VLOWL is a threshold value, and the wireless sensor node stops working if its voltage supply is less than it. VLOWH also is a special threshold value between VLOWL and VUP, the value is set in order to avoid frequently switching between states. VUP: upper bound of system voltage supply VSET1:
voltage of battery set1
VSET2:
voltage of battery set2
Q1 - Q4: four MOSFET transistors. Q1on indicates that Q1 is turned on, and Q1off indicates that Q1 is turned off S0: the initial state of the system when it is powered. In this state, the ports and internal components of the PIC processor are initialized. S1: the state when battery set1 powers the wireless sensor node (Q2on) and set2 stops powering (Q4 off). The microprocessor checks the VSET2. Q3 is turned on if VSET2 is less than VUP, Q3 is turned off if VSET2 is larger than VUP. S2:
the state when battery set2 powers the wireless sensor node (Q4 on) and battery set1 stops powering (Q2off). The processor checks the VSET1. Q1 is turned on if VSET1 is less than VUP, and Q1 is turned off if VSET1 is larger than VUP.
S3:
the state when battery set1 and set2 stop powering the sensor node. S3 is from S1, so Q3 is kept to be on so that battery set2 is continuously charged by the energy stored in the supercapacitor.
S4:
the state when battery set1 and set2 stop powering the sensor node. S4 is from S2, so Q1 is kept to be on, so that battery set1 is continuously charged by the energy stored in the supercapacitor. S0: Initializing system
S3: Q2off Q4off Q1off Q3on
VSET2 > VLOWH
VSET1,VSET2 < VLOWL S1: Q2on Q4off If VSET2
VSET1 > VLOWH
VSET2>VLOWH VSET1
S2: Q2off Q4on If VSET1
VSET2 < VLOWL VSET1> VLOWH S4: Q2off Q4off Q1on Q3off
VSET1, VSET2 < VLOWL
Figure 4 State transfers
5
EVALUATION
The evaluation is mainly to test the correctness of middleware functions and to calculate the power consumption of the middleware circuit itself. 5.1 Test of middleware functions In the experiment, the voltage supply of the Imote2 wireless sensor node used is between 3.5V and 4.2V, and the current supply is 200mA or so. Here, two cycles of charging and powering process of middleware are observed. Figure 5 shows two cycles of the voltage waveforms on battery set1, Figure 6 shows two cycles of the voltage waveforms on battery set2, and Figure 7 is for the supercapacitor. It can be seen that the middleware works. Battery set1 is charged while
710
battery set2 is powering, and the switching occurs when the voltage on battery set2 is less than 3.65V; while the voltage on battery set1 declines to 3.55V, the switching occurs again; the supercapacitor charges set1 and set2 in turn. Because of the chemical characteristics of the battery sets, the lower and upper bounds of voltage supplies for the two battery sets show a minor difference respectively. 5.2 Power Analysis of Middleware Circuit The entire system shown in Figure 2 consists of such sub-circuits as the switch and control, voltage sampling, processor, LED and driver, starting circuit, and a full bridge rectifier. It is necessary to extract the formula for calculating the power consumption of each sub-circuit, so that the entire system can be correctly simulated. In this section, the first task is to describe the calculation formulas of power consumption for each sub-circuit, and the second task is to discuss the power distribution in different operation modes. 5.2.1 Power consumption calculation of sub-circuits In order to obtain accurate simulation results with high performance, it is necessary to extract mathematical formulas from each sub-circuit, which are used to calculate energy consumption in the simulation system. 5.1.1.1 Switch and control sub-circuit As shown in Figure 4, there are four switch and control sub-circuits, which consists of two resistors RSL, RSS and a MOSFET transistor for each. Four control pins are IN1, IN2, OUT1, and OUT2 respectively. When the output of any control pin is in low electronic level, the corresponding transistor is turned on, and the power consumption is expressed in the formula (1). When the output is in high level, the transistor is off, and the consumption is expressed in (2). PSC(on) = RDS(on)*I2OUT + (VDD – VPINL)2/(RSL+RSS)
(1)
2
(2)
PSC(off) = (VDD – VPINH) /(RSL+RSS)
Figure 5 Voltage waveform of battery set1
Figure 6 Voltage waveform of battery set2
711
Figure 7 Voltage waveform on supercapacitor 5.1.1.2 Voltage sampling unit There are three voltage sampling units in the middleware. Each unit is made up two resistors RADC. The voltage on check point only indicates half of the voltage sampling value. This is to prevent going beyond the measurement range, as voltage values on two rechargeable battery-sets and supercapacitor are higher than VDD. The power consumption is expressed in the formula (3). PVSU = V2DD/(2RADC)
(3)
5.2.1.3 Processor unit According to the datasheet of PIC16F506, when it works at 2V and 4MHZ, the current supply is 170µA. Power consumption of the processor unit is comprised of the power consumption of the diode and the processor [23]. Power consumption calculation of the processor unit is formulated in (4). PCPU = 0.172*RDIODE + (VSET – 0.17*RDIODE)*0.17
(4)
5.2.1.4 LED and driver The middleware includes four LEDs and corresponding drivers, which are controlled by such pins as RB5, RB4, RC5, and RC4. When one of these pins is in low electrical level, the corresponding LED glows, and the power consumption can be calculated according to formula (5). When the level on the pin is high, the consumption is expressed in (6). PLED(lo) = (VDD – VPINL)2/RLED
(5)
2
(6)
PLED(hi) = (VDD – VPINH) /RLED 5.2.1.5 Starting sub-circuit
Starting sub-circuit consists of a switch button, a capacitor, two resistors R19 and R20. The switch button has two functions: one is to start system working after the system is powered, and the other is to show the states of the switch and control subcircuit when the button is pressed down. The energy consumptions are formulized in (7) when the button is released, and in (8) when the button is pressed down. PSS(br) = (VDD – VPINH)2/( R19 + R20 )
(7)
2
(8)
PSS(bp) = (VDD – VPINL) /R19 5.2.1.6 Full bridge rectifier
Normally the full bridge rectifier is a single chip – a special product, or is made up four schottky diodes. The power consumption for the former can be checked in corresponding datasheet. The consumption for the latter can be calculated by formula (9). PFBR = 4*I2ACinput*RDIODE
(9)
5.2.2 Power distribution in different operation modes Power distribution of system is a time function and depends dynamically on operation modes. The energy states for different modes are analyzed and calculated as follows: During S0 state, the microprocessor unit, starting unit, full bridge rectifier, and three voltage sampling units are in working mode. Four switch and control sub-circuits are turned off, and four pins to control LEDs and drivers are in high electrical level, i.e. all LEDs do not light up. The power consumption can be calculated according to the formulas defined in section 5.2.1.
712
During S1 state, the microprocessor unit, full bridge rectifier, and three voltage sampling units are in working mode. The starting unit does not work, two switch and control sub-circuits are turned off, and the other two sub-circuits are turned on. The states for the four LEDs and drivers are the same as as those in S0 state. Apart from the power consumption of the middleware itself described above, the energy consumption also includes the wireless sensor node power consumption, which can be calculated by such parameters as PTR, PW, PS, TTR, TW, and TS, according to the states Oi , i ˛{0, 2}. Here, the energy provider is the rechargeable battery-set1. In S1 state, the supercapacitor charges the rechargeable battery-set2 if VSET2 is less than VUP and VSC is larger than VSET2. The power change in S2 state is very similar to the ones in S1 state. The difference is only the energy provider is the rechargeable battery-set2, and the supercapacitor charges the rechargeable battery-set1 if VSET1 is less than VUP and VSC is larger than VSET1. S3 state is converted from S1, and the middleware stops powering the wireless sensor node as the energy stored in both two rechargeable battery-sets is too small to do it. In general, the rechargeable battery-set2 is charged by the supercapacitor in S1 state, so the battery-set2 is charged in S3 state continuously. When the voltage on the set2 is larger than VLOWH, the system state is switched into S2 state. During S3 state, the microprocessor unit, full bridge rectifier, and three voltage sampling units are in working mode. The starting unit does not work, three switch and control sub-circuits are turned off, and only one sub-circuit is turned on for charging the battery-set2. The four LEDs are turned off. S4 state is converted from S2, and the power change in S4 state is very similar to the ones in S3 state. The difference is that the supercapacitor charges the battery-set1 rather than the set2. When the voltage on the set1 is larger than VLOWH, the system state is switched into S1 state. 5.2.3 Evaluation of energy consumption According to the parameters of the electronic components, the power consumption for the switch and control circuit is 11.32µW when the switch is turned on and used to power a wireless sensor node with current supply of 30mA. The power consumption of charging a rechargeable battery set by the same switch circuit is 0.004µW. In the design, the power consumption of the sampling circuit is 0.036mW, 0.676mW is for the processor unit, 0.027mW is for the four LED’s and their drivers when the double-function button is not pressed down, 0.0017mW is for the starting circuit, and 2.12µW is for the full bridge rectifier. Power consumption of the middleware is the sum of the energy consumption of all sub-circuits, and is calculated as 0.9185mW. 6 CONCLUSION AND FUTURE WORK The paper has proposed a middleware, a dynamic power management system for wireless sensor nodes. The proposed middleware can effectively extend the lifespan of a wireless sensor node by collecting trivial energy in the environment, and accumulating the energy to dynamically charge rechargeable batteries or supercapacitors. This is a feasible solution to solve the challenging problem of how to harvest small energy from indoor environment to power a wireless sensor node with high energy consumption. A simulation model of the dynamic power management system will be investigated in the future. An optimal solution of energy harvesting system should be small in size and should really be battery-free. 7 REFERENCE 1
http://web.mit.edu/newsoffice/2007/wireless-0607.html
2
http://www.aldinc.com/pdf/EH300Brochure.pdf, 02/2009.
3
http://www.farnell.com/datasheets/81206.pdf , March 2009
4
http://video-equipment.globalspec.com/datasheets/1876/MynTahl/92ABCC5E-029F-4E63-90F4-AA249E44C1FC March 2009
5
http://www.advancedcerametrics.com/pages/energy_harvesting_components
6
http://www.perpetuum.co.uk/resource/PMG17%20- %20Technical%20Specification%20Rev%202%200.pdf , June 2008
7
http://www.poweredbythermolife.com/thermolife.htm, June 2008
8
http://www.farnell.com/datasheets/45065.pdf , March 2009
9
http://www.gelighting.com/eu/resources/literature_library/product_brochures/downloads/t5_brochure_en.pdf, 02/2009
10 http://enaweb.eng.yale.edu/drupal/system/files/CC2420_Data_Sheet_1_4.pdf , March 2009 11 http://www.freescale.com/files/rf_if/doc/data_sheet/MC13192.pdf , March 2009
713
,
12 http://www.coretk.com/CataLog/cata_img/FILE/183366731/UBEC/168/168_176_1146034480.pdf ,March 2009 13 http://www.sparkfun.com/datasheets/Wireless/Zigbee/XBee-Datasheet.pdf March 2009 14 http://www.rev-ed.co.uk/docs/XBE001.pdf March 2009 15 ftp://ftp.efo.ru/pub/nanotron/nanoPAN%205360%20-%205361.pdf March 2009 16 https://www.silabs.com/Support%20Documents/TechnicalDocs/C8051F930_short.pdf March 2009 17 http://www.datasheetcatalog.com/datasheets_pdf/P/I/C/1/PIC18F4620.shtml March 2009 18 http://www.datasheetpro.com/6979_view_MC9S08GT_datasheet.html March 2009 19 http://www.datasheetcatalog.com/datasheets_pdf/A/T/M/E/ATMEGA128L.shtml, March 2009 20 http://datasheet.emcelettronica.com/ti/MSP430CG4618, March 2009 21 http://www.okisemi.eu/docbox/PEPL610Q431_Q432-02.pdf, March 2009 22 Deborah Estrin. Wireless Sensor Networks Tutorial Part IV: Sensor Network Protocols. Mobicom, Sep. 23-28, 2002. Westin Peachtree Plaza,Atlanta,Georgia,USA. 23 http://melabs.picbasic.com/devicedata/41268A.pdf , May 2009
Acknowledgments The funding received for this work from the EU under the FP6 programme for the DYNAMITE project is gratefully acknowledged.
714
Proceedings of the 4th World Congress on Engineering Asset Management Athens, Greece 28 - 30 September 2009
A ROADMAP FOR INFORMATION TECHNOLOGY GOVERNANCE Dr. Abrar Haider a, b a
b
CRC for Integrated Engineering Asset Management, Brisbane, Australia
School of Computer and Infomatin Science, University of South Australia, Mawson Lakes Campus, SA 5095, Australia.
Engineering enterprises traditionally adopt a technology-centred approach to asset lifecycle management, where technical aspects of asset configuration, design, and operation command most resources and are considered first in their planning and design stage. On the other hand, asset lifecycle support infrastructure like information technologies to support asset lifecycle, lifecycle processes maturity, and other organisational factors are considered relatively late in the process and sometimes only after the assets are operational. In this way, these enterprises mature technologically along the continuum of standalone technologies to integrated systems, and in so doing aim to achieve maturity of processes enabled by these technologies, and the skills associated with their operation. However, information technologies influence and are influenced by the context of their implementation and have a direct relationship with organisational evolution. Therefore, the success of information technologies implementation for asset lifecycle management depends on their alignment with business processes. This paper provides a case for governance of information technologies utilised for asset lifecycle management. It is of particular interest to the organisations that are currently using Enterprise Resource Planning systems like SAP as the core technology for asset lifecycle management or have adopted a Service Oriented Architecture, or those that have adopted both. It concludes that information technologies should not be taken as technical constructs, these are at the core of strategic alignment, value delivery, resource management, and risk management; which calls for needs to understand and govern the overall information technology architecture of the organisation. Key Words: Information technology, IT governance, Asset management. 1
INTRODUCTION
Asset lifecycle management is a set of information intensive processes. However, information requirements of these processes are prone to change, due mainly to the changes in the strategic, business, operational, and tactical environment of asset lifecycle management. The ability of an organisation to understand those changes not only contributes to its responsiveness, but also improves its capacity to enhance reliability of asset operations and to deliver optimised level of asset management. On the other hand, this ability is directly influenced by the way an organisation implements information technology (IT), which consequently acquires, processes, and presents information to enable asset managing organisations to understand these changes. IT in asset management, therefore, not only act as strategy translators but also act as strategy enablers, whereby the real value of IT relies upon how effectively these technologies are mapped to the asset lifecycle management processes and how effectively they are synchronised with other IT and operational technologies in the organisation. This highlights the value and necessity of IT governance, which can be succinctly put as the set of procedures that formalize IT and business alignment. Scope of IT governance in these circumstances spans from IT investment management to accountability of its choice and operation, as well as of its management coordination and control. This paper addresses specifics of IT governance relevant to some of the most common IT scenarios for asset management. It starts with a discussion of the role of IT for asset lifecycle management, followed by importance of IT governance and potential benefits of its implementation. Having set these foundations, the paper then presents a set of challenges and difficulties associated with IT governance. Subsequently, an overview of the main areas of interest for IT governance and available frameworks is included. The paper then describes a model for determining IT governance priorities, which is complemented by a set of key decisions for an effective governance strategy.
715
2
IT IMPLEMENTATION FOR ASSET MANAGEMENT
In theory IT utilised in asset management has three major roles; firstly, IT systems capture, store, and exchange information spanning asset lifecycle processes; secondly, IT provide decision support capabilities through the analytic conclusions arrived at from analysis of data; and thirdly, IT enables an integrated view of asset management through integration and interoperability of asset lifecycle information IT systems thus have help in translating asset management strategy into action by enabling asset lifecycle processes, and also inform the asset management strategy though their ability to analyse lifecycle information that asset mangers use in lifecycle planning and decisions. IT for asset management, thus, seeks to enhance the outputs of asset management processes through a bottom up approach. This approach gathers and processes operational data for individual assets at the base level, and on a higher level provides a consolidated view of entire asset base (figure 1).
IT Implementation Concerns
Desired Asset Management Outputs
Level
Providing and integrated view of asset lifecycle management information to facilitate strategic decision making at the executive level.
Tactical Level
Fulfilling asset lifecycle planning and control requirements aimed at continuous asset availability, through performance analysis based on analysis of various dimensions of asset information such as, design, operation, maintenance, financial, and risk assessment and management.
Strategic How IT must be implemented to provide an integrated view of asset lifecycle?
How IT must be implemented to meet the planning and control of asset lifecycle management?
How IT must be implemented to meet operational requirements of assets?
Operational Level
Aiding in and/or ensuring of asset design, operation, condition monitoring, failure notifications, maintenance execution and resource allocation, and enabling other activities required for smooth asset operation.
Figure 1: Scope of IT for Asset Management [1] At the operational and tactical levels, IT systems are required to provide necessary support for planning and execution of core asset lifecycle processes. For example, at the design stage designers need to capture and process information such as, asset configuration; asset and/or site layout design and schematic diagrams/drawings; asset bill of materials; analysis of maintainability and reliability design requirements; and failure modes, effects and criticality identification for each asset. Planning choices at this stage drive future asset behaviour, therefore the minimum requirement laid on IT at this stage is to provide right and timely information, such that informed choices could be made to ensure availability, reliability and quality of asset operation. An important aspect of asset design stage is the supportability design that governs most of the later asset lifecycle stages. The crucial factor in carrying out these analyses is the availability and integration of information, such that analysis of supportability of all facets of asset design and development, operation, maintenance, and retirement are fully recognised and defined. Nevertheless, effective asset management requires the lifecycle decision makers to identify the financial and non financial risks posed to asset operation, their impact, and ways to mitigate those risks. IT for asset management not only has to provide for standardised quality information but also have to provide for the control of asset lifecycle processes. For example, design of an asset has a direct impact on its asset operation. Operation, itself, is concerned with minimising the disturbances relating to production or service provision of an asset. At this level, it is important that IT systems are capable of providing feedback to maintenance and design functions regarding factors such as asset performance; detection of manufacturing or production process defects; design defects; asset condition; asset failure notifications. There are numerous IT systems employed at this stage that capture data from sensors and other field devices to diagnostic/prognostic systems; such as Supervisory Control and Data Acquisition (SCADA) systems, Computerized
716
Maintenance Management Systems (CMMS), and Enterprise Asset Management systems. These systems further provide inputs to maintenance planning and execution. However, effective maintenance not only requires effective planning but also requires availability of spares, maintenance expertise, work order generation, and other financial and non financial supports. This requires integration of technical, administrative, and operational information of asset lifecycle, such that timely, informed, and cost effective choices could be made about maintenance of an asset. For example, a typical water pump station in Australia is located away from major infrastructure and has considerable length of pipe line assets that brings water from the source to the destination. The demand for water supply is continuous for twenty four hours a day, seven days a week. Although, the station may have an early warning system installed, maintenance labour at the water stations and along the pipeline is limited and spares inventory is generally not held at each station. Therefore, it is important to continuously monitor asset operation (which in this case constitutes equipment on the water station as well as the pipeline) in order to sense asset failures as soon as possible and preferably in their development stage. However, early fault detection is not of much use if it is not backed up with the ready availability of spares and maintenance expertise. The expectations placed on water station by its stakeholders are not just of continuous availability of operational assets, but also of the efficiency and reliability of support processes. IT systems, therefore, need to enable maintenance workflow execution as well as decision support by enabling information manipulation on factors such as, asset failure and wear pattern; maintenance work plan generation; maintenance scheduling and follow up actions; asset shutdown scheduling; maintenance simulation; spares acquisition; testing after servicing/repair treatment; identification of asset design weaknesses; and asset operation cost benefit analysis. An important measure of effectiveness of IT, therefore, is the level of integration that they provide in bringing together different functions of asset lifecycle management, as well as stakeholders, such as business partners, customers, and regulatory agencies like environmental and government organisations. However, effectiveness of IT depends on how IT is implemented, since it cannot be detached from human action and understanding, social context, and cultural environment within which it is implemented. IT implementation is a strategic advisory mechanism that supports planning, decision making, and management of IT implementation processes, and facilitates organizational learning.
3
PERSPECTIVES ON IT IMPLEMENTATION
In computer science implementation is considered as an activity that is concerned with installation of the IT system and applications, and is centred entirely on the technical aspects of the IT development process. On the other hand, in information systems paradigm, implementation is a process that deals with how to make use of hardware, software and information to fulfil specific organizational needs. This perspective of IT implementation is generally governed by two quite opposing views. In a technology driven view, humans are considered as passive entities, whose behaviour is determined by technology. It is argued that technology development follows a casual logic between humans and technology, and therefore is independent of its designers and users. This mechanistic view assumes that human behaviour can be predicted, and therefore technology can be developed and produced perfectly with an intended purpose. This view may hold true for objective machine such as, microcontrollers which have a determined behaviour; whereas for IT this view has inherent limitations due to its disregard of human and contextual elements. A corollary of this objective view is the managerial assumption that IT implementation increases productivity and profitability. This view basically works on the assumption that social and organisational transformation is measurable and therefore can be predicted. Consequently, management decisions are governed by the expectations from technology rather than the means that enable technology to deliver the expectations. Although, it is clear that these approaches have inherent limitation, yet these views dictate majority of contemporary research and practice. The opposing stance to traditional technical view is much more liberating and takes a critical scrutiny of the deterministic technological and managerial views of the relationship of technology with human, organisational, and social aspects. This view illustrates that technology has an active relationship with humans, in the sense that humans are considered as constructors and shapers of technology as well as reality. In this stance, technology users are active rather than passive, and their social behaviour, interaction, and learning evolves continuously towards improving the overall context of the organisation. This organisational change, as a result of IT implementation, is not a linear process and represents intertwined multifaceted relations between people in a variety of opposing forces, which makes the human and organisational behaviour highly unpredictable. This unpredictability is attracting attention of researchers to uncover the relationship between humans and technology, and development of emancipatory human centred technology [2]. As a consequence, IT implementation is increasingly being considered as strategic translation through accomplishment of social action, and technological maturity in an organisation is viewed as an outcome of strategic choices and social action. These two views provide divergent perspectives on use of technology implementation and use, with one considering it as structure and the other as process. Considering it as structure, demonstrates that technology determines the business processes; whereas the process view argues that technology alone cannot determine the outcomes of business processes and in fact it is open to an intentional propose. Schienstock et al. [3] summarises various perceptions on implementation of technology using different descriptions (see Table 1). When these descriptions are viewed in the light of the two views described here, the first three metaphors, i.e. tool, automation and control instrument conform to the technical view. The process metaphor matches the emancipatory view; whereas the organisation technology and medium metaphors are debateable and can conform to either view.
717
Metaphor
Function
Aim
Tool
Support business process
Increase quality, speed up work process, cope with increased complexity
Automation technology
Elimination of human labour
Cost cutting
Control instrument
Monitoring and steering business process
Adjustment to changes, avoiding defects
Organisation technology
Co-ordination of business processes
Transparency, organizational flexibility
Medium
Setting up of communication
Process
Improve information system
technical
connections
for
Quick and intensive exchange of information and knowledge Continuous learning
Table 1: Perceptions on Technology Implementation [3] However, review of literature on IT adoption and implementation reveals that researchers have attempted to address these issues from many different perspectives. At the same time, it also reveals that the value profile that organisations attach to IT implementation spans from simple process automation to strategic competitiveness. These theories have originated from diversified fields of knowledge, such as business management, organisational behaviour, computer science, mathematics, engineering, sociology, and cognitive sciences. These theories have originated from a diversified range of disciplines, and therefore represent a variety of view on IT adoption and use. However, these theories can be classified into three broad categories, i.e. technology deterministic (such as information processing, task technology fit, and agency theory); sociotechnical interactions (such as actor network theory, socio-technical theory, and contingency theory), and organisational imperatives (such as strategic competitiveness, resource based view theory, and dynamic capabilities theory). Technology deterministic theories adopt a mechanistic view of organisations where technology is applied to bring about predicted desired effect through appropriate methodologies, and apparatus. Socio-technical theories are focused on the interaction of technology with the social context of the organisation, and aim at the impacts of these interactions to produce desired objectives of technology implementation. Organisational imperative theories focus on the relationships between the environment that the business operates in, business strategies and strategic orientation, and the technology management strategies, to produce the required gaols of the organisation. However, there is no right or wrong choice here as the success of any of these strategies depends upon the type of organisation, the industry that it is in, and the way it views technology. However, the fundamental building bock of success is the way an organisation aligns IT with the business processes’ requirements and to find the strategic fit between technology and organisational infrastructure and architecture. In essence it calls for meeting IT goals and avoiding risks; alignment of IT and business; development of business opportunities through IT; and accountable for IT resources usage. The point is that although IT is focus here, its governance should be the focus of management.
4
BUSINESS CASE FOR GOVERNANCE
Although there is no clear agreement on what constitutes IT governance, it can be succinctly put as the set of procedures that formalize IT and business alignment [4]. However, the term relates to accurate reporting, IT investment management, accountability, management coordination and control [5]. Simonsson and Johnson recently took on the task of creating a new definition of IT governance, based on a study of over 60 articles they came to conclude that "IT governance is basically about IT decision-making: The preparation for, making of and implementations of decisions regarding goals, processes, people and technology on a tactical and strategic level" [6]. IT governance is concerned about measuring IT outcomes and ensuring every stakeholder is considered in IT strategic decisions so that organisations can work towards their ultimate goals more effectively. These activities are important for every organisation regardless of size, type or sector they belong to since they all can benefit from accurately aligning IT initiatives with organisational goals [4]. In the past few years, governance has consistently been identified as a top management issue by Chief Information Officers [7]. There are many reasons behind the increased interest in governance and its appearance in most organisations’ agenda. It is estimated that companies with effective means of governance enjoy much higher returns on assets than their counterparts [8]. Another report states that companies that effectively manage IT spending obtain higher earnings than over-spenders [9]. CIO’s claim that governance promotes focused IT spending, increased quality control of business planning and visibility into IT projects progress [7]. Most companies will find a compelling case for governance given its role in ensuring compliance with the many regulations found in today’s corporate environment (i.e. Sarbanes-Oxley Act of 2002) [4]. There is a common
718
perception of governance as a constraining effort, set far aside from resourcefulness. However, a clear case exists to support that effective governance ultimately promotes resourceful thinking by providing a framework that demands not only for alignment but for cost-effective IT projects that deliver significant value to organisations [10].
5
IMPLEMENTING GOVERNANCE 5.1 governance Challenges
CIO’s report several organisational issues undermining IT governance implementations. Lack of commitment, awareness, recognition, engagement, accountability and education together with avoidance, resistance and misalignment are some of the main roadblocks CIO’s are facing in their governance efforts [7]. Although there is a clear business case for governance, several organisations have failed in their efforts to achieve it. Some of these organisations report failure to achieve competitive advantage through IT, others have tried to benchmark a framework from another company just to realize it did not suit their specific needs while others have simply decided to completely disregard IT governance [9]. In some cases governance frameworks have been implemented mainly as a set of supervisory boards and mechanisms that prevent the organisation to run smoothly as they increase bureaucracy and effort. While in some instances strict controls are necessary, in other cases a more systemic and organic approach is advisable; the idea is to give people the information and resources necessary to take the right decisions with minimal intervention. This approach promises to empower people and provide flexibility at significantly lower costs than highly structured and regulated environments. There are several approaches to implement governance in organisations; but to rip the benefits it is critical to choose the right one for the organisation [5]. 5.2 Key Areas of governance and Related Frameworks Strategic alignment, value delivery, resource management, risk management and performance measures are the key components of an IT governance framework. So far many standards are available to organisations looking to improve governance practices; these vary in complexity and focus but essentially work towards the same goals of compliance and control. Amongst the most widely accepted frameworks we find CoBIT, ITIL, COSO and CMMI. There is no one-size fits all solution and the governance framework choice should be taken in consideration of organisational culture, objectives and line of business [4]. Choosing the right framework can be a science of its own, researchers have identified that most literature concerning governance either deals with the decision-making structures used to support it or with the question of what governance frameworks and implementation strategies work better in which organisations. The study actually asserts that subject matter experts all agree there are no ecumenical IT governance structures [11]. There are however mechanisms to facilitate this process so to avoid IT executives the need to scrutinize hundreds of governance models [9]. 5.3 Identifying governance Priorities It is clear that one of the main challenges of governance is identifying the best-fit approach for each organisation. Given the difficulty of the task, some models have been created to assist in this task. An example is ‘Accenture’s IT governance Model’ which classifies organisations into four categories. The model presents four quadrants; one for each category. Organisations can be located in the model according to their rate of change, operational efficiency and product/service differentiation level [9]. The proposed model is as follows:
719
Faster Rate of Change
Responsive Solution Providers
New Capabilities Enablers
Efficient, Predictable Operators
Information Integrators
Operational Efficiency
Product/Service Differentiation
Industry/Company Rate of Change
Slow Rate of Change
Company’s Basis for Competitive Advantage
Figure 2: AccentuRE IT governance Model [9] Once the position in the quadrant of an organisation has been determined, the expectations it should place on IT can be determined with assistance from the following table: Company Category
C-Level Expectations of Information Technology
Efficient, Predictable Operators
• Meet business needs while supporting low-cost orientation • Keep costs low – Minimize changes to and maximise lifecycle of information technology assets – Leverage cost-saving devices, e.g. shared services, co-sourcing, outsourcing, etc.
Responsive Solution Providers
• Work jointly with businesses to develop prioritized investment plans and longer-term capability road maps • Deliver the planned new capabilities to meet time-to-market windows • Proactively manage lifecycles of information technology assets in accordance with the capability road maps
Information Integrators
• Drive and enable businesses to leverage information for improved decision making and new products/service offerings • Develop an information technology platform that enables rapid development of shorter lifecycle information based business capabilities and offerings • Offset any increases to information technology spending by generating higher revenues
• Develop an information technology organisation that is flexible to accommodate rapidly changing business strategies and requirements • Foster an innovation culture to create innovate information technology – enabled business models and business capabilities through a combination of existing and emerging technologies • Deliver the innovative capabilities to capture first –mover advantages Table 2: AccentuRE IT governance Model Description [9]
New Capability Enablers
720
While the model won’t point at a specific framework, it will provide with an important starting point for identifying specific circumstances around an organisation. Furthermore, it creates a reference point for classifying study-cases and facilitating benchmarking across similar organisations. 5.4 Key governance Decisions Once IT priorities and expectations have been identified, it is important to take some critical decisions that shape an organisations governance model. Accenture has identified six of them: organisational model, investment, architecture, standards and resources [9]. First it is important to decide if a centralized, decentralized or hybrid model will be adopted for the IT function within the company. Then it is critical to know key IT investment priorities and determine their resource allocation. An appropriate architecture model should then be identified taking into account the requirements in terms of flexibility and stability of IT services. Standard can be used to simplify integration and maintenance of IT, therefore it’s important to identify which should be adopted by the organisation [9]. Once these decisions have been taken, the organisation will be able to create a clear IT strategic road-map and set the start of governance efforts.
6
SPECIFIC IT GOVERNANCE IMPLEMENTATION SCENARIOS 6.1 Implementing governance in SOA
Effective governance is regarded as key success factor in implementing and maintaining SOA (Service Oriented Architecture).governance efforts in such approach will usually take the form of data repositories containing specifications and configuration details of integrated components allowing for their effective monitoring and manipulation. These records will contain relationship and dependencies information and are typically kept not only on physical entities but on services, processes, workflows and any other objects that may exist [12]. Registries can become the governance core of SOA providing means to formalize ownership, configuration management and compliance efforts [13]. 6.2 Enhancing governance in ERPs systems In his article "IT governance: Maximizing the Business Investment", Stolovitsky [14] has identified the importance of IT governance in organisations not only in order achieve better direction and control but as means to assist in achieving compliance with SOX and other governance standards (i.e. OPM3, CMMI, ITIL). The author recognizes that although major ERP's may offer some capability in these aspects, their strength lays in billing services and project accounting functionality. In response to this, some project portfolio management (PPM) vendors have taken the opportunity to differentiate their products by offering best-of-breed functionality for achieving IT governance and assisting in compliance efforts. These niche vendors that are focusing their positioning strategy in their IT governance capability include Computer Associate's Niku, ProSight, Pacific Edge, Augeo Software, PlanView, and Mercury Interactive, among others. These products offer a solution for companies whose priorities lay in IT governance and compliance rather that in improving the billing services and project accounting which is fully addressed by major ERP vendors [14].
7
IMPROVING GOVERNANCE 7.1 Effective governance Mechanisms
In 2004 a study was conducted to determine the most effective governance mechanisms, the results are based on the perception of more than 250 Chief Information Officers. These mechanisms were grouped into decision-making structures, alignment processes and communication approaches [8]. CIO's identified having specific roles address the IT-business relationship as the most effective decision-making structure to support governance. The most highly regarded alignment process was effective IT project progress and resource consumption tracking. Having an Office of CIO or an IT governance officer obtained the highest score as the best governance communication approach [15]. 7.2 Defining Roles and Relationships in governance Even when a governance model has been successfully implemented, as the organisation evolves issues will naturally appear and changes will be required. A key factor in addressing such issues is to create a shared understanding on what constitutes governance. In addition, related responsibilities and the way in which organisational entities relate to governance must be
721
clear. In order to assist in this process, Gartner has created an ‘IT governance Relationship Model’ (Gerrard, 2006). The specifics of the model are outside the scope of this paper but its purpose is to assist in communicating these relationships and responsibilities so to strengthen communications and accountability. Gartner makes a clear case for the need to model, communicate and act upon the roles and relationships involved in an organisations IT governance model.
8
CONCLUSION
IT governance is a very broad concept and it is hard to describe it since its meaning in terms of actions depends on each organisation. Despite the significant benefits associated with it, it is important to be prepared for equally significant challenges. Being aware of the complexities of IT governance beforehand is therefore critical. The broad array of frameworks available is very useful, however determining the right one or combination of them for an organisation remains a daunting task. Significant planning and analysis will be required for a successful implementation but it’s possible to leverage from existing models in order to fast track this process and avoid making common mistakes. It is possible to leverage existing investments when deploying an IT governance strategy. ERPs for example can provide with some capabilities, however it is important to fine-tune them or complement them with additional tools in order to warrantee efficient IT governance processes. In other words, IT governance does not come off-the-shelf. The IT strategy currently followed by an organisation must be taken in consideration in any effective implementation; the paper has presented some specific considerations to be taken when using the SOA paradigm together with IT governance practices. Postimplementation activities are a must in any IT governance strategy. Putting the framework in place is a great milestone but then efforts must be taken to allow the model to change and grow organically. Some key recommendations in the area have been included based on Gartner consultancy reports (i.e. role and relationship modelling). At the time when the broader asset management community is concerned with the IT and operational technologies nexus, IT governance is an area that should be given its due. IT governance is a project that paradoxically benefits from applying its own principles to itself. In other words, understand what IT governance is all about, commit to it and put it in practice from the first stages of the IT governance implementation strategy through its maintenance and continuous improvement processes.
9
REFERENCES
1
Haider, A 2009, ‘Value Maximisation from Information Technology in Asset Management – A Cultural Study’, 2009 International Conference of Maintenance Societies (ICOMS), 2-4 June, Sydney, Australia.
2
Walsham, G 1995, ‘Interpretive Case Studies in IS Research: Nature and Method’, European Journal of Information Systems. Vol.4, No.2, pp. 74-83.
3
Schienstock, G 1999, ‘Information society, work and the generation of new forms of social exclusion’, (SOWING): First Interim Report (Literature Review), Tampere, Finland, accessed online on May 30, 2009, at http://www.uta.fi/laitokset/tyoelama/sowing/frontpage.html
4
Schwartz, KD 2007, ABC: An Introduction to IT governance. Accessed online on April 16th 2009, at http://www.cio.com/article/111700/IT_governance_Definition_and_Solutions
5
CIO 2005, ‘What does governance mean?, Accessed http://www.cio.com/article/2687/What_Does_governance_Mean_
6
Simonsson, P JM 2005, ‘Defining governance - a consolidation of literature, EARP working paper, Accessed online on April 14th 2009, at http://www.ics.kth.se/Publikationer/Working%20Papers/EARP-WP-2005-MS-04.pdf
7
Gerrard, M 2005, ‘CIOs reveal their issues with IT governance’, Gartner, Accessed online on April 16th 2009, at http://www.gartner.com/DisplayDocument?id=486308&ref=g_sitelink.
8
Ross, PWJ 2004, ‘Recipe for Good governance’, http://www.cio.com/article/29162/Recipe_for_Good_governance
9
Melnicoff, RM, Shearer, SG, and. Goyal, DK 2005, Is there a smarter way to approach IT governance, Outlook Journal, accessed online on May 21, 2009, at http://www.accenture.com/Global/Research_and_Insights/Outlook/By_Alphabet/IsGovernance.htm
10
Dragoon, A. (2003, August 15). governance: Deciding Factors. Retrieved April 14th 2009 April 14th 2009, April 14th 2009, from CIO: www.cio.com
11
Brown, AE, and Grnt, GG 2005, ‘Framing the frameworks: a review of IT governance research’, Communications of the association for information system , Vol. 15, pp. 696-712.
722
online
Accessed
on
online
April
April
14th
16th
2009,
2009,
at
at
12
Kahimbaara, E 2009, ‘Key Elements of an SOA governance Strategy’, accessed online on May 21, 2009, at http://www.cio.com/article/487511/Key_Elements_of_an_SOA_governance_Strategy
13
Knorr, E, and Rist. O 2008, ‘Steps to SOA No. 6: Tackling governance’, accessed online on May 21, 2009, at http://www.cio.com/article/431213/Steps_to_SOA_No._Start_Tackling_governance_
14
Stolovitsky, N 2005, ‘IT governance: Maximizing the Business Investment’, accessed online on May 29, 2009, at http://evaluation.cio.com/search/for/it-governance.html
15
CIO 2004, ‘Effective IT governance Mechanisms’, accessed http://www.cio.com/article/32337/Effective_IT_governance_Mechanisms
online
on
May
29,
2009,
at
Acknowledgement Financial support from the Cooperative Research Centre for Integrated Engineering Asset Management (CIEAM) for this work is gratefully acknowledged.
723
Proceedings of the 4th World Congress on Engineering Asset Management Athens, Greece 28 - 30 September 2009
CREATING AN ASSET REGISTRY FOR RAILWAY ELECTRICAL TRACTION EQUIPMENT WITH OPEN STANDARDS Avin Mathew a, Michael Purser a, Lin Ma a, David Mengel b a
CRC for Integrated Engineering Asset Management, Queensland University of Technology, Brisbane, Australia b
QR Network, Queensland Rail, Brisbane, Australia
An asset registry arguably forms the core system that needs to be in place before other systems can operate or interoperate. Most systems have rudimentary asset registry functionality that store assets, relationships, or characteristics, and this leads to different asset management systems storing similar sets of data in multiple locations in an organisation. As organisations have been slowly moving their information architecture toward a service-oriented architecture, they have also been consolidating their multiple data stores, to form a “single point of truth”. As part of a strategy to integrate several asset management systems in an Australian railway organisation, a case study for developing a consolidated asset registry was conducted. A decision was made to use the MIMOSA OSA-EAI CRIS data model as well as the OSA-EAI Reference Data in building the platform due to the standard’s relative maturity and completeness. A pilot study of electrical traction equipment was selected, and the data sources feeding into the asset registry were primarily diagrammatic based. This paper presents the pitfalls encountered, approaches taken, and lessons learned during the development of the asset registry. Key Words: asset registry; asset management; railway electrical traction equipment; MIMOSA OSA-EAI; structured data 1
INTRODUCTION
An asset registry is typically one of the first data sets constructed by an organisation that conducts any form of asset management. An asset registry stores fundamental data about assets, including their models and makes, specifications and commissioning, location history, and relationships with other assets. An asset registry arguably forms the core system that needs to be in place before other systems can operate or interoperate. Most information systems have rudimentary asset registry functionality that allows allocating an identifier for an asset in order for cross-referencing data collected by the system. For example, work management systems allow capturing asset relationship structures to allow “rolling up” of maintenance costs from individual components to an overall asset or system. Condition monitoring systems use OEM (original equipment manufacturer) model specifications, such as component natural frequencies, as part of their analytical procedure. Subsequently, an organisation has asset registry data scattered around multiple locations in an organisation, with different users and maintainers. If synchronisation procedures and triggers are not correctly configured, these multiple occurrences of data can be a point of concern when one occurrence changes, but not another. Different versions of data now exist, confusing the organisation and wasting resources as users try to locate the most correct data. As organisations have been slowly moving their information architecture toward a service-oriented architecture (SOA), where possible, they have also been consolidating their multiple data stores. A “single point of truth” can then be formed, allowing data sets to be queried as a service, and in turn, supporting data replication and synchronisation procedures. As part of a strategy to integrate several asset management systems in Queensland Rail, an Australian railway organisation, a case study for developing a consolidated asset registry was conducted. As the consolidated asset registry would not be tied to any particular system, a decision was made to use the MIMOSA (Machinery Information Management Open Systems Alliance) OSA-EAI (Open Systems Architecture for Enterprise Application Integration) [1] data model as well as the OSAEAI Reference Data in building the platform due to its relative maturity and completeness. A pilot study of electrical traction equipment was selected, and the data sources feeding into the asset registry were primarily diagrammatic based.
724
2
BACKGROUND INFORMATION 2.1 Asset Registry
An asset registry forms the core master data for asset management information systems in that it describes assets, relationships between assets, models, locations, and specifications. This is distinct from asset management reference data which includes asset types, relationship types, model types, location types and specifications types. While there is little direction on what constitutes an asset registry (e.g. should it include measurement locations or common work instructions?), consensus remains on basic asset data (names, types, relationships, and specifications) A subset of an asset registry is encapsulated in almost all asset management systems due to its fundamental nature, although the scope of encapsulated registry is partial toward the functionality of the system. For example, a SCADA system would typically not store model related data, as it is not relevant to any of its processes. The duplication of data among disparate systems can lead to different versions of the truth if updates to registry data are not synchronised. 2.2 MIMOSA OSA-EAI The OSA-EAI1 model covers five asset management areas: registry, condition, maintenance, reliability, and capability forecast management. As seen in Figure 1, the registry forms the core for the other four areas. It contains a logical breakdown of an enterprise into sites, segments, and assets, as well as specifications, networks, agents, and resources. The terminology used is defined by the MIMOSA OSA-EAI Terminology Dictionary.
Figure 1. MIMOSA OSA-EAI areas and Open Object Registry Management breakdown
The MIMOSA OSA-EAI Common Relational Information Schema (CRIS) is a data model that originates from a relational model and is thereby readily implementable through a relational database management system. SQL scripts are provided to create tables, attributes, and relationships for a CRIS-based database in addition to the insertion of OSA-EAI Reference Data. As these scripts are not segmented per area (i.e. one designated for registry information), removal of unneeded tables and data is necessary if a lightweight database is intended. Modifications such as adding sequence tables, indexes, statistics, and stored procedures should also be considered before the database can be considered for production-level purposes. 3
REQUIREMENTS AND DESIGN
The initial requirements of the case study were minimal in that Queensland Rail had few expectations of the asset registry apart from forming a single source of truth for asset information. The registry needed to contain all available information from 1
As MIMOSA OSA-EAI is a continually improving standard, this discussion is in reference to version 3.2.1 (the latest version at the time of writing).
725
the diagrammatic sets of data (described in Section 3.2) as well as a work management system. It also needed the capability to connect via web services to other systems in the organisation (the details of which are not discussed in this paper). A standards-based data model was also preferred, with the two choices being MIMOSA OSA-EAI and ISO 15926. The decision to go with the OSA-EAI was based on implementation maturity, successful case studies, knowledge, and capability. 3.1 OSA-EAI data model As the OSA-EAI contains a large number of entities, many unrelated to asset registry data, identification of the relevant entities began with a study of the organisation’s related data sources and business processes. A mapping between the organisation’s requirements and the functionality offered by the OSA-EAI was developed, and Table 1 lists the CRIS entities selected for use within the case study. Reference type entities (those that contain an enumeration of values, typically from the OSA-EAI Reference Data) are distinctly identified from data entities (those that contain instances and use the reference types). Reference type entities often have slowly changing data in contrast to data entities, which have relatively faster changing data2.
Table 1 Entities within the OSA-EAI used in the data model Entity Name
Descriptive Name
Type
as_chr_dat_type
Asset Character Data Type
Reference Type Entity
as_num_dat_type
Asset Numeric Data Type
Reference Type Entity
asset_type
Asset Type
Reference Type Entity
blob_content_type
BLOB Content Type
Reference Type Entity
blob_data_type
BLOB Data Type
Reference Type Entity
network_conn_type
Network Connection Type
Reference Type Entity
network_type
Network Type
Reference Type Entity
segment_type
Segment Type
Reference Type Entity
enterprise
Enterprise
Data Entity
site
Site
Data Entity
site_database
Site Database
Data Entity
asset
Asset
Data Entity
asset_num_data
Asset Numeric Data
Data Entity
asset_chr_data
Asset Character Data
Data Entity
asset_blob_data
Asset BLOB Data
Data Entity
asset_child
Asset Child (Sub-Components)
Data Entity
segment
Segment
Data Entity
segment_child
Segment Child (Digraph Edge)
Data Entity
asset_on_segment
Asset on Segment History
Data Entity
network
Network
Data Entity
as_network_connect
Asset Network Connection
Data Entity
sg_network_connect
Segment Network Connection
Data Entity
2
Changes include additions, updates, and removals.
726
It is important to understand the distinction between assets and segments. Assets are a physical instance of a model and can be tagged with a serial number, while segments are a location or container for where an asset can perform a function. Placing an asset on a segment through the asset_on_segment entity (for it to perform its designated function) captures the location history of the asset and segment. When modelling the asset system from diagrams, segments are the primary instances identified, and in the absence of any historical records of when the placement occurred, a nominal asset can be placed on segments (for linking other data such as specifications). It is also important to understand the two types of relationships that can occur between asset and segment instances. Child-type relationships are composite relationships – removing the parent instance will remove associated child instances. Network-type relationships are aggregate relationships – removing the ‘parent’ instance will not remove any associated network instances. While the OSA-EAI CRIS covers the majority of the information structures required for most scenarios, there are certain data that cannot be stored due to the structure, or a convoluted process is required to store such data3. Provision is made to allow additional user-defined “local” attributes to be passed between systems when using OSA-EAI XML formats for exchange of non-globally recognized data; however, no similar directives are given for CRIS databases. Thus, an addition to the data model was made to overcome these difficulties (shown in Table 2).
Table 2 Entities not within the OSA-EAI used in the data model Entity Name
Descriptive Name
Type
segment_blob_coordinate
Segment BLOB Coordinates
Data Entity
coordinate_system_type
Coordinate system
Reference Type Entity
blob_data_coordinate_system
Coordinate system used for binary Reference Type Entity data types
The segment_blob_coordinate entity was added to allow the storage of asset positioning data within drawing documents. As there can be many assets within a single drawing document, the entity can have a one-to-many relationship with the segment_blob_data (asset drawing representation ) entity. Despite all electrical traction drawings being 2d, the entity allows for three dimensions (for 3d CAD drawings) with the option of the z-coordinate set as null. A coordinate system is relative and is hence based on the document type. For example, PDF documents have an origin of (0, 0) at the top-left most point of the document with increasing coordinates when moving towards the bottom-right most point. Other documents have the origin at the centre of the document. To allow for mapping between document types (blob_data_type) and coordinate system types, two entities were added that allowed this functionality (coordinate_system_type and blob_data_coordinate_system). 3.2 Sources of data Two data source types were identified as relevant for electrical traction asset registry data: diagrammatic information and existing asset registries found in information systems. As diagrams represent the as-designed/as-built status of the assets, diagrammatic data would always dominate any information systems registry data, particularly as information systems data could be stale if updates from diagrams had not been propagated. 3.2.1 Diagrams Two diagram types, isolation diagrams and wiring layout diagrams, provide the main source of data for the case study. While other diagrams that describe the overhead electrification system are available, the data required by the asset registry are fully contained within these two diagram types. Isolation diagrams are a not-to-scale logical representation of the electrical network and are used to indicate the electrical components that are affected upon switching on/off an isolator. The types of 3
Where universally applicable, additional data items can be submitted to MIMOSA for consideration in future releases of the standard.
727
data available from an isolation diagram are shown in Table 3.Wiring layout diagrams show the overhead wire composition and their connection to the various masts located along the railway track. The types of data available from a wiring layout diagram are shown in Table 4. Depending on the original source of the diagram, data can range from unstructured (scanned hand-drawn documents) to structured (CAD-designed). Extracting information such as the asset instance, name, type, location, and drawing coordinates is relatively simple from structured diagram sources. Depending on the design principles instituted by the organisation, relationships and meterages may also be ascertained from these diagrams. In contrast, some seemingly structured diagrams may be less than suitable when the CAD system is used as a drawing tool as opposed to asset-based, object-oriented design tool. For example, when inserting a template shape representing an asset graphic and a asset text label onto the canvas, not linking the two objects can lead to difficulties in establishing a relationship between the graphic and textual data if other assets and labels are nearby on the canvas. Moving toward structured diagrams entails a decrease in the amount of manual effort required to subsequently extract data from the document. While image recognition techniques can be used in automating data extraction from diagrams, the process is still laborious due to immaturity of the recognition technology. Further problems arise when configuring recognition rules for diagrams that have evolving conventions over time (e.g. changing positions and content for a title box).
Table 3 Data available from isolation diagrams Information type
Corresponding data model entities
Asset instance and name
asset, segment
Asset type
asset, segment, asset_type, segment_type
Asset relationships
segment, segment_child, sg_network_connect
Asset location
asset_on_segment
Kilometerage and meterage specifications
sg_num_dat_type, segment_num_data
Asset associated drawing and coordinates
segment_blob_data, segment_blob_coordinate
Table 4 Data available from wiring layout diagrams
Information type
Corresponding data model entities
Asset instance and name
asset, segment
Asset type
asset, segment, asset_type, segment_type
Asset relationships
segment, segment_child, sg_network_connect
Asset location
asset_on_segment
Kilometerage, meterage, and wire-structure connection height specifications
segment_num_data
Asset associated drawing and coordinates
segment_blob_data, cieam_sg_blob_coord
728
3.2.2 Existing registries There are a plethora of information systems that can store asset registry-related data such as financial, work management, process control, and reliability systems. Each can contain a different aspect of the asset registry information required. For example, financial and work management systems might contain manufacturer, model, make, and cost data; process control systems might contain operational measurement thresholds; and reliability systems might contain a detailed reliability block diagram of the assets. Queensland Rail had implemented a work management system that contained a list of assets (instances and names), meterage, manufacturers and models. The data could be output to a flat Microsoft Excel file for analysis and comparison with diagrammatic information. 3.3 Naming Naming is fundamental to an asset registry as it allows for identification of elements in the registry. The identification process using names is important more so for users than computers (the latter relying on primary keys) and it is important to note that behaviour of primary keys and names in the OSA-EAI can differ. For example, the OSA-EAI specifies that the primary key should never change throughout the lifetime of the asset, while the name is permitted to change over time. This scenario could occur where a business rule forces an asset’s name be based on its segment hierarchy; consequently, the name will change once the asset is moved. However, due to the immutable primary key, the history of the asset can remain intact. 3.3.1 Asset instance naming Asset naming is an opinionated topic as there is no single correct method of naming. Names can often include asset type, function, and location information within the name (e.g. B4-C-P3 might represent condensation pump 3 at building 4). Naming should be standardised across assets and flexible enough to incorporate new scenarios such as incorporating new asset types or hierarchies. Some assets, due to their type, have names that are derived from other assets and for consistent naming, conventions must be enforced. For example, neutral sections derive their names from their surrounding lines and with lines named 131E and 141A, there are many conventions that can be generated (e.g. smallest numbered track first, position based, or direction of travel). For this case study, the direction of travel is used as the convention such that if a train travels on line 131E before proceeding to 141A, the neutral section is named 131E-141A. It is often the case that segments are allocated a name that concatenates the parent segment name with the segment’s name, showing the full segment hierarchy (e.g. B4-C-P3 in the example above). While having the full hierarchy in the name allows for the quick identification of the type and location of the asset, this should be provided as a function of the asset registry software by traversing the asset/segment child relationships as opposed to storing the hierarchy in the data itself, introducing data redundancy. MIMOSA OSA-EAI follows the same thought pattern, specifying two locations for names: the user_tag_ident and name fields, both of which are intended for naming the asset or segment itself, rather than its hierarchy. 3.3.2 Asset type naming Asset type names are usually found in the legend section of diagrams. For the case study, all of the asset types from the legends were inserted directly into segment_type and asset_type. New records were created for all asset types from the legends as none had equivalent asset types in the OSA-EAI Reference Data.
Figure 2. Isolation diagram legend extract
The naming schema used for the asset types follow the OSA-EAI pattern of separating major characteristics with commas, with the most common characteristic on the left. From the legend extract in Figure 2, 132kV Double-Break Isolator becomes
729
Isolator, Double-Break, 132kV4; Circuit Breaker remains as Circuit Breaker; and GIS Circuit Breaker becomes Circuit Breaker, GIS.
3.4 Specifications Specifications (otherwise known as characteristics, attributes, or configuration) are the relatively static properties of assets and segments. In the case of assets, these properties might include dimensions, weight, colour with more technical properties including speed, flow, or voltage. Specifications can vary between a range (e.g. dimension tolerance), or can be a single figure (e.g. minimum RPM). 3.4.1 Meterage specifications for discrete and linear assets Meterage specifications indicate the distance between an asset and a defined point (e.g. a station). Such specifications are used for identification and field location purposes, although using meterage for the latter can be problematic and introduce data maintenance complexities when significant modifications, such as a detour, have been made to a railway track. The use of a meterage value alone for locating assets is a simplification when applied on a linear network such as a railway. An addition to the data model could include entities and information types for asset location referencing for location referencing systems (e.g. km post reference markers and meterage along the line) and thereby enabling utilisation of the dynamic segmentation capabilities of a GIS (Geographic Information System). Assets can have multiple references to its location, its physical location at a point in time and its location reported against defined network configurations using a specified location referencing method, e.g. an LRS (linear referencing system). The transformation from a physical location to a reported location (and between linear referencing systems) would likely be performed via a GIS web service. For non-linear, discrete assets, meterage is recorded as a single point; while linear, continuous assets use start and end meterage specifications. Thus three new segment numeric data types (sg_num_dat_type) were employed: “Meterage”, “Meterage, Start”, and “Meterage, End”. Despite the name of the numeric data type, kilometres are used as the unit type for the segment numeric data (segment_num_data) entries in order to align the meterage values with the diagrams. As with the derived asset naming, linear assets with a start and end meterage need a convention that indicates which part of the asset signifies the start and end. For example, with wire assets, the start specification was taken from the first connecting structure (taking into account track direction) and the end specification from the second connecting structure. 3.4.2 Inheritance of meterage specifications Meterage specifications (in addition to other specifications) have the property that they can be applicable to child assets. A switch, connected to a structure, will often be allocated the same meterage specifications because it is “attached” to the structure. Although more a characteristic of the asset registry software rather than the data model, these “attached” assets should inherit the meterage specifications rather than store the specifications to avoid data redundancy. 3.4.3 Asset relationship specifications For specifications that result from a relationship between assets, there are multiple potential locations to store the data: against either of the assets or against the relationship between the assets. For example, the connection heights of a single wire held up by two structures can either be associated against the wire segment (distinguishing between the connection height of the first and second structures, the order depending on track direction), against the structure, or against the relationship of the structure and wire. While all three results in the same data captured, the choice depends on the usage of the data. The OSAEAI currently does not allow the latter method to be used, as numeric data can only be stored against an asset and not its relationship with another asset. In this particular example, a decision was made to store such height specifications against the structure asset. 3.5 Relationships Asset relationships are the physical or conceptual connections between two or more assets. They are often used for grouping in asset navigation and system-based calculations (e.g. system reliability or health). Within the OSA-EAI, assets and segments can be related through two means: parent/children relationships through the asset/segment_child entities, and network relationships through the as/sg_network_connect entities. The 4
For the case of the 132kV double-break isolator, it can be argued that 132kV is a specification of the asset, rather than the asset type. It was decided to include the voltage in the type name itself as the asset type was such a common occurrence. However, any submissions to MIMOSA for inclusion of new asset type reference data would be submitted under the universal case (i.e. without the voltage rating in the name).
730
parent/child child relationships are considered as compositional in nature; that is, child assets constitute the parent asset, and physically moving the parent asset would also move the child assets. Network relationships are considered as aggregate in nature; that is, physically moving one asset would not physically move a networked asset. While networks can also represent parent/child relationships by specifying the appropriate network type, the dedicated entities for parent/child relationships simplify the required data. For equipment mounted on mast structures, including autotransformers, switches, etc., asset/segment child hierarchies are used as moving the structure would entail moving the supported equipment. Mast structures can connect to the main network through both via their subcomponents as well as directly when used as a support mechanism. 3.5.1 Wire relationships Wires form the largest number of assets for the electrical traction system due to the discretisation of the linear assets. The electrical section is broken into different lines, and a line contains wires of different types, including catenary, contact, feeder, earth, balance weight anchor, and mid-point anchor wires. These wires are separated across sections, where there are two grounding anchors at either end with the wire strung between mast structures. This type of discretisation, while creating an exponential increase in the number of assets, allows data (e.g. maintenance actions) to be recorded against the individual wire segments, rather than against the whole line or section. Translating this to an OSA-EAI structure results in line assets/segments, which are composed of section assets/segments, which are composed of section children assets/segments (see Figure 3). The relationships between wires and structures (both mast and anchor) are of network types. The track direction determines the ‘output asset’ and hence a structure will be connected to a wire, that wire is connected to another structure, that structure is connected to another wire, etc.
Figure 3. Composition of wires
3.5.2 Conceptual relationships Asset relationships do not need to be limited to physical connections between assets, but can consist of conceptual groupings. This is enabled through the OSA-EAI network entities, which allow for an unlimited number of networks to be created. Conceptual relationships are useful for referring to all assets in a particular location where an adjacent location might contain some of the assets from the first location. In the case study, this functionality was used for creating isolation networks, which indicate which assets are contained between two neutral sections, and business-defined railway line groupings. 4
IMPLEMENTATION AND VALIDATION
An OSA-EAI database was created based on the SQL scripts provided by the standard, and all identified assets and related data were inserted into the database. As all data is stored across relational structures, they have little meaning on visual inspection and require manipulation into a human-readable form. Therefore, a graphical user interface (GUI) was developed to provide a view of the asset registry data. Names were matched to primary keys, lists were created from type tables, and hierarchical structures (e.g. asset children) could be easily viewed and traversed. The tool also allows for searching and filtering of assets based on asset names, types, locations, and specifications. It also allows a visualisation of the asset relationship structure by rendering the graph structure.
731
The GUI provides a read-only view of the data and all data insertion was performed via external programs. Data validation was largely visual and through easier perusal of the data came easier validation of the data, particularly for non-technical personnel. Extensions to the GUI to allow data insertion would allow further validation rules to be implemented. 5
CONCLUSION
The idea of an asset registry is certainly not a new one; however, the move towards SOA has highlighted the need to shift towards a consolidated asset registry that provides a single source of truth. The MIMOSA OSA-EAI forms a potential candidate for the asset registry’s data model and reference data, although there still remain some challenges that need to be addressed by future versions. However, combined with the standards-based interoperability XML specifications, the standard forms a very viable platform on which asset management information systems and interoperability can be based. 6
REFERENCES
1 MIMOSA. (2008) Open System Architecture for Enterprise Application Integration V3.2.1, from www.mimosa.org/downloads/44/specifications/index.aspx Acknowledgements This research was conducted within the CRC for Integrated Engineering Asset Management, established and supported under the Australian Government’s Cooperative Research Centres Programme. The authors would like to acknowledge the assistance of Deborah Hirst from QR Network, Queensland Rail and Dr. Ken Bever from Machinery Information Management Open Systems Alliance.
732
Proceedings of the 4th World Congress on Engineering Asset Management Athens, Greece 28 - 30 September 2009
CORPORATE EXECUTIVE DEVELOPMENT FOR INTEGRATED ASSETS MANAGEMENT IN A NEW GLOBAL ECONOMY Dr Rudolph Frederick Stapelberg BSc.Eng, MBA(Exec), PhD (Eng), DBA, Pr.Eng. Director of the Academy for Professional Education and Training, Australia.
The sudden shift from a worldwide boom economy to a recession in just under two years has caused a dramatic change in the direction corporate executives need to drive their organisations. The rapid slide from economic growth into recession on a global scale has been dramatic. This has prompted the need for a radical re-orientation of corporate strategic, tactical and operational planning and decision-making priorities to deal with the shift to a new global economy. However, a survey conducted in December 2008 by Booz and Company found that corporate executives are struggling to make the right moves in the current economic environment, with many wavering in their confidence of fulfilling the expected decision-making and leadership capabilities essential for navigating their companies through the crisis. The accepted past wisdom has been that economic downturns are the great incubators of innovation-driven strategies. But in today’s reality, innovation usually takes a backseat to survival. What happens in this situation is that innovative strategic vision and decision-making often tends to weaken, with a general shift towards more survivalist operational initiatives. In this regard, corporate assets, both current and physical, are significantly affected. Physical assets management is primarily about capital assets, the productive assets in which an organisation has invested, particularly during times of economic growth. This has become blurred however, and Integrated Asset Management has expanded to include strategic, tactical and operational planning as well as decision-making of a company’s total wealth. It has become a prime responsibility of corporate executives to get more from existing assets and to understand how these assets are performing through adopting key innovation and leadership initiatives in sustainable practices. This paper considers the growing interest world-wide, by corporate organisations as well as universities, of professional executive management development in general, and executive management of corporate assets in particular, in a new global economy that requires the essential capabilities of: establishing corporate strategic planning through innovation-driven, though risk-informed, decision-making; leadership in the establishment and implementation of organisational tactical and operational planning through knowledge-based decision-making; implementation of sustainable practices in preparation for a future competitive edge; as well as business knowledge transfer through the appropriate organisational training of a company’s human resource assets. Key Words: Global financial crisis, executive development, integrated asset management, corporate strategic planning, tactical planning, operational planning, decision-making, innovation, leadership, sustainable practice, human resources training. 1
INTRODUCTION
In 2008-2009 much of the industrialised world entered into a deep recession sparked by a financial crisis that had its origins in reckless lending practices involving the generation and distribution of mortgage debt and its related securities in the U.S.A. The years immediately before the 2008-2009 financial crisis were characterised by a combination of easy credit conditions, low risk premiums, aggressive lending practices and less disciplined risk management. The easy money conditions encouraged financial institutions and investors to adopt more leveraged strategies, leaving their capital bases more exposed to adverse risk. After house prices in the U.S.A. drastically started falling, and interest rates began rising, the least creditworthy (sub-prime) borrowers began defaulting. By March 2009, 13.6 million home owners (17%) in the U.S.A. were making monthly payments on houses worth less than they owed on their mortgages. In the fourth quarter of 2008, 15% of all homes and apartments were left unoccupied, causing losses to investors in mortgage-backed securities. The nature of the securities, such as collateralised
733
debt obligations (CDOs), made it difficult for investors, institutions and authorities to know the scale and dispersion of critical risk exposures and potential losses throughout the financial system (Access Economics, 2008). As losses mounted and trust in counterparties deteriorated, credit markets ceased functioning properly. The ensuing financial turmoil resulted in widespread financial slowdown and eventually in a global financial crisis — the biggest shock to global financial markets since the 1930s. Prices for many assets fell a long way. In the year to October 2008, world stock markets lost 40% of their value. The resulting loss of wealth hurt investors and caused many companies to fail, including a number of key financial intermediaries. U.S. government bailouts of large financial corporations such as the insurance giant, AIG, cost American taxpayers well over US$170 billion, much of which went to international banks on the other side of AIG’s financial trade. Following the collapse of Bear Stearns in March 2008, and Lehman Brothers in September 2008, a generalised loss of confidence between financial institutions triggered reactions akin to a ‘blackout’ in global financial markets, with a further carry-on effect to the manufacturing industry, such as the bankruptcies of auto manufacturing giants Chrysler and General Motors. Since October 2008 the global financial crisis led to the bankruptcy of many more financial institutions in the U.S.A. and in Europe, threatening the entire global financial system (Access Economics, 2008; TIME, March 2009). The financial turmoil that erupted in the United States had broadened to include non-bank financial institutions, and rapidly spread to the rest of the world. Asian countries, and in particular the manufacturing economies relying heavily on exports such as Hong Kong, South Korea, Singapore, Taiwan, as well as China, became export economies precariously perched on the expectation that western countries, and the U.S.A. in particular, would continue to buy more domestic products, television sets, computers and cars. However, plunging exports started affecting their growth rates. Year-on-year percentage changes in total exports caused a drop in their growth rates to negative values, such as –12% for Hong Kong, -33% for South Korea, -35% for Singapore, and –44% for Taiwan. China’s huge export driven economy also started to feel the impact of the economic slowdown in the U.S.A. and Europe, whereby the Chinese government cut key interest rates three times in less than two months in a bid to spur economic expansion. In November 2008 a national stimulus package equivalent to US$586 billion was announced by the government of the People's Republic of China in its biggest move to stop the global financial crisis from hitting the world's third largest economy. The announcement of the stimulus package temporarily sent financial markets up around the world, however economic analysts claimed that China was actually in recession (The Times, January 2009). The Organisation for Economic Co-operation and Development (OECD) is an international organisation of 30 countries (excluding the Russian Federation and most Asian countries, except Japan) that accept the principles of representative democracy and free-market economy. Most OECD members are high-income economies and are regarded as developed countries. By late 2008, most OECD economies were on the verge of a protracted recession of a magnitude not experienced since the early 1980s. As a result, the number of unemployed in OECD countries could rise by 8 million over the next two years. The International Labour Organisation has forecast that the number of unemployed people worldwide could increase by more than 50 million in 2009 as the global recession intensifies. This outlook represents a downward revision from the first quarter of 2009, and many of the downside risks previously identified had materialised, placing greater pressure on corporate executives to reconsider their planning, decision-making, and leadership skills (OECD, 2008; The Times, January 2009). A survey conducted in December 2008 by Booz and Company of over 800 corporate executives around the world, found that companies, whether financially weak or strong, were struggling to make the right moves in the 2008-2009 global economic crisis. The survey captured responses from executives in 65 countries, representing companies from many major industries. The survey explored how well corporate executives were handling the global economic crisis, the actions they were taking, and the resulting impact their companies had on social responsibility agendas, with the finding that many executives wavered in their confidence of leadership capabilities to navigate their companies through the crisis. According to the survey, 40% doubt that their leadership has a credible plan to address the economic crisis, while 46% are not sure that their leadership could carry out any survival plan, credible or not. Additionally, one-third of all corporate executive level respondents do not have confidence in the plans that they presumably wrote themselves. Further, a remarkably high number of hard-hit companies (65%) are not doing enough to ensure their own survival, such as accelerating efforts to dispose of non-core business assets, or secure external funding. Among companies that are financially strong, one quarter are not taking advantage of opportunities to improve their position in the crisis (Reuters, 2009).
2
IMPACT OF THE GLOBAL FINANCIAL CRISIS ON INVESTMENT IN CAPITAL ASSETS
The global financial crisis has resulted in a significant fall in commodity prices, the source of much of the manufacturing and mining industry’s income and growth over recent years through further investment in capital assets. While most commodities are usually locked into short-term contracts, as these contracts came up for renegotiation, prices fell significantly. In the face of this, investment in mining-related capital assets was already being scaled back. Lower commodity prices also mean lower profits, particularly in the mining sector, and any reduction in investment spending inevitably flows through to other sectors of the economy, as well as reducing future employment in the manufacturing and mining-related industries. Industry surveys indicated that investment in the coal mining sector would drop by 40% in 2009 compared to 2008. Nonetheless, this drop is from very high levels reached in 2007, which were exceptionally profitable. Coal companies used free cash flows to sharply increase their investments, as well as paying out large dividends to shareholders. Expected reductions in
734
capital asset spending in 2009 were most marked among high-cost coal producers, especially those supplying export markets, such as in the U.S.A., Russia and Australia (Access Economics, 2008; OECD/IEA, 2009). Capital asset investment in new projects for those industries where demand is expected to remain at a commensurate level, such as in the energy industry, the outlook for renewable-based power projects is mixed, but is generally falling proportionately more than that in other types of power generating capacity. Estimates for 2009 predicted investment in renewable energy as a whole would drop by as much as 38%, although stimulus provided by most government fiscal packages probably offset a small proportion of this decline. Investment in renewable energy assets surged in recent years, recording year-on-year growth of 85% in 2007. But activity slowed in 2008 as sources of finance contracted and lower fossil-fuel prices reduced the economic incentive for new investment, particularly in the last few months of the year. Data for the first quarter of 2009 indicated that the slump in investment had accelerated, with spending 42% lower than in the previous quarter. In most regions, investment in bio-refineries has dried up due to lower ethanol prices and scarce finance. In the oil and gas sector, there has been a steady stream of announcements of cutbacks in capital asset spending and project delays and cancellations, mainly as a result of lower prices and cash flow. Global upstream oil and gas investment budgets for 2009 have already been cut by around 21% compared with 2008, a reduction of almost US$100 billion. Between October 2008 and the end of April 2009, over 20 planned large-scale upstream oil and gas projects, valued at a total of more than US$170 billion and involving around 2 million barrels per day of oil production capacity and 1 billion cubic feet (110 million cubic metres) per day of gas capacity, were deferred indefinitely or cancelled (OECD/IEA, 2009). Capital asset investment in the power sector has been severely affected by financing difficulties, as well as by weak demand. Global electricity consumption could eventually drop by as much as 3.5% by the end of 2009, the first annual contraction since the end of the Second World War. In the OECD, electricity demand in the first quarter of 2009 fell by 4.9% on a year-on-year basis. Non-OECD countries have also seen weaker demand. In China, demand fell by 7.1% in the fourth quarter of 2008 and by a further 4% in the first quarter of 2009. Weak demand has reduced the immediate need for additional capacity and resulting new investment in capital assets, thus affecting the manufacturing and mining industries world-wide. At the same time, commercial borrowing has become more difficult, with the cost of capital rising markedly, and venture capital and private equity investment falling sharply. In the event financial recovery takes longer than expected, capital-intensive projects will decline, although this depends on the policies and support funding that countries have in place for investment in infrastructure to sustain certain levels of employment (OECD/IEA, 2009). Governments worldwide face two major capital asset investment challenges. First, low-income countries suffer from limited fiscal potential. Many already have large current account deficits; a third have deficits that exceed 10% of their GDP. Foreign reserves are limited. Developing countries thus lack the means to stimulate their own economies. Second, in order to raise money, countries must issue bonds or raise taxes. The latter will lead to less available private money in the short term; and the former will require increased savings in the short term in order to pay off debt in the long term. Unless any government stimulus pays for itself, such a stimulus will be ineffective. The solution to both problems is to follow China’s example and invest in infrastructure capital assets of developing countries. Infrastructure already tends to be adequate in developed countries, so capital asset investments will likely not provide a high return. However, infrastructure “bottlenecks” to growth in developing countries are relatively easy to identify and profitable to exploit. The return on these investments would then spread throughout the globe through various international trade linkages. A case in point is the Chinese state-owned industrial giant, Chinalco’s US$19.5 billion bid for the world’s third largest mining company, Rio Tinto. However, the Rio Tinto Group has rejected Chinalco’s bid, opting rather to raise US$15.2 billion through a share offering with its former rival, the world’s largest mining corporation, BHP Billiton, for a 50/50 joint venture that will consolidate all their iron ore assets in Western Australia. The decision allows Rio Tinto to pay towards its US$38.9 billion debt without selling stakes in its largest mines to Chinalco. Nevertheless, China is the first major economy to engineer an upturn in growth, but China on its own is not able turn the global economy around. China still expects economic growth of 8% in 2009 and has announced investments in several infrastructure capital assets intended to support it. However, legitimate questions can be raised about the viability of such plans when economies around the world have faltered. China’s capital asset investments had worked during the Asian financial crisis of 1997-1998, but then the problem was limited to one continent, and was not global. In the aftermath of the 2008-2009 global financial crisis, financial systems will be subject to much more stringent regulation with less freely available credit, and hence much more subdued recoveries in capital asset investment. (ANZ, 2009; World Bank, 2009). 3
IMPACT OF THE GLOBAL FINANCIAL CRISIS ON INTERNATIONAL CORPORATIONS
The 2008-2009 financial crisis brought about a considerable decline in business all over the world, marked by reduced consumer spending and diminished production output, particularly in Britain, France, Germany and Japan. With the declining global economic activity, macroeconomic risks heightened, though not uniformly compared to credit risks. Uncertainty about losses in financial institutions and record low values of troubled assets continued to plague the financial systems in most advanced countries, leading to their inability to attract private capital and, in several cases, necessitating government infusions amounting to billions of dollars. This resulted in increased national deficits to record heights in most cases. Financial systems in these economies thus remained under severe stress. The deteriorating outlook for corporate industries, including financial institutions, took a severe toll on their balance sheets. Corporate industries that were severely impacted included major
735
industrial and infrastructure assets such as the automotive, airline, building and construction industries. Automotive companies such as GM, Ford and Toyota reported in October 2008 a decline in sales of 45%, 30% and 23% respectively. GM’s file for bankruptcy in June 2009 was one of the largest debt bankruptcies in U.S.A. history (Grail Research, 2008; IMF, 2009). There was little doubt that the global financial crisis required far-reaching changes in the shape and functioning of financial markets in a new global economy whereby financial systems world-wide were to be characterised by lower levels of leverage, reduced funding mismatches, less counter-party finance risks, and more transparent and straight-forward financial instruments than the pre-crisis period. The global financial crisis demonstrated that greater emphasis should have been placed on systemically focused surveillance and regulation. The emphasis should have been on how to detect and mitigate systemic risks through better regulation, since neither market discipline nor government oversight were sufficient to properly assess and contain the build-up of such risks. However, while attempts to eliminate all systemic risk was not only impossible, it would also result in slow credit growth and constrain economic creativity and business innovation. Restoring credit growth is essential to sustaining economic activity. Government fiscal stimulus to support economic activity and limit the rapid degradation of asset values was therefore intended to improve the creditworthiness of borrowers with collateral underpinning of loans, and combined with financial policies, to bolster banks’ balance sheets to enable sound credit extensions. Seed funds for private-public partnerships for infrastructure projects were also intended to raise demand for loans (IMF, 2009). The private sector had a central responsibility in such fiscal stimulus by contributing to the new global economy through greatly improved corporate strategic planning and risk management, demonstrated leadership in organisational tactical and operational planning, implementation of sustainable business practices including attention to governance and remuneration policies, and transferring new economic-business knowledge to the company’s human resource assets. However, the Booz and Company survey concluded that, in many cases, corporations did not following the course that was best suited for them. Based on an analysis of responses, the survey results indicated the following:
• • •
•
Corporations were categorised as being strong (characterised by both financial and competitive strength), stable (strong financially but weak competitively), struggling (weak financially but strong competitively), or failing (weak both financially and competitively). Stable and strong corporations were more focussed on cutting costs across the board and conserving cash, than on opportunities to strengthen their competitive positions. While stable corporations would be expected to capitalise on the global financial crisis by buying companies with compelling capital assets and products - but weak finances, or pursuing other growth initiatives, 21% pulled back on mergers and acquisitions. One in five stable corporations also invested less in new capital assets and products, or slowed down moves into emerging markets. While struggling and failing corporations would be expected to accelerate efforts to improve working capital positions, slash overhead, drive process improvements, and renegotiate deals with suppliers, surprisingly many were not doing this. Between a quarter and a third of the survey respondents said they were pursuing such strategies no more aggressively than they were before the crisis.
In the case of restructuring corporate debt, two broad approaches have been suggested. The first approach is a voluntary private sector debt workout between banks and borrowers. In this case, debtors negotiate with a consortium of creditors to establish a mutually agreeable level of debt service and loan maturities. The second approach is that governments take a central role in the restructuring process. Such a role will vary from case to case, but is essential in a systemic crisis where insolvencies are large and private coordination difficult. The global nature of the financial crisis has however made any effort towards restructuring difficult for at least two key reasons. First, corporations have borrowed from cross-border banks that operate in a wide range of jurisdictions where corporate law and in-court settlement frameworks differ, making coordination of debtors and cross-border creditors more difficult. Second, the holders of corporate debt are much more dispersed than in the past because corporations have financed their activities by issuing bonds in international markets, and many corporate loans have been acquired by securitisation structures, with each structure holding a small share of any single corporation’s debt (IMF, 2009).
4
IMPACT OF THE GLOBAL FINANCIAL CRISIS ON CORPORATE MANAGEMENT
Despite the depth of challenges most corporations faced in the global financial crisis, 54% of all respondents of the Booz and Company survey believed that the crisis would ultimately have a positive impact on their companies` competitive position. This sense of optimism was even higher among executives in emerging markets (59%), compared with corporate executives in the U.S.A. (53%) and in Western Europe (52%). Furthermore, 75% of executives expressed their various companies` financial strength in the financial crisis as being positive, and only 13% said they worked with companies that were financially weak. Scepticism however was much more visible farther down the management chain. Among senior managers below the CEO and corporate executive levels, 51% thought executive leadership lacked the capabilities to carry out their crisis plans, a point that seemed at odds with the optimism expressed by many executive respondents. Many top executives were still reacting and were not ahead of the possible impacts of the financial crisis to formulate appropriate recovery plans. They were still operating with cumbersome processes and ineffective lines of communications. This tended to slow down any form of approach to the right
736
strategic, tactical and operational decision-making. Fundamentally, corporate executives were not getting the right homework fast enough, nor were they able to enact decisions quick enough or to the extent they expected. Corporate executives were struggling to make the right moves in the current economic environment, with many wavering in their confidence of fulfilling the expected decision-making and leadership capabilities essential for navigating their companies through the financial crisis. The accepted past wisdom has been that economic downturns are the great incubators of innovation-driven strategies. But in today’s reality, innovation usually takes a backseat to survival. What happens in this situation is that innovative strategic vision and decision-making often tends to weaken, with a general shift towards more survivalist operational initiatives. Financial industry executives were alone in praising collaborative efforts to resolve the crisis. Forty-three percent of financial industry executive respondents believed business, government and union leaders were working together effectively to stabilize their industry. However, scepticism about stakeholder collaboration was highest in healthcare and pharmaceuticals (56%); telecommunications and media (42%); and transportation and commercial services (41%). Essentially, the global financial crisis called for a new, more direct executive leadership approach. Critical to addressing the global financial crisis for their companies, senior leadership needed to take stock of their world view through three steps of crisis restructuring, according to Booz and Company (Reuters, 2009):
• • •
5
Get an accurate assessment of the economic environment and their corporations’ position in it. An accurate and organisation-wide diagnosis is critical to end the cycle of inappropriate strategic planning actions. Design a good strategy plan that does enough, but not too much, when time is short and resources may be diminished in a crisis. Identify a limited set of straightforward initiatives that have the potential to make a difference quickly. Execute and communicate the right strategic, tactical and operational decision-making, which is vital to regaining the confidence of all stakeholders, from sceptical managers to risk-averse shareholders.
INTEGRATED ASSET MANAGEMENT IN A NEW GLOBAL ECONOMY
Asset management is a systematic process for programming and integrating efficient and equitable allocations of resources for cost-effectively installing, operating, maintaining, upgrading, and replacing physical capital assets to fulfil established business performance and service delivery objectives, and thus takes both a life cycle and systemic perspective of both industrial and infrastructure capital assets. Integrated Asset Management (IAM), and more specifically Integrated Engineering Asset Management, involves integrated processes of managing physical assets during their useful lives, and requires a certain level of management insight and expertise from diverse organisational disciplines. IAM is a systematic, structured process covering the whole life of physical assets whereby the underlying assumption is that an organisation's assets exist to support the organisation's delivery strategies. The principal objective of IAM is thus to achieve the best possible match of assets with an organisation's service delivery strategies. Furthermore, corporate objectives are translated into delivery strategies, outputs and outcomes. These delivery strategies thus combine with business information systems and human and financial resources. In addition, IAM provides a framework for handling both short-term and long-range planning to facilitate a more organised and logical approach to decision-making. This decision process is scenario-driven and founded upon principles of strategic management, risk management, engineering economy, performance measure, information technology, and asset usage life cycle engineering. IAM begins with an organisation’s strategic planning of service delivery objectives, investment risk decisions, budgeting and costing, and continues through tactical and operational planning of assets installation, operation, condition assessment, maintenance and eventually disposal of the physical asset. Integrated Asset Management in a new global economy would inevitably focus on those asset management issues that are most affected in a post-recession recovery. These issues would predominantly include the installation, operation, maintenance, upgrading or replacement of new as well as existing capital assets to fulfil set performance and service delivery objectives. This has become blurred however, and Integrated Asset Management has expanded to include strategic, tactical and operational planning as well as decision-making of a company’s total wealth. It has become a prime responsibility of corporate executives to get more from existing assets and to understand how these assets are performing through adopting key innovation and leadership initiatives in sustainable practices.
6
CORPORATE EXECUTIVE DEVELOPMENT IN A NEW GLOBAL ECONOMY
Corporate executive development has been a topic of major interest in most prominent university business schools prior to the 2008-2009 global financial crisis and came under focus in the corporate world with the collapse of Enron and its aftermath. Enron Corporation, founded in Houston, Texas, in July 1985, was one of the largest energy, services, and commodities corporations in the world. Growth for Enron was rapid. In 2000, the company's annual revenue reached US$100 billion and its stock price peaked at US$90. It ranked as the seventh largest corporation on the Fortune 500 and the sixth largest energy company in the world. However, in October 2001, Enron reported a loss of US$618 million, its first quarterly loss in four years. By late November 2001, the company's stock was down to less than US$1. Investors had lost billions of dollars. Enron filed for bankruptcy protection in December 2001 in the biggest case of bankruptcy in U.S. history at that time. Roughly 5,600
737
Enron employees subsequently lost their jobs. What caused the collapse of Enron? Many business analysts and academics have offered various reasons for Enron’s failure. Many of these reasons relate directly to corporate executive mismanagement of huge proportions that included, among others, a breakdown of corporate organisational structure. Organisational structure equates with corporate strategic decision-making capability, and is essential in reaching corporate objectives. Although strategy plans were reviewed and approved by Enron's board of directors, key decision-making was left to the chief financial officer and the chief operating officer, for whom the scope of broader corporate responsibility did not feature in their designations. Their span of control was non-existent and their basic interests were financially motivated. Enron's chief financial officer was also the managing director of corporate partnerships financed by Enron. Enron financed partnerships made ‘off the books’ and buried these partnerships, and any resulting losses, in elaborate financial statements. Billions of dollars were lost in these partnerships, and kept off Enron’s books through overly complex accounting practices involving the accounting firm, Arthur Andersen. Furthermore, Enron's executive leadership could not encompass all of the organisation’s goals, nor could it provide the necessary advice and guidance needed for operational decision-making. The way a business and its people conduct themselves is a direct reflection of its leadership. Responsible leadership and responsible decision-making were lacking in Enron. There was no firm standard of conduct for Enron’s business practices, and the top executives broke management leadership down by turning over their accountabilities to unqualified and inexperienced staff in the guise of delegating responsibilities. Enron’s collapse was largely brought on by a lack of respect and concern for sound, sustainable business practices. Because Enron's corporate culture was established by executives who lauded the premise of ‘profits at all costs’, the people recruited by Enron were to be of a mindset type that was reflected in the required organisational training of the company’s human resource assets. So much so that Enron created a ‘cheat sheet’ of sorts to aid in recruiting managers. They basically looked for a particular type of person who had no reservations about closing deals (Fowler, 2002). The 2008-2009 financial crisis has resulted in a new and growing interest by corporate organisations world-wide, as well as by universities, of professional executive management development in general, and executive management of corporate assets in particular, in a new global economy. Such corporate executive development required the essential capabilities of:
• • • •
Establishing corporate strategic planning through innovation-driven, though risk-informed, decision-making. Leadership in the establishment and implementation of organisational tactical and operational planning through knowledge-based decision-making. Implementation of sustainable practices in preparation for a future competitive edge. Business knowledge transfer through appropriate organisational training of a company’s human resource assets.
6.1 Establishing corporate strategic planning through innovation-driven, though risk-informed, decision-making. Strategic planning can be defined as; ‘a continuous and systematic process where the guiding members of an organisation make decisions about its future, develop the necessary procedures and operations to achieve that future, and determine how success is to be measured’ (Goodstein et al. 1993). To fully understand strategic planning, it is necessary to look at a few key words in the definition of strategic planning (U.S. Federal Benchmarking Consortium, 1997); continuous refers to the view that strategic planning is an ongoing process, not merely an event to produce a plan; systematic recognises that strategic planning is a structured and deliberate effort, that does not happen on its own; process recognises that a benefit of strategic planning is to think strategically about the future and how to get there; guiding members identifies not only senior corporate and business unit executives but stakeholders and customers; procedures and operations means the full spectrum of actions and activities from aligning the organisation behind long-term goals, to establishing organisational incentives, allocating resources, and developing human resources; how success is to be measured recognises that strategic planning must use appropriate measures to determine organisational achievement.
Most importantly, strategic planning is an opportunity to unify an organisation’s management, employees, stakeholders and customers through a common understanding of what the organisation’s core business is, and how everyone involved can work to that common purpose, and how progress and levels of achievement will be measured. In a post-recession recovery economy, the pivotal point in a company’s strategic planning process is the point at which strategic direction (i.e., the organisation’s goals, objectives and activities by which it plans to achieve its future vision, mission and values) is set. It is at this point that an organisation’s knowledge and insights about its past, present, and future converges, and a path is chosen around which the organisation aligns its activities and its resources. Without strategic direction, an organisation risks both internal misalignment and the likelihood that it will fail to respond to the vagaries of a changing world – particularly in a new global economy. Organisations today use decentralised business units that focus on intangible knowledge, capabilities, and relationships created by employees whereby organisational strategy becomes a continual and participative process. Regardless of such a revised structure of the strategic planning process, its timing or its participants, most organisations emphasise the central role that guidance from corporate executives play in ensuring success. The change from financial measures that come from past actions can no longer measure the objectives that need to be addressed. However, strategy must be measured, and the best tool to
738
implement such a practice is the Balanced Scorecard. The Balanced Scorecard allows organisations to build a management system that manages strategy - a strategy-focused organisation. Strategy in this context means communicating in a way that everyone can understand the organisation’s plan for success; focused means navigation in the organisation to align strategy; and organisation means to mobilize all employees to act in different ways that will link together across the business. The Balanced Scorecard provides a framework to look at organisational strategy from four different perspectives. It gives corporate executives the information necessary to make important decisions that affect everyone in the organisation, using the five principles of a strategy-focused organisation, specifically, make strategy a continual process; align the strategy; translate the strategy to operational terms; make strategy everyone’s everyday job; and mobilize through executive leadership. These five principles help organisations achieve the required focus and alignment (Kaplan and Norton, 2001). There are however three clear threads that run through most corporations’ strategic planning process. The first is that the top executives must always be personally involved in establishing guidance at the corporate level. The second is that guidance cascades from top to bottom in the organisation. This is not to say that strategic issues are not raised from lower organisational levels, or that strategic thinking does not occur outside the executive boardroom. On the contrary, strategic planning in a recovery economy dictates that management throughout all organisations are expected to think and act strategically. This means that strategic guidance created at successively lower levels in the organisational management hierarchy is built upon and aligned with the guidance at higher levels. Managers at each level take their cue from the guidance of the managers above them. The third thread is that strategic guidance is not developed as a staff output, but it is the result of some of the most important thinking and decision-making by the leaders of the organisations (U.S. Federal Benchmarking Consortium, 1997). Strategic planning in a post-recession recovery economy will typically include the following:
• • • • • •
Setting targets – to achieve enough, but not too much, when time is short and resources may be diminished in a crisis. Corporate objectives – initially identify short-term initiatives that have the potential to make a difference quickly. Gap analysis – extrapolate current performance, and gauge the gap between target and forecast performance. Strategy Formulation – generate and evaluate options for achieving targets, and select best options. Strategy Appraisal – look at competitive advantages and weaknesses and refine targets. Strategy Implementation – Draw up plans, monitor and control achievement.
Setting targets is often driven by past performance, or performance expectations based on past performances. However, in a post-recession recovery economy, targets should be set sufficiently close to reach, with enough to achieve, but not too much. Corporate objectives should be short-term with the potential to make a difference quickly, followed by longer-term initiatives. Gap analysis is usually conducted by finance and marketing staff looking at what adjustments are necessary to achieve these targets. This is an essential but difficult quantitative analysis during post-recession recovery. Most strategic plans are to do with addressing and improving performance, but occasionally a progressive short-term to long-term approach to strategic planning is necessary through concise strategic appraisals of changes in the economic environment. Strategic appraisals should examine the organisation and its economic environment from inside out and from the outside in, considering the strengths and weaknesses of the organisation in order to formulate, appraise, implement, and evaluate strategic options. Assets strategic planning can be simply defined as; ‘the planned alignment of physical assets with service delivery’. It is achieved by the systematic management of all decision-making processes taken throughout the life of the asset. Assets strategic planning realises what the organisation’s key delivery of services are from its strategic assets base. Assets strategic planning commences with the identification and analysis of community needs and expectations of service in the case of public sector assets, and the identification and analysis of corporate or business demands in the case of private industry assets. The primary focus of assets strategic planning is to achieve optimal service delivery through effective asset utilisation and efficient asset management. Traditional performance has been measured in terms of inputs and the minimisation of costs at the input end of the resource equation. Assets strategic planning however, requires a focus on delivery and outcomes at the output end of the resource equation. The real measure of success is enhanced assets service delivery. In a new global economy, corporate executives need to get more from existing business assets and to understand how these assets are performing. Performance measures and benchmarks in assets strategic planning should therefore be primarily directed at measuring these outcomes. 6.2 Leadership in the establishment and implementation of organisational tactical and operational planning through knowledge-based decision-making. Major economic slowdowns and recessions undergo highly predictable patterns. Just before the slowdown occurs, the market is at a peak and there is a great deal of complacency amongst companies, shareholders and investors. Money is relatively easy to get and much is therefore wasted. Corporate behaviour in financial matters are relatively lax and investment quality declines across the board, not just at the corporate executive level but also with shareholders. As a result, companies globally are striving to identify, develop, and put in place outstanding leaders to face the challenges of a highly unpredictable world - commercially, economically and socially. Yet, in the commercial world at least, there is a widespread feeling that current leadership approaches are not working. Most of these approaches derive from theories that are almost a century old.
739
Approaches based on these theories have a psychoanalytic, anthropological as well as social basis rather than a financial basis. A basic requirement for executive development in a new global economy is to transform leadership capabilities from a static theoretical approach to one that can meet and effectively deal with a post-recession economy (Perth Leadership Institute 2008). An important organisational component, therefore, in a new global economy, is executive leadership development. When conducted effectively, executive leadership development can greatly influence organisational culture and success. Armitage, Brooks and Schulz (2006) suggest that leadership development comprises of more than training individual executive leaders, and that there is a much more complex relationship between individual and organisational leadership development, as shown in Figure 1, the Leader Maturity Model Life Cycle. Such a relationship is particularly significant in those organisations entering a post-recession recovery economy where ability and capability gaps and organisational limitations, significant impacting global trends, as well as corporate culture and available resources, affect leader performance.
Figure 1. Leader Maturity Model Life Cycle (Armitage et al. 2006) The model shows that there are numerous other processes in place for a leader to be successful. For example, in a global changing economy, if a leader is trying to transform an organisation in an organisational culture that is resistant to change, no leadership development process will make the organisation successful. Leadership development must therefore be focused on systemic change, especially in organisational planning, as well as individual leadership development (Armitage et al. 2006). The organisational planning process is rational and open to the scientific approach to problem solving and decision-making. Organisational planning consists of a logical and orderly series of steps that include three fundamental stages in the planning process namely, strategic planning, tactical planning, and operational planning. Strategic planning sets policies and guidelines for the rest of the organisation’s planning. Traditionally, strategic planning has been done annually. However, in a new global economy, many organisations are doing away with annual business plans altogether and moving to a system of continuous planning, to permit quicker response to changing economic conditions. Such a plan involves adapting the organisation to take advantage of opportunities in a constantly changing economic environment. Tactical planning deals primarily with the preparatory phase of the planning process through systems and models. The tactical plan demonstrates how the strategic plan will be executed. The tactical plan must integrate into the strategic plan and accomplish the goals of the strategic plan. It must do this by developing specific objectives related to strategic goals, and by setting targets that are modelled to simulate typical real-world scenarios of the strategic plan as closely as possible. Tactical planning turns strategy into reality, and is usually tightly integrated with the annual budgeting process. Operational planning is based on the implementation of organisational plans and includes procedures and methods. Operational plans support tactical plans. They are the tools for executing daily, weekly, and monthly activities. One example is an operating budget, which is a plan that allows for, and places restrictions on, money to be spent over a certain period of time. Another example of operational planning includes scheduling the work of employees and identifying needs for staff and resources to meet short-term changes in business, finance and the economy. A systematic method of monitoring these changes must be adopted to establish process knowledge criteria in order to determine whether the strategy plan is unfolding as intended and to decide whether any plan must be changed in the drive to continuously improve the tactical and operational planning process through decision-making based on business knowledge.
740
6.3 Implementation of sustainable practices in preparation for a future competitive edge. Corporate sustainability can be defined as a business planning approach to creating shareholder value by embracing opportunities and managing risks derived from changing economic, environmental and social trends. While most corporations have a number of business planning processes in existence, few have appropriate systems to effectively co-ordinate them. Developing co-ordination systems will assist in integrating the components of sustainable practices into mainstream planning processes, typically such as assets strategic, tactical and operational planning. Key questions to be addressed are (SAM, 2003): • • •
Which corporations will profit most from future sustainability trends? Which corporations will benefit the least from such trends? Where are the investment opportunities of tomorrow?
Sustainable practices address these questions, and analyse sustainability trends, through (SAM, 2003);
focusing on future-oriented trends and technologies; assessing not only ‘hard’ but also ‘soft’ economic factors; complementing existing financial valuation methodologies; considering financial crisis and environmental management; identifying successful capital asset investment opportunities.
Sustainability trends also set the framework for sustainable investments and their impact on a company‘s business approach. Figure 2 illustrates sustainability trends concerning social, environment, and economic factors.
Figure 2. Social, Environment, and Economic Sustainability Trends (SAM, 2003) Sustainable practices in asset management set the framework for capital asset investments and their impact on a company’s business approach for a future competitive edge. Sustainable practices are based on the hypothesis that successful business depends on identifying capital asset investment opportunities that can provide a good return. This depends on a vigorous population of business enterprises and investment targets, which in turn depends on a healthy and vibrant macro economy. 6.4 Business knowledge transfer through appropriate organisational training of a company’s human resource assets. Business knowledge is the collective term used to describe the knowledge underpinning an organisation, allowing it to perform its functions and potentially giving it its competitive advantage. It covers a wide spectrum of knowledge, expertise, skills, and know-how that define an organisation. Business knowledge represents the most important corporate asset and covers the following:
• Documented business policies and guidelines, business systems and models, and operating procedures and methods. • Compliance with external rules, regulations and legislation (e.g. tax, statutory and environmental regulations). • Employees’ know-how and expertise relating to an organisation’s customers, products, services, resources, processes, operations and risks.
741
The concept of developing business planning processes from business knowledge is not new. However, many businesses do not have sufficient information that can be used for management planning, simply because they have not collected the data. Concerted attempts to develop business planning processes through an accumulation of appropriate business knowledge should become a prime focus for executives needing to develop corporate strategy in a post-recession recovery economy. Still, there are many situations where the appropriate information does not exist because the conditions they reflect seldom occur. This is typical of organisations caught in a financial crisis where executives focus more on survival and less on business knowledge. The transfer of essential business knowledge to those levels of organisational management that can best apply it in the business planning process, can deliver significant competitive advantages to an organisation. This would include preserving the expertise of key employees; the ability to respond fast to economic changes made possible by the use of captured knowledge; and applying business knowledge widely, accurately, consistently and rapidly – especially in a financial crisis. Such transfer of business knowledge is feasible only through appropriate organisational training of a company’s human resource (HR) assets. However, HR development and training are amongst the first areas to suffer in an economic downturn. In economic downturns, CEOs and boards focus on immediate cost-cutting exercises that are designed to reflect in the financial statements of the organisation. Typically little or no thought is given to changing employee and managerial behaviours. The immediate focus is to get to numbers that will show significant changes have been made that will result in earnings improvement. In such an environment, behaviours are indeed changed, but only in the short-term. The fundamental financial behavioural attributers remain unchanged. The focus is on short-term financial metrics rather than long-term financial behaviours. This leaves the company vulnerable to precisely the same financial vulnerabilities, namely to lower quality business planning and related decision-making, once the global economy improves (Perth Leadership Institute, 2008).
7
REFERENCES
1
Access Economics, (2008) The impact of the global financial crisis on social services in Australia. Issues paper prepared by Access Economics, Australia, for the Australian social services sector, November 2008.
2
ANZ, (2009) The global financial crisis and its impact on the Australian economy. Presentation to Singapore banking sector by Eslake, S., Chief Economist, ANZ, Australia, March 2009.
3
Armitage, J.W., Brooks, N.A. and Schulz, S.P. (2006) Remodelling Leadership. Performance Improvement, 45, 40-49.
4
Economic Insight, Inc. (2008) The 2008 Global Financial Crisis. plenary presentation for the International Association for Energy Economics by Van Vactor S.A., Perth, Australia, November 6, 2008.
5
Fowler, T., (2002) Enron Adds Up Four Years of Errors. Houston Chronicle, January 2002.
6
Goodstein, L. D., Nolan, T. M., and Pfeiffer, J. W. (1993) Applied Strategic Planning: A Comprehensive Guide, Amsterdam.
7
Grail Research, (2008) Global Financial Crisis: Summary of the media’s coverage of the timeline, causes, implications, impact and recommended path forward. Presentation by Grail Research, LLC, a member of the Monitor Group.
8
IMF, (2009) Global Financial Stability Report: Responding to the Financial Crisis and Measuring Systemic Risks. World Economic and Financial Surveys, International Monetary Fund, Washington DC, April 2009.
9
Kaplan R. and Norton, D. P., (2001) The Strategy-Focused Organisation: How Balanced Scorecard Companies Thrive in the New Business Environment. Harvard Business School Press. Boston, MA, U.S.A.
10 OECD, (2008) Managing the Global Financial Crisis and Economic Downturn. Economic Outlook by Schmidt-Hebbel, K., Chief Economist, OECD, Paris, November 2008. 11 OECD/IEA, (2009) The Impact of the Financial and Economic Crisis on Global Energy Investment. Office of the Chief Economist (OCE) of the International Energy Agency (IEA), OECD, Paris, May 2009. 12 Perth Leadership Institute (2008) A Recession’s Role in Transforming Leadership Development. White paper by Perth Leadership Institute, U.S.A., February 2008. 13 Reuters, (2009) Senior Managers Lack Confidence in Corporate Leadership’s Plans to Counter Economic Crisis, Finds Booz and Company Survey. Reuters Top News, Jan 20, 2009. 14 SAM, (2003) Sustainable Asset Management. Sustainable Asset Management (SAM), Zollikon-Zürich, Switzerland, presentation by Tucker E., to the 2nd Biennial ISIN Meeting, March 13-16, 2003, Toronto, Canada. 15 The Times, (2009) Global unemployment heads towards 50 million. The Times, New York, U.S.A. January 29, 2009. 16 U.S. Federal Benchmarking Consortium, (1997) Best Practices in Customer-Driven Strategic Planning. U.S. Federal Benchmarking Consortium Study Report, U.S. Federal Government, February 1997. 17 World Bank, (2009). The Causes and Impact of the Global Financial Crisis: Implications for Developing Countries. White paper by Lin, J, Chief Economist of the World Bank, February 9, 2009.
742
Proceedings of the 4th World Congress on Engineering Asset Management Athens, Greece 28 - 30 September 2009
DEVELOPMENT OF AN ONLINE CONDITION MONITORING SYSTEM FOR SLOW SPEED MACHINERY Eric Y Kim a, Andy C. C. Tan a, Joseph Mathew a and Bo-suk Yang b a
b
CRC for Integrated Engineering Asset Management, Queensland University of Technology, 2 George St, Brisbane, 4001, Australia
QLD
School of Mechanical Engineering, Pukyong National University ,San 100, Yongdang-dong, Nam-gu, Busan, 608-739, South Korea.
One of the main challenges of slow speed machinery condition monitoring is that the energy generated from an incipient defect is too weak to be detected by traditional vibration measurements due to its low impact energy. Acoustic emission (AE) measurement is an alternative for this as it has the ability to detect crack initiations or rubbing between moving surfaces. However, AE measurement requires high sampling frequency and consequently huge amount of data are obtained to be processed. It also requires expensive hardware to capture those data, storage and involves signal processing techniques to retrieve valuable information on the state of the machine. AE signal has been utilised for early detection of defects in bearings and gears. This paper presents an online condition monitoring (CM) system for slow speed machinery, which attempts to overcome those challenges. The system incorporates relevant signal processing techniques for slow speed CM which include noise removal techniques to enhance the signal-to-noise and peak-holding down sampling to reduce the burden of massive data handling. The analysis software works under Labview environment, which enables online remote control of data acquisition, real-time analysis, offline analysis and diagnostic trending. The system has been fully implemented on a site machine and contributing significantly to improve the maintenance efficiency and provide a safer and reliable operation. Key Words: Condition Monitoring, Low Speed Machinery, Rolling Element Bearing, Acoustic Emission 1
INTRODUCTION
Condition Monitoring (CM) is the process used to determine the operational state and health of a machine for the purpose of detecting potential failures before they turn into functional failures. Condition Monitoring is an integral part of Predictive Maintenance (PdM) or Condition-Based Maintenance (CBM). Typical CM techniques include vibration analysis, oil analysis, wear particle analysis, ultrasonic analysis, thermographic analysis and motor current signature analysis. Low speed machines (LSM) are usually large and have high rotating inertia. Bearings are arguably the most critical component to be monitored in most LSMs. Extensive research has been done on diagnosing rolling element bearing defects using vibration analysis [1-8]. At low speeds, the impact energy between the rotating elements and the defect is generally low and consequently it is weakly detected through vibration transducer. Theoretically, it is possible to extract a low energy signal using traditional signal processing methods, but in reality it is problematic. The use of wear particle analysis is impractical because low speed units are usually grease lubricated. Conventional techniques involving acceleration vibration may not be able to detect a growing fault due to the low impact energy generated by the relative moving components. This has lead to an increasing use of the Acoustic Emission (AE) technique in fault diagnosis of rotating machinery in low speed machinery condition monitoring [9-13]. AE technique has been revealed to be the most sensitive approach as compared to vibration and ultrasound measurements for detection of incipient faults especially in low speed situations [14]. However, due to the high frequency nature of AE signal, it requires a high sampling rate in the region of a few MHz. This is one of the main shortcomings of the AE application as it requires a highly specialized data acquisition system and experts to interpret the results. An additional problem with AE
743
application in low speed machines is that it requires not only high sampling frequency but also longer recording time in order to cover enough complete shaft revolution. Longer time recording is also necessary to have an enough resolution to observe sidebands of defect frequencies in frequency spectra. Hence, AE data are normally analyzed using the hit-based or continuous time-based and not in the frequency spectrum. This also restricts the application of many advanced signal processing techniques to AE signal analysis. Kim et al [15, 16] have revealed the effectiveness of the AE signals in the frequency range up to 100 kHz for low speed bearing CM/D. This paper introduces an online CM system which uses the AE-based technique for LSMs. An on-line data acquisition system has been constructed as LSMs are often exposed to harsh environment and hence manual data acquisition has to be minimised in practice. For this reason, the sensors and data acquisition system are installed on the selected machine and the data are collected remotely via web. This enables detection of damages as soon as they appear and minimises the need for regular human interference. Software has been developed using Labview, which enables online remote control of data acquisition setup, online real-time analysis, offline analysis and diagnostic trending.
2
SIGNAL PROCESSING AND FEATURE EXTRACTION
Most of the research on bearing diagnosis based on vibration measurement can be categorized in the time domain and frequency domains. The RMS, crest factor, probability density moments (skewness, kurtosis) are the most popular statistical time domain parameters for bearing defect detection [3]. In the frequency domain, the enveloping method, also known as demodulation or HFRT (High Frequency Resonance Technique), is the most popular technique for detecting localized defects [4,5]. Wavelet analysis has also been successfully applied to bearing defect detection [6-7]. Recently, AE measurement has been widely applied to assessing the integrity of machines as well as structures. Most of AE applications have been focused on high frequency events (100kHz~1MHz) of AE signals using highly specialized data acquisition and analysis systems. Hit-based features (counts, energy, amplitude, rising time and duration) are typical monitoring features in many applications. However, in rotating machinery the high frequency AE signals are easily overwhelmed by background noise coming from various moving machine parts. A drawback of using the hit-based AE features is the lack of frequency information of the AE events. Frequency of the occurrence of burst-type hit events is by far the most relevant feature for identifying bearing defect frequencies, such as ball passing frequency on outer race (BPFO) or ball passing frequency on inner race (BPFI). It is impractical to get the frequency of AE events in low speed situation because much longer duration of data is required, and consequently a huge amount of data is generated for each revolution of the shaft. Kim et al [15, 16] have successfully applied low frequency (up to 100kHz) AE signals analysis for early detection of low speed bearing faults. This research utilises those low frequency AE signals with proven processing algorithms, such as time domain features and envelope analysis. Fig. 1 shows the signal processing and feature extraction flow used in the software. Wavelet packet analysis was employed as a band-pass filter. Adaptive line enhancer (ALE) is applied as an option for preprocessing to enhance the signal to noise ratio.
Fig. 1 Flow of signal processing and feature extraction
744
2.1 Peak-hold down sampling (PHDS) The main challenge in AE application to low speed machinery is its huge amount of data to be processed due to the high frequency sampling of the AE signals which resulted in long duration of time records in order to obtain a full range of frequency information. Traditional AE-based techniques for slow speed bearings do not involve spectrum analysis in bearing diagnosis. To overcome this issue, this research proposed the peak-hold down sampling (PHDS) technique which maintains the high frequency information of the burst AE signals with significant reduced sampling frequency. PHDS works like an analog peak-hold circuit in digital domain. In PHDS, an analog signal is sampled at the original high frequency for a number of complete revolutions and temporally stored in the buffer memory of the DAQ board. Then, the peak amplitudes detected from the rectified signal are saved into data file. As this process is accomplished in the memory of DAQ board and controlled by Labview software, it does not need the original high sampled data to be saved into the hard disk. Hence, it overcomes the problem of massive data storage due high sampling and low rotation speed. In Fig. 2, PHDS signal is compared with normal down-sampled signal in time domain and envelope spectrum. The signal was originally from a defective bearing of a gearbox. The result clearly shows that the normal 1kHz sampling signal does not provide information of the bearing defect frequency (BPFO) in the envelope spectrum while the PHDS 1kHz signal maintains similar spectrum to that of the original 250kHz sampling signal. In this case the data size was reduced by 250 times while maintaining similar detectability. This is a significant advantage for storing, processing and analyzing the measured AE data especially from low speed machines due to the longer time span required for fine resolution in frequency spectrums.
2.2 Noise removal In low speed machinery, the impact energy generated by the rolling elements on the defective components is insufficient to produce a detectable vibration response. This is further aggravated by the inability of general measuring instruments to detect and process the weak signals accurately. Furthermore, the weak incipient signals are often corrupted by background noise along its transmission path to the sensor located on the machine surface. Kim et al [17] compared two noise removal techniques to enhance the slow speed bearing signals so as to increase the detectability of incipient defects. Blind deconvolution (BD) and adaptive line enhancer (ALE) were used as wide-band notch filter to remove sinusoidal-type noises. Optimum filter length was chosen to maximise kurtosis and modified peak ratio. The performance of the two algorithms were compared and validated by using simulated bearing signals and signals from a test rig with a defective bearing rotating at low speed. Adaptive line enhancer was incorporated in this application owing to its better performance and computation efficiency.
-5
1.5
8
AE1, Env, [32 64]kHz, Origin vs PHDS(250:1kHz)
x 10
1
6
0.5 0
Original 250 kHz Sampling
BPFO
4
-0.5
2 -1 -1.5
0
0.5
1
1.5
2
2.5
3
3.5
0
4
50
100
150
200
250
300
350
400
-4
1.5
4
x 10
Amplitude (mV)
Amplitude (mV)
1 0.5 0 -0.5
3
PHDS 1 kHz Sampling
2
1
-1 -1.5
0
0.5
1
1.5
2 Time (sec)
2.5
3
3.5
0
4
1.5
1.5
50
100
150
x 10
200 Frequency (Hz)
250
300
350
400
Amplitude (mV)
Amplitude (mV)
1 0.5 0 -0.5
1
1 kHz Sampling
0.5
-1 -1.5
0
0.5
1
1.5
2 Time (sec)
2.5
3
3.5
4
0
(a) Time wave form
50
100
150
200 Frequency (Hz)
250
300
(b) Envelope spectrum
Fig. 2 Comparison of PHDS with normal down sampling.
745
350
400
3
HARDWARE IMPLEMENTATION
3.1 Target machine The mixer gearbox in the green carbon production line in an aluminium smelter has been chosen as the target machine on which the developed technique is applied. A continuous processing of carbon anodes is a key factor in successful operation of the smelter. Effective planning and organisation of maintenance strategy prevents production backlogs, keeps equipment in top operating condition and ensures that productivity remains high. Integrated information and technology systems are vital for successful operation of the smelters. The role of the gearbox is to produce both reciprocating and rotating motions at around 50rpm for the paste mixer. This lateral reciprocating motion causes two distinctive impact motions due to the significant inertia of the mixer shaft and mixing material when it reaches the maximum stroke. This impact makes the analysis difficult in time domain as well as frequency domain. In time domain, this high level of impact signal overwhelms minute useful signals from defects. In frequency domain, these impulsive signals induce a series of harmonics and their side bands in frequency spectrum.
3.2 Data acquisition system Low speed machines are often exposed to harsh environment. In the green carbon production line, crushed and graded fractions of calcined petroleum coke and recycled anode butts are heated and mixed with molten pitch. A respiratory is compulsory to access to the site as toxic coke powder rides in the air. Hence manual inspection and data acquisition are restricted. Accordingly, online remote monitoring is highly expected for this machine. For these reasons, a decision was made to construct an online condition monitoring system to implement the AE-based monitoring technique. Suitable hardware systems have been explored. The key requirements for the data acquisition hardware are online monitoring capability, compatibility, expandability and cost. National Instrument’s PXI system (www.ni.com) was chosen as the rugged PC-based platform which offers a high performance, low-cost deployment solution for measurement and automation systems. PXI also adds mechanical, electrical, and software features that define complete systems for test and measurement, data acquisition, and manufacturing applications. These systems are used in other applications such as manufacturing test, military and aerospace, machine monitoring, automotive, and industrial test. The main advantage of NI PXI system over other similar products is the flexible control software. Labview software has a fully functional measurement application with analysis and custom user interface using a variety of hardware.
3.3 AE sensor and installation There are two types of AE sensors: resonance-type and wideband-type. Resonance-type AE sensors maximise the sensitivity in a specific frequency range near the resonance frequency, while wideband-type one covers a wide range of frequency normally from 100kHz to 1MHz. In the use of AE sensor for monitoring of rotating machinery, resonance-type AE sensors are often used because high frequency (over 100kHz) signals attenuate only short distance and are easily buried by background noise. The authors found the feasibility of AE signal of the range of 50-100kHz [16] for slow speed bearing monitoring. Therefore, a resonance-type AE sensor with a resonance frequency of 60kHz was chosen. A preamplifier is normally used with the AE sensor as the signal detected by the sensor is normally too weak and can be easily buried by the background noise. An integrated-type AE sensor in which a built in preamplifier provides the benefits of not requiring a separate amplifier, hence is suitable for industry application. This also enables driving long cables without interfering by noise. Physical Acoustic Corporation (PAC)’s CH6I is selected as it meets our requirements. AE sensors are normally attached to the machine surface using magnetic holders. This type sensor mounting is not suitable in this application because the mixer gearbox has high vibration due to the reciprocating motion. Furthermore the sensors need to be attached at the lower side of the bearing housing close to the loading zone of bearing. Therefore, the sensors were attached on the surface of the gearbox securely and permanently using the epoxy glue which can be detached using acetone. The schematic of the data acquisition system and its installation are shown in Figure 3. Six AE sensors, one accelerometer and one tachometer were installed permanently on the target gearbox. A signal conditioning box for AE sensors is to decouple the AE signals from the original signal coupled with 28 VDC. The signals are digitised at the DAQ card and stored in the PXI system. The PXI is connected to internet via a local network with security function allowing access from anywhere in the world through internet.
746
Fig. 3 Schematics of the hardware system and its installation besides the machine
4
SOFTWARE IMPLEMENTATION
The CM software has been developed on the Labview environment which directly controls the data acquisition hardware. The software enables real-time observation and off-line analysis. The software consists of data acquisition control, real-time analysis, off-line analysis, feature extraction and trending of features. Fig. 4 shows some screen displays of the software. The software has an option of extracting all relevant diagnostic features for trending. The features include (i)mean Peak Ratio for outer-race, inner race, ball and cage train obtained from envelope spectrum; (ii) RMS, crest factor, skewness and kurtosis from four levels of wavelet packet analysis from time waveform; and (iii) amplitude of harmonics of rotating speed and gear mesh frequencies from raw spectrum. Fig. 5 shows a trending of mean Peak Ratio (mPR) [16] of outer race response for five bearings. The mPR of bearing 2 is distinctively higher than other bearings, which indicates that its condition is worse than the others. It is also noted from this trending that although a spall was initiated on the outer race, its deterioration rate is slow at the moment and continuous close monitoring is required for this bearing. 5
CONCLUSIONS
This paper introduced an AE-based condition monitoring system for low speed machinery. To overcome the challenges of traditional AE-based technique in CM of low speed bearings, the peak-hold down sampling was proposed. CM software has been developed in Labview environment, which enables online remote control of data acquisition, online real-time analysis, offline analysis and diagnostic trending of defective components. The system has been successfully implemented onto a site machine and has detected a possible incipient defect on a bearing indicating early stage of its deterioration. The superiority of the developed AE-based method over traditional vibration-based lies on the ability of earlier detection of defects with increased sensitivity, which enables not only to prevent a catastrophic failure but to optimize the maintenance schedule by providing significant lead time before a real failure occurs. Further research is ongoing to develop a robust gear diagnostic algorithm based on the data acquired from the developed system.
747
(a) Main window
(b) bearing and gear data input window
(c) Analysis window
(d) Feature trending window Fig. 4 Screen shots of the CM software
Fig. 5 Example of mPR trending for the bearings
6
REFERENCES
1
N. Tandon and B. C. Nakra, (1992) Vibration and acoustic monitoring techniques for the detection of defects in rolling element bearings - a review. Shock and Vibration Digest, 24, 3-11.
2
Tandon N & Nakra BC. (1992) Vibration and acoustic monitoring techniques for the detection of defects in rolling element bearings - a review. Shock and Vibration Digest, 24, 3-11.
3
Tandon N & Choudhury A. (1999) A review of vibration and acoustic measurement methods for the detection of defects in rolling element bearings. Tribology International, 32, 469-480.
748
4
Alfredson AC & Mathew J. (1985) Time domain methods for monitoring the condition of rolling element bearings. Mechanical Engineering Transactions.
5
McFadden PD & Smith JD. (1984) Vibration monitoring of rolling element bearings by the high frequency resnance technique - a review. Tribology International, 17, 3-10
6
Liu TI & Mengel JM. (1992) Intelligent monitoring of ball bearing conditions, Mechanical Systems and Signal Processing, 6, 419-431.
7
Paya BA, Esat II & Badi MNM. (1997) Artificial neural network based fault diagnostics of rotating machinery using wavelet transforms as a pre-processor, Mechanical Systems and Signal Processing, 11, 751-765.
8
Lin J, Zuo MJ & Fyfe KR. (2004) Mechanical fault detection based on the wavelet de-noising technique, Journal of Vibration and Acoustics, Transactions of the ASME, 126, 9-16.
9
Rao RBKN. Advances in acoustic emission technology (AET) in COMADEM: A state-of-the-art review, COMADEM2003, 1-18.
10
Yoshioka T, Korenaga A, Mano H & Yamamoto T. (1999) Diagnosis of rolling bearing by measuring time interval of AE generation, Journal of Tribology, Transaction of ASME, 121, 468-472.
11
Tandon N & Nakra BC. (1999) Defect detection in rolling element bearings by acoustic emission method, Journal of Acoustic Emission, 9(1), 25-28.
12
Shiroishi J, Li Y, Liang S, Kurfess T & Danyluk S (1997) Bearing condition diagnostics via vibration and acoustic emission measurements, Mechanical Systems and Signal Processing, 11(5), 693-705.
13
Jamuludin N, Mba D & Bannister RH. (2001) Condition monitoring of slow-speed rolling element bearings using stress waves, In Proceedings of the IMechE, 215(4), pp. 245-271, ProQuest Science Journal.
14
Shi DF, Wang WJ & Qu LS (2004) Defect detection for bearings using envelope spectra of wavelet transform, Journal of Vibration and Acoustics, Trans. ASME, 126(4), 567-573.
15
Kim Y-H, Tan ACC, Mathew J & Yang B-S. (2006) Condition monitoring of low speed bearings: a comparative study of the ultrasound technique versus vibration measurement, In WCEAM’06, World congress on engineering asset management, paper no. 26, Gold coast, Springer Publisher.
16
Kim Y-H, Tan ACC, Mathew J, Kosse V & Yang B-S. (2007) A comparative study on the application of acoustic emission technique and acceleration measurements for low speed condition monitoring, In APVC’07, Asiapacific vibration conference, Sapporo.
17
Kim EY, Tan A & Yang B-S (2008) Fault detection of slow speed rolling element bearing with noise removal techniques, In 15th ICSV, International Congress on Sound and Vibration, pp. 1981-1988, Daejeon,
Acknowledgments The work is supported through a grant from the CRC for Integrated Engineering Asset Management (CIEAM).
749
Proceedings of the 4th World Congress on Engineering Asset Management Athens, Greece 28 - 30 September 2009
ELECTRIC SIGNATURE ANALYSIS AS A CHEAP DIAGNOSTIC AND PROGNOSTIC TOOL Stefano Ierace a, Marco Garetti b and Loredana Cristaldi c a
University of Bergamo, CELS - Research Center on Logistics and After Sales Service, Department of Industrial Engineering Viale Marconi, 5 - I - 24044 Dalmine (BG), Italy (Tel: +39 035 2052384, E-mail: [email protected] ) b
Department of Management, Economics and Industrial Engineering – Politecnico di Milano, Piazza L. da Vinci 32 – 20133 Milano, Italy (Tel: +39 02 23994760, E-mail: [email protected])
c
Department of Electric Engineering - Politecnico di Milano Piazza L. da Vinci 32 – 20133 Milano, Italy (Tel: +39 02 23993751, E-mail: [email protected] )
In modern industrial context, given the increasing requirements in quality and efficiency, maintenance is becoming more and more important, while its strategy is changing rapidly moving towards the development of diagnostics and prognostics techniques which aims at minimizing, and theoretically eliminating, the impact of failures. However, due to the need of cost reduction, it is necessary not only to prevent failures, but also to implement the required tools in a cheap way (i.e. without affecting product/equipment cost). This paper introduces ESA (Electric Signature Analysis), a novel technique which can act as a cheap and reliable tool for diagnostics and prognostics. The effectiveness of the technique is demonstrated in a real case study, carried out for the diagnostics of a vending machine. Finally, a first attempt to integrate the ESA tool in a general maintenance architecture is also provided. Key Words: Electric Signature Analysis, Condition Based Maintenance, Diagnostics, Prognostics. 1
INTRODUCTION
In modern industrial world, the increasing demand for efficiency and quality has influenced production technology and maintenance management strategies. In this context, the approach to maintenance has changed substantially, moving its focus from fault Repair Maintenance (RM) to Preventive Maintenance (PM). Within PM, the Cyclic Maintenance approach is being substituted by Condition Based Maintenance (CBM), which directly evaluates the equipment’s health condition by performing a periodic or continuous (on-line) condition monitoring of the equipment itself. As a further evolution of CBM, Predictive Maintenance (PdM) stems from the goal of predicting the future trend of the equipment’s health conditions. In fact, being faults in complex systems usually preceded by progressive degradation phenomena, the availability of suitable monitoring techniques can allow the identification of the equipment degradation level and its evolution, thus permitting to implement CBM and PdM policies. As an application perspective, many research works carried out so far on this topic ([1], [2], [3], [4], [5], [6], [7]), predict a future in which critical components, equipments and plants will be equipped with microelectronic systems able to acquire in real time measures useful to evaluate their "health state". MEMS (Micro Electro Mechanical Systems), Smart Sensors, RFIDs (Radio Frequency Identification), wireless networks, etc. are just some instruments/techniques that can be used to this concern. Moving to the equipment side, it should be considered that the evolution of technology is leading in the direction of an ever increasing dependence on electrical apparatus [8] for every kind of industrial or service machinery. This means that, for the future, condition monitoring should be more and more applied not only to mechanical, but also to electrical apparatuses. Also another issue should be considered related to energy saving, in fact recent requirements on energy resource control, on quality of the electric energy and on monitoring of all the electric components of an industrial system are claiming effective tools to diagnose system efficiency and effectiveness (i.e. to identify performance degradation phenomena as soon as possible). However, the implementation cost of monitoring technologies is an issue not to forget. Even if today many of the before mentioned instruments are available, their adoption by companies for monitoring is still limited, due to hardware and software costs, which obviously influence the
750
economical balance of CBM and PdM implementation. Therefore, the development of cheap monitoring techniques may have the potential to greatly influence and enhance the application of CBM and PdM practices. The purpose of this paper moves in this direction, by presenting the principles and some experimental results of the use of the Electrical Signature Analysis (ESA) as an approach to condition monitoring. ESA is the procedure of capturing the equipment’s supply signals (current and/or voltage) and analysing them to detect malfunctions (not only electrical ones) or incipient faults. ESA provides diagnostic and prognostic information comparable to conventional prognostics techniques, but requires only access to electric supply lines rather than to the equipment itself. In fact, since machines electrical clamps are always available, this implies that it is possible to conduct diagnostic and prognostic analyses without operate inside the tested equipment. ESA is an online (no stoppage required) and truly non-intrusive method. It is invisible to the equipment being monitored; therefore it does not interfere with the equipment’s operation and can be used to perform a one-time test or a periodic test to track and trend equipment performance. It is important to observe that the ESA technique can give the opportunity to develop very cheap diagnostic and prognostic systems. The approach suggests a modus operandi that optimizes resources consumption and utilizes in an intelligent way the power of software for signal analysis, without the need to put in place expensive and complicated hardware (i.e. sensors, cables, converters,.. ). The paper describes the design principles of a monitoring and diagnostic system based on the ESA technique and shows how an empirical application of the methodology for a vending machine has demonstrated the possibility to diagnose the degradation of a number of critical components. The remainder of the paper is organized as follows: section 2 provides a literature review on the most acknowledged diagnostics and prognostics approaches. Section 3 describes the main characteristics of the Electric Signature Analysis technique. The application of the technique in an industrial case study for a vending machine is reported in Section 4, while section 5 provides an overview of general software architecture for the integration of the ESA tool in a maintenance management system. Finally, conclusions and managerial implication are drawn in section 6.
2
STATE OF THE ART
Under the name of Maintenance Engineering [9] or Reliability Engineering, several approaches, tools and techniques are being developed in order to provide a scientific base to maintenance activities. In particular, given the evolution of maintenance concepts in the last decades, a lot of work has been done in diagnostics and prognostics, both in the literature and in the industrial field. Jardine et al. [1] provided a detailed review on how CBM can be implemented by machinery diagnostics and prognostics. Diagnostics and prognostics are described as two techniques for maintenance decision support in a CBM program: fault diagnostics focuses on detection, isolation and identification of faults when they occur, while prognostics attempts to predict faults or failures before they occur. Some issues of the two approaches are discussed hereafter.
2.1 Diagnostics Machine fault diagnostics is the procedure of mapping the information obtained in the measurement space and/or features in the feature space. In equivalent terms, this mapping process can be defined as a pattern recognition process obtained by the classification of signals based on the information and/or features extracted from machine signals. Practical issues, generally related to machine diagnostics, were discussed by Williams [10]. Machine fault diagnostic approaches can be classified considering the statistical, the artificial intelligent and the model-based point of view: •
•
Statistical approach: in this context, the method of fault diagnostics relies on detecting whether a specific fault type is present or not in equipment, based on the analysis of data coming from a registered condition monitoring activity. The fault detection problem can be described as a hypothesis test problem with null hypothesis H0: Fault A is present, against alternative hypothesis H1: Fault A is not present. Several techniques have been used to solve diagnostic problems, this way: (i) statistics tests and conventional Statistics Control Process ([11]; [12]; [13]), (ii) multivariate statistical analysis [14] and (iii) Hidden Markov Model [15]; Artificial Intelligent (AI) approach: AI techniques have been increasingly applied in the last years to machine diagnosis and have shown improved performance over conventional approaches. Two popular AI techniques for machine diagnosis are Artificial Neural Networks (ANNs) and Expert Systems (ESs). An ANN is a computational model that mimics the human brain structure. It consists of simple processing elements connected in a complex layer structure which enables the model to approximate a complex non-linear function with multi-input and multi-output structure (for more information [16]). The feed forward neural network (FFNN) structure is the most widely used neural network structure in machine fault diagnosis ([17]; [18]; [19]). Also, a special FFNN was used for pattern recognition and classification in machine fault diagnostics [20]. In contrast to ANNs, which extract knowledge by the training on known input and output data of the observed phenomenon, ESs rely on knowledge which should be extracted (i.e. given) by a domain expert and accordingly stored in a computer program, where an automated inference engine uses it to perform reasoning for problem solving. In the area of machinery diagnostics, three main
751
•
reasoning methods for ES are used: i) rule-based reasoning [21], ii) case-based reasoning [22] and iii) model-based reasoning [23]. However, the practical implementation of AI techniques has its weak point just in the lack of efficient procedures to obtain training data for the AI model (in the ANNs case) or to get specific domain knowledge to be implemented into the model from human experts (in the case of ESs). So far, most of the applications reported in the literature used ANNs with experimental data for model training. Model-based approach: Another class of machine fault diagnostic technique is the model-based solution, which utilizes explicit mathematical modeling of the monitored machine, based on physics laws. The analytical expressions of the model are used to obtain signals, called residuals, which are indicative of fault presence in the machine. Various model-based diagnostic approaches have been applied to fault diagnosis of a variety of mechanical systems such as gearboxes [24], bearings [25] and cutting tools [26], however a lot of work remain to be done in pursuing this potentially very interesting approach.
2.2 Prognostics Compared to diagnostics, the literature of prognostics is much smaller. Two main approaches to prediction are followed: •
•
Remaining Useful Life. The most obvious and widely used prognostic approach is to predict how much time is left before a fault occurs (or more than one), given the current machine condition and the past operation profile. The time left before observing a failure is usually called Remaining Useful Life (RUL) ([27], [28], [29];). Similarly to diagnosis, the approaches to prognosis fall into three main categories [1]: o Statistical approach, in which the RUL is evaluated as the prediction to the future of the statistical behavior of the considered entity calculated, for example, with the Weibull distribution [30], the logistic regression [31] or the Hidden Markov Model [32], o Artificial intelligence approach, in which the RUL is evaluated based on the distance between the prediction of a machine parameter given by the ANN (simulating the “as good as new” situation) and the current real value of the parameter itself ([33]; [34]). o Model-based approach, in which thank to a model implementing specific mechanistic knowledge and theory of the monitored machine, the future behavior of the machine is predicted [35]. Reliability approach: In some situations, especially when a failure can be catastrophic (e.g., in a chemical plant), it would be more desirable to predict the chance that a machine would operate without a fault up to some future time (e.g., next inspection interval), given the current machine condition and the past operation profile. Actually, the probability that a machine would operate without a fault until next inspection (or condition monitoring) interval could be a good reference for maintenance personnel to determine whether the inspection interval is appropriate or not ([36]; [37]; [38]).
2.3 Electric Signature Analysis Based on the above mentioned framework on diagnostics and prognostics, ESA could be considered as an efficient tool supporting both diagnostics and prognostics decisions. Since it is a rather novel technique in the maintenance area, a few contributions can be found up to now in the literature on its use as a diagnostic and prognostic tool. Basak et al. [39] reported an algorithm for induction machines diagnosis based on the analysis of the trend of the negative sequence impedance, obtained from the stator current and voltage vectors. The method has the capability of discriminating faults in the machine under investigation from the effects of an unbalanced supply system. Rodriguez and Arkkio [40] presented a method for the detection of stator winding faults based on monitoring the line/terminal current amplitudes and the use of fuzzy logic for decision making about motors stators conditions. Joksimovic and Penman [41] proposed an approach based on the analysis of frequency components of the line current spectra. Ponci et al. [42] presented a method for early fault detection of inverter fed induction motor drives. The method relies on a simple statistical algorithm, which analyses the line current signal acquired upstream from the rectifier-inverter and produces an ad hoc index related to the health state of the motor. Ponci et al. [42] also proposed another approach, for the purpose of identification of faults in the stator phase resistance of AC induction motor drives. In this case the method relies on the correlation between wavelet decomposition coefficients of the electric current in the healthy and in the faulty conditions. From this brief review it appears evident that the ESA approach may require, for the most difficult cases, the detection of frequency components or transient disturbances, related to the potential malfunction. Therefore, frequency-domain analysis algorithms, such as FFT algorithms and time-frequency transformations, such as wavelet transformations, appear to be the most promising tools for extracting the fault-related information from the signature. AI techniques, like artificial neural networks (ANNs), fuzzy, or neuro-fuzzy systems are extensively used these days to correlate the obtained fault-related information to the actual faulty condition whenever a limited knowledge is available about the mode of the monitored device. To this concern, Kim et al. [43] described a neural-network-based fault prediction scheme that does not require any machine parameter or speed information. Speed is estimated from measured terminal voltage and
752
current signals using the method itself; induction machines of different power ratings can be monitored this way only requiring a minimal tuning of the neural network.
3
ESA AS A DIAGNOSTIC AND PROGNOSTIC TOOL
The basic principle of Signature Analysis (SA) is that an effect is a result of a cause. SA is only applicable to cases in which this principle is verified. In general, this methodology requires defining a mapping process in order to define the “signature” as a cluster of object features. The mapping process (well known as Pattern Recognition approach) is based on the notion of similarity between two different objects or between an object and a reference object (the target or prototype object). Starting from this assumption, Electrical Signature Analysis (ESA) is the application of this principle for defining the signature of an electric signal acquired from the field. ESA was firstly proposed for predictive maintenance by the Oak Ridge National Laboratory, as part of a study on the effects of aging and service degradation of nuclear power plant components. The idea is that any variation in an electromechanical system generally produces correlated variations or disturbances in the current and voltage supply line of the electric motor powering the system. ESA can be used to analyze these small perturbations and to match them to their source. The resulting time and frequency signatures calculated with ESA reflect loads, stresses, and wear throughout the system and allow extracting an extensive range of mechanical diagnostic information. ESA can be used for: -
Diagnostic purposes, since it allows to identify which equipment causes the change in the signature and why,
-
Prognostic purposes, since it allows calculating the Remaining Useful Life, relying on the gap between the signal measured and the "reference signal" obtained in “as good as new” conditions.
ESA could be considered as a “black box” technique since it does not require any knowledge on the analyzed equipment, in fact, a practical way to use ESA is by creating a database about malfunctions. By the comparison among the feature of the current measured signal and the features of signals contained in the database, it should be possible to recognize the type of malfunction and / or deterioration, whose “signature” is “registered” in the monitored signal. It is important to underline that ESA can pinpoint not only electrical, but also mechanical problems. Recent studies by Nandi et al. [44] demonstrated that ESA represents a reliable technique since it achieves a detection effectiveness of 93% or more. Finally, ESA could also be integrated with statistical and / or artificial intelligent techniques to improve its effectiveness in diagnosis and prognosis. Finally, Annoni et al. [45] reported that the use of ESA can target maintenance on an as-needed basis, thereby increasing equipment reliability and maintenance efficiency and minimizing unexpected downtime One of the main advantages of ESA is related to its cheaper architecture in comparison with other traditional diagnostic / prognostic techniques which require complex hardware. ESA, in contrast to other diagnostic techniques which requires expensive sensors and hardware systems, only relies on the power of sophisticated software algorithms for data analysis of the equipment electrical supply line. The ESA’s hardware requirements are very limited: the line voltage and current signals can be acquired with a data acquisition board (DAQ) or by the use of an Analog-to-Digital conversion board (ADC). The architecture of a generic tool for ESA can be divided in two sections: •
data acquisition and measurement section;
•
data management and analysis section.
In a laboratory implementation of ESA, the data acquisition and measurement section can be built by hooking up three clamp-on current transformers and three alligator-type voltage clips onto the cables carrying power to the equipment’s electric driving motor. Transducer arrangement must guarantee an adequate isolation level among channels and between the supply and measuring devices over a wide band. The measurement system is usually developed under a Virtual Instrument environment (e.g. LabView or Matlab) on a PC, hosting the ADC board. The philosophy on which Virtual instruments systems are based on is to associate the measures not to a specific hardware, but to the algorithm that implements the required measures [45]. To put it less formally, traditional hardware instrumentation systems are made up of pre-defined hardware components that are completely specific to their stimulus, analysis, or measurement function. Because of their hard-coded function, these systems are more limited in their versatility than virtual instrumentation systems. Within the VI approach software enables complex and expensive hardware to be replaced by already purchased computer hardware; e. g. an analog to digital converter can act as a hardware complement of a virtual oscilloscope. Therefore, VIs are a very powerful support for the implementation of diagnostic/prognostic techniques that require continuous monitoring. The capabilities of VIs to manage measurement, to control process, to store data and to query databases in autonomous way are strategic in these kinds of applications. Moreover, especially in modern machinery, components such as current/voltage sensors or digital processors are already installed, for non-diagnostic purposes (e.g. control, protection). The presence in most of modern machines of a digital processor for control purposes or of current sensors for current / torque control and over-current protection leads to the possibility of current measurement and digital signal processing at an even lower cost. Current and voltage data can also be acquired directly from the equipments Motor Control Center (MCC), thus providing prognostic capabilities on a much larger
753
portion of the installed base. As a consequence, it is evident as ESA could be considered as an innovative and cheap tool for the diagnostic and prognostic field. The following paragraph shows the effectiveness of this technique in an industrial case study carried out on a vending machine.
4
CASE STUDY
ESA approach has been tested on a vending machine (in particular a coffee machine). In this case the purpose of the mapping process is the definition of a set of target signatures (a signature for every product i.e. short or long coffee, coffee with a dash of milk, chocolate etc.) in order to verify the similarity between the target and the actual signature. In fig. 1 an example of signature is reported (the signature is related to the short coffee case). Four LED convert the similarity in product quality level.
Figure 1a – Target signature of the short coffee delivery
Figure 1b – Actual signature of the short coffee delivery
The approach of target signature definition can be considered as a “black box” approach. The user may ignore the model of the process, but can use the principle that the process is repeatable. For instance Figure 1a) shows the signature of the machine working in healty conditions, while Figure 1b) shows the signature of a machine with a defective coffee mill.
5
OVERVIEW OF A SOFTWARE ARCHITECTURE FOR THE INTEGRATION OF ESA TOOL
The integration of the ESA tool in comprehensive software architecture for prognostics and maintenance optimization is a prerequisite for its effective implementation in real applications. To this concern, reference architecture was developed within the scope of the current research activity on ESA. The architecture is depicted schematically in Figure 2, divided in different hierarchical levels (from L1 to L4) and it is explained shortly in the following. -
-
-
L1 is the Electric Signature Toolbox (EST): it is made by a set of software algorithms implementing the electric signature analysis using various techniques (i.e. distortion, transient, power analysis, etc.); the outputs of EST are quantitative/qualitative indicators of the signature features that represent the inputs to the second level L2 of the architecture; L2 is the Diagnostic Analysis Toolbox (DAT): it is responsible of executing the diagnostic analysis for defining the health status of the monitored equipment; to this end, inputs coming from L1 are converted by the DAT into diagnostic information related to the functional elements of the equipment (i.e. bearings wear, friction, unbalance, etc.); the DAT logic can be based on the “white box approach” (i.e. on a mathematical model featuring the physical behaviour of the equipment) or on the “black box approach” (i.e. comparing "proxies" of the normal functioning of the equipment in "as good as new" condition, with the current equipment signature) or also on an intermediate solution (“grey box approach”) in which the non-accurate theoretical modelling is integrated and complemented by the comparison strategy (or vice versa); L3 is the Prognostic Signature Toolbox (PST): it transforms the diagnostic valuations coming from L2 into a probabilistic forecast of the equipment health state (already mentioned methods like Fuzzy logic, ANN, Fault tree analysis, etc. can be used in PST); L4 is the Maintenance Policy Optimization System (MPOS): it is in charge of suggesting the more convenient maintenance policy considering techno / economical aspects, relying on the information coming from all previous levels of the hierarchy.
754
Figure 2: Software architecture for ESA tool integration in a maintenance optimization system
6
CONCLUSIONS AND MANAGERIAL IMPLICATIONS
The use of the ESA as a diagnostic and prognostic tool can give the ability to detect and quantify mechanical / electrical malfunctions and degradation phenomena in electromechanical equipment. In fact ESA allows the extraction of information about healthy, faulty and incipiently faulty operating conditions from the electric supply line of the equipment in a cheap way. The main advantages this method can offer are the following: -
substantial reduction of inspection activities on the equipment, thank to the automatic ESA monitoring;
-
increase in safety, thanks to the possibility to diagnose and to foresee the future health state of the equipment being able to operate before a dangerous situation for man and environment is reached;
-
increase in equipment availability, thank to the elimination of unexpected breakdowns;
-
possibility to plan on time and in an efficient way maintenance actions in order to minimize all related costs (such as spare parts, personnel, );
-
contribution to the development of sustainable process and products, thanks to a better utilization of the goods useful life, by substituting the cyclic maintenance approach (which risks to change components still performing correctly), with the CBM approach.
Moreover, as the case study demonstrated, the ESA allows also quality management since, based on the electric signature, it is possible to monitor the quality of the delivered product. However, further research efforts and improvements are still needed for:
7
-
developing powerful, reliable and scalable algorithms for signature analysis and diagnostic / prognostic evaluation,
-
making experience on the development and use of databases of malfunction and degradation conditions during the life cycle of different kinds of industrial equipments,
-
implementing and applying the before mentioned software architecture for condition monitoring and maintenance policy optimization in collaboration with industry for experimentation in real cases.
REFERENCES
1
Jardine A.K.S., Lin D., Banjevic D. (2006) A review on machinery diagnostics and prognostics implementing condition based maintenance. Mechanical Systems and Signal Processing, 20, 1483–1510.
2
Korbicz J., Koscielny J.M., Kowalczuk Z., Cholewa W. (2004) Fault Diagnosis: models, artificial intelligence, applications. Springer Berlin.
3
Lee J., Ni J., Djurdjanovic D., Qiu H., Liao H. (2006) Intelligent prognostics tools and e-maintenance. Computers in Industry, 57, 476 – 489.
755
4
Bangemann T., Reboul D., Scymanski J., Thomesse J.P., Zerhouni N. (2006) PROTEUS – an integration platform for distribuited maintenance systems. Computers in Industry, Special Issue on E-Maintenance, 57 (6), 539 – 551.
5
Iung B. (2003) From remote maintenance to MAS – based E – maintenance of an industrial process. Journal of Intelligent Manufacturing, 14(1), 59 – 82.
6
Garetti M., Macchi M., Terzi S., Fumagalli L. (2007) Investigating the organizational business models of maintenance when adopting self diagnosing and self healing ICT systems. Proceedings of IFAC – CEA07 International Congress. October 2- 5, 2007.
7
Kiritsis D. (2004) Ubiquitous product lifecycle management using product embedded information devices. Invited keynote paper of IMS’2004—International conference on intelligent maintenance systems, Arles, France, 2004.
8
Tavner P.J. (2008) Review con Condition Monitoring of rotating electrical machines. IET Electric Power Applications, 2(4), 215–247.
9
Furlanetto L., Garetti M., Macchi M. (2007). Ingegneria della Manutenzione. Milano, FrancoAngeli (Italian).
10
Williams J.H., Davies A., Drake P.R.. (1994) Condition-Based Maintenance and Machine Diagnostics, Chapman & Hall, London, 1994.
11
Ma J., Li C.J.. (1995) Detection of localized defects in rolling element bearings via composite hypothesis test, Mechanical Systems and Signal Processing, 9, 63–75.
12
Sohn H., Worden K., Farrar C.R. (2002) Statistical damage classification under changing environmental and operational conditions, Journal of Intelligent Material Systems and Structures, 13, 561–574.
13
Fugate M.L., Sohn H., Farrar C.R. (2001) Vibration-based damage detection using statistical process control, Mechanical Systems and Signal Processing, 15, 707–721.
14
Artes M., Del Castillo L., Perez J. (2003) Failure prevention and diagnosis in machine elements using cluster, in: Proceedings of the Tenth International Congress on Sound and Vibration, Stockholm, Sweden, pp. 1197–1203.
15
Ge M., Du R., Xu Y. (2004) Hidden Markov model based fault diagnosis for stamping processes, Mechanical Systems and Signal Processing, 18, 391–408.
16
Chester J., (1993), Neural Networks. A Tutorial Prentice Hall, New York.
17
Power, Y., Bahri, A.P. (2004) A two-step supervisory fault diagnosis framework, Computers & Chemical Engineering 28, 2131–2140.
18
Li B., Chow M.-Y., Tipsuwan Y., Hung J.C. (2000) Neural-network-based motor rolling bearing fault diagnosis, IEEE Transactions on Industrial Electronics, 47, 1060–1069.
19
Fan Y., Li C.J. (2002) Diagnostic rule extraction from trained feedforward neural networks, Mechanical Systems and Signal Processing, 16, 1073–1081.
20
Samanta B., Al-Balushi K.R. (2003) Artificial neural network based fault diagnostics of rolling element bearings using time-domain features, Mechanical Systems and Signal Processing, 17, 317–328.
21
Hansen C.H., Autar R.K., Pickles J.M. (1994) Expert systems for machine fault diagnosis, Acoustics Australia, 22, 85– 90.
22
Bengtsson M., Olsson E., Funk P., Jackson M. (2004) Technical design of condition based maintenance system—A case study using sound analysis and case-based reasoning, in: Maintenance and Reliability Conference—Proceedings of the Eighth Congress, Knoxville, USA, 2004.
23
Araiza M.L., Kent R., Espinosa R. (2002) Real-time, embedded diagnostics and prognostics in advanced artillery systems, in: 2002 IEEE Autotestcon Proceedings, Systems Readiness Technology Conference, New York, 818–841.
24
Wang W.Y. (2002) Towards dynamic model-based prognostics for transmission gears, in: Component and Systems Diagnostics, Prognostics, and Health Management II, 4733, 157–167.
25
Baillie D.C., Mathew J. (1994) Nonlinear model-based fault diagnosis of bearings, in: Proceedings of an International Conference on Condition Monitoring, Swansea, UK, 1994, 241–252.
26
Choi G.H., Choi G.S. (1996) Application of minimum cross entropy to model-based monitoring in diamond turning, Mechanical Systems and Signal Processing, 10, 615–631.
27
Chinnam R.B., Baruah P. (2003) Autonomous diagnostics and prognostics through competitive learning driven HMMbased clustering, in: Proceedings of the International Joint Conference on Neural Networks 2003, vols. 1–4, New York, 2003, pp. 466–2471.
756
28
Kwan C., Zhan R. Xu L, Haynes L. (2003) A novel approach to fault diagnostics and prognostics, in: Proceedings of the 2003 IEEE International Conference on Robotics and Automation, vols. 1–3, New York, 2003, pp. 604–609.
29
Lin D., Makis V. (2004) Filters and parameter estimation for a partially observable system subject to random failure with continuous-range observations, Advances in Applied Probability, 36, 1212–1230.
30
Goode K.B., Moore J., Roylance B.J. (2000) Plant machinery working life prediction method utilizing reliability and condition-monitoring data, Proceedings of the Institution of Mechanical Engineers Part E—Journal of Process Mechanical Engineering, 214, 109–122.
31
Yan J., Koc M., Lee J. (2004) A prognostic algorithm for machine performance assessment and its application, Production Planning and Control, 15, 796–801.
32
Chinnam R.B., Baruah P. (2004) A neuro-fuzzy approach for estimating mean residual life in condition-based maintenance systems, International Journal of Materials and Product Technology, 20, 166–179.
33
Wang W.Q., Golnaraghi M.F., Ismail F., (2004) Prognosis of machine health condition using neuro-fuzzy systems, Mechanical Systems and Signal Processing, 18, 813–831.
34
Ierace S., Pinto R., Cavalieri S. (2007) Application of Neural Networks to condition based maintenance: a case study in the textile industry. In: Proceedings of the International Manufacturing Systems, Alicante, May 23 -25, 2007.
35
Chelidze D., Cusumano J.P. (2004) A dynamical systems approach to failure prognosis, Journal of Vibration and Acoustics, 126, 2–8.
36
Scarf P.A. (1997) On the application of mathematical models in maintenance, European Journal of Operational Research, 99, 493–506.
37
Lugtigheid D., Banjevic D., Jardine A.K.S. (2004) Modelling repairable system reliability with explanatory variables and repair and maintenance actions, IMA Journal Management Mathematics, 15, 89–110.
38
Campbell J.E., Thompson B.M., Swiler L.P. (2002) Consequence analysis in predictive health monitoring systems, in: Proceedings of Probabilistic Safety Assessment and Management, vols. I and II, Amsterdam, 2002, pp. 1353–1358.
39
Basak D., Tiwari A., Das S. P. (2006) Fault diagnosis and condition monitoring of electrical machines - A Review, IEEE Xplore.
40
Rodríguez J. P., Arkkio A. (2008) Detection of stator winding fault in induction motor using fuzzy logic. Applied Soft Computing, 8(2), 1112–1120.
41
Joksimovic G., Penman J., (2000) The detection of inter-turn short circuits in the stator windings of operating motors," IEEE Transactions on Industrial Electronics (ISSN:0278-0046), Volume 47, Issue 5, Oct. 2000, 1078-1084.
42
Ponci F., Cristaldi L., Faifer M., Lazzaroni M. (2007) Innovative approach to early fault detection for induction motors / In: IEEE international symposium on diagnostics for electric machines, power electronics & drives, 283-288.
43
Kim K., Parlos A. G., Bharadwaj R. M. (2003) Sensorless fault diagnosis of induction motors, IEEE Trans. Ind. Electron., 50 (5), 1038–1051.
44
Nandi S., Toliyat H.A., Li X. (2005) Condition monitoring and fault diagnosis of electrical motors – a review, IEEE Trans. Energy Convers., 20, (4), pp. 719–729.
45
Annoni M., Cristaldi L., Measurements, Analysis, and Interpretation of the Signals From a High-Pressure Waterjet Pump, IEEE Transactions on Instrumentation and Measurement, 57( 1), 34 – 47.
757
Proceedings of the 4th World Congress on Engineering Asset Management Athens, Greece 28 - 30 September 2009
GEARBOX FAULT DETECTION USING SPECTRUM ANALYSIS OF THE DRIVE MOTOR CURRENT SIGNAL Mohamed Rgeai a, Fengshou Gu b, Andrew Ball b, Mohamed Elhaj c and Mohamed Ghretli d a
The Higher Institute of Electronic Profession, Al-Jrab Street, Tripoli, Libya
b
School of Computing and Engineering, University of Huddersfield, Queensgate Huddersfield HD1 3DH, U.K. c th
7 April University , Zawia, Libya
d
The Higher Institute of Computer Technology ,Gargaesh Road, Tripoli, Libya
This paper investigates the application of spectrum analysis of the motor current signal to the detection of mechanical faults in a two-stage helical gearbox driven by an 11kW induction motor. The benefits of using the current signal of the drive motor to monitor downstream mechanical components include a non-intrusive approach, potentially applicable remotely from the machine, likely less costly to apply than more conventional approaches like vibration monitoring, and with scope for improved health and safety. Comparison of the spectral content of the motor current signal against the baseline is used for the purposes of detecting and assessing the severity of pinion gear faults in a multi-stage gearbox, and a method is established that quantifies spectral components and provides a basis for assessment of gearbox condition. The spectrum is dominated by the 50Hz mains frequency component in the motor current spectrum and families of sidebands are revealed which correlate with the shaft rotational frequencies (RF) around the gear meshing frequency (GMF). The number and the amplitude of the sidebands rise when a local tooth fault is introduced into the gear and the clear differences can be observed between the faulty and the healthy spectra. The work in this paper then confirms the abilities of motor current signal for fault detection of downstream machines. 1
INTRODUCTION
Motor Current Signal Analysis (MCSA) has been used as a powerful tool for the detection and diagnosis of different kinds of faults within primer movers: motors and generators. Early work in the field of MCSA for condition monitoring with motors was conducted by Oak Ridge National Laboratory (ORNL) in 1980’s [1]. In recent years many researchers [2, 3, 4, 5, 6, 7, 8, 9, and 10] have focused their efforts on the condition monitoring of 3-phase induction motors. The results achieved using the motor current for this purpose has been significant. However, the use of MSCA to detect and diagnose faults from downstream machines, such as a gearbox, has not been fully investigated yet although there is a strong correlation between the motor and the gearbox in system operation. Because the input current to the motor depends on the torque and speed, it is possible to monitor the current changes if the gearbox operates abnormally and thus to detect faults in the gearbox using the current measurement. Authors in [1] showed laboratory results using MCSA as a diagnostic indicator to reveal defects in motor operated valves. The investigation on [11-12] has shown more interesting results using MSCA to detect gearbox faults based on the changes in the low frequency range. However, the authors believe that research work has not been performed sufficiently for using induction motors as reliable transducers to reveal mechanical faults beyond the immediate motor vicinity. The main objective of this paper is to detect downstream mechanical faults in a two-stage helical gearbox using spectrum analysis technique to analyze motor current data in order to assess the condition of the gear. In order to explore the possibility of using MSCA for gearbox diagnosis, healthy and faulty motor current data with a pinion gear under full load were analyzed through an careful examination of a wide frequency range in the frequency domain to achieve both the identification and tracking of individual frequency components, the identification of frequency and amplitude modulation occurrence. It therefore permits close correlation with physical characteristics of the gearbox [13].
758
2
FREQUENCY CHARACTERISTICS OF GEARBOX
The dynamics of a gearbox will change the motor current signals through the oscillatory motion of gearbox shaft rotationally and radically. The motion transfers to motor rotor and causes corresponding oscillation of magnetic flux. Because of flux oscillation, additional current component is induced. This fundamental mechanism for current signal changes shows that mechanical vibrations from the gearbox can be also observed in current signals. Based on vibration characteristics, the spectral location of a two stage gearbox fundamental frequencies can be quantified using Equations (1-3). The values of the first, second and third shaft rotational speeds in Hertz (fundamental frequencies f r1 , f r 2 , f r 3 respectively) can be determined using:
f r1 = rotor mechnical Speed / 60
(1)
np fr 2 = n g
. f r1
(2)
np fr3 = n g
Np N g
. f r 1
(3)
where n p , n g , N p , N g are the number of the pinion gear at the first stage, the number of teeth on the driven gear at the first stage, the number of teeth on the pinion gear at second stage, and the number of teeth of the driven gear at second stage respectively. The oscillation of magnetic flux due to vibration is around the fundamental flux component such as 50Hz. This means that the oscillation will result in various sidebands around the fundamental components. Based on modulation mechanisms, the frequency components at which the sidebands appear in the spectrum, fSBn (Hz), can be determined using Equation 4. (4) f SBn = f 0 – m G f rn Where f , G , f rn , m are the fundamental frequency 50Hz, the gear reduction ratio, the rotational speed of the nth shaft and index 0 1, 2, 3,…..etc. respectively. The rotational speed of the motor shaft f r1 also can be expressed in relation to the fundamental frequency and the motor slip as 1- s f r1 = f 0 P
=
f 0 - sf 0 P
(5)
Table 1 Spectrum frequency components and their harmonics as order m increases G=1 G12 = 0.486 First shaft frequency Second shaft frequency fSBn f SB1 Hz f SB 2 Hz m
Lower Side
Upper Side
Lower Side
Upper Side
74.43
Spaced From 50Hz – 24.43
1
25.57
38.14
2
01.21
98.86
–48.86
3
23.29
123.3
4
47.72
147.7
G23 = 0.2817 Third shaft frequency f SB 3 Hz Lower Side
Upper Side
61.84
Spaced From 50Hz –11.84
43.39
56.61
Spaced From 50Hz –06.61
26.28
73.72
–23.72
36.78
63.22
–13.22
–73.30
14.42
85.58
–35.58
30.17
69.83
–19.83
–97.70
2.56
97.44
–47.44
23.56
76.44
–26.44
where s is per unit slip and P is number of pole–pairs of the induction motor. The rotor rotational speed, f r1 Hz, can be expressed in the stator reference giving a relation which directly relates the sideband frequencies f SB to the line frequency, f0, as:
759
f SBn = f 0 [1 – m G (1 - S ) / P]
(6)
The frequency components of the three different gear shafts' rotational speeds for healthy condition in the two stage gearbox are summarized in Table 1, which illustrates these frequencies and their harmonics as the order m changes 1, 2, 3 and 4. Another significant vibration in gearbox is at the tooth meshing frequencies (TMF) and their sidebands. Similarly, the spectrum locations for this type of vibration can be expressed as the form of Equation (6). However, because TMF’s at much higher frequency range, their amplitudes are very low. High attention must be paid to investigate the characteristics. 3
IMPLEMENTATION OF FREQUENCY ANALYSIS
A main purpose of the experimental testing is to obtain a baseline pattern of the stator current spectrum under healthy condition of the helical gearbox that can be used for comparison of the spectral content of the motor current signal against the baseline for gear fault detection and diagnosis. Frequency domain analysis of the stator current can be used to reveal the presence of sidebands to particular components in the motor current spectrum and any variations in the amplitude of these sidebands could indicate the presence of mechanical problems in the gearbox. Ideally, the current drawn by a healthy ideal motor should exhibit only a 50Hz frequency component in its spectrum [14]. However, it is clearly shown from the experimental results in Figure 4 and Figure 5 that the current spectrum is rich with various frequency components, and exhibits many different sidebands around the main 50Hz frequency. This strongly suggests that the motor current reflects the effect of different modulation processes caused by both inherent motor dynamics and gear characteristics. In order to gain a better understanding of the motor current spectrum for helical gear fault detection, a fault was seeded on the pinion of the first stage of the two-stage helical gearbox (complete tooth breakage). The author has looked only to the spectrums (healthy and faulty gear conditions) at full load because in general practice fault symptoms appear to be more pronounced at higher load than lower load. The FFT is based on the assumption that the time signal is periodic and repeated. In addition, the FFT algorithm assumes that the sampled time data does not have signal amplitudes at the ends of the time record; otherwise the FFT spectrum would reflect errors as inaccuracies in both amplitude and frequency [15]. The frequency spectrum of a periodic sine wave in the time record for an infinite period of time will exhibits a single line spectrum whose width can be determined using fine resolution. However, for a non-periodic finite time record of the sine wave the power spreads over a spectrum. This smearing of energy throughout the frequency domain is a phenomenon known as leakage (i.e. energy leaks out of one resolution line of the FFT into all the other lines) [16]. The leakage of energy can be severe enough to entirely mask small signals. For clarity in the analysis of a spectrum, leakage can be much reduced by employing a time-domain window function, such as the Hanning window [13]. This process multiplies each signal data point by the window function. The resultant is the signal shaped as an amplitude modulation contour with the maximum amplitude in the centre of the time record and zeros at the time record edges [15], thus the effect of discontinuities (incomplete periods) is reduced. Frequency domain plots are shown later, plotted on linear and logarithmic scales. The vertical axes of all spectral figures are represented in Volts and calculated using a Hanning window, unless otherwise stated. It must be explained that no calibrated amplitude scale is necessary because the figures are only ever used for comparative purposes. To gain a better understanding of the stator current spectrum under normal and abnormal operating conditions of the helical gearbox, 215 data points were collected at a sample rate of 800Hz so that a good frequency resolution of 0.024Hz was achieved. The motor current signal was measured using the 3-phase measurement unit. The stator and the rotor windings are assumed to be in healthy condition. All results obtained shown the motor is operating at the rated voltage and 50Hz power supply with the rated speed of 1465rpm at full load giving per-unit slip of 0.023. The motor current spectrum was determined at full load (i.e. 100% of output power or torque) for the pinion in both healthy and faulty condition, but only the low frequency region of the spectrum was investigated because motor current is likely to be affected by high frequencies generated from different parts of the machine.
760
4
ANALYSIS OF MOTOR CURRENT SPECTRUM 4.1 Characteristics of Shaft Frequencies
The motor current spectrum is very rich in frequency components as shown in Figure 1. At first, the spectrum might look very confusing to the reader, so the author believes it is important at this stage to identify and quantify each component in the frequency spectrum because such a method of analysis for the detection of gearbox faults has never before been attempted as such using MCSA.
Figure 1. Motor current spectrum (healthy gear) Figure 1 shows the phase current spectrum for the healthy gear condition; obviously the spectrum exhibits the fundamental line frequency 50Hz in addition to many sidebands which are found to be gear rotational speed related. The presence of these sidebands in the spectrum of a healthy gear show how sensitive the motor current signal is to the changes in the gear transmission condition and demonstrates the frequency modulation caused by the gear transmission. The modulation frequencies can be determined by measuring the spacing between these sidebands and the supply frequency 50Hz component. Table 2 summarises the spectral location of the sidebands, their separation from the line frequency (50Hz) and the generating mechanism Table 2 Fundamental sidebands and their harmonics Spectral Lower Sidebands Hz Upper Sidebands Location 1 f0 -2fs 47.70 f0 + 2fs
Hz 52.30
2
f0 -fr3
43.39
f0 + fr3
56.61
3
f0 -fr2
38.16
f0 + fr2
61.84
4
f0 -2fr3
36.78
f0 + 2fr3
63.22
5
f0 -3fr3
30.17
f0 + 3fr3
69.83
6
f0 -2fr2
26.28
f0 + 2fr2
73.72
7
f0 -fr1
25.57
f0 + fr1
74.43
8
f0 -4fr3
23.56
f0 + 4fr3
76.44
9
f0 -3fr2
14.42
f0 + 3fr2
85.58
10
f0 -2fr1
01.21
f0 + 2fr1
98.80
11
Not appear
2 x f0
100
761
Generating Mechanism twice slip frequency 1st harmonic of 3rd shaft speed 1st harmonic of 2nd shaft speed nd 2 harmonic of 3rd shaft speed 3rd harmonic of 3rd shaft speed 2nd harmonic of 2nd shaft speed 1st harmonic of rotor shaft speed 2nd harmonic of 3rd shaft speed 3rd harmonic of 2nd shaft speed 2nd harmonic of rotor shaft speed twice line frequency
The presence of symmetrical series of sidebands spaced around the running speed reflect shaft speed variation, and is as if the motor was acting as an amplitude modulator with line frequency as the carrier and the shaft speed as the modulating signal [17]. Figure 1 also shows that the predominant set of sidebands is located at spectral location 7 (25.57Hz and 74.43Hz) which represent the sum and difference from the main line frequency of the motor shaft rotational speed fr1. The figure also shows a second set of dominant sidebands at spectral location 3 (38.16Hz and 61.84Hz), these frequency components are spaced at G12 fr1 Hz from the line frequency, where G12 is the gear reduction ratio at the first stage of the gearbox. Obviously these sidebands are related to the second shaft rotational speed fr2. Figure 1 shows a third set of sidebands with lower amplitude, at spectral location 2 (43.40Hz and 56.60Hz), both are spaced by G23 fr1Hz from the line frequency where G23 is the gear reduction ratio at the second stage of the gearbox. The spacing corresponds to the mechanical rotation of the third shaft of the gearbox. The presence of these sidebands could be related to inherent static and/or dynamic eccentricity because of unavoidable manufacturing non-symmetry or unavoidable gear imperfections, since it has been shown that such imperfections also produce similar sidebands in the spectrum, with amplitudes which depend on the degree and type of imperfection [18], in particular the degree of static and dynamic eccentricity [19]. It is clear that the appearance of these sidebands in the healthy spectrum is due to operating conditions in the driving system which gives rise to stator current modulation [20] but, as will be shown in Figure 4 some of these sidebands show lower amplitude when compared to their counterparts in the spectrum of a faulty gear. The current spectrum in Figure 1 also exhibits two other symmetrical sidebands around the line frequency at spectral location 1 (52.30Hz and 47.70Hz), these are spaced at twice the slip frequency, (f0±2fs), where fs is equal to Sf0. It is well known that the presence of these sidebands in the motor current spectrum is due to rotor bar imperfection or/and due to faults such as a broken rotor bar or rotor asymmetry. The amplitude of these particular sidebands is sufficiently low as to not raise a concern about the health condition of this induction motor, because all induction motors show an inherent level of abnormality which cannot be avoided at the manufacturing stage. The stator current spectrum signature shown in Figure 1 is used in this work as the baseline spectrum for fault detection in the helical gearbox.
Figure 2a Motor current sidebands pattern at 0% load
Figure 2b Motor current sidebands pattern at 0% load
762
Figure 2 (a, b) shows the stator current spectrum for healthy gear conditions at 0% and 100% load, respectively. The spectrum at 0% load exhibits additional frequency components and sidebands which are not observed under the higher loading condition (100%). The generation of these additional components is due to the backlash effect caused by the tooth pinion and its mating gear (driven gear) during the meshing period. At lighter load, the static force exerted on the pinion teeth is not as strong as at higher load, this allows the mating teeth to vibrate more, resulting in a different level of vibration, which is transmitted along many paths inducing vibration in other frequency components which are seen in the motor current spectrum in Figure 2a. However, as the load increases, more force is exerted on the pinion teeth, reducing the backlash effect which lead to a reduction in the number of frequency components as shown in the spectrum in Figure 2b. It is also clear from the Figures that some frequency components appear only at higher load; twice the slip frequency, for example, was not present at 0% load. The amplitude of the slip frequency is load dependent and its amplitude increases as the motor load increases. The presence of gear fault (tooth breakage) will cause a pulse or impact each time the tooth goes into mesh and this sudden impact results in changes in the amplitude of the vibration signal during the meshing period. The duration and amplitude of a local variation can provide useful information regarding the severity of a fault. The amplitude of the vibration signal is sensitive to the tooth loading. Thus, if the load fluctuates it would be expected that the amplitude of the vibration would vary. The source of this type of modulation could be either due to a gear fault and/or manufacturing errors. For instance, tooth breakage or eccentricity of one of the gears could cause a similar amplitude modulation. However, a localized fault would tend to give further modulation by a short pulse repeated every revolution [21].
Figure 3. Spectrum of motor current with faulty pinion at full load. A faulty gear tooth generates two different vibration signals, one is a radial vibration between the rotor and the stator of the induction motor arising from the instantaneous changes in the dynamic force as the damaged tooth enters the mesh, and second is the torsional vibration caused by the twisting rotating shaft. Figure 3 represents the spectrum of the motor current when a fault (tooth breakage) is present in the gearbox under full load. It is evident from Figure 3 that there is significant increase in the amplitude of the motor rotational speed sidebands, f r1 , compared to the healthy spectrum, shown in Figure 1. This increase of about 30% in amplitude clearly suggests that an abnormality has occurred in the driving system. Hence, change in the amplitude of the sidebands of the running speed can be used as an indicator for the presence of fault in the gearbox. The figure also shows additional frequency components (13.03Hz, 23.50Hz, 71.98Hz, 86.93Hz and 96.28Hz) which are of unknown origin and were not observed in the spectrum with the healthy gear, see Figure 1. Figures 4 and 5 show the zoomed spectrum (35Hz-65Hz) of the motor current when the gear is healthy and faulty respectively. A quick comparison between the two spectrums shows the presence of additional sidebands at spectral location (f0±fc), where fc is the oscillating frequency due to speed oscillation, as result of the fault. The experimental results in Figure 5 show that that this oscillating frequency has a value 0.38Hz [20]. Moreover, Figure 5 exhibits symmetrical sidebands with relatively low amplitudes spaced at approximately at 0.38Hz, and its multiples, from the main line frequency. The spectral location of these frequencies is 47.70Hz, 48.07Hz, 48.45Hz, 48.83Hz, 49.21 Hz, 50.79Hz, 51.17Hz, 51.55Hz, 51.93Hzand 52.32Hz.
763
Figure 4. Magnified motor current spectrum with healthy pinion at full load
Sidebands spaced at 0.38Hz and its multiples
Figure 5. Magnified motor current spectrum with faulty pinion at full load Figures 6 and 7 show the motor current spectra for both a healthy gear and for a gear with a damaged tooth, under the same load, over the low frequency ranges 0-25Hz and 82-92Hz respectively. The spectra in Figure 6 show the presence of a fault introduces additional frequency components at 13.03Hz and 13.75Hz, whereas Figure 7 shows components at 86.27Hz and 86.85Hz. These components are considered to be gear fault related. It is worth mentioning that the 13.75 Hz and 86.27 Hz sidebands were found to be related to a combination of the first harmonic of the driving shaft and the second harmonics of the second shaft. Such spacing clearly indicates the source of the modulating frequency as the damaged tooth entering the mesh. These components can be quantified using Equation 8.
f SBn = f 0 – m G f rn
(7)
= f –(f + f )
(8)
f
SBn
0
r1
r2
fupper =50 +(24.43 +11.84) = 86.27 Hz flower = 50 - (24.43 +11.84) = 13.75 Hz The amplitudes of these sidebands are either very low or obscured in the healthy case. However, with the faulty condition these sidebands showed significant increase in their amplitude up to approximately 400%.
764
0.38Hz Oscillation frequency
f0 – (fr1+ fr2)
Figure 6 Zoomed motor current spectrum with healthy and faulty gear at full load
f0 + (fr1+ fr2)
Figure 7 Zoomed motor current spectrum with healthy and faulty pinion at full load Figures 8 and 9 show the motor current spectrum with healthy and faulty pinion respectively, at full load for the frequency range 120-175Hz. Study of these spectra reveals that there exist higher order harmonics (3rd, 4th, & 5th) of the sideband components of the motor rotational speed in the spectrum for both healthy and faulty conditions. These sidebands occur at 123.25Hz, 147.72Hz and 172.15Hz respectively. The amplitudes of these components are very low in the healthy case but they show noticeable amplitudes for the faulty case, especially for the 172.15Hz component. Because these components are related to the motor shaft frequency, it is believed that their appearance is caused by the damaged tooth. Therefore, they can be used for the detection of such a fault. Moreover, the spectrum shows additional higher order frequency components at 120.65Hz, 125.56Hz, 145.03Hz, and 174.19Hz, but with unknown origin. In general, the test results show that the presence of a fault in a gear changes the current spectrum by introducing higher amplitudes of sidebands already present, correlating with the drive shaft frequency. In addition, the fault also creates more sidebands relating to higher harmonics of the shaft frequency.
765
Unknown origin
Figure 8 Motor current spectrum with healthy pinion at full load
Unknown origin
Figure 9 Motor current spectrum with higher harmonics (faulty) pinion)
4.2 Characteristics of Tooth Meshing Frequency Further analysis carried out on motor current data in the high frequency range 800-900Hz to investigate the change of spectrum around the tooth meshing frequency (TMF) at 830Hz, The TMF is defined as the rate at which tooth-pairs enter the mesh; it is equal to the product of the pinion gear number of teeth (34) and the shaft rotational speed (24.43Hz) on which the gear is mounted. As shown Figure 10, the TFM vibration and its modulation process that occurs within the mesh process due to a broken tooth results in a significant increase in the amplitude of the TMF and in the sidebands around the mesh frequency [22], providing more another significant feature for detect the fault. In addition, TMF related sidebands also have large changes. The spectrum in Figure 10 is for both healthy and faulty gear conditions under 100% load. This magnified stator current spectrum exhibits the gear tooth meshing frequency surrounded by several pairs of sidebands at (830 – 24.43) Hz and (830 – 11.8) Hz. This is a significant change in the signature of the motor current spectrum due to the gear fault.
766
Figure 10 Motor current spectrum showing TMF with healthy and faulty pinion The spectrum in Figure 11 and 12 show a series of sidebands surrounding the TMF, spaced at motor and second transfer shaft speeds respectively. Close examination of Figure 12 (faulty case) shows the same sidebands, but with much higher amplitudes. This increase of amplitude can be considered as the direct result of the tooth gear fault.
Figure 11 Motor current spectrum showing TMF and sidebands (healthy case)
Figure 12 Motor current spectrum showing TMF and saidebands (faulty case)
767
5
CONCLUSIONS
Based on the basics of motor electromagnetic process and gearbox vibration chracteristics, the spectrum of motor current is studied to correlate with various vibration components. Moreover, the changes found in the current signals due to gear faults can be observed not only in the low frequency range where the shaft rotational frequencies locate but also in the high frequency range where the tooth mesh frequencies locate. These changes can be relied on for the fault localisation and quantification. However, more advanced signal processing techniques need to be investigated to extract these changes more accurately for on-line fault diagnosis.
6
REFERENCES
1
Haynes, H. D. and Kryter, R. C. 1989, How to Monitor Motor Drive Machinery by Analyzing Motor Current Power Engineering, ONRL.Power Engineering pp.35-39
2
Schoen, R. R.; Habetler, T. Kamran, G. F. and Bartheld, R. G. 1995, Motor Bearing Damage Detection Using Stator Current Monitoring, IEEE Trans. Ind. Appl., Vol. 31, No. 6, Nov./Dec.
3
Thomson, E.T. and Fenger, M. 2000, Motor Current Signature Analysis to Diagnose Faults in Induction Motor Drives. IEEE IAS Industry Magazine applications
4
Thomson, W.T. 1992, On Line Current Monitoring of the Influence of Mechanical Loads or a Unique Rotor on the Diagnosis of Broken Rotor Bars in Induction Motors, Proceeding of ICEM -UMLST.
5
Schoen, R R.; Lin, B K.; Habetler, F G; Shlog, H J and Farag, S 1995, An Unsupervised On-line System for Induction Motor Fault Detection Using Stator Current Monitoring. IEEE-IS Transaction, November/December, Vol 31, No 6,,12 80-1286
6
Fenger, M. and Lloyd, B.A. 2003, Case Histories of Current Signature Analysis to Detect Faults in Induction Motor Drives. Electric Machines and Drives Conference, 2003. IEMDC'03. IEEE International Volume 3, Page(s):1459 - 1465 vol.3.
7
Kilman, G.B. and Stein, J. 1990, Induction Motor Fault Detection Via Passive Motor Current - A Brief Survey. Proc. 44th Meeting of the Mechanical Failures Prevention Group, Virginia Beach, Virginian, pp. 49-65.
8
Thomson, W.T. Chalmers, S.J. and Rankin, D. 1987, An On-line Computer Based Current Monitoring System for Rotor Fault Diagnosis in 3-PhaseInduction Motors. Turbo-machinery International, pp. 17-24.
9
Li, Weidong. Mechefske, Chris. 2004 Induction motor fault detection using vibration and stator current methods10.1784/insi.46.8.473.39379 Print ISSN: 1354-2575 Volume: 46, Issue: 8 Cover Page(s): 473-478.
10 Fenger, Mark. Lloyd, Blake A. Case Histories of Current Signature Analysis to Detect Faults Induction Motor Drives Iris Power Engineering Inc. 11 Chinmaya Kar and A.R. Mohanty Monitoring gear vibrations through motor current signature analysis and wavelet transform, Mechanical Systems and Signal Processing, Volume 20, Issue 1, January 2006, Pages 158-187 12 Saadaoui, W. Jelassi, K.Gearbox-Induction Machine Bearing Fault Diagnosis Using Spectral Analysis, Electr. Syst. Lab. (LSE), Nat. Sch. of Eng. of Tunis (ENIT), Tunis; 13 Payne, B. S. 2003, Condition Monitoring of Electrical Motors for Improved Asset Management’’ A thesis submitted to he University of Manchester for the degree of Doctor of Engineering in the Faculty of Science And Engineering. 14 Tavner, P and Penman J. 1987, Condition Monitoring of Electrical Machines. John Wiely & sons Inc. 15 SKF Reliability Systems. 1999, Overlapping and Windows, CM3012 SKF, Reliability Systems 4141 Ruffin Road San Diego, California 92123 USA. 16 Agilent Technologies. 2000, The Fundamentals of Signal Analysis. Application Note 243 Agilent Technologies, Inc. Headquarters 5301 Stevens Creek Blvd Santa Clara , CA 95051 United States 17 Kliman, G.B. and Stein, J. 1995, Induction Motor Fault Detection via Passive Current Monitoring. pp 13-17. 18 Rgeai, M.; Gu, F. and Ball, A. 2004, Downstream Mechanical Fault Detection Using Motor Current Signature Analysis (MCSA) The 9th Mechatronics Forum International.Culture & Convention Centre, METU, Ankara, Turkey. 19 Hurst, K. D. and Habetle, T.G. 1996 Sensor less Speed Measurement Using Current Harmonic Spectral Estimation in Induction Machine Drives. IEEE transaction on power electronics vol. 11 No.1.
768
769
DATA QUALITY ENHANCED ASSET MANAGEMENT METADATA MODEL
Jing Gao a, Andy Koronios b, Steve Kennett c and Halina Scott d, a
School of Computer and Information Science, University of South Australia, GPO Box 2471, SA 5001, Australia.
b
School of Computer and Information Science, University of South Australia, GPO Box 2471, SA 5001, Australia.
c
Maritime Platforms Division, Defence Science and Technology Organisation, Melbourne, Australia.
d
Strategy & Research, Logistics Mgmt Group, Defence materiel organisation, Defence Plaza Sydney, Australia.
Researchers have indicated that maintaining the quality of data is often acknowledged as problematic, but is also seen as critical to effective decision-making in engineering asset management (AM). The development of metadata standards is considered as an effective approach to address various data quality issues. Our literature review shows that there has been little study on the development of metadata standards for engineering asset management. Thus, this research has proposed a preliminary EAM metadata model as a result of the study into various related mature metadata standards with a strong focus on data quality assurance. It is believed that this model will provide useful contributions to generic or organisational specific metadata standard development in engineering asset management organizations.
Key Words: metadata, data quality and engineering asset
770
1
INTRODUCTION
The development of modern Enterprise Asset Management (EAM) systems significantly advances the capability of asset lifecycle management. With the widely-spread implementation of ICT infrastructure, engineering asset management organisations have shifted the focus of handling text based data records to a variety of multimedia data including audio data (such as machine acoustic reading) and visual data (such as remote monitoring through video surveillance). However, the vast amount of data and information generated and stored in the current asset management systems, present critical challenges to the ongoing data quality assurance requirements.
Researchers have indicated that maintaining the quality of data is often acknowledged as problematic, but is also seen as critical to effective decision-making in engineering asset management (AM). As a response to this issue, the development of metadata standards has become increasingly prevalent. According to Taylor (2003), metadata is defined as structured data which describes, explains, locates, or makes it easier to retrieve, use, or manage an information resource. In other words, metadata is the “data about data” or “information about information” (NISO 2004). A metadata record contains a certain number of pre-defined elements that stand for particular attributes of a resource, and those elements can have single or multiple values, as shown in Table 1.
Element name
Value
Title
Web catalogue
Creator
Dagnija McAuliffe
Publisher
University of Queensland Library
Identifier
http://www.library.uq.edu.au/iad/mainmenu.html
Format
Text/html
Relation
Library Web site
Table 1: An example of a simple metadata record Source: Taylor (2003)
Our literature review shows that, while the study of metadata standards is a vibrant research area, there has been little study on the development of metadata standards for engineering asset management. Perhaps the closest finding is the MIMOSA standard, which provides metadata reference libraries and a series of information exchange standards using XML and SQL (MIMOSA 2009). However, it must be noted that this metadata reference library is specifically designed for the implementation of the MIMOSA standard.
771
This research therefore tries to develop a preliminary generic metadata model for engineering asset management with a specific focus on data quality assurance. This proposed metadata model will lead to the development of customer-tailored engineering asset management standards, which would enable engineers to understand the value and completeness of the information and to make the best engineering decisions with confidence.
2
METADATA MODELS / STANDARDS REVIEW The researchers have studied 6 categories of metadata standards which have strong influences /
overlaps with engineering asset management including:
•
Digital asset / library metadata
•
Geospatial metadata
•
Document and File system metadata
•
Database and Data Warehouse metadata
•
Multimedia metadata
•
Other minor metadata standards
Based on the accumulated knowledge, a preliminary engineering asset management metadata model is proposed. This proposed metadata model has highlighted the data quality assurance requirements for engineering asset management.
2.1 Digital asset / library metadata The digital asset/library metadata standard is the most popular and earliest metadata standard, developed to manage text records in libraries. Through decades of development, the digital asset management metadata standards have covered all basic metadata elements, especially the navigation function for object hierarchy. According to the Oxford Digital Library (2008), digital library metadata can be classified in three categories:
•
Descriptive digital library metadata - Information describing the intellectual content of the object, such as MARC cataloguing records, finding aids or similar schemes. It is typically used for bibliographic purposes and for search and retrieval.
•
Structural digital library metadata - Information that ties each object to others to make up logical units (e.g., information that relates individual images of pages from a book to the others that make up the book).
•
Administrative digital library metadata - Information used to manage the object or control access to it. This may include information on how it was scanned, its storage format,
772
copyright and licensing information, and information necessary for the long-term preservation of the digital objects.
The most famous and widely used schema in this domain is the Dublin Core metadata set (DC) (IFLANET 2008). The Dublin Core metadata element set was developed from 1995 as a response to the need to improve retrieval of information resources, especially on the World Wide Web (WWW). It is sponsored by OCLC and the National Center for Supercomputing Applications (NCSA) and is being developed as a generic metadata standard for use by libraries, archives, government and other publishers of online information (NISO 2004). The DC standard comprises fifteen elements (as shown in table 2).
Content & about the Resource
Intellectual Property
Electronic or Physical manifestation
Title
Author or Creator
Date
Subject
Publisher
Type
Description
Contributor
Format
Source
Rights
Identifier
Language Relation Coverage
Table 2: the fundamental elements of Dublin Core Source: The University of Queensland Library (2008)
Figure 1 indicates an abstract model for Dublin Core metadata, describing the components and constructs used in Dublin Core metadata, and defines the nature of the components used and describes how those components are combined to create information structures.
773
Figure 1: DCMI resource model Source: Dublin Core Metadata Initiative (2008)
As the most widespread and accepted standard amongst the resource discovery community and the de facto Internet metadata standard, the Dublin Core metadata schema offers the advantages of being flexible and easy-to-understand; especially, it can be extended to meet the demands of more specialised communities by allowing the addition of elements for site-specific purposes or specialised fields .
2.2 Geospatial metadata The Geographic Information System (GIS) plays a vital role in modern AM organisations, such as utilities and transportation companies. The geospatial metadata standards are used to document geographic digital resources such as GIS map files, geospatial databases, and earth imagery. It consistently describes the datasets - collections of spatial data. This enables the effective management and availability of that data.
Government authorities are heavily involved in the metadata standard development in this area. For example, the U.S. Federal Geographic Data Committee (2008) indicates that the record of geospatial metadata must include core library catalog elements such as Title, Abstract, and Publication Data; geographic elements such as Geographic Extent and Projection Information; and database elements such as Attribute Label Definitions and Attribute Domain Values. Generally speaking, the geospatial dataset descriptions are composed of well-defined metadata elements which have a specific structure and order. The dataset descriptions are maintained as documents in the Standard Generalized Markup Language (SGML) (Batcheller 2007).
A number of popular metadata standards are found in this area including U.S. Federal Geographic Data Committee (FGDC), Global Change Master Directory (GCMD) and Directory Interchange Format (DIF) and so-on. Among these geographic metadata catalogues, the most widely adopted
774
standard is the FGDC. It must also be pointed out that the FGDC standard has explicitly addressed the requirements of data quality as shown in figure 2:
Figure 2: CSDGM standard sections Source: Federal Geographic Data Committee (2008)
2.3 Document and File system metadata Similar to all information systems, EAM systems constantly interact with different database management systems and create the document as one of the describable results. In practice, the most popular office tools including Microsoft SharePoint, Microsoft Word and other Office products, which all save metadata with document files to record some fundamental information about documents. These metadata can contain the name of the person who created the file, the name of the person who last edited the file, how many times the file has been printed, and even how many revisions have been made on the file and so on. Other saved material, such as deleted text, saved in case of an undelete command, document comments and the like, is also commonly known as document metadata (Metadata 2008).
Documents are stored in directories through a function managed by the file systems (usually as an integrated part of operating systems). Some file systems keep metadata in directory entries; others in specialised structure or even in the name of a file (Metadata 2008). Generally, the file system metadata scope involves the size, age, and type; frequency of files; the size of directories; the structure of the file-system namespace; and various characteristics of file systems including file and directory
775
population, storage capacity, storage consumption, and degree of file modification, etc. (Agrawal et al. 2007). Since the file systems are often integrated parts of operating systems, the related metadata standards are not discussed.
The Text Encoding Initiative (TEI) (NISO 2004) is considered as a major metadata standard in digital documentation. The Text Encoding Initiative is an international project to develop guidelines for marking up electronic texts such as novels, plays, and poetry, primarily to support research in the humanities. In addition to specifying how to encode the text of a work, the TEI Guidelines for Electronic Text Encoding and Interchange also specify a header portion, embedded in the resource, which consists of metadata about the work. The TEI header, like the rest of the TEI, is defined as an SGML DTD - a set of tags and rules defined in SGML syntax that describe the structure and elements of a document (NISO 2004).
2.4 Database and Data Warehouse metadata The data warehouse becomes an important initiative in the majority of EAM organisations. A data warehouse is a collection of integrated data originating from many data sources with various data models and a heterogeneous structure; it aims at providing and managing a set of integrated data for business decision support within an organisation (Vaduva & Dittrich 2001). The complexity of a data warehouse environment grows with the number of data sources, their heterogeneity, the diversity of programs and tools for loading the data warehouse, and the number of applications for using it. As pointed out by Vaduva & Dittrich (2001), the appropriate metadata standard will help users overcome this complexity.
The Meta Content Framework (MCF) is an example of metadata schema, being used in the structured database and found to be popular in many data warehouse initiatives. It provides a system for representing a wide range of information about content. The MCF is not intended to be an extension of markup languages such as HTML which can be used to hold embedded metadata. Instead, it provides a format for holding the metadata externally to the content described. It is possible that metadata embedded in content will be extracted automatically by robots that use the MCF to represent the results of their activities. The MCF should be able to represent the metadata that proposals such as the Dublin Core aim to cover (IFLANET 2008).
2.5 Multimedia metadata As mentioned in the introduction, modern AM organizations are dealing with a large volume of data and document-objects of various type. Especially in engineering asset management, audio (e.g. acoustic readings) and visual (e.g. engineering drawings) are important sources of information for asset maintenance. However, effective methods for managing these multimedia data objects are still undergoing research. The study by Smith & Schirling (2006) points out the concept of multimedia life-cycle management (figure 3). The authors suggest that multimedia resources typically have an extended life cycle,
776
covering a series of distinct processes and workflows, which builds throughout each object’s lifetime across various processes. Each process normally generates a series of extensive information dealing including content acquisition and creation, production, management, delivery, search, etc. This information can be captured as metadata that is either stored in database systems in records associated with the multimedia resources or is carried along with the multimedia resources by integrating the metadata within the multimedia data objects.
Figure 3: Multimedia life cycle and the central role of metadata Source: Smith & Schirling (2006)
With respect to the individual multimedia data object, a number of mature metadata standards can be identified, including DMS-1, P/Meta and IPTC. Among these standards, MPEG has been a major player for streaming video data. The ISO/IEC Moving Picture Experts Group (MPEG) has developed a suite of standards for coded representation of digital audio and video. According to NISO (2004), there are two standards which address metadata: MPEG-7 and MPEG-21.
MPEG-7 defines the metadata elements, structure, and relationships that are used to describe audiovisual objects including still pictures, graphics, 3D models, music, audio, speech, video, or multimedia collections. MPEG-21 was developed to address the need for an overarching framework to
777
ensure interoperability of digital multimedia objects with additional elements such as Digital Item Declaration, Digital Item Identification, Intellectual Property Management and Protection.
2.6 Other minor metadata In addition to the above five main categories, there are many other relevant metadata standards that can be adopted for the development of EAM metadata standards. For example, business intelligence is the key conception and plays a crucial role throughout the entire enterprise business processes especially for AM organisations. Business Intelligence metadata standards such as the Microsoft Business SQL Server (2005) and BI Metadata describe how data can be queried, filtered, analyzed, and displayed in Business Intelligence software tools such as Reporting tools, OLAP tools, Data Mining tools and so on.
3
DATA QUALITY ENHANCE ASSET MANAGEMENT METADATA MODEL
The development of metadata standards is an expensive but valuable task in AM organisations. The development process requires rigorous consideration of many aspects. With the understanding of the above metadata standards, useful experiences can be borrowed to assist the development of an engineering asset management metadata standard. With respect to the data quality dimensions that were highlighted by Wang and Strong (1996), Shanks and Darke (1998), this paper proposes an engineering asset management metadata model (as shown in table 3), which provides guidance to the actual standard development. This model lists a set of metadata elements including both descriptive and administrative metadata that can be used to manage corporative engineering assets in practice.
This metadata proposal includes eight main categories - Core Data, Data Subject, Data Record, Data Status, Data Access, Data Quality, Data Contact and Optional Metadata. The concept of this proposed structure is mainly based on multiple current mature metadata schemas such as Dublin Core Metadata Initiative (DCMI), Content Standard for Digital Geospatial Metadata (CSDGM), Global Change Master Directory (GCMD), and IAFA/WHOIS++ TEMPLATES, etc. Each of these developed metadata standards is designed to focus on a particular metadata area, and use more techniques to support
them
to
provide
better
services
778
in
corresponding
domains.
779
Data Status
Data Record
Data Subject
Core Data
Category
DCMI
A brief description of the contents of the created data
Location
Progress
Ending data
The status of the process for data such as
The last date of information in the dataset
The earliest date of data in the dataset
The common location or geospatial coordinates that can be used to detect the data generating entity.
Keyword
Beginning data
The word that is likely to be used by people to look for data
MPEG
FGDC
FGDC
DIF
FGDC
DIF
MCF
DCMI
The date that records for the data were created.
Date Stamp
Abstract
FGDC
The authority and authentication of the entity which is responsible for data creation.
Jurisdiction
DCMI
DCMI
Source
Creator
A meaningful name of the created data.
Description
The people, machine or device primarily responsible for creating data.
Title
Element
Identifying data status for asset management
As normally created data form a dataset, tracing the dataset start and end times can effectively record asset management system status change.
Due to the asset management system having wide geospatial distribution, recording data locations is significant, so people can detect, maintain or upgrade related data generating devices.
As a result of a large amount of data created by the asset management system, it is vital to set some keyword in each data contents to increase search efficiency.
It is important to give a clear explanation of every created data, to promote better understanding
Enables logging and tracking
Refers to the privacy of asset management system; it should have an element reflecting the entity which can access or edit the created data.
Due to the magnitude of components in asset management system, it is important to identify which entity is the source of or is responsible for created data.
A meaningful name allows easy identification and avoids confusion.
Importance
780
Data Contact
Data Quality
Data Access
The name of the departments or organizations which are responsible for analyzing or treating generated data.
Data responsibility party
Data lineage
Data perfectibility
A brief history of the source and processing steps used to produce data
Access restriction
A brief assessment of the completeness of coverage, classification and verification.
Any constraint or legal prerequisites applying to the use of the data, such as licensing information.
Available data types
Data accuracy
The formats in which data are available to be stored.
A brief evaluation of the created data that how accurately it can reflect to the real situation, such as the closeness of the location of geospatial objects in the dataset in relation to their true position on the Earth, or the reliability assigned to features in the dataset in relation to their real world values.
MCF
The format of data which is stored in the database by entities
Stored data type
DCMI
MCF
FGDC
FGDC
MCF
MCF
DIF
MCF
DIF
MCF
FGDC
MPEG
The frequency of data creation or changes.
Update frequency
normal, maintenance, error or malfunction, etc.
It is important to place the entity contact information in the asset management metadata for further contact with the entity responsible for
Giving detailed data assessments in metadata enables a better reflection of the structural development and maturity of an asset management system.
It is critical for asset management system that the data generated by devices closely reflect the real situation. Setting data accuracy elements can evaluate and display the accuracy rate of created data to people, so that they can make the right decision for their current circumstances.
Some large asset management systems may generate data through complicated steps, thus it is essential to introduce this particular data process to help people understand more about system.
It is important to set the diverse data access privileges in terms of different data, and alert people of any data retrieval conditions.
Listing all available data types in the asset management system database can provide valuable information for people to identify the database application range for the particular data.
Helping people identify the data format in asset management system database.
It contributes to people justifying real time levels of data
system helps adopt corresponding disposals
781
External Metadata
The facsimile number for contact
Facsimile No.
Additional metadata
Table 3: Asset management metadata schema
Reference to other directories or systems containing further information about the data.
The Email address for contact
The telephone number for contact
Telephone No.
E-mail
The relevant address information for contact
Contact address
MPEG
DCMI
DCMI
DCMI
DCMI
To reflect data more comprehensively, an asset management system may refer to some related metadata. Therefore, it is critical to add an element to record related external metadata used in asset management metadata.
creating data.
4
CONCLUSION
The development and application of metadata standards supports various major initiatives in EAM organizations such as Data warehousing, Enterprise Application Integration and so on. The success of these initiatives and the business operations relies on the ability to extract value from the existing data repository with confidence. Due to the lack of engineering asset management specific metadata standards, EAM organizations are facing critical challenges in their information management processes and are suffering from ongoing data quality problems.
As a response to the above problem, this paper has provided a preliminary EAM metadata model as a result of the study into various related mature metadata standards. Although this model has not provided low level details nor the technical elements (such as XML schemas), this model has addressed all the important data elements with a strong focus on data quality assurance. It is believed that this model will provide a useful contribution to actual generic or organizational-specific metadata standard development in engineering asset management organisations. Additionally, this model also provides essential criteria for assessing data risks in these organizations.
782
5
REFERENCES
1.
Agrawal, N, Bolosky, WJ, Douceur, JR & Lorch, JR 2007, ‘A five-year study of file-system metadata’, ACM Transactions on Storage (TOS), Vol. 3, Issue 3, Article 9.
2.
Bretherton, FP & Singley, PT 1994, ‘Metadata: A User's View’, Proceedings of the International Conference on Very Large Data Bases (VLDB), pp. 1091-1094.
3.
Batcheller, JK 2007, ‘Automating geospatial metadata generation—An integrated data management and documentation approach’, Computers & Geosciences, Vol. 34, pp. 387-398.
4.
Cathro, W 1997, ‘Metadata: an overview’, Applied Services to Libraries Division at Standards Australia Seminar “Matching Discovery and Recovery”, August, National Library of Australia, viewed 28 February 2008, .
5.
Dublin Core Metadata Initiative 2008, ‘DCMI Abstract
6.
Federal Geographic Data Committee 2008, Federal Geographic Data Committee, Reston, Virginia, USA, viewed 10 March 2008,
7.
Foshay, N, Mukherjee, A & Taylor, A 2007, ‘Does data warehouse end-user metadata add value?’, Communications of the ACM, Vol. 50, Issue 11, pp. 70-77.
8.
Hay, DC 2006, Data Model Patterns: A Meta Map, Morgan Kaufmam, San Fransisco, USA.
9.
IFLANET 2008, Digital Libraries: .
Metadata
Model’,
Resources,
viewed
Viewed
2
5
March
March
2008,
2008,
URL:
URL:
10. Jagadish, HV, Chapman, A, Elkiss, A, Jayapandian, M, Li, YY, Nandi, A & Yu, C 2007, ‘Making database systems usable’, in Proceedings of the 2007 ACM SIGMOD international conference on Management of data, Beijing, China 11. Kimball, R 1998, The Data Warehouse Lifecycle Toolkit, Wiley, New York, USA. 12. Lagoze, C 1996, ‘The Warwick Framework: A Container Architecture for Diverse Set of Metadata’, D-Lib Magazine 7/8. 13. Metadata 2008, Wikipedia, viewed 22 February 2008, URL: http://en.wikipedia.org/wiki/Metadata#cite_ref-3 14. Natu, S & Mendonca, J 2003, ‘Digital asset management using a native XML database implementation’, in Proceedings of the 4th conference on Information technology curriculum, Lafayette, Indiana, USA, pp. 237-24 15. NISO 2004, ‘Understanding Metadata’, NISO Press, .
Viewed
1
March
2008,,
URL:
16. Oxford Digital Library 2008, Metadata in the Oxford Digital Library, viewed 2 March 2008, URL: . 17. Parsian, M 2006, JDBC Metadata, MySQL, and Oracle Recipes: A Problem-Solution Approach, Apress, Berkeley, USA. 18. Smith, JR & Schirling, P 2006, ‘Metadata standards roundup’, IEEE Multimedia, Vol. 13, Issue 2, pp. 84-88. 19. Stock, I, Weber, M & Steinmeier, E 2005, ‘Metadata based authoring for technical documentation’, in Proceedings of the 23rd annual international conference on Design of communication: documenting & designing for pervasive information, Coventry, UK, pp. 60-67. 20. The University of Queensland Library 2008, An Introduction of Metadata, viewed 25 February 2008, . 21. Tsunakawa, M, Konishi, F & Nakanishi, T 2004, ‘Media asset management (MAM) system for efficient content management using metadata’, NTT Technical Review, Vol. 2, Issue 9, pp.62-67 22. Vaduva, A & Dittrich, KR 2001, ‘Metadata management for data warehousing: between vision and reality’, in Proceedings of Database Engineering & Applications, 2001 International Symposium, Grenoble, France, pp. 129-135. 23. Wang, XH, Wang, S & Wei, W 2005, ‘Study on remote sensing image metadata management and issue’, in Proceedings of IEEE 2005 International Geoscience and Remote Sensing Symposium, Vol. 1, pp. 612-615. 24. Wang, R.Y., and Strong, D.M., “Beyond Accuracy: What Data Quality Means to Data Consumers”, Journal of Management Information Systems, 12(4), 1996, pp. 5-33.
783
25. Shanks, G., and Darke, P., “Understanding data quality in a data warehouse”, Australian Computer Journal, 30(4), 1998, pp. 122-128.
Acknowledgments Cooperative Research Centre of Integrated Engineering Asset Management (CIEAM), Australia
784
Proceedings of the 4th World Congress on Engineering Asset Management Athens, Greece 28 - 30 September 2009
A SERVICE ORIENTED ARCHITECTURE FOR DATA INTEGRATION IN ASSET MANAGEMENT Georg Grossmann a, Markus Stumptner a, Wolfgang Mayer a, and Matt Barlow b a
University of South Australia, Advanced Computing Research Centre, Mawson Lakes, SA, 5095, Australia. b
Australian Nuclear Science and Technology Organisation (ANSTO), PMB 1 Menai NSW 2234
The integration of data plays a crucial role in condition monitoring and active data warehousing. It is classified in horizontal and vertical integration which capture different integration scenarios. Horizontal scenarios deal with application providing complementary functionality, whereas vertical scenarios deal with the integration of applications on different abstraction levels. Processing data for effective decision support in condition monitoring is usually performed by different software applications that are integrated in a common business process. In order to execute the business process, data must be exported from one application and imported into another. However, due to heterogeneous underlying data models, the data export and import is not straight forward and a translation of data from one representation into another is required. We propose a Service Oriented Architecture (SOA) based on Web services that allow a seamless integration of asset management tools by providing a common architecture. It supports vertical and horizontal integration and enables the plug-in of new tools without interfering with a running environment. Key Words: Integration, Service Oriented Architecture, Condition Monitoring, Asset Health Management 1
INTRODUCTION
Condition monitoring plays an important role in asset health management. In order to maintain assets cost effectively and to minimize the interruption caused by maintenance, monitoring the operation of assets with respect to operational limits and conditions is crucial [1]. Trend analysis techniques applied on recorded values allows to predict failures and to determine when to conduct a maintenance process. Decision Support Systems (DSS) implement prediction techniques that need to consider various factors like the relation between maintenance costs of an asset in various conditions and the costs caused by nonoperational times. Such systems must therefore be integrated with other systems to gain access to relevant data. As pointed out by Matthew and Gregory [5,10], information systems, their integration and data management are among the key challenges in Asset Management. In general, there is a need for integrating systems in two dimensions. Health Asset Management usually involves multiple asset management tools and DSSs specialized in a particular area like resource planning or risk management that need to be integrated in a decision making process on a high level. This means that results from various DSS systems are aligned without considering the underlying process which has lead to a specific result. On the other hand, each DSS requires integration with asset management tools located on a lower representation level like sensor reading systems. The integration on the same and different levels is known as horizontal and vertical integration and is discussed in Section 3 [2,6]. The rapid development of new applications demands a highly dynamic integration component that takes care of adding systems to and removing systems from an environment. Recently, the IT industry has recognized asset management tools as a growing market which is reflected by a comprehensive list of currently available plant maintenance applications (see also http://www.plant-maintenance.com). In the nineties, Enterprise Resource Planning (ERP) systems emerged to overcome the limits of isolated systems. They provide an integrated database and functionality that spans large parts of an organization. With the growth of enterprises and new market requirements, driven by new customer needs around the year 2000, the demand for additional functionality arose, and new types of software systems entered the market. These systems, such as risk-, supply chain-, and customer relationship management systems, were established in addition to the existing ERP systems and use their
785
own local databases. The integration usually takes place on a high abstract level because data from sub-systems needs to be adapted before transferring to another system. For example, noise needs to be filtered from sensor reading values [7,18]. This has lead to the problems of how to access information from existing external systems and how to integrate the data of old and new systems. IT vendors of ERP systems such as SAP have started to target the information access problem by providing an interface of their products using Web service technology but the integration remains a challenging problem. Web services in combination with a Service Oriented Architecture (SOA) offer a standard way of accessing data and are discussed in Section 4. In the past, companies invested considerable resources in the integration of so-called legacy applications [8]. Such applications had been developed at a time where future exchange of data was often not considered and no interface for accessing information was implemented. In order to solve this problem, data access and its integration with other systems was implemented at a later stage without separating data access from integration. Although communication between systems could be established, this solution is not optimal. It represents a hard-coded static solution that disappeared in the code of systems. As a result, the applied integration techniques could not be re-used for other scenarios because each directed communication was realized individually as shown on the left hand side of Figure 1. If changes were applied to one system then all individual implementations had to be adapted as well.
Figure 1: Comparison of individual integration versus standards-based integration. A state-of-the-art integration solution must be flexible in allowing changes within systems and within the asset health management environment without adapting the implementation. Adding or removing systems should be supported by a plug-in like functionality provided by an independent component. Changes within a system should either be shielded from the environment or propagated to already integrated systems so they can be adapted almost automatically. Standards play a supportive role in achieving such flexibility [3,4]. On the communication level, SOA and the often mentioned Enterprise Service Bus (ESB) are increasingly replacing middleware standards such as CORBA and are well accepted in industry and research. On the data level, a common structure for data representation is required which is expressive enough to capture all sort of data used in an environment. Standards for data exchange in asset management already exist, for example, MIMOSA and ISO-15926 [3,4,9] which can serve as a central integration component acting between systems as shown on the right hand side in Figure 1. In this paper, we focus on the communication level and present an architecture for a service oriented integration of asset management tools. The following section includes a use case scenario of a reactor plant management environment which we are currently investigating. Section 4 discusses horizontal and vertical integration applied in the use case. Section 4 discusses middleware technologies such as SOA and EDA and Section 5 proposes an architecture based on Web services for the use case.
2
USE CASE
In this paper we address a use case scenario of a power plant management environment. It consists of five systems which include (1) an embedded sensor reading system, (2) a data filtering system, (3) a field data collection system using personal digital assistants (PDAs), (4) an enterprise resource planning system (ERP), and (5) a decision support system (DSS). All systems except the DSS communicate with each other as shown in Figure 2. The DSS was introduced recently because it
786
provides functionality that the other systems do not include, for example, prediction of asset health and decision support in asset maintenance. However the DSS is not yet integrated with remaining systems which is the main challenge of the project. Sensor reading system: Many sensors are installed on site and collect measurements from various assets like temperature and vibration of pumps in cooling systems. The values are stored in a text file in the form of comma separated values (CSV) that can be read by the Data filtering and conversion system. Data filtering and conversion system: A data filtering and conversion systems reads values from a CSV file, selects specific values and normalizes the data. This allows to select of specific assets and to reduce noise. The output of the systems is passed on as a separate file to an ERP system. Enterprise resource planning system: An ERP system administrates operational limits, work orders, human resources, and assets. It can be regarded as a central storage component for asset management data because it contains most of the required data for condition monitoring and maintenance. Therefore, all systems are integrated with the ERP system where the integration was implemented on the ERP side as shown by the components within the ERP in Figure 2. Hence whenever a new ERP interface for a system was required, it was implemented as a new component within the ERP system manually. Currently, the ERP system provides an import interface for the data filtering and conversion and the field data collection systems as well as an export interface to the condition monitoring and analysis tool. Field data collection: Apart from sensor reading, data is collected manually from the assets during maintenance. Engineers on site record information like hour meter readings or fuel levels of diesel generators on PDAs. Data is transferred from the PDAs to the ERP system after the maintenance process using the import interface in ERP. Condition monitoring and analysis tool: A condition monitoring and analysis tool provides functionality for the visualisation of data. It reads data that is exported from the ERP systems and plots the information in form of graphs. The information about the different assets, operational limits, and work order completion can be overlayed in the same graph and allows visual comparison. Decision support system: A decision support system was developed that provides functionality the other systems do not provide. Some of those functions include prediction of maintenance costs in comparison to operational costs and decision support for conducting asset maintenance so costs involved in the maintenance process and non-operative times are minimized. For calculating predictions, data from different sources are required, for example, data from sensors for deriving current asset conditions, maintenance costs and human resource availability from the ERP. Therefore, the DSS needs to be integrated with the remaining systems in the environment. A desired integration solution for DSS aims at the following goals: •
The main goal is the integration of DSS with remaining systems, i.e., accessing data from sensor reading, ERP, and the condition monitoring and analysis tool.
•
An integration solution should be re-usable for future systems introduced to the environment.
•
A solution should cover the integration of existing systems and ERP and replace integration components within the ERP system. This offers the advantage that the integration is centralized and the environment becomes flexible to changes.
•
A solution needs to support horizontal and vertical integration. Both can be found in the use case and are discussed in the next section.
787
Figure 2: Use case showing an asset health management environment for a power plant.
3
INTEGRATION SCENARIOS
The integration of software application can be classified as Enterprise Application Integration (EAI) and business-to-business integration (B2B integration). EAI is concerned with the integration of software applications within an organization whereas B2B integration deals with the exchange of electronic documents between organizations [11,12,13]. Both share some capabilities : • Business processes are used for modeling the sequence of activity execution [30,31]. • Routing rules are applied for defining the data exchange between two systems. • System interfaces provide the basis for data exchange. However, EAI and B2B differ in their focus and requirements. EAI software provides the infrastructure to rapidly connect and interface between an organization’s internal applications. B2B can be regarded as an extension of EAI by integrating an organization’s applications with the applications of its partners. In this paper we focus on EAI because asset management tools usually exchange data within an organization. Integrating systems is usually driven by a business goal; three possible goals can be observed [14,15]: 2 3 4
Systems with similar functionality may be merged. Complementary systems may be composed to gain new functionality. Existing systems may be customized with new features.
Merging systems with similar functionality is an important issue in preserving data quality. If redundant information is distributed over several systems and an integration of those is not considered then there is a high risk that information becomes inconsistent over time. For example, redundant data is changed in one system but not in the other. The second goal, composing systems to gain new functionality, is the main reason for integrating existing systems in our use case. A new decision support system is introduced that requires data from sensors and the ERP system. Without implementing an integration, prediction of asset health conditions cannot be achieved accurately.
788
The third goal, customizing existing systems, is similar to the second goal. New functionality is introduced to the environment but in this case it affects only one system. A major reason for customization is to ease the integration with other systems. For example, data extracted from sensors readings are filtered first and then transferred to an ERP system. Customization may be implemented within a systems as an extension or as a separate component that can be re-used in combination with other systems. In our use case, such an extension is implemented as a separate component. An integration of two systems can be established in two dimensions depending on the relationship of the systems. The dimensions are referred to as horizontal and vertical integration known from organizational integration [2]. 3.1 Horizontal integration Horizontal integration incorporates different systems that provide complementing functionality required to reach a certain business goal. Horizontal integration can be established by implementing a distributed business process that orchestrates activities executed in different locations. It defines the specific order in which activities are executed to achieve a business goal. Example: The Decision Support System (DSS) shown in Figure 2 provides complementary functionality to the ERP system and the Condition Monitoring Analysis tool. It requires data from both systems and therefore needs to be integrated with both. A distributed business process establishing this integration will have to consider the data dependency between the ERP and the Condition Monitoring Analysis tool. If an activity within the decision support system is called that requires data from the Analysis tool then activities in the ERP may need to be executed to transfer data from the ERP to the Analysis tool for further processing before the data is transferred to the DSS.
3.2 Vertical Integration Vertical integration handles the integration of systems located on different abstraction levels. Usually, systems consist of sub-systems and these sub-systems may consist again of sub-sub-systems, etc. Vertical integration allows to abstract information from a low level and to lift it to a higher level that is appropriate for further processing. By this means unnecessary information is hidden and an overall view of the underlying data is accomplished [17]. Example: Sensor systems usually send bulk data in millisecond intervals to a monitoring system. An engineer working with the monitoring system receives only an overview of the asset condition. The monitoring systems analyses the trend of values and indicates the presence of values that are close to operational limits but hides the specific values of sensor readings. An important feature of vertical integration is the direct access of information on different abstraction levels. A high level view has the advantage of identifying abnormal conditions quickly, but for finding the reasons why the condition is abnormal, an approach is required that allows to navigate to the sub-systems in order to localize the cause.
3.3 Combining Vertical and Horizontal Integration The use case shown in Figure 2 contains an integration scenario that requires integration in both dimensions. On the left hand side of Figure 2 Sensor readings and Data filtering and conversion are integrated vertically because they both are located on different abstraction levels. Data from sensors are filtered and pre-processed for other tools. An example of horizontal integration can be found between remaining systems where the integration is implemented as import and export interfaces within the ERP system. Horizontal integration is also required for the DSS so additionally functionality can be added to the asset health management system. When looking at the order in which the dimensions are integrated, it can be observed that vertical integration must be considered prior to horizontal integration. The reason for this is that systems in the vertical dimension are usually tightly coupled in comparison to the horizontal dimension where systems are usually loosely coupled. An essential property of integration is consistency. It is important to ensure that two integrated systems are in a consistent state during run-time. This means that each state of one system always refers to the same set of states of the other system. Schrefl and Stumptner identified consistency criteria for business processes that model the life cycle behavior of systems and defined modeling rules that ensure consistency [16]. This allows the development of consistent software models that can be transformed into executable code. In vertical integration, each system state must be consistent with the sub-system states. The same must hold for the converse view: Each sub-system state must be consistent with the system state it belongs to. Consistency is mainly enforced by state change propagation. If a state change occurs then all related systems need to be informed which may result in a state change.
789
Example: If a sensor reading systems enters state “failure” then the monitoring system it belongs to needs to enter a similar state (e.g., “failure in sensor reading”) in order to inform an engineer about the malfunction. Consistency plays a similar role in the horizontal dimension. In this case, consistency is defined between a distributed business process, also called global business process, and underlying local business processes that model the behavior of the integrated systems, whereas in vertical scenarios, consistency is defined directly between the local business processes. Example: If a distributed business processes includes the transfer of data from the ERP system to the analysis tool, and the export from the ERP system fails, then the failure needs to be considered in the distributed business process by a compensation activity. At run-time, consistency is enforced by informing integrated systems of any state changes. This is enabled by middleware technology which is discussed in the following section.
4 MIDDLEWARE TECHNOLOGIES Alonso et al. pointed out that “We are not yet at the point where you can have plug and play application integration [...]” and the reason for that is a lack of standardization, both at the middleware and at the component level [18]. This has lead to research efforts in the area of Service Oriented- and Event-Driven Architectures.
4.1 Service Oriented Architecture Web service standards provide the most promising communication infrastructure for software application integration and enable dynamic interaction. Their advantage lies in their ability to make bottom-up designs more efficient, cost-effective, and simpler to design and maintain. Software services are self-contained, platform-agnostic computational elements that support rapid, low-cost and easy composition of loosely-coupled distributed software applications . They are made available by service providers that require a service description that can be accessed and understood by potential service requestors. If services use the Internet as the communication medium and use open Internet-based standards then they are called Web services. Web services constitute a distributed computer infrastructure made up of many different interacting application modules trying to communicate over private or public networks to virtually form a single logical system. The definition of Web service by the W3C Web Services Architecture Working Group is more specific about underlying technology: “A Web service is a software application identified by a URI, whose interfaces and bindings are capable of being defined, described, and discovered as XML artifacts. A Web service supports direct interactions with other software agents using XML-based messages exchanged via Internet-based protocols.”. Service-oriented architectures (SOAs) are software architectures that provide an environment for describing and finding software services, and for binding to services. Descriptions of software services provide a level of detail that facilitates service requestors to bind to and invoke them. Currently there are two visions on how to use Web services and SOA for future integration technologies: the Semantic Web and the Service Oriented Computing paradigm. 4.1.1 The Semantic Web The vision of the Semantic Web (http://www.w3.org/2001/sw) was first articulated by Tim Berners-Lee at the first World Wide Web Conference in 1994. The vision sees the evolution of the Web in using software agents that are able to understand the meaning of data and create connections between data automatically to gain new information [19]. This vision proposes the use of ontologies to capture the semantics of data similar to resolving semantic heterogeneities, a technique which also receives more and more attention in Asset Management [20,21]. The National Institute of Standards and Technology (NIST) started the program “Manufacturing Enterprise Integration” with the goal of realizing Tim Berners-Lee’s vision applied to manufacturing businesses. NIST sees the critical success of the Semantic Web in the ability of applications to find another, to establish a dialogue, to exchange information, and to understand that information, more or less automatically [22]. NIST calls this ability self-integrating and aims to develop standards that enable self-integrating applications. However, current Semantic Web Service technologies like DAML-S and OWLS are not yet up to this challenge, struggling to address even relatively straightforward matching situations [23,24]. Although the Semantic Web remains largely unrealized in today’s Web, it is believed that it is attainable. 4.1.2 The Service Oriented Computing Paradigm The Service Oriented Computing (SOC) paradigm as seen by Papazoglou et al. is a more pragmatic approach compared to the Semantic Web. SOC promotes the idea of assembling application components into a network of services that can be loosely-coupled to create flexible, dynamic business processes [25]. In contrast to the Semantic Web, the SOC paradigm does
790
not immediately address the automatic integration of services but proposes to tackle integration problems on different abstraction layers and support integration by combining the solutions developed on each layer. The SOC research road map of Papazoglou et al. outlines an extended SOA and consists of the layers service foundations, service composition, and service management and monitoring [25]. The service foundations comprise basic service capabilities provided by conventional SOA. The Web service and SOA implementation is currently supported by the concept of the enterprise service bus (ESB) . The ESB is an open standards-based message backbone designed to enable the implementation, deployment, and management of SOA-based solutions. Each service is attached to the ESB via a standardized interface. The ESB acts as a centralized component that integrates the applications and hides their heterogeneities. The service composition layer provides advanced service functionality for dynamic service composition. Service composition is realized by service aggregators that become a service provider by publishing the service descriptions of the composite service they create. This layer can be used by service aggregators to define a global business process that integrates the behavior of existing systems. The process can be translated to service descriptions and then be published at a service registry. The service management and monitoring layer concerns the operational level of a SOA based application and spans a range of activities, such as installation and configuration to ensure responsive service execution.
4.2 Event-Driven Architecture Over the past few years, event-based systems have appeared in many application domains, such as enterprise management systems, large-scale data dissemination, Internet applications and autonomic computing. Event-based computing takes a contrasting approach compared to conventional request/reply mode of interaction and inherently de-couples system components. In an event-based mode of interaction components communicate by generating and receiving event notifications, where an event is any occurrence of interest which is a state change in some component. The affected component issues a notification describing the observed event. An event notification service or publish/subscribe middleware mediates between the components of an event-based system (EBS) and conveys notifications from producers (or publishers) to consumer (or subscribers) that have registered their interest . The power of an event-driven architecture (EDA) is that neither the published notification nor the subscriptions are directed toward specific components. In contrast to standard SOA applications, components are fully de-coupled and not just loosely coupled. The event notification service broadcasts an event to all subscribers that are unknown to the publisher [26]. In view of the above arguments, the use of events is superior to request/reply in some information-driven scenarios. In the past, event-based approaches have been in applied in the database community which lead to active databases that follow an event-based style using database triggers that are expressed in the form of event-condition-action (ECA) rules . The application of ECA rules found its way also into workflow systems where event notification services are used as building blocks for distributed activity services . The event-driven architecture does not stand in contrast to the service-oriented architecture. Several efforts can be found in related literature that combined both approaches successfully. The Web services notification (WS-Notification) family of specifications defines a framework for event notification in a Web service environment [27]. In [28,29], we introduce a new concept that combines business process modeling with event modeling. We propose a set of extensions to currently used business process languages for the design of event-based communication between business processes. These extensions abstract from underlying ECA rules and introduce a modeling primitive that can be used in combination with well-known business process modeling languages and allows the coordination of business events.
4.3 Discussion of SOA and EDA Web services are becoming the de-facto technology standard for integration. The service oriented architecture provides an environment for Web services and brings many benefits to the integration of software applications. A set of standards for the specification of service interfaces, service registry, and orchestration that can be used over the Internet are available, which facilitates the interoperability of systems. Therefore, a SOA based environment can be used to implement horizontal and vertical integration in EAI integration scenarios. The conventional SOA also has some disadvantages that can be overcome by extensions. The conventional SOA relies on the request/reply paradigm which requires static service interfaces to be implemented. This might be not desired in two cases: First, smaller and medium enterprises cannot invest in building interface- and registration specifications for each service that is added because such an effort would be too costly. They need a light-weight, straight-forward set of technologies to build and maintain the service abstraction for client applications. Second, request and reply need to be de-coupled to reflect the real
791
world’s event-driven nature. In some cases it is more appropriate if a state change is broadcast to interested parties instead of sending a request to each individual system. For example, a monitoring system involves systems that have to react to state changes, especially if a state change indicates an emergency. An event-driven architecture allows flexible addition and removal of systems to and from the monitoring system. An event-driven architecture (EDA) is suitable for horizontal and vertical integration in EAI scenarios where a light-weight solution and broadcasting state changes is required. The application of EDA is part of current research since some challenges such as scalability, performance, and security issues remain unsolved. Another effort that has recently been taken is the combination of SOA and EDA. Some XML-based standards such as WS-Event, WS-Eventing, and WS-Notification have emerged which allow event-based notification in a Web service environment. Monsieur et al. investigated these standards and found out that all standards still lack support for business event coordination [27]. So far, different integration types and -scenarios and their possible implementation have been discussed. The implementation of an integration leads to a specific result that can appear in different forms. In the following section we propose an architecture for the use case scenario that incorporates Web service calls.
5 ARCHITECTURE We propose a service oriented architecture for the integration of asset management tools and demonstrate it with the use case introduced in Section 2. For simplicity, only horizontal integration is considered here, based on Web service calls. The architecture can easily be extended with application logic for vertical integration and an event publish/ subscribe component. For the use case mentioned here, a Web service environment based on the request/reply paradigm is sufficient. Figure 3 shows the integration of five components found in the horizontal dimension of the use case shown in Figure 2.
Figure 3: Proposed architecture for use case shown in Figure 2. The main difference of the proposed architecture compared to the current situation shown in Figure 2 represents the Integration Component. It builds a generic communication interface for all systems and is responsible for all data transfers. Instead of integrating systems individually, where the implementation disappears in a system as shown by import and export components within the ERP system in Figure 2, each system is connected to the environment by connecting it to the Integration Component. By this means the implementation of the Integration Component can be re-used for new introduced systems. It offers the advantage of a standards-based integration explained in Section 1 with the difference that it does not rely on a specific standard for data representation. This is enabled by the sub-components of the Integration Component shown in Figure 4. They can be divided into components that are used prior to run-time for developing a data translation and components used during run-time for executing a translation. Translation consists of a mapping between the data structure of a source and target interface.
792
Figure 4: Sub-components of the integration component.
5.1 Developing a Mapping Specification A mapping need to be created before a system can transfer data to another system. It is defined only once prior to execution and includes a specification for the data translation from a source to a target interface, where an interface can be either a Web service description written in an XML-based standard or a database schema of a relational database. In many cases a single mapping can be specified that translates a data structure in both ways between source and target interface. In some cases this is not possible, for example, if the mapping depends on the content of a data structure. If this case occurs and a transfer is required in the opposite way then a second mapping needs to be created. The two components Interface Import and Mapping provide functionality for developing a mapping. The first component imports an interface either from a file written in the Web Service Description Language (WSDL) or by importing a database schema from a relational database and saves the interface in the internal storage Interfaces. The component Mapping consumes two interfaces, a source and a target interface, and provides simple and complex mapping specifications that can be set between the two interfaces. For example, a simple mapping specification is a direct mapping from a field to another using the equal operator. A complex mapping specification consists of a set of different operators like split and merge of data fields. All specified mappings are stored in the internal storage component Mappings.
5.2 Executing a Mapping The components Service Interface and Service Call are used for executing a translation. The former component contains the Web service interface that can be accessed by all systems. It obtains two references to Web services as input parameters and initiates an execution. The input parameters must hold a reference to a source and a target Web service interface that have been previously imported into internal storage Interfaces. Furthermore a mapping must exist in internal storage Mappings which specifies a translation between the source and the target Web service. The Service Call component executes three steps: (1) The source Web service is called to get the source data. (2) The mapping between source and target Web service is executed that translated source data into the structure required by the target Web service. (3) The target Web service is called with the translated source data as input.
6 CONCLUSION State-of-the-start asset health management requires a dynamic environment which allows the introduction of new software applications and their integration with existing systems. We have pointed out that integration needs to be considered as a separate component to optimize re-usability of integration knowledge and provide flexibility to the environment. Integration can be classified in horizontal and vertical integration that can be realized using Service Oriented or Event-Driven Architectures as underlying middleware. We have proposed an integration architecture that supports both integration dimensions based on a SOA using Web services and have demonstrated its application in a use case of Power Plant Management System.
793
7 REFERENCES 1
Basim Al-Najjar. (2007) Establishing and Running a Condition-Based Maintenance Policy; Applied Example for Vibration-Based Maintenance. In WCEAM CM 2007 Proceedings of the Second World Congress on Engineering Asset Management, Harrogate, UK. pp.106-115. Coxmoor Publishing.
2
Aitor Arnaiz, Benoit Iung, Erkki Jantunen, Eric Levrat, and Eduardo Gilabert. (2007) DYNAWEB. A Web Platform for Flexible Provision of E-Maintenance Services. In WCEAM CM 2007 Proceedings of the Second World Congress on Engineering Asset Management, Harrogate, UK. pp.106-115. Coxmoor Publishing.
3
Andy Koronios, Daniela Nastasie, Vivek Chanana, and Abrar Haider. (2007) Integration Through Standards - An Overview of International Standards Relevant to the Integration of Engineering Asset Management. In WCEAM CM 2007 Proceedings of the Second World Congress on Engineering Asset Management, Harrogate, UK. pp.106-115. Coxmoor Publishing.
4
Stephen Roe. (2007) The Benefits of Standards and Certification. In WCEAM CM 2007 Proceedings of the Second World Congress on Engineering Asset Management, Harrogate, UK. pp.106-115. Coxmoor Publishing.
5
Joseph Mathew. (2008) Engineering Asset Management - Trends, Drivers, Challenges and Advances. In Gao Jinji, Jay Lee, Jun Ni, Lin Ma and Joseph Mathew (Eds) WCEAM-IMS 2008 Proceedings of the 3rd World Congress on Engineering Asset Management and Intelligent Maintenance Systems Conference, Beijing, China. pp.59-74: Springer-Verlag London Ltd.
6
Georg Grossmann. (2008) Horizontal and Vertical Integration of Object Oriented Information Systems Behaviour. PhD thesis, University of South Australia.
7
Matthias Weske. (2007) Business Process Management – Concepts, Languages, Architectures: Springer Verlag.
8
Michael L. Brodie and Michael Stonebraker. (1995) Migrating Legacy Systems: Gateways, Interfaces, and the Incremental Approach: Morgan Kaufmann.
9
Avin Mathew and Lin Ma. (2007) Multidimensional Schemas for Engineering Asset Management. In WCEAM CM 2007 Proceedings of the Second World Congress on Engineering Asset Management, Harrogate, UK. pp.1387-1397. Coxmoor Publishing.
10 Neil Gregory. (2007) Excellence in Asset Management. In WCEAM CM 2007 Proceedings of the Second World Congress on Engineering Asset Management, Harrogate, UK. pp.682-691. Coxmoor Publishing. 11 Christoph Bussler. (2003) B2B Integration: Springer Verlag. 12 Marlon Dumas, W.M.P. van der Aalst and A.H.M. ter Hofstede. (2005) Process-Aware Information Systems: John Wiley & Sons. 13 Jeff Pinkston. (2001) The Ins and Outs of Integration - How EAI differs from B2B Integration. In eAI Journal, no. 8, August 2001. pp.48-52. 14 Michael P. Papazoglou and Willem-Jan van den Heuvel. (2007) Business Process Development Life Cycle Methodology. In Communications of the ACM, 50(10). pp.79-85. 15 Anat Eyal and Tova Milo. (2001) Integrating and customizing heterogeneous e-commerce applications. In VLDB Journal, issue 10. pp.16-38: Springer Verlag. 16 Michael Schrefl and Markus Stumptner. (2002) Behavior-consistent Specialization of Object Life Cycles. In ACM Transactions on Software Engineering and Methodology, 11(1).pp.92-148. 17 Takashi Kobayashi, Masato Tamaki, and Norihisa Komoda. (2003) Business Process Integration as a Solution to the Implementation of Supply Chain Management Systems. In Information and Management, 40(8). pp.769-780: Elsevier. 18 Gustavo Alonso, Fabio Casati, Harumi Kuno and Vijay Machiraju. (2003) Web Services - Data Centric Systems and Applications: Springer-Verlag. 19 Tim Berners-Lee and Mark Fischetti. (1999) Weaving the Web - The Original Design and Ultimate Destiny of the World Wide Web: Harper San Francisco. 20 Daniela Nastasie, Andy Koronios and Kamaljeet Sandhu. (2008) Factors Influence the Diffusion of Ontologies in Road Asset Management - A Preliminary Conceptual Model. In Gao Jinji, Jay Lee, Jun Ni, Lin Ma and Joseph Mathew (Eds) WCEAM-IMS 2008 Proceedings of the 3rd World Congress on Engineering Asset Management and Intelligent Maintenance Systems Conference, Beijing, China. pp.1162-1176. Springer Verlag London Ltd.
794
21 D Platt, G Jordan, A Kumar and A Koronios. (2008) Towards a Service Ontology to Drive Requirements for Public Infrastructure and Assets. In Gao Jinji, Jay Lee, Jun Ni, Lin Ma and Joseph Mathew (Eds) WCEAM-IMS 2008 Proceedings of the 3rd World Congress on Engineering Asset Management and Intelligent Maintenance Systems Conference, Beijing, China. pp.1321-1331: Springer-Verlag London Ltd. 22 Albert Jones. (2003) Program "Manufacturing Enterprise Integration": NIST. http://www.mel.nist.gov/msid/mee.htm. 23 Rajesh Thiagarajan and Markus Stumptner. (2006) A Native Ontology Approach for Semantic Service Description. In Proc. of the Australasian Ontology Workshop (AOW), volume 72 of Series of Conferences in Research and Practice in Information Technology (CRPIT). pp.85-90: Australian Computer Society. 24 Rajesh Thiagarajan and Markus Stumptner. (2007) Service Composition With Consistency Based Matchmaking: A CSP-based Approach. In Proceedings of the IEEE European Conference on Web Services (ECOWS), pp.23-32. 25 Michael P. Papazoglou, Paolo Traverso, Schahram Dustdar and Frank Leymann. (2007) Service-Oriented Computing: State of the Art and Research Challenges. In IEEE Computer, 40(11). pp.38-45. 26 Gero Muehl, Ludger Fiege and Peter R. Pietzuch. (2006) Distributed Event-Based Systems: Springer-Verlag. 27 Geert Monsieur, Monique Snoeck, and Wilfried Lemahieu. Coordinated Web Services Orchestration. In Proc. of the IEEE International Conference on Web Services (ICWS). pp.775-783. 28 Georg Grossmann, Michael Schrefl and Markus Stumptner. (2008) Modelling Inter-Process Dependencies with HighLevel Business Process Modelling Languages. In Proc. of the Fifth Asia-Pacific Conference on Conceptual Modelling (APCCM 2008). CRPIT, Vol. 79. pp.89-102: Australian Computer Society. 29 Georg Grossmann, Michael Schrefl and Markus Stumptner. (2009) Modelling and Enforcement of Inter-Process Dependencies With Business Process Modelling Languages. To appear in Journal of Research and Practice in Information Technology: Australian Computer Society. 30 Lin Ma, Yong Sun and Joseph Mathew. (2007) Asset Management Processes and Their Representation. In WCEAM CM 2007 Proceedings of the Second World Congress on Engineering Asset Management, Harrogate, UK. pp.13541363. Coxmoor Publishing. 31 Yong Sun, Lin Ma, and Joseph Mathew. (2007) Asset Management Processes: Modelling, Evaluation and Integration. In WCEAM CM 2007 Proceedings of the Second World Congress on Engineering Asset Management, Harrogate, UK. pp.1847-1856. Coxmoor Publishing.
Acknowledgements This research was supported by the CIEAM CRC Project SI-302 “Improved OPAL Monitoring and Management System”.
795
Proceedings of the 4th World Congress on Engineering Asset Management Athens, Greece 28 - 30 September 2009
DATA MINING TECHNIQUES FOR DATA CLEANING Kalaivany Natarajan a, b, Jiuyong Li a, b, Andy Koronios a, b a
b
CRC for Integrated Engineering Asset Management, Brisbane, Australia School of Computer and Information Science, University of South Australia, Mawson Lakes - 5095, Australia
Data quality is a main issue in quality information management. Data quality problems occur anywhere in information systems. These problems are solved by data cleaning. Data cleaning is a process used to determine inaccurate, incomplete or unreasonable data and then improve the quality through correcting of detected errors and omissions. Generally data cleaning reduces errors and improves the data quality. Correcting errors in data and eliminating bad records can be a time consuming and tedious process but it cannot be ignored. Data mining is a key technique for data cleaning. Data mining is a technique for discovery interesting information in data. Data quality mining is a recent approach applying data mining techniques to identify and recover data quality problems in large databases. Data mining automatically extract hidden and intrinsic information from the collections of data. Data mining has various techniques that are suitable for data cleaning. In this paper we discuss three major data mining methods, namely functional dependency mining, association rule mining and Bagging SVMs for data cleaning. We discuss strengths and weakness of these data mining methods for data cleaning. Key Words:
1
Data Mining, Data Cleaning, Functional dependency, Association rule, Bagging, SVMs
INTRODUCTION
Data quality is a main issue for Information oriented organizations. Here quality of the data depends on the actual use of the data. The state of completeness, validity, consistency, timeliness and accuracy that makes data appropriate for a specific use. Data is collected from the variety of sources and stored in databases. In a relational database data quality problems arise in the query execution when we use data for a merge operation, if one of the query results is fault it affects the entire transaction. Data quality problems can occur anywhere in the information system (36). These problems are solved by data cleaning. Data cleaning is a process used to determine inaccurate, incomplete or unreasonable data and then improving the quality through the correction of detected errors and omissions. Generally data cleaning reduces errors and improves the data quality. Correcting errors in data and eliminating bad records can be a time consuming and tedious process (13) but it cannot be ignored. It is important however, that errors not just be deleted, but corrections documented and changes traced. It is always good to correct the errors in separate fields there is chance to go through the original information. Data cleaning is required when we integrate more than one databases, since the structure of the database is differ from one another. Data mining is a key technique for data cleaning. Data quality mining is a recent approach applying data mining techniques to identify and recover data quality problems in large databases. Data mining automatically extract hidden information from the collections of data (34). Data mining has various techniques that are suitable for data cleaning. Here we depict some commonly used data mining techniques. Rule Mining works like algorithmic process. It takes an input and induces rules as output; the outputs can be association rules and functional dependency. Normally association rules describe relationships among large data sets and co-occurrence of items. Functional dependency shows the connection and association between attributes and shows how one specific combination of values on one set of attributes determines one specific combination of values on another set (39). Bagging is a Meta method for model building. It generates multiple training sets by sampling with replacement from the available training data which is also called as bootstrap aggregating (38). SVMs are set of related supervised learning methods used for classification and regression (25).
796
In this paper we provide an overview of the data quality problems. We discuss the data quality mining problems and three different methods to clean the database with data mining techniques. The first method identifies deficiency in the data quality dimensions and uses the Bagging SVMs to predict and clean the database. The second method detect and solve the data quality problems by association rules. The third method is data cleaning using functional dependency in which functional dependency has selectivity rule to select the functional dependency and clean the database according to their ranking. In section.5 we discuss strengths and weakness of these data cleaning techniques. Challenges are also discussed here. The final section is a conclusion. Table 1. E.g. Basic data quality problems
2
Sl. No Dirty data
DQ Problem
1 2 3 4 5 6
Duplicated Records Misfielded Values Missing Values Violated Attribute dependencies Illegal Values Multiple values in single column
Cus1 = (name=”William Robert”...) Cus2= (name=”W.Robert”..) Country =”South Australia”... Age =00 Zip = 7777 City = Sydney Gender = Q name =”William 22-09-81”
DATA QUALITY PROBLEMS Data quality is a fundamental issue many area mainly in the pattern discovery (19). If data quality satisfies a quality criteria and the data is treated as high quality data. Data quality criteria (29) are accuracy, integrity, completeness, validity, consistency, schema conformance, uniformity, density and uniqueness. Data quality problems are raised in industry and academic areas (8). Collecting data from a variety of sources and analyze that data fo r further u s a g e are main aspect o f the data quality problems. i.e. Data Integration Single source and multiple sources are classifications of data quality problems. Single source problems o c c ur in a single database whereas multiple source problems o c c ur whenever data i nt e gr a te from two or more sources. e.g. Overlapped data and differences in entity names. Metadata is a pivotal idea on which both the components depend the function of the metadata descriptions is to be able to abstract and capture the essential information in the underlying data independent of representational details (21). It helps to find out the data quality problems and contribute to identify attribute correspondence between source schemas based on which automatic data transformations can be derived (22; 9). In table. 1 we show some examples of dirty data in the customer database (31).
3
DATA CLEANING Data cleaning is applied with comprehension and demands in the different areas of data processing and maintenance. Data profiling examines the data available in an existing source and collects statistics information about that data. It is a application of data analysis technique. It determines actual content, structure and quality of the data (10). Data profiling gives overall idea about the database which are obliging for data cleaning to perform their work efficiently. Data cleaning is essential to maintain the data warehouse; it deals with detecting and removing errors and inconsistencies from data in order to improve the quality of data (31). It is applied in the field of data warehousing when several databases are merged. How data cleaning is important task in data warehousing is described in (5). Records referring to the same entity are represented in different formats in the different data sets or are represented erroneously. Thus, duplicate records will appear in the merged database. The issue is to identify and eliminate these duplicate records. This is called Merge/purge problem (12) (14; 26). Generally data cleaning is updating a record with cleaned data but serious cleaning involves decomposing and reassembling the data (24). Data transformation is essential for extracting data from legacy data formats and for Businessto- Business Enterprise data integration (35). Data cleaning is performed by domain expert because it is valuable in identifying and eliminating of anomalies. Anomaly is a property of data values it may causes the errors in measurements, lazy input habits, omission of data and redundancies. Anomalies basically classified into three types Syntactic - describes characteristic values and format. Semantic - hides data collection from a comprehensive and non- redundant representation. Coverage anomalies - reduce the amount of entities and their properties (27).
797
3.1 Types of Anomalies 1. Syntactic (a) Lexical errors (b) Domain format errors (c) Irregularities 2. Semantic (a) Integrity constraint violations (b) Contradictions (c) Duplicates (d) Invalid tuples 3. Coverage Anomalies (a) Missing values (b) Missing tuples 4
DATA QUALITY MINING Data quality mining is a deliberate application of data mining techniques for the purpose of data quality measurement and improvement (15). We assume that, if data quality dimensions are achieved in high degree that database has a good quality of data for further transactions. Data Mining has a range of techniques, in the introduction part we have discussed about this. It faces various kinds of challenges (6) at the development of techniques, •
Difficulties to handle different types of data
•
Mining information from multiple data sources
•
To give an efficient data mining algorithm and outputs
•
Provide privacy and security to data
Data quality mining techniques perform two main tasks, 1. Quality problem identification 2. To fix the identified quality problems In our survey we have three methods to perform these tasks. The methods Ensemble approach to mining with quality matrices and data cleaning using functional dependency are involved to fix the quality problem. Association rule is carried out for quality problem identification in large datasets.
4.1 Data mining for data cleaning Data cleaning is key area in data mining. The term data mining has mostly been used by statisticians, data analysts and the management information systems (MIS) communities. It has also gained popularity in the database field (11). Data Mining is an interdisciplinary field with a general predicting outcomes and uncovering relationships in data (33). In data mining information are analyzed and stored from different sources, when we combine a data from dissimilar source is there may be chance to produce errors, in this situation data cleaning detecting and removing those errors. ETL (Extraction/Transformation/Loading) process applied for detectable errors, which are widely available and time consuming tools. For e.g. In a record s e t if name field is filled by numeric values this error is easily detectable and rectified by ETL tools. It has tool like Data Junction or Ascential Software’s Data stage to transform the data, fixing errors and converting it to the format needed for analysis (32). Data mining process involves into the data collection, cleaning the data, building a model and monitoring the models (2).
798
Table 2. E.g. Quality matrices for accuracy
Single Married
Single 0.79 0.8
Married 0.21 0.92
4.1.1 Ensemble Approach to Mining with Quality Matrices (EQPD) Objective It explores the use of readily available data quality matrices for the data mining classification task to improve predictive accuracy. Method: Bagging SVMs Ensemble approach to mining with Quality Matrices is proposed by (7). Quality matrices are key to measure the data quality by their dimensions such as consistency, timeliness, accuracy and accessibility e.g. Business. Data quality matrices are a concise way to represent errors and deficiencies in the data along with dimensions. Column of matrix represents data quality problems and row of matrix represents quality checks or corrective process to prevent, detect or correct these data quality problems (30). Data quality matrices are created for accuracy, contextual quality and semantic interpretability by quality assurance systems. In table. 2 we have given some sample quality matrices for accuracy. In that martial status is find out from two options. If we consider that single as positive and married as negative, that could be understand as 21% false positives and 8% false negatives. Database contains group of records each record has a range of values. The value may be correct or incorrect. Quality matrix model is constructed for perturbed versions and test the instance according to the predictions. Data perturbation is one type of general class of techniques that are masking original values in a data set to prevent disclosure (23). Perturbation data methods are capable of providing high data utility and low disclosure risk (37; 28). In this method data perturbation involves modifying confidential variables using random noise. Users are provided access only to the modified values of the confidential variables, thereby guarantying that original values of the confidential data are not revealed. Before entering into an Ensemble approach we see basic idea to build a model. • Given data set is divided into equal part. •
Each part is used in turn for testing while the remaining part is used for training.
•
The procedure is applied on all part to find the errors.
•
Overall error rate is calculated from the average error rate of each part.
•
Generate the final classifier by learning from all of the data.
The above steps are applied on data sets to construct the models. Database consist of versions from Dv1 to Dvm those values are independent with one another. Individual version considered as random perturbation of true values and model by quality matrices with machine learning methods. Here mining approach is use a true value as version of the data since true values are unknown. In a record set if all m versions are available it starts to build a model and make prediction among them. Predictor resource analyses the characteristics of users and discovers relationship between records. Moreover it summarizes the relationships in the data set in the form of rules. For e.g. If customer is male and he purchases a sport clothes, he may purchase a tennis equipment. In table.3 we have Dv1 to Dvm versions, personal information of the employees are stored in a separate records. In version Dv1 gender field is missing, in version Dvm Zip code is entered we do not know whether that is correct or not. According to EQPD, a first field of the version Dv1 is compared with other fields i.e. Dv1, 1 to Dv1, n and predict the correct values. All records having one perturbed version of the true values from this we can generate approximation for other versions. Finally model builds from approximate version and aggregate the voting. Database: Dv1 = F Q (R1) · · · Dvn = F Q (Rn) (1) Approx – ver : ADv11 = F Q (Dv1) · · · ADv1,n = F Q (Dvn ) Basic idea of EQPD is one of our database has flawed version of the entities; we construct a new version and build a model from them. Finally we combine both new and flawed version together to improve accuracy.
799
(2)
Table 3. Data set for quality matrix model Version
E-id
E-name
Gender
Zip
Suburb
State
Salary
DV1 DV2 .. DVm
E110 E108 .. E111
Tom Van .. Bob
? M .. M
5108 5031 .. 6000
Parahills Modbury .. Adelaide
SA SA .. SA
2000 3500 .. 2200
Strengths and Limitations •
It reduces noise, bias and variance in dependent and independent variables.
•
Improves accuracy by perturbed data according to the quality matrices.
•
It builds a model from predictive value, we are not sure whether that is correct or not. This is a major problem of this method.
4.1.2 Data quality mining with Association Rules Objective Data Quality mining i s used here to detect, quantify, explain and correct data quality deficiencies in very large databases. Method: Association rules Data quality mining is associated with association rules to find a relationship with the items in huge database in addition to that it improves the data quality. Data quality mining associated with association rule is projected by (15). In a dataset each transaction has a group of elements. We will find a link among them by association rule. It is normally implied by A → B, when we find a transaction in which all items in a ∈A also b ∈B with some probability value (1). Association rules generates a rule for all the transactions which are checked by their confidence level. E.g. Association rules from Patients medical database Association rule Medical: Patient-id → MedicareNo- 5136049263 Medical: Patient-name → Pname - Mark Medical: Patient-name → City - South Australia Medical: Patient-name → Zip code - 5108
Confidence 95% 78% 76% 75%
In the above e.g. Confidence level of each transaction is mentioned, in which Patient Id is identified from Medicare number. It has a high confidence level 95%. In the remaining rules confidence levels are less than 80%. According to their confidence level and bagging predictor (3) we can find out the strength of all rules by the following steps, •
Determine transaction type.
•
Generates the association rule, if possible an algorithm directly access the tables from relational database.
Assign a score to each transaction based on the generated rules. The score captures the consistence of a single transaction. We assign a score to each transaction by summing the confidence values of the rules it violates. Rule violation occurs when a tuples must satisfy the rule body but not it’s consequent (18). The idea behind to assign high scores to a transaction is to suspect the deficiencies. Here we have tuning parameter to assess the confidences depending on their value. We assume that tuning parameter value will be anything i.e. 3 or 5. The contradicting of rule is not a sign of incorrectness and assumes that the whole data set doesn’t messed up with noise, we suggest that minimal threshold of y = 75% or higher for the confidence to restrict the rule set AR in order to improve the results. •
The transactions are sorted and stored according to their score values. The users easily understand and decide about the trustworthiness of single transactions or entire data set. Based on the score, the system decides whether to accept or
800
reject the data or else issue a warning. Strengths and Limitations •
Reduce the number of rules to generate for a transaction.
•
It avoids a severe pitfall of association rule mining.
•
Limitation of this method is difficult to generate association rules for all transactions.
4.1.3 Data Cleaning using Functional Dependency Objective It identifies duplicates and anomalies with high recall and low false positive. Method: Functional dependency is associated with query optimization technique called selectivity rule. Functional dependency shows the relationship between entities in data model and has the ability to clean the data (4). Data cleaning using functional dependency is proposed by (20). Here the functional dependency is automatic process by combining the FD discovery technique with the data cleaning technique. Combining solution is sensitive to data size. When the data increases, it decreases the speed of the discovery algorithm, when a number of attributes increases, the discovery creates more candidates of FD and generates too many FDs including noise ones. To decrease the number of generated FDs it uses a query optimization technique, i.e. Selectivity rule. It is repairing to prune an unlikely FD. FD discovery algorithm used for identifying errors and cleaning algorithm together to produce FD cleaning tool. FD discovery algorithm has four phases to perform this operation. Data Collector: Retrieve data from relational database and improve data for the next (FD engine) module. FD Engine: Identifies duplicates (non-Candidate Key), inconsistent errors (Candidate Key) and rank the candidates based on selectivity rule. This output is the input of data cleaning module. Cleaning Engine : Assign weighted to the data, high errors produces high weight tuples these are clean by low weight tuples using cost-based algorithm, then the cleaned data send to the relational database. Relational database is fourth module other modules storing and retrieving data from this module. Selecting Functional Dependency Selectivity Value Selectivity values are applied for ranking the candidate in order to find the appropriate FD(17). Selectivity value is calculated by |X1 ||Y1 |/|X1 , Y1 | where |X1 | is number of classes in a partition X1 and |Y1 | is the number classes in a partition Y1 and |X1 , Y1 | is the number of classes in both partitions. Candidate Ranking Selectivity values are determining the ranking of candidates. Pruning point is set as a threshold value for low and high ranking for good candidates. High-ranking candidate has a high selectivity value and named as candidate key. Low ranking candidate has a low selectivity value and grouped into non-candidate key. Functional dependency with repairing technique applied in this method to clean the database. Improving Pruning Step Pruning step generates candidate set by computing the candidates from level 1 to level 4 (16). The pruning steps from level 0 to level 4 is shown in fig. 1 in which level 0 starts with 0, level 1 has candidates {W, X, Y, Z}, level 2 candidates are {W X, W Y, W Z, X Y, X Z, Y Z} level 3 has candidates {W X Y, W X Z, W Y Z, X Y Z} and final level has {W X Y Z}. Steps to clean the database 1. Get set of candidates. 2. Set the threshold values for low ranking and high ranking. 3. Check the candidates if they are not in FD and in either low or high accepted ranking. 4. Store a new candidate from candidate A and B from current level.
801
Figure 1: FD pruning steps Functional dependency with ranking and repairing technique reduce the number of functional dependency and identifies suspicious tuples for cleaning. In addition to that it reduces sorting attributes and decreases the workload. Strengths and Limitations
5
•
Easily identifies suspicious tuples for cleaning.
•
Decrease the number of functional dependency discovered.
•
This method is not suitable for large database because it is difficult to sort all the records.
DISCUSSIONS AND CHALLENGES
Data mining provide capacities to process huge data sets with high dimensions. In the data quality mining we have discussed different approaches for data cleaning. We discuss strength and weakness of those approaches. Ensemble approach to mining with quality matrices method improves the predictive accuracy by perturbed data according to the quality matrices. At the same time it builds a model from the predicted value it might be correct or incorrect. The Data quality mining with association rules method using association rules to clean the database. This method reduces number of rules to generate for a transaction. Data quality mining with association rule method has some problems to overcome such as it calculates the score for the transactions and suspects the deficiencies of the transaction. It may not suitable for some situations. This is suitable for the case when the transaction does not have any violation then the score and violation values are 0. According to the sorting and storing method, it would be in the last position of the database. Another situation is that, we could not generate the association rule for all the transactions. The final method is Functional dependency in which ranking technique using selectivity value to prevent the data inconsistency. However it is difficult to test for the larger database since it is sorting all the records in the database to recover the database and output of one functional dependency produces more number of functional dependencies. Data cleaning has many challenges to overcome. The first challenge is insufficient values while doing predictions among the incorrect data. From time to time we fill the missing values with some predicted values, in that case we are not sure whether filled value is correct or not. Maintaining a cleaned database is another big challenge we will keep track of the changes in database to avoid the immediate data quality problems.
6
CONCLUSIONS Data cleaning is a primary task for the database transactions. In this paper we overview data quality problems and data quality techniques. Data mining offers many techniques for data cleaning. In this paper we have reviewed three major techniques that are association rule, Bagging SVMs and functional dependency. Each technique deals with identifying data quality problems and cleaning those problems. Strengths and limitations of all the techniques are also discussed. Finally we discuss the challenges faced in the data cleaning.
802
7
REFERENCES
1
Rakesh Agrawal and Ramakrishnan Srikant, (1994) Fast algorithms for mining association rules in large databases, VLDB (Jorge B. Bocca, Matthias Jarke, and Carlo Zaniolo, eds.), Morgan Kaufmann. pp. 487–499.
2
M. Berry and G. Linoff, (1999) Mastering data mining, New York: Wiley.
3
Leo Breiman, (1996) Bagging predictors, Machine Learning 24, no. 2, 123–140.
4
Diego Calvanese, Giuseppe De Giacomo, and Maurizio Lenzerini, (2001) Identification constraints and functional depen- dencies in description logics, IJCAI (Bernhard Nebel, ed.), Morgan Kaufmann, pp. 155–160.
5
Surajit Chaudhuri and Umeshwar SIGMOD Record 26, no. 1, 65–74.
6
Ming-Syan Chen, Jiawei Han, and Philip S. Yu, (1996) Data mining: An overview from a database perspective, IEEE Trans. Knowl. Data Eng. 8, no. 6, 866–883.
7
Ian Davidson, Ashish Grover, Ashwin Satyanarayana, and Giri Kumar Tayi, ( 2004) A general approach to incorporate data quality matrices into data mining algorithms, KDD (Won Kim, Ron Kohavi, Johannes Gehrke, and William DuMouchel, eds.), ACM, pp. 794–798.
8
Anne M. Disney and Philip M. Johnson, (1998) Investigation data quality problems in the psp, SIGSOFT FSE, pp. 143–152.
9
AnHai Doan, Pedro Domingos, and Alon Y. Levy, ( 2000) Learning source description for data integration, WebDB (Informal Proceedings), pp. 81–86.
10
Jack E.Olson, (2003) Data quality: The accuracy dimension, Morgan Kaufman , ISBN: 1558608915.
11
Usama M. Fayyad and Ramasamy Uthurusamy, ( (1996) Data mining and knowledge discovery in databases (introduction to the special section), Commun. ACM 39, no. 11, 24–26.
12
Galhardas.H, D. Florescu, D. Shasha, and Simon.E, ( 1999) An extensible framework for data cleaning, Tech. report, Institute National de Recherche en Informatique et en Automatique.
13
WILLIAMS P. H., MARGULES C. R., and HILBERT D. W, (2002) Data requirements and data sources for biodiversity priority area selection, Journal of biosciences ISSN 0250-5991 vol. 27, no. no 4, pp. 327– 338.
14
M.A Hernandez and J.S Stolfo, (1998) Real-world data is dirty: Data cleansing and the merge/purge problem, Data Mining and knowledge Discovery 2, 9–37.
15
Jochen Hipp, Ulrich Gu ¨ntzer, and Udo Grimmer, ( 2001) Data quality mining - making a virute of necessity, DMKD.
16
Yk¨a Huhtala, Juha Karkkainen, Pasi Porkka, and Hannu Toivonen, Tane, (1999) An efficient algorithm for discovering functional and approximate dependencies, Comput. J. 42, no. 2, 100–111.
17
Ihab F. Ilyas, Volker Markl, Peter J. Haas, Paul Brown, and Ashraf Aboulnaga, Cords: (2004) Automatic discovery of correlations and soft functional dependencies, SIGMOD Conference (Gerhard Weikum, Arnd Christian Konig, and Stefan Deßloch, eds.), ACM, pp. 647–658.
18
Tomasz Imielinski and Aashu Virmani, (1998) Association rules... and what’s next? towards second generation data mining systems, ADBIS (Witold Litwin, Tadeusz Morzy, and Gottfried Vossen, eds.), Lecture Notes in Computer Science, vol. 1475, Springer, pp. 6–25.
19
David J.Hand, ( 2007) Principles of data mining, Drug Safety, pp. 30,621–622.
20
Kollayut Kaewbuadee, Yae Temtanapat, and Ratchata Peachavanish, (2006) Data cleaning using functional dependency from data mining process, International Journal on Computer Science and Information System (IADIS) V1 , no. 2, 117–131 ,ISBN: ISSN : 1646 – 3692.
21
Vipul Kashyap and Amit Sheth, (1996) Semantic heterogeneity in global information systems:the role of metadata context and ontologies, Tech. report, Department of Computer Science University of Georgia,Athens.
22
Wen-Syan Li and Chris Clifton, Semint, (2000) A tool for identifying attribute correspondences in heterogeneous databases using neural networks, Data Knowl. Eng. 33, no. 1, 49–84.
23
Chong K. Liew, Uinam J. Choi, and Chung J. Liew, (1985) A data distortion by probability distribution, ACM Trans. Database Syst. 10, no. 3, 395–411.
24
Jonathan I. Maletic and Andrian Marcus, (2000) Data cleansing: Beyond integrity analysis, IQ (Barbara D.
Dayal, (1997) An overview of data
803
warehousing
and olap technology,
Klein and Donald F. Rossin, eds.), MIT, pp. 200–209. 25
David Meyer, Friedrich Leisch, and Kurt Hornik, (2003) The support vector machine under test, Neurocomputing 55, no. 1-2, 169–186.
26
L. Moss, ( 1 9 9 8 ) Data cleansing : A dichotomy of data warehousing, Tech. report, DM Review, February 1998.
27
Heiko Muller and Johann-Christoph Freytag, ( 2003) Problems, methods and challenges in comprehensive data cleansing, Tech. Report HUB-1B-164, Humboldt University,Berlin.
28
Krishnamurty Muralidhar and Rathindra Sarathy, (2003) A theoretical basis for perturbation methods, Statistics and Com- puting 13, no. 4, 329–335.
29
Felix Naumann, Johann Christoph Freytag, and Myra Spiliopoulou, ( 1998) Quality driven source selection using data envelope analysis, IQ (InduShobha N. Chengalur-Smith and Leo Pipino, eds.), MIT, pp. 137–152.
30
Elizabeth M. Pierce, Assessing data quality with control matrices, Commun. ACM 47 (2004), no. 2, 82–86.
31
Erhard Rahm and Hong Hai Do, (2000) Data cleaning: Problems and current approaches, IEEE Data Eng. Bull. 23, no. 4, 3–13.
32
Vijayshankar Raman and Joseph M. Hellerstein, (2001) Potter’s wheel: An interactive data cleaning system, VLDB (Peter M. G. Apers, Paolo Atzeni, Stefano Ceri, Stefano Paraboschi, Kotagiri Ramamohanarao, and Richard T. Snodgrass, eds.), Morgan Kaufmann, pp. 381–390.
33
Ronald.K.Pearson, ( 2 0 0 5 ) Mining imperfect data: Dealing with contamination and incomplete records, SIAM,Society for Industrial and Applied Mathematics, ISBN-10:0898715828, ISBN-13:978-0898715828, April,1 2005.
34
S.L.Kendal and M.Creen, (2007) An introduction to knowledge engineering, London :Springer 2007,x,287 p:ill 24cm.
35
Michael Stonebraker and Joseph M. Hellerstein, ( 2001) Content integration for e-business, SIGMOD ’01: Proceedings of the 2001 ACM SIGMOD international conference on Management of data (New York, NY, USA), ACM, pp. 552–560.
36
Diane M. Strong, Yang W. Lee, and Richard Y. Wang, ((1997) Data quality in context, Commun. ACM 40, no. 5, 103–110.
37
Daleniu T., ( 1977) Towards methodology for statistical disclosure control, Statistisktidskrift 5.
38
Graham Williams, Data mining desktop survival guide, Togaware Pty.Ltd, 24,May 2009.
39
Yi Yu Yao and Ning Zhong, (2000) On association , similarity Springer Verlag Berlin Heidelberg, pp. 138–141.
and dependency of attributes, PAKDD,
Acknowledgments This paper was developed within the CRC for Integrated Engineering Asset Management, established and supported under the Australian Governments Cooperative Research Centre Programme.
804
Proceedings of the 4th World Congress on Engineering Asset Management Athens, Greece 28 - 30 September 2009
DETECTING MIS-ENTERED VALUES IN LARGE DATA SETS Kalaivany Natarajan, Jiuyong Li, Andy Koronios Cooperative Research Centre for Integrated Engineering Asset Management - System Integration and IT School of Computer and Information Science, University of South Australia Mawson Lakes - 5095, Australia
Data is the valuable asset of business organizations and companies. Quality data is essential for business intelligence and intelligence decision-making. Data quality is a main issue in quality information management. Data quality control has been aware of by most large business organizations. Various mechanisms have been employed t o ensure obtaining quality data, for example, using electronic forms for data collection. With the popularity of collecting data from electronic forms, mis-entered values become a major source of dirty values in a database. Mis-entered values can be caused b y randomly ticking multiple choices from drop down selection lists. These dirty values are more inconspicuous than traditional data entry errors and misspellings since mis-entered values have right spelling and normally do not caused integrity violation. In this paper, we discuss some data mining methods that are used for detecting mis-entered values in large data sets. We present a framework for detecting mis-entered values using association rules. Key Words: Data Cleaning, Mis e n t r y , Data Mining, Association rule 1
INTRODUCTION
Maintaining database with good quality is the most important issue in the information oriented organization. Identifying and correcting those errors are a challenge. Missing and mis-entered values are common in large databases. These values mess up the quality o f data. Normally mis-entered values are not seemed to be erroneous data for the reason that they have correct value for specified field of attribute. For example in an online application form for Credit card the customer needs to fill their both personal and account details. Attribute account type has choices: Award saver, Cheque, Businessacnt. A system may set any one of them a s a default value. Customers fail to remember t o select the account type. The default value is considered as account type. In this case we are incapable to find out the exact misentered values since some customers may have the default account type as original account. This situation is hard to be differentiated between correct values and mis-entered v a l u e s . In other case, a selection list contains a number o f options. The cursor points to a particular value to be selected. Customer may not see the selected value appearing correctly o r not. The selected value is not the one the customer intended to choose. This is not a missing value but a mis-entered v a l u e . Mis-entered values are difficult to be detected i n many circumstances. Firstly they are not misspellings; values which are selected by the customers have correct spellings and are relevant to the corresponding attribute. Secondly they do not cause the violation of integrity constraints since data t y p e and length of the values are accurate. Cleaning mis-entered is also impossible for people to do it manually u n l e s s he/she knows the person. Data c le a nin g is an application of data mining. Data mini n g is the process of analyzing a nd summarizing data i n t o useful information. It discovers patterns and relationships from large data s e t (12). Data mi n i n g helps to detect a n d correct dirty values in large data se t . There are many methods to clean the database such as ETL (Extract, Transform and Load). ETL process is applied for detectable errors. Potter’s Wheel approach is an interactive system for data t r a n s f o r m a t i o n and cleaning. It allows users to gradually build a transformation to clean the data by adding transforms through graphical operations or through example (13). Potter’s wheel approach focuses on data cleaning with discrepancy detection by
805
graphical and example transformation. It doesn’t deal with interactive query processing in this case finding mis entries are difficult. Another approach is probabilistic noise identification and data cleaning (8). This method identifies and removes corrupted records by LENS (Learning Explicit Noise System) algorithm. It facilitates the process of data cleaning and collection of future records by new explicit models. LENS algorithm mostly concentrates on the noise detection and data collection process. since if the corruption occurs in a single field affects entire record. They do not work well for mis entry detection. This paper aims to explore data mining methods for mis-entered values detection. Since detecting a n d c l e a n i n g misentered a r e difficult as we have discussed before. We study different data cleaning methods to identify and correct misentered data in this paper. This paper organized as follows. Section 2 gives basic idea about the mis-entered data and different kind of sources to cause this problem. We discuss their consequences in various applications like tree construction, correlation regression, statistical and hypothesis test. In section 3 we discuss related work done in to detect mis-entered values in large database. The main work is a heuristic approach it finds out disguised missing values and cleans the database. In addition we have discussed partial domain knowledge, univariate method and Q-Q plot to detect mis-entered data. In section 4 we have a framework to detect mis-entered values using association rules. This framework is suitable for unsupervised and supervised models. In final section we conclude the paper.
2
MIS-ENTERED DATA
Sometimes unknown, inapplicable or non specified responses values are encoded as valid data values. This is a situation to occur mis-entered values. In other cases, many applications have missing values that may not be explicitly represented as such, but instead appear as potentially valid data values (11). Other than this there are many sources to occur mis-entered data. In the following section we show various reasons to cause mis-entered values in large data sets. 2.1 Basis of Mis-entered Data An on line data entry user registration forms are constructed with validation software. For example on line ticket reservation form designed by HTML and validated by software called JavaScript in which user may enter there phone number with string value the system not allowed to enter further details. It shows the message in dialogue box as Enter Numeric Value then the value is re-entered. The user may think what they entered was correct since that are validated by software. The values are fitting to the given field but they may not correct. This is one situation that mis-entered values occur. In other situation, users don’t want to expose their private information to others. They will hide some information like DOB, Phone number and Address. Mis-entered values also occurred by deliberate fraud (11) and mischievous activities. For example railway tickets are reserved with details of deceased people. 2.2 Consequences of Mis-entered Data Mis-entered values are not a small error. It produces big problems in different fields of database. In the following section we see how consequences occur in various areas. 2.2.1 Problems in Classification Tree Construction Classification trees are used to predict objects in the classes from their measurements on one or more predictor variables (14). Each node is constructed by a prediction from top-down approach. A tree constructing algorithm analyzes database values and builds a tree from the corresponding fields. Mis-entered values may affect tree structure greatly. For example in an employer table allowances (Dearness Allowance, Rent Allowance, Travelling Allowance and Superannuation) are calculated from the employee basic salary if the basic salary has any mis entry the total payment calculation will be incorrect. This may affect the entire database. Classification trees are very sensitive to small changes in the dataset from which they are built (2; 3). Small number of mis-entered values will produce different trees. 2.2.2 Simple Statistics and Correlations In some situation mis-entered values are encoded with anomalous values in the large datasets here the mean values are shifted toward the anomalous values which increase or decrease the sample standard deviation values (16). Correlation measures the association between variables. They are designated as dependent or independent. Correlation co-efficient are calculated for the ordinal in which co-efficient values vary from -1 to +1. The value -1 indicates a negative correlation and +1 indicates positive correlation betwee n the variables. If the value is 0 there is no association between variables. If the value -1 entered inst e a d o f +1, it indicates positive as negative correlation. In the
806
negative correlation value of one variable increases, the value of other variable decreases. In the positive correlation both the variables increases (17). If the values are mis-entered , it is difficult to find as exact association between the variables. 2.2.3
Hypothesis Tests
A hypothesis test is a method of making statistical decisions using experimental data. Hypothesis test is an algorithm which chooses the data between alternatives to minimize the task (5). In this case mis-entered values are influencing on the hypothesis tests to make a statistical decision. A test statistic is calculated from sample data. This value is used to test whether the null hypothesis to be rejected or not. The test statistic depends on the probability model (18). If there may be any mis-entered value in the experimental data it will change the probability model so the statistics decision will be automatically changed.
Table 1. Medical dataset with mis-entered values Rec.no Med.No Name
3
Gender
DOB
Age
Address
Suburb
ZIP
Disease
1
6043
Lisa
M
6-11-61
48
1,Connell Street
Salisbury
5109
CervicalCancer
2
5061
Ana
F
2-9-30
79
6,Jersey Ave
Parafield
5108
Pregnant
3
5019
Judy
F
15-06-73
36
2,Tearnby Dr
Adelaide
5096
BreastCancer
4
5018
James
M
1-1-2009 5 Months 6,Wicklow street GoldenGrve
5125
Toothache
RELATED WORK
Mis-entered values occur in the real datasets and can be responsible for substantial bias in result analysis. Detecting mis-entered values are not an easy task but we use some possible methods to detect these sorts of problems. Particularly in the form of electronic data entry systems and intentional fraud entries. In the following section we see some techniques to detect mis-entered values in large database system. 3.1 Cleaning Mis-Entered Values using an Embedded Unbiased Sample Heuristic Heuristic approach to clean mis-entered data is designed by (7). It uses a heuristic technique Embedded Unbiased Sample (EUS) to identify mis-entered values in large databases. In some situations a user picks up values for an attribute from a set of values which may not be correct for their necessity. These errors are cleaned by heuristic approach. The value in domain is stored in the table. If entry value is missing then it stores the disguise missing in the table. In this method two types of tables are maintained to detect the mis-entered values. First table is Truth table TT second table is Recorded table RT. Truth table contains the values to be stored. There exist oneto-one mapping between the tuples in TT and tuples in RT to identify the relationships among them. There are two possible states to identify the mis-entered values. 1. If value of the attribute is not missing in the TT then it is collected correctly in the RT. 2. If value of attribute in the TT is missing then the value of the recorded table is explicitly missing or any legal value. Recorded table RT is given for data cleaning when truth table is not available. The main problem is to find out the frequently using missing values and mis-entered values. Only set of values are always mis-entered that values are identified via the recorded tables. Some field face mis-entered p r o b l e m s often when we use selection list to select the values e.g. Acc-type, Gender and DOB. Embedded Unbiased Sample Heuristic (EUS) If value v1 is frequently used mis-entered value on attribute AT, then RT contains a large subset of RT. The subset is
807
an unbiased sample of RT except for AT. EUS is a heuristic way to find frequently used mis-entered values. For value v1 of attribute AT, the set of tuples in RT carrying the value on the attribute is called projected database. Assumption of this method is only small number of values are frequently used as mis-entered values. Projected database shows which small numbers of attribute value contain large number of subsets as unbiased sample of the whole table. Such attribute value expect as frequently used mis-entered values.
Table 2. Training dataset for medical database
Rec.no
Med.No
Name
1
6043
Lisa
2
5061
3
Gender
DOB
Age
Address
Suburb
ZIP
Disease
F
6-11-61
48
1,Connell Street
Salisbury
5109
Cervical Cancer
Ana
F
2-9-80
29
6,Jersey Ave
Parafield
5108
Pregnant
5019
Judy
F
15-06-73
36
2,Tearnby Dr
Adelaide
5006
BreastCancer
4
5018
James
M
1-1-55
54
6,Wicklow street GoldenGrve
5125
Hypertension
5
5091
Paul
M
18-4-84
25
3,Observation Dr Para hills
5096
Chicken pox
6
5016
Kelly
F
9-3-65
44
2,Park valley
Highbury
5122
Menopause
7
5129
Joe
F
2-9-70
39
18,Cataline Ave
Woodville
5107
Ovarian Cancer
Finding Maximal Embedded Unbiased Sample (Mv) Mv is the maximal subset of projected database that is an unbiased sample of D that is called maximal embedded unbiased sample. The size and quality of Mv specifies how much possibilities are there that value should be mis-entered values. Maximal embedded unbiased sample are derived from projected database with greedy approach. Here greedy approach applied for subsets of projected database to find approximate maximal unbiased samples. In this method first it tests whether project databases of the frequent values in the attribute are unbiased samples of whole database. Heuristic is applicable when most of the project databases are not unbiased samples. Mining frequent mis-entered values are taking place by two phases, Phase 1: Checks the projected databases of values on attribute AT are unbiased samples of truth table TT. If so it derives maximal unbiased sample for each value on attribute AT. Finally it finds the largest maximal embedded unbiased sa mp le values to show the frequent mis-entered values. Phase 2: Frequent mis-entered values are verified by domain experts or already existing cleaning methods. Some techniques are available to detect mis-entered values (16) (4). They are designed in some point of view and apply o nl y to specific applications. Existing approaches are depending on the deep background knowledge. Domain knowledge is mostly incomplete or even unavailable for many database tasks. Embedded Unbiased Sample heuristic approach finds out the mis-entered values without any background knowledge and suitable for the generic applications. 3.2 Partial Domain Knowledge Domain Knowledge consists of information about the data that is already available either through some other discovery process or from a domain expert and provides an indication on sources and quality of data (1). Domain specific knowledge is used in the detection of mis-entered values. In domain specific knowledge, syntactic checks are performed to verify each element present in the output and that values are matched with their expected data types.
808
Figure 1. Detecting mis-entered values by an unsupervised model Subsequently a semantic check spots the incorrect values (9). Partial domain knowledge is also useful to identify mis entry data (11). For example in some situations we don’t have high and low bound values to identify anomalies like blood pressure and stock prices, so it uses zero. Positive and negative values are used to detect anomalies. 3.3 Detectable outliers An outlier is an entry in a dataset that is irregular with respect to the behaviour seen in the majority of the other entries in the dataset (19). Data values are checked by different levels of categorical variable to identify their distribution. This is the basic step in data analysis. It exposes most obvious outliers. Univariate checks the presence of outliers and makes distributional assumptions which are often not relevant (6). Outliers are detected from the database to identify mis-entered data for that it has three considerations. First it detects the outlier but not treats all mis-entered as outliers. Generally most outliers are treated as potential mis-entered values. For example number of male is larger than the number of female in census database in which we can find out female are mis-entered as male. This is a possible situation to treat outliers as mis-entered values. Second it determines additional outliers in the same database other than what we find out in the first step. These outliers examine the results to find out further mis-entered values. Third it uses various univariate procedures to identify other group of outliers in same dataset (16). 3.4 Distributional Anomalies Univariate method is used for detecting only particular type of anomalies. In a single value there may be frequent number of anomalies are occurred. They are not considered as outliers but they are considered as distributional anomalies. Distributional anomalies are identified by the graphical tools like quantile - quantile (Q-Q) plot. Quantiles are points taken at regular interval from the cumulative distribution function (CDF) of a random variable. It divides the data sequence into q equal sized subset for q- Quantiles. Quantiles marks the boundaries between consecutive subsets. If the distribution of data sequence is normal it starts plot the straight line approximately. Q - Q plot be inclined to highlight repeated value distributional anomalies like those frequently associated with mis-entered values. (15;16). 4
A FRAMEWORK FOR MIS-ENTERED VALUE DETECTION USING ASSOCIATION RULES Association rule mining finds relationships among large data sets. It represents the results by rules. Rules are in the form of IF - THEN. Association rules can be used for checking consistency of records in a database. For e.g. if Country code: 0061 → Country: Australia, which holds 100% confidence. Likewise it characterizes each record by rules from the set of transactions in a large database. In this paper mis-entered values are detected from the large data sets by association rules. We present a framework to detect the problem of mis-entered values and shows how association rules can be used to discover those values. We use a medical database in which information are stored about the employees in the hospital and patients details as an example. 4.1 Detecting Mis-entered Values by an Unsupervised Model Unsupervised models do not predict target value, but focus on the intrinsic structure, relations and interconnections of the data (10). Association rule is one type of unsupervised function. Detecting mis-entered values by an unsupervised method uses association rule to detect mis-entered values. In this approach association rules don’t know the exact value for the given attribute but it predicts the mis-entered values by their relations and interconnections of the data. Table. 1 has medical records with mis-entered values of the patient’s. In record1, a male patient is affected by the disease of cervical cancer. The cervical cancer will be suffered by only females not male. Either gender or disease was
809
mis-entered. So we can determine the value is mis-entered here. In record2 the woman consults a doctor for pregnancy issue. Her DOB is given as 1930. If she was born in 1930, her age is 79. It is impossible for her to get pregnant at the age of 79. Either age or DOB was mis-entered. Record3 having address of the patient’s in that suburb filed is having value Adelaide but the postcode of the suburb is mis-entered. Finally in record4 patient James having toothache in the age of 5 months. Babies never have teeth at the age of 5 months. Age or disease value is mis-entered. Table 3. Test dataset for medical database Rec.no
Med.No
Name
Gender
DOB
Age
Address
Suburb
ZIP
Disease
6
6123
Jan
F
7-12-90
19
2,Ann Street
Prospect
5104
CervicalCancer
7
6214
Phil
M
23-1-80
29
12,John St
Greenfield
5093
BreastCancer
8
6134
Michel
M
3-6-66
45
21,Park Terrace
Salisbury
5109
Ovarian Cancer
9
6512
Sally
F
12-9-60
35
4,BrownTce
Woodville Park
5065
Allergy
Figure 2: Detecting mis-entered values by a supervised model A dataset should support a set of association rule as follows, 1. If disease = Cervical cancer Then gender = Female (Confidence=100%) 2. If disease= Pregnant Then gender = Female (Confidence=100%) 3. If suburb = Adelaide Then Postcode = 5006 (Confidence=100%) 4. If disease = Toothache Then age = between 1 and 90 (Confidence=100%) In the medical database with some mis-entered values, Association rule finds out the record with less than 100% confidence. Then this indicates some values are mis-entered. A small number of records causing rules to be not perfect are suspected to be mis-entered. Fig.1 shows an unsupervised model structure. In this approach real world data has set of records. Association rules are generated for the given record set and discovered mis-entered values from that records. 4.2 Detecting Mis-entered Values by a Supervised Model In this model we divide the database into training and test dataset. Training dataset shown in table. 2 and test dataset shown in table. 3. Training data checks the mis-entered in coming data. Rules are generated from the training data sets that rules are monitored and modified by domain experts. A set of rules is applied on the real dataset to check
810
consistency of the real world data. These real world data are entered by the end user. Fig. 2 show how supervised model works. Record8 has some values in the database like, 6134, Michel, M, 03-06-66, 45, 21, Park Terrace, Salisbury, 5109, Ovarian Cancer. This record set is test by training set; it finds mis-entered value in the field of disease. If disease = Ovarian cancer Then gender = F but in the record8 gender field is having value M consequently the record has mis-entered value. 5
CONCLUSIONS
In this paper we have discussed mis-entered values in large data sets. Since mis-entered data are arise most of the time by electronic forms, intentional fraud system and information hidden by the user. Mis-entered data creates serious issues than the data entry error and misspellings. Mis-entered values are always with the correct spellings and never produce any integrity constraints. In this paper we have discussed what mis entry is, how they are occur and what are the problems arise due to mis-entered values. Furthermore what are the data cleaning methods are used in the field of misentered detection and correction such as heuristic approach, partial domain knowledge and univariate methods. These methods are given an idea to identify mis-entered values and producing quality data for an intelligence decision making in the business organizations and companies. We have a framework for detecting mis-entered values using association rules. 6
REFERENCES
1
Sarabjot S. Anand, David A. Bell, and John G. Hughes (1995) The role of domain knowledge in data mining, CIKM ’95: Proceedings of the fourth international conference on Information and knowledge management (New York, NY, USA), ACM, 37–43.
2
L. Breiman, J.H. Friedman, R.A. Olshen and C.J. Stone (1984) Classification and regression trees, Wadsworth Interna- tional, Canada.
3
Leo Breiman (1996) Bagging predictors, Machine Learning, Springer Netherlands, 24, 123–140.
4
D. DesJardins (2001) Outliners, inliers and just plain liars-new graphical eda+ (eda plus) techniques for understanding data., In Proc.SAS User’s Group International Conference(SUG126) (Long Beach, CA).
5
Dennis Howitt Duncan Cramer (2004) The sage dictionary of statistics, p. 76.
6
Robin High (2000) Dealing with ’outliers’: Report.
7
Ming Hua and Jian Pei (2007) Cleaning disguised missing data: a heuristic approach, KDD (Pavel Berkhin,
How to maintain your data’s integrity in computing news, Tech.
RichCaruana, and Xindong Wu, eds.), ACM, 950–958. 8
Jeremy Kubica and Andrew W. Moore (2003) Probabilistic noise identification and data cleaning, ICDM, IEEE Computer Society, 131–138.
9
Jussi Myllymaki (2001) Effective web data extraction with standard xml technologies, ACM 1-58113-348, Hong Kong.
10
Oracle database documentation Library (2005) Oracle: Data mining concepts, 10g Release 2(10.2) ed..
11
Ronald K. Pearson (2006) The problem of disguised missing data, SIGKDD Explorations 8(1), 83–92.
12
Erhard Rahm and Hong Hai Do, Data cleaning (2000) Problems and current approaches, IEEE Data Eng. Bull. 23(4), 3–13.
13
Vijayshankar Raman and Joseph M. Hellerstein (2001) Potter’s wheel: An interactive data cleaning system, VLDB (Peter M. G. Apers, Paolo Atzeni, Stefano Ceri, Stefano Paraboschi, Kotagiri Ramamohanarao, and Richard T. Snodgrass, eds.), Morgan Kaufmann, 381–390.
14
Brian D. Ripley (1996) Pattern recognition and neural networks, Cambridge University Press.
15
R.J.Serfling (1980) Approximation theorems o f mathematical statistics, John Wiley and Sons.
16
Ronald.K.Pearson (2005) Mining imperfect data: Dealing with contamination and incomplete records, SIAM, Society for Industrial and Applied Mathematics, ISBN-10:0898715828, ISBN-13:978-0898715828.
17
William M.K. Trochim (2006) Research methods and knowledge base, Atomic dog publishing.
811
18
John H.Mccoll Valerie J.Easton (1997) Statistics glossary, Tech. Report.
19
V.Barnett and T.Lewis (1994) Outliers in statistical data, 3rd ed., Wiley.
Acknowledgments This paper was developed within the CRC for Integrated Engineering Asset Management, established and supported under the Australian Governments Cooperative Research Centre Programme.
812
Proceedings of the 4th World Congress on Engineering Asset Management Athens, Greece 28 - 30 September 2009
INFERENCES ON NON-PARAMETRIC METHODS FOR THE ESTIMATION OF THE RELIABILITY FUNCTION WITH MULTIPLY CENSORED DATA Bohoris a G.A. & Kostagiolas b P.A. a
Department of Business Administration, University of Piraeus, Karaoli & Dimitriou 80, GR-185 34 Piraeus, Greece, Tel : +30 210 4142253 & Email: [email protected]
b
Department of Archive and Library Science, Ionian University, Ioannou Theotokis 72, GR-491 00 Corfu, Greece Tel: +30 26610 87402 & Email: [email protected].
In today’s highly competitive business environment, the role of reliability management for improving product performance becomes crucial. Indeed, reliability considerations should be included as vital elements within modern management and business practices. Reliability management is indeed complex because it involves a number of different activities and responsibilities that take place throughout the life-cycle of product and /or services. In reliability practice, the description and study of data, which are often incomplete (censored) is facilitated through the estimation of specific density distribution functions. Although a number of quite efficient modeling techniques have been made available in the literature, the ones prevail and thus widely employed are the non-parametric methods where reliability estimates are obtained directly from the data. In this paper three such inference techniques are reviewed (i.e. the Kaplan-Meier, the Cumulative-Hazard and the Piecewise Exponential Estimator) and the numerical differences when all three are employed for the same data set are analytically established. Furthermore, a study regarding the finite sample behavior in terms of estimated mean squared error based on Monte Carlo Simulations is presented.. Key Words: reliability management, non-parametric methods, Kaplan-Meier, Cumulative-Hazard, Piecewise Exponential Estimator, finite sample behavior, simulation 1
INTRODUCTION AND NOTATION
In reliability management and maintenance the study of the life characteristics of an engineering piece of equipment is often facilitated through estimation of the survival function from a set of censored data. For this purpose some very efficient modeling techniques have been introduced and studied in the literature [1 & 2 & 3 & 4 & 5]. In this paper three prevalent such inference procedures are discussed within the applied reliability and maintenance context: the well known and widely accepted in both the engineering and biomedical worlds Kaplan–Meier (KM) and Cumulative–Hazard (CH) are presented first; while the Piecewise Exponential Estimator (PEXE) which is a non-parametric density estimation method will follow. The problem of estimating the reliability function from the data (non-parametric methods) within the reliability and maintenance fields can be described as follows [6]: Assume that a collection of N identical items are put in a lifetest experiment; after the termination of the experiment the available data consists of a number of lifelength times (failures) and a number of truncated lifelength times (censoring). The later is a result of components which, at the end of the study period, have either not reached the end point event of interest (remain unfailed), or they have been removed prior reaching it. Hence, the outcome of reliability experiments of this nature is indeed a set of lifetimes randomly intermixed with incomplete observations, i.e. multiply censored reliability data. The variable of interest is the lifespan of the units and an investigator wishes to estimate the reliability function, i.e. the probability of survival of a unit beyond any given time. Among the estimators proposed for the reliability (or survival) function in this work we focus on non-parametric methods, i.e. when the reliability is estimated directly from the data. On the other hand, a purely parametric estimator would be based on the assumption that the underlying life distribution belongs to some specific family of distributions and would require estimation of parameters from the sample. The non-parametric estimators employed here are equally available, although due to
813
their distinct formulation they result in distinct survival probabilities, all approximations of the unique true sample survival function [7, 8, 9]. Of interest to us are, however, their dissimilarities rather than their common characteristics, being all suitable estimation methods. Prior to proceeding any further it is useful to define the notation used: Let us assume that that the available multiply censored data set is a random sample of the studied population, consisting of N independent and identically distributed (iid) lifetimes. Let nf be the number of distinct times to failure denoted as T1,T2,…,Tnf. The censored data sample may be re-ordered by magnitude and written as,
0 < t 01 £ K £ t 0 e0 < T1 < t 11 £ K £ t 1e1 < T2 < K < T j < t j1 £ K £ t je j < K < Tnf < t nf 1 £ K £ t nfenf whe re T0 = 0 and Tnf+1 = ∞. Strict inequality is assumed between complete and incomplete observations. Furthermore, let dj be the number of failures occurring at time Tj (j=1,…,nf), with d0 = 0, i.e. there are no failures at time zero. Denote by ej the number of right censored observations, i.e. t nf 1 , K,t nfenf , that fail in the interval [T j , T j +1 ) with j = 0, …, nf. Let nj be the number of items “at risk of failure” constituted from items with lifetimes greater than or equal to Tj (j=1,…,nf), nf
n j = ∑ (d j + e j ) ⇒ n0 = N l= j
2
(1)
AN OVERVIEW OF THE RELIABILITY ESTIMATORS EMPLOYED
The density estimation procedures known as Kaplan-Meier (KM), Cumulative-Hazard (CH) and Piecewise Exponential Estimator (PEXE) will be reviewed in this section. All three methods can be applied to all types of censored data sets and provide the most accurate estimates of the survival (or reliability) function.
2.1 Kaplan-Meier (KM) Estimator A quite popular estimator of the survival function is the KM estimator often referred to as the Product–Limit (PL) estimator introduced by Kaplan and Meier [10]. KM density estimation method has played a central role in the analysis of clinical data and has been the standard procedure for estimating the survival function in many biomedical and statistical computer programs. Furthermore, the KM is a key quantity in several more complicated survival analysis models like the Proportional Hazards, Goodness Of Fit (GOF) and two-sample tests. This is partly due to the fact that the Km estimate reduces to the Empirical Distribution Function (EDF) when used with the non-life (complete) data samples. The estimator itself is defined through the product: j
nj - d j
l =1
nj
Rˆ KM (T j ) = P
(2)
KM is a step function and is the non-parametric maximum likelihood estimator of the reliability function, while its properties have been studied by a number of people [2, 11 & 12, 3].
2.2 Cumulative-Hazard (CH) Estimator The CH procedure, proposed by Nelson [13], estimates the reliability function through computation of the hazard and cumulative hazard functions (chf): j dj hˆ(T j ) = & Hˆ (T j ) = S hˆ(T j ) nj l =1
(3)
Calculation of the reliability is then a direct application of the relationship between the cumulative hazard and the reliability function [14]:
814
(
Rˆ CH (T j ) = exp - Hˆ (T j )
)
(4)
The CH estimator is the method mainly used by the engineering world in analysing multiply censored data [1]. When the CH estimator compared with the KM the following can be said: • • • • •
Results in higher survival probabilities [1&6]; Has the same form as the KM estimator, i.e. they are both step functions; Is at least as easy to calculate as the KM estimator; Is closely related to graphical assessment techniques (Cumulative Hazard Plots), [14 & 3]; Tends only asymptotically to zero after the last event in the data. The KM estimator is defined to be zero if the last event happens to be a failure leading to the rather extreme conclusion that no population failures are statistically possible beyond the point [6].
2.3 Piecewise Exponential Estimator (PEXE) The Piecewise Exponential Estimator has been introduced by Kitchin et al [15], and among the few who studied this very promising non-parametric estimator are: Whittemore and Keler [16] which produced modified versions of PEXE to overcome the additional difficult imposed by the presence of ties in the data, Mimmack and Proschan [17] provided results for the discrete counterpart of PEXE, the Piecewise Geometric Estimator (PEGE), and Kim and Proschan [18] who compared PEXE with the KM. A similar in nature to PEXE estimator was proposed by Kulaskekera & White, [19]. The authors produced an estimator which was based on the estimation of the hazard and the cdf in the intervals between the successive failures through employment of the TTT computed however, on the failure intervals closed from the left: [T j , T j +1 ) with j = 0, …, nf. The procedure was derived as a modification of the CH method using instead of the risk set (nj) in the denominator of the hazard function and then estimating the cdf and, thus, the survival function. The proposed procedure is indeed a variant of PEXE method with the time intervals closed from the left. More recently, Goodman et al. [20] studied estimators based on change point hazard functions and Malla [9] extended PEXE and studies its small sample properties through simulation. PEXE is based on the Total time on Test (TTT) between successive failures and the assumption that the hazard function between the observed failures times is constant and therefore the survival function within the failure intervals assumes an exponential form. Then, the exponential pieces of the survival function are extrapolated and the estimator follows. The successive steps for computing PEXE are summarized below: Step 1: Let us assume again that a number of items, say N, are put on test and the basic sample quantities are computed (nj, dj and ej for j=0,…,nf). Step 2: Calculate an estimate of the hazard function separately on each of the intervals between successive failure times in the censored survival data through the following expression: (Observed number of failures on the time intervals between successive failures) /(Observed TTT on the time intervals between successive failures) (5) Then the hazard function estimates (Eq. 5) are defined at the observed distinct failure times Tj (j=1,…,nf) by,
number of hˆPEXE j =
failures observerd TTT
during
(T
j -1 , T j
The procedure for the computation of the TTTj’s computational expression follows:
Time on test
TTT j =
for the censoring
observed during e j -1
∑ (t l =1
j -1l
(T j -1 -T j ]
- T j -1
)
+
during
]
dj
(
j
- T j -1
815
j -1
,T j
]=
dj TTT j
(6)
(j=1,…,nf)is graphically illustrated bellow and the relevant
Time on test for the failures observed during T j -1 -T j
(T
(T
)
]
Time on test for all the remaining observations ( failures & censoring ) with lifetimes greater than T j nf nf + ∑ d l + ∑ el (T j - T j -1 ) l= j l = j +1
Step 3: Thereafter, an exponential survival function is independently computed for each of the time intervals between the successive failures in the survival data, the exponential “pieces” (one for each failure interval) are joined and the continuous piecewise exponential survival estimator follows:
1 for t £ 0 j -1 Rˆ PEXE (t ) = exp - hˆPEXE j (t - T j -1 ) + ∑l =1 hˆPEXEl (Tl - Tl -1 ) for T j -1 < t £ T j , j = 1,K , nf no estimator for t > Tnf
{[
)]}
(
(7)
2.4 Inferences on the characteristics of KM, CH and PEXE estimators The computational effort required for the implementation of PEXE method is slightly increased when compared to either KM or CH reliability estimators. An illustrative example for the computation of the KM, CH and PEXE reliability estimators is presented later on. Kitchin et al [15] showed that PEXE and CH are asymptotically equivalent, i.e. both estimators converge to the unknown true reliability function with probability 1 as the sample size increases without limit. Moreover, the same is known for the KM and the CH estimators. The practical significance is that for testing situations with large sample sizes with a moderate amount of censored observations the three reliability estimation methods are equally applicable with little differences in the numerical values of the resulting survival probabilities. With reliability data, however, which are often heavy censored including only a moderate number of incomplete observations, certain issues arises such as the ones mentioned bellow [7]: 2.4.1 Reliability function representation PEXE is constructed specifically under the assumption that the underlying distribution is continuous and results in a continuous survival function. The estimator preserves the assumed properties of the underlying life distribution by taking into account and reflecting the continuity of any life process. PEXE is therefore decreasing at any possible failure time within the observation window and not only at the observed failure times. On the other hand, KM and CH result in a step function representation of the underlying distribution with discontinuities at the observed complete observations in the data. 2.4.2 Hazard function estimates In PEXE procedure a constant hazard function is assumed separately for each of the time intervals between successive failures. In the computation of the hazard function estimates however, only the relevant information to the particular failure interval is utilized. Therefore, the constant hazards may only provide a reasonable estimate for the proneness to failure at any possible time within each specific failure interval. On the other hand, the KM and CH estimators suggests zero hazard within the time intervals between successive failure times prompting to the unreasonable conclusion that just because no failure is observed at any possible failure time there is no chance of failure at that time and, for that mater, at any time within each of the failure internals in the data. 2.4.3 Information usage PEXE utilizes both interval and ordinal information from the complete and incomplete observations in the data. In fact, the exact location of the censoring and the failure(s) is employed in the computation of the TTT for that interval. KM and CH are insensitive to the exact location of the incomplete observations in the data as long as they remain in their original failure intervals. 2.4.4 Extrapolation beyond the data Inferences beyond the last observation in the data are often of interest. Mimmack & Proschan, [15], introduced the assumption that the estimate of the hazard function beyond Tnf, i.e. in the interval (Tnf , ¥) , is equal to the hazard estimate at the last failure interval that is
hˆPEXE nf +1 ” hˆPEXEnf . Hence, either through this assumption or due to the continuous
nature of PEXE survival estimator, extrapolation beyond the data is always possible. The KM and CH estimators do not offer this opportunity. 2.4.5 Graphical analysis of the survival data A significant feature of CH is its relation to graphical analysis techniques and, in particular, the graphical goodness of fit methods, such as the cumulative hazard plots [17&18]. However, either KM or CH or PEXE being all equally suitable
816
estimators of the survival function with multiply censored data, may be employed in a graphical analysis investigation [21].
Table 1: Calculation of the basic sample quantities for the multiply censored data set of the example [7]. Event
Event
Failure
Failure
Basic Sample
Number
Time
Number
Time
Quantities
i
ti*
J
Tj
nj
dj
ej
1
69
1
69
21
1
0
2
176
2
176
20
1
1
3
-195
–
–
–
–
–
4
208
3
208
18
1
0
5
215
4
215
17
1
0
6
233
5
233
16
1
0
7
289
6
289
15
1
0
8
300
7
300
14
1
0
9
384
8
384
13
1
0
10
390
9
390
12
1
1
11
-393
–
–
–
–
–
12
441
10
441
11
1
0
13
453
11
453
10
1
0
14
567
12
567
9
1
2
15
-617
–
–
–
–
–
16
-718
–
–
–
–
–
17
719
13
719
5
1
0
18
783
14
783
4
1
0
19
900
15
900
3
1
2
20
-1000
–
–
–
–
–
21
-1022
–
–
–
–
–
* Negative lifetimes denote censored observations
2.5 An illustrative numerical example A multiply censored data sample is presented, the basic sample quantities are calculated and the three procedures KM, CH, and PEXE are employed in order to estimate the true underlying survival function. Les us assume that the illustrative multiply censored data set of Table 1 has been made available [1], i.e. N=21 lifetimes in total of which nf = 15 are failures and
nc = ∑l =0 el = 6 are censoring. In Table 1 the basic sample quantities are presented. In Table 2 the required computations nf
for the estimation of the survival function according to KM, CH and PEXE are presented. Computation of KM survival estimator is based on Eq. 2 and is presented in the last column of the Table 2. The CH procedure involves estimation of the hazard and the chf (Eq. 3) and the computations are presented in columns 6 and 7, and thereupon estimation of the survival function follows in column 8 (Eq. 4).
817
Table 2: Estimated survival probabilities according to KM, CH and PEXE non-parametric density estimation procedures Computation of PEXE estimator
Computation of KM & CH estimators
Failure
TTT
Hazard
Reliability
Hazard
Chf
Reliability
Intervals
Value
Estimate
PEXE
Estimate
Estimate
CH
KM
j
(Tj-1,Tj]
TTTj
hˆPEXE (T j )
Rˆ PEXE (T j )
hˆCH (T j )
Hˆ CH (T j )
Rˆ CH (T j )
Rˆ KM (T j )
1
(0,69]
1449
0.00069
0.9535
0.04762
0.04762
0.9535
0.9524
2
(69,176]
2140
0.00047
0.9070
0.05000
0.09762
0.9070
0.9048
3
(176,208]
595
0.00168
0.8595
0.05556
0.15317
0.8580
0.8545
4
(208,215]
119
0.00840
0.8104
0.05882
0.21200
0.8090
0.8042
5
(215,233]
288
0.00347
0.7613
0.06250
0.27450
0.7600
0.7540
6
(233,289]
840
0.00119
0.7122
0.06667
0.34116
0.7109
0.7037
7
(289,300]
154
0.00649
0.6631
0.07143
0.41259
0.6619
0.6534
8
(300,384]
1092
0.00092
0.6140
0.07692
0.48952
0.6129
0.6032
9
(384,390]
72
0.01389
0.5649
0.08333
0.57285
0.5639
0.5529
10
(390,441]
513
0.00195
0.5115
0.10000
0.67285
0.5103
0.4976
11
(441,453]
108
0.00926
0.4577
0.11111
0.78396
0.4566
0.4423
12
(453,567]
912
0.00110
0.4039
0.12500
0.90896
0.4029
0.3870
13
(567,719]
961
0.00104
0.3448
0.20000
1.10896
0.3299
0.3096
14
(719,783]
256
0.00391
0.2685
0.25000
1.35896
0.2569
0.2322
15
(783,900]
351
0.00285
0.1924
0.33333
1.69229
0.1841
0.1548
Computation of PEXE method is somewhat more complicated and a relatively small number of examples have been made available. On that account the calculation of PEXE method is exhibited in more detail in columns 3, 4 and 5 of Table 2. According to the computational algorithm of PEXE, for each of the failure intervals (Tj-1,Tj] (j=1,…,nf) the TTT is calculated and thus a separate constant hazard function estimate is obtained (columns 3 and 4 of Table 2). For example, for the 3rd failure internal (176,208], the TTT3 = (195-176)+1 · (208-176)+(12+5) · (208-176) = 595. Then the estimate of the hazard function for that interval follows from Eq. 6:
hˆPEXE3 = 1 TTT3 = 1 595 = 0.00168 . In turn, the hazard estimates are employed in PEXE
survival estimator in agreement to Eq. 7 in the last step of the computational algorithm of the method. For example, the estimator of the survival function within the third failure interval (176, 208] is expressed by,
{[
(
3-1 Rˆ PEXE (t ) = exp - hˆPEXE j (t - T3-1 ) + ∑l =1 hˆPEXEl (Tl - Tl -1 )
)]}
Rˆ PEXE (t ) = exp{- [0.00168 (t - 176) + 0.00047 (176 - 69) + 0.00069 69]} An estimate of the reliability function at a specific failure time may be obtained from the relevant expression of PEXE survival estimator (Eq. 9) which is computed for the failure interval including the required complete observation, i.e. column 5 in Table 2. Therefore, at the 3rd failure time (T3=208),
3
Rˆ PEXE (T3 ) = 0.8595 .
A RELATION FOR THE RELIABILITY PROBABILITIES OBTAINED FROM KM, CH AND PEXE
It has been shown in [1] the CH estimator results consistently in larger survival probabilities than the KM estimator, when both employed for the same data set. Similar results have been produced by [17] for the case of PEXE when compared with the CH. Hence,
Rˆ CH (T j ) > Rˆ KM (T j ) & Rˆ PEXE (T j ) > Rˆ KM (T j ) for every Tj (j=0,…,nf). Furthermore, one may
observe that when all three estimators are computed for the same data set, the survival probabilities obtained by PEXE are
818
larger to the ones obtained by the CH estimator. This relation, however, is not specific to any data set and it will be shown that it holds consistently. In other words that,
Rˆ PEXE (T j ) ‡ Rˆ CH (T j )
Rˆ PEXE (T j ) Rˆ CH (T j ) ‡ 1
Rˆ PEXE (T j ) =g Rˆ (T ) CH
[
] [
for
j = 1,..., nf . Let,
]
ln Rˆ PEXE (T j ) - ln Rˆ CH (T j ) = ln g , where γ is a positive integer
(8)
j
The PEXE estimator at any failure time Tj (j=1,…,nf) may be expressed as (Eq. 7),
[
(
j Rˆ PEXE (T j ) = exp - ∑l =1 hˆPEXEl (Tl - Tl -1 )
)], where hˆ
PEXE l
is the hazard estimate
(9)
Employing Eq. 4 for the CH estimator and Eq. 9 for PEXE, the Eq. 8 can be rewritten as follows:
d j - ∑l =1 hˆPEXEl (Tl - Tl -1 ) - l nl
= ln g ‡ 0
(10)
The right hand side of the above equation is greater than or equal to zero, hence without any loss of generality it is sufficient to show that the left hand side of Eq. 10 is positive or zero, and therefore it is sufficient to show that the summation itself is a negative quantity or zero:
d hˆPEXEl (Tl - Tl -1 ) - l £ 0 nl
d hˆPEXEl (Tl - Tl -1 ) £ l , " int eger l ˛ [1, nf ] nl
(11)
Now, from Eq. 6 one can deduce that,
hˆPEXEl =
dl
∑
{
}
(t l -1,m - Tl -1 ) + d l (Tl - Tl -1 ) + ∑m=l +1 d m + ∑m=l em (Tl - Tl -1 ) m =1 dl hˆPEXEl = el -1 , " int eger l ˛ [1, nf ] ( t T ) + n ( T T ) ∑m=1 l -1,m l -1 l l l -1 d l (Tl - Tl -1 ) and hence hˆPEXE (Tl - Tl -1 ) = el -1 (12) ∑m=1 (t l -1,m - Tl -1 ) + nl (Tl - Tl -1 ) el -1
The right hand side of Eq. 11 may be re-written as, For Eq. 11 to be true the quantity definition t l -1, m
∑
el -1 m =1
nf
nf
d l d l (Tl - Tl -1 ) = nl nl (Tl - Tl -1 )
" int eger l ˛ [1, nf ]
(13)
(t l -1,m - Tl -1 ) must be greater than or equal to zero. However, by
> Tl -1 for every integer l ˛ [1, nf ] and m ˛ [1, el ] , therefore
∑
el -1 m =1
(t l -1, m - Tl -1 ) ‡ 0 . The trivial case of
equality is obtained when no censoring is observed in the interval between two successive failures (i.e. el-1=0). Therefore, the CH and PEXE estimators produce equal survival probabilities until the first censored observation is observed. As a result, the reliability estimates obtained from PEXE and CH are exactly the same when the two estimators are employed with any complete data sample. After the first censored observation the survival probabilities obtained using PEXE method are consistently larger than the ones obtained by CH. A relationship has been established for the three reliability estimators according to which for any multiply censored data sample, the probabilities obtained by KM, CH and PEXE conform to the following inequality,
Rˆ PEXE (T j ) ‡ Rˆ CH (T j ) > Rˆ KM (T j ) for
819
j = 1,..., nf .
Table 3: Monte Carlo Simulation outcome for the Estimated Mean Squared Error of the reliability estimators [7] Monte Carlo Simulation outcome for the reliability estimators Weibull shape parameter values 0.50 N
30
50
100
4
1.00
2.00
3.00
%C
Ω1
Ω2
Ω3
Ω1
Ω2
Ω3
Ω1
Ω2
Ω3
Ω1
Ω2
Ω3
10
1.140
1.136
0.996
1.262
1.131
1.010
1.118
1.141
1.021
1.117
1.147
1.027
25
1.232
1.240
1.006
1.186
1.202
1.013
1.153
1.189
1.031
1.145
1.184
1.034
50
1.511
1.717
1.137
1.430
1.520
1.063
1.291
1.341
1.039
1.254
1.309
1.044
60
1.666
2.138
1.283
1.615
1.836
1.137
1.415
1.487
1.051
1.346
1.420
1.055
70
1.817
2.824
1.554
1.836
2.348
1.279
1.610
1.751
1.088
1.488
1.593
1.071
80
1.945
2.824
1.554
2.065
3.128
1.515
1.849
2.112
1.142
1.683
1.853
1.101
90
–
–
–
–
–
–
–
–
–
–
–
–
10
1.086
1.083
0.997
1.075
1.078
1.003
1.071
1.086
1.010
1.067
1.088
1.017
25
1.152
1.155
1.003
1.113
1.121
1.007
1.093
1.112
1.017
1.087
1.106
1.017
50
1.395
1.553
1.113
1.288
1.336
1.037
1.179
1.200
1.018
1.148
1.173
1.021
60
1.553
1.946
1.253
1.443
1.570
1.088
1.265
1.295
1.024
1.207
1.239
1.026
70
1.710
2.616
1.530
1.681
2.046
1.217
1.424
1.493
1.048
1.317
1.362
1.034
80
1.887
3.986
2.113
1.974
3.025
1.533
1.728
1.937
1.121
1.542
1.639
1.063
90
1.992
6.254
3.140
2.197
4.704
2.141
2.122
2.662
1.254
1.858
2.048
1.102
10
1.045
1.043
0.998
1.037
1.039
1.002
1.035
1.044
1.008
1.035
1.045
1.009
25
1.086
1.087
1.001
1.057
1.061
1.004
1.046
1.054
1.007
1.045
1.052
1.007
50
1.284
1.402
1.092
1.166
1.185
1.016
1.090
1.095
1.005
1.074
1.084
1.009
60
1.415
1.715
1.212
1.282
1.341
1.046
1.136
1.141
1.004
1.102
1.113
1.010
70
1.600
2.346
1.467
1.482
1.689
1.140
1.235
1.252
1.014
1.162
1.179
1.014
80
1.787
3.700
2.070
1.766
2.487
1.408
1.463
1.554
1.062
1.311
1.345
1.026
90
1.903
6.935
3.645
2.079
4.840
2.328
2.025
2.603
1.286
1.766
1.910
1.081
A SIMULATION STUDY FOR THE BEHAVIOR OF KM, CH AND PEXE
A Monte Carlo Simulation (MCS) study has been conducted in order to exhibit the finite sample behaviour of the KM, CH and PEXE reliability estimators, in terms of Estimated Mean Squared Error (EMSE). In every censored sample generated in the simulation experiments the failure and the censoring mechanisms were always Weibull and Exponential, respectively. The Weibull distribution plays a central role in the analysis of reliability and maintenance data [13]. The reason for the selection of the later relates to the requirement that for the generation of multiply censored data the distribution of the censoring ought to be random as well [11, 9, 7]. Source of standard Uniform deviates was the routine RAN1 obtained from Press et al [23], initialized before each simulation experiment with a different random integer seed; while the inversion method was employed was employed for the generation of random deviates from the Weibull and the Exponential distributions [22]. The percentages of censoring in the data can vary as required by keeping fixed the value of the parameter of the Weibull shape parameter and appropriately choosing combinations of the distribution means [7]. Thus, multiply censored data samples were generated possessing combinations of the following basic sample properties: Percentage of Censoring:
10%, 25%, 50%, 60%, 70%, 80%, 90%
820
Weibull Shape Parameter Value:
0.5, 1.0, 2.0, 3.0
Sample Size:
30, 50, 100
The relative performance of the three reliability estimators were examined in relation to the accuracy in terms of EMSE of the survival probabilities obtained at the complete observations in the reliability data samples. The EMSE for a reliability estimator, say
Rˆ , is defined as follows:
{}
1 l 1 EMSE Rˆ = ∑ l i =1 N
∑ [R (T ) - Rˆ (T )] nf
j =1
2
i
o
j
j
(14)
where l is the simulation size, N is the sample size and Ro(Tj) (j=1,…,nf) represents the theoretical Weibull survival probabilities estimated at the observed failures. The number of MCS repetitions was fixed to 20,000. Moreover, in order to study the relative behaviour of the methods, the following ratios of the EMSE were rather obtained:
KM KM
compared compared
with CH : with PEXE :
W1 =
{ {
{ EMSE {Rˆ EMSE {Rˆ W3 = EMSE{Rˆ W2 =
} }
EMSE Rˆ KM (t ) EMSE Rˆ CH (t ) EMSE Rˆ (t ) KM
} (t )} (t )} (t )}
(15)
PEXE
CH
compared
with PEXE :
CH
PEXE
Table 3 present the results of the simulation experiments for Ω1, Ω2 and Ω3, all combinations of sample sizes, percentages of censoring and Weibull shape parameter values. The following may, in summary, observed in terms of the EMSE for the finite sample behaviour performance for the non-parametric reliability estimation methods:
• • • •
5
The overall simulation outcome suggests that utilization of the actual censoring times in the calculation of PEXE substantially improve the accuracy of the reliability estimation methods. As the sample size increases the differences in terms of EMSE between the KM and the other two survival estimators decreases, with the observed ratios Ω1 and Ω2 consistently above but closer to unity. As the amount of censoring in the data increases the difference in performance in favor of PEXE consistently increases and, therefore, the increments of EMSE observed for PEXE are smaller than for KM and CH. The probability density function of a Weibull distribution with decreasing or constant hazard function has a heavier right tail than with an increasing hazard function. The conservative nature (overestimating the true sample cdf) of KM and CH reliability estimators is evident especially with an increasing amount of censoring and heavier tailed distributions. The simulation experiments suggests that PEXE responds quite better than the other two in estimating the reliability function in relation to the shape of the Weibull distribution (i.e. different values of the Weibull shape parameters) and the percentages of censoring in the multiply censored data sample: o With a Weibull distribution with a decreasing hazard function, PEXE results consistently in smaller EMSE when compared with KM; while only with 10% censoring CH performs better than PEXE. o With a Weibull distribution having a constant and increasing hazard function, the KM and CH reliability estimators result in higher EMSE than the one obtained for PEXE estimator.
CONCLUSIONS
Three important non-parametric methods for estimating the survival function for a multiply censored data were presented. The CH method is frequently employed within applications of reliability analysis in order to estimate the reliability probabilities from the data. PEXE is regarded as a promising procedure that has a number of advantages when compared to the other two, i.e. KM and CH. The availability and suitability of all density estimation procedures in fulfilling the requirements of reliability data with high percentages of incomplete observations was examined in the applied reliability and maintenance context through Monte Carlo Simulations for a wide variety of censored data sets with the Weibull distribution. Furthermore, a relationship has been established for the numerical probabilities from all three reliability estimators when applied to any multiply censored data set. It seems that all three reliability estimators are affected by increments in the amount of censoring. PEXE however, appears to accommodate incomplete observations in a more consistent manner by utilizing in its procedure both interval and ordinal information from both complete and incomplete observations. Indicative of the superiority of PEXE with increased amounts of censoring in the data is the outcome of the MCS experiments results presented in Section 4.
821
Furthermore, PEXE provides a convincing representation of the true survival function sustaining the continuous nature inherited by any life process.
6
REFERENCES
1
Bohoris G.A. (1994a) Comparison of the Cumulative-Hazard and the Kaplan-Meier Estimators of the Survivor Function, IEEE Transactions on Reliability, 43 (2), 230-232.
2
Lawless J.F. (1982) Statistical Methods and Models for Lifetime Data, John Wiley and Sons, New York.
3
Nelson W. (1982) Applied Data Life Analysis, John Wiley and Sons, New York.
4
Padgett W.J. & McNicholls D.T. (1984) Nonparametric Density Estimation from Censored Data, Communications in Statistics-Theory and Methods, 13 (13), 1581-1611.
5
Izenman A.J. (1991) Recent Developments in Nonparametric Density Estimation, Journal of the American Statistical Association, 86 (413), 205-224.
6
Bohoris G.A. (1994b) Numerical Differences in the Survival Probabilities Obtained by the Cumulative-Hazard and Kaplan-Meier Estimators of the Reliability Function, Quality and Reliability Engineering International, 10, 99-104.
7
Kostagiolas, P. A. (2000) The Goodness of Fit Problem with Industrial Reliability Data, Ph.D thesis, Univ. of Birmingham
8
Skinner K.R. & Keats B.J. & Zimmer W.J. (2001) A comparison of three estimators of the Weibull parameters, Quality and Reliability Engineering International, 17, 249-256.
9
Malla G.B. (2008), Extending the Piecewise Exponential Estimator of the Survival Function, Proceedings of the 4th Annual GRASP Symposium, Wichita State University.
10
Kaplan E.l. & Meier P. (1958) Nonparametric Estimation from Incomplete Observations, Journal of the American Statistical Association, 53, 457-481.
11
Miller R.G. (1981) Survival Analysis, John Wiley and Sons, New York.
12
Kalbfleisch J.D. & Prentice R.L. (2002) The Statistical Analysis of Failure Time Data, Wiley-InterScience, New York.
13
Nelson W. (1969) Hazard Plotting Methods for Incomplete Failure Data, Journal of Quality Technology, 1(1), 27-52.
14
Nelson W. (1979) How to Analyse Data with Simple Plots, American Society for Quality Control, Milwaukee.
15
Kitchin J., Langberg N.A. & Proschan F. (1983) A new Method for estimating life distributions from incomplete data, Statistics & Decisions, 1, 241-255.
16
Whittemore A.S. & Keller J.B. (1983) Survival Estimation with censored data, Stanford University Technical Report No 69.
17
Mimmack G.M. & Proschan f. (1988) Piecewise geometric estimation of a survival function, Handbook of Statistics Volume 7: Quality Control and Reliability, edited by Krishnaiah p.r. & Rao c.r., pp. 251-280.
18
Kim J.S. & Proschan F. (1991) Piecewise Exponential Estimator of the Survivor Function, IEEE Transactions on Reliability, 40(2), 134-139.
19
Kulasekera K.B. & White W.H. (1996) Estimation of the Survival Function from censored data: A method based on Total Time on Test, Communications in Statistics-Simulation, 25(1), 189-200.
20
Goodman M.S. & Li Y. & Tiwari R.C. (2006), Survival analysis with change point hazard functions, Harvard University Biostatistics Working Paper Series, paper 40, The Berkley Electronic Press.
21
Waller L.A. & Turnbull B.W. (1992) Probability Plotting With Censored Data, The American Statistician, 46(1), 5-12.
22
Kleijnen J.P.C. & Groenendaal W. (1992) Simulation: A Statistical Perspective, John Wiley and Sons, Chichester.
23
Press W.H., Flannery B.P., Teukolsky S.A. & Vetterling W.T. (1986) Numerical Recipes, The Art of Scientific Computing, Cambridge University Press, Cambridge.
Acknowledgments The second author would like to express his gratitude to Prof. Brian Haley, University of Birmingham for his support during his Ph.D. studies.
822
AN ADVANCED MAINTENANCE SYSTEM FOR POLYGENERATION APPLICATIONS Eduardo Gilabert, Eneko Gorritxategi, Egoitz Conde, Alvaro Garcia, Olatz Areitioaurtena and Amaya Igartuaa a
Fundación Tekniker, Avda. Otaola 20, 20600 Eibar, Spain
This paper presents an advanced maintenance procedure and a maintenance management model for polygeneration applications, which is part of the HEGEL project. A Proper maintenance strategy has been designed, identifying the critical component based on RCM methods, and afterwards selecting the best on-line condition monitoring indicators. The maintenance model is distributed among a local module, dealing with monitoring and user feedback information, an a centralized platform for management nicknamed TESSnet, a system that can be identified as a predictive maintenance management system (PMMS) which performs condition monitoring (CM), diagnosis, prognosis and decision support, based on both condition and reliability data and taking into account a best energy efficiency. The application has been developed according to condition based maintenance (CBM) strategy and following the structure proposed by OSA-CBM architecture (the standardization of CBM in order to facilitate the integration and interoperability between CBM components). Moreover, this platform interacts with a set of web services that range from simple CM protocols to complex diagnosis protocols for certain applications based on Decision Network algorithms. Key Words: E-maintenance, RCM, CBM, Diagnosis, polygeneration, energy efficiency 1
INTRODUCTION
Nowadays, the industry is realising that the efficient use of industrial assets is a key issue in supporting our current standard of living. Companies demand a considerable improvement on system productivity, availability and safety in order to satisfy the increasing product quality and customer satisfaction. In this way, maintenance is a very important area that can provide great improvement in effectiveness of industrial production system. To achieve this target, the maintenance concept must undergo through several major developments taking into account proactive consideration, modifying the approach “fail and fix” into a “predict and prevent” e-maintenance strategy [1][2]. In the frame of the European project “High efficiency polygeneration applications” (HEGEL), the responsibility from TEKNIKER has been to design the Maintenance Rules for the polygeneration applications of the ICED (Internal combustion engine-Desiccant) system. First of all, a FMEA (Failure mode and effects analysis) analysis of polygeneration applications was made in order to define the most critical components in these systems: A Methane gas compressor system for the ICED system was selected. These applications have been deeply studied in order to reconsider the lubricants involved to increase the lifetime of these critical components. Some accelerated tests have been designed for each application, in order to select the best lubricants to be used, trying to predict their behaviour in use. In this framework, TESSnet is used as predictive maintenance management system (PMMS) which performs condition monitoring (CM), diagnosis, prognosis and decision support, based on both condition and reliability data. The application has been developed according to condition based maintenance (CBM) strategy and following the structure proposed by OSA-CBM architecture (the standardization of CBM in order to facilitate the integration and interoperability between CBM components) [3]. The system avoids the systematic application of corrective and preventive maintenance tasks, making this in a predictive way, with the main objective of performing a balanced maintenance with operational reliability and energy efficiency awareness.
823
2
MAINTENANCE STRATEGIES
Tekniker have been working for the last years in the development of a new maintenance continuous improvement “Kaizen” program using cost effectiveness techniques to obtain and define the proper maintenance strategy. In the next figure the structure and steps of the continuous improvement program can be seen.
Figure 1. Maintenance cycle 1.
Selection of the objectives: The first steps should be the establishment of the main objective and it is one of the most important steps. The approach of the objective could be very different such as financial, learning or technical. It is important to identify the indicators KPI-s to evaluate the final results.
2.
Identification of the most important products/processes. The next step is the identification of the main objects or processes where the improvements are going to be critical. Criticality tables are used for this identification.
3.
Analysis of selected product/processes. An exhaustive analysis of selected product/processes is carried out to have a clear idea of the main important aspects into the product/processes. Failure Mode and Effect Analysis (FMEA) can be used to obtain the information.
4.
Develop the proper strategy for each critical product/processes. Analysis and assessment of different maintenance strategies in the selected critical product/processes. There are many different techniques to implement and analyze these aspects. The main objective is to obtain the cost-effectiveness with the improvements.
5.
Implementation of actions. One of the final step is to implement the selected strategy in each case, at least in one control group to evaluate finally the results
6.
Final assessment. The evaluation wether initial objectives have been fulfilled or not using initially defined indicators. If the initial objectives have not been fulfilled you should return to a previous step to identify the problem. If the objectives have been fulfilled, new objectives should be defined to obtain a continuous improvement program.
During the HEGEL project, the responsibility from Tekniker has been to design the Maintenance Rules for the polygeneration applications of ICED (Internal combustion engine-Desiccant) system. Following the previous maintenance structure, main objective of the analysis were established: Find the most cost-effective maintenance strategy in the cogeneration system. To minimize the cost of the maintenance system, the most important indicators are the security, availability, quality and criticality. The next step was to identify the most critical components into the system, and for that purpose criticality tables were used.
824
Table 1 Criticality table
Security
Component yes
no
Availability Important
Acceptable
Insignificant
Quality Important
Acceptable
Criticity Insignificant
x
x
x
10
x
x
x
8
Heat exchanger (exhaust / diathermic oil)
x
x
x
8
Heat exchanger (exhaust / hot water)
x
x
x
8
Diathermic oil circuit
x
x
Micro Gas Turbine Gas chiller
fired
absortion
x
8
Once we had the main objects where the improvement will be the most effective, we made an analysis of this object. Two of the tools to make this type of analysis are the FMEA (Failure mode and effects analysis) and the FTA (Failure Tree Analysis. A FMEA Analysis of polygeneration applications was made in order to analyse the most critical components in these systems. Xfmea software was used to make the analysis. The development of the most cost-effective strategy showed that the Web-based vibration condition monitoring was most effective than preventive maintenance strategies. The energy production assurance with the condition monitoring and the prediction of the failure to better schedule the maintenance actions minimize the cost of maintenance. There are savings in the material stock, in the required human resources, in maintenance actions (there are less preventive maintenance actions) and in less unscheduled maintenance actions. Different authors demonstrate that the condition monitoring techniques are the most cost-effective ones [4][5]. The next step was implementing the strategy in the polygeneration plant where the compressor is controlled via web. And to confirm that what we are doing is correct, we will see if there is any improvement in the maintenance costs, before to start the program again.
Axial Accel
Figure 2. Compressor axial accelerometer
825
3
LUBRICANT SELECTION
Tekniker has been also involved in analyzing the most critical components from friction and wear and ageing point of view, according to previous experiences [6]. A Methane gas compressor system is used in the ICED system, and these applications have been deeply studied in order to reconsider the lubricants involved to increase the lifetime of these critical components. Some accelerated tests have been designed for each application, in order to select the best lubricants to be used, trying to predict their behaviour in use. The actual lubricants of the gas reciprocating engine are typical mineral oils. 2 viscosity grades and formulations have been considered 68 and 100cST, with 3 different formulations A, C and R. The behaviour of the lubricants have been compared carrying out different physico chemical tests: - Oxidation temperature resistance, by means of Differential Scanning Calorimetry (DSC). These tests were carry out performing a dynamic curve from 50ºC to 500ºC with 10ºC/min heating rate, under 20 MPa of pressure in O2 atmosphere. - Resistance of the lubricant at fixed temperature (DSC). In these tests, it has been measured the isothermal curve at 195ºC, under 20 MPa of pressure in O2 atmosphere. - Desemulsibility, it has been measured according to the standard ASTM D-1401-02 at 54 ºC: Reference
Viscosity 40ºC (cSt)
DSC Time resistance (min) 15
Desemulsibility
68
DSC (TºC resistance) 252
Lubricant C68 Lubricant C100
100
222
17
40.40.0 (10)
Lubricant A100
100
177
*
40.37.3 (15)
Lubricant R100
100
248
8
40.40.0 (20)
40.40.0 (10)
* It could not be done the isothermal curve at 195ºC The samples referenced as C100, R100 and C68 present good resistance to temperature and good desemulsibility Additionally, different friction and wear tests have been performed: -
Four Ball Extreme Pressure tests following the Standard ASTM D 2783 Four Ball Antiwear Tests following the Standard ASTM D 4172 at 75ºC, 1200rpm, 40Kg, 60 minutes Abrasion test resistance, using three ball on disc tests with silicon nitride balls and 100Cr6 Steel disc at load 45lb, speed 1000rpm, (2,67m/s), pressure 2,57GPa, and 60 minutes duration
Reference
Four Ball EP Load (N)
Abrasion resistance Friction Coefficient 0,047
Abrasion resistance Wear ball/disc (μm)
100
Four Ball Antiwear Wear Scar (μm) 570
505/543
Abrasion resistance Increase of TºC 33
Lubricant C68 Lubricant C100
100
552
0,050
605/572
38
Lubricant A100
80
667
0,055
524/504
44
Lubricant R100
100
656
0,051
678/722
38
According to the test results the Lubricant of reference C68 was selected for the incorporation in the Compressor, due to a good relation temperature/time resistance and good abrasion resistance, generating less friction in the contact. 4
DIAGNOSIS MODEL
A specific diagnosis model was designed for the ICED application in order to automate the health assessment process and also to assess the energy efficiency of the cogeneration system. For this purpose, the FMEA performed previously was the basis of the diagnosis system, developing a Bayesian Network (BN). A FMEA is used to analyse the problems and failures that may occur in a machine or unit, as well as the effects of these problems and the criticality of these effects. The analysis is made using the BOTTOM-UP methodology, starting with the basic
826
elements (clogs, pumps…), following the modules including the basic elements, and so on until the whole system is covered and analysed. Moreover a FMEA includes a summary report of all the functions, possible failures and causes. These lists will come in handy as each point of these reports will become a node of the Bayesian network (BN). A BN is a model [7][8] which reflects the states of some parts of a world that is being modelled. It describes how those states are related though conditional probabilities. All the possible states of the model represent all the possible worlds that can exist, that is, all the possible ways that the parts or states can be configured. The representation is a directed acyclic graph (DAG) consisting of nodes, which correspond to random variables and arcs, which in turn correspond to probabilistic dependencies between the variables. A conditional probability distribution is associated with each node and describes the dependency between the node and its parents. The way of building the net is: Take one unit from the function repost (each unit has one function only). One or more failures could cause the unit to operate incorrectly and one or more causes can lead to a failure. One cause may lead to different failures, even in different units. Moreover, the malfunction of one unit may be the cause of failure in another unit. If we make a diagram following these steps we will come to something like the figure 1, which happens to be a Bayesian network.
Cause
Cause
Cause
Cause
Failure
Failure
Cause
Unit Failure
Failure
Unit
Figure 3. Cause-Failure-Unit Bayesian- net All the nodes of the net will be Boolean (units work or not, causes happen or not and failures occur or not). When assigning the probabilities of the nodes, the FMEA’s ORS (Occurrence Rating Scale) will be used for ‘leaf’ nodes (the ones without parents). “Branch” and “root” nodes’ probability will be an OR-gate of their parent nodes (if one or more of the parents are true, the node is true) The FMEA has to be read in detail, as it can contain a huge amount of causes, some of them which may be repeated. In the same way the FMEA differs between broken and failing elements. To simplify things the cause nodes will cover both cases (broken/failing elements). A less restrictive perspective can be taken when building the net. Instead of using the unit as causes we can use the failures of the units. These can be useful when the same type of failure occur in an upper-level unit. For example: A cogenerator producing both electric and thermal energy has a generation module (among other units). Both cogenerator and generation module have the same two types of failure (not producing electricity and not producing heat). Heat and electricity generation is independent to the unit. A malfunction in thermal energy production in the generation module should not affect the production of electricity in the whole cogenerator.
827
Cause
Cause
Failure T1
Cause
Cause
Failure T2 Cause
Unit
Failure T1
Failure T2
Unit
Figure 4. Cause-Failure-Unit Bayesian- net (2nd version) Following this methodology, a BN for the cogeneration system was developed using Hugin Research tool [9]. This software has a graphic interface on windows operating system so that Bayesian networks can be designed and it is possible to see probabilities propagation when node instances are set. The BN has been included in TESSnet platform, since Hugin software provides an API so users can integrate their BN in their own applications.
5
TESSNET PLATFORM
TESSnet is a Predictive Maintenance Management System (PMMS), able to perform different types of tasks: Condition monitoring, Health assessment, Prognostics assessments and Decision support. These tasks are carried out based on condition data (measurements related to vibration, oil, temperature, pressure), reliability data and energy efficiency indicators. This tool is web-based, collaborative and a central management system, with user control access. The platform stores measurements both from on-line and off-line sensors as well as laboratory analysis results. They are stored using a hierarchy of components: Company, Plant, Machine, Assembly, Sensor and Measurement. The development of this intelligent system is based on previous developed systems. [10][11][12], and during the HEGEL project, new functionalities were added and adapted for polygeneration applications. TESSnet also uses DynaWeb web services [13]. The condition monitoring function receives data from the sensor modules, the signal processing modules and other condition monitors. Its primary focus is to compare data with expected values and it is able to generate alerts based on preset operational limits. The diagnosis or Health Assessment receives data from different condition monitors or from heath assessment modules. The primary focus of the health assessment web service is to prescribe if the health in the monitored component, sub-system or system has degraded. The health assessment layer should be able to generate diagnosis records and propose fault possibilities, and for this purpose the Bayesian network developed from FMEA has been integrated on the platform. The diagnosis should be based upon trends in the health history, operational status and loading and maintenance history. Other important feature that TESSnet performs is the prognostics assessment. The prognosis plans the health state of equipment into the future. TESSnet is able to estimates the remaining useful life (RUL) over measurements from sensors based on linear and exponential regression of measured values and the according alarm limit, taking into account their evolution in the time. Furthermore, this information is very useful to set the right maintenance order list in scheduling task in case of having different applications to monitor. The algorithm for the task arrangement takes into account the RUL of the machine, setting prior in time the maintenance tasks related to a machine with minor RUL.
828
Figure 5. TESSnet platform This new type of maintenance pretends to achieve maximum resource optimization and operational availability minimizing cost with intelligent “decision support” based on an “operational support” and “energy awareness”.
6
CONCLUSIONS
This paper has presented part of the work carried out during the European project HEGEL, where the main objective is to define the best cost effectiveness maintenance strategy in relation to the energy efficiency of the application. A Proper maintenance strategy has been designed, identifying the critical component based on RCM methods, and afterwards selecting the best on-line condition monitoring indicators. The maintenance model is distributed among a local module, dealing with monitoring and user feedback information, and the TESSnet platform, a predictive maintenance management system, able to perform different types of tasks: Condition monitoring, Health assessment, Prognostics assessments and Decision support. The system avoids the systematic application of corrective and preventive maintenance tasks, making this in a predictive way, with the main objective of performing a balanced maintenance with operational reliability and energy efficiency awareness. Bayesian network is a useful technology for developing diagnostic systems, and a fault diagnosis model based on FMEA information has been developed. One of the main challenges is to provide new types of maintenance strategies based on prediction and prognosis that combined with accelerated tests replace preventive maintenance carried out in the industry currently.
7
REFERENCES
1
Al-Najjar B & Alsyouf I. (2003) Selecting the most efficient maintenance approach using fuzzy multiple criteria decision making. International Journal of Production Economics 84, 85-100.
2
Crespo Márquez A & Gupta JND. (2006) Contemporary Maintenance Management. Process, Framework and Supporting Pillars. Omega. 34/3, 325-338.
3
Bengtsson M. (2003) Standardization issues in condition based maintenance. Department of Innovation, Design and Product Development, Mälardalen University, Sweden.
4
Jardine, AKS, Joseph T & Banjevic D (1999) Optimizing condition-based maintenance decisions for equipment subject to vibration monitoring. Journal of Quality in Maintenance Engineering, Vol. 5 No. 3, pp 192-202.
829
5
Al-Najjar B. (1999) Economic criteria to select a cost-effective maintenance policy. Journal of Quality in Maintenance Engineering, Vol. 5 No. 3, pp 236-247.
6
Igartua A, Barriga J & Aranzabe A. (2005) Biodegradable lubricants. ISBN 83-7204-449-X, Vol 1, Pag 1-VI.3, Edited in Poland by the Virtual Tribological Institute.
7
Diez FJ. (2000) Introduction to Approximate Reasoning, ,UNED, Madrid.
8
Diez FJ. (2000) Probabilidad y teoría de la decisión de medicina, UNED, Madrid.
9
Andersen SK, Olesen KG, Jensen FV & Jensen F. (1989) Hugin – a shell for building Bayesian belief universes for expert systems. Proceedings of the 11th International Joint Conference on Artificial Intelligence, pp. 10801085.
10
Arnaiz A. (2006) Análisis de datos de fiabilidad para predicción de fallos en mantenimiento. Jornada mantenimiento y Fiabilidad en energía. Madrid.
11
Gilabert E & Arnaiz A. (2006) Intelligent automation systems for predictive maintenance. A case study. Robotics and Computer Integrated Manufacturing (RCIM). Vol 22. numbers 5-6. pp 543-549. (ISSN:0736-5845).
12
Arnaiz A, Levrat E, Mascolo J & Gorritxategi E. (2007) Scenarios for development and demonstration of dynamic maintenance strategies. ESReDA European Safety, Reliability & Data Analysis 32nd seminar. Sardinia.
13
Arnaiz A, Iung B, Jantunen E, Levat E & Gilabert E. (2007) DYNAWeb. A web platform for flexible provision of e-maintenance services. Harrogate.
ACKNOWLEDGEMENTS The authors gratefully acknowledge the support of the European Commission Sixth Framework programme for Research and Technological Development. This paper summarises work performed as part of FP6 STREP Project 20153, HEGEL "High efficiency polygeneration applications".
830
Proceedings of the 4th World Congress on Engineering Asset Management Athens, Greece 28 - 30 September 2009
FINITE SAMPLE BEHAVIOR OF THE HOLLANDER-PROSCHAN GOODNESS OF FIT WITH RELIABILITY AND MAINTENANCE DATA Kostagiolasa, P.A. & Bohorisb. G.A. a
Department of Archive and Library Science, Ionian University, Ioannou Theotokis 72, GR-491 00 Corfu, Greece Tel: +30 26610 87402 & Email: [email protected].
b
Department of Business Administration, University of Piraeus, Karaoli & Dimitriou 80, GR-185 34 Piraeus, Greece, Tel : +30 210 4142253 & Email: [email protected].
In maintenance and reliability everyday practices the data samples which are often made available are multiply censored, i.e. the times to failure are randomly mixed with incomplete lifetimes. This fact adds complexity when sound decisions are required to be made in identifying failure mechanisms, evaluating maintenance practices and/or manufacturing methods. Hollander & Proschan have derided a Goodness of Fit (GOF) test statistic which has a number of advantages, i.e. can accommodate multiply censored data, can be employed irrespective of the failure and censoring distributions, and finally is indeed an omnibus and straightforward method to compute. Although the test statistic has received attention in the literature of reliability and maintenance applications over the last two decades, only a limited investigation has been carried out in terms of its finite sample behavior. This paper investigates the usefulness of this particular GOF method within the reliability and maintenance context and provides a literature review of alternative approaches. Furthermore, the finite sample properties are investigated through extensive Monte Carlo Simulations, with the Weibull distribution when parameters are estimated from the data and a wide range of censoring percentages. Key Words: reliability management, non-parametric methods, goodness of fit, finite sample behavior, simulation 1
INTRODUCTION
Goodness of fit (GOF) test statistics are regarded to be invaluable “tolls” in the everyday activities of reliability engineers and practitioners. In maintenance and reliability practice the data samples which are commonly made available are multiply censored [Newton in the discussion of 1]. Sound decisions require further investigation of the adequacy of the model in representing the censored data. In an aggregated sense, within the reliability and maintenance field, appears to be an absence of appropriate and serviceable formal GOF statistical methods. Aftereffects of this shortcoming may be further appreciated by considering the significant amount of management and engineering effort which is associated in identifying failure mechanisms, evaluating maintenance practices and/or manufacturing methods [2]. In an effort to circumvent the above mentioned difficulties, reliability practitioners rely on graphical means for the assessment of distributional assumptions with censored reliability data [3]. Hollander & Proschan [4] derived an analogous to Efron’s [5] two sample test statistic for GOF purposes. The test received attention and was recommended as a standard GOF procedure to be employed with multiply censored data and a simple GOF hypothesis [6 & 7]. It is quite clear, that the method conforms to the requirements imposed by the practicability of a GOF procedure to be employed within reliability analysis:
• • •
Is applicable irrespective of the distribution of the censoring and, therefore, does not require an explicit assumption of the underlying censoring mechanism involved in the generation of survival data; May accommodate all types of censored data; Is omnibus and simple to compute.
831
This paper investigates the behavior of the Hollander & Proschan (HP) GOF test within the reliability and maintenance context. In the next section the notation employed is introduced together with the Kaplan–Meier (KM) non-parametric reliability estimator. Thereafter alternative GOF methods to HP test statistic are presented and commented; while the computational details of the HP GOF procedure are exhibited. Finally, the finite sample behavior in terms of size and power for the HP GOF test statistic is examined through Monte Carlo Simulations (MCS).
2
BACKGROUND INFORMATION
Prior to proceeding any further it is useful to define the notation used: Let us assume that that the available multiply censored data set is a random sample of the studied population, consisting of N independent and identically distributed (iid) lifetimes. Let nf be the number of distinct times to failure denoted as T1,T2,…,Tnf. The censored data sample may be re-ordered by magnitude and written as,
0 < t 01 £ K £ t 0 e0 < T1 < t 11 £ K £ t 1e1 < T2 < K < T j < t j1 £ K £ t je j < K < Tnf < t nf 1 £ K £ t nfenf whe re T0 = 0 and Tnf+1 = ∞. Strict inequality is assumed between complete and incomplete observations. Furthermore, let dj be the number of failures occurring at time Tj (j=1,…,nf), with d0 = 0, i.e. there are no failures at time zero. Denote by ej the number of right censored observations, i.e. t nf 1 ,K,t nfenf , that fail in the interval [T j , T j +1 ) with j = 0, …, nf. Let nj be the number of items “at risk of failure” constituted from items with lifetimes greater than or equal to Tj (j=1,…,nf), nf
n j = ∑ (d j + e j ) ⇒ n0 = N
(1)
l= j
A quite popular non-parametric estimator of the survival function is the KM estimator often referred to as the Product–Limit (PL) estimator introduced by Kaplan and Meier [8]. KM density estimation method has played a central role in the analysis of clinical data and has been the standard procedure for estimating the survival function in many biomedical and statistical computer programs. Furthermore, the KM is a key quantity in several more complicated survival analysis models like the Proportional Hazards, GOF and two-sample tests. This is partly due to the fact that the KM estimate reduces to the Empirical Distribution Function (EDF) when used with the non-life (complete) data samples. The estimator itself is defined through the product: j n - dj ˆ RKM (T j ) = P j nj l =1
(2)
KM is a step function and is the non-parametric maximum likelihood estimator of the reliability function, while its properties have been studied by a number of people [9 & 10]. The true cumulative distribution function (cdf) of the survival data set is denoted by F(=1-R), where R is the corresponding reliability function. Let as assume that information about the population is available implying that specific parametric lifetime model, F (•,q ) , where θ is a λ-dimensional set of completely specified distributional parameters, represent the available survival data. The simple GOF hypothesis may now be stated as follows: Simple Null Hypothesis : R (•,q ) = Rˆ , against , R (•,q ) „ Rˆ where Rˆ is the KM estimator and Ro = (1 - Fo ) is o
o
the parametric survival function hypothesized for the censored data sample.
3
A REVIEW OF ALTERNATIVES TO HP GOF TEST STATISTICS
GOF test statistics for grouped data have been extended to include the multiply censored data and assess simple or composite hypothesis [11, 12, 13, 14, 15, 16, 17, 18, 9, 19, 20]. Within reliability and maintenance applications, however, such a categorization of data is often inconvenient, reduces the information extracted from the survival data, and the power of the test statistics is generally lower than that of alternative GOF methods. That is, by using a chi-square type test statistic regularly, one is more likely not to reject the null hypothesis when is it in fact false. More recently Johnson [21] proposed a Bayesian chsquared test statistic by generalizing the classical Pearson chi-squared GOF test; while a similar approach has been proposed by Yin [22] for right censored survival data. One the other hand, test statistics based on the empirical distribution function (EDF) are useful procedures and enjoyed wide acceptance in industrial applications of survival analysis. Extensive research produced a complete set of tables with
832
critical values to be used with complete or singly censored data and important life distributions [10, 23, 24]. A GOF investigation, however, becomes substantially more complicated when the problem is addressed with multiply censored data and the available literature comprises only a few papers. The presence of multiply censored observations has the effect of invalidating the sample distribution of the tests with complete or singly censored data. Establishment of suitable critical values for existing test statistics with multiply censored data is the main difficulty that the research in the area faces [25]. Koziol and Green [26] proposed a quadratic Kramer von Mises (CvM) procedure for testing simple GOF hypothesis. The test was based on the KM estimator and reduced to the usual CvM if employed with complete data. In order, however, to study the asymptotic distribution of the test statistic and to provide a mechanism for sampling with multiply censored data, a significant assumption on the random censoring model was made. Namely, the authors assumed that the hypothesized cdf for the failures (Fo) is related to the cdf of the censoring, Go via the following expression: 1 - Go = (1 - Fo ) c where c is a nonnegative constant governing the amount of censoring. This approach initiated by [26] referred to as the Koziol–Green (K-G) model for random censoring. Csorgo & Horvath [27] revisited the modified CvM test by [26] and further proposed a modified Kolmogorov – Smirnov (KS) test statistic for a simple GOF hypothesis. The authors established that the censoring parameter of the KG model (c) could be estimated from the data without effecting the distribution of the GOF procedures. Although the work carried out by [26 & 27] is of significant theoretical interest, the Koziol and Green assumption is difficult to be established with reliability and maintenance data. Other approaches include those by Pena [28], Chen et al. [29] and Ren [30]. Koziol [31] introduced modifications of the KS and the CvM GOF tests for a simple hypothesis with multiply censored data. The modified statistics were based on the KM estimator and were derived as functional of the difference between the estimator and the hypothesized cdf weighted by the square root of the sample size. The author suggested that the asymptotic distributions already derived for the singly censored (or the truncated) “version” of the test statistics [32, 33, 34] could also be employed when assessing a GOF hypothesis with multiply censored data. The approach by Koziol may be computational convenient, however, the proposed asymptotic distribution is an additional drawback since it introduces extra variability to the method. Fleming et al. [35] and Fleming and Harringhton [36] modified the supremum KS procedure proposing a different formulation for the test statistic. Through this reformulation a new inference procedure for the two sample problem as well as a new GOF test, both to be used with multiply censored data were derived. The GOF procedure was based on a modification introduced to the reliability estimator by [37] in order to estimate survival probabilities from both failure and censoring underlying mechanisms in the censored data. The authors mostly concentrated on the two sample problem providing examples and investigating the modified two sample procedure through a simulation study. Gastaldi [38, 39, 40] introduced an interesting framework which allowed derivation of a supremum KS two sample procedure and a GOF test statistic for a simple null hypothesis with multiply censored data. The author assumed that the subset of failures, within the survival data, is what remains after censoring on a complete random sample consisting of both censoring and failures. The null hypothesis was thereafter constructed to test distributional assumptions made for the complete (yet unknown) sample given that the subset of failures has also been drawn from the hypothesized distribution. Although the distribution of the KS test is unknown in the presence of random censoring the author provide expressions for a lower and upper bounds aiding in drawing inferences for the GOF test statistic.
4
THE HOLLANDER–PROSCHAN TEST STATISTIC
Hollander & Proschan [4], derided a GOF test statistic, denoted by HP, which is based on the one sample analog of Efron’s statistic,
C = ∫ Ro dRˆ
(3)
where Rˆ is the KM estimate of the true survival function and
Ro is a completely specified distributional model which is
hypothesized for the data. The HP test is applicable with complete, singly censored and multiply censored data. GOF is judged by the HP test upon the accumulation of survival probabilities and not through direct (or weighted) comparisons of parametric to nonparametric KM estimates as for example in the EDF type test statistics. The computational algorithm for the calculation of the HP GOF test statistic is summarized bellow: Step 1. Compute the KM estimate (
Rˆ ) of the survival function for the data set (Eq. 2). Define the quantity
DRˆ (T j ) to be the observed jumps of the KM estimate computed at the distinct failures in the censored data ( T = 0 & Rˆ (0) = 1 ), o
DRˆ (T j ) = Rˆ (T j -1 ) - Rˆ (T j ), for j=1,…,nf
(4)
833
Furthermore, calculate the jump of the reliability estimator at the last observation in the data ( t nfenf ) by,
0 DRˆ (t nfenf ) = ˆ R(Tnf )
if it is a failure, i.e. enf = 0 if it is a censoring , i.e. e nf > 0
(5)
Step 2. Compute the value of Eq. 3 over the observed distinct failure times and the last observation in the data by, nf
C = ∑ Ro (T j ) DRˆ (T j ) + Ro (t nfe nf ) DRˆ (t nfe nf )
(6)
j -1
Step 3. The value of the GOF test statistic is obtained through the expression,
HP =
N (C - 1 / 2) sˆ
(7)
where sˆ is an estimate of the standard deviation of C which under the null hypothesis is given by [4],
sˆ =
{
1 N N [Ro (tl -1 )]4 - [Ro (tl )]4 ∑ 16 l =1 N - l + 1
}
(8)
with to=0, Ro(to) =1. Step 4. In the uncommon event, within the reliability and maintenance field, that a one-sided hypothesis is of interest then reject the null hypothesis in favor of,
H 1 : Rˆ > Ro
if
HP < - Z a
or
H 1 : Rˆ < Ro
if
HP > Z a
In the usual two-sided case, reject the null hypothesis in favor of the general alternative if,
HP > Z a / 2
or
HP < - Z a / 2
where α is the required level of significance and Zα is the upper percentile point of the standard normal distribution. In Hollander & Proschan a limited simulation study was presented: A simple Exponential and Normal GOF null hypothesis were considered with small sample sizes N=20 & 50 and percentages of censoring 33% and 50% through the K-G model of censored data. A confirmation for the appropriateness of the asymptotic normal distribution were resulted through the estimated type I errors. Moreover, the simulation results suggested that the Hp GOF test compares favorably in terms of estimated powers to the EDF test by [26] scale and location alternatives to an Exponential and Normal distribution, respectively.
5
AN INVESTIGATION OF THE SIZE AND THE POWER OF THE TEST
An investigation of the size assesses the behavior of the statistic in regard to the probability of incorrectly rejecting the null hypothesis when it is true. This probability should be approximately α for a test of a size α, i.e. at α level of significance. A power investigation assesses the behavior of the test statistic in regard to the probability of correctly rejecting the null hypothesis when it is in fact false. Studies of this nature do heavily depend on the nature of the hypothesis to be tested, the statistic under examination and the alternative distributions considered. Size and power of the HP GOF test statistic is examined through simulation. The characteristics of simulated multiply censored data samples are throughout determined by two sampling distributions representing the parent (failure and censoring) underlying mechanisms. In every censored sample generated in the simulation experiments the failure and censoring distributions were always Weibull and Exponential, respectively. The Weibull distribution plays a central role in the analysis of reliability and maintenance survival data [25]. The reason for the selection of
834
the later relates to the requirement that for the generation of multiply censored data, the underlying censored mechanism ought to be random [41, 42, 43]. Source of standard Uniform random deviates was the routine RAN1 obtained from Press et al [44], initialized before each simulation experiment with a different random integer seed; while the inversion method was employed for the generation of random deviates for the Weibull and the Exponential distributions [45]. The percentages of censoring in the data can vary as required by keeping fixed to a selected constant value the Weibull shape parameter (β) and appropriately choosing combinations of the distributions means. The number of repetitions in the simulation experiments was fixed to l = 20,000. Table 1: Estimated Type I errors for the HP GOF test statistic at an asymptotic nominal level of 0/05, as a function of the sample size, percentage of censoring and Weibull shape parameters [the complete tabulate simulation results have been presented in 46 and can be made available from the authors] Estimated Size of HP Type GOF tests (α = 0.05) β = 1.0 Sample Size
Censoring
20
40%
30
40
50
100
HP
β = 2.0
MHP
KHP
0.0424
0.0455
0.0650
50%
0.0460
0.0428
60%
0.0544
70%
HP
β = 3.0
MHP
KHP
0.0475
0.0588
0.0820
0.0631
0.0510
0.0620
0.0399
0.0611
0.0504
0.0314
0.0522
0.0483
40%
0.0443
0.0462
50%
0.0455
60%
HP
MHP
KHP
0.0529
0.0614
0.0856
0.0914
0.0503
0.0619
0.0930
0.0589
0.0975
0.0535
0.0641
0.1063
0.0532
0.0579
0.1023
0.0573
0.0698
0.1139
0.0613
0.0482
0.0552
0.0719
0.0478
0.0538
0.0695
0.0450
0.0669
0.0480
0.0571
0.0802
0.0455
0.0551
0.0765
0.0556
0.0454
0.0662
0.0508
0.0610
0.0975
0.0483
0.0599
0.0910
70%
0.1185
0.0755
0.0658
0.0521
0.0611
0.1053
0.0545
0.0665
0.1093
40%
0.0450
0.0490
0.0604
0.0438
0.0508
0.0649
0.0469
0.0535
0.0665
50%
0.0421
0.0442
0.0652
0.0443
0.0522
0.0708
0.0451
0.0528
0.0671
60%
0.0494
0.0439
0.0708
0.0497
0.0574
0.0885
0.0534
0.0634
0.0870
70%
0.1014
0.0684
0.0652
0.0518
0.0613
0.0969
0.0470
0.0615
0.0992
40%
0.0442
0.0484
0.0607
0.0491
0.0529
0.0650
0.0548
0.0510
0.0677
50%
0.0439
0.0463
0.0654
0.0485
0.0528
0.0698
0.0454
0.0514
0.0663
60%
0.0473
0.0421
0.0655
0.0451
0.0497
0.0752
0.0498
0.0504
0.0817
70%
0.0853
0.0580
0.0613
0.0485
0.0575
0.0908
0.0471
0.0590
0.0911
40%
0.0513
0.0527
0.0610
0.0480
0.0488
0.0530
0.0478
0.0488
0.0535
50%
0.0478
0.0503
0.0629
0.0455
0.0474
0.0562
0.0484
0.0512
0.0590
60%
0.0453
0.0413
0.0600
0.0460
0.0515
0.0665
0.0493
0.0555
0.0655
70%
0.0598
0.0455
0.0702
0.0452
0.0524
0.0763
0.0410
0.0514
0.0698
In regard however, to the distribution introduced as alternatives these are chosen to be families of models commonly employed in the estimation of the power of GOF test statistics with a simple null Weibull hypothesis, e.g. [47 & 24]. More specifically, for tests concerned with a simple null hypothesis two families of alternative distributions with a particular interest in survival analysis are selected: i. Shape alternatives with the Weibull distribution with cdf,
t f b Wf (f b ,h ) = 1 - exp- with f = 0.20,0.35 & 0.60 h
835
(9)
ii. Scale alternatives with the Weibull distribution with cdf,
t b with w = 0.40,0.60 & 0.80 Wf ( b , w h ) = 1 - exp- w h
(10)
The cdf of the shape and scale alternatives to Weibull distributions, Eq. 9 & 10, respectively, suggest that the closer the values of the parameters φ and ω are to zero the larger are the introduced disagreements. The size and the power of the test statistics were studied as a function of sample size, percentage of censoring, shape parameter of the null Weibull distribution Wo(β, η), and levels of significance α = 0.01, 0.05 and 0.10. However, only the simulation outcome relevant to α = 0.05 is presented here since the HP GOF test statistic behaved similarly for α = 0.01 & 0.10. In the simulation experiments combinations of the following basic sample characteristics were desired: Sample Size:
20, 30, 40, 50, 100, 150, 200
Average percentage of censoring:
10%, (10%), 70%
Weibull Shape, β:
1.0, (1.0), 5.0
For each sample generated with a combination of the desired properties, the true Weibull distribution, Wo(β, η), and each member from the two families of alternative distributions Wφ(φ ·β, η) and Wω(β, ω · η), was hypothesized to be representative and submitted to the HP GOF test statistic. The proportion of samples rejected as not conforming to the null hypothesis defined through the true Weibull distribution, represents approximations to the size of the test; while the proportions of the samples that rejected the alternative models represents the power of the test statistic. Table 1 has been constructed so that an overview of the simulations outcome for the size of the HP GOF test statistic is presented and, therefore, a more precise statement about the appropriateness of the asymptotic sampling distribution of the procedure when employed with finite sample sizes is provided. Namely, estimated Type I errors at an asymptotic nominal level of α = 0.05, for a wide selection of testing situations are tabulated. The estimated sizes for the HP GOF test (Table 1) are close to the asymptotic nominal levels for sample sizes as small as N = 20 and for high percentages of censoring. Overall, the asymptotic distribution of the HP test seems appropriate even with small sample sizes. Figure 1: Estimated power results, α=0.05, as a function of the sample size for the HP GOF test statistic with a constant (β = 1) and increasing (β = 2) Weibull hazard function [46].
836
The general expected trends for the power of the HP GOF test statistic are exhibited: power increases with sample size and decreases as the percentages of censoring increases. A compilation of the estimated powers for the HP test as a function of the sample size with α = 0.05 is illustrated in separate graphs within Figure 1. The simulation outcome is in conformity with the findings of [4] for both size and power of the HP GOF test. The plethora of additional cases, however, considered in our simulations, reveal the behavior in terms of power of the HP procedure under a variety of testing situations [46]:
•
An increased amount of incomplete observations in the data reduce the ability of the GOF procedure to detect the alternative models. The “source of information” for the GOF test is the available complete observations. By increasing the sample size the HP statistic “receive more information” in the sense that the non-parametric constituent (KM estimator) of the test approximates closer than alternative distribution and, thus, the discriminatory ability increases. Excellent estimated powers are observed for large sample sizes with even high percentage of censoring. • The HP GOF test appear to be more powerful in detecting scale alternatives Wω(β, ω·η), rather than shape alternatives, Wφ(φ·β, η), to the null Weibull distribution. In fact, scale disagreements (ω = 0.4, 0.6 & 0.8) were detected successfully even with small sample sizes and rather large amount of censoring in the data (Figure 1). The estimated power of the test against scale alternatives to the Weibull model were found to tend quite rapidly as a function of sample size to their asymptotic value. On the other hand, estimated powers for the shape disagreements (φ = 0.2, 0.35 & 0.6) were not as high. The test exhibited little power with small sample sizes (N = 20 & 30); moderate discriminating ability with medium sample sizes (N = 40 & 50); while, the estimated powers were approximating to their limiting value for φ = 0.2 & 0.35 with large sample sizes. • The estimated powers tend to increase as a function of the Weibull shape parameter with the larger differences observed in the corresponding estimated powers between β = 1 and β = 2 (Figure 1). The presence of the random incomplete observations in the data complicates the analysis and, in fact, the GOF investigation. However, the simulation outcome suggests that the HP statistic is a reasonably powerful GOF test against shape and scale alternatives the Weibull distribution. Table 2: Computations involved in the non-parametric components of the HP GOF test statistic [46] Event Time
Basic Sample Quantities
KM Estimator
Steps 2 & 3 in the computational algorithm of HP GOF test statistic (Eq. 5 & 6 & 8)
{[R (t
)] - [Ro (tl )]
ti*
nj
dj
ej
Rˆ KM (T j )
DRˆ (T j )
Ro (T j )
69
21
1
0
0.9524
0.0476
0.96915
0.11779
o
4
l -1
176
20
1
1
0.9048
0.0476
0.87531
0.29519
-195
–
–
–
-
-
0.85553
0.05130
208
18
1
0
0.8545
0.0503
0.84164
0.03394
215
17
1
0
0.8042
0.0503
0.83406
0.01784
233
16
1
0
0.7540
0.0503
0.81428
0.04431
289
15
1
0
0.7037
0.0503
0.75081
0.12185
300
14
1
0
0.6534
0.0503
0.73814
0.02093
384
13
1
0
0.6032
0.0503
0.64105
0.12798
390
12
1
1
0.5529
0.0503
0.63417
0.00713
-393
–
–
–
-
-
0.63074
0.00347
441
11
1
0
0.4976
0.0553
0.57665
0.04777
453
10
1
0
0.4423
0.0553
0.56326
0.00984
567
9
1
2
0.3870
0.0553
0.44396
0.06181
-617
–
–
–
-
-
0.39641
0.01415
-718
–
–
–
-
-
0.31049
0.01540
719
5
1
0
0.3096
0.0774
0.30971
0.00009
783
4
1
0
0.2322
0.0774
0.26258
0.00445
900
3
1
2
0.1548
0.0774
0.19047
0.00344
-1000
–
–
–
-
-
0.14206
0.00091
-1022
–
–
–
-
-
0.13289
0.00019
837
4
}
6
ILLUSTRATIVE EXAMPLE OF THE HP GOF TEST STATISTIC
In an effort to look more closely in the successive steps involved in the calculation of the HP GOF test statistic, a simple but informative example is presented. The computational algorithm is employed and the details of the arithmetic calculations involved in the HP test for assessing a simple GOF hypothesis with multiply censored data are exhibited [46]. Let as assume that the simple and illustrative multiply censored data set (Table 2, column 1) has been made available [48], i.e. N=21 lifetimes in total of which the nf = 15 are failures and the
nc = ∑l = 0 el = 6 are incomplete observations. Table 2 summarizes the data nf
together with the basic quantities ej, dj, and nj, computed at each distinct failure time in the sample (j = 1,…, nf). Furthermore, no = N, no failures are observed at time T0 = 0, i.e. d0 = 0, while, for the multiply censored data sample of this example e0 = 0. The simple null Weibull hypothesis submitted for examination to the HP GOF test suggest that a Weibull distribution with shape (β = 1.545) and scale (η = 646.784) adequately represents the overall multiply censored data sample. The reliability function for the Weibull distribution is expressed by, 1.545 t b t Ro (t ) = exp- = exp- 648.784 h
(11)
The non-parametric constituent of the HP GOF test statistic requires computation of the KM survival estimator (Eq. 2) and the differences of the reliability estimates between successive distinct failures (Eq. 4 & 5). The relevant calculations are summarized in Table 2, columns 7 & 8. Because the last observation in the data is a censoring ( t21 ” t 15e15 = 1022 ) the difference of the survival estimator at this point is also required
DR (t 15e15 ) = Rˆ (Tnf ) = 0.1548 . Employment of the
reliability function (Eq. 11) allows calculation of the survival probabilities for the hypothesized Weibull model at both complete and incomplete observations (Table 2, column 7). The agreement of the non-parametric to the parametric survival probabilities may now be assessed through the HP GOF test statistic. Computation of the GOF procedure may therefore be now realized by employing the results of the Table 2 and summarized as follows. Employing Eq. 6, the quantity C is evaluated, nf
C = ∑ Ro (T j ) j -1 123
DRˆ (T j ) + Ro (t nfe nf ) DRˆ (t nfenf ) ⇒ C = 0.4985 + 0.0206 ⇒ C = 0.5191
Table 2 , column 7
The standard deviation of C is computed from Eq. 8,
sˆ =
{
}
1 N N 1 4 4 ˆ [R4 1.4733 = 0.3034 ∑ o (tl -1 )] - [Ro (tl ) ] ⇒ s = 1 4 4 2 4 4 4 3 16 l =1 N - l + 1 16 Table 2 , column8
The value of the HP GOF test statistic may now be obtained from Eq. 7,
HP =
N (C - 1 / 2 ) 21(0.5191 - 0.5000 ) = ⇒ HP = 0.2885 sˆ 0.3034
Therefore, the Weibull lifetime model for the survival data cannot be rejected at a level of significance α = 0.05:
- 1.96 = - Z a / 2 < HP = 0.2885 < Z a / 2 < 1.96 A visualization of the Weibull distribution fot to the survival data may be obtained through a graphical GOF investigation provided in Figure 2. The graphical GOF analysis include the Cumulative Hazard (CHP) and thje Probability Plots (PP) detailed in [2] and the Stability Probability plot (S-P) introduced by [49]. The different graphical GOF methods provide distinct visual emphasis in the resulting plots: CHP is useful in a tail fit assessment; while both P-P and S-P methods may pass judgment for the overall fit. The graphs resulting from the Q-Q, P-P and S-P methods (Figure 2) indicate a satisfactory agreement between the Weibull model and the multiply censored data sample.
838
Figure 2: Graphical GOF assessment of the Weibull null hypothesis
7
CONCLUSIONS
The applicability, thin the reliability and maintenance, of standard GOF methods (e.g. EDF type test statistics) is restricted due to mainly the presence of random censoring in the data. The HP GOF test statistic is indeed a useful method for the assessment of distributional assumptions with multiply censored reliability and maintenance data. The computational details of the GOF method were illustrated through an example. The finite sample behavior of the HP GOF test was studied through simulation: the adequateness of its asymptotic null distribution was established and the power of the test with shape and scale alternatives to the Weibull distribution was investigated through simulation.
8
REFERENCES
1
Ansell J.I. & Phillips M.J. (1989) Practical Problems in the Statistical Analysis of Reliability Data (with discussion), Applied Statistics, 38(2), 205-247. Nelson W. (1982) Applied Data Life Analysis, John Wiley and Sons, New York. Waller L.A. & Turnbull B.W. (1992) Probability Plotting With Censored Data, The American Statistician, 46(1), 5-12. Hollander M. & Proschan F. (1979) Testing to Determine the Underlying Distribution Using Randomly Censored Data, Biometrics, 35, 393-401. Efron B. (1967) The Two Sample Problem with Censored Data, Conference Proceedings. 5th Berkley Symposium on Mathematical Statistics and Probability edited by J. Newman, 4, 831-853, Berkley: University of California Press. Lee E.T. (1992) Statistical Methods for Survival Data Analysis, John Wiley and Sons, New York. Dodson B. (1994) Weibull Analysis, ASQC, Quality Press. Kaplan E.L. & Meier P. (1958) Nonparametric Estimation from Incomplete Observations, Journal of the American Statistical Association, 53, 457-481. Hollander M. & Pena E.A. (1992) A Chi-Squared Goodness-of-Fit Test for Randomly Censored Data, Journal of the American Statistical Association, 87(418), 458-463. D’Agostino R.B. & Stephens M.A. (1986) Goodness-of-Fit Techniques. New York: Marcel Dekker, INC. Gail M.H. & Ware J.H. (1979) Comparing Observed Life Table Data with a Known Survival Curve in the Presence of Random Censorship, Biometrics, 35, 385-391. O’Neill T.J.O. (1984) A Goodness-of-Fit Test for One-Sample Life Table Data, Journal of the American Statistical Association, 79(385), 194-199. Li G. (1995) On Nonparametric Likelihood Ratio estimation of survival probabilities for Censored Data, Statistics and Probability Letters, 25(2), 95-104. Mihalko D.P. & Moore D.S. (1980) Chi-Square Tests of Fit for Type II Censored Data, The Annals of Statistics, 8(3), 625-644. Mihalko D.P. (1993) Chi-Square Tests-of-Fit for Location-Scale Families Using Type-II Censored Data, IEEE Transactions on Reliability, 42(1), 76-80. Habib M.G. & Thomas D.R. (1986) Chi-square goodness-of-fit tests for randomly censored data, The Annals of Statistics, 14(2), 759-765. Akritas M.G. (1988) Pearson-Type Goodness-of-Fit Tests The Univariate Case. Journal of the American Statistical Association, 83(401), 222-230.
2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
839
18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49
Hjort N.L. (1990) goodness-of-fit tests in models for life history data based on cumulative hazard rates, The Annals of Statistics, 18(3), 1221-1258. Li G. & Doss H. (1993) Generalized Pearson-Fisher Chi-square Goodness-of-fit tests, with applications to models with life history data, The Annals of Statistics, 21(2), 772-797. Kim J.H. (1993) Chi-Square goodness-of-fit tests for randomly censored data, The Annals of Statistics, 21(3), 1621-1639. Johnson V.E. (2004) A Bayesian χ2 test for goodness-of-fit, Annals of Statistics, 32, 2361-2384. Yin G. (2009) Bayesian goodness-if-fit test for censored data, Journal of Statistical Planning and Inference, 139, 14741483. Wozniak P.J. & Li X. (1990) Goodness-of-Fit for the two-parameter Weibull distribution with estimated parameters, Journal of Statistical Computation and Simulation, 34, 133-143. Wozniak P. J. (1994) Power of Goodness of Fit Tests for the two-parameter Weibull distribution with estimated parameters, Journal of Statistical Computation and Simulation, 50, 153-161. Lawless J.F. (1982) Statistical Methods and Models for Lifetime Data, John Wiley and Sons, New York. Koziol J.A. & Green S.B. (1976) A Cramer-von Mises statistic for randomly censored data, Biometrika, 63(3), 465-474. Csorgo S. & Horvath L. (1981) On the Koziol-Green model for random censorship. Biometrika, 68(2), 391-401. Pena E.A. (1998) Smooth goodness of-fit tests for composite hypothesis in hazard based models, Annals of Statistics, 26, 1935-1971. Chen H. S., Lai K., Ying Z., (2004) Goodness-of-fit tests and minimum power divergence estimators for survival data, Statistica Sinica, 14, 231-248. Ren J.-J. (2003) Goodness of fit-tests with interval censored data, Scandinavian Journal of Statistics, 30, 211-226. Koziol J.A. (1980b) Goodness-of-fit tests for randomly censored data, Biometrika, 67(3), 693-696. Koziol J.A. & Byar D.P. (1975) Percentage Points of the Asymptotic Distributions of One and Two Sample K-S Statistics for Truncated or Censored Data, Technometrics, 17(4), 507-510. Pettitt A.N. & Stephens M.A. (1976) Modified Cramer-von Mises statistics for censored data, Biometrika, 63(2), 291298. Koziol J.A. (1980a) Percentage Points of the Asymptotic Distributions of One and Two Sample Kuiper Statistics for Truncated or Censored Data, Technometrics, 22(3), 437-442. Fleming T.R., O’Fallon J.R. & O’Brien P.C. (1980) Modified Kolmogorov-Smirnov Test Procedures with Application to Arbitrary Right-Censored Data. Biometrics, 36, 607-625. Fleming T.R. & Harrington D.P. (1981) A Class of Hypothesis Tests for one and two sample censored survival data. Communications in Statistics-Theory and Methods, A10(8), 763-794. Nelson W. (1969) Hazard Plotting Methods for Incomplete Failure Data, Journal of Quality Technology, 1(1), 27-52. Gastaldi T. (1991) Generalized two sample kolmogorov-smirnov test involving a possibly censored sample, Communications in Statistics-Simulation, 20(1), 365-373. Gastaldi T. (1992) Testing a Hypothesis through an identity test, Communications in Statistics-Theory and Methods, 21(5), 1267-1272. Gastaldi T. (1993) A Kolmogorov-Smirnov test procedure involving a possibly censored or truncated sample, Communication in Statistics-Theory and Methods, 22(1), 31-39. Miller R.G. (1981) Survival Analysis, John Wiley and Sons, New York. Hahn G.J. & Meeker W.Q. (1993) Assumptions for Statistical Inference, The American Statistician, 47 (1), 1-11. Bohoris G.A. (1995) Effectiveness of the Comparative Reliability Assessment Techniques Evaluating Quality & Reliability in Engineering Design. International Journal of Reliability, Quality and Safety Engineering, 2(3), 309-326. Press W.H., Flannery B.P., Teukolsky S.A. & Vetterling W.T. (1986) Numerical Recipes, The Art of Scientific Computing, Cambridge University Press, Cambridge. Kleijnen J.P.C. & Groenendaal W. (1992) Simulation: A Statistical Perspective, John Wiley and Sons, Chichester. Kostagiolas, P. A. (2000) The Goodness of Fit Problem with Industrial Reliability Data, Ph.D thesis, Univ. of Birmingham. Aho M., Bain L.J. & Engelhardt M. (1985) Goodness-of-Fit Tests for the Weibull Distribution with Unknown Parameters and Heavy Censoring. Journal of statistical Computation and Simulation, 21, 213-225. Bohoris G.A. (1994) Comparison of the Cumulative-Hazard and the Kaplan-Meier Estimators of the Survivor Function, IEEE Transactions on Reliability. 43(2), 230-232. Michael J.R. (1983) The stabilized probability plot, Biometrika, 70(1), 11-17.
Acknowledgments The first author would like to express his gratitude to Prof. Brian Haley, University of Birmingham for his support during his Ph.D. studies.
840
Proceedings of the 4th World Congress on Engineering Asset Management Athens, Greece 28 - 30 September 2009
LIFE-CYCLE ENGINEERING ASSET MANAGEMENT Woo-bang Lee, Sang-young Moh, and Hong-jung Choi Korea Hydro & Nuclear Power Co., 411 Youngdongdaero, Gangnam-gu, Seoul, Korea.
All asset life cycle phases, involving project planning, design, procurement, construction, operation, and decommission, should be considered to appropriately manage the asset. Existing EAM processes are usually planned and implemented during operation phase of asset life cycle. Therefore, the effectiveness of the processes is cut down in spite of a huge amount of efforts for process introduction. The high standard EAM model should be planned, implemented, and with a feedback loop at the beginning of the asset life cycle. Thereby, a life cycle approach can be applied to reduce operating and maintenance costs, improve equipment reliability, configuration management, and minimize risk. KHNP (Korea Hydro and Nuclear Power Co.) has developed its own Asset Management Model with reference to NEI SNPM (Standard Nuclear Performance Model), INPO ER (Equipment Reliability) process and EPRI recommendations on ER for New Nuclear Plant Projects (Design and Procurement). KHNP is gradually implementing the model into the nuclear plants in operation and will also extend to the plants under construction thus refining the model from commissioning to decommissioning phases for a complete cradle to grave approach. We plan to develop plant O&M programs at the beginning of plant design and procurement phase and transfer the programs to the operation phase. The O&M programs consist of asset registration (asset list, structure, BOM, etc.), asset classification (equipment class, functional importance, SPV analysis, etc.), and maintenance strategies and plans. Therefore, asset vendors and suppliers will be required to provide, in terms of contract, all data and information related to the O&M programs for the physical asset they offer. We are also preparing to utilize Life Cycle Cost (LCC) on a decision making process to achieve economical asset management. Technical and Financial Risks can be controlled within long term equipment reliability management plans as well. This paper represents KHNP’s Asset Management Model to share our lessons learned from the implementation tasks and feedback experts’ comments or suggestions to improve the Model. Key Words: Life-Cycle EAM, LCC, EAM, Asset Management, KHNP 1
INTRODUCTION
KHNP has 20 nuclear reactors and 27 hydro generators in operation and 8 nuclear reactors under construction. Most of KHNP’s asset is physical asset, generation plants, and almost all electricity comes from nuclear plants. So the EAM model presented in this paper is for nuclear plants. KHNP’s has developed the EAM model through benchmarking of US models and processes such as SNPM (Stand Nuclear Performance Model) and ER (Equipment Reliability) process. We have been gradually implementing the EAM model into all nuclear plants in operation meanwhile sending our engineers to U.S. nuclear plants to learn by the processes of on-the-jobtraining. The KHNP engineers take roles and responsibilities for implementation of the processes into their local plant after completing their oversea training course on system engineering related tasks for a year. The purposes of KHNP’s EAM medal are: 1) To improve maintenance strategies from reactive to proactive. 2) To introduce graded approach concept to better manage limited resources with the focus on the critical assets. 3) To enhance efficiencies on asset management and funding process by implementing long term planning processes. However, implementing the EAM processes into the nuclear plants which have been operating for more than 10 years, result in the following difficulties and restrictions:
841
1) 2) 3) 4)
Lack of understanding of design concepts. Less than sufficient knowledge on system engineering basis. Hard to obtain historical engineering documents on design changes and configuration management. Strong resistance to new processes by staffs who are feeling comfortable with existing business processes.
Due to above reasons, it takes so long time to introduce new EAM processes into the plants in operation, and the effectiveness of the processes cannot help being restrictive. KHNP is now refining and improving the EAM model to be applicable at plant design and procurement phase to resolve the problems listed above and enhance processes’ effectiveness. And the new model involves not only construction and operation phase but also decommission phase into its management range through considering life cycle cost (LCC). This paper will represent the existing EAM model in first, and review additional considerations for enhancing effectiveness and efficiency of the model when applied to new nuclear plant projects, and finally, suggest new EAM model implicating the considerations. 2
EXSISTING EAM MODEL FOR THE PLANTS IN OPERATION Figure-1 represents existing EAM model, which was designed for, and being implemented into the plants in operation.
Figure-1 Existing EAM Model of KHNP
2.1 Asset Registration Asset registration process is very important since it is the starting point of EAM. Unregistered asset possibly incurs severe risks at unexpected time because those are left off the management scope. 2.1.1 Equipment List and It’s Hierarchy Structure All equipment in the plant must be registered into equipment master of asset management system (SAP). Maintenance management unit is the lowest level of the equipment registered. The equipment is hierarchically structured in accordance with actual equipment composition and the structure focuses on key components having sub components and support components under its lower hierarchy level. The equipment master is updated whenever any modification on equipment composition is performed due to design change, replacement, additional installation or elimination. Attributions of the equipment are also registered in the equipment master. The attributions consist of technical specification, design data, location and cost related information, linkages to related documents, etc. Once any equipment is registered, all information on operation and maintenance is automatically accumulated under the equipment ID.
842
2.1.2 Material List All spare parts, replaceable materials are registered into material master in the asset management system (SAP). Attribution information regarding manufacture and procurement of the material is also registered along with the material. Safety stock for each material is optimized by inventory optimization process considering usage history and purchasing lead time. 2.1.3 Equipment BOM (Bill of Material) Equipment BOM is internal linkages between an equipment ID and a material ID which is needed for maintenance of the equipment. Equipment BOM aims convenience and efficiency of maintenance planning task by offering required material list for the maintenance while supporting decision making in the inventory optimization process by presenting equipment list using the certain material. 2.2 Asset Classification Asset classification is utilized as a strategic methodology so that we can focus on important asset by putting limited resources on the more important asset. 2.2.1 SPV (Single Point Vulnerability) Analysis Since SPV components cause plant shut down, those are managed with the highest priority and attention in the EAM model. Once SPV components are identified, we usually find any design change opportunity to make it redundant and, eventually, to minimize the number of SPV components. If there is no way to make the component redundant, we try to improve its reliability by finding any alternatives such as replacing the component with more reliable one, improve maintenance program, strengthening condition monitoring, etc. 2.2.2 MR (Maintenance Rule) Scoping KHNP has developed and implemented its own MR program with reference to US MR program. MR program classifies safety-related functions from the plant system function list. After defining risk significance and appropriate performance criteria for each function in MR scope, system engineers monitor the MR functions and periodically evaluate system health. Any function in worse condition than performance criteria is fallen into intensive monitoring target and new performance goals are set. Root cause analysis and corrective actions are also conducted to recover performance of the affected function in parallel with intensive monitoring. Korean regulatory agency has deferred the decision making whether make the MR program a mandatory legal rule on the condition that utilities implement the program voluntarily. 2.2.3 FID (Functional Importance Determination) Functional importance of equipment is determined in consideration of its intended functions. Although physical characteristics of different equipment are same, they might have different importance in case their functions are not the same. Equipment’s functional characteristics determined by FID process consist of equipment class, duty cycle, and service condition. If the equipment is one of the SPV component or performing MR function, its Importance is classified as Critical. FID result data is used in maintenance optimization process to decide applicable PM template, PM tasks, and task intervals. The FID data is also utilized as inputs of decision making in most EAM processes. Figure-2 shows equipment’s functional characteristics determined by FID process. 2.2.4 FEG (Functional Equipment Group) FEG is a group of components which support a certain function. If a component fails, the function supported by the failed component is also failed simultaneously. Since other components’ operability dose not required during corrective maintenance work for the failed component, all planned maintenance works for other components also be conducted at the same time. Utilizing FEGs supports efficiency in maintenance scheduling and work management. We have to manage equipment’s performance based on system functions for more reliable plant operation, and the FEG is the effective tool supporting function based performance management. The main purpose of FEG is to minimize inoperable time and frequency through optimization of maintenance schedule based on FEG. FEG-based management also supports effective performance monitoring in MR program.
843
Characteristic
Code
Functional Importance
Component Type Duty Cycle
Service Condition
Description
Comment
A
Critical A
B
Critical B
C
Minor
Non-critical
X
No Impact
RTF in PM program
Char(4)
244 types
One type is selected
H
High
L
Low
S
Severe
M
Mild
Both are fallen under “Critical” in PM program
Defined by common criteria described in the FID process description
Figure-2 Equipment’s Functional Characteristics by FID
2.3 System Performance Monitoring System performance monitoring means function level performance monitoring for each system. Whenever any individual component fails, we have to assess if any system function is affected and update performance status of the system function. System engineers develop system monitoring plans for their systems and monitor their systems in accordance with the monitoring plans. The system monitoring plan includes performance criteria, failure mode, affecting parameters, monitoring interval, monitoring method, and action requirements in case of failure. The result data of performance monitoring is stored in System Health Evaluation System (SHES) automatically or manually. System health statuses are determined by logically structured performance criteria and displayed on SHES web screen in colors of Red, Yellow, White, and Green. System performance data is used to evaluate plant performance which is also displayed on SHES screen in colors showing quantitative status of achievement for the business strategies and goals of the plant. Most direct information and some indirect information which are needed to evaluate system performance are automatically collected by IT system. System performance monitoring process aims at preventing any functional failure by continuous performance trending and corrective actions against potential failures. Condition Based Maintenance (CBM) must be settled in the plants to achieve the purposes of system performance monitoring. Therefore, KHNP has introduced integrated predictive maintenance system to support CBM. The system contains state-of-the-art diagnosis technologies such as vibration analysis, ultrasound analysis, motor current analysis, thermography, oil analysis, etc. 2.4 Maintenance Optimization Maintenance Optimization (MO) process is to establish optimized maintenance strategies for each equipment type. The process defines appropriate preventive maintenance tasks and intervals to prevent progression of failure modes through FMEA analysis, and develops PM templates for each equipment type to implement the maintenance strategies according to FID data. The MO process takes CBM tasks prior to TBM (Time based Maintenance) tasks to realize proactive maintenance strategy and enhance effectiveness of preventive maintenance. The process restrictively adapts TBM tasks just in case the failure mode can not be fully addressed by CBM tasks or there is no effective CBM task addressing the failure mode. “As-Required” of the interval for some TBM tasks means that the TBM task is performed just in case any evidence of degradation is identified by performance of CBM tasks. Figure-3 is a sample of PM templates. PM template is developed for each equipment type and classified by 4 characters (e.g., PPHC) on right upper side in the template. There are preventive tasks on the left side and intervals on the centre area in the template. The appropriate interval for each task is determined by FID data.
844
Figure-3 Sample of PM Template
Figure-3 shows finally summarized information of the PM template. It is used when we implement the PM strategies into plant PM program. PM template includes additional background information such as detailed FMEA information, maintenance strategy selecting process, consist of each PM task, etc. KHNP has developed 200 PM templates. 60 templates of them were developed with reference to the EPRI PM templates meanwhile others were developed by KHNP. We are gradually implementing these PM templates into all the plants in operation. It needs culture changes to implement these PM template implicating new maintenance strategies. Many engineers in local plants do not like to introduce the PM templates into their work place and adhesive to existing maintenance practices based on TBM tasks. They are also in lack of understanding on CBM tasks and the state-of-the-art diagnosis technologies. KHNP have been conducting training and qualification courses on the CBM diagnosis technologies to overcome the resistance since 2007. 2.5 Equipment Reliability Database KHNP was used to perform the time based preventive maintenance according to guidance form vendors or manufacturers. Since we recently developed and implemented the PM templates focused on CBM, we are developing reliability database system to enhance equipment reliability and maintenance effectiveness through supporting continuous optimization of the PM templates. Some parts of the database supporting PSA (Probabilistic Safety Assessment) and MR program was developed already while other parts for FMEA analysis and maintenance optimization are being developed. The reliability database will promote advance of our asset management processes in the future. 2.6 Risk Informed Application Risk Informed Application (RIA) optimises regulation program shifting from the deterministic methodology to the risk informed and performance based methodology utilizing PSA results and risk information coming from reliability analysis. RIA aims to eliminate unnecessary regulatory items while focusing on more important safety issues for improving both nuclear safety and economical benefit. In the view of EAM, we classify important equipment by analysing criticality and weakness of it and promote Allowable Outage Time (AOT) extension so that we can conduct preventive maintenance tasks for the equipment during the plant operation period. Utilizing risk information, we are able to concentrate our resources upon more important physical assets to enhance effectiveness of asset management. 2.7 Corrective Action Program (CAP)
845
In case of any performance degradation, healthy risk or issue requiring improvement, identified in the performance monitoring process, the action request is transferred to CAP process to analyse root cause and take corrective actions for recovering the performance. Existing maintenance processes focus on resolving apparent healthy problems of the affected equipment while CAP process aims at root cause analysis and corrective actions, in parallel with progression of existing maintenance process, to resolve all healthy issues involving equipment-related incident and other non-equipment-related incident on human performance, procedures, processes, work practices, and safety culture. Once a certain action request is issued by anyone in the site, CAP committee screens and defines significance level for each action request (AR) and cause analysis method is selected according to the significance level of the AR. Figure-4 is the process flowchart of CAP process.
Figure-4 Process Flowchart of CAP
2.8 Investment Management & Long-Term Asset Management Investment requests on maintenance programs and corrective actions for improving plant performance are prioritized based on feasibility analysis before enrolled into long term asset management plan. Feasibility analysis on investment can be divided into economic analysis and requirement analysis. Economic analysis is to select the optimist invest alternative by comparing life cycle cost (LCC) converted into net present value (NPV) while requirement analysis is to review needs for the investment, such as legal considerations, regulatory guidelines, national standards, technical codes, or pending issues on public acceptance on nuclear energy. The results of the analyses are quantified and prioritised in the investment requests pool and the investments are carried out in order of ranked priority. Company-wide long-term asset management (LTAM) plan is established for the major components which are significantly affecting plant’s cost and safety. The purpose of LTAM process is: 1) To minimize financial risk on funding through determining equipment replacement time in advance based on LCC analysis. 2) To prevent unplanned losses by systematic performance monitoring and proactive corrective actions to potential failures or latent problems which could result in significant losses 3) To find opportunities for performance improvement of major components.
2.9 Long-Term Planning & Work Management
846
The long-term investment plan, for the evaluated investment requests and maintenance tasks, is developed with consideration of company’s financial status. Maintenance planning utilizes FID data and PM templates; the plan is led to applicability reviews in each nuclear plant. The final maintenance plans are stored and managed in SAP PM module. And longterm maintenance plans over the plant life-cycle are established automatically according to predefined task intervals by SAP. Provided that long term plan is confirmed, we can check life cycle investment profile and company’s funding plan is modified by the long term plan. Confirmed maintenance plan is scheduled weekly considering day-to-day plant condition. Scheduling job is performed with reference to core damage frequency (CDF) calculated by risk monitor on the assumption that the maintained component is out of service. And work schedules may be adjusted to bring CDF peak down under the manageable limit value. To minimize errors on all human performances regarding operational actions and maintenance works, we utilize various human performance improvement guidelines such Pre-job Briefing, Post-job Critique, 3-way Communication, Self/Peer Check, Concurrent/Independent Verification, Operational Decision Making, and Management Observation, etc. 2.10 Continuous Optimization Since all EAM processes result in system performance, the EAM processes can be optimized continuously by corrective actions against performance gaps. Maintenance strategy is optimized by trending analysis of maintenance feedback, as-found equipment condition code, which is numerical code representing degree of gap between actual equipment condition and expected equipment condition. The expected condition is the decision basis for the maintenance task and interval defined in the PM template for the target equipment. Secure systematic process integration and feedback is prerequisites for the continuous optimization. And KHNP’s asset management system, SAP, fully supports process integration and feedback.
3
ADDITIONAL CONSIDERATIONS FOR NEW NUCLEAR PLANT PROJECTS
As mentioned before in the introduction part, there are many difficulties or restrictions such as lack of understanding of design concepts, hard to obtain vendors’ information, insufficient historical engineering documents, etc., in case we develop and implement EAM processes into plants in operation. The lesson learned from EAM process introduction into the plants in operation is that developing and implementing the EAM processes during plant design or procurement phase is incomparably efficient. 3.1 Asset Registration It is much easier and more beneficial in cost to compose material master during plants design and procurement phase than during operation phase. We can easily develop whole material master and equipment master using purchase order list, material list, suppliers’ information and detailed technical specifications from vendors when the plant is in design and procurement phase. And we can also simply organize hierarchy structure of the equipment master utilizing design engineers in the design company. So, above considerations including equipment bill of material (BOM) should be contained in the contractual terms so that vendors have duties for providing related information. All kind of data and information should be of digitalized formats and transported though networks; it is more beneficial, reliable, and fast. We have to register all kind of physical asset during design and procurement phase and transfer to operation phase so that establish reliable asset management foundations. 3.2 Asset Classification Asset classification includes single point vulnerability (SPV) analysis, MR scoping, functional importance determination (FID), functional equipment groups (FEGs) development, etc. Because the asset classification task quires entire understanding of plant design, the best way for asset classification is to conduct the task during plant design and procurement phase, getting design engineers involved in the task. We conduct SPV analysis first, and initiate design changes for the identified SPV components to be redundant, to minimize the number of SPV components and to improve plant’s own reliability. Acceptance decision for the design changes should be based on LCC analysis for possible alternatives and the best reasonable one should be selected. We develop system function list for all plant systems and classify MR scope functions for which performance monitoring is conducted according to MR program. After classifying FID data for each component based on its actual usage in the installed
847
system, FEGs are developed through grouping function-related components which support the function. Data quality of function analysis and importance determination tasks mentioned above depends on understanding of plant design, PSA experiences and PSA insight. Hence, asset classification should also be contained in the contractual terms as vendors’ duties. Figure-5 represents asset classification process during the design and procurement phase of new plant project.
Equipment Master in SAP PM
Review for Design Change
SPV Analysis
Functional Importance Determination
Develop System Function List
Initiate Design Change
FMEA Libraries
Develop Performance Monitoring Plans
MR Function Scoping
Establish PM Program
Develop Functional Equipment Groups
Figure-5 Asset Classification during the Design & Procurement Phase of new plant project
3.3 Maintenance Program Development of maintenance strategies and their implementation have to be completed prior to the start up (commissioning) phase of new plant project since maintenance management program is needed to be serviced from the start up phase. Since existing plants in operation have not enough documents or information accumulated during the start up phase, there are many difficulties for engineering analysis to resolve problems during the operation phase. Data and information on initial start up performance test are so valuable that these are used as a reference to performance management over the plant life cycle. We can utilize FMEA libraries and PM templates of existing plant in operation when develop maintenance program for the new plant. But additional FMEA libraries and PM templates for the new component types of the new plant should be contained in the contractual terms for vendors to prepare those. 3.4 Other considerations It is so efficiency that PSA analysis organization participates in development of risk monitor and MR program for the new plant. Although investment management process and LTAM process may not be considered as the processes which are serviced from the design phase, those processes must be serviced from plant design phase because we have to find the best cost-benefit alternative by LCC evaluation at selecting model of major equipment, and at developing maintenance program. CAP process also has to be serviced from plant design and procurement phase with cooperation with design companies and vendors because CAP process is very effective tool to enhance overall equipment reliability by preventing and resolving potential problems, and getting feedback of operational experiences from existing plants.
4
NEW EAM MODEL CONSIDERING ASSET LIFE CYCLE Figure-6 represents KHNP’s new EAM model considering asset life cycle, and it is similar with the existing EAM model.
848
The key concepts of new EAM model are that the EAM model should be developed and implemented in design and procurement phase, and following items are fully prepared by vendors or design companies: 1) 2) 3) 4) 5) 6)
Asset registration Asset classification Risk monitor development FMEA Libraries and PM templates for new component types MR program development Technical support on PSA and LCC evaluation
Figure-6 New EAM Model considering Asset Life-Cycle
New EAM model for new plants must be systemically developed and implemented by specialized organization from the design and procurement phase of new plant project while existing EAM model is developed and implemented in the operation phase. LCC evaluation recommends the best alternative for each investment request. LCC is calculated though evaluating all cost elements regarding design, procurement, manufacturing, installation, operation, maintenance, and disposal; and converting those costs to net present value (NPV) to compare easily among the investment alternatives. 5
CONCLUSIONS
KHNP has developed its own EAM model and the mode is being implemented into the plants in operation. Furthermore, KHNP will expand the EAM model to the plants under construction as well so that we could accomplish the purpose of the EAM model effectively and efficiently. The expected benefits of new EAM model are: 1) 2) 3) 4)
Establishing qualified equipment master, material master, and equipment BOM Reliable asset classification by design engineering experts Maximizing utilization of information and experience from design companies, vendors, and construction companies Assuring consistencies between safety and risk considerations by participating PSA experts in MR program and risk monitor development 5) Establishing information management system including historical records on design, procurement, construction, and
849
commissioning (start up) 6) Enhancing reliability and efficiency of new nuclear plant project by early implementation of problem solving process (CAP). 7) Preventing usual equipment reliability declines emerging at initial operation period, through implementation of high standard asset management system during the design and procurement phase of new plant project, and transferring the system to the operation phase. We are implementing EAM model for the plants under construction at present by ourselves with support of design companies and vendors. And for new nuclear plant project under planning, we plan to include that into contractual terms as duties of design companies and vendors. 6
REFERENCES
1
Woo-bang Lee, et al., (2004) System Oriented Plant Maintenance, KHNP.
2
Woo-bang Lee, (2005) Optimization of Maintenance Management for a Nuclear Power Plant Considering Its Safety and Economics, Bukyong University.
3
Woo-bang Lee, et al., (2005) The Evaluation of Capital Investment for Improving the Performance of Aging Equipment in Nuclear Power Plant with AHP, KHNP.
4
Woo-bang Lee, et al., (2006) Strategies for Physical Asset Management in KHNP, WCEAM.
5
Anderson Consortium and KPMG Consulting Consortium, (2002) KHNP BPR Deliverables for MO, PM, FEG and IO processes.
6
OnMarc Consulting Inc. & ACA Inc., (2005) i-ERIP process overview.
7
INPO, Equipment Reliability Process Description, INPO AP-913, Rev. 2, 2007.
8
NEI, (2003) The Standard Nuclear Performance Model - A Process Management Approach - Revision 4.
9
EPRI, (2008) Advanced Nuclear Technology: Equipment Reliability for New Nuclear Plant Projects: Industry Recommendations — Design, 1016754.
10 EPRI, (2008) Advanced Nuclear Technology: Equipment Reliability for New Nuclear Plant Projects: Industry Recommendations — Procurement, 1018393. 11 EPRI, (2003) Critical Component Identification Process - Licensee Examples, TR-1007935. 12 EPRI, (1999) Preventive Maintenance Basis Guideline, TR-106857 Vol. 1-38. 13 EPRI, (2004) Preventive Maintenance Basis Database (PMBD) Version 5.1.1, Project 4109. 14 J. Moubray, (1997) RCM
, Reliability-centered Maintenance, Industrial Press.
15 SAE, JA-1011 (1999) Evaluation Criteria for Reliability-centered Maintenance (RCM) Processes 16 SAE, JA-1012 (2002) A Guide to the Reliability-centered Maintenance (RCM) Standard. 17 J.D. Campbell & A.K. Jardine, (2001) Maintenance Excellence, Marcel Dekker Inc. 18 H. Paul Barringer, (2003) A Life Cycle Cost Summary, Barringer & Associates Inc., ICOMS-2003 19 Frank Bescherer, (2005) Established Life Cycle Concept in the Business Environment, Helsinki University of Technology. 20 NEI, (2000) Industry Guideline for Monitoring the Effectiveness of Maintenance at Nuclear Power Plants, NUMARC 9301. 21 INPO, (2003) Work Management Process Description, INPO AP-928 Rev. 1. 22 EPRI, (1997) The Work Control Process Module in Support of a Living Maintenance Program, TR-108559. 23 EPRI, (2004) Guidance for Developing and Implementing an On-Line Maintenance Strategy, TR-1009708. 24 EPRI, (1997) Guideline for System Monitoring by System Engineers, TR-107668. 25 INPO, (2005) Performance Objectives and Criteria, INPO 05-003. 26 B. Stengl & R. Ematinger, (2001) SAP R/3 Plant Maintenance - Making it works for your Business, SAP Press. 27 NEI, (2005) Nuclear Asset Management Process Description and Guideline, NEI AP-940, Rev. 0. 28 EPRI, (1998) Nuclear Plant Life Cycle Management Implementation Guideline, TR-106109. 29 EPRI, (1998) Inventory Optimization in Support of the EPRI Work Control Process, TR-109648.
850
DEVELOPMENT OF A DYNAMIC MAINTENANCE SYSTEM FOR ELECTRIC MOTOR’S FAILURE PROGNOSIS. Mpinos Chr. Anastasios a and Karakatsanis S. Theoklitos b a
Production & Management Engineer Grad. of D.U.Th., Cand. PhD of Democritus University of .Thrace, Greece.
b
Electrical Engineer PhD of Nat.Tech Univ. of Athens, Assistant Professor of Democritus University of Thrace, Greece.
During the last years there have become a lot of studies concerning the preventive maintenance of mechanical equipment and fault diagnosis, each one of them having a different approach. The predictive maintenance allows to be continuously known the kind of maintenance and care that a system needs, and mainly what kind of equipment replacement is foreseen in the future. This paper presents the development of a smart – dynamic maintenance system for electric motor’s failure prognosis. The method is based on the analysis of the motor into his subsystems by using neural networks and by recording their relations and interactions. In this model are also embodied the possible damages of every subsystem and its symptoms, respectively. The objective goal of this analysis is to develop an algorithm for the calculation of the probability of showing the damage in a motor’s part according to its dynamic operational status. The creation of a data basis reflects the techniques experience concerning the causes and the possible preventive actions that are necessary. The suggested method is general and can be applied in every part or system of the mechanical equipment. Finally, the paper presents a simple study case for the rotor and practical conclusions are drawn which can lead to a better focus on preventive maintenance. Key Words: Preventive maintenance, Electric motor, Failure prognosis, Neural Networks (NN), Dynamic Neural Networks (DNN).
1
INTRODUCTION
The maintenance of all electromagnetic installations is an essential requirement and the main concern of the maintenance techniques and one of the most interesting issues of modern Mechanics. It is a fact that no equipment, especially when it comprises mechanical features or movable parts, can normally or continuously operate without having maintenance. One day, it will need a repair or replacement. The normal practice is that the repair is performed when the equipment demonstrates damage. Therefore, it results its shutdown for unknown reason. The procedure that is being followed in this case is to locate at first the cause of damage and then to repair it. The necessary time for the restoration of service of the installations varies and it depends on the time of searching the cause, the availability of possible spare parts and the time of the main repair. The preventive maintenance is based on damage prevention through the systematic maintenance of the equipment on regular time intervals. The anticipation is that in this way eventual damages will be prevented before being demonstrated. The preventive maintenance includes the replacement of the equipment that has finished the time of its useful life, despite being necessary or not, the control of the possible wear and fatigue points and the lubrication of the movable parts. A characteristic example is the replacement of lubricants, straps, elastic pipes etc before they even demonstrate damage and without identifying the direct necessity for such a thing since the objective criterion is the lapse of the time set. The majority of the programmed tasks concerning the preventive maintenance aim at developing the algorithms so as to determine the best planning of the time set for the servicing or replacement of particular parts [1]. The criteria applied are the quality and reliability of the equipment, the time and the repair and replacement cost, the availability of spare parts and the storehouse capacity. These methods consider as being stable the spare parts’ lifetime, the machines’ operating conditions and the pre-set tolerance and stressing of the spare parts, without taking into account the dynamic operational status of the equipment [2].
851
The prognostic maintenance is based in the objective evaluation of the special functional and environmental equipment during its life and total fatigue, something which allows the universal and constant knowledge of the maintenance’s type and care that a system demands, and principally the future prognostic equipment substitution. The prognostic maintenance consists of two separate parts: the diagnosis and the prognosis. The diagnosis refers to fault detection of every part while the prognosis can be performed by developing systems capable of prognosticating with various sensors and algorithms the potential failure of equipment parts. This fact renders the prognostic maintenance a necessary tool in all modern industrial units. Its advantages are obvious. Important advantages include reduction in lost production, reduction in maintenance costs, minimizing the possibility of secondary faults; suspending the need for a “spare part store”, expanding the industrial machines’ productive life, as well as better product quality [3-4]. During the last years, many studies have been conducted and many systems have been suggested, both failure diagnosis and prognosis. Some of these are based in statistical knowledge of reliability factors [5], others in tree analysis and decisionmaking [6], and others in comprehending the failure mechanisms of every part [7]. A prognosis system aims to receive systematic calculations with appropriate sensors and to statistically analyze the results in order to evaluate the instantaneous and dynamic functional equipment behavior. Many and interesting modern methods have been suggested in the field of failure prognosis, which use mainly audiovisual means; for instance, Oil-wear Particle Analysis, Vibration analysis [8], Ultrasonic Analysis [9], and Thermal Analysis. These measurements from the right sensors come obligatorily with a complete database so they can be compared. This database includes cases of relevant functional situations in order to find identical functional situations which have led to these failures or fatigues. In this article, a prognosis failure and damage method is presented in industrial electric induction rotors which not only are they the basic motion motor of the industry, but of the transports as well as of the majority of electric devices. This method is based in the complete analysis of the induction motor in integral subsystems which perform particular tasks and further classification and grouping of their parts. The relations and interactions between them is registered with the use of Dynamic Neural Networks which can take advantage of the accumulated experience and the artificial intelligence systems can provide fast, reliable and, above all, in real function time of the actual dynamic functional situation and evaluating a potential failure. A complete evaluation of the failure danger should take into account as well the potential correctional actions which can be used so as to remove the system from the dangerous situation and the right evaluation according to their results. The proposed method is general and can be applied simultaneously or parallel to any part of system of the mechanical equipment. In conclusion, in this article a simple example of the method’s application to the subsystem bearing–rotor of the electric motor is presented. This subsystem is the most common to present failure due to temperature-induced fatigue. Thanks to the analysis and the method presentation, important conclusions are deducted concerning its suitability of use in various subsystems and its future extension in more subsystems as well as controlling more combined functional and environmental parameters as vibration, current power, network harmonics, humidity, dust etc.
2
NEURAL NETWORKS
The artificial neural networks are inspired by biology and are composed of elements, the artificial neurons, which behave in a way relative to the one of the most basic functions of the biological cells. The artificial neurons are organized in such way that they simulate the human brain’s anatomy. For instance, they learn by using the accumulated experience, they are capable of generalizing from earlier examples to new ones, they can asses a group of data and can distinguish the more important characteristics. The use of artificial intelligence techniques has been applied to various problems of Statistics and Computing and Construction Dynamics during the last decade. These applications are the reliability analysis of construction aiming to foresee the analysis results or optimal design problems where the necessary values of every new design limits originate in the prognosis of an appropriately educated neural network. Furthermore, they have been applied to Fracture Mechanics problems as well as adaptive and stochastic finite element problems [10]. A well-educated neural network can produce acceptable, as far as accuracy is concerned, results in short calculation period. This ability of the NN’s is their main advantage. The approach to a solution by a neural network is very valuable in cases of time consuming analysis, where a fast evaluation of the factor’s behavior is necessary. They can use the same word norms humans use and can carry the cause of elaborated information as a result of many factors. In this case in order to develop the failure prognosis method in an electric motor, a Dynamic Neural Network is applied in a subsystem of the rotor, according to the McCulloch-Pitts model (step model) [10] The DNN’s aim is to represent as precisely as possibly the special functional and environmental conditions of the subsystem. By using sensors which are appropriately placed in parts of special interest, certain factors concerning mechanical, physics of electric capacities of certain subsystems’ parts, as temperature, vibration frequency, sound, current power, humidity, foreign elements presence (dust) etc. can be measured. The mere recording of this is not the main interest but the constant observation of the dynamic behavior of the calculated factors and controlling their abrupt changes is [27]. The mathematic model of a Neuron of a DNN which is applied and analyzed below is composed of one output, more than one inputs and the function represented in Figure 1.
852
Figure 1. Function representation of a DNN Neuron [14-16] . An output is defined as: • y= 0 → inactive neuron • y= 1 → maximum pulse frequency
(1)
Defined inputs: x1, x2, ..., xn
(2)
Attached weights and total Neural Network impulse are: Attached weights: w1, w2, ..., wn Total impulse: u= w1 x1+ w2 x2+ … +wnxn The threshold of the neural inputs is: • Threshold θ
(3) (4) and
• If u > θ then the neuron fires • If u < θ then the neuron stays inactive An activation function is defined as f ( . ) = Activation function and as an output function : y = f (u − θ) 3
(5) (6)
ELECTRIC MOTOR ANALYSIS
In this part of the article, the inductive electric motor is analyzed. The inductive motor with a squirrel-cage rotor is the most common type due to its simple construction and the function simplicity. This motor does not have a separate activation circuit, so that the currents depend on its activation in accordance to the transformer’s principle operation. Due to its many advantages which presents, as it is a stable construction, small cost, increased reliability, most of the times the direct connection to the network, has been extensively applied as the principal drive force to the industry but to nearly all everyday applications, covering a percentage over 90% of the motor total. In spite of the fact that every manufacturer may use special manufacture methods with the view to bettering the motor’s productivity, the basic manufacture characteristics have not changed and can be standardized according to NEMA’s basic design norms (National Electrical Manufactures Association). As with every electric machine, it consists of mechanical and electric parts or it comes with electronic security systems. The mechanical parts compose the static part (stator) and the mobile part (rotor). In the rotor there are the armature’s roller plates, in its grooves three-phase winding is placed, charged by a three-phase current. Six ends of the three-phase winding which usually form a certain pole number in it simple form end with the known configuration at the motor’s terminal box. In the terminal box, before the power collection, the connection of the windings is in star shape or triangle. The squirrel-cage rotor bears an adapted axle, the core with its winding and the cooling fan. The core of the squirrel-cage rotor is composed of iron-magnetic material sheets, which are isolated so as to avoid magnetic loss. Along the core’s armature internal surface grooves are formed in which the wirings are placed. The grooves’ type has to do with the rotor’s characteristics and the motor’s function. The rotor’s windings is composed of bars on which the induction phenomenon takes place as well as of two short-circuit rings, through which the electric circuit of the rotor closes. The type of the squirrelcage rotor bars is adapted to the grooves type. The bars are places in the grooves of the rotor’s core and vary in number, compared to those of the stator’s armature. The latter ones have special format of 150 to the horizontal level. Then the two short-circuit rings are placed and welded so that the magnetic sounds are avoided while the motor functions.
853
The electric motor’s analysis is done firstly at the electric and mechanic parts. They are then analyzed in various subsystems depending on the functions they perform. The grouping of the sub-systems is done in such a way that every team does a certain distinguished function. The classification of the system’s different groups is done with a process chart and it can distinguish basic functional systems from support or security systems. So, the examination of the correlation must take into consideration the (in)dependent function of various groups, the degree of dependence they present and generally their place in the process chart. Finally, every subsystem is composed of complex or simple parts. An objective aim of this study is to account for the relation between different systems and the degree of dependence. The analysis of the motor to its subsystems is done with the use of UDNN’s; the relationship and their interaction between them is presented. The application of the UDNN’s allows the recording, the observation and the control of both deterministic and vague data [17-23]. In Figure 2 the entire analysis of the squirrel-cage rotor to its subsystems is presented and in table 1 and table 2 are analyzed indicatively certain faults and symptoms of subsystems of electric motor. The faults are coded aiming at the clue of region - subsystem that these is presented. Also table 2 shows the symptoms of damage that causes the damage or dysfunctions.
Figure 2. Electromotor’s analysis to its subsystems Table 1 Faults of subsystems for electric motor FAULTS DETERIORATION DESTRUCTION BENT OF AXIS CRACK TWISTING BRAKE FRICTION OF ROTATION DETERIORATION OF INSULATION DESTRUCTION OF INSULATION SHORT-CIRCUIT INTERRUPTION OF CONTINUITY NON-EXISTENCE OF OIL REDUCTION OF QUANTITY OF OIL
CODE F1 F2 F3 F4 F5 F6 F7 F8 F9 F10 F11 F12 F13
FAULTS DETERIORATION OF LUBRICANT INTERRUPTION OF CURRENT AFFLICTION CONDITIONS OF OPERATION HOURS OF OPERATION HIGH TEMPERATURE HIGH RATES OF DUST MANY AND OVERBOOST HARMONIOUS IN THE NETWORK BAD MANUFACTURE OF REFRIGERATION CO-ORDINATION OF SOUND DIMINUTION OF ROTOR LAMINAS LOSS OF ECCENTRICITY OF AIR COOLANT BAD CONNECTION OF TRANSMISSION
854
CODE F14 F15 F16 F17 F18 F19 F20 F21 F22 F23 F24 F25 F26
Table 2 Symptoms of Faults for the subsystems of electric motor.
In continuance, one of the principal sub-systems of the electric motor will be analyzed with the application of Neural Networks, as it appears in relevant studies of IEEE and EPRI: the sub-system Bearing-Rotor [12]. As statistical data reveal and Figure 3 presents, the percentage of failures in bearings and in rotor is the greater and the cumulative failure frequency is of 50%. Therefore, the stator presents failures of 36%. The main causes of failure in bearings are mechanical breakage and overheating, while the one of the stator’s windings is insulation breakdown and overheating. According to this data, we choose to study the mechanical breakage of bearings due to overheating and to develop this sub-system’s model with the use of Neural Networks and to present to the application study the equivalent numerical example.
Figure 3. Failure frequency and their causes in an electric motor [37].
4
CASE STUDY
For this example of the method’s application a laboratory device is made so as to constantly monitor and control the functional state of the bearings in an electric motor and as to take the necessary measures for avoiding a failure by correctional control. The devise can perform data and calculation collection, examination of them while controlling the correctional actions and classifying the bearings’ functional state as far as overheating danger is concerned in the suitable group. The description of this device and the development of a mathematical model are presented below.
855
4.1 Description of the laboratory device The experimental device includes an electric induction Siemens motor rated capacity 13Hp, 1500 rpm. For the load exercising to the electric induction motor a hydraulic digital duplex with maximum power 18Hp is used. Furthermore, a RTD platinum 4B sensor for constant temperature observation was placed. This sensor was connected with a computer, through which the data was received and recorded as well as the necessary control by the software of the program language MatLab. In Figure 4 we can see point A, where the RTD platinum 4B sensor was placed on the bearings’ terminal box in order to perform a constant temperature observation. The sensor is placed according to the DIN 43760, as we can see in Figure 5.
Figure 4. Temperature sensor position
Figure 5. Positioning, according to the DIN 43760.
In Figure 6, the flow diagram of the laboratory device which consists of the mechanic part and the load’s modeling, the data collection in the computer and its processing are presented. Thus, via the materialized NN system which is materialized can represent at all times the actual dynamic functional status of the point of our interest, i.e. the bearingrotor sub-system as well as its thermal fatigue [12].
Figure 6. Schematic Flow Diagram of the Device The Neural Network via the Database can classify this functional status as far as the danger of certain categories and in order to select the right one from the available correctional actions in order to perform the necessary correctional control. The choice of the right correctional action can be based on the system’s sensitivity or given with certain weighting coefficient. The entire system thus works as an intelligent agent with the aim to evaluate the dangerous functional status of the motor and its release. So with this procedure, the prognosis failure system knows at all times the actual health of the equipment, taking into account its entire fatigue from the environment and the function conditions, while parallel it foresees the rest of the function period till the next necessary repair or replacement of the equipment with the objective criteria and not only from its expire date. The aim is minimizing or eliminating of the time of non-programmed motor’s pauses, bettering the reliability of the production procedure and decreasing the cost of the programmed maintenances.
856
4.2 System Modeling It is known that the bearings due to the friction produce vibrations during the motor’s function, but they increase the temperature P. The friction on the drift surfaces produces a power loss PR which results in heat removal.
PR = F × µ × u
(7)
W (N m / s)
Where: load power of the bearing in Ν coefficient of friction. Calculated by the Sommerfeld number as µ = 3 × ψ / S o for S o ≤ 1 and µ = 3 × ψ / S o
F µ u
for
So > 1
drift velocity in m/s
u =
d ×π × n 60 × 100
(8)
Where: d the rotor’s diameter in mm and n the reverse number in RPM. The friction temperature is dispersed in the environment via the lubricant, the shell and the axle mainly via heat release and part of it via radiation. The bearing’s temperature as well as the oil’s rises until the thermal equilibrium is achieved between the produced and the released [33]. Then the following applies:
P R = Pα = α × Α × (θ − θ
0
)
(9)
W (N m / s)
From this equation, the function temperature of the bearing is calculated, which equals to oil temperature. The equation is:
θ =
PR + θ α × Α
0
(
0
C
(10)
)
where: PR Pα α
θ θ0 Α
produced temperature in W released temperature in W coefficient of heat transfer between surface A of the bearing’s cover and of the air in W/m2 0C. In this case we calculate α=15 to 20 W/m2 0C for a slight breeze. The bearing’s temperature while functioning in 0C. In the following application, the allowable temperature field is defined the surrounding air temperature in 0C. We assume that it is of 200C. The total surface of the bearing which releases temperature in m2.
In this application, the description of the neural network function according to Figure 1 and the matching inputs is presented in the following Table 3. Table 3. Neural Network Development DESCRIPTION
SYMBOL
MATCHING TO ΝΝ
temperature of the bearing in operation
θ
X1
temperature produced in W
PR
X2
total surface of the bearing heat yield
Α
X3
temperature of the surrounding air heat transmission coefficient between surface A of the bearing cap and the surrounding air
θ0
X4
Α
X5
857
The attached weights were defined with the acceptance of equivalent gravity for all inputs apart from that of ambient temperature which was set to be equal to zero. Thus, w1=w2=w3=w5=1, while w4=0. This happened because ambient temperature (under acceptance) considered to remain stable at 200C. The threshold θ defined as a clear number, which resulted after many measurements made in a particular electricallyoperated motor and the results were registered in a database. It has been set as θ=400. On the plain bearing of the electric motor was exercised a radial load F = 18000N at n = 1500 RPM. The spindle’s diameter is d=80mm. Therefore, for the Heaviside unit step activation function it results: 0 / 1. f (x) = 0, if x ≤0 f (x) = 1, if x >0 The situation one (C1) means that the neuron fires, while the situation zero (C0) means that the neuron remains inactive. Practically, the situation 1 expresses that due to temperature rise it is possible to induce damage on the motor and, generally, it is about an unwanted situation that must be removed. This situation is defined as C1 and it is called initial or possible damage situation. The situation zero is defined as C0 and it is called healthy situation. However, there can be defined more than one different operational conditions depending on danger rating of each one of them (low, medium, high) and to be defined the possible corresponding remedial actions for every different situation with priority order in regard to the effectiveness of each one for the particular operational condition. Therefore, in order to smooth the situation C1 and to remove the danger for the entire system, the intelligent agent orders the oil pump to operate as to sufficiently cool the system. Its application to the electric motor’s subsystem «Bearing - Rotor» is being analyzed in Figure 7. In every stage of operation, depending on the conditions and the proper database, it is being performed the corresponding lifetime prediction and the damage possibility. From the statistical survey of SKF [37] it comes out that the bearings’ lifetime, due to temperature rise beyond their normal operating limits, is being decreased up to one thousand times from lifetime proposed by the factory (in space of one hour).
Figure 7. Functional Flow Diagram of the system.
4.3 Results According to the overhearing control of the system made with the intelligent agent arose the following data which comply with the theoretical solution of the system. The heat PR produced in the bearing due to friction is:
PR = F × µ × u = 18000 × 0 . 0032 × 6 . 3 = 363 W ( Nm / s ) with
µ =
3 ×ψ 30 × 0 . 0016 = = 0 . 0032 S0 2 .3
for
S 0 > 1 .0
The total surface Α that attributes heat is given from the following type, since the bearing is an upright type lead-tin alloy (according to DIN 1705) [ ].
A = 25 × d × b = 25 × 0.08 × 0.08 = 0.16 m 2
858
The spindle’s surface that attributes heat results from:
0 . 25 × A = 0 . 25 × 0 . 16 = 0 . 04 m
2
Finally, it is being determined the temperature θ of the bearing from the below arithmetic relation:
θ =
PR 363 + θ0 = + 20 = 93 0 C a× A 25 × 0 , 2
Due to the fact that the bearing is an upright type lead-tin alloy (according to DIN 1705) it has temperature operating limits from 70 until 90 0C. Thus, the intelligent agent undertakes to operate the oil pump, since at θ=930C the threshold of ΝΝ is set to be 425,2 which is far bigger from the point of reaction. In order to lower the temperature to the permitted operating limits (600C in this particular case), it has been used an oil pump that provides the necessary quantity of cooling oil Qk. The lubricating oil ISO V6 46 used was a DIN 51519 type. The necessary discharge of the oil pump derives from the following type:
Qk =
PR c × ρ × (θ 2 − θ1 )
where: θ1 θ2 c ρ Qk
the temperature exhausted from the shell the temperature exhausted from the oil special temperature of the lubricating oil (in this particular mechanism it was measured at 2000Nm/Kg0C) the oil density (in this particular mechanism it was measured at 900Kgm3) the necessary quantity of cooling oil in m3/s
Therefore, according to the theoretical solution, the supply of cooling oil which is necessary for the pump controlled by the PC through the Neural Network at the breadboard construction is:
Qk =
PR c × ρ × (θ 2 − θ1 )
=
363 = 0.00001344 m 3 / s = 0.81 l / min 1,8 × 106 × 15
θ2= 600C is the temperature of expense of oil that is equal with the temperature of the bearing, and θ1= 450C is the oil inlet temperature. The abstraction of them θ2-θ1 = 150C is usual. The result of the oil pump use (procedure C1 of the ΝΝ) was the bearing’s temperature drop and its maintenance at 600C. Required (minimal) quantity of oil of fertilization Q, that expresses the “comfortable” drain of lubricant in the surfaces of slipping from a ring of fertilization is :
Q ≈ 0.0003 ∗ d 2 ∗ b ∗ n ∗ψ = 0,0003 ∗ 8 2 ∗ 8 ∗ 1500 ∗ 0,0016 = 0.37 l / min όπου: d b n
the diameter of bearing in cm the width of bearing in cm the number of round in RPM
And provided that it is in effect:
Qk = 0.81 l / min > Q = 0.37 l / min Then the pump that was used is capable to refrigerate the bearings.
859
5
CONCLUSIONS
This paper proposes the development of a new model for damage forecast to an electric motor that will also provide at the same time an overall view of its health status. The model in question receives information from special sensors while the measurements’ processing is being performed through the development of a Hybrid Dynamic Neural Network. The possible damages are being filled to the database, as well as the total number of the possible remedial actions which can be made, along with their corresponding priority, which reflects the impact of each action to the desired result. Therefore, it is being implemented an intelligent agent that can evaluate at any time the risk of every different operational status, classify it to the relevant category and execute the necessary remedial actions so as to remove the electric motor’s system from the eventual damage. As a result, it is being minimized the time of the non programmed breaks of the motor, it is being reduced its functional stress during its lifetime, and it is being improved the reliability and efficiency of the production process. The method is general and flexible, and can be implemented to every subsystem of the motor by making the right modifications. Thus, the extension of this method to other subsystems of the motor can also include a dynamic vibration and noise control, visual – spectrographic control, and understand the different ambient conditions where the electric motor has to work, such as the increased humidity, dust, the existence of the network’s harmonics etc. The Neural Network is an intelligent system and its advantage, in combination with the database, is that it can be “educated” and it can enrich the operational and ambient status which administrates and therefore, to always improve the performed forecast and the overall administration of the system’s risk management. There is, of course, a big gap between theoretical work and the application needs within effective running time that need to be covered. Some of the problems that must be faced is the increased number of sensors and measurements needed to cover in total every weak spot in different sections and subsystems, as well as the choice of the right position and the way of placing them in order to be closer and to have a more exact display of the system’s state. The method that can be developed so as to restrict the necessary points of measurement will determine the parameters that a) mostly affect the system and they are being indicative for its state, and b) will consider that adjacent or intense functionally connected points or accessories act or get similarly stressed. The method’s development and its application to this particular example of the Bearing – Rotor subsystem with dynamic control of its temperature stress shows the advantages of the proposed method and the use of Dynamic Neural Networks to the failure prediction systems, the risk assessment of the operational status of an electric motor and the automated correction control in order to remove the electric motor from an unwanted operational status.
6
REFERENCES
1
J. D. Patton, “Preventive Maintenance”: Instrument Soc. Of America, 1983.
2
S.S. Rao, “Reliability-Based Design”: McGraw-Hill, 1992.
3
D. Chelidze and J. Cusumano, “A dynamical systems approach to failure prognosis,” J. Vib. Acoust., vol. 126, no. 1, pp. 1–7, 2004
4
W. Wang, F. Golnaraghi, and F. Ismail, “Condition monitoring of a multistage printing press,” J. Sound Vib., vol. 270, no. 5-6, pp. 755–766, 2004.
5
W. Wang, F. Ismail, and F. Golnaraghi, “A neuro-fuzzy approach for gear system monitoring,” IEEE Trans. Fuzzy Syst., vol. 12, no. 5, pp. 710–723, Oct. 2004.
6
S. K. Yang and T. S. Liu, “A petri net approach to early failure detection and isolation for preventive maintenance”, Qual. Reliab. Eng. Int., vol.14, pp 319-330, 1998.
7
C. Li and H. Lee, “Gear fatigue crack prognosis using embedded model gear dynamic model and fracture mechanics,” Mech. Syst. Signal Process., vol. 20, pp. 836–846, 2005.
8
P. McFadden and J. Smith, “Vibration monitoring of rolling element bearings by the high frequency resonance technique—A review,” Tribology Int., vol. 17, no. 1, pp. 3–10, 1984.
9
N. Tandon and A. Choudhury, “Areviewof vibration and acoustic measurement methods for the detection of defects in rolling element bearings,” Tribology Int., vol. 32, pp. 469–480, 1999.
10
J. Jang, C. Sun, and E. Mizutani, Neuro-Fuzzy and Soft Computing. Englewood Cliffs, NJ: Prentice-Hall, 1997.
11
W. Wang, F. Golnaraghi, and F. Ismail, “Prognosis of machine health condition using neuro-fuzzy systems,” Mech. Systems Signal Process., vol. 18, no. 4, pp. 813–831, 2004.
12
An Intelligent System for Machinery Condition Monitoring IEEE Trans. Fuzzy Syst 10.1109/TFUZZ.2007.896237 January 17, 2006.
860
13
R. Patton, P. Frank, and R. Clark, Issues of Fault Diagnosis for DynamicSystems. New York: Springer-Verlag, 2000.
14
J. Korbicz, Fault Diagnosis: Models, Artificial Intelligence, Applications. New York: Springer-Verlag, 2004.
15
M. Pourahmadi, Foundation of Time Series Analysis and Prediction Theory. New York: Wiley, 2001.
16
J. Korbicz, Fault Diagnosis: Models, Artificial Intelligence, Applications. New York: Springer-Verlag, 2004.Milan J & Munt CE. (1992)
17
F. Zhao, X. Koutsoukos, H. Haussecker, J. Reich, and P. Cheung, “Monitoring and fault diagnosis of hybrid systems,” IEEE Trans. Syst.,Man, Cybern. B, Cybern., vol. 35, no. 6, pp. 1225–1240, Dec. 2005.
18
J. Gusumano, D. Chelidze, and A. Chatterjee, “Dynamical systems approach to damage evolution tracking, Part 2: Model-based validation and physical interpretation,” J. Vibrat. Acoust., vol. 124, no. 2, pp. 258–264, 2002.
19
Y. Murphey, J. Crossman, Z. Chen, and J. Cardillo, “Automotive fault diagnosis—Part II:Adistributed agent diagnostic system,” IEEE Trans. Veh. Technol., vol. 52, no. 4, pp. 1076–1098, Jul. 2003.
20
I. Rish, M. Brodie, S. Ma,N. Odintsova,A. Beygelzimer,G. Grabarnik, and K. Hernandez, “Adaptive diagnosis in distributed systems,” IEEE Trans. Neural Netw., vol. 16, no. 5, pp. 1088–1109, Sep. 2005.
21
D. Quinn, G. Mani, M. Kasarda, T. Bash, D. Inman, and R. Kirk, “Damage detection of a rotating cracked shaft using an active magnetic bearing as a force actuator-analysis and experimental verification,” IEEE/ASME Trans. Mechatronics, vol. 10, no. 6, pp. 640–647, Dec. 2005.
22
H. Ishibuchi and T. Nakashima, “Effect of rule weights in fuzzy rulebased classification systems,” IEEE Trans. Fuzzy Syst., vol. 9, no. 4, pp. 506–515, Aug. 2001. [12] M. Kowal and J. Korbicz, “Robust fault detection using neuro-fuzzy networks,” in Proc. 16th IFAC World Congr., 2004, Prague Czech Republic, CD-ROM.
23
J. Wang and C. Lee, “Self-adaptive neuro-fuzzy inference systems for classification applications,” IEEE Trans. Fuzzy Syst., vol. 10, no. 6, pp. 790–802, Dec. 2002.
24
W. Wang, “An adaptive predictor for dynamic system forecasting,” Mech. Syst. Signal Process., vol. 21, no. 2, pp. 809– 823, 2007.
25
W. Wang, F. Ismail, and F. Golnaraghi, “Assessment of gear damage monitoring techniques using vibration measurements,” Mech. Syst. Signal Process., vol. 15, no. 5, pp. 905–922, 2001.
26
Y. Li, T. Kurfess, and S. Liang, “Stochastic prognostics for rolling element bearings,” Mech. Syst. Signal Process., vol. 14, no. 5, pp. 737–762, 2000.
27
P. Tse and D. Atherton, “Prediction of machine deterioration using vibration based fault trends and recurrent neural networks,” J. Vib. Acoust., vol. 121, no. 3, pp. 355–362, 1999.
28
A. Ray and S. Tangirala, “Stochastic modeling of fatigue crack dynamics for online failure prognostics,” IEEE Trans. Control Syst. Technol., vol. 4, no. 4, pp. 443–451, Jul. 1996.
29
A. Atiya, S. El-Shoura, S. Shaheen, and M. El-Sherif, “A comparison between neural-network forecasting techniques— Case study: River flow forecasting,” IEEE Trans. Neural Netw., vol. 10, no. 2, pp. 402–409, Mar. 1999.
30
G. Corani and G. Guariso, “Coupling fuzzy modeling and neural networks for river flood prediction,” IEEE Trans. Syst., Man, Cybern. C, Appl. Rev., vol. 35, no. 3, pp. 382–390, Aug. 2005.
31
V. Giurgiutiu, “Current issues in vibration-based fault diagnostics and prognostics,” in Proc. SPIE 9th Int. Symp. Smart Structures Materials, San Diego, CA, 2002, pp. 17–21.
32
D. McFadden, “Detecting fatigue cracks in gears by amplitude and phase demodulation of the meshing vibration,” J. Vib., Acoust., Stress, Reliab. Design, vol. 108, pp. 165–170, 1986.
33
N. Nikolaou and I. Antoniadis, “Rolling element bearing fault diagnosis using wavelet packets,” Nondestructive Test. Eval. Int., vol. 35, pp. 197–205, 2002.
34
M. Figueiredo, R. Ballini, S. Soares, M. Andrade, and F. Gomide, “Learning algorithms for a class of neurofuzzy network and applications,” IEEE Trans. Syst., Man, Cybern. C, Appl. Rev., vol. 34, no. 3, pp. 293–301, Aug. 2004.
35
D. Nauck, “Adaptive rule weights in neuro-fuzzy systems,” Neural Comput. Appl., vol. 9, pp. 60–70, 2000.
36
H. Ishibuchi and T. Yamamoto, “Rule weight specification in fuzzy rule-based classification systems,” IEEE Trans. Fuzzy Syst., vol. 13, no. 4, pp. 428–435, Oct. 2005.
37
Statistical analysis. SKF, Gerard Schram, May 2003.
861
Proceedings of the 4th World Congress on Engineering Asset Management Athens, Greece 28 - 30 September 2009
AUTOMATED DIAGNOSTIC APPROACHES FOR DEFECTIVE ROLLING ELEMENT BEARINGS USING MINIMAL TRAINING PATTERN CLASSIFICATION METHODS K. C. Gryllias, C. Yiakopoulos and I. Antoniadis School of Mechanical Engineering, Machine Design and Control Systems Section, Dynamics & Structures Laboratory, Athens, 15773, Greece.
Rolling Element Bearings consist one of the most widely used industrial machine elements, being the interface between the stationary and the rotating part of the machine. Due to their importance a plethora of monitoring methods and fault diagnosis procedures have been developed, in order to reduce maintenance costs, improve productivity, and prevent malfunctions and failures during operation which could lead to the downtime of the machine. Towards this direction, among different automatic diagnostic methods, the Support Vector Machine (SVM) method has been shown to present a number of advantages. Support Vector Machine is a relatively new computational learning method based on Statistical Learning Theory and combines fundamental concepts and principles related to learning, well-defined formulation and self-consistent mathematical theory. The key aspects about the use of SVMs as a rolling element bearing health monitoring tool are the lack of actual experimental data, the optimal selection of the type and the number of input features, and the correct selection of the kernel function and its corresponding parameters. A large number of input features have been proposed, being divided in two big categories: A) Traditional signal statistical features in the time domain, such as mean value, rms value, variance, skewness, kurtosis etc, B) Frequency domain based indices, such as energy values obtained at characteristic frequency bands of the measured and the demodulated signals. In this paper, the structure and the performance of a Support Vector Machine based approach for rolling element bearing fault diagnosis is presented. The main advantage of this method is that the training of the SVM is based on a model describing the dynamic behavior of a defective rolling element bearing, enabling thus the direct application of the SVM to experimental measurements of defective bearings, without the need of training the SVM with experimental data of a defective bearing. Key Words: Condition Monitoring, Fault Detection, Support Vector Machines, Vibration Analysis, Rolling Element Bearings. 1
INTRODUCTION
Machine Condition Monitoring and fault diagnosis consist an extremely important issue in manufacturing industry, since benefits result from reduced maintenance costs, improved productivity, increased machine availability and safety. The capability to detect fast, accurately and easily the existence and severity of a fault in an installation during operation is very important as an unexpected failure of machine can drive to unacceptably long time maintenance stops. An impressive number of methods have been presented and developed as a part of industry’s maintenance system based on intelligent and/or expert systems such as artificial neural network, fuzzy expert systems, condition-based reasoning, random forest etc. Artificial intelligent techniques are combined with expert systems attempting to transfer expertise from human to computers. Rolling element bearings consist one of the most common as well as important components in rotating machinery. Thus, it is mandatory to be able to detect the existence and severity of a defect at rolling element bearings in machinery, automatically and reliably. The use of vibration and acoustic emission signals is extremely common in the field of condition monitoring of rotating machinery. These signals can be used for the detection of incipient failures of the machine components. Despite the fact that a proper inspection of the time and frequency domain features of the measured signals may be adequate to identify the faults, there exists a real need for a reliable, fast and automated procedure for health monitoring and fault diagnosis.
862
Artificial neural networks (ANNs) have been widely used and applied in automatic fault detection and machine health monitoring confronting these kinds of problems as pattern recognition and classification problems. The traditional neural network approaches have limitations on generalization giving rise to models that can over fit the data. A strong drawback of ANNs, as well as all supervised learning methods is the requirement of a certain (relatively large) number of data training samples. As training data, empirical modelled data or examples are frequently used. Further, there is usually lack of an analytical simulation model of the machine. Unlike to most classification methods, Support Vector machines (SVM) do not require a large number of training samples [1, 2]. Moreover, the SVM method can solve the learning problem even when only a small amount of training samples is available. Due to the fact that it is hard to obtain sufficient fault samples in practice, SVM is introduced into health monitoring due to its high accuracy and good generalization for a small number of samples. The abilities of using SVMs in machine health monitoring have recently been considered. Different procedures are presented [3-12], using statistical features based on moments, cumulants and other statistical features of the time data series and spectral of vibration data for fault detection and monitoring of rolling element bearings. Statistical features of the signals, both of the original ones, as well as of signals with a certain degree of preprocessing, like differentiation, integration, low- and high-pass filtering, or certain spectral data of the signals have been used as input features [4, 7]. However, the key aspects about the use of SVMs as machine health monitoring tool, still remain the lack of actual training data, the right selection of the type and the number of input features, the right selection of the kernel function and its corresponding parameters. In this paper, in order to overcome the lack of real training data and to avoid novelty detection, a stochastic simulation of the dynamic behavior of defective rolling element bearings is used. Proper features of the simulated signals are used for the training of the SVM. The proposed fault diagnosis method consists of three steps. In the first step, a stochastic simulation, using bearing model is used to produce fault signals. The model is tuned according to the running speed, the Ball Pass Frequency of the Outer Race and the Ball Pass Frequency of the inner race of the related rolling element bearing. The normal condition is simulated using a white noise signal with amplitude tuned to the vibration level of the working rolling bearing. Since the characteristic defect frequencies in the rotating machine vibration spectra are proportional to the machine rotational speed, vibration data are preprocessed using order analysis. The basic concept of order analysis is the transformation of the traditional Fourier Transform of the signal from the frequency spectrum into an order spectrum. Then, the frequency-domain features of Table 1 are extracted for the training of the SVMs. In the second step two Support Vector Machines are trained. The first SVM is used to detect whether a fault exists or not. For this reason, only three features are used. The second SVM is used in order to identify the type of the fault (Outer or Inner race). In this case, a set of 15 frequency domain features, are used. This setup is necessary as SVM can only deal with two classes. Moreover such a structure ameliorates the procedure and speeds up the fault detection. In the third step, vibration signals measured directly on the machine under condition monitoring, are directly introduced into the SVMs for fault diagnosis.
2
DEFECTIVE ROLLING ELEMENT BEARING VIBRATION MODEL
Dynamic behavior of rolling element bearings and machines under localized defects has been a subject of intensive research, leading to a number of well established models [13, 14]. As a result, instead of experimental data which are difficult to obtain, a model can be used, in order to train the Support Vector Machine. Following actually the same concepts as in [13, 14], the basic elements of such a model are as follows. The repetitive impacts produced by a localized bearing defect can be described by a train of Dirac delta functions δ(t) with the period between two successive impulses being the reciprocal Td of the characteristic ball pass frequency outer or inner race (BPFO/BPFI) of a rolling element bearing, depending on the type of defect. These two frequencies are proportional to the shaft rotation speed f shaft , and their value depends on the bearing geometric characteristics. The amplitude of the impacts depends mainly on the load distribution around the circumference of the bearing, as well as on other parameters, such as on the variation in the dynamic stiffness of the assembly, the waviness of the rolling elements and races, and the existence of offsized balls in the ball complement. Under these assumptions, the train of the impacts can be expressed as: N q t d ( ) 0 ∑ d (t - kTd ) for an inner race defect k =0 d (t ) = N d for an outer race defect d (t - kTd ) 0 ∑ k =0
(1)
where d0 is the amplitude of the impulse force characterizing the severity of the defect and q (t ) is the distribution of the load around the rolling element bearing under radial load which approximated typically by the well-known Stribeck equation.
863
The amplitude of the impacts depends on certain parameters such as the load distribution around the circumference of the bearing, the variation in the dynamic stiffness of the assembly, the waviness of the rolling elements and races, and the existence of off-sized balls in the ball complement. In parallel, rolling element bearings present a slip motion which introduces nonlinearity effects to the system. Consequently, the series of the impacts is considered to present a random modulation in amplitude and a time lag between seriate impacts due to the presence of slip. The slip τk is typically assumed to be of zero mean and of normal (Gaussian) probability density function. When an impulse train is applied to the rolling element bearing, it will excite structural resonances. The impulsive structural response of such a system to each impulse can be expressed by the following function: M
s (t ) = ∑ Bi e-2pzi f ni t cos(2 p f 0i t )
(2)
i =1
f0i = fni
1- z i 2
(3)
where i = 1,..., M are the excited modes, and for each mode i ,
f ni is the resonance frequency, and ζι is the damping factor.
Therefore, the dynamic response x (t ) resulting from an induced defect in a bearing can be expressed by: M -2 pz i f nit N q ( t ) A d ( t kT t ) ˜ Bi e cos(2p f 0i t ) + n(t ) for an inner race defect ∑ ∑ k d k k =0 i =1 d (t ) = N M A d (t - kT - t ) ˜ B e -2 pzi fnit cos(2p f t ) + n(t ) for an outer race defect ∑ k d k i 0i ∑ k =0 i =1
where the symbol ˜ denotes convolution and
3
(4)
n(t ) is an additive background noise.
OVERVIEW OF SUPPORT VECTOR MACHINES
The theory of Support Vector Machine was systematically presented by Vapnik and Chervonenkis in the late 1960s. However, it was not until the middle of 1990s that the methods used for SVMs began emerging with great availability of computing power, leading to many practical applications. The basic idea of SVM [15 -17] is to transform the signal to a higher dimensional feature space and find the optimal hyperplane in the space that maximizes the margin between the classes. Briefly, SVM solves a binary problem in which data are separated by a hyperplane. The hyperplane is defined through the use of support vectors, which are a subset of the data available for both cases and define the boundary between the two classes.
{x ( w
{x ( w
}
x ) + b = +1
}
x ) + b = -1
{x ( w
}
x) + b = 0
Figure 1: Data Classification by Support Vector Machine. Briefly, SVM can be considered as a method to create a line or hyperplane between two sets of data for classification and regression. In a two-dimensional case, the action of the SVM can be easily explained without any loss of generality. In Fig. 1 a series of points for two different classes of data are shown, circles (class A) and squares (class B). The SVM attempts to draw a linear boundary (solid line) between the two different classes, and orient it in such a way that the margin (represented by dotted lines) is maximized. In other words, the SVM tries to orient the boundary in such a way that the distance between the boundary and the nearest data point in each class is maximal. The boundary is then placed in the middle of this margin between the two
864
points. The nearest data points are used to define the margins and are known as support vectors (SV, represented by grey circle and square). Once the support vectors are selected, the rest of the feature set can be discarded, as the support vectors contain all the necessary information for the classifier. The boundary can be expressed in terms of
(w x) + b = 0 , w ˛ RN , b ˛ R ,
(5)
where the vector w defines the boundary, x is the input vector of dimension N and b is a scalar threshold. At the margins, where the SVs are located, the equations for class A and B, respectively, are
( w x ) + b = 1 and ( w x ) + b = -1 ,
(6)
As SVs correspond to the extremities of the data for a given class, a decision function can be created to specify whether a given data point belongs to either A or B. This is defined as:
f ( x ) = sign ( ( w x ) + b ) ,
(7)
The optimal hyperplane can be obtained as a solution to the optimization problem: Minimize
t ( w) =
1 2 w , 2
(8)
subject to
yi ( ( w xi ) + b ) ‡ 1 , i=1,….,l,
(9)
where l is the number of training sets. The solution of the constrained quadratic programming (QP) optimization problem can be obtained as:
w = ∑ vi xi ,
(10)
where xi are SVs obtained from training. Putting (9) in (7), the decision function is obtained as follows:
l f ( x ) = sign ∑ vi ( x xi ) + b , i =1
(11)
However cases exist where the linear boundary in input spaces will not be unable to separate two classes accurately. In this case, it is possible to create a hyperplane that allows linear separation in the higher dimension (which corresponds to curved surface in lower dimensional input space). In SVMs, this is achieved through the use of a transformation
f (x)
which
transforms the data from an N-dimensional input space to Q-dimensional feature space:
s = f (x) , where
(12)
x ˛ R N and b ˛ R . Substituting the transformation in (11) gives
l f ( x ) = sign ∑ vi (f ( x ) f ( xi ) ) + b , i =1
(13)
The transformation into higher-dimensional feature space is relatively computation-intensive. A kernel can be used to perform this transformation and the dot product in a single step provided the transformation can be replaced by an equivalent kernel function. This helps in reducing the computational load and at the same time retaining the effect of higher-dimensional transformation. The construction and selection of kernel function are very important to SVM. Any function that satisfies Mercer’s theorem can be used as a kernel function to compute a dot product in feature space. Usually kernel function is given directly in practice. There are different kernel functions like polynomial, sigmoid and radial basis function (RBF) used in SVM. The definition of legitimate kernel function is given by mercer’s theorem. The function must be continuous and positive definite. The kernel function
K ( x, y ) is defined as: K ( x, y ) = f ( x ) f ( y ) ,
The decision function is accordingly modified as:
865
(14)
l f ( x ) = sign ∑ vi K ( x, y ) + b , i =1
(15)
The parameters vi are used as weighting factors to determine which of the input vectors are actually support vectors ( 0 < vi < ¥ ). Some kernel functions are shown as follows: The linear kernel:
K ( x, y ) = x y ,
The polynomial kernel:
(16)
K ( x, y ) = ( x y + 1) , p
(17)
where p is the power number of polynomial. The Radial basis Kernel (RBF): The Sigmoid Kernel:
(
K ( x, y ) = exp - x - y
2
)
2s 2 ,
K ( x, y ) = tanh ( v x y + c ) ,
(18) (19)
The Kernel function reflects the geometric relationship between the input vector and the support vector as well as the similarities of the features of the faults. As an example, the polynomial kernel function describes the similarity of the two vectors since the dot product depicts the canonical correlation. Choosing different order, p, would result in different similarity measures and therefore in different results. For the cases where there is an overlap between the classes with nonseparable data, the range of parameters vi may be limited to reduce the effect of outliers on the boundary defined by SVs. For nonseparable case, the constraint is modified ( 0 < vi < C ). C finally is a penalty constant for those sample points mis-separated by the optimal separation plane. Its role is to strike a proper balance between the calculation complexity and the separating error. For separable case, C is infinity while for nonseparable case, it may be varied, depending on the number of allowable errors in the trained solution: few errors are permitted for high C while low C allows a higher proportion of errors in the solution. To control generalization capability of SVM, there are a few free parameters like limiting term C and the kernel parameters like RBF width σ.
4
BEARING FEATURE SELECTION AND EXTRACTION
The dynamic model for the vibration response of defective rolling element bearings described in the previous section foresees that bearing defects cause impacts at characteristic “defect frequencies”, governed by the rotating speed of the machine and the geometry of the bearings, which in turn excite various machine natural frequencies. As a result, a “spiky” behavior is generated in the time domain signal. The frequency-domain transform of the signal results to entire high frequency regions around the excited natural frequencies which are dominated by characteristic defect frequency sidebands. Due to this characteristic dynamic behavior, different features and feature extraction methods have been proposed, including signal statistical analysis in the time domain, low and high-pass filtering, normalization, time derivation and integration of signals, Fourier transform and wavelet transform. Some characteristic statistical time domain feature parameters used are Kurtosis, Skewness, Variance and RMS of a signal. These parameters -and especially skewness and kurtosis- are considered to be especially appropriate for “spiky” signals, like the ones characterizing the behavior of defective rolling element bearings. Additionally to them, a number of other time domain features can be considered. Alternative to the time domain features, frequency domain features are considered in this paper, since taking into account the energy at well specified frequency bands can lead to a more accurate diagnosis and monitoring of the operating condition of rotating machinery. Especially for rolling element bearings, due to the presence of characteristic sidebands of inner or outer race defect frequencies around the structural natural frequencies, a number of demodulation methods have been proposed. Demodulation or enveloping based methods offer a stronger and more reliable diagnostic potential, since they are based on a more solid physical background. The corresponding physical mechanism is described in [14]. The goal of the enveloping is first to isolate the measured signal in a relatively narrow frequency band around a specific natural frequency using a band-pass filter and then to demodulate it in order to produce a low-frequency signal, called ‘envelope’. Carrier signals are removed and this decreases the influence of irrelevant information. The envelope signal is obtained by applying the well known Hilbert transform. The Hilbert envelope spectrum is given by:
h( f ) =
+¥
∫
x 2 (t ) + H 2 [ x(t )]e - j 2pft dt
(20)
-¥
866
where
H [ x(t )] is the Hilbert transform of a series x(t ) :
H [ x(t )] =
1
p
+¥
x (t )
∫ t - t dt
(21)
-¥
Afterwards, the vibration energies at characteristic frequency bands of the envelope spectrum are calculated. The proposed indices to be used as input features which are based on vibration energies in the frequency domain are presented in Table 1.
Table 1: Frequency domain based vibration energy features. Parameter
Symbol Envelop Signal
Shaft frequency
1x
2nd harmonic of shaft freq.
2x
nd
3x
th
4x
3 harmonic of shaft freq. 4 harmonic of shaft freq. Bearing outer race defect freq.
BPFO
nd
2 harmonic of bearing outer race defect freq.
2xBPFO
3nd harmonic of bearing outer race defect freq.
3xBPFO
th
4 harmonic of bearing outer race defect freq.
4xBPFO
Bearing inner race defect freq.
BPFI
nd
2 harmonic of bearing inner race defect freq.
2xBPFI
3nd harmonic of bearing inner race defect freq.
3xBPFI
th
4 harmonic of bearing inner race defect freq.
4xBPFI 4
SSHE = ∑ kx
Sum of shaft harmonics
k =1
4
SORH = ∑ kBPFO
Sum of bearing outer race defect harmonics
k =1 4
SIRH = ∑ kBPFI
Sum of bearing inner race defect harmonics
k =1
Sum of bearing outer and inner race defect harmonics
4
4
k =1
k =1
SIORH = ∑ kBPFO + ∑ kBPFI
Raw Signal 4
SSHR = ∑ kx
Sum of shaft harmonics
k =1
HFE = ∫
High Frequency Energy
867
f s /2 fH
f 2 df
These indices are energies in the spectrum of the signal or its envelope at the following characteristic frequency bands: •
The shaft rotating speed and its harmonics up to fourth order.
•
The Ball Passing Frequency Outer Race (BPFO) frequency its harmonics up to fourth frequency harmonics characterize the existence of an outer race fault.
•
The Ball Passing Frequency Inner Race (BPFO) frequency its harmonics up to fourth order. Peaks at the BPFO frequency harmonics characterize the existence of an inner race fault.
order. Peaks at the BPFO
The first sixteen energies in Table 1 are derived from the spectrum of the envelope of the signal and are normalized by the Total Energy of the demodulated signal. The last two energies are derived from the spectrum of the raw signal and are normalized by the corresponding Total energy of the raw signal. The main advantage of normalization is to avoid attributes in greater numeric ranges to dominate on those in smaller numeric ranges. Another advantage is to avoid numerical difficulties during the calculation. Kernel values usually depend on the inner products of feature vectors, e.g. the linear kernel and the polynomial kernel, and as a result large attribute values might cause numerical problems. The lowest frequency fH, as used in the definition of the HFE index in Table 1, is selected equal to 1000Hz, since this frequency is considered to characterize the typical lower limit of the high high-frequency bands, excited by the bearing impulses, while fs/2 denotes the Nyquist frequency. The enveloping procedure takes place in the frequency band range from fH, up to fs / 2. In the proposed approach, two (2) two-class classifiers are ranged as a binary tree in order to form a multi-class fault diagnosis system. The two Support Vector Machines are trained using a number of simulated signals presenting an outer or an inner defect, produced by the defective rolling element bearing vibration model. Healthy condition is demonstrated by a number of white noise signals. The signals are preprocessed using order analysis in order to overcome the ‘spectral smearing’ caused by the nonstationarity of the signals and the Energy Features presented at Tables 1 and 2 are derived.
Table 2: Frequency domain based vibration energy features from Table 1, as used in the two stages of the proposed SVMs based fault classification.
Parameters used in the first stage SSHR, SIORH, HFE Parameters used in the second stage 1x, 2x, 2x, 4x, BPFO, 2BPFO, 3BPFO, 4BPFO, BPFI, 2BPFI, 3BPFI, 4BPFI, SSHE, SORH, SIRH
The first SVM is trained in order to detect whether a fault exists or not. For this reason, only three features are used, as shown in Table 2. When the test signal is normal, the output of SVM 1 is 1 and the classification process is over. Otherwise, the output is set to +1 and the fault diagnosis is transferred to SVM 2. The second SVM is used in order to identify the type of the fault (Outer or Inner race). In this case, a set of 15 frequency domain features are used, as shown in Table 2. When the test signal presents an inner race fault the output of SVM 2 is set to -1, otherwise is set to +1. This setup is necessary as SVM can only deal with two classes. Moreover such a structure ameliorates the procedure and speeds up the fault detection. Then the measured signals are preprocessed and the Energy Features of Tables 1 and 2 are extracted. The data are imported consecutively at the trained SVM 1 and SVM 2 and the outputs of the Support Vector Machines reveal the condition of the tested bearing. The selection of the type of the kernel function and the corresponding parameter, as well as the choice of the penalty factor C exert a considerable influence on the performance of the SVM. In practice the parameter C is varied through a wide range of values. Its role is to strike a proper balance between the calculation complexity and the separation error. Whether the value of C is too big or too small can reduce the generalization of SVM. In this work, Linear Function was selected as kernel function and the penalty parameter C was set equal to 100.
868
5
EXPERIMENTAL APPLICATION
5.1 Description of Experimental Setup The evaluation of the proposed approach has been performed in an experimental application. The vibration data used at this session have been obtained from the ball bearing test data set obtained from Case Western Reserve University Bearing Data Center Website [18]. As shown in Figure above, the test stand consists of a 2 hp Reliance Electric motor (left), a torque transducer/encoder (center), a dynamometer (right), and control electronics (not shown). The test bearings support the motor shaft. Vibration data was collected using accelerometers, which were attached to the housing with magnetic bases. Vibration signals were collected using a 16 channel DAT recorder, and were post processed in a Matlab environment. Digital data was collected at 12,000 samples per second. Speed and horsepower data were collected using the torque transducer/encoder and were recorded by hand. Each signal is 8,192 samples long.
Figure 2: Experimental Test Stand. The bearing monitored is a deep groove ball bearing manufactured by SKF. The drive end bearing is a 6205-2RS JEM with a BPFI frequency and a BPFO frequency equal to 5.4152 and 3.5848 times the shaft frequency respectively, leading to theoretical estimations of the expected BPFO and BPFI frequencies presented at Table. Single point faults are introduced into the drive-end test bearings using electro-discharge machining with fault diameters of 7 mils (1 mil=0.001 inches) at the inner race and at the outer race respectively. Each bearing was tested under four different loads (0, 1, 2 and 3 hp) resulting to four different motor speeds. The bearing data set was obtained from the experimental test rig under three different operating conditions: (a) normal condition (3 cases), (b) with outer race fault (4 cases) and (c) with inner race fault (4 cases) as presented at Table 6. At Figures 3-8 the spectra of indicative signals and their corresponding envelopes are presented.
5.2 Experimental Classification Performance Analysis The two-stage SVM approach was applied in eleven (11) test cases. The defective bearing model is tuned in order to produce a certain number of simulated signals. The model is updated with the shaft rotation speed and the expected BPFO and BPFI frequencies of the bearing. Afterwards, four model parameters are selected: (a) Signal’s Amplitude (Gs), (b) the Slip (%), (c) the Level of Noise NL (%) and (d) the excited Natural Frequency (Hz). Four different amplitudes were selected in order to cover the amplitude vibration signal range between 0 and 40 Gs Rms (concerning signals presenting an Outer Race Defect) and 0 and 20 Gs Rms (concerning signals presenting an Inner Race Defect). Simulated signals were produced using totally four different amplitudes, three resonance frequencies, three levels of noise and three levels of slip. The noise at simulated signals is up to 5% SNR while the percentage of slip fluctuates up to 3%. Finally using the model 108 signals were produced assuming the presence of a defect on the inner race of the bearing and another 108 assuming the existence of a defect on the outer race. The normal condition is simulated using white noise with 12 different levels. The envelope spectra of the simulated signals are presented in Figures 9, 10 and 11. The simulated signals at Fig. 9 are white noise signals indicating the normal condition. The signals at Fig. 10 present an outer race defect, while the signals at Fig. 11 present an inner race defect. The 228 signals are further processed using order analysis and the features presented at Table 1 are derived. These Energy features are the inputs of the two Support Vector Machines as it is presented at Table 2. Afterwards, the two SVMs are trained. For each SVM the Linear Basis Function is used. The limiting term C is chosen equal to 100 (high value) and as a result few errors are permitted. The inputs of the first (1) Support vector Machine are a) the Sum of Shaft Harmonics (SSHR), extracted from the raw signal, b) the Sum of bearing Inner and Outer Race defect Harmonics (SIORH), extracted from the Envelop
869
signal and c) the High Frequency Energy (HFE) extracted from the raw signal. The training set consists of 228 examples, 12 positive and 216 negative. Two (2) Support Vectors were selected while none of them is at the bound C, as shown at Table 3. The inputs of the second SVM are presented at Table 1 and 2. The training set is consisted of 216 examples, 108 positive and 108 negative. Three (3) Support Vectors were selected, one (1) from positive examples and two (2) from negative examples (Table 4). The 11 available measurements corresponding to the different stages and the different type of the fault are then used, in order to test the results of the trained SVM. The results of the application of the proposed two-stage classification using SVMs are presented in Table 5. As it can be observed, the success rate of the method is 100% in both stages at all measurements of all cases.
6
CONCLUSIONS
In this paper, a two stage procedure for the automated diagnosis of bearing condition based on a defective rolling element bearing vibration model and on Support Vector Machines is presented. The basic concept and major advantage of the method, is that its training can be performed using simulated data from a dynamic model, describing the response of defective rolling element bearings. Then, vibration measurements, resulting from the machine under condition monitoring, can be directly processed and imported to the multi-class fault diagnosis system. The data are preprocessed using order analysis in order to overcome problems related to sudden changes of the shaft rotating speed. Frequency domain features from the preprocessed measured signals and the simulated signals are used as inputs to the SVM classifiers for a two-stage recognition and classification. At the first stage, a SVM classifier separates the normal condition signals from the faulty signals. At the second stage, a SVM classifier recognizes and categorizes the type of fault. The effectiveness of the method is tested in an experimental case including successive measurements from bearings under different types of defects, different loads and different rotation speeds. The properly selected input features extracted from the frequency domain, present 100% classification success.
7
REFERENCES
1
Burges, C.J.C. (1998) A tutorial on support vector machines for pattern recognition. Data Mining and Knowledge Discovery 2, 955–974.
2
Gunn, S.R. (1998) Support vector machines for classification and regression. Technical report. University of Southampton. Department of Electrical and Computer Science.
3
Hu, Q., He, Z., Zhang, Z. & Zi, Y. (2007) Fault diagnosis of rotating machinery based on improved wavelet package transform and SVMs ensemble. Mechanical Systems and Signal Processing 21, 688-705.
4
Jack, L.B. & Nandi, A.K. (2001) Support vector machines for detection and characterisation of rolling element bearing faults. Proceedings of Institution of Mechanical Engineers, Part C: Journal of Mechanical Engineering Science 215, 1065–1074.
5
Jack, L.B. & Nandi, A.K. (2002) Fault detection using support vector machines and artificial neural networks, augmented by genetic algorithms. Mechanical Systems and Signal Processing 16, 373–390.
6
Rojas, A. & Nandi, A. (2006) Practical scheme for fast detection and classification of rolling-element bearing faults using support vector machines. Mechanical Systems and Signal Processing 20, 1523-1536.
7
Samanta, B., Al-Balushi, K.R. & Al-Araimi, S.A. (2003) Artificial neural networks and support vector machines with genetic algorithm for bearing fault detection. Engineering Applications of Artificial Intelligence 16, 657–665.
8
Samanta, B. & Nataraj, C. (2009) Use of particle swarm optimization for machinery fault detection, Engineering Applications of Artificial Intelligence 22, 308-316.
9
Yuan, S.-F. & Chu, F.-L. (2007a) Fault diagnosis based on support vector machines with parameter optimisation by artificial immunisation algorithm. Mechanical Systems and Signal Processing 21, 1318–1330.
10
Yuan, S.-F. & Chu, F.-L. (2007b) Fault diagnosis based on particle optimisation and support vector machines. Mechanical Systems and Signal Processing 21, 1787–1798.
11
Yang, B.-S., Han, T. & Hwang, W.-W. (2005) Fault Diagnosis of Rotating Machinery based on Multi-Class Support Vector Machines. Journal of Mechanical Science and Technology 19(3), 846-859.
12
Yang, J., Zhang, Y. & Zhu, Y. (2007) Intelligent fault diagnosis of rolling element bearing based on SVMs and fractal dimension. Mechanical Systems and Signal Processing 21, 2012-2024.
870
13
Antoni, J. & Randall, R. B. (2002) Differential diagnosis of gear and bearing faults. Transactions of the ASME. Journal of Vibration and Acoustics 124, 165-171.
14
McFadden, P. D. & Smith, J. D. (1984) Model for the vibration produced by a single point defect in a rolling element bearing. Journal of Sound and Vibration 96, 69-82.
15
Vapnik, V. (1995) The Nature of Statistical Learning Theory, Springer-Verlag, New York.
16
Vapnik, V. (1998) Statistical Learning Theory, John Wiley and Sons, Inc., New York.
17
Christianini, N. & Shawe-Taylor, J. (2000) An Introduction to Support Vector Machines and other kernel-based learning methods, Cambridge University Press.
18
Loparo, K. A., Bearings vibration data http://www.eecs.case.edu/laboratory/bearing/download.htm.
set.
Case
Western
Reserve
University.
Acknowledgments This paper is part of the 03ED_78 research project, implemented within the framework of the “Reinforcement Programme of Human Research Manpower” (PENED) and co-financed by National and Community Funds (25% from the Greek Ministry of Development-General Secretariat of Research and Technology and 75% from E.U.-European Social Fund).
Table 3: Experimental Case: Support Vectors of SVM 1. sv
SSHR
SIORH
HFE
1
0.9123
0.1623
0.1361
2
0.9769
0.6278
0.0681
Table 4: Experimental Case: Support Vectors of SVM 2. sv
1x
2x
1
0,0306
0,2459
0,0192 0,1136 0,1320 0,0092 0,0512 0,0840 0,0709 0,0157 0,0496 0,0199 0,2111 0,5114 0,1192
2
0,0214
0,2514
0,0209 0,0870 0,1598 0,0150 0,0394 0,1121 0,0867 0,0157 0,0854 0,0241 0,1634 0,6087 0,1467
3
0,0022
0,0078
0,4519 0,0066 0,1768 0,2630 0,0121 0,0038 0,1784 0,0209 0,0223 0,1297 0,0418 0,2108 1,0230
3x
4x
BPFO 2BPFO 3BPFO 4BPFO
871
BPFI
2BPFI
3BPFI
4BPFI
SSHE
SORH
SIRH
Table 5: Experimental Case: Data and Results. Fault Type
Fault Diameter Motor Approx. Motor (inches) Load (HP) Speed (rpm)
Characteristic SVM1 SVM2 Frequency Distance Class Distance Class (Hz)
1
Normal
-
1
1772
-
1,8022
+1
-
-
2
Normal
-
2
1750
-
1,6298
+1
-
-
3
Normal
-
3
1725
-
1,7187
+1
-
-
0.007"
0
1797
162,19
-1,1082
-1
-0,9922
-1
5 Inner Race
1
1772
159,93
-1,3672
-1
-1,0199
-1
6 Inner Race
2
1750
157,94
-1,2227
-1
-0,9689
-1
7 Inner Race
3
1730
156,14
-1,0219
-1
-0,8763
-1
0
1797
107,36
-2,9291
-1
0,6269
+1
9 Outer Race
1
1772
105,87
-2,9128
-1
0,4788
+1
10 Outer Race
2
1750
104,56
-3,2225
-1
0,5450
+1
11 Outer Race
3
1730
103,36
-2,9765
-1
0,6385
+1
4 Inner Race
8 Outer Race
0.007"
Normal Spectrum 0.02
Gs Rms
0.015
0.01
0.005
0
0
1000
2000
3000 Hz
4000
5000
6000
4000
5000
6000
Normal Spectrum 0.02
Gs Rms
0.015
0.01
0.005
0
0
1000
2000
3000 Hz
Figure 3: Experimental Case: Spectra of normal raw signals.
872
Normal Env Spectrum 0.01
Gs Rms
0.008
0.006 0.004
0.002
0
0
100
200
300
400
500 Hz
600
700
800
900
1000
600
700
800
900
1000
Normal Env Spectrum 0.01
Gs Rms
0.008
0.006
0.004
0.002
0
0
100
200
300
400
500 Hz
Figure 4: Experimental Case: Spectra of envelopes of normal signals. BPFO Spectrum 0.14 0.12
Gs Rms
0.1 0.08 0.06 0.04 0.02 0
0
1000
2000
3000 Hz
4000
5000
6000
4000
5000
6000
BPFO Spectrum 0.16 0.14 0.12 Gs Rms
0.1 0.08 0.06 0.04 0.02 0
0
1000
2000
3000 Hz
Figure 5: Experimental Case: Spectra of BPFO raw signals.
873
BPFO Env Spectrum 0.4 0.35 0.3 Gs Rms
0.25 0.2 0.15 0.1 0.05 0
0
100
200
300
400
500 Hz
600
700
800
900
1000
600
700
800
900
1000
BPFO Env Spectrum 0.4 0.35 0.3 Gs Rms
0.25 0.2 0.15 0.1 0.05 0
0
100
200
300
400
500 Hz
Figure 6: Experimental Case: Spectra of envelopes of BPFO signals. BPFI Spectrum 0.05
Gs Rms
0.04
0.03 0.02
0.01
0
0
1000
2000
3000 Hz
4000
5000
6000
4000
5000
6000
BPFI Spectrum 0.06 0.05
Gs Rms
0.04 0.03 0.02 0.01 0
0
1000
2000
3000 Hz
Figure 7: Experimental Case: Spectra of BPFI raw signals.
874
BPFI Env Spectrum 0.2
Gs Rms
0.15
0.1
0.05
0
0
100
200
300
400
500 Hz
600
700
800
900
1000
600
700
800
900
1000
600
700
800
900
1000
600
700
800
900
1000
BPFI Env Spectrum 0.2
Gs Rms
0.15
0.1
0.05
0
0
100
200
300
400
500 Hz
Figure 8: Experimental Case: Spectra of envelopes of BPFI signals. Simulated Normal Env Spectrum 0.01
Gs Rms
0.008
0.006 0.004
0.002
0
0
100
200
300
400
500 Hz
Simulated Normal Env Spectrum 0.01
Gs Rms
0.008
0.006
0.004
0.002
0
0
100
200
300
400
500 Hz
Figure 9: Experimental Case: Spectra of envelopes of normal simulated signals.
875
Simulated BPFO Env Spectrum 2
Gs Rms
1.5
1
0.5
0
0
100
200
300
400
500 Hz
600
700
800
900
1000
600
700
800
900
1000
Simulated BPFO Env Spectrum 2
Gs Rms
1.5
1
0.5
0
0
100
200
300
400
500 Hz
Figure 10: Experimental Case: Spectra of envelopes of BPFO simulated signals. Simulated BPFI Env Spectrum 0.4 0.35 0.3 Gs Rms
0.25 0.2 0.15 0.1 0.05 0
0
100
200
300
400
500 Hz
600
700
800
900
1000
600
700
800
900
1000
Simulated BPFI Env Spectrum 0.4 0.35 0.3 Gs Rms
0.25 0.2 0.15 0.1 0.05 0
0
100
200
300
400
500 Hz
Figure 11: Experimental Case: Spectra of envelopes of BPFI simulated signals.
876
Proceedings of the 4th World Congress on Engineering Asset Management Athens, Greece 28 - 30 September 2009
EXPERIMENTAL DETERMINATION OF CUTTING AND DEFORMATION ENERGY FACTORS FOR WEAR PREDICTION OF PNEUMATIC CONVEYING PIPLINE Kim Pang a, b, Ahmed Cenna a, b, Shengming Tan a, b and Mark Jones a, b a
b
CRC for Integrated Engineering Asset Management, Brisbane, Australia
Department of Mechanical Engineering, the University of Newcastle, University Drive, Callaghan, Australia.
Pneumatic conveying has become a well established method of transporting materials in the resource and process industries. Erosion is a phenomenon that occurs in pneumatic conveying pipeline due to the inherent nature of conveying process. In pneumatic conveying, particulate material is transported by the motive of compressed gas with velocities usually less than 60 m/s. In the present investigation, erosion tests were performed in order to study the wear behaviour and determine specific energy factors of pipeline material for the predictive models of wear in dense phase mode of pneumatic conveying pipeline. These tests were performed on mild steel and aluminum surface with alumina and Ilmenite particles. Double disc method was used to measure the particle impact velocities with different powder mass flow rates at different compressed air pressures for erosion tests. Erosion rate and erosion behaviour were studied under the influence of solid particle erosion at dense phase conveying condition. Deformation and cutting energy factors were then determined for predicting wear based on the material removal mechanisms. These factors will then be incorporated in a generic software algorithm to predict the service life of pneumatic conveying pipeline. Key Words: Pneumatic conveying, erosion test, pipeline wear, service life. 1
INTRODUCTION
The current industrial regulations aim to protect the environment from dusts, spillage and mess from any industrial operation. This is primarily true for the industries that involve the transportation of powder or granular materials of any category. For toxic and hazardous materials these regulations are more stringent. Thus, pneumatic conveying is an ideal method for conveying particulate materials within the industrial environment or any other relatively short distances. Pneumatic conveying involves the transportation of a wide variety of dry powdered and granular solids through pipeline and bends using high pressure gas. It is a frequently used method of material transport particularly for in-plant transport over relatively short distances. This is primarily to exploit the degree of flexibility it offers in terms of pipeline routing as well as dust minimization. Approximately 80% of industrial systems are traditionally dilute phase system which uses relatively large amount of air to achieve high particle velocities to stay away from trouble, such as blocking the pipeline. However, for many applications higher velocities lead to excessive levels of particle attrition or wear of pipelines, bends and fittings. To combat these problems, there are systems designed to operate at relatively low velocity regimes. Yet one problem remains as a major issue with these conveying systems which is wear. Numerous experimental and numerical studies on solid particle erosion and failure assessment have been undertaken during the last fifty years [1, 2]. As a result, the solid particle erosion based on fracture mechanics concepts has become mature, and standard methods for determination the erosion rate, in particular for laboratory testing have been already available [3]. To achieve the desired results, it is important to maintain the test conditions as accurately as possible. These standards allow us to determine the ranking of different candidate materials in certain wear environment. Even with these ranking, it is often found that the materials behave differently in applied wear situations. As the pipeline wall experience the impingement force of large amount of abrasives, their failure is usually associated with the deformation and cutting wear processes. Since the applicability of micromachining and associated fracture mechanics concept is limited, the present study focused on the erosion behaviour of target material in conjunction with the impinging
877
particles kinetic energy dissipated into the surface. The objective of this research is to evaluate the erosive wear properties of two ductile materials: aluminum and mild steel. This paper is also analysed the eroded surface features to relate the wear features to the experimentally determined energy factors in different wear environments. 2
LITERATURE REVIEW
In ductile materials erosion occurs by a process of plastic deformation in which material is removed by displacement or cutting action of the particles in which material is removed as chips. In brittle materials, the material is removed by the intersection of cracks which radiates out from the point of impact of the eroding particle. Models of solid particles erosion presented in the literatures showed great importance of the particles impact energy in removing the materials from the surface. Finnie [4] divided the erosion problem into two major parts. The first part involves the determination of the number, direction, and velocity of the particles striking the surface, from fluid flow conditions. The second part of the problem is the calculation of the material removed from the surface. The first part is basically the problem of fluid mechanics. During erosion of ductile material, a large number of abrasive particles strike the surface. Some of these particles will land on flat faces and do no cutting, while others will cut into the surface and remove material. Finnie developed a model for material removal by the particles which displace or cut away material from the surface. An idealized picture of the particle interaction with a material surface is presented in Figure 1. Finnie [4] derived and solved the equations of motion of the idealized particle and compared the predicted material loss with the experimental results. To solve the equations of motion of the particle the following assumptions were made:
Figure 1: Idealized Abrasive particle striking a surface and removing material. Initial velocity of the particle’s centre of gravity makes an angle a with the surface [4]
1)
the ratio of the vertical and horizontal component of the force was assumed to be of constant value (K). This is reasonable if a geometrically similar configuration is maintained throughout the period of cutting.
2)
the depth of contact (l) to the depth of cut (ye) has a constant value (y).
3)
the particle cutting face is of uniform width, which is large compared to the cutting depth and
4) a constant plastic flow stress (p) is reached immediately upon impact. Based on these assumptions, the first microcutting model was developed based on the deformation caused by an individual particle. The volume of material W removed by a single abrasive grain of mass m, velocity V and impact angle a was given by
W =
W=
mV 2 6 K 2 sin 2a - (sin a ) for tan a £ pyK K 6
mV 2 K cos2 a K for tan a ‡ pyK 6 6
(1)
(2)
These two expressions predict the same weight loss when tan 2a = K/6 and the maximum erosion occurs at slightly lower angle given by tan 2a=K/3. The first equation applies to lower impact angles for which the particle leaves the surface while still cutting. The second equation applies to higher impact angles in which the horizontal component of the particle motion ceases while still cutting. The critical angle ac is the impact angle at which the horizontal velocity component has just become zero when the particle leaves the body; i.e., the impact angle above which the residual tangential speed of the particle equals zero. Based on the understanding of the material removal processes in erosion, Neilson and Gilchrist [5] proposed a simplified model for the erosion of material. They assumed the cutting wear factor f (kinetic energy needed to release unit mass of material from the surface through cutting) and deformation wear factor e (kinetic energy needed to release unit mass of material from the surface through deformation) and proposed the relationship for erosive wear loss based on the material and process parameters as follows:
878
W =
1 2
2
M (V 2 cos 2 a - v p )
f
+
1 2
M (V sin a - K ) 2
e
( A) W =
1 2
2
f
(3)
( B)
2
MV cos a
for a < a 0
+
1 2
M (V sin a - K )
(C )
e
2
for a > a 0
(4)
( B)
where W is the erosion value, M is the mass of particles striking at angle a with velocity V. K is the velocity component normal to the surface below which no erosion takes place in certain materials and vp is the residual parallel component of particle velocity at small angles of impact. Part B accounts for deformation wear and part A and C account for cutting wear at small angles of impact and large angles of impact respectively. a0 is the angle at which the vp is zero so that at this angle both the equations will predict the same erosion. The major emphasis in these models was the energy factors that determine the material removal in cutting and deformation processes. Sheldon and Kanhere [6] studied the formation process of craters and concluded that the material removal can be characterised based on indentation hardness of material. They showed that the penetration depth is dependent on the particle impact velocity. They observed that the material flowed around the crater by the impacts of particles until the displaced material strain is large enough to break off. The volume of material removal is proportional to the absorbed kinetic energy of the impacted particles. From these studies it can be clearly stated that the particle impact energy is the most dominant factor affecting the erosion rate of target material. Particle erosion is a complex phenomenon of two phase flow. Currently there are no suitable models available to predict the particle velocity in the high pressure air stream used in these erosion tests. As a result, scientist depend on certain empirical formulas developed on the specific test rig to reduce the number of tests required to determine particle velocities for a range of pressure and particle mass flow rates. There are several methods for measuring the particle impact velocities for specific air pressure and solids flow rates. One of the simplest and practical methods of measuring of particle velocity is the double disc method by Ruff and Ives [7]. The following figure shows the schematic drawing of the arrangement of the double disc apparatus.
Figure 2 Schematic of double disc method for measuring particle impact velocity
v=
In the double disc method the particles velocity is measured through measurement of flight time between the two fixed discs. The two discs mounted on the shaft are rotated by a motor at a known speed. Particle accelerating nozzle is fixed on the other side as shown in the diagram. There is a hole in the disc in front of the nozzle for the particles to pass through. There is a fixed mark directly opposite to the hole in the disc A. As the shaft is rotated at a high speed, particles passing through the hole on disc A will hit disc B somewhere down on the path on disc. An aluminium foil was sticked on the surface of disc B covering the positions of both first mark and second mark. Therefore, the aluminium foil was exposed to the abrasive stream which passed through the hole on the disc A. From the distance between the two marks on Disc B, the particle velocity can be measured through the following equation [5].
2prgL S
(5)
In equation (5), r is the radial distance of the marks on the disc B in m, with rotation rate, g , in revolution per second. The two disc separating distance, L, is kept constant and the linear distance, S, between two marks on the disc B measured after the test. J.H. Neilson and A. Gilchrist [5] introduced the simple approach to the experimental analysis of the material removal due to erosion and they correlate the experimental result to theoretical model developed by Bitter [8]. The aim of the erosion test was to determine the value of deformation and cutting energy factors for stable progressive wear loss of eroded surface. For a direct comparison between the controlled factor of particle impact angle and velocity, the erosion value was determined by
879
measuring the progressive mass loss of each specimen and mass of impinged erodent during erosion test. The erosion rate was obtained by computing the mass loss of the target material per unit mass of impinged erodent periodically during the erosion test. From the test results, the values of the deformations and cutting energy factors were determined according the following procedure: 1.
Using the mass loss of material per unit mass erodent, obtained at equation
W90 o
M
2. 3. 4.
1 2 V = 2
a
= 90o
e
can be determined from the
e
Using e , the contribution that deformation wear makes to the total wear at all angles can be obtained from equation (3), (4) The cutting wear can be obtained at all angles by subtracting the deformation wear from the experimental values. Using the cutting wear value, obtained at 3 can be determined from
1 2 V cos 2 a W = 2 M f
As the cutting and deformation are the primary mechanisms in wear of ductile materials, these two factors can determine the wear loss in any wear situations, provided the factors affecting particles impact energy are determined. Those are particle velocity, particle angularity and energy absorption to the surface. Effect of these factors and their quantitative assessment has been discussed in details elsewhere [9]. 3
EXPERIMENTAL METHOD
Erosion tests were performed using two different particles, namely alumina (Av. 70mm) and ilmenite (av. 130mm) on a micro sandblaster, model SWAM_BLAST® MV-2L from Crystal Mark Incorporated (Figure 3). A specialized chamber was built to enable the specimen to be in a controllable testing condition. The chamber also enabled us to collect the particles after each tests. The investigation of the erosion rate was carried out at the abrasive particle velocity of 30m/s, 50m/s and 60 m/s and the impact angle range was 0-90o. The particles velocity was determined using a double disc method and calibrated for different mass flow rates as well as pressure conditions. Calibration of particle velocity using the double disc method has been presented earlier. Erosion rate was determined from the weight loss of target material and mass of impacted abrasive particle. The target mass loss was measured by weighing at an accuracy of 0.01mg.
Figure 3 Schematic of SWAM_BLAST® MV-2L erosion tester
880
Erosion experiments were conducted with alumina and ilmenite to determine the wear factors according to the procedure described earlier. The study also analysed the influence of particles angularity on the erosion rate and the material removal mechanism in ductile materials. The SEM images of abrasive particle clearly revealed the primary difference between these two particles. The following figure shows the SEM images of the particles as well as the particles size distributions. The physical properties and conveying condition are given in Table 1.
Figure 4 SEM observation of size and shape of alumina
Figure 5 SEM observation of size and shape of ilmenite.
Figure 6 The size distribution of alumina in the erosion test
Figure 7 The size distribution of ilmenite in the erosion test
Table 1 Test parameter of Erodent materials Erodent bulk density Erodent size range (µm) Erodent hardness(mohr) Erodent shape Impingement angle (µ0) Impact velocity (m/s) Nozzle diameter(mm) Nozzle to sample distance Erodent feed rate (g/min) Test temperature Nozzle to sample distance (mm) Nozzle diameter (mm)
Alumina 3766 kg/m3 65µm-125µm 9 Less angular edges 20o, 30o, 90o 30 – 10, 60 – 10 1 21 – 2 10.2 – 0.6 Room Temperature 21 – 2mm 1
ilmenite 4352 kg/m3 65µm-180µm 5-6 Irregular, more angularity 30o,45o,90o 30 – 10, 60 – 10 1 21 – 2 10.2 – 0.6 Room Temperature 21 – 2mm 1
The specimen size of 10mm x 30mm x 1mm was used for solid particle erosion test processes. Specimens were cut from a single stock for the consistency of the test materials. The surface grinding was performed using sand paper to 400 grit. Specimens were cleaned using an ultrasonic cleaner as well as acetone was used to clean the chemical residue on the surface of target material. The specimen was positioned using a rigid specimen holder to avoid buckling under the particles stream. The attached mechanism enabled to monitor the progressive mass loss from the target material with accurate positioning of the sample after each weighing sessions. The specimen design was in agreement with the recommendations given in the standard [3].
881
4
RESULT AND DISCUSSION 4.1 Eroded surface feature
Although the primary aim of this study was to determine the cutting and deformation factors for two different surfaceerodent combinations, it also played a greater role in developing understanding of material removal mechanisms in these conditions. One of the features that have been demonstrated here is the development of the ripples on the wear surfaces. Even though it is a well known feature of material removal in ductile materials, it is usually mentioned with respect to the spherical erodent. It was equally observed in these studies for erodent of highly angularity like ilmenite as well as less angular alumina (shown earlier) as shown in Figures 4-5. Figure 5 shows the ilmenite containing the average larger size of abrasive with the more angular shape than alumina. The following figures Fig. 8-11 presented the ripples patterns generated in each of the surface-erodent combinations at different angle of impacts. Figure 8 presented the wear surface of aluminium eroded by alumina particles at three different impact angles. Ripples are observed in all the cases of wear tests. On the other hand mild steel surface did not show any ripples for the similar test conditions (Figure 9). This is due to the fact that aluminium presents a softer surface compared to the mild steel. Although cutting was expected to be the primary mechanism of material removal, the two surfaces presented two different phenomena. Surface hardness and the particles angularity played the major role in the materials removal and the eventual surface texture in these two cases. As the alumina particles are less angular, depth of cut is very small for individual particles for mild steel. On the other hand, because of the higher hardness, the deformation is also limited for these test conditions. For the case of aluminium, the depth of cut is smaller but the deformation is lot higher compared to mild steel. The higher deformation rate is clearly visible in the case of 300 impact compared to both 900 and 200 as the normal component stress of the particle is higher. Another phenomenon demonstrated in these tests is the initial increase of the samples weight due to the particles embedded into the surface.
Fig. 8 Photographs of aluminium surface eroded by impinging alumina with 60ms-1(a) 90o in 960 s(b) 30o in 3600s (c) 20o in 2160s
Fig. 9 Photographs of mild steel surface eroded by impinging alumina with 60ms-1(a) 90o in 1587s (b) 30o in 938s (c) 20o in 305s Figures 10 and 11 presented the wear surfaces of aluminium and mild steel eroded by ilmenite. Ilmenite is highly angular particles and relatively larger particles compared to sand. Although the wear surfaces of aluminium presented the similar topography as for alumina erodent, wear surfaces of mild steel at 450 are clearly different with respect to the ripples formation. Wear scar in 900 impacts clearly the same for both with aluminium and mild steel. One of the major similarities between mild steel and aluminium was observed in 450 impacts. Ripples formation is associated with the deformation of the surface by the erodent. In general, if the particle is angular, it cuts into the surface and removes the material as chips. For higher angle of impacts, particles may dig into the surface and stop cutting before leaving the surface. In this case the particle may be embedded into the surface and increase the weight of the sample. As the particle cuts into the surface, it also deforms the surface around the particle, form lips in front of it. The chip formed by the particle can be removed my subsequent impacts of other particles. For the deformation of the surface, particles push the surface material in front of it creating small wave patterns. As more particles impact of this wave like discontinuities, surface layer moves along the particles direction. With higher impact angles of the particle stream, the depth of these waves are greater compared to that at lower impacts. As a result a larger wave pattern can be observed. In the case of mild steel, impacted by alumina, the depth of deformation is small for the
882
wave patterns. But for the ilmenite particles, due to the angularity of the particles, surface deforms more compared to cutting. The deformation of the surface is higher than cutting. This is probably the primary reason for the wave formation in the mild steel surface eroded by ilmenite.
Fig. 10 Photographs of aluminium surface eroded by impinging ilmenite with 60ms-1 with different impact angle in different test duration (a) 90o in 1440s (b) 45o in 870s (c) 30o in 1080s
Fig. 11 Photographs of mild steel surface eroded by impinging ilmenite with 60ms-1 with different impact angle in different test duration (a) 90o in 900s (b) 40o in 1080s (c) 30o in 840s 4.2 Erosion rate of ductile material The erosion rate of ductile material under the test condition is discussed in the following sections. Figure 12 a and b, show the erosion rates as a function of impingement angle. The characteristics of erosion rate of ductile material are the minimum erosion rate at normal incident and maximum around 200-300 impact conditions. Due to the limitations of the apparatus, it was not possible to measure the weight loss below 200 incidences satisfactorily and was dropped. The Figures 12 a and b showed the measured erosion rates for mild steel and aluminium surfaces at different angles of impacts. Aluminium wear rates are almost 3-4 times the wear rates of mild steel surfaces for particles velocity of 60m/s. This difference seems to be lower in case of lower velocity on impacts. Another difference for aluminium and mild steel is the difference in wear rates from 200 to 300 impact angles. The changes seem to be smooth in mild steel compared to aluminium. The difference in wear rates between the mild steel and aluminium can be seen from the wear surfaces in Figures 8 and 9.
30 m/s
60 m/s Erosion rate of Mild Steel, (gm/gm)
Erosion rate of Aluminum, (gm/gm)
30 m/s 5.00E-05 4.00E-05 3.00E-05 2.00E-05 1.00E-05 0.00E+00 0
20
40
60
80
Im pingem ent angle of alum ina, (degree)
100
(a)
60 m/s
1.20E-05 1.00E-05 8.00E-06 6.00E-06 4.00E-06 2.00E-06 0.00E+00 0
20
40
60
80
Impingement angle of alumina, (degree)
100
(b)
Fig.12. Influence of impingement angle of alumina on steady state erosion rate of ferrous and non-ferrous materials Wear rates of mild steel and aluminium for ilmenite particles are presented in Figure 13 a and b. From the figures it can be seen that the wear rates for aluminium are almost 2-4 times the wear rates of mild steel for these wear conditions. Wear rate for ilmenite have been increased up to 10 times compared to the alumina wear rates. This is primarily due to the angularity of the ilmenite particles. In the case of ilmenite, the ratio between the wear rates of mild steel and aluminium has been reduced considerably compared to the alumina erodent. This is due to the fact that in case of ilmenite, material removal is primarily through deformation. As the hardness of mild steel is higher compared to aluminium, it will need higher particle energy for permanent deformation of the surface. With similar particle impact energy on the surface will result in less material removal from the harder surface. On the other hand, ilmenite being an angular material, it can deform the surface similar to the
883
30 m/s
60 m/s
30 m/s
1.40E-04
Erosion rate of Mild steel, (g/g)
Erosion rate of Aluminum, (g/g)
aluminium, even though smaller. This will increase the wear rate for mild steel compared to the aluminium. The changes in wear rates have been reflected in the surface profile as seen in Figure 10 and 11. Figures showed the deformation wear characteristics such as ripples formation in mild steel which was absent in figure 9.
1.20E-04 1.00E-04 8.00E-05 6.00E-05 4.00E-05 2.00E-05 0.00E+00 0
20
40
60
80
100
60 m/s
1.20E-04 1.00E-04 8.00E-05 6.00E-05 4.00E-05 2.00E-05 0.00E+00 0
(a)
20
40
60
80
100
(b)
Im pingem ent angle Ilm enite (degree)
Impingement angle of Ilmenite (degree)
Fig. 13. Influence of impingement angle of Ilmenite on steady state erosion rate of ferrous and non-ferrous materials
4.3 Determination of Deformation and Cutting Energy Factor The importance and procedure for determining the deformation and cutting energy factors from the experimental results are described in earlier sections. These are the factors that are very important in developing predictive models for wear loss in different wear situations. Once the factors are determined for a combination of surface and erodent, it can be considered as constants. The variation of deformation energy factor ( e ) and cutting energy factor ( f ) for alumina particles are presented in Figure 14 a. and b. with respect to particles velocity. It showed clearly that the deformation energy required by mild steel and aluminium are nearly comparable. On the other hand, the cutting energy factors between aluminium and mild steel are considerably different. It appears that with increasing velocity, mild steel required more unit energy for cutting than slower velocity. This is due to the particle shape and the hardness of the surface. In fact these results showed the inefficiency of cutting harder surface by the alumina particles.
Deformation Energy for unit m aterial rem oval
Cutting Energy for unit material removal 2.50E+08
Cutting energy factor (J/kg)
Deformation energy factor (J/kg)
5.00E+08 4.00E+08 3.00E+08 2.00E+08 1.00E+08 0.00E+00 0
20 40 60 80 Alum ina impact velocity (m /s)
aluminium deformation energy
2.00E+08 1.50E+08 1.00E+08 5.00E+07 0.00E+00
(a)
0
20
40
60
80
(b)
Alum ina im pact velocity (m /s)
mild steel deformation energy
aluminium cutting energy
mild steel cutting energy
Figure 14 a) Deformation energy factor of both aluminium and mild steel impinging by alumina, b) Cutting energy factor of both aluminium and mild steel impinging by alumina The variation of deformation energy factor ( e ) and cutting energy factor ( f ) for ilmenite particles are presented in Figure 15 a. and b. with respect to particles velocity. Wear rates at 900, 45o, and 30o are required for the determination of cutting and deformation energy factors for particle velocity between 30ms-1 and 60ms-1. Similar trends were observed for the aluminium where the deformation energy factors are increased with increasing velocity. The cutting energy factor showed a different trend in mild steel compared to alumina. Cutting energy factor increased slightly here compared to alumina. This is due to the fact that the ilmenite particles are more efficient in cutting the mild steel surface at higher velocity compared to alumina. This has been again reflected on the wear surface of mild steel eroded by ilmenite.
884
Cutting Energy for unit material removal
5.00E+07
Cutting energy factor (J/kg)
Deformation energy factor (J/kg)
Deform ation Energy for unit m aterial rem oval
4.00E+07 3.00E+07 2.00E+07 1.00E+07
2.00E+07 1.50E+07 1.00E+07 5.00E+06 0.00E+00
0.00E+00
0
0
20 40 60 Ilm enite im pact velocity (m/s)
aluminium deformation energy
80
(a)
mild steel deformation energy
20
40
60
80
(b)
Ilm enite im pact Velocity (m /s) aluminium cutting energy
mild steel cutting energy
Figure 15: a) Deformation energy factor of both aluminium and mild steel impinging by Ilmenite b) Cutting energy factor of both aluminium and mild steel impinging by Ilmenite 5
CONCLUSION
(1) Wear behaviour of ductile materials have been analysed for two different erodent having similar hardness but different angularity. It was observed that material removal mechanisms changes with particles characteristics. (2) Wear mechanisms in aluminium is the same for both alumina and ilmenite but they are different in mild steel. Ilmenite developed ripples in mild steel whereas no ripples were observed while eroded using alumina. (3) The deformation wear and cutting process are the most important material removal mechanisms in ductile materials. Through determination of cutting and deformation energy, changes in material removal mechanisms can be recognized. (4) Experiments carried out in this investigation to determine the cutting and deformation wear factors for different ductile materials for different erodent. It was found that these factors are highly dependent on both the erodent and surface materials. Although they vary with particle velocity, they can be considered constants for a particular surface-erodent combination. It was also observed that for a small range of particle velocity, like pneumatic conveying processes, these can be considered constants.
6
REFERENCES
1
I. Finnie, (1995) Some reflections on the past and future of erosion, Wear, 1-10
2
Tilly GP. (1969) Sand erosion of metals and plastic: a brief review, Wear, 14, 241-8.
3
ASTM G76-07, (1995) Standard Test Method for Conducting Erosion Tests by Solid Particle Impingement Using Gas Jets, ASTM, Philadelphia.
4
I. Finnie, (1960) Erosion of surfaces by solid particles, Wear 3, 87-103.
5
J. H. Neilson and A. Gilchrist, (1968) Erosion by a stream of solid particles, Wear, 11, 111-122.
6
G.L. Sheldon, A. Kanhere, (1972) Investigation of Impingement Erosion Using Single Particles, Wear, 21, 195-208.
7
Ruff AW, Ives I.K. (1975) Measurement of solid particle velocity in erosive wear, Wear, 35, 195-9
8
J.G.A. Bitter, (1962) A study of erosion phenomena. Part I and II, Wear 6, 5-21, 169-190.
9
A. Cenna, N.W.Page, K.C. Williams and M.G. Jones, (2008) Wear mechanisms in dense phase pneumatic conveying of alumina, Wear 264, 905-913
Acknowledgement Financial support from the Cooperative Research Centre for Integrated Engineering Asset Management (CIEAM) for this work is gratefully acknowledged
885
Proceedings of the 4th World Congress on Engineering Asset Management Athens, Greece 28 - 30 September 2009
DIAGNOSTIC SUPPORT TECHNOLOGY BY FUSION OF MODEL AND SEMANTIC NETWORK Hideki Yachiku a , Ryota Inoue b and Tadao Kawai b a
b
OMRON Corporation, Shiokoji Horikawa, Shimogyo-ku, KYOTO, 600-8530 JAPAN.
Osaka City University, 3-3-138 Sugimoto Sumiyoshi-ku Osaka-city, OSAKA, 558-8585 JAPAN.
There are mainly two separate ways in diagnostic techniques for failures. The first one is based on models. The other one is on statistics. Although the model-based diagnosis can gain strict numerical answers, its problem is difficulty in making models and interpreting the results. On the hand, the statistics-based diagnosis has an advantage that it is easily understandable for its users, because a phenomenon and its cause can be directly associated. Still, for high accuracy, the technique requires a lot of failure cases, which is yet another problem. This study is to examine the characteristics of the above two diagnostic techniques and consider a method to build a better diagnostic system by blending them both. By virtual case data creation with using the model-based technique for the statistics-based technique and verbal abstraction for the model-based technique, the author tried to solve the problems for both techniques of data shortage and difficult interpretation. Key Words: Diagnostic Support System, Model-Based, Statistics-Based, Semantic Network 1
INTRODUCTION
As many machines are used in electric power plants and factories and damages due to failures of those equipments are immeasurable, the demand for diagnosis of troubles enabling early detection of mechanical abnormality is strong. In big companies, there are many cases where diagnosis know-how has been accumulated at the time of anomalous occurrence and analyses by skilled people are conducted. In small and medium sized companies, however, experts of diagnosis are often unavailable and there is an issue of securing quality diagnosis in the future. Meanwhile, when we reviewed studies relating to diagnosis of troubles, there were many studies in the past relating to quantitative diagnosis aiming for an accurate diagnosis of a higher level. However, an expert knowledge of a higher level has become necessary in the process and circumstances where only experts of diagnosis can diagnose making use of diagnostic technology have been arising. In this study, we aim for a technology to be able to solve problems relating to diagnosis in the aforementioned small and medium sized companies and discuss on supporting system enabling diagnosis even by no expert of diagnosis.
2
THE APPROACHES OF TROUBLE DIAGNOSIS When we try to make mechanical troubles diagnostic system, the approaches in the past can be divided into two main
categories. One is a model-based approach[1] and it is an approach that simulates troubles from a theory deductively and diagnoses troubles by checking similarities between the consequent results and actual troubles. The other approach is a statistics-based/case-based approach[2] and it is an approach that accumulates cases actually occurred in the past and cases that were analysed and their causes were found, and organize them inductively/statistically and diagnoses troubles by checking similarities when new troubles occurred. (Figure 1, Figure 2)
886
Figure 1 Model Based Diagnostic Approach
Figure 2 Statistics Based Diagnostic Approach
The advantage of the model-based approach lies in the fact that it is easier to probe detailed causes as a numeric phenomenon can be predicted very strictly if a model matches reality. However, there are issues also in the model-based approach. The first one is that it is difficult to simulate actual equipment completely by simple model. The targeted machines are diverse and the way of occurrence of phenomena between model and actual equipment would differ if parameter is only a bit different. Especially for new targets, it sometimes difficult to obtain strict solution depending on simulations as it could be poorly modelled. Also, as the results derived by models are often expressed as values, there is also an issue that it is too difficult to interpret them for normal operators and maintenance people for machines. Though experts can analyse them with expert knowledge, there is too far-reached from user-friendliness required by normal maintenance people for machines to analyse causes in case of failure. On the other hand, as the statistics-based/case-based approach is the way that occurred troubles are recorded/remembered in the point of view of observers/machine users and causes analysed in the past are utilized effectively when the phenomenon reoccurs or similar troubles occurs, there is an advantage that easier diagnoses become possible for observers/machine users. In case of objects which have been known for a long time, troubles in the past have been accumulated and it is highly possible that a case-based diagnostic system can be made by examining them statistically, but in case of new objects, there are also many cases where case data themselves about troubles were not obtained and diagnosis may even not be possible due to insufficient data. With the meaning of supporting that, an approach to accumulate prior experiments as sample cases can be also considered but it would be difficult to produce troubles for collecting sample cases if machines are expensive. Therefore, the issues of statistics-based/case-based approach lie in how data can be collected. In this paper, we discuss on a build approach for a better diagnostic system considering merits and demerits of each model-based and statistics-based diagnostic system and fusing them. As a basic concept, we consider to make virtual case data for statistics-based diagnostic system by model-based diagnostic system. In this way, we will try to solve the issue of difficult collection of sample cases that is problematic in statistics-based diagnostic system. Also we try to increase accountability/userfriendliness by making output result of model-based diagnostic system to statistics/verbalization temporarily for the issue of easy understanding that is problematic in model-based diagnostic system. (Figure 3)
Figure 3 Diagnostic Approach by Fusion of Model and Statistics
887
3
MAKING VIRTUAL CASES BY MODEL BASED SYSTEM In this paper, we build diagnostic system considering the merits taking diagnoses of rotating body vibration as examples.
On the rotating body vibration, many studies have been conducted in the past. For this reason, we thought it would be easy to perform an assessment by building approach for this object other way round. For modelling, using a modelling language, Modelica[3], we made a flexural oscillation of axis due to static unbalance and a subsynchronous resonance of roller bearing due to backlash for simple systems made of motor, bearing, rod and rotor so that they can be reproduced by simulation. With these models and simulation environment, we conducted experiments of preliminary estimation on occurrence of troubles. For instance, we obtained results for the static unbalance and the backlash of bearing as shown in (Figure 4) and (Figure 5) respectively. We could observe that the simulation was reasonable as these results showed the results predictable from theory could almost be reproduced.
Figure 5 Simulation for Subharmonic Resonance
Figure 4 Simulation for Unbalance Vibration
Next, we consider rotating system made of three rotors as a model of diagnosis object applying the above simulation. (Figure 6) To this system, we recorded phenomena occurred consequently as virtual cases by producing three kinds of causes; eccentricity of rotor, trouble of rotor parts and backlash of bearing part. For the eccentricity of rotor, we ran simulations with data of each rotor having any of 0, 1 and 3% eccentricity for its radius. Also, for the trouble of rotor parts, we ran simulations with data having any of 1, 5 and 10% of mass trouble for mass of any one rotor. For the backlash of bearing part, we ran simulations for each case of backlash in bearing on one side only and both sides. After recording phenomena as the results of simulations, we perform Figure 6 Test System for Diagnosis
processing of abstraction/verbalization for the amount of property characteristic of those phenomena. For example, (Figure 7) shows
oscillatory waveform of the left side rotor(Roter1) in case 1% defect was produced in the central rotor(Roter2). By the way, the number of rotations is lowered enough below critical speed. In this case, we analyse amplitude change and frequency from the oscillatory waveform, and put a label of “rapid increased” if amplitude change exceeds predefined threshold value and put other labels if frequency of more than threshold value will be how many time more than rotation frequency in the results of frequency analysis. We call the process of putting labels on measuring results based on predetermined rule like this as “verbalization”.
888
(Table-1) shows the results after verbalization on number of vibrations, number of rotations, amplitude change and phase variation for diagnostic object models in (Figure 6) by the above approach.
Figure 7 Verbalization Example for Vibration Table 1 Result of Verbalization for Test System 4
STASTICS BASED LEARNING AND DIAGNOSIS As above, we could build virtual cases same as accumulated actual cases by model base by leaving causes and phenomena
on simulations after verbalized. Next, we describe an approach to conduct statistical learning and diagnosis using these virtual cases. We consider applying Semantic Network[4] developed by us as an approach of learning/diagnosis here. We have utilized the Semantic Network to learn linguistic phenomena relating to troubles statistically and increase precision of search and classification. Though learning from linguistic information described by human was conducted in the past, we conduct learning using linguistic information output as simulation result for input here. In the Semantic Network, “meaning” of linguistic information expresses a set of relations between words. For this reason, we will learn data as a minimum unit that with how much strength (S) word (A) and word (B) maintain relation (R). This minimum data unit is called as “semantic primitive” and a set of semantic primitives forms “Semantic Network”. For the diagnostic object model in this paper, we will express property name and property value verbalized in aforementioned section by the Semantic Network. For example, if many cases of “rapid increased” ”amplitude change” were observed for models with troubles, it increases strength (S) of relation (R): “cause and effect” between word (A): “amplitude change” and word (B): “rapid increased”. Like the above, we will have models of each phenomenon of trouble learn the Semantic Network. (Figure 8) Regarding the digitalization of strength of cause and effect, though approaches such as pass analysis[5], graphical modelling[6] and Bayesian network[7] are being introduced in recent years as an effort to handle cause and effect expressly, we decided to use correlation analysis as the theory of correlation analysis forms the foundation of those technologies Figure 8 Learning Semantic Network
and we judged it better to use simple model in this
889
simplified problem. Here, we take this problem as a kind of classification problem under multiple observed variables and learn classification rule. In this case, we shall learn classification rule per one item of causes and formulate how effectively each property of phenomenon would work to identify the cause using correlation coefficient. Also, in case of language data, we perform calculation in the same way making the appearance of label as 1 and no-appearance as 0. The definition of correlation coefficient is given by the following formula.
∑
n
i =1
n
∑
i =1
( xi - x)( yi - y )
( xi - x)
2
n
∑
i =1
( yi - y )
(1) 2
Now, when we made the number of cases for cause c , the number of occurrences for property p, the number of cases other than cause c and the number of occurrence for property p as
c0 , p0 , c1 and p1 respectively to eliminate influence of
increasing errors by the number of cases, we take them as correlation analysis for two points of
( x0 , y 0 ) = (
p0 ,1) c0
(2)
p ( x1 , y1 ) = ( 1 ,-1) c1 and assign Equation (2) to Equation (1). By doing the above process, we focus on the property to be able to discern from other causes when a cause is estimated and can also learn classification rule that can estimate a cause by giving more weight to property of increasing correlation. In case of an actual diagnosis, estimated score of the cause can be obtained by multiplying deviation from mean value of observed each property value by correlation coefficient of Equation (1) and obtaining sum for all properties. Therefore, it becomes possible to display causes that are considered as closest to input phenomenon in order of score.
5
DIAGNOSTIC TEST We conducted diagnostic test to verify effectiveness of this
approach. Preparing data group A and data group B for diagnostic test and using Semantic Network learned by data group A, closed test and open test were conducted in data group A and data group B respectively. (Figure 9) The results obtained by the diagnostic test are included in (Table 2). As a result of diagnosing cause and trouble spot, we could diagnose 100% causes in this test coverage in both closed test and open test. For trouble sport, however, the accuracy rate was only 82% for closed test and only 60% for open test.
Figure 9 Diagnostic Test Method
Nonetheless, if combined-factor learning/diagnosis was excluded, either test had 100% accuracy rate. We believe that the reasons why the accuracy rate lowers if combined factor was included are considered that the simple correlation analysis is not accurate enough and it may be necessary to apply an approach that can handle an effect of combined factor quantitatively such as pass analysis.
890
6
THE RESULT OF THIS RESEARCH In this paper, we proposed a system idea to
realize the breakaway from data deficiency and specialization by clarifying merits and demerits of model-based diagnosis and statistics-based diagnosis and fusing the both together. Also, we actually showed that a simple learning diagnosis can be realized for vibration problem of rotating body by fusing model & simulation and basic statistical
approach
together.
Though
the
approach of this time carried out a trial manufacture for vibration of rotating body, knowledge specialized in domain is a rule only in case of verbalizing simulation result and it is
Table 2 Results of Diagnosis
also easily applied to other objects as it is different from the diagnostic structure of
constructing knowledge beforehand and it intends to learn diagnostic rule automatically. Though we could have positive outcomes about cause and trouble spot except in case of combined factor in this diagnostic test, we need to conduct further study on the diagnosis of trouble spot in case of combined factor. Also, we need to conduct verification for more sophisticated objects in the future.
7
REFERENCES
1
J. Chen and R. Patton , (1999) Robust Model-Based Fault Diagnosis for Dynamic Systems. Kluwer
2
T. Iwata, Y. Motomura and K. Machida, (2002) Diagnosis of Satellites -Top-Down Approach-. In 23rd International Symposium on Space Technology and Science, ISTS 2002-f-18.
3
H. Elmqvist, S. E. Mattsson, and M. Otter, (1998) Modelica—The new object-oriented modelling language. In Proc. 12th Eur. Simulation Multiconf., pp. 127–131.
4
H. Yachiku, (2008) Quality Feedback Model for Trouble Relapse Prevention in Product Development. In Proceedings of the Third World Congress on Engineering Asset Management and Intelligent Maintenance Systems (WCEAM IMS2008), Springer-Verlag London Limited, No.ISBN 978-1-84882-216-0, pp.1823-1827.
5
H. B. Asher, (1976) Causal Modeling. Beverly Hills, CA/London: Sage Publications.
6
D. Edwards (2000) Introduction to graphical modelling. Springer, 2nd edition.
7
D. Heckerman, C. Meek and G. Cooper (1999) A Bayesian approach to causal discovery.In Cooper and Glymour, pp.141–166.
891
Proceedings of the 4th World Congress on Engineering Asset Management Athens, Greece 28 - 30 September 2009
E-LEARNING MAINTENANCE MANAGEMENT TRAINING AND COMPETENCE ASSESSMENT: DEVELOPMENT AND DEMONSTRATION Nikos Papathanassiou and Christos Emmanouilidis CETI/ATHENA Research & Innovation Centre, Greece Recent advances in industrial production and manufacturing processes as well as the rapidly increasing global competition are key drivers in a growing demand for improving personnel competences in maintenance management. Most employees in the manufacturing sector are beyond the typical age of standard education, thus traditional teaching methods are not sufficient. Vocational Education and Training should be targeted to serve the specific needs for enhancing employable skills and competences. Time and place constraints, which are often a significant concern for professional training, can be mitigated by e-training. This paper presents an overview of the e-learning toolkit employed in the iLearn2Main project, a focused EU collaborative project targeting Maintenance Management Training. The learning toolkit offers customized maintenance management training, while facilitating the standardisation of competencies assessment and learning evaluation. It is based on the Moodle Learning Management System platform and comprises a series of courses on maintenance management, structured in a complete Maintenance Management curriculum, specified by taking into account stakeholder needs. Each course is delivered in short and easily completed sessions, which are followed by comprehension questions, aiming at maximizing learner engagement and understanding. A full set of assessment questions certifies that learners acquire the required knowledge and provides an immediate, reliable and automated way of trainee performance evaluation. The learning toolkit is adequately customised to offer not only interactive training but to seek to steer the learners through individual learning paths, thus offering a more engaging and efficient learning process. Key Words: Maintenance management training, e-learning, learning management systems 1
INTRODUCTION
Industries are confronting challenging global competition that drives them to seek to rationalise the use of their assets. The maintenance management function is strengthened by the introduction of advanced tools and enabling technologies aimed at streamlining the planning and execution of maintenance and asset management tasks. This new environment increases the pressure put on personnel involved in Maintenance Management to excel in performing their intended duties. As a consequence, there is a constantly increasing demand for improvement in maintenance management, delivered through personnel training. Vocational Education and Training (VET) should be targeted to specific needs and to the improvement of certain employable skills and competences. Whether VET is targeting people entering their working life (initial VET, IVET) or people during their working life (continuing VET – CVET), there is a clear need for a common framework for maintenance management competence assessment, ultimately leading to a competence certification process. An EU initiative in this direction is taken through the EFNMS, in the form of Competence Requirement Specification [1]. Traditional delivery of training in Maintenance Management is often considered impractical, as personnel need to operate under time and space constraints that lack flexibility. Therefore it is the training itself that needs to turn flexible. This can be achieved by employing e-learning techniques and tools, enabling trainees to choose the training pace and courses that fit their needs. Beyond that, e-tools can also facilitate streamlining the competence assessment procedure, by providing a uniform and standardized way to assess Maintenance Management knowledge and skills. This paper presents steps taken towards developing a Maintenance Management e-learning and e-competence assessment IT toolkit, as part of the iLearn2Main EU project. Based on the popular open source Learning Management System platform Moodle, the toolkit offers customized maintenance management training and automated competence assessment. In the remainder we describe the motivation for undertaking this work, as well as the key offered functionalities.
892
2
MAINTENANCE TRAINING
Maintenance training has been acknowledged as of critical importance for industry to be able to effectively apply adequate practices in maintenance and support the effort of a modern enterprise to confront global competition through optimal use of their assets. Providing adequate maintenance training is a twofold issue:
Appropriate maintenance training curricula should be constructed by taking into consideration maintenance theories & practice, academic knowledge and industrial needs. Close collaboration between academics, professional trainers and industrialists is crucial in the development of this curriculum [2].
An acknowledged competence assessment and knowledge accreditation system that should lead to recognised qualification for maintenance personnel. An internationally recognised accreditation system would facilitate the recognition of maintenance competences. This in turn would facilitate personnel mobility as personnel on one hand would carry acknowledged qualifications, while industry will be aided in its pursuit to fill in maintenance posts with adequately trained personnel. In the EU, the main effort in the way of establishing a common accreditation scheme is through the EFNMS (European Federation of National Maintenance Societies) recommendations on maintenance competencies [3].
As a result, a number of European initiatives lately are seeking to provide up to date and adequate maintenance training [1]. Most of them are been founded by the European Commission Vocational Education and training (VET) funding programme, Leonardo da Vinci. The nature of maintenance training can vary considerably depending on the targeted learner group. Maintenance management training has very different requirements compared to the training of maintenance technical personnel. In all cases training can more efficiently be delivered by employing additional tools, rather than relying on conventional training. There have been several examples of employing advanced technology tools to deliver maintenance training. When the target group is maintenance technical personnel, a key learning outcome is ability to perform maintenance tasks. Naturally, onthe-job training is well suited for such training. Nonetheless, it is often impractical and in most cases too expensive. One of the most effective technologies is to employ Augmented Reality (AR), to provide “real-time” assistance to engineers. AR provides a way for problem-based maintenance training without the cost of on-the-job training, which in cases such as aircraft maintenance [4] [5], large scale factory maintenance [6] or maintenance of power facilities maintenance [7] can be very expensive. When aiming at delivering Maintenance Management training there are a number of factors that differentiate this type of training from the training that is appropriate for the industrial workers:
The primary occupation of Maintenance Managers is the planning of the Maintenance procedures and observation of their correct application throughout the industrial system. Details about specific machine particularities are almost always out of their scope of interest.
Trainees in Maintenance Management have to be constantly aware of the latest advancements and trends in the area of industrial maintenance and to be ready to apply them in practice. Excellent grasp of Maintenance Management concepts and strategies is a necessary ability.
Their Maintenance training must harmonize with international standards and guidelines for maintenance management practice, such as the EFNMS Competence Requirement Specification.
To be able to successfully perform in their role, personnel involved in maintenance management must be prepared to operate in a global competition environment; thus they be able at any time to take quick and informed decisions on the best applicable maintenance policies.
In maintenance management training, e-learning can be a practical and efficient means to deliver the required training. Although the cost of developing the e-training solution is higher than that of conventional training, the costs associated with running the training, the flexibility offered to the trainees and the fact that e-learning can offer interactive and engaging training experience makes it appropriate for maintenance management training. Before describing the iLearn2Main e-training solution, we first look into some aspects of e-Learning technology that makes it appropriate for Maintenance Management training. 3
e-LEARNING FLEXIBILITY AND ADAPTATION
One of the key advantages of employing e-Learning for maintenance management training is related to the flexibility it offers and the ease of adaptation of how the training is delivered to individual learners. Much of the adaptation capacity of Learning Management Solutions, draw inspiration from an understanding of the way humans learn. Two are the key principles involved in such a process:
Learners should be actively involved in learning, thus be motivated to learn.
893
People learn in different ways and rates. Customised and individual material leads to more efficient learning.
Reflecting on the way humans learn, three main learning theories have been influencing e-learning [8]:
Behaviourism that treats learning as a set of changes to the learner as he reacts to environmental events. Memorising and imitation are critical in this learning process. Focus is on the teacher, or the computer providing the carefully arranged material and not the learner.
Cognitive science that bases learning on attention, motivation, perception and other internal processes. It focuses on screen design and on human – computer interaction, where the teacher usually has the role of the facilitator or partner.
Constructivism that claims that learners construct their knowledge as they react with and interpret their environment. The most important differentiation from other theories is that focus is located on the learner and his actions and not the teacher or the teaching methods. Thus the aim is to provide stimuli and support for the users to construct their knowledge.
A number of recent studies indicate that e-Learning has an exponential impact in learning practice, in every level or aspect of education [9, 10]. The number of universities and tutors including e-learning in their curricula demonstrates the maturity elearning has reached. It offers unique benefits over traditional methods of studying, such as lower costs, ubiquitous learning and independence of time and space limitations. Especially in ages over compulsory education and VET teaching, mitigation of time and space limitations is an indisputable benefit. Personalization is the latest trend in e-learning, and generally relates to the ability of a Learning Management System to deliver the best suited content for each user separately [8]. For example, personalisation can be seen as a way of tailoring learning content to each separate user. A learning system that supports personalisation should be able to identify learner educational needs and select to deliver the most appropriate learning material from a multitude of redundant resources to the learner. More general, adaptation is the ability of an e-Learning system to change the delivered training material, both in navigational and content level, to best fit each user needs. [11]. Adaptation can be distinguished between four levels [12]:
Navigation adaptation that customises links by generating, hiding, annotating and ranking them.
Content adaptation that customises content by hiding or providing extra or different versions of content.
Presentation adaptation that changes the way content is delivered, by highlighting, adding, removing or sorting parts of it.
Collaboration adaptation that bases customisation on collaboration preferences between the learners, and supports cooperative problem solving.
Personalization and adaptation base their efficiency on correct identification of learner’s learning needs. This identification is closely intertwined with the notion of Learner Profiles. Learner profiles are described as a standardized method to store all the important information about a learner in one convenient and searchable place. This information usually includes his preferences, goals, previous knowledge about a subject, general knowledge, achievements, performance, and everything else that could be useful for an automated system or a tutor to decide the appropriate learning material for a learner [13]. Another research area on improving e-learning experience deals with Learning Styles. The identification of Learning Styles is based on cognitive theoretical approaches that attempt to classify users to different categories depending on the most efficient way for them to learn. In this way, a Learning Style can be defined as a set of characteristic cognitive, affective and physiological factors that indicate how a person learns and interacts in a learning environment.[12] Categorisation of users according to their Learning styles is presented also as an efficient way to implement customisation without an unmanageable number of differentiated parameters for each user. The iLearn2Main toolkit seeks to exploit LMS adaptation concepts in order to deliver the right training and competence assessment content and tools to the targeted groups, ie those involved in maintenance management training, both trainers and trainees. It is of interest to assess the potential impact and acceptance likelihood of an e-learning solution to deliver maintenance management training. A survey of 70 professionals involved in the maintenance function in the UK, Sweden, Greece, Latvia and Romania, conducted as part of the iLearn2Main project. It is worth looking at the computer literacy of the interviewees. An especially high proportion (94,5 %) of them uses computer in a daily basis and believes to be “very much” familiar with computers (81,8%). Furthermore, 100% of the interviewees responded that they expect to benefit “much” (40%) or “too much” (60%) from a computer based automated learning platform. There were no negative or indifferent replies on this question. These responses bond well with potential future use of the iLearn2main platform and its acceptance prospects. The next section provides an overview of the LMS system that the iLearn2Main toolkit is based upon. 4
MOODLE LEARNING MANAGEMENT SYSTEM
Moodle is an acronym for “Modular Object-Oriented Dynamic Learning Environment”. By its core definition, Moodle is an open-source Learning Management System which aims at stimulating users to explore its pages, interact with the learning material and communicate with the teachers and other learners. Moodle has been the selected deployment platform of iLearn2Main training system for a series of reasons:
894
Moodle has been built on top of a relatively new educational theory, called social constructivism. While most other LMSs have been designed around technological rather than educational concepts, Moodle has pedagogy in its core [14] Constructivism claims that we learn more efficiently when we construct knowledge artefacts for others. Social denotes that this construction of knowledge is even more efficient when it is performed in a collaborative way. Widespread acceptance and reactions from academic community everywhere prove that this is at least a successful theory.
Learners are free to navigate through the material and choose the courses that they need. The learning environment can guide them by providing feedback and making recommendations on the courses that may be necessary for them to take. This, in conjunction with a learner enrolment system gives the course administrator or the tutor adequate control on which learners are attending specific courses.
Open-source. This results in exceptional adaptability and transferability.[3] This is a significant advantage considering and enables easy expansion, amendment and porting of courses to different environments. A system built on proprietary software is always bound to the decisions of the software vendor.
Huge user base. Registered Moodle based sites and users have grown exponentially, thus indicating a platform with positive outlook. It also designates the existence of an active support community. Furthermore it ensures multilingual support, which is necessary for the wider reach of the training. Moodle already includes language packs for more than 70 languages.
Minimal requirements. Moodle has very limited resources requirements, thus supporting transferability, as lessons can be delivered to platform – independent computers with little hardware restrictions.
Moodle classifies all learning content into two large and distinct categories [15]:
Resources. These are the static material that learners can read or attend but cannot interact with. They include Labels, Web pages, Text pages, File Directories and Links to files.
Activities. This is the interactive course material, where learners can answer questions, upload files and communicate. They include Assignments, Choices, Journal, Lessons, Quizzes, Surveys, Wikis, Workshops, Chats, Forums and Glossaries. Some of the above further encourage collaboration between users, such as Forums, Wikis and Chats.
Course construction is relatively simple, enabling easy construction of taught curricula. Overall, Moodle provides a highly customizable environment. This customization is based on self contained, customizable entities called blocks. Blocks can be enabled, disabled and moved around the user screen by an administrator. Most commonly used standard provided blocks are Activities, Administration, Calendar, Course/ site description, Courses, Latest news, Main menu, Online users, People, Recent Activity, Site Administration and Search Forums.[15] It is worth mentioning here that in spite of the fact that Moodle basic installation contains a full set of blocks, there is a vast amount of blocks provided by other members of the open-source Moodle community. Learners’ track record and performance data can also be stored and handled by authorised users. This can be done by handling detailed activity logs. Moodle keeps analytic logs for every type of action that occurs inside the LMS. The track data related to a particular user can shape up an individual learning profile. While it is possible to define a learning path to a user, social constructivism indicates that it is preferable to provide feedback and guidance, thus influence, rather than force the learning pathways. Moodle manages users by defining user roles. Every user is member of one or more roles. Roles can be defined dynamically and determine exactly the rights of the user inside every piece of the environment. There are a number of predefined roles to facilitate quick setup of courses, like administrator, teacher, course creator, learner, but administrators can define new roles with more refined rights. It is worth noting that these roles are completely dynamic, meaning that a learner in one course, can be teacher to another, or have the right to create courses for a topic he is well aware of. Regarding the interoperability support, course content in Moodle is pure HTML which means that core material is kept in a form completely independent of hardware and software limitations. However, other pieces of information, such as user personal data and qualifications can be stored inside the environment’s database. From the e-learning perspective, an important consideration is conformance to Learning Object standards, such as SCORM [16] SCORM stands for Sharable Content Object Reference Model and was developed in 1997 by the Advanced Distributed Learning initiative (ADL), a joint White House / U.S. Department of Defence initiative. SCORM incorporated the best parts of pre-existing e-learning standard groups, like IMS, AICC, ARIADNE and IEEE-LTSC into a new model.[17] In SCORM, Sharable Content Objects (SCOs) are defined as the smallest logical, instruction units and considered as the building blocks for learning content. The most recent SCORM version is the 2004 4th edition, available since April 2009. SCORM compliance will lead to fully independent, interoperable Learning Objects which will contain all the information necessary to be deployed in different setups. Full compliance with SCORM is a primary goal for Moodle version 2.
895
5
ILEARN2MAIN LEARNING ENVIRONMENT
The targeted user group in iLearn2Main is people involved or aiming at becoming involved in Maintenance Management. An assessment of VET objectives was completed, that has taken into account a user survey of 70 stakeholders. On the basis of this a Maintenance Management Training Curriculum was defined (Figure 1). The iLearn2Main training toolkit offers an integrated environment that supports trainees and trainers to enrol and participate in e–training and e-assessment for Maintenance Management competences. The training modules have been developed and deployed in a Moodle platform that was setup and customised to fit the needs of Maintenance Management training, as specified in the VET objectives.
Figure 1. iLearn2Main Maintenance Management Learning Curriculum
The Learning system resides inside the project site, which is accessible at www.ilearn2main.eu. When a user first arrives at the project site, he is presented with a list of the offered learning courses. If the user selects a course, or clicks on the login hyperlink, he is presented with a login form where he has to enter his credentials. Inside a course, under the course title, there are the different parts that comprise a Moodle course (Figure 2):
Course modules at the center of the screen: these include Lessons, Glossaries and References. This is the learning content.
Links to other participants in the course to facilitate communication.
Links to activity types in the course for easier navigation.
Direct access to the learner grades for every type of quiz or exercise is included in the course.
List of all other courses the learner has enrolled to.
Latest news and events relevant to the course, i.e. uploading of a new module.
896
Figure 2. An iLearn2Main course page
Figure 3. Part of an ILearn2Main course
As the learner progresses through the course screens (Figure 3), comprehension questions are offered (Figure 4). The learner has to answer to continue to the rest of the lesson. Multiple choice questions appear with shuffled answer order and help learners to assert their knowledge before continuing. According to [8], immediate feedback is three times more efficient than delayed in terms of learning time, thus the system provides immediate feedback on the given answers. When the learner reaches the final page of a lesson, he is presented with a relevant message indicating his performance on comprehension questions and offering options for the next steps (Figure 5). Performance data can also be used to offer specific learning path suggestions, for example indicating to the learner that he is advised to take a specific course, repeat the same course content or progress to the next item.
897
Figure 4. Comprehension question during a lesson
Figure 5. Learner results at the end of a lesson
Inside a course a user can find not only complete lessons but also References, in the form of a separate web page, which is convenient for direct linking from the courses but also as a collective reference for external material. Finally learners can see the course’s glossary, with definitions for all the maintenance terms that have been used inside the course. This Glossary is integrated with the learning content, so as to provide direct and easy access to any of its terms. These terms are automatically linked everywhere they exist in the lessons, and comprise a full and analytic reference guide of all maintenance terms, conveniently located to help trainees accessing them when studying the training content. Quite often in maintenance practice there is an evident lack of a common vocabulary between professionals involved in Maintenance Management. In recognition of the need to support the spread of such a common vocabulary, a Maintenance Management e-Glossary has been developed. The development is based on the established European Standard in Maintenance Terminology (CEN EN 13306), while additional terms have been included, as needed for each different training course. The end result was a comprehensive glossary with terms interlinked with every course. A page with glossary definitions is shown in Figure 6.
898
Figure 6. An ILearn2Main glossary page.
6
COMPETENCE ASSESSMENT
While delivered learning material helps learners expand their knowledge and acquire the required levels of technical and theoretical background, an assessment tool is necessary to evaluate performance of learners and the impact of the teaching solution. What is more, there are some desirable characteristics from an assessment tool that is based on an e-learning platform, especially targeted to Maintenance Managers:
First, it should be fully automated. It has been already mentioned as a major benefit of our solution the freedom of any kind of class commitment or time restrictions, thus providing a flexible training alternative for Maintenance Managers, and this should be true for assessment tests also. Our implementation offers completely automated competence assessment, while it keeps detailed data about users attempts and grades in case it will become necessary for a human tutor to check learners’ performance.
Second, it should provide useful feedback on learner’s performance, so they can learn from their mistakes and perfect their knowledge. An automated learning tool should point the user to the right direction and clarify his mistakes or misunderstandings. Our system provides immediate feedback to user answers. It also locates wrong answers and lack of achievement of desirable competence level and directs user to relevant theory chapters.
Third, it should be completely separate and independent from the e-Learning. The need for clear separation between training and certification procedures is stated in the ISO standard EN ISO/IEC 17024:2003 in order to certify confidentiality of the results. [1]. Accreditation and evaluation of learners’ knowledge is appropriately undertaken by separate bodies from those offering training. Therefore it is important that the competence assessment tool is separate from the e-learning as it is intended to be used by different users. The assessment content can be updated and improved in the future to better cover the competence assessment requirements. However, assessment test should cover sufficiently all the required knowledge areas right from the beginning.
Last, conformance to a widespread maintenance standard is necessary, since we primarily aim to the exploitation of our learning platform in an international level of assessment and accreditation in industrial maintenance. Adherence to the European Standard in Maintenance Terminology, namely CEN EN 13306, and EFNMS Guidelines on Maintenance Competence Assessment was the first important step towards this direction.
An example of a multiple choice quiz related to the condition monitoring course can be seen in Figure 7, while Figure 8 is a typical results page from an assessment test, where we can see the calculated grades.
899
Figure 7. Questions within assessment test.
Figure 8. Results of an assessment test. 7
CONCLUSION
This paper presented the development of an e-learning toolkit for Maintenance Management training. Learners involved in such training are usually either people before entering their working life or – more often – people during their working life. The latter usually have to operate under time and space constraints, making conventional training impractical. E-Training offers a flexible and adaptable solution for Maintenance Management training. The iLearn2Main project, an EU initiative to establish IT tools for Maintenance Management training and competence assessment is a concerted effort in this direction. The project results include a Learning Toolkit for delivering training content in a flexible and interactive way and an e-Assessment tool for automated competence assessment. Based on open source platform, the iLearn2Main toolkit is a flexible and
900
expandable training platform that can be employed for web-based e-Training in Maintenance Management. Evaluation of the platform in 5 EU countries is currently under way. 8
REFERENCES
1
Franlund, J. (2008) Some European Initiatives in Requirements of Competence in Maintenance. In Proc. of CM-MFPT 2008, 5th Int. Conf. on Condition Monitoring & Machinery Failure Prevention Technologies. Edinburgh, UK.
2
Bakouros, Y. and S. Panagiotidou. (2008) An analysis of maintenance education and training needs in European SMEs and an IT platform enriching maintenance curricula with industrial expertise. In Proc. of CM-MFPT 2008, 5th Int. Conf. on Condition Monitoring & Machinery Failure Prevention Technologies. Edinburgh, UK.
3
Emmanouilidis, C., N. Papathanassiou, and A. Papakonstantinou. (2008) Current trends in e-training and prospects for maintenance vocational training. In Proc. of CM-MFPT 2008, 5th Int. Conf. on Condition Monitoring & Machinery Failure Prevention Technologies. Edinburgh, UK.
4
Haritos, T. and D. Macchiarella. (2005) A mobile application of augmented reality for aerospace maintenance training. In The 24th Digital Avionics Systems Conference, 2005. Washington, DC: IEEE.
5
Christian, J., et al., (2007) Virtual and Mixed Reality Interface for e-Training: Examples of Applications in Light Aircraft Maintenance, in 4th International Conference on Universal Access in Human-Computer Interaction, UAHCI 2007 Held as Part of HCI International 2007, Part III. Springer Berlin /Heidelberg: Beijing, China. 520-529.
6
Schwald, B. and B. Laval, (2003) An Augmented Reality System for Training and Assistance to Maintenance in the Industrial Context. Journal of WSCG, Plzen, Czech Republic, 11(1).
7
Nakajima, C. and N. Itho. (2003) A Support System for Maintenance Training by Augmented Reality. In Proceedings of the 12th International Conference on Image Analysis and Processing (ICIAP’03). Mantova, Italy: IEEE.
8
Woolf, B.P., (2009) Building Intelligent Interactive Tutors. Morgan Kauffman.
9
Jones, N. and J. O'Shea, (2004) Challenging hierarchies: The impact of e-learning. Higher Education, 48(3), 379-395.
10
Huddlestone, J. and J. Pike, (2007) Seven key decision factors for selecting e-learning. Cognition, Technology & Work, 10(3), 237-247.
11
Dolog, P., et al. (2004) Personalisation in distributed e-learning environments. In Proceedings of the 13th international World Wide Web conference on Alternate track papers & posters (WWW2004). New York, USA: Association for Computing Machinery (ACM).
12
Popescu, E., P. Trigano, and C. Badica, (2007) Adaptive Educational Hypermedia Systems: A Focus on Learning Styles, in EUROCON, 2007. The International Conference on Computer as a Tool, IEEE: Warsaw, Poland. 2473-2478.
13
Dolog, P. and M. Schaefer. (2005) A Framework for Browsing, Manipulating and Maintaining Interoperable Learner Profiles in In Proc. of UM2005 - 10th International Conference on User Modeling. Edinburgh, UK: Springer Berlin / Heidelberg.
14
Cole, J.R., (2008) Using Moodle. 2nd ed. Farnham: O'Reilly. xiii, 266.
15
Rice, W.H., (2006) Moodle e-learning course development : a complete guide to successful learning using Moodle. Packt Pub.: Birmingham, U.K. p. ix, 236.
16
Kazi, S.A. (2004) A Conceptual Framework for Web-based Intelligent Learning Environments using SCORM-2004. In Proceedings of the IEEE International Conference on Advanced Learning Technologies. Joensuu, Finland: IEEE.
17
Dodds, P. and S.E. Thropp, eds. (2006) SCORM® 2004 3rd Edition Content Aggregation Model (CAM) Version 1.0. ADL.
Acknowledgment The authors wish to acknowledge the financial support received through the UK/07/LLP-LdV/TOI-004 project iLearn2Main, which is a collaboration of the Univ. of Portsmouth, the Athena Research & Innovation Centre, ATLANTIS Engineering, the Swedish Maintenance Society UTEK, the Latvia Technology Park and the CNIPMMR National Council of Small and Medium Sized Private Enterprises.
901
Proceedings of the 4th World Congress on Engineering Asset Management Athens, Greece 28 - 30 September 2009
A RISK BASED INSPECTION (RBI) PREVENTIVE MAINTANANCE PROGRAMME: A CASE STUDY By P.N. Botsaris, A. D. Naris, G. Gaidajis [email protected],[email protected], [email protected] Department of Production Engineering and Management Demokritus University of Thrace , School of Engineering 67100, Xanthi, Greece, Kimmeria campus The current study has as main objective the Risk Based Inspection (RBI) method, a method of planning preventive maintenance and inspection. The (RBI) method intend to create an inspection and a maintenance program, by taking into consideration the probability of failure and the consequences that result from failure of the equipment. There are three approaches for the application of the method, depending on the desirable level of precision. These approaches are qualitative, semi quantitative, and quantitative. After the approaches of method are analyzed, it will become a case study of the qualitative and semi quantitative approach, in a heat exchanger (E102), in the unit of Ammonia, in the Phosphoric Fertilizers Industry (P.F.I.) S.A., which is found in New Karvali, Kavala, Greece. Key Words: preventive maintenance, inspection, equipment, level of precision, approaches of method, risk based inspection (RBI). 1
INTRODUCTION
The method which dominated until a few years ago in the programming of the inspections and the maintainance of the equipment of most industries, was by log. This means that a program is kept which has been written based on some general directions of the manufacturers, and the experience of the engineers and technicians in charge. Nevertheless, this method is both rigid, since it does not contain feedback, as well as it does not provide as a measurable quantity the risk of failure of the various parts of the equipment. This has as a consequence that the risk of the various parts of the equipment is not exactly known, and the priorities set during inspection and maintainance are not fully documented, but they are based on the knowledge and the experience of the engineers and the technicians of the industry. Nowadays, due to the increased competitiveness and globalization, all costs, consequently the cost of inspection and maintainance as well, should be compressed, maximizing at the same time the benefits. The aim of this study is to suggest a new method of programming inspections and maintainance, which should have the following characteristics: - First, with its application, the risk of each separate piece of equipment should become a measurable quantity, so that there is comparison, for the best possible distribution of funds for inspection and maintainance. - Second, in case the findings demand it, it should suggest shuting down the unit, so that repairing or replacing takes place, in order to avoid failure of the equipment during operation. - Third, because this method is going to be applied to chemical industries, the consequences of failing to the health of the workers as well as the integrity of the equipment should also be taken into account.
902
- Forth, in case a change in the operating conditions of a procedure is suggested (eg. Higher pressure or operation temperature), it should be possible to estimate in terms of quantity the impact of this change on the risk of the equipment. This will be a very important tool for the people in charge so as to decide if this change is at their interest or not. And that is because, if the change suggested relocates the equipment in question at a high risk area, this would have as a result a higher frequency of inspections and maintainance so as to avoid failure during operation. This means that the cost of each choice could be estimated and the decision making will be easier. - Fifth, this specific method should be dynamic. This means that the findings of the inspections and the maintainance should be analysed so that the position of the various parts of the equpment whould change on the risk chart, so as the time span between inspecions and maintainance whould change as well. In this way, the new data would provide feedback to the new system which will be set for the integrity of the equipment. - Sixth, all enfeeblement mechanisms which are applied to the equipment should be identified and their negative impact in the integrity of the equipment should be put into terms of quantity. It should be noted that the results of the enfeeblement mechanisms are not always in linear relation to time. The method which can make all the above come true is the programming of the inspecions and the maintainance based on the risk of failure (Risk Based Inspection), which has been applied to the biggest chemical industries internationally. The right application of the method requires that there is the knowledge of all the enfeeblement mechanisms that act on the equipment in the order they appear. After that, the control techniques that are applied should be the appropriate ones, the corresponding instruments should be accurate, and they should be operated by people which are adequately trained.RBI (Risk Based Inspection), considers that the estimation of risk on some equipment depends on the probability of failure and the consequences that will come as a result of this failure. This comes down to: Risk Estimation= Failure Probability x Failure Consequences [1] As shown in figure 1, the horizontal axis estimates the impact of the failure while the vertical axis estimates the probability of failure. The top right corner of the chart contains equipment which has been considered to be high risk, so, there the time span between inspections and maintainance should be small, while on the opposite side the bottom corner contains equipment which is considered to be low risk, so there the time span between inspections and maintainance should be big. From this augmentation in time spans comes the obvious financial benefit for the company which will apply the RBI method. The hidden financial benefit which is also very important as a size, derives from the funds that the company saves from the non-failure of the equipment during operation.These failures create both direct as well as indirect damage to the company, related to: Production loss, environmental restoration cost, mending or replacing damaged equipment cost, compensation cost, raise in insurance fees, bad public image, since it indicates a company which values nothing compared to short term profit, legal fees cost. There are three ways to approach the RBI method depending on the degree of elaboration required. These approaches are the qualitative, the semi-quantitative, and the quantitative. The usual action plan is to apply the qualitative approach throughout the entire equipment, because the time of realization is limited. Afterwards, the semi-quantitative method will be applied to equipment which will be considered as medium or high risk. Finally, to specific parts of the equipment, which require a greater amount of attention, the quantitative method will be applied, which is the most time consuming in its realization. Once the entire equipment is represented in the RBI chart, then, with the help of the technical module subfactor, the inspection and maintainance program will be designed.In figure 2 the action chart is shown, in order to create the program based on the RBI method. 2 THE RBI PROCESS As it was mentioned before, there are three approaches of the method, depending on the required level of control, and the realization time available. The qualitative approach is the first level of control, and, in order to be realized, a very good knowledge of the unit is required. It gives a preliminary, clear picture of the equipment condition, regarding to the risk of failure. The failure probability factor is graded from 1 to 5 (1 low risk, 5 high risk), while the consequence factor is graded from A to E (A low risk, E high risk).The probability factor is estimated after taking into account the following factors: Equipment factor, Damage factor, Inspection factor, Current Condition factor, Process factor, Mechanical Design factor.
903
Figure 1 The RBI matrix The higest score that the probability factor can achieve is 87 points. The damage factor is the one with the greatest gravity of the factors above, since the highest score it can achieve is 37 points, that is 42,7% of the total score of the probability factor. In table 1 the way of determining of the probability category is shown, depending on the score that has been achieved. In order to find the consequence category, the consequence factor is estimated, depending on the health of the people, as well as the consequence factor depending on the integrity of the equipment. The factor that has the highest score will be chosen. The consequence factor, which is related to the integrity of the equipment has 98 points as greatest gravity, and in order to be estimated the following factors should be taken into account: Chemical factor, Quantity factor, State factor, Auto-ignition factor, Pressure factor, Credit factor, Damage potential factor. From all factors this is the one that has the greatest gravity, since it can achieve up to 50 points, that is 51% of the total score. In table 2 the way of determining the consequence damage category is shown, in relation to the total score that has been achieved, regarding the integrity of the equipment.The consequence factor which is related to the health of people has 70 points as greatest score, and in order to be estimated, the following factors should be firstly taken into account: Toxic quantity factor, Dispersibility factor, Credit factor, Population factor. From these factors the one with the greatest gravity is the toxic quantity factor, since it can achieve up to 55 points, that is 78,5% of the total score. In table 3 the way of determining the consequence category is shown, in relation to the total score that has been achieved, regarding the health of people. Finally the consequence category with the worst score is selected. The semi-quantitative approach is the second level of control, and it should be applied to the equipment which has been characterized as of medium and high risk. It is more analytical than the qualitative approach since it takes more parametres into consideration. It is more accurate than the qualitative method, in the ability of quantitative estimation of the change in failure risk in case of alteration of some operational characteristics. It is based on the estimation of the technical module subfactor for finding the failure probability, which is also used for the creation of the inspection and maintainance schedule. (table 4)
904
Figure 2 Flow chart for organizing the application of the RBI Table 1 Determination of Likelihood Category Probability Factor
Probability Category
0 – 15
1
16 – 25
2
26 – 35
3
36 – 50
4
>51
5
Reprinted from: The American Petroleum Institute, (2000), API publication 581 Risk Based Inspection, base recourse document, API Publication 1st edition, May, p A5.
905
Table 2 Determination of Consequence Damage Category for qualitive analysis Total Score
Consequence Category
0 – 19
A
20 – 34
B
35 – 49
C
50 – 79
D
> 80
E
Reprinted from: The American Petroleum Institute, (2000), API publication 581 Risk Based Inspection, base recourse document, API Publication 1st edition, May, p A8. Table 3 Determination of Consequence Health Category for qualitive analysis Total Score
Consequence Category
<10
A
10 – 19
B
20 – 29
C
30 – 39
D
>40
E
Reprinted from: The American Petroleum Institute, (2000), API publication 581 Risk Based Inspection, base recourse document, API Publication 1st edition, May, p A10. Table 4 Determination of Probability Category for semiquantitive analysis Probability Category
Total Score
1
<1
2
1 to 10
3
10 to 100
4
100 to 1000
5
>1000
Reprinted from: The American Petroleum Institute, (2000), API publication 581 Risk Based Inspection, base recourse document, API Publication 1st edition, May, p B1.
906
The consequence of failure category is determined based on the entire surface which will be affected, according to table 5. The entire surface takes readings from zero to more than 10000, which means that it will belong to category E, so the consequences from any equipment failure will be very serious. What plays a critical role is the rate of effusion of the fluid from the equipment, the phase of the fluid with the effusion, the nature of the fluid, and the duration of the effusion. There are two categories of consequences: the consequences due to flamability and the consequences due to toxicity. Whichever of the two categories is found to have the most negative consequences, is the one that will be chosen. The quantitative approach of the RBI method is the third level of control and it is applied to the equipment that has been charted in the very high risk area. It is the most analytical of the three approaches, consequently it provides the most accurate results, but the time of realization is much more than in the case of the other two approaches. For this reason this approach is applied to selected equipment.The failure probability is determined according to equation 2: Failure Probability= Generic Failure Frequency x Equipment Modification Factor x Management Systems Evaluation Factor. [2] where: • Generic Failure Frequency = there is one for every kind of equipment, taken from the analyses of previous occurences as shown in table 6. • Equipment Modification Factor = the estimation of quite enough parametres is required for its finding, of which the one with the greatest gravity is the technical module subfactor. • Management Systems Evaluatrion Factor = it is estimated with the help of a logarithm chart after completing a check list which grades a company as far as the existence and the application of procedures is concearned, which have to do with the production, the inspection department, the control department and the personnel department. After the failure probability is estimated, what follows is the estimation of the consequences due to equipment failure. The consequences are divided into two categories: -Consequences due to fire. -Consequences due to effusion of toxic fluid. Of the two categories of consequences the one which will be taken into consideration is the one with the most adverse consequences, which are estimated in ft², meaning in the entire area which will be affected by the consequences of the failure. Finally, after all the above have been estimated, the equipment is charted on the RBI chart.After the application of the approaches of the RBI method is completed, in all the vessels and the piping of an industrial unit, then there will be the charting of all the vessels on an RBI chart, and the charting of the piping on another RBI chart. In this way it will be attainable for all the dangerous parts of the equipment to be shown at a glance. This method should be supported by a program which will provide feedback, so as the relocation of the equipment to be possible, from the area of low risk, to that of high risk, or vice versa, depending on the results of the inspections and the maintainance. For the creation of a program of inspections and maintainance, what should be estimated is the technical module subfactor. This specific subfactor is estimated in a different way, according to the enfeeblement mechanism which acts on the equipment. So, after estimating this specific subfactor with the help of table 7, the time of the next inspection or maintainance is defined. It should be noted that depending on the enfeeblement mechanism, the control technique is divided into five categories. The goal that there must be there during the time programming, is that the technical module subfactor is the lowest possible, so as to be a reduction of the failure risk of the equipment. Furthermore the degree of evaluation of the efficiency of the inspection should be the lowest possible, so as to be a reduction of the cost. For the same reason, the time span between the inspections and the maintainances should be the smallest possible. 3 APPLICATION In order to show the difficulty or not of the specific method, an application of it was done on an exchanger (E102) which is situated in the unit 100 of the Ammonia factory of the Phosphoric Fertilizers Industry in Nea Karvali, Kavala. The unit 100 is where the reforming of the natural gas is done, in order for the product which is pure H2 to react with nitrogen (N2) for the production of ammonia (NH3). On the specific exchanger both the qualitative as well as the semi-quantitative methods were applied for its mapping on the risk chart of RBI.
907
In both cases (i.e. in the application of the qualitative and the semi-quantitative approach), the equipment was found to belong to the area of medium risk. It should be noted that the enfeeblement mechanism which affects the specific equipment is the High Temperature Hydrogen Attack (HTHA). For this mechanism to exist, the presence of hydrogen in conditions of high temperature and pressure for a long time span (temperature>400°F and pressure>80psia) is required. The results of this mechanism is the appearance of small cracks in the inside of the material, an expansion of which could lead to the appearance of bigger cracks, and to the final failure of the equipment.
Table 5 Determination of Consequence Category for semiquantitive analysis Consequence Category
Total Score
A
<10
B
10 to100
C
100 to 1000
D
1000 to 10000
E
>10000
Reprinted from: The American Petroleum Institute, (2000), API publication 581 Risk Based Inspection, base recourse document, API Publication 1st edition, May, p B2.
From the application of the semi-quantitative method specifically, the crucial effect of the operating characteristics on the risk situation of the equipment became obvious. This means that, the RBI method constitutes a very useful tool in case an alteration in operating conditions is suggested, because the change in the risk of the equipment will be measurable, as well as any change in the time span between inspections and maintainance. In this way the engineer in charge will be able to compare the financial benefit from the alteration of the operating parametres, with the extra cost that might come up from the shrinking of the time span between inspections and maintainance. During the application of the semi-quantitative approach the technical module subfactor was found equal to 100. This according to table 7 means that a Usually Effective inspection should be done every 6 years or that alternatively a Fairly Effective inspection should be done every three years. In table 8 the definitions of the effectiveness of each inspection for the specific enfeeblement mechanism are shown. It should be noted that the operating time of the specific exchanger is 19 years. For the impact of time to be obvious the following scenario will be examined: If the operating time was 25 years, then the technical module subfactor would reach the reading of 2000. This means that a Usually Effective inspection should be done every three years. When the reading of the specific factor reaches 10000, then an immediate shutdown of the unit should be done, as well as predictive maintainance, because the risk of equipment failure during operation is very high. As far as the way of calculating the above factor goes, it should be noted that it depends on the type of enfeeblement mechanism, which is also affecting the way in which the efficiency of the inspection is defined. 4 CONCLUSIONS In a research that was done it was concluded that in a world wide scale, the most common method of programming inspections and precautionary maintainance was the Risk Based Inspection. It should be mentioned that the biggest chemical industries have invested money and human scientific resources for the development and the evolution of this specific method. Some of these industries are the following: PETROBRAS, BP, SHELL, PETRO-CANADA, SAUDI ARAMCO, DOW, MOBIL, SIEMENS, TUV ENGINEERING SERVICE, PLACID REFINING, TOTAL. All three approaches of the method have specific advantages and disadvantages. The basic direction for the application of RBI to an indurstrial unit is the following: First the qualitative approach is applied to all industrial vessels, so that, from the charting of the equipment, an initial image about the failure risk of the equipment will be derived. What follows is that the semi-quantitative
908
approach will be applied to the equipment which has been charted on the low and medium risk area, and which usually constitutes the vast majority of the equipment. Finally, to a small part of the entire equipment, which has been charted on the high risk area, the quantitative approach will be applied, which, as mentioned earlier, is the most analytical, it provides the most accurate results, but it is also the most time consuming because it requires the most calculations. So, because it has this disadvantage, the application of the specific approach to the entirety of the equipment is not a plausible goal. Table 6 Suggested Generic Equipment Failure Frequences Equipment type Centrifugal Pump, single seal Centrifugal Pump, double seal Column Compressor, Centrifugal Filter Fan Coolers Heat Exchanger Shell Heat Exchanger Tube Side Piping, 0.75 in diameter per feet Piping, 1 in diameter per feet Piping, 2 in diameter per feet Piping, 4 in diameter per feet Piping, 6 in diameter per 3feet Piping, 8 in diameter per feet Piping, 10 in diameter per feet Piping, 12 in diameter per feet Piping, 16 in diameter per feet Piping, >16 in diameter per feet Pressure Vessels Reactor Reciprocating Pumps Atmospheric tanks
Data Source
¼ in
Leak Frequency (per year for four hole sizes) 1 in 4 in Rupture
1
6x10-2
5x10-4
1x10-4
1
6x10-3
5x10-4
1x10-4
2
8x10-5
2x10-4
2x10-5
1x10-3
1x10-4
1
6x10-6
1 3
9x10-4 2x10-3
1x10-4 3x10-4
5x10-5 5x10-8
1x10-5 2x10-8
1
4x10-5
1x10-4
1x10-5
6x10-6
1
4x10-5
1x10-4
1x10-5
6x10-6
3
1x10-5
1x10-4
3x10-7
3
5x10-6
1x10-4
5x10-7
3
3x10-6
1x10-4
6x10-7
3
9x10-7
6x10-7
7x10-8
3
4x10-7
4x10-7
8x10-8
3
3x10-7
3x10-7
8x10-8
2x10-8
3
2x10-7
3x10-7
8x10-8
2x10-8
3
1x10-7
3x10-7
3x10-8
2x10-8
3
1x10-7
2x10-7
2x10-8
2x10-8
3
6x10-8
2x10-7
2x10-8
1x10-8
2 2
4x10-5 1x10-4
1x10-4 3x10-4
1x10-5 3x10-5
6x10-6 2x10-5
7
0.7
0.01
0.001
0.001
5
4x10-5
1x10-4
1x10-5
2x10-5
Reprinted from: The American Petroleum Institute, (2000), API publication 581 Risk Based Inspection, base recourse document, API Publication 1st edition, May, p 8-3.
909
The order API 510 provides the directive that an indurstial vessel shoud be internally inspected during the first 10 years of its operation. The only case in which the decade could be expanded is if an application of the RBI is done on the specific vessel and the TMSF which will be calculated takes such a reading that it allows for the 10 years to be surpassed. More specifically, in the case which the enfeeblement mechanism is the High Temperature Hydrogen Attack, and the TMSF takes readings inferior to 100, then the next inspection could be done twenty years later. In order for this time extension to be done, there should be certainty about the nature and the size of the enfeeblement mechanism which affects the equipment and naturally, all the calculations that have been done to be based on parametres which remain stable. Table 7 Actions required for H.T.H.A. Frequency of inspection and preventive maintenance ASAP
TMSF
Actions
10000<=TMSF
Preventive maintenance
2000<=TMSF<10000
Usually effective inspection
Every three years
Usually effective inspection
Every six years
Fairly effective inspection
Every three years
Usually effective inspection
Every twelve years
Fairly effective inspection
Every six years
Usually effective inspection
Every twenty years
Fairly effective inspection
Every twelve years
No need for inspection
No need for inspection
500<=TMSF<2000
100<= TMSF<500
10<= TMSF<100
10>TMSF
Reprinted from: The American Petroleum Institute, (2000), API publication 581 Risk Based Inspection, base recourse document, API Publication 1st edition, May, p 9-15. Table 8 Inspection effectiviness guidelines for H.T.H.A. Inspection Effectiveness Category
Typical Inspection Practices
Highly Effective Usually Effective
None Extensive Advanced Backscatter Technique (AUBT), spot AUBT based on stress analysis or extensive in-situ metallography
Fairly Effective
Spot AUBT or spot in-situ metallography
Poorly Effective
Ultrasonic backscatter plus attenuation
Ineffective
Attenuation only
Reprinted from: The American Petroleum Institute, (2000), API publication 581 Risk Based Inspection, base recourse document, API Publication 1st edition, May, p I-3.
910
From the above it becomes obvious that for the right application of the RBI the procedure that will be applied should be operating, user friendly, easy to comprehend, and to provide feedback as well as the possibility of recording. For all the above to be realised, the application of the RBI method should be sustained by a computer program which will provide both operativeness as well as time and space saving. In this way the findings of the inspections and the maintainance can be fully utilized and a database can be created which could be used into other industrial units as well. 5 REFERENCES 1
The American Petroleum Institute, (2000) API publication 581 Risk Based Inspection, base recourse document, API Publication 1st edition, May.
2
Safety assessment federation, (1997) Guidelines on Periodicity of Examination, SAFed.
3
Confederation organism of control, (1999) Risk assessment: a qualitative and quantitative approach, HSE books.
4
American Petroleum Institute, (1997) API publication 510 Pressure Vessel Inspection Code, API Publication, 8th edition, June.
5
American Petroleum Institute, (2000) API publication 580 Risk Based Inspection, API Publication 1st edition, May.
6
Mr Wintle and B W Kenzie TWI, (1998) Best practice for risk based inspection as a part of plant integrity management, HSA books.
911
Proceedings of the 4th World Congress on Engineering Asset Management Athens, Greece 28 - 30 September 2009
ASSESSING OCCUPATIONAL RISK FOR CONTACT WITH MOVING PARTS OF MACHINES DURING MAINTENANCE I.A. Papazogloua, O. N. Anezirisa, M. Konstandinidoua, L.J. Bellamyb and M.Damenc a
Systems Reliability and Industrial Safety Laboratory, National Center for Scientific Research "DEMOKRITOS", Aghia Paraskevi 15310, Greece b
c
WhiteQueen, NL-2130 AS Hoofddorp, Netherlands
RIGO, Postbus 2805, 1000CV, Amsterdam, Netherlands
In this paper a methodology for managing occupational risk owing to contact with moving parts while maintaining machines is presented. The methodology is based on the principles of quantified risk assessment. A probabilistic model has been developed to assess the risk from contact with moving parts of machines while maintaining them as a function of working conditions and time exposure. 62 other models have been developed to cover different hazards in all working activities. They have been developed under the Workgroup Occupational Risk Model (WORM - Metamorphosis) project, financed by the Dutch government. These models allow the delineation of accidents into sequences of events describing measures (technical and/or procedural) in place to prevent them or to mitigate their consequences. Identification of these sequences enables the identification of specific root causes of such accidents and hence the determination of specific and practical actions that can influence the probability of an accident. Quantification of these models provides, furthermore, a way for assessing the relative value of such measures and hence a basis for supporting decisions aiming at reducing the consequences of accidents in the work fields. Qualitative information on the safety functions, measures and barriers the failure of which constitute the causes of accidents have been derived from the analysis of actual accidents occurred and reported under the Dutch law over a period of years. The information contained in the accident reports, required by Dutch law and composed by investigating labor inspectors, has been systematically analyzed. The results of this analysis provide a Bowtie-like visual aid of the various events that contribute to the accident. This information is then used to develop the logical models describing the logic interconnection of the various events corresponding to technical and/or procedural safety related measures and is amenable to quantification according to the laws of probability theory. The relative importance of each measure to risk reduction is depicted with the aid of sensitivity calculations. Optimization, thus minimization of risk with specific initial conditions and constraints is achieved through alterations in the working conditions. Key Words: Occupational risk, moving parts, machine, logical model, quantification, importance analysis, WORM 1
INTRODUCTION
Occupational risk owing to contact with moving parts of machines or other equipment has very high rates in all industry sectors and among different working activities. OSHA sustains with facts that amputations are among the most severe and disabling injuries in the workplace that often result in permanent disability. These injuries are the outcome not only of the use but also of the care and maintenance of machines such as saws, presses, conveyors, and bending, rolling or shaping machines as well as forklifts and trash compactors [1]. The incident rate for events related to be “caught in or compressed by equipment or object” is over 6 cases per 10.000 workers per year for the period 2003 – 2007 in the United States, with more than 250 fatal incidents per year registered in the same period according to data from the US Bureau of Labour Statistics. The high incident rate is the reason of existence of many regulations and guidelines that have been issued to offer guidance and standardization in the prevention of accidents of the specific type. In the US the machinery and machine safeguarding standards by OSHA have specific references to moving parts of machines while in Europe a series of Machine safety regulations exist under the
912
name of Machinery Directives [2-4] along with relevant standards. Main purpose of these standards and policies is the protection of operators by avoiding any contact with the movable parts of machines, either by safeguarding these parts or by using adequate Personal Protective Equipment or even by using equipment in conformity with standardization (CE Marking). More than 400 accidents as a result of contact with moving parts of machines take place each year only in the Netherlands [5] and are the most frequent source of injuries in the workplace, sometimes leading to fatality also. Most of the accidents occur while personnel is operating the machines but many accidents are registered also for periods when machines are not operating but under maintenance, clearing or cleaning processes. The Workgroup Occupational Risk Model (WORM) project has been launched by the Dutch government in order to manage and reduce occupational risk. The aim of the project is the quantification of occupational risk through logical models for a full range of potential risks from accidents in the workspace [6]. Sixty-three logical models have been developed each coupling working conditions with the consequences of accidents owing to sixty-three specific hazards. Data for the development of these models are derived from the GISAI database [5] of the Netherlands Ministry of Work which includes approximately 12500 accident cases reported between January 1998 and February 2004. Of the analysed GISAI occupational accidents, approximately 3000 have been classified as owing to contact with moving parts of machines. The modelling and the quantification of these cases are described in this paper with emphasis on the cases concerning maintenance and cleaning of machines with moving parts. Logical models for contact with moving parts of machines have been presented by Backsteen et al [7]. Quantification for other types of occupational accidents such as falls from ladders [8], occupational accidents related to crane activities [9], chemical explosions [10], or electrical accidents [11] has already been performed within the WORM project. An overall assessment of the risk from 63 specific occupational hazards is given in Papazoglou et al [12]. From the observed accident cases, scenario-models have been firstly developed to capture the sequence of events leading to the accident with the use of the Storybuilder tool which allows the visual representation of the accident paths [13]. The scenario-model is the basis for the logical modelling in the WORM project [14]. This logical model consists in successive decomposition of the overall accident consequence into simpler and simpler events until a final level of event resolution is achieved. Each level of events is logically interconnected with the more general events of the immediately upper level. The events of the lower level of decomposition form an influence diagram consisting of two parts connected by a main event called the Centre Event (CE) and representing the occurrence of an accident resulting in a reportable consequence (here contact with moving parts of machines). This is a very important characteristic of the model. Owing to the nature of the available data that correspond to joint events of contacts resulting in reportable consequences, the Centre Event refers to events that either result in a reportable consequence or not (i.e. no contact or contact without reportable consequences). Usually all events to the left of this event represent events aiming at preventing the CE from occurring and the corresponding part of the diagram is called Left Hand Side (LHS). All events to the right of the CE correspond to events aiming at mitigating the consequences of the CE and this part of the model is called Right Hand Side (RHS) [14]. In the core of the WORM project, however, the events to the left are events that influence the probability of the Centre Event occurring, the latter being an accident with reportable consequence. The events to the right of the Centre Event, simply condition the severity of the reportable consequence. For communication purposes with safety engineers not familiar with logical models this influence diagram is called, within the WORM project, bowtie model. The logical model provides a way for organising various events from a root cause via the centre event, ending up with a reportable damage to the health of the worker. The use of such a model is twofold. On the one hand it provides the accident sequences, that is, the sequences of events that lead from a fundamental or root cause to the final consequence. On the other hand, it provides a way for quantifying the risk [14]. Data required for the quantification of these models are derived from the following sources: a) a user survey that produces the exposure rate of working near a machine with moving parts in the Netherlands; b) a survey that assesses the success rates for various measures; c) analysis of the details of 797 reported accidents with moving parts of machines during operation, maintenance, clearing and cleaning processes in the Netherlands, provided by the GISAI database. The results of this analysis and their inclusion in the logical model provide a probabilistic approach for the quantification of occupational risk due to moving parts of machines based on real data which is performed for the first time. This paper presents the logical model and the process for the quantification of risk, the specific causes and their prioritization with focus on activities related to maintenance of machines with moving parts. Activities for cleaning machines are also included in this process. The analysis covers 62 accident cases during maintenance and 108 accident cases while cleaning a machine. The paper is organized as follows. After the introduction of section 1, section 2 presents the logical model for contact with moving parts of machines while maintaining a machine. Section 3 describes the quantification process while the results from this process are presented in Section 4. Section 5 presents the ranking of the various working conditions and/or safety measures in terms of their contribution to the risk. Finally section 6 offers a summary of the paper and the conclusions.
913
2
LOGICAL MODEL FOR CONTACT WITH MOVING PARTS OF MACHINE
In this section a general model for accidents where injuries are the result of contact with moving parts of machines is presented. The accident may take place while operating, maintaining, clearing or cleaning a machine. Focus in this paper will be on maintaining activities as well as cleaning. Figure 1 presents the “Contact with moving parts of a machine” bowtie. The Center event represents a contact or not with the moving parts of a machine. The prevention part is decomposed into the initiating event and the safety barriers aiming at preventing the accident. The initiating event represents various activities related to maintenance of machines such as: modifying, installing, assembling, de-assembling, inspecting & testing during the maintenance process and cleaning a machine. The main difference between operating a machine and the maintenance activities is that during operation process machine is always ON while during maintenance machine may be ON or OFF. The status of the machine differentiates the safety barriers put in place to prevent the contact. A safety barrier is a physical entity, a technical, hardware, procedural or organisational element in the working environment that aims either at preventing something from happening (e.g. the CE) or at mitigating the consequences of something that has happened. Safety Barriers can be distinguished in Primary and Support Barriers. A Primary Safety Barrier (PSB) either alone or in combination with other PSBs may prevent the contact with moving parts of machine. A Support Safety Barrier (SSB) sustains the adequate function of the PSB and influence the probability with which the primary safety barrierstates occur. There are two situations for the operator to come in contact with the moving parts of the machine: I. Hit by moving part when the machine is ON and II. Hit by moving part while the machine is OFF This specific property has been included in the logical model in the form of a Primary Barrier with the name ‘Operating Status’. According to the data provided by accident analysis there are two distinguished ways of having contact with the moving parts of a machine while the machine is ON and a different one when the machine is OFF: 1.
the worker enters the dangerous zone of the machine without adequate physical guarding (machine is ON)
2.
the machine enters the safe zone of the worker who does not have adequate physical guarding (machine is ON)
3.
the machine moves unexpectedly while the worker is in the dangerous zone of the machine (machine is OFF)
The safety barriers used to model the above mentioned situations are: - Physical Guarding: Measures to prevent the physical contact with the moving parts of machines such as covering of the machine or safeguarding measures. - Prevention of body part inside Danger Zone: Measures to prevent the operator of entering the Dangerous Zone of a machine (e.g. signals, mark up). - Prevention of machine part inside Safety Zone: Measures to prevent the machine of entering the Zone where the operator is (e.g. good condition of the machine). - Prevention of unexpected movement when body inside Danger Zone: Measures to prevent the starting of a machine unexpectedly while the operator is located inside the Dangerous Zone of the machine for his works (e.g. tagging, locks). This barrier is activated when the machine is normally OFF.
Figure 1. Logical model for contact with moving parts while maintaining a machine.
914
2.1 Support Safety Barriers A Support Safety Barrier (SSB) contributes to the adequate function of the Primary Safety Barriers and influences the probability with which the primary safety barrier-states occur. Support Safety Barriers for contact with moving parts of a machine while maintaining or cleaning it are: - Respecting the Danger Zone of machine: This barrier refers to preventing people to move their body(parts) intentionally into a danger zone of a machine while machine parts are moving. - Body Control & Awareness: This is the ability of people to control their body movements in order to stay out of the danger zone of a machine. This only refers to unintentional and/or unaware movements into the danger zone. - Ability to use the machine: This barrier is the ability of the user to use the machine as intended and within the operational safe limits (according to their specification). - Machine Integrity: This barrier refers to the property of the machine to stay intact, without endangering the user outside the normal danger zone, including the prevention of ejected parts. - Prompt interruption: This refers to the action of stopping machine movement either automatically or manually. - Lock-out & Tag-out of Machine: This refers to the system in place that prevents unintentional operating of the machine, such as lock-out, tag-out or mechanical securing. In order to complete the model, the following additional conditional block was added. - Start of machine and body(parts) inside DZ at start: This refers to the unintentional start of the machine while the operator is inside the Danger Zone. Finally one more block is used as support barrier. - Age of operator: Age of the person that was injured by the contact may have an influence on the consequences of this contact on the victim. This event has two states: ‘Age of operator <= 50 years old’ and ‘Age of operator > 50 years old’. “Age of operator” influences all prevention barriers.
2.2 Probability Influencing Entities (PIEs) In several instances the safety barriers of the model are simple enough to link directly to easily understood working conditions and measures as in the barrier “Prompt Interaction”. Assessing the frequency with which Prompt Interaction of the machinery occurs is straightforward. In other instances, however, this is not possible. For example, the support barrier “Machine Integrity” may be analysed into more detailed and more concrete measures that affect its quality. Such specific measures are: (i) CE marking; ii) Equipment hazard identification and risk evaluation and iii) Machinery in good condition. Similarly the barrier “Body Control and Awareness” may be analysed into the following measures: i) Marking & signalling of danger zones; ii) No residual movement of the machine; iii) Machine danger zones change location (dynamic); iv) Accessibility of danger zone; v) Workplace conditions (visibility, noise, slippery work floor); vi) Physical condition of person (fatigue, illness, dizziness); vii) Mental alertness; viii) Entanglement of clothing/hair. Such factors have the name of Probability Influencing Entities (PIEs). Each influencing factor (PIE) is assumed to have two possible levels, “Adequate” and “Inadequate”. The quality of an influencing factor is then set equal to the frequency with which this factor is at the adequate level in the working places. Then the quality of the barrier is given by a weighted sum of the influencing factor qualities. The weights reflect the relative importance of each factor and are assessed by the analyst on the basis of expert judgement. Currently equal weights have been used. This way the probability of a support barrier to be in one of its possible states is given by the weighted sum of the frequencies of the influencing factors [15]. PIEs and their frequencies for the barriers they influence for the model of contact with moving parts of machine are presented in Table 1. Frequencies of PIEs have been assessed through surveys of the working condition in the Dutch working population and reflect the Dutch National Average [15]. Barrier failure probabilities are also presented in Table 1.
2.3 Right hand side (RHS) The right hand side of the bowties in combination with the outcome of the centre event determine the consequences of the contact. Four levels of consequences are used: C1: No consequence; C2: Recoverable injury; C3: Permanent injury; C4: Death. Events of the RHS are the Type of contact (entanglement or not) and the adequacy and promptness of the emergency response. More detail on these events is presented by Backsteen et al [16].
915
Table 1.Barriers and their PIES Characteristics Barrier
Maintain Frequency
Barrier Failure
Clean Frequency
Barrier
Completeness physical guarding (moving parts covered)
0.25
0.222
0.17
0.184
Presence of Physical Safeguard
0.24
0.2
Condition of Physical Safeguard
0.11
0.08
Bypassing a physical safeguard
0.16
0.13
Provision of Physical Safeguard
0.35
0.34
Respecting DZ of machine
Respecting Danger Zone
0.28
0.28
0.23
0.23
Body control & awareness
Marking & signalling of danger zones
0.25
0.145
0.23
0.138
Residual movement of the machine
0.1
0.11
Dynamic danger zones
0.13
0.08
Accessibility of danger zone
0.22
0.23
Workplace conditions
0.18
0.15
Physical condition of person
0.07
0.07
Mental alertness
0.13
0.14
Entanglement of clothing/hair
0.08
0.09
Ability to use machine
Safe operating limits
0.13
0.13
0.14
0.14
Machine integrity
CE marking
0.26
0.203
0.17
0.137
Equipment hazard identification and risk evaluation
0.14
0.09
Machine in good condition
0.21
0.15
Prompt Interruption
Emergency stop / Contact switch failure
0.15
0.15
0.1
0.1
Lock out / Tag out of a machine
Lock-outs & Tag-outs
0.08
0.08
0.09
0.09
Physical Guarding
3
Probability Influencing Entity (PIE)
Failure
QUANTIFICATION PROCESS
In general the level of resolution of a logical model used in WORM is driven by the available data. A logic model provides a collection of event outcomes or barrier states which may lead to an accident when they coexist in particular states. These accidents have specific consequences. The general form of such a sequence is: C = {S1,S2, …..Sn, B1, B2, …Bm}
(1)
Analysis of available accident data allowed the assessment of the number of times such accident sequences occurred during a given period of time. Surveys of the Dutch working population assessed the exposure of the workers to the specific hazards over the same period of time. Consequently it was possible to assess the probability P(C) of the various accident sequences. Surveys of the Dutch working places and of the corresponding conditions allowed the assessment of the overall probability of some individual barriers (e.g. see Table 1). If such assessment is made then probabilities of the form P(S1, S2,…, B1,…,Bi,…) can be estimated where (S1, S2, …, B1, …, Bi,…) are the barriers that can be quantified independently of the accident data. Then equation (1) can be written as: P(C) = P(S1,…, Bi)*P*(Bi+1,…, Bj/S1, …, Bi)
(2)
In the above equation P(C), P(S1,...,Bi) are known hence P(Bi+1,…,Bj/S1, …,Bi) can be calculated. The overall objective of WORM is then to be able to calculate the new value of P(C) given a number of safety measures that have a specific effect and consequently change P(S1,...,Bi) to P΄(S1,...,Bi).
4
RESULTS
Quantification of the model for contact with moving parts of machine according to the quantification process described in the previous paragraph resulted in the following probabilities:
916
Probability of Contact with moving parts of machines while maintaining a machine and with reportable Consequence (/hr)
= 1.63x10-7
Probability of Lethal Injury while maintaining a machine (/hr)
= 5.26x10-9
Probability of Permanent Injury while maintaining a machine (/hr)
= 1.20x10-7
Probability of Recoverable Injury while maintaining a machine (/hr) = 3.67x10-8 Probability of Contact with moving parts of machines while cleaning a machine and with reportable Consequence (/hr)
= 6.80x10-7
Probability of Lethal Injury while cleaning a machine (/hr)
= 9.73x10-9
Probability of Permanent Injury while cleaning a machine (/hr)
= 5.40x10-7
Probability of Recoverable Injury while cleaning a machine (/hr)
= 1.33x10-7
Individual risk of death, permanent injury and recoverable injury per hour has been assessed according to the methodology presented with more detail by Papazoglou et al [12].
5
IMPORTANCE ANALYSIS
To assess the relative importance of each factor influencing the risk from contact with moving parts of machines, two importance measures have been calculated. 1.
Risk Decrease: This measure gives the relative decrease of risk, with respect to the present state, if the barrier (or PIE) achieves its perfect state with probability equal to unity.
2.
Risk Increase: This measure gives the relative increase of risk, with respect to the present state, if the barrier (or PIE) achieves its failed state with probability equal to unity.
Risk Decrease prioritizes the various elements of the model for the purposes of possible improvements. It is more risk – effective to try to improve first a barrier with higher risk decrease effect than another with lower risk decrease. Risk Increase provides a measure of the importance of each element in the model to be maintained at its current level. It is more important to concentrate on the maintenance of a barrier with high risk increase importance than one with a lesser one.
5.1 Contact with moving parts of machine –Maintaining a machine When maintaining a machine the machine can be either ON or OFF. According to survey results the machine if OFF in 30% of the total maintaining time but may unintentionally start-up. 5.0 4.5 4.0 3.5 3.0 2.5 2.0 1.5 1.0 0.5 0.0
Contact with moving parts of a machine - maintaining
INCR
DECR
INCR
DECR
INCR
DECR
RECOVERABLE INJURY PERMANENT INJURY Physical Guarding Respecting DZ of machine
FATALITY Body control & awareness
Ability to use machine
PROMPT INTERRUPTION
Machine integrity
Lock-out/ tag-out of machine
Figure 2. Risk Increase and Risk Decrease indices for Support Barriers
917
In order to minimize risk the physical guarding of the moving parts of the machine is the most effective barrier since if properly used it won’t allow the contact of the moving parts with operators. If this not achievable there are other measures to decrease or maintain risk while maintaining a machine. In order to maintain fatality and injury risk the most effective barrier is the ability of people to control their body movements such, that they stay out of the danger zone of a machine along with the ability of the user to use the machine as intended and within the operational safe limits. Results depicting the importance of different safety barriers are shown in Figure 2.
The results of the sensitivity analysis into a more detailed level depict the following measures as the most important for risk decrease and risk increase purposes.
Maintaining a machine - Fatalities
Lock-out & Tag-out Emergency stop / Contact switch failure Machine in good condition
The most important measure to decrease fatality risk when maintaining a machine is the “operation of the machine within the designed safety limits” that means in compliance with the machine’s specifications and safety regulations, with a risk decrease index of 0.528 followed by the “respect of the Dangerous Zone of the machine” with 0.480 and the existence of a properly functioning emergency stop with 0.412.
Equipment hazard identitifcation CE marking Safe operating limits Entanglement of clothing/ hair Mental alertness Physical condition of person Workplace conditions Accessibility of danger zone Dynamic danger zones Residual movement of the machine Marking & signalling of danger zones
DECREASE INCREASE
Respecting Danger Zone Provision of Physical Safeguard Bypassing a Physical Safeguard Condition of Physical Safeguard Presence of Physical Safeguard Completeness Physical Guarding 0.00
0.50
1.00
1.50
2.00
2.50
3.00
3.50
4.00
4.50
Figure 3. Fatality risk increase and risk decrease for contact with moving parts while maintaining a machine for PIEs Important measures to decrease permanent and recoverable injuries are also the “operation of the machine within safety limits” and the “respecting of the Dangerous Zone” meaning no intentional reaching towards the machine moving parts while the third more important for both consequence levels is the “provision of a physical safeguard” with a risk index of 0.28.
Maintaining a machine - Permanent injuries
Lock-out & Tag-out Emergency stop / Contact switch failure Machine in good condition Equipment hazard identitifcation CE marking Safe operating limits Entanglement of clothing/ hair Mental alertness Physical condition of person Workplace conditions
The most important measures to maintain fatality risk at the present level are the “operating of the machine within safety limits”, the existence of a properly functioning emergency stop switch and the use of technical measures against unintentional start-up such as lock-outs & tagouts on the machine while being maintained.
Accessibility of danger zone Dynamic danger zones Residual movement of the machine Marking & signalling of danger zones Respecting Danger Zone Provision of Physical Safeguard
DECREASE
Bypassing a Physical Safeguard
INCREASE
Condition of Physical Safeguard Presence of Physical Safeguard Completeness Physical Guarding 0.00
0.50
1.00
1.50
2.00
2.50
3.00
3.50
4.00
4.50
Figure 4. Permanent injury risk increase and decrease for contact with moving parts while maintaining a machine for PIEs Maintaining a machine - Recoverable injuries
If those measures are not respected or not used they will increase risk by 3.53, 2.33 and 1.62 times respectively. The most important measure to maintain injury risk is the “operation of the machine within safety limits” and the “use of lock-out & tag-outs” on the machine when the machine is being maintained with almost similar risk increase indexes for both consequence levels of 4 and 1.5 respectively. Results are shown in Figures 3-5.
Lock-out & Tag-out Emergency stop / Contact switch failure Machine in good condition Equipment hazard identitifcation CE marking Safe operating limits Entanglement of clothing/ hair Mental alertness Physical condition of person Workplace conditions Accessibility of danger zone Dynamic danger zones Residual movement of the machine Marking & signalling of danger zones Respecting Danger Zone Provision of Physical Safeguard
DECREASE
Bypassing a Physical Safeguard
INCREASE
Condition of Physical Safeguard Presence of Physical Safeguard Completeness Physical Guarding
0.00
0.50
1.00
1.50
2.00
2.50
3.00
3.50
4.00
4.50
Figure 5 Recoverable injury risk increase and decrease for contact with moving parts while maintaining a machine for PIEs
918
5.2 Contact with moving parts of machine – Cleaning a machine When cleaning a machine the machine can be ON or OFF. According to the survey results machine is OFF 29% of the time but may start-up unexpectedly. Contact with moving parts of a machine - cleaning 10 9 8 7 6 5 4 3 2 1 0 INCR
DECR
INCR
RECOVERABLE Physical Guarding INJURY Ability to use machine Lock-out/ tag-out of machine
DECR
INCR
PERMANENT Respecting DZ ofINJURY machine Machine integrity
DECR
FATALITY Body control & awareness PROMPT INTERRUPTION
In order to minimize injury risk the physical guarding of the moving parts of the machine is the most effective barrier since if properly used it won’t allow the contact with the moving parts. Prompt interruption of the machinery is the most efficient barrier to decrease or to maintain fatality risk at present level. In order to decrease or to maintain injury risk (for permanent or recoverable injuries) the most effective barrier is the ability of people to control their body movements such, that they stay out of the danger zone of a machine along with the ability of the user to use the machine within the operational safe limits. Results are shown in Figure 6.
Figure 6. Risk Increase and Risk Decrease indices for Support Barriers Cleaning a machine - Fatalities Lock-out & Tag-out Emergency stop / Contact switch failure Machine in good condition
The results of the sensitivity analysis into a more detailed level depict the following measures as the most important for risk decrease and risk increase purposes. The most important measure to decrease fatality risk while cleaning a machine is the existence of a properly functioning emergency switch within the reach of operators and the respect of the dangerous zone of the machine. These measures will decrease fatality risk by 0.995 and 0.522 respectively if used 100% of time.
Equipment hazard identitifcation CE marking Safe operating limits Entanglement of clothing/ hair Mental alertness Physical condition of person Workplace conditions Accessibility of danger zone Dynamic danger zones Residual movement of the machine Marking & signalling of danger zones Respecting Danger Zone
DECREASE
Provision of Physical Safeguard
INCREASE
Bypassing a Physical Safeguard Condition of Physical Safeguard Presence of Physical Safeguard Completeness Physical Guarding 0.00
1.00
2.00
3.00
4.00
5.00
6.00
7.00
8.00
9.00
Figure 7. Fatality risk increase and risk decrease for contact with moving parts while cleaning a machine for PIEs The most important measures to maintain fatality risk at the present level are the emergency stop, the operating of the machine within safety limits and the lock-out & tag-out of the machine while being cleaned. If not used they will increase risk by 8.96, 2.30 and 2.12 times respectively. The most important measure to decrease injury risk is the operation of the machine within safety limits and the respect of the dangerous zone of the machine. These measures will decrease injury risk in half if used 100% of time. The most important measure to maintain injury risk is the operation of the machine within safety limits, the use of the emergency stop switch, the respect of the dangerous zone of the machine and the use of lock-out & tag-out on the machine when the machine is being cleaned. If not used they will increase risk by 3.3, 2.5, 1.8 and 1.3 respectively. Results are shown in details in Figures 7 - 9. As shown in this figures the importance of measures when cleaning a machine is quite different from the importance while maintaining them.
919
Cleaning a machine - Permanent injuries
Lock-out & Tag-out Emergency stop / Contact switch failure Machine in good condition Equipment hazard identitifcation CE marking Safe operating limits Entanglement of clothing/ hair Mental alertness Physical condition of person Workplace conditions Accessibility of danger zone Dynamic danger zones Residual movement of the machine Marking & signalling of danger zones
DECREASE
Respecting Danger Zone
INCREASE
Provision of Physical Safeguard Bypassing a Physical Safeguard Condition of Physical Safeguard Presence of Physical Safeguard Completeness Physical Guarding 0.00
1.00
2.00
3.00
4.00
5.00
6.00
7.00
8.00
9.00
Figure 8 Permanent injury risk increase and decrease for contact with moving parts while cleaning a machine for PIEs Cleaning a machine - Recoverable injuries
Alternative combinations of the above measures result in reduced occupational risk. Furthermore, each set of measures might be associated with a different level of cost and related risk. Risk optimization calculations may be performed to compare potential reducing strategies and define the optimum one based on different criteria such as: minimum risk criteria, specific risk criteria and economic criteria. The selection of the reducing strategy is up to company’s management or to regulatory committee’s decision according to the required objective.
Lock-out & Tag-out Emergency stop / Contact switch failure Machine in good condition Equipment hazard identitifcation CE marking Safe operating limits Entanglement of clothing/ hair Mental alertness Physical condition of person Workplace conditions Accessibility of danger zone Dynamic danger zones Residual movement of the machine Marking & signalling of danger zones Respecting Danger Zone Provision of Physical Safeguard
DECREASE
Bypassing a Physical Safeguard
INCREASE
Condition of Physical Safeguard Presence of Physical Safeguard Completeness Physical Guarding 0.00
1.00
2.00
3.00
4.00
5.00
6.00
7.00
8.00
9.00
Figure 9 Recoverable injury risk increase and decrease for contact with moving parts while cleaning a machine for PIEs 6
CONCLUSIONS
A logical model has been presented for quantifying the probability of contact with moving parts of machines during maintenance and the various types of consequences following these types of accidents. The model includes primary and support safety barriers aiming at preventing the contact. For the quantification of the model the exposure rates (total time spent in an activity involving each hazard per hour) have been used which was estimated with user (operators) surveys and real accident data coming from the reported accident database GISAI. The probability of the consequences of such accidents is presented in three levels: fatalities, permanent injury and non-permanent injury. Surveys also provided data for the working places and the corresponding conditions allowing in this way the assessment of the overall probability of some individual barriers. The model has been used for risk reducing measures prioritization through the calculation of two risk importance measures: the risk decrease and the risk increase. The calculations were made for the overall risk and the risk in three levels of consequence severity. Results show that the most important measure to maintain fatality risk at its present levels is the presence of a properly functioning emergency stop switch when cleaning a machine and the operation of the machine within the designed safety limits when maintaining it. Injuries can be manageable when operating the machine within the safety levels and when using lock-outs/tag-outs on the machine to prevent their unintentional start up.
920
In order to decrease fatality risk the most efficient measure is the existence and use of the emergency stop switch while for reducing injuries in the workplace the most efficient measure is to operate the machine within the safety levels as described in their instructions and to respect the dangerous zone of the moving parts of the machine while maintaining or cleaning it. The selection of the final reducing strategy including all of the proposed measures or combinations of them is up to company’s management or to regulatory committee’s decision according to the required objective e.g. minimizing risk based on specific initial conditions and potential economic budget thus also specific constraints. 7
REFERENCES
1
OSHA (2007) Safeguarding Equipment and Protecting Employees from Amputations, Small Business Safety and Health Management Series, OSHA 3170-02R.
2
EC (1998) Directive 98/37/EC of the European Parliament and of the Council of 22 June 1998 on the approximation of the laws of the Member States relating to machinery, Official Journal of the European Communities, Luxembourg.
3
EC (2006) Directive 2006/42/EC of the European Parliament and of the Council of 17 May 2006 on machinery, and amending Directive 95/16/EC (recast), Official Journal of the European Communities, Luxembourg.
4
EEC (1989) Council Directive 89/392/EEC of 14 June 1989 on the approximation of the laws of the Member States relating to machinery, Official Journal of the European Communities, Luxembourg.
5
GISAI (2005) Geintegreerd Informatie Systeem Arbeids Inspectie: Integrated Information System of the Labor Inspection in the Netherlands.
6
Ale B.J.M., Baksteen H., Bellamy L.J., Bloemhof A., Goossens L., Hale A.R., Mud M.L., Oh J.I.H., Papazoglou I.A., Post J., and Whiston J.Y. (2008) Quantifying occupational risk: The development of an occupational risk model. Safety Science, 46 (2), 176-185.
7
Baksteen H., Mud M., Papazoglou I.A., Aneziris O. N., Ale B.J.M, Bellamy L.J., Hale A.R., Bloemhoff A., Post J. , Oh J. (2006) Quantified risk assessment for contact with a moving part of a machine. Working on Safety Conference 2006, Dutch Ministry of Social Affairs and Employment and the Delft University of Technology, Netherlands.
8
Aneziris O.N, Papazoglou I.A., Baksteen H., Mud M.L., Ale B.J.M, Bellamy L.J., Hale A.R., Bloemhoff A., Post J., Oh J.I.H. (2008) Quantified risk assessment for fall from height. Safety Science, 46 (2), 198-220.
9
Aneziris O. N., Papazoglou I.A., Mud M.L., Damen M., Kuiper J., Baksteen H., Ale B.J.M., Bellamy L.J., Hale A.R., Bloemhoff A., Post J.G., Oh J.I.H. (2008) Towards risk assessment for crane activities. Safety Science. 46 (6), 872-884.
10
Papazoglou I.A., Aneziris O. N., Konstandinidou M., Mud M., Damen M., Kuiper J., Bloemhoff A., Backsteen H., Bellamy L.J., Post J.G., Oh J. (2008) Occupational Risk Management for Vapour/Gas Explosions, In S. Martorell, C. Guedes Soares & J. Barnett (Eds) ESREL 2008 Safety, Reliability and Risk Analysis: Theory, Methods and Applications. Valencia. 777-785, London: Taylor & Francis Group.
11
Aneziris O. N., Papazoglou I.A., Konstandinidou M., Mud M., Damen M., Kuiper J., Backsteen H., Bellamy L.J., Oh J. (2009) Occupational Risk Management for Electricity Accidents, Submitted for publication in ESREL 2009.
12
Papazoglou I.A, Bellamy L.J., Leidelmeijer K.C.M., Damen M., Bloemhoff A., Kuiper J., Ale BJ.M., Oh J.I.H. (2008) Quantification of Occupational Risk from Accidents, Proceedings of PSAM 9.
13
Bellamy L.J., Ale B.J.M., Geyer T.A.W., Goossens L.H.J., Hale A.R., Oh J.I.H., Mud M.L., Bloemhoff A, Papazoglou I.A., Whiston J.Y. (2007) Storybuilder—A tool for the analysis of accident reports. Reliability Engineering and System Safety 92, 735–744.
14
Papazoglou I.A., Ale B.J.M. (2007). A logical model for quantification of occupational risk, Reliability Engineering & System Safety 92 (6), 785-803.
15
RIVM (2008) WORM Metamorphosis Consortium. The Quantification of Occupational Risk. The development of a risk assessment model and software. RIVM Report 620801001/2007 The Hague.
16
Baksteen H., Mud M., I.A. Papazoglou, O. N. Aneziris, Konstandinidou M., Schepers W. (2008) Technical report on Bowtie: Contact with moving part of a machine, RIVM.
Acknowledgments The work presented in this paper was performed under the Work Occupational Risk Model – Metamorphosis (WORM-M) project and the RAM project both financed by the Dutch Ministry of Social Affairs and Employment through RIVM, Netherlands.
921
Proceedings of the 4th World Congress on Engineering Asset Management Athens, Greece 28 - 30 September 2009
ACCIDENT CAUSES DURING REPAIR & MAINTENANCE ACTIVITIES AND MANAGERIAL MEASURES EFFECTIVENESS Dr George Skroumpelos a, b a
b
Vice President ACRM Consulting S.A.
Adjunct Lecturer, University of Piraeus, 80 Karaoli & Demetriou str., Piraeus 18534, Greece.
This paper presents the results of a large scale research in industry operations that recorded the causes of incidents during repair and maintenance activities and the effectiveness of the implementation of managerial measures in order to prevent them; the residual risk for the same activities is also mentioned. The chain-event mechanism is briefly analysed and a new contributing behavioural factor, the unsafe mentality bonding is briefly explained. An updated table of all incident causes that comprise the chain-event mechanism is presented. The potential incident scenarios were linked to three categories of managerial root causes and namely lack of (a) health and safety system, (b) communication and (c) enforcement. A brief description of the research arithmology and methodology is followed by the results. The research revealed that during repair and maintenance the main employee-related accident causes activities are the lack of housekeeping, unexpected machinery start, the use of defective hardware, safety rules violation and bantering while the main supervisorrelated causes were hastiness, familiarization with danger and insufficient job specifications. The managerial causes were analysed not only independently but in combination and were linked to corresponding measures. The most effective managerial measure proved to be enforcement which in combination with the development of a health and safety system could reduce potential incidents in repair and maintenance activities to half. Key Words: safety, repair, maintenance, accident, incident, causes, measures 1
INTRODUCTION
Most risk assessment studies are mainly focused on regular production or directly-linked to production activities. Repair and maintenance (r&m) activities are most often either omitted or superficially approached in the risk assessment process in several studies the reason being they involve. By nature, these activities are more complex and involve a higher degree of uncertainty which in turn means that they entail a higher degree of risk. In spite of the fact that several r&m-related procedures were introduced as safety methods of work, like lockout-tagout, welding, confined space entry, a general overview of maintenance activities, their risks and measures is rare. This survey presents the results of a wide industry-oriented survey on the causes of potential incidents during r&m works and their relation to the managerial causes inflicting these incidents. These causes are then linked to the corresponding managerial measures and their individual as well as combined effectiveness in reducing the incident probability in such works.
2
THE ROOT CAUSE-EVENT CHAIN MECHANISM
Researchers have proven that an accident is the result of a sequence of events that is triggered by an initiating event linked to poor management practices [1], [2], [3]. More specifically the root cause-event chain mechanism is: Managerial CausesBasic CausesImmediate CausesUnpleasant EventConsequences [4], [5]. Immediate causes are technically oriented hence first-line-worker sensitive. Basic causes are less technically oriented; they include management issues as well and are therefore more supervisory-personnel oriented. Managerial causes depend on company culture and are analyzed in the following. By applying effective preventive measures at any of the links, the chain is broken and the event does not happen.
922
However, this mechanism is not a straight line flow chart; actually, the managerial causes affect a broader range of basic causes which in turn induce a bigger number of immediate causes thus creating an environment in which a variety of unpleasant events (incidents) is possible. Therefore, one managerial cause initiates not one, but a number of chain-event mechanisms that may result in more than one incidents (Fig.1). On the other hand, if managerial measures are fully and effectively implemented no chain-event mechanism can be initiated.
CULTURE MC=Managerial Causes BC=Basic Causes IC=Immediate Causes Inc=Incidents UMB=Unsafe Mentality Bonding
MC
Bonding improbable
MC
MC UMB
MANAGERIAL Level Measures STRONG RESISTANCE to UMB
lead to
Unsafe behaviour BC
BC
BC
BC UMB
Unsafe behaviour
SUPERVISORY Level Measures WEAKER RESISTANCE to UMB lead to
IC
IC
IC
IC
IC UMB
TECHNICAL /FIRST LINE Level Measures TRIVIAL RESISTANCE to UMB Unsafe behaviour
Inc
Inc
Inc
lead to
Inc
Inc
Inc
Figure 1: The root cause-event chain mechanism and the effect of Unsafe Mentality Bonding (UMB) depicted by the horizontal dotted lines Managerial, Basic and Immediate causes are explicitly listed in the ANSI Z16.2 standard [6]. These causal factor lists were upgraded to yield Table 1 [7]. Table 1: Immediate, Basic and Managerial cause list A. Immediate Causes UA.Unsafe Acts
UC. Unsafe Conditions
UA1.Acting without authorization
UC1.Guards out of place
UA2.Acting without personnel warning
UC2.Insufficient guarding
UA3.Guard bypassing
UC3.Insufficient working space
UA4.LOTO violation
UC4.Insufficient access
UA5.Improper material handling
UC5.Heat/ ignition sources
UA6.Non-use of equipment
UC6.Unexpected movement
UA7.Use of defective equipment
UC7.Protruding parts
UA8.Misuse of equipment
UC8.Unstable material storage
UA9.Non-use of PPE
UC9.Insufficient equipment
UA10.Use of defective PPE
UC10.Defective equipment
UA11.Misuse of PPE
UC11.Insufficient work area demarcation
UA12.Improper manual handling
UC12.Insufficient equipment/ installation demarcation
UA13.Maintenance of machinery/ equipment in motion
UC13.Insufficient material labeling
UA14.Maintenace of electrical machinery/ equipment under voltage
UC14.Insufficient Lockout - Tagout
923
UA15.Rules violation
UC15.Improper working outfit/ jewelry
UA16.Bantering
UC16.Improper-insufficient PPE
UA17.Working under influence
UC17.Insufficient lighting
UA18.Repetitive motion
UC18.Insufficient air conditioning
UA19.Improper working posture
UC19.Vibration
UA20.Overexertion
UC20.Insufficient housekeeping
B. Basic Causes WM. Wrong Motive
PF. Personal Factots
OF. Occupational Factors
WM1.Saving time (hastiness)
PF1.Lack of knowledge-skills
OF1.Insufficient job specifications
WM2.Saving effort
PF2.Lack of attention focus
OF2.Insufficient design
WM3.Seeking ease, comfort
PF3.Familiarisation with danger
OF3.Insufficient maintenance
WM4.Attract attention
OF4.Normal wear
WM5.Display independence
OF5.Abnormal wear
WM6.Seeking approval
OF6.Insufficient equipment
WM7.Express hostility WM8.Seek financial rewards C. Managerial Causes M1.Insufficient programming M2.Lack of programming M3.Insufficient specifications - programming - procedures M4.Lack of specifications-programming-procedures M5.Insufficient training M6.Lack of training M7.Insufficient enforcement M8.Lack of enforcement M9.Insufficient knowhow M10. Lack of knowhow Then an association was created among the workplace hazards, the potential incident scenarios and the potential incident chain mechanisms. This tool was developed as part of this research [6]. All incident chain mechanisms sprang from three broad managerial-cause categories: -
Partial or total lack of health & safety system
-
Partial or total lack of internal communication techniques
-
Partial or total lack of managerial enforcement
924
Managerial Causes were then classified into three major categories and corresponding recommended measures were linked to each category as shown in Table 2, namely: Table 2: Incident Managerial Causes and corresponding recommended measures Managerial Causes Classification
Corresponding measures
M1. Partial or total lack of a Health & Safety System
1S. Written policies, procedures, guidelines, rules, safe methods of work and action planning for continual improvement
M2. Partial or total lack of internal Communications
2C. Training, meetings, verbal & non-verbal internal communication techniques
M3. Partial or total lack of a managerial enforcement
3E. Continuous safety supervision, auditing, & followup on the implementation of the above
Another issue that was also taken into account was the Unsafe Mentality Bonding (UMB) [8] to describe the behaviour of a worker who may ignore the measures because of his total perception of the management’s general attitude or of the working environment safety and not because he was is not aware of danger, an issue which is raised by researchers [9], [10], [11], [12]. This phenomenon is easily identifiable in comments of employees of the type “Keeping machine guards in place is the least of my problems; don’t you see there is no air conditioning here?”, or, “We work under extreme stress, I cannot demand from my workers to wear safety shoes.” These comments were very frequent during the research.
3
RESEARCH ARITHMOLOGY AND METHODOLOGY
The research was conducted in 10 industrial operations all of which had repair and maintenance (R&M) activities as part of their in-house operations. 105 R&M activities were surveyed that engaged 146 full-time employees, of which 89 where interviewed. The results that are presented yielded from the maintenance and technical personnel involved in the survey process using the incident recall method [4], [13] which is widely used in risk assessment aiming at recording the incidents (accidents and near accidents) experienced by personnel and not only the ones recorded or reported which are filtered by the assessor thus limiting the latter’s subjectivity. The method’s further analysis is beyond the scope of this paper. The 105 R&M activities that were examined resulted in 1122 R&M-related potential incident scenarios. By using the tools described above, the chain mechanisms were constructed that led to the identification of 8392 immediate (worker-related) causes that were the result of 5854 basic (supervisor-related) causes which in turn were induced by 2286 managerial causes.
4
RESULTS Of the immediate causes 41% were related to unsafe conditions while 59% were related to unsafe acts. Of the basic causes 29% were related to wrong motives, 37% to personal factors and 34% to professional factors. A further analysis showed the frequency the various causes appeared in the potential incident scenarios chain mechanism: -
Immediate causes/ unsafe conditions: lack of housekeeping 13%, unexpected machinery movement 13%, use of defective equipment 12%
-
Immediate causes/ unsafe acts: safety rules violation 15%, bantering 12%, lack of authorization 11%, lack of PPE use 10%
-
Basic causes/ wrong motive: hastiness 37%, saving effort 26%
-
Basic causes/ personal factors: familiarisation with danger 43%, lack of focus 38%
-
Basic causes/ occupational factors: insufficient job specifications 43%
Managerial causes were examined not only with respect their individual contribution but their combined one as well yielding the following frequencies of initiation of the potential incident scenarios:
925
-
Lack only of a Health & Safety System (HSS) appears as an initiating event in 10,3% of the scenarios
-
Lack only of communication appears as an initiating event in 2,2% of the scenarios
-
Lack only of enforcement appears as an initiating event in 14,4% of the scenarios
-
Lack of both HSS and communication appears as an initiating event in 15,2% of the scenarios
-
Lack of both HSS and enforcement appears as an initiating event in 23,8% of the scenarios
-
Lack of both communications and enforcement appears as an initiating event in 3,4% of the scenarios
-
Lack of all three managerial causes appears as an initiating event in 30,7% of the scenarios
In order to more clearly illustrate the above, if an operation established an effective HSS only 10,3% of the potential incidents would be avoided; however, if effective enforcement was also applied then a total of 48,5% of the potential incidents would be avoided: 10,3% due to the HSS, 14,4% due to enforcement and 23,8% due to the combination of HSS and enforcement. Hence, the effectiveness of the managerial measures can be laid in rounded figures: -
If an effective HSS is applied then 10% of the incident scenarios can be avoided
-
If an effective communication scheme is applied then 2% of the incident scenarios can be avoided
-
If an effective enforcement scheme is applied then 14% of the incident scenarios can be avoided
-
If effective HSS and communication schemes are applied then 28% of the incident scenarios can be avoided
-
If effective HSS and enforcement schemes are applied then 48% of the incident scenarios can be avoided
-
If effective communication and enforcement schemes are applied then 20% of the incident scenarios can be avoided
These data were also statistically analysed; while the details are beyond the scope of this paper the results of this analysis depicted that owing to the UMB there is a 2% residual risk in R&M activities.
5
RESULTS’ DISCUSSION
R&M are usually non-routine extremely diverse activities, conducted under time pressure, involving handling equipment without the manufacturer’s protective devices and requiring high skills, experience and initiative; they must therefore be considered high-risk tasks. Housekeeping still remains of imperative importance in R&M activities since by nature most technical works involve disassembling, laying out tools, spare parts, chemicals and using a sufficient amount of floor space. Specific to each applicable activity lockout-tagout procedures are also extremely important in order to avoid unexpected machinery movement since R&M works are conducted after removing the machine guards or opening the electrical panels thus leaving personnel exposed to the hazards involved. Keeping focused to the job and not allowing for any distractions would also significantly contribute to safety since R&M works require specialized knowhow and the activities performed are not self-explanatory requiring extra caution, since the slightest mistake could result in operational failures as well as accidents. R&M personnel are usually trained professionals which however seem to develop overconfident behaviours thus exposing themselves to unnecessary risks, a phenomenon which is enhanced by the pressure exercised bu management or production to keep rundown times to a minimum. In spite of the fact that production employees usually receive ample training and according to developed written procedures, R&M activities lack these procedures mainly to the diversity and complexity of the tasks and therefore the difficulty in the development of such procedures. Having acquired an acceptable level of knowledge through training and experience but, on th other hand, experiencing the pressure to minimize work-time cycles, R&M personnel is more vulnerable in adopting unsafe behaviours; therefore, enforcement is individually the most effective managerial measure should the other two are not initiated or well organized. If enforcement is supported by the appropriate detailed procedures, guidelines in the frame work of a health & safety system then it becomes the most effective combination in reducing the potential incident scenarios since
926
6
CONCLUSIONS
Lack detailed and specific to each maintenance task health & safety procedures in the framework of a health & safety system enhanced by enforcement are the most effective managerial measures for repair and maintenance activities a combination that could reduce potential incident scenarios as close as 50%. R&M personnel must adopt excellent housekeeping practices and implement lockout-tagout procedures irrespective the work pressure. Equipment and tools should be in excellent condition. R&M personnel should also remain focused while performing their tasks and follow health & safety rules with no exception. Authorization should be obtained where appropriate. Supervisors should monitor closely all R&M activities and discourage unsafe behaviours like hastiness and minimum effort practices springing from over confidence and familiarization with the involved dangers. Supervisors should also demand the implementation of the detailed and specific to each maintenance task health & safety procedures; if non-existent they should contribute in the procedure acquisition or development. Management should instigate a robust health & safety system to assure enforcement is consistently present and, in case of their absence, detailed and specific to each maintenance task health & safety procedures are developed as top priority. Strong support to R&M supervisory personnel must be given to relieve pressure to R&M departments to the extent that health & safety is compromised.
7
REFERENCES
1
Bedford T. & Cooke R. (2001) Probabilistic Risk Analysis, Foundations & Methods, Cambridge: Cambridge University Press
2
Kumamoto H. & Henley E.J. (2nd ed.) (1996) Probabilistic Risk Assessment and Management for Engineers, Piscataway NJ: IEEE Pres
3
Vose, D. (2nd ed.) (2000) Risk Analysis, A Quantitative Guide, West Sussex: John Wiley & Sons
4
Bird F.E.Jr. and Germain G.L. (1985) Practical loss control leadership, Loganville GA: Institute Press Publishing
5
Johnson W.B.(1973) The management oversight and risk-tree MORT, Wiloughby, OH: US Atomic Energy Commission
6
ANSI, (1995) Standard information management for occupational safety and health, American National Standards Institute, Z16.2
7
Skroumpelos G. & Moustakis V. (2006) Methodological framework for conducting a risk assessment study, Chania, Crete: Technical University of Crete
8
Skroumpelos G. & Moustakis V. (2008) Methodological framework for conducting a risk assessment study, WOS Conference’08 Proceedings of the 4th International Conference – Prevention of Occupational Accident in a Changing World Environment, Heraklion. Safety Science Monitor Georgetown: IPSO
9
Dekker, S.W.A. (2002) Reconstructing human contributions to accidents: the new view on error and performance. Journal of Safety Research, 33 (3): 375
10
Howe, J. (2000) A union perspective on behavior–based safety, Safety culture and effective safety management, Itasca IL: NSC Press
11
Miller, D.P. and Swain A.D. (1987) Human error & human reliability, Handbook of Human Factors, NY: John Wiley & Sons
12
Vrendenburgh A.G. (2002) Organizational safety: Which mgmt practices are most effective in reducing employee injury rates? Journal of Safety Research, 33 (2): 261-265
13
Croner’s (2001) Health & safety in practice: Risk assessment, Surrey: Croner’s Publications
927
Proceedings of the 4th World Congress on Engineering Asset Management Athens, Greece 28 - 30 September 2009
DIAGNOSIS FOR IMPROVED MAINTENANCE SERVICES: ANALYSIS OF STANDARDS RELATED TO CONDITION BASED MAINTENANCE Luca Fumagalli a, Erkki Jantunen b, Marco Garetti c, Marco Macchi c a
Politecnico di Milano, Department of Management, Economics and Industrial Engineering, P.zza Leonardo Da Vinci 23, 20133, Milano, Italy Corresponding author: Email: [email protected], b
c
VTT Technical Research Centre of Finland, P.O.Box 1702,, FIN-02044 VTT, Finland Email: [email protected]
Politecnico di Milano, Department of Management, Economics and Industrial Engineering, P.zza Leonardo Da Vinci 23, 20133, Milano, Italy Email: [email protected], [email protected]
Depending on the maintenance strategy there can be enormous differences how much energy the machinery uses and how much waist it produces. It has become a common practise to study the efficiency of production machinery together with the quality of production and availability of this machinery i.e. the overall effectiveness is studied. In order to reach high efficiency, high availability and good quality, the production machinery has to be in the condition to fulfil these goals. In principle there are two questions that need to be answered when maintenance is planned for tackling the above described situation: 1) What do we have to do? 2) When do we need to take action? The maintenance strategy that has been developed to answer these questions in an optimal way is Condition Based Maintenance (CBM) i.e. the maintenance actions are based on the need of the machinery. Up to this level everything is very logical and clear but unfortunately the current reality in the industry is far from optimal. It is not an easy task to define the condition of production machinery and it is not easy to say what needs to be done and when. This paper is oriented to help to answer the What question i.e. diagnosis of the condition of machinery and When question i.e. prognosis of wear development is not discussed in detail. However, the What question as such is already very demanding. The reason for this is that automatic diagnosis should be based on measurements of the condition and this becomes very difficult in practise due to the differences in the production machinery and the difficulty in separating the condition information from information that is related to the production parameters. The paper provides an ontology of diagnosis in order to support the building of diagnostic tools. The motivation for relying on defining ontology is based on the fact that it takes a lot of work to define a reliable diagnosis tool and it also is very demanding to be able to keep the system working when changes in the machinery or the software environment take place. The ontology is built in such a way that diagnosis of similar machinery can be identified and new type of machinery as well, based on the similarity of components. One of the most important findings, in the development of this process has been that in order to make the development of ontology possible in practise and in order to be able to keep the system up to date, is to rely on available definitions in the form of standards and practices that other developers are prepared to support and update. The main focus of this paper is in defining the environment that supports the development of the ontology of diagnosis of the condition of rotating machinery. To build the ontology both references related with standards for exchange of information in a CBM system [1] and upper ontologies are analysed. Upper ontologies define toplevel classes such as physical objects and activities from which more specific classes and relations can be defined. For example following the scope of this paper ISO 15926-2 [2] is analysed. The using of upper ontologies to develop a more specific domain ontology enables to define concepts based on the more general concepts provided by the upper ontology, avoiding reinventing the wheel, while having better integration [3] and standardization.
928
1
INTRODUCTION
Maintenance is becoming a crucial competence factor in the management of assets also due to the attention to sustainability [4] issues that is growing. Maintenance is the most efficient way to keep the functional level of a product above the level required also from the viewpoint of environmental impact. This concept is not new, since e.g. sustainable issues are dated back to ‘80s. What is new, instead, is that new technologies enable proper activities to carry out maintenance activities in such an efficient way now that also the corporate profits can benefit. The concept of CBM was proposed, based on the development of machine diagnostic techniques in the 1970s. In the case of CBM, preventive actions are taken when symptoms of failures are recognized through monitoring or diagnosis. Only later authors started to investigate in depth diagnostic systems, mainly thanks to the availability of IT solutions. Works of this decade about condition monitoring and diagnosis are e.g. [5], [6], [7], [8]. Equipment must be maintained according to its characteristic. To this end, different decision making frameworks are available in literature to explain which maintenance policy that must be adopted in different industrial environment. To provide an example, it is possible to refer to [9], where a decision making framework is provided that also enables to find when CBM approach is suitable. Some questions are formulated to the decision maker, in order to decide upon the maintenance policy. In particular to come to CBM the following combination of questions and answers are necessary, according to [9]: • Is the equipment critical for the system dependability? YES • Is the fault hidden, not detectable for operating crew in normal operation? NO • Does the condition degrade noticeably before functional failure? YES • Is the degraded condition detectable during normal operation? NO • Are there reliable means to measure the condition or performance? YES Result: Condition monitoring: i) Automatic or human based sensing; ii) Process parameter monitoring Once CBM approach is identified as suitable, the decision maker should then decide upon the best diagnostic technique to adopt. However this is not a trivial task since different techniques are available. The aim of this paper is to clarify which information is to take into account to perform diagnosis on industrial equipments. This paper represents a short literature review of: i) monitoring techniques, ii) standards for CBM. Standards for CBM are presented in order to understand which information is available for the maintenance decision maker that tries to face the problem here presented. The paper is structured in the following way. Paragraph 2 briefly describes the main monitoring techniques, according to [10]. Paragraph 3 summarises the concept of ontology as a mean to help to carry out this research. Paragraph 4 deals with the analysis of standards. Paragraph 5 summarizes the analysis of the standards, integrating it with other references and proposing an ontology. Paragraph 6 ends the paper with conclusions and issues related with further development of the present work. 2
MONITORING TECHNIQUES – INDUSTRIAL APPLICATION
A variety of techniques can be used as part of a CBM program, and a good CBM program needs adoption of several techniques that can be used together in a plant system. In [10] a survey is presented, carried out in 2004 in 15 different countries, including the Americas, Europe, Japan, Australia, South East Asia, Middle East and Africa on a sample of 157 companies. According to this survey some of the most adopted diagnostic techniques are: • • •
Vibration analysis, adopted by 94% of the interviewed companies (148 on 157) Oil analysis, adopted by 72% of the companies Infra-red Thermography, adopted by 65% of the companies
Here some monitoring techniques are described, i.e. some of the most adopted as shown by [10]. As general remarks, before going ahead, it is worth saying that two stages are mainly covered by the solutions available in the market: data acquisition and data processing with analysis tools [1]. Vibration monitoring It is a technique that can be used for electromechanical systems such as pumps, fans, rotating machinery in general, both in continuous processes and in manufacturing systems, and it is a primary predictive maintenance tool. A machine is subjected to several sources of vibration, and so it has a composite vibration profile. This profile can be acquired by accelerometers and analysed in a frequency domain thanks to Fast Fourier Transforms (FFT), in order to individuate the different sources of vibration, and focus on the abnormal component indicating a strange behaviour of the equipment (e.g. deriving from worn bearings). According to such aspects as working temperatures, presence of electromagnetic fields, signal quality and vibration frequency band, different sensors might be chosen in order to implement a vibration monitoring program [11]; for this reason, sensor producers often offer a wide range of solutions.
929
Oil analysis Three oil analysis techniques are often used in condition-based maintenance. They are: i) lubricating oil analysis, ii) wear particle analysis and iii) ferrography (for further details see [12]). These techniques are relatively slow and expensive because the analysis requires the use of laboratory facilities such as spectrometer and scanning electron microscope. In lubricating oil analysis, samples of lubricating, hydraulic, and dielectric oils are analysed at regular intervals to determine if they can still meet the lubricating requirements of their application. Lubricating oil analysis involves the use of spectrographic techniques to analyse the elements contained in the oil sample. However, it must be supplemented with other diagnostic procedures in order to identify the specific failure mode which may have caused the observed degradation of the oil condition. The limitations of oil analysis in a condition-based maintenance programme are: high equipment costs, being a laboratory-based procedure, reliance on acquisition of accurate oil samples and skills needed for proper interpretation of data. However in the recent years some solutions for oil analysis are available at a lower cost. Portable devices, enabling oil analysis through visual analysis are, in fact, available nowadays, e.g. online oil sensors [13]. Thermography Thermal non destructive methods involve the measurement or mapping of surface temperatures as heat flows to, from and/or through an object; by detecting thermal anomalies incipient problems can be located and defined. Only emitted energy is important to predict and prevent failures, so the other energy forms (reflected and transmitted) must be filtered out in order to have a good analysis. Infrared imaging is the most used technique to catch thermal data, because it can provide it in a shorter way than others (such as line scanners or infrared thermometers) can. Thermography is a relatively inexpensive technique and has a wide application range, so its use is very frequent in CBM programs (e.g. see [14]). Thermal cameras can be used for acquiring images that can be analysed either manually or automatically (thanks to supporting software). Thermal imaging is a fast and cost effective way to perform detailed thermal analysis. Temperature/pressure monitoring In order to measure the parameters that indicate the actual operating conditions of plant systems, sensors measuring temperature, pressure and other quantities can be used. They aim at defining the conditions at which the output of the production process is obtained, and they are usually used in process control and automation: temperature and pressure sensors can be parts of feedback control systems, which aim at maintaining constant conditions in a production process. They can be also used as parts of a CBM program: the measurement of process parameters can be introduced in a CBM program, with the purpose to monitor the production capacity in a plant and discover process inefficiencies [15]. In this case, instrumentation must be installed to measure the parameters that indicate the actual operating condition of plant systems; the so-obtained data can be periodically recorded and analysed. As for acceleration sensors, there is a wide range of possible solutions in accordance to measurement range, response time, environmental conditions, accuracy etc. [16]. 3
DIAGNOSIS ONTOLOGY
In recent years a number of ontologies have been constructed and applied in selected domains. Here we avoid speaking about ontology for software interoperability that is not relevant for the present discussion. In medicine and biology, instead, Semantic Web technologies and web mining have been exploited in new intelligent applications. However, these disciplines are generally influenced by government support and are not as commercially fragmented as the manufacturing and process industries. Creating an industry-wide standard in a fragmented field is a task that must be handled carefully. There are some important standards for industries in the maintenance context, but there are also smaller standards, and many companies use their own internal terminologies for particular areas. This does not help to consider a common terminology nor a common structure of concepts. To this end, the research approach followed in this paper keeps the analysis of industrial standards at basic level. Then the development of an ontology can be clarified. Ontology can be defined as: a taxonomy of concepts [17], a list of constraints and interrelations among the concepts [18], a hierarchically ordered amount (collection) of classes, instances, relations, functions and axioms [19]. In the scope of the present paper the last definition will be adopted. A list of classes will be presented to describe the “environment” where the machine operates, see Figure 1. These classes will be linked to each other through the establishment of relations. These relations will allow classifying machine environment in an ordered way. Classes describing machine components will be included. Once machine components and operating environment is established, these can be matched with the monitoring technique to adopt. This last issue will be presented in a further development of the present work, while in this paper it will be presented an ontology supporting the description of the “environment”.
930
Figure 1. Processes in defining the classes of a diagnosis system. In Figure 1 an example of classification though classes is presented. Machines can be classified according to their, e.g. Failure mode or type of inspection interval. The inspection interval, for instance, can be determined by rules concerning safety. Machines can be classified according to these sets. More precisely the item that can be classified is one component of a machine. How to manage the composition of components into a machine is explicitly described by ISO 15926 [2]. This furthermore underlines the importance and role of standards in approaching this kind of work. 4
STANDARDS AND REFERENCES
Standards are of great importance in building diagnosis ontology. Keeping in mind that diagnosis ontology is defined in order to make it easy to define automatic diagnosis systems and also keeping them up to date. Here the proper use of standards is the main technology that decreases the amount of definition dramatically and guarantees that the diagnosis system can be kept alive and working for a long period if there are changes in the operating system and programming environment. The paper provides a brief overview of the current status of standards that support the establishment of a CBM system and thus are useful to build the diagnosis ontology that will be presented in paragraph 5. The standards here analysed are designed for different purposes. The benefit of the ontology described in this paper will be to utilise part of the standards in order to describe the context where a machine works. The rationale behind these standards is that when developers of CBM systems start to follow standards and standardization proposals it is easier to direct the development towards new innovative ways of predicting remaining useful life. The CBM community would achieve interchangeable hardware and software components, more technological choices for users, more rapid developments of technology, reduced prices and improved ease of upgrading of system components. Interface standard that enhance the ability of integration between different vendor-products in a system have several positive effects. Overall system costs will be reduced; performance will be optimized as well as the ability to implement new CBM approaches will be enhanced by the adoption of standards [20]. In the scope of the present work these standards must be considered to ensure that the result of this work can be adopted in an environment where these standards are applied, though the adoption of a common terminology. However one main benefit of this approach is to synthesize the available material concerning CBM and try to integrate it with the material available in literature. OSA-CBM OSA-CBM is an abbreviation for Open System Architecture for CBM. As declared by the OSA-CBM organization the standard proposal shall cover the whole range of functions of a CBM system, for both hardware and software components. The OSA-CBM proposed standard divides a CBM system into different processes: Data Acquisition, Data Manipulation, State Detection, Health Assessment, Prognostic Assessment and Advisory Generation. In the scope of the present work, OSA-CBM documentations do not provide information to build the diagnosis ontology to describe industrial environment condition. However OSA-CBM must be kept in mind since it represents a reference for the above mentioned processes. MIMOSA The Machinery Information Management Open System Alliance, MIMOSA, was founded in 1994 and introduced in the September 1995 issue of Sound and Vibration. The purpose and goal of MIMOSA is to develop open conventions for information exchange between plant and machinery maintenance information systems. The development of MIMOSA CRIS (Common Relational Information Schema) has been openly published at their website (www.mimosa.org). The CRIS provides coverage of the information (data) that will be managed within a CBM system. This is done by a relational database schema for machinery maintenance information. The typical information that is needed to handle is: a) a list of specific assets being tracked b) a description of system functions, failure modes, and failure mode effects, c) a description of the monitoring system and characteristics of the monitoring components, d) a record of alarm limits and triggered alarms, e) resources describing
931
degradation in a system as well as prognostics of system health trends, f) a record of recommended actions, g) a complete record of work requests. The adoptions of CRIS specification in the ontology represent the possibility to consider the above listed information as already enumerated and defined. MIMOSA in particular defines the terminology to adopt and the way the data can be then managed also in the practical implementation of diagnosis systems that however is out of scope of this paper. It can be concluded that Mimosa as such is of highest importance as it defines the format and relations for most of the data needed in a system to diagnose the condition of rotating machinery. ISO 15926 ISO 15926 [2] is a standard for integrating life-cycle data across phases (e.g. concept, design, construction, operation, decommissioning) and across disciplines (e.g. geology, reservoir, process, automation). It consists of 7 parts, of which parts 2 and 4 are the most relevant for the present work. In fact, part 2 specifies a meta-model or top-level ontology [3] for defining application-specific terminologies. Part 2 includes about 200 entities. It is intended to provide the basic types necessary for defining any kind of industrial data. Part 4 of ISO 15926 includes application or discipline-specific terminologies, and it is usually referred to as the Reference Data Library. These terminologies are instances of the data types from part 2. Part 4 contains around 50.000 general concepts. Standards for geometry and topology (Part 3), procedures for adding and maintaining reference data (Part 5 and 6), and methods for integrating distributed systems (Part 7) are the other parts of the norm, out of the scope of this work. The scope of the information model (Part 2) is to provide: generic concepts associated with set theory and functions, concepts and relationships that describes changes to physical objects during time periods, generic relationships relevant to engineering such as connection, composition, containment and involvement (in an activity). ISO 15926 defines a format for the representation of information about a process plant. The basis for ISO 15926 is a record of: a) the physical objects that exist within a process plant, b)identifications of the physical objects, c) properties of the physical objects; d) classifications of the physical objects, e) how the physical objects are assembled, f) how the physical objects are connected. ISO 15926 does not attempt to standardize all these classes, but instead provides a small set of basic engineering classes which can be specialized by reference to a dictionary. The reference is made by instantiating a proxy for the class defined in a dictionary and by associating information with this proxy, such as: • •
the name of the source dictionary, which defines a namespace for the identification of the class; the identifier of the class within the source dictionary.
ISO 15926 does not only record the process plant as it exists at an instant, but also how the process plant changes as a result of maintenance and refurbishment activities and the requirements for a process plant and the design for a process plant; which may not directly correspond to a process plant as it exists. The reference data library (Part 4), instead, contains a dictionary of basic classes and properties used within the process industries. The dictionary specializes the generic concepts within the information model. The first release of the reference data library contains about 10,000 classes and properties. It is intended that the reference data library will be subject to continual revision and extension as an ISO register. ISO 15926 part 4 standardizes an initial set of a few thousand generic classes. In the scope of the present work, from the above description, partially coming directly from the ISO documentations, it seems that ISO 15926 is almost ready to describe all the industrial systems and machines, describing it from an aggregation of classes. However the way ISO 15926 can be accessed through the web (www.http://15926.org) is not so user-friendly and some difficulties are present surfing the website and trying to find specific classes defined by the standards. From a theoretical point of view it can be concluded that our ontology can rely on this standard for what concern the description of components. ISO 17359 This International Standard sets out guidelines for the general procedures to be considered when setting up a condition monitoring program for machines and includes references to associated standards required in this process. It is applicable to all machines. ISO 17359 [21] provides also a condition monitoring procedure flow chart, where the procedures from selecting the equipment to monitor to determined required maintenance actions are enumerated, going through processes that fits OSACBM layers. In the scope of the present work it seems very useful the part where measurement locations are discussed. Measurement locations should be chosen to give the best possibility of fault detection. Measurement points should be identified uniquely. The use of a permanent label or identification mark is recommended. Factors to take into consideration are • • • • • • • •
safety, high sensitivity to change in fault condition, reduced sensitivity to other influences, repeatability of measurements, attenuation or loss of signal, accessibility, environment, and costs.
932
It can be concluded that for vibration condition monitoring, information on measurement locations is contained in ISO 13373-1. For tribology-based condition monitoring, information on measurement locations is contained in ISO 14830-1. These references are provided within the norm ISO 17359. ISO 17359 provides also an informative table where examples of condition monitoring parameters to monitor according to machine type are presented. It is a similar example that the ontology presented in this paper aims at reaching, namely the matching between machine and proper diagnostic techniques. ISO 13373-1 This International Standard [22] is dedicated to Condition Monitoring. It describes in detail transducers and measurement locations, according to the different problems to diagnose. A vibration condition monitoring flowchart is provided within the documentation of the norm. This flowchart, however differs from the one provided by ISO 17359 since it is very specific for the practical application of vibration measurement sensors. Without discussing the measurement location it seems that only the first two steps of that flowchart are worth noticing in this context. In the scope of the ontology here presented, instead, Table A.1 of the documentation of the standard completely fits the needs of the ontology that this work aims at defining. In particular within that table, types and location of measurement are provided, as a function of machines and parameters to monitor. To this end it is envisioned that the machines there enumerated are to be included in the ontology as type of machines in which vibration analysis can be performed. Annex C of the Standard helps to evaluate which causes can be derived from the symptoms identified through vibration analysis on different equipment. Basics but useful information are provided concerning shafts, gears and bearings. Also most common causes of turbo machinery, torsional vibration and resulting vibration characteristics are provided and they mainly refer to electrical problems. It can be concluded that ISO 17359 provides suggestions coming from MIMOSA standard for what concerns the definition of measurement location; this is reported in Annex D of the ISO. ISO 13379 This standard [23] provides the generic steps of the diagnostics study that include: a. b. c. d. e. f. g. h. i.
analyse the machine availability, maintainability and criticality with respect to the whole process; list the major components and their functions; analyse the failure modes and their causes as component faults; express the criticality, taking into account the gravity (safety, availability, maintenance costs, production quality) and the occurrence; decide accordingly which faults should be covered by diagnostics (“diagnosable”); analyse under which operating conditions the different faults can be best observed and define reference conditions; express the symptoms that can serve in assessing the condition of the machine, and that will be used for diagnostics; list the descriptors that will be used to evaluate (recognize) the different symptoms; identify the necessary measurements and transducers from which the descriptors will be derived or computed.
According to the research scope this paper aims at covering some of the points mentioned above. Point “b” is partially covered by the ontology that will be proposed here, obviously when it is customized on the specific industrial case. Point “c” is only partially covered since in the ontology the failure modes and causes as component faults are enumerated, namely reserving room in the ontology for their definition. Point “e” is covered in the sense that monitoring techniques to make the fault diagnosable are suggested. Point “f” is satisfied through the definition of the operating condition of the machine. Norm ISO 13379 interestingly provides definition of diagnostic methods that are there classified by two main approaches: • •
Numerical methods (neural network, pattern recognition, statistical, histographic Pareto approach, or other numerical approaches). These methods are generally automatic, do not need deep knowledge of the mechanism of initiation and fault propagation, but require a learning period with a large set of observed fault data. Knowledge-based methods which rely on the use of fault models, correct behaviour models or case description. This partially fits Jardine (2006)’s definition, where diagnostic methods are classified as: o Statistical approach, o Artificial Intelligence (AI) approach, o Physical model based approach.
These features are not included in the ontology presented here, that aims at clarifying concepts in the scope of classifying machine environment for diagnostic purposes. However this information will be kept in mind for the further development of this research. ISO 18436-2 This standard [24] is not useful in the scope of this paper, namely in the definition of the ontology. The authors however would like to point out that this norm helps to define the amount of time required to train personnel in monitoring techniques. It seems thus that this standard is relevant in practice, above all in an initial phase of implementation of a CBM approach.
933
ISO 13380 This norm [25] provides an important point of view on machine operating condition, namely the norm states that: measurements of different parameters should be taken wherever possible at the same time, or under the same operating conditions. For variable duty or variable speed machines, it may be possible to achieve similar measurement conditions by varying speed, load or some other control parameters. Monitoring should be taken where possible when the machine has reached a predetermined set of operating conditions (e.g. normal operating temperature) or, for transients, a predetermined start and finish condition and operating profile (e.g. coast down). These are also conditions which may be used for a specific machine configuration to establish baselines. Subsequent measurements are compared to the baseline values to detect changes. The trending of measurements is useful in highlighting the development of faults. This is important information to take into account when evaluating which diagnostic techniques must be selected. The norm is quite restrictive in this sense, since, following its instruction it seems no other diagnostic approaches are possible if the conditions are not stable as above explained. However this is not the case in all the situations and thus the case above mentioned represents only one of the possible cases of industrial environment and condition where a machine is called to operate. It can be concluded that annex A and annex C of this norm contain similar examples as 13373-1. These examples can be taken into account to feed the ontology at its early stage. In particular annex C provides fault and symptoms or parameter changes for several types of machines, like: industrial gas turbine, pumps, compressors, electric generators, fans. 5
PROPOSAL OF THE DIAGNOSIS ONTOLOGY
Standards must be integrated with other references in order to approach such a kind of problem in a way that also very practical issues are considered. Therefore some other references were consulted, mainly coming from industrial presentations. It must be clear to the reader that this part of the paper does not aim at being a complete literature review. On the other hand it aims at presenting an ontology that will be of course influenced by the analysed references. Information coming from the above described i) monitoring techniques, ii) standards, iii) review of practical cases are synthesized in the following picture (Figure 2), describing the ontology as a result of the analysis presented in this paper. The ontology presents the classes that classify the environment, the components and the related faults, according to some rules. The rules come from the review presented in the paper. Figure 2 shows the classes, while the model developed in Protègè is able also to provide relations between the classes in order to check consistency in the classification. Classes are then presented.
Figure 2. Ontology describing the “environment” where the machine operate. Here few examples of the analysed documents are presented. In [26] the authors point out that inspection for CBM can be performed based on a plan or upon a call due to an emergency situation. Moreover in that reference thermography is indicated as a suitable method when the availability of the equipment during the inspection must be guaranteed. It is then important to consider if the equipment to be monitored must be maintained through regular inspections, required by laws and regulations. This led to define the class “Inspection interval”, in order to consider this aspect.
934
In [27], the authors point out that the different failure modes experienced by rotating machinery result from random shocks to the system or deterioration mechanisms such as wear and fatigue. Component deterioration rates depend on different factors like operational loading, quality of maintenance and other external effect. It is possible to state that components are generally designed to operate satisfactorily, even if stressed, if they are under the forecasted operating condition. This reference led to define the class describing the capability to handle stress, it is called “Attitude to handle stress” in the ontology. Large variation of the failure rate, with increased rates of failure, may result as consequence of other factors such as reduced safety margins, poor quality of maintenance, hostile operating condition or extreme environmental condition. These increased rates may develop as function of time due to component degradation. From these considerations arise the need that operating condition and type of design must be considered in the analysis. In [28] it is stated that complete monitoring techniques, like vibration monitoring and thermography analysis require special tools and skilled personnel. Moreover a critical thing is that the analysis of the signals when performed manually requires a certain time frame. Reference [28] points out that when there is physical separation of the monitored equipment from the analysis centre, the situation could be critical. This driver must then be taken into account. This reference led to define the class describing the skills of the operators using the machines; it is “Driver of the machine”. Then, the CBM system can associate multiple metered equipment values to calculate an overall equipment health factor. For instance, a feed water pump might require the lube-oil pressure to be at a different reading when it is running idle than when it is at full capacity (www.matrikon.com). Consequently, placing high and low alarm points on the lube-oil pressure may not be accurate at different operating conditions. Process analysis software determines the relationship between the various metered equipment values and their interactions. A normal relationship between the values indicates a good health factor. As well, an abnormal relationship indicates a poor health factor, and is a warning that maintenance is required. Furthermore it is also pointing out that it is important to understand the behaviour of the failure mode. This is done identifying the corresponding class within the macro-class “Failure mode” The class Component contains subclasses defining all the possible components of a production plant. This class is not detailed by our ontology, since information to build it is contained in ISO 15926. The definition of the “Failure rate value” comes from the traditional statistical maintenance approach where the failure behaviour is modelled through exponential function describing a constant failure rate. Conversely with Weibull function increasing (or eventually decreasing) failure rate can be modelled. In our ontology also the case of a failure rate function of the type of production campaign is considered. “Industrial situation” class deals with the classification of the operating environment and the managerial situation: maintenance can be done properly or poor maintenance can be performed (e.g. in location where maintenance culture is not high). 6
CONCLUSION
This paper presented an ontology in order to synthesize part of the material available in literature and above all the standards related with CBM. The focus of the ontology is on describing the “environment” where a machine is called to operate, as well as describing the machine itself through the listing of its components. It is envisioned that the ontology represents a good synthesis of the standards and can be adopted to further develop other tools to be adopted for machine diagnosis. In particular some co-authors of this paper are carrying out a project related to the development of a tool aimed at making the definition of an automatic diagnostic system a semi-automatic process. In that project the ontology here presented will be adopted as basis to build a decision making tool that can decide which diagnostic technique is the most suitable approach in certain industrial environment and situation. The present work shows how standards must be adopted in the scope to classify the “environment” where a machine is specified to work. This helps to answer to the question “What”, concerning the maintenance action to perform on the machine, as stated in the beginning of the paper. The analysis here presented should help in the scope of diagnosis of machines to improve the maintenance services related with CBM. It is envisioned that, through the better development of CBM approach, maintenance action will be more effective and will allow also the keeping of machines in a better working condition, allowing the saving of resources, through saving of energy and spare parts consumption. 7
REFERENCES
1
MIMOSA, OSA-CBM Primer, August 2006, www.mimosa.org
2
ISO 15926, www.iso.org
3
R. Batres, M. West, D. Leal, D. Price, M. Katsube, Y. Shimada, T. Fuchino. (2007) An upper ontology based on ISO 15926. Comp. & Chem. Eng., 31(5-6), 519–534
4
Brundtland Commission, (1987) Our Common Future, Report of the World Commission on Environment and Development, World Commission on Environment and Development, Published as Annex to General Assembly document A/42/427, Development and International Co-operation: Environment, August 2.
935
5
Grall, A., Dieulle, L., Berenguer, C. and Roussignol, M. (2002) Continuous time predictive maintenance scheduling for a deteriorating system. IEEE Transactions on Reliability, 51(2), 141-50.
6
Chen, D. and Trivedi, K.S. (2002) Closed-form analytical results for condition-based maintenance. Reliability Engineering and System Safety, 76(1), 43-51.
7
Marseguerra, M., Zio, E. and Podofillini, L. (2002) Condition based maintenance optimization by means of genetic algorithms and Monte Carlo simulation. Reliability Engineering and System Safety, 77(2), 151-65.
8
Jamali, M.A., Ait-Kadi, D., Cle´roux, R. and Artiba, A. (2005) Joint optimal periodic and conditional maintenance strategy. Journal of Quality in Maintenance Engineering, 11(2), 107-14.
9
Rosqvist T., K. Laakso, M. Reunanen, (2009) Value-driven maintenance planning for a production plant. Reliability Engineering and System Safety 94, 97–110.
10 Higgs A., Parkin R., Jackson M., Al-Habaibeh A., Zorriassatine F. e Coy J. (2004) A survey on condition monitoring systems in industry. Proceedings of: ESDA 2004, 7th Biennal ASME Conference Engineering Systems Design and Analysis, July 19-22, Manchester, UK. 11 www.skf.com 12 Nunnari J. J. ; Dalley R. J., An overview of ferrography and its use in maintenance, (1991) Tappi journal ISSN 07341415, 74(8), 85-94. 13 Jantunen E., Adgar A., Arnaiz A. (2008) Actors and roles in e-maintenance. Proceedings of the 5th International Conference on Condition Monitoring and Machine Failure Prevention Technologies. 14 Ierace S., Carminati V, (2007) Application of thermography to Condition Based Maintenance: a case study in a manufacturing company. In: Proc. of 3rd International Conference on Maintenance and Facility Management, pp. 159166, Roma, Italy, September 27 -28, 2007. 15 Mobley R. Keith (2002) An introduction to predictive maintenance. Butterworth- Heinemann 16 Garetti M., Taisch M. (1997) Automatic production systems, in italian, original title: Sistemi di produzione automatizzati. 2nd ed., CUSL, Milan 17 Alberts, (1994) Ymir: A sharable ontology for the formal representation of engineering design knowledge. Design Methods for CAD, http://citeseer.ist.psu.edu/alberts94ymir.html, pp. 3-32. 18 Storey, V.C. Ullrich, H. Sundaresan, S. (1997) An ontology for database design automation. Proceedings of the 16th International Conference of Conceptual Modelling, pp. 2-15. 19 Visser, P.R.S. and T.J.M. Bench-Capon (1998) A comparison of four ontologies for the design of legal knowledge systems. Artificial Intelligence and Law, 6(1), 25-57. 20 Tassey, G.,(1992) Technology Infrastructure and Competition Position. Kluwer, Norwell, MA. 21 ISO 17359, www.iso.org 22 ISO 13373-1, www.iso.org 23 ISO 13379, www.iso.org 24 ISO 18436-2, www.iso.org 25 ISO 13380, www.iso.org 26 Thermography analysis on gas turbine systems. Inprotec MCM Milano April 2007 27 Moss, T.R. and Andrews, J.D., (1996) Factors influencing rotating machinery reliability. IN: Proceedings of the European Safety and Reliability Data Association Conference, 10th ESReDA, Chamonix, France, pp 149-171. 28 Neapolitan Engineering Association, Navy Commission. Remote monitoring of ships and maintenance strategies. Eng. Fabio Spetrini
936
Proceedings of the 4th World Congress on Engineering Asset Management Athens, Greece 28 - 30 September 2009
MACHINE PROGNOSTICS BASED ON SURVIVAL ANALYSIS AND SUPPORT VECTOR MACHINE Achmad Widodo a Bo-Suk Yang b a
b
Mechanical Engineering Department, Diponegoro University, Tembalang, Semarang 50275, Indonesia.
School of Mechanical Engineering, Pukyong National University, San 100 Yongdang-dong, Nam-gu, Busan 608-739, South Korea.
Intelligent machine prognostics system estimates the remaining useful life of machine components. It deals with prediction of machine health condition based on past measured data from condition monitoring (CM). It has benefits to reduce the production downtime, spare-parts inventory, maintenance cost, and safety hazards. Many papers have reported the valuable models and methods of prognostics systems. However, it was rarely found the papers deal with censored data, which was common in machine condition monitoring practice. This work concerns with developing intelligent machine prognostics system using survival analysis (SA) and support vector machine (SVM). Survival analysis utilizes censored and uncensored data collected from condition monitoring routine then estimates the survival probability of failure time of machine components. SVM was trained by data input from CM data that corresponds to target vectors of estimated survival probability. After validation, SVM is addressed to predict survival probability of individual unit of machines. Progressive bearing degradation data were simulated and trained to validate the proposed method. The result shows the proposed method is promising to be a probability-based machine prognostics system. Key Words: Machine Prognostics, Support Vector Machine, Survival Analysis, Censored and Uncensored Data 1
INTRODUCTION
The study of machine prognostics system has been carried out by many researchers with their own techniques. The effort to early know how much time left of machine components is necessary to support the assessment of engineering asset for reducing the operation downtime, avoiding the suddenly machines broken down, minimizing safety hazards as well as good maintenance planning. Machine prognostics system is addressed to find the remaining useful life of machine components based on various method such as data-driven based, physics-based model and probability-based method. Among them, datadriven and physic-based methods are more popular than the probability-based method. Data-driven method uses condition monitoring (CM) data that record the trending of some features during time operation for data input of the predictor, while physic-based method basically combines the specific system of mechanical knowledge e.g. system dynamic, fatigue growth formula and CM data for predicting the propagation of defects. The published paper which reports the review of machine prognostics system can be found in Refs. [1-2]. This paper contributes an intelligent machine prognostics system based on probability estimation of CM data of historical units when some data were censored and not undergo failure. This situation commonly occurs in practice when preventive replacements are conducted, while the units under study are still operated. Moreover, CM data is considered to be integrated with reliability analysis to enable prognostics system that is longer-range system. The censored data of historical units usually rare to be considered as prognostics input data and it has also not been fully utilized. Whereas this phenomenon is very common in the practice that the system does not contain of only single unit but a population of units. Therefore, the relation between CM data and actual survival state of the assets need to be deduced.
937
This work complements intelligent prognostics system of the previous work done in Ref. [3] and utilizes support vector machine (SVM) for prediction the survival probability of units under study. The training inputs are generated from simulated CM of bearing defect degradation data that involves censored data. Target vectors are obtained from survival analysis by using Kaplan-Meier (KM) estimator and probability density function (PDF).
2
INTELLIGENT PROGNOSTICS SYSTEM BASED PROBABILITY
2.1 Survival Analysis Survival analysis is the name for a collection of statistical techniques used to describe and quantify time to event data. In survival analysis, we use the term ‘failure’ to define the occurrence of the event of interest and the term ‘survival time’ to specify the length of time taken for failure to occur. Situations where survival analysis have been used include prognostics of life time machine components, time from diagnosis to death in clinical trial, duration of industrial dispute, time from infection to disease onset, etc. Our work deals with survival analysis to estimate the lifetime of some machine components. So we draw a random sample of these machine components, put them into test, collect and perform analysis of the data then make the inference among them. This work employs KM and PDF estimator to generate survival probability as target vectors of our prognostics system. Kaplan-Meier (KM) estimator also known as product-limit estimator of the survivor function is non-parametric estimator [4], which uses intervals starting death times. The standard formula of this survivor function is given by
Sˆ (t ) =
nj -d j nj j =1 k
(1)
Since, by construction, there are nj units which are survive just before tj and dj failed occurring at tj, the probability that an unit failures between time interval and tj is estimated by dj/nj. Thus, the probability of units surviving through [tj, tj+1] is estimated by (nj-dj)/nj. The only influence of the censored data is in the computation of the number of units, nj, which are survive just before tj. If a censored survival time occurs simultaneously with one or more unit failure, then the censored survival time is taken to occur immediately after the failure time. In the case of complete failure data, we adapt the previous work done in Ref. [3], that means the machine components have reached failure when removed from the machine, the survivor function is calculated by
1, 0 £ t + k < T Sˆ (t + k ) = 0, t + k > T
(2)
where T is failure time. Data set considered as censored if the machine components have not reached the failure threshold when removed from the machine. In this work, the standard formula of KM estimator was modified to produce cumulative survival probability for individual/unit machine components that is given by 0£t+k < L 1, nj -d j Sˆ (t + k ) = , t + k > L nj L £ t £ t + k j
(3)
where L denotes the last observed survival time of the unit machine component. Note that we use the last observed survival time L of each censored unit as the starting time, rather than time 0, to compute appropriate training target survival probabilities.
938
Probability density function (PDF) is employed to estimate the survivor function of each unit j which derived from CM data Yj(t) at time t. In this case, the estimated survival probability is successive multiplication of probability of units that have survived preceding intervals having condition indices higher than the observed index of item j but lower than the threshold, this is given by Ythreshold
∫ f ( y | t + dk )dy
k
Sˆ (t + dk ) =
yi ,t + dk
(4)
¥
j =1
∫ f ( y | t + dk )dy
yi ,t + dk
where d is time interval. Finally the target vectors of training are mean of survival probability obtained by above methods.
2.2 Support Vector Machine The theoretical foundation of SVM was developed by Vapnik [5]. Application of SVM in the field of machine fault diagnosis and prognosis has been reported in Refs. [6-7]. Here, we summarize SVM that is considered as a nonlinear regressive model in which the dependence of scalar d on a vector x is described by
d = f (x) + v
(5)
where both the nonlinear function f(·) and the statistics of the additive noise v are unknown. All the information we have is a set of training data {(xi , di )}iN=1 where xi sample value of the input vector x and di corresponding value of the target output d. The problem is to provide an estimate of the dependence of d on x. In performing nonlinear regression, we map the input vector x into a high-dimensional feature space in which we then perform linear regression. The architecture of SVM is shown in Figure 1, where kernel function K(x, xi) = fT(x)f(xi), and i=1, 2, …, mi is dimension of feature space. It is assumed that f(x) define a priori for all j. Given such a set of nonlinear transformations, we may define an estimate of d, denoted by y as follows
y=
mi
∑ w f (x) + b j
(6)
j
j =1
m
where {w} j =i 1 denotes a linear weights connecting the feature space to output, and b = bias.
…
…
Figure 1. Architecture of SVM adapted from Ref. [8].
939
2.3 Prognostics Method The prognostics method is depicted in Figure 2 which employs degradation data of j units obtained from CM routine. Feature calculation is performed to obtain good features that represent clear progressive degradation of machine. Feature extraction maps the calculated features from high dimensional space onto lower dimensional space. We employ unsupervised learning namely principal component analysis (PCA) for feature extraction [9]. One dimensional feature is obtained by PCA from which the survival probability estimations are calculated. Survival probability is generated from KM and PDF estimators then regarded as target vectors for SVM training and validation. Good validation process is measured by root-mean-square error (RMSE) that the lower the better of validation process. Weights and bias obtained from validation process are saved then used for testing the ability of SVM prognostics.
Data from CM, j units
Target vectors
Training SVM and validation
Feature calculation and extraction
Survival probability estimation
Good (?) Yes
No
SVM Prognosis
Figure 2. Prognostics method.
3
MODEL VALIDATION USING SIMULATED DATA
Bearing defective simulation model was developed to validate the propose method. A vibration signals a rolling element bearing under constant radial load with outer race defects was modeled using Matlab. The signals were repeatedly generated from the computer program, while the defective severity was increased exponentially with random fluctuations to represent real condition. Every simulated signal has defect impulses that increase at different rates and time measurements. The signals were set up to be having same threshold of failure, but the time of reaching failure was different for each data set. Figure 3 shows the defective bearing signal simulation with increasing defect impulse by the time. We calculated three features from time domain signals namely peak, kurtosis and entropy estimation, then we performed feature extraction by means of PCA to reduce the dimensionality of calculated features. This feature reduction was addressed to minimize the input of the SVM network and training time. After PCA training, the deviations between mapped features of simulated signals and healthy state conditions were calculated. These deviations are regarded as quantization errors (QE) [10] that defined as (7)
QE = x j - x 'j where xj and xj’ are the healthy state vectors and mapped vectors by PCA, respectively.
Figure 3. Defective bearing signal simulation.
940
We generated 40 datasets and the corresponding QE values were obtained. Thirty-six of 40 datasets were employed for training and the remaining for testing the system. In training datasets, we imposed 1/3 of training data which are censored data. The target vectors for training process were obtained from KM estimator and PDF. Figure 4 depicts the estimated survival probability of censored data used for training SVM. Five prediction horizons were set up to be predicted by SVM.
4
SIMULATION RESULT AND DISCUSSION
SVM was trained by data input obtained from simulated CM of bearing defect degradation, and set up target vectors from survival probability estimation with five horizons prediction. In the training validation, we used radial basis function (RBF) kernel for mapping input in features space. RBF kernel parameter (g) and regularization parameter (C) are selected using 5-fold cross-validation in the range of g = {1.45, 1.55, …, 1.75} and C = {1, 10, …, 1000} to obtain optimized training process. The parameters resulted from cross-validation are 1.55 and 1000 for g and C, respectively. The validation of training process is shown in Figure 5 with RMSE is 0.073.
Figure 4. Estimated survival probability of censored data.
Figure 5. Validation process of SVM training.
After validation model, the result of prognostics is presented by survival probability of testing data, while the failure time is not represented. The evaluation of predicted failure time is identified only by noting the output prediction of survival probability that less than 0.5. Figure 6 shows the QE of defect degradation of bearing for dataset 37 which failure at t = 67 s which regarded as testing data. The validated SVM model is then implemented to predict the survival probability of testing data (No.37) based on its weights and bias that saved in the model. By using CM defect degradation of testing data as input, then validated SVM predicts five horizons of survival probability. This result is presented in Figure 7. The predicted survival probability below 0.5 emerges first time at 5th rows at t = 43, then is followed by next 5 horizons with the same survival probability till t = 53. Started from t = 54, the survival probability decreases with 0.37, however, at t = 57 (at 5th rows) the probability becomes 0.24. It shows that SVM perform over prediction until t = 60. Survival probability of 0.11 is reached at t = 61, and then the predictions are effective for next horizons until reached failure time at t = 68. Comparing with actual failure, the prediction of failure time is satisfying with accuracy
| ta - t p Accuracy = 1 ta
| | 67 - 68 | x 100% = 1 x 100% = 98.51% 67
The predicted failure time t = 68 seems overestimation due to supposition that the data training still having remaining useful life when the measurement was stopped. However, the accuracy of 98.51% is acceptable for building the prognostics model.
Figure 6. Defect bearing degradation of dataset No. 37.
Figure 7. Predicted survival probability for simulated dataset No. 37. 941
5
CONCLUSION
The study of machine prognostics based survival analysis and SVM has been presented. The proposed method employs degradation trending data of bearing generated from simulated CM routine and survival probability obtained from KM and PDF estimators. SVM was trained by simulated CM data including censored data to obtain good prognostics model, and it is better than only employing CM data that ends with failure. Target vectors are generated by KM and PDF estimators which represent the population characteristics of machines being studied. The prognostics method that employs both population characteristic information and individual unit condition (CM data) is expected to enable longer-range prediction. Result deduced from simulated data is promising to be a prognostics probability-based model. However, this method needs to be implemented in real application, and this will be a future work.
6
REFERENCES
1
Khotamasu R, Huang SH, and Ver Dui WH. (1996) System health monitoring and prognostics – A review of current paradigms and practices. International Journal of Advanced Manufacturing Technology, 28, 1012-1024.
2
Jardine AKS, Lin D, and Banjevic D. (2006) A review on machinery diagnostics and prognostics implementing condition-based maintenance. Mechanical Systems and Signal Processing, 20, 1483-1510.
3
Heng A, Tan A, and Mathew J. (2008) Asset health prognosis incorporating reliability data and condition monitoring histories. In Gao J, Lee J, Ni J, Ma L, and Mathew J. Proceeding of the 3rd World Congress Engineering Asset Management and Intelligent Maintenance System (WCEAM-IMS 2008). Beijing, China. pp. 666-672. London: SpringerVerlag.
4
Kaplan EL, and Meier P. (1958) Nonparametric estimation from incomplete observations. Journal of the American Statistical Association, 53, 457-481.
5
Vapnik, V. (1995) The nature of statistical learning theory. New York, Springer-Verlag.
6
Widodo A, and Yang BS. (2007) Support vector machine in machine condition monitoring and fault diagnosis. Mechanical Systems and Signal Processing, 21(6), 2560-2574.
7
Widodo A, and Yang BS. (2008) Support vector machine for machine fault diagnosis and prognosis. Journal of System Design and Dynamics, 2(1), 12-23.
8
Haykin S. (1999) Neural Network 2nd Ed. New Jersey, Upper Saddle River: Prentice-Hall.
9
Widodo A, and Yang BS. (2007) Application of nonlinear feature extraction and support vector machines for fault diagnosis of induction motors. Expert System with Application, 33(1), 241-250.
10
Huang R, Xi L, Li X, Liu CR, Qiu H, and Lee J. (2007) Residual life predictions of ball bearing based on self-organizing map and back propagation neural network methods. Mechanical Systems and Signal Processing, 21, 193-207.
Acknowledgments The authors gratefully acknowledge the financial support from Brain Korea (BK) 21 project.
942
Proceedings of the 4th World Congress on Engineering Asset Management Athens, Greece 28 - 30 September 2009
IMAGE HISTOGRAM FEATURES BASED THERMAL IMAGE RETRIEVAL TO PATTERN RECOGNITION OF MACHINE CONDITION Ali Md. Younus a, Achmad Widodo a, Bo-Suk Yang*a a
School of Mechanical Engineering, Pukyong National University, san 100,Yongdang-dong, Nam-gu, Busan 608-739, South Korea.
Thermal image investigation has exposed up to date and remote diagnosis of machine condition which is very important part of industry maintenance. By using thermal image, the information of machine condition can be investigated; it is easier than other conventional methods of machine condition monitoring. In the current work, the behaviour of thermal image is investigated with different condition of machine. A test-rig that represents the machine in industry was set up to produce thermal image data in experiment. Some significant features have been extracted and selected by means of PCA and ICA and other irrelevant features have been discarded. The aim of this study is to retrieve thermal image by means of selecting proper feature to recognize the fault pattern of the machine. The result shows that classification process of thermal image features by SVM and other classifier can serve for machine fault diagnosis. Key Words: Thermal image, Features, Condition monitoring, Pattern recognition. 1
INTRODUCTION
Among the excellent condition monitoring tools, infrared thermograph (IRT) is one of the important tool which can assist in the reduction of maintenance time and cost of industry. IRT allows for inspection of mechanical machinery for thermal pattern on pump, motors, bearings, fans, pulleys and other rotating machinery [1]. For new approach of machine condition monitoring, IRT has significant part due to frequently fault diagnosis. Indeed, thermal image is able to indicate whether the machine condition is normal or abnormal. For example, the support bearing contains very useful information on the subject of machine condition so the condition monitoring (CM) data in real application should be experimented on them. Infrared (IR) imaging approach is used in the industry as a part of non-destructive evaluation of machine condition especially to check misalignment, bend shaft and rolling element bearing fault based on thermography (temperature) data [2-4]. Hence, data analysis technique is essential in every approach for machine condition identification. Fault diagnosis of rotating machinery can be handled as a task of pattern recognition that consists of three steps: data acquisition, feature extraction and final condition identification is the demandable issue of fault diagnosis [13]. In order to obtain correct information whether normal or abnormal, it is important to complete all step of signal processing whatever the signal type such as vibration, image, current, acoustic signal and so on. In this study, the image histogram features have been chosen for pattern classification and condition monitoring because all thermal image features are not useful to fault diagnosis. Images play with many features such as shape, histogram, colour, spectra texture and others [5]. Fault pattern classification specially for image is typically included of image acquisition, pre-processing, segmentation, feature extraction or dimension reduction, feature selection, classification and decision steps. It is the task of exploring the model that generated the patterns that we must concern with. Histogram based features such as standard deviation, skew, variance, energy, entropy etc. as dimensionless parameters are effective and practical in fault diagnosis of rotating machinery due their relative sensitivity to early faults and robustness to various load and speed. Here, dimensionless parameters are extracted from the raw IRT data which have unfortunately large dimensionality that may increase the computational burden of a subsequent classifier and degrade the generalize capability of the classifier. Therefore, to overcome these difficulties, a few original features which apparently characterize the machine operating condition need to be selected feature from all features. The methods such as distance evaluation technique [6], genetics algorithm [7, 8], conditional entropy [9] are applied to seek the proper features to established machine characteristics. So after getting normalized features by extracting method, the extracting features are fed into the classification algorithm to
943
identify the machine status as the final step of condition monitoring. Widodo & Yang et al [6] and J.-D. Son et al.[10] used support vector machines(SVM) classifier for vibration and current signal which performed very well to fault diagnosis. Nui & Yang et al. [11] introduced multi-agent decision fusion using several classifier together and B.S Yang et al. [12] showed the new approached of fault classification which is adaptive resonance theory kohonen neural network (ART-KNN). This report provides a fault diagnosis scheme for rotating machinery based on thermograph signal by employing histogram features of image. In this experiment, four machine conditions were measured to acquire data. Acquiring data is processed as follows: firstly, features of thermal images are calculated based on Histogram features of Image. Secondly, the feature extraction is conducted by pre-processing techniques by PCA and ICA. Lastly, the extracted data are used for input of the classification algorithms such as support vector machines (SVM), Fuzzy k-Nearest Neighbour (Fk-NN), Adaptive Resonance theory- KOHONEN neural network (ART-KNN) and The Parzen Probabilistic Neural Network (PPNN). The propose method is tested through characterize the different condition of machine fault simulator (MFS). The result from this method validates for assessing the machine state.
2
FEATURE EXTRACTION AND EVALUATION
Feature extraction is one of the most important parts used for condition monitoring and fault diagnosis, whose aim is finding a simple and effective transform of original signals. Important features contain in the signal can be extracted for machine condition monitoring and fault diagnosis. The selected features will be the major factor that determines the complexity and success of signal pattern classification process. Detail has been discussed about image features by Umbaugh & Scott [5]. For thermal image analysis, Histogram features of image have been used because data structure of thermograph in temperature scale is similar to the image gray level distribution as like array structure.
2.1 Histogram Features The histogram features can be considered as statistical based features which provide us information about the characteristic of the gray level distribution for the image. Notated that, for thermal image the gray level of image depends on temperature that always varies with temperature. Now let us consider a image I, thus the first-order histogram probability P(g) can be expressed as
P( g ) =
N (g) M
(1)
Where M is the number of pixel in the image I or sub image considering that’s entire dimension is N*N and N(g) is the number of gray level g. The mean is the average value which gives some information about general brightness of the image. As the colour distribution varies on temperature so the thermal image can be classified according to its colour intensities. We will use L as total number of gray level available range from 0 to 256 for image but for thermograph signature which is comparable with maximum and minimum temperature as may be mentioned 00 K to maximum value of temperature. Therefore, mean can be defined as follows L -1
g = ∑ gP ( g ) = ∑ g =0
r
∑ c
I (r , c ) M
(2)
Variance is defined as a measure of the dispersion of a set of data points around their mean value. L -1
s g2 = ∑ ( g - g ) 2 P( g )
(3)
g =0
The standard deviation, which is also known as the square root of the variance, tells us something about the contrast. It describes the spread in the data, so a high contrast image will have a high temperature distribution over image. Using this we can be able to classify various of machine conditions thermal images. Standard deviation is termed as follows
944
L -1
sg =
∑ (g - g)
2
P( g )
(4)
g =0
The skew S measures the asymmetry about the mean in the gray level distribution. It is defined as
S=
1
s g3
L -1
∑ (g - g)
3
P( g )
(5)
g =0
The energy E tells us something about how the gray levels are distributed L -1
E = ∑ [ P ( g )]2
(6)
g =0
The energy measure has a maximum value of 1 for an image with a constant value, and gets increasingly smaller as the pixel values are distributed across more gray level values. The entropy
Et is a measure that tells us how many bits we need to code the image data, and is given by
L -1
Et = ∑ P ( g ) log 2 [ P ( g )]
(7)
g =0
As the pixel values in the image are distributed among more gray levels, the entropy increases. A complex image has higher entropy than a simple image. The kurtosis K is just the ratio of the fourth central moment and the square of the variance. L -1
K =∑ g =0
( g - mean) 4
(8)
s4
2.2 Feature Extraction and classification When the input data to an algorithm is too large to be processed and it is suspected to be notoriously redundant (much data, but not much information) then the input data will be transformed into a reduced representation set of features (also named features vector). Transforming the input data into the set of features is called features extraction. If the features are carefully extracted then it is expected that the features set will contain the relevant information from the input data in order to perform the desired task. Two types of feature extraction method have been applied in this current work which are Principal component analysis (PCA) and Independent component analysis (ICA). PCA is a classical statistical method and is often used to machine fault diagnosis and pattern classification. And ICA is a technique that transform multivariate random signal into a signal having components that as mutually independent in complete statistical sense. For classification, SVMs, ARTKNN FKNN, PPNNs and e.tc. algorithms are used for machine condition diagnosis.
3
EXPERIMENTS AND MEASUREMENT
3.1 Experimental setup Figure 1. shows the fault simulator with thermo-cam sensor that is apart and a short shaft of 30 mm diameter is attached to the shaft of the motor through a flexible coupling; this minimizes effects of misalignment and transmission of vibration from motor. Using coupling we can set misalignment condition of the fault simulator. The shaft is supported at its ends through two ball bearings. A disk is attached with the shaft that is used for making balanced and unbalanced condition of fault simulator. To get unbalanced extra mass is to add on the disk. A variable speed DC motor (0.5 HP) with speed up to 3450 rpm is used as the
945
Figure 1. Experimental setup.
Figure 2. Original Thermal Image.
basic drive. Table 1 shows the main specification of thermo cam and fault simulator. The sensor used in the experiments for this study is a long-wave IR camera from FLIR with a thermal sensitivity of 0.08 °C at 30 °C.
Table 1 Specification of thermo cam and fault simulator Thermo-cam
Solid state, uncooled micro bolometer detector, 7.5 to 13 µm
(FLIR-A 40 series)
-40 °C to +70 °C storage temperature range
0.08 °C at 30 °C thermal sensitivity
Shaft diameter: 30 mm, Bearing: Two ball bearings
Bearing housings: Two bearing housings, aluminum horizontally
Fault-simulator
split bracket for simple and easy changes, tapped to accept transducer mount
Bearing housing base: Completely movable using jack bolts for easy misalignment in all three planes
Rotors: Two rotors, 6" diameter with two rows of tapped holes at every 20° (with lip for introducing unbalance force)
3.2 Experimental procedure In this experiment, the thermo-cam is the major key device that some parameters are set for data accusation. Some specifications of the thermo cam have been given in the prior section where we can get some idea, regarding thermo-cam. Herein, some parameters are very important to obtain data is emissivity of the object that plays fundamental role for image (signal) accusation but those parameters are put automatically functioning of thermo-cam and all machine’s materials about to same. Others parameter of objects, for example, relative humidity, scale temperature, focal length of camera and distance are set as our requirement of experiment. All of these parameters are chosen according to our experiment condition. For all four normal, mass unbalance , misalignment and bearing fault condition, we have put same parameter for accomplished the experiment. In the current study, we try to analyse different types of faulty condition machine by this experiment. Firstly, the normal condition of machine was set, afterward speed of the motor been increased gradually up to 900 rpm. And then, machine was run for five minutes to get its stable condition then data acquisition was lunched. We conducted experiment of normal, misalignment and mass unbalance condition of machine successively. Data from thermo-cam was saved directly to the notebook or PC. The image from thermo-cam is shown in Figure 2. which carrying only visual inspection of machine condition. Figure 3. shows the data structure from thermo-cam in Kelvin scale which has been processed for fault diagnosis.
946
Figure 3. Temperature in Kelvin Scale at each pixel of thermal image.
4
RESULT AND DISCUSSION
4.1 Feature representation and Feature extraction In the following feature extraction process, histogram based image features of thermal image data have been used which a new approach to machine condition monitoring.
Original features
PCA Extraction
0.8
0.2
Normal Misalignment Bearing ault
0.6
0.1 Component3
0.4 0.2
0 0.8
0 Normal Misalignment Bearingf ault
-0.1
Massunbalance
-0.2 0.3
0.7
0.8
-0.07 0.1
0.4
0.5 Standard Deviation
0.2
0.6
0.6
-0.08 0
0.2 0.4
0
Figure 4. Original features selected randomly
-0.09 -0.1
Component2
Skew
-0.1
3.5 3 2.5 2 1.5 1 0.5 -10
Normal Misalignment Bearingf ault Massunbalance 20
-12 18
-14 Component2
Component1
Figure 5. Clustering features by PCA
ICA Extraction
Component3
Mean
f
Massunbalance
16 -16
14
Component1
Figure 6. Clustering features by ICA
947
In fact, the original features data could not be possible to cluster well because it is high dimension data structure and being overlapped each other. This phenomenon is shown in the Figure 4, the three features from the original features data(120) has been chosen manually without applying any feature extraction algorithm. The representation of features in figure has possessed as scattered due to large size of data only limited number data having to present. Also different type of faults in figure is overlapped each other. So, these data cannot be separated and also not possible to directly processed into classifier because it will degrade the performance of classifier To overcome these problems and disadvantages, useful features should be extracted and also necessary to reduce the dimension of original data features. Accordingly, the dimension reduction and feature extraction algorithm like PCA and ICA has been employed intending to avoid disorder data. In this study, ICA and PCA is applied based on variation of eigenvalue. The eigenvalues have been chosen according to the biggest value have been selected and remaining were discarded. So, the components of PCA were calculated according to that eigenvalues. The first three principal components of PCA are plotted in Figure 5. It can be mentioned that the different class of data for machine conditions are well separated. If we pay attention to both original features selected randomly (in Figure 4.) and feature by PCA (in Figure 5.), obviously, the using PCA extracting data is showing the better performance than choosing features randomly or manually. The Figure 6. shows the extacting features data by ICA of histogram features of themal image. Also represents the impendent component analysis with four different machine conditions. Data extracting features over ICA compare to PCA and randomly chosen is well. The separation of four types data for machine condition is really appreciable through ICA and every classter data is away eachother in Figure 6.
Table 2 Performance Classifier
Features
PCA ICA Original Data
Classifier Performance Supervised Classifier
Neural Network classifier
Data
SVM
FKNN
ARTKNN
PPNN
Valid
0.9833
0.9167
0.7500
0.8167
Test
0.8667
0.8333
0.8000
0.7833
Valid
1.0000
1.0000
0.9660
0.9312
Test
0.9861
0.9633
0.9425
0.9100
Valid
0.7800
0.7837
0.6941
0.6965
Test
0.7487
0.7600
0.6528
0.6281
4.2 Training and Classification The relevant parameters setup for the classifier is according to G.Nui et al.[11], Widodo et al. [6], J.-D. Son et al. [10] and B.S. Yang et al.[12]. They have carried out vast and deep researches on classifier for machine fault diagnosis to find out optimum parameters of classifier for achieving good result. With optimum parameters of classifier, in the current work , the RBF kernel is used as the basis function of SVMs which consists of two parameters are C and g . As optimal value of these arguments, C and g is defined with values 10 and 2-2 respectively. ARTKNN is a classifier of neural networks family whose main parameter is number of neurons and also criterion parameter denoted as r . The criterion parameter, r >0.96 indicates the optimal number of neurons because it is directly proportional to the neural numbers. In order to achieve satisfactory performance by this networks, the number of neural networks is 27. The performance of FKNN depends on parameter K so that it is a important problem to find a suitable K. To accomplish this job, we have tried the value of k<5 which gives us satisfactory result of classification. In practice, the parameter value of FKNN differs case to case. Here, PPNNs are a simple type of neural network used to classify data vectors. In classification processes, features data were input as fifty percent for training and remaining were test validation. After performing classification, from the Table 2 the training and testing accuracy of four classifiers can be observed. For thermal
948
image data, it can be seen that the classification performance using ICA data is much better than PCA and original features data with all classifier. Here, two types of classifier have been shown for classifications that are supervised learning method and neural networks classifier. The best classification accuracy is for thermal image data by SVM and FKNN classifier with trained value of 1.000 where SVM and FKNN are belong to a family of supervised classifier. However, only the SVM test performance is very well over other classifiers. ARTKNN and PPNN belong to family of neural network classifier. Performance of this is also appreciable as shown in Table 2. It can be concluded that all classifiers which have been used in this work validate for thermal image data on machine condition diagnosis that is not of course for original features.
5
CONCLUSION
By this work, a useful application of thermography in the machine condition monitoring and fault diagnosis area is presented. The thermograph data was taken into account to investigate different types of machine fault. At first, the experiment was carried out for four conditions at the same experimental condition after that raw data was extracted from its original structure to compatibility in data processing technique. Histogram features based on statistical of image were employed as a proper feature for thermal image data. Calculated image features data were taken into feature extraction algorithm due to mass dimensionality. As a result, data extracted by ICA shows better clustering performance compare to PCA. Finally, comparison of classifier accuracy using original, ICA and PCA data shows remarkable result by SVM than other machine learning method.
6
REFERENCES
1
Carosena Meola, Giovanni & M Carlomaggo. (2004) Recent advance in the use of infrared thermography, Measurement Science and Technology, 15, R27–R58.
2
R. Thomas, The continued and future use of infrared use of infrared thermography, Proceeding of 2nd WCEAM. pp. 1897-1907.
3
I.Abdel-Qader, S Yohali, O Abudayyeh and S Yehia. (2008) Segmentation of thermal images for non-destructive evaluation of bridge decks, NDT&E International, 41, pp. 395-405.
4
R.A. Epperly, G.E. Herberlin and L.G. Eads(1997), A tool for reliability and safety: predict and prevent equipment failure with thermography, IEEE App. Soc. Procs. pp 59-68.
5
Umbaugh and Scott E. Computer Imaging: (2005) Digital Image analysis and processing. Taylor & Francis.
6
A. Widodo and B.S. Yang. (2007) Application of nolinear feature extraction and support vector machines for fault diagnosis of induction motors. Expert System with Applications 33, 241-250.
7
B SAMANTA. (2004) Artificial neural networks and genetic algorithms for gear fault detection. Mechanical Systems and Signal Processing. 18, 1273-1282.
8
L.B. Jack & A.K. Nandy. (2002) Fault detection using support vector, machines and artificial neural networks, augmented by genetic algorithms. Mechanical Systems and Signal Processing. 16, 373-390.
9
M. Lehrman, A.B. Rechester & RB White. (1997), Symbolic analysis of chaotic signals and turbulent fluctuation. Physical Review Letters. 78, 54-57.
10
J.D. Son, G. Niu, B.S. Yang, Don Ha Hwang & Dong Sik Kang. (2009) Development of Smart sensors system for fault diagnosis. Expert System with Applications, in press.
11
Gang Niu, Tian Han, Bo-Suk Yang and Andy Chit Chiow Tan. (2007) Multi-agent decision fusion for motor fault diagnosis, Mechanical Systems and Signal Processing. 21, 1285-1299.
12
B.S. Yang, T.Han and J.L. An. (2004) ART-KOHONEN neural network for fault diagnosis of rotating machinery. Mechanical Systems and Signal Processing, 18, 645-657.
13
S. Qiao, C. Ping, Z. Da, and Feng, X.(2004). Pattern recognition for automatic machinery fault diagnosis. Transactions of the ASME: Journal of Vibration and Acoustic. V126, 307-316.
949
Proceedings of the 4th World Congress on Engineering Asset Management Athens, Greece 28 - 30 September 2009
ACOUSTIC EMISSIONS OBSERVED FROM A NATURALLY DEGRADING SLOW SPEED BEARING AND SHAFT M. Elforjani , D. Mba School of Engineering, Cranfield University, Cranfield, Beds. MK43 0AL, UK. Condition monitoring through the use of Acoustic Emission (AE) monitoring is gaining acceptance as a useful complimentary tool. This study presents an experimental investigation for incipient fault detection of a low speed thrust bearing and shaft with the AE technology. A low speed test-rig was employed to accelerate natural degradation on both machine components. Throughout the life of the components AE measurements were acquired. It is shown that the AE technology offers the opportunity to monitoring crack propagation and subsequent fracture of slow speed rotating machines. Key Words: Condition Monitoring, Acoustic Emission, Slow Speed bearing, Shafts 1
INTRODUCTION
Shafts and bearings are critical components in rotating machines and are often expected to carry heavy loads and operate reliably. Undetected failure can cause significant damage for machinery, influence production rates and even cause safety concerns. In condition monitoring of low speed components, faults can be very difficult to detect because of their relatively low vibration energy. Thus, traditional methods of vibration measurements and analysis may not be able to detect growing faults. The major problems associated with employing the conventional vibration monitoring techniques to such slow rotating machines were presented by Jamaludin et al [1]. However, the high sensitivity of AE in detecting early stages of loss of mechanical integrity has become one of the significant advantages over the well-established vibration monitoring technique. Acoustic emission (AE) is the phenomenon whereby transient elastic waves are generated due to a rapid release of strain energy from localized sources within or on the surface of a material [2]; typical frequency content of AE is between 100 kHz to 1 MHz. Limitations and difficulties in application of AE to machinery has been detailed [3]. Several applications of the Acoustic Emission technology for machine health monitoring over the last 20-years have been presented [4]. A considerable success has been reported in the application of AE to monitoring slow speed machinery components (e.g. bearings) [5, 6, 7, and 8]. To date, the investigation undertaken by Elforjani et al [9, 10] could be considered one of the few publications that address natural mechanical degradation on rotating machine components. This paper presents an experimental assessment on the application of AE technology to detecting natural crack propagation on a slow rotating thrust bearing and shaft. 2
EXPERIMENTAL PROGRAM
The ing tests are shown in figures 1 to 3. The shaft test rig test design incorporated three bearings on one rotating shaft, a coupling system and an electrical geared motor (MOTOVARIO-Type HA52 B3-B6-B7 j20, 46-Lubricated: AGIP). Two tapered roller bearings, single row (SKF 30207 J2/Q) were employed to support the shaft. An overhang cylindrical roller bearing, single row (SKF NU 1007 ECP) was used to locate the hydraulic load rod onto the shaft. To accelerate crack initiation and growth, a V-Notched shaft of 235 mm length and 35 mm diameter was designed. A radial load was applied to the shaft through the overhang bearing by a hydraulic system. The test rig rotational speed was kept constant at 72 rpm. A flexible coupling was employed between the test shaft and drive motor, see figure 1. To detect AE during shaft rotation an oil-filled cylinder was designed. This is similar to a hydraulic bearing but in this case the cylinder is completely filled with oil, see figure 1. A flat was machined on the outer periphery of the cylinder which was for placement of the AE sensor. A rubber seal on both cylinder sides ensured no leakage and consequently there was no mechanical contact between the shaft and oil-bath that could result in AE noise generation. This enclosed circular bath allowed for direct contact between the rotating shaft and the oil. The enclosed bath was completely filled with oil (CASTROL, Alpha, SP, 460, 3186DM).
950
Figure 1 Test-rig layout
On the bearing test rig, figures 2 and 3, one of the challenges was to accelerate the natural crack signatures at the early stage of defect development. To implement this, a combination of a thrust ball bearing and a thrust roller bearing was selected. One race of thrust ball bearing (SKF 51210) was replaced with a flat race taken from thrust roller bearing (SKF 81210 TN) of the same size, as shown in figures 2 and 3. As a consequence, this arrangement caused higher contact pressure on a flat track relative to the grooved track due to the reduced contact area between the ball elements and the flat race. For this study, bearing run to failure tests were performed under natural damage conditions. The bearing test rig consisted of a hydraulic loading device, a geared electric motor, a coupling and a supporting structure. The test bearing was placed between the fixed thrust loading shaft and the rotating disk, which housed the grooved race. The flat race was fitted onto the loading shaft in a specifically designed housing. This housing was constructed to allow for placement of AE sensors and thermocouples directly onto the race, see figure 3. The thrust shaft was driven by a hydraulic cylinder which moved forwards to load the bearing and backwards for allow periodical inspections of the test bearing face. The rotating disk was driven by a shaft attached to a geared motor with an output speed of 72rpm. A thrust bearing (SKF 81214 TN) was placed between the coupling and the test bearing to react the axial load. A flexible coupling was employed between the shaft and the geared motor.
951
Figure 2 Test bearing arrangement for accelerated failure on the flat race
Figure 3 Test bearing and location of measuring sensors
3
INSTRUMENTATIONS
The AE’s from the rotating shaft were measured throughout the tests. Two Physical Acoustics Corporation WD transducers were employed. These piezoelectric sensors have a bandwidth of 200-750 kHz. The AE sensors were attached to the overhang bearing and the cylinder using superglue. Pre-amplification was set at 40dB. The system was continuously set to acquire AE absolute energy (joules) over a time constant of 10 ms at a sampling rate of 100 Hz. The absolute energy is a measure of the true energy and is derived from the integral of the squared voltage signal divided by the reference resistance (10 k-ohms) over the duration of the AE signal. IIn addition AE waveforms were periodically acquired at a sampling rate of 2MHz. In all cases, AE measurements were taken simultaneously from all AE sensors. For the bearing test rig the AE acquisition system employed commercially available piezoelectric sensors (Physical Acoustic Corporation type “PICO”) with an operating range of 200-750 kHz at temperature ranging from -65 to 177 oC. Four AE sensors, together with two thermocouples (RoHS-Type: J x 1M 455-4371) were attached to the back of the flat raceway, see figure 2. The acoustic sensors were connected to a data acquisition system through a preamplifier, set at 40 dB. The system was continuously set to acquire AE absolute energy (Joules) over a time constant of 10 ms at a sampling rate of 100 Hz. AE waveforms were periodically acquired at a sampling rate of 2MHz. In all cases, AE measurements were taken simultaneously
952
from all four AE sensors. Under normal conditions of load, rotational speed and good alignment, surface damage begins with small cracks, located between the surface of the flat track and the rolling elements, which gradually propagate to the surface generating detectable AE signals. To determine the sub-surface stresses on the test bearing and thereby estimate the time, or number of cycles, to surface fatigue on the race the following theories were employed: the Hertizan theory for determining surface stresses and deformations [11], Thomas and Hoersh theory for sub-surface stress [11], and, the Lundberg and Palmgren theory for fatigue evaluation [12]. For the grooved race the standard procedure, as described by BS (British Standards Documents) 5512; 1991, was employed for determining dynamic load rating. Finally, the anticipated life for defined stresses were computed for both the grooved and flat races and the results clearly illustrated that surface fatigue, such as flaking, could be initiated on the flat race within a few days depending on the load condition; thereby authenticating the test-rig design. The test rig rotational speed was 72 rpm and an axial load of 35kN was employed for this particular test which calculations would cause flaking/surface damage (L10) within 3day on the flat race whilst the grooved race would show similar defect signs after 53days, see table A1 in appendix A. The L10 is defined as the number of hours in service that 90% of a group of bearings subjected to the same conditions (load, speed and lubrication) will survive before the onset of fatigue. It should be noted that the theoretical estimation of rolling contact fatigue is known to be subject to variability or scatter when compared to experimental results and this has been attributed to the probability of inclusions in the steel material located in the highest load zones of the race [13]. It is also worth noting that the actual test period leading to visual damage on the race was much faster than the theoretical calculations. This variation was random but always earlier than predicted. This is attributed to issues such as misalignment, unbalance, etc, which are not incorporated in theoretical estimates; however, best efforts were undertaken to minimise this. 4
EXPERIMENTAL OBSERVATIONS, ANALYSIS AND DISCUSSIONS 4.1 Shaft tests
For this particular paper oobservations of continuous monitoring of the AE levels over 300-mins of shaft operation that reflect the general observations associated with other experimental tests are presented in figure 4. A radial force of 4kN was applied on the test shaft. At the end of the test (300-mins) fatigue failure had occurred in the region of the V-notch, see figure 5. A relative increase in AE energy was noted between 75- and 90-mins of operation. The exact reason for this was attributed to crack initiation. Approximately 90-mins operation, the level of AE levels reduced to that prior to the increased AE activity. It was also observed that at approximately 240-mins into the test AE showed significant increases in level, albeit transient in nature. A continuous increase in AE levels was noted from 250-mins of operation until the final fracture. Prior to fracture the opening and closing of the crack was also visibly observed and final fracture occurred before 300-mins. Visual examination of the fracture surfaces revealed that the fatigue crack had propagated along the circumference of the shaft. Moreover, the fractured surfaces had been rubbing during rotation of the shaft.
Figure 4 Observations of a run-to–failure shaft test.
953
Figure 5 Observed fatigue failure on several test shafts Interestingly observations of the AE waveform, sampled at 2MHz showed changing characteristics as a function of time are shown in figure 6. These waveforms present typical AE waveforms associated with AE transient events after 225-, 250-, and 275-mins of operation. It was noted that the AE waveform at 50, 100 and 225-mins of operation were at levels equivalent to electronic noise levels. It is also particularly interesting to note that the waveforms at 250- and 275-mins of operation showed high AE transient bursts superimposed on continuous type AE; the authors postulate this high AE transient is associated with the rapid propagation of the crack on the shaft and it is a clear indication of a fatigued shaft. The AE waveforms recorded throughout the test period was further evaluated using the spectrogram analysis. The use of the spectrogram is particularly appropriate as it gives information about the measured AE signals both in frequency and time domain . A spectrogram was undertaken on part of the AE waveform recorded at 250-mins into operation, presented in figure 6. The spectrogram is presented in figure 7. It was evident the large transient AE burst contained frequency ranging from 100 to 900 kHz whilst continuous type of AE contained frequencies between 100 to 300 kHz. The authors believe that high frequency components are attributed to the release of elastic waves particularly attributed to the propagating crack in the shaft whilst low frequency components are due to shearing and rubbing of the cracked face noted at the end of the test.
Figure 6 AE waveforms at 225-, 250- and 275-mins of operation
954
Figure 7 Spectrogram results after 250-mins of operation.
4.2 Bearing tests For this particular paper observations of continuous monitoring of the AE levels over 18-hrs (hours) of bearing operation that reflect the general observations associated with over a dozen experimental tests are presented in figure 8. An axial force of 35kN was applied on the test bearing. It was also observed that at approximately 13-hrs into the test AE showed significant increases in level, albeit transient in nature. A continuous increase in AE levels was noted from 17-hrs of operation until the test was terminated. It is worth noting that recording of lubricant temperature was undertaken by two thermocouples channels attached to the back of the flat raceway. Following run-in (0- to 1-hr) the bearing temperature stabilized at 35 oC and after 18hrs operation a maximum temperature of 37 oC was recorded, see figure 8. Visual inspection after 18-hrs operation indicated surface damage had occurred in the region, which locates approximately 20mm from ch-1 AE sensor, see figure 9; the relevance of this will be evident later in the paper.
Figure 8 Observations of a run-to–failure bearing test.
955
Figure 9 Surface damage on flat bearing race, see ‘box’ section It is particularly interesting to note that the waveform at 10-, 14- and 16-hrs of operations where a periodicity of AE transient bursts at approximately (9Hz) was noted, see figure 10; this periodicity is associated with the defect frequency (9Hz) of the bearing and is a clear indication of a damaged bearing as noted by several authors [4 and 7].
Figure 10 AE waveforms at 10-, 14- and 18-hrs of operation The capability of AE to determine source locations of signals emanating in real time from materials under load is one of the significant advantages over other non-destructive test (NDT) technologies. With knowledge of the signal velocity, the location
956
of the AE source can be identified. For this particular investigation efforts were made to identify the defect location (AE source location) in real-time. This was accomplished by identifying the wave velocity on the bearing ring experimentally. At a threshold of 52dB and with known distances between the AE sensors, the velocity of the AE waveform under such conditions was experimentally determined at 4,000 m/sec. This velocity was used for all source location investigations and prior to the onset of testing several lead breaks were made at various positions on the surface to establish the accuracy at this velocity and specific threshold level. Results were within 4% of the exact geometric location of the lead break. Figure 11 shows the source location layout used which essentially ‘unwrapped’ the bearing race for linear location AE waves travelling through a medium are attenuated and arrive at different sensors with a certain time delay. This delay can be attributed to the distance between the source and AE sensors. Source location estimations employed in the bearing test provided another simple and rapid means to identify and locate the crack initiation and propagation to surface spall. The source location over the duration of this test is presented in figures 12 and 13; the region where the surface damage occurred is highlighted by the ‘box’ section. The location plot shows a cumulative energy over the test simulation. It is worth noting that only AE events above a threshold of 52dB contribute to the source location. Whenever the threshold is exceeded, the location of the source is computed and identified. The AE energy is assigned to the geometric position (source); this is a cumulative process and as such a fixed source will have the largest contributory energy in a cumulative plot. Evident from these figures was that the start of the test, shown in figure 12, an AE activity was distributed across a very braod circumferential position on the bearing ring. A maximum energy value of 10 x 106 atto-Joules was noted at 18-hrs operation, and, just as importantly, was the concentration of AE energy over the defect region, 20mm from AE channel-1.
Figure 11 Source location layout for linear detection
Figure 12 Source location estimates of AE events at 4-hrs operation
957
Figure 13 Source location estimates of AE events at 18-hrs operation
5
CONCLUSION
Investigations into the application of Acoustic Emission technique to condition monitoring of slow speed shafts and bearings have been presented. It can be concluded that AE parameters such as Energy, have been validated as reliable, robust and sensitive to the detection of crack propagation and location in slow speed rotational bearings, and, crack propagation and rubbing between cracked surfaces in slow speed shaft. At the rotational speed on which these tests were employed, this is the first known attempt at correlating AE and natural defect generation in slow speed rotating bearings and shafts. 6
REFERENCES
1
Jamaludin, N.; Mba, D.; Bannister, R., H., (2001) Condition Monitoring of Slow-Speed Rolling Element Bearings Using Stress Waves, Proceedings of the IMECHE Part E Journal of Process Mechanical Engineering, 215(4),. 245-271(27), Publisher: Professional Engineering Publishing.
2
ISO 22096 (2007) Condition monitoring and diagnosis of machines – Acoustic Emission.
3
J.Z. Sikorska and D. Mba, Truth, (2008) Lies Acoustic Emission and Process Machines, Journal of Mechanical Process Engineering, Part E, IMechE, Volume 222(1), 1-19.
4
Mba, D.; Rao, Raj, B., K., N., (2006) Development of Acoustic Emission Technology for Condition Monitoring and Diagnosis of Rotating Machines: Bearings, Pumps, Gearboxes, Engines, and Rotating Structures. The Shock and Vibration Digest 38, 3-16.
5
Miettinen, J.; Pataniitty, P., (1999) Acoustic Emission in Monitoring Extremely Slowly Rotating Rolling Bearing, Proceedings of 12th International Congress on Condition Monitoring and Diagnostic Engineering Management, COMADEM 99, England.
6
Morhain, A.; Mba, D., (2003) Bearing Defect Diagnosis and Acoustic Emission, Proceedings of the Institution of Mechanical Engineers, Part J: Journal of Engineering Tribology, 217(4 ), 257 – 272.
7
Al-Ghamd A. M; Mba D. A (2006) Comparative experimental study on the use of acoustic emission and vibration analysis for bearing defect identification and estimation of defect size. Mech Syst Signal Process; 20(7), 1537–71.
8
Choudhury A.; Tandon N., (2000) Application of Acoustic Emission Technique for the Detection of Defects in Rolling Element Bearings, Tribology International, 33(1), 39-45(7), Publisher, Elsevier Science.
9
M. Elforjani, D. Mba, (2008) Observations and location of Acoustic Emissions for a naturally degrading rolling element thrust bearing, Journal of Failure Analysis and Prevention, Publisher, Springer Boston, ISSN: 1547-7029 (Print) 18641245 (Online), May 2008.
958
10 M., Elforjani; D., Mba, (2008) Monitoring the Onset and Propagation of Natural Degradation Process in a Slow Speed Rolling Element Bearing With Acoustic Emission, Journal of Vibration and Acoustics, 130(4), 041013 (Published Online, 15 July 2008). 11 Harris, Tedric A., (2001) Rolling Bearing Analysis, 4th edition, John Wiley & Sons Inc., New York, USA. 12 Palmgren, A., (1959) Ball and Roller Bearing Engineering, 3rd edition, SKF Industries, S. H. Burbank & CO., INC., Philadelphia, USA. 13 Voskamp, A. P. (1985) Material response to rolling contact loading. Transactions of ASME Journal of Tribology technology, 107(3), July 1985.
959
MEASUREMENT OF GAS CONTENT IN TWO-PHASE FLOW WITH ACOUSTIC EMISSION A Addali, S Al-lababidi, H Yeung, D Mba School of Engineering, Cranfield University, MK43 0AL, UK ABSTRACT The two-phase liquid/gas slug flow regime phenomenon can be encountered over a range of gas and liquid flow rates. Monitoring of slugs and measurement of their characteristics, such as the gas void fraction, are necessary to minimise the disruption of downstream process facilities. This paper presents experimental results correlating Acoustic Emission measurements with Gas Void Fraction (GVF) in a two-phase water / air flow regime. It is concluded that the measurements of Acoustic Emission offer a complimentary means of measuring the GVF non-intrusively. Keywords: Acoustic Emission, slug flow, two-phase flow, gas void fraction,
1.
SLUG FLOW MECHANISM AND ACOUSTIC EMISSION
In oil and gas production process, multiphase slug flow regime is normally encountered for a range of pipe inclinations, and, over a wide range of gas and liquid flow rates. Slug flow is characterised by a complex dynamic structure, which consists of aerated slugs of liquid that travel down the pipeline at the local gas velocity. The mechanism of slug initiation has been experimentally investigated by many authors [1-8]. The idealised picture of a “stable slug” flow in horizontal pipe is presented in figure 1. The section ‘F’ represents the front region of the liquid slug body (LSB) and section ‘T ’ represent the tail region of the liquid slug body. In a stable slug flow, the liquid slug body length, LLSB and the elongated bubble (EB) length, LEB remain essentially constant in the downstream direction.
T
F Liquid Slug Body (LSB)
EB tail
ФGT
Elongated Bubble (EB)
ФGB VLSB(L) VLSB(G)
ФGE
LLSB
EB nose
VT
VGEB
HLF
VLF
LEB
Figure 1 Schematic description of an idealised developed slug flow The gas in the elongated bubble moves at a velocity, VGEB , which is faster than the average mixture velocity, Vmix in the liquid slug body. As a result, the liquid is shed from the back of the liquid slug body to form the liquid film layer along the elongated bubble. The liquid in the film at the EB nose may be aerated. Also, bubbles in the liquid slug body coalesce with the elongated bubble interface and are gradually absorbed, which in the fullness of time results in the liquid film becoming un-aerated. The mixture velocity is the sum of the liquid velocity and the gas velocity (VSL + VSG).
960
At the same time, the gas bubbles are fragmented from the tail of the elongated bubble and re-entrained into the front section of liquid slug body ‘F’ at a defined rate, Ф GE rate. The fragmentation of the elongated bubble tail and the entrainment of the bubbles into the front section of the liquid slug body are due to the dispersing forces induced by the flow of the liquid film as it plunges into the liquid slug front. From the description of the slug flow formation, dispersed gas bubbles can be generated in the liquid slug body region through the formation, coalescence, breakage and collapse of bubbles processes. The entrained gas bubbles in the liquid experience a transient pressure as they move through the hydrodynamic pressure field of the liquid. The transient pressure causes the gas bubbles to oscillate at their natural frequencies; one consequence of which is the generation of sound [9]. 1.1
Acoustic Emission technology
Acoustic emission (AE) is defined as transient elastic waves generated from a rapid release ofstrain energy caused by a deformation or damage within or on the surface of a material [10]. As a non-destructive testing tool AE has been successfully applied to a range of industries [11-13]. In addition, the AE technology has also gained considerable recognition as a complimentary tool for machine condition monitoring [14, 15]. The earliest known references relating emitted sound from two-phase gas/liquid flow to the presence of air bubble was published by Bragg [16] in the early 1920’s. Thereafter, numerous investigations on the sound emitted from two-phase flow have focused on understanding the bubble characteristics including bubble size and shape [17-20]. Such studies on the dynamics of bubble behaviour have shown bubble formation, bubble coalescence and/or division result in bubble oscillations at the bubble resonant frequencies which is dependent on the radius of the bubble and mode of excitation. In two-phase gas/liquid flow regimes, gas bubbles entrained in the liquid will generate sound pressures when excited by external pressure fluctuations such as experienced within the slug. Such excitation, in addition to the formation, coalescence, breakage and collapse of bubbles in the slug (as described earlier), will result in volumetric bubble oscillations at the various modes of the bubble. All such pressure pulsations, dependent on the magnitude, will excite a broad frequency range extending to the Acoustic Emission spectrum. It is also known that the event of a bubble collapse results in release of AE energy [21]. The objective of this study was to develop a correlation to predict the gas void fraction in two-phase air/water slug flow as a function of the absolute AE energy and slug velocity. 1.2
Conductivity measurement technique
Although, several techniques are commercially available for measuring gas void fractions in two and multiphase flows; however, this measuring process always combined with a degree of difficulty to measure the parameter due to the natural complexity of two-phase mixture. Separation method is the most traditional method used for void fraction measurements, where an expensive separator facility is employed to physically separate and then measure the two-phase flow components. Gamma-ray attenuation and electrical impedance are the most generally adopted techniques in this subject; other alternative techniques would include microwave transducers, X-ray and gamma-ray tomography, magnetic resonance imaging and nuclear magnetic resonance [22]. The basic principle for electrical impedance methods for component fraction (void fraction) is measuring the electrical impedance or the dielectric constant of the mixed flow. Hence, the mixed flow characterised as an electrical conductor, and by applying a well calibrated relationship between the void fraction and the conductivity and permittivity of the oil, gas and water components, thus the information about void fraction of these components can be determined. Generally, measurements of the electrical impedance are carried out across the pipe diameter (using e.g. contact or non-contact electrodes). Several methods are described in the literature such as Arc electrodes [23], Ring electrodes [24], H elical electrodes [25], and Rotating field electrodes [26]. The fast response of this method makes it possible to employ it for measurements during both steady state and transient situations. However, it suffers from two significant limitations- it cannot be used at high gas fraction ranges and is flow regime dependent. Also, due to the nature of its online installation, it is considered as invasive measurement technique, which require special arrangements in
961
order to install it, unlike the proposed passive AE technique which is flow regime independent, and noninvasive. 2.
EXPERIMENTAL SETUP AND PROCEDURE
A purposely built experimental facility that can simulate two-phase flow was employed, see figure 2. The majority of the piping system was made from ABS (class E) pipe; however, two Perspex sections were installed to allow for visual observations of the flow. Measurements of GVF were undertaken with conductivity probe section followed by stainless steel pipe of 750mm length and 8mm thickness thereby allowing a direct correlation between AE measured from the stainless steel pipe and measured GVF. The flow loop pipeline length was of sufficient length to allow the formation of fully developed slugs. Water 3 was supplied to the flow loop using a centrifugal pump with a maximum capacity of 40 m /hr and a maximum discharge pressure of 5 barg. The water flow was metered using an electromagnetic flow meter 3 with 0 - 20 m /h range. Air was injected to the liquid flow through a 0.5 inch (13 mm) pipe fitted with an airflow meter. -1
-1
The experiments covered a range of superficial water velocities (VSL), 0.3 ms to 1.2 ms at increments of -1 -1 -1 0.1 ms ; and superficial air velocities (VSG) of 0.2 to 1.4 ms at increments of 0.2 ms at each constant VSL. The VSL and VSG values were achieved by throttling valves downstream of the flow meters, and every test condition was maintained for 120 seconds during which period data was acquired for each test condition. Comparisons between the conductivity sensor measurement and AE were undertaken for every test condition. Air tank Air Flowmeter P1
T1
½”
compressor Air
P3 T3
Mag Flowmeter
Air injection
Wat Water tan er tank k
Water
Pump
Steel pipe
Conductivity sensor
Figure 2
2-inch air/water horizontal flow test facility
The conductivity probe measuring section is 0.5 m long Perspex pipe of 50 mm inner diameter, equipped with two pairs of flush-mounted Ring electrodes conductivity probes as seen in figure 3, however only one ring will be utilized in this investigation. The probes discussed here are of twin-ring electrodes type. They consist of two stainless steel rings electrodes with a width of (S p ) 3.7 mm and spaced at 17 mm (=De) apart as shown in figure 4.
962
AE sensor Conductivity sensor
Pre-amplifier
Figure 3 2-inch test section (AE sensor and preamplifier) and conductivity sensor
Figure 4
Scheme of flush mounted stainless steel conductivity ring electrodes
An electronic circuit is used to measure the electrical impedance between the electrodes. Probes based on this technique have been used by Andristsos et al. [27] and Fossa et al. [28]. Such probes can be operated either in the conductance (lower AC frequency) or in capacitance (very high AC frequency) mode. Labview software and AT -M10 data acquisition (DAQ) card of National Instruments were used to acquire and store the data continuously to the computer hard disk. Calibration of the probe was performed by connecting the electrode pairs to the Conductivity Electronic Box device that supplies 7 kHz of A.C carrier signal. The aspect ratios of the probe De/d=0.34, S p /d=0.074 were chosen based on the design recommendations by Fossa [29], where d was equal to 0.05 m. The gas-liquid phase distribution has been achieved by introducing known liquid volumes into the horizontal positioned test pipe. Tap water was used and great care was taken to check the inclination of the pipe at each measurement. A total of 48 measurements were performed in order to cover the liquid fraction range 0~1. At each measurement, both the weight of the water, exclusive the weight of the conductivity ring, and the corresponding voltage value was recorded. As a result calibration of the probe was obtained. The correlated liquid holdup (E) as a function of the normalised output voltage (set at a value of one when the pipe was full and zero when the pipe was empty) for the conductivity probe is given as following:
963
E 1.489 G
4
1.475 G
3
0.368 G
2
0.623 G
(1)
A sample trace collected from the conductivity probe under slug flow conditions is presented in Figure 5. 1.2
1
Film Region
Liquid Holdup
0.8
Slug Body 0.6
0.4
0.2
0 0
2
4
6
8
10
12
Time (second)
Figure 5 Slug trace by conductivity probes A Commercially available AE acquisition system was used for acquiring the data from a Pico type AE sensor with a broadband operating frequency range of 150-750 kHz. The AE sensor was non-invasively mounted by means of industrial superglue onto the stainless steel pipe as shown in figure 3, this orientation of the AE sensor was previously investigated and justified [30]. S ensor sensitivity was evaluated using the pencil lead fracture (PLF) technique. The output AE signals from the sensor were pre-amplified at 60dB and AE absolute energy parameter (Joules) was recorded at 10 ms sample rate. The absolute energy is defined as a measure of the true energy and is derived from the integral of the squared voltage signal (raw signal) divided by the reference resistance (10 k ) over the duration of the AE signal. A Typical AE energy signal collected from the AE sensor under slug flow conditions is presented in Figure 6.
964
14
3.00E+03
AE Abs. Energy (Atto joules)
2.50E+03
Front Region
2.00E+03
1.50E+03
Slug Body Film Region
1.00E+03
5.00E+02
0.00E+00 0
1
2
3
4
5
6
Time (second)
Figure 6
Typical AE signal from slug flow
3.
RESULT AND DISCUSSION
3.1
Gas Bubbles in Slug Body
Adamson [31] stated that the surface free energy per unit interfacial area is equal to the interfacial surface tension between a liquid phase and a gas phase. Assuming gas bubbles are all spherical with a diameter, dbubble, the total surface free energy of the discrete gas bubbles (Esurface) in the liquid slug body was proposed by Brauner et al [32] as:
E surface
6σ d bubble
A (1 ELSB ) LLSB
(2)
where σ is the interfacial surface tension, A is the internal cross-sectional area of the pipe, LLSB is the length of the liquid slug body, and ELSB is the liquid hold-up in the slug body. From Barnea et al. [33] a critical bubble diameter, dbubble, is given as:
0.4σ d bubble 2 ( L - G ) g
1 /2
(3)
where, ρL and ρG are the liquid and gas densities respectively, and g is the gravitational force. The slug length
L LSB was calculated as a function of pipe diameter D [3] as:
965
7
LLSB 32D
(4)
Values employed in estimating Esurface included: Slug length =1.6m and Bubble diameter = 0.003451m Zhang et al. [34] assumed that the surface free energy of the discrete gas bubbles, based on the maximum amount of gas the liquid slug can hold, is proportional to the turbulent kinetic energy in the liquid slug body. This assumption is used in this paper to relate AE to surface free energy. The gas void fraction measured using the conductivity sensor is used to calculate the surface free energy in air/water slug flow conditions as per equation (2) and is plotted in figure 7. As expected, the relationship between the surface free energy and gas void fraction is linear for the given slug length, bubble diameter and interfacial surface tension. T he surface free energy is proportional to the amount of the gas bubbles being held in the slug body. 4
Surface Free Energy (Joule)
3.5 3 2.5 2 1.5 1 0.5 0 0.1
0.2
0.3 0.4 0.5 0.6 Gas Void Fraction Measurement by Conductivity Sensor
0.7
0.8
Figure 7 Surface free energy and measured gas void fraction in the slug body -1
At a fixed superficial water velocity, for example VSL=0.8 ms , increasing the superficial gas velocity resulted in an increase of the measured absolute AE energy, see figure 8. This was observed for all VSL levels investigated. This was not surprising and showed that an increase in bubble content, and its associated bubble dynamics, resulted in an increase of AE generated. It was interesting to note that an increase in VSL for a fixed VSG resulted in a relative decrease in surface free energy whilst a simultaneous increase in AE energy was observed; figure 9 describes the relationship between AE energy measured from the AE sensor and the surface free energy calculated from equation 2. This suggests that there are two mechanisms responsible for the generation of AE. When the water superficial velocity increases the intensity of the turbulence diffusion increases, and as a result, the associated absolute AE energy increases as illustrated in figure 9. Similarly, an increase in water superficial velocity reduces the GVF for a defined superficial air velocity; this GVF is directly correlated to the free surface energy. Therefore, the authors believe that there are two processes influencing the generation of AE; the free surface energy which is a measure of the bubble content in the liquid (air velocity) and the influence of turbulent diffusion (turbulent kinetic energy) due to high superficial liquid velocities. Figure 8 illustrates and confirms the above mentioned processes between liquid and air velocities, and the associated absolute AE energy.
966
200
150
100 1.2
50
1.1 1.0 0.9 0.8
0
0.3
Figure 8
0
0.6
0.4
0.4
0.5
0.8
1
0.6
1.2 1.4
0.7
0.2
VS L (m /s)
Abs Acoustic Energy (10^-18 Joule)
250
s) (m/ VSG
Contribution of the liquid and air velocities on the increase of the measured absolute acoustic energy
250 Vsl=1.2 m/s
Abs Acoustic Energy (10^-18 Joule)
Vsg=1.4 m/s 200
Vsl=1.1 m/s
Vsl=1.0 m/s
150 Vsl=0.9 m/s
Vsg=0.8 m/s 100 Vsl=0.8 m/s
Vsl=0.7 m/s
50
Vsl=0.6 m/s
0 0.5
0.7
0.9
1.1
1.3
1.5
1.7
1.9
2.1
2.3
2.5
Surface Free Energy (Joule)
Figure 9
3.2
Contribution of the turbulence kinetic energy on the increase of the absolute acoustic energy Acoustic Gas Void Fraction Correlation in Slug Body
Figure 10 presents the measured gas void fraction taken by the conductivity sensor and the associated absolute AE energy suggesting a non-linear relationship and provided the basis for establishing a relationship between the GVF as a function of the AE energy. A multiple exponential regression resulted in the following relationship:
967
c
ε a AEb Vmix VSG
d
(5)
where is the gas void fraction in slug body, a = 0.768, b = 0.003, c = -0.690.and d = 0.744. Figure 11 shows the obtained gas void fraction in the liquid slug body from the developed correlation (5) as function of absolute AE energy and slug velocities. The predictions of the proposed model, equation (5), was compared with a proposed model of Beggs et al [35] and a variation of 9% was obtained
0.8
0.3776
y = 0.1875x
GVF (Measured by Conductivity)
0.7
0.3395
0.3375
y = 0.1941xy = 0.148x
Vsl=1.2 m/s
y = 0.1241x 0.355 y = 0.0817x 0.4129 y = 0.0571x0.4591
y = 0.037x0.514
Vsl=1.1 m/s
0.6 Vsl=1.0 m/s
0.5 0.4
Vsl=0.9 m/s
0.3
Vsl=0.8 m/s
0.2
Vsl=0.4 m/s
0.1 Vsl=0.3 m/s
0.0 0
50
100
150
200
250
Abs Acoustic Energy (10^-18 Joule) Figure 10
Absolute AE energy and measured gas void fraction
0.8
GVF predicted by AE Correlation
VSL=0.3 m/s
0.7
VSL=0.4 m/s
0.6
VSL=0.5 m/s VSL=0.6 m/s
0.5
VSL=0.7 m/s
0.4 VSL=0.8 m/s
0.3
VSL=0.9 m/s
0.2
VSL=1.0 m/s VSL=1.1 m/s
0.1
VSL=1.2 m/s
0 0.0
Figure 11
0.5
1.0
1.5 2.0 Mixture Velocity (m/s)
2.5
3.0
Gas Void Fraction predicted from the proposed AE model, equation (5)
968
4.
CONCLUSIONS
The applicability of the AE technology to measure the gas void fraction has been demonstrated. A correlation is developed as a function of the absolute AE energy and slug velocities for the range of liquid and gas velocities. In order to validate the applicability of the model for other flow and physical conditions, further experimental investigations are required. The work currently being undertaken investigates the GVF prediction for varying viscosity, pipeline orientation, and surface roughness of the pipe.
5.
References 1. 2. 3.
4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14.
15. 16. 17. 18. 19.
20.
21.
22.
Kordyban, E. S. and Ranov, T. 1970. Mechanism of Slug Formation in Horizontal Two-Phase Flow. Journal of Basic Engineering. 92, 857-864. Graham, B; Wallis, G. B. and Dobson, J. E. 1973. The onset of slugging in horizontal stratified air-water flow. International Journal of Multiphase Flow, 1(1), 173-193 Taitel, Y. and Dukler, A. E. ,1976. A model for predicting flow regime transitions in horizontal and near horizontal gas-flow. Journal of American Institute of Chemical Engineering. 22(1), 4755. Mishima, K. and Ishii, M. 1980. Theoretical prediction ofonset of horizontal slug flow. Journal of Fluids Engineering, Transactions of the ASME . 102, 441-445. Nydal, O.J., Pintus, S. and Andreussi, P. ,1992. Statistical characterisation of slug flow in horizontal pipes. International Journal of Multiphase Flow, 18(3), 439-453. Barnea, D. and Taitel, Y., 1993. A model for slug length distribution in gas-liquid slug flow. International Journal of Multiphase Flow, 19(5), 829–838. Fan, Z., Lusseyran, F. and Hanratty, T. J. 1993a. Initiation of slugs in horizontal gas-liquid flows. Journal of American Institute of Chemical Engineering . 39, 1741-1753. Hale, C. P . 2000. Slug Formation, Growth and Decay in Gas-Liquid Flow, PhD Thesis, Imperial College, London, UK, 2000. Strasberg M., 1956. Gas Bubbles as Sources of Sound in Liquids. The Journal of the Acoustic Society of America,28(1) . ISO 22096, 2007, "Condition Monitoring and Diagnostics of Machines. Acoustic Emission,". Pao, Y-H., Gajewski, R.R. and Ceranoglu, A.N. ,1979. Acoustic emission and transient waves in an elastic plate, J. Acoust. Soc. Am. 65(1), 96 –102. Pollock AA, Acoustic Emission Inspection, Physical Acoustics Corporation, Technical Report, 1989, TR-103-96-12/89. Mathews, J. R. 1983, Acoustic emission, Gordon and Breach Science Publishers Inc.,New York., ISSN 0730-7152. Mba, D., and Rao, R. B. K. N., 2006. Development of Acoustic Emission T echnology for Condition Monitoring and Diagnosis of Rotating Machines: Bearings, Pumps, Gearboxes, Engines, and Rotating Structures, Shock and Vibration Digest, 38(1) pp. 3-16. J.Z. Sikorska and D. Mba, Truth, Lies Acoustic Emission and Process Machines, Journal of Mechanical Process Engineering, Part E, IMechE, Volume 222, Number 1, 1-19, 2008. Bragg Sir W. H., 1921. The World of Sound .London, Bell, pp 69-74. Minnaert, M. 1933. On musical air-bubbles and the sounds of running water. Philosophical Magazine, 16, 235}248. Leighton, T. G. 1994. The acoustic bubble. London: Academic Press. Waheed. A. Al-Masry, Emad M. Ali and Yhya M. Aqeel. , 2005. Determination of Bubble Characteristics in Bubble Columns using Statistical Analysis of Acoustic Sound Measurements. IchemE, 83(A10), 1196-1207. th Manasseh R 1997. Acoustic sizing of bubbles at moderate to high bubbling rates. 4 world conference on experimental heat transfer, fluid mechanics and thermodynamics, pp 943-947 Bruxelles, Belgium. Derakhshan, O., J. Rechard Houghton, R. Keith Jones, 1989. Cavitation Monitoring of Hydroturbines with RMS Acoustic Emission Measurements. World Meeting on Acoustic Emission , p305-315. Clayton T. Crowe, 2005. Multiphase Flow Handbook, Taylor and Francis
969
23. Xie, C.G., Stott, A. L., Plaskowski, A., and Beck.,M. S., 1990. Design ofCapacitance Electrodes for Concentration Measurement of Two-Phase Flow. Meas. Science & Technology , p.65-78. 24. Andreussi, P., Donfrancesco, Di, and Messia, M.,1988 . An Impedance Method for the Measurement of Liquid Holdup in Two-Phase Flow. Int. J. Multiphase Flow, 14,777–787. 25. Abouelwafa, M.S.A., and Kendall, E.J.M.,1980. The measurement of Component Rations In Multiphase Systems Using Gamma Ray Attenuation. J. of Phys.E:Sci. Instrum. Vol 13, pp 341-5. 26. Merilo,M., Dechene, R.L.,and Cichowlas, W.M. Void Fraction Measurement with a Rotating Electric Field Conductance Gauge. ASME J’ Heat Transfer 99, p.330-332, 1977. 27. Andreussi, P. and Bendiksen, K. H., 1989. An Investigation of void fraction in liquid slugs for horizontal and inclined gas-liquid pipe flow. International Journal of Multiphase Flow, 15(6), 937-946. 28. Fossa, M., Guglielmini, G., and Marchitto, A., 2003. Intermittent flow Parameters from Void Fraction Analysis. Journal of Flow Measurement and Instrumentation, 14, 161–168. 29. Fossa, M. (1998). Design and Performance of a Conductance Probe for Measuring the Liquid Fraction in Two-Phase Gas-Liquid Flows. Journal of Flow Measurement and Instrumentation, 9, 103-109. A. 30. Addali, S . Al-Lababidi, D. Mba, H Yeung, Observations of Acoustic Emission in two-phase flow, nd 22 International Congress and Exhibition on Condition Monitoring and Diagnostic Engineering th th Management (COMADEM 2008), Prague, Czech Republic,11 – 13 June 2008. ISBN 978-80254-2276-2, 3-10. 31. Adamson, A.W., 1990. Physical Chemistry of Surfaces, fifth ed. John Willey and Sons Inc. 32. Brauner, N., Ullmann, A., 2004. Modelling of gas entrainment from Taylor bubbles. Part B: A stationary bubble. Int. J.Multiphase Flow Vol. 30, pp. 273-290, 33. Barnea, D., Shoham, O., Taitel, Y., 1982. Flow pattern transition for vertical downward two phase flow. Chem. Eng. Sci. 37, 741–744. 34. Zhang, H.-Q., Wang, Q., Sarica, C., and Brill, J. P., 2003. Unified Model for Gas-Liquid Pipe Flow Via Slug Dynamics—Part 2: Model Validation,. ASME J. Energy Resour. Technol., 125,pp.266-273 35. Beggs, H. D., and Brill, J. P., 1973, ‘‘A Study of Two-Phase Flow in Inclined Pipes,’’ Trans. AIME, 255, p. 607.
970
Author
Page Number
Botsaris, Pantelis
902
Brown, K.A.
326
Adgar, Adam
688
Ahonen, Toni
1
Caesarendra, Wahyu
671
Ahonen, Toni
31
Cannata, Alessandro
411
398
Cenna, Ahmed
264
1
Cenna, Ahmed
877
Chae Bae, Yong
589
Akma Ibrahima, Nurul Ala-Nikkolac, Jari Ali, Md. Younus
943
Al-Najjar, Basim
7
Chebel Morello, Brigitte
98
Al-Najjar, Basim
137
Chebel Morello, Brigitte
474
Chevalier , R.
464
Alonso Perez, Andrea
59
Álvarez Padilla, F.J.
928
Choi, Byeong-Keun
474
Amadi-Echendu, Joe
173
Choi, Byeong-Keun
596
Amin Abdul Majida, Mohd
398
Choi, Byeong-Keun
603
Amin Abdul Majida, Mohd
493
Choi, Byung-Geun
273
Anezirisa, O. N.
912
Choi, Hong-jung
841
Antoniadis, Ioannis
862
Ciganovic, Renato
Areitioaurtena, Olatz
823
Conde, Egoitz
823
Athanasopoulosa, Th.
197
Crespo Márquez, Adolfo
146
Badurdeen, Fazleena
308
Crespo Márquez, Adolfo
928
Baglee, David
679
Cristaldi, Loredana
750
Bakouros, Yiannis
504
Damen, M.
912
Ball, Andrew
750
Diston, D.
68
7
Dontsiou, Maria
515
221
Elforjani, M
950
Barlow, Matthew
785
Elhaj, Mohamed
750
Bellamy, L.J.
912
Elina, Laila
515
Belmonte, Hal
173
Emmanouilidis, Christos
420
Benetrix, Ludovic
236
Emmanouilidis, Christos
515
Bevis, Keith
539
Emmanouilidis, Christos
688
Bey-Temsamani, Abdellatif
543
Emmanoulidis, Christos
892
Bhagwan, Jay
173
Engels, Marc
543
Bohoris, George
813
Espíndola, Danúbia
620
Bohoris, George
831
Farhanb, Farnaz
350
Borcos, Mirabella
515
Fidgeb, Colin
405
Bandara, Wasana
86
Barlow, Matthew
Franlund, Jan
515
Haider, Abrar
112
Franlund, Jan
525
Haider, Abrar
207
Frkovic, Drago
255
Haider, Abrar
438
Frolov, Vladimir
86
Haider, Abrar
715
Fumagalli, Luca
288
Halina, Scott
770
Fumagalli, Luca
928
Haouchine, Karim
474
Furneaux, Craig
326
Heikkilä, Jouko
Gaidajis, George
902
Herdera, P.M
164
Gao, Jing
230
Howard, Ian
567
Gao, Jing
358
Howard, Ian M.
627
Gao, Jing
770
Ierace, Stefano
531
Garcia, Alvaro
823
Ierace, Stefano
750
Garetti, Marco
750
Igartua, Amaya
823
Garnero, Marie-Agnès
236
Ik Lim, Jang
589
Ghretli, Mohamed
750
Ingwald, Anders
430
Gilabert, Eduardo
688
Inoue, Ryota
886
Gilabert, Eduardo
823
Jafarib, Mohsen
350
Godichaud , M.
280
Jakovèiæ, Mladen
255
Gómez Fernández, Juan Francisco
146
Jang, Young-Seok
273
Gómez Fernández, Juan Francisco
928
Jantunen, Erkki
688
Gontarz, Szymon
552
Jasinski, Marcin
651
Gonzalez Diaz, Vicente
146
Jeong, Han Eol
273
González Díaz, Vicente
928
Jones, Mark
264
Gorjian Jolfaei, Nima
369
Jones, Mark
877
Gorjian Jolfaei, Nima
385
Kans, Mirka
214
Gorritxategi, Eneko
823
Kans, Mirka
430
Grossmann, Georg
785
Kaphle, Manindra
Gryllias, Konstantinos
862
Karakatsanis, Theoklitos
851
Gu, Dong-Sik
589
Karnouskos, Stamatis
411
Gu, Dong-Sik
596
Karray, Moahmed Hedi
98
Gu, Fengshou
750
Kawai, Tadao
886
Keun Choi, Byeong
589
Gudergan, Gerhard
15
1
40
Gudmundsson, A.
326
Kiassat, Corey
561
Guminski, Robert
574
Kim, Byeong-Su
596
449
Lim, Jangik
273
Kim, Eric
40
Lim, Jangik
474
Kim, Eric
743
Limb, Jai-Hak
458
Kim, Eric Y. H.
603
Lipiaa, Tahmina
458
Kim, H.J.
474
Liyanage, Jayantha P.
308
Kim, Hack Eun
603
Lorenz, Bert
Kim, Jae-Gu
273
Lumentut, Mikail
Kim, Jae-Gu
596
Ma, Lin
86
Kim, S.J.
671
Ma, Lin
221
Kim, Won-cheol
637
Ma, Lin
369
Kim, Young Chan
596
Ma, Lin
385
Kiritsis, Dimitris
120
Ma, Lin
405
Kiritsis, Dimitris
127
Ma, Lin
724
Kong, Young-mo
637
Macchi, Marco
531
Konstandinidou, Myrto
912
Macchi, Marco
928
Koronios, Andy
112
Marentakisb, H.
197
Koronios, Andy
176
Marques Cardoso, A. J.
59
Koronios, Andy
186
Marttinend, Arto
31
Koronios, Andy
230
Maszak, Jedrzej
661
Koronios, Andy
358
Mathew, Avin
221
Koronios, Andy
770
Mathew, Avin
724
Koronios, Andy
796
Mathew, Joseph
603
Koronios, Andy
805
Mathew, Joseph
743
Kostagiolas, Petros
813
Matsokis, Aristeidis
120
Kostagiolas, Petros
831
Mayer, Wolfgang
785
Mazhar, Muhammad Ilyas
567
Kim, Dohoon
15 627
Kumara, Uday
25
Labib, Ashraf
245
Mba, David
950
Labib, Ashraf
336
Mba, David
960
Labib, Ashraf
515
Mekid, Samir
705
Lee, W.H.
671
Mengel, David
86
Lee, Woo-bang
841
Mengel, David
724
Li, Jiuyong
796
Minis, Ioannis
197
Li, Jiuyong
805
Mittinty, Murthy
369
Lim, Jae-Hak
449
Mittinty, Murthy
385
Moh, Sangyoung
841
Piyatrapoomi, Noppadol
49
Mokhtar, Ainul
493
Piyatrapoomi, Noppadol
77
Motten, Andy
543
Podratz, Kevin
156
Mpinos, Anastasios
851
Provost , D.
464
Muhammad, Masdi
398
Purser, Michael
221
Muhammad, Masdi
493
Purser, Michael
724
Naris, Athanasios
902
Purser, Michael
405
Nastasie, Daniela
178
Radkowski, Stanislaw
552
Nastasie, Daniela
188
Radkowski, Stanislaw
574
Natarajan, Kalaivany
796
Radkowski, Stanislaw
651
Natarajan, Kalaivany
805
Rezvanib, Ali
350
Rgeai, Mohamed
750
Nayak, Richi
49
Neejärvi, Jari
1
Rosqvist, Tony
31
Ninikasa, G.
197
Ryun Lee, Wook
589
Oh, Jun-Seok
671
Safaei, Nima
561
Ojanenc, Ville
31
Salman, Muhammad
567
Olsson , Christer
245
San-peng, Deng
583
Ompusunggu, Agusmian P.
543
Schuh, Guenther
15
Oyadiji, S Olutunde
697
Scroubelos, George
922
Oyadiji, S Olutunde
705
Seraoui, Redouane
464
Panagiotidou, Sofia
504
Shao-hong, Wang
583
Pandya, S.
68
Shao-hong, Wang
614
Pang, Kim
264
Shim, Min-Chan
637
Pang, Kim
877
Shim, Min-Chan
671
Papathanassiou, Nikos
892
Shin, Jong-Ho
127
Papazogloua, I.A.
912
Smalko, Zbigniew
499
Smit, Johan J
644
Parida, Aditya
25
Park, C.H.
474
Srinivasan, Rengarajan
350
Parlikad, Ajith Kumar
350
Stapelberg, Rudolph Frederick
733
68
Starr, Andrew
68
Peppard, Joe
230
Starr, Andrew
539
Peres, Francois
280
Steenstrup, Kristian
112
Steve, Kennett
770
Stumptner, Markus
785
Payne, J.
Phillips, Paul Pistofidis, Petros
68 420
Su Kim, Byeong
589
Xu, Xiao-Li
583
Su Kim, Hee
589
Xu, Xiao-Li
614
Yachiku, Hideki
886
Sun, Yong
86
Sun, Yong
369
Yang, Bo-Suk
637
Sun, Yong
385
Yang, Bo-Suk
671
Sun, Yong
405
Yang, Bo-Suk
743
Taisch, Marco
411
Yang, Bo-Suk
937
Takata, S.
316
Yang, Bo-Suk
943
Tan, Andy C.C.
40
Yang, Seung Wook
671
Tan, Andy C.C.
603
Yarlagadda, Prasad
369
Tan, Andy C.C.
743
Yarlagadda, Prasad
385
Tan, Shengming
877
Yau, Chi-Yung
679
Tao, Chen
583
Yeoh, William
358
Tao, Chen
614
Yiakopoulos, Christos
862
Tchangani, A.
280
Zeimpekisa, V.
197
40
Zerhouni, Noureddine
98
Tsutsui, Makoto
316
Zerhouni, Noureddine
474
Tywoniak, S.
326
Zhu, Zhenhuan
705
Ulaga, Samo
255
Lipia, Ming
449
Vamvalis, Cosmas
245
Zuo, Ming
458
Vamvalis, Cosmas
504
Zurek, Jozef
499
Vandenplas, Steve
543
Verrier, Véronique
236
Thambiratnam, David
Vieira, Ana von Holdt, Chris
59 173
Weligamage, Justin
49
Weligamage, Justin
77
Widodo, Achmad
671
Widodo, Achmad
937
Widodo, Achmad
943
Wijnia, Y.C
164
Williams, K.C
264
Winter, Cord-Philipp
15
Woropay, Maciej
499
Xirouchakis, Paul
127