2151_C000.fm Page i Thursday, October 26, 2006 1:39 PM
Stream of Variation Modeling and Analysis for Multistage Manufacturing Processes
2151_C000.fm Page ii Thursday, October 26, 2006 1:39 PM
2151_C000.fm Page iii Thursday, October 26, 2006 1:39 PM
Stream of Variation Modeling and Analysis for Multistage Manufacturing Processes Jianjun Shi University of Michigan, Michigan
Boca Raton London New York
CRC Press is an imprint of the Taylor & Francis Group, an informa business
2151_C000.fm Page iv Thursday, October 26, 2006 1:39 PM
CRC Press Taylor & Francis Group 6000 Broken Sound Parkway NW, Suite 300 Boca Raton, FL 33487-2742 © 2007 by Taylor & Francis Group, LLC CRC Press is an imprint of Taylor & Francis Group, an Informa business No claim to original U.S. Government works Printed in the United States of America on acid-free paper 10 9 8 7 6 5 4 3 2 1 International Standard Book Number-10: 0-8493-2151-4 (Hardcover) International Standard Book Number-13: 978-0-8493-2151-1 (Hardcover) This book contains information obtained from authentic and highly regarded sources. Reprinted material is quoted with permission, and sources are indicated. A wide variety of references are listed. Reasonable efforts have been made to publish reliable data and information, but the author and the publisher cannot assume responsibility for the validity of all materials or for the consequences of their use. No part of this book may be reprinted, reproduced, transmitted, or utilized in any form by any electronic, mechanical, or other means, now known or hereafter invented, including photocopying, microfilming, and recording, or in any information storage or retrieval system, without written permission from the publishers. For permission to photocopy or use material electronically from this work, please access www. copyright.com (http://www.copyright.com/) or contact the Copyright Clearance Center, Inc. (CCC) 222 Rosewood Drive, Danvers, MA 01923, 978-750-8400. CCC is a not-for-profit organization that provides licenses and registration for a variety of users. For organizations that have been granted a photocopy license by the CCC, a separate system of payment has been arranged. Trademark Notice: Product or corporate names may be trademarks or registered trademarks, and are used only for identification and explanation without intent to infringe. Library of Congress Cataloging-in-Publication Data Shi, Jianjun, 1963Stream of variation modeling and analysis for multistage manufacturing processes / Jianjun Shi. p. cm. Includes bibliographical references and index. ISBN 0-8493-2151-4 1. Manufacturing processes--Statistical methods. I. Title. TS183.S524 2006 658.5--dc22 Visit the Taylor & Francis Web site at http://www.taylorandfrancis.com and the CRC Press Web site at http://www.crcpress.com
2006049164
2151_C000.fm Page v Thursday, October 26, 2006 1:39 PM
Preface The aim of this book is to summarize major achievements in stream of variation (SoV) methodology research and implementation in various manufacturing processes. The SoV methodology focuses on developing and implementing a unified, systematic, and generic methodology for variation management and reduction of multistage manufacturing processes (MMPs). An MMP is a complex manufacturing system with common characteristics as: (1) multiple operations/stations are involved to produce a product, (2) the product quality can be quantitatively characterized by a set of features or attributes, and (3) the product quality deviations/variations are contributed by the errors generated at the current station, as well as the accumulated errors transmitted from previous stations. The major challenges of variation reduction in an MMP are to model, analyze, control, and reduce the process-induced variation on final products. In an MMP, the output-workpiece deviation of one station is the input-workpiece deviation of the next station. Each station adds design inherent process variation due to tooling tolerance when no fault occurs, and special assignable process variations due to tooling errors when a fault occurs, to the part variation. The final product variation is an accumulation of variations from all stations. Due to the complexity of an MMP, the variation propagation depends on both product design and process design. The variation stack-up is normally not simply additive, and could be nonlinear. This book addresses those fundamental issues by developing the unified methodology, which includes: (1) model variation propagation in an MMP using a state space model, (2) expanding the concepts of system and control theory for variation management through design synthesis and optimization to achieve an affordable minimum variability at the design stage, (3) developing statistical methods driven by engineering models for root cause identification and quick failure recovery in the manufacturing stage; and (4) integrating process tool degradation, reliability model, and product quality information for prognostics and defect prevention to maintain the minimum variability throughout the manufacturing lifetime. The book can be used as a major reference or text book for researchers, engineers, and students who are interested in manufacturing design and analysis, manufacturing process control, advanced quality control, industrial statistics, or general areas of manufacturing process modeling, monitoring, diagnostics, and control.
OUTLINE OF THE BOOK The book is organized into 5 parts and 18 chapters. A brief summary of those 5 sections is summarized below:
2151_C000.fm Page vi Thursday, October 26, 2006 1:39 PM
•
•
•
•
•
Part I (Chapter 2 to Chapter 5) provides a review of some bases of matrix theory and multivariate statistics, which will be used in later sections in the SoV study. Topics include basics of matrix theory (Chapter 2), multivariate statistics (Chapter 3), statistical inference (Chapter 4), and principal component analysis and factor analysis (Chapter 5). Part II (Chapter 6 to Chapter 8) discusses variation propagation modeling issues, including SoV modeling based on the first principles (engineering knowledge) for assembly processes (Chapter 6), machining processes (Chapter 7), and SoV modeling based on observational data (Chapter 8). Part III addresses issues on variation source diagnosis. The section starts with diagnosability study (Chapter 9), then follows with two diagnostic methods as pattern matching methods (Chapter 10) and estimation based methods (Chapter 11). Part IV investigates issues related to design for variation reduction, which includes optimal sensor placement and distribution (Chapter 12), design evaluation and process capability analysis (Chapter 13), optimal fixture layout (Chapter 14), and process-oriented tolerance synthesis (Chapter 15). The last part introduces additional SoV related topics, including QR chain modeling and analysis (Chapter 16), quality-ensured maintenance (Chapter 17), and a review of other SoV research achievements (Chapter 18).
In addition to serving as a major reference book for researchers and engineers, this book can also be used as a text book for senior undergraduate students or graduate students in advanced quality control courses or manufacturing design courses. The book can be organized into three different tracks for a one-semester course according to students’ backgrounds and interests: 1. Data-driven track (fits industrial engineering–style quality and manufacturing for senior undergraduate and junior graduate students): include Chapter 2 to Chapter 5 and Chapter 8 to Chapter 11. 2. Physics-model-based track (fits mechanical engineering–style manufacturing process control for senior undergraduate and junior graduate students): includes Chapter 2, Chapter 3, Chapter 5, Chapter 6 or Chapter 7, Chapter 10, Chapter 14, and Chapter 15. 3. Advanced track (senior graduate and Ph.D. students): includes Chapter 6 (or Chapter 7) and any chapter from Chapter 8 to Chapter 17.
2151_C000.fm Page vii Thursday, October 26, 2006 1:39 PM
Acknowledgments I have been fortunate to have had three advisors in my graduate study: Professor Jun Zhang (M.S. degree advisor in the Automatic Control Department, Beijing Institute of Technology), who introduced me to the control and system theories; Prof. Zhifang Zhang (Ph.D. advisor in electrical engineering at Beijing Institute of Technology), who trained me with system identification and adaptive control, and Prof. S. M. Wu (Ph.D. advisor in mechanical engineering at the University of Michigan), who provided me with the opportunities to study at the University of Michigan and led me into the area of manufacturing science and engineering. Many of the research ideas and achievements described in this book originated from their guidance and training. I am indebted to many of my former and current students with whom I have collaborated, or am still collaborating, in regard to the technical ideas described in this book. The individuals include, but are not limited to, D. Ceglarek, D. Khorzad, B. W. Shiu, D. W. Apley, J. Jin, Y. Ding, S. Zhou, Y. Chen, Q. Huang, F. Tsung, Q. Rong, P. Chaipradubkiat, J. Li, J. Liu, L. E. Izquierdo, J. Zhong, R. Jin, and H. Zheng. I am also indebted to other friends and collaborators, including S. J. Hu, J. Ni, Y. Koren, G. Ulsoy, C. F. J. Wu, V. Nair, D. Djurdjanovic, J. Camelio, C. W. Wampler, and Z. Kong. The experiences of working with them are invaluable, and I have learned a great deal from them, which benefitted me both personally and professionally. My appreciation is also due to the many research associates and students as well as managers and engineers in companies who have directly contributed to the technology transfer and the implementation of the SoV methodologies in industrial practices. Special acknowledgement goes to three individuals who directly helped in contributing or writing some of the chapters: Y. Ding for Chapter 6 and Chapter 11 to Chapter 14; S. Zhou for Chapter 7, Chapter 9, and Chapter 10; and Y. Chen for Chapter 15 to Chapter 17. They also reviewed the manuscript draft of the book and provided many invaluable comments for improvement. Without their strong support and contribution, this manuscript would not have been completed. I am also grateful for the many students, collaborators, more senior colleagues, and others who provided valuable critiques and feedbacks on the manuscript. I have been fortunate to have research support from many sources, including the National Science Foundation (NSF), NSF Engineering Research Center for Reconfigurable Manufacturing Systems, NIST-Advanced Technology Program, General Motors, Ford, DaimlerChrysler, Auto Body Consortium, and many other industrial companies. Their generous funding support made many of these SoV research initiatives feasible.
2151_C000.fm Page viii Thursday, October 26, 2006 1:39 PM
Finally, a very special note of appreciation is extended to my wife, Liping Luo, and my two daughters, Helen and Katherine, who have provided continuous support in the past years.
2151_C000.fm Page ix Thursday, October 26, 2006 1:39 PM
Author Jianjun Shi is a professor of the Department of Industrial and Operations Engineering and a professor of Department of Mechanical Engineering at the University of Michigan. He received his bachelor of science and master of science in electrical engineering at the Beijing Institute of Technology in 1984 and 1987, respectively, and his Ph.D. in mechanical engineering at the University of Michigan in 1992. His research interests focus on the fusion of advanced statistical and domain knowledge to develop methodologies for modeling, monitoring, diagnosis, and control for complex systems in a data-rich environment. He is one of the early pioneers in the field. He has guided 19 Ph.D. graduates and published more than 100 papers. He has also been closely working with industrial companies. He has led various research projects funded by National Science Foundation, NIST Advanced Technology Program, General Motors, DaimlerChrysler, Ford, Lockheed-Martin, Honeywell, Pfizer, and various other industrial companies and funding agencies. The technologies developed in his research group have been implemented in various production systems with significant economic impacts. Professor Shi is the founding chairperson of the Quality, Statistics and Reliability (QSR) Subdivision, Institute for Operations Research and Management Science (INFORMS). He also serves as the director of the laboratory for In-Process Quality Improvement Research (IPQI) at the University of Michigan. He is currently serving as the departmental editor, IIE Transactions on Quality and Reliability, and associate editor for the International Journal of Flexible Manufacturing Systems. He is a member of ASME, ASQ, IIE, SME, and ASA. More information about Dr. Shi can be found on his web site: http://www-personal.engin.umich.edu/~shihang/index.html
2151_C000.fm Page x Thursday, October 26, 2006 1:39 PM
2151_C000.fm Page xi Thursday, October 26, 2006 1:39 PM
Table of Contents Chapter 1
What Is Stream of Variation for Multistage Manufacturing Processes?.............................................................................................1
1.1
History and Current Status of SoV .................................................................1 1.1.1 SoV Initiation and Impact in Industry.................................................1 1.1.2 The History of SoV Methodology Development ................................3 1.2 Overview of SoV Methodology.......................................................................5 1.3 Relationship of SoV Methodologies with Other Existing Methods ...............9 References................................................................................................................11
PART I Basis of Matrix Theory and Multivariate Statistics Chapter 2
Basics of Matrix Theory ....................................................................15
2.1 2.2
Introduction ....................................................................................................15 Definitions of Vector, Matrix, and Operations ..............................................15 2.2.1 Definitions ..........................................................................................15 2.2.2 Partitioned Matrices ...........................................................................18 2.3 Quadratic Forms.............................................................................................19 2.4 Vector Space and Geometrical Interpretations ..............................................20 2.5 Eigenvalues and Eigenvectors of a Matrix ....................................................22 2.6 Vector and Matrix Differentiation, Maximization, and Operators................24 2.6.1 Differentiation with Vectors...............................................................24 2.6.2 Matrix Inequalities and Maximization ..............................................26 2.6.3 Vec Operator, Kronecker Product, and Hadamard Product ..............27 2.6.3.1 Vec Operator and Kronecker Product ................................27 2.6.3.2 Hadamard Product ..............................................................30 2.7 Exercises.........................................................................................................30 References................................................................................................................32 Chapter 3 3.1 3.2
Basics of Multivariate Statistical Analysis ........................................33
Introduction ....................................................................................................33 Multivariate Distribution and Properties .......................................................33 3.2.1 Random Vectors, Cumulative Distribution Functions (CDF), and Probability Density Functions (PDF) .........................................34 3.2.2 Marginal and Conditional Distributions ............................................35 3.2.3 Population Moments ..........................................................................37 3.2.4 Correlation Coefficients .....................................................................40
2151_C000.fm Page xii Thursday, October 26, 2006 1:39 PM
3.3
Multivariate Normal Distribution and Quadratic Forms ...............................40 3.3.1 Multivariate Normal Distribution and Its Properties.........................43 3.3.2 Quadratic Forms.................................................................................44 3.3.3 Noncentral χ2 and F Distributions.....................................................45 3.4 Sampling Theory............................................................................................45 3.4.1 Sample Geometry...............................................................................49 3.4.2 Random Samples and the Expected Values of the Sample Mean and Covariance Matrix.......................................................................50 3.4.3 Some Important Results of Sampling from Multivariate Normal Distributions .......................................................................................50 3.5 The Wishart Distribution and Some Properties.............................................51 3.6 Exercises.........................................................................................................53 References................................................................................................................55 Chapter 4
Statistical Inferences on Mean Vectors and Linear Models..............57
4.1
Statistical Inferences on Mean Vectors..........................................................57 4.1.1 Hotelling’s T 2 Test .............................................................................57 4.1.2 Confidence Regions and Simultaneous Confidence Intervals...........59 4.1.2.1 Confidence Regions ............................................................59 4.1.2.2 Simultaneous Confidence Intervals ....................................61 4.1.3 Inference on a Population Mean Vector with Large Sample Size .....63 4.1.4 Multivariate Quality Control Charts ..................................................64 4.1.4.1 Control Charts for One Multivariate Sample.....................64 4.1.4.2 Control Charts Based on Subgroup Means........................66 4.2 Multiple Linear Regression ...........................................................................68 4.2.1 Model Description..............................................................................69 4.2.2 Least-Squares Estimates ....................................................................70 4.2.3 Inferences about the Regression Model ............................................73 4.2.4 Model Checking: Normality Checking and Outlier Detection .........75 4.2.5 Inferences from the Estimated Regression Model ............................76 4.2.6 Multivariate Linear Regression..........................................................78 4.2.6.1 Multivariate Linear Regression Model...............................78 4.2.6.2 Least-Squares Estimation (LSE) and Maximum Likelihood Estimation (MLE) of Parameters.....................79 4.2.6.3 Inferences for Multivariate Regression Model under Normality Assumption........................................................82 4.2.6.4 Predictions from Multivariate Regression..........................83 4.2.6.5 Selection Method of Independent Variables ......................85 4.3 Exercises.........................................................................................................86 References................................................................................................................89 Chapter 5 5.1
Principal Component Analysis and Factory Analysis .......................91
Principal Component Analysis.......................................................................91 5.1.1 Mathematical Model of PCA.............................................................91 5.1.2 Geometrical Interpretation .................................................................95
2151_C000.fm Page xiii Thursday, October 26, 2006 1:39 PM
5.1.3 Inferences on PCs ..............................................................................96 5.1.4 Applying PCs to Process Control......................................................97 5.2 Factor Analysis...............................................................................................99 5.2.1 Introduction ........................................................................................99 5.2.2 Comparison of Factor Analysis and PCA .........................................99 5.2.3 Orthogonal Factor Model.................................................................100 5.2.4 Statistical Interpretations of Factor Loadings and Communality ...102 5.2.4.1 Interpretation of Factor Loadings.....................................102 5.2.4.2 Interpretation of Communality .........................................102 5.2.5 Estimation of Loading Matrix .........................................................103 5.2.6 Factor Rotation.................................................................................105 5.2.7 Factor Score .....................................................................................109 5.2.8 General Procedures for Factor Analysis..........................................111 5.3 Exercises.......................................................................................................111 References..............................................................................................................113
PART II Variation Propagation Modeling in MMP Chapter 6
State Space Modeling for Assembly Processes...............................117
6.1 6.2
Introduction of Multistage Assembly Processes .........................................117 Variation Factors and Assumptions .............................................................119 6.2.1 Station-Level Variation Factors........................................................119 6.2.2 Across-Station Variation Factors .....................................................120 6.2.3 Summary and Assumptions for Modeling.......................................121 6.3 State Space Modeling ..................................................................................122 6.3.1 Representation of Part Position and Its Deviation State.................122 6.3.2 Some Preliminary Results................................................................123 6.3.3 State Space Representation..............................................................125 6.4 Model Validation ..........................................................................................131 6.5 Exercises.......................................................................................................133 Appendix 6.1: Determination of the Deviations of a Part (Lemma 6.1) .............137 Appendix 6.2: Effect of Fixture Deviation on Part Deviation (Lemma 6.2) .......138 References..............................................................................................................140 Chapter 7 7.1
7.2
State Space Modeling for Machining Processes .............................143
Introduction ..................................................................................................143 7.1.1 Introduction to Machining Processes and Dimensional Variation Sources .............................................................................................143 7.1.2 Modeling of Variation Propagation in Multistage Machining Processes ..........................................................................................145 7.1.3 Model Formulation ..........................................................................147 Derivation of Variation Propagation Model ................................................148 7.2.1 Basics of Kinematic Analysis of Machining Operations ................148 7.2.2 Representation of Workpiece Geometric Deviation ........................152
2151_C000.fm Page xiv Thursday, October 26, 2006 1:39 PM
7.2.3
Single-Stage Modeling of Dimensional Variation...........................153 7.2.3.1 Analysis of Datum-Induced Error ....................................154 7.2.3.2 Analysis of Fixture Errors ................................................158 7.2.3.3 Identify the Overall Dimensional Error by Combining Error Sources Together.....................................................159 7.2.3.4 Modeling Variation Propagation in Multistage Machining Processes ........................................................161 7.3 Model Validation ..........................................................................................162 7.3.1 Introduction to the Experimental Machining Process.....................162 7.3.2 Comparison between the Real Measurement and the Model Prediction .........................................................................................164 7.4 Summary ......................................................................................................166 7.5 Exercises.......................................................................................................166 7.6 Appendix: The System Matrices Used in Equation 7.27............................169 References..............................................................................................................173 Chapter 8
A Factor Analysis Method for Variability Modeling ......................175
8.1 8.2
Introduction ..................................................................................................175 A Factor Analysis Model for Process Variability........................................176 8.2.1 Model Structure................................................................................176 8.2.2 Interpretation of Variation Patterns..................................................177 8.3 Limitations of PCA and Factor Rotation ....................................................180 8.4 Estimating the Number of Faults.................................................................181 8.4.1 Likelihood Ratio Test.......................................................................181 8.4.2 AIC and MDL Information Criteria ................................................182 8.4.3 An Illustrative Example ...................................................................183 8.5 Unique Identification of Multiple Faults .....................................................187 8.5.1 Estimating the Fault Geometry Vectors...........................................187 8.5.2 Identifying Subgroups......................................................................189 8.5.3 Fault Interpretation and Illustrative Example..................................190 8.6 Statistical Properties.....................................................................................193 8.7 Summary ......................................................................................................195 8.8 Exercises.......................................................................................................196 8.9 Appendix 8.1: Discussion of the Distributions of Test Statistics ...............196 References..............................................................................................................197
PART III Variation Source Diagnosis Chapter 9 9.1 9.2
Diagnosability Analysis for Variation Source Identification...........201
Motivation and Formulation of Diagnosability Study ................................201 Definitions of Diagnosability.......................................................................203
2151_C000.fm Page xv Thursday, October 26, 2006 1:39 PM
9.3 9.4 9.5
Criterion of Fault Diagnosability.................................................................205 Minimal Diagnosable Class .........................................................................207 Gauging System Evaluation Based on Minimal Diagnosable Class ..........210 9.5.1 Information Quantity........................................................................210 9.5.2 Information Quality..........................................................................211 9.5.3 System Flexibility ............................................................................211 9.6 Case Study....................................................................................................212 9.6.1 Case Study of A Multistage Assembly Process ..............................212 9.6.2 Case Study of A Multistage Machining Process.............................219 9.7 Summary ......................................................................................................221 9.8 Exercises.......................................................................................................224 Reference ...............................................................................................................226 Chapter 10 Diagnosis through Variation Pattern Matching ...............................229 10.1 Introduction to Variation Patterns ................................................................229 10.2 Links between the Fault-Quality Model and Variation Patterns.................231 10.3 Procedure of Pattern Matching for Variation Source Identification ...........233 10.3.1 Disturbance due to Unstructured Noises .........................................235 10.3.2 Disturbance due to Sampling Uncertainty ......................................236 10.3.3 A Robust Pattern Matching Procedure ............................................237 10.4 Case Study....................................................................................................240 10.4.1 A Machining Process and Its Variation Propagation Model...........240 10.4.2 Pattern Matching for Root Cause Identification in the Machining Process ...........................................................................241 10.5 Summary ......................................................................................................244 10.6 Exercises.......................................................................................................244 References..............................................................................................................248 Chapter 11 Estimation-Based Diagnosis ............................................................249 11.1 LS Estimators for Variance Components ....................................................249 11.1.1 Deviation LS Estimator....................................................................250 11.1.2 Variation LS Estimator.....................................................................251 11.1.3 Other Variation LS Estimators.........................................................252 11.2 Relationship among Variance Estimators ....................................................254 11.3 Comparison of Variance Estimators ............................................................262 11.3.1 Unbiasedness of Variance Estimators ..............................................262 11.3.2 Dispersion of Variance Estimators...................................................263 11.3.3 Comparison of Variance Estimators ................................................264 11.4 Chapter Summary ........................................................................................270 11.5 Exercises.......................................................................................................270 References..............................................................................................................272
2151_C000.fm Page xvi Thursday, October 26, 2006 1:39 PM
PART IV Design for Variation Reduction Chapter 12 Optimal Sensor Placement and Distribution ...................................275 12.1 Introduction ..................................................................................................275 12.2 Design Criteria for Sensor Placement .........................................................278 12.2.1 Diagnosability Index as Design Criterion .......................................278 12.2.2 Sensitivity Index As Design Criterion.............................................279 12.3 Single-Station Sensor Placement .................................................................282 12.3.1 Optimization Formulation................................................................282 12.3.2 Exchange Algorithms from Optimal Experimental Design ............283 12.3.3 Fast Exchange Algorithm with Sort-and-Cut ..................................284 12.3.4 Comparison among Alternative Algorithms ....................................286 12.3.5 Results of Optimal Sensor Layout on a Single Station ..................288 12.4 Multiple-Station Sensor Distribution...........................................................290 12.4.1 Optimization Formulation................................................................290 12.4.2 Variation Transmissibility Ratio ......................................................291 12.4.3 Detection Power on An Individual Station......................................293 12.4.4 Optimal Strategy of Sensor Distribution .........................................296 12.4.4.1 Strategy of Sensor Distribution ........................................297 12.4.5 Example............................................................................................298 12.5 Summary ......................................................................................................301 12.6 Exercises.......................................................................................................301 References..............................................................................................................302 Chapter 13 Design Evaluation and Process Capability Analysis.......................305 13.1 13.2 13.3 13.4
Introduction ..................................................................................................305 Sensitivity-Based Design Evaluation...........................................................307 Multivariate Process Capability Analysis ....................................................313 Examples ......................................................................................................315 13.4.1 Sensitivity-Based Design Evaluation...............................................315 13.4.2 Multivariate Process Capability Analysis ........................................319 13.5 Exercises.......................................................................................................321 13.6 Appendix 13.1: System Matrices of Configuration C1...............................322 13.7 Appendix 13.2: System Matrices of Configuration C2...............................326 13.8 Appendix 13.3: System Matrices of Configuration C3...............................327 13.9 Appendix 13.4: System Matrices of Configuration C4...............................330 References..............................................................................................................331 Chapter 14 Optimal Fixture Layout Design.......................................................333 14.1 Introduction ..................................................................................................333 14.2 Design Criteria for Variation Reduction......................................................335 14.3 Data-Mining-Aided Design Algorithm........................................................338
2151_C000.fm Page xvii Thursday, October 26, 2006 1:39 PM
14.3.1 Overview of the Data-Mining-Aided Design ..................................339 14.3.2 Candidate Design Space ..................................................................341 14.3.3 Uniform Coverage Selection............................................................342 14.3.4 Feature and Feature Function ..........................................................344 14.3.5 Clustering Method............................................................................346 14.3.6 Classification Method ......................................................................347 14.3.7 Selection of K and J ........................................................................349 14.3.8 An Overall Description of the Data-Mining-Aided Design............352 14.4 Example and Performance Comparison ......................................................353 14.5 Summary ......................................................................................................356 14.6 Exercises.......................................................................................................356 References..............................................................................................................358 Chapter 15 Process-Oriented Tolerance Synthesis .............................................361 15.1 Concept of Process-Oriented Tolerancing ...................................................361 15.2 Framework of Process-Oriented Tolerancing ..............................................362 15.2.1 Overview ..........................................................................................362 15.2.2 Variation Propagation Model ...........................................................363 15.2.3 Relationship between Tolerance and Variation................................364 15.2.4 Process Degradation Model .............................................................366 15.2.5 Cost Function ...................................................................................368 15.2.6 Optimization Formulation and Optimality ......................................368 15.3 Case Study for Process-Oriented Tolerancing.............................................368 15.3.1 Tolerance Allocation when Tooling Degradation Is Not Considered........................................................................................368 15.3.2 Tolerance Allocation Considering Tooling Degradation .................370 15.4 Integration of Process-Oriented Tolerance Synthesis and Maintenance Planning........................................................................................................372 15.4.1 Decision Variables of Tolerance and Maintenance Design.............372 15.4.2 Cost Components .............................................................................373 15.4.2.1 Tolerance Cost ..................................................................373 15.4.2.2 Maintenance Cost .............................................................373 15.4.2.3 Quality Loss Function ......................................................374 15.4.3 Formulation of Optimization Problems...........................................374 15.4.3.1 Optimization Formulation Using a Quality Loss Function (F1) ....................................................................374 15.4.3.2 Optimization Formulation with a Quality Constraint ......375 15.5 Integrated Tolerance and Maintenance Design in BIW Assembly Processes ......................................................................................................375 15.6 Optimizations and Optimality for Integrated Tolerance and Maintenance Design...........................................................................................................377 15.6.1 Optimality Analysis of Optimization Formulation F1 ....................377 15.6.2 Optimality Analysis of Optimization Formulation F2 ....................379 15.7 Case Study for Integrated Tolerance and Maintenance Design..................380
2151_C000.fm Page xviii Thursday, October 26, 2006 1:39 PM
15.7.1 Optimal Tolerance and Maintenance Design for Optimization Formulation F1.................................................................................380 15.7.2 Optimal Tolerance and Maintenance Design for Optimization Formulation F2.................................................................................381 15.7.3 Cost Comparison under Different Design Schemes........................382 15.8 Exercises.......................................................................................................384 References..............................................................................................................385
PART V Quality and Reliability Integration and Advanced Topics Chapter 16 Quality and Reliability Chain Modeling and Analysis ...................389 16.1 Introduction ..................................................................................................389 16.1.1 Example 1: Machining Processes ....................................................391 16.1.2 Example 2: Transfer or Progressive Die-Stamping Processes ........392 16.2 QR-Chain Modeling.....................................................................................393 16.2.1 Relationship between Component Performance and Product Quality ..............................................................................................394 16.2.2 System Component Degradation .....................................................394 16.2.3 Product Quality Assessment and System Failure due to Nonconforming Products .................................................................395 16.2.4 Component Catastrophic Failure and Its Induced System Catastrophic Failure .........................................................................396 16.3 System Reliability Evaluation......................................................................396 16.3.1 Challenges in System Reliability Evaluation ..................................397 16.3.2 System Reliability Evaluation of MMPs.........................................397 16.3.3 Self-Improvement of Product Quality and the Upper Bound of System Reliability ............................................................................400 16.4 Implementation of QR-Chain Modeling and Analysis in Body-in-White Assembly Processes .....................................................................................401 16.4.1 QR-Chain in Multistation BIW Assembly Processes .....................402 16.4.2 QR-Chain Model of a BIW Assembly Process...............................402 16.4.2.1 Locating-Pin Degradation Model .....................................403 16.4.2.2 Relationship between Process Variables and Deviations of Quality Characteristics .................................................404 16.4.2.3 Product Quality Assessment and Pin Catastrophic Failure ...............................................................................407 16.4.2.4 System Reliability Evaluation ..........................................408 16.4.3 Case Study........................................................................................408 16.5 Exercises.......................................................................................................410 16.6 Appendix 16.1: Derivation of Equation 16.4 ..............................................410 16.7 Appendix 16.2: Proof of Result 16.1 ..........................................................412
2151_C000.fm Page xix Thursday, October 26, 2006 1:39 PM
16.8 Appendix 16.3: Derivation of Equation 16.10...........................................413 16.9 Appendix 16.4: Proof of Lemma 16.1 .......................................................414 16.10 Appendix 16.5: Proof of Result 16.4 .........................................................415 16.11 Appendix 16.6: Distribution of Random Variable ξ(tk) ............................415 References..............................................................................................................416 Chapter 17 Quality-Oriented Maintenance for Multiple Interactive System Components......................................................................................419 17.1 Quality-Oriented Maintenance Model .........................................................419 17.2 Multicomponent Maintenance Policies .......................................................422 17.2.1 Simple Block Resetting (Replacement)...........................................422 17.2.2 Modified Block Resetting ................................................................425 17.2.3 Age Resetting...................................................................................427 17.3 Further Discussion on Solutions of Optimization Problems ......................429 17.3.1 Optimal Solutions of SBR Policy....................................................430 17.3.2 Optimal Solutions of Modified Block Resetting Policy .................432 17.4 Case Study....................................................................................................432 17.5 Exercises.......................................................................................................436 17.6 Appendix 17.1: Proof of Lemma 17.7 ........................................................437 References..............................................................................................................438 Chapter 18 Additional Topics on Stream of Variation .......................................439 18.1 SoV Modeling for Multistation Assembly Process of Compliant Parts.....439 18.1.1 Introduction ......................................................................................439 18.1.2 SoV Modeling of Compliant Parts in an MMP ..............................440 18.1.2.1 Single Station Assembly Modeling ..................................441 18.1.2.2 Multistation Assembly Modeling .....................................442 18.1.3 Additional Comments ......................................................................444 18.2 SoV Modeling for Serial-Parallel Multistage Manufacturing Systems ......445 18.2.1 Introduction ......................................................................................445 18.2.2 SoV Modeling for Serial-Parallel MMPs........................................446 18.2.2.1 State Space Modeling for Multiple SoVs in an SP-MMP ...........................................................................446 18.2.2.2 Model Dimension Reduction............................................448 18.2.3 Additional Comments ......................................................................449 18.3 SoV-Based Quality-Ensured Setup Planning...............................................449 18.3.1 Introduction ......................................................................................449 18.3.2 Quality-Ensured Setup Planning Methodologies ............................450 18.3.2.1 Variation Propagation Modeling for Setup Planning.......450 18.3.2.2 Optimization Formulation ................................................451 18.3.3 Additional Comments ......................................................................452 18.4 Active Control for Variation Reduction of Multistage Manufacturing Processes ......................................................................................................453
2151_C000.fm Page xx Thursday, October 26, 2006 1:39 PM
18.4.1 Introduction ......................................................................................453 18.4.2 Active Control for Variation Reduction of MMP ...........................454 18.4.3 Additional Comments ......................................................................456 References..............................................................................................................457 Index ......................................................................................................................459 Related Titles .........................................................................................................469
2151_book.fm Page 1 Friday, October 20, 2006 5:04 PM
1
What Is Stream of Variation for Multistage Manufacturing Processes?
This chapter will present some background information about the Stream of Variation (SoV) methodology and its applications. We will particularly discuss the history of SoV initiation and development, the topics addressed in SoV, and the relationship of SoV with other existing methods.
1.1 HISTORY AND CURRENT STATUS OF SOV 1.1.1 SOV INITIATION
AND IMPACT IN INDUSTRY
SoV research was initiated by the need for variation reduction in automotive body assembly processes in the late 1980s. In those days, U.S. automobile producers were losing market share to Asian and European competitors. Quality, mainly reflected as the dimensional variability, of a body-in-white (BIW) was identified as one of the most important factors that impacted the competitiveness of U.S. automobile industries [1]. Variation reduction in automotive body assembly is a very challenging task [2]. A typical BIW has about 100 to 150 sheet metal parts, which are assembled at 80 to 120 assembly stations. A BIW assembly line normally has about 1500 to 2000 fixture locators and approximately 4000 welding spots. Figure 1.1 shows several layers of a BIW assembly process. If any deviations (or failures) occur at a locator, at a welding spot, or from an incoming part, a dimensional variation will be generated on the assembled part at this station, which will propagate to downstream assembly stations, and finally accumulate at the BIW. After assembly is finished, the door, hood, and deck-lid panels are installed into the openings of a BIW; and after the painting process, the windshield and backlight are installed into the appropriate openings. The dimensional variations accumulated at the BIW openings, compounded by variations associated with panels or other subassemblies, will significantly increase the production complexity, leading to more tooling failures and unexpected downtime and reducing both product quality and production throughput. In addition, the dimensional variation in auto body assembly also directly impacts gaps and flushes of panel fitting, which leads to the increase of wind noise and the risk of water leakage. Variation reduction demands effective process control techniques through monitoring and diagnosis based on the dimensional measurement data.
1
2151_book.fm Page 2 Friday, October 20, 2006 5:04 PM
2
Stream of Variation Modeling and Analysis for MMPs
FIGURE 1.1 A body-in-white (BIW) and major closure panels.
In the late 1980s, the inline optical coordination measurement machine (OCMM) was introduced to the automotive body assembly plant. The OCMM was installed at the end of an assembly line and used optical laser sensors to measure critical features of a BIW. An OCMM had about 100 laser sensors. Each sensor targeted a critical feature of a BIW, and provided dimension readings on the body coordinates. As a result, a large amount of measurement data were obtained from each auto body produced in an assembly line. The tremendous amount of inline quality data provided significant opportunities for more effective process control. However, it also challenged existing process control technologies. With hundreds of quality attributes being measured, some out-of-control conditions were invariably detected by using SPC control charts. In general, reacting to those out-of-control conditions was not easy using SPC techniques alone: they could not be quickly resolved because of the complexity of the process and time-consuming efforts in root cause identification. Therefore, the massive dimensional data generated from the OCMM was not fully utilized for variation reduction in body assembly. More effective models and data analysis methods were needed to address this challenging problem. Professor S.M. Wu at the University of Michigan took up the challenge and initiated the “Two Millimeter Program.” The main goal of this program was to increase the competitiveness of U.S. automakers by reducing the BIW dimensional variation. In the early 1990s, the dimensional variation (by Six Sigma) was about 2.0 mm for Japanese automakers, 2.5 mm for European automakers, and more than 3.0 mm (even close to 8.0 mm) for U.S. automakers. The 2 mm program aimed to reduce car body variation to the lowest level possible: the Six Sigma of all key dimensional quality attributes on a car body should be reduced to below 2 mm — a benchmark of the best quality of Japanese cars at that time. Under the leadership of Professor Wu and with strong support from GM, Chrysler, Ford, Auto Body Consortium, and NIST-Advanced Technology Programs (NIST-ATP), a group of researchers and engineers devoted much effort to studying various critical issues on variation reduction for automotive body assembly and enabled concurrent technology transfer and implementation in the automotive industry.
2151_book.fm Page 3 Friday, October 20, 2006 5:04 PM
What Is Stream of Variation for Multistage Manufacturing Processes?
3
The 2 mm program made a significant impact on automotive body assembly plants. In December 1992, one U.S. assembly plant located in Detroit, MI, achieved the “2 mm” variation level and marked the first success of the 2 mm program. In 1995, a “process navigator” software package was developed by Auto Body Consortium and the University of Michigan that provided an integrated environment for BIW dimensional data monitoring, analysis, and root cause diagnosis for the entire body assembly process. In 1998, distributed sensing strategy was deployed in a body shop for more effective data acquisition for process control. By 2000, the developed variation reduction methodologies had been successfully implemented in design improvement and the product launch of 32 new vehicle programs [1]. The implementation of those methodologies significantly improved the quality of assembly and also reduced the ramp-up time during the new production launch in automotive body assembly processes. The SoV methodology was expanded to the multistage machining processes in 1998 with the support of the NSF Engineering Research Center on Reconfigurable Manufacturing Systems (NSF ERC on RMS) and its industrial member companies. Unlike assembly, which is a part addition process, the machining is a materialremoving process. A set of new technologies was developed for the SoV methodology in machining. A test bed with four CNC machines, one CMM, and one reconfigurable inspection machine (RIM) was built in the NSF-ERC for RMS for the SoV research and validation in 1999. The developed SoV-based technologies for machining processes were validated and implemented in several powertrain plants from 2002 to 2006. The SoV-based sensitivity study was conducted to evaluate a new engine head machining line with 22 stations to identify critical machining stations that contributed more variability to the process in 2004. Gauging strategies were studied for quality feature selection and measurement of engine and transmission processes in 2005. Root cause diagnosis techniques were successfully implemented in process control in a diesel engine plant and adopted as the standard practice for the system in 2005. In 2003 the DCS Company won a grant from the NIST-Advanced Technology Program to develop commercial software based on the SoV methodology to support product and process modeling, design analysis, and diagnostic control of multistage manufacturing processes.
1.1.2 THE HISTORY
OF
SOV METHODOLOGY DEVELOPMENT
SoV methodology development attempts to develop a math-based technique for variation reduction in both design and manufacturing of a multistage manufacturing process. The methodology development can be divided into two phases: initial learning and technologies development (1990–1997) and a unified SoV methodology development based on state space models (1997–2006). Phase I research (1990–1997) targeted the development of advanced methods to analyze massive OCMM data to find root causes of variability. Some representative achievements in the early phase of technology development included (1) proposing the concept of using principal component analysis (PCA) to analyze the multivariate OCMM data [3], (2) developing a hierarchical representation or model and associated ad hoc diagnostic rules for failures in multistation assembly systems [4], (3) developing
2151_book.fm Page 4 Friday, October 20, 2006 5:04 PM
4
Stream of Variation Modeling and Analysis for MMPs
a theoretical basis by proving the inherent relationships between the fixture failure pattern obtained from design and the first principal component calculated from the OCMM data [5], (4) developing multiple fault diagnosis techniques based on geometric modeling and least-squares estimation [6], and (5) variation modeling and analysis of complaint parts by integrating the part dimensional variability and material properties [7,8]. In 1997, Hu discussed the variation and its propagation problem in multistage manufacturing processes using the term stream of variation (SoV), as suggested by Yoram Koren [8]. Although various significant technological advancements were achieved in Phase I, the methodologies in this phase were specifically devised to address particular variation reduction objectives. In addition, most of them focused on a singlestage manufacturing system (e.g., single fixture, single tooling). The hierarchical representation of multistation systems [4] was still an engineering knowledge “representation,” rather than a mathematical model, for body assembly system. Thus, there was no systematic, unified methodology for variation reduction for multistage manufacturing systems at that time. The concept of using a state space model to describe the variation and its propagation ushered in a new era of SoV methodology development [9,10]. The state space model was typically used in system or automatic control theories, but not in discrete-part manufacturing. The introduction of a state space modeling structure to describe variation and its propagation successfully resolved two fundamental issues in variation reduction of multistage manufacturing processes: (1) the lack of analytical models for variation propagation modeling and (2) the lack of systematic methodologies for variation propagation analysis and optimization. The development of state space models to describe variation propagation provided a solid scientific foundation for manufacturing information (product design, process design, and quality measurements) integration for variation reduction in multistage manufacturing processes. With the state space model as the basis, various SoV-related research efforts have been conducted in both product and process design and the manufacturing process control. The major technological milestones in SoV development are summarized in Figure 1.2, which consists of three major paths: SoV modeling, SoVbased design, and SoV-based process control. More than 40 papers on those topics were published in various journals, including ASME Transactions, IIE Transactions, IEEE Transactions, Technometrics, etc., which represent a truly multidisciplinary research effort contributing mechanical engineering, industrial engineering, system and control engineering, and applied statistics to the knowledge base. The work was also broadly recognized by different research disciplines, which were evidenced by the Best Paper Awards received from American Society of Mechanical Engineering (ASME), North American Manufacturing Engineering Institute (NAMRI), Institute of Industrial Engineering (IIE), and Institute of Operations Research and Management Science (INFORMS). The SoV methodology, as a unified approach, has made fundamental contributions and has had a significant impact on variation reduction of multistage manufacturing processes.
2151_book.fm Page 5 Friday, October 20, 2006 5:04 PM
What Is Stream of Variation for Multistage Manufacturing Processes?
5
FIGURE 1.2 Important milestones in SoV methodology development.
1.2 OVERVIEW OF SOV METHODOLOGY Although SoV methodology originated from dimensional variation control of auto body assembly processes, it has been expanded to address variation management and reduction issues for generic complex multistage manufacturing processes (MMPs). An MMP is referred to as a process involving multiple stations to produce a product. Examples of such processes include: (1) an automotive body assembly that has multiple parts assembled at multiple stations, (2) an engine head production that involves multiple machining operations of a single part at multiple stations, (3) a transfer or progressive stamping process that involves multiple die stations to form a part, and (4) semiconductor manufacturing processes in which a silicon wafer develops in several stages with several layers to form a chip. In an MMP, workpiece dimensional errors are contributed by the errors generated at the current stage, as well as accumulated errors transmitted from previous stages. Thus, the nature of variation and its propagation introduce significant challenges to variation management and reduction for an MMP. This book will discuss how to conduct variation reduction by presenting SoV methodologies. SoV attempts to describe the complex production stream and data stream involved in modeling and analysis of variation and its propagation in an MMP. The production stream is referred to as the physical layout of the multistage manufacturing processes. As shown in Figure 1.3, multiple stations form a subassembly line, and multiple subassembly lines merge into the main assembly line in an automotive body assembly process. Each station generates a dimensional variation on the subassembly, and the subassembly is transferred to the next station to be assembled with more parts. As the parts or subassemblies go through all production
2151_book.fm Page 6 Friday, October 20, 2006 5:04 PM
6
Stream of Variation Modeling and Analysis for MMPs
FIGURE 1.3 Layout of a multistage automotive body assembly process.
lines, a stream of part flow, as well as an associated flow of part variation, will be generated. One interpretation of SoV reflects these multiple stations and multiple line configurations, which leads to the variation and its propagation through the MMP. An interpretation of SoV reflects the complex data relationships in an MMP. As shown in Figure 1.4, the x-axis represents the manufacturing stages; the y axis, time; the z-axis, the quality attributes; and Mi, the quality features. In an MMP, there are three types of correlations among those data streams: (1) the quality attributes are autocorrelated in terms of the stages along the production line and are shown as M2 along the x-axis; (2) the quality attributes are crosscorrelated among them within the same stage and represented as [M1, M2, …, Mm] at stage N along the z-axis; and (3) each quality attribute is also autocorrelated in terms of time owing to the
FIGURE 1.4 Complex data relationships in an MMP.
2151_book.fm Page 7 Friday, October 20, 2006 5:04 PM
What Is Stream of Variation for Multistage Manufacturing Processes?
7
FIGURE 1.5 Variation propagation and notations in SoV modeling.
degradation or wear of production tooling over time and represented as Mi (i = 1, 2, … m) along the y axis. Those three types of correlations, observed as a stream of data, introduce significant challenges in variation modeling, analysis, and control. To investigate the variations of those data streams, this book presents the development of the SoV methodology and its applications. The foundation of the SoV methodology is a mathematical model that links the key product quality characteristics (KPC) with key control characteristics (KCC) (e.g., fixture error, machine error, etc.). This model has a state space representation that describes the deviation and its propagation in an N-station process (as shown in Figure 1.5), i.e., x k = A k −1x k −1 + B k u k + w k , k = 1, 2, …, N,
(1.1)
y k = C k x k + v k , {k} ⊂ {1, 2, …, N},
(1.2)
where k = 1, 2, …, N and is the stage index. xk is the state vector representing the key quality characteristics of the product (or intermediate workpiece) after stage k. uk is the control vector representing the tooling errors (e.g., tolerance when no faults occur, or deviation when failures occur on the tooling) at stage k. yk is the measurement vector representing product quality measurements at stage k. wk and vk are the modeling error and sensing error, respectively. The coefficient matrices Ak, Bk, and Ck are determined by product and process design information: Ak represents the impact of deviation transition from stage k – 1 to stage k, Bk represents the impact of the local tooling deviation on product quality at stage k, and Ck is the measurement matrix, which can be obtained from the defined key product quality features at stage k. If we repeat the modeling efforts for each stage from k = 1 to N, we will get the deviation and its propagation throughout the MMPs. By taking variations on both sides of Equation 1.1 and Equation 1.2, and making certain assumptions, we will obtain the “variation” and its propagation model for the MMPs. With the mathematical model, variation reduction can be achieved in both the design and manufacturing stages through rigorous mathematical decision making. However, significant challenges must be overcome in model development and utilization to realize the benefits of the analytical capability of this model. These challenges are addressed in this book. SoV methodology will address the following important questions for variation reduction in an MMP in detail:
2151_book.fm Page 8 Friday, October 20, 2006 5:04 PM
8
Stream of Variation Modeling and Analysis for MMPs
•
•
•
•
How to model and represent the product and process design information to achieve variation reduction? In this book, we introduce two basic methods: the physical modeling method and data-driven modeling methods. In the former method, the kinematics relationship between KCC and KPC is identified through a detailed physical analysis of the manufacturing process; whereas in the latter method, the model is achieved through a statistical fitting based on the historical process measurement data. Details of SoV modeling will be discussed in Chapter 6 to Chapter 8. How to systematically find the root causes of variability in terms of which manufacturing station, and what in the station introduce the variability? During continuous production, a larger product variation may occur at any stage of an MMP because of worn-out tooling, tooling breakage, and incoming part variation. This book will present a systematic approach to root cause identification. In this approach, a new concept of “statistical methods driven by engineering models” is proposed to integrate the product and process design knowledge with online statistics. The variation model, developed from the design information, is used to link the product variation (quality attributes) with the tooling variation (or potential failure). During production, the product features are measured and the data are used to conduct statistical analysis — based on Equation 1.1 and Equation 1.2 — to identify root causes. Advanced statistics and estimation theory will be used in these efforts. This book presents two types of diagnosis techniques for root cause identification in MMPs: variation pattern matching (Chapter 10) and estimation-based diagnosis (Chapter 11). How to distribute measurement gauges for effective process control in an MMP by determining when, where, and what to measure on the final and intermediate workpieces? One of the major tasks in variation reduction is to design the gauging strategies to measure the product features in a manufacturing process. Most existing industrial practices focus on product coherence inspection (i.e., product-oriented measurements), which is effective in detecting product imperfection but may not be as effective in identifying root causes of product variation. This book proposes a “process-oriented” measurement concept with distributed sensing strategy. In this strategy, selected key control and product characteristics are measured at specific stages to both detect product defects and identify their root causes. The diagnosability issue, i.e., the capability of identifying potential root causes of the process variation for a given measurement strategy, is discussed in Chapter 9. The related issues of optimal sensor placement and distribution are discussed in Chapter 12. How to conduct design evaluation and tolerance synthesis for an MMP? Variation analysis and design evaluation are conducted on the product and during the process design stage to identify the critical components, features, and manufacturing operations. With the SoV model defined in Equation 1.1, the following three tasks can be performed: (1) tolerance
2151_book.fm Page 9 Friday, October 20, 2006 5:04 PM
What Is Stream of Variation for Multistage Manufacturing Processes?
•
9
analysis by allocating the part tolerance (x0) and tooling tolerance (uk) and then predicting the final product tolerance (xN) by solving the difference equations, (2) tolerance synthesis by fixing the final product tolerance (xN) and then assigning the tolerance for individual parts (x0) and tooling components (uk) with minimized cost objective functions, and (3) sensitivity study by identifying the critical parts (xk) or tooling components (uk) that have significant impacts on the final part variation through the evaluation of defined sensitivity indices. The details of those topics will be discussed in Chapter 13 to Chapter 15 of this book. How to integrate product quality and production tooling reliability for effective system design and maintenance decision making? There is a complex and intriguing relationship between product quality and tooling reliability. A degraded (or failed) production tool will lead to greater product variability or number of defects; in addition, the variability of product quality features will impact the degradation rate or failure rate of production tooling. For an MMP, those interactions are more complex, as variations propagate from one stage to the next stage. Thus, a “chain effect” between the product quality and tooling reliability can be observed, and this is known as the “QR Chain” effect. The modeling of the QR Chain is an integrated effort of the SoV model and the semi-Markov process model. Chapter 16 will discuss the modeling of the QR Chain for MMPs. Chapter 17 investigates the applications of the QR Chain effect in reliability and maintenance decisions.
1.3 RELATIONSHIP OF SOV METHODOLOGIES WITH OTHER EXISTING METHODS A typical product realization involves product design, process design, production ramp-up, production and process control and, finally, system maintenance. Various methodologies have been developed and implemented in each stage of the product realization. However, very few techniques exist to systematically address issues related to variation reduction and management. There is a lack of a unified methodology to conduct variation management and reduction over all the different steps from design and manufacturing to maintenance in MMPs. Currently, computer-aided design (CAD) and computer-aided manufacturing/engineering (CAM or CAE) have been widely used in product and process design, and have significantly impacted product realization. CAD mainly focuses on product design in terms of functionality, part geometry, component interactions, and their tolerances. CAM addresses issues of process planning, sequencing, and tooling design for a given product. There is usually an efficient interface for data communication and data transfer among CAD and CAM systems. However, few of them address issues related to “variation” modeling and analysis. The closest efforts in variation modeling and analysis are Variation Simulation Analysis (VSA), or similar industrial packages such as DVA, 3DCS, or Valysis. These technologies use homogeneous transformation matrices to model the variation and its propagation based on geometrical data from the product and tooling design. Then, numerical
2151_book.fm Page 10 Friday, October 20, 2006 5:04 PM
10
Stream of Variation Modeling and Analysis for MMPs
simulations, mainly based on the Monte-Carlo method, are conducted to predict the final product variation with the given process sequence, tooling layout, and tooling tolerance. In general, these efforts can be time consuming and only lead to the “best available” solution from different simulation trials. The VSA-based technology mainly serves as a tolerance analysis and design evaluation tool rather than a tolerance synthesis and design optimization tool for MMPs. Unlike VSA-based technologies, SoV models the variation and its propagation with a station-indexed state space model based on the product and process design. The SoV model in Equation 1.1 contains all essential design information that impacts the product variability. That information includes production sequence information, fixture locator layout and datum changes along the process, tooling and part interactions, part and tooling tolerance, and key product quality features. All those design parameters are integrated into the SoV model in an analytical format. Thus, it facilitates effective computation to obtain analytical solutions for design evaluations (unlike Monte-Carlo simulations). It also provides capabilities for tolerance synthesis (or allocation) and design optimizations for process sequencing, tooling layout, and variation management. In production and process control, statistical process control (SPC) plays an essential role in monitoring process data and detecting mean or variation changes. Although SPC is effective in terms of process change detection, it does not systematically integrate the product and process design data into the process control efforts. Thus, human experience plays a major role in identifying root causes of variation. These efforts could be very challenging and time consuming, especially for a new production line with experience which is limited, or for MMPs in which the process and product parameters have complex and intricate relationships. The SoV models the product measurement data with an observation equation (Equation 1.2), which can be used jointly with the state space equation (Equation 1.1) representing the variation and its propagations based on the product design. As a result, SoV-based process control can systematically integrate production measurement data with the process and product design information. This leads to effective process monitoring, change detection, and root cause identification in terms of which stage introduces product variability, and what process in that stage introduces the variability. In addition, the SoV-based process control also leads to systematic distribution of sensors (e.g., gauging strategies) to ensure identification of the root causes of variability. This effort leads to the transformation of the gauging strategy from the end-of-line “product-inspection”-oriented sensing placement to distributed “process-diagnosis”-oriented sensing strategy. Therefore, more effective process control capabilities can be achieved. Variation reduction and management also affects tooling reliability and maintenance decisions during production. Tooling reliability modeling and analysis are typically separated from the product quality and process control in current industrial practice. However, recent research has indicated that product quality and tooling reliability are inherently related. The QR Chain is an important aspect of an MMP. The SoV model provides a basis for integrated modeling and analysis for product quality and tooling reliability integration, which further leads to integrated design
2151_book.fm Page 11 Friday, October 20, 2006 5:04 PM
What Is Stream of Variation for Multistage Manufacturing Processes?
11
of tolerance optimization and maintenance decision making with consideration of the life cycle of production. In summary, SoV methodology expands the current technological base by developing a unified tool for variation modeling, design analysis, and process control with a focus on variation reduction and management in an MMP.
References 1. Long, B., March 1999, A Systems Solution to a Quality Problem in Auto Body Manufacturing, NIST Special Pub 950-1. 2. Ceglarek, D. and Shi, J., 1995, Dimensional variation reduction for automotive body assembly, Journal of Manufacturing Review, 8, 139–154. 3. Hu, S.J. and Wu, S.M., May 1992, Identifying sources of variation in automobile body assembly using principal component analysis, Transactions of NAMRI, XX, 311–316. 4. Ceglarek, D., Shi, J., and Wu, S.M., 1994, A knowledge-based diagnostic approach for the launch of the auto-body assembly process, ASME Transactions, Journal of Engineering for Industry, 116, 491–499. (Also published in 1993, PED-Vol. 64, Manufacturing Science and Engineering, 401–412.) 5. Ceglarek, D. and Shi, J., 1996, Fixture failure diagnosis for auto body assembly using pattern recognition, ASME Transactions, Journal of Engineering for Industry, 118, 55–65. 6. Apley, D. and Shi, J., 1998, Diagnosis of multiple fixture faults in panel assembly, ASME Transactions, Journal of Manufacturing Science and Engineering, 120, 793–801. (Also Proceedings of the 95’ ASME Winter Annual Meeting, ASME MEDVol. 4, November 1996, 575–581.) 7. Liu, S.C. and Hu, S.J., 1997, Variation simulation for deformable sheet metal assembly using finite element methods, ASME Journal of Manufacturing Science and Engineering, 119, 368–374. 8. Hu, S.J., 1997, Stream of variation theory for automotive body assembly, Annals of the CIRP, 46(1), 1–6. 9. Shi, J. and Jin, J., 1997, Modeling and diagnosis for automotive body assembly process using state space models, Proceedings of International Intelligent Manufacturing System’97, Seoul, Korea, pp.189–196. 10. Jin, J. and Shi, J., November 1999, State space modeling of sheet metal assembly for dimensional control, ASME Transactions, Journal of Manufacturing Science and Engineering, 121, 756–762. 11. Huang, Q., Zhou, N., and Shi, J., 2000, Stream of variation modeling and diagnosis of multi-station machining processes, Proceedings of the 2000 ASME International Mechanical Engineering Congress and Exposition, MED-Vol. 11, November 5–10, Orlando, FL, pp. 81–88 12. Zhou, S., Huang, Q., and Shi, J., April 2003, State space modeling for dimensional monitoring of multistage machining process using differential motion vector, IEEE Transactions on Robotics and Automation, 19(2), 296–309. 13. Camelio, J., Hu, S.J., and Ceglarek, D., 2003, Modeling variation propagation of multi-station assembly systems with compliant parts, ASME Transactions, Journal of Mechanical Design, 125, 673–681.
2151_book.fm Page 12 Friday, October 20, 2006 5:04 PM
12
Stream of Variation Modeling and Analysis for MMPs 14. Chen, Y., Jin, J., and Shi, J., 2004, Integration of dimensional quality and locator reliability in design and evaluation of multi-station body-in-white assembly processes, IIE Transaction, 36(9), 827–839. 15. Ding. Y., Jin. J., Ceglarek, D., and Shi, J., 2000, Process-oriented tolerance synthesis for multi-station manufacturing systems, Proceedings of the 2000 International Mechanical Engineering Congress and Expositions, November 5–10, Orland, FL, pp. 15–22. 16. Ding, Y., Jin, J., Ceglarek, D., Shi, J., 2005, Process-oriented tolerancing for multistation assembly systems, IIE Transactions, 37(6), 493–508. 17. Ding, Y., Ceglarek, D., and Shi, J., 2002, Design evaluation of multi-station assembly processes by using state space approach, ASME Transactions, Journal of Mechanical Design, 124(2), 408–418. 18. Kim, P. and Ding Y., 2004, Optimal design of fixture layout in multi-station assembly processes, IEEE Transactions on Automation Science and Engineering, 1(2), 133–145. 19. Ding, Y., Ceglarek, D., Shi, J., 2002, Fault diagnosis of multistage manufacturing processes by using state space approach, ASME Transactions, Journal of Manufacturing Science and Engineering, 124(2), 313–322. 20. Ding, Y., Shi, J., and Ceglarek, D., 2002, Diagnosability analysis of multi-station manufacturing processes, ASME Transactions, Journal of Dynamics Systems, Measurement, and Control, 124, 1–13. 21. Ding, Y., Kim, P., Ceglarek, D., and Jin, J., 2003, Optimal sensor distribution for variation diagnosis for multi-station assembly processes, IEEE Transactions on Robotics and Automation, 19(4), 543–556. 22. Zhou, S., Chen, Y., and Shi, J., 2004, Root cause identification for quality improvement of multistage machining processes, IEEE transactions on Robotics and Automation, 1(1), 73–83.
2151_book.fm Page 13 Friday, October 20, 2006 5:04 PM
Part I Basis of Matrix Theory and Multivariate Statistics
2151_book.fm Page 14 Friday, October 20, 2006 5:04 PM
2151_book.fm Page 15 Friday, October 20, 2006 5:04 PM
2
Basics of Matrix Theory
2.1 INTRODUCTION Matrix concepts, techniques, and theorems are crucial tools for multivariate analysis and its applications, e.g., process monitoring and quality improvement for multiple significant characteristics. By employing matrix notation, results can be expressed in a clear and compact manner. It can also facilitate geometrical interpretation and algebraic explanation of calculations. In this chapter, we will summarize the concepts, operations, and theorems of matrix algebra that will be required in later sections of this book. More readings on the topic can be found in References 1–5.
2.2 DEFINITIONS OF VECTOR, MATRIX, AND OPERATIONS 2.2.1 DEFINITIONS Definition 2.1 (Matrix): A matrix of dimension m × n is defined as a rectangular array of elements arranged in m rows and n columns, denoted by an uppercase boldface letter as follows:
( )
A = aij
a11 a = 21 am 1
a12 a22 am 2
a1n a2 n ∈ R m×n , amn m × n
(2.1)
where aij denotes the element located in the i th row and the j th column of the matrix A. Definition 2.2 (Vector): A vector is a matrix with only a single row or column, named row vector or column vector, respectively. Remark 2.1: In this book, without special indication, we will refer to a vector as a column vector, which is customarily denoted by a lowercase boldface letter as follows: T
a = a1 a2 am ∈ R m×1.
(2.2)
Remark 2.2: From Definition 2.1 and Definition 2.2, a matrix can also be viewed as a collection of row or column vectors, i.e., A = [a 1T a T2 a Tm ]T = [a1 a 2 a n ],
(2.3)
15
2151_book.fm Page 16 Friday, October 20, 2006 5:04 PM
16
Stream of Variation Modeling and Analysis for MMPs
where a~ i = [ai1 ai2 ain] is the i th row vector with dimension n, and aj = [a1j a2j amj]T is the j th column vector with dimension m. Definition 2.3 (Determinant of a square matrix): Let A be an m × m square matrix. The determinant of A, denoted by A or det(A), is defined as: A =
∑ ( −1)
α
a1 j1 a2 j2 amjm .
(2.4)
m
All products consist of one element from each row and column and multiplied by (–1) if the number of inversions of the particular permutation j1, j2 ,..., jm from the standard order 1, 2, …, m. The summation extends through all m! permutations of the column subscripts. The number of inversions α in a particular permutation is defined as the total number of times an element is followed by numbers that would ordinarily precede it in the standard order 1, 2, …, m. Matrix A is referred to as a singular matrix if A = 0. Example 1: A = a11 , A = a11 a Example 2: A = 11 a21
a12 , A = a11a22 − a12 a21 a22
a11 Example 3: A = a21 a31
a12 a22 a32
a13 a23 , a33
A = a11a22 a33 + a12 a23 a31 + a13 a21a32 − a13 a22 a31 − a11a23 a32 − a12 a21a33 Definition 2.4 (Trace): Let A be an m × m square matrix. The sum of diagonal elements is called the trace of A, denoted by m
tr ( A) =
∑a
ii
(2.5)
i =1
Definition 2.5 (Inverse): The inverse of the m × m square matrix A = ( aij ) , denoted by A–1, is the unique matrix such that AA–1 = A–1A = I
(2.6)
2151_book.fm Page 17 Friday, October 20, 2006 5:04 PM
Basics of Matrix Theory
17
If such an A −1 exists, then A is nonsingular; otherwise, A is singular. Some useful results related to inverses are summarized below. Here, we assume that all the inverses exist. 1. ( A −1 )T = ( A T )−1 2. ( A −1 )−1 = A 3. ( AB)−1 = B−1A −1 1 4. ( kA)−1 = A −1, k is a nonzero scalar k −1 a11 1 / a11 0 0 0 0 a22 0 0 0 5. = 0 0 0 0 0 0 amm 0 0
0 1 / a22 0 0
0 0 0
0 0 0 1 / amm
6. If A is orthogonal, then A −1 = A T Definition 2.6 (Reduced row echelon form of a matrix): A matrix is said to be in reduced row echelon form if the following conditions are satisfied: • • • •
The first nonzero number in a row is 1 (we call it the leading 1). All rows of zeros (if any) are together at the bottom of the matrix. Each column that contains a leading 1 has only zeros below it. Each column that contains a leading 1 has zeros everywhere else.
If all these properties except the last are satisfied, then the matrix is in row echelon form. Remark 2.3: The reduced row echelon form of a matrix is unique. Remark 2.4: The reduced row echelon form of a matrix can be obtained through row operations, i.e., trading two rows, multiplying a row by a nonzero scalar, or adding a scalar multiple of one row to another row. For example, Matrix 1 A = 2 1
1 3 3
1 7 −2
3 0 17
can be transformed into reduced row echelon form as 1 0 0
0 1 0
0 0 1
1 4 . −2
2151_book.fm Page 18 Friday, October 20, 2006 5:04 PM
18
Stream of Variation Modeling and Analysis for MMPs
2.2.2 PARTITIONED MATRICES In certain cases, some blocks in a matrix possess common characteristics. Thus, it is sometimes convenient to partition the whole matrix into several blocks, for example; A11 A A = ( A ij ) = 21 A m 1
A12 A 22 Am 2
A1n A2 n A mn
(2.7)
where each submatrix A ij has pi rows and q j columns. Remark 2.5: In Equation 2.7, submatrices in the same row of A have the same number of rows; so do the submatrices in the same column of A. Remark 2.6: The operations of partitioned matrices are similar to those of unpartitioned ones, except that the elements in partitioned matrices are submatrices. For example, if A = ( Aij) and B = (Bij) are partitioned into submatrices with similar dimensions, then A + B = ( Aij + Bij) , n A1k B k 1 k =1 AB= n A mk B k 1 = k 1
∑
∑
A1k B kp k =1 n A mk B kp k =1 n
∑
∑
in which the dimensions of each submatrix product must conform. By employing submatrices, we can simplify computations of the inverse and determinant. Consider a particular important partition of the form: A A = 11 A 21
A12 A 22
(2.8)
If both A11 and A 22 are square, then the inverse and determinant of A can be readily calculated in terms of submatrices: 1. If A 22 has an inverse, then A −1 = ( A11 − A12 A −221A 21 )−1 −1 −1 −1 − A 22 A 21 ( A11 − A12 A 22 A 21 )
(2.9) A + A A 21 ( A11 − A12 A A 21 ) A12 A
−( A11 − A12 A −221A 21 )−1 A12 A −221 −1 22
−1 22
−1 22
−1
−1 22
2151_book.fm Page 19 Friday, October 20, 2006 5:04 PM
Basics of Matrix Theory
2.
19
−1 A = A11 ⋅ A 22 − A 21A11 A12 if A11 has an inverse
−1 A 21 if A 22 has an inverse = A 22 ⋅ A11 − A12 A 22
(2.10)
3. If both A11 and A 22 have inverses, then −1 −1 −1 −1 −1 ( A11 − A12 A 22 A 21 )−1 = A11 + A11 A12 ( A 22 − A 21A11 A12 )−1 A 21A11 .
(2.11)
As a special case, if A is an m × m nonsingular matrix, a is an m × 1 vector, and k is a scalar, then we have the identity: ( A + kaa T )−1 = A −1 −
k A −1aa T A −1 T −1 1 + ka A a
(2.12)
2.3 QUADRATIC FORMS Definition 2.7 (Quadratic Form): A quadratic form of the variables xi (i = 1, 2, …, n) is of the type: f ( x1 , x 2 ,..., x n ) = a11x12 + a22 x 22 + + ann x n2 + 2a12 x1x 2 + + 2a1n x1x n + + 2a( n−1) n x n−1x n
(2.13)
Let x T = x1 x 2 x n and A = (aij) be a symmetric matrix of order n × n . Then Equation 2.13 can readily be expressed in an equivalent matrix notation: f (x) = x T A x
(2.13P′)
Remark 2.7: If x is a random vector, the quadratic form plays an important role in univariate and multivariate analysis. Definition 2.8 (Positive definite and positive semidefinite matrix): A symmetric matrix A = (aij) and its associate quadratic form in Equation 2.13 are called positive definite if xTAx > 0 for ∀x ≠ 0
(2.14)
xTAx ≥ 0 for ∀x ≠ 0
(2.15)
and positive semidefinite if
2151_book.fm Page 20 Friday, October 20, 2006 5:04 PM
20
Stream of Variation Modeling and Analysis for MMPs
2.4 VECTOR SPACE AND GEOMETRICAL INTERPRETATIONS As a column vector a = [a1 a2 am]T can be viewed as a special matrix of m × 1 order, as defined in Definition 2.2, addition and multiplication by a scalar also hold for vectors. Hence, we can introduce the definition of vector space as follows: Definition 2.9 (Vector space): Vector space is the space of all the real vectors a = [a1 a2 am]T with the operations of addition and multiplication by a scalar. Definition 2.10 (Linear combination and linear span): The linear combination of vectors ai (i = 1, 2, …, n) is defined as a vector b of the form: n
b=
∑k a
(2.16)
i i
i =1
where all ki ‘s are scalars. The linear span of ai (i = 1, 2, …, n) is formed by the set of all linear combinations of ai (i = 1, 2, …, n). Remark 2.8: An m-dimensional vector a = [a1 a2 am]T can be geometrically viewed as a directed line in m-dimensional space with component ai along the i-axis, i = 1, 2, …, m. The 2-D case is illustrated in Figure 2.1. Thus, we can define the length and angle between two vectors. Definition 2.11 (Length of a vector): Let a = [a1 a2 am]T be an m-dimensional vector. Its length, la , is defined by: la = a12 + a22 + + am2
(2.17)
Remark 2.9: For the scalar multiplication ka, the direction of a does not change if k > 0, but reverses if k < 0.
a2 a
1
0 FIGURE 2.1 A 2-D vector.
a1
a1 a2
2151_book.fm Page 21 Friday, October 20, 2006 5:04 PM
Basics of Matrix Theory
21
Remark 2.10: For the scalar multiplication ka, its length is changed in the form: lka = ( ka1 )2 + ( ka2 )2 + + ( kam )2 = k
a12 + a22 + + am2 = k la (2.18)
Thus, length increases if k >1, and decreases if 0 < k <1. Before introducing the angle between two vectors, we will first define their inner product. Definition 2.12 (Inner product): The inner product of two m-dimensional vectors a and b is defined by: m
T
T
(a, b) = a b = b a =
∑a b
i i
(2.19)
i =1
Remark 2.11: la = (a, a ) = a T a Definition 2.13 (Angle): The angle θ between two m-dimensional vectors a and b is defined by: cos(θ) =
(a, b) (a, a ) ⋅ (b, b)
=
aT b bT a = la ⋅ lb la ⋅ lb
(2.20)
From Equation 2.20, cos(θ) = 0 only if aT b = 0 and we call a and b perpendicular, denoted by a ⊥ b . Given two m-dimensional vectors a and b, we can project one on the other. For example, when m = 2, Figure 2.2 illustrates the projection of a on b.
FIGURE 2.2 Angle between two vectors and projection of a on b.
2151_book.fm Page 22 Friday, October 20, 2006 5:04 PM
22
Stream of Variation Modeling and Analysis for MMPs
Definition 2.14 (Projection of vectors): The projection of a on b is defined by (a, b) aT b b ⋅ b= (b, b) lb lb where
(2.21)
b is the unit vector along the direction of b. The length of the projection is lb aT b aT b = la ⋅ = la | cos(θ) | lb la ⋅ lb
(2.22)
2.5 EIGENVALUES AND EIGENVECTORS OF A MATRIX Let A be an m × m matrix. It is said to have an eigenvalue λ and corresponding eigenvector x ≠ 0 if Ax = λx
(2.23)
Remark 2.12: Equation 2.23 has a clear geometrical interpretation. The left side A x indicates a linear transformation of vector x. The right side λ x means the new length of x. So Equation 2.23 indicates that when a linear transformation A is applied to an eigenvector, its length is multiplied by λ; however, its direction remains unchanged if λ > 0 and is reversed if λ < 0. The set of eigenvalues, denoted by λ(A), can be obtained by solving the socalled mth order characteristics polynomial or equation: A − λI = 0
(2.24)
A − λI = (− λ)m + t1 (− λ)m −1 + ... + tm −1 (− λ)1 + A
(2.25)
By using Laplace expansion,
where ti is the sum of all i × i principal minor determinants. So: m
t1 =
∑a
ii
= tr ( A)
(2.26)
i =1
Based on the theory of polynomial equations, it follows that: m
∏λ = A i
i =1
(2.27)
2151_book.fm Page 23 Friday, October 20, 2006 5:04 PM
Basics of Matrix Theory
23
m
∑ λ = tr (A)
(2.28)
i
i =1
Some properties of eigenvalues are summarized as follows: 1. The eigenvalues of a real symmetric matrix are all real. 2. The eigenvalues of a positive definite matrix are all positive. 3. For an m × m positive semidefinite symmetric matrix with rank r, i.e., | A | ≥ 0 and rank (A) = r, A has exactly r positive eigenvalues and m – r zero eigenvalues. 4. λ( AB) = λ(BA). 5. λ diag ( a11, a22 ,..., amm ) = a11, a22 ,..., amm .
(
) {
} After obtaining the set of eigenvalues λ( A) = {λ , λ , ..., λ } , the corresponding 1
2
m
eigenvectors can be calculated by solving the equation: A − λ i I ⋅ x i = 0
(2.29)
Remark 2.13: From Equation 2.29, it is apparent that the eigenvector x i corresponding to the eigenvalue λ i is not unique. For a symmetric matrix A, the eigenvalues and eigenvectors have the following properties: 1. If λ i ≠ λ j are two distinct eigenvalues of symmetric matrix A, then the associated eigenvectors x i and x j are perpendicular, i.e., x i ⊥ x j . 2. For any real symmetric matrix A , there exists an orthogonal matrix P such that
(
PTAP = diag λ1, λ 2 , …, λ m
)
(2.30)
Remark 2.14: Using the preceding properties, the quadratic form of symmetric matrices can be simplified by orthogonal transformation: x = Py
(2.31)
It follows immediately that m
x Ax = y P APy = T
T
T
∑λ y
2 i i
(2.32)
i =1
Remark 2.15: If A is not full rank, say rank(A) = r, then only r terms are left in the summation. Generally, the so-called spectral decomposition of an m × m symmetric matrix A is given by:
2151_book.fm Page 24 Friday, October 20, 2006 5:04 PM
24
Stream of Variation Modeling and Analysis for MMPs
A = λ1e1e1T + λ 2 e2 eT2 + ... + λ m em emT
(2.33)
where λi’s, i = 1, 2, …, m are eigenvalues of A and ei’s, i = 1, 2, …, m are associated normalized eigenvectors. So 1, i = j , i, j = 1, 2, ..., m eTi e j = 0, i ≠ j
(2.34)
Remark 2.16: Comparing Equation 2.32 and Equation 2.33, we can see that yi = x T ei , i = 1, 2, ..., m
(2.35)
2.6 VECTOR AND MATRIX DIFFERENTIATION, MAXIMIZATION, AND OPERATORS 2.6.1 DIFFERENTIATION
WITH
VECTORS
In later chapters, we will need to determine the partial derivatives of likelihoods and other scalar functions of vectors and matrices. Let f ( x ) = f ( x1, x2 ,..., xm ) be a continuous function whose first and second partial derivatives ∂f ( x ) ∂2 f ( x ) and ∂xi ∂xi ∂x j
(2.36)
exist for all points x in the span of m-dimensional Euclidean space. Definition 2.15 (Partial differentiation with vectors): The partial differentiation of function f ( x ) = f ( x1, x2 ,..., xm ) with respect to (w.r.t.) vector x is defined by: ∂f ( x ) ∂x 1 ∂f ( x ) ∂f ( x ) = ∂x2 . ∂x ∂f ( x ) ∂xm
(2.37)
The partial derivatives of some special forms of f ( x ) are summarized as follows.
2151_book.fm Page 25 Friday, October 20, 2006 5:04 PM
Basics of Matrix Theory
25
1. f (x) = constant for all x ∂f ( x ) =0 ∂x
(2.38)
∂f ( x ) =a ∂x
(2.39)
2. f (x) = a T x = x T a
3. Quadratic form: f ( x ) = x T Ax
(
)
∂f ( x ) = A + AT x ∂x
(2.40)
4. General quadratic form: f ( x ) = (a − Bx )T A(a − Bx ) where a is an m × 1 vector, B = [B1 B2 Bn] is an m × m matrix, and A is an m × m symmetric matrix. By using the chain rule: ∂f ( x ) = −2 BT A(a − Bx ) ∂x
(2.41)
Definition 2.16 (Hessian matrix): A Hessian matrix is defined by the matrix of second-order partial derivatives of a function f ( x ) = f ( x1, x2 ,..., xm ) w.r.t. vector x, which is ∂2 f ( x ) 2 ∂x1 ∂2 f ( x ) = H= ∂x T ∂x 2 ∂ f (x) ∂x ∂x 1 m
∂2 f ( x ) ∂x1∂xm ∂2 f ( x ) ∂xm2
(2.42)
As an example, the Hessian matrix of the quadratic form f ( x ) = x T Ax is 2A, if A is symmetric. After introducing definitions of first- and second-order partial derivatives, we can state the necessary condition for a maximum or minimum of f ( x ) at x = x*: ∂f (x) =0 ∂x x= x*
(2.43)
2151_book.fm Page 26 Friday, October 20, 2006 5:04 PM
26
Stream of Variation Modeling and Analysis for MMPs
Remark 2.17: This kind of maximum or minimum is called a stationary maximum or minimum because a global extreme may be located in the boundary of the feasible region of x, or as a cusp or another form at which the first-order derivative does not exist. A sufficient condition for a maximum (or minimum) at the point x* satisfying Equation 2.43 is that the Hessian matrix at x* H=
∂ 2 f (x) ∂x T ∂x x= x*
(2.44)
be negative (or positive) definite. Remark 2.18: If the Hessian matrix is indefinite at x*, then the point x* is not an extreme point.
2.6.2 MATRIX INEQUALITIES
AND
MAXIMIZATION
The matrix inequalities help obtain the maximization results. First, some important matrix inequalities are introduced. Cauchy–Schwarz Inequality: Let a and b be any two m × 1 vectors. Then (a, b)2 ≤ (a, a ) ⋅ (b, b) or (a T b)2 ≤ (a T a ) ⋅ (b T b)
(2.45)
The equality holds if and only if a = kb or b = ka, where k is a constant. An important extension of Cauchy–Schwarz Inequality is given as follows: Extended Cauchy–Schwarz Inequality: Let a and b be any two m × 1 vectors, and D be an m × m positive definite matrix. Then (a, b)2 ≤ (a, Da ) ⋅ (b, D −1b) or (a T b)2 ≤ (a T Da ) ⋅ (b T D −1b)
(2.46)
The equality holds if and only if a = kD–1b or b = kDa, where k is a constant. The proofs of Equation 2.45 and Equation 2.46 can be found in Reference 1. Based on the extended Cauchy–Schwarz Inequality, we have the following maximization theorem: Theorem 2.1 (Maximization): Suppose A is a positive definite matrix of order m × m and a is a given m × 1 vector. Then for ∀ x ≠ 0 of dimension m × 1 , max x ≠0
( x, a )2 = a T A −1a ( x, Ax )
(2.47)
The maximum is obtained when x = kA −1a for any constant k ≠ 0 . The following theorem gives the maximization result of quadratic forms for points on the unit sphere:
2151_book.fm Page 27 Friday, October 20, 2006 5:04 PM
Basics of Matrix Theory
27
Theorem 2.2 (Maximization of quadratic forms for points on the unit sphere): Suppose A is a positive definite matrix of dimension m × m with eigenvalues λ1 ≥ λ 2 ≥ ≥ λ m > 0
(2.48)
and corresponding eigenvectors ei , i = 1, 2, ..., m . Then ( x, Ax ) = λ1 obtained when x = e1 ( x, x ) ( x, Ax ) 2. min = λ m obtained when x = em x ≠ 0 ( x, x ) ( x, Ax ) 3. max = λ i+1 obtained when x = ei+1 i = 1, 2, ..., m − 1 x ⊥ e1,...,ei ( x, x ) 1. max x ≠0
(2.49) (2.50) (2.51)
The proofs can be found in Reference 1.
2.6.3 VEC OPERATOR, KRONECKER PRODUCT, AND HADAMARD PRODUCT 2.6.3.1 Vec Operator and Kronecker Product The concepts and results of the Kronecker product, which is also often called the direct product, are mentioned in many papers on variation modeling and analysis. Before introducing the definitions and some of the important properties of this product, we will first cover the concept of a vector operator, vec(⋅) . Definition 2.17 (Vector operator or Vec operation): Let X = [X1 X2 Xq] be a p × q matrix with columns X i ∈ R p ×1, i = 1, 2, …, q. Then the vec operator, vec(X) , is defined by stacking all the q columns of X in one column, i.e., X1 X2 vec (X) = ∈ R pq ×1 X q
(2.52)
Definition 2.18 (Kronecker product, or direct product): Given matrices A = ( aij ) ∈ R p ×q and B = (bij ) ∈ R r ×t , the Kronecker product of A and B, denoted by A ⊗ B, is the pr × qt matrix, a11B a21B A⊗B = a p1B
a12 B
a22 B
a p2B
a1q B a2 q B a pq B
(2.53) pr × qt
2151_book.fm Page 28 Friday, October 20, 2006 5:04 PM
28
Stream of Variation Modeling and Analysis for MMPs
Remark 2.19: Based on this definition, we can derive a special result about blockdiagonal matrices as follows: B 0 0
0 B 0
0 0 = Ip ⊗ B B pr × pt
(2.54)
where B ∈ R r ×t occurs p times on the diagonal. From the definition, it is easy to prove some important properties about the Kronecker product, which are summarized as follows: Property 1: ( aA) ⊗ (bB) = ab ( A ⊗ B) holds for any scalars a and b. Property 2: If A ∈ R p ×q , B ∈ R p ×q , and C ∈ R s ×t , then ( A + B) ⊗ C = A ⊗ C + B ⊗ C Property 3: ( A ⊗ B) ⊗ C = A ⊗ (B ⊗ C) Property 4: If A ∈ R m × n , B ∈ R p ×q , C ∈ R n ×s , and D ∈ R q ×t , then
( )
( A ⊗ B)(C ⊗ D) = AC ⊗ (BD) Property 5: ( A ⊗ B)T = A T ⊗ BT Property 6: If A and B are both nonsingular matrices, then ( A ⊗ B)−1 = A −1 ⊗ B−1 Property 7: If A ∈ R p × p and B ∈ R p × p , then tr ( A ⊗ B) = tr ( A) ⋅ tr (B) Property 8: If A and B are both orthogonal matrices, then A ⊗ B is also an orthogonal matrix. Property 9: If A ∈ R p × p , B ∈ R q ×q , then det( A ⊗ B) = [det( A)]q ⋅ [det(B)] p Property 10: If A > 0 and B > 0, then A ⊗ B > 0
2151_book.fm Page 29 Friday, October 20, 2006 5:04 PM
Basics of Matrix Theory
29
Property 11: If A ∈ R p × p and B ∈ R q ×q have eigenvalues αi, i = 1, 2, …, p and βj, i = 1, 2, …, q, respectively, then A ⊗ B has eigenvalues αiβj, i = 1, 2, …, p, j = 1, 2, …, q. The relationships between the vec operator and the Kronecker products are given in the following lemmas. Lemma 2.1: Let A ∈ R p ×m , X ∈ R m × n , and B ∈ R n ×q . Then vec ( AXB) = (BT ⊗ A) ⋅ vec (X)
(2.55)
The following remarks provide an application of this lemma. Remark 2.20: If X = [X1 X 2 X n ] ∈ R m × n is a random matrix whose columns are independent m × 1 random vectors each with the same covariance matrix Σ, i.e., cov(X i ) = Σ for i = 1, 2, …, n, then X1 X vec (X) = 2 ∈ R mn ×1 X n ∑ 0 cov[ vec (X)] = 0
0 ∑ 0
0 0 = In ⊗ ∑ ∈ R mn× nm ∑ mn× nm
(2.56)
(2.57)
Equation 2.57 holds because all the Xi’s are all independent with the same covariance matrix. Remark 2.21: Let A ∈ R p ×m and B ∈ R n ×q be constant matrices, and X ∈ R m × n be a random matrix whose columns are independent. Then applying this lemma and properties of the Kronecker product, the transformed random matrix Y = AXB has the following relationships: 1. E[vec( Y)] = (BT ⊗ A) ⋅ E[vec(X)] 2. cov[vec( Y)] = cov[(BT ⊗ A) ⋅ vec(X)] = (BT ⊗ A) ⋅ cov[vec(X)] ⋅ (BT ⊗ A)T = (BT ⊗ A) ⋅ (In ⊗ Σ) ⋅ (B ⊗ AT ) = (BT B) ⊗ ( AΣAT )
2151_book.fm Page 30 Friday, October 20, 2006 5:04 PM
30
Stream of Variation Modeling and Analysis for MMPs
Lemma 2.2: For appropriate sizes of matrices, the following statements hold: 1. vec( AB) = (I ⊗ A) ⋅ vec(B) = (BT ⊗ I) ⋅ vec( A) = (BT ⊗ A) ⋅ vec(I) 2. tr( ABC) = [ vec ( A T )]T (I ⊗ B) vec (C) 3. tr ( AX T BXC) = [ vec (X)]T ( A T C T ⊗ B) vec (X) = [ vec (X)]T (CA ⊗ BT ) vec (X) 2.6.3.2 Hadamard Product Definition 2.19 (Hadamard product): Suppose A and B are two m × m matrices. Their Hadamard product is the entrywise product of A and B, i.e., the m × n matrix A B whose (i,j)th entry is aijbij. For the following properties, suppose A, B, and C are m × m matrices and λ is a scalar. Property 1: A B = B A. Property 2: A (B + C) = A B + A C. Property 3: A (λB) = λ( A B). Property 4: If A and B are diagonal matrices, then A B = AB. Property 5 (Oppenheim inequality): If A and B are positive definite matrices and aii’s are the diagonal entries of A, then AB ≥
m
∏ a = B ii
i =1
with equality if and only if A is a diagonal matrix.
2.7 EXERCISES 1. Suppose aT = [3, 1, 4] and bT = [2, –3, 1]. a.Determine the length of a and b. b.What is the angle between a and b? c.What is the projection of a on b? d.Find the length of the projection in (c). 2. Let 3 A = 5 4 5
4 5 3 − 5
2151_book.fm Page 31 Friday, October 20, 2006 5:04 PM
Basics of Matrix Theory
31
a. Show if AA T = A T A = I , i.e., whether A is an orthogonal matrix. b. If A is an orthogonal matrix, verify that A −1 = A T . 3. Let 3 A= −2
−2 6
a. Check whether A is symmetric. b. Verify that A is positive definite. 4. Suppose 2 A= 2
2 . 5
Determine the following: a. The eigenvalues and eigenvectors of A. b. Calculate A–1. c. Calculate the eigenvalues and eigenvectors of A–1. 5. Let 1 A= 4
3 5
a. What is the determinant of A? b. Determine the trace of A. c. Determine whether the product of the eigenvalues of A is the same as the determinant of A. d. Determine whether the sum of the eigenvalues of A is the same as the trace of A. 6. Consider the symmetric matrix 5 A = 2 3
2 9 −2
3 −2 6
Verify whether A = λ1e1e1T + λ 2 e2 eT2 + λ 3e3eT3 . This is the so-called spectral decomposition of a symmetric matrix A. 7. Suppose f ( x ) = 2 x12 + 3 x1 x23 + 7 x32 . Find a.
∂f ( x ) ∂x
b. The Hessian matrix or
∂ 2 f (x) . ∂x T ∂x
2151_book.fm Page 32 Friday, October 20, 2006 5:04 PM
32
Stream of Variation Modeling and Analysis for MMPs
8. Suppose f (x) is in a quadratic form ( f (x) = xT Ax), i.e., 2 f ( x ) = x1 x2 x3 4 −1 Determine whether
(
3 −1 0
1 x1 5 x2 2 x3
)
∂f ( x ) = A + AT x . ∂x
References 1. Johnson, R.A. and Wichern, D.W., Applied Multivariate Statistical Analysis, 4th ed., Prentice Hall, Upper Saddle River, NJ, 1998. 2. Anderson, T.W., An Introduction to Multivariate Statistical Analysis, John Wiley & Sons, New York, 1958. 3. Graybill, F.A., Introduction to Matrices with Applications in Statistics, Wadsworth Publishing Company, Inc., Belmont, CA, 1969. 4. Harville, D.A., Matrix Algebra from a Statistician’s Perspective, Springer-Verlag, New York, 1997. 5. Noble, B. and Daniel, J.W., Applied Linear Algebra, 3rd ed., Prentice Hall, Englewood Cliffs, NJ, 1987.
2151_book.fm Page 33 Friday, October 20, 2006 5:04 PM
3
Basics of Multivariate Statistical Analysis
3.1 INTRODUCTION In variation modeling and analysis, it is common to consider multiple random variables. Thus, multivariate statistical analysis is essential to understand and master variation reduction methodologies. This chapter provides some basics of multivariate statistical analysis, including underlying theoretical population settings for random phenomena involving multivariate distributions, marginal distributions, conditional distributions, population means, variances, covariances, correlations, and other characteristics. The most important multivariate distribution, the multivariate normal distribution, is discussed in detail and some important results are summarized. An introduction of sampling theory is also included in this chapter. It should be noted that all the proofs are not provided here. Readers who want to learn more about the proofs may please refer to related mathematical statistics materials [1–10].
3.2 MULTIVARIATE DISTRIBUTION AND PROPERTIES 3.2.1 RANDOM VECTORS, CUMULATIVE DISTRIBUTION FUNCTIONS (CDF), AND PROBABILITY DENSITY FUNCTIONS (PDF) A random vector is a vector comprising elements that are random variables. Uppercase letters denote random variables and random vectors, and corresponding lowercase letters denote realizations of the random variables or random vectors. Definition 3.1 (Cumulative distribution function): Let X = [ X1, X2 , , X p ]T be a random vector with X1, X2 , , X p as the p random variables. The associated joint CDF is the function F defined by: F ( x ) = Pr(X ≤ x ) = Pr( X1 ≤ x1 , X2 ≤ x2 , , X p ≤ x p ) ,
(3.1)
where x = [ x1, x2 , , x p ]T is a realization of random vector X. Two important cases are absolutely continuous and discrete distributions. Definition 3.2 (Absolutely continuous distributions): A random vector X is absolutely continuous if there exists a probability density function (PDF), f ( x ) , s.t., F (x) =
∫
x
−∞
∫ ∫ xp
f (u) du =
−∞
x1
−∞
f (u1, u2 , , u p ) du1 du p
33
2151_book.fm Page 34 Friday, October 20, 2006 5:04 PM
34
Stream of Variation Modeling and Analysis for MMPs
Here, f ( x ) is called a joint PDF. Note that for any measurable set D ⊆ R p , Pr(X ∈ D ) =
∫ f (u)du
(3.2)
D
and
∫
+∞
−∞
f (u) du = 1
(3.3)
Definition 3.3 (Discrete distributions): For a discrete random vector X, the total probability is concentrated on a countable (or finite) set of points {χi ; i = 1, 2, }. Its probability mass function (PMF) is defined as: f (x i ) = Pr(X = x i ) for i = 1, 2, ; and f (x) = 0, otherwise.
(3.4)
Similarly, Pr(X ∈ D) =
∑ f (x ) i
i:xi ∈D
Then, for both cases, the support S of X is defined as the set:
{
S = x ∈ R p : f (x) > 0
3.2.2 MARGINAL
AND
}
CONDITIONAL DISTRIBUTIONS
Consider the partitioned vector X X = 1, X2 where X1 ∈ R k ×1 and X 2 ∈ R ( p − k )×1 , respectively. Definition 3.4 (Marginal CDF and PDF): Under the pervious partition, the function:
(
)
(
Pr X1 ≤ x10 = Pr X11 ≤ x10 , X 21 ≤ x 20 ,, X k1 ≤ x k0 , X k1+1 ≤ ∞,, X 1p ≤ ∞
)
(3.5)
is called the marginal CDF of X1. Let f ( x ) be the joint PDF of X. Then the marginal PDF of X1 is given by: f1 ( x 1 ) =
∫
+∞
−∞
f ( x 1 , x 2 ) dx 2
The marginal PDF of X2 can be defined in a similar manner.
(3.6)
2151_book.fm Page 35 Friday, October 20, 2006 5:04 PM
Basics of Multivariate Statistical Analysis
35
Definition 3.5 (Conditional PDF): Under the previous partition, the conditional PDF of X2, given X1 = x10 , is defined as:
(
f x 2 | X1 = x
0 1
) = (f ( x ) ) f x10 , x 2
(3.7)
0 1
1
(Assume f1 ( x10 ) ≠ 0). Similarly, we can define the conditional PDF of X1 given X2 = x 02 . Based on the conditional distribution, we can introduce the important concept statistical independence as follows: Definition 3.6 (Statistical independence): When the conditional PDF f (x2|X1 = x 10 ) is the same for all possible values of x10 , then X1 and X2 are said to be statistically independent of each other, denoted as: X1 ⊥⊥ X2. Theorem 3.1: If X1 ⊥⊥ X2, then f ( x ) = f1 ( x 1 ) f 2 ( x 2 )
(3.8)
3.2.3 POPULATION MOMENTS Definition 3.7 (Population mean vector): Suppose X = [ X1, X2 , , X p ]T is a random vector with PDF f ( x ) . Then the expectation or mean of random vector X is defined as: E ( X1 ) E(X2 ) µX = E X = E ( X p )
( )
∫
(3.9)
+∞
xi fi ( xi ) dxi , i = 1, 2, , p . µ X is also called where µX = [µ1, µ2, …, µp]T and µ i = −∞ the population mean vector. It is easy to show that the expectation operator has the following properties:
(
)
( ) X Partition: Let X = , then E ( X ) = ∫ x X If X ⊥⊥ X , then E ( X X ) = E ( X ) E ( X ) .
1. Linearity: E AX + b = AE X + b for any A ∈ R q × p and b ∈R q . 2.
+∞
1
1
2
3.
1
2
1
2
1
−∞
f ( x 1 ) dx 1 .
1 1
2
Definition 3.8 (Population covariance matrix): The covariance matrix (or variancecovariance matrix, or dispersion matrix) of random vector X = [ X1, X2 , , X p ]T is defined as:
(
∑ X = var(X) = E (X − µ X )(X − µ X )T
)
(3.10)
2151_book.fm Page 36 Friday, October 20, 2006 5:04 PM
36
Stream of Variation Modeling and Analysis for MMPs
For simplicity, if random vector X = [ X1, X2 , , X p ]T has mean vector µ X and covariance matrix ∑ X , we denote it as X ~ ( µ X , ∑ X ) or ( µ , ∑ ). It can be proved that the covariance has the following properties: 1. ∑ X is semidefinite, i.e., for all a ∈R p , a T ∑ X a ≥ 0. 2. ∑ X is symmetric, i.e., ∑TX = ∑ X with diagonal elements σ ii = var( Xi ) and the (i, j)th element σ ij = cov( Xi , X j ) for i ≠ j. Thus,
( )
∑ X = σ ij σ11 σ 21 σ p1
p× p
=
σ12
σ 22
σ p2
σ1 p var( X1 ) σ 2 p cov( X1 , X 2 ) = σ pp cov( X1 , X p )
cov( X1 , X 2 )
var( X 2 )
cov( X 2 , X p )
cov( X1 , X p ) cov( X 2 , X p ) . var( X p )
∑ X = E (XXT ) − µ Xµ TX . For all constant vectors, a ∈R p , var(a T X) = a T ∑ X a. var( AX + b) = A ∑ X AT , where A is a matrix of an appropriate size. cov(X, X) = ∑ X. cov(X1 + X 2 , Y) = cov(X1, Y) + cov(X 2 , Y) . T Let X, Y ∈R p, then cov(X, Y) = cov( Y, X) and var(X + Y) = ∑ X + cov(X, Y) + cov( Y, X) + ∑ Y . 9. cov( AX, BY) = A cov(X, Y)BT, where A and B are matrices of appropriate sizes. 10. If X ⊥⊥ Y, then cov( X , Y ) = 0, but the converse may not be true. 3. 4. 5. 6. 7. 8.
Based on the covariance matrix, we can define the statistical distance between two random vectors. Clearly, it is a special case of Mahalanobis distance. Definition 3.9 (Statistical distance or Mahalanobis distance): If X, Y ∈R p , then the statistical distance between X and Y with matrix ∑ is defined as: d ∑ (X, Y) = (X − Y)T ∑ −1 (X − Y)
(3.11)
For convenience, the matrix ∑ is usually chosen to be the covariance matrix. For example: µ, ∑). The Mahalanobis distance between X and µ, d ∑ (X, µ ), is 1. Let X ~ (µ a random variable. 2. Let X ~ ( µ X , ∑) and Y ~ ( µ Y , ∑). Then d ∑ (µ X , µ Y ) is a Mahalanobis distance between the parameters. Also, for any nonsingular matrix A ∈ R p × p , d ∑ (µ X , µ Y ) is invariant under the following transformations:
2151_book.fm Page 37 Friday, October 20, 2006 5:04 PM
Basics of Multivariate Statistical Analysis
37
= AX + b X :→ X = AY + b Y :→ Y = A ∑ AT ∑ :→ ∑ i.e., d ∑ (µ X , µ Y ) = d ∑ (µ X , µ Y ).
3.2.4 CORRELATION COEFFICIENTS There are three types of correlation coefficients: simple correlation coefficients, partial correlation coefficients, and multiple correlation coefficients. It should be noted that partial correlation coefficients and multiple correlation coefficients are special cases of simple correlation coefficients. µx, ∑x). As before, suppose X = [ X1, X2 , , X p ]T is a random vector with X ~ (µ Definition 3.10 (Simple correlation coefficients): The correlation coefficient between any two random variables Xi and Xj is defined as: ρij =
cov( Xi , X j ) var( Xi ) ⋅ var( X j )
=
σ ij σ ii ⋅ σ jj
(3.12)
Clearly, ρij is the normalized covariance of Xi and Xj , for i, j = 1, 2, …, p. Properties of ρij : 1. ρij is dimensionless. 2. –1 ≤ ρij ≤ +1. 3. If ρij = –1 or +1, there is a line in the Xi and X j sample space of the form X j = a + bXi such that all the nonzero probability values for the bivariate distribution of Xi and X j lie on the line. Furthermore, if ρij > 0, the slope is greater than zero; if ρij < 0, the slope is less than zero. 4. If –1 < ρij < +1, the value of ρij is a relatively subjective measure of the tendency of Xi’s and Xj’s probabilities to concentrate about the same line. It can be seen that the simple correlation coefficient between any two random variables Xi and X j describes their joint behavior, whereas the partial correlation coefficient reflects the behavior of two random variables when other random variables are held fixed. For example, considering random variables Y , X1, X2 , , Xl , the partial correlation between Y and X1 reflects the behavior between them when the other random variables X2 , , Xl are held fixed, which is commonly denoted by ρX1Y•X2X3Xl or ρ X1Y |X2 X3Xl . Without loss of generality, let X1, X2 , , X k and X k +1, X k +2 , , X k +r be any two subsets of a collection of random variables X1, X2 , , X k , X k +1, , X p (2 ≤ k < p).
2151_book.fm Page 38 Friday, October 20, 2006 5:04 PM
38
Stream of Variation Modeling and Analysis for MMPs
Definition 3.11 (Partial correlation coefficients or conditional correlation coefficients): Suppose Xi and Xj are any two of the random variables X1, X2 , , X k , and σ ij•k +1,k +2,,k +r is the conditional covariance of Xi and Xj in the joint conditional distribution, the partial correlation coefficient of Xi and Xj given X k +1, X k +2 , , X k +r is defined as: ρ X i X j • X k +1 X k + 2 X k + r =
σ ij•k +1,k +2,,k +r σ ii•k +1,k +2,,k +r ⋅ σ jj•k +1,k +2,,k +r
(3.13)
The partial correlation coefficient can be calculated from simple correlation coefficients. Let Z be the collection of random variables X k +1, X k +2 , , X k +r ; then Equation 3.13 can be expressed in a more explicit but equivalent form: ρ Xi X j •Z =
σ Xi X j •Z σ Xi Xi •Z ⋅ σ X j X j •Z
ρ Xi X j − ρ XiZ ⋅ ρ X j Z
=
(1 − ρ2XiZ )(1 − ρ2X jZ )
(3.13)
Equation (3.13) also gives the calculation approach. The interpretation of a partial correlation coefficient is the same as that of a simple correlation coefficient except that the partial correlation coefficient generally depends on the conditional values of the conditional random variables. Graphically, the simple correlation coefficient describes the data in an ordinary scatter plot, whereas the partial correlation coefficient describes the data in the partial regression residual plot. For example, considering the collection of random variables Y, X1, X2, , Xl, suppose Y and X1 are of interest and the remaining variables are fixed values. The residuals of Y and X1 are calculated by regressing on X2 , , Xl , respectively. These residuals are parts of Y and X1 that cannot be predicted from X2 , , Xl . The partial correlation coefficient between Y and X1 after adjusting X2 , , Xl is the correlation between these two sets of residuals. Furthermore, the regression coefficient when Y’s residuals are regressed on X1’s residuals is equal to the regression coefficient of X1 in the multiple regression equation when Y is regressed on X1, X2 , , Xl . For more details, please refer to the regression section in Chapter 4. The simple correlation coefficient is a measure of the linear relationship between two random variables. The partial correlation coefficient is a measure of the linear relationship between two random variables when other random variables are held fixed, and the multiple correlation coefficients are a measure of the linear relationship between a random variable and a set of other random variables. Definition 3.12 (Multiple correlation coefficients): Without loss of generality, suppose X1 is a random variable and X2, X3, , Xp are a set of other random variables. Let p
Y=
∑a X i
i=2
i
2151_book.fm Page 39 Friday, October 20, 2006 5:04 PM
Basics of Multivariate Statistical Analysis
39
be an arbitrary linear combination of X2, X3, , Xp. Then the multiple correlation coefficient of X1 and X2, X3, , Xp is the maximum simple correlation coefficient of X1 and Y for variation over ai, i = 2, 3, , p. For example, let ρ X1Y ( a2 , a3 , , a p ) =
cov( X1, Y ) var( X1 ) ⋅ var(Y )
(3.14)
Then the multiple correlation coefficient of X1 and X2 , X3 , , X p , denoted by ρ1•2 3p, is defined as the maximum of ρ X1Y ( a2 , a3 , , a p ) w.r.t. a2 , a3 , , a p , i.e.,
{
}
ρ1•23p = max ρX1Y (a2 , a3 , , a p ) ai ,2≤i ≤ p
It can be calculated according to the following procedure: 1. Let X = [ X1, X2 , , X p ]T ; partition ∑ in the following manner ∑ ∑ = var(X) = 11 T ∑12
∑12 ∑ 22
where Σ11 = var( X1 ) = σ11 ∈ R 1×1 , ∑12 = [cov( X1 , X 2 ) cov( X1, X p )] = [σ12 σ1 p ] and ∑ 22 = cov(X −1, X −1 ) , where X −1 = [ X2 , X3 , , X p ]T . 2. Based on the definition of ρ X1Y ( a2 , a3 , , a p ) , compute
ρ2 X1Y ( a2 , a3 , , a p ) =
(Σ
(
a
12 −1
)
2
σ11 a T−1Σ 22a −1
)
where a −1 = [a2 a p ]. 3. Calculate ρ21•23p by
{
}
ρ21•23p = max ρ2 X1Y ( a2 , a3 , , a p ) . ai ,2 ≤i≤ p
It can be shown that ρ21•23p =
(
)
1 T . ∑12 ∑ −221 ∑12 σ11
(3.15)
2151_book.fm Page 40 Friday, October 20, 2006 5:04 PM
40
Stream of Variation Modeling and Analysis for MMPs
4. Determine ρ1•23p by taking the square root of ρ21•23p , i.e., ρ1•23p = ± ρ21•23p . Remark 3.1: Let S S = 11 T S12
S12 S22
be the sample covariance matrix. Then the sample multiple correlation coefficient is: r 21•23 p =
(
)
1 T S12S−221S12 . S11
Remark 3.2: The multiple correlation coefficient can be used to test whether X1 is independent of X2 , X3 , , X p by the F-test with appropriate degrees of freedom. Remark 3.3: The multiple correlation coefficient describes the correlation between one dependent variable X1 and its best estimate from a regression on several independent variables X2 , X3 , , X p. The squared multiple correlation coefficient is the proportion of variance accounted for the multiple regression.
3.3 MULTIVARIATE NORMAL DISTRIBUTION AND QUADRATIC FORMS The basic and central distribution in multivariate analysis is the multivariate normal distribution. This is due to two main factors. First, multivariate observations are often normally distributed, or approximately normally distributed, due to the central limit theorem. This fact lays the foundation for formal inferences. Second, the multivariate normal distribution and the sampling distributions are tractable. This is not generally the case with other multivariate distributions, even for those that are close to normality.
3.3.1 MULTIVARIATE NORMAL DISTRIBUTION
AND ITS
PROPERTIES
Definition 3.13 (Multivariate normal distribution): Let X = [X1, X2, , Xp]T ∈R p µx, ∑ X ), where all the p random variables Xi are be a random vector with X ~ (µ continuous s.t. –∞ < Xi < ∞, and ∑ X is a positive definite matrix. Thus S = {( x1 , x 2 ,, x p )T | − ∞ < xi < ∞, i = 1, 2, ..., p} forms the sample space for X. Then random vector X is said to have a p-variate normal distribution if its multivariate probability distribution is given by: f (X) =
1 (2π )
p/ 2
∑X
1/ 2
1 exp − X − µ X 2
(
)
T
∑ X−1 X − µ X
(
Often, it is denoted as X ~ N p (µ X , ∑ X ), or simply N p (µ , ∑).
)
(3.16)
2151_book.fm Page 41 Friday, October 20, 2006 5:04 PM
Basics of Multivariate Statistical Analysis
41
Properties of multivariate normal distribution: 1. If X follows a p-variate normal distribution or X ~ N p (µ X , ∑ X ), then both µX ≡ E(X) and ∑ X ≡ cov(X) exist and the distribution of X is uniquely determined by µ X and ∑ X. 2. If X ~ N p (µ , ∑), and A ∈ R q × p , b ∈R q , then Y = AX + b has a q-variate µ + b, A ∑ AT). normal distribution with Y ~ N q (Aµ 3. If X ~ N p (µ , ∑), then the marginal distribution of any subset of k(k < p) components of X is k-variate normal. 4. If X ~ N p (µ , ∑), for the partition: X µ ∑ X = 1 , µ = 1 , ∑ = 11 X2 µ 2 ∑ 21
∑12 , ∑ 22
where X1 ∈R k, µ1 ∈R k , and ∑11 ∈ R k × k, the subvectors X1 and X2 are independent iff ∑12 = 0, i.e., X1 ⊥⊥ X 2 ⇔ ∑12 = 0
(3.17)
Remark 3.4: Property 4 can be easily extended to the case that X is partitioned into a number of blocks with more than two subvectors. Remark 3.5: Property 4 shows that, in order to determine whether two subvectors of a normally distributed vector are independent, it suffices to check that covariance matrices between these two subvectors are zero. 5. Contours of N p (µ , ∑) are defined as:
(X − µ )
T
(
)
Σ −1 X − µ = C 2
(3.18)
where C is a position constant. So Equation 3.18 defines an ellipsoid with its center at µ and axes ±C λ i ei , where λ i and ei are the eigenvalues and associated eigenvectors of ∑ . So µ determines its position and ∑ defines its shape and orientation. 6. (Regression) Given X ~ N p (µ , ∑), and the partition: X µ Σ X = 1 , µ = 1 , and Σ = 11 X2 µ 2 Σ 21
Σ12 Σ 22
where X1 ∈R k , µ1 ∈R k , and Σ11 ∈ R k × k . Let ∑ −22 be the generalized inverse (1) − − of ∑ 22 , i.e., ∑ 22 = ∑ 22 ∑ 22 ∑ 22 , and define ∑11 = ∑11 − ∑12 ∑ 22 ∑ 21 . Then:
2151_book.fm Page 42 Friday, October 20, 2006 5:04 PM
42
Stream of Variation Modeling and Analysis for MMPs − − (1) a. X1 − ∑12 ∑ 22 X 2 ~ N k (µ1 − ∑12 ∑ 22 µ 2 , ∑11 ); furthermore, − X1 − ∑12 ∑ 22 X2 ⊥⊥ X 2
(3.19)
b. Conditional distribution of X1 given X 2 = x 2 is: − (1) X1 X 2 = x 2 ~ N k (µ1 + ∑12 ∑ 22 (x 2 − µ 2 ), ∑11 ) − Remark 3.6: E( X1 X 2 ) = µ1 + ∑12 ∑ 22 (X 2 − µ 2 )
(3.20) (3.21)
(1) − var( X1 X 2 ) = ∑11 = ∑11 − ∑12 ∑ 22 ∑ 21
(3.22)
Rewrite E( X1 X 2 ) as: E(X1 X 2 ) = (µ1 − ∑12 ∑ −22 µ 2 ) + ∑12 ∑ −22 X 2 = b + AX 2
(3.23)
− − µ 2 ), A = ∑12 ∑ 22 . So E( X1 X 2 ) is usually called the regreswhere b = (µ1 − ∑12 ∑ 22 sion function of X1 on X2 with the intercept vector b and the regression matrix A. It is a linear regression function because it depends linearly on the variable X2 being held constant, whereas var( X1 X 2 ) is constant without depending on X2.
7. For random vectors X and Y ∈R p , if X ⊥⊥ Y and X + Y is p-variate normally distributed, then both X and Y are p-variate normal. 8. Linear combination of random vectors: If X1, X2, …, XN ∈R p are all independent random vectors and Xi ~ N p (µ i , ∑i ) for i = 1, 2, …, N, then: a. For any fixed constant a T = [a1 aN ], N
∑ i =1
ai X i ~ N p
N
N
∑
aiµ i ,
i =1
∑ i =1
ai2 ∑ i
(3.24)
b. As a special case of Equation 3.24, suppose Xi iid ~ N p (µ , ∑) for i = 1, 2, , N and a1 = = aN = 1/N, then: 1 X ~ N p µ, ∑ , N where X=
1 N
N
∑X i =1
is called the sample mean vector.
i
(3.25)
2151_book.fm Page 43 Friday, October 20, 2006 5:04 PM
Basics of Multivariate Statistical Analysis
43
9. Let X ~ N p (µ , ∑), where ∑ is nonsingular, so that there exists a nonsingular matrix D such that ∑ = DDT . Then the transformation Y = D–1 (X – µ) ~ N p (0, I p ).
3.3.2 QUADRATIC FORMS Quadratic forms in multivariate normal random variables arise very often and play an important role in multivariate analysis. Some important results of certain quadratic forms under various parametric conditions are given in this subsection. Theorem 3.2: Suppose X ~ N p (µ , ∑), where ∑ is nonsingular, then: 1. (X − µ )T ∑ −1 (X − µ ) ~ χ2p and Prob. {(X – µ)T ∑ −1 (X − µ ) ≤ χp2(α)} = 1 – α, (3.26) where χ2p (α ) is the upper α percentile of the χ2 distribution with p degree of freedom. 2. XT ∑ −1 X ~ χ2p ( ∆) (3.27) T −1 where ∆ = µ ∑ µ . Remark 3.7: This result is used in testing the hypothesis about the mean of a multivariate normal distribution when the covariance matrix is known, e.g., as a χ2 control chart in a quality control problem. Remark 3.8: From this theorem and Property 8, it is easy to show that: N (X − µ )T ∑ −1 (X − µ ) ~ χ2p
(3.28)
Remark 3.9: If X ~ N p (µ , ∑), then (X − µ )T ∑ −1 (X − µ ) is the squared statistical distance (squared Mahalanobis distance) from X to the population mean vector µ . Remark 3.10: If σ ii > σ jj , then Xi contributes less to the statistical distance than Xj; if σij > σst, then Xi and X j contribute less to the statistical distance than Xs and Xt . Theorem 3.3: Suppose X ~ N p (µ , ∑), where ∑ is nonsingular. Then for the partition: X µ ∑ X = 1 , µ = 1 , and ∑ = 11 X2 µ 2 ∑ 21
∑12 , ∑ 22
where X1 ∈R k , , µ1 ∈R k , and ∑11 ∈ R k × k , the following result holds:
(
)
−1 (X − µ )T ∑ −1 (X − µ ) − (X1 − µ1 )T ∑11 X1 − µ1 ~ χ2p − k
(3.29)
Theorem 3.4 Suppose X ~ N p (0, ∑), where rank (∑) = r (r ≤ p), and ∑ − is the generalized inverse of Σ, i.e., ΣΣ − Σ = Σ , then XT ∑ − X ~ χ2r
(3.30)
2151_book.fm Page 44 Friday, October 20, 2006 5:04 PM
44
Stream of Variation Modeling and Analysis for MMPs
3.3.3 NONCENTRAL χ2
AND
F DISTRIBUTIONS
The noncentral χ2 distribution can be expressed as a mixture of central χ2 distributions where the weights are Poisson probabilities. Definition 3.14 (Noncentral χ2 distribution): If X ~ N p (µ , I p ), then the random variable Z = X T X has the CDF:
∞
F ( z; p, δ) =
∑ i=0
1 i δ δ 2 e − 2 Pr(χ2 ≤ z) p +2 i i!
(3.31)
where δ = µ T µ. Z is said to have the noncentral χ2 distribution with degree of freedom (d.o.f.) p and noncentrality parameter δ denoted by χ2p (δ) . Corollary: If Z ~ χ2p (δ) , then its PDF can be expressed as: ∞
∑ P (K = k )g
( z) , z > 0
(3.32)
1 e − z / 2 z r / 2 −1 Γ ( 21 r )
(3.33)
p +2 k
k =0
where K ~ Poisson (mean = δ/2), and gr ( z) =
2
r /2
is the density function of the χ2r distribution. Property 1: If Z ~ χ2p (δ) , then E ( Z ) = p + δ , var( Z ) = 2 p + 4δ . Property 2: If Z1 ~ χ2p1 (δ1 ) and Z 2 ~ χ2p2 (δ 2 ) , then Z1 + Z 2 ~ χ2p1+ p2 (δ1 + δ 2 ) . Recall that the central F distribution is obtained by taking the ratio of two independent χ2 variables divided by their degrees of freedom. The noncentral F distribution is obtained by allowing the numerator variable to be noncentral χ2. Definition 3.15 (Noncentral F distribution): If Z1 ~ χ2p1 (δ) , Z 2 ~ χ2p2 , and Z1 is independent of Z2, then F=
Z1 /p1 Z 2 /p2
(3.34)
is said to have the noncentral F distribution with d.o.f. p1 and p2, and noncentrality parameter δ, denoted by Fp1 , p2 (δ).
2151_book.fm Page 45 Friday, October 20, 2006 5:04 PM
Basics of Multivariate Statistical Analysis
45
Property: If F ~ Fp1, p2 (δ) , then E (F ) =
p2 ( p1 + δ) , p2 > 2 p1 ( p2 − 2 )
(3.35)
2
p ( p + δ)2 + ( p1 + 2 δ)( p2 − 2 ) var( F ) = 2 2 1 , p2 > 4 p1 ( p2 − 2 )2 ( p2 − 4)
(3.36)
Remark 3.11: The noncentral χ2 distribution can be calculated by the Matlab® function NCX2CDF(x,df,delta) or SAS function PROBCHI(x,df,delta). Both return the noncentral chi-square probability with df degrees of freedom and noncentrality parameter, delta, at the values in x. Remark 3.12: The noncentral F distribution can be calculated by the Matlab function NCFCDF(x,ndf,ddf,delta) or SAS function PROBF(x,ndf,ddf,delta). Both return the noncentral F CDF with numerator degree of freedom, ndf; denominator degree of freedom, ddf; and noncentrality parameter, delta, at the values in x. For more details about these functions please refer to the related Matlab [11] and SAS manual [12].
3.4 SAMPLING THEORY 3.4.1 SAMPLE GEOMETRY Assume X1, X2, …, Xn ∈R p are n random vectors and x1, x 2 ,..., x n are associated realizations. Let x be an n × p matrix: x1T T x x= 2= T x n
x11 x21 x n1
x12
x22
xn 2
x1 p x2 p x np
(3.37) n× p
Definition 3.16 (Observation matrix, sample matrix): An observation matrix x with n multivariate observations (or sample size n) is defined as shown in Equation 3.37; each row is called a single multivariate observation and is the collection of measurements of p different variables taken at the same trial. Remark 3.13: Matrix x is also referred to as a sample of size n from a p-variate population. From Definition 3.16, each row of an observation matrix is a realization of p-variate random vectors.
2151_book.fm Page 46 Friday, October 20, 2006 5:04 PM
46
Stream of Variation Modeling and Analysis for MMPs
Remark 3.14: In observation matrix x, all the rows are independent of each other. For simplicity, we introduce the 1-vector: T
1n = 1 1 1 ∈ R n
(3.38)
We define the sample mean vector x, sample covariance matrix s n , and sample correlation matrix as follows: Sample mean vector: 1 x= n
n
∑ x = n1 x 1 T
(3.39)
i
i =1
Sample covariance matrix: 1 sn = n −1
n
∑ (x − x)(x − x) i
T
i
i =1
(3.40)
(
1 x − 1x T = n −1
) ( x − 1x ) T
T
1 T 1 = x I − 11T x n −1 n
Sample correlation matrix: s ij R n = rij = sii s jj
( )
(3.41)
where sij is the (i, j) entry of sn. To simplify the notation, let n
A=
∑ (x − x)(x − x) i
i
T
(3.42)
i =1
So Equation 3.40 becomes
sn =
1 A n −1
(3.43)
1 Remark 3.15: sn is a symmetric matrix with p variances and p ( p − 1) potential 2 different covariances.
2151_book.fm Page 47 Friday, October 20, 2006 5:04 PM
Basics of Multivariate Statistical Analysis
47
Remark 3.16: sn can be denoted by
( )
s n = sij , i, j = 1, 2, …, p
(3.44)
where sij =
1 n −1
n
∑ (x
ki
− xi )( x kj − x j ) .
k =1
The observation matrix defined in Equation 3.37 can be geometrically interpreted in the following ways: 1. Express x1T T x x = 2 = [x1 x 2 x n ]T . T x n According to this expression, the total n data points, xi = [xi1 xi2 xip]T (i = 1, 2, …, n) can be plotted in R p space. This scatter plot contains rich information about the locations and variability of these points. 2. Express x as x = [x 1 x 2 x p ], where x1 j x2 j x j = , j = 1, 2, …, p x nj Here, there are a total of p vectors in R n space. Each x j can be decomposed into two mutually perpendicular parts, x1 j − x j x2 j − x j d j = x j − x j 1 n = and x j 1 n , x nj − x j which is illustrated in Figure 3.1.
2151_book.fm Page 48 Friday, October 20, 2006 5:04 PM
48
Stream of Variation Modeling and Analysis for MMPs
~ xj
~ xj
x j 1n
x j 1n FIGURE 3.1 Illustration of the decomposition of each column vector x~ j of data matrix x = [x~ 1 x~ 2 x~ p], where x–j is the mean value of the jth column vector x~ j, i.e., xj =
1 n
∑x n
ij
=
i =1
1 n
T x j ⋅ 1n.
In the decomposition x j = x j 1 n + d j , j = 1, 2, …, p
(3.45)
each x j 1 n is called a mean vector and d j is called the associated residual vector. The residual matrix is defined by:
d1
x1T T x d p = 2 T x n
d2
− xT − xT = x − 1n x T − x T
The length of each residual vector d j is given by: n
ld j = d j = d Tj d j =
∑ (x
kj
− x j )2 = ( n − 1)s jj
(3.46)
k =1
For any two residual vectors d i and d j , the inner product (d i , d j ) = d Ti d j n
=
∑
( x ki − xi )( x kj − x j ) = (n − 1)sij
(3.47)
k =1
= l di l d j cos(Q ij )
(3.48)
2151_book.fm Page 49 Friday, October 20, 2006 5:04 PM
Basics of Multivariate Statistical Analysis
49
where θij is the angle between two residual vectors d i and d j . If we substitute Equation 3.48 with Equation 3.46, then: n
(d i , d j ) =
∑ (x
ki
− xi )( x kj − x j )
k =1
n
=
∑
n
( x ki − xi )2
∑ (x
kj
− x j )2 cos(θij )
k =1
k =1
This implies that: cos(θij ) =
sij sii s jj
= rij
(3.49)
i.e., cos(θij) is the sample correlation coefficient. Note the following: 1. If di is nearly parallel to dj with the same direction, i.e., θij ≈ 0°, then ρij ≈ 1; if di is nearly parallel to dj but with opposite direction, i.e., θij ≈ 180°, then ρij ≈ –1. 2. If di and dj are mutually perpendicular, then ρij = 0.
3.4.2 RANDOM SAMPLES AND THE EXPECTED VALUES OF THE SAMPLE MEAN AND COVARIANCE MATRIX The definition data matrix x is clearly constructed by the observations or realizations of a p-variate random vector. But before taking the actual measurements, we cannot predict these values exactly. So each set of measurements X i on p variables is a random vector and the matrix X =[X1 X2 Xn]T is actually a random matrix. Definition 3.17 (Random sample): A p-variate Xi, for i = 1, 2, …, n, is said to be a random sample taken from a common joint distribution with density function f ( x ) = f ( x1, x2 , , x p ) if the row vectors in X =[X1 X2 Xn]T, i.e., XTi, i = 1, 2, …, n, are independent observations from f (x), i.e., the joint PDF of X1 X2 Xn is given by n
∏ f (x ) , i
i =1
where f ( x i ) = f ( xi1, xi 2 , , xip ) is the PDF for the ith row vector.
2151_book.fm Page 50 Friday, October 20, 2006 5:04 PM
50
Stream of Variation Modeling and Analysis for MMPs
As the sample mean vector X is also a random vector, Theorem 3.5 gives its expected value and covariance matrix. Before introducing the theorem, we need the following lemma. Lemma 3.1: For any random vector Y with mean vector E( Y) = µ and covariance matrix cov( Y) = ∑ . The following relationship holds: E ( YY T ) = ∑ +µµ T
(3.50)
Theorem 3.5: Let X1 X2 Xn ∈R p be a random sample taken from a joint µ, ∑). Then the sample mean vector X is an unbiased estimator distribution with (µ of µ and its covariance matrix is n1 ∑, i.e, E(X) = µ
cov(X) =
(3.51)
1 ∑ n
(3.52)
The sample variance-covariance matrix s n defined in Equation 3.40 is an unbiased estimator for ∑, i.e., E (s n ) = ∑
3.4.3 SOME IMPORTANT RESULTS OF SAMPLING MULTIVARIATE NORMAL DISTRIBUTIONS
(3.53) FROM
For multivariate normal distributed random vectors, there exist the following fundamental results: Theorem 3.6 (Distributions of sample means and covariance matrices): Suppose X1, X2, , Xn are independently identically distributed (i.i.d.) random samples drawn from N p (µ , ∑), then: 1. Sample mean vector X =
1 n
n
∑ X ~ N (µ, n1 ∑). So i
p
i =1
n (X − µ ) ~ N p (0, ∑).
(3.54)
2. (n – 1)sn is distributed as a Wishart random matrix with d.o.f. = n – 1. – 3. X and sn are independent where
sn =
1 n −1
n
∑ (X − X)(X − X) . T
i
i =1
i
2151_book.fm Page 51 Friday, October 20, 2006 5:04 PM
Basics of Multivariate Statistical Analysis
51
Theorem 3.7 (Multivariate central limit theorem, Cramer): Let X1, X2, , Xn be i.i.d random vectors with mean µ and positive definite ∑, and let 1 X= n
n
∑X . i
i =1
Then, (1) n (X − µ ) → N p (0, ∑), as n → ∞. (2) n(X − µ )T s −n1 (X − µ ) → χ 2p , for n – p is large.
(3.55)
3.5 THE WISHART DISTRIBUTION AND SOME PROPERTIES In many cases, observations are collected on k independent groups of sampling units. The responses are described by a multinormal random variable with mean vector µh being the hth group and a covariance matrix common to all groups. So the observation matrix for the hth group is: x1Th x11h T x x21h x h = 2 h = T x nh h x nh 1h
x12 h
x22 h
x nh 2 h
x1 ph x2 ph x nh ph
(3.56)
The maximum likelihood estimation (MLE) of µh is the sample mean vector x h of the hth group, and the unbiased estimate of ∑ is given by: 1 s= n−k
k
nh
∑∑ h =1 i =1
1 ( x ih − x h )( x ih − x h ) = n−k T
k
∑A
h
h =1
where k is the total number of groups, and nh
Ah =
∑ (x
ih
− x h )( x ih − x h )T
i =1
is the sums of squares and cross products within the hth group and k
n=
∑n . h
h =1
(3.57)
2151_book.fm Page 52 Friday, October 20, 2006 5:04 PM
52
Stream of Variation Modeling and Analysis for MMPs
Remark 3.17: From the definition of A and Ah, they can be expressed as the sums of products of n – 1 and n – k independent p-variate random vectors with the common distribution N p (0,∑ ∑). In general, any symmetric positive definite matrix A ∈ R p × p of quadratic and bilinear forms which can be transformed to the sum: n
∑YY , T i
i
(3.58)
i=1
with p-variate vectors Yi independently distributed according to the distribution N p (0, ∑) is said to have the Wishart distribution [13]. The density function of a positive definite A with d.o.f. = n is given by: ( n − p−1)/ 2
A
Wn ( A | ∑) = 2
/2 np
π
(
exp − 12 tr ( A ∑ −1 )
p ( p−1)/ 4
∑
n / 2
)
p
∑ Γ(n + 1 − i)
(3.59)
i =1
where Γ(⋅) is the gamma function. If A is not positive definite, then Wn ( A | ∑) = 0. In Equation 3.59, n = n − 1 for a single-sample matrix, n = n − k for the k sample within-groups matrix. The d.o.f. n and ∑ specifies the form of the density function. Remark 3.18: If p = 1 and ∑ =1, then the Wishart density reduces to χ2n . Wishart-distributed matrices have many similar properties of χ2 variates. Some properties are summarized as follows: Property 1: If A1, A 2 , , A k are independent Wishart-distributed matrices with the common parameter matrix ∑, i.e., A i ~ Wni ( Ai | ∑) for i =1, 2, ..., k, then k
∑ i =1
A i ~ Wn
k
∑ i =1
Ai | ∑ , where n =
k
∑n . i
i =1
Property 2: If A ~ Wn ( A | ∑), then CAC T ~ Wn (CACT C ∑ CT ), where C is a constant matrix with proper orders. Property 3: The sample mean vector x and the matrix of sums of squares and cross products n
A=
∑ (x − x)(x − x) i
T
i
i =1
computed from the same sample are independently distributed.
2151_book.fm Page 53 Friday, October 20, 2006 5:04 PM
Basics of Multivariate Statistical Analysis
53
3.6 EXERCISES 1. A transportation company gives the list of ticket prices in dollars (x1) for different travel distances in miles (x2) as follows: x1
2
6
6
8
8
10
12
15
17
20
x2
4
7
10
15
20
25
35
45
55
70
a. Construct a scatter plot of the data. b. Calculate the mean of the ticket price x1 and travel distances x2 . c. By looking at the scatter plot, what will the sign of the covariance s12 be? d. Calculate the variance s11 and s22. e. Calculate the covariance s12 and the correlation coefficient r12. Explain the physical meaning of these values. 2. Suppose two variables x1 and x2 are measured and shown to be the following: x1 x2
2 4
5 6
3 5
9 13
6 7
1 3
8 11
4 5
3 4
7 9
a. Construct a scatter plot. x 2T , where b. Express x as xT = x 1T x1 j x x j = 2 j , j = 1, 2. x10 j c. Determine d j = x j − x j 1 n for j = 1, 2, where n = 10 in this case. dj is called the associated residual vector. d. Find the lengths of each residual vector dj for j = 1, 2. e. Find the angle between two residual vectors d1 and d1. f. Verify that cos(θ12) = r12. 3. The measurements of two quality characteristics are as follows: Time
1
2
3
4
5
6
7
8
9
10
x1 x2
4.5 3
2 1.5
2.5 2
4 3
3.5 2.5
4.5 4
3 2
4 3.5
5 4
3 2.5
a. Find the correlation coefficient r between x1 and x2. b. Find the covariance matrix S for x1 and x2, determine the eigenvalues λ1 and λ2, and the eigenvectors {u1, u2} of S.
2151_book.fm Page 54 Friday, October 20, 2006 5:04 PM
54
Stream of Variation Modeling and Analysis for MMPs
c. Use the preceding eigenvectors to form the transformation matrix Q. Then the variables x1 and x2 can be transformed into independent variables z1, and z2, by using z1 T x1 = Q . z 2 x2 Calculate the corresponding values of z1 for all time steps. 4. Suppose 3 X = 9 3
5 7 6
4 2 , aT = [3 –1 2], bT = [2 4 –2], and cT = [–3 –1 1]. 3
There are p = 3 variables X1, X2, and X3. Each variable has n = 3 observations. a. Find the sample means of aT X, bT X, and cT X. b. Find the variances of aT X, bT X, and cT X. c. Calculate the covariances of aT X and bT X, bT X and cT X, and aT X and cT X. 5. Suppose the random vector X T = X1, X2 , X3 , X 4 has µ Tx = [5,1,4,6] and covariance matrix 1 2 ∑X = 3 −4
2 3 0 2
−4 2 0 8
3 0 7 0
X is partitioned as X1 X1 X = X2 = X3 X 2 X4 Suppose 2 A = and B = 1 1
2
−2
2151_book.fm Page 55 Friday, October 20, 2006 5:04 PM
Basics of Multivariate Statistical Analysis
55
Determine the following: a. E(X1) and E(X2). b. E(AX1) and E(BX2). c. cov(X1) and cov(X2). d. cov(AX1) and cov(BX2). e. cov(X2, X1). f. cov(BX2, AX1). 6. Suppose x1 and x2 are two random measurements from a process and each follows a Gaussian distribution x1 ~ N(2,1) and x2 ~ N(5,4). Suppose also that x1 and x2 are not independent and have covariance cov(x1,x2) = 1.5. Calculate: a. The mean vector and covariance matrix of the 2-D random vector x ≡ [x1 x2]T. b. The joint probability density function of x and marginal probability density functions of x1 and x2, respectively. c. Draw the equidensity contour of x (i.e., the curves over which pdf(x) assumes a constant value) and pdfs of x1 and x2 on the same figure. d. If cov(x1,x2) = 0, repeat (a) to (c).
References 1. Kshirsagar, A.M., Multivariate Analysis, Marcel Dekker, New York, 1972. 2. Mardia, K.V. et al., Multivariate Analysis, Academic Press, London, 1979. 3. Kleinbaum, D.G. et al., Applied Regression Analysis and Other Multivariable Methods, 2nd ed., PWS-Kent Publishing Company, Boston, MA, 1987. 4. Hair, J.F. et al., Multivariate Data Analysis, 3rd ed., Macmillan Publishing Company, New York, 1992. 5. Muirhead, R.J., Aspects of Multivariate Statistical Theory, John Wiley & Sons, New York, 1982. 6. Karson, M.J., Multivariate Statistical Methods, The Iowa State University Press, Ames, IA, 1982. 7. Morrison, D.F., Multivariate Statistical Methods, 2nd ed., McGraw-Hill, New York, 1976. 8. Takeuchi, K. et al., The Foundations of Multivariate Analysis, Wiley Eastern Limited, New Delhi, India, 1982. 9. Srivastava, M.S., Methods of Multivariate Statistics, John Wiley & Sons, New York, 2002. 10. Manly, B.F.J., Multivariate Statistical Methods: A Primer, Chapman and Hall, London, 1986. 11. The MathWorks, Getting Started with Matlab Version 7, The MathWorks, Inc., Natick, MA, 2006. 12. SAS, SAS 9.1 Documentation, http://support.sas.com/documentation/index.html, SAS Institute, Inc., Cary, NC, 2006. 13. Wishart, J., Biometrika, A., The Generalized Product Moment Distribution in Samples from a Normal Multivariate Population, 20(1928), pp. 32–52.
2151_book.fm Page 56 Friday, October 20, 2006 5:04 PM
2151_book.fm Page 57 Friday, October 20, 2006 5:04 PM
4
Statistical Inferences on Mean Vectors and Linear Models
4.1 STATISTICAL INFERENCES ON MEAN VECTORS In this section, we will discuss systematic approaches to making inferences about mean vectors based on the sampling information from a multivariate distribution. The necessary assumption is that the data should follow a multivariate normal distribution.
4.1.1 HOTELLING’S T 2 TEST Let Xi, i = 1, 2, …, n, be a sequence of independently identically distributed (i.i.d) p-dimensional random vectors with N p (µ , ∑), and xi, i = 1, 2, …, n, be their associated realizations. For a multivariate hypothesis testing problem, we are required to test whether a vector µ0 ∈R p is a true mean vector for the multivariate normal distribution. The squared distance in a multivariate case is defined as: −1
1 T 2 = (X − µ 0 )T S (X − µ 0 ) = n(X − µ 0 )T S−1 (X − µ 0 ), n
(4.1)
where µ0 = [µ10 µ20 … µp0]T ∈Rp and X and S are the sample mean vector and sample covariance matrix, respectively, which are defined as: X=
1 n
n
∑
Xi ∈ R p , S =
i =1
1 n −1
n
∑ (X − X)(X − X) i
i
T
∈ R p× p
i =1
1
Remark 4.1: In Equation 4.1, n S is the estimate of cov( X). Remark 4.2: The statistic T 2 defined in Equation 4.1 is called Hotelling’s T 2 in honor of Harold Hotelling, and is distributed as ( n − 1) p Fp , n − p , n− p when µ = µ 0 holds. Its sampling distribution is as follows: 57
2151_book.fm Page 58 Friday, October 20, 2006 5:04 PM
58
Stream of Variation Modeling and Analysis for MMPs
For a random sample xi, i = 1,2, …, n, which is drawn from N p (µ , ∑) population, no matter what the true µ and ∑ are, (n − 1) p Pr T 2 > Fp,n− p (α) = n − p (n − 1) p Pr n( x − µ )T S−1 ( x − µ ) > Fp,n− p (α) = α n− p
(4.2)
where Fp,n − p (α) is the upper (100α)th percentile of the F-distribution with degrees of freedom (d.o.f.) p and n – p. Remark 4.3 (Property of the T 2 statistic): It is easy to verify that the T 2 statistic is invariant for different scales of observations, i.e., for the nonsingular transformation Y = CX + d T 2 is identical for both X and Y, where C and d are a constant matrix and a vector, respectively, with proper dimensions. For the multivariate hypothesis testing problem, we have the following result. Theorem 4.1: For the multivariate hypothesis testing problem with estimated covariance matrix S: H 0 : µ = µ 0 , H1: µ ≠ µ 0
(4.3)
H0 is rejected if T 2 = n( x − µ 0 )T S−1 ( x − µ 0 ) >
(n − 1) p Fp,n− p (α) n− p
(4.4)
Remark 4.4: The preceding Hotelling’s T 2 test is based on the assumption that the covariance matrix ∑ is unknown. But if ∑ is known, the statistic becomes: T 2 = n(X − µ 0 )T ∑ −1 (X − µ 0 ),
(4.5)
where T 2 follows chi-square distribution with d.o.f. p under H0. This is also true for S ≈ ∑, if there are a large number of samples. Remark 4.5: As p univariate tests do not account for the dependency between these variables in multivariate measurement, a multivariate hypothesis should not be tested using p separate univariate tests, because it can lead to wrong conclusions.
2151_book.fm Page 59 Friday, October 20, 2006 5:04 PM
Statistical Inferences on Mean Vectors and Linear Models
59
Remark 4.6: It can be proved [1] that: n
T2 =
ˆ (n − 1) ∑ 0 ˆ ∑
∑ (X − µ )(X − µ ) 0
i
− (n − 1) = (n − 1)
i
i =1
n
∑ (X − x)(X − x) i
T
0
− (n − 1).
(4.6)
T
i
i =1
Equation 4.6 facilitates the calculation of T 2 by avoiding the calculation of S–1.
4.1.2 CONFIDENCE REGIONS AND SIMULTANEOUS CONFIDENCE INTERVALS 4.1.2.1 Confidence Regions Definition 4.1 (Confidence region): Let θ ∈Θ, where Θ is the parameter space. Given the data matrix x = [x1 x2 xn]T and the significance level α, before the sample is selected, the 100 (1 − α )% confidence region CR(x) is defined by 1− α = Pr(θ ∈CR(x)) = Pr(CR(x) will contain the trrue θ)
(4.7)
which is calculated under the unknown true parameter θ. For example, before the sample is selected, from Equation 4.2 and Equation 4.7, we have
{
}
1 − α = Pr µ ∈CR(x)
(n − 1) p = Pr n( x − µ )T S−1 ( x − µ ) ≤ Fp,n− p (α) − n p
(4.8)
regardless of the values of unknown µ and ∑. Therefore, for the p-dimensional normal cases, CR(x) can be determined by the following theorem. Theorem 4.2 (Confidence region): Given sample data xi, i = 1, 2, …, n, a 100(1 – α)% confidence region for the mean of a p-dimensional normal distribution is the space within the ellipsoid centered at x , i.e., CR(x): n( x − µ )T S−1 ( x − µ ) ≤
(n − 1) p Fp,n− p (α) = Tα2 n− p
where 1 x= n
n
∑ i =1
1 xi ∈ R , S = n −1 p
n
∑ (x − x)(x − x) i
i =1
i
T
∈ R p× p .
(4.9)
2151_book.fm Page 60 Friday, October 20, 2006 5:04 PM
60
Stream of Variation Modeling and Analysis for MMPs
~ Remark 4.7: To determine whether µ ∈CR(x), we need to compare the values of T −1 n( x − µ ) S ( x − µ ) and
Tα2 =
( n − 1) p Fp,n − p (α) . n− p
If n( x − µ )T S−1 ( x − µ ) ≤ Tα2 , then µ ∈CR(x); otherwise, µ ∉CR(x). Remark 4.8: The application of the confidence region CR(x) defined in Equation 4.9 is equivalent to testing H 0 : µ = µ 0 , H1 : µ ≠ µ 0 because CR(x) contains all possible µ 0 vectors for which the T 2-test fails to reject H0 at significance level α. Remark 4.9: The joint confidence region for µ cannot be graphed when the dimension p ≥ 4. It can be determined by the directions and lengths of the axes of the ellipsoid defined in Equation 4.9 under the scaling factor Tα n
λ i , i = 1, 2, …, p,
along the eigenvectors ei , where λ i and ei are eigenvalues and associated eigenvectors of S. The procedure for determining the ellipsoid involves the following steps: 1. Determine x , the center of ellipsoid; 2. Determine the axes of ellipsoid:
± λi
p(n − 1) Fp,n− p (α ) ei , i = 1, 2, …, p n(n − p)
(4.10)
where λi’s and ei’s are calculated by solving S ei = λ i ei From Equation 4.10, the half-length of axes are
λi
p ( n − 1) Fp,n − p (α ) for i = 1, 2, …, p. n(n − p)
(4.11)
2151_book.fm Page 61 Friday, October 20, 2006 5:04 PM
Statistical Inferences on Mean Vectors and Linear Models
61
So the elongation of the confidence ellipsoid is given by the ratio of the lengths of the axes, i.e.,
2 λi 2 λj
p ( n − 1) Fp,n − p (α ) n(n − p) p ( n − 1) Fp,n − p (α) n(n − p)
=
λi λj
, i≠ j
(4.12)
4.1.2.2 Simultaneous Confidence Intervals Based on the matrix theory presented in Chapter 2, it can be shown that
max a≠0
[a T ( x − µ )]2 = ( x − µ )T S−1 ( x − µ ) a T Sa
(4.13)
Thus, Equation 4.8 is equivalent to:
{
{
}
1 − α = Pr µ ∈CR(x) = Pr n( x − µ )T S−1( x − µ ) ≤ Tα2
}
[a T (x − µ )]2 = Pr n ≤ Ta2 , ∀a ∈ R p , a ≠ 0 T a Sa
(4.14)
where Tα2 =
( n − 1) p Fp,n − p (α) . n− p
Hence, Equation 4.14 becomes a T x − Tα
a T Sa n
≤ a T µ ≤ a T x + Tα
a T Sa n
, ∀a ∈ R p and a ≠ 0
(4.15)
This region consists of all plausible values of µ at the significance level α. In summary, we have the following theorem: Theorem 4.3 (Simultaneous intervals, or T 2-intervals): Suppose xi, i = 1, 2, …, n, is a random sample drawn from N p (µ , ∑) population with ∑ positive definite. Then simultaneously, for all a, the interval: a T x − Tα
a T Sa n
, a T x + Tα
a T Sa n
(4.16)
2151_book.fm Page 62 Friday, October 20, 2006 5:04 PM
62
Stream of Variation Modeling and Analysis for MMPs
will contain aT µ with probability 1 – α, where Tα2 =
( n − 1) p Fp,n − p (α). n− p
The confidence interval in Equation 4.16 is called Roy’s confidence interval. This theorem states that the confidence coefficient 1–α remains unchanged for any a ∈R p. Some special cases are discussed in the following remarks: Remark 4.10: Construct 100(1–α)% simultaneous confidence intervals for µi , i = 1, 2, …, p. If we select aT in the following forms: 1
0
0 , 0
1
0 , …, 0
0
1 .
Then the T 2-intervals become: xi − Tα
Sii n
, xi + Tα
Sii , for i = 1, 2, ..., p n
(4.17)
All intervals hold simultaneously with confidence coefficient 1–α. Remark 4.11: Construct 100(1–α)% confidence intervals for differences µ i − µ j , i ≠ j, i, j = 1, 2, …, p. If we select aT = [0 0 ai 0 aj 0 0], where ai = 1 and aj = –1, then T a Sa = sii − 2 sij + s jj . So the T 2-intervals become: ( xi − x j ) − Tα
sii − 2sij + s jj n
, ( xi − x j ) + Tα
sii − 2sij + s jj . n
(4.18)
It holds with confidence 1–α. Remark 4.12: Construct 100(1–α)% confidence ellipses for pairs of means. (µ i , µ j ) is within the sample mean-centered ellipses
n xi − µ i
sii x j − µ j sij
sij s jj
−1
xi − µ i (n − 1) p Fp,n− p (α) = Tα2 (4.19) ≤ n− p x j − µ j
It holds with confidence 1–α. Remark 4.13: If we want to obtain 100(1–α)% simultaneous confidence intervals for aiTµ, i = 1, 2, …, k, and if k is small, then Roy’s confidence intervals defined by
2151_book.fm Page 63 Friday, October 20, 2006 5:04 PM
Statistical Inferences on Mean Vectors and Linear Models
63
Equation 4.16 will be too wide. But the Bonferroni method will give narrower and more precise intervals if k (the number of linear combinations of interest) is small. More discussions on the Bonferroni confidence interval can be found in Reference 1.
4.1.3 INFERENCE ON A POPULATION MEAN VECTOR WITH LARGE SAMPLE SIZE So far, the hypothesis testing and confidence regions for µ have been constructed based on the assumption of a normal population. But when the sample size n is large, this normality assumption can be ignored even though there is a serious departure from normality. All large-sample inferences on µ are based on a chi-square distribution. Because n(X − µ )T S−1 (X − µ ) is approximately χ2p with d.o.f. p, when the sample size n is large enough,
{
}
Pr n( x − µ)Τ S−1 ( x − µ ) ≤ χ 2p (α ) ≈ 1 − α ,
(4.20)
where χ2p (α ) is the upper (100α)th percentile of the χ2p distribution. From Equation 4.20, we can obtain the hypothesis test and simultaneous confidence regions in the case of the large sample size. Theorem 4.4: Suppose xi, i = 1, 2, …, n, is a random sample drawn from a population with mean vector µ and positive definite covariance matrix ∑, and assume n − p is large. Then we have the following: 1. Hypothesis testing: Hypothesis H 0 : µ = µ 0 H1 : µ ≠ µ 0 will reject H 0 at a level of significance approximately α, if n( x − µ 0 )T S−1 ( x − µ 0 ) > χ 2p (α).
(4.21)
where χ2p (α ) is the upper (100α)th percentile of the χ2p distribution. 2. Simultaneous confidence regions: For any a ∈R p, the confidence intervals a T Sa T a T Sa a T x − χ 2p (α )· , a x + χ 2p (α )· n n will contain aT µ with probability approximately 1–α.
(4.22)
2151_book.fm Page 64 Friday, October 20, 2006 5:04 PM
64
Stream of Variation Modeling and Analysis for MMPs
Remark 4.14: As with the remarks in Theorem 4.3, the 100(1–α)% simultaneous confidence intervals for µi are: s s xi − χ2p (α ) ⋅ ii , xi + χ2p (α ) ⋅ ii , for i = 1, 2, ..., p n n
(4.23)
Remark 4.15: 100(1–α)% confidence ellipses for pairs of means (µ i , µ j ) can be constructed by: −1 sii sij xi − µ i 2 ≤ χ p (α ) . (µ i , µ j ) : n xi − µ i x j − µ j s s x j − µ j ij jj
(4.24)
4.1.4 MULTIVARIATE QUALITY CONTROL CHARTS Quality is of great importance in any industry. To improve the quality and performance of products and services, the process should be stable and no special causes leading to variation. A quality control chart can be effectively employed to continuously monitor the process and help identify special causes of variation. A typical control chart is composed of data (e.g., individual observations, or sample means) plotted in time order and control limits that indicate the quantity of variation due to common causes. A univariate control chart is used to monitor the individual characteristics in a process. But when more than one important feature occurs, it cannot monitor such processes due to the correlation between these features. Hence, it is necessary to develop multivariate control charts to monitor process stability. The multivariate control chart has two main tasks. One is to monitor the stability of a process with multivariate observations; the other is to set a control region for future observations. T 2 control charts will be introduced in this section. 4.1.4.1 Control Charts for One Multivariate Sample Two useful control charts are described in the following text: 1. T 2 control chart to monitor stability Suppose xi, i = 1, 2, …, n, is a random sample drawn from a normal µ,∑). After simple calculation, population Np (µ
xi − x = xi −
1 n
n
∑ i =1
1 1 x i = 1 − xi − n n
n
∑x. j
j =1, j ≠ i
2151_book.fm Page 65 Friday, October 20, 2006 5:04 PM
Statistical Inferences on Mean Vectors and Linear Models
65
It is easy to verify that 2
2
1 (n − 1) 1 E (x i − x ) = 0 and cov(x i − x ) = 1 − ∑ + ⋅ (n − 1) ∑ = ∑. n n n Thus, x i − x is also a multivariate normal distribution with zero mean vector, but not independent of the sample covariance matrix S. A chisquare distribution can be used to set the control limits. A T 2 control chart can be obtained through the following three steps: Step 1: Construct control limits Set up the 100 (1 − α )% control limit: UCL = χ 2p (α ) , LCL = 0
(4.25)
where p is the total number of monitored features. Step 2: Determine the points to be plotted Calculate and plot the T 2 statistics for the ith point: Ti 2 = (x i − x )T S−1 (x i − x ).
(4.26)
Step 3: Identify special causes If some points are out of control limits, we need to identify which variables are responsible for these points. This can be done by the Bonferroni intervals. For example, the jth variable of the ith sample is out of control if
(
xij ∉ x j − tn−1 (α / 2 p) s jj , x j + tn−1 (α / 2 p) s jj
)
(4.27)
2. T 2 control chart for future individual measurements If a T 2 control chart indicates that the process is stable, we can employ the collected data to construct another control chart, that can be used to predict future measurements. For the T 2 control chart, the control limits are set: (n − 1) p UCL = n − p Fp,n− p (α) , LCL = 0
(4.28)
2151_book.fm Page 66 Friday, October 20, 2006 5:04 PM
66
Stream of Variation Modeling and Analysis for MMPs
and the following T 2 calculated from the new measurement x is then plotted: T2 =
n ( x − x )T S −1 ( x − x ) n +1
(4.29)
where x=
n
∑ x ∈R
1 n
p
i
i =1
and S=
1 n −1
n
∑ (x − x)(x − x) i
i
T
∈ R p× p
i =1
are the sample mean vector and covariance matrix, respectively. If the T 2 from the future measurement x exceeds the upper control limit, we can conclude that it is an out-of-control signal and further action may be taken to identify the special causes. 4.1.4.2 Control Charts Based on Subgroup Means Two useful control charts are described in the following text: 1. Control ellipse and T 2 chart for monitoring stability A multivariate control chart can also be constructed based on subgroup means. The basic assumption is that each multivariate sample is drawn from N p (µ , ∑). The subgroup size is m > 1. Suppose the subgroup mean vectors and covariance matrices are xi , Si , i = 1, 2, …, n, where n is the sample size and xi =
1 m
m
∑x
ij
.
j =1
Because the population is a multivariate normal distribution, xi and S i are independent of each other. The grand mean vector and pooled sample covariance is calculated by x=
1 n
n
∑x
i
i =1
and Sp =
1 n
n
∑S . i
i =1
2151_book.fm Page 67 Friday, October 20, 2006 5:04 PM
Statistical Inferences on Mean Vectors and Linear Models
67
Following procedures similar to those in Subsection 4.1.4.1, it is easy to verify that E ( xi − x ) = 0 and 2
n −1 n −1 1 ∑, cov( xi − x ) = 1 − cov( xi ) + 2 cov( xi ) = mn n n where m > 1 is the subgroup size and n is the sample size. Remark 4.16: The pooled S p is an estimate of ∑ . Remark 4.17: n (m − 1)S p is independent of each xi , and thus the grand mean x. Remark 4.18: n (m − 1)S p is distributed as a Wishart random matrix with d.o.f. n(m – 1). As ∑ can be estimated by each subgroup, when all the estimators are pooled together, a single estimator is obtained and thus has a large number of degrees of freedom. Consequently, T2 =
mn n(m − 1) p ( xi − x )T S−p1 ( xi − x ) ~ Fp,mn− n− p+1 , n −1 mn − n − p + 1
(4.30)
where Fp,mn − n − p +1 is an F-distribution with d.o.f p and mn − n − p + 1. For different dimensions of p, we can construct a T 2 control chart with control limits as: (n − 1)(m − 1) p UCL = mn − n − p + 1 Fp,mn− n− p+1 (α) . LCL = 0
(4.31)
For a large sample size n, (n − 1)(m − 1) p Fp,mn− n− p+1 (α) ≈ χ 2p (α ). mn − n − p + 1 The plotted points are: Ti 2 = m( xi − x )T S−p1 ( xi − x ), i = 1, 2, ..., n
(4.32)
Any Ti 2 exceeding the UCL may indicate a potential out-of-control signal, and appropriate action should be taken to identify the responsible special causes.
2151_book.fm Page 68 Friday, October 20, 2006 5:04 PM
68
Stream of Variation Modeling and Analysis for MMPs
2. T 2 Control chart for future subgroup measurements Once we have determined that the process is stable based on the T 2 control chart, we can use the measurements drawn from this stable process to set up the control limits for future measurement means. n +1 Suppose X is the future subgroup mean. Because X − X ~ N p (0, ∑), we mn have: mn n (m − 1) p (X − X)T S −p1 (X − X) ~ Fp,mn − n − p +1 n +1 mn − n − p + 1
(4.33)
Based on Equation 4.33, we can obtain the following T 2 control chart for future data monitoring: For 100 (1 − α )% prediction T 2 control chart, the control limits are: (n + 1)(m − 1) p UCL = mn − n − p + 1 Fp,mn− n− p+1 (α) . LCL = 0
(4.34)
For a large sample size m, (n + 1)(m − 1) p Fp,mn− n− p+1 (α) ≈ χ 2p (α ) . mn − n − p + 1 The plotted points in time order are the T 2 statistic defined by T 2 = m( x − x )T S−1 p ( x − x ). Any point exceeding the UCL may indicate a potential out-of-control signal, and appropriate corrective action should be taken to eliminate the special causes. Remark 4.19: So far, we have discussed the hypothesis testing of a multivariate one-sample mean vector from a single multivariate population. In some situations, we are required to compare two or more populations. Interested readers can find approaches to making inferences on multivariate two-sample testing problems in Reference 1.
4.2 MULTIPLE LINEAR REGRESSION Regression analysis is a collection of techniques to help assess how much variation of certain variables can be explained as being due to the effects of another set of variables. Similar to a simple linear regression analysis, a multiple regression analysis is a statistical technique that can be used to analyze the relationship between dependent (or response) variables and independent (or controlled) variables. The
2151_book.fm Page 69 Friday, October 20, 2006 5:04 PM
Statistical Inferences on Mean Vectors and Linear Models
69
objective of the multiple regression analysis is to use the independent variables whose values are known to predict the dependent values. The result is a linear combination of the independent variables that “best” predicts the dependent variables. The variables are weighted by the weights denoting their relative contribution to the overall prediction, so that the regression model has the maximum prediction power.
4.2.1 MODEL DESCRIPTION The accuracy of prediction of y based on a single variable x is determined by the correlation coefficient ρxy . If ρ xy is low, it is necessary to employ multiple variables as explanatory variables to increase the accuracy of prediction. So multiple regression analysis is a method to predict the value of a dependent variable y based on a set of independent variables, x1, x2, …, xp. If y is centered, i.e., E ( y ) = 0 , then a multiple linear regression model can be described by: y = a1 x1 + a2 x2 + a p x p + ξ
(4.35)
where y is called dependent variable (or response variable, or criterion); p variables xi , i = 1, 2, …, p, are called independent variables (or controlled, or explanatory variables), having values that are considered known or fixed, not random; ai, i = 1, 2, …, p, are called regression parameters whose values are unknown. Some basic assumptions are made regarding the error term ξ: (A1) E(ξ) = 0; (A2) var(ξ) = σ 2 So from Equation 4.35 and the assumptions (A1) and (A2), we obtain: E ( y ) = a1 x1 + a2 x2 + a p x p + E (ξ) = a1 x1 + a2 x2 + a p x p ; var( y ) = var(ξ) = σ 2 Suppose n observations are made, yi = a1 xi1 + a2 xi 2 + a p xip + ξ i , i = 1, 2, …, n. We assume that these observations are uncorrelated, i.e., (A3) cov( yi , y j ) = cov(ξ i , ξ j ) = 0 for i ≠ j .
(4.36)
2151_book.fm Page 70 Friday, October 20, 2006 5:04 PM
70
Stream of Variation Modeling and Analysis for MMPs
To simplify the notation, we rewrite Equation 4.36 in the equivalent matrix form: Y = Xa + ξ,
(4.36)
where y1 x11 y2 x 21 Y = , X = X1 X 2 X p = x n1 y n
x12
x 22
xn 2
x1 p x2 p and Xi ∈ R n x np
a1 ξ1 a2 ξ a = , ξ = 2 . ξ n a p Here, matrix X is usually referred to as a design matrix with full column rank. Thus, assumptions (A1)~(A3) can be rewritten in a matrix form known as Gauss–Markov conditions (GM conditions): (A1′) E(ξξ) = 0; (A2′) cov(ξξ) = σ2I. So Equation 4.36′ together with (A1′) and (A2′) is called a multivariate linear regression model. The unknown parameters are regression parameters vector a and variance σ2. Efforts are made to estimate and evaluate these parameters.
4.2.2 LEAST-SQUARES ESTIMATES We need to find the linear composite vector of p
ˆ = Xa = Y
∑a X i
i
i =1
ˆ is an approximation of vector Y, i.e., find as close to vector Y as possible where Y p a ∈R to minimize: ˆ = Y − Xa , ξ = Y−Y 2
2
2
(4.37)
2151_book.fm Page 71 Friday, October 20, 2006 5:04 PM
Statistical Inferences on Mean Vectors and Linear Models
71
or minimize ˆ )T ( Y − Y ˆ ) = ( Y − Xa )T ( Y − Xa ). ξT ξ = (Y − Y
(4.38)
Definition 4.2 (Least-squares estimate [LSE], residual vector, and residual sum of squares [RSS]): The coefficient vector a selected by minimizing Equation 4.38 is called the least-squares estimate of the regression parameters a, denoted by aˆ , i.e., aˆ = arg min( Y − Xa )T ( Y − Xa ) = arg min Y − Xa a ∈R
a ∈R
p
p
2 2
(4.39)
The difference between Y and its predictors Xaˆ is called residual vector, denoted by ξˆ . So, ˆ = Y − Xaˆ , ξˆ = Y − Y and ξˆ T ξˆ = ( Y − Xaˆ )T ( Y − Xaˆ )
(4.40)
are called the residual sum of squares, or RSS. The LSE of a can be obtained by the following theorem: Theorem 4.5 (LSE): If X has full rank, i.e., rank(X) = p ≤ n, then the LSE of a in Equation 4.39 and the residual vector ξˆ are given by: aˆ = (XT X)−1 XT Y ˆ = I − X(XT X)−1 XT Y ξˆ = Y − Y n
(4.41)
ˆ = a and cov(a) ˆ = σ2 (XT X)–1. Further, E(a) Remark 4.20 (BLUE: the best linear unbiased estimator): From Equation 4.41, aˆ is a linear function of Y. It is also an unbiased estimator under GM conditions. Moreover, the Gauss–Markov theorem (2, Theorem 2.6, p.41) ensures that aˆ is the best estimate because it has the smallest covariance among all linear unbiased estimators of a. Remark 4.21 (Collinearity): If X is not full column rank, the inverse in Equation 4.41 is replaced by a generalized inverse. In this case, some combinations, e.g., Xa~ , must be 0, and the columns are said to be collinear. In most regression analyses, Xa~ is not likely to be exactly 0. But a very small magnitude of |Xa~ | will lead to numerically unstable (XT X)–1. Some diagonal elements of (XT X)–1 will be large and, consequently, the estimated variance for the corresponding âi’s are large. This will make it difficult to detect the “significant” regression coefficient ai .
2151_book.fm Page 72 Friday, October 20, 2006 5:04 PM
72
Stream of Variation Modeling and Analysis for MMPs
The following two methods can be used to solve the problems caused by collinearity: 1. Deleting one predictor variable from a strongly correlated pair 2. Representing the original predictor variables using their principal components and then regressing Y on these new predictor variables Remark 4.22 (Hat matrix): Define matrix: H = X(XT X)−1 XT ,
(4.42)
ˆ = HY and ξˆ = Y – Y ˆ = (I – where H is usually called the “hat” matrix, then: Y n H)Y = (In – H)ξξ. Regarding the linear function of a, suppose p
cT a =
∑c a , i i
i =1
the following Gauss theorem [1, p.389] gives its “best” estimator. Theorem 4.6 (Gauss least-squares theorem): For the multiple regression model: Y = Xa + ξ , where ξ ~ (0, σ 2I) and X has full column rank p. Then for any vector c ∈R p , the estimator with the smallest possible variance among all linear estimators given by c T aˆ is unbiased for c T a. ˆ = Theorem 4.7: The residual vector ξˆ defined in Equation 4.41, i.e., ξˆ = Y − Y [In − H]Y, has the following property:
( )
E ξˆ T ξˆ = (n − p)σ 2 .
(4.43)
Remark 4.23: From Theorem 4.7, an unbiased estimator of σ2 is given by: s2 =
1 Y T ( I − H) Y n− p
=
1 ˆ )T ( Y − Y ˆ) (Y − Y n− p
=
1 1 ( Y − Xaˆ )T ( Y − Xaˆ ) = ( Y T Y − Y T Xaˆ ). n− p n− p
(4.44)
2151_book.fm Page 73 Friday, October 20, 2006 5:04 PM
Statistical Inferences on Mean Vectors and Linear Models
73
Remark 4.24 (Decomposition of sum of squares): The total sum of squares (TSS) can be decomposed into the regression sum of squares (RegSS) and the RSS, i.e., 2
2 2 ˆ + ξˆ . Y = Y 2
2
(4.45)
2
ˆ and variance of ξˆ as sY2, s 2 , If we denote the total variance of Y, variance of Y, Yx 2 and sY⊥ , respectively, then, x
sY2 = sY2 x + sY2 ⊥ . x
4.2.3 INFERENCES
ABOUT THE
(4.46)
REGRESSION MODEL
In this subsection, we give methods for testing the general linear hypothesis and obtaining the confidence intervals under the conditions in Theorem 4.8. It can be proved that the MLE for regression parameters is the same as the LSE under normality condition [1, pp.390–392], which is established by the following theorems: Theorem 4.8 (MLE for a and σ2): Given regression model Y = Xa + ξ, where a1 y1 ξ1 a2 y2 ξ2 Y = , a = , ξ == , a y ξ n n p design matrix X = [X1 X 2 X p ], Xi ∈ R n and X has full column rank, and the error vector follows multinormal distribution, ξ ~ N n (0, σ 2 I) . Then: 1. The MLE of a is the same as the LSE aˆ . Moreover, aˆ = (X T X)−1 X T Y ~ N p (a, σ 2 (X T X)−1 )
(4.47)
2. aˆ is distributed independently of the residuals ξˆ = Y − Xaˆ . Furthermore, ξˆ T ξˆ = n σˆ 2 ~ σ 2χ 2n− p where σˆ 2 =
1 ( Y − Xaˆ )T ( Y − Xaˆ ) is the MLE of σ2. n
(4.48)
2151_book.fm Page 74 Friday, October 20, 2006 5:04 PM
74
Stream of Variation Modeling and Analysis for MMPs
To draw inferences about regression parameters, the confidence ellipsoid for a is constructed based on the following theorem in terms of the estimated covariance matrix s 2 (X T X)−1 . Theorem 4.9 (Confidence region and confidence interval for parameters a): According to the assumption in Theorem 4.8, the 100 (1 − α )% confidence region for a is given by: (a − aˆ )T (X T X)(a − aˆ ) ≤ ps 2 Fp,n − p (α)
(4.49)
where s2 =
1 ˆT ˆ ξ ξ. n− p
Simultaneous 100 (1 − α )% confidence intervals for ai , i = 1, 2, ..., p are given by: aˆi ± var (aˆi ) pFp,n− p (α).
(4.50)
Remark 4.25: If the regression model contains a constant term, e.g., Y = a0 + a1X1 + + ar Xr + ξ
(4.51)
then the 100 (1 − α )% confidence region for a and simultaneous 100 (1 − α )% confidence intervals for ai , i = 0, 1, 2, ..., r are defined as: (a − aˆ )T (X T X)(a − aˆ ) ≤ (r + 1)s 2 Fr +1,n −r −1 (α)
(4.52)
aˆi ± var (aˆi ) (r + 1) Fr +1,n−r −1 (α),
(4.53)
where s2 =
1 ξˆ T ξˆ . n − r −1
Remark 4.26: The confidence ellipsoid is centered at the MLE â, and its orientation and size are determined by the eigenvalues and eigenvectors pair (λ, e) of XT X, i.e., λe = (X T X)e . If λ ≈ 0, then the confidence ellipsoid will be very long in the direction of the corresponding eigenvector. Remark 4.27: If ignoring the “simultaneous” confidence properties defined in Equation 4.50 or Equation 4.53, the pFp,n − p (α) in Equation 4.50 or (r + 1)Fr+1,n–r–1(α) in
2151_book.fm Page 75 Friday, October 20, 2006 5:04 PM
Statistical Inferences on Mean Vectors and Linear Models
75
Equation 4.53 can be replaced by a one-at-a-time t value, tn–p(α/2) or tn–r–1(α/2). This can be used to search for important independent variables. Remark 4.28: If the confidence interval for ai contains 0, then the associated independent variable xi may not have an influence on the dependent variable. In other words, it might contribute nothing to the response. Hence, it can be dropped from the regression model.
4.2.4 MODEL CHECKING: NORMALITY CHECKING AND OUTLIER DETECTION Regression modeling requires the assumption of normality and no outliers in the ˆ = (I – H)Y = (I – H)ξ. If the model is valid, then ξˆ is a data. We have ξˆ = Y – Y normally distributed random vector with E(ξˆ ) = 0 and var(ξˆ ) = σ2 (I – H). Although the covariance matrix is not diagonal, the off-diagonal elements are usually small and the diagonal elements are nearly equal. Clearly if the diagonal elements of H vary greatly, then the variance of each ξˆ i will also accordingly differ to a large extent from one another. Using residual mean square s2 =
1 Y − Xaˆ n− p
2 2
as the estimate of σ2, we obtain the estimation of var(ξˆ i ) : var (ξˆ i ) = s 2 (1 − hii ), i = 1, 2, …, n,
(4.54)
and the studentized residuals: ξˆ − E (ξˆ i ) ξˆ *i = i = var (ξˆ i )
ξˆ i s 2 (1 − hii )
, i = 1, 2, …, n.
(4.55)
The studentized residuals are expected to be approximately independently distributed as N (0, 1) . By ranking the ξˆ *i , i = 1, 2, …, n, we get the order statistics ξˆ *(i ), i = 1, 2, …, n. Then normality can be checked visually or analytically. The commonly used graphical approaches include: 1. Plot ( yˆi , ξˆ i ): If there is a trend on the plot (e.g., incorrect calculations, or if a constant term is ignored in the regression model) or the variance is not constant (e.g., transformations or weighted least squares are required), then the model assumptions do not hold. 2. Plot (xji, ξˆ i) or (xji xki , ξˆ i ): If there is a systematic pattern occurring in the plot, then it suggests that more terms are needed in the regression model. 3. Plot (i , ξˆ i ): This plot can be used to check independence over time. A systematic pattern suggests the data are time dependent.
2151_book.fm Page 76 Friday, October 20, 2006 5:04 PM
76
Stream of Variation Modeling and Analysis for MMPs
4. Q-Q plot: The Q-Q plot can be used to examine severe departures from normality or the presence of abnormal observations. If the sample size n is large, the extreme points may depart from normality but will not affect inferences about the regression parameter a. Statistical approaches include the Shapiro–Wilk test or Kolmogorov’s test. Interested readers may refer to Reference 3 in Chapter 3. An outlier can be detected using the methods described in Reference 3 (p. 274): 1. Construct the test statistic: P = max ξˆ *i , 1≤ i ≤ n
(4.56)
where ξˆ *i is defined in Equation 4.55. 2. Make inferences: There is an outlier in the data if P > F1,n–1(α/n); furthermore, if i0 = arg max ξˆ *i , 1≤ i ≤ n
(4.57)
then the i0th observation is an outlier.
4.2.5 INFERENCES
FROM THE
ESTIMATED REGRESSION MODEL
After having checked the fitted regression model and evaluated it as proper, one can use it to make some predictions, given a particular predictor vector x0 = [x10 x20 xp0 ]T. Based on x0 and MLE â, we can estimate the regression function and forecast a new observation at x0: 1. Estimating the regression function x0Ta at x0: Suppose y0 is the value of the response when the predictor vector x0 is given. Then:
(
)
E y0 x 0 = x T0 a
(4.58)
Its LSE is x0T â. Theorem 4.10 (Estimating): For the multiple regression model given in Equation 4.36 or Equation 4.36′: Y = Xa + ξ , E (ξ ) = 0, cov(ξ ) = σ 2I
2151_book.fm Page 77 Friday, October 20, 2006 5:04 PM
Statistical Inferences on Mean Vectors and Linear Models
77
x0T â is the best linear unbiased estimator (BLUE) of E(y0 |x0) in terms of minimum variance. Its variance is given by: var (x T0 aˆ ) = σ 2x T0 (XT X)−1 x 0
(4.59)
Further, if ξ is normally distributed, then a 100(1–α)% confidence interval for E(y0 |x0) is given by:
(
α x T0 aˆ ± tn− p s 2 x T0 (XT X)−1 x 0 2
)
(4.60)
where tn-p(α/2) is the upper 100(α/2)th percentile of a t-distribution with d.o.f. n – p. 2. Predicting a new observation y0 at x0: Predicting a new observation is more uncertain than estimating the expected value of y0. For the regression model: y0 = x T0 a + ξ 0 , ξ 0 ~ N (0, σ 2 ) where y0 is the new response; x0Ta, the expected value of y0 at x0; and ξ0, the new error. Theorem 4.11 (Predicting): For the multiple regression model given in Equation 4.36 or Equation 4.36′: Y = Xa + ξ , E (ξ ) = 0, cov(ξ ) = σ 2I a new observation of y0 has the unbiased predictor x T0 aˆ . The variance of the forecast error y0 − x T0 aˆ is given by: var ( y0 − x T0 aˆ ) = σ 2 (1 + x T0 (XT X)−1 x 0 )
(4.61)
Further, if error ξ is normally distributed, then a 100(1–α)% prediction interval for y0 is given by:
(
α x T0 aˆ ± tn− p s 2 1 + x T0 (XT X)−1 x 0 2
)
(4.62)
where tn–p(α/2) is the upper 100(α/2)th percentile of a t-distribution with d.o.f. of n – p. The proofs for Theorem 4.10 and Theorem 4.11 can be found in [1, pp. 400~402].
2151_book.fm Page 78 Friday, October 20, 2006 5:04 PM
78
Stream of Variation Modeling and Analysis for MMPs
Remark 4.29: By comparing Equation 4.60 with Equation 4.62, we can find:
(
)
(
α α tn− p s 2 1 + x T0 (XT X)−1 x 0 > tn− p s 2 x T0 (XT X)−1 x 0 2 2
)
(4.63)
i.e., the half width of the prediction interval for y0 is longer than the half width of the confidence interval for estimating the value of the regression function E(y0 |x0) = x0Ta. This additional uncertainty in forecasting y0 comes from the presence of the new unknown error term ξ0.
4.2.6 MULTIVARIATE LINEAR REGRESSION 4.2.6.1 Multivariate Linear Regression Model Multiple linear regression is used to model the linear relationship between one response, Y, and a set of independent variables X1, X2, , Xp. But in some cases, there are many response variables, Y(1), Y(2), …, Y(q). We need to model the relationship between the q variables and independent variables. Each response follows its own multiple regression model: Y( j ) = Xa ( j ) + ξ ( j ) , j = 1, 2, …, q. If each response has n observations, we introduce the notation: x11 x 21 X= x n1 y11 y21 Y = yn1 a11 a21 A= a p1
y12 y22
yn 2
a12
a22
a p2
x12
x 22
xn 2
x1 p x2 p x np
, n× p
y11 y2 q = Y(1) Y( 2) Y( q ) , ynq n× q a1q a2 q a pq
= a (1) a ( 2) a ( q ) , p× q
(4.64)
2151_book.fm Page 79 Friday, October 20, 2006 5:04 PM
Statistical Inferences on Mean Vectors and Linear Models
ξ11 ξ 21 ξ= ξ n1
ξ12
ξ 22
ξn2
ξ1q ξ 2q ξ nq
n× q
79
ξ1T ξT 2 = ξ (1) ξ ( 2) ξ ( q ) = , T ξ n
Thus, the multivariate linear regression model is defined as: Y(i ) = Xa (i ) + ξ (i ) , i = 1, 2, …, q, or Y = XA + ξ , E (ξ (i ) ) = 0, cov(ξ (i ) , ξ ( j ) ) = σ ij I for i, j = 1, 2, …, q,
(4.65)
where Y ∈ R n ×q , X ∈ R n × p , A ∈ R p ×q , ξ ∈ R n ×q . Remark 4.30: The n observations on the kth sample have the covariance matrix ∑, but observations for different samples are uncorrelated. Remark 4.31: In the multivariate linear regression model, A and σij, i, j = 1, 2, …, q are unknown parameters. Remark 4.32: The error term ξ corresponding to different responses may be correlated. 4.2.6.2 Least-Squares Estimation (LSE) and Maximum Likelihood Estimation (MLE) of Parameters Based on the results of multiple regression analysis, the LSE of a(i) is given by minimizing Y(i) – Xa(i)22 as: aˆ (i ) = (XT X)−1 XT Y(i ) , i = 1, 2, …, q
(4.66)
Thus, the criterion to be minimized for multivariate regression analysis is given by:
(
ˆ ) = tr Y − XA ˆ f (A
) ( Y − XAˆ ) T
(4.67)
ˆ by differentiating f(A) ˆ w.r.t. each element of A: We can obtain the LSE A ˆ = (XT X)−1 XT Y A
(4.68)
2151_book.fm Page 80 Friday, October 20, 2006 5:04 PM
80
Stream of Variation Modeling and Analysis for MMPs
ˆ we can obtain the matrices of predicted values and residuals: So, using the LSE A, ˆ = X (XT X)−1 XT Y ˆ = XA Y
(4.69)
ˆ = [I − X (XT X)−1 XT ] Y. ξˆ = Y − Y
(4.70)
We know that multiple regression analysis possesses orthogonality properties among residuals, predicted values, and columns of X. In multivariate regression analysis, these properties also hold. Property 1: Columns of X and residual ξˆ are orthogonal, i.e., XT ξˆ = 0
(4.71)
It can be easily verified by XT ξˆ = XT [I − X(XT X)−1 XT ] Y = 0. ˆ and residual ξˆ are orthogonal, i.e., Property 2: Predicted value Y ˆ T ξˆ = 0 Y
(4.72)
ˆ T ξˆ = (X (XT X)–1 XT Y)T [I – X (XT X)–1 XT] Y = 0. This property holds because Y Similar to multiple regression analysis, the total sum of squares can also be decomposed into two components because: ˆ + (Y − X A ˆ) ˆ + ξˆ = X A Y=Y Using Property 2, we have: ˆ TY ˆ + ξˆ T ξˆ + 0 T + 0 = Y ˆ TY ˆ + ξˆ T ξˆ ˆ + ξˆ ) = Y ˆ + ξˆ )T ( Y YT Y = (Y
(4.73)
= Y T X(XT X)−1 XT Y + ( Y T Y − Y T X(XT X)−1 XT Y)
(4.74)
Thus, the TSS and cross products can be decomposed into two components: Predicted sum of squares (RegSS) and cross products, and residual sum of squares (ReSS) and cross product, i.e., ˆ TY ˆ + ξˆ T ξˆ YT Y = Y
(4.75)
ˆ T X XT A ˆ ˆ TY ˆ = YT Y − A ξˆ T ξˆ = Y T Y − Y
(4.76)
So,
2151_book.fm Page 81 Friday, October 20, 2006 5:04 PM
Statistical Inferences on Mean Vectors and Linear Models
81
The following results give the LSE ξˆ its properties, as in the multiple regression model. Theorem 4.12 (Mean vectors and covariance matrices): Let the multivariate multiple regression model in Equation 4.65 have full rank, i.e., rank(X) = p < n. Then: ˆ = [aˆ aˆ aˆ ] is an unbiased estimator of 1. The LSE A (1) ( 2) (q) A = [a (1) a ( 2) a ( q ) ], i.e., ˆ = A. E(A) 2. cov( aˆ (i ) , aˆ ( j ) ) = σij (XT X)–1, i, j = 1, 2, …, q. ˆ satisfies 3. The residual ξˆ = ξˆ (1) ξˆ ( 2) ξˆ ( q ) = Y – XA
( )
(
)
E ξˆ (i ) = 0, E ξˆ T(i ) ξˆ ( j ) = (n − p)σ ij
(4.77) (4.78)
(4.79)
Thus,
()
1 ˆT ˆ E ξˆ = 0, E ξ (i ) ξ ( j ) = ∑ . n− p ˆ and the LSE A ˆ are 4. The residual ξˆ = ξˆ (1) ξˆ ( 2) ξˆ ( q ) = Y – XA uncorrelated. The detailed proof can be found in Reference 1 (p. 417). Similar to multiple regression analysis, when the error ξ follows a normal distribution, we can obtain the MLE and its distribution. This result is stated in the following theorems: Theorem 4.13 (MLE): Assume that the multivariate regression model in Equation 4.65 has full rank, i.e., rank(X) = p, p + q ≤ n, and the error ξ is normally distributed. Then: 1. The MLE of A is: ˆ = (XT X)–1 XT Y A
(4.80)
ˆ is normally distributed with 2. A ˆ ) = A, and cov( aˆ (i ), aˆ ( j )) = σij (XT X)–1, i, j = 1, 2, …, q E(A
(4.81)
2151_book.fm Page 82 Friday, October 20, 2006 5:04 PM
82
Stream of Variation Modeling and Analysis for MMPs
ˆ of positive definite ∑, where ˆ is independent of the MLE ∑ 3. A ˆ = 1 ξˆ T ξˆ = 1 ( Y − X A ˆ )T ( Y − X A ˆ) ∑ n n ˆ ~ W (∑ ) 4. n ∑ n− p
(4.82) (4.83)
4.2.6.3 Inferences for Multivariate Regression Model under Normality Assumption Suppose we want to test the hypothesis that the responses do not depend on some independent variables. Without loss of generality, we can split the matrices in the following manner: A(1) A = , A(1) ∈ Rl × q , A( 2) ∈ R( p−l )× q A( 2) p× q X = X1 X 2 , X1 ∈ R n×l , X 2 ∈ R n×( p−l ) n× p So the general model can be rewritten as: A(1) E(Y) = XA = X1 X 2 = X1A(1) + X 2 A( 2) A( 2)
(4.84)
Under the null hypothesis H 0 : A (2 ) = 0, the model becomes Y= X1A(1) + ξ
(4.85)
Then the extra sum of squares and cross products is given by: ˆ ˆ )T ( Y − X A ˆ ˆ T ˆ ( Y − X1 A (1) 1 (1) ) − ( Y − X A) ( Y − X A) = n(∑1 − ∑ ) where ˆ = 1 (Y − X A ˆ T ˆ ˆ = (XT X )−1 XT Y and ∑ A (1) 1 1 1 1 1 (1) ) ( Y − X1 A(1) ). n
(4.86)
2151_book.fm Page 83 Friday, October 20, 2006 5:04 PM
Statistical Inferences on Mean Vectors and Linear Models
83
Thus, the likelihood ratio, Λ, can be expressed based on generalized variances: ˆ ) ˆ ,∑ L(A 1 (1) Λ= = = ˆ) ˆ ,∑ max L ( A, ∑) L ( A A ,∑ max L ( A(1) , ∑)
A(1) ,∑
ˆ ∑ ˆ ∑ 1
n/2
(4.87)
or equivalently Wilks’ lambda statistics Λ 2/n =
ˆ ∑ ˆ ∑ 1
(4.88)
Theorem 4.14 (Likelihood ratio test): Suppose the multivariate regression model in Equation 4.65 holds with X full rank, i.e., rank(X) = p, p + q ≤ n. Assume that the ˆ ~ W (∑) and is indeerror ξ follows normal distribution. Under H0: A(2) = 0, n∑ n–p ˆ pendent of n(∑1 − ∑), which follows Wp−l (∑). The likelihood ratio test of H 0 is equivalent to rejecting H 0 for large values of −2 ln( Λ) = − n ln
ˆ ˆ n∑ ∑ = − n ln ˆ ˆ + n(∑ ˆ −∑ ˆ) n∑ ∑ 1 1
(4.89)
When the sample size n is large, the distribution of modified statistics: ˆ ∑ 1 − n − p − (q − p + l + 2 ) ln ˆ 2 ∑ 1
(4.90)
is close to a χ2 distribution with d.o.f. q(p – l – 1). The theorem and its proof can be found on page 422 in Reference 1, where the possibility that X is not full rank is also discussed. Other than the likelihood ratio test, there are also other multivariate test statistics for testing H0: A(2) = 0, e.g., Wilks’ lambda, Pillai’s trace, Hotelling–Lawley trace, and Roy’s greatest root. These four multivariate test statistics are commonly used in computer-based packages. Interested readers may refer to Reference 1. 4.2.6.4 Predictions from Multivariate Regression As in multiple regression analysis, the multivariate regression model can also be used for predictions if the model has been fit and checked for any inadequacies. Suppose the multivariate regression model has the form Y = XA + ξ
2151_book.fm Page 84 Friday, October 20, 2006 5:04 PM
84
Stream of Variation Modeling and Analysis for MMPs
where ξ is normally distributed. If the model is adequate, two predictive problems can be solved using the fitted model. 1. Predicting the mean responses corresponding to fixed values x0: Based on the theorem (MLE) and the remarks therein, we have
(
ˆ T x ~ N AT x , x T (XT X)−1 x ∑ A 0 0 0 0 q
)
(4.91)
ˆ ~ W (∑ ) n∑ n− p
(4.92)
ˆ T x . So Clearly, the unknown value of the regression function at x0 is A 0 2 the T statistic can be defined as: A ˆ T x − AT x 0 0 T2 = x T0 (XT X)−1 x 0
T
−1 n ˆ T x − AT x A 0 0 ˆ ∑ n − p x T0 (XT X)−1 x 0
(4.93)
The 100(1-α)th confidence ellipsoid for AT x0 is given by the inequality:
(
ˆ Tx AT x 0 − A 0
)
T
n ˆ n − p ∑
−1
(A x T
0
)
ˆ Tx ≤ −A 0
q(n − p) x ( X X) x 0 Fq ,n− p+1− q (α) 1 n − p + − q T 0
T
(4.94)
−1
where Fq,n − p +1−q (α) is the upper (100α)% percentile of an F-distribution with d.o.f q, n − p + 1 − q. The 100(1-α)% simultaneous confidence intervals for E(Yi) = x T0 a (i ) are: q(n − p) n x T0 aˆ (i ) ± Fq ,n− p+1− q (α) x T0 (XT X)−1 x 0 σˆ ii , n − p + 1− q n− p
(4.95)
i = 1, 2, …, q ˆ. ˆ and σˆ is the ith diagonal element of ∑ where aˆ (i ) is the ith column of A ii 2. Forecasting new responses Y0 at x0: If the model is adequate, it can be used to forecast new responses Y0 = AT x0 + ξ0 at x0, where ξ0 is independent of ξ.
2151_book.fm Page 85 Friday, October 20, 2006 5:04 PM
Statistical Inferences on Mean Vectors and Linear Models
85
Because ˆ T x = AT x + ξ – A ˆ T x = (AT – A ˆ T) x + ξ Y0 – A 0 0 0 0 0 0 its distribution is: ˆ T x ~ N (0, ∑ + x T (XT X)–1 x ∑) Y0 – A 0 q 0 0
(4.96)
ˆ. which is independent of n ∑ Thus, the 100(1–α)% prediction region is within the ellipsoid centered at Y0: −1
ˆ ˆ T x )T n ∑ ˆT ( Y0 − A 0 n − p ( Y0 − A x 0 ) ≤ q(n − p) Fq ,n− p+1− q (α) (1 + x (X X) x 0 ) n − p + 1 − q T 0
T
(4.97)
−1
The 100(1–α)% simultaneous prediction intervals for the individual responses y0i are: q(n − p) n x T0 aˆ (i ) ± Fq ,n− p+1− q (α) (1 + x T0 (XT X)−1 x 0 ) σˆ ii , n − p + 1− q n − p 4.98) i = 1, 2, …, q ˆ. ˆ and σˆ is the ith diagonal element of ∑ where aˆ (i ) is the ith column of A ii Remark 4.33: By comparing the previous two 100(1–α)% simultaneous intervals, ˆ 0i, the prediction we can see that, because of the presence of the random error σ intervals for the actual values of the response variables are wider than the corresponding intervals for the expected values. 4.2.6.5 Selection Method of Independent Variables In the previous chapters, we have discussed that the multiple correlation coefficient increases with an increase in the number of independent variables involved in the regression model. If the number of independent variables exceeds n – 1, where n is the number of samples, the values of multiple correlation coefficient become irrelevant to the magnitude of the correlation between X1, X2, …, Xp. To deal with this problem, some methods are useful in determining a set of extra variables to be included in the regression function, and the multiple correlation
2151_book.fm Page 86 Friday, October 20, 2006 5:04 PM
86
Stream of Variation Modeling and Analysis for MMPs
coefficient is adjusted for d.o.f. Among these methods, the AIC criterion, which was first proposed by Akaike [4], and the PSS criterion ([5]) are commonly used. Detailed descriptions are available in Reference 6.
4.3 EXERCISES 1. Let X be a sequence of i.i.d 2-D random vectors with a multivariate normal distribution. Suppose that the observations x are: 3 5 7 x= 2 4 6
6 4 5 7 10 9
Given that µ 0 = [5 7]T : a. Conduct the hypothesis testing for H0: µ = µ0. Determine T 2 using the given data. b. From (a), what is the distribution of T 2 ? Also, determine all parameters of this distribution. c. Given α = .05 level and based on the given data, what will be the conclusion for the test in (a)? 2. The following data are the off-target measurements (in mm) of an automobile body in the panel assembly process. X1, X2, X3, and X4 are the associated deviations from their nominal positions in mm (based on sensors). Twenty-five parts are measured. Part
X1
X2
X3
X4
1 2 3 4 5 6 7 8 9 10 11 12
0.0388 –0.0322 –0.0195 0.0254 0.0054 0.0425 –0.0059 –0.0352 –0.0457 –0.0620 –0.0076 –0.0181
0.0050 0.0004 –0.0146 –0.0747 –0.0476 –0.0504 0.0047 0.0404 –0.0094 0.0531 –0.0063 0.0219
0.0073 0.0161 0.0581 0.0644 –0.0065 –0.0158 0.0156 –0.1069 –0.0309 0.0007 –0.0101 0.0274
0.4319 0.4479 0.3606 0.4115 0.3864 0.3508 0.4185 0.3803 0.4165 0.3587 0.3819 0.4491
2151_book.fm Page 87 Friday, October 20, 2006 5:04 PM
Statistical Inferences on Mean Vectors and Linear Models
Part
X1
X2
X3
X4
13 14 15 16 17 18 19 20 21 22 23 24 25
0.0142 0.0166 0.0151 0.0358 –0.0166 0.0298 –0.0040 –0.0175 0.0403 0.0274 –0.0154 0.0026 0.0128
0.0290 0.0805 –0.0351 0.0091 0.0375 0.0137 0.0458 0.0117 0.0355 –0.0487 –0.0581 –0.0097 –0.0092
0.0157 –0.0255 –0.0156 –0.0223 0.0181 0.0736 0.0412 0.0271 0.0196 0.0115 0.0120 0.0253 0.0272
0.3799 0.3346 0.4253 0.3699 0.3903 0.3818 0.3279 0.5460 0.3710 0.3898 0.5007 0.4102 0.3958
Given that the true mean vector is µ0 = [0 0]T, Use the data on X1 and X2 to do the following: a. Conduct the hypothesis testing for H0: µ = µ0. Determine T 2 using the given data. b. Calculate T 2 using Equation 4.6. Is it the same as the result from (a)? c. Find the distribution of T 2. d. Given α = .05 level and based on the given data, what will be the conclusion of the test in (a)? 3. Using the data on X1 and X2 in Problem 2, assume that α = 0.05 and µ0 = [0 0]T. The hypothesis testing is H 0 : µ = µ 0 . H1: µ ≠ µ 0
4. 5. 6. 7.
Construct a confidence ellipse. Is the result from the confidence ellipse the same as the conclusion in Problem 2d? Using the data in Problem 2 and given α = 0.05, construct a T 2 chart and a control ellipse for X3 and X4. From the data on X1 and X2 in Problem 2, construct a control ellipse for a future observation xT = (x1, x2). Comment on the result. Repeat Problem 5, but use the data on X3 and X4 in Problem 2. Suppose 1 X = 3 1 T
1 5 3
1 8 4
1 7 4
1 9 5
1 4 and Y T = 15 9 20 18 25 16 . 2
87
2151_book.fm Page 88 Friday, October 20, 2006 5:04 PM
88
Stream of Variation Modeling and Analysis for MMPs
The multiple linear regression model is Y = Xa + ξ, where ξ1 a0 ξ a = a1 and ξ = 2 . a2 ξ 6 a. Determine the least-squares estimates â. ˆ b. Determine the fitted values Y. ˆ c. Calculate the residuals ξ. d. Calculate the residual sum of squares, ξˆ T ξˆ . 8. The relationship between measurement deviation (y) and the pins deviations (u) in a single-station assembly process can be represented in a linear relationship as the following: y = Γ ⋅u + ε where Γ is the geometric relationship between y and u, and ε is the model error. Suppose y = δx1 u = δP1x
δz1
δx3
δP1z
δz3
δx5
δx9
T
δP2 z ,
,
σ 2ε = 1.7361 × 10 −5 , and 0.9972 −0.0093 1 Γ= 0 −0.0150 1.0072 −0.0246
−0.1157 0.6196 0 1 0.3846 0.2961 −0.0049
T
δz9 ,
0.1157 0.3804 0 0 . 0.6154 −0.2961 1.0049
Let N = 25, i.e., there are 25 parts observed.
2151_book.fm Page 89 Friday, October 20, 2006 5:04 PM
Statistical Inferences on Mean Vectors and Linear Models
89
The observation data for y is provided in the table below: Part
δx 1
δz 1
δx 3
δz 3
δx 5
δx 9
δz 9
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25
0.4231 0.4337 0.3563 0.4174 0.3748 0.3472 0.4237 0.3879 0.4197 0.3532 0.3686 0.4454 0.3769 0.3320 0.4306 0.3921 0.3849 0.3738 0.3237 0.5418 0.3687 0.3925 0.4846 0.3888 0.3828
0.0073 0.0161 0.0581 0.0644 –0.0065 –0.0158 0.0156 –0.1069 –0.0309 0.0007 –0.0101 0.0274 0.0157 –0.0255 –0.0156 –0.0223 0.0181 0.0736 0.0412 0.0271 0.0196 0.0115 0.0120 0.0253 0.0272
0.4319 0.4479 0.3606 0.4115 0.3864 0.3508 0.4185 0.3803 0.4165 0.3587 0.3819 0.4491 0.3799 0.3346 0.4253 0.3699 0.3903 0.3818 0.3279 0.5460 0.3710 0.3898 0.5007 0.4102 0.3958
0.0331 0.0351 0.0834 0.0404 0.0496 –0.0077 –0.0119 –0.1330 –0.0251 –0.0085 0.0113 0.0329 0.0078 –0.0219 –0.0107 –0.0895 0.0078 0.0890 0.0652 0.0439 0.0279 0.0119 0.0294 0.0713 0.0618
0.4226 0.4396 0.3513 0.4137 0.3769 0.3411 0.4228 0.3965 0.4195 0.3510 0.3736 0.4412 0.3786 0.3292 0.4301 0.3885 0.3949 0.3748 0.3167 0.5421 0.3687 0.3846 0.4868 0.3875 0.3805
0.4496 0.4669 0.3873 0.4032 0.4227 0.3683 0.4078 0.3644 0.4231 0.3496 0.3968 0.4569 0.3707 0.3217 0.4360 0.3265 0.3669 0.3861 0.3464 0.5565 0.3781 0.3875 0.5132 0.4248 0.4221
–0.0348 –0.0380 0.0088 0.0856 –0.0881 –0.0430 0.0399 –0.0595 –0.0486 0.0010 –0.0478 0.0202 0.0407 –0.0262 –0.0249 0.0688 0.0463 0.0584 0.0032 –0.0107 0.0088 –0.0038 –0.0238 –0.0426 –0.0199
a. b. c. d.
Use least-squares estimation to find uˆ . From Theorem 4.7, find an unbiased estimator of σ 2ε ? Given that α = 0.05, find the confidence region for u. Determine the simultaneous 100 (1 − α )% confidence intervals for the ui, i = 1, 2, 3. Use α = 0.05.
REFERENCES 1. Johnson, R.A. and Wichern, D.W., Applied Multivariate Statistical Analysis, 4th ed., Prentice Hall, Upper Saddle River, NJ, 1998. 2. Sen, A. and Srivastava, M., Regression Analysis: Theory, Methods and Applications, Springer-Verlag, New York, 1990. 3. Srivastatva, M.S., Methods of Multivariate Statistics (Wiley Series in Probability and Statistics), John Wiley & Sons, New York, 2002. 4. Akaike, H., Autoregressive model fitting for control, Annals of the Institute of Statistical Mathematics, 23, 163–180, 1971.
2151_book.fm Page 90 Friday, October 20, 2006 5:04 PM
90
Stream of Variation Modeling and Analysis for MMPs 5. Allen, D.M., Mean squared error of prediction as a criterion for selecting variables, Technometrics, 13, 469–475, 1971. 6. Takeuchi, K., Yanai, H., and Mukherjee, B.N., The Foundations of Multivariate Analysis: A Unified Approach by Means of Projection onto Linar Subspaces, John Wiley & Sons, New York, 1982. 7. Muirhead, R.J., Aspects of Multivariate Statistical Theory (Wiley Series in Probability and Statistics), John Wiley & Sons, New York, 1982. 8. Morrison, D.F., Multivariate Statistical Methods, 2nd ed., McGraw-Hill, New York, 1976. 9. Kleinbaum, D.G. et al., Applied Regression Analysis and Other Multivariate Methods, 2nd ed., PWS-KENT Publishing Company, Boston, MA, 1988. 10. Manly, B.F.J., Multivariate Statistical Methods: A Primer, Chapman and Hall, London, 1986. 11. Kshirsagar, A.M., Multivariate Analysis, Marcel Dekker, New York, 1972. 12. Karson, M.J., Multivariate Statistical Methods, The Iowa State University Press, Ames, IO, 1982. 13. Hair, J.F. et al., Multivariate Data Analysis, 3rd ed., Macmillan Publishing Company, New York, 1992.
ADDITIONAL READING Hair, J.F. et al., Multivariate Data Analysis, 3rd ed., Macmillan Publishing Company, New York, 1992. Karson, M.J., Multivariate Statistical Methods, The Iowa State University Press, Ames, IO, 1982. Kleinbaum, D.G. et al., Applied Regression Analysis and Other Multivariate Methods, 2nd ed., PWS-KENT Publishing Company, Boston, MA, 1988. Kshirsagar, A.M., Multivariate Analysis, Marcel Dekker, New York, 1972. Manly, B.F.J., Multivariate Statistical Methods: A Primer, Chapman and Hall, London, 1986. Morrison, D.F., Multivariate Statistical Methods, 2nd ed., McGraw-Hill, New York, 1976. Muirhead, R.J., Aspects of Multivariate Statistical Theory (Wiley Series in Probability and Statistics), John Wiley & Sons, New York, 1982.
2151_book.fm Page 91 Friday, October 20, 2006 5:04 PM
5
Principal Component Analysis and Factory Analysis
5.1 PRINCIPAL COMPONENT ANALYSIS A complex system may involve large volumes of data with more than hundreds of different variables. In such situations, it is desirable to find ways to effectively reduce the dimensions of the data with minimum loss of information, and interpret the data with physical meanings. Principal component analysis (PCA) provides the capability to identify patterns in high-dimensional data and reconstruct it to highlight similarities and differences. The concept of PCA was first introduced by Karl Pearson in 1901 and formally treated by Hotelling [1] and Rao [2]. PCA has been widely and successfully applied in various fields by analyzing the variance-covariance structure through linear combinations of multivariate data. In this section, the basic mathematical model of PCA, its geometrical interpretation, as well as related statistical tests will be discussed. A potential application in process control is also presented.
5.1.1 MATHEMATICAL MODEL
OF
PCA
Suppose X T = [X1 X2 Xp] is a random vector with variance-covariance
(
)
∑ x = var(X) = E (X − µ X )(X − µ X )T ,
(5.1)
Clearly, ∑x is at least positive semidefinite with rank r ≤ p; thus, all the eigenvalues are real and nonnegative. Without loss of generality, let the eigenvalues be λ1 ≥ λ2 ≥ ≥ λp ≥ 0 and the standardized eigenvectors be e1, e2, …, ep. Denote
E = e1
e2
e11 e21 e p = e p1
e12
e22
ep2
e1 p e2 p e pp
T
(5.2)
then EET = ETE = I and ET∑xE = Λ, or ∑x = EΛET where Λ = diag(λ1, λ2, , λp). Consider the linear combination: 91
2151_book.fm Page 92 Friday, October 20, 2006 5:04 PM
92
Stream of Variation Modeling and Analysis for MMPs
Yi = ai1 X1 + ai 2 X2 + + aip X p = a Ti X , (i = 1, 2, …, p)
(5.3)
where a Ti = ai1 ai 2 aip , a Ti a i = 1 and X T = [X1 X2 Xp]. The variance-covariance structure of Yi is determined by: var(Yi ) = var(a Ti X) = a Ti ∑ X a i cov(Yi , Yj ) = cov(a iT X, a Tj X) = a Ti ∑ X a j , for i, j = 1, 2, …, p.
(5.4)
As ∑ X = EΛET and λ1 ≥ λ2 ≥ ≥ λp ≥ 0 p
var(Yi ) = a Ti ∑ x a i = a Ti EΛET a i =
∑ λ (a e ) i
T i
2
j
j =1
(5.5)
p
≤ λ1
∑ (a e ) T i
j
2
= λ1a Ti EET a i = λ1a Ti a i = λ1
j =1
when a i = e1, “=” holds. So a i = e1 maximizes var(Yi ) and, λ1 = var(e1T X) = e1T ∑ X e1T = max a T ∑ X a a:a T a =1
(5.6)
It can be shown that for any i = 1, 2, …, p λ i = var(eTi X) , and cov(eTi X, eTj X) = eTi ∑ X e j = e Ti EΛET e j = 0 for i ≠ j
(5.7)
So the principal components (PCs) of Xi, i = 1, 2, …, p, are the uncorrelated linear combinations of Xi, i = 1, 2, …, p with the components of eigenvectors as combination coefficients. Property 1: Let ∑X be the covariance matrix of random vector X T = [X1 X2 Xp] and its eigenvalue-eigenvector pairs be (λi, ei) i = 1, 2, …, p with λ1 ≥ λ2 ≥ ≥ λp ≥ 0. Then the ith PC is defined by the linear combination: Yi = ei1 X1 + ei 2 X 2 + ⋅ ⋅ ⋅ + eip X p = eTi X, i = 1, 2, …, p
(5.8)
and var(Yi ) = λ i . Remark 5.1: var(Y1 ) ≥ var(Y2 ) ≥ ≥ var(Y p ) ≥ 0 because λ1 ≥ λ2 ≥ ≥ λp ≥ 0.
2151_book.fm Page 93 Friday, October 20, 2006 5:04 PM
Principal Component Analysis and Factory Analysis
93
Remark 5.2: If some eigenvalues are identical, the choice of ei , and hence Yi , is not unique. P
P
Property 2:
∑
var(Yi ) =
i =1
∑ var( X ) . i
i =1
Property 3: The correlation between PCs and original variables is:
ρ(Yi , X j ) =
cov(Yi , X j ) var(Yi ) var( X j )
=
eij λ i σ ii
, i, j = 1, 2, …, p.
From Property 2, we can see that the total variance will not be significantly affected by ignoring components with smaller variances. This is why the PCA approach can be used to reduce variables without losing too much information. The contribution rate of the ith principal component can be defined as: var(Y j )
=
p
λi
(5.9)
p
∑ var(Y ) ∑ λ j
j
j =1
j =1
The larger this rate, the larger the capability of this PC to summarize the information in Xi , i = 1, 2, ..., p . The accumulated contribution rate of the first k PCs is given by: k
k
∑ var(Y ) ∑ λ j
j =1 p
=
∑ var(Y ) ∑ λ j
j =1
j
j =1 p
(5.10) j
j =1
If this rate becomes 85%, it indicates that the first k PCs extract 85% original variation information in Xi , i = 1, 2, ..., p . Thus, many variables are reduced effectively without losing too much information. We define the Loss of Information (LoI) using the first k principal components instead of all the variables Xi , i = 1, 2, ..., p as: p
LoI =
∑ i=1
p
k
var( Xi ) –
∑
var(Yi ) =
i =1
where k is selected by minimizing LoI.
∑ i =1
p
k
λi –
∑ i =1
λi =
∑λ
i = k +1
i
(5.11)
2151_book.fm Page 94 Friday, October 20, 2006 5:04 PM
94
Stream of Variation Modeling and Analysis for MMPs
p
p
k
k
∑ var( X ) – ∑ var(a X) ≥ ∑ var( X ) – ∑ var (e X) T i
i
i =1
i=1
T i
i
(5.12)
i =1
i=1
Equality holds if and only if ai = ei. PCs can also be calculated using “standardized” variables. Remark 5.3: In general, the eigenvalue–eigenvector pairs obtained from a covariance matrix are different from those obtained from the correlation matrix. This leads to different PCs from the covariance matrix ∑ and correlation matrix R. The covariance matrix ∑ is recommended when the variables are measured in the same, or comparable, units. Sometimes, the population variance-covariance ∑ X is not known. In this case, the sample covariance (or correlation) matrix is used in PCA. Corresponding to the results from the population, there are similar results from the random sample. Property 4: Let x be the sampling data matrix and S be the sample covariance matrix with eigenvalue–eigenvector pairs ( λˆ i , eˆ i ), i = 1, 2, …, p, with λˆ 1 ≥ λˆ 2 ≥ ≥ λˆ p ≥ 0 . Then the ith PC is defined by: yˆi = eˆi1 x1 + eˆi 2 x2 + … + eˆip x p = eˆ Ti x , i = 1, 2, …, p .
(5.13)
The variance-covariance matrix of yˆ T = [ yˆ1 , yˆ2 ,..., yˆ p ] is: var( yˆ ) = diag (λˆ 1, λˆ 2 ,..., λˆ p )
(5.14)
The total sample variance is: p
∑
p
var( yˆi ) =
i=1
∑
p
var( xi ) =
i =1
∑
p
sii =
i=1
∑ λˆ
i
(5.15)
i =1
and the sample correlation matrix R between PCs and x with (i, j) entry is:
ρ( yˆi , x j ) =
cov( yˆi , x j ) var( yˆi ) var( x j )
=
eˆij λˆ i sii
, i, j = 1, 2, ..., p
(5.16)
In summary, PCA procedures using sampling measurements are: Step 1: Obtain sample matrix x and standardize data if the variables are not of the same unit. Step 2: Calculate the sample covariance matrix S or correlation matrix R.
2151_book.fm Page 95 Friday, October 20, 2006 5:04 PM
Principal Component Analysis and Factory Analysis
95
X2
Y2
Y1
O
X1
FIGURE 5.1 Relationship between PCs Y1, Y2 and original variables X1, X2.
Step 3: Compute the sample eigenvalue–eigenvector pairs (λˆ i , eˆ i ), i = 1, 2, …, p, with λˆ 1 ≥ λˆ 2 ≥ ≥ λˆ p ≥ 0 . Step 4: Obtain the sample PCs: yˆ i = eˆi1 x1 + eˆi 2 x2 + + eˆip x p = eˆ Ti x, i = 1, 2, …, p. Remark 5.4: From the derivation of PCs, it is evident that they only depend on the covariance matrix ∑ X and multinormal distribution is not required. However, normality is required for statistical tests.
5.1.2 GEOMETRICAL INTERPRETATION From the previous subsection, the p PCs can be algebraically viewed as special linear combinations of original variables Xi , i = 1, 2, …, p, i.e. Yi = ei1 X 1 + ei 2 X 2 + + eip X p = eTi X , i = 1, 2, …, p
(5.17)
From a geometrical point of view, these uncorrelated linear combinations Yi , i = 1, 2, …, p, form a new coordinate system by rotating the old coordinate system spanned by Xi , i = 1, 2, …, p. The PCs have the maximum variance. For simplicity, we will illustrate this transformation using p = 2. The 2-D scatter plot of data is shown in Figure 5.1. Suppose we have n samples. Let the original variables be X1, X2 and the two PCs be Y1,Y2. Further, let us assume XT = (X1, X2) ~ N(µ , ∑). Without loss of generality, we can assume that the data are centered at the origin, i.e., µ = 0. For a bivariate normal distributed random variable X, the n points are scattered in an ellipse with the direction of Y1 being its long-axis direction and the direction of Y2 being its short-axis direction. This is obtained by rotating the original coordinate system to a new coordinate system with angle θ. Let cos(θ) E= − sin(θ)
sin(θ) cos(θ)
(5.18)
2151_book.fm Page 96 Friday, October 20, 2006 5:04 PM
96
Stream of Variation Modeling and Analysis for MMPs
Clearly, E is an orthogonal matrix. Denote the preceding relationship in the matrix form: Y1 cos(θ) = Y2 − sin(θ)
sin(θ) X1 ⋅ = E⋅X cos(θ) X2
(5.19)
From Figure 5.1, the first PC, Y1, exhibits the maximum variance and the second PC, Y2 exhibits the second largest variance. So by choosing the linear combination with the maximum dispersion, most variation in X1, X2 can be explained. If the ellipse is very flat, then the total sample variation can be approximated by the dispersion in the Y1 direction, and thus the effect of Y2 on total variation can be ignored. In this sense, the one-dimensional PC is obtained by reducing the original 2-D data. In general, p variables Xi, i = 1, 2, …, p span a p-dimensional space, and n samples can be viewed as n points in this p-dimensional space. Then, selecting p PCs is equivalent to identifying the p pivotal axes of the p-dimensional ellipse.
5.1.3 INFERENCES
ON
PCS
In this subsection, we will consider two types of inferences in which normal distribution assumption about X ~ N p (µ , ∑) is required: 1. Statistical testing for independence of the original variables, i.e., H0: R = I, where R is the correlation matrix of multinormal variables X. For a large sample size n, a likelihood ratio test for H0 is given by p
|R| =
∏ λˆ
i
(5.20)
i=1
where λˆ i , i = 1, 2, …, p, are eigenvalues of correlation matrix R. H0 is rejected if:
( )
− m ln R > χ 2p( p−1)/ 2 (α)
5.21)
where m = n − 1−
2p +5 . 6
It should be noted that PCA can only be performed if H0 is rejected. 2. Statistical testing for isotropy for eigenvalues, i.e., H0: λq+1 = λq+2 = … = λp = λ, which can be used to determine the number of PCs.
2151_book.fm Page 97 Friday, October 20, 2006 5:04 PM
Principal Component Analysis and Factory Analysis
97
Equation 5.10 or Equation 5.11 can be used to determine the number of PCs, i.e., to choose the first few PCs to replace the original p (usually very large) variables without much loss of information. Here, an alternative method is given based on the testing of the isotropy: H0: λ q +1 = λ q +2 = ... = λ p = λ
(5.22)
H a : λ i ≠ λ for at least one i = q + 1, q + 2, ..., p where λ is an unknown constant. If H0 is not rejected, then in addition to the first q PCs, all the remaining PCs must be added if any one of the PCs is added, because they are isotropic. The likelihood ratio test for H0 is based on the statistic: 1/( p − q )
p λˆ i i=q +1 Q= p 1 λˆ i p − q i=q +1
∏
∑
(5.23)
where λˆ i is an eigenvalue of the sample covariance matrix S, i =1, 2, …, p. Lawley [3] showed that 2 1 Q * = − n − 1 − q − 2( p − q) + 1 + ⋅ ln(Q ) 6 p − q
(5.24)
is asymptotically distributed as χ 2f (α ) where f =
1 ( p − q )( p − q + 1) − 1 . 2
So H0 is rejected if Q* > χ 2f (α ). When H0 is rejected, more than q PCs are needed. If H0 is not rejected, then at the significance level α, the first q PCs are enough to replace the p original variables without much loss of information.
5.1.4 APPLYING PCS
TO
PROCESS CONTROL
It is known from previous chapters that quality ellipses and T 2 charts can be used as multivariate control charts. With a large number of variables, PCs can be used to construct a control chart.
2151_book.fm Page 98 Friday, October 20, 2006 5:04 PM
98
Stream of Variation Modeling and Analysis for MMPs
Consider data matrix x11 x 21 x= x n1
x12
x 22
xn 2
x1 p x2 p x np
with (µ , ∑). The first two sample PCs yˆ1 and yˆ2 are considered because they account for the largest cumulative proportion of the total sample variance and can be visualized easily. Richard Johnson [4] suggested two types of control charts: 1. Ellipse format chart based on the first two PCs for large n: The first two PCs yˆ1 and yˆ2 are independent, so the control ellipse is constructed by: yˆ12 yˆ22 + ≤ χ 22 (α ) ˆλ λˆ 2 1
(5.25)
This ellipse can be used to check the stability of measurements. If there are out-of-control points, special causes may exist in the process. 2. T 2 chart based on the remaining PCs: Assume X − µ ∼ N p (0, ∑). Let Yi, i = 1, 2, …, p, be centered PCs with mean 0. A T 2 chart is constructed from the residual information not contained in the first two PCs. Based on the last p – 2 PCs, the statistic p
∑λ
Y j2
j =3
~ χ2p−2 .
j
As Yˆj , j = 3, 4, ..., p, λˆ j
are independent and normally distributed, the T 2 chart can be created based on the statistic p
Ti 2 =
∑ λˆ j =3
yˆij2 j
2151_book.fm Page 99 Friday, October 20, 2006 5:04 PM
Principal Component Analysis and Factory Analysis
99
and the upper control limit (UCL) can be defined as UCL = χ2p−2 (α ) . This chart can be used to identify special causes of variation.
5.2 FACTOR ANALYSIS 5.2.1 INTRODUCTION In many areas such as psychology, sociology, and health care, it is not always possible to directly measure concepts of major interest such as intelligence, satisfaction, or quality of life. In the manufacturing and service industries, the quality of a product or service is the critical issue and is usually affected by many direct or indirect factors. The effective identification and interpretation of these common factors become the key to the success. The early stages of most statistical investigations in the categories listed earlier are characterized by little prior quantitative information being available about the area of study. Consequently, early research efforts are hit-and-miss by nature — many variables are studied in the hope of identifying some kind of order that will deepen understanding of the subject. Although correlation analysis can be used to identify the interrelationship among variables, practical problems arise when we have to deal with a large number of variables. The problem is that the number of coefficients keeps accelerating as the number of variables increases. So, a higherorder data reduction technique is required that can systematically summarize large covariance or correlation matrices. Factor analysis is one of the data reduction techniques. It is a family of procedures for removing redundancy from a set of correlated variables and representing the variables with a smaller set of “derived” variables, or factors. Alternatively, the factor analysis procedure can be thought of as removing duplicated information from a set of variables. We may also think of it loosely as a group of similar variables, although these variables are often unobservable.
5.2.2 COMPARISON
OF
FACTOR ANALYSIS
AND
PCA
Factor analysis and PCA are functionally very similar and can be used for data dimension reduction. However, they are quite different in terms of underlying assumptions. PCA is concerned with explaining the variance-covariance structure of a set of variables (XT = [X1 X2 Xp]) through a few linear combinations of these variables (YT = [Y1 Y2 Yp]). The PCs are those uncorrelated linear combinations that can be defined as Y = UX. Algebraically, PCs (Yi’s) are particular (uncorrelated) linear combinations of the random variables (Xi’s). Geometrically, these linear combinations represent the selection of a new coordinate system obtained by rotating the original system with U. Therefore, PCA is a method for analyzing the structure of the data covariance, but not for modeling the relationship between data and some latent variables. Factor analysis assumes that the variance of a single variable can be decomposed into a common variance that is shared by other variables included in the model, and
2151_book.fm Page 100 Friday, October 20, 2006 5:04 PM
100
Stream of Variation Modeling and Analysis for MMPs
a specific variance that is unique to a particular variable, including the error component. Factor analysis analyzes only the common variance of the observed variables, whereas PCA considers the total variance and makes no distinction between common and unique variances [5]. Stevens [6] described two criteria that could be used to choose one technique over the other: •
•
The objective of the analysis: Factor analysis and PCA are similar as the purpose of both methods is to reduce the original variables into fewer composite variables, called factors or PCs. However, they are distinct because the obtained composite variables serve different purposes. In factor analysis, a small number of factors are extracted to account for the intercorrelations among the observed variables, i.e., to identify the latent variables that explain why they are correlated with one another. The objective of PCA is to account for the maximum portion of the variance present in the original set of variables with a minimum number of composite variables, called PCs. The assumption about the variance in the original variables: If the observed variable measurements are relatively error free (for example, age, years of education, or number of family members), or if it is assumed that the error and specific variance represent a small portion of the total variance in the original set of variables, then PCA is appropriate. But if the observed variables are only indicators of the latent structures to be measured (such as test scores or responses to attitude scales), or if the error variance represents a significant portion of the total variance, then factor analysis will be the more appropriate technique.
5.2.3 ORTHOGONAL FACTOR MODEL Factor analysis is a multivariate analysis technique aiming at reducing the number of variables while preserving as much original information as possible. In other words, it is used to uncover the latent structure (dimensions) of a set of variables. The essential purpose of factor analysis is to describe the covariance relationship among many variables in terms of a few underlying, but unobservable, and random quantities. The factor model is motivated by the argument that variables can be grouped according to their correlations. That is, suppose all variables within a particular group are highly correlated among themselves, but have relatively small correlations with variables in a different group. Then it is conceivable that each group of variables represents a single underlying factor that is responsible for the observed correlations. We suppose X1, X2, …, Xp are p observable random variables. Remember that if these p variables do not have the same unit, standardization should be done to avoid unreasonable results. The factor model postulates that these variables are linearly dependent on a few unobservable random variables F1, F2, …, Fm, and p additional sources of variation ε1, ε2, …, εp, which cannot be explained by Fj , j = 1, 2, …, m. In particular, the factor analysis model in a matrix notation is:
2151_book.fm Page 101 Friday, October 20, 2006 5:04 PM
Principal Component Analysis and Factory Analysis
X1 a11 X2 = a21 : X p a p1
a12 a22
... ...
a p2
...
a1m F1 ε1 a2 m F2 ε 2 + ... a pm Fm ε p
101
(5.26)
or X p×1 = A p× m Fm×1 + ε p×1
(5.26′)
where F1, F2, …, Fm are called common factors and ε1, ε2, …, εp are called residual errors or specific factors. The coefficient aij is called the loading of the ith variable on the jth factor. So the matrix A is the loading matrix of factor analysis. Because the p variables, X1, X2, …, Xp, are expressed in terms of p + m unobservable random variables, F1, F2, …, Fm and ε1, ε2, …, εp, it is impossible to give a direct verification of the factor model from the observations on X1, X2, …, Xp with so many unobservable quantities. Thus, additional assumptions on F1, F2, …, Fm and ε1, ε2, …, εp must be made to simplify the covariance relationships: 1. m ≤ p. 2. E(εε) = 0, ∑ ε = Cov(ε ) = diag σ12 , σ 22 ,…, σ 2p , i.e., ε1, ε2, …, εp are not correlated and the variances are different. 3. Cov(F, ε ) = 0, i.e., F and ε are uncorrelated. 4. E(F) = 0, ∑ F = Cov(F) = diag 1, 1, …, 1 = Im , i.e., the F1, …, Fm are uncorrelated and their variances are 1.
(
)
(
)
The preceding assumptions make the model Equation 5.26 an orthogonal factor model. Remark 5.5: The models of Equation 5.26 or Equation 5.26′ give the objective of factor analysis, i.e., to use F as a surrogate of X such that the dimension of the variable space is reduced from p to m. Remark 5.6: In PCA, the coefficient matrix A is determined and unique. But in factor analysis, the loading matrix A is not unique, because for any m × m orthogonal matrix Γ , Equation 5.26′ can be rewritten as: X p×1 = ( A p× m Γ m× m )(Γ Tm× m Fm×1 ) + ε p×1
(5.26″)
It is easy to verify that Equation 5.26″ still satisfies the assumptions, i.e., ∑ Γ T F = Cov(Γ T F) = Im , Cov(Γ T F, ε ) = 0 So Γ Tm× m Fm×1 is also a common factor, and A p× m Γ m× m is the loading matrix.
2151_book.fm Page 102 Friday, October 20, 2006 5:04 PM
102
Stream of Variation Modeling and Analysis for MMPs
Remark 5.7: This nonuniqueness provides a nice property. When the structure of A is complex, it can be simplified by an appropriate transformation such that the new factors have meanings that can be better interpreted. Unlike the coefficients in a regression model, the common factor F is unobservable in the factor analysis model.
5.2.4 STATISTICAL INTERPRETATIONS AND COMMUNALITY
OF
FACTOR LOADINGS
5.2.4.1 Interpretation of Factor Loadings Under assumptions listed in the previous section, we know that E(F) = 0, σ i2 = var(εi) = 1, µ = E(X) = 0, and var( Xi ) = σ 2ii , i= 1, 2, …, p. The orthogonal factor model given in Equation 5.26 implies a covariance structure for X: cov(X, F) = E (X − µ )(F − 0)T = E ( AF + ε )FT = A ∑ F + E (εFT ) = A ∑ F = A or cov( Xi , F j ) = aij
(5.27)
So the factor loadings are the covariance coefficients between the observable variables and factors. Similarly, if the data X is standardized, i.e., var(Xi) = σii2 = 1, factor loadings are the correlation coefficients between the observable variables and factors, i.e., ρ( Xi , F j ) = aij . Define p
Sj =
∑ a , j = 1, 2, ..., m 2 ij
i =1
where the squared factor loading aij2 is the percentage of variance in variable Xi accounted for by the factor Fj. Sj is the contribution of Fj to X, i.e., the sum of variance is contributed by Fj to all X. To obtain the percentage of variance in all the variables explained by each factor, the sum of the squared factor loadings for that factor is calculated and then divided by the number of variables. 5.2.4.2 Interpretation of Communality As mentioned earlier, we have: ∑ X = Cov(X) = E (X − µ )(X − µ )T = AAT + ∑ ε or 2 σ 2ii = var( Xi ) = ai21 + ... + aim + σ i2 , Cov ( Xi , X j ) = ai1a j 1 + ... + aim a jm
(5.28)
2151_book.fm Page 103 Friday, October 20, 2006 5:04 PM
Principal Component Analysis and Factory Analysis
103
The portion of the variance of the ith variable var(Xi) contributed by the m 2 common factors, i.e., ai21 + ... + aim , is called the ith communality. The portion of var(Xi) due to the specific factor, i.e., σ 2i , is called the uniqueness or specific variance. Thus, the variance of the ith variable can be decomposed into two parts as follows: var(Xi) = communality + specific variance, i = 1, 2, …, p
(5.29)
2 Denoting the ith communality by hi2 = ai21 + ... + aim , the ith communality is the sum of squares of the loadings of the ith variable on the m common factors. Hence, from Equation 5.28, the “standardization” yields:
σ 2ii = hi2 + σ i2 = 1
(5.29)
If hi2 ≈ σ ii2 , then almost all the original information contained in the ith variable Xi is extracted by common factors, and thus hi2 is critical component of var(Xi). However, if hi2 ≈ 0, then the common factors have a small impact on Xi, and the specific factor ε i dominates the Xi. So the communality measures the percentage of variance in a given variable accounted for by all the factors.
5.2.5 ESTIMATION
OF
LOADING MATRIX
Suppose we are given n observations x1, x2, …, xn on p generally correlated variables. To construct the factor analysis model, we need to estimate the factor loading matrix A based on these observations. There are some popular methods of parameter estimation, such as, the principal component method and the maximum likelihood method. Here, we will only concentrate on the first method. The MLE method can be seen in Reference 4. Let ∑ X = Cov(X) have eigenvalue λ1 ≥ λ 2 ≥ ... ≥ λ p ≥ 0 and associated normalized eigenvectors ei , i = 1, 2, …, p then:
p
∑X =
∑λ e e
T i i i
i =1
= λ1 e1
λ eT 1 1 λp ep T λ p e p
(5.30)
If m = p, then from Equation 5.28 and Equation 5.30, ∑ X = AAT + 0 = AAT
(5.31)
A comparison of Equation 5.30 and Equation 5.31 yields A = λ1 e1
λpep
(5.32)
2151_book.fm Page 104 Friday, October 20, 2006 5:04 PM
104
Stream of Variation Modeling and Analysis for MMPs
If m < p and the last p – m eigenvalues are small, we neglect the contribution of p
∑λee
T i i i
to ∑ X ;
i = m +1
then:
∑ X ≈ λ1 e1
λ1 e1T λ m em T λ m e m
(5.33)
λ1 e1T λ m e m + ∑ ε = AAT + ∑ ε T λ m e m
(5.34)
If we consider the specific factors, then:
∑ X ≈ λ1 e1
To use this method in a data set x1, x2, …, xn, if the variables do not have the same unit, it is first customary to standardize the data by x ji − xi
, j = 1, 2, …, n, i = 1, 2, …, p
(5.35)
sii If ∑ X is unknown, we can use the sample variance-covariance matrix S obtained from the standardized data. So the estimated factor loading matrix is given by ˆ = λˆ eˆ A 1 1
λˆ m eˆ m
(5.36)
where λˆ 1 ≥ λˆ 2 ≥ ≥ λˆ p are eigenvalues of sample covariance matrix S (or correlation matrix R), and eˆ i , i = 1, 2, …, p, are the associated normalized orthogonal eigenvectors of S or R. The estimated specific variances are provided by the diagonal ˆ = diag σˆ 2 , σˆ 2 , …, σˆwith ˆ ˆ T and, thus, ∑ elements of the matrix S − AA ε m +1 m+2
(
m
σˆ 2i = sii −
∑ aˆ
ij
2
.
j =1
2 The communalities are estimated by hˆi2 = aˆi21 + ... + aˆim .
2151_book.fm Page 105 Friday, October 20, 2006 5:04 PM
Principal Component Analysis and Factory Analysis
105
Fk
~ Fk
kj
Fj ~ Fj
FIGURE 5.2 Orthogonal rotation of common factors.
5.2.6 FACTOR ROTATION The objective of factor modeling is not only to find the common factors Fj, j = 1, 2, …, m, and group the original variables Xi, i = 1, 2, …, p, but also to make each common factor more interpretable and meaningful. Thus, factor rotation is usually necessary to facilitate the interpretation of factors. The nonuniqueness of a loading matrix makes the rotation feasible. Through factor rotation, the structure of the loading matrix is simplified in the sense of making each variable have larger loading only on one common factor, and smaller or moderate loading on other common factors. There are two categories of rotational methods to choose from: orthogonal rotation or oblique rotation. Oblique rotations allow the factors to be correlated, and so a factor correlation matrix is generated when oblique is requested. Normally, however, with orthogonal rotation, the axes are kept perpendicular as they are rotated and no factor correlation matrix is produced, as the correlation of any factor with another is zero. In this subsection, the commonly used maximum variance orthogonal rotation is introduced. The oblique rotation methods is illustrated in Figure 5.2 and can be found in Reference 4. We start with m = 2. Let the factor loading matrix a11 a21 A= a p1
a12 a22 . a p 2
To avoid imbalance caused by communality hi2 = ai21 + ai22 , i = 1, 2, …, p, the ith row of this loading matrix is preprocessed by dividing hi2 for i = 1, 2, …, p. For simplification, we still use A. Let cos ϕ T= sin ϕ
− sin ϕ cos ϕ
2151_book.fm Page 106 Friday, October 20, 2006 5:04 PM
106
Stream of Variation Modeling and Analysis for MMPs
be an orthogonal matrix and define the new matrix a11 cos ϕ + a12 sin ϕ a21 cos ϕ + a22 sin ϕ B = AT = a p1 cos ϕ + a p 2 sin ϕ
− a11 sin ϕ + a12 cos ϕ − a21 sin ϕ + a22 cos ϕ − a p1 sin ϕ + a p2 cos ϕ
b11 b 21 b p1
b12 b22 (5.37) b p 2
The rotation angle ϕ is selected by making the structure of loading matrix A as simple as possible. In other words, we hope to divide the variables Xi , i = 1, 2, …, p into two categories, one related to F1 and the other related to F1. This is achieved by making the two variances V1 and V2 as large as possible, where V1 and V2 are obtained from the two column vectors of B, respectively. Thus, we select rotation angle ϕ such that ϕ = arg max(V ) = arg max(V1 + V2 ) ϕ
(5.38)
ϕ
where
1 Vj = p
p
∑ i =1
1 (b ) − p 2 2 ij
Using the principle of maximization
p
∑ i =1
2
b , j = 1, 2. 2 ij
dV = 0, we have: dϕ D−
tan( 4ϕ) =
1 (2 AB) p
(5.39)
1 C − ( A2 − B2 ) p
where p
A=
∑
p
αi , B =
i =1
∑
p
βi , C =
i =1
2
∑ i =1
2
p
(α 2i − β2i ) , D = 2
∑ α β , and i
i
i =1
a a a a α i = i1 − i 2 , βi = 2 i1 i 2 for i = 1, 2, …, p hi hi hi hi
2151_book.fm Page 107 Friday, October 20, 2006 5:04 PM
Principal Component Analysis and Factory Analysis
107
TABLE 5.1 The Range of Rotation Angle ϕ Sign of Nominator
Sign of Denominator
+
+
+
–
Range of 4 ϕ
π
+
–
8 π
~π
–π ~ − π
π
0~
2
–
–
π
0~
2 –
Range of ϕ
~
8 π
−
2
−
~0
2
π 4
π 4 π
~−
π 8
~0
8
The range of ϕ is determined according to the signs of denominators and nominators of tan(4ϕ), which are summarized in Table 5.1. If there are m > 2 common factors, we need to conduct the preceding orthogonal rotation on each pair of common factors, i.e., orthogonally rotate the factor space span {Fk, Fj} with ϕ kj for k = 1, 2, …, (m – 1) and j = k + 1, …, m. Each ϕ kj is determined by rotating a11 a21 A= a p1
a12 a22
... ...
a p2
...
a1m a2 m ... a pm
to b11 b21 B = ATkj = b p1
b12 b22 bp2
... ... ...
b1m b2 m ... b pm
by maximizing m
m
V=
1 p
p
∑ V =∑ ∑ j
j =1
j =1
i =1
1 (b ) − p 2 2 ij
p
∑ i =1
b 2 ij
2
.
2151_book.fm Page 108 Friday, October 20, 2006 5:04 PM
108
Stream of Variation Modeling and Analysis for MMPs
where Tkj is the fundamental rotation matrix, and bik = aik cos ϕ + aij sin ϕ bij = − aik sin ϕ + aij sin ϕ
(5.40)
bil = ail where i = 1, 2, …, p and j, k, l = 1, 2, …, m. The rotation angle ϕ is determined as before. m For m factors, the total number of rotations is ; thus, the rotated loading 2 matrix is given by: B(1) = AT12 T1m T(m −1)m = AC1
(5.41)
where C1 = T12 T1m T(m −1)m . Let V(1) be the variance of B(1). Continuing the earlier procedure, we get the sequences: B(1) , B( 2) , B(3) … and
(5.42)
V(1) ≤ V(2 ) ≤ V(3) ≤
(5.43)
The sequence has an upper bound because the absolute value of factor loading cannot exceed 1. Thus lim V( k ) exists; let k →∞
V * = lim V( k ) . k →∞
For a given precision δ, when k is sufficiently large, | V( k ) − V * |< δ
(5.44)
In practice, if | V( k ) − V( k +1) | < ε (ε > 0 is a small number), then we stop the rotation and take B( k ) = AC1C 2 C 3 C k = AC
(5.45)
as the factor loading matrix. Remark 5.8: The sum of eigenvalues is not affected by rotation, but the rotation will alter the eigenvalues of particular factors and will change the factor loadings.
2151_book.fm Page 109 Friday, October 20, 2006 5:04 PM
Principal Component Analysis and Factory Analysis
109
Because multiple rotations may explain the same variance (have the same total eigenvalue) but have different factor loadings, and because factor loadings are used to intuit the meaning of factors, this implies that different meanings may be ascribed to the factors depending on the rotation — a problem some cite as a drawback of factor analysis.
5.2.7 FACTOR SCORE The factor model is given in Equation 5.26, Xi = ai1F1 + ai 2 F2 + aim Fm, i = 1, 2, …, p The common factors Fj , j = 1, 2, …, m, reflect the relationship between original variables Xi , i = 1, 2, …, p. Usually, it is convenient to study the characteristics of the problem using the common factors in place of original variables. Suppose common factors can be linearly represented by the original variables, i.e., F j = β j 1 X1 + β j 2 X2 + + β jp X p , j = 1, 2, …, m
(5.46)
Equation 5.46 is called the factor score function and can be used to compute the common factor score for each sample. For example, when m = 2, we can plug in the p variables of each sample and compute F1 and F2. The scatter plot of F2 vs. F1 can be used to classify the samples or conduct further analysis. When m < p, the factor scores cannot be precisely computed. There are many methods, such as, weighted least-squares regression, that can be used to estimate the factor scores. Here, Thomson’s [7] regression method is introduced. Suppose the original variables and common factors have been standardized. If common factors Fj , j = 1, 2, …, m, can be regressed on Xi , i = 1, 2, …, p, then the regression equation can be expressed as: Fˆ j = b j 1 X1 + b j 2 X2 + + b jp X p , j = 1, 2, …, m
(5.47)
where the common factors are unknown and need to be estimated. From Equation 5.27 and Equation 5.47, for i = 1, 2, …, p and j = 1, 2, …, m, we can obtain: aij = cov( X i , Fj ) = E ( X i Fj ) = E[ X i (b j1X1 + b j 2 X 2 + + b jp X p )] = b j1E ( X i X1 ) + b j 2 E ( X i X 2 ) + + b jp E ( X i X p ) = b j1 cov( X i , X1 ) + b j 2 cov( X i , X 2 ) + + b jp cov( X i , X p ) = b j1ri1 + b j 2ri 2 + + b jprip
2151_book.fm Page 110 Friday, October 20, 2006 5:04 PM
110
Stream of Variation Modeling and Analysis for MMPs
i.e., b j 1r11 + b j 2 r12 + + b jp r1 p = a1 j b j 1r21 + b j 2 r22 + + b jp r2 p = a2 j b j 1rp1 + b j 2 rp 2 + + b jp rpp = a pj
(5.48)
In matrix notation, Equation 5.48 can be expressed as: Rb j = a j
(5.49)
where b j = (b j 1, b j 2 ,..., b jp )T , a j = ( a1 j , a2 j ,..., a pj )T , and R is the correlation matrix. By solving Equation 5.49, we get: b j = R −1a j
(5.50)
Let b1T b11 T b b21 B= 2 = T b m bm 1
b12
b22
bm 2
b1 p b2 p bmp
Then: (R −1a1 )T −1 T (R a 2 ) B= = A T R −1 −1 T (R a m )
(5.51)
Thus, from Equation 5.47 and Equation 5.51, the common factors can be estimated by: Fˆ1 b T X 1T b X Fˆ Fˆ = 2 = 2 = BX = A T R −1X Fˆm b Tm X
(5.52)
2151_book.fm Page 111 Friday, October 20, 2006 5:04 PM
Principal Component Analysis and Factory Analysis
5.2.8 GENERAL PROCEDURES
FOR
111
FACTOR ANALYSIS
The general procedures for factor analysis are summarized as follows: Step 1: Standardize original data if the variables do not have the same unit. Step 2: Derive the covariance matrix S or correlation matrix R. Step 3: Derive the eigenvalue and eigenvector pairs of S or R to get the factor loadings according to the cumulative contribution of the eigenvalues. Step 4: Rotate the factor loading. Step 5: Name and interpret the factors.
5.3 EXERCISES 1. It is given that the covariance matrix of the three (p = 3) standardized random variables Z1, Z2, and Z3 is 1.0 ρ = .07 .63
.07 1.0 .09
.63 .09 1.0
a. In order to generate Z1, Z2, and Z3 by an m = 1 factor model, list the necessary assumptions regarding ε ( ε = [ε1 ε2 ε3]T) and F1. b. Show that, based on the assumptions listed in (1), Z1, Z2, and Z3 can be generated by Z1 = .7F1 + ε1
.51 Z 2 = .1F1 + ε 2 and Ψ = cov(ε ) = 0 0 Z 3 = .9F1 + ε 3
0 .99 0
0 0 . .19
That is, write ρ in the form ρ = LLT + Ψ. 2. The eigenvalues and eigenvectors of the correlation matrix ρ in Problem 1 are λ1 = 1.64 , e1 = [.6953
.1716
.6980 ]T
λ 2 = .98 , e2 = [.1375
−.9849
.1052 ]T
λ 3 = .37 , e3 = [.7055
.0228
−.7084]T
2151_book.fm Page 112 Friday, October 20, 2006 5:04 PM
112
Stream of Variation Modeling and Analysis for MMPs
a. Assuming an m = 2 factor model, calculate the loading matrix L, and specific variance Ψ using the principal component solution method. b. What proportion of the total variance is explained by the m = 2 common factors? 3. Using the basic factor analysis model and assumptions, proof the following statements: a. If T is an m × m orthogonal matrix, then cov(TTF) = I and cov(TTF,εε) = 0. b. The factor model is X− µ = L
( p×1)
( p×1)
F + ε (m < p),
( p× m ) ( m ×1)
( p×1)
let L* = LT and F* = T T F, where T is an m × m orthogonal matrix, then cov(X) = L*L*T + Ψ. c. If the factor score is given by Fˆ = LT ∑ −1 (X − µ ), where L is estimated from covariance matrix ∑ by the principal component decomposition method (i.e., L = [ λ 1 e1 λ 2 e2 λ m em]), then E(Fˆ ) = 0 and Cov(Fˆ ) = I . Hint: i. cov(A) = E[(A – E(A)A – E(A))(A – E(A))T], where E(A) is the expectation of matrix A. ii. L and ∑ are deterministic, i.e., E(L) = L and E(∑) = ∑. iii. {λj, ej}, j = 1, 2, …, m are the eigenvalue and eigenvector pairs of ∑, which has the property that λj ej = ∑ej for j = 1, 2, …, m. 4. Consider an assembly process. A side panel of a car body is located on the fixture through a 4-way pin P1 and a 2-way pin P2. To monitor the side panel’s quality, ten points are measured and their nominal positions are shown in Figure 5.3. In this case, only three potential sources of variation are considered: a. The x direction variation of P1 b. The z direction variation of P1 c. The z direction variation of P2
FIGURE 5.3 Product and process description.
2151_book.fm Page 113 Friday, October 20, 2006 5:04 PM
Principal Component Analysis and Factory Analysis
113
The process is simulated by deliberately introducing process variation sources. The measurement data of points M1, M2, M5, and M6 are collected and given in the attached data file (Problem 4 dataset.xls). Use the factor analysis method to identify the introduced process variation sources. Hint: You may follow the following procedures to solve this diagnosis problem: i. Derive the data covariance S or correlation R matrix. Indicate the reason why you have chosen one (either S or R) rather than the other. ii. Derive the eigenvalue/eigenvector pairs from the matrix you selected in (i). iii. Determine the number of common factors. Indicate the criteria for your decision. iv. Derive the factor loadings for the common factors and try to interpret and link them with potential process variation sources considered. v. Are the factor loadings in (iv) interpretable? If not, try to improve the interpretability of the factor loading by deriving the new loading vectors. Indicate the method and criteria for achieving your new factor loadings. Interpret the new factor loadings, and figure out the variation sources introduced.
REFERENCES 1. Hotelling, H., Analysis of a complex of statistical variables into principal components, Journal of Educational Psychology, 24, 417–441,498–520, 1933. 2. Rao, C.R., The use and interpretation of principal component analysis in applied research, Sankhya A, 26, 329–358, 1964. 3. Lawley, D.N., Tests of significance for the latent roots of covariance and correlation matrices, Biometrika, 43, 128, 1956. 4. Johnson, R.A. and Wichern, D.W., Applied Multivariate Statistical Analysis, 4th ed., Prentice Hall, Upper Saddle River, NJ, 1998. 5. Morrison, D.F., Multivariate Statistical Methods, McGraw-Hill, New York, 1990. 6. Stevens, J., Applied Multivariate Statistics for the Social Sciences, 4th ed., Lawrence Erlbaum Associates, Mahwah, NJ, 2001. 7. Thomson, G.H., The Factor Analysis of Human Ability, Houghton Mifflin Company, New York, 1939. 8. Darlington, R.B., Weinberg, S., and Walberg, H., Canonical variate analysis and related techniques, Review of Educational Research, 453–454, 1973. 9. Rubenstein, D.I., Ecological Aspects of Social Evolution: Birds and Mammals, Princeton University Press, Princeton, NJ, 1986.
ADDITIONAL READING Darlington, R.B., Weinberg, S., and Walberg, H., Canonical variate analysis and related techniques, Review of Educational Research, 453–454, 1973. Rubenstein, D.I., Ecological Aspects of Social Evolution: Birds and Mammals, Princeton University Press, Princeton, NJ, 1986.
2151_book.fm Page 114 Friday, October 20, 2006 5:04 PM
2151_book.fm Page 115 Friday, October 20, 2006 5:04 PM
Part II Variation Propagation Modeling in MMP
2151_book.fm Page 116 Friday, October 20, 2006 5:04 PM
2151_book.fm Page 117 Friday, October 20, 2006 5:04 PM
6
State Space Modeling for Assembly Processes
This chapter develops a state space model characterizing variation propagation in a multistage rigid-part assembly process. First, the background of a multistage assembly process is briefly described using automotive body assembly processes. Major variation sources are then identified and modeling assumptions are summarized, on which a station-indexed state space model is subsequently developed. The state space model for depicting stream of dimensional variation is actually a result of recursively applying standard kinematic analysis to individual operation stages. However, the state space model is not a simple summation of multiple single-stage models, because the across-stage transitions will affect how variation propagates and complicate the dynamics of the entire system. The model is validated through a comparison with commercially available variation simulation software.
6.1 INTRODUCTION OF MULTISTAGE ASSEMBLY PROCESSES A multistage assembly process involves multiple stations or operations to assemble individual components into a sophisticated product. Typical multistage assembly processes include automotive assembly, printed circuit board (PCB) assembly, and aircraft fuselage assembly, to name a few. We will use the car body assembly process to highlight the features of this method. The major part of a car body is the body-in-white (BIW), namely, the structural skeleton of a car. Figure 6.1 shows an example of the BIW. The BIW provides a physical frame on which closure panels, including doors, hood, fenders, and lift gate or trunk, will be mounted. The finished BIW will be measured in a coordinate system as shown in Figure 6.1. This coordinate system is known as the body coordinate system in the automotive industry. A BIW is usually made of 100 to 150 sheet metal panels, which are assembled by a process involving 55 to 75 assembly stations [1]. Figure 6.2 shows the corresponding process layout for producing the BIW in Figure 6.1. The BIW assembly process starts with the component subassembly process, in which a sequence of assembly stations builds up a set of large subassemblies, including the left-hand side aperture (LH-APT), the right-hand side aperture (RH-APT), and the underbody. These subassemblies will then be fed into a framing station, where industrial robot welders will joint them together to form the BIW. Figure 6.2 also marks the optical coordinate measuring machine (OCMM) at a few different locations. OCMM stations are where the key dimensional features of a finished or semifinished car are measured in the production line. In a modern assembly system, OCMM stations are 117
2151_book.fm Page 118 Friday, October 20, 2006 5:04 PM
118
Stream of Variation Modeling and Analysis for MMPs
FIGURE 6.1 Example of an automotive body (body-in-white) and inline sensing using OCMM.
OCMM
Station #2
Left Hand Aperture Assembly
Station #1 Framing
OCMM Right Hand Aperture Assembly Station #5
Station #4
Station #3
Station #6 Underbody Assembly
OCMM
OCMM: Optical Coordinate Measuring Machine
Body-In-White
OCMM
FIGURE 6.2 BIW assembly layout (arrows represent a part input/output at each assembly station).
commonly distributed along the assembly line, instead of being implemented only at the end of a production line. The taxonomy of this treelike representation is the exact graphical equivalent of the physical process. More details can be found in [1]. On each assembly station, part assembly is initiated in two or more sets of fixture locators that aim to constrain the degrees of freedom of a part during a joining operation to ensure a repeatable build. Figure 6.3 illustrates a typical 3-2-1 fixture set used in car body assembly processes. It consists of two locating pins, LP4-way and LP2-way , and three net contact blocks, NCi, i = 1, 2, 3. The two locating pins constrain three degrees of freedom in the x-z plane, where the 4-way pin controls part motion in both the x and z directions and the 2-way pin controls part motion in the z direction. Three NC blocks constrain other degrees of freedom of the workpiece. When a workpiece is nonrigid, more than three NC blocks may be needed to reduce part deformation. An n-2-1 fixture layout, denoted by {LP4-way , LP2-way , NCi , i = 1, 2, .., n}, is a more generic setting in multistage assembly processes. For notation simplicity, unless otherwise indicated, we will also use LP1 to denote a 4-way locator and LP2 for a 2-way locator on a part. Hence, for a 2-D part, we can use the simplified notation {LP1, LP2} to denote the fixture locator pair used to position the part.
2151_book.fm Page 119 Friday, October 20, 2006 5:04 PM
State Space Modeling for Assembly Processes
119
NC 3
LP 2way NC 2
LP 4way NC1
workpiece
y x z FIGURE 6.3 Illustration of a 3-2-1 fixture.
6.2 VARIATION FACTORS AND ASSUMPTIONS Dimensional variation is a major problem affecting product quality in multistage assembly processes. Previous studies have shown that dimensional problems contribute to roughly two thirds of all quality-related problems during new product launches [2,3]. In a multistage system, the major variation factors can be generally classified into those on individual stations and those induced by across-station transitions.
6.2.1 STATION-LEVEL VARIATION FACTORS One of the major factors for dimensional quality of the finished product at the station level is the repeatability of fixture locators used to hold the parts, generally referred to as fixture variation. Fixture locators are used extensively in processes such as car body assembly to provide physical support and dimensional reference within the body coordinate system, thereby determining the dimensional accuracy of the final assembly [3,4]. Fixture locators may fail to provide the desired positioning repeatability (relative to tolerances) during production owing to gradual deterioration of locators and catastrophic events, such as broken or deformed locators. Figure 6.4 shows an example of fixture fault manifestation, where δLP2(z) is a z-directional fixturing deviation caused by the malfunctioning pinhole at LP2. This deviation will affect the positions of the measurement points SPi, i = 1, 2, 3. Another variation factor is the manufacturing imperfection of individual parts, such as deviations in part dimensions or geometrical features from their design,
sensor andSP i : sensor position fixture locators
SP 2
SP 1
z
LP 2 x
workpiece: a panel
LP 1
LP 2(z)
LP 1(z) LP 1(x)
4-way locator, positioning variability in two directions
LP 2(z)
SP 3
FIGURE 6.4 Fixture-locator-induced dimensional variation.
2-way locator, positioning variability in one direction
2151_book.fm Page 120 Friday, October 20, 2006 5:04 PM
120
Stream of Variation Modeling and Analysis for MMPs mating feature Part 1
(a) Lap joint
Part 2
Part 1
Part 2
(b) Butt joint
Part 1
Part 2
(c) Mixed joint
Defined direction
FIGURE 6.5 Cross-sectional views of joint geometries.
generally referred to as part variation. Part variation is actually a result of the processes fabricating the parts: as the manufacturing process is inherently imperfect, part variation is unavoidable. Nevertheless, whether part variation affects the downstream process or not depends on what type of part-to-part joint is involved. In car body assembly processes, Ceglarek and Shi [5] reported that there are two basic important types of joints: one is the lap joint (Figure 6.5a) and the other is the butt joint (Figure 6.5b). The lap joint can absorb variation in the direction of the slip plane and the butt joint can be used as a datum to locate another part. Therefore, variation will propagate in the defined direction through the mating feature only when using butt joints. Figure 6.5c shows a generic joint geometry that can partially allow motion in the defined direction. The third variation factor is related to the flexibility of a part. If a part is flexible, such as the metal panels of a car body, it can deform during the assembly operation owing to incorrect locator positions, uneven forces, or part-to-part interferences. This deformation will cause additional dimensional variability in the finished product. In this chapter, we will assume that rigid body parts are used in assembly operations.
6.2.2 ACROSS-STATION VARIATION FACTORS Shiu et al. [6] discovered a phenomenon called reorientation in a multioperation process, when a change in fixture layouts occurs as the subassembly proceeds to a new station. Figure 6.6 illustrates the reorientation phenomenon, in which the dashed line represents the nominal position of parts and the solid line indicates the actual position. Assume that positioning errors are associated with fixture locators on station k, but not with those on station k + 1. Then, when part 1 and part 2 are assembled on station k, there is some relative position deviation between these two components; i.e., if part 1 is positioned according to its nominal position, part 2 will deviate from its nominal position, and vice versa. When the subassembly of part 1 and part 2 is transferred to station k + 1, it will be repositioned on station k + 1 in order to be assembled with a new part 3. Suppose that the pin-holes on part 2 are used to hold subassembly “1+2.” Even if fixture locators that hold part 2 on station k + 1 are free of error, as shown in Figure 6.6, part 1 may deviate from its nominal position in subassembly “1+2+3.” This deviation of part 1 is not a result of locator deviation at station k + 1. Rather, it is a combined result of deviation of locators on the previous
2151_book.fm Page 121 Friday, October 20, 2006 5:04 PM
State Space Modeling for Assembly Processes
Part 2
Part 1
Fixture deviation
121
Station k
Deviation of part 1 comes from re-orientation of subassembly 1+2 from station k to station k+1 Part 1
Part 2
Station k+1 Part 3
FIGURE 6.6 Across-station variation factor: reorientation.
station and the subassembly repositioning at the current station. The significance of this reorientation-induced deviation is that it actually couples the variations originating from individual stations, as we will see in Subsection 6.3.2. This reorientation-induced error would be almost unavoidable for a multistation assembly process when a different set of fixture locators or a different datum scheme is used to reposition a subassembly on a downstream station. When the parts are considered flexible, the reorientation effect is influenced by flexibility-related effects (such as spring back), and they will further complicate the way variation propagates across stations. The modeling of variation and its propagation taking part flexibility into account can be found in Camelio et al. [7].
6.2.3 SUMMARY
AND
ASSUMPTIONS
FOR
MODELING
The key variation factors can be summarized as: • • • •
Dimensional errors associated with fixture locators at each assembly station Dimensional errors due to part fabrication Across-station reorientation-induced error Flexibility effects of individual components
In this chapter, we will focus on a simplified rigid-part assembly process. That is, we will limit ourselves to considering only 2-D panel-like parts positioned by a 3-2-1 fixture layout. Moreover, we will concentrate on the modeling and analysis of fixture-locator-related errors and assume that only the lap joint is present between parts. Owing to the variation-absorbing nature of a lap joint, part fabrication errors will therefore not affect the stream of variation in the whole production line. The assumptions are summarized as follows: • • •
2-D rigid body part 3-2-1 Fixture layout for rigid part Lap joint between parts
2151_book.fm Page 122 Friday, October 20, 2006 5:04 PM
122
Stream of Variation Modeling and Analysis for MMPs
The resulting model will also be applicable to an n-2-1 fixture layout if the fixture variations being considered cause panel motion largely in the plane of rigidity. Shiu et al. [6] indicated that this simplified process is applicable to 60% of situations in a typical automotive body assembly process. The part fabrication errors will be explicitly considered and modeled using a differential motion vector in Chapter 7 for a machining process.
6.3 STATE SPACE MODELING 6.3.1 REPRESENTATION
OF
PART POSITION
AND ITS
DEVIATION STATE
In dimensional quality control, each part is characterized by its deviation from the nominal position. Let x, y, and z be the translation coordinate variables and, α, β, and φ be the corresponding rotation coordinate variables (please refer to Figure 6.7 for the six coordinate variables [or degree of freedoms]). As we will only explicitly model a 2-D assembly process in this chapter, the position of each single part or a multipart subassembly is represented by three coordinate variables, x, z, and β. In Figure 6.7, the subscripts (i, k) of the coordinate variables indicate that they are for part i on station k, where i = 1, …, np, k = 1, …, N, np is the total number of parts, and N is the total number of stations. To define the position and orientation of a part, we need to specify reference points. In this chapter, we will assign two reference points to each part, and make them coincide with the locating points denoted by LP1,i,k and LP2,i,k. On station k, the position of a part will be represented by the instantaneous position of LP1,i,k, (LP1,i,k (x), LP1,i,k (z)), in the body coordinate system, and its orientation represented by the angle between the line connecting LP1,i,k and LP2,i,k and the x-axis of the body coordinate system. For a subassembly that consists of more than one part, its position and orientation is similarly defined. Each 2-D rigid body subassembly will be positioned by a 4-way locator and a 2-way locator, which are chosen from the pool of 4-way and 2-way locators on the subassembly. The 4-way/2-way locator pair used to position the subassembly is treated as the reference points for the subassembly, and they are
parti on station k
LP 2,i,k LP 1,i,k
z
(LP1,i,k(x), LP 1,i,k(z))
y x
FIGURE 6.7 Representation of part deviation.
i,k
2151_book.fm Page 123 Friday, October 20, 2006 5:04 PM
State Space Modeling for Assembly Processes
123
denoted by LS1,s,k and LS2,s,k, respectively, where the subscripts have a similar meaning but s is the index for subassembly here. The part position indicated in a design blueprint is known as the nominal position, which is the ideal situation when neither fixture variation nor part variation exists. As locating tools and part fabrications are not perfect in reality, the part will deviate from its nominal position. Given that one knows part nominal positions from a design blueprint, people often use deviations from such positions to represent its actual position. Denote the random deviations associated with each of the six degrees of freedom (three translations and three rotations) of part i at station k by xi,k ≡ [δxi,k δyi,k δzi,k δαi,k δβi,k δφi,k]T, where δ in front of a coordinate variable represents a small deviation. For a 2-D assembly, the deviation state of each xi,k can be simplified as [δxi,k δzi,k δβi,k]T. For all the np parts in the final assembly, the deviation state of the whole assembly on station k can be expressed as x k ≡ [x1T,k x Tn p ,k ]T. In a multistation process, part i may not yet have appeared on station k. If not, the corresponding xi,k = 0. We denote the index of the first station where part i appears by ξi. For the deviation state of a multipart subassembly s at station k, we use the notation qs,k to distinguish it from an individual part, but the notation’s physical meaning is the same. We will use uk to denote the random deviation associated with locator pairs on station k. For example, for the locator pair that supports subassembly s on station k, its deviation will be us,k = [δLS1,s,k (x) δLS1,s,k (z) δLS2,s,k (x) δLS2,s,k (z)]T. For all the nk locator pairs used on station k, the fixture deviation vector is xk ≡ [u1T,k u Tnk ,k ]T.
6.3.2 SOME PRELIMINARY RESULTS Two lemmas are presented here before we derive the state space model, which will be used intensively during later derivations. They are based on rigid part kinematics. The detailed proofs of both lemmas can be found in the appendix 6.1 and 6.2, and also in Reference 8 or Reference 9. Lemma 6.1: When a rigid body undergoes a translation and a rotation away from its nominal position, if the rotation angle (denoted as δβ) is small, then deviations of any two points on the same part (denoted as [ δxa, δza ]T and [δxb, δzb ]T) have the relationship: δxb δx a a ,b δ z = R 1 δza b δβ δβ
(6.1)
1 = 0 0
(6.2)
where
R
a ,b 1
0 1 0
−( zb − za ) xb − x a 1
and (xa, za), (xb, zb) are the nominal coordinates of those two points.
2151_book.fm Page 124 Friday, October 20, 2006 5:04 PM
124
Stream of Variation Modeling and Analysis for MMPs
Lemma 6.2: The part deviation state xi,k can be related to the deviations of its reference points using a linear approximation, provided that the deviations at these points are much smaller than the distance between them; that is:
x i,k = R 2i,k
δLP1,i,k ( x ) δLP1,i,k ( z) δLP2,i,k ( x ) δLP2,i,k ( z)
(6.3)
where
R
i, k 2
1 = 0 sin βi,k LDi,k
0 1 cos βi,k − LDi,k
0 0 sin βi,k − LDi,k
0 0 cos βi,k LDi,k 3× 4
(6.4)
and LDi,k is the Euclidean distance between LP1,i,k and LP2,i,k . Lemma 6.2 can be easily extended to a multipart subassembly, as stated in the following two corollaries. Corollary 6.1: If subassembly s at station k is located by points LS1,s,k and LS2,s,k, then its deviation state due to small deviations at the locating points is:
q s ,k = R s3,k
δLS1,s ,k ( x ) δLS1,s ,k ( z) = R s ,k ⋅ u 3 s ,k δLS2,s ,k ( x ) δLS2,s ,k ( z)
(6.5)
where
R
s ,k 3
1 = 0 sin βs ,k SDs ,k
0 1 cos βs ,k − SDs ,k
0 0 0 0 sin βs ,k cos βs ,k − SDs ,k SDs ,k
(6.6)
and SDs,k is the Euclidian distance between LS1,s,k and LS2,s,k. The next corollary characterizes the reorientation when a subassembly is transferred to the next station. As explained in Subsection 6.2.2, reorientation will induce deviation in a part even if the current locating points are free of error.
2151_book.fm Page 125 Friday, October 20, 2006 5:04 PM
State Space Modeling for Assembly Processes
125
Corollary 6.2: Suppose that subassembly s is supported by LS1,s,k and LS2,s,k at station k and also assume that these two locating points are free of error at the current station k. The deviation state of subassembly s on station k when moving from station k – 1 to station k can be expressed as a linear combination of deviations accumulated in its locating points LS1 and LS2 at the previous station k – 1.
q s ,k = R k4−1,k
δLS1,s ,k −1 ( x ) δLS1,s ,k −1 ( z) δLS2,s ,k −1 ( x ) δLS2,s ,k −1 ( z)
(6.7)
where
R
k −1, k 4
−1 = 0 sin βs ,k − SDs ,k
0 −1 cos βs ,k SDs ,k
0 0 0 0 sin βs ,k cos βs ,k − SDs ,k SDs ,k
(6.8)
6.3.3 STATE SPACE REPRESENTATION Based on the previous discussions, as well as the analysis done by Jin and Shi [10], the deviation state of part i on station k can be expressed as a summation: x i,k = x i,k −1 + d if,k + d ir,k
(6.9)
where d if,k and d ir,k are the fixture-error-induced deviation and reorientation-induced deviation, respectively, and the superscripts have the same meaning as the subscript of x. Equation 6.9 is illustrated in Figure 6.8, in which x i,k −1 is the intermediate state of part i during the operations on station k. The two deviation terms d if,k and d ir,k in Equation 6.9 are given by the following theorems.
Deviated part from the previous station
On station k dr
xi,k-1
Re-orientation
~ x i,k 1
df Fixture error
FIGURE 6.8 Deviation accumulation of part i on station k.
x i,k
Input to the next station
2151_book.fm Page 126 Friday, October 20, 2006 5:04 PM
126
Stream of Variation Modeling and Analysis for MMPs
Part i
Subassembly s
LP 2,i,k.
LP 1,i,k
z
LS1,s,k
LS2,s,k
x
FIGURE 6.9 Fixture error-induced deviation.
Theorem 6.1: Suppose part i is in subassembly s on station k. As illustrated in Figure 6.9, its fixture error-induced deviation d if,k can be expressed as d if,k = B1i,k u k
(6.10)
where R 1,s ,k B1i ,k = 1 0 3× 4nk LS
W1 (s) = ∆1s I4× 4
,LP1,i ,k
R 3s ,k W1 (s) if k ≥ ξ i if k < ξ i
∆ 2 s I 4× 4
∆ hs I4× 4
(6.11)
4 × 4 nk
(6.12)
The Kronecker Delta is 1 if h = s ∆ hs = 0 if h ≠ s
(6.13)
Proof: According to Corollary 6.1, the deviation of subassembly s due to fixture errors at LS1,s,k and LS2,s,k is the qs,k expressed in Equation 6.5. Because part i is on subassembly s, Lemma 6.1 can be employed to obtain the deviation state of part i, which is d if,k = R 1
LS1,s , k , LP1,i , k
q s ,k
(6.14)
Substituting Equation 6.5 into Equation 6.14 yields d if,k = R 1
LS1,s , k , LP1,i , k
R s3,k ⋅ u s ,k
(6.15)
2151_book.fm Page 127 Friday, October 20, 2006 5:04 PM
State Space Modeling for Assembly Processes
127
The W1’s defined in Equation 6.12 is a selecting matrix to determine which set of locator pairs is the one affecting the deviation of part i, i.e., u s ,k = W1 (s ) ⋅ u k
(6.16)
Combining Equation 6.15 and Equation 6.16, we end up with the expression of the upper part in Equation 6.11. The lower part in Equation 6.11 implies that fixture error will not be reflected in the deviation of part i before this part appears in the assembly stream. Theorem 6.2: If part i is on subassembly s on station k, its reorientation-induced deviation d ir,k can be expressed as d ir,k = B2i,k F1x J ,k −1 + B2i,k F2 x G ,k −1
(6.17)
where LS , LP R 1,s ,k 1,i,k R k4−1,k Bi2,k = 1 0
1 0 F1 = 0 0 0 0 F2 = 1 0
0 0 0 1
iff k > ξ i if k ≤ ξ i
(6.18)
0 0 0 0
(6.19)
0 0 LP1,G ,k ,LP2 ,G ,k R1 0 0
(6.20)
0 1 0 0
and J and G are the indices of parts on which the points LS1,s,k and LS2,s,k are located, respectively. Proof: The d ir,k is actually caused by the locating deviation accumulated up to station k – 1; therefore, it can be expressed as:
d ir,k = R 1
LS1,s , k , LP1,i , k
R k4−1,k
δLS1,s ,k −1 ( x ) δLS1,s ,k −1 ( z) δLS2,s ,k −1 ( x ) δLS2,s ,k −1 ( z)
(6.21)
2151_book.fm Page 128 Friday, October 20, 2006 5:04 PM
128
Stream of Variation Modeling and Analysis for MMPs
where δLS h,s ,k −1 ( x ) and δLS h,s ,k −1 ( z) (h = 1, 2) have the same meaning as those in Corollary 6.2. Here, R k4−1,k converts fixture deviations at the previous station into the deviation state of the associated subassembly during part transferring across stations (Corollary 6.2), whereas R 1LS1,LP1 further connects the deviation of subassembly s to that of part i using Lemma 6.1. Provided that LS1,s,k is on part J and LS2,s,k on part G, respectively, the deviation information of these two points at the previous station k–1 is actually preserved in xJ,k-1 and xG,k-1, that is, δLS1,s ,k −1 ( x ) 1 = δLS1,s ,k −1 ( z) 0
0 x J ,k −1 0
(6.22)
0 LP1,G ,k ,LP2 ,G ,k x G ,k −1 R1 0
(6.23)
0 1
and δLS2,s ,k −1 ( x ) 1 = δLS2,s ,k −1 ( z) 0
0 1
Putting them together, Equation 6.22 and Equation 6.23 yield δLS1,s ,k −1 ( x ) 1 δLS1,s ,k −1 ( z ) 0 = δLS2,s ,k −1 ( x ) 0 0 δLS2,s ,k −1 ( z ) 0 0 + 1 0
0 1 1 0 0 0 0 0 1 0 0 1
0 x J ,k −1 0
0 1
(6.24) 0 1
0 LP1,G ,k ,LP2 ,G ,k xG ,k −1 R1 0
Substituting Equation 6.24 into Equation 6.21, and also using definitions in Equation 6.18 to Equation 6.20, Equation 6.17 can be obtained. Again, the lower part of Equation 6.18 is 0 because the reorientation-induced deviation only begins to contribute from station k + 1 if part i first appears on station k. Utilizing Theorem 6.1 and Theorem 6.2, Equation 6.9 can be written as, x i,k = x i,k −1 + B1i,k u k + Bi2,k F1x J ,k −1 + Bi2,k F2 x G ,k−1
(6.25)
This equation implies that the deviation state of part i is coupled with that of other parts. It is then impossible to write a decoupled deviation propagation equation
2151_book.fm Page 129 Friday, October 20, 2006 5:04 PM
State Space Modeling for Assembly Processes
129
for each individual part. Alternatively, we can write a state space equation using the deviation state vector of all np parts, which is x k = [x1T,k x Tn p ,k ]T. First, we need to replace x J ,k −1 and x G ,k −1 in Equation 6.25 with the aggregated state vector xk. Define W2, another selecting matrix, as W11 W2 (s, k ) = W21
W12
W22
W1n p W2 n p 6 ×3 n p
(6.26)
where I W jg = 3×3 03×3
if (j, g ) = (1, J ) or (2, G ) otherwise
(6.27)
W2 is used to determine which parts on subassembly s contribute to the reorientation-induced error at station k. Then, define Hi,k as Hi,k = [Bi2,k F1
Bi2,k F2 ]
(6.28)
Using Equation 6.26 and Equation 6.28, Equation 6.25 turns out to be x i,k = x i,k −1 + B1i,k u k + Hi,k W2 (s, k ) x k
(6.29)
Furthermore, define Bk and Hk–1 as B k = [(B11,k )T H k −1 = [(H1,k W2 (s, k ))T
(B12,k )T
...
(B11,nk )T ]T
(H2,k W2 (s, k ))T
...
(H n p ,k W2 (s, k ))T ]T for k ≥ 2,
(6.30)
and H0 = 03 n p ×3 n p. Then, Equation 6.29 can be written as x k = (I3 n p ×3 n p + H k −1 ) x k −1 + B k u k , or one step further, x k = A k −1x k −1 + B k u k where A k −1 = I3 n p ×3 n p + H k −1 .
(6.31)
2151_book.fm Page 130 Friday, October 20, 2006 5:04 PM
130
Stream of Variation Modeling and Analysis for MMPs
Equation 6.31 is the state equation in our station-indexed state space model, governing the deviation propagation in an assembly process under assumptions listed in Subsection 6.2.3. If coordinate sensors are placed on assembly stations along the assembly line, an observation equation should be employed to describe the relationship between the dimensional measurements and the deviation state. Denote mi,k as the number of measurement points on part i at station k and mk as the total number of measurement points at station k. Then the deviation of each point on part i is: y ij,k = R 0
LP1,i , k , SPj ,i
x i, k
j = 1, 2, , mi,k
(6.32)
0 LP1,i,k ,SPj ,i R1 0
(6.33)
where 1 = 0
LP1,i , k , SPj ,i
R0
0 1
and SPj,i is the location of the jth sensor on part i. Thus, the measurement vector of part i on station k is y i,k = [( y1i,k )T
( y i2,k )T
m
( y i,ki,k )T ]T
(6.34)
and it can be written as y i, k = C i, k x i, k
(6.35)
where
C i, k
R 0LP1,i,k ,SP1,i LP ,SP R 1,i,k 2 ,i = 0 LP1,i,k ,SPmi,k ,i R 0 2 mi , k × 3
(6.36)
Then, the deviation vector of all the measurement points on station k is y k ≡ [ y1T where
y Tmk ]T = C k x k
(6.37)
2151_book.fm Page 131 Friday, October 20, 2006 5:04 PM
State Space Modeling for Assembly Processes
C1,k 0 C k = 0
0 C 2,k 0
131
0 0 C n p ,k
(6.38)
If process background disturbances, higher-order terms due to linearization, and measurement noises are considered together with Equation 6.31 and Equation 6.32, the state space model governing dimensional deviation propagation in an N-station process will be: x k = A k −1x k −1 + B k u k + w k , k = 1, 2, …, N
(6.39)
y k = C k x k + v k {k} ⊂ {1, 2, …, N}
(6.40)
where wk and vk are the random vectors representing process disturbances or higherorder terms and measurement noises, respectively.
6.4 MODEL VALIDATION To validate the state space model developed in Section 6.3, consider the assembly process of the side frame of an SUV. The final product, the inner-panel-complete, comprises four components: A-pillar, B-pillar, rail roof side panel, and rear quarter panel, which are shown in Figure 6.10. The four parts are assembled on three stations (Stations I, II, and III). Then, the final assembly is inspected at Station IV, where SP1 to SP 10 marked in Figure 6.10d are the key dimensional features. In such a multistation process, the aforementioned 3-2-1 fixture is used on every station to ensure product dimensional accuracy. In Figure 6.10, LP1 to LP8 are locating points, often called principal locating points (PLPs) in the automotive industry. A fixture layout in a multistation process can be represented using these PLPs as follows: {{LP1, LP2}, {LP3, LP4}}I → {{LP1, LP4}, { LP5, LP6}}II → {{ LP1, LP6}, {{LP7, LP8}}III → {{ LP1, LP8}}IV , where the assembly process starts from Station I (indicated by the subscript) and the arrow represents a transition from one station to the next. As an example, {{LP1, LP4},{LP5, LP6}}II means that at Station II, the first workpiece, the subassembly “A-pillar+B-pillar,” is located by LP1 and LP4 and the second workpiece, the rail roof side panel, is located by LP5 and LP6. The design nominal x and z coordinates for fixture locators and measurement points are given in Table 6.1 and Table 6.2, respectively.
2151_book.fm Page 132 Friday, October 20, 2006 5:04 PM
132
Stream of Variation Modeling and Analysis for MMPs
Rail roof side panel
LP 2 A-Pillar
A-pillar
LP 5 LP 6
LP 3 LP 1 B-pillar
LP 4
z
LP 1 y
(a) Station I: pillar and pillar assembly
(b) Station II: addition of rail roof side panel
SP 7
SP 6
SP 2
LP 4
B-pillar
x
LP 8
A-pillar
SP 1
LP 6
SP 10 SP 3 SP 4
LP 7
Rail roof side panel
SP 9
SP 5
LP 1
SP 8
Rear quarter panel B-pillar
(d) Station IV: key product features
(c) Station III: final assembly of inner-panel
FIGURE 6.10 Assembly process of an SUV side frame.
TABLE 6.1 Coordinates of Fixture Locators (in Figure 6.10a, Figure 6.10b, and Figure 6.10c; units: mm) (x,z)
LP1 (367.8,906.05)
LP2 (667.47,1295.35)
LP3 (1301,1368.89)
LP4 (1272.73,537.37)
(x,z)
LP5 (1470.71,1640.40)
LP6 (1770.50,1702.62)
LP7 (2941.42,1691.31)
LP8 (2120.32,1402.83)
TABLE 6.2 Coordinates of Measurement Points (in Figure 6.10d; units: mm) (x,z)
SP1 (271.50,905)
SP2 (565.7, 1634.7)
SP3 (1289.7,1227.5)
SP4 (1306.5,633.5)
SP5 (1244.5,85)
(x,z)
SP6 (1604.5,1781.8)
SP7 (2884.8, 1951.5)
SP8 (2743.5, 475.2)
SP9 (1838.4,226.3)
SP10 (1979.8,1459.4)
A state space model is established, following the general modeling procedure, as outlined in Section 6.2, to represent the variation propagation in such a fourstation assembly process. Suppose that there are tolerances of ±1.0 mm associated with all the locating points on each of the three assembly stations and we assume
2151_book.fm Page 133 Friday, October 20, 2006 5:04 PM
State Space Modeling for Assembly Processes
133
TABLE 6.3 Comparison between the State Space Model and VSA (unit: mm)
Mean
σ) Std(σ
Mean
σ) Std(σ
Discrepancy between Std’s (%)
271.5002 905.0007 565.7050 1634.6972 1289.6876 1227.4772 1306.5058 633.4785 1244.4226 84.9773 1640.5004 1781.7971 2884.8178 1951.5027 2743.5147 475.2045 1838.4151 226.3067 1979.8176 1459.4052
0.0020 0.1862 1.4103 0.3833 0.8361 0.9885 1.0123 0.9999 1.7122 0.9608 0.9545 0.8902 1.1852 1.1136 1.6114 0.9210 1.9107 0.4698 0.9165 0.3229
271.5000 905.0007 565.7053 1634.6986 1289.6876 1227.4778 1306.5059 633.4783 1244.5228 84.9764 1640.5007 1781.7985 2884.8065 1951.5037 2743.5094 475.2034 1838.4099 226.3016 1979.8075 1459.4019
0.0020 0.1861 1.4102 0.3833 0.8361 0.9885 1.0123 0.9999 1.7122 0.9608 0.9545 0.8900 1.1853 1.1136 1.6115 0.9211 1.9107 0.4698 0.9165 0.3229
0.0000 0.0537 0.0071 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0225 0.0084 0.0000 0.0062 0.0109 0.0000 0.0000 0.0000 0.0000
VSA
SP1(x) SP1(z) SP2(x) SP2(z) SP3(x) SP3(z) SP4(x) SP4(z) SP5(x) SP5(z) SP6(x) SP6(z) SP7(x) SP7(z) SP8(x) SP8(z) SP9(x) SP9(z) SP10(x) SP10(z)
State Space Model
that fixture locators at the measurement station (Station IV) are free of error. Also, it is assumed that deviations at the locating points are normally distributed and the tolerance limits (±1.0) correspond to the 6σ value of a corresponding normal distribution. Variations observed at the sensor locations are calculated using the state space model and a simulation software package, Variation System Analysis (VSA), based on 5000 simulations. The mean value and the standard deviation (std) of all measurement points from both models are compared in Table 6.3. The discrepancy between the std values from the two models is less than 0.054% for this four-station example. We feel that the state space model can provide an adequate linear variation propagation model that can predict dimensional variation as accurately as VSA does under some circumstances.
6.5 EXERCISES This problem aims to provide an exercise to help get familiar with the state space model and perform assembly process simulations. The process analyzed corresponds to the assembly of an automobile side frame, as presented in Figure 6.11. The side frame has three parts: the fender, the side ring, and the rear quarter.
2151_book.fm Page 134 Friday, October 20, 2006 5:04 PM
134
Stream of Variation Modeling and Analysis for MMPs
FIGURE 6.11 Description of the components of a sedan side frame.
FIGURE 6.12 Assembly sequence of the sedan side frame.
The assembly sequence is presented in Figure 6.12. In station 1, the fender and the side ring are assembled together. The subassembly is then moved to Station 2, where the rear quarter is added. Finally, the whole assembly is transferred to the measurement station to measure the KPCs given as the measurement points (M1 to M8). The dots on each part represent the locating points used to locate the parts in each station. The odd numbers correspond to the 4-way locators and the even numbers to the 2-way locators. It is important to note that in station 2, the subassembly coming from station 1 is held by locators P1 and P4. In the measurement station, the locators used are P1 and P6. The position of the locators and measurement points are presented in Table 6.4. Problem 6.1 (Single-part model): In this problem, consider the complete side frame as one part, as shown in Figure 6.13. 1. Derive the relationships that relate the deviations at the measurement points M4 and M6 with respect to the deviations of the locating points P1 and P6. (Hint: Obtain matrix T (T = R1 R2) in this case.) 2. From the result in (1), determine the amount of deviations in the locators (P1 and P6) that causes the deviations of the measurement points for different parts, as presented in Table 6.5. Provide a table with the faults that you identified and their associated deviations.
2151_book.fm Page 135 Friday, October 20, 2006 5:04 PM
State Space Modeling for Assembly Processes
135
TABLE 6.4 Location of the Locators and the Measurement Points X (mm)
P1 P2 P3 P4 P5 P6
(4-way) (2-way) (4-way) (2-way) (4-way) (2-way)
M1 M2 M3 M4 M5 M6 M7 M8
Locators 250 700 1078 2992 3437 3905
Z(mm)
360 365 770 812 495 503
Measurement Points 456 401.7 770 555.3 1078 770 1740 1146.4 1807 35 2456 1260 4050 800 4250 402
FIGURE 6.13 Simplified model of a sedan side frame (single-part model).
Note that: • •
It is recommended to use MatlabTM (Matlab is a registered trademark of The MathWorks, Inc.) to calculate matrices R1 and R2. To determine the deviation of the locators in (b), you should use the command pinv (pseudoinverse) in Matlab instead of inv to calculate the inverse of the matrix.
Problem 6.2 (Multiple-parts model): In this problem, consider the complete assembly process of the three parts that form the side frame presented in Figure 6.12, and the position of the locators and the measurement points presented in Table 6.4.
2151_book.fm Page 136 Friday, October 20, 2006 5:04 PM
136
Stream of Variation Modeling and Analysis for MMPs
TABLE 6.5 Observed Measurements (unit: mm) Part #
M4 (x)
M4 (z)
M6 (x)
M6 (z)
1 2 3 4 5 6 7
0.992 0.215 0 0.43 1.206 0.215 0.562
0.016 0.407 1 0.186 0.609 0.593 0.17
0.990 0.246 0 0.492 1.236 0.246 0.499
0.024 0.603 1 0.205 0.421 0.397 0.229
1. Obtain the state space model matrices (A, B for each station and matrix C for the measurement station) for the process presented in Figure 6.12. The locators used at station 1 are P1 and P2 for the fender, and P3 and P4 for the side ring. At station 2, the locators used are P1 and P4 for the subassembly and P5 and P6 for the rear quarter. At the measurement station, the locators P1 and P6 are used to hold the complete side frame. Note that in order to verify the results, consider that A0(1,1) = 1; A1(3,3) = 1; B1(3,2) = 0.0022; B2(3,2) = 0.0004; C3(16,9) = 813. Remember that the matrices A are index-shifted to 1 because no index can be 0 in Matlab. 2. Compare the following two mounting strategies presented in terms of their impact on quality (the simulation conditions are stated at the end of the problem): • First strategy: Locators P1 and P6 are used to hold the frame at the measurement station. • Second strategy: Locators P1 and P4 are used (i.e., P4 instead of P6) to hold the frame at the measurement station. Do each of the following: a. Compare the standard deviation of both strategies for each individual measurement point M1 to M8 in x and z directions (use the Matlab command std to calculate the standard deviation). b. Compare the 2-norm (sum of squared elements of a vector) of the vectors containing the standard deviations of the eight measurement points for both strategies (use the command norm in Matlab to calculate the vector norms). c. Which mounting strategy do you recommend? When solving this problem, please consider that the only source of variation in the whole process is the fixture variations at each station, which are independent and normally distributed with zero mean and standard deviation 0.0416 mm.
2151_book.fm Page 137 Friday, October 20, 2006 5:04 PM
State Space Modeling for Assembly Processes
137
Deviated position B’ δZ B
Nominal position δX B B
A’
δZ A
A Z
δβ
δX A
β X
FIGURE 6.14 Part deviating from its nominal position.
APPENDIX 6.1: DETERMINATION OF THE DEVIATIONS OF A PART (LEMMA 6.1) Figure 6.14 presents a view of a part deviating from its nominal position. As the deviation is in the 2-D plane (x-z) and the part is assumed to be rigid, three parameters are required to completely characterize the deviations of the part. Those are the deviations in the x and z directions and the rotation along the y- axis (going into the plane, and henceforth considered positive in the y direction). The position of any arbitrary point B on the part can be determined based on the known position of any other arbitrary point A and the part rotation. In Figure 6.14, point A coincides with the location of the part hole. However, the position of point A can be any point on the part. Then the position of point B can be determined as: xb = x a + x ab zb = za + zab
(6.41)
where xa, za, xb, and zb are the positions of points A and B in the x- and z-axis directions (absolute position), respectively. The terms xab and zab stand for the distance between points A and B in the x and z directions and are calculated as xab = xb – xa and zab = zb – za . If the part deviates in the plane and the deviations of the reference point A are known (including part rotation), then it is possible to determine the final position of point B xb′ and zb′ as x b′ = x a′ + x ab ⋅ Cos(δβ) − zab ⋅ Sin(δβ) zb′ = za′ + x ab ⋅ Sin(δβ) + zab ⋅ Cos(δβ)
(6.42)
2151_book.fm Page 138 Friday, October 20, 2006 5:04 PM
138
Stream of Variation Modeling and Analysis for MMPs
where xa′ and za′ are the positions of point A after the deviation of the part in the x and z directions, and δβ is the part rotation. Now considering that the rotations of the part are small (less than 10˚), Equation 6.42 can be simplified (based on the linearization of the trigonometric functions) as x b′ = x a′ + x ab − zab ⋅ δβ zb′ = za′ + x ab ⋅ δβ + zab
(6.43)
By subtracting Equation 6.41 from Equation 6.43, it is possible to determine the deviations of point B with respect to its nominal position as δx b = x b′ − x b = δx a − zab ⋅ δβ δzb = zb′ − zb = δza + x ab ⋅ δβ
(6.44)
Then Equation 6.44 can be rewritten as a matrix multiplication: δxb 1 δzb = 0 δβ 0
0 1 0
δx a −( zb − za ) δx a a ,b xb − x a ⋅ δza = R 1 ⋅ δza δβ 1 δβ
where matrix R1a,b is defined as
R
a ,b 1
1 = 0 0
0 1 0
−( zb − za ) xb − x a 1
APPENDIX 6.2: EFFECT OF FIXTURE DEVIATION ON PART DEVIATION (LEMMA 6.2) Lemma 6.2 relates the part deviations or state (xi,k) as a function of the deviations of the hole and slot, which is caused by the deviations of the pins that fit into those orifices. In the current analysis, the reference point in the part where the deviations are calculated, coincides with the location of the hole. The relations between fixture or pin deviations and the part, deviations are obtained through the decomposition of effects that single deviations have on the deviation state. The case presented in Figure 6.15a corresponds to the general case where the locators of part i in station k (LP1,i,k and LP2,i,k) are not necessarily aligned along the x or z directions. Figure 6.15b presents the effects when only a deviation in the z direction of the pin that fits into the hole (LP1,i,k), has on the deviations of the part.
2151_book.fm Page 139 Friday, October 20, 2006 5:04 PM
State Space Modeling for Assembly Processes
139
LP 2
LP1 ( z)
Z X
LP 1 (a)
(b)
FIGURE 6.15 Effects of the deviation of LP1 on the part (Note: In the figure, the subscripts i,k have been omitted for simplicity). (a) Case where locators are not aligned with X or Z axis. (b) Impact of LP2(z) on part orientation.
Those effects are a translation of the reference point LP1,i,k in the z direction and the rotation of the part. Figure 6.16 presents the detail of the effects that the displacement of LP1,i,k (δLP1,i,k (z)) has on the part. The part rotation δβi,k can be determined as a function of the displacement using geometric relations, as presented in Equation 6.45. The rotation is negative according to the axis defined in Figure 6.15. δβi,k ≈ −Cos (βi,k ) ⋅
δLP1,i,k ( z) LDi,k
(6.45)
The same type of analysis helps to determine the effects that deviations δLP1,i,k (x), δLP2,i,k (x), and δLP2,i,k (z) have on the rotation of the part, as presented in Equation 6.46. In this case, a deviation in the x direction of LP1,i,k and in the z direction of LP2,i,k will generate a positive rotation; deviation in the x direction of LP2,i,k will generate a negative one as presented in Equation 6.46. δβi ,k ≈ sin(β) ⋅
(δLP
1,i ,k
( x ) − δLP2,i ,k ( x ) LDi ,k
) + cos(β) ⋅ (δLP
2 ,i ,k
( z ) − δLP1,i ,k ( z ) LDi ,k
)
(6.46)
LD
LP1 ( z )
LP1 ( z)
LP1 ( z ) Cos ( )
FIGURE 6.16 Details of the effect that the deviation of LP1,i,k (z) has on the deviations of the rotation (Note: In the figure, the subscripts i,k have been omitted for simplicity).
2151_book.fm Page 140 Friday, October 20, 2006 5:04 PM
140
Stream of Variation Modeling and Analysis for MMPs
Finally, the total effect of the deviations of the reference or locating points on the deviations of the part xi,k (state) can be represented as:
x i, k
δxi,k = δzi,k = R i2,k δβi,k
δLP1,i,k ( x ) δLP1,i,k ( z) δLP2,i,k ( x ) δLP2,i,k ( z)
where matrix R2i,k is defined as:
R
i, k 2
1 = 0 sin βi,k LDi,k
0 1 cos βi,k − LDi,k
0 0 sin βi,k − LDi,k
0 0 cos βi,k LDi,k 3× 4
The first two rows of R2i,k account for the translations, and the last row accounts for the rotation. The same analysis is used to derive Corollary 6.1, which explains the effect of locating point deviations of subassemblies on the deviations of the subassembly. Corollary 6.2 explains the effect of reorientation due to errors in the previous station. This is why the terms in the matrix in Equation 6.8 are the negation of the elements in the matrix in Equation 6.6; because in this case it is a locator error (inherited from station k – 1) and not a pin error in station k.
References 1. Ceglarek, D., Shi, J., and Wu, S.M., A Knowledge-based diagnostic approach for the launch of the auto-body assembly process, ASME Transactions, Journal of Engineering for Industry, 116, pp. 491–499, 1994. 2. Shalon, D., Gossard, D., Ulrich, K., and Fitzpatrick, D., Representing geometric variations in complex structural assemblies on CAD systems, Proceedings of the 19th Annual ASME Advances in Design Automation Conference, 44, 121, 1992. 3. Ceglarek, D. and Shi, J., Dimensional variation reduction for automotive body assembly, Manufacturing Review, 8, 139, 1995. 4. Cunningham, T.W. et al., Definition, analysis, and planning of a flexible assembly process, Proceedings of the 1996 Japan/USA Symposium on Flexible Automation, 2, 767, 1996. 5. Ceglarek, D. and Shi, J., Design evaluation of sheet metal joints for dimensional integrity, ASME Transactions, Journal of Manufacturing Science and Engineering, 120, 452, 1998. 6. Shiu, B.W., Ceglarek, D., and Shi, J., Multi-Station sheet metal assembly modeling and diagnostics, NAMRI/SME Transactions, XXIV, 199, 1996.
2151_book.fm Page 141 Friday, October 20, 2006 5:04 PM
State Space Modeling for Assembly Processes
141
7. Camelio, A.J., Hu, S.J., and Ceglarek, D.J., Modeling variation propagation of multistation assembly systems with compliant parts, Transactions of the ASME, Journal of Mechanical Design, 125, 673, 2003. 8. Ding, Y., Ceglarek, D., and Shi, J., Modeling and diagnosis of multistage manufacturing processes, Proceedings of the 2000 Japan/USA Symposium on Flexible Automation, July 23–26, Ann Arbor, MI, 2000JUSFA-13146. 9. Ding, Y., Modeling and Analysis of Stream-of-Variation in Multistage Manufacturing Processes, Ph.D. dissertation, The University of Michigan, Ann Arbor, MI, 2001. 10. Jin, J. and Shi, J., State space modeling of sheet metal assembly for dimensional control, ASME Transactions, Journal of Manufacturing Science and Engineering, 121, 756, 1999.
2151_book.fm Page 142 Friday, October 20, 2006 5:04 PM
2151_book.fm Page 143 Friday, October 20, 2006 5:04 PM
7
State Space Modeling for Machining Processes*
7.1 INTRODUCTION Chapter 6 presented the variation propagation modeling for multistage assembly processes. In this chapter, stream of variation (SoV) modeling for multistage machining processes will be investigated.
7.1.1 INTRODUCTION TO MACHINING PROCESSES AND DIMENSIONAL VARIATION SOURCES Machining is a very important manufacturing process that removes materials from a workpiece to obtain design dimensional accuracy and better surface finishing, or to obtain complicated surface forms that cannot be obtained by other processes. The basic steps of machining are illustrated in Figure 7.1. The first step in a machining process is to set up the workpiece in a fixture system as illustrated in Figure 7.1b. A fixture is a device that holds a workpiece position in a particular setup and provides a means to reference and align the cutting tool to the workpiece. Proper location of the workpiece is essential to ensure accuracy and repeatability of the machining process. Figure 7.2 depicts the most popular 3-2-1 fixture system. Figure 7.2a presents the 6 degrees of freedom (d.o.f.’s) of a workpiece in the space (3 translational and 3 rotational d.o.f.’s). By requiring the workpiece to touch the 6 locators as illustrated by Figure 7.2b, the workpiece is fully constrained and the required positional and orientational relationship between the workpiece and the locators and, hence the machine tool, can be attained. The feature of the workpiece is defined as a physically identifiable portion of a workpiece, such as a plane, hole, slot, and chamfer. The features that touch the fixture locators are called datum or datum feature in machining literature. For example, the bottom plane of the cube in Figure 7.2b is the primary datum. The second step in a machining process is to clamp the workpiece on the fixture system and then mount the fixture system on the machine working table (for some dedicated machining systems, the fixture is an integrated part of the machine). One point worth mentioning is that the purpose of clamping is to fasten the workpiece to the fixture, and it does not change the location of the workpiece in space. In the last step, the cutting tool cuts the workpiece that is located on the fixture. It is worth noting that a cutting-tool path is programmed according to the working table.
* Part of chapter material is based on Reference 22 (pp. 296–309).
143
2151_book.fm Page 144 Friday, October 20, 2006 5:04 PM
144
Stream of Variation Modeling and Analysis for MMPs
(c) Machining Operation
(b) Fixturing System
(a) Workpiece
FIGURE 7.1 Basic steps in machining.
3
z axis
z axis Locator {6}
6
Locators {4,5} 5
y axis y axis 2
4 x axis 1
x axis
(a) Degree of Freedom of a Workpiece
Locators {1 2 3 }
(b) 3-2-1 Fixture System
FIGURE 7.2 Illustration of a fixture system.
A machining operation often produces multiple features on a workpiece. The dimensional accuracy of the workpiece is determined by the geometrical shape of the features and the relative positions and orientations of different features. Because of the complexity of the machining operation, the dimensional accuracy of the final product of a machining process is affected by many interrelated factors. To understand the influential factors of the dimensional accuracy, the relationships among the key elements of a machining process are illustrated in Figure 7.3. From Figure 7.3, it is clear that if the cutting-tool path is at the nominal location of the geometric feature with respect to the workpiece, then the dimensions of the resulting feature will be perfect and there will be no error. However, because it is unrealistic to measure precisely the location of each workpiece after it is mounted in the fixture, it is very difficult, if not impossible, to program the cutting-tool path with respect to the workpiece location. Therefore, in most cases it is programmed with respect to the working table of the machine. Clearly, the errors of the fixture location with respect to the working table (called fixture error) and the errors of the workpiece location with respect to the fixture (called datum error) will result in dimensional error of the features in this case. These two types of errors are often referred to as setup errors in machining processes. Besides the setup error, machine error is another important error source: the true cutting path is different from the programmed path because of various errors in the machine itself, such as geometric and dimensional errors of the kinematic links of the machine, thermal errors, wear of the cuttingtool, etc. The deviation of cutting tool path will certainly cause dimensional errors
2151_book.fm Page 145 Friday, October 20, 2006 5:04 PM
State Space Modeling for Machining Processes
145
Cutting tool Workpiece Fixture Machine Table
FIGURE 7.3 The key elements and their relationships in machining.
in the workpiece. These errors are defined as machine errors in this book. Some error sources will induce both machine error and setup error. For example, the cutting force will cause deformation in both machine and fixture. Therefore, cutting-forceinduced error is a combined error source. These aforementioned errors involved in machining operation often cause large variations in product dimensional quality and, thus, they are also called variation sources in this book. We will use “process fault,” “process error,” and “variation source” interchangeably in this book. Instead of classifying the error sources in machining processes into setup error, machine error, and combined error, these errors are also classified as quasi-static errors and dynamic errors [1]. Quasi-static errors are the static or slow-varying errors between the cutting tool and the workpiece. They include geometric and kinematic errors, thermal errors, cutting-force-induced errors, tool-wear-induced error, fixturing error, etc. Quasi-static errors account for about 70% of overall errors. Dynamic errors are fast-changing errors, caused by sources such as spindle error, machine structure vibration, controller error, etc. They are more dependent on the particular operating conditions of the machine.
7.1.2 MODELING OF VARIATION PROPAGATION MACHINING PROCESSES
IN
MULTISTAGE
The analysis and compensation of a specific type of error on a single-stage operation have been intensively studied. A huge body of literature can be found on the modeling and compensation of machine geometric errors [2–6], thermal errors [7–10], fixtureinduced errors [11–14], and force-induced errors [15–17]. However, the research on building a comprehensive model that includes multiple error sources and multiple operation stages is limited. For a multistage machining process, the product variation at a certain stage consists of two components: the variation brought by the volumetric error of current machine stage and the variation brought by the datum feature error introduced in previous stages. The second component exists because we have to use features produced by previous stages as the reference in the current operation. The variation
2151_book.fm Page 146 Friday, October 20, 2006 5:04 PM
146
Stream of Variation Modeling and Analysis for MMPs
Datum Error D
D
D
C
C
C
(a)
(b)
(c)
FIGURE 7.4 A two-step machining example.
from previous stages will be accumulated onto current operation. The variation propagation can be illustrated in a simple two-step machining example, as shown in Figure 7.4. The workpiece is a cube of metal (the front view is shown). Surface C of the workpiece is milled in the first step (Figure 7.4a). Because of an error in the milling operation, surface C is not perpendicular to surface D. A hole is drilled on surface D using surface C as one of the datum features (Figure 7.4b) in the second step. The location and orientation of the workpiece are determined by both the fixture and the datum features. Clearly, the resulting hole is not perpendicular to surface D (Figure 7.4c). The geometric error of the hole is not caused by the drilling operation; rather, it is caused by the milling operation in the first step. From this example, we have the following key observations: • •
The dimensional variation propagation and the interactions among multiple machining stages exist in machining processes in general. The key reason of variation propagation in machining processes is that the previously machined features are used as the datum in the subsequent machining operations.
From the preceding observations, we will develop a quantitative state space model to describe the variation propagation and the relationship between process error sources and product dimensional quality in multistage machining processes. It is unrealistic to include all the error sources in the model, because of the complexity of a machining process. Instead, we will focus on modeling the following error sources, which mainly account for the variation propagation: 1. Datum error refers to the dimensional and geometric error of datum features. It causes deviation of workpiece in the fixture. The effect of datum error can be obtained through a kinematics analysis based on product/process design information, such as datum feature selection, the nominal locations of datum features and new features, and fixture layout information. Datum error captures the variation propagated from previous stages, because datum features are introduced at previous stages. 2. Fixture error is caused by the errors in a fixture system, such as setup errors, wear of locating pin, clamping errors, etc. Similar to datuminduced error, the contribution of this error can be obtained by a kinematics analysis based on the product/process design information.
2151_book.fm Page 147 Friday, October 20, 2006 5:04 PM
State Space Modeling for Machining Processes
147
3. Machine error is the most complicated contributor. It involves multiple error sources, and the impact of some of those sources, such as geometric error and thermal error, can only be identified empirically. Fortunately, many of these error sources have been well studied, and effective analysis and compensation schemes have been developed [1,18]. Because the focus of the modeling is on variation propagation, the detailed modeling of certain machine errors is not included. Instead, the machine error in the model is represented as a deviation of the tool path from its nominal path. The mapping of certain machining error components to the tool path deviation can be done but is not included in this chapter for the sake of simplicity. In this chapter, a quantitative state space model will be developed to capture the combined impact of the process error sources and the dimensional variation propagation among different stages. The formulation of this model is further introduced in the following subsection.
7.1.3 MODEL FORMULATION The most challenging issue in variation analysis and reduction for complicated multistage manufacturing processes is the modeling of variation accumulation and propagation along different stages. To attack this problem, a quantitative state space model is proposed (Figure 7.5). For an N-stage process, the state space model is x k = A k −1x k −1 + B k u k + w k and y k = C k x k + v k
(7.1)
where k = 1, 2, …, N, is the stage index. The key quality characteristics of the product (e.g., dimensional deviations of key features) after each stage are represented by the state vector xk. The process faults (e.g., fixturing errors and tooling errors) are denoted by uk. Natural variation and unmodeled errors in the process are represented by a noise input to the system, wk. The product quality measurements are collected in yk. Output yk is not necessarily available at every stage. The measurement noise is denoted by vector vk. The coefficient matrices Ak, Bk, and Ck are determined by the system layout and specific characteristics of the process. Particularly, Ak represents the impact of datum error, i.e., propagated variation source, on the product quality, Bk represents the impact of the local variation sources such as fixture error on the product quality, and Ck is the measurement matrix that maps xk to yk . yk
yN
vk x0
Stage 1 x1 u1
w1
…
x k-2
Stage k-1 xk-1 uk-1
wk-1
Stage k xk uk
wk
vN … xN-1
Stage N uN
FIGURE 7.5 Diagram of a complicated multistage manufacturing process.
wN
xN
2151_book.fm Page 148 Friday, October 20, 2006 5:04 PM
148
Stream of Variation Modeling and Analysis for MMPs
Complicated variation propagation is handled automatically in this model through the state transition. To construct this model, we only need to study the local relationship among xk–1, xk , and uk at each individual stage k. Hence, the large body of knowledge about single-stage operation can be readily reused in the model construction. Furthermore, because of its chainlike structure, the state space model is very flexible; we can easily pick up and study any critical segments of the process. Another advantage of this model is its linear structure. Although the relationships between the variation sources and the key product quality characteristics are nonlinear, in general those relationships can often be linearized around a nominal working condition using Taylor series approximation. A linear state space model Equation 7.1 can significantly reduce the complexity of the subsequent analysis and synthesis. The derivation of this model through the kinematic analysis of the machinecutting operation is presented in Section 7.2. An experimental validation of the developed model is included in Section 7.3. The last section concludes this chapter and provides some coverage of applications of this model.
7.2 DERIVATION OF VARIATION PROPAGATION MODEL 7.2.1 BASICS OF KINEMATIC ANALYSIS OF MACHINING OPERATIONS Homogeneous transformation (HT) is commonly used in kinematic analysis. In this subsection, we will briefly introduce the basic properties of HT, which are useful for the analysis of machining operations. In space, the location of a point is represented by a 3 × 1 vector consisting of three coordinates of this point in a coordinate system. As shown in Figure 7.6, the location of point p can be represented by its coordinates in coordinate system (CS) 1 as vp,1 = [xp,1 yp,1 zp,1]T, where the subscript 1 indicates that the coordinates are w.r.t. CS1. The point p can also be represented by the coordinates of the same point in CS 2 as vp,2. The transformation between vp,1 and vp,2 is given by v p,2 v p,1 2 = H1 ⋅ 1 1
(7.2)
where R 2 H12 = 1 0
t12 1
is a 4 × 4 homogeneous transformation matrix from CS 2 to CS1. The translational vector t12 is the coordinates of the origin of CS 1 in CS 2. The rotation matrix R 12 is a 3 × 3 matrix that captures the orientational difference between CS 2 and CS 1. The rotation matrix is an orthogonal matrix and, thus, R 12 only has three independent parameters. People often use three Euler angles (φ, θ, and ψ) to represent these three
2151_book.fm Page 149 Friday, October 20, 2006 5:04 PM
State Space Modeling for Machining Processes
149
z2
y2
p
x2
z1
t12
Coordinate System 2
y1 x1 Coordinate System 1
FIGURE 7.6 Coordinate transformation through homogeneous transformation.
independent parameters. The orientation of CS 1 can be viewed as rotating CS 2 around z2 by angle of φ, then rotating the new coordinate system around the new y2 axis by θ, and finally rotating the new coordinate system around the new x2 axis by ψ. Given the three Euler angles, the rotation matrix is cos φ cos θ cos ψ − sin φ sin ψ R = sin φ cos θ cos ψ + cos φ sin ψ − sin θ cos ψ 2 1
− cos φ cos θ sin ψ − sin φ cos ψ − sin φ cos θ sin ψ + cos φ cos ψ sin θ sin ψ
cos φ sin θ sin φ sin θ cos θ (7.3)
Although the expression of the rotation matrix looks complicated, R 12 is an orthogonal matrix; hence (R 12 )−1 = (R 12 )T and the determinant of R 12 is 1. Because of the property of R 12, we have −(R 12 )T t12 1
(R 2 )T (H12 )−1 = 1 0
(7.4)
It is also noteworthy that (H12 )−1 is actually H12 , where v p,1 v p,2 1 = H2 ⋅ 1 1
(7.5)
If the rotational deviation between CS 1 and CS 2 is small, the expression of R 12 can be simplified as in Reference 19 1 R ≈φ −θ 2 1
−φ 1 ψ
θ −ψ 1
(7.6)
2151_book.fm Page 150 Friday, October 20, 2006 5:04 PM
150
Stream of Variation Modeling and Analysis for MMPs
The concept of differential motion vector [19] is used to describe the small deviation between CS 2 and CS 1: t2 x12 = 12 ω1
(7.7)
where t12 = [t12x t12y t12z ]T and ω 12 = [φ θ ψ ]T. In other words, the differential motion vector from CS 2 to CS 1 is a stack of vectors consisting of independent parameters H12 when the deviations between CS 2 and CS 1 are small. We have the identity 1 φ H( x12 ) = −θ 0
−φ 1
θ −ψ
ψ 0
1 0
t12x t12y t12z 1
(7.8)
Another important concept in describing small deviations between coordinate systems is the differential transformation matrix (DTM), which is often denoted as ∆12 . The matrix of ∆12 comes from the identity of H12 ≈ I4× 4 + ∆12 , and thus the DTM can be written as 0 φ ∆12 = −θ 0
−φ
θ −ψ
0 ψ
0 0
0
t12x t12y t12z 0
(7.9)
If a skew-symmetric matrix associated with a vector p is defined as 0 pˆ = p3 − p2
− p3 0 p1
p2 − p1 , 0
then DTM ∆ (x12 ) can be written as ωˆ 2 ∆ (x12 ) = 1 0
t12 . 0
One very useful fact about DTM is ∆12 = − ∆ 12
(7.10)
2151_book.fm Page 151 Friday, October 20, 2006 5:04 PM
State Space Modeling for Machining Processes
CS 0
CS 1'
CS 1
H10
H 02
151
CS 2'
H12 CS 2" CS 2
FIGURE 7.7 Transition of differential motion vector.
where ∆12 and ∆12 are the differential transformation matrices associated with H12 and H12 , respectively. Because the differential motion vector is heavily used in the derivation of the variation propagation model, we now prove some of its properties. Corollary 7.1: Consider the coordinate systems CS 0, CS 1, CS 1′, CS 2, CS 2′, and CS 2″ and given H10 , H 02 , H12′′′ , the differential motion vector x11′ that describes the deviation between CS 1 and CS 1' and x 22′′′ that describes the deviation between CS 2″ and CS 2′, as shown in Figure 7.7, we have the following: if H10 · H12′′′ = H 02 , then (R 1′ )T x 22′ = 2′′ 0
−(R 12′′′ )T ⋅ (tˆ21′′′ ) (R 12′′′ )T
I3×3 0
0 x11′ I3×3 x 22′′′
(7.11)
Proof: Note that H02′ = H10 ⋅ H( x11′ ) ⋅ H12′′′ ⋅ H( x 22′′′ ) and H02′ = H02 ⋅ H( x 22′ ) ; we have H10 ⋅ H( x11′ ) ⋅ H12′′′ ⋅ H( x 22′′′ ) = H02 ⋅ H( x 22′ ) . Substituting H10 ⋅ H12′′′ = H02 into this equation, we have H( x11′ ) ⋅ H12′′′ ⋅ H( x 22′′′ ) = H12′′′ ⋅ H( x 22′ )
(7.12)
In Equation 7.12, only x 22′ is unknown. If the second-order small values are ignored, Equation 7.11 can be obtain through a straightforward derivation from Equation 7.12. Corollary 7.2: Consider the coordinate systems CS 0, CS 1, CS 1′, CS 2, CS 2′, and CS 2″ and given H10 , H02 , and H12′′′ , the differential motion vector x11′ that describes the deviation between CS 1 and CS 1′ and x 22′′′ that describes the deviation between CS 2″ and CS 2′, as shown in Figure 7.7, we have the following: if H10 ⋅ H12′′′ = H02 , −(R 12′′′ )T x 22′′′ = 0
(R 12′′′ )T ⋅ (tˆ21′′′ ) −(R 12′′′ )T
I3×3 0
0 x11′ I3×3 x 22′
(7.13)
2151_book.fm Page 152 Friday, October 20, 2006 5:04 PM
152
Stream of Variation Modeling and Analysis for MMPs
Proof: The proof is very similar to the proof of Corollary 7.1. Again, Equation 7.12 is used. In this case, only x 22′′′ is unknown. From Equation 7.12, Equation 7.13 can be obtained through a straightforward derivation. These two corollaries are very useful when we switch the reference coordinates in the derivation of the variation propagation model. Equipped with the basic knowledge, we can move on to the detailed model derivation.
7.2.2 REPRESENTATION
OF
WORKPIECE GEOMETRIC DEVIATION
To regulate the dimensional deviations of features of a workpiece, standards were developed for geometric dimensioning and tolerancing (ISO 1101 [1983] or ANSI Y14.5[1982]). However, these conventional geometric tolerances originated from gauging practice. They are not suitable for the operational principle of the coordinate measurement machine (CMM) that now serves as the standard measurement equipment for machining process. In addition, the representation of workpiece features in conventional geometric tolerances do not conform to the part representations used in CAD/CAM systems. Recently, a new vectorial dimensioning tolerancing (VD&T) strategy was proposed and has drawn significant attention [20]. The principle of VD&T is based on the concept of substitute elements or substitute features. A substitute feature is an imaginary, geometrical, ideal feature (e.g., plane, circle, line) whose location, orientation, and size (if applicable) are calculated from the measurement data points of a workpiece surface. Roughly speaking, a substitute feature is an ideal feature fitted from the measurement data of a corresponding true feature. Substitute features are represented by the location vector, orientation vector, and sizes. The location vector indicates the location of a specified point of the substitute feature. The substitute orientation vector is a unit vector that is normal to the substitute plane or parallel to the substitute axis (cylinder, cone, etc.). The size is available for some features. For example, the diameter is the size of a circular hole. The VD&T workpiece feature representation follows the operational principle of CMM and CAD/CAM systems. The measurement data from CMM can be analyzed and compared with the design model directly. The difference between the true feature and the design requirement can be fed back to the manufacturing process directly. In this chapter, we adopt a feature representation that is consistent with VD&T. A location vector and a vector that consists of three rotating Euler angles are used to represent a workpiece feature. Because the size of a feature is usually formed at one machining stage, it is not considered in the modeling derivation. The part representation is illustrated in Figure 7.8. CS 0 is the reference coordinate system. CS 1 is attached on the nominal of the cylinder feature. The position and orientation of the nominal cylinder can be represented by H10 . After the machining operation, the true location and the orientation of the cylinder feature are represented by CS 1′, and they deviate from its nominal values. The deviation is captured by the differential motion vector x11′. To describe the accuracy of a machined workpiece, we need to study the relationships among different features. These relationships can be described by the relationships of corresponding coordinate systems attached to the local features. If
2151_book.fm Page 153 Friday, October 20, 2006 5:04 PM
State Space Modeling for Machining Processes
153
D1 D0
Nominal Feature Featur Z
True Feature Featur
z
z CS 1
y
Y
x
CS 1'
x
H( x11 )
CS 0
O
y
X
FIGURE 7.8 Illustration of feature representation.
CS i is attached to the nominal feature i and CS i′ is attached to the true feature i, the deviation of the feature is described by the differential motion vector ti x ii′ = ii′ . ω i′ In summary, the relative position and orientation of a feature is described as a homogenous transformation matrix. The deviation of a feature from its nominal value is represented by a differential motion vector. The rationale of this representation is that it conforms to the working principles of CMM and CAD/CAM models. Moreover, various mathematical tools are available for analyzing this representation.
7.2.3 SINGLE-STAGE MODELING
OF
DIMENSIONAL VARIATION
Based on the differential motion vector representation, we can set up a state space model to describe the variation propagation in a multistage machining process. As discussed in the previous section, various coordinate systems are attached to the workpiece, fixture, and cutting tool to represent the position and orientation of workpiece, fixture, and the newly generated features. The definitions of the coordinate systems involved in the derivation are illustrated in Figure 7.9 and listed as follows: • •
•
CS 0 is the reference part coordinate system in this analysis. The primary datum feature is often selected as the reference for convenience. CS 1 and CS 1′ are the nominal and true fixture coordinate systems, respectively. Because of the datum error, the true fixture coordinate system deviates from the nominal fixture coordinate system w.r.t. CS 0. The nominal fixture coordinate system w.r.t. CS 0 is represented by CS 1″ in Figure 7.9. CS 2 and CS 2′ are the nominal and true newly generated feature coordinate systems, respectively. Similarly, the nominal feature coordinate system w.r.t. CS 0 is represented by CS 2″.
2151_book.fm Page 154 Friday, October 20, 2006 5:04 PM
154
Stream of Variation Modeling and Analysis for MMPs
CS 2' CS 2 CS 2" CS 1
CS 0
CS 1' CS 1"
FIGURE 7.9 Coordinate systems involved in the model derivation.
TABLE 7.1 Transformations Among Coordinate Systems From → To CS 0 → CS 1″ CS 1″ → CS 1′ CS 1′ → CS 1 CS 1 → CS 2 CS 2 → CS 2′ CS 1″ → CS 2″
CS 2″ → CS 2′
Representation and Explanation H10′′ — the designed (nominal) value of the fixture w.r.t. CS 0, and is known from product/process design x11′′′ — caused by the datum errors x11′ — caused by the fixture locator errors H12 — the designed (nominal) value representing the cutting-tool path (also the newly generated feature) w.r.t. CS 1 and is known from product/process design x 22′ — caused by machine errors ′′ H12′′ — the designed (nominal) value representing the cutting-tool path (also, the newly generated feature) w.r.t. CS 1″ and is known from product/process design; it is worth ′′ noticing that H12′′ equals H12 2 ′′ x 2′ — the final overall deviation of the newly generated feature from its nominal location and orientation w.r.t. the part coordinate system CS 0
The transformations and the root causes of the deviations among these coordinate systems are listed in the following table: From Table 7.1, it is clear that the deviations x11′′′ caused by datum errors, x1′1 caused by fixture errors, and x 22′ caused by machine errors need to be identified. The summation of those three errors will be the final deviation of the newly generated feature at a single stage. 7.2.3.1 Analysis of Datum-Induced Error The most commonly used fixture scheme in practice is the 3-2-1 fixturing scheme. A general 3-2-1 layout is shown in Figure 7.10. Surface ABCD defines the primary datum plane, which constrains two rotational and one translational motions. ADHE is the secondary datum plane, which constrains
2151_book.fm Page 155 Friday, October 20, 2006 5:04 PM
State Space Modeling for Machining Processes
H
E
G
XF
OF D F1
OR L3
XR
F2 L2
C
F3 L1
ZF YF
F
P3
P1
P2
155
YR
ZR
A
B
FIGURE 7.10 The 3-2-1 fixture setup.
one rotational and one translational motion. CDHG is the tertiary datum plane, which constrains the last translational motion. The points F1, F2, and F3 are the perpendicular projection points of the locators P1, P2, and P3 on the primary datum. The fixture coordinate system (CS 1′) is shown in Figure 7.10 as OF XFYF ZF . F1F2 is the y-axis of CS 1′, the line passing F3 and perpendicular to F1F2 is the x-axis of CS 1′, and the z-axis is perpendicular to the primary datum plane. The coordinate system attached on the primary datum plane (surface ABCD) is taken as CS 0. The deviation of the secondary datum plane and the tertiary datum plane are represented as x dd 22′ and x dd 33′, respectively. The symbols “di” and “di” represent the coordinate systems attached to the ith nominal and true datum features, respectively. Given x dd 22′ and x dd 33′, the deviation of CS 1′ w.r.t. CS 1″, x11′′′ , can be obtained by Equation 7.14 x11′′′ = T1x dd 22′ + T2x dd 33′
(7.14)
where T1 and T2 are determined by the nominal positions of the secondary and tertiary datum features and the positions of fixture locators. The value of T1 and T2 can be obtained by the following general procedure. Let p1, p2 be the datum points touching the secondary datum and p3 be the datum point touching the tertiary datum. The nominal coordinates of these three points in CS 1′ are denoted as p1,1′ , p 2,1′ , p 3,1′ . Denoting p˜ = [pT 1]T, we have H0d 2′ H10′ p 1,1′ = p 1,d 2′ d 2′ 0 H0 H1′ p 2,1′ = p 2,dd 2′ H d 3′ H0 p = p 3, d 3′ 0 1′ 3,1′
(7.15)
2151_book.fm Page 156 Friday, October 20, 2006 5:04 PM
156
Stream of Variation Modeling and Analysis for MMPs
To guarantee that these points touch the corresponding part surface, the z coordinates of p 1,d 2′ , p 2,d 2′, and p 3,d 3′ should be zeros (the z direction is defined as the normal direction of the surface). Consider the first entry in Equation 7.15. Note that in H d0 2′ = (H d0 2′ )−1 = (H d0 2 ⋅ H dd 22′ )−1 = (H dd 22′ )−1 ⋅ H d0 2 = ( ∆ dd 22′ + I) ⋅ H d0 2 a n d H10′ = H10′′ ⋅ ( ∆11′′′ + I), t h e left-hand side changes to ( ∆ dd 22′ + I) ⋅ H d0 2 ⋅ H10′′ ⋅ ( ∆ 11′′′ + I) p 1,1′ ≈ ( ∆ dd 22′ ⋅ H1d′′2 + H1d′′2 ⋅ ∆11′′′ + H1d′′2 ) p 1,1′
(7.16)
The third element of p 1,d 2′ is zero in order to guarantee touching; therefore, [( ∆ dd 22′ ⋅ H1d′′2 + H1d′′2 ⋅ ∆ 11′′′ + H1d′′2 ) p 1,1′ ](3) = 0
(7.17)
where the subscript “(3)” means the 3rd element of [⋅]. Under nominal conditions, p1, p2, and p3 should touch the datum plane. Hence, because p 1,1′ is the nominal coordinate of the locators in CS 1′, [H1d′′2 ⋅ p 1,1′ ](3) = [H1d′′2 ⋅ p 1,1′′ ](3) = 0 . Equation 7.17 changes to [ ∆ dd 22′ ⋅ H1d′′2 ⋅ p 1,1′ ](3) = [H1d′′2 ⋅ ∆ 11′′′ ⋅ p 1,1′ ](3)
(7.18)
Similarly, we can get the other two equations for p2 and p3. In Equation 7.18, H1d 2 and p 1,1′ are the nominal values that are known from the design specifications, and ∆ dd 22′ contains the datum errors x dd 22′ . The DTM ∆11′′′ contains the differential motion vector x11′′′ . Although there are six parameters in x11′′′ , only three of them are unknown. By solving the equation system consisting of Equation 7.18 and the corresponding equations for p2 and p3, the relationship between x11′′′ , x dd 22′ and x dd 33′ can be obtained and put in the form of Equation 7.14 by rearranging the terms. The expression of T1 and T2 for a general 3-2-1 setup is very complicated. However, if datum features are orthogonal to each other, which is very common in practice, T1 and T2 can be significantly simplified. For example, given the nominal positions for the second and tertiary datum and locating pins in Figure 7.10 as
H1d′′2
0 0 = −1 0
0 1
1 0
0 0
0 0
1 t1d′′2x d2 t1′′ y 0 d3 H = , 1′′ 0 0 0 1
0 0
0 1
−1 0
0 0
t1d′′3x t1d′′3y , 0 1
p1,1′ = [0 p1y p1z ]T , p2,1′ = [0 p2 y p2 z ]T, and p3,1′ = [ p3 x 0 p3z ]T, the relationships can be obtained as Equation 7.14, where
2151_book.fm Page 157 Friday, October 20, 2006 5:04 PM
State Space Modeling for Machining Processes
0 0 T1 = 0 0 0 0
157
p2 z + t1d"3x +
0 0 0 0 0 0
p2 y ( p2 z − p1z ) p1y − p2 y
0
−1
−t1d"2x
0
0
− p3 x
p3 x ( p1z − p2 z ) p1y − p2 y
0 0
0 0
0 0
0 0
0
0
0
0
0
1
0 p − p2 z − 1z p1y − p2 y
0 0 0 T2 = 0 0 0
0 0
0 −1
0 − p3 z − t1d′′3y
0 p3 x + t1d′′3x
0
0
0
0
0 0 0
0 0 0
0 0 0
0 0 0
0 0 0 0 0 0
(7.19)
A very common variation of the general 3-2-1 fixture setup is shown in Figure 7.11.
D P1 P3
C
L3 XF
OR P2
L2
ZF
XR
L1
YF A
YR
ZR
FIGURE 7.11 Another setup of 3-2-1 fixture setup.
B
2151_book.fm Page 158 Friday, October 20, 2006 5:04 PM
158
Stream of Variation Modeling and Analysis for MMPs
The six degrees of freedom of the workpiece are constrained by the plane ABCD (two rotational and one translational motions), a circular short hole P1 and P3 (two translational motions), and a slot P2 (one rotational motion). Given the nominal positions for the secondary the tertiary datum feature and locating pins in Figure 7.11 as
H1d′′2
0 0 = −1 0
0 1 0 0
1 0 0 0
1 0 0 d 3 0 , H1′′ = 0 0 1 0
0 0 −1 0
0 0 , 0 1
0 1 0 0
p1,1′ = [0 0 0]T , p2,1′ = [0 p2 y 0]T , and p3,1′ = [0 0 0]T , the T1 and T2 matrices in Equation 7.14 can be obtained as 0 0 0 T1 = 0 0 0
0 0 0 0 0 0
−1 0 0 0 0 0
0 0 0 0 0 1
0 0 0 0 0 0
0 0 0 0 0 0 , T2 = 0 0 0 0 0 0
0 0 0 0 0 0
0 −1 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 (7.20) 0 0 0
The procedure presented in this subsection can be used to study the datuminduced error for a general 3-2-1 fixture setup. Following this derivation method, similar results can be obtained for other fixture setups. 7.2.3.2 Analysis of Fixture Errors In a general 3-2-1 fixture scheme as shown in Figure 7.10, the workpiece position is located by six locators (P1, P2, P3, L1, L2, and L3). Assume the nominal coordinates of these six points in the nominal fixture coordinate system CS 1 are (0, p1y , p1z), (0, p2y , p2z), (p3x, 0, p3z), (L1x, L1y , 0), (L2x, L2y , 0), and (L3x, L3y , 0), respectively. (Note that we assume the z coordinate of P1 and P2 are the same to simplify the problem). If there are small deviations on these 6 locators, (∆pix, ∆piy , ∆piz) and (∆Lix, ∆Liy , ∆Liz), where i = 1, 2, 3, the true fixture coordinate system CS 1′ will deviate from its nominal CS 1. Cai et al. [21] gave an analytical infinitesimal error analysis for a rigid body locating scheme with general six locating points. The fixture error can be derived based on their results. In Figure 7.10, the surface norm vector for L1, L2, and L3 is (0, 0, 1); for P1 and P2, (–1, 0, 0); and for P3, (0, –1, 0). With the locators’ position vectors, Equation 3.1.6 in Cai et al. [21] can be applied to obtain x11′ = T3 ⋅ [ ∆L1z
∆L2 z
∆L3 z
∆p1x
∆p2 x
∆p3 y ]T
(7.21)
2151_book.fm Page 159 Friday, October 20, 2006 5:04 PM
State Space Modeling for Machining Processes
159
where T3 = ( L2 y − L3 y ) p1z /C ( L3 x − L2 x ) p3 z /C ( L L − L L ) /C 3y 2 x 2 y 3x ( L2 x − L3 x ) /C ( L2 y − L3 y ) /C 0
(L (L
− L ) p /C ( L − L ) p /C − p / ( p − p ) p / ( p − p ) 0 1 y 1z 1y 2 y 1z 2y 1y 2y 1y 1y 2y ) p /C ( L − L ) p /C p / ( p − p ) − p / ( p − p ) 1 3x 3z 2x 1x 3z 3x 1y 2y 3x 1y 2y 0 0 0 L − L L ) /C ( L L − L L ) /C
3y
1x
(L
−L
3 y 1x
3 x 1y
(L (L
2 y 1x
−L
) /C
(L
−L
2x
3y
− L ) /C
(L
−L
2y
1x
1x
1y
1y
0
0 0
1y 2 x
3x
) /C
0
) /C
0
0 1/ ( p
0
1y
0
0
− p
2y
−1/ ( p
)
1y
− p
2y
)
and C = L3 x L1y − L1y L2 x + L3 y L2 x + L2 y L1x − L2 y L3 x − L3 y L1x . We need to know x1′1 as well. From the properties of DTM, x11′ = − x11′ = −T3 ⋅[ ∆L1z
∆L2 z
∆L3 z
∆p1x
∆p2 x
∆p3 y ]T
(7.22)
For the fixture setup shown in Figure 7.11, the nominal coordinates of P1, P2, and P3 in CS 1 are (0, 0, 0), (0, p2y , 0), and (0, 0, 0), respectively. Substituting these values in the expression of T3, we can obtain another T3 matrix for the particular setup shown in Figure 7.11 as T3 = ( L3 y L 2 x ( L2 x ( L2 y
0
0
0
1
0
0
0
0
0
0
( L3 x L1 y − L3 y L1 x ) /C
( L2 y L1 x − L1 y L2 x ) /C
0
0
− L3 x ) /C
( L3 x − L1 x ) /C
( L1 x − L2 x ) /C
0
0
− L3 y ) /C
( L3 y − L1 y ) /C
( L1 y − L2 y ) /C
0
0
0
0
−1/p2 y
1/p2 y
− L2 y L3 x ) /C
0
0
0 0 0 0 1
The fixture error analysis procedure for a general 3-2-1 fixture scheme is presented in this subsection. For other fixture systems, the analysis can follow the same procedure. 7.2.3.3 Identify the Overall Dimensional Error by Combining Error Sources Together In previous sections, the expressions of x11′′′ because of datum errors and x1′1 because of fixture errors are obtained. To obtain the final overall feature deviation x 22′′′ , these two errors need to be combined with x 22′, which is the machining error of the newly
2151_book.fm Page 160 Friday, October 20, 2006 5:04 PM
160
Stream of Variation Modeling and Analysis for MMPs
CS 2' CS 2' CS 2
CS 2" CS 2"'
CS 2"'
CS 1
CS 1'
CS 1'
CS 1"
(a) Calculation of x 22
(b) Calculation of x 22
FIGURE 7.12 Derivation of x2′2 and x2′2″.
generated feature. This can be achieved by applying Corollary 7.1 twice as shown in Figure 7.12a and Figure 7.12b. Corollary 7.1 can be directly applied to Figure 7.12a to obtain x 22′′′′ by knowing 1′ x1 , H12 , x 22′, and H12′′′′ , where CS 2′′′ is the nominal location of the cutting-tool path w.r.t. CS 1′ and H12′′′′ is taken to be the same as H12 . In this case, we can take H12 as H12′′′ , x1′1 as x11′, and x 22′ as x 22′′′ in Corollary 7.1 to identify x 22′′′′ as (R1 )T x 22′′′′ = 2 0
−(R12 )T ⋅ (tˆ12 ) 1′ 2 ⋅ x1 + x 2′ (R12 )T
(7.23)
Corollary 7.1 can be further directly applied to Figure 7.12b to obtain x 22′′′ by knowing x11′′′ , H12′′′′ that equals H12 , x 22′′′′ , and H12′′′′ that equals H12 . In this case, we can take H12 as H12′′′ , x11′′′ as x11′ , and x 22′′′′ as x 22′′′ in Corollary 7.1 to identify x 22′′′ as (R1 )T x 22′′′ = 2 0
−(R12 )T ⋅ (tˆ12 ) 1′′ 2′′′ ⋅ x1′ + x 2′ (R12 )T
(7.24)
The final expression for the overall feature deviation x 22′′′ can be obtained by substituting Equation 7.23 into Equation 7.24 to get (R1 )T x 22′′′ = 2 0
−(R12 )T ⋅ (tˆ12 ) 1′′ (R12 )T ⋅ x1′ + (R12 )T 0
−(R12 )T ⋅ (tˆ12 ) 1′ 2 ⋅ x1 + x 2′ 1 T (R 2 )
(7.25)
After the overall deviations of the newly generated features are obtained, the modeling of variation propagation can be achieved by linking multiple stages together. The steps are presented in the following subsection.
2151_book.fm Page 161 Friday, October 20, 2006 5:04 PM
State Space Modeling for Machining Processes
x k-1
S 5. assemble the newly generated feature with other features
S 1. extract information on datum and relocate the part
S 2. gather information on fixture error
161
S3. calculate the datum 1 1 error x1 and fixture error x1
xk
S 4. calculate the overall deviation of the newly generated feature
FIGURE 7.13 Procedures for variation propagation model development.
7.2.3.4 Modeling Variation Propagation in Multistage Machining Processes Figure 7.13 shows the steps of the modeling of variation propagation in multiple stages. The vector of xk is the collection of dimension deviations of all key features on the product. The deviation of each feature is represented by a differential motion vector. If a feature has not been generated after the kth stage, the corresponding components of that feature in xk are set to be zero, and after it has been generated, the zero components are replaced by nonzero deviations. The detailed procedure is presented as follows. In this step, S1, we reorient the part on the current stage and collect information of datum deviation. This is an important step because in single-stage analysis, we always take the primary datum feature as the part coordinate system (CS 0), and the feature deviations are obtained w.r.t. the primary datum feature. Clearly, if the same primary datum as that in the previous stage is used in current stage no extra transformations need be made to the feature deviations obtained from previous stages. However, if the primary datum changes to a different feature in the current stage, a transformation is needed to change the feature deviations into the new CS 0. This can be done using Corollary 7.2. The transformation is illustrated in Figure 7.7. In Figure 7.7, if we treat CS 0 as the primary datum at the previous stage, CS 1′ as the primary datum at the current stage, and CS 2′ as a feature, then deviation of CS 2′ w.r.t. CS 1′ ( x 22′′′ ) can be obtained using Equation 7.13 by knowing H12′′′ and the deviations of CS 1′ and CS 2′ w.r.t CS 0 (i.e., x11′ and x 22′ ). Corollary 7.2 can be applied to all features from the previous stage to transform the deviations into the new part coordinate system. After the transformation is done, the deviations of the secondary and the tertiary datum can be collected. In other words, x dd 22′ and x dd 33′ can be obtained. This is an important step in building the state transition model; it transforms the coordinates of xk–1 from the previous stage to the current stage. In the step S2, the fixture imperfections can be aggregated as a vector uf,k. In the step S3, the datum-induced error x11′′′ can then be obtained by using Equation 7.14 and the fixture error x1′1 using Equation 7.22. The matrices T1 and T2 in Equation 7.14 contribute to the expression of Ak–1, and T3 in Equation 7.22 contributes to the expression of Bk. Following the derivation in Subsection 7.2.3.3, in step S4, the overall deviation of the newly generated feature x 22′′′ can be obtained by combining x11′′′ , x1′1 , and x 22′ .
2151_book.fm Page 162 Friday, October 20, 2006 5:04 PM
162
Stream of Variation Modeling and Analysis for MMPs
This procedure can be applied to all newly generated features to get their predicted deviations, may be noted that for different newly generated features, x11′′′ and x1′1 are kept unchanged whereas x 22′ may change from feature to feature. By substituting Equation 7.14 and Equation 7.22 into Equation 7.25, we can obtain the parts of Ak–1 and Bk corresponding to the newly generated feature. The last step S5, is to assemble the deviations of the newly generated features together with the unchanging features to form xk as the output of the state vector from the current stage. Further, the calculations in steps S1 through S4 can also be assembled together to put the state transition in the form of Equation 7.1. The equation formation is quite straightforward; however, the final expression is fairly tedious. The detailed expression is not considered here. The interested reader can refer to Zhou et al. [22]. After the state space equation is obtained, the observation equation in the state space model is easy to obtain. We can simply use selection matrix as Ck to select the measured features as the output of yk. If the reference coordinate system changes in the measurement station, a transformation similar to S1 transformation is needed in the observation equation. The preceding model describes the deviation propagation among a multistage machining process. An experimental validation of this model is presented in Section 7.3.
7.3 MODEL VALIDATION The deviation propagation model has been validated on an experimental multistage machining process.
7.3.1 INTRODUCTION
TO THE
EXPERIMENTAL MACHINING PROCESS
The product is a V-6 automotive engine head. Its key datum features are shown in Figure 7.14. Joint Face A
H101 B
Longitude Slot S
H104 C
H103
H102
Cover Face M
X3 X2
X1
H105
H106
H107
H108
Z Y1
Y2
(a) Joint face of the engine head FIGURE 7.14 The workpiece: a V-6 engine head.
(b) Cover face of the engine head
2151_book.fm Page 163 Friday, October 20, 2006 5:04 PM
State Space Modeling for Machining Processes
163
TABLE 7.2 Description of the Operations Operation #
Datum (Primary + Secondary + Tertiary)
Operation Descriptions
1 2 3
(X1, X2, X3) + (Y1, Y2) + Z M + (Y1, Y2) + Z A+B+C
Mill cover face M Mill joint face A, drill hole B and C Mill slot S
TABLE 7.3 Nominal Positions and Orientations of Key Features w.r.t. the Reference Coordinate System Feature Number
Feature Name
Euler Angles
Translational Vector
1 2 3 4 5 6
Surface D Joint face A Cover face M Hole B Hole C Slot
[0, 0, 0] [0, 0, 0] [0, π, 0] [0, 0, 0] [0, 0, 0] [0, π, 0]
[133.7, 134.2, 0] [0, 0, 2.5] [0, 0, –117.03] [0, 0, 2.5] [0, 306, 2.5] [0, 0, –100.9]
For the sake of simplicity, we only consider three operations in this process. Each operation and its corresponding datum setup are described in Table 7.2. In the modeling of variation propagation, we focus on six features. Their nominal positions and orientations are listed in Table 7.3. These numbers are w.r.t. a reference coordinate system. The origin of the reference coordinate is the intersection point between the center line of hole B and the plane determined by rough datum X1, X2, and X3. The y-axis is the line connecting the centers of hole B and hole C, the z-axis coincides with the center line of hole B and points from the cover face to the joint face, and the x-axis is determined by the right-hand principle. It is straightforward to obtain the nominal relationship between any two features from the design parameters listed in Table 7.3. To develop the state space model, the nominal locations of the fixture locators are also needed. Those values are listed in Table 7.4. Note that the nominal fixture locations are given in the coordinate systems attached to the corresponding touching feature (or datum feature). With this known information, a state space model with three stages can be established to link the process root causes such as datum, fixture, and machine errors with the product quality measurements as follows: x k = A k −1x k −1 + B k u k + w k , k = 1, 2, 3
(7.26)
2151_book.fm Page 164 Friday, October 20, 2006 5:04 PM
164
Stream of Variation Modeling and Analysis for MMPs
TABLE 7.4 Nominal Locator Positions w.r.t. the Touching Feature Operation
Datum Primary
#1
Secondary Tertiary Primary
#2 Secondary Tertiary Primary #3
Secondary Tertiary
Locator Positions
Touching Feature Number
[–4.6, –92.2, 0] [–4.6, 128.8, 0] [–169, 38.3, 0] [0, –92.2, –2] [0, 128.8, –2] [0, 0, 0] [42.43, 74, 0] [42.43, 274.5, 0] [–148.07, 157.5, 0] [0, –92.2, –2] [0, 128.8, –2] [0, 0, 0] [10, 95.25, 0] [–95.25, 0, 0] [–95.25, 317.5, 0] [0, 0, 0] [0, 0, 0] [0, 0, 0]
D
M
D
A C B B
yk = Ck xk + vk
(7.27)
The details of the system matrices are listed in the appendix.
7.3.2 COMPARISON BETWEEN THE REAL MEASUREMENT AND THE MODEL PREDICTION To validate the model, a workpiece is machined in the test bed with multistage machining operations. In the machining, a fixture error is intentionally added to the process at each stage. The inputs to the model that correspond to the fixture error are uf,1 = [0 0 –0.2 0 0 0]T, uf,2 = [–0.39 0 0 0 0 0]T, and uf,3 = [–0.39 0 0 0 0 0]T. The nonzero values in uf,k correspond to the mean shift magnitudes of the fixture errors. All the inputs corresponding to the machining error um,k are set as zeros. The overall inputs to the model are an aggregation of uf,k and um,k, i.e., uk = u Tf ,k u Tm ,k . T
The workpiece is measured on a CMM after it is machined. The measurement results from CMM are listed in Table 7.5. For a plane feature, the CMM measurement gives the coordinates of a point on the surface and the norm direction vector of the surface. For the cylinder (hole B and hole C) feature, the CMM measurement gives the
2151_book.fm Page 165 Friday, October 20, 2006 5:04 PM
State Space Modeling for Machining Processes
165
TABLE 7.5 CMM Measurement Results
Rough datum Joint face Hole B Hole C Cover face Slot
x
y
Z
I
j
k
–361.661 –364.264 –357.147 –358.721 –244.887 –261.027
75.708 103.454 154.084 153.795 103.429 99.433
27.553 35.086 –132.65 173.319 45.558 31.59
–1 1 1 1 1 1
–0.005 0.005 0.005 0.003 0.006 0.008
–0.003 0.005 0.005 0.005 0.003 0.005
Note: Unit — mm for x, y, and z.
coordinate of a point on the cylinder axis, the norm direction vector of the axis, and the diameter. The diameters of the holes are not listed in the table, because they are not used in this study. All the measurements are w.r.t. the global CMM coordinate system. These data can be easily transformed to the same coordinate system used in the state space model. In the state space model, the coefficient matrices A, B, and C and the input to the model u are known. These coefficient matrices are listed in the appendix of this chapter. If we neglect the model noise and measurement noise and set the initial state vector x0 as zero, we can iteratively calculate the state vectors at the following stations using the state space model. In this way, the measurement value can be predicted. The comparison between the real measurement and the model prediction is listed in Table 7.6. For a plane feature, the relevant parameter is its orientation and the distance from the datum plane. The deviation of the orientation of a plane is denoted by [φ, θ, ψ], and the deviation of the distance from the datum plane is denoted by dz. The value of dz is measured at the origin of LCS of the plane w.r.t. the datum plane. For the cylinder feature (hole B and hole C), only the deviation in orientation is considered.
TABLE 7.6 Comparison between the Measurement and Model Prediction Measurement
Cover face w.r.t. rough datum Joint face w.r.t. cover face Hole B w.r.t. cover face Hole C w.r.t. cover face Slot w.r.t. joint face
φ
q
y
0.0000 0.0020 0.0020 0.0020 0.0000
0.0010 –0.0010 –0.0010 –0.0030 0.0030
0.0000 0.0000 0.0000 0.0000 0.0000
Note: Unit — rad for φ, θ, and ψ; mm for dz.
Model Output dz
f
–0.113 0.0000 –0.508 0.0019 — 0.0019 — 0.0019 –0.370 0
q
y
dz
0.0012 –0.0012 –0.0012 –0.0012 0.0037
0 0 0 0 0
–0.157 –0.483 — — –0.353
2151_book.fm Page 166 Friday, October 20, 2006 5:04 PM
166
Stream of Variation Modeling and Analysis for MMPs
From the data presented in Table 7.6, the discrepancies between the model prediction and real measurement of joint face, hole B, and the slot are reasonably small. Because only the fixture errors are taken as the inputs to the model in this case study, these discrepancies are because of other machining errors, such as machine geometric error, thermal error, and force-induced error, that are neglected in the model input. The small discrepancies between the model prediction and real measurement of these three features show that the fixture error is dominant in this machining process. Relatively large discrepancies happen at the cover face and hole C. The large difference at the cover face is understandable because the reference datum feature is the rough datum that contains large natural variation itself. The large difference at hole C is an indication of large geometric error with the machine. Because the joint face, hole B, and hole C are machined at the same time under the same setup, they should have the same orientation that the model predicts if only fixture errors were present. A difference in the orientation between hole C, hole B, and the joint face indicates that the cutting-tool orientation is different at different machine configurations, which is considered a geometric error. Another point that needs to be pointed out is that the deviation patterns (i.e., the sign of each deviation and their relative magnitude) are correctly predicted by the model. This is important because we can conduct root cause identification by considering the error patterns in the measurement. The application of this model to root cause identification will be discussed in Chapter 9 to Chapter 11.
7.4 SUMMARY The complexity of a multistage machining process makes it very difficult to understand and model the variation and its propagation. In this chapter, an analytical linear model is developed to describe the propagation of workpiece geometric deviation among multiple machining stages. Using the state transitions among multiple machining stages, this model describes the geometric error accumulation and transformation when the workpiece passes through the whole process. A systematic procedure is presented to model the workpiece setup and machining process. This model has great potential to be applied in the various fields in variation reduction such as variation analysis of multistage machining process, variation source identification, sensor placement optimization, and fixture design and optimization. All these potential applications are presented in the following chapters of this book.
7.5 EXERCISES In Problem 1 to Problem 3, the following conditions are common. In Figure 7.15, the nominal dimensions of the 2-D workpiece are given in the drawing and expressed in millimeters. The CS0 is defined in Figure 7.15 as (Oxy), and the nominal PCS is defined as being the CS0.
2151_book.fm Page 167 Friday, October 20, 2006 5:04 PM
State Space Modeling for Machining Processes
167
y f4
A
10
f5 C
f3
18
f2 10
O
f1
15
B
x
FIGURE 7.15 The nominal dimension of the 2-D workpiece (for Problem 1 to Problem 3).
1. Assume the raw material is a square with dimensions 20 mm × 20 mm. The workpiece is mounted in a 2-1 locating system (note you may consider this system as a special 3-2-1 locating system with perfect L1, L2, and L3). The nominal position of the locators given in the CS0 is given in the following table. Also assume that feature f1 is used as the primary datum and feature f5 is used as secondary datum. The goal in this machining stage is to mill the top surface (feature f4). Locating Point
(x,y)
P1 P2 P3
(0,5) (0,13) (8,0)
a. What is the workpiece’s degree of freedom? If locator P1 deviates from its nominal position in the y direction by 0.1 mm b. What is the corresponding deviation of the CS1? c. What is the corresponding deviation of point A on feature f4 in the (x,y)-axis after the machining operation is performed? d. What is the corresponding orientation deviation of feature f4 after the machining operation is performed? 2. Rework Problem 1 questions (b), (c), and (d) if locator P2 is deviated form its nominal direction in the direction of (y) by 0.1 mm. 3. We want to evaluate a different locating scheme. The new positions of the locating points are given in the table as follows: Locating Point
(x,y)
P1 P2 P3
(0,5) (0,7) (8,0)
2151_book.fm Page 168 Friday, October 20, 2006 5:04 PM
168
Stream of Variation Modeling and Analysis for MMPs
a. Rework (1) for the new locating scheme b. Compare with your results obtained in (1); which fixture design would you recommend? 4. Assume the raw material is a rectangle with dimensions 20 mm × 18 mm. at the current stage, milling operation will be performed on the raw material to obtain feature f3 and f2. Suppose that we want to evaluate two fixture designs, and at this stage feature f4 has the deviation found in Problem 1c and 1d, please answer the following questions: Locating Point
(x,y)
P1 P2 P3 fixture design
Locating Point
(0,15) (0,12) (5,18) 1
P1 P2 P3 fixture design
(x,y) (0,15) (0,12) (13,0) 2
a. Given that both sets of locators are at their nominal locations, determine the deviation in the (x,y) axis of point C in the middle of feature f3 and point B on feature f2. What can you conclude? b. Given that locator L3 has some deviation in the y direction of 0.07 mm, determine the deviation in the (x,y) axis of point C in the middle of feature f3 and point B on feature f2. What can you conclude? c. By comparing the results, what fixture design would you recommend? 5. The nominal dimensions of the 2-D workpiece are given in Figure 7.16 and expressed in millimeters. The CS0 is defined in the figure as (Oxy), and the nominal CS1 is defined as being the CS0. The raw material is shown in solid lines and the feature to be created is shown in dashed line in Figure 7.16. Suppose that we want to evaluate two fixture designs. The locator’s nominal position is given in the following table. Suppose that feature f1 is used as primary datum and that feature f2 is used as secondary datum y
f4
A
18
10
15
f2
10
f1
x
O
f3
FIGURE 7.16 The nominal dimensions of the 2-D workpiece (for Problem 5).
2151_book.fm Page 169 Friday, October 20, 2006 5:04 PM
State Space Modeling for Machining Processes
169
in the first design and all f1,f3, and f2 are used as datums in the second design (note you may revise the T3 in equation 7.21 and equation 7.22 to deal with the locating scheme used in this problem). Locating Point
(x,y)
P1 P2 P3 fixture design
Locating Point
(3,0) (8,0) (0,10) 1
(x,y)
P1 (3,0) P2 (12.5,4) P3 (0,10) fixture design 2
a. What is the workpiece’s degree of freedom for each fixture design? b. What is the sensitivity of the orientation of feature f4 (point A) to a deviation of locator P1 for both fixture designs? c. What is the sensitivity of the orientation of feature f4 (point A) to a deviation of locator P2 for both fixture designs? d. By comparing the results obtained in (b) and (c), which design would you recommend?
7.6 APPENDIX: THE SYSTEM MATRICES USED IN EQUATION 7.27 The system matrices for the state space model used in model validation are as follows: A0 =
0 6× 6 0 30×6
0 6×30 , I30×30
012×6
B1 =
0.2865
0.4132
0.5295
−0.5295
0.8685
−0.6538
−0.0045
0.0045
0.0025
0.0036
0
0
−0.6997
012×6 1.1900
−0.1900
0
0
0.6050
−0.6050
−1
0.7853
0
0
0
0
0
0
0
0
0
0
−00.0061 0
0.0045 018×6
−0.0045
I 6× 6
0 018×6
,
2151_book.fm Page 170 Friday, October 20, 2006 5:04 PM
170
Stream of Variation Modeling and Analysis for MMPs
1
0
0
0
−117.03
−134.2 2
0
−1
0
−117.03
0
133.7
0
0
1
134.2
133.70
0
0
0
0
1
0
0
0
0
0
0
−1
0
0 6× 6
0
0
0
0
0
1
I 6× 6
1
0
0
0
−115.03
0
0 6× 6
0
−1
0
−117.03
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
1
I 6×6 1
0
0
0
−2
134.2
0
1
0
0
0
−133.7
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
1
0 6× 6
A1 =
018×18
0 6× 6 1
0
0
0
−2
134.2
1
0
0
0
−115.03
0
0
1
0
0
0
−133.7
0
−1
0
−117.03
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
1
0
0
0
0
0
1
1
0
0
0
−2
−171.88
1
0
0
0
−115.03
0
1
0
0
0
−133.7
0
−1
0
−117.03
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
1
0
0
0
0
0
1
0 6× 6
018×6
0 6× 6
−306
,
I18×18
2151_book.fm Page 171 Friday, October 20, 2006 5:04 PM
State Space Modeling for Machining Processes
171
0 6× 6 −0.0236
0.1378
0.0098
0.0125
−0.0125
0
1.239
−0.4618
0.2227
−0.005
0.005
0.0031
0.0022
0
0
0
1.19
−0.1900
0
−0.6050
0.6050
−1
0
0
0
−0.0052 0
0
0
0
0 0
0 6× 6 I 6× 6
012×12
0
0.00452
−0.00452
1.1900
−0.1900
0
0.6050
−1
0
0 6× 6
B2 =
−0.0236
0.0138
0.0098
0.0125
−0.0125
0
1.2391
−0.4618
0.2227
0
0
0
−0.0050
0.0050
0
0
0
0
0.0031
0.0022
0
0
0
0
0
0.0123
0.0087
0.0100
−0.0100
−0.2871
1.0643
−0.0050
0.0050
0.0031
0.0022
0
0
−0.6050
−0.0052 0
0.0045
−0.0210 0
−00.0045
0
−0.1946
1.1946
0
−0.6050
0.6050
-1
0.2 2227
0
0
0
0
0
0
0
0
0
0
−0.0052 0
0.0045
0 6× 6
−0.0045
018×6
0 6×12 I12×12
0
0 6× 6
0 6×12
,
2151_book.fm Page 172 Friday, October 20, 2006 5:04 PM
172
Stream of Variation Modeling and Analysis for MMPs
I 6×6 0 6×6
−1
0
0
0
2.5
134.2
0
−1
0
−2.5
0
−133.7
0
0
−1
−134.2
133.7
0
0
0
0
−1
0
0
0
0
0
0
−1
0
0
0
0
0
0
−1
0 12 × 24
0 6×6
A2 =
0 6×6
1
0
0
0
−119.525
0
0
−1
0
−119.525
0
0
0
0
1
0
0
0
0
0
0
1
0
0
0
0
0
0
−1
0
0
0
0
0
0
1
I 6×6
,
0 6 ×18
− I 6×6
0 18 × 6
−1
0
0
0
0.5
306
0
−1
0
−0.5
0
0
0
0
−1
−306
0
0
0
0
0
−1
0
0
0
0
0
0
−1
0
0
0
0
0
0
−1
1
0
0
0
0
0
−1
0
−1
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0.0016
1
−0.0033
0
0
0
0
0 18 × 6
I 6×6
0 6×6
0 6×6
I 6×6
0 12 × 6
0
0
0
0
0
0
0
0
0
0
0
0
1
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0.0033
0
0
0
0
0
I 6×6
2151_book.fm Page 173 Friday, October 20, 2006 5:04 PM
State Space Modeling for Machining Processes
−0.9824
B3 =
0.6877
0 30 × 6 0.2947
173
0 30 × 6 1
0
0
0
0.3257
−0.3257
0
−1
0
0.9050
0.3665
−0.2715
0
0
0
−0.0031
0.0031
0
0
0
0.0067
0.0029
0
0
0
0
0
0.0033
0
0 −0.0095 0
I 6×6
.
−0.0033
References 1. Ramesh, R., Mannan, M.A., and Poo, A.N., Error compensation in machine tools — a review, part I: geometric, cutting-force induced and fixture dependent errors, International Journal of Machine Tools and Manufacture, 40, 1235, 2000. 2. Kreng, V.B., Liu, C.R., and Chu, C.N., A kinematic model for machine tool accuracy characterization, International Journal of Advanced Manufacturing Technology, 9, 79, 1994. 3. Lee, E.S., Suh, S.H., and Shon, J.W., A comprehensive method for calibration of volumetric positioning accuracy of CNC-Machines, International Journal of Advanced Manufacturing Technology, 14, 43, 1998. 4. Chen, G., Yuan, J., and Ni, J., A displacement measurement approach for machine geometric error assessment, International Journal of Machine Tools and Manufacture, 41, 149, 2001. 5. Ni, J. and Wu, S.M., An on-line measurement technique for machine volumetric error compensation, ASME Transactions, Journal of Engineering for Industry, 115, 85, 1993. 6. Okafor, A.C. and Ertekin, Y.M., Derivation of machine tool error models and error compensation procedure for three axes vertical machining center using rigid body kinematics, International Journal of Machine Tools and Manufacture, 40, 1199, 2000. 7. Lee, J.H. and Yang, S.H., Statistical optimization and assessment of a thermal error model for CNC machine tools, International Journal of Machine Tools and Manufacture, 42, 147, 2002. 8. Mize, C.D. and Ziegert, J.C., Neural network thermal error compensation of a machining center, Precision Engineering, 24(4): 338–346, 2000. 9. Lo, C., Yuan, J., and Ni, J., Optimal temperature variable selection by grouping approach for thermal error modeling and compensation, International Journal of Machine Tools and Manufacture, 39, 1383, 1999. 10. Yang, J., Yuan, J., and Ni, J., Thermal error mode analysis and robust modeling for error compensation on a CNC turning center, International Journal of Machine Tools and Manufacture, 39, 1367, 1999. 11. Fuh, J., Chang, C., and Melkanoff, M., An integrated fixture planning and analysis system for machining processes, Robotics and Computer-Integrated Manufacturing, 10, 339, 1993. 12. De Meter, E.C., Min-Max load model for optimizing machining fixture performance, ASME Transaction, Journal of Engineering for Industry, 117, 186, 1995.
2151_book.fm Page 174 Friday, October 20, 2006 5:04 PM
174
Stream of Variation Modeling and Analysis for MMPs
13. Hockenberger, M.J. and De Meter, E.C., The application of meta functions to the Quasi-Static analysis of workpiece displacement within a machining fixture, ASME Transactions, Journal of Manufacturing Science and Engineering, 118, 325, 1996. 14. Rong, Y. and Bai, Y., Machining accuracy analysis for computer-aided fixture design verification, ASME Transactions, Journal of Manufacturing Science and Engineering, 118, 289, 1996. 15. Chen, S.G., Ulsoy, A.G., and Koren, Y., Error source diagnostics using a turning process simulator, ASME Transactions, Journal of Manufacturing Science and Engineering, 120, 409, 1998. 16. Mayer, J.R., Phan, A.V., and Cloutier, G., Prediction of diameter errors in bar turning: a computationally effective model, Applied Mathematical Modeling, 24, 943, 2000. 17. Yang, J., Yuan, J., and Ni, J., Real-Time cutting force induced error compensation on a turning center, International Journal of Machine Tools and Manufacture, 37, 1597, 1997. 18. Ramesh, R., Mannan, M.A., and Poo, A.N., Error compensation in machine tools — a review, part II: thermal errors, International Journal of Machine Tools and Manufacturing, 40, 1235, 2000. 19. Paul, R.P., Robot Manipulators: Mathematics, Programming, and Control, The MIT Press, Cambridge, MA, 1981. 20. Henzold, G., Handbook of Geometrical Tolerancing: Design, Manufacturing and Inspection, John Wiley & Sons, New York, 1995. 21. Cai, W., Hu, J., and Yuan, J., 1997, A variational method of robust fixture configuration design for 3-D workpieces, Journal of Manufacturing Science and Engineering, 119, 593, 1997. 22. Zhou, S., Huang, Q., and Shi, J., State space modeling of dimensional variation propagation in multistage machining process using differential motion vectors, IEEE Transactions on Robotics and Automation, 19, 296, 2003.
2151_book.fm Page 175 Friday, October 20, 2006 5:04 PM
8
A Factor Analysis Method for Variability Modeling*
8.1 INTRODUCTION In the last two chapters, a state space model was used to describe variation and its propagation in a multistage manufacturing process. Product and process design information were assumed to be available, and the first principles were used to get the state space model coefficient matrices. The prerequisite for using the methodology is that it must be possible for the variation to be obtained from the design of the product and its process. However, if the variation model cannot be obtained offline from the product and process design, the methodology of the stream of variation (SoV) modeling will not be directly applicable. Apley and Shi [1] have developed a methodology that models and analyzes the variation of a complex system with factor analysis based on product measurement data. This chapter presents a method to facilitate the identification and elimination of root causes of variability in manufacturing processes, in which the coefficient of the latent model cannot be obtained from the design data directly. This method extracts diagnostic information from what would typically be large quantities of multivariate process measurement data. As the technology for automated in-process measurement is becoming more affordable, accurate, reliable, and widely used by manufacturing industries, there are many applications in which the proposed method could be used to reduce variability. The effectiveness of the method will be demonstrated with an example from auto body assembly. The problem formulation and objectives in the methodology presented in this chapter are closely related to the factor analysis problem. The variability patterns (the fault geometry vectors) must be estimated from process data with no a priori knowledge of the faults, except for an assumed model structure. In contrast to traditional factor analysis methods, much greater emphasis has been placed on the underlying physical model that describes the effects of the faults on process variability. The criterion for “rotating” the factors is that the results should correspond closely to the actual faults, as opposed to artificial criteria that seek to aggregate the maximum amount of variability into the minimum number of variables. This allows more physically meaningful interpretations that better facilitate the overall goal of identifying and eliminating the root causes of variability. The remainder of the chapter is organized as follows. Section 8.2 describes the model used to represent the effects of process faults on variability. Section 8.3 discusses the capabilities and limitations of principal component analysis (PCA) and * Part of chapter material is based on Reference 5 (pp. 84–95).
175
2151_book.fm Page 176 Friday, October 20, 2006 5:04 PM
176
Stream of Variation Modeling and Analysis for MMPs
factor rotation for process diagnosis. Section 8.4 discusses methods for estimating the number of faults that are simultaneously present. Section 8.5 presents the main results of the chapter, a technique for estimating the effect of each individual fault when multiple faults are present, and an illustrative example. In Section 8.6, the statistical properties of the estimates are discussed.
8.2 A FACTOR ANALYSIS MODEL FOR PROCESS VARIABILITY 8.2.1 MODEL STRUCTURE One of the keys to diagnosing variability faults is the incorporation of a suitable fault model. This subsection presents the model structure that is assumed throughout this chapter and illustrates with an example from auto body assembly. Let y = [y1, y2, …, yn]T be an n × 1 random vector that represents a set of n measured features from the product or process. Let yi, i = 1, 2, …, N, be a random sample of N observations of y. It is assumed that y obeys the model y = Cu + w
(8.1)
where C = [c1, c2, …, cp] is an n × p constant matrix, u = [u1, u2, …, up]T is a p × 1 zero-mean random vector with covariance matrix ∑u = I, and w is an n × 1 zeromean random vector that is independent of u and has covariance matrix ∑w = σ2I. I denotes the identity matrix of appropriate dimension. It can be shown that the model of Equation 8.1 can be obtained from the state space model of Equation 6.39 and Equation 6.40 if such a model can be obtained from design information, as discussed in earlier chapters. Thus, the model of Equation 8.1 represents the same underlying physics as that of the system described in Chapter 6 and Chapter 7. The interpretation of the model is that there are p separate uncorrelated faults that affect the measurement vector y. Each fault has a linear effect on y that is dictated by the corresponding column of C. Together, ciui describes the effect of the ith fault on y. The direction of ci indicates the nature of the variation pattern caused by the ith fault. Specifically, it indicates how the fault causes the different measured features to vary with respect to each other. ci will be referred to as the “fault geometry” vector. Because the elements of u are scaled to have unit variance, ci also indicates the magnitude or severity of the ith fault. w represents the aggregated effects of measurement noise and any inherent unmodeled variation in the manufacturing process. It is assumed that p ≤ n and that C has full rank p. The focus of this chapter is on faults that act as sources of variation, as opposed to mean shifts. All random variables are assumed to be zero mean. If not, the mean of y should be estimated and subtracted out from the data. The objective is to estimate the number of faults p that are contributing to process variability, as well as each of the p fault geometry vectors in C, using the set of multivariate observations of y. The presumption is that if C can be accurately
2151_book.fm Page 177 Friday, October 20, 2006 5:04 PM
A Factor Analysis Method for Variability Modeling
177
estimated, it may be used to diagnose each of the p faults and aid in the identification and elimination of their root causes. The model and the objective of estimating C are similar to the standard linear orthogonal factor analysis problem [2], in which the elements of u are referred to as the factors, C is referred to as the factor loading matrix, and ∑w is typically only assumed to be diagonal. For the method presented in this chapter to produce estimates of the fault geometry vectors that have the desired interpretability, it is necessary to assume ∑w = σ2I. This can be assumed without loss of generality if w is known up to a constant scalar multiple. Whereas this may not be a reasonable assumption for typical applications of factor analysis, for statistical process control (SPC) of manufacturing processes it is often reasonable. Essentially, it is equivalent to assuming that a sufficient quantity of data has been collected when the process was known to be in control (i.e., when there were no faults present, y = w). In this case, the sample covariance matrix of the in-control data would provide an estimate of ∑w. Before applying the method of this chapter, the data would first be transformed via ∑ −w1/ 2 y, where ∑ −w1/ 2 is the inverse of the square root of ∑w. The noise covariance matrix for the transformed data would then be the identity matrix. After estimating C for the transformed data, the estimates should be transformed back by premultiplying by ∑1w/ 2 before interpreting the results.
8.2.2 INTERPRETATION
OF
VARIATION PATTERNS
Although factor analysis is widely used in the social sciences and other fields, it is seldom used as a tool for diagnosing variability in manufacturing processes. The applicability of the model is illustrated with the following example from auto body assembly shown in Figure 8.1. Figure 8.2 shows the layout of 26 measurement points (labeled 1–26 in the figure) taken at the BIW stage of the assembly process. All three coordinates (x, y, and z in the figure) are measured for each point, except for points 10 and 23, for Laser sensors
z (up) y (right) x (fore)
FIGURE 8.1 Laser sensing at the BIW stage in auto body assembly.
2151_book.fm Page 178 Friday, October 20, 2006 5:04 PM
178
Stream of Variation Modeling and Analysis for MMPs
Rear window opening
Rear door opening
Front door opening 20 22
18
Left bodyside 16 21 17 15
19 26 25 24 Lift-gate support
13
7
Roof header 14
Plenum
12
9
5
11 3 8 4 2
23
6
10 Cross-member
Right bodyside 1
z (up) x (fore) y (right)
FIGURE 8.2 Measurement layout at the BIW stage.
which only the x and z coordinates are measured. Thus, the measurement vector y has a total of n = 76 elements. The cross-member, roof header, and plenum (in addition to the roof, underbody, cowl, and additional roof bows, which have all been omitted from the figure for clarity) join the left and the right body sides to form the BIW. Before the body sides and connecting members are welded together, they must be accurately located with respect to one another with fixtures and then clamped into place. The body sides are positioned in the x-z plane with pins rigidly attached to the fixtures. Each pin mates with either a hole or a slot in the auto body panel, so that the panel position is constrained but not overly constrained. Through repeated use (a thousand panels per day may be placed into each fixture), the pins frequently become worn or loose. In this event, the panel will no longer be constrained to lie in its proper location when it is placed into the fixture. For example, suppose the pin that constrains the right body side in the x direction becomes loose. When a body side is placed into the fixture, it may be positioned too far forward (in the positive x direction) by, say, 1 mm. When it is subsequently clamped into place and welded to the rest of the BIW, it will retain the incorrect position. For the next auto body, the right body side may be positioned too far backward (in the negative x direction). From auto body to auto body, the loose pin will cause a distinct variation pattern in the BIW dimensions. The elements of y that represent measurements in the x direction on the right body side will reflect this pattern when the BIW is
2151_book.fm Page 179 Friday, October 20, 2006 5:04 PM
A Factor Analysis Method for Variability Modeling
179
measured. If we refer to this loose pin as fault 1, u1 is the random variable representing by how much the right body side translates in the x direction for each auto body (scaled to have unit variance). All elements of c1 that do not correspond to x direction measurements on the right body side would be zero. All elements that do correspond to x direction measurements on the right body side would be the same constant value, equal to the standard deviation of the body side translations. Likewise, suppose a second pin becomes loose or worn and causes the right body side to rotate in the x-z plane before being welded to the BIW. This, referred to as fault 2, will also cause a distinct variation pattern in the BIW measurements that is well represented by the model of Equation 8.1. The c2 vector will be determined by the measurement layout and the geometry of the panel and fixture. Apley and Shi [3] provide details on how to analytically model the effects of tooling faults and give a justification for the linear structure of model of Equation 8.1. Technically, the rotation of a panel is nonlinear in the parameters, but is closely approximated as linear when small angles of rotation are involved. The linear model structure is quite versatile and provides a good representation of a variety of commonly encountered faults, including those that are not rigid body translations and rotations. This includes variability introduced by stamping, welding, and material-handling faults, in which case the model of Equation 8.1 can be viewed as a linearization of a more accurate nonlinear model [3]. Remark 8.1: We point out that it is not necessary to analytically model the faults to use the method presented in this chapter. This is the main distinction between this work and that of Apley and Shi [3], or SoV modeling techniques presented in Chapter 6 and Chapter 7. In Apley and Shi [3] it is assumed that an exhaustive set of potential faults can be analytically modeled offline to obtain the C matrix. The problem then reduces to one of fault classification, where, based on the online measurements, one seeks to identify which of the modeled faults is present. The method of this chapter attempts to estimate C directly from the data, with no a priori knowledge of the faults, and use the results to gain insight into the root causes of the variability. To uniquely estimate C with no a priori knowledge of the faults, an additional assumption must be made regarding the structure of C. The reason this assumption is necessary is discussed in Section 8.3. It is assumed that C has the ragged lower triangular form c1,1 c 2,1 C = c 3,1 c p,1
c 2,2 c 3,2
c 3,3
c p,2
c p,3
c p, p
(8.2)
where ci,j is an ni × 1 vector with ni ≥ 2 and ∑ip=1 ni = n. The interpretation of this structure is that there exists a subgroup of n1 measurements {y1, y2, …, yn1} that are
2151_book.fm Page 180 Friday, October 20, 2006 5:04 PM
180
Stream of Variation Modeling and Analysis for MMPs
affected by only one fault and not by the remaining p – 1 faults. Furthermore, there must exist a second subgroup of n2 measurements {yn1+1, yn1+2, …, yn1+n2} that are affected by only one of the remaining p – 1 faults. Note that these measurements may also be affected by the first fault. There must also exist a third subgroup of measurements affected by only one of the remaining p – 2 faults, and so on. In the previous example, {y1, y2, …, yn1} could be taken to be any set of z direction measurements on the right body side. Such a group would be affected by fault 2, which causes the right body side to rotate, but not by fault 1, which causes only an x-direction translation of the body side and does not affect the z direction measurements. Upon appropriate reordering of the measurements and faults, C has the structure in Equation 8.2. If the faults are such that C does not have the assumed structure of Equation 8.2, the method of this chapter cannot be applied. When n is large relative to p, this structure is often satisfied in auto body assembly. An additional example in which this is the case will be given in Subsection 8.5.3.
8.3 LIMITATIONS OF PCA AND FACTOR ROTATION From the model structure and assumptions, the covariance matrix of y is ∑ y = E[(Cu + w)(Cu + w)T ] = CCT + σ 2I
(8.3)
Let zi, i = 1, 2, …, n, denote an orthogonal set of unit norm eigenvectors of ∑ y , Let λi, i = 1, 2, …, n, denote the corresponding eigenvalues, arranged in descending order. It follows from Equation 8.3 that λ1 ≥ λ2 ≥ … ≥ λp > σ2 = λp+1 = λp+2 = … = λn, and span{Zi}ip=1 = span{ci}ip=1 . PCA can also be used to decompose ∑ y in terms of its eigenvectors and eigenvalues as p
n
∑y =
∑ i =1
λ i zi zTi =
∑( i =1
)
λ i − σ 2 zi zTi + σ 2
n
∑z z
T i i
i =1
= Z p Λ p − σ 2I ZTp + σ 2I (8.4)
where Zp = [z1, z2, …, z p], and Λp = diag{λ1, λ2, …, λ p}. Comparing Equation 8.3 and Equation 8.4, it is obvious that one possible estimate of C is Z p [Λ Λ p − σ 2I]1/ 2 . Another possible estimate that would preserve the covariance structure is Z p [Λ p − σ 2I]1/ 2 Q, where Q is any p × p orthogonal matrix. When there is a single fault present, PCA is an effective tool for diagnosing process variability. ∑y has a single dominant eigenvalue, and a unique estimate of C (= c1) is Z1[λ1 – σ2]1/2. σ2 can be taken to be any of the smallest n – 1 eigenvalues. In practice, one must work with the sample covariance matrix. In this case, it may not be clear how many eigenvalues are “dominant” and, thus, how many faults are present. Methods of estimating the number of faults from the sample covariance matrix are discussed in Section 8.4. When multiple faults are present (p > 1), the fault geometry vectors will not correspond one-to-one with the eigenvectors unless the fault geometry vectors happen
2151_book.fm Page 181 Friday, October 20, 2006 5:04 PM
A Factor Analysis Method for Variability Modeling
181
to be orthogonal. PCA does provide an estimate of the number of faults that are present (refer to Section 8.4), but estimation of C is less straightforward. As discussed earlier, the estimate is not unique. Treating the eigenvectors as (scaled) estimates of the fault geometry vectors may yield noninterpretable results and provide little useful diagnostic information. To improve interpretability, a commonly used procedure is factor rotation [2,4]. The standard factor rotation problem is to find the p × p orthogonal matrix Q so that the resulting estimate of C (typically of the form Z p Λ1p/ 2Q ) provides the clearest interpretability. What is meant by “interpretability” is subjective, but the most widely used criterion is the varimax rotation [2,4]. The varimax criterion for best interpretability is that each column of Z p Λ1p/ 2Q should consist of elements that are either very large in magnitude or very small in magnitude with as few moderate-sized elements as possible. For diagnosis of manufacturing variability, there is little justification for the varimax criterion. The Q matrix that results in the clearest interpretability would be such that Z p [Λ p − σ 2I]1/ 2 Q = C, whose columns are the physical fault geometry vectors themselves. That is, knowing the fault geometry vectors for the actual faults will surely provide the most effective root cause diagnosis. As shown in Section 8.5, C may be uniquely identified if it has the structure assumed in Equation 8.2.
8.4 ESTIMATING THE NUMBER OF FAULTS The method for estimating C that will be presented in section 8.5 requires an estimate of the number of faults present. If p faults are present, λ1 ≥ λ2 ≥ …≥ λp > σ2 = λp+1 = λp+2 = …= λn. Thus, a natural means of estimating p is to look at the eigenvalues ˆ . In this chapter, the symbol “^” will be ˆ i} in=1 of the sample covariance matrix ∑ {λ y used to denote an estimate of a parameter. A number of methods for estimating p have been suggested, the most popular of which can be classified as either maximumlikelihood based or information based. Strictly speaking, these methods require that the data be multivariate Gaussian.
8.4.1 LIKELIHOOD RATIO TEST Anderson [5] developed results for the asymptotic distribution of eigenvectors and eigenvalues of a sample covariance matrix for Gaussian data. This yields an asymptotically (for large enough N) valid likelihood ratio test of the null hypothesis that λm+1 = λm+2 = … = λn, for some fixed m, in contrast to the alternative hypothesis that not all of the n–m smallest eigenvalues are equal. Note that when m ≥ p, the null hypothesis holds. For m = 0, 1, …, n – 1, one calculates the test statistics a Λ(m )= N ( n − m ) log m gm where a m and g m are the arithmetic and geometric means, respectively, of the n – m ˆ . Under the null hypothesis, Λ(m) is asymptotically chismallest eigenvalues of ∑ y squared distributed with (n – m)(n – m + 1)/2 – 1 degrees-of-freedom. A set of
2151_book.fm Page 182 Friday, October 20, 2006 5:04 PM
182
Stream of Variation Modeling and Analysis for MMPs
thresholds η(m), m = 0, 1, …, n – 1, are specified, typically based on the null distribution. The suggested procedure for estimating p is, while increasing m sequentially from 0, to choose pˆ to be the first m for which Λ(m) < η(m). To improve the chi-squared approximation to the null distribution for finite N, Lawley [6] introduced the modified statistic
(
) ( ) ( )
2 m 2 n−m + n−m +2 1 Λ mod (m ) = 1 − − + N N 6N n − m
m
∑ i =1
2 am Λ(m ) λ − a i m
which has the same asymptotic chi-squared null distribution with (n – m)(n – m + 1)/ 2 – 1 degrees of freedom. The procedure for estimating p is the same.
8.4.2 AIC
AND
MDL INFORMATION CRITERIA
Alternative procedures for estimating p are based on the Akaike (AIC) and minimum description length (MDL) information criteria introduced by Akaike [7], Schwartz [8], and Rissanen [9]. For the problem of estimating the number of significant factors in PCA, the AIC criterion was applied in Akaike [10] and Wax and Kailath [11], and the MDL criterion was applied in Wax and Kailath [11]. The AIC and MDL tests require calculation of, for m = 0, 1, …, n – 1, a AIC (m )= N ( n − m ) log m + m (2 n − m ) , and gm a MDL( m )= N (n − m) log m + m(2n − m) log( N ) /22 gm Using the AIC or MDL criterion, pˆ is chosen to be the m that minimizes AIC(m) or MDL(m), respectively. One major advantage of using either the AIC test or MDL test is simplicity. For the likelihood ratio tests one must select, somewhat arbitrarily, the set of thresholds. The information-based tests do not require this. Some properties of the various methods of estimating p are addressed in the following remarks: Remark 8.2: The likelihood ratio methods often lead to overestimating p and choosing a higher number of factors than can be interpreted [12]. The MDL and AIC criteria are less dependent on the validity of the asymptotic chi-squared approximation. Remark 8.3: The AIC estimate is not consistent (in the sense of yielding the true number of faults with probability one as N approaches infinity), whereas the other three methods do produce consistent estimates [11]. The AIC method slightly overestimates p asymptotically.
2151_book.fm Page 183 Friday, October 20, 2006 5:04 PM
A Factor Analysis Method for Variability Modeling
183
8.4.3 AN ILLUSTRATIVE EXAMPLE For some problems of typical scale (i.e., typical n, N, and p) in auto body assembly, we present a Monte Carlo comparison of the different methods. 10,000 Monte Carlo trials were used for all simulations, and the generated data were Gaussian. Note that, as shown in the appendix, the distribution of each of the four test statistics depends only on n, N, and {λ i /σ 2}ip=1 , and not on the individual eigenvectors. Table 8.1 through Table 8.3 show the probability mass functions of pˆ for various cases when at least
TABLE 8.1 Probability Mass Function of pˆ from AIC for Various n, N, p λi /σ σ2}i=1 and {λ pj ≡ Probability {pˆ = j} n
N
40 40 40 40 40 40 40 40 40 40 40 40 40 40 40 40 40 40 200 200 200 200 200 200 200 200 200 200 200
50 50 50 50 50 50 100 100 100 100 100 100 500 500 500 500 500 500 250 250 250 250 250 250 800 800 800 800 800
p λi /σ σ2}i=1 {λ
{11,11,11} {5,5,5} {3,3,3} {11} {5} {3} {11,11,11} {5,5,5} {2,2,2} {11} {5} {2} {11,11,11} {3,3,3} {2,2,2} {11} {3} {2} {11,11,11} {5,5,5} {3,3,3} {11} {5} {3} {5,5,5} {3,3,3} {2,2,2} {3} {2}
p0
p1
p2
p3
p4
p5
0 0 0.084 0 0.006 0.365 0 0 0.294 0 0 0.608 0 0 0 0 0 0 0 0 0.003 0 0 0.181 0 0 0 0 0.047
0 0.002 0.334 0.975 0.973 0.621 0 0 0.446 0.962 0.961 0.379 0 0 0 0.914 0.917 0.918 0 0 0.122 1 1 0.819 0 0 0.009 1 0.953
0 0.067 0.394 0.024 0.02 0.014 0 0 0.22 0.038 0.039 0.013 0 0 0 0.085 0.082 0.081 0 0 0.567 0 0 0 0 0 0.319 0 0
0.951 0.892 0.18 0.001 0.001 0 0.941 0.947 0.039 0 0 0 0.906 0.914 0.927 0.001 0.001 0.001 1 1 0.308 0 0 0 1 1 0.672 0 0
0.046 0.037 0.008 0 0 0 0.058 0.052 0.001 0 0 0 0.091 0.083 0.071 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0.003 0.002 0 0 0 0 0.001 0.001 0 0 0 0 0.003 0.003 0.002 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Note: The box corresponding to the actual value of p is shaded.
p6 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
2151_book.fm Page 184 Friday, October 20, 2006 5:04 PM
184
Stream of Variation Modeling and Analysis for MMPs
TABLE 8.2 Probability Mass Function of pˆ from MDL for Various n, N, p λi /σ σ2}i=1 and {λ pj ≡ Probability {pˆ = j} n
N
40 40 40 40 40 40 40 40 40 40 40 40 40 40 40 40 40 40 200 200 200 200 200 200 200 200 200 200 200
50 50 50 50 50 50 100 100 100 100 100 100 500 500 500 500 500 500 250 250 250 250 250 250 800 800 800 800 800
p σ2}i=1 λi /σ {λ
{11,11,11} {5,5,5} {3,3,3} {11} {5} {3} {11,11,11} {5,5,5} {2,2,2} {11} {5} {2} {11,11,11} {3,3,3} {2,2,2} {11} {3} {2} {11,11,11} {5,5,5} {3,3,3} {11} {5} {3} {5,5,5} {3,3,3} {2,2,2} {3} {2}
p0
p1
p2
p3
0 0.39 0.995 0 0.577 0.997 0 0.005 1 0 0.047 1 0 0 0.931 0 0 0.96 0 0.995 1 0 1 1 0 1 1 1 1
0 0.405 0.005 1 0.423 0.003 0 0.029 0 1 0.953 0 0 0 0.067 1 1 0.04 0 0.005 0 1 0 0 0 0 0 0 0
0.008 0.168 0 0 0 0 0 0.233 0 0 0 0 0 0 0.002 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0.992 0.037 0 0 0 0 1 0.733 0 0 0 0 1 1 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0
p4
p5
p6
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Note: The box corresponding to the actual value of p is shaded.
one fault is present, using AIC, MDL, and Λmod, respectively. The values are rounded to three decimal places. The results using the unmodified likelihood ratio statistic Λ are not shown. Except for the case where n = 40 and N = 500, this method greatly overestimated the number of faults and will not be further considered. For the Λmod method, all thresholds were chosen to be the upper 0.001 percentile of the appropriate chi-squared distribution. The values of {λ i /σ 2}ip=1 were chosen to span the range of what can be considered small (in terms of the tests having difficulty in detecting the
2151_book.fm Page 185 Friday, October 20, 2006 5:04 PM
A Factor Analysis Method for Variability Modeling
185
TABLE 8.3 Probability Mass Function of pˆ from Λmod for Various n, N, p λi /σ σ2}i=1 and {λ pj ≡ Probability {pˆ = j} n
N
40 40 40 40 40 40 40 40 40 40 40 40 40 40 40 40 40 40 200 200 200 200 200 200 200 200 200 200 200
50 50 50 50 50 50 100 100 100 100 100 100 500 500 500 500 500 500 250 250 250 250 250 250 800 800 800 800 800
p λi /σ σ2}i=1 {λ
{11,11,11} {5,5,5} {3,3,3} {11} {5} {3} {11,11,11} {5,5,5} {2,2,2} {11} {5} {2} {11,11,11} {3,3,3} {2,2,2} {11} {3} {2} {11,11,11} {5,5,5} {3,3,3} {11} {5} {3} {5,5,5} {3,3,3} {2,2,2} {3} {2}
p0
p1
p2
p3
p4
p5
p6
0 0.002 0.289 0.005 0.315 0.678 0 0 0.858 0 0.108 0.985 0 0 0 0 0 0.419 0 0 0 0 0 0 0 0 0.289 0.301 0.939
0 0.111 0.31 0.878 0.573 0.212 0 0 0.12 0.999 0.891 0.013 0 0 0.012 0.998 0.998 0.578 0 0 0 0 0 0 0 0.003 0.475 0.693 0.055
0.033 0.462 0.202 0.047 0.047 0.045 0 0.358 0.017 0 0 0 0 0 0.669 0.001 0.001 0.002 0 0 0 0 0 0 0 0.461 0.208 0.005 0.005
0.867 0.319 0.099 0.027 0.021 0.02 0.998 0.639 0.004 0.001 0.001 0.002 0.998 0.999 0.318 0 0 0 0 0 0 0 0 0 0.995 0.523 0.025 0.001 0.001
0.034 0.04 0.038 0.014 0.016 0.016 0.001 0.002 0.001 0 0 0 0.001 0.001 0.001 0 0.001 0.001 0 0 0 0 0 0 0.005 0.004 0.003 0 0
0.021 0.022 0.022 0.011 0.007 0.008 0.001 0.001 0 0 0 0 0.001 0 0 0.001 0 0 0 0 0 0 0 0 0 0.003 0 0 0
0.021 0.016 0.011 0.005 0.01 0.011 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0.002 0 0 0
Note: The box corresponding to the actual value of p is shaded.
faults) to large (in terms of the tests detecting the faults with high probability). Note that for the case p = 1, c1T c1 /σ 2 = λ1 /σ 2 − 1, c1T c1 /σ 2 can be viewed as the total variance (summed over all n measurements) due to the fault, divided by the average variance (averaged over all n measurements) due to the noise. Table 8.4 shows the probability of correctly concluding p = 0 when there are truly no faults present. This situation is of particular interest if an eigenvalue test is used to detect the presence of a fault. In this event, the probabilities shown in
2151_book.fm Page 186 Friday, October 20, 2006 5:04 PM
186
Stream of Variation Modeling and Analysis for MMPs
TABLE 8.4 Probability That pˆ = 0 when No Fault Is Present, for Various n and N Probability {pˆ = 0} n
N
AIC
MDL
mod
20 20 20 20 50 50 50 50 100 100 100 200 200
50 100 200 500 75 150 250 500 120 200 500 250 500
0.926 0.911 0.879 0.858 0.991 0.978 0.968 0.947 1.000 1.000 0.992 1.000 1.000
1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000
0.998 0.999 0.999 0.999 0.952 0.998 0.999 0.999 0.016 0.979 1.000 0.000 0.961
Table 8.4 are equal to one minus the alpha error. The values in Table 8.4 are also rounded to three decimal places. The results in Table 8.1 through Table 8.3 agree with the fact that the MDL and Λmod methods provide consistent estimates of p, whereas the AIC method asymptotically overestimates p (consider the n = 40, N = 500 cases). On the other hand, the AIC method was generally able to detect much smaller magnitude faults than either the MDL or Λmod methods. For example, consider the case when n = 40, N = 100, and three faults are present with {λ i /σ 2}ip=1 = {5,5,5}. The AIC method correctly estimates p with 0.947 probability, whereas MDL and Λmod correctly estimate p with only 0.733 and 0.639 probability, respectively. Consider also the cases where n = 200, N = 250, and {λ i /σ 2}ip=1 = {5,5,5}, {3,3,3}, {5}, or {3}. In these situations, the AIC method provides much more accurate estimates of p. Furthermore, for n (large) and N (not sufficiently large), Λmod greatly overestimates p. Although not shown in Table 8.3, for n = 200 and N = 250, Λmod often estimates p to be greater than 20. These observations also extend to the case when there are no faults present, as shown in Table 8.4. The alpha error is generally very low for the MDL method and slightly higher for the AIC method. Note that for all cases the alpha error for the MDL is zero to three decimal places. For large N/n (e.g., n = 20 or 50, N = 500), Λmod (which is consistent) has a lower alpha error than AIC (which asymptotically overestimates p). For small N/n (e.g., n = 100, N = 120), AIC still has a very small alpha error, whereas the alpha error for Λmod is close to one. In light of this, we suggest using either the AIC or MDL methods. If smallmagnitude faults are expected and it is desired that they be detected, the AIC method is more suitable. Otherwise, the MDL method is preferable.
2151_book.fm Page 187 Friday, October 20, 2006 5:04 PM
A Factor Analysis Method for Variability Modeling
187
8.5 UNIQUE IDENTIFICATION OF MULTIPLE FAULTS 8.5.1 ESTIMATING
THE
FAULT GEOMETRY VECTORS
Assume y follows the model of Equation 8.1, where C has the structure in Equation 8.2, and further assume that ∑ y is known. Assume also that one has identified a subgroup of measurements that are affected by a single fault, for which a procedure is given in Subsection 8.5.2. Define the “latent” covariance matrix ∑C to be the component of ∑ y that is due to the underlying latent variables that represent the faults, i.e., ∑C = Z p Λ p − σ 2I ZTp = CCT = c1,1 c 2,1 = c1T,1 c p,1
c
T 2 ,1
∑
p i =1
c i c iT
0 c + 0 T p ,1
0 ∑C( 2)
(8.5)
The last equality follows from the assumed structure of C. From Equation 8.5, the upper-left n1 × n1 block of ∑C is the rank-1 matrix c1,1c1T,1 , which has a single nonzero eigenvalue λ1,1 = c1T,1c1,1 with eigenvector z1,1 = c1,1λ1−,11/ 2 . Thus, the first fault geometry vector can be obtained from ∑C via z1,1λ1−,11/ 2 ∑C = 0 c1,1 0 c 2,1 T T T c1,1 c 2,1 c p,1 + 0 c p,1
0 c1,1 c1T,1c1,1 ∑C( 2) 0
(
)
−1
c1,1 c = 2,1 = c 1 c p,1
After identifying c1, deflate the latent covariance matrix via ∑ C − c1c1T =
∑
0 c i c Ti = i=2 0 p
0 ∑ C( 2)
where
∑ C( 2)
c 2,2 c 3,2 T c = 2,2 c p,2
c T3,2
0 c Tp,2 + 0
0 ∑ C(3)
2151_book.fm Page 188 Friday, October 20, 2006 5:04 PM
188
Stream of Variation Modeling and Analysis for MMPs
The upper-left n2 × n2 block of ∑ C( 2) is the rank-1 matrix c2,2 cT2,2 , which has a single nonzero eigenvalue λ 2,2 = cT2,2 c2,2 with eigenvector z 2,2 = c2,2 λ 2−,12/ 2 . Thus, c2 can be obtained via 0 ∑ C − c1c1T z 2,2λ −2,12/ 2 = 0 0 0 c 2,2 T T 0 c 2,2 c p,2 + 0 0 c p,2
0 0 T 0 c 2,2 c 2,2c 2,2 ∑ C(3) 0
0 0 0
(
)
−1
= c2
The entire process can be repeated, each time deflating the latent covariance matrix with the most recently identified fault geometry vector, until cp is identified. At the final stage, C is completely determined. ˆ must In practice, ∑ y will not be known, and the sample covariance matrix ∑ y be used instead. We suggest using the preceding procedure with all quantities ˆ pˆ and Λˆ pˆ from replaced by their estimates. Let pˆ denote an estimate of p. Form Z ˆ 2 ˆ the p largest eigenvalues/eigenvectors of ∑ y , and estimate σ via n
σˆ 2 = ( n − pˆ )−1
∑ λˆ
i = p +1
i
(8.6)
Anderson [5] has shown that Equation 8.6 is asymptotically the maximum ˆ =Z ˆ ˆ Λˆ ˆ − σˆ 2I Z ˆT likelihood estimate of σ2. An estimate of ∑ C would then be ∑ y p p pˆ . At each stage, the estimate of ci is ˆ − cˆ i = ∑ C
i −1
∑ j=0
0 cˆ cˆ zˆ i ,i λˆ i−,1i / 2 0 T j j
(8.7)
λˆ i,i and zˆ i,i denote the dominant eigenvalue/eigenvector pair of the ni × ni block of the deflated latent covariance matrix, corresponding to the subgroup of measurements affected by only one of the remaining faults. The subgroups can be identified by the procedure given in Subsection 8.5.2. Although the upper-left n1 × n1 submatrix of ∑ C is exactly rank 1, the correˆ will not necessarily be so. Its rank will be close to –1, sponding submatrix of ∑ C ˆ however, if ∑ C is sufficiently close to ∑ C . One eigenvalue of the upper-left n1 × n1 ˆ will then be much larger than the others, and it should be clear how to block of ∑ C
2151_book.fm Page 189 Friday, October 20, 2006 5:04 PM
A Factor Analysis Method for Variability Modeling
189
choose λˆ 1,1. Likewise for λˆ i,i , i = 2, 3, …, pˆ . This issue is related to whether the subgroups affected by only one fault have been appropriately selected, and will be further discussed in subsection 8.5.2.
8.5.2
IDENTIFYING SUBGROUPS
The accuracy in estimating the fault geometry vectors depends on, among other factors, whether the subgroups of measurements affected by only one of the remaining faults can be identified. This subsection discusses a simple procedure to accomplish this. If {x1, x2, …, xn1} is a subgroup affected by only one fault, the submatrix of ∑ C corresponding to this subgroup is the rank-1 matrix c1,1c1T,1 . Thus, to identify the first subgroup it is necessary to find a set of measurements whose latent covariance matrix has rank 1. Define R C to be the latent correlation matrix associated with ∑ C , obtained in the usual way by scaling each element of ∑ C by the square roots of the corresponding diagonal elements. A submatrix of ∑ C will have unit rank 1 if and only if all elements of the corresponding submatrix of R C have unit magnitude. This follows from the fact that the covariance matrix of a set of random variables has unit rank 1 if and only if the variables are linearly related, which occurs if and only if their correlation coefficients are either 1 or –1. Consequently, finding a subgroup of measurements whose latent covariance matrix has unit rank 1 reduces to finding a subgroup whose latent correlation matrix has all elements either 1 or 1. In the event that ∑ C is known, this can be easily accomplished by inspection of the elements of R C . If ∑ C is estimated, no submatrix of Rˆ C will have all unit magnitude elements with positive probability. Therefore, Rˆ C must be searched for submatrices whose elements are all close to one in magnitude. If n is large, we recommend using a clustering technique with the magnitude of the latent correlation coefficients as the similarity measure. We have found agglomerative hierarchical methods with complete linkage to work well in practice. Refer to Everitt [13] for details. The result of the clustering will be a set of candidate subgroups, one of which must be selected as the first subgroup {x1, x2, …, xn1}. To accomplish this, we ˆ that correspond to recommend calculating the eigenvalues of the submatrices of ∑ C each candidate subgroup. For notational convenience, denote the eigenvalues for a given subgroup (of size n1) as {λ i }in1= 1, arranged in descending order. We suggest choosing the subgroup that maximizes the criterion λi
(8.8)
n1
( n1 − 1)
−1
∑λ
i
i =1
This criterion is the ratio of the largest eigenvalue to the average of the remaining eigenvalues, and helps to ensure that the rank of the corresponding submatrix ˆ is close to 1. After identifying the first subgroup, the first fault geometry vector of ∑ C ˆ is deflated as described in Subsection 8.5.1. The preceding is estimated and ∑ C
2151_book.fm Page 190 Friday, October 20, 2006 5:04 PM
190
Stream of Variation Modeling and Analysis for MMPs
procedure can then be repeated on the deflated latent covariance matrix to identify the second subgroup, and so on.
8.5.3 FAULT INTERPRETATION
AND ILLUSTRATIVE
EXAMPLE
The estimated fault geometry vectors can provide powerful diagnostic tools for identifying root causes of process variability, as illustrated in the following example. Consider again the measurement layout on the BIW described in Subsection 8.2 and shown in Figure 8.2. The sample consists of N = 200 auto bodies, produced and measured over a 4-hr period. The measurements are in units of millimeters. All three ˆ was formed from the four dominant methods discussed in Section 8.4 gave pˆ = 4. ∑ C 2 eigenvalue/eigenvector pairs, with σˆ = 0.040 estimated from the remaining 72 eigenvalues. Using the clustering procedure outlined in Subsection 8.5.2, a number of candidate subgroups were found. The subgroup {11X, 13X, 15X, 16X, 17X, 18X, 24X, 25X} maximized Equation 8.8 and was selected to be the first subgroup. The ˆ were {0.2873, 0.0008, 0.0004, eigenvalues of the corresponding submatrix of ∑ C 0.0004, 0, 0, 0, 0}, and the corresponding submatrix of Rˆ C was 1 .994 .997 .991 .995 .985 .993 .998
.994 1 .994 .985 .997 .982 .998 .991
.997 .994 1 .996 .998 .994 .997 .999
.991 .985 .996 1 .996 .989 .993 .995
.995 .9997 .998 .996 1 .987 .999 .995
.985 .982 .994 .989 .987 1 .989 .994
.993 .998 .997 .993 .999 .989 1 .994
.998 .991 .999 .995 .995 .994 .994 1
Thus, it appears that subgroup one is affected by only a single fault. cˆ 1 , estimated via Equation 8.7, and is shown graphically in Figure 8.3. The length of each arrow is the Six Sigma value, due to fault 1, for that coordinate, i.e., the arrow length is 6 times the magnitude of the corresponding element of cˆ 1. To make the plot less cluttered, an arrow was omitted if the Six Sigma level was less than 0.25 mm (deemed insignificant from a practical viewpoint). Fault 1 appears to affect only fore/aft direction coordinates, except for a relatively minor effect on point 1Y. Moreover, it appears to cause all points on the vehicle to move by approximately the same amount in the fore/aft direction. In other words, the entire vehicle is translating in the fore/aft direction. Because the fault does not appear to cause the auto body panels to vary relative to each other, it was suspected that the cause was measurement error. Investigation quickly revealed the following. In the measurement station, the vehicle's position is fixed via a pin/hole/slot combination in the underbody, as described in Section 8.2. The same underbody hole is used to locate the vehicle throughout the assembly process, prior to measurement. Through repeated use, the hole in each underbody was significantly enlarged in the fore/aft direction. This allowed the vehicles to translate in the fore/aft direction when placed into the measurement station fixture, resulting in the measurement error
2151_book.fm Page 191 Friday, October 20, 2006 5:04 PM
A Factor Analysis Method for Variability Modeling
191
1.0 mm 1.0 mm
FIGURE 8.3 Graphical illustration of cˆ 1 for the example.
represented by fault 1. The problem was solved by changing the design of the pin/hole. ˆ was deflated and the procedure was repeated. After estimating the first fault, ∑ C The clustering procedure produced as a second subgroup {1X, 2X, 3X, 4X, 5X, 6X, 7X, 8X, 9X}. The eigenvalues of the corresponding submatrix of the deflated latent covariance matrix were {0.5205, 0.0018, 0.0008, 0, 0, 0, 0, 0, 0}, and the corresponding correlation matrix contained elements mostly greater than 0.99 in magnitude. cˆ 2, estimated via Equation 8.7, is shown graphically in Figure 8.4. It appears the second fault also affects only fore/aft direction points. In contrast to fault 1, fault 2 only affects points on the right body side, causing each point to translate by approximately the same amount. Thus, it appears fault 2 results in the right body side being incorrectly positioned with respect to the rest of the vehicle, identical to the variation pattern described in Section 8.2. From this it was suspected that, in the station in which the body sides are joined to the roof and underbody, the pin that locates the right body side in the fore/aft direction was loose or worn. Investigation revealed that it was loose. The latent covariance matrix was deflated again, and the procedure was repeated twice more to estimate the two remaining fault geometry vectors. Although not shown, faults three and four affected the y-direction measurements predominantly and also had clear explanations. One of the faults was because of a misaligned robotic weld gun. Together, the four fault geometry vectors resulted in a C matrix with the structure of Equation 8.2. These results can be compared with standard PCA, in which the four dominant ˆ are directly plotted. Two of the eigenvectors, zˆ and zˆ , are eigenvectors of ∑ y 1 3 shown in Figure 8.5 and Figure 8.6.
2151_book.fm Page 192 Friday, October 20, 2006 5:04 PM
192
Stream of Variation Modeling and Analysis for MMPs
1.0 mm 1.0 mm
FIGURE 8.4 Graphical illustration of cˆ 2 for the example.
1.0 mm 1.0 mm
FIGURE 8.5 Graphical illustration of zˆ 1 in the example.
Interpretation of zˆ 1 could possibly have led to the same conclusion as cˆ 1 , although zˆ 1 makes it appear that the fault causes the right body side to translate by larger amounts than the rest of the vehicle. cˆ 1 more clearly represents the actual fault. zˆ 3 would be difficult to interpret, appearing that the fault causes the right and left sides of the vehicle to move in opposite directions. ˆ ˆ [ Λˆ ˆ − σˆ 2I]1/ 2 Q, ˆ in this example can be written in the form Z We point out that C p p where Q is a 4×4 orthogonal matrix. This, in fact, must be the case, because the
2151_book.fm Page 193 Friday, October 20, 2006 5:04 PM
A Factor Analysis Method for Variability Modeling
193
1.0 mm 1.0 mm
FIGURE 8.6 Graphical illustration of zˆ 3 in the example.
ˆ that preserves the latent covariance structure. method of this chapter provides a C ˆ . Thus, the ˆ ˆ ˆ T is always equal to ∑ Specifically, because of the way C is formed, CC C method of this chapter represents a form of factor rotation, where the criterion for ˆ are the actual fault geometry vectors. “best” interpretability is that the columns of C As in PCA, a breakdown of the percentage of total sample variation that is due to each fault can be easily obtained. The estimated variation due to the ith fault is ˆ , or, equivalently, the sum of its eigenvalues. cˆ Ti cˆ i . The total variation is the trace of ∑ y For the preceding example, the total variation is 5.18 mm2. The percentages due to fault 1 through fault 4 are 15.5%, 10.3%, 10.4%, and 5.6%, respectively. Together, the four faults account for 41.8% of the total sample variability.
8.6 STATISTICAL PROPERTIES The statistical properties of the estimated fault geometry vectors are complicated, in general, and depend on a number of factors that include n, N, p, σ2, and {ci }ip=1. The fact that the estimates depend on which measurements are selected as the subgroups, which involves user subjectivity, further complicates the analysis. Some results can be obtained, however, in special situations. It is assumed that the data are normally distributed. A case of particular interest is where there is only one fault present. Suppose that this is the case and that the number of faults has been correctly estimated. Then ˆ has rank 1, and ∑ C cˆ 1 = λˆ 1 − σˆ 2 zˆ 1.
2151_book.fm Page 194 Friday, October 20, 2006 5:04 PM
194
Stream of Variation Modeling and Analysis for MMPs
Although the distribution of eigenvalues and eigenvectors of a sample covariance matrix is, in general, complicated, Anderson [5] provides some asymptotic (for large N) results when the data are normally distributed. Using these results, the following asymptotic expression for the accuracy of cˆ 1 is derived in Appendix B of Reference 1. cˆ − c 1 1 κ≡E c 2 1
2
(
1+ γ ≅ 1 n − 1 γ 1 + γ + N 2
(
) (
)
)
2
+
γ2 2 n −1
(
)
(8.9)
where γ ≡ σ 2 /c1T c1. γ can be viewed as the inverse of a “signal-to-noise ratio,” because c1T c1 is the total variance due to the fault. For large n, the third term can be neglected, and Equation 8.9 reduces to
κ≅
1+ γ 1− γ nγ + N 2
(8.10)
Clearly, increasing sample size N reduces κ and improves accuracy. It can be verified that the partial derivative of Equation 8.10 with respect to γ is always positive. Consequently, holding n and N fixed and increasing the signal-to-noise ratio (decreasing γ) always reduces κ, which is intuitively obvious. The effects of n are less obvious. Equation 8.10 seems to imply that increasing n increases κ and decreases accuracy. γ, however, may also change with n, because c1 will increase in dimension and c1T c1 may change. As n increases, if the additional measurements are not affected by the fault, then γ remains constant, and κ does increase. Thus, adding irrelevant measurements will decrease accuracy. On the other hand, suppose the additional measurements are affected by the fault. In this case, it is convenient to define γ ≡ σ 2 / n −1c1T c1 = nγ , which can be viewed as the average variance due to the noise, divided by the average variance due to the fault. As n increases, suppose the additional measurements are affected by the fault in such a way that γ remains constant. If we substitute γ = γ /n into Equation 8.10, it can also be verified that the partial derivative of κ with respect to n (holding γ and N fixed) is always negative for n > 1. Consequently, adding measurements that are affected by the fault (to the point that γ does not decrease) will reduce κ and improve accuracy. The general case where p > 1 is more difficult to analyze. Suppose the MDL method, which is known to yield a consistent estimate of p, is used, so that asymptotically
(
)
∑ ( λˆ − σ )zˆ zˆ p
ˆ =Z ˆ Λˆ − σ 2I Z ˆT ∑ C p p p=
2
i
T i i
i =1
For the case that λ1 > λ2 > …> λp, the results of Anderson [5] apply directly, yielding for i = 1, 2, …, p,
2151_book.fm Page 195 Friday, October 20, 2006 5:04 PM
A Factor Analysis Method for Variability Modeling
195
2λ 2 λˆ i ~ N λ i , i N 2σ 4 σˆ 2 ~ N σ 2 , (n − p) N and λ zˆ i ~ N zi , i N
n
∑ k =1 k ≠i
λk T zk zk (λ k − λ i ) 2
ˆ . Because {λˆ i , zˆ i }ip=1 and σˆ 2 are clearly consistent estimates. Therefore, so is ∑ C p ˆ {cˆ i }i=1 are calculated entirely from ∑ C , they also are consistent estimates, providing the measurement subgroups have been correctly identified. For the case where λj = λj+1 for some 1 ≤ j < p, Anderson's results do not apply. In fact, zj and zj+1 are not even uniquely defined when their eigenvalue has multiplicity 2, although the projection operator onto their eigenspace is. Consider the extreme case where λ1 = λ2 = . . . = λp ≡ λ. Note that Z p Z Tp , the projection operator onto the eigenspace of λ, is well defined. Tyler [14] showed that
{
1 Z p Z Tp B I − Z p Z Tp + I − Z p Z Tp BZ p Z Tp N Zˆ p Zˆ Tp − Z p Z Tp → λ − σ2
}
where the convergence is in distribution as N → ∞. Here, B is an n × n multivariate normal matrix with zero mean and covariance given by cov([B]i,j,[B]k,l) = [∑y]i,k , [∑y]j,l + [∑y]i,l , [∑y]j,k. Thus, Zˆ p Zˆ Tp is a consistent estimate of Z p Z Tp . Because Λˆ p ˆ is also consisand σˆ 2 are consistent estimates of Λ p = λI and σ 2 , respectively, ∑ C tent. Similar arguments can be applied when some, but not all, of {λ i }ip=1 are equal.
8.7 SUMMARY The applicability of the method depends predominantly on whether the linear (or linearized) model of Equation 8.1, with the assumed structure of Equation 8.2 for C, adequately represents the effects of process faults. Ultimately, this depends on the underlying physics of the process and the faults, although we believe the model provides a reasonable representation of manufacturing processes. Additionally, the method requires some degree of subjective judgment by the user in identifying the measurement subgroups. The development of more “black-box” methods that remove the subjectivity and relax the model assumptions would be a valuable contribution.
2151_book.fm Page 196 Friday, October 20, 2006 5:04 PM
196
Stream of Variation Modeling and Analysis for MMPs
8.8 EXERCISES 1. Assume eight quality variables are measured from an assembly process. The sample covariance matrix obtained from data of 150 observations is as follows. Determine the number of process faults with scree plot. 0.1553 0.0068 0.1535 0.0024 S= 0.1549 0.0106 0.15377 0.0184
0.0068 0.1535 0.0024 0.1549 0.0106 0.1537 7 0.0184 0.0183 0.0009 0.0011 0.0056 0.02988 0.0024 0.0534 0.0009 0.1541 0.0020 0.15388 0.0008 0.1536 0.0009 0.0011 0.0020 0.00033 0.0023 0.0018 0.0021 0.0033 0.0056 0.1538 8 0.0023 0.1551 0.0087 0.1538 0.0149 0.02988 0.0008 0.0018 0.0087 0.0494 0.0034 0.0880 0.0024 0.1536 0.0021 0.1538 0.0034 0.1536 0.00544 0.0534 0.0009 0.0033 0.0149 0.0880 0.0054 4 0.1576
2. For the same sample covariance matrix given in (1) determine the number of process faults with the likelihood ratio test method. 3. For the same sample covariance matrix given in (1) determine the number of process faults with the AIC and MDL criteria. 4. Derive Equation 8.7 based on the properties of eigen-decomposition. 5. For the same diagnosis problem described in Chapter 5, Problem 4: a. Derive the data covariance S or correlation R matrix. b. Derive the eigenvalue/eigenvector pairs from the matrix you selected in (a). c. Determine the number of common factors (or number of process faults). d. Estimate the fault geometry vectors for each of the process faults with the methodology introduced in this chapter. e. Compare the results from (d) with that from Problem 4 of Chapter 5.
8.9 APPENDIX 8.1: DISCUSSION OF THE DISTRIBUTIONS OF TEST STATISTICS In this appendix, it is proved that the distributions of Λ(m), Λmod(m), AIC(m) and MDL(m) depend only on n, N, and {λ i , σ 2}ip=1 , and not on the individual eigenvectors. Assume y is Gaussian. Consider the eigenvector decomposition of ∑ y in Equation 8.4. Expanding y in terms of its principal components, it follows that y is distributed as Ze, where e follows a multivariate normal distribution with covariance matrix , and Z = [z1, z2, …, zn]. Then, ˆ ≡ ∑ y
1 N −1
N
∑ i =1
1 ( yi − y )( yi − y )T = Z N − 1
N
∑ (e − e)(e − e)) Z T
i
i =1
i
T
2151_book.fm Page 197 Friday, October 20, 2006 5:04 PM
A Factor Analysis Method for Variability Modeling
197
where y and e are the sample averages for y and e, respectively. Because Z is an ˆ are the eigenvalues of the matrix in brackorthogonal matrix, the eigenvalues of ∑ y ets, the distribution of which depends only on n, N, and {λ i }in=1, and not on Z. Now suppose that y ′ ≡ a1/ 2 y for some positive constant a, so that the covariance ˆ are distributed as the eigenvalues of matrix of y ′ is ZaΛa T . The eigenvalues of ∑ y ˆ ∑ y multiplied by the constant a. By inspection of each of the four test statistics, it is clear that the constant drops out of the distribution. Consequently, if all eigenvalues of ∑ y are divided by a constant, say σ2, the distribution of each of the test statistics ˆ depends remains unchanged. Therefore, the distribution of the eigenvalues of ∑ y 2 n 2 only on n, N, and {λ i , σ }i =1. Noting that λ i , σ = 1, for i = p + 1, p + 2, …, n, completes the proof.
References 1. Apley, D. and Shi, J., A factor-analysis method for diagnosing variability in multivariate manufacturing processes, Technometrics, 43, 84, 2001. 2. Johnson, R.A. and Wichern, D.W., Applied Multivariate Statistical Analysis, 4th ed., Prentice Hall, Upper Saddle River, NJ, 1998. 3. Apley, D.W., and Shi, J., Diagnosis of multiple fixture faults in panel assembly, ASME Journal of Manufacturing Science and Engineering, 120, 793, 1998. 4. Jackson, J.E., Principal components and factor analysis: part II — additional topics related to principal components, Journal of Quality Technology, 13, 46, 1981. 5. Anderson, T.W., Asymptotic theory for principal components analysis, The Annals of Mathematical Statistics, 34, 122, 1963. 6. Lawley, D.N., Tests of significance for the latent roots of covariance and correlation matrices, Biometrika, 43, 128, 1956. 7. Akaike, H., Autoregressive model fitting for control, Annals of the Institute of Statistical Mathematics, 23, 163, 1971. 8. Schwartz, G., Estimating the dimensions of a model, The Annals of Statistics, 6, 461, 1978. 9. Rissanen, J., Modeling by shortest description length, Automatica, 14, 465, 1978. 10. Akaike, H., Factor analysis and AIC, Psychometrika, 52, 317, 1987. 11. Wax, M. and Kailath, T., Detection of signals by information theoretic criteria, IEEE Transactions on Acoustics, Speech, and Signal Processing, 33, 387, 1985. 12. Basilevsky, A., Statistical Factor Analysis and Related Methods, John Wiley & Sons, New York, 1994. 13. Everitt, B.S., Cluster Analysis, 3rd ed., Edward Arnold, London, 1993. 14. Tyler, D.E., Asymptotic inference for eigenvectors, The Annals of Statistics, 9, 725, 1981.
2151_book.fm Page 198 Friday, October 20, 2006 5:04 PM
2151_book.fm Page 199 Friday, October 20, 2006 5:04 PM
Part III Variation Source Diagnosis
2151_book.fm Page 200 Friday, October 20, 2006 5:04 PM
2151_book.fm Page 201 Friday, October 20, 2006 5:04 PM
9
Diagnosability Analysis for Variation Source Identification*
9.1 MOTIVATION AND FORMULATION OF DIAGNOSABILITY STUDY The state space model developed in Chapter 6 and Chapter 7 links process variation sources with product quality. Based on this quantitative model, variation sources in the process can be identified. However, before discussing variation source diagnosis, the diagnosability aspect needs to be thoroughly examined. The issue of diagnosability refers to the problem of whether product measurements contain sufficient information for the diagnosis of critical process faults, i.e., if root causes of process faults can be diagnosed. This problem has significant engineering relevance. Let us begin by revisiting the simple machining process shown in Figure 7.4, in which the perpendicularity problem of the hole is caused by the milling operation. The same process is shown again in Figure 9.1; here the milling operation is fine but the fixture in (b) has an error. Clearly, the fixture error will cause the same perpendicularity problem in the hole as shown in Figure 7.4. This implies that we cannot distinguish the datum error in Figure 7.4 and the fixture error in Figure 9.1 if only the hole orientation is measured. The diagnosability issue is particularly relevant for a multistage manufacturing system. First, it is challenging to evaluate diagnosability in a multistage system. As in Figure 9.1, the quality characteristic of the hole at the second station can be affected by either the locators on the second station or the machining errors on the first station. It is not obvious what kind of information can be obtained when the dimensional quality of the workpiece is measured. Overall, are all process faults diagnosable, given current measurements of product features? If not, what is the “aliasing” structure among the coupled process faults? Second, even if it is technically feasible, it is not cost-effective to install sensors or probes at every intermediate manufacturing stage. Therefore, the quantitative performance evaluation of a gauging strategy is very important. There is a great need for a systematic methodology that captures this phenomenon. The diagnosability analysis is based on a fault-quality diagnostic model that links process faults and product quality measurements. As presented in Chapter 6 to Chapter 8, this fault-quality model can be obtained through engineering analysis of the process- or data-driven techniques. In general, the fault-quality model for multistage processes can be expressed in a state space model form as * Part of chapter material is based on Reference 9 (pp. 312–325).
201
2151_book.fm Page 202 Friday, October 20, 2006 5:04 PM
202
Stream of Variation Modeling and Analysis for MMPs
Fixture Error D D
D
C C
C (a)
(b)
(c)
FIGURE 9.1 A machining process with fixture error.
x k = A k −1x k −1 + B k u k + w k
and
yk = Ck xk + vk
(9.1)
where k = 1, 2, …, N; A k −1x k −1 represents the transformation of quality information from stage k – 1 to stage k; B k u k represents the product quality as it is affected by the process faults at stage k; and Ck is the observation matrix that maps process states to measurements. System matrices Ak, Bk, and Ck are constant matrices. They are determined by the process or product design information. The state space model can be transformed into a general mixed linear model as follows. First, it can be written in an input-output format as: y k = Σ ik=1C k Φk ,i Bi u i + C k Φk ,0 x 0 + Σ ik=1C k Φk ,i w i + v k
(9.2)
where Φk ,i = A k −1A k − 2 Ai for k > i and Φk ,k = Inx , where nx is the dimension of xk. The quality characteristics x 0 correspond to the initial condition of the product before it goes into the manufacturing line. If the measurement of x0 is available, C k Φk ,0 x 0 can be moved to the left side of Equation 9.2 and the difference y k − C k Φk ,0 x 0 can then be treated as a new measurement. If the measurement of x0 is not available, we can treat it as an additional process fault input. Without loss of generality, we set x0 to 0. Define µ k as the expectation of uk and u k = u k − µ k . Combining all available measurements from station 1 to station N, we have y1 µ1 µ 1 w1 v1 y2 = G µ 2 + Γ µ 2 + Ψ w2 + v2 w N v N y N µ N µ N where C1B1 C 2Φ2,1B1 Γ= C Φ N N ,1B1
0 C2B2 C N ΦN ,2B2
0 0 , C N B N
(9.3)
2151_book.fm Page 203 Friday, October 20, 2006 5:04 PM
Diagnosability Analysis for Variation Source Identification
C1 C 2Φ2,1 Ψ= C N ΦN ,1
0 C2 C N ΦN ,2
203
, C N 0 0
µ k is an unknown constant vector, and u k , w k , v k are zero mean random vectors. As yk is not necessarily available at every stage, if no measurement is available at station k, the corresponding rows should be eliminated. For the sake of simplicity, we can define y as [y1T y2T … yNT ]T and µ, u ,v, and w are defined in a similar way. Then Equation 9.3 can be rewritten as: y = Γ ⋅ µ + Γ ⋅ u + Ψ ⋅ w + v
(9.4)
Let P denote the total number of potential faults (i.e., the dimension of µ) and Q denote the number of system noises (i.e., the dimension of w) considered at all the stages. We use the lowercase ui, i = 1, …, P, to represent the ith element of the vector of u and ui’s variance is denoted as σ 2ui . Similarly, we use the lowercase wi to represent the ith element of the vector of w and wi’s variance is denoted as σ 2wi . With this notation, the variance components of process faults and system noises, and the variance of measurement noises are represented by {σ 2ui }i=1...P , {σ 2wi }i=1...Q , and σ 2v , respectively. During production, multiple samples of the product are available at each stage. Assume we have n samples that can be stacked up as y a = (1n ⊗ Γ ) ⋅ µ + (In ⊗ Γ ) ⋅ u a + (In ⊗ Ψ ) ⋅ w a + v a
(9.5)
where y a is a stack of the n samples of y, i.e., ya = [y(1)T y(n)T]T, and y(i) is the ith sample of y. The vectors u a, wa, and va are defined in a similar manner as ya, ⊗ is the Kronecker matrix product [1], and 1 n is the summing vector whose n elements equal unity. The process faults manifest themselves as the mean deviation and variance of process variation sources. The diagnosability problem can then be restated: From n samples, can we identify the value of {µ i }i=1...P and {σ 2ui }i=1…P ?
9.2 DEFINITIONS OF DIAGNOSABILITY There is currently limited reported research on diagnosability of variation root causes in multistage manufacturing processes. Ding et at. [2] conducted a preliminary study. They proposed the concepts of within-station diagnosability and between-station diagnosability to distinguish the local process faults and propagated process faults. The diagnosability condition listed in their work is a special case of the diagnosability analysis presented here.
2151_book.fm Page 204 Friday, October 20, 2006 5:04 PM
204
Stream of Variation Modeling and Analysis for MMPs
Fault-Quality Model
General Mixed Linear Model
Measurement Error rr
y
~ u w
µ
Residual Error Fixed Effects
Mean of Faults
v
y=X +Zu+e
System Matrices
Design Matrices Fault Variations and Model Noise
Random Effects
FIGURE 9.2 Model comparison.
In this chapter, the diagnosability problem is investigated under the framework of variance components analysis (VCA). The model in Equation 9.5 fits a general mixed linear model given by Rao and Kleffe (1988) as: β + Zu + e y = Xβ
(9.6)
where y is the observation vector; X is a known constant matrix; β is a vector of unknown constants; Z is a known constant matrix; u is a vector of independent variables with zero mean and unknown variances; and e is a vector of independent variables with zero mean and unknown variance σ 2e . The unknown variances of u and e are called “variance components.” A mixed model is used to describe both fixed and random effects. This model is often applied to biological and agricultural data. In designed experiments, the matrices X and Z are determined by designers. They often contain only 0 or 1, depending upon whether the relevant effect contributes to the measurement. Given a mixed model, researchers are primarily interested in estimating the fixed effects and variance components. A large body of literature about VCA is available. Excellent overviews can be found in Rao and Kleffe [3] and Searle et al. [4]. We can establish a one-to-one corresponding relationship between terms in our fault-quality model Equation 9.4 and those in the mixed model Equation 9.6, as shown in Figure 9.2. In our fault diagnosis problem, however, the matrices X and Z are computed from system matrices Ak , Bk, and Ck, k = 1, …, N, which are determined by process design information and measurement deployment informaµ) of process faults, and the random tion. The fixed effects are the mean values (µ w, and v. Fault diagnosis is effects are the process faults and the process noises, u, thus equivalent to the problem of variance components estimation. The definition of diagnosability in this chapter follows the same concept of identifiability in VCA [3]. The term “diagnosability” is used because it is more relevant in the context of our engineering applications. Based on Equation 9.5, we have T
E (y a ) = Γ T ... Γ T ... Γ T • µ
(9.7)
2151_book.fm Page 205 Friday, October 20, 2006 5:04 PM
Diagnosability Analysis for Variation Source Identification
Cov ( y a ) = F1σ u21 + ... + FP σ 2uP + FP +1σ w2 1 + ... + FP +Q σ 2wQ + FP +Q +1σ 2v
205
(9.8)
where E(⋅) represents the expectation, I ⊗ (Γ :i Γ :Ti ) Fi = n T In ⊗ (Ψ:(i − P )Ψ:(i − P ) )
when 1 ≤ i ≤ P when P < i ≤ P + Q
where the subscript “:i ” represents the ith column of the matrix, and FP+Q+1 is an identity matrix with the appropriate dimension. Define [σ u21 ... σ u2P σ 2w1 ... σ 2wQ σ 2v ]T in Equation 9.8 as θ, EU as the space containing all possible values of U, and ES as the space containing all possible values of θ (in the most general case, EU is ℜ P ×1 and ES is a (P + Q + 1) × 1 space spanned by nonnegative real numbers). Definition 9.1: In the model of Equation 9.5, a linear parametric function pTα, p ∈ℜ P ×1, α ∈EU is said to be diagnosable if, ∀ α1 , α 2 ∈EU , pT α 1 ≠ pT α 2 ⇒ E ( Y)
U = α1
≠ E ( Y)
U =α 2
(9.9)
A linear parametric function f Tθ, f ∈ℜ( P +Q +1)×1 , θ ∈E S is said to be diagnosable if, ∀ θ1 , θ 2 ∈E S , f T θ1 ≠ f T θ 2 ⇒ Cov( Y)
θ = θ1
≠ Cov( Y)
θ =θ 2
(9.10)
In model of Equation 9.5, we are only concerned with the mean and variance of process faults. Therefore, only the first- and second-order moments are considered in the definition. The intuition behind this definition is that if for any change in a linear combination of the mean or the variance of the process faults, we can see a corresponding change in the observation ya, then this fault combination is diagnosable. Clearly, this definition is independent of the estimation algorithms. This definition is also very flexible. By selecting different p and f, the diagnosability of different fault combinations can be evaluated. For example, by selecting p or f = [1 0 … 0]T, we can check if the mean or variance of the first fault is diagnosable. If it is diagnosable, we say the mean or variance of this fault can be uniquely identified or diagnosed.
9.3 CRITERION OF FAULT DIAGNOSABILITY From the preceding definition, the diagnosability of the system can be checked according to the system fault-quality model. Theorem 9.1 provides the fundamental results that can be used for checking.
2151_book.fm Page 206 Friday, October 20, 2006 5:04 PM
206
Stream of Variation Modeling and Analysis for MMPs
Λ Ψ]. In the Theorem 9.1: Define the range space of a matrix as R(⋅) and D = [Λ model of Equation 9.5, 1. pTα is diagnosable if and only if p ∈R(Γ T ). 2. fTθ is diagnosable if and only if f ∈R(H) , where H is symmetric and given as (D:T1 D:1 )2 (D T D )2 :i :1 H= T 2 (D:( P +Q ) D:1 ) T D:1 D:1
...
(D:T1 D:i )2
...
(D D:i )2
...
(D D:( P +Q ) )2
... ... ...
T :i
(D
T :( P + Q ) T :i
D:i )
D D:i
2
(D:T1 D:( P +Q ) )2 T :i
...
(D
T :( P + Q )
D:( P +Q ) )2
D:(TP +Q ) D:( P +Q )
...
D:Ti D:i T D:( P +Q ) D:( P +Q ) L (9.11) D:T1 D:1
T
where L is the dimension of y1T y T2 y TN in Equation 9.3, i.e., L = Σ Nk =1qk . Proof: To prove this theorem, we need the following results in Rao and Kleffe [3]. β + ε, where β represents the fixed effects and Consider a linear mixed model Y = Xβ ε is zero mean and Cov(εε) = θ1V1 + … + θrVr. Denote θ = [θ1 … θr]T as variance components. Rao and Kleffe [3] prove that pTβ is identifiable if and only if p ∈R(XT), fTθ is identifiable if and only if f ∈R(H) , H′ = (tr(ViVj)), 1 ≤ i ≤ r, 1 ≤ j ≤ r. Γ T … Γ T ]) in model Based on this result, pTα is diagnosable if and only if p ∈R([Γ T T T Γ … Γ ]) = R(Γ Γ ). Therefore, (i) holds. For of Equation 9.5. It is clear that R([Γ (ii), fTθ is diagnosable if and only if f ∈ R(H ′) , where H′ = (tr(FiFj)), 1 ≤ i ≤ P + Q + 1, 1 ≤ j ≤ P + Q + 1, and Fi and Fj are defined in Equation 9.8. It can be further shown that H ′ = nH . Because a constant coefficient does not affect the range space of a matrix, the result of (part 2 in Theorem 9.1) follows. Q.E.D. Theorem 9.1 gives us a powerful tool to test if some combinations of faults are diagnosable. From Theorem 9.1, it is clear that the means of all the faults are uniquely diagnosable if and only if Γ T is of full rank. The variances of all the faults are uniquely diagnosable if and only if H is of full rank. For the preceding criterion, the diagnosability of the variance of a process fault includes the effects of the modeling error w and the observation noise v. This means that even if a fault can be distinguished from other faults, it could still be nonuniquely diagnosable if it is mixed up with the modeling error or the observation noise. In some cases, if the modeling error and the observation noise can be assumed small or their variance can be estimated from the normal working condition of a manufacturing process, we can ignore their effects when exploring the diagnosability of process faults. The testing matrix is revised accordingly by reducing θ to include only [σ u21 σ u2P ] and reducing the H matrix in Theorem 9.1 to Hr , where Hr is a subblock of H, i.e.,
2151_book.fm Page 207 Friday, October 20, 2006 5:04 PM
Diagnosability Analysis for Variation Source Identification
(Γ :T1 Γ :1 )2 M Hr = (Γ :Ti Γ :1 )2 M (Γ T Γ ) 2 :P :1
... M ... M ...
(Γ :T1 Γ :i )2 M T (Γ :i Γ :i )2 M T (Γ :P Γ :i )2
... M ... M ...
(Γ :T1 Γ :P )2 M (Γ :Ti Γ :P )2 . M (Γ :TP Γ :P )2
207
(9.12)
Under the condition that noises w and v are assumed negligible, the diagnosability matrix was defined in Ding, Shi, and Ceglarek [2] as π(Γ ), where π(⋅) is a matrix transformation defined in their paper. The variances of process faults are considered fully diagnosable if and only if π(Γ )T π(Γ ) is of full rank. In fact, this Γ)T condition is the same as what we derived here. It can be shown that R(Hr) = R(π(Γ Γ)). Therefore, their work can be considered as a special case of the general π(Γ framework presented in this chapter. If noise terms are not included, Ding et al. [2] showed that the mean being diagnosable is a sufficient condition for variance being diagnosable. However, the converse is not true. This is illustrated in the case study of a machining process in Section 9.6.
9.4 MINIMAL DIAGNOSABLE CLASS Theorem 9.1 alone is not very effective in analyzing a partially diagnosable system in which not all faults are diagnosable. It is not obvious from Theorem 9.1 which faults are mixed or coupled together. To analyze a partially diagnosable system, we propose the concept of a minimal diagnosable class. We will first introduce the concept of the diagnosable class and then present the definition of the minimal diagnosable class. Definition 9.2: A nonempty set of n faults {ui1 … uin} forms a mean or variance diagnosable class if a nontrivial linear combination of their means {ui1 … uin} or variances {σ i21 ... σ i2n } is diagnosable. “Nontrivial” means at least one coefficient of the linear combination is nonzero. Definition 9.3: A nonempty set of n faults {ui1 … uin} forms a minimal mean or variance diagnosable class if no strict subset of {ui1 … uin} is mean or variance diagnosable. The diagnosability of the mean and variance can be dealt with separately, and the testing procedures are very similar (the only difference is the testing matrix; it is ΓT for mean and H for variance). Hence, no distinction between mean or variance diagnosability will be made hereafter unless otherwise indicated. The minimal diagnosable classes expose the interrelationship between different faults. Intuitively, a minimal diagnosable class is a group of faults that cannot be further distinguished. Minimal diagnosable classes expose the “aliasing” structure of the nondiagnosable faults in a partially diagnosable system. A minimal diagnosable class represents a set of faults that are closely coupled together. We can only identify a linear combination of them, but we cannot identify any strict subset. With
2151_book.fm Page 208 Friday, October 20, 2006 5:04 PM
208
Stream of Variation Modeling and Analysis for MMPs
this information, we can show the coupling relationship among faults and learn what additional information is needed to identify certain faults. We found that the minimal diagnosable class can be generated from the reduced row echelon form (RREF) (see Chapter 2) of the transpose of testing matrices. This result is stated in the following theorem: Theorem 9.2: Given a testing matrix G ∈ℜ n ×m (G is ΓT or H) and n faults θ = [u1 … un]T corresponding to G, the fault set θ[v] is a minimal diagnosable class if v is a nonzero row of the RREF of GT, where θ[v] is a subset of θ such that θ(i) θ[v] if v(i) (the ith element of v) ≠ 0. (the ith element of θ) ∈θ Proof: Denote the row and column space of a matrix as Row(⋅) and Col(⋅), respectively, the RREF of GT as GTr , and the nonzero row vectors of GTr as {vi}i=1…ρ, where ρ is the rank of GTr . Noting that GTr is unique and Row( GTr ) = Row(GT), we have v i ∈Col(G). Hence, θ[vi] is a diagnosable class. We need to further prove that θ[vi] is a minimal diagnosable class. From the algorithm to obtain RREF, the leftmost element of vi is always a “leading 1.” The position of such a “leading 1” in vi is called the pivot position. Denote the set of all pivot positions contributed by the rows of GTr as Ξ. It is known that: (i) given an i∈{1, …, ρ}, there is only one nonzero element in {v i ( j)} j∈ΞΞ; (ii) if {ci}i=1…n are columns of GTr , then there is only one nonzero element in ci if i ∈Ξ Ξ. From (i), θ[vi] Ξ, i1…ik ∈Ξ Ξ. Assuming that θ[vi] is not must be of the form {upi, ui1, …, uik}, pi ∈Ξ a minimal diagnosable class, we can then find a vector v i′ such that θ[vi′] ⊂ θ[vi], v i′ ∈ Row(GTr ) , and v i′ can thus be written as ρ
v i′ =
∑a v . j
j
j =1
However, from (ii), if there is a j such that aj is nonzero, u p j must be in θ[v′i ], where pj is the pivot position of vj. Because θ[vi] only contains one fault at the pivot position pi, ai is the only possible nonzero coefficient. Then θ[v′i ] = θ[v i ]. This contradicts the assumption that θ[v′i ] ⊂ θ[v i ], implying that θ[vi] is a minimal diagnosable class. Q.E.D. When the RREF of GT is calculated, we can obtain some of the minimal diagnosable classes. The following corollary shows that by rearranging the columns in GT, we can obtain all the possible minimal diagnosable classes. The rearranging process is known as matrix permutation. The permuted matrix is defined as: if {ci}i=1…n, denote the column vectors of GT and correspond to the faults θ = [u1 … un]T, the columnwise permuted matrix G′ T = [c i1 … c in ] is called the permuted matrix corresponding to the fault permutation θ′ = [ui1 … uin ]T . Corollary 9.1: Given a testing matrix G ∈ ℜ n× m and if Θ′ = {ui1 , …, uis is a minimal diagnosable class, then θ[v] = Θ, where v is the last nonzero row of G′r T . G′r T is the RREF of the permuted matrix of GT corresponding to the fault permutation uis +1 ... uin ui1 ... uis . Proof: Denote {vi}i=1…ρ as the nonzero row vectors of G′r T . We want to prove that the pivot position of the last row vρ must be n – s + 1 (this position corresponds to ui1).
2151_book.fm Page 209 Friday, October 20, 2006 5:04 PM
Diagnosability Analysis for Variation Source Identification
209
First, suppose that the pivot position of vρ is larger than n – s + 1. If so, θ[v ρ ] ⊂ Θ. According to Theorem 9.2, however, θ[vρ] is a diagnosable class. This contradicts the fact that Θ is minimal. Second, assume that the pivot position of vρ is smaller than n – s + 1. If so, a fault among {uis+1, …, uin} must belong to θ[vρ]. Because the pivot position of vρ is the largest among all the pivot positions of {vi}i=1…ρ, given any vector v f = Σ ρj =1a j v j (defined as an arbitrary nontrivial linear combination of {vi}i=1…ρ), θ[vf ] contains at least one element among {uis+1, …, uin}. According to Theorem 9.1, any diagnosable class should contain at least one element among {uis+1, …, uin} because vf is an arbitrary vector in Row( G′r T ). This contradicts the assertion that Θ = {ui1, …, uis} is a minimal diagnosable class. Therefore, the pivot position of vρ is at n – s + 1, i.e., θ[vρ] ⊆ Θ. Because θ[vρ] and Θ are both minimal, θ[vρ] = Θ. Q.E.D. Corollary 9.1 indicates that a complete list of minimal diagnosable classes can be obtained by thoroughly permuting GT. However, the number of permutations will rapidly become very large if the number of faults is large. To handle this problem, we need the concept of the “connected fault class.” Given the RREF of GT, assume we can divide its nonzero rows into two sets of rows (C1 and C2) such that for any v i ∈C1 and v j ∈C2, v i * v j = 0 , where * is the Hadamard product [6]. In other words, v i does not share any common nonzero column positions with v j . Define symbol θ[C] as the fault set of ∪(θ[v k ]) for all k v k ∈C, where C is a set of rows. We can show that for an arbitrary minimal diagnosable class θ[v], either θ[v] ⊆ θ[C1 ] or θ[v] ⊆ θ[C2 ]. From Theorem 9.1, v is in the space spanned by the rows of GT. Thus, v = a1v1 + a2 v 2 , where v1 and v2 are in the space spanned by the rows in C1 and C2, respectively. However, if a1 and a2 are both nonzero, the fact that v i * v j = 0 and θ[v1], θ[v2] are both diagnosable will lead to the contradiction that θ[v] is not minimal. The implication is that the complete list of minimal diagnosable classes can be obtained by only permuting the faults within θ[C1] and θ[C2], respectively. Following the same rule, C1 and C2 can be further divided into smaller groups iteratively until they are no longer dividable. If Ci is an undividable set of rows, θ[C1] is called a connected fault class. Following a similar argument, we know that the complete list of minimal fault classes can be obtained through permutations only within each connected fault class. If there are many small connected fault classes in the system, the computational load required to find all minimal diagnosable classes can be significantly reduced. The worst case is that all faults are connected in a big fault class. However, that is usually not a common situation. For instance, one principle in manufacturing process design is to reduce the accumulation and propagation chain of process faults [5]. For many actual engineering systems, the entire fault set can often be partitioned into much smaller connected fault subsets, as we will see in the case studies of Section 9.6. In summary, the algorithm obtaining all the minimal diagnosable classes is as follows: 1. Calculate the RREF of GT. 2. Remove all the uniquely identifiable faults because each of them will form a minimal diagnosable class; remove the faults corresponding to zero
2151_book.fm Page 210 Friday, October 20, 2006 5:04 PM
210
Stream of Variation Modeling and Analysis for MMPs
1
2
4 3
5
FIGURE 9.3 Illustration of minimal diagnosable class.
columns because they are invisible to the measurement system and hence not diagnosable, and will not appear in any minimal diagnosable classes. 3. Find the connected fault classes based on the RREF of GT. 4. Permute the columns within the connected fault classes and obtain the minimal diagnosable classes based on the permuted matrices until all possible permutations are visited. The minimal diagnosable classes expose the “aliasing” structure among the faults in the system, revealing critical fault diagnosability information. For example, if a single fault forms a minimal diagnosable class, it is uniquely diagnosable. If a fault is not uniquely diagnosable and forms a minimal diagnosable class with several other faults, it can be identified when all other faults are known. Thus, by looking at the minimal diagnosable classes, we can identify which fault can be identified from the measurements, and if not, what other faults need to be known to identify it. This is illustrated in Figure 9.3, where there are five faults in the system. Fault 3 is uniquely diagnosable, whereas faults 1 and 2, and faults 4 and 5 are coupled together.
9.5 GAUGING SYSTEM EVALUATION BASED ON MINIMAL DIAGNOSABLE CLASS To evaluate a gauging system, we may need several easy-to-interpret indices to characterize the information obtained through the system. We propose three criteria for evaluation of gauging systems: information quantity, information quality, and system flexibility.
9.5.1 INFORMATION QUANTITY Information quantity refers to the level of knowledge regarding process faults we have obtained from the measurement data. When two gauging systems are used for the same manufacturing system, the number of potential faults is the same. However, for two different partially diagnosable systems, the number of faults that we need to know to ensure full diagnosability will often be different. This number can be used to quantify the amount of information obtained by different gauging systems. The following corollary indicates that the rank of the diagnosability testing matrix should be used to quantify the amount of measurement information. Corollary 9.2: Given a testing matrix G ∈ℜ n ×m and n faults θ = [u1 … un]T corresponding to G, if the rank of G is ρ, then n – ρ faults need to be known to uniquely identify all n faults.
2151_book.fm Page 211 Friday, October 20, 2006 5:04 PM
Diagnosability Analysis for Variation Source Identification
211
The proof is omitted here. It uses the property of the RREF of a matrix. An intuitive understanding of this corollary is given as follows. The solvability condition of a linear system Y = AX can be determined by analyzing the RREF of A. In such a linear system, n – ρ free variables need to be known before uniquely solving X, where n is the dimension of X and ρ = rank (A). If we consider the testing matrix G to be in a similar situation as matrix A, the result of Corollary 9.2 is not surprising.
9.5.2 INFORMATION QUALITY The second criterion is information quality. Even if two gauging systems provide the same amount of information as per the criterion developed earlier, the detailed information content could be very different. In practice, it is always desirable to have unique identification of a fault so that corrective action can immediately be undertaken to eliminate the fault and restore the system to its normal condition. The decision of corrective action cannot be made for a fault coupled with others without further investigation or measurement. Thus, we use the number of uniquely identifiable faults to benchmark the quality of measurement information. Uniquely identifiable faults can be easily found by counting the number of minimal diagnosable classes that contain only one single fault.
9.5.3 SYSTEM FLEXIBILITY The third criterion is the flexibility provided by the current gauging system toward achieving full diagnosability. Some gauging systems could be rigid because certain faults or fault combinations, which may be difficult to measure in practice, have to be known to achieve a fully diagnosable system. Other gauging systems may provide information in a flexible manner, i.e., many fault combinations can be selected to make the system fully diagnosable. This comparison needs the concept of minimal complementary classes. A minimal complementary class is a minimal set of faults such that if they are known, all the faults of the system can be uniquely identified. Consider a system with four faults and three minimal diagnosable classes as {u1, u2}, {u1, u3, u4}, and {u2, u3, u4}. One can verify that the minimal complementary classes for this system are {u1, u3}, {u1, u4}, {u2, u3}, {u2, u4}, and {u3, u4}. The number of minimal complementary classes is five. A system with more minimal complementary classes is considered to be more flexible. In general, it is difficult to find the complete sets of minimal complementary classes by simply trying out different fault combinations, especially for a complex system with a large number of faults and intricate fault combinations. Corollary 3 facilitates the determination of minimal complimentary classes. Corollary 9.3: A set of faults forms a minimal complementary class if and only if the set contains n – ρ faults but does not contain any minimal diagnosable class, where n is the total number of faults and ρ is the rank of the diagnosability testing matrix. Proof: From Corollary 9.2, it is clear that a minimal complementary class should contain exactly n – ρ faults. Assume that a minimal complementary class contains a minimal diagnosable class that includes n1 faults. Because a minimal diagnosable class is diagnosable, we only need to know n1 – 1 faults in the minimal diagnosable
2151_book.fm Page 212 Friday, October 20, 2006 5:04 PM
212
Stream of Variation Modeling and Analysis for MMPs
class to identify all the n1 faults. Then, the number of faults in the minimal complementary class can be reduced by 1. Thus, a fault class is a minimal complementary class only if it does not contain any minimal diagnosable class. Now we need to prove that if a fault class with n – ρ elements does not include any minimal diagnosable class, it is a minimal complementary class. Assume that a fault class {ui1 … uin–ρ} does not contain any minimal diagnosable class. Consider the RREF of the permuted matrix G ′ T corresponding to the fault permutation in–ρ+1 … in i1 … in–ρ. As ui1 … uin–ρ does not include any minimal diagnosable class, the last n – ρ columns of the RREF should not include any pivot positions, according to Corollary 9.1. However, because there are a total of ρ pivot positions, every column among the first ρ columns of the RREF should contain only a “leading 1.” Hence, it is clear that all the faults can be uniquely identified if the n – ρ faults that correspond to the last n – ρ columns are known. Q.E.D. With Corollary 9.3, all minimal complementary classes can be found through a search among all fault sets with n – ρ faults. If the entire fault set can be partitioned into many smaller distinct, connected fault classes, the computational load of searching the complete minimal complementary classes can be further substantially reduced. Corollary 9.3 can be applied to a connected fault class but n should be the total number of faults in the connected fault class and ρ should be the rank of the space spanned by the associated row vectors in the RREF of the transpose of the testing matrix. Individual searches can be conducted within each connected fault class. The complete set of minimal complementary classes can then be obtained by joining the minimal complementary classes from each connected fault class and adding the nondiagnosable faults. An example will be given in Section 9.6 to illustrate this procedure. The order of using the three criteria generally depends on the requirements of individual applications. In some cases, when the ultimate goal is to design a gauging system providing the full diagnosability, we can skip the second criterion and compare the number of minimal complementary classes directly. In other cases, the second criterion can be used before the first criterion if the uniquely identified fault is highly desired. Based on our experience, using the three criteria in the sequence in which they were presented here is an effective way to gauge system evaluation in many industrial applications.
9.6 CASE STUDY 9.6.1 CASE STUDY
OF
A MULTISTAGE ASSEMBLY PROCESS
Let us consider the assembly processes shown in Figure 9.4. In this process, three stations are involved in assembling four parts (labeled 1, 2, 3, and 4, respectively, in Figure 9.4) and inspecting the assembly: part 1 and part 2 are assembled at station I; subassembly “1 + 2” is assembled with part 3 and part 4 at station II; and the final assembly with four parts is inspected at station III for surface finish, joint quality, and dimensional defects. Each part is restrained by a set of fixtures consisting of a 4-way locator, which controls motion in both x- and z-directions, and a 2-way locator, which controls motion only in the z-direction. A subassembly with several parts also needs a 4-way locator and a 2-way locator to completely control its degrees of freedom. The active locating points are marked as Pi , i = 1, ..., 8, in Figure 9.4. In
2151_book.fm Page 213 Friday, October 20, 2006 5:04 PM
Diagnosability Analysis for Variation Source Identification
1
P2 3
5
4
P4
500
P1 8 7
1
z
6
2
M55
P8
4
P4
P7 4
M4
P8 15
500
3
1
10
12 P 5 P6 3
9 13 14
x
(c) Station III
17 P11
P3
11
2
M33
950
700
16
(b) Station II
450
P 12 1
2
1100
600
M1
400
M22
(a) Station I
213
M5
18
P i –locating points Mi –measurement points -active 4-way locator -active 2-way locator -inactive 4-way locator -inactive2-way locator -measurement location
4-way locator, 2 potentntial faults 2-way locator, 1 potential fault
FIGURE 9.4 A multistage assembly process.
this process, five coordinate sensors are installed on all three stations. Each coordinate sensor measures the position of a part feature, such as a corner, in two orthogonal directions (x and z). The measurement points are marked as {Mi, i = 1, …, 5} in Figure 9.4. There is another measurement point on station III, which is marked as M 5′ . This point represents another gauging strategy: the point M 5′ is measured on station III instead of the point M 5 . Counting potential locator errors on all stations, we have a total of n = 18 potential faults, which are assigned serial numbers from 1 to 18, as shown in Figure 9.4. It is interesting to compare the diagnosability of these 18 faults under these two gauging strategies. The product state variable xk is denoted by random deviations associated with the degrees of freedom (d.o.f.’s) of each part. Each 2-D part in this example has three d.o.f.’s (two translational and one rotational) and the size of xk is 12 × 1, given that there are four parts. The state vector xk is expressed as x k = [δx1,k
δz1,k
δα1,k
δx 4,k
δz4,k
δα 4,k ]T
(9.13)
where δ is the deviation operator, δxi,k , δzi,k , and δα i ,k are the two translational and one rotational deviations of part i on station k, respectively. If part i has not yet appeared on station k, the corresponding δxi,k , δzi,k , and δα i ,k are zero. The input vector uk represents the random deviations associated with fixture locators on station k. There are a total of 18 components of fixture deviations on three stations, as indicated by the numbers 1 to 18 (i.e., the 18 faults) in Figure 9.4. Thus, we have u1 = [δp1 δp6]T, u2 = [δp7 δp15]T, u3 = [δp16 δp17 δp18]T, where δpi is the deviation associated with fault i. The measurement y contains positional deviations detected at Mi, i = 1,…, 5. In this 2-D case, each Mi can deviate in x- or z-directions. Hence: y1 = [δM1(x) δM1(z) δM2(x) δM2(z)]T, y2 = [δM3(x) δM3(z) δM4(x) δM4(z)]T, and y3 = [δM5(x) δM5(z)]T. The state space representation of this process is shown as follows: x1 = B1u1 + w1 and x k = A k −1x k −1 + B k u k + w k , k = 2, 3
(9.14)
y k = C k x k + v k , k = 1, 2, 3
(9.15)
2151_book.fm Page 214 Friday, October 20, 2006 5:04 PM
214
Stream of Variation Modeling and Analysis for MMPs
Matrices Ak , Bk , and Ck are determined by process design and sensor deployment. Ak characterizes the change in product state when a product is transferred from station k to station k + 1. Thus, Ak depends on the coordinates of fixture locators on two adjacent stations, k and k + 1. Bk determines how fixture deviations affect product deviations on station k and is thus determined by the coordinates of fixture locators on station k. Ck is determined by the coordinates of measurement points such as M1 to M5 in this example. Following the model development presented in Chapter 6, we give the numerical expressions of A’s, B’s, and C’s of the assembly processes shown in Figure 9.4, respectively. The A’s, B’s, C1, and C2 are the same for these two processes because their fixture layouts are the same for all stations and the sensor deployments are the same for station I and station II. In this particular example, since each locator pair used in each station have the same z coordinate, the deviation of a 2-way locator in x direction will not affect the part location because the deviation will be absorbed by the slot. Comparing to the models in Chapter 6, the third element in each us,k does not exist. This also reflects in i,k equaling 0, thus the third column in R3s ,k and R4k −1,k will degenerate to a zero column, in Equations 6.11 and 6.18, respectively. In this special condition, simply omitting the two-way locator deviation in x direction in models developed in Chapter 6 will give the B matrices in 9.17. 0 0 0 A1 = −1 0 0 0 0 0 −1 0 0 A2 = −1 0 0 −1 0 0
0 0 0.0007 0 −0.3497 0.0007
0 0 0.0005 0 −0.5550 0.0005 −0.2153 −0.2392 0.0005 0 −0.2392 0.0005
0 0 1 0 0 0
0 0 1 0 0 0 0 0 0 0 0 0
0 0 0 1 0 0 0 6× 6
0
3× 6
I 6× 6
0 3×6
0 0 0 0 −0.0007 −0.3497 0 0 0.3497 −325.17 −0.0007 0.6503
0 6× 6
I 6× 6 0 0 0 0 0 0 0 0 0 1 0 0
12×12
0 0 0 0 −0.0005 −0.2392 0 0 −0.4450 −222.49 −0.00005 −0.2392 0.2153 107.655 −0.7608 −380.38 −0.0005 −0.2392 0 −0.0005 0.2392 −380.38 −0.0005 0.7608 12×12
(9.16)
2151_book.fm Page 215 Friday, October 20, 2006 5:04 PM
Diagnosability Analysis for Variation Source Identification
1 0 0 B1 = 0 0 0
0 1 −0.0014 0 0 0
0 0 0 0 0.0014 0 0 1 0 0 0 0 6× 6 0
1 0 0 1 0 0 B2 =
0 1 −0.0007 0 0.3497 −0.0007
0 0 0.0007 0 0.6503 0.0007
1 0 0 1 0 0 B3 = 1 0 0 1 0 0
0 6 ×3
0 0 0 0 1 −0.002
0 0 0 0 0 0.002 12×6
0 6×6
1 0 0 0 0 0
0 0 1 0 −0.0010 0.0010 0 0 0.5550 0.4450 −0.00005 0.0005 0.2153 −0.2153 0.2392 0.7608 −0.0005 0.0005 0 0 0.2392 0.7608 −0.0005 0.0005 12×3
0 1 −0.002 0 0 0
0 0 0.002 0 0 0
0 0 0 1 0 0
215
(9.17)
0 0 0 0 0 0 0 0 1 0 −0.002 0.002 12×9
2151_book.fm Page 216 Friday, October 20, 2006 5:04 PM
216
Stream of Variation Modeling and Analysis for MMPs
1 0 C1 =
−550 −100
0 1 0
1 0 C2 =
1 0
2× 3
−550 −100
0 1
−550 −630
0 1
0 2× 3 1 0
−300 −740
0 1
(9.18)
0 2× 3 0 2× 3 4×12
0 2× 3
0 2× 3
0 2× 3
0 2× 6 0 2× 6 4×12
0 2× 3
We use C13 and C 23 to denote C3 of these two gauging systems, respectively. Their expressions are: 1 C13 = 0
C32 = 0 2×99 0 2× 9 2×12
−550 −100
0 1
1 0
−200 (9.19) 620 2×12
0 1
For simplicity, we will only discuss the variance diagnosability of fixture faults in this study. Thus, we use Hr in Equation 9.12 as the testing matrix. To use Hr , we need to obtain Γ first. Substituting A’s, B’s, and C’s in Equation 9.16–Equation 9.19 into Equation 9.3 yields: 1 0 0 0 0 Γ1 = 0 0 0 0 0 and
0.786 −0.786 0
0
0
0
0
0
0
0
0
0 0
0
0
0
1.143 −0.143 0
0
0
0
0
0
0
0
0
0 0
0
0
0
0
0
1
−1.1 0
0
0
0
0
0
0 0
0
0
0
0
0
0 2.26 −1.26 0
0
0
0
0
0
0 0
0
0
0
1.1
0.401 −0.786 0
0
0.385 1 0.385 −0.385 0
0
0
0 0
0
0
0
0.073 −0.143 0
0
0.070 0 1.070 −0.070 0
0
0
0 0
0
0
0
0.6
−0.6
0
0
0
0
0
0
0
0
1
0 0
0
0
0
0
0
0
0
0
0
0
0
0 2.48 −1.48 0 0
0
0
0
0.401 −0.786 0
0
0.385 0 0.122 −0.385 0
0
0
0 0 0.263 1 0.263
0.073 −0.143 0
0
0.070 0 0.022 −0.070 0
0
0
0 0 0.048 0 1.048
0 0 0 0 0 0 −0.263 −0.048 0
0
(9.20)
2151_book.fm Page 217 Friday, October 20, 2006 5:04 PM
Diagnosability Analysis for Variation Source Identification
1 0 0 0 0 Γ2 = 0 0 0 0 0
217
0.786 −0.786 0
0
0
0
0
0
0
0
0
0
0
0
0
0
1.143 −0.143 0
0
0
0
0
0
0
0
0
0
0
0
0
0
−1.1
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
1 1.1
0
0
0 2.26 −1.26 0
0
0
0
0
0
0
0
0
0.401 −0.786 0
0
0.385 1
0.385 −0.385 0
0
0
0
0
0
0
0
0.073 −0.143 0
0
0.070 0
1.070 −0.070 0
0
0
0
0
0
0
0
−0.6 0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0 2.48 −1.48 0
0
−1 −0.096
0
0
0
0
1
0
0
0
0
0
0 −0.24 0.183 0 −0.057
0.057
1 0.6
0 0.4
0 0 0 0 0 0 −0.096 1.057 (9.21) 0
−0.304 1 0.096
where the superscript 1 or 2 indicates which gauging strategy the Γ is associated with. Further, Hr can be obtained following its definition in Equation 9.12. Their expressions are: 1 0.617 0.617 0 0 0 0 0 0 H1r = 0 0 0 0 0 0 0 0 0
and
0.617 0.617 0
0
5.089 2.050 0
0 0
2.050 3.661 0
0
0
0
0
0
0
0
0 0
0.102 0.161 0.08 0.102
0
0
0
0 0
0.390 0.617 0.307 0.390
0
0
0
0 0
1.21 1.21
0
0
0
0
0
0
0 0
1.21 39.91 16.46
0
0
0
0
0
0
0 0
0.102 0.390 1.21 16.46 9.63 0.148 0.073 0.093
0
0
0
0 0
0.161 0.617 0
0
0.148 1.0 0.148 0.148
0
0
0
0 0
0.08 0.307 0
0
0.073 0.148 1.711 0.073
0
0
0
0 0
0.102 0.390 0
0
0.093 0.148 0.073 0.093
0
0
0
0 0
0
0
0
0
0
0
1
0
0
0
0
0
0
1
0.36 0.36 0 0
0
0
0
0
0
0
0
0
0.36 42.39 16.24 0 0
0
0
0
0
0
0
0
0
0.36 16.24 6.505 0 0
0
0
0
0
0
0
0
0
0
0
0
0 0
0
0
0
0
0
0
0
0
0
0
0
0 0
0.012 0.046 0
0
0.011
0
0.001 0.011
0
0
0
0 0
0.161 0.617 0
0
0.148
0
0.015 0.148
0
0
0
0 0
0.033 0.127 0
0
0.030
0
0.003 0.0330
0
0
0
0 0
0.012 0.046 0
0
0.0011
0
0.001 0.011
0
0
0
0 0
0.012 0.161 0.033 0.012 0.046 0.617 0.127 0.046 0 0 0 0 0 0 0 0 0.011 0.148 0.030 0.011 0 0 0 0 0.001 0.015 0.003 0.001 0.011 0.148 0.03 0.0011 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0.005 0.069 0.014 0.005 0.069 1 0.069 0.069 0.014 0.069 1.362 0.014 0.005 0.069 0.014 0.005 (9.22) 0
0
0
0
2151_book.fm Page 218 Friday, October 20, 2006 5:04 PM
218
Stream of Variation Modeling and Analysis for MMPs
1 0.617 0.617 0 0 0 0.617 4.367 1.224 0 0 0.025 0.617 1.224 1.627 0 0 0.098 0 0 0 1 1.21 1.21 0 0 0 1.21 39.91 16.46 0 0.025 0.098 1.21 16.46 8.705 0 0.161 0.617 0 0 0.148 0 0.054 0.207 0 0 0.050 0 0.025 0.098 0 0 0.023 Hr2 = 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0
0
0
0
0
0
0
0
0
0
0
0.161 0.054 0.025
0
0
0
0
0
0
0
0
0.617 0.207 0.098
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0.231 0.14 48
0
0
0
1
1
0.009
0.148 0.050 0.023 4.0
0.16 0.093
0.231 1.703 0.050
0
0
0
0.148
0.05
0.023
0
0
0
0.009 0.003 0.002 0.009 0.00002 0
0
0
0
0
0
0
0
1
0.36
0.36
0
0
0
0
0
0
0
0
0.36 42.39 16.24
0
0
0
0
0
0.36 16.24 6.505
0
0
0
0
0
1
0.009
0
0
0
1
0.009
0
0
0
0
0.16
0.003
0
0
0
0
0.16 0.047 0.027 0.16
0.093 0.002
0
0
0
0
0.093 0.027 0.016 0.093 0.002
1
1
1
0.16 0.093
0.16 0.093
1
0.003
0.009
0
0
0
0
0.009 0.0002
0
0
0
0
0.009 0.003 0.002 0.009 0.0002
0..009
0.009 0.005
0
0
0
0
0.009 0.085 0.049 0.009 0.005
0 0 0 0 0.009 0.005 0 . 0 0 0 0.009 0.085 0.049 0.009 0.005 1.271 (9.23) 0
0
The RREF of Hr’s and the corresponding fault structures are compared in Table 9.1. For gauging strategy 1, 14 rows have only one nonzero element, corresponding to 14 uniquely identified faults and, hence, minimal diagnosable classes, {1}, …, {12}, {16}, {17}. Two faults {13, 14} correspond to zero columns and are therefore not diagnosable. The 13th row has two nonzero elements, i.e., [01×12 | 0 0 1 0 0 1], indicating that {15, 18} is a minimal diagnosable class. The class {15, 18} is also a connected fault class and because it is already minimal, no further permutation is needed. Similarly, for gauging strategy 2, there are 13 uniquely diagnosable classes, {1}, …, {12}, {18}. Two minimal diagnosable classes, {13, 16} and {14, 15, 17}, correspond to the 13th and 14th row, respectively. No permutation of Hr is needed for gauging strategy 2, either. For gauging strategy 1, to achieve a fully diagnosable system, at least n – ρ = 3 faults need to be known. We first search fault set {15,18} with n = 2 and ρ = 1. It is clear that {15} and {18} are two minimal complementary fault classes for the connected fault class {15, 18}. Adding the nondiagnosable faults {13, 14}, we obtain the minimal complementary classes as {13, 14, 15} and {13, 14, 18}. The number of minimal complementary classes is two. For gauging strategy 2, to find the minimal complementary class, we search for the faults among {13, 16} with n = 2 and ρ = 1, and among {14, 15, 17} with n = 3 and ρ = 1. The search yields {13} and {16} for {13, 16} and {14, 15}, {14, 17}, and {15, 17} for {14, 15, 17}. Joining these two fault groups together gives us
2151_book.fm Page 219 Friday, October 20, 2006 5:04 PM
Diagnosability Analysis for Variation Source Identification
219
TABLE 9.1 Comparison of Gauging Systems 1 and 2 Gauging Strategy 1 RREF(Hr)
I 0 6×12
12 ×12
Number of potential faults Rank of testing matrix Minimal diagnosable classes Number of uniquely identified faults Minimal complementary classes Number of minimal complementary classes
0
12 × 6
0 0 1 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
1 0 0 0 0 0 18 ×18
Gauging Strategy 2
I 0 6×12
12 ×12
0 1 0
0
12 × 6
1
0
0 1 0.58 0 0.06 0 0
0
0
0
0 0
0
0
0
0 0
0
0
0
0 0
0
0
0
0 0 1 0 0 0 18 ×18
18 15 {1},…,{12}, {16}, {17}, {15, 18} 14
18 15 {1}, …, {12}, {18}, {13,16}, {14, 15, 17} 13
{13, 14, 15}, {13, 14, 18}
{13, 14, 15}, {13, 14, 17}, {13, 15, 17}, {16, 14, 15}, {16, 14, 17}, {16, 15, 17} 6
2
C21 ⋅ C31 = 6 minimal complementary classes, which are listed in Table 9.1. This analysis verifies that although engineering systems have many potential faults (18 faults in this case), they can often be partitioned into smaller connected fault classes. Neither gauging system provides full diagnosability because both their Hr’s are not of full rank. Ranks of Hr’s are the same (ρ = 15), suggesting that the amount of information obtained by both systems is the same. But gauging strategy 1 can uniquely identify 14 faults, which are faults 1–12, 16, and 17, whereas gauging strategy 2 can only uniquely identify 13 faults, which are faults 1–12 and 18. The information quality provided by gauging strategy 1 is considered better than that of gauging strategy 2. In this sense, gauging strategy 1 provides more valuable information. However, one may also notice that gauging strategy 2 can have six possible ways of measuring additional faults in achieving a fully diagnosable system, but gauging strategy 1 only has two possibilities. This difference indicates that gauging strategy 2 is more flexible. If the third criterion is of higher priority, gauging strategy 2 is more favorable.
9.6.2 CASE STUDY
OF
A MULTISTAGE MACHINING PROCESS
The proposed evaluation criteria can also be applied to multistage machining processes. To machine a workpiece, we need to first fix the location of the workpiece in the space. Figure 9.5 shows a widely used 3-2-1 fixturing setup. If we require the
2151_book.fm Page 220 Friday, October 20, 2006 5:04 PM
220
Stream of Variation Modeling and Analysis for MMPs
CuttingTool G
H
F
P1 P2
Z
Workpiece
P3
Fixture System E
A L2
Y
D
C L1
Machine Table
L3 B
X
FIGURE 9.5 A typical 3-2-1 fixturing configuration.
workpiece to touch all the locating pads (L1~L3) and locating pins (P1~P3), the location of the workpiece in the machine coordinate system XYZ is fixed. The surface of the workpiece that touches the locating pads (L1~L3) (surface ABCD in Figure 9.5) is called the “primary datum.” Similarly, surface ADHE is called the “secondary datum” and DCGH is called the “tertiary datum” in Figure 9.5. Because the primary datum (surface ABCD) touches L1–L3, the translational motion in the z direction and the rotational motion in x and y directions are restrained. Similarly, the secondary datum constrains the translational motion in the x direction and the rotational motion in the z direction; the tertiary datum constrains the translational motion in the y direction. Therefore, all six d.o.f.’s associated with the workpiece are constrained by these three datum surfaces and the corresponding locating pins and pads. The cutting toolpath is calibrated w.r.t. the machine coordinate system XYZ. Clearly, an error in the position of locating pads and pins will cause a geometric error in the machined feature. Suppose that we mill a slot on surface EFGH in Figure 9.5. If L1 is higher than its nominal position, the workpiece will be tilted w.r.t. XYZ. However, the cutting tool path is still determined w.r.t. XYZ. Hence, the bottom surface of the finished slot will not be parallel to the primary datum (ABCD). Besides the fixture error, the geometric errors in the datum feature will also affect the workpiece quality. For example, if the primary datum (ABCD) is not perpendicular to the secondary datum (ADHE), the milled slot will not be perpendicular to the secondary datum, either. A three-stage machining process using this 3-2-1 fixture setup is shown in Figure 9.6. The product is an automotive engine head. The features are the cover face (M), joint face, and the slot (S). The cover face, joint face, and the slot are milled at the first (Figure 9.6a), second (Figure 9.6b), and third (Figure 9.6c) stages, respectively. We treat the positional errors of product features after stage k as state vector xk , the errors of fixture and the cutting tool path at stage k as input uk, and the measurements of positions and orientations of the machined product features as yk, which can be obtained by a coordinate measuring machine (CMM). The state space model in Equation 9.1 can be obtained through a similar (to the preceding panel assembly) but more complicated 3-D kinematics analysis, where Ak1xk1 is the error contributed
2151_book.fm Page 221 Friday, October 20, 2006 5:04 PM
Diagnosability Analysis for Variation Source Identification
Cover Face (M)
(a)
Joint Face
(b)
221
Slot (S)
(c)
FIGURE 9.6 Process layout at three stages.
by the errors of datum features (these features are produced in previous stages) and Bkuk is the error contributed by the fixture or cutting tool at stage k. Details of this process and the corresponding state space model can be found in Chapter 7. After the model in Equation 9.1 is obtained, the diagnosability study for the multistage machining process can be conducted following the theories from Sections 9.2 to 9.5. We will focus on the fixture error in this case study. For a 3-2-1 fixture setup, there are six potential fixture errors at each stage (each locating pad and pin could have one error). Hence, there are 18 potential faults in the whole system, where faults 1–6 represent locator errors at the first stage, faults 7–12 at the second stage, and faults 13–18 at the third stage, respectively. Three gauging systems are used to measure slot S, cover face M, and the rough datum, respectively, where the rough datum is the primary datum at the first stage and can be seen from the joint face. The results of a fault diagnosability for the three systems are listed in Table 9.2. The RREFs of the testing matrices of gauging strategies 1 and 2 have a very simple structure. For gauging strategy 3 (the fourth column in Table 9.2), the first Γ) share common nonzero column positions. The corresponding three rows of RREF(Γ faults, {1, 2, 3, 7, 8, 9}, form a connected fault class regarding its mean diagnosΓ), we can generate 15 ability. By permuting the corresponding columns of RREF(Γ minimal diagnosable classes (each has four faults) within this connected fault class, as shown in Table 9.2. The minimal complementary class of this connected fault 3 class can be found by searching the class with n = 6, ρ = 3. We obtain C6 = 20 minimal complementary classes for the connected class: {1, 7, 8}, {2, 7, 8}, {3, 7, 8}, {1, 7, 9}, {2, 7, 9}, {3, 7, 9}, {1, 8, 9}, {2, 8, 9}, {3, 8, 9}, {1, 2, 7}, {1, 2, 8}, {1, 2, 9}, {1, 3, 7}, {1, 3, 8}, {1, 3, 9}, {2, 3, 7}, {2, 3, 8}, {2, 3, 9}, {1, 2, 3}, {7, 8, 9}. Adding the nondiagnosable faults {4, 5, 6, 13~18}, we can obtain 20 minimal complementary classes for the system regarding the mean diagnosability. It is also interesting to see that although the faults {1, 2, 3, 7, 8, 9} form a connected fault class regarding their mean diagnosability, they are uniquely diagnosable regarding their variance diagnosability. This verifies our previous remark that mean diagnosability requires a stronger condition than variance diagnosability.
9.7 SUMMARY This chapter studied the diagnosability of process faults given the product quality measurements in a complicated multistage manufacturing process. The study reveals
Gauging System Mean diagnosability: Γ) RREF(Γ
06 ×12 I 6 × 6 30 ×12 30 × 6 0 0 36 × 18
System 1 (Slot S)
6×3 6×6 I 0 030 × 3 030 × 6 0
30 × 3
0 0 0
0 0 0
0 0 0
0 0 −1
0 1 0
1 0 0
6×6 0 30 × 6 0 36 × 18
System 2 (Cover Face M)
−0.63 I 3× 3 03× 3 0.47 −0.49 0 3× 3 0 3× 3 30 × 3 30 × 3 0 0
0
3× 3
30 × 3
0
−0..48 −0.04
0
3× 3
3× 3
30 × 3
I
−0.57 −0.90 0
0.53 −0.90
System 3 (Rough Datum)
0 3× 6 0 30 × 6 0 3× 6
222
TABLE 9.2 Comparison of Gauging Systems
2151_book.fm Page 222 Friday, October 20, 2006 5:04 PM
Stream of Variation Modeling and Analysis for MMPs
Number of uniquely identified faults Number of minimal complementary classes
Minimal diagnosable classes
Number of potential faults Rank of testing matrix
Variance diagnosability: RREF(Hr)
mean variance mean variance
variance
18 Γ Hr mean
{13}, {14}, {15}, {16}, {17}, {18} 6 6 1 1
6 6 {13}, {14}, {15}, {16}, {17}, {18}
06 ×12 I 6 × 6 12 ×12 12 × 6 0 0 18 × 18
0
12 × 3
0 0 0
0 0 0
0 0 0
0 0 1
0 1 0
1 0 0
3 3 8 8
{7}, {8}, {9}, {4, 10}, {5, 11}, {6, 12}
18 6 6 {7}, {8}, {9}, {4, 10}, {5, 11}, {6, 12}
6×3 6×6 0 I 012 × 3 012 × 6
6×6 0 12 × 6 0 18 × 18
3 9 20 1
18 6 9 {10}, {11}, {12}, {1, 7, 8, 9}, {2, 7, 8, 9}, {3, 7, 8, 9}, {1, 3, 8, 9}, {2, 3, 8, 9}, {1, 7, 3, 9}, {2, 7, 3, 9}, {1, 7, 8, 3}, {2, 7, 8, 3}, {1, 2, 8, 9}, {1, 7, 2, 9}, {1, 7, 8, 2}, {1, 2, 3, 9}, {1, 2, 8, 3}, {1, 7, 2, 3} {1}, {2}, {3}, {7}, {8}, {9}, {10}, {11}, {12}
I 3× 3 0 3× 3 0 3× 6 0 3× 6 6×3 6×3 6×6 6×6 0 0 I 0 09 × 3 09 × 3 09 × 6 09 × 6
2151_book.fm Page 223 Friday, October 20, 2006 5:04 PM
Diagnosability Analysis for Variation Source Identification 223
2151_book.fm Page 224 Friday, October 20, 2006 5:04 PM
224
Stream of Variation Modeling and Analysis for MMPs
that the diagnosis capability that a gauging strategy can provide strongly depends on sensor deployment in a multistage manufacturing system. A poorly designed gauging strategy is likely to result in loss of diagnosability. On the contrary, a welldesigned gauging system that achieves the desired level of diagnosability can not only monitor the process change but also quickly identify the process root causes of quality-related problems. Quick root cause identification will lead to product quality improvement, production downtime reduction, and a remarkable cost reduction in manufacturing systems. This study was a model-based approach; a linear fault-quality model was used. The results can be used when a linear diagnostic model is available. Because the errors of tooling elements considered in quality control problems are often much smaller than the nominal parameters, most manufacturing systems can be linearized and then represented by a linear model under the small error assumption. Many of the linear state space models were validated through comparison with either a commercial software simulation [6] or with experimental data [7,8]. Thus, the small error assumption is not restrictive, and the methodology presented in this article is generic and applicable to various manufacturing systems. Another note on the applicability of the reported methodology is that for some poorly designed manufacturing system, a large number of process faults could possibly be coupled together and form a single, huge connected fault class. As a result, it would be impractical to exhaust matrix column-permutation in finding the complete list of minimal diagnosable classes, and the diagnosability study itself then becomes intractable.
9.8 EXERCISES 1. In an assembly process, a liftgate (Figure 9.7b) will be mounted on the back of a car (Figure 9.7a). Six points are measured on the liftgate, as illustrated in Figure 9.7c. Among them, 1, 2, 3, and 4. Measure the horizontal deviation only, 5 and 6 measure the up-down deviation only. In practice, five variation patterns of the liftgate are identified, as given in Figure 9.8. The dotted line shows the ideal liftgate, and the solid line shows the deformation pattern of the liftgate.
liftgate opening
roof cross-member 5 left bodyside
(a) A body in white
6
2
4
1
3
right bodyside
(b) A liftgate
FIGURE 9.7 Liftgate assemble and measurement process.
(c) Measurement points on a liftgate
2151_book.fm Page 225 Friday, October 20, 2006 5:04 PM
Diagnosability Analysis for Variation Source Identification
225
FIGURE 9.8 Variation patterns of the liftgate in assembly.
We use a vector u to represent the amount of deviation along each of the patterns shown in Figure 9.8. For example, the first element of u represents the amount by which the liftgate opening enlarges on the auto body, the second element of u represents the amount the liftgate opening translates on the auto body, etc. Further, we use a vector y to represent the six measurements. Answer the following questions: a. If a linear model is used to link the error patterns and the measurements as y = Γu + ε , then what is Γ? (Hint: Γ is a 6 × 5 matrix that only contains 1, –1, and 0.) b. If we ignore the influence of noise, then is u mean diagnosable? Is u variance diagnosable? c. If measurement points 1 and 4 are removed, then is u still mean diagnosable? Is u still variance diagnosable? If no, what faults are coupled together? d. If measurement points 1 and 4 are removed and measurement point 5 is moved to the middle of the roof crossmember, then is u mean diagnosable? Is u variance diagnosable? e. Based on the results of (d), what is the difference between mean diagnosis and variance diagnosis? 2. Figure 9.9a shows an assembly station in which two parts are assembled together. Suppose we are not concerned with the displacement of the parts in the x direction and are only considering the four variance components that represent the z-direction deviations of the four pin/hole or pin/slot combinations. If the z direction deviation of the four pins are denoted as u and the four measurements are denoted as y, and the relationship between u and y is modeled linearly as y = Γ · u + ε, answer the following questions: a. For the setup in Figure 9.9a, what is Γ? b. For the setup in Figure 9.9a, if the measurement noise is ignored, then is u mean diagnosable? Is u variance diagnosable? c. For the setup in Figure 9.9a, if the measurement noise is included as a separate variation source and its covariance is of the structure of σ2I, then is u mean diagnosable? Is u variance diagnosable? d. For the setup in Figure 9.9b (the distance between y1 and y2 and y3 and y4 is 0.5), repeat (a) to (c). e. Which setup is better in terms of diagnosability?
2151_book.fm Page 226 Friday, October 20, 2006 5:04 PM
226
Stream of Variation Modeling and Analysis for MMPs
(a) Set up 1.
(b) Set up 2.
FIGURE 9.9 An assembly process.
3. For a linear system y = Γu + ε, answer the following questions (noise can be ignored in the analysis): a. If Γ is a 4 × 4 matrix as 0.9501 0.2311 0.6068 0.4860
0.8913 0.7621 0.4565 0.0185
0.8214 0.4447 0.6154 0.7919
0.9218 0.7382 0.1763 0.4057,
what is the mean and variance diagnosability of u? b. If Γ is a 4 × 4 matrix as 0.9501 0.2311 0.6068 0.4860
0.8913 0.7621 0.4565 0.0185
0.8214 0.4447 0.6154 0.7919
0.8712 0.2119 0.5564 0.4456,
what is the mean and variance diagnosability of u? c. If Γ is a 4 × 4 matrix as 0.9501 0.2311 0.6068 0.4860
0.8913 0.7621 0.4565 0.0185
0.3695 0.2823 0.1962 0.0347
1.1863 0.7759 0.6569 0.2159,
what is the mean and variance diagnosability of u?
Reference 1. Schott, J.R., Matrix Analysis for Statistics, John Wiley & Sons, New York, 1997. 2. Ding, Y., Shi, J., and Ceglarek, D., Diagnosability analysis of multistage manufacturing processes, ASME Journal of Dynamic Systems, Measurement and Control, 124, 1, 2002. 3. Rao, C.R. and Kleffe, J., Estimation of Variance Components and Applications, NorthHolland, Amsterdam, 1988. 4. Searle, S.R., Casella, G., and McCulloch, C.E., Variance Components, John Wiley & Sons, New York, 1992. 5. Halevi, G. and Weill, R.D., Principles of Process Planning: A Logical Approach, Chapman and Hall, New York, 1995.
2151_book.fm Page 227 Friday, October 20, 2006 5:04 PM
Diagnosability Analysis for Variation Source Identification
227
6. Ding, Y., Ceglarek, D., and Shi, J., Modeling and diagnosis of multistage manufacturing processes: part I — state space model, Proceedings of the 2000 Japan/USA Symposium on Flexible Automation, July 23–26, Ann Arbor, MI, 2000JUSFA-13146, 2000. 7. Djurdjanovic, D. and Ni, J., Linear state space modeling of dimensional machining errors, Transactions of NAMRI/SME, XXIX, 541, 2001. 8. Zhou, S., Huang, Q., and Shi, J., State space modeling for dimensional monitoring of multistage machining process using differential motion vector, IEEE Transactions on Robotics and Automation, 19, 296, 2003. 9. Zhou, S., Ding, Y., Chen, Y., and Shi, J., Diagnosability study of multistage manufacturing processes based on linear mixed-effects models, Technometrics, 45, 312, 2003.
2151_book.fm Page 228 Friday, October 20, 2006 5:04 PM
2151_book.fm Page 229 Friday, October 20, 2006 5:04 PM
10
Diagnosis through Variation Pattern Matching*
10.1 INTRODUCTION TO VARIATION PATTERNS The relationships between product quality and process variation sources such as fixture errors and machine errors in a multistage manufacturing process are modeled in a linear state space form. They have the same linear model structure because the process faults are assumed to be of smaller magnitude as compared to nominal product dimensions. The chain like state space model can be further transformed into an input-output model and can be generally expressed as: y = Γ ⋅u + ε
(10.1)
where y is an n × 1 vector of product quality measurements, Γ is an n × p constant system matrix determined by product/process designs, u is a p × 1 random vector representing process faults, ε is an n × 1 random vector representing measurement noises, unmodeled faults, and high-order nonlinear terms. Equation 10.1 is equivalent to Equation 9.4. The term u in Equation 10.1 is actually µ + u in Equation 9.4, and ε is actually Ψ ⋅ w + v in Equation 9.4. Without loss of generality, we can further assume that the columns of Γ matrix are of unit length. This can be achieved through simple scaling. Because process variation errors such as variability increase are more difficult to diagnose and compensate than the mean change of process faults, most of the root cause identification methods focus on identifying root causes relevant to the variance changes of process faults. Therefore, the process fault vector u is often modeled as a random vector to describe variation errors, as opposed to a simple mean-shift type of error in the process. Because the mean of quality measurements can always be subtracted, it is often assumed that u and ε are zero mean. In Equation 10.1, the column vectors of Γ determine how a specific process fault affects product quality characteristics and causes specific variation patterns in product quality measurements. This concept can be illustrated in the following examples from assembly and machining processes. Variation pattern in assembly: Figure 10.1 shows the dimensional variation in a 2-D panel assembly process. The rectangle in Figure 10.1a represents a sheet metal * Part of chapter material is based on Reference 4.
229
2151_book.fm Page 230 Friday, October 20, 2006 5:04 PM
230
Stream of Variation Modeling and Analysis for MMPs Cutting Tool
G
H
Workpiece
y2
P3 E P2
z y
P1 y1
x
P2
P1
F D L2
u2 Z
y3
Y
A
L1
C L3 B
X
(a) Variation pattern in an assembly operation
(b) Variation pattern in a machining operation
FIGURE 10.1 Illustration of dimensional variation pattern.
part that needs to be assembled with another piece of sheet metal. The position of the sheet metal is determined by a 4-way pin locator (P1), which controls the translations in x and z directions, and a 2-way pin locator (P2), which controls rotation around the y-axis. The dash-line rectangle is the nominal position of the sheet metal. If there is a variation in the position of P2, the position of the sheet metal will vary as indicated by the two solid-line rectangles. A key observation in this case is that the variation of the position of the sheet metal is not purely random: the sheet metal actually rotates around P1. In other words, the variations in the multiple dimensional measurements satisfy a certain relationship. For example, the ratios among the magnitudes of the variations in y1, y2, and y3 will be fixed. This fixed relation among the variations of quality measurements are called variation patterns. Variation pattern in machining: A similar phenomenon can be found in a machining process as shown in Figure 10.1b. In this operation, the top surface (EFGH) will be milled. The position of the workpiece is determined by a 3-2-1 fixture layout. If there is a variation in the locator L2, then the machined surface will have an upward shift or a downward shift with a slope that is determined by the magnitude of the L2 deviation. Clearly, the location and orientation of the top surface shows a pattern in this case. From the preceding two examples, it can be seen that the variation sources in the process will manifest themselves as patterns in the product quality measurements (i.e., a fixed relationship among the variables of multivariate quality measurements). Based on this observation, the variation source identification can be realized through pattern matching. The basic idea is illustrated in Figure 10.2. First, based on the fault-quality model, we can obtain the unique patterns of each potential fault. Meanwhile, the symptom of an occurring fault can be extracted from measurement data. Finally, the occurring fault can be identified if there is a matched pattern between the fault symptom and the fault pattern obtained from the model.
2151_book.fm Page 231 Friday, October 20, 2006 5:04 PM
Diagnosis through Variation Pattern Matching
231 in-line
off-line
In-process sensing data
Engineering Knowledge Linear Fault-Quality Model
Data collection and variance analysis
Variation signatures of potential faults
Estimation of an occurring fault symptom
Pattern matching between signature and symptom Root cause of the fault
FIGURE 10.2 Outline of the pattern matching method.
10.2 LINKS BETWEEN THE FAULT-QUALITY MODEL AND VARIATION PATTERNS Section 10.1 only provide an intuitive illustration of variation pattern. To develop a rigorous pattern matching technique, a formal mathematical analysis is needed to link the quality variation and the parameters of the fault-quality model, which is the focus of this section. Assuming that the process fault u is independent of system noise ε, we can obtain the following from Equation 10.1: Σy = Γ ⋅ Σu ⋅ ΓT + Σε
(10.2)
where Σu and Σε are the covariance matrices of u and ε, respectively. Furthermore, it is usually assumed that process faults (i.e., the elements in u) are independent of each other and, hence, Σu is a diagonal matrix. Without loss of generality, let us assume that the ith fault occurs and only a single diagonal element of Σu, σ 2i , is nonzero. Then, we have Σ y = σ i2 ⋅ ri ⋅ riT + Σ ε
(10.3)
where ri is the ith column vector of Γ. If we multiply ri on both sides of Equation 10.3, we have Σ y ⋅ ri = σ i2 ⋅ ri ⋅ riT ⋅ ri + Σ ε ⋅ ri . If the components in ε are independent of each other and have the same variances, then Σε will have the simplest form σ 2ε ⋅ I, where σ 2ε is the variance of the noise and I is an identity matrix with appropriate dimension. Under these conditions, we have Σ y ⋅ ri = σ i2 (ri ⋅ riT ⋅ ri ) + σ 2ε ⋅ ri , and further
2151_book.fm Page 232 Friday, October 20, 2006 5:04 PM
232
Stream of Variation Modeling and Analysis for MMPs
Σ y ⋅ ri = (σ i2 (riT ⋅ ri ) + σ 2ε ) ⋅ ri
(10.4)
Because σ i2 (riT ⋅ ri ) + σ ε2 is a scalar, ri is apparently an eigenvector of Σy . If we multiply both sides of Equation 10.3 by ri⊥ , where ri⊥ is any vector perpendicular to ri and has unit length, we have Σ y ⋅ ri⊥ = σ i2 (ri ⋅ riT ⋅ ri⊥ ) + σ ε2 ⋅ ri⊥ . Because riT ⋅ ri⊥ is zero, we get Σ y ⋅ ri⊥ = σ 2ε ⋅ ri⊥
(10.5)
which suggests that ri⊥ is also an eigenvector of Σy . The difference between ri and ri⊥ is the eigenvalues they are associated with: ri is associated with the largest eigenvalue of Σy and ri⊥ is associated with a small eigenvalue of Σy . This analysis indicates that when the ith process fault occurs, the eigenvector associated with the largest eigenvalue of Σy , known as its principal eigenvector, should match the ith column vector of Γ. Thus, the principal eigenvector of Σy can be viewed as the fault symptom and the corresponding column vector of Γ can be viewed as variation pattern or signature determined by the fault-quality model. An example is given to demonstrate the variation patterns. We consider a simple machining operation in which the top surface of a cubic metal is milled, as seen in Figure 10.3. The primary datum is located at the bottom surface indicated by u1, u2, and u3. Their x and y coordinates are (10,10), (110,10), and (10,90), respectively. Because we are only concerned with the depth of cutting, which will be solely determined by the primary datum, the other datums are not shown in this figure. After milling, we measure the z coordinates of four points on the top surface to evaluate the depth of cutting. These four points are labeled as M1, M2, M3, and M4, and their x and y coordinates are (10,10), (200, 10), (200,200), and (10,200), respectively. Clearly, the z coordinates of the locators will certainly influence the z coordinates of the measurement points. A kinematical analysis shows that the following linearized relationship holds:
z
Cutting Tool
M4 M1
M3 M2
y u3 u1
u2
FIGURE 10.3 An example from machining processes.
x
2151_book.fm Page 233 Friday, October 20, 2006 5:04 PM
Diagnosis through Variation Pattern Matching
z M1 −1 z M2 = 0.9 z M3 3.275 z M4 1.375
0 −1.9 −1.9 0
233
0 zu1 0 z + ε −2.375 u2 z −2.375 u3
(10.6)
where zMi and zui are the deviation of the measurement points and the locators in z direction, respectively, and ε represents the linearization and measurement noises. Each column of the coefficient matrix actually is the variation pattern associated with each locator variation error. For example, the first column [–1 0.9 3.275 1.375]T indicates that if the first locator u1 has a 1 unit upward shift and the other two locators are kept at nominal locations, then the z coordinates of the four measurements will have deviations of –1, 0.9, 3.275, and 1.375 units, respectively, after milling. Note that the lengths of the columns of the coefficient matrix are not one in Equation 10.6. However, a simple scaling yields z M1 −0.263 z M2 = 0.237 z M3 0.862 z M4 0.362
0 −0.707 −0.707 0
0 3.798 ⋅ zu1 0 2.687 ⋅ z + ε u2 −0.707 3.359 ⋅ zu3 −0.707
and then we can treat [3.798 ⋅ zu1 2.687 ⋅ zu2 3.359 ⋅ zu3 ] as the standardized variation sources to transform the lengths of the columns into one. If u1 is loose and thus zu1 has a large variation, then the first eigenvector of the covariance matrix of [ zM1 zM2 zM3 zM 4 ]T will be close to [–0.263 0.237 0.862 0.362]T from previous derivations. This result can lead us to obtain the following variation source identification procedure. If only one variation source exists in the process, we need to check if the symptom vector falls within a confidence boundary of a specific signature vector. This boundary is due to the sample uncertainty and measurement noises. If a match is found, then the fault represented by the specific signature vector occurs.
10.3 PROCEDURE OF PATTERN MATCHING FOR VARIATION SOURCE IDENTIFICATION If a single fault happens in the system, we need to match the symptom vector and the corresponding signature vector. However, in practice the value of the population covariance matrix Σy is unavailable. Instead, only the sample covariance is available. Sampling uncertainty in Sy makes the sample principal eigenvector a random vector. Instead of checking if the sample principal eigenvector equals the theoretical fault signature vector in an exact and deterministic sense, we should establish a confidence boundary to see if they are equal in a probabilistic sense.
2151_book.fm Page 234 Friday, October 20, 2006 5:04 PM
234
Stream of Variation Modeling and Analysis for MMPs
We assumed the covariance of the noise ε is in the form of Σε = σε2 · I in Section 10.2. This assumption can probably only represent well for the case where identical measurement devices are used to measure quality features. When different measurement devices are used, the variances of different components in ε will be different. Moreover, the noise term ε could comprise unmodeled faults and high-order nonlinear residuals, which are more likely to be correlated and makes even a diagonal structure in Σε less likely. For this reason, it is more practical to study Σy in a general structure. The unstructured Σε will disturb the principal eigenvector of Σy and make its direction deviate from the corresponding signature vector. Thus, in pattern matching, we need to consider not only sample uncertainty, but also the disturbance of unstructured noise. Several researchers have contributed to pattern-matching based variation source identification. Ceglarek and Shi [1] developed a pattern matching method for fixture fault diagnosis in automotive body assembly processes by assuming a structured noise and large sample size (namely, Σy is known). Their method is extended by Rong et al. [2] by considering the sample properties of the principal eigenvector but retaining the structured noise assumption. Ding et al. [3] studied the impact of unstructured noise under the large-sample assumption. In this chapter, a pattern matching technique considering both sample uncertainty and unstructured noise will be presented. This work is based on Li et al. [4]. Figure 10.4 illustrates the basic idea of this pattern matching technique. Vector ri is a column vector of Γ, which is the same as the principal eigenvector of Γ Σ u ΓT when the ith fault happens. With the presence of unstructured noise, the principal eigenvector of Σy , denoted as v (i 1) (the subscript i means the ith fault happens, and the superscript (1) represents that it is the eigenvector, also called principal eigenvector, associated with the largest eigenvalue), is not the same as ri anymore. Instead, it will fall in a cone, as shown in Figure 10.4. The boundary of the cone can be represented by the angle γ c . The sample principal eigenvector of Sy, denoted as us, will be different from v (i 1) because of sample uncertainty. The final boundary for us can be illustrated by the dashed line, as shown in Figure 10.4. If the sample principal eigenvector falls in the confidence boundary of a fault signature vector ri, we can claim that the corresponding fault occurs. Note that Figure 10.4 exaggerates the size of the region for the sake of illustration. When the sample size is large and the noise variance is small, the confidence range of us is actually small.
Boundary of us Boundary of v (i1)
ri c
FIGURE 10.4 The boundary for pattern matching of the columns of Γ.
2151_book.fm Page 235 Friday, October 20, 2006 5:04 PM
Diagnosis through Variation Pattern Matching
235
The subsequent sections will present the confidence boundary of us. We list our assumptions as follows: 1. The fault-quality relation can be adequately described by Equation 10.1. The process faults and the system noises are assumed to be independent of each other. 2. The covariance of ε is in a general form as Σε . Σε may be unknown but the range of its eigenvalues is assumed known. 3. One fault occurs at a time, meaning that multiple faults do not occur in the system simultaneously. The rationale of the second assumption is that ε is usually dominated by measurement noises, whose range of eigenvalues represents the range of sensor accuracy that can be obtained through the sensor vendor’s specification. Even if ε comprises unmodeled faults and high-order nonlinear residuals, offline calibration could provide information regarding the range of eigenvalues of Σε .
10.3.1 DISTURBANCE
DUE TO
UNSTRUCTURED NOISES
In this subsection we focus on the perturbation of unstructured noise on the eigenvectors of Σy . When the ith fault happens, the principal eigenvalue and eigenvector pair of Γ Σ u ΓT, Σ y and Sy, are { λ (u,1)i , h (i 1) }, { λ (y,1)i , v (i 1)}, and { ly,(1i) , us}, respectively. As previously discussed, if only the ith fault occurs, the principal eigenvector ( h (i 1) ) of Γ Σ u ΓT is the same as the ith column vector (ri) of Γ. In the presence of unstructured noises the principal eigenvector ( v (i 1)) of Σy differs from ri, and the boundary of this difference is related to the additive noise covariance matrix Σε . A useful result regarding the perturbation boundary caused by unstructured noise is given by Ding et al. [3]: When only the ith fault occurs in the system (i.e., only λ (u,1)i , the principal eigenvalue of Γ Σ u ΓT is nonzero) and || Σε ||2 ≤ λ (u,1)i / 4, then 4 dist(span{ri},span{v i(1)}) ≤ (1) λ u ,i
2 ( Σ ε ) − λ 2min ( Σ ε ) λ max
(10.7)
where λ 2max ( Σ ε ) and λ 2min ( Σ ε ) are the largest and the smallest eigenvalues of Σε , and dist (span{ri }, span{v i(1)}) is the distance [5] between the space spanned by ri and v (1) i , respectively. The condition Σε ≤ 2
λ (u,1)i 4
implies that the largest variance of the noise term is four times smaller than the variance of the process fault, which is not restrictive in practice. Because ri and v (i 1) are two unit vectors (in the sense of its 2-norm) in Rn, Rn, the distance between two subspaces of span{ri } and span{v (i 1)}, equals the sine of the angle (∆θ) between ri and v (i 1). That is, sin( ∆θ) = dist(span{ri }, span{v (i 1)}). The detailed proof of this result can be found in Li, Zhou, and Ding [4]. Based on this result and Equation 10.7, we have
2151_book.fm Page 236 Friday, October 20, 2006 5:04 PM
236
Stream of Variation Modeling and Analysis for MMPs
4 ∆θ ≤ sin −1 (1) λ u ,i
2 2 λ max ( Σ ε ) − λ min (Σ ε ) = γ c
(10.8)
Obviously, the angle between ri and v (i 1) can be calculated by using cos(∆θ) = v ⋅ ri . Its boundary will be denoted by (1) T i
4 γ c = sin −1 (1) λ u ,i
2 2 λ max ( Σ ε ) − λ min (Σ ε )
hereafter. The preceding result tells us that the difference between ri and v (i 1), represented by the angle ∆θ between them (as shown in Figure 10.4), is determined by the eigenvalue of Γ Σ u ΓT and the extreme eigenvalues of Σε . Also, note that the boundary specified by Equation 10.8 is a worst-case boundary. The geometric meaning of this boundary is shown in Figure 10.4; that is, v (i 1) will construct a hyperdimensional cone with its central axis as ri. Equation 10.8 suggests that the perturbation boundary between ri and v (i 1) depends on λ (u,1)i /λ max ( Σ ε ) and λ max ( Σ ε ) /λ min ( Σ ε ). The value of λ (u,1)i /λ max ( Σ ε ) can be regarded as the signal-to-noise ratio, whereas λ max ( Σ ε ) / λ min ( Σ ε ) indicates the imbalance in accuracy associated with different measurement devices. With the increase in signal-to-noise ratio, the perturbation boundary will get smaller. For instance, if λ max ( Σ ε ) =0.030, λ (u,1)i /λ max ( Σ ε ) is 10, and λ max ( Σ ε ) /λ min ( Σ ε ) is around 10, the perturbation boundary (i.e., the angle) will be around 20°. However, if λ (u,1)i /λ max ( Σ ε ) is around 50, meaning the fault magnitude is five times larger than before, the perturbation angle will reduce to about 5˚. On the other hand, with a higher imbalance among eigenvalues of Σε , the perturbation boundary will become larger. If all eigenvalues of Σε are the same, there will be no perturbation on v (i 1) even if the measurement device is not very accurate, corresponding to the situation that the absolute values of eigenvalues of Σε are high but equal.
10.3.2 DISTURBANCE
DUE TO
SAMPLING UNCERTAINTY
This subsection presents the result of a testing procedure for the principal eigenvector of a sample covariance matrix. We use the same notation as before; namely, us and v (i 1) are the principal eigenvector of Sy and Σy , respectively, and the subscript i indicates that the ith fault happens. According to Murihead [6], if λ y,i is a distinct eigenvalue (which is true in our 1case when a single fault occurs) and y follows a /2 normal distribution, then N − 1 u s − v (i 1) asymptotically follows an n-variate normal distribution with zero mean and covariance matrix of Π, where
(
) (
)
n
Π = λ (y1,)i
∑ (λ j=2
λ (yj,i) (1) y ,i
− λ (yj,i) )2
v (i j )v i( j )T
(10.9)
and N is the sample size, and n is the dimension of measurement in Equation 10.1. Apparently, because of the sampling uncertainty, different realizations of us will generate a hyper dimensional cone centering around the corresponding vector v (i 1) .
2151_book.fm Page 237 Friday, October 20, 2006 5:04 PM
Diagnosis through Variation Pattern Matching
237
Based on Equation 10.9, given a symptom vector us, an asymptotic test can be developed to determine if the principal eigenvector of Σy is v (i 1) . The test statistic can be formulated as 1 Wi ≡ ( N − 1)(v (i1) − u s )T ly(1,i)S−y1 − 2I + (1) Sy (v (i1) − u s ) ly ,i
(10.10)
It is known that Wi asymptotically follows the distribution of χ2n−1 if the hypothesis is true. If Wi > χ2n −1 (α ) , where χ2n−1 (α ) is the upper 100α% point of the χ2n−1 distribution, we reject the hypothesis, meaning that v (i 1) is not the principal eigenvector of Σy under the sampling uncertainty. Otherwise, we accept the hypothesis, meaning that v (i 1) is the principal eigenvector of Σy.
10.3.3 A ROBUST PATTERN MATCHING PROCEDURE A robust pattern matching procedure considering both effects of noise perturbation and sampling uncertainty is developed in this subsection. Consider two faults, represented by ri and rj , in Figure 10.5. Vectors ri and rj are fault signature vectors from matrix Γ. Because of the perturbation of unstructured noises, there are two cones centered around ri and rj , respectively, within which the eigenvectors of Σy , v (i 1) and v (j1), will fall. Once measurement data are collected, the fault symptom vector us, the principal eigenvector, can be calculated from Sy. We want to identify exactly which fault happens. There are two cases: (1) when us falls into either one of the two cones and (2) when us falls outside both cones. For case 1, because us, under extra sampling uncertainty, still falls within the confidence boundary of population covariance matrices, the meaning is that the corresponding fault occurs. The decision rule is then simple and the same as when there is no sampling uncertainty, namely, we can immediately claim that a fault represented by either ri or rj, whichever is applicable, occurs. For case 2, we need to further calculate a test statistic similar to that in Equation 10.10 to decide which fault has occurred.
v (i1)
fuc s
ra i Boundary due to noise
Boundary due to noise v (j1)
ar j
FIGURE 10.5 The method to identify the boundary of vi(1).
2151_book.fm Page 238 Friday, October 20, 2006 5:04 PM
238
Stream of Variation Modeling and Analysis for MMPs
However, owing to the perturbation from unstructured noise ε, we do not know the precise position of v (i 1) . Instead, we only know the worst case boundary, within which v (i 1) falls. One straightforward approach would be to use the worst-case boundary as the position of v (i 1) . However, a simple numerical study shows that the test statistic in Equation 10.10 using a “worst-case” v (i 1) (i.e., using the value on the worst-case boundary) often yields a very large Wi value so that the hypothesis that v (i 1) is the principal eigenvector of Σy is always rejected and a high miss detection rate results. To reduce the miss detection rate, we adopt a conservative approach to locate v (i 1) for testing Equation10.10. Instead of using a boundary vector, we try to find a vector located within the hyperdimensional cone (including the boundary) that can minimize the Wi value. Then this vector is substituted in Equation 10.10 as v (i 1) to calculate Wi*. A typical nonlinear programming problem to find an appropriate v (i 1) can be formulated as follows:
Objective:
1 Wi* = min( N − 1)(v (i1) − u s )T ly(1,i)S−y1 − 2I + (1) Sy (v i(1) − u s ) (1) vi ly ,i
Constraints: v i(1)T v (i 1) = 1 ∆θi = cos −1
(v (i1)T ri ) ≤ γc 11ri 11
In the preceding formulation, the objective function is to find a v (i 1) that minimizes Wi while satisfying a set of constraints. The constraints mean that v (i 1) should remain as a unit vector and the angle between v (i 1) and ri should be smaller than the worst-case boundary range γc. Based on the Kuhn–Tucker conditions [7], we can convert the inequality constraint to an equation and get the solution of v (i 1) . In our numerical study, we use the function fmincon in Matlab software to solve this nonlinear programming problem. After getting the vectors of v (i 1) and us, the final step is to calculate the value of Wi* and compare this value with the upper 100α% point of the χ2n−1 distribution. Our decision rule using the aforementioned optimized v (i 1) and Wi* follows what was previously stated: if Wi* is not greater than the critical value, we claim that the corresponding fault occurs; otherwise we claim this fault does not occur. This conclusion is conservative in terms of fault isolation. Because the vector that minimizes the Wi* statistic might not be the true v (i 1) , we are more inclined to conclude that a fault indeed occur. Subsequent investigations are still needed to verify the conclusion. In practice, this strategy fits fault-critical applications. In this case, we would prefer a small miss detection rate and be very cautious about any potential signs of significant process faults because of the very high warranty cost associated with quality problems. After identifying the ith process fault, we can further estimate the variance magnitude of the fault. We have [5],
2151_book.fm Page 239 Friday, October 20, 2006 5:04 PM
Diagnosis through Variation Pattern Matching
239
λ (u1,)i + λ min ( Σ ε ) ≤ λ (y1,)i ≤ λ (u1,)i + λ max ( Σ ε )
(10.11)
Based on Equation 10.11, we can get the range of eigenvalue of Γ Σ u ΓT as follows, λ (y1,)i − λ max ( Σ ε ) ≤ λ (u1,)i ≤ λ (y1,)i − λ min ( Σ ε )
(10.12)
It is known that σ 2i (riT ⋅ ri ) is the eigenvalue of Γ Σ u ΓT corresponding to the process fault. Substituting λ u,i = σ 2i (riT ⋅ ri ) in Equation 10.12 and using the average of the upper and lower bounds in Equation 10.12 as an approximation for σ 2i , we then estimate the variation magnitude of a fault as, σ i2 ≈
1
{
ly(1,i) − λ min ( Σ ε ) + ly(1,i) − λ max ( Σ ε ) 2(r ⋅ ri ) T i
}
(10.13)
where λ (y,1)i is substituted by its sample version ly,(1i) . Apparently, if the variance of the noise is relatively small compared to that of a process fault, which is usually the case, the approximation error will be small. The reason is that when the largest and smallest eigenvalues of ∑y become smaller, so does the range stated in Equation 10.12. To summarize, the proposed pattern matching procedure is listed as follows: S1: A fault-quality model y = Γ ⋅ u + ε is developed. Based on the model, fault signature vectors rk , k = 1,…, p, are obtained from the columns of Γ. S2: The multivariate measurements y of product quality features are obtained during production. The sample size is N and the dimension of measurements is n. S3: Based on y, calculate the sample covariance matrix Sy and the principal eigenvector us using PCA. Σε) and λmin(Σ Σε) from an analysis of accuracy specification S4: Estimate λmax(Σ of the measurement system. Based on Equation 10.8, we can calculate γ c . Here, we often need to substitute λ (u,1)i with the tolerance specification of the corresponding process variable if the precise value of λ (u,1)i is not yet known. Then, we calculate the angles ∆θ k between rk and us, k = 1, …, p. If all of the angles are larger than γ c , go to the next step, otherwise go to step S6. S5: Select a confidence level 1 – α. For the nonlinear programming problem, we find the optimal value of v (k1), k = 1, …, p, and calculate the value of Wk*, k = 1, …, p. If all the values are larger than χ2n−1 (α ) , we can claim that none of the known faults happen and the system is working normally; otherwise go to the next step. S6: If ∆θ k ≤ γ c or Wk* ≤ χ2n −1 (α ) , we can claim the kth fault, represented by the kth column of matrix Γ, happens. After that, the variation magnitude of the fault can be calculated by Equation 10.13. If all Wk*s, k = 1, …, p, are larger than χ2n−1 (α ) , then we claim that none of the known faults happen and the system is working normally.
2151_book.fm Page 240 Friday, October 20, 2006 5:04 PM
240
Stream of Variation Modeling and Analysis for MMPs
10.4 CASE STUDY 10.4.1 A MACHINING PROCESS PROPAGATION MODEL
AND ITS
VARIATION
The manufacturing process considered here is the same as that used in Chapter 7. To simplify the problem, we consider only the first two operations (milling the cover face and the joint face) in this case study. A total of 15 and 16 points on the joint face and the cover face are measured to determine the quality of the machining operation. These measurement points are evenly distributed on these two surfaces. Therefore, y will be a 31 × 1 vector (i.e., n = 31), consisting of the deviations at 15 points on the cover face and 16 points on the joint face. Fault u is then a 6 × 1 vector (i.e., p = 6), where the first three elements correspond to three pins at the first operation and the last three elements correspond to three pins at the second operation. A fault-quality model can be established from a sophisticated kinematic analysis (refer to Chapter 7 for details). Matrix Γ of this specific process is given as: Γ Γ = 1 Γ 2
(10.14)
Γ11 0], Γ2 = [0 Γ22], and where Γ1 = [Γ Γ11 = –0.5402 –0.6455 –0.2664 0.0683 0.5158 0.8555 0.9902 1.1863 1.3106 1.0177 0.6394 0.3403 0.0530 –0.0820 –0.2776 0.9690 0.5991 0.2203 –0.1140 –0.5612 –0.9005 –0.7058 –0.4225 –0.2428 0.1052 0.4597 0.7585 1.0455 0.8192 0.5367 0.5711 1.0464 1.0461 1.0458 1.0453 1.0450 0.7156 0.2362 –0.0678 –0.1229 –0.0991 –0.0988 –0.0985 0.2628 0.7409
T
Γ22 = -0.6349 -0.4590 -0.0332 0.2510 0.6955 1.0215 1.1372 1.1155 0.9048 0.6472 0.5864 0.3875 0.0777 -0.1490 -0.4063 0.0776 0.9602 0.6817 0.2556 -0.0288 -0.4736 -0.5857 -0.5031 -0.3477 -0.1368 0.1210 -0.0142 0.3998 0.7098 0.9368 0.9339 0.4496 0.6747 0.7773 0.7776 0.7778 0.7781 0.5641 0.3659 0.2322 0.2320 0.2318 0.4279 0.2127 0.2124 0.2123 0.4725 0.4728
T
The six column vectors of Γ are the signature vectors of six potential faults. In this example, noise ε is dominated by errors associated with the measurement device. Because the orientations of a CMM probe are different when it is used to measure the joint face and the cover face, the variances of measurement errors are therefore different for measurement points on these two different surfaces. Based on engineering analysis, the standard deviations of components in εi are between 0.008 and 0.015 mm. Furthermore, from the design specification of our fixture system, the locating tolerance is around 0.050 mm. If the standard deviation of components in u is larger than this tolerance, we deem that the corresponding fault occurs. Next we perform root cause identification using the pattern matching technique based on this faultquality model and the CMM measurements.
2151_book.fm Page 241 Friday, October 20, 2006 5:04 PM
Diagnosis through Variation Pattern Matching
241
10.4.2 PATTERN MATCHING FOR ROOT CAUSE IDENTIFICATION IN THE MACHINING PROCESS Given that the standard deviations of components in εi are between 0.008 and 0.015 mm, we randomly generate a positive definite matrix with diagonal elements falling in the range of 64- to 225 × 10–6 mm2. Then, we use this matrix as the noise Σε)and λmin(Σ Σε), can covariance matrix, whose largest and smallest eigenvalues, λmax(Σ be easily computed. We consider the situation when the third fault occurs. The variations of u are specified accordingly as: c mm 2 , k = 3 var u( k ) = 1 2 c 2 mm , k ≠ 3 where c 1 is larger than 2.5 × 10–3, the square of the locating tolerance, and c 2 is smaller than this value. This means that the number 3 variation source occurs in the system. We set the significant level of hypothesis testing as α = 0.01. The theoretical chi-square critical value in Equation 10.12 is χ230 (0.01) = 50.892 . First, we would like to compare the miss detection rates of the proposed method with two prior methods developed in Ding et al. [3] and Rong et al. [2], respectively. The method in Ding et al. [3] considers perturbation of unstructured noises but not sampling uncertainty, whereas the one in Rong et al. [2] considers sampling uncertainty in the absence of unstructured noise. We conducted 1000 replicates of fault detection using the aforementioned three procedures for a set of combinations of sample size N and perturbation angle γ c . Table 10.1 lists the resulting miss detection rates of different procedures, which are
TABLE 10.1 Miss Detection Rates of Three Pattern Matching Methods γc (degrees) N 50
100
400
1000
e e1 e2 e3 e1 e2 e3 e1 e2 e3 e1 e2 e3
0 100% 83.8% 83.6% 100% 30.1% 30.0% 100% 4.0% 4.0% 100% 2.3% 2.3%
0.54 100% 85.7% 0 100% 30.2% 0 100% 3.5% 0 99.9% 3.1% 0
0.91 100% 85.9% 0 100% 30.0% 0 100% 4.1% 0 95.7% 3.0% 0
2.42 100% 86.4% 0 99.8% 30.0% 0 69.7% 4.1% 0 19.4% 3.4% 0
4.06 96.7% 87.6% 0 74.2% 30.4% 0 7.4% 6.1% 0 0 7.4% 0
9.94 67.5% 88.3% 0 30.4% 33.8% 0 0 10.7% 0 0 21.7% 0
15.62 42.6% 88.7% 0 15.6% 37.7% 0 0 14.4% 0 0 31.5% 0
20.29 36.5% 88.3% 0 10.1% 37.4% 0 0 16.3% 0 0 43.2% 0
29.08 26.7% 90.1% 0 6.7% 47.4% 0 0 21.4% 0 0 51.2% 0
2151_book.fm Page 242 Friday, October 20, 2006 5:04 PM
242
Stream of Variation Modeling and Analysis for MMPs
the rates of the number of miss detection cases over the total number of tests (i.e., 1000 replications). We denote by e1, e2, and e3 the miss detection rate of the procedure without considering sample uncertainty, the procedure without considering unstructured disturbances, and the robust method described in this chapter, respectively. From the table we can observe the following: 1. The value of e1 could be very high when the sample size is small. For instance, e1 is nearly 100% for a sample size as high as N = 1000 when γ c < 2.42°. The miss detection ratio e1 depends on the value of γ c ; given the same sample size, the larger γ c , the smaller e1. This is understandable because γ c provides the worst-case boundary, and a larger γ c can help in offsetting disturbance from sampling uncertainty. When a smaller sample size such as N = 50 is used, e1 will have a considerably higher value even for a γ c as large as 29.08°. Too large a γ c is not preferred because the worst-case boundaries of different faults will easily overlap. On the other hand, if we have a large N (say, N = 1000), the miss detection rate will indeed reduce to zero provided that there is a reasonable size of γ c (say, γ c > 4.06°). 2. The value of e2 will increase very rapidly when γ c is nonzero. This is particularly obvious for a large sample. Given the same γ c , e2 tends to decrease when the sample size increases. 3. Using the robust procedure, the miss detection rate is always smaller than the other two methods. In fact, except for the cases when γ c = 0, the robust method will have a near-zero miss detection rate. For γ c = 0, using the robust procedure will be equivalent to using the procedure without considering unstructured noise perturbation, and e3 is almost exactly the same as e2 under that condition. Usually, a low miss detection rate is accompanied by a high false alarm rate, which is the probability of identifying a normal working condition as a faulty working condition. Because we adopted a conservative approach to find Wi* for diagnosis, it will not be surprising if the false alarm rate is higher than that in the other two methods. In this study, we used the same aforementioned example but changed the variances of u to c 1 = 20 × 10 −4 and c 2 = 2 × 10 −4 , both of which are smaller than 2.5 × 10–3 tolerance level corresponding to the case that no fault occurs. We again replicate our numerical study 1000 times and list the resulting false alarm rates in Table 10.2, where false alarm rate δ is defined as the rate of the number of false alarms over the total number of tests (1000 replicates). As with the notation used for miss detection ratio, δ1, δ2, and δ3 represent the fault alarm rates using these three procedures, respectively. From this table we find that δ 3 is larger than δ1 and δ 2 under almost all circumstances. As explained earlier, this is not surprising. However, the false alarm rate using our proposed procedure is not alarmingly high. When there is either an unstructured noise disturbance (say, γ c > 4.06°) or the sample size is relatively large (say, N > 400), the false alarm rate using our method is quite comparable to the other two.
2151_book.fm Page 243 Friday, October 20, 2006 5:04 PM
Diagnosis through Variation Pattern Matching
243
TABLE 10.2 Fault Alarm Rates of Three Pattern Matching Methods γc (degrees) N 50
100
400
1000
δ
0
0.54
0.91
2.42
4.06
9.94
15.62
20.29
29.08
δ1 δ2 δ3 δ1 δ2 δ3 δ1 δ2 δ3 δ1 δ2 δ3
0 2.47% 3.08% 0 11.63% 12.42% 0 16.03% 16.30% 0 16.35% 16.45%
0 2.10% 5.85% 0 11.80% 14.60% 0 15.92% 16.67% 0 16.35% 16.67%
0 2.07% 16.63% 0 11.50% 16.65% 0 16.10% 16.67% 0 16.20% 16.67%
0 2.05% 16.73% 0 11.45% 16.67% 0.22% 16.08% 16.67% 7.85% 16.15% 16.67%
0 2.02% 16.88% 0.17% 11.47% 16.67% 11.17% 16.07% 16.67% 16.60% 16.22% 16.67%
4.52% 2.07% 16.92% 14.72% 11.33% 16.67% 16.67% 15.92% 16.67% 16.67% 16.15% 16.67%
14.95% 2.00% 17.17% 16.67% 11.28% 16.67% 16.67% 15.82% 16.67% 16.67% 15.65% 16.67%
16.38% 1.95% 17.20% 16.67% 11.02% 16.67% 16.67% 15.33% 16.67% 16.67% 14.50% 16.67%
16.67% 1.93% 18.15% 16.67% 10.40% 16.69% 16.67% 13.98% 16.67% 16.67% 10.13% 16.67%
Combining the results from miss detection rate and false alarm rate, we find that our proposed method gains significantly in detection power while giving slightly more false alarms. Overall, the proposed procedure is more robust in identifying the occurring fault and is a more preferable tool for root cause identification. Finally, we show some estimates of the fault magnitudes in Table 10.3. The fault magnitude is estimated under the following conditions: N = 400 and the noise Σε) = 3 × 10–4 and λmin(Σ Σε) = 10–6. Each time, we covariance matrix is fixed at λmax(Σ increase the variance level of one element in u so that it becomes 10 times larger than the rest of the elements in u. This element then becomes an outstanding fault. Then we use Equation 10.13 to estimate its magnitude. This same practice is repeated for all p elements in u. The variations of u are defined as follows, 2 × 10 −2 mm 2 , k = i var u( k ) = , i = 1, …, p −3 2 2 × 10 mm , k ≠ i From Table 10.3, one can see that the estimated magnitudes for all p faults are very close to the true value, which is 2 × 10–2 mm2.
TABLE 10.3 The Estimated Values of the Fault Magnitudes (10–2 mm2) Occurring fault i Magnitude
1 2.035
2 2.055
3 1.995
4 2.075
5 2.075
6 2.015
2151_book.fm Page 244 Friday, October 20, 2006 5:04 PM
244
Stream of Variation Modeling and Analysis for MMPs
This numerical case study illustrates the necessity of considering both sample uncertainty and unstructured noise perturbation. The method described in this chapter is more robust than previous methods.
10.5 SUMMARY Pattern matching technique is a widely used approach for root cause identification in quality improvement. They have a few obvious advantages: there is a clear geometric explanation of fault signatures or fault patterns; the results can thus be easily visualized with physical interpretation attached; and it is intuitive and easy for practitioners to implement and execute. This chapter presented a procedure to make the pattern matching method more robust under perturbation due to unstructured noise and disturbance of sampling uncertainty. Case studies have been conducted to illustrate the effectiveness of this method. It was demonstrated that detection capability using the new procedure is significantly improved. This method can be used for rapid root cause identification of manufacturing processes and will reduce the variability of products. In the next chapter, the direct-estimation-based method will be introduced. The technique introduced in this chapter mainly deals with the single variation source case, i.e., only one variation source exists in the system. However, this method can also be extended to the multiple variation sources case. Actually, one important result, which can be derived from the factor rotation properties in multivariate statistics [6], is that the eigenvectors of Σy associated with eigenvalues larger than σ 2ε will span the same linear space as that spanned by the column vectors of Γ matrix corresponding to the significant variation sources. From this observation, if multiple variation sources exist in the process simultaneously, we can check the closeness of the linear space spanned by the symptom vectors and the selected signature vectors, we can determine the multiple variation sources in the process. The closeness of the two spaces can be measured by the principal angle. The pattern matching procedure for the multiple faults case considering both sample uncertainty and unstructured noise perturbation is fairly involved. The details are omitted in this chapter. Interested readers can refer to Li and Zhou [8].
10.6 EXERCISES 1. For a linear system y = Γ ⋅ u + ε , we know 0.0579 0.3529 Γ= 0.81322 0.0099
0.1389 0.2028 0.1987 0.6038
0.2722 0.1988 , 0.0153 0.7468
and the covariance of the components of u is
2151_book.fm Page 245 Friday, October 20, 2006 5:04 PM
Diagnosis through Variation Pattern Matching
1.5 Σu = 0 0
0 0 0
245
0 0 , 0
and the variances of different components of noise ε are at the same level as 0.08. (Note if the angle 0 between an eigenvector, v, and a specified vector, r, is close to 180°, you may consider using 1180–01 as the angle between those two vectors.) a. What is the dimension of y? b. What is the covariance of the components of y, Σy? c. What are the eigenvectors of Σy? d. Is there a specific relationship between the eigenvector of Σy and the matrix of Γ? e. Assuming 0 Σ u = 0 0
0 1.5 0
0 0 , 0
redo exercises 1 to 4. 2. For the same linear system in Exercise 1, we have: 0.0579 0.3529 Γ= 0.81322 0.0099 1.5 Σu = 0 0
0 0 0
0.1389 0.2028 0.1987 0.6038
0.08 0 0 0 and Σ ε = 0 0 0
0.2722 0.1988 , 0.0153 0.7468 0 0.08 0 0
0 0 0.08 0
0 0 0 0.08
a. Use computer simulation to generate 50 measurements of y by using system equation y = Γ ⋅ u + ε. Assume all the variables are normally distributed and with zero mean. b. Calculate the sample covariance of the 50 measurements of y, denoted as Sy. c. Compare Sy with Σy obtained from Exercise 1. Are they the same? If not, why? d. Calculate the eigenvectors of Sy and compare it with the eigenvectors of Σy. Are they the same? If not, why?
2151_book.fm Page 246 Friday, October 20, 2006 5:04 PM
246
Stream of Variation Modeling and Analysis for MMPs
e. Repeat the simulation 500 times to generate 500 groups of 50 measurements of y. For each group of measurements, calculate the sample covariance matrix and corresponding eigenvalue and eigenvector. Plot the histogram of the first component of the eigenvector corresponding to the largest eigenvalue. How is this component roughly distributed? What conclusion can we arrive at from this simulation exercise? 3. Consider the linear system of Exercise 1 except that the covariance of noise ε is 0.0127 0.0041 Σε = 0.0088 0.0076
0.0041 0.0088 0.0076 0.0024 0.0023 0.0030 . 0.0023 0.0068 0.0048 0.0030 0.0048 0.0059
Use 0.0579 0.3529 Γ= 0.8132 0.0099
0.1389 0.2028 0.1987 0.6038
0.2722 1.5 0 0 0.1988 and Σ u = 0 0 0 . 0.0153 0 0 0 0.7468
a. What is the covariance of the components of y, Σy? b. What are the eigenvectors of Σy? c. What is the angle between the eigenvector of Σy that is associated with the largest eigenvalue and the first column of Γ? d. What is the predicted boundary of this angle? Is the angle in (c) less than this boundary? e. Assume 0 Σ u = 0 0
0 1.5 0
0 0 , 0
redo exercises (a) to (d). (Note: The covariance of ε can be changed to generate new problems. It should be generated in Matlab as A*a diagonal matrix*AT. A could be an arbitrary matrix. The diagonal elements of the middle diagonal matrix should be positive. After it is generated, we need to check its eigenvalues to make sure that they are all small. If not, we can shrink the diagonal elements to redo it.) 4. Consider the linear system of Exercise 3, assume 1.5 0 0 Σu = 0 0 0. 0 0 0
2151_book.fm Page 247 Friday, October 20, 2006 5:04 PM
Diagnosis through Variation Pattern Matching
247
a. Use computer simulation to generate 50 measurements of y. Assume all the variables are normally distributed and with zero mean. b. Calculate the sample covariance of the 50 measurements of y, denoted as Sy. c. Calculate the eigenvectors of Sy and compare it to the eigenvectors of Σy. Are they the same? If not, what is the angle between them? d. Repeat the simulation 500 times to generate 500 groups of 50 measurements of y. For each group of measurements, calculate the sample covariance matrix and the corresponding eigenvalue and eigenvector. Calculate the angle between the eigenvector corresponding to the largest eigenvalue of the sample covariance matrix and the first column of Γ. Are all the angles smaller than the boundary obtained in part (d) of Exercise 3? What does this tell us? 5. Consider the linear system of Exercise 3. We have 25 measurements as follows: y1
y2
y3
y4
–0.0639 –0.2331 0.1299 0.0275 0.0175 0.0170 –0.0877 –0.2313 –0.1568 –0.0224 –0.1723 –0.2138 0.2423 –0.0442 –0.1383 –0.0047 –0.1451 –0.1242 –0.0482 –0.0103 –0.0584 0.0914 –0.2210 –0.0766 –0.0964
0.3446 –0.0290 0.1798 0.5061 –0.1283 –0.0223 –0.5406 –0.7883 –0.2938 0.1175 –0.2276 –0.5767 0.4942 –0.3487 –0.0514 –0.0128 –0.1767 0.1391 –0.0740 –0.7029 –0.4683 0.1206 –0.5633 –0.1952 –0.4563
0.9083 0.0641 0.3354 1.1540 –0.2424 –0.1362 –1.2689 –1.6845 –0.8079 0.2147 –0.6414 –1.4479 1.1935 –0.7012 –0.0922 0.0138 –0.3289 0.2344 –0.3015 –1.5377 –0.9786 0.2594 –1.3537 –0.3854 –0.9649
–0.1253 –0.1588 0.0680 –0.0133 –0.0019 0.0172 0.0020 –0.0955 0.0197 –0.0470 –0.0974 –0.0290 0.0888 –0.0633 –0.0786 0.0345 –0.0699 –0.0004 0.0183 0.0785 –0.0371 0.0220 –0.0889 –0.0073 –0.0745
a. Which fault occurs in the system based on these measurements? b. What is the estimated fault magnitude?
2151_book.fm Page 248 Friday, October 20, 2006 5:04 PM
248
Stream of Variation Modeling and Analysis for MMPs
6. Discussion questions: a. What is the motivation of the study variation source identification? What is the objective? b. Why is there a variation pattern in the assembly or machining problem? c. What are single fault or multiple faults discussed in the chapter? d. Can you find a variation pattern for a single fault with a given SoV model (i.e., state space model)? e. Why do we need to discuss the “boundary” for pattern matching problems? How do we find the “boundary” for a given variation pattern? f. What are the challenges for variation pattern in multiple-fault problems?
References 1. Ceglarek, D. and Shi, J., Fixture failure diagnosis for autobody assembly using pattern recognition, ASME Journal of Engineering for Industry, 188, 55, 1996. 2. Rong, Q., Ceglarek, D., and Shi, J., Dimensional fault diagnosis for compliant beam structure assemblies, ASME Journal of Manufacturing Science and Engineering, 122, 773, 2000. 3. Ding, Y., Ceglarek, D., and Shi, J., Fault diagnosis of multistage manufacturing processes by using state space approach, ASME Transactions, Journal of Manufacturing Science and Engineering, 124, 313, 2002. 4. Li, Z., Zhou, S., and Ding, Y., Pattern matching for root cause identification of manufacturing processes with consideration of general structured noise, IIE Transactions on Quality and Reliability Engineering, accepted 2005. 5. Golub, G.H. and Van Loan, C.F., Matrix Computations, 3rd ed., The Johns Hopkins University Press, Baltimore, MD, 1996. 6. Murihead, R., Aspects of Multivariate Statistical Theory, John Wiley & Sons, New York, 1982. 7. Taha, H.A., Operations Research, An Introduction, 5th ed., Prentice-Hall, NJ, 1992. 8. Li, Z. and Zhou, S., Robust method of multiple variation sources identification in manufacturing processes for quality improvement, ASME Transactions, Journal of Manufacturing Science and Engineering, 128(1), 326–336, 2006.
2151_book.fm Page 249 Friday, October 20, 2006 5:04 PM
11
Estimation-Based Diagnosis*
As discussed earlier, there are two major approaches to fault diagnosis: pattern-based recognition methods and estimation-based methods. Chapter 10 focused on the former. This chapter will discuss the latter method. An estimation-based diagnosis starts with the linear diagnostic model (such as Equation 10.1) and then estimates the variance components for each potential variation source (or random effect) based on the measurement data sample. Knowledge of the variance components, in conjunction with subsequent statistical testing, allows us to assess whether each variation source is present in the current sample and, if present, the severity of the source. Two statistical methods are commonly used in estimating variance components: the least-squares (LS) method (and its variants) and the maximum likelihood (ML) method. In engineering applications, the least-squares methods are more prevalent, perhaps because of its ease of use and quick computation, more suitable for in-line applications. This chapter will present a few variants of LS estimations of variance components. Their interrelationship will be revealed and their performance will be compared. Some convenient tools will be developed to guide the appropriate use of these variance estimators under specific circumstances. This chapter is organized as follows. Section 11.1 presents several variants of LS estimators for variance components. Section 11.2 will discuss the interrelationship among variance estimators. Section 11.3 compares their performances. Section 11.4 summarizes the chapter.
11.1 LS ESTIMATORS FOR VARIANCE COMPONENTS For an estimation-based diagnosis method, the first step is to establish a linear diagnostic model that links product measurements to process variance sources. Equation 10.1 actually serves this purpose. In this chapter, for simplicity, we will assume that the effect of background noise, Ψ ⋅ w, is negligible. Then Equation 10.1 becomes P
y(t) = Γu(t) + v(t) =
∑ Γ u (t) + v(t), t = 1, 2, …, n i i
(11.1)
i =1
where t is an observation index and n is the sample size. y, u, and v are the aggregated vectors for measurement, input of variation sources, and the sensor noise, respec* Part of chapter material is based on Reference 2 (pp. 200–210) and Reference 13 (pp. 69–79).
249
2151_book.fm Page 250 Friday, October 20, 2006 5:04 PM
250
Stream of Variation Modeling and Analysis for MMPs
tively. Following the notations in Part I, we know that y and v have a dimension of M × 1 and u has a dimension of P × 1. The following assumptions are commonly used: (A1) The underlying distributions of u and v are normal. (A2) Noise vector v has zero mean, is independent of u, and has the variancecovariance matrix σ 2v IM ( IM is an M × M identity matrix), where σ 2v is sensor noise variance. (A3) The P variation sources are independent such that u has a diagonal variance-covariance matrix Σ u = diag{σ12 σ 22 σ 2P}, where σ 2i is the variance of the ith input in u. It is further assumed that u has zero mean because it represents the deviation from the designed nominal position. (A4) Unless otherwise noted, we assume that the sample mean y– = n–1 Σ tn=1y t has been subtracted from the data so that the resulting sample {y(t): t = 1, 2, …, n} can be taken to be zero mean.
()
The diagnostic objective in this chapter is to estimate the variance components { σ12 , σ 22 , …, σ 2P , σ 2v } for each of the P potential variation sources (or random effects) and for sensor noise, based on the data sample {y(t): t = 1, 2, …, n}. The estimation of variance components is more significant than that of mean components because variance is more problematic than mean shifts in dimensional quality control. A sustained, consistent deviation from nominal (i.e., a mean shift) can often be compensated relatively easily by process engineers via shimming and other adjustments. In contrast, variance is much more difficult to compensate and requires either some form of online, constant feedback control or the removal of the variation root causes. The LS method can clearly be applied to the linear model in Equation 11.1 to estimate the variance components. However, there appear to be different ways of applying the LS method.
11.1.1 DEVIATION LS ESTIMATOR An LS estimator was developed by Apley and Shi [1]. The basic idea is as follows: One first estimates the random deviations {uˆ (t )}tn=1, where the overscore “^” denotes an estimated quantity. The sample variance of the elements is then used in uˆ as the estimates of the variance components of u. The detailed procedure reads as: DP1. Estimate uˆ (t ) = (Γ T Γ )−1 Γ T y(t ) ; DP2. Estimate σˆ 2v =
1 Σ tn=1vˆ (t )T vˆ (t ) , ( n − 1)( M − P )
where vˆ (t ) = y(t ) − Γuˆ (t );
2151_book.fm Page 251 Friday, October 20, 2006 5:04 PM
Estimation-Based Diagnosis
251
DP3. Estimate the variance components of u: 1 σˆ i2 = Σ tn=1uˆi (t )2 − σˆ v2 (Γ T Γ )i−,1i n −1 for i = 1, …, P, where ûi (t) represents the ith element of û(t) and (Γ T Γ )i−1,i is the (i,i)th element of (Γ T Γ )−1. In the preceding procedure, the quantity σˆ 2v (Γ T Γ )i−,1i is subtracted to eliminate bias due to measurement noise. Because the deviations {uˆ (t )}tn=1 are directly estimated via a least-squares method, this estimator will be referred to as a deviation LS estimator.
11.1.2 VARIATION LS ESTIMATOR An alternative approach to estimating the variance components was developed in Ding et al. [2]. Their procedure is as follows. Taking the covariance matrix for both sides of Equation 11.1 gives: Σ y = ΓΣ u Γ T + σ v2 I
(11.2)
Given the fact that Σu is a diagonal matrix, Equation 11.2 can be written as Σ y = Σ ip=1 ( γ i γ Ti )σ i2 + σ v2 I
(11.3)
where γi is the ith column vector of Γ. In practice, the population covariance Σy is estimated by the sample covariance matrix Sy =
1 Σ tn=1y(t )y(t )T = σ 2y + E n −1
(11.4)
where E denotes the estimation error matrix. If we define Vi ≡ γ j γ Tj for i = 1, …, P, Vp+1 ≡ I , and σ 2P +1 ≡ σ 2v , Equation 11.2 and Equation 11.4 become: Sy = Σ iP=+11Vi σ 2i + E
(11.5)
Based on this, one approach to estimating the variance components is to choose σˆ 2 to minimize the sum of the squares of the elements of the error matrix Sy − Σ iP=+11Vi σˆ 2i . For square matrices A and B of compatible dimension, define the matrix inner product
= tr(AT B) and the associated matrix norm ||A||2 = , which is exactly the sum of the squares of the elements of A. Using standard results
2151_book.fm Page 252 Friday, October 20, 2006 5:04 PM
252
Stream of Variation Modeling and Analysis for MMPs
for least-squares estimation in inner-product spaces [3], the estimates in this case must satisfy the so-called normal equations: G σˆ 2 = b
(11.6)
where the notation is as follows: G is the Gram matrix, defined so that the ith row, jth column element is for 1 ≤ i, j ≤ P + 1, and the (P + 1)-length column vector b is defined so that its ith element is . For the particular inner product defined earlier, it can be verified that ( γ 1T γ 1 )2 G= T ( γ γ )2 1T P γ 1 γ 1
( γ 1T γ P )2
( γ TP γ P )2 γ TP γ P
γ 1T Sy γ 1 γ 1T γ 1 and b = γ TS γ γ TP γ P P y P tr (Sy ) M
(11.7)
When G is nonsingular, or equivalently when the matrices γ 1 γ 1T , …, γ P γ TP and I are linearly independent, σˆ 2 = G–1b is a unique solution to Equation 11.6. We refer to this approach as the variance LS estimator. This estimator is also known as the LS fit estimator in D’Assumpcao [4] and Bohme [5].
11.1.3 OTHER VARIATION LS ESTIMATORS One may wonder about the variance estimators that were developed in variance component analysis (VCA) theory [6,7] and other application domains (such as signal processing). We will present a few other variance estimators here. •
MLE With the normality assumption, the probability density function (PDF.) of y(t) is f (y(t )) = (2π)
−1M 2
Σy
−1 2
e
− 1 y ( t )T Σ −y 1y ( t ) 2
.
Denote the complete observations as yT = [y(1)T y(2)T y(n)T]nM×1, the PDF of y is f ( y ) = ∏ tn=1 f ( y (t )) . Then the log-likelihood function of Σy is
(
)
L Σ y y = ln f (y) = −
nM n n ln(2π) − ln Σ y − tr ( Σ y −1Sy ) 2 2 2
(11.8)
Based on this log-likelihood function, Anderson [8,9] derived an MLE as the solution to the following nonlinear equation (iterative numerical algorithms were also presented in the same paper to solve the nonlinear equation),
2151_book.fm Page 253 Friday, October 20, 2006 5:04 PM
Estimation-Based Diagnosis
253
{tr ( Σˆ −y1Vi Σˆ −y1Vj )}iP, +j =11 σˆ 2 = {tr ( Σˆ −y1Vi Σˆ −y1S y )}iP=+11
(11.9)
where Σˆ y = Σ Pj =+11σˆ 2j Vj , {} ⋅ iP, j+=11 is a (P+1)×(P+1) matrix, and {} ⋅ iP=+11 is a (P + 1) × 1 column vector. •
MINQUE and MINQUEO Rao and Kleffe [7] presented the MINQUE (the last letter “E” stands for either “estimator” or “estimation” in this chapter, depending on the context) theory to estimate variance components in a linear mixed-effects model. According to Searle et al. [6], a MINQUE can be solved from the ML Equation 11.8 by using a preassigned variance value Σ0, i.e., the equation {tr ( Σ 0−1Vi Σ −01Vj )}iP, +j =11 σˆ 2 = {tr ( Σ −01Vi Σ −01S y )}iP=+11
(11.10)
yields a MINQUE for the given Σ0. The solution of MINQUE does not require iteration in computation. However, it only gives estimates that are locally “optimal” (in terms of the norm defined in the MINQUE theory) in the neighborhood of Σ0. A special case of MINQUE is to assign Σ0 = In. For instance, assign {σ 2j }Pj =1 = 0 and σ 2v = 1. Then, Equation 11.10 becomes
{tr (V V )} i
j
P +1
i , j =1
{
}
σˆ 2 = tr (ViS y )
P +1
(11.11)
i =1
This special case was called MINQUEO or MINQUE(0) [6], according to which is a locally “optimal” estimator in the neighborhood of Σ0 = In MINQUE theory. •
Estimator in Reference 10 Ding et al. [10] define an estimator as follows. Denote vec(⋅) as an operator to stack the columns of a matrix on top of one another, i.e., vec(S) = [s11 s21 s12 s22]T for a 2 × 2 S. Using this operator, Equation 11.2 can be written as vec( Σ y ) = π(Γ )
vec(I M ) σ 2
(11.12)
where π(⋅) is a matrix transform defined as π(Γ ) = ( γ 1 * γ 1 )T
( γ 1 * γ M )T
( γ M * γ 1 )T
T
( γ M * γ M )T (11.13)
where γ j is the jth row vector of Γ for j = 1, …, M. Using Sy to replace Σy in Equation 11.12, we obtain σ 2 as
2151_book.fm Page 254 Friday, October 20, 2006 5:04 PM
254
Stream of Variation Modeling and Analysis for MMPs
(
T
σ 2 = π (Γ ) vec(I M ) π (Γ ) vec(I M )
)
−1
T
⋅ π (Γ ) vec(I M ) ⋅ vec(S y )
(11.14)
= (Π T Π)−1 Π T ⋅ vec(S y ) = Π + vec(S y ) where Π ≡ π(Γ ) vec(I M ) and Π + ≡ (Π T Π)−1 Π T . •
Estimator in Reference 11 For a full rank Γ, i.e., Γ T Γ is nonsingular, Stoica and Nehorai [11] defined an estimator as Σˆ u = Γ +S y Γ + T − σˆ v2 (Γ T Γ )−1 σˆ 2v = tr ((I M − ΓΓ + )S y ) /(( M − P)
(11.15)
where Γ + ≡ (Γ T Γ )−1 Γ T . When random variables in u are known to be independent, as assumed in A3, the heuristic estimator of {σ 2i }iP=1 uses the diagonal elements of Σˆ u .
11.2 RELATIONSHIP AMONG VARIANCE ESTIMATORS Although the aforementioned variance estimators appear to be quite different, some are intrinsically equivalent, as will be shown in the following. As a result, these variance estimators can be grouped into three distinct categories. First, it will be shown that the variation LS estimator, the estimator in Ding et al. [2], and the MINQUEO are identical, i.e., that Π T Π = {tr ( Vi Vj )}ip, +j =11 = G and Π T ⋅ vec(S y ) = {tr ( Vi S y )}ip=+11 = b, as declared by the following lemma. The term “variation LS estimator” is used here after to refer to those estimators. Lemma 11.1: Π T Π = {tr ( Vi Vj )}iP, j+=11 = G and Π T ⋅ vec(S y ) = {tr ( Vi S y )}iP=+11 = b. Proof: Using the definition of Π in Equation 11.14, π ( Γ )T π ( Γ ) ΠT Π = T vec(I M ) π(Γ ) π (Γ )T π (Γ ) = T vec(I M ) π(Γ )
π(Γ )T vec(I M ) vec(I M )T vec(I M ) π(Γ )T vec(I M ) M
(11.16)
However, as VP +1 = IM , {tr ( Vi Vj )}iP, j+=11 can be expressed as {tr ( Vi Vj )}iP, j =1 {tr ( Vi Vj )}iP, j+=11 = P T ({tr ( Vi )}i=1 )
{tr ( Vi )}iP=1 M
(11.17)
2151_book.fm Page 255 Friday, October 20, 2006 5:04 PM
Estimation-Based Diagnosis
255
Γ)T π(Γ Γ) = {tr ( Vi Vj )}iP, j =1 and π(Γ Γ)T vec(IM) = Furthermore, it can be shown that π(Γ P T {tr ( Vi )}i=1. Recalling that tr ( AB) = vec ( A) vec (B) for any symmetric matrices A and B, the (i,j)th element in {tr ( Vi Vj )}iP, j =1 is (vec( γ i γ Ti ))T vec( γ j γ Tj ). In fact, vec( γ i γ Ti ) is Γ), leading to the conclusion that π(Γ Γ)T π(Γ Γ) = the ith column vector in π(Γ P {tr ( Vi Vj )}i, j =1 . Γ)T Following the same procedure, {tr ( Vi S y )}iP=1 = {(vec( γ i γ Ti ))T ⋅ vec(S y )}iP=1 = π(Γ P T Γ) vec(IM) = {tr ( Vi )}i=1 constitutes a special case when Sy = IM. vec(Sy). Then, π(Γ Moreover, Π T ⋅ vec(S y ) = {tr ( Vi S y )}iP=+11 results, following a similar matrix partition as in Equation 11.16 and Equation 11.17. This proves the first equality in both expressions in the lemma. To prove the second equality in each expression, note that: tr (Vi Vj ) = tr ( γ i γ iT γ j γ Tj ) = tr ( γ Ti γ j γ Tj γ i ) = ( γ iT γ j )2 tr (Vi ) = tr ( γ i γ Ti ) = tr ( γ Ti γ i ) = γ Ti γ i tr (ViSy ) = tr ( γ i γ iT Sy ) = tr ( γ Ti Sy γ i ) = γ Ti S y γ i , and VP +1 = IM . It is then obvious that {tr ( Vi Vj )}ip, +j =11 = G and {tr ( Vi S y )}iP=+11 = b.
Q.E.D.
Lemma 11.2: The deviation LS estimator in DP1 to DP3 comprises the diagonal elements of the estimator proposed by Stoica and Nehorai [11] in Equation 11.15. This estimator is hereafter referred to as the deviation LS estimator. Proof: Substituting vˆ (t ) = y(t ) − Γuˆ (t ) and uˆ (t ) = A + y (t ) into σˆ 2v in DP2 yields σˆ 2v =
1 Σ tn=1y T (t )(I M − ΓΓ + )T (I M − ΓΓ + )y(t ) n( M − P )
(
)
(
1 1 tr (I M − ΓΓ + ) ⋅ Σ tn=1y(t )y T (t ) = ⋅ tr (I M − ΓΓ + ) ⋅ Sy = n( M − P ) ( M − P)
)
(11.18)
which is the same as σˆ 2v in Equation 11.15. Furthermore, uˆ j (t ) = Γ +j y(t ), where Γ +j is the jth row vector of Γ+. σˆ 2j can then be expressed as: 1 σˆ 2j = Σ tn=1uˆ 2j (t ) − σˆ v2 (Γ T Γ )−j ,1j n 1 = Σ tn=1 ( A +j y(t ))(Γ +j y(t ))T − σˆ v2 (Γ T Γ )−j ,1j n
(11.19)
2151_book.fm Page 256 Friday, October 20, 2006 5:04 PM
256
Stream of Variation Modeling and Analysis for MMPs
1 = Γ +j ⋅ Σ tn=1y(t )(y(t ))T ⋅ (Γ +j )T − σˆ 2v (Γ T Γ )−j ,1j n = Γ +j ⋅ S y ⋅ (Γ +j )T − σˆ v2 (Γ T Γ )−j ,1j It is clear that the preceding result is the (j,j)th diagonal element of Σˆ u in Equation 11.15. Thus, all {σˆ 2j }Pj =1 in (DP3) of the deviation LS estimator constitute the diagonal elements of Σˆ u in Stoica and Nehorai [11]. Q.E.D. A matrix expression of the deviation LS estimator is introduced as follows and will be used subsequently to analyze estimation variance. Let e j be the jth column vector of IP . The first P elements in the deviation LS estimator can then be expressed as: σˆ 12,P ≡ σˆ 12
T
σˆ 2P = Q ⋅ (Γ + ⊗ Γ + )vec(S y ) − vec(((Γ T Γ )−1 )σˆ 2v (11.20)
where Q is defined as (e1 ⊗ e1 )T Q≡ (e ⊗ e )T P P
(11.21)
where ⊗ represents a Kronecker matrix product. Among the aforementioned variance estimators, the MLE is different from the others. Collectively, according to the previous two lemmas, we have three distinct types of estimators, the deviation LS estimator, the variation LS estimator: and the ML estimator. Apparently, these variance estimators require different existence conditions. The existence condition for the deviation LS estimator is that Γ T Γ be full rank and M ≥ P + 1, where P + 1 is the number of independent variance components, including sensor noise variance. The existence condition for the variation LS estimator is that Π T Π, G, or {tr ( Vi Vj )}iP, j+=11 be full rank. The existence condition of a variance estimator is related to the diagnosability condition, as discussed in Chapter 9. In general, the diagnosability condition characterizes whether or not the observations of y contain sufficient information to ensure that the variance components can be estimated. This condition is independent of specific estimation algorithms but should be required as a necessary condition for all variance estimators. Individual variance estimators may, however, require stronger diagnosability conditions. Theorem 9.1 defines the diagnosability condition of variance components for a linear diagnostic model Equation 11.1 as that the H matrix in Equation 9.11, which is the same as G in Equation 11.7 after neglecting Ψ, be full rank. In other words, the diagnosability condition is the same as that of a variation LS estimator. The conditions that Γ T Γ be full rank and M ≥ P + 1 are stronger conditions than G being full rank. It can be proved that the existence of a deviation estimator guarantees the existence of a variation LS estimator. The result is stated
2151_book.fm Page 257 Friday, October 20, 2006 5:04 PM
Estimation-Based Diagnosis
257
in Lemma 11.3. However, the converse is not true; that is, G could be full rank even if Γ T Γ is singular. Lemma 11.3: If Γ T Γ is full rank and M ≥ P + 1, G is full rank. Proof: Suppose Γ T Γ is of full rank and M > P, but G is singular. Because G is the Gram matrix of {γγ1γ1T, γ2γ2T, …, γPγPT, I}, its singularity implies that {γγ11T, γ2γ2T, …, γPγPT, I} are linearly dependent. Thus, there exists a set of scalars {α1, α2, …, αP , αP+1}, not all zero, such that γ1γ1Tα1 + γ2γ2Tα2 + … + γPγPTαP = −IαP+1. For this to hold, we must have αP+1 = 0. Otherwise, rank(δI) = M, whereas the summation of matrices on the left-hand side can have at most rank P < M. It follows that γ1γ1Tα1 + γ2γ2Tα2 + … + γPγPTαP = 0, and at least one of the α’s (say, αi) is nonzero. Postmultiplying the preceding equation by γi gives γ1(γγ1Tγiα1) + γ2(γγ2Tγiα2) + … + γP(γγPTγiαP) = 0. Because at least one of the coefficients (γγiTγiαi) is nonzero, this implies that the vectors {γγ1, γ2, …, γP} are linearly dependent. Their Gram matrix Γ T Γ must therefore be singular, which contradicts the condition that Γ T Γ be full rank. Q.E.D. An MLE exists if Sy is positive definite and V1, V2, …, VP+1 are linearly independent [7, p. 231]. The positive definiteness of sample variance Sy is usually satisfied in practice because independent sensor noise having nonzero variances exists. It is easy to verify that V1, V2, …, VP+1 are linearly independent if and only if G is full rank. Therefore, the existence condition for an MLE is the same as the diagnosability condition for Equation 11.1, and, likewise, the same as that of a variation LS estimator. The deviation LS estimator and the variation LS estimator are actually equivalent in the special case that all columns of Γ are orthogonal, i.e., when γiTγj = 0, ∀i ≠ j. This obviously requires that Γ T Γ be full rank. This is stated as Proposition 11.1, which follows. Proposition 11.1: If M > P, γiTγj = 0, and ∀i ≠ j, the variation LS estimator is the same as the deviation LS estimator. Proof: Utilizing the fact that the columns in G are orthogonal to each other, i.e., γiTγj = 0, and ∀i≠j, we can rewrite Equation 11.6 and Equation 11.7 as ( γ 1T γ 1 )2 0 T γ 1 γ 1
0 T ( γ P γ P )2 γ TP γ P
T γ 1T γ 1 σˆ 12 tr ( γ 1 γ 1 Sy ) ⋅ 2 = T T tr ( γ P γ PSy ) σˆ γ Pγ P P2 M σˆ v tr (Sy )
(11.22)
where those zeros result from the orthogonal condition γiTγj = 0, ∀i≠j. The preceding equation is equivalent to: ( γ Ti γ i )2 ⋅ σˆ i2 + γ iT γ i ⋅ σˆ 2v = tr ( γ i γ iT Sy ), i = 1, 2, …, P Σ iP=1γ Ti γ i ⋅ σˆ i2 + M ⋅ σˆ v2 = tr (Sy )
(11.23)
2151_book.fm Page 258 Friday, October 20, 2006 5:04 PM
258
Stream of Variation Modeling and Analysis for MMPs
We can solve {σˆ 2i }iP=1 in terms of σˆ 2v from the first equation and substitute it into the second equation. Then we have: ( M − P)σˆ 2v = tr (S y ) − Σ iP=1
(
1 tr γ i γ Ti Sy γ γi T i
)
1 = tr (S y ) − tr Σ iP=1 T γ i γ Ti ⋅ Sy γi γi
(11.24)
Note that Γ T Γ is a diagonal matrix with γ Ti γ i as its (i,i)th element and Σ iP=1 ω i γ i γ iT = ΓΩΓ T, where Ω = diag{ω1 ωP} and ωi , i = 1, …, P, is an arbitrary real number. Then we have: Σ iP=1
1 γ i γ Ti = Γ (Γ T Γ )−1 Γ T = ΓΓ + γ Ti γ i
(11.25)
Given all these results, we can write Equation 11.24 as σˆ 2v =
(
1 tr (I − ΓΓ + )Sy ( M − P)
)
(11.26)
It can be further shown that this σˆ 2v is the same as that in the deviation LS estimator. Substitute vˆ (t ) = y(t ) − Γuˆ (t ) and uˆ (t ) = Γ + y(t ) in σˆ 2v = ( Σ tn=1vˆ (t ) T vˆ (t )) / ((n − 1) (M – P)). It yields: σˆ 2v = =
1 Σ tn=1y(t )T (I − ΓΓ + )T (I − ΓΓ + )y(t ) (n − 1)( M − P)
{
}
1 tr Σ tn=1y T (I − ΓΓ + )y(t ) (n − 1)( M − P)
1 1 tr (I − ΓΓ + ) ⋅ = Σ tn=1y(t )y(t )T n −1 ( M − P) =
(
1 ⋅ tr (I − ΓΓ + ) ⋅ Sy ( M − P)
(11.27)
)
After obtaining the solution of σˆ 2v , we can substitute it in Equation 11.23 to solve for σˆ 2i as σˆ i2 =
1 1 tr ( γ i γ Ti S y ) − σˆ 2v ⋅ T , i = 1, 2, …, P 2 (γ γ i ) γi γi T i
(11.28)
2151_book.fm Page 259 Friday, October 20, 2006 5:04 PM
Estimation-Based Diagnosis
259
Γ T Γ)–1. Remember that tr ( γ i γ Ti S y ) = γ Ti S y γ i and 1/γγ Ti γ i is the (i,i)th element of (Γ T T T –1 T Γ Γ) Γ . We can further write Equation 11.28 as Then γ i / γ i γ i is the ith row of (Γ γT γ 1 σˆ i2 = T i S y T i − σˆ v2 ⋅ T = Γ i+S y (Γ i+ )T − σˆ v2 (Γ T Γ )i−,1i , i = 1, 2, …, P (11.29) γi γi γi γi γi γi where Γ i+ is the ith row of (Γ T Γ )−1 Γ T. The second term σˆ 2v (Γ T Γ )i−,1i on the right-hand side of the preceding equation is the same as the one in DP3. All we need to show Γ T Γ)–1 Γ T)i · now is that Σ tn=1uˆi (t )2 / (n − 1) is the same as Γ i+S y (Γ i+ )T. In fact, uˆi (t ) = ((Γ + y(t) = Γ i y(t ). Then: 1 1 Σ tn=1uˆi (t )2 = Σ tn=1 (Γ i+ y(t )(Γ i+ y(t ))T n −1 n −1 1 = Γ ⋅ Σ tn=1y(t )y(t )T ⋅ (Γ i+ )T = Γ i+ ⋅ S y ⋅ (Γ i+ )T n −1
(11.30)
+ i
This completes the proof.
Q.E.D.
EXAMPLE 11.1: EXISTENCE CONDITIONS
FOR
VARIANCE ESTIMATORS
Let us consider a simplified example involving two assembly stations and three panel parts (Figure 11.1). At station II, the locating hole on part 2 is used to position the whole assembly. The locator on station II is assumed to be free of positioning errors (i.e., u2 = 0). Other settings are B1 = I, C1 = 0, C2 = I, and x0 = 0. The state space model of fixture error propagation in this three-panel assembly process is: x1 = u1 x 2 = A1x1
(11.31)
y = x2 + v
Station I
1
Station II
2 2 1
sensor
Assembly Transfer
3 2 2
1
2
3
2 3
locator being used
locator notbeing used
FIGURE 11.1 A three-panel two-station assembly process.
welding gun
2151_book.fm Page 260 Friday, October 20, 2006 5:04 PM
260
Stream of Variation Modeling and Analysis for MMPs
The state transition matrix A1 can be determined as follows. When this threepanel assembly is transferred to Station II, it undergoes a translation by the amount –x21. Therefore, 0 x 2 = x1 + 0 0
−1 −1 −1
1 0 0 x1 = 0 0 0
−1 0 −1
0 0 x1 1
(11.32)
That is, the state transition matrix is 1 A1 = 0 0
0 0 1
−1 0 −1
(11.33)
In this model, the relationship of Γ = A1 holds. It is easy to verify that Γ T Γ is singular. So the deviation LS estimator does not exist. However, the Gram matrix 1 1 G= 0 1
1
0
4 1 2
1 1 1
1 2 1 3
(11.34)
is of full rank. This suggests that the variance vector σ2 can be estimated using the variation LS estimator but not the deviation LS estimator. This serves as an example to illustrate the message we convey before Lemma 11.3, i.e., G could be full rank even if Γ T Γ is singular. EXAMPLE 11.2: AN ASSEMBLY EXAMPLE
WITH AN
ORTHOGONAL Γ
An automotive assembly process was described in Apley and Shi [1]. The assembly is shown in Figure 11.2 with measurement points and locating points marked. Both the x and z directions of points M1 through M4 are measured. The x direction of points M5 through M8, and the z direction of points M9 and M10 are also measured. Pins P1 and P2 locate the body side in the x-z plane at the framing station, P1 is a 4-way locator mating with a hole in the body side, and P2 is a 2-way locator mating with a slot. The goal is to detect fixturing errors in the framing station due to P1 and P2 faults. Under this particular circumstance, two fixture faults will be considered: a P1 failure in the x direction (fault 1) and a P2 failure in the z direction (fault 2). Note that fault 1 causes a translation in the x direction and fault 2 causes a rotation about P1.
2151_book.fm Page 261 Friday, October 20, 2006 5:04 PM
Estimation-Based Diagnosis
261
FIGURE 11.2 Side frame assembly of a sports utility vehicle.
For the preceding process, there were 14 measurements (M = 14) and two potential fixture errors (P = 2). Kinematics analysis determined the matrix Γ (which is the C matrix in their paper) to be: ΓT = 0 0 0 0 0 .354 .354 .354 .354 .354 .354 .354 .354 0 .057 −.026 0 −.004 .046 −.087 −.024 .043 .187 .361 0 .535 .495 .536 (11.35)
This Γ matrix is of full column rank and M > P, suggesting that the variation LS estimator and the deviation LS estimator can both be applied. For this side frame assembly system, both variance estimators are used to estimate the variance components associated with fixture errors. Monte Carlo simulations with K = 5000 replicates were conducted in a Matlab environment, and fixture errors were assumed to follow a normal distribution. Five different sample sizes were used (n = 5, 10, 25, 50, 100) in the simulation. We try to compare the two types of estimators using the mean square error (MSE) of the estimates as the performance measure. The MSE is defined to be: MSE ≡
(
)
2 1 1 Σ Pj =+11 Σ kK=1 σˆ 2j ,k − σ 2j P +1 K
(11.36)
where σˆ 2j ,k is the estimate of σ 2j for the kth replicate. Figure 11.3 shows the MSEs of the two variance estimators. It is clear that the two estimators perform identically in this example. The reason is that the two columns of Γ are almost orthogonal (γ 1T γ 2 = 0.018, γ 1T γ 1 = 1.0025, and γ T2 γ 2 = 0.999). This agrees with Proposition 11.1, which states that the two estimators are equivalent when the columns of Γ are orthogonal.
2151_book.fm Page 262 Friday, October 20, 2006 5:04 PM
262
Stream of Variation Modeling and Analysis for MMPs
4 3.5 3
MSE
2.5 2 1.5 1 0.5 0 5
10
25
50
100
Sample Size n
FIGURE 11.3 MSE for the linear system with Γ, as in Equation 11.35.
11.3 COMPARISON OF VARIANCE ESTIMATORS The performances of the deviation and variation LS estimators are compared in this section. We start with a discussion of their properties related to estimation unbiasedness and dispersion.
11.3.1 UNBIASEDNESS
OF
VARIANCE ESTIMATORS
When u and v are assumed to be zero-mean vectors, E(Sy) = Σy , i.e., Sy is the unbiased estimate of Σy . From Equation 11.14, it can be readily seen that the variation LS estimator is unbiased. The unbiasedness of a deviation LS estimator can be more easily determined Σy)/(M – P)). Applying Equafrom Equation 11.15. Note that E(σˆ 2v ) = tr((IM – ΓΓ+)Σ tion 11.2 yields E (σˆ 2v ) = tr ((I M − ΓΓ + )(ΓΣ u Γ T + σ v2I M )) / ( M − P) = σ 2v ⋅ {tr (I M ) − tr (ΓΓ + )}/ ( M − P) = σ 2v ⋅ {tr (I M ) − tr ((Γ T Γ )−1 Γ T Γ )}/ ( M − P)
(11.37)
= σ 2v ⋅ {tr (I M ) − tr (I P )}/ ( M − P) = σ 2v The expectation of Σˆ u is then taken as E ( Σˆ u ) = Γ + Σ y Γ + T − E (σˆ 2v ) ⋅ (Γ T Γ )−1
(11.38)
Utilizing the results from Equation 11.2 and Equation 11.27, it is easy to show that E ( Σˆ u ) = Σ u . Thus, the deviation LS estimator is also unbiased.
2151_book.fm Page 263 Friday, October 20, 2006 5:04 PM
Estimation-Based Diagnosis
11.3.2 DISPERSION
OF
263
VARIANCE ESTIMATORS
Variance estimator dispersion is characterized by the trace of its variance-covariance matrix, i.e., varML,VLS,or DLS ≡ tr (Cov(σˆ 2ML,VLS ,or DLS )). The same criterion has been used for a composite comparison of variance estimation in Corbeil and Searle [12]. Because the MLE will be used as the reference later for comparison of deviation and variation LS estimators, MLE dispersion is also presented. First, the variance of σˆ 2v for a deviation LS estimator is derived as Var (σˆ 2v ) =
2 σ 4v . n(M − P )
(11.39)
The complete derivation can be found in Appendix A3 of Ding et al. [13]. Using Equation 11.20, varDLS can be calculated as: varDLS = tr (Cov(σˆ 12,P )) + Var (σˆ 2v ) =
(11.40)
1 1 tr (P1 (I M 2 + K M )( Σ y ⊗ Σ y )P1T ) − tr (P1 (I M 2 + K M )( Σ y ⊗ Σ y )P3T ) n n( M − P ) + tr (P2P2T )
2σ v4 2σ v4 + n( M − P ) n( M − P )
where P1 = Q ⋅ (Γ + ⊗ Γ + ), P2 = Q ⋅ vec((Γ T Γ )−1 ), P3 = P2 ⋅ (vec(I M − ΓΓ + ))T, and KM is as defined in Magnus and Neudecker [14]. Similarly, the varVLS for the variation LS estimator is 1 2 varVLS = tr (Cov(σˆ VLS )) = tr (Π + (I M 2 + K M )( Σ y ⊗ Σ y )(Π + )T ) n
(11.41)
The covariance matrix of an MLE is approximated by the inverse of its Fisher information matrix [6], given as P +1 2 ∂ L ( Σ y | y) 2 ˆ Ψ (σ ML ) ≡ − E ∂σ i2∂σ 2j i , j =1
(11.42)
Thus, varML is approximated as varML ≈ tr (Ψ −1 ) =
2 tr tr ( Σ −y1Vi Σ −y1Vj ) n
{
}
i , j =1 P +1
−1
(11.43)
It should be noted that the sample size n has the same effect on the dispersion of all three preceding estimators.
2151_book.fm Page 264 Friday, October 20, 2006 5:04 PM
264
Stream of Variation Modeling and Analysis for MMPs
11.3.3 COMPARISON
OF
VARIANCE ESTIMATORS
As the deviation and variation LS estimators are both unbiased, estimator dispersion is the main criterion for performance comparison. One objective of this comparison is to determine the condition under which the variation LSs or the deviation LS estimator may be an effective alternative to MLE for online variance estimation. LS estimators can be computed using their closed-form expressions and, consequently, should require much less computation time than an MLE. The primary disadvantage of LS estimators is that they may demonstrate unacceptably higher variances than MLE. varML is used as the reference for this performance comparison. The relative difference between an LS estimator and an MLE is characterized by the percentage difference (Diff), defined as:
DiffVLS (or DLS ) vs ML ≡
varVLS (or DLS ) − varML varML
× 100%
(11.44)
A direct analytical comparison of variance estimators is difficult, if not impossible. To address this issue, a general understanding of the performance of the LSbased estimator is provided, followed by a numerical evaluation to illustrate the conclusions of this study. An LS estimator becomes an MLE estimator under special conditions. A deviation LS estimator is an MLE in a noise-free environment, i.e., when σ 2v = 0 . In this case, when a deviation LS estimator exists, the observation of y(t) is equivalent to direct observation of u(t). The sample variances computed from direct observation of u(t), t = 1, …, n, are the maximum likelihood estimators of {σ 2j }Pj =1. Under noisefree conditions, the diagonal elements of Σˆ u are the same as the sample variances of u. As such, a deviation LS estimator is the MLE of {σ 2j }Pj =1 . In contrast, a variation LS estimator becomes an MLE when the signal u is not random. Randomness in y is solely due to sensor noise, i.e., Σu = 0 but σ 2v ≠ 0 . This equivalence can be easily demonstrated by substituting Σˆ y = σˆ 2v I M into Equation 11.9 (ML equation). The results for σˆ 2v will be the same as those obtained from Equation 11.6 (the variation LS equation). Hence, the variation LS and deviation LS estimators are two extreme cases of the MLE. It is expected that a deviation LS estimator will perform as well as an MLE when sensor noise is relatively small, whereas a variation LS estimator will perform as well as an MLE when sensor noise is dominant in the process. To characterize sensor noise dominance, an average signal-to-noise ratio (SNR) is defined as follows: SNR =
tr ( Σ u ) P ⋅ σ v2
(11.45)
Σu)/P is the average power of signals and σ 2v is the power of sensor noises. where tr(Σ In addition to the SNR, the structure of matrix Γ will also affect the estimator variance. However, because the existence of the matrix Γ in dispersion indicators
2151_book.fm Page 265 Friday, October 20, 2006 5:04 PM
Estimation-Based Diagnosis
265
deviation LS estimator vs MLE
variation LS fit vs ML E
FIGURE 11.4 Performance curves for two types of LS estimators.
Equation 11.40 to Equation 11.43 is complicated, it is recommended that the percentage difference (Diff ) vs. SNR be plotted for a given matrix Γ. As an example, let M = 6, P = 3, and Γ matrix be 1 Γ = −1 0 T
1 −1 0
0 1 0
0 1 0
0 −1 1
0 −1 1
(11.46)
for which the SNR range is selected to be [0.01, 100]. This SNR range is achieved Σu). Although the compoby fixing σ 2v at unit variance and varying the value of tr(Σ Σu), in this case all σ 2i are nents σ 2i in Σu can take different values for a given tr(Σ Σu)/P. For instance, if SNR = 1, given σ 2v = 1, assigned equal values, i.e., σ 2i = tr(Σ 2 Σu) = 1. Subsequently, σ i = 0.333 for i = 1, …, 3. then tr(Σ Diff vs. SNR for both DLS vs. ML and VLS vs. ML are plotted in Figure 11.4, which provides a quantitative characterization of the relationship between the two estimators and is consistent with the general concept described earlier. In practice, the curves are plotted for a given matrix Γ to determine the SNR range over which a deviation or a variation LS estimator demonstrates acceptable performance compared to an MLE. With respect to matrix Γ in Equation 11.46, the point at which the deviation and variation LS estimators exhibit the same performance is approximately SNR = 1 (SNR = 100). If the maximum allowed difference from an MLE is 10%, then a deviation LS estimator is a good alternative for an MLE when SNR > 2. Similarly, a variation LS estimator is an effective alternative when SNR < 0.2. EXAMPLE 11.3: SINGLE-STATION ASSEMBLY PROCESS This example uses a single-station automotive body assembly process similar to that in Example 11.1. For this particular problem, however, there are nine measurements (M = 9) and three independent variation sources (P = 3). The associated Γ matrix is determined to be
2151_book.fm Page 266 Friday, October 20, 2006 5:04 PM
266
Stream of Variation Modeling and Analysis for MMPs
ΓT = .093 .577 −.120
0 0 0
−.093 0 .843
.093 .5777 −.120
0 0 0
.647 0 −.120
−.370 .577 .482
0 0 0
.647 (11.47) 0 −.120
Matrix Γ in this example is of full column rank and M > P + 1, suggesting that both LS estimators are applicable. The performance curves for this system design are presented in Figure 11.5. To evaluate the performance of variance estimators, the SNR is first estimated from engineering design specifications. The sensor used in this example is a type of noncontact coordinate sensor having regular precision. The specified sensor accuracy is (6σ)sensor = 0.1 mm. In contrast, the tolerance of the pinhole contact is roughly 0.2 mm. If the tolerance is approximated by the Six Sigma value, then (6σ)locator = 0.2 mm, implying that the repeatability of a malfunctioning locator will have a Six Sigma value larger than 0.2 mm. Based on this approximation, SNR ≥ (6σ)locator / (6σ)sensor)2 = 4. Given an SNR of around 4, a deviation LS estimator can exhibit a 1% difference from an MLE in terms of estimation variance, whereas the difference for a variation LS estimator would be around 18%. To determine if meaningful guidance can be obtained from the OC curve, a simulation (500 trials) was performed with variance components σ2 = [0.0011 0.0025 0.0044 0.0006] (including sensor noise) and a sample size of n = 25. The results from the two LS variance estimators are compared in Table 11.1. MLE results are also included in Table 11.1 as a reference. The first row (σˆ 2 ) presents the sample average of variance estimation from all three estimators and indicates that both the deviation and variation LS estimators are unbiased. The bias of the MLE in this example is not noticeable. In Table 11.1, row 2 details the sample variance of each
deviation L S estimator
variation LS estimator
FIGURE 11.5 Performance curves for the linear system in Equation 11.47.
2151_book.fm Page 267 Friday, October 20, 2006 5:04 PM
Estimation-Based Diagnosis
267
TABLE 11.1 Comparison of Three Estimators for Linear System in Equation 11.47
2 σˆ
tr (Cov( σˆ )) Diff (%) 2
Deviation LS Estimator
Variation LS Estimator
[0.0011 0.0025 0.0044 0.0006] 3.20 × 10–6 1.1%
[0.0011 0.0025 0.0044 0.0006] 3.58 × 10–6 13.1%
MLE [0.0011 0.0025 0.0044 0.0006] 3.17 × 10–6 —
Note: The true values of σ2 =[0.0011 0.0025 0.0044 0.0006].
estimator. The computed Diff values (row 3) are near those predicted by the performance curves. The simulation result confirms the conclusion drawn from the performance curves. EXAMPLE 11.4: MULTISTATION ASSEMBLY PROCESS Figure 11.6 depicts a two-station process, derived as a segment of the simplified automotive body assembly process in Figure 9.4 of Chapter 9. In this example, three workpieces are welded together at station I, with the first workpiece, a subassembly from prior assembly operations, consisting of two components. Once welding operations are completed, the entire assembly is transferred to a dedicated in-process OCMM station (station II) for inspection. The nine points at which measurements are taken are indicated in Figure 11.6. High-precision laser-optic coordinate sensors are used to measure two directional coordinates at each measurement point, therefore, M = 18. As the dedicated OCMM station is assumed to always be well maintained, only those variation sources associated with locators on station I are considered. Fixturing variation sources are marked from 1 to 9 in Figure 11.6a; 1 through 3 represent fixturing variations in the x direction and 4 through 9 are variations in the z direction. Hence, Px = 3, Pz = 6, and P = Px + Pz = 9. Because this process involves more than one station, the state space modeling approach outlined in Chapter 6 is employed to generate the variation propagation model. The state space model was converted to a linear diagnostic model as y(t ) = Γu(t ) + v(t ) = [Γ x Γ z ]u(t ) + v(t ), t = 1, 2, … n (a) Station I
(b) Station II 2 6
1
7
z x
5
assembly
1
2
3
transfer
3
4
(11.48)
8
pinhole being used
4
9
pinhole not being used
FIGURE 11.6 A two-station assembly process.
measurement points
2151_book.fm Page 268 Friday, October 20, 2006 5:04 PM
268
Stream of Variation Modeling and Analysis for MMPs
where Γx and Γz are two blocks in matrix Γ, corresponding to horizontal and vertical fixturing variations, respectively; the numerical expression of matrix Γ is as follows:
Γ = [Γ x
0 0 0 0 0 0 0 0 0 Γz ] = 0 −1 0 −1 0 −1 0 −1 0
0
0
0.1215 −0.3846
0
0
0
0
0
0.0221 −0.0699
0
0
0
0.0478
0
0
0.1215 −0.3846
0
0
0
0.2632
0
0
−0.1817
0.5944
0
0
0
0
0
−0.0773
0.2448
−0.3379
1.0699
0
0
0
0
0
0
0
0
0
0
−0.2054
0.6503
1
0
−0.3110
0
0
0
0.0574
0
1
0
−0.2153
0
0
0
−0.2392
0
0
1
−0.0957
0
0
0
0.0574
0
0
1
0
0
0
0
0.1656 −0.5245 −0.3379 0
−0.2392
1.0699 0
0
0.2632
0 0 0 −0.1675 0 0 0 −0.7321 0 0 0 0.3589 −0.7321 0 0 0 0 0 0 0 −0.445 0 0 0 0.4 −0.4 0 0.311 −0.24 −1.0574 1.24 0 0 0 0 0.2153 1 0 0 −0.7608 0 0 0.4 −0.3043 0 0 0.1826 −0.24 0 0 0 0 0 0 1 −0.7608 18×9 (11.49) −0.4067
For the purposes of this case study, it was assumed that only horizontal variance sources or vertical sources exist, so that only Γx or Γz is needed for variance estimation. This simplified treatment is employed only to demonstrate different variance estimator performances. Similarly, both horizontal and vertical variance sources can be simultaneously estimated using the entire Γ matrix. First Γx ∈ℜ18×3 is evaluated. Γ Tx Γ x is 4 Γ Tx Γ x = −2 −2
−2 2 0
−2 0 2
(11.50)
which has a rank of 2. In this case, Γ Tx Γ x is singular, illustrating a critical aspect of many multistation processes, i.e., matrix Γ often does not have full column rank. In this singular system, the deviation LS estimator cannot be used. The existence of a variation LS estimator is dependent on the rank of ΠxT Πx , where Πx = Γx) vec(IM)]. In this case, [π(Γ
2151_book.fm Page 269 Friday, October 20, 2006 5:04 PM
Estimation-Based Diagnosis
269
16 4 ΠxT Πx = 4 4
4 4 0 2
4 0 4 2
4 2 2 18
(11.51)
which is of full rank. Hence, a variation LS estimator exists. The performance curve for the variation LS estimator is given in Figure 11.7, a roughly flat line near the Diff = 0 value. As a result, to estimate variance location in the x direction, a variation LS estimator can be used in place of an MLE. The same evaluation is performed for Γz ∈ℜ18×6. Likewise, Γ Tz Γ z is also singular (rank = 5); however, ΠzT Πz (defined similarly as ΠxT Πx ) is full rank. For convenience Γ Tz Γ z and ΠzT Πz are not computed here, though the performance curve for Γz is also given in Figure 11.7. Unfortunately, when the SNR is large, the variation LS estimator for a system having Γz can exhibit an unacceptable deterioration from the preferred MLE performance. Hence, the variation LS estimator may not be appropriate for use in estimating variance location in the z direction, depending on the amplitude of the process SNR. In this multistation example, a dedicated OCMM station houses highly accurate coordinate sensors with (6σ)sensor = 0.02 mm. The tolerance for fixture locators is the same as before, resulting in an SNR of 100. Thus, the variance of a variation LS estimator can be as much as 80% greater than that of an MLE. Again, 500 simulation trials were run with n = 25. The variance estimation sample average (Table 11.2, row 1) indicates that the variation LS estimators are unbiased and the bias of the MLE is not noticeable. The Diff values are also near those obtained from the performance curve. Hence, the behavior of the performance curve in Figure 11.7 is verified by the simulation results given in Table 11.2.
OC curve for
z
OC curve for
FIGURE 11.7 Performance curves for Γx and Γz.
x
2151_book.fm Page 270 Friday, October 20, 2006 5:04 PM
270
Stream of Variation Modeling and Analysis for MMPs
TABLE 11.2 Simulation Comparison of Variation LS Estimator and ML Estimator Γx ) LS (Γ 2 σˆ
2 tr (Cov( σˆ )) Diff (%) True values
Γx ) MLE (Γ
Γz ) LS (Γ
Γz ) MLE (Γ
[0.0045 0.0003 [0.0045 0.0003 [0.0005 0.0005 [0.0005 0.0005 0.0003 1.11 × 10–5] 0.0003 1.11 × 10–5] 0.0014 0.0025 0.0014 0.0025 0.0005 0.000 0.0005 0.0005 1.15 × 10–5] 1.11 × 10–5] –6 –6 –6 1.741 × 10 1.675 × 10 1.678 × 10 0.954 × 10–6 3.97% — 75.93% — [.0045 .0003 .0003, 1.11 × 10–5] [.0005 .0005 .0015 .0025 .0005 .0005, 1.11 × 10–5]
11.4 CHAPTER SUMMARY This chapter discusses estimation-based diagnosis, complementing the pattern-based diagnosis methods in Chapter 10. A variance estimator for linear diagnostic models is of central importance in identifying sources of variation. Variance estimators with closed-form expressions are demonstrably more appropriate for online quality control than an MLE. This chapter studies the intrinsic relationship among several variants of variance estimators that are developed using the least-squares principle. The results of this study will significantly facilitate efforts supporting root cause analysis by enabling rapid and accurate estimation of underlying variation sources and providing information for corrective actions. We note that the presented methods typically require a random sample of 25 to 50 units. For a dynamic process with tool degeneration, the process data are inherently autocorrelated. However, because 25 to 50 units typically translate to production periods of 1 h or less, the sampling period will generally be too small to observe any noticeable process degeneration effects. Consequently, the methods should still be applicable to diagnosing other types of tooling errors in processes that also experience relatively slow tool degeneration dynamics (although other methods would be required to diagnose the tool degeneration itself). For processes with faster degeneration dynamics, recursive estimation methods need to be developed.
11.5 EXERCISES 1. Derive Equation 11.5 and Equation 11.6. 2. Implement an iterative computer program that can solve Equation 11.9. You can consult the procedures in Anderson [8,9] or use other available numerical solution tools that you are familiar with. 3. For the π-transform defined in Equation 11.13, prove the following: Γ) are also a. If the columns in Γ are independent, then those in π(Γ independent.
2151_book.fm Page 271 Friday, October 20, 2006 5:04 PM
Estimation-Based Diagnosis
271
b. Given any two matrices Γi and Γj , if all columns in Γi are independent of those in Γj , then all columns in π( Γi) will be independent of columns in π( Γj ). 4. Given a system of which the system matrix is (this is actually a design matrix from a 23 factorial design): 1 Γ = 1 1 T
1 1 −1
1 −1 1
1 −1 −1
−1 1 1
−1 1 −1
−1 −1 1
−1 −1 −1
a. Simulate n = 5 observations for y, assuming u1 ~ N(1, 1), u2 ~ N(0.5, 1.5), u3 ~ N(-0.5, 0.5), and v ~ N(0, 0.1). b. Treat the values you obtained in (1) as if they were the observations gathered from a real process. Repeat Example 11.2 to see if the variation LS estimator and the deviation LS estimator yield the same estimation of u and v. c. Repeat (1) and (2) for n = 10, 25, 50, and 100 so that you can plot a figure similar to Figure 11.3. 5. Consider the following single-station fixturing system shown in Figure 11.8: a. Decide the system matrix Γ for this single-station system. b. Generate the OC-curves for the two LS estimators using Equation 11.40, Equation 11.41, and Equation 11.43. Present your result in the format of Figure 11.5. c. Given σ2 = [0.0025 0.0011 0.0040 0.0005] and n = 25, use a n = 500trial simulation to verify if the OC-curves generated in (2) accurately represent the performance difference of the two LS estimators. Present your result in the format of Table 11.1. Note that you need your computer program from (2) to solve the ML estimator.
M1
M3 Z P1
Y
100
X M2 100 700 1000
FIGURE 11.8 Single-station fixturing system.
200
600
P2
2151_book.fm Page 272 Friday, October 20, 2006 5:04 PM
272
Stream of Variation Modeling and Analysis for MMPs
References 1. Apley, D.W. and Shi, J., Diagnosis of multiple fixture faults in panel assembly, ASME Journal of Manufacturing Science and Engineering, 120, 793, 1998. 2. Ding, Y. Gupta, A., and Apley, D., Singularity of fixture fault diagnosis in multiStation assembly systems, ASME Transactions, Journal of Manufacturing Science and Engineering, 126, 200, 2004. 3. Luenberger, D.G., Optimization by Vector Space Methods, John Wiley & Sons, New York, 1968. 4. D’Assumpcao, H.A., Some new signal processors for arrays of sensors, IEEE Transactions on Information Technology, IT-26, 441, 1980. 5. Bohme, J.F., Estimation of special parameters of correlated signals in wavefields, Signal Processing, 11, 329, 1986. 6. Searle, S.R., Casella, G., and McCulloch, C.E., Variance Components, John Wiley & Sons, New York, 1992. 7. Rao, C.R. and Kleffe, J., Estimation of Variance Components and Applications, NorthHolland, Amsterdam, 1988, p. 231. 8. Anderson, T.W., Statistical inference for covariance matrices with linear structure, Proceedings of the Second International Symposium of Multivariate Analysis, Academic Press, New York, 1969, p. 55. 9. Anderson, T.W., Estimation of covariance matrices which are linear combinations or whose inverses are linear combinations of given matrices, in Essays in Probability and Statistics, University of North Carolina Press, Chapel Hill, NC, 1970, p. 1. 10. Ding, Y. Shi, J., and Ceglarek, D., Diagnosability analysis of multi-station manufacturing processes, ASME Journal of Dynamic Systems, Measurement, and Control, 124, 1, 2002. 11. Stoica, P. and Nehorai, A., On the concentrated stochastic likelihood function in array signal processing, Circuits Systems Signal Process, 14, 669, 1995. 12. Corbeil, R.R. and Searle, S.R., A comparison of variance component estimators, Biometrics, 32, 779, 1976. 13. Ding, Y., Zhou, S., and Chen Y., A comparison of process variance estimation methods for in-process dimensional measurement and control, ASME Transactions, Journal of Dynamic Systems, Measurement, and Control, 127, 1, 69–79, 2005. 14. Magnus, J.R. and Neudecker, H., The commutation matrix: some properties and applications, Annals of Statistics, 7, 381, 1979.
2151_book.fm Page 273 Friday, October 20, 2006 5:04 PM
Part IV Design for Variation Reduction
2151_book.fm Page 274 Friday, October 20, 2006 5:04 PM
2151_book.fm Page 275 Friday, October 20, 2006 5:04 PM
12
Optimal Sensor Placement and Distribution*
This chapter will consider the problem of optimal sensor placement and distribution for quality control in multistation processes. Introduction of in-process automatic quality-assurance sensors (such as optical coordinate measuring machine, also known as, OCMM) offers the potential of diagnosing variation sources responsible for product quality defects in a timely manner. Such a sensor system can help manufacturers improve product quality and reduce process downtime. Effective use of sensory data in diagnosing variation sources depends on the optimal design of a sensor system, a problem that is often known as sensor placement and distribution. In the literature, the “location of a sensor” takes two different meanings: the first is the literal meaning of a sensor location, i.e., where a sensor is physically installed; and the second refers to the location of a product feature that a sensor measures. The first meaning, i.e., where to physically install a sensor, is often the focus of computer vision research [1]. The second meaning of sensor location, i.e., which product feature to measure, is more commonly used in quality control research and will also be dealt with in this chapter. Under this meaning, distributing sensors is, in fact, equivalent to selecting which product features to measure at different stations in a manufacturing process. In this chapter, Section 12.1 describes the background of the problem and reviews the relevant literature. Section 12.2 presents the design criterion, as well as the optimization formulation used to achieve the above-stated design objective. Section 12.3 discusses a revised exchange algorithm for sensor placement on an individual manufacturing station. Section 12.4 presents a backward propagation algorithm for optimal sensor distribution in a multistation process. Section 12.5 concludes and summarizes this chapter.
12.1 INTRODUCTION Coordinate measuring machines (CMMs) are widely used in discrete-part industries to ensure the dimensional quality of a product. The mechanism of a CMM is illustrated in Figure 12.1a. A CMM usually consists of a spatial frame that provides the coordinate reference (not shown in the figure), a mechanical arm that can move along guided tracks, and a probe that retrieves coordinate information when its tip touches the surface of a manufactured workpiece. One disadvantage of CMMs is * Part of chapter material is based on Reference 19 and Reference 21.
275
2151_book.fm Page 276 Friday, October 20, 2006 5:04 PM
276
arm
Stream of Variation Modeling and Analysis for MMPs
(a)
(b)
touch probe
manufactured workpiece
CCD image sensors optical lens laser source manufactured workpiece
(c) laser coordinate sensors variation root cause diagnosis
Automotive Body Assembly
FIGURE 12.1 Mechanism of CMM and OCMM.
their low throughput. Performing the measurement job sequentially, a CMM with a single mechanical arm and touch probe will usually take hours to finish all measurements on a complicated product. For instance, a CMM can measure only 6 to 8 automotive bodies per day in an automotive body shop that fabricates 1000 units daily. The high manufacturing cost of a CMM may also prohibit using multiple CMMs to perform measurement jobs in parallel. Recent innovations in sensor technology have enabled manufacturers to distribute quality-assurance metrology sensors in multistation manufacturing processes. For example, optical coordinate measuring machines (OCMMs) are built into automotive assembly lines. An OCMM replaces the mechanical arm and the touch probe in a CMM with an optical sensor unit (Figure 12.1b) that consists of a laser source and two CCD (charge-coupled device) image sensors. The laser source sheds a beam on the surface of a workpiece, and the CCD sensors detect the reflective laser beam. The sensor unit is installed within a spatial frame and calculates the coordinate of the measured point relative to the frame reference using triangulation. The OCMM frame and sensor unit are less expensive than the frame and touch probe of the CMM. It is therefore more affordable to deploy multiple optical sensor units and build more OCMM stations, performing parallel measurement jobs of multiple product characteristics. An OCMM station with multiple sensor units is capable of measuring as many as 150 product features on a car body within 1 min (refer to Figure 12.1c). The high throughput capability enables OCMMs to be built into the production process and 100% inspection of dimensional quality characteristics [2]. The deployment of OCMMs and other fast, automatic, in-process sensors results in a shift in quality control philosophy. With CMMs, dimensional measurements are taken offline and sampled from a large product population. CMMs are also used to inspect key dimensional product features and ensure that they are statistically acceptable. The implementation of in-process OCMMs can lead to continuous dimensional measurements of every manufactured product and to the underlying process variation sources responsible for product defects can potentially be determined; this process is known as root cause diagnosis. Root cause diagnosis is critical because the identification of variation sources will lead to corrective actions, restoring the manufacturing system to its normal condition in a timely manner. Many of the recent research advances in root cause diagnosis in multistation manufacturing systems have been presented in Part II of this book.
2151_book.fm Page 277 Friday, October 20, 2006 5:04 PM
Optimal Sensor Placement and Distribution
277
Clearly, effective use of product measurements in root cause diagnosis depends to a great extent on the design of a sensor system. A poorly designed sensor system is likely to generate an extensive amount of irrelevant or even conflicting information, and as such, may not be able to provide the desired diagnosability in identifying variation sources. The effectiveness of a sensor system is characterized by its capability and accuracy to identify major variation sources. The efficiency of the system can be benchmarked by the sensing cost required to achieve certain levels of diagnosability or accuracy. The optimal design of a sensor system in terms of its effectiveness and efficiency is to realize the desired capability for root cause diagnosis at a minimum sensing cost. In a multistation manufacturing process, the design of a sensor system involves the determination of: (1) which workstation to distribute sensing devices at, (2) the number of sensors required for individual stations, (3) the location of sensors at individual stations, and (4) operation strategies, such as how many and how often measurements will be taken. However, given the high measurement throughput of the chosen optical sensor units, in-process data are automatically collected from every product. This makes the operation strategy of the sensors simple and straightforward. Therefore, the sensor system design considered here focuses on the first three questions, namely, where to place sensors and how many are needed. This problem is generally referred to as sensor placement and distribution in engineering practice. Relevant research in this area falls into two major categories: sensor allocation for the purpose of multistage product inspection and single-station sensing optimization. Optimal allocation of inspection efforts has been studied for serial and nonserial production lines with either perfect or imperfect inspection capability; see the recent survey on this topic [3]. The objective of multistage product inspection is to minimize overall cost, including fixed inspection, variable inspection, scrap or repair and warranty costs. The problem is often formulated as a dynamic programming problem. Other optimization methods used therein include nonlinear programming, genetic algorithms, and simulated annealing. Although the preceding inspection strategy potentially improves the quality of products for customers, it does nothing to alter overall product quality because nothing has been done to improve the underlying process. By contrast, root cause diagnosis aims to draw inferences regarding process variation sources that are correlated to product measurements. For this reason, a diagnosis-oriented strategy focuses on choosing the features that could lead to certain optimal conditions (e.g., maximum separation) for identifying variation sources. This involves defining certain criteria to characterize the distinction between variation sources and then employing an optimization routine to enhance the chosen criteria. Prior research in this area has been reportedly conducted mainly at the single-machine level rather than at the system level, i.e., the variation sources and locations of sensors are limited to a single manufacturing station; papers include Khan et al. [4], Khan et al. [5], Wang and Nagarkar [6], and Khan and Ceglarek [7], Djurdjanovic and Ni [8,9]. For instance, Wang and Nagarkar [6] used a D-optimal criterion for sensor placement on a single coordinate checking station and employed Powell’s direct search [10] to find the optimal solution.
2151_book.fm Page 278 Friday, October 20, 2006 5:04 PM
278
Stream of Variation Modeling and Analysis for MMPs
Research on sensor distribution for multistation systems, which considers the effectiveness of variation diagnosis, is very limited. Khan et al. [4] and Khan and Ceglarek [7] studied: (1) end-of-line sensing, in which the sensing station is located at the end of a manufacturing system but variation sources include those from upstream stations; and (2) distributed sensing, in which sensing stations can be located in preselected yet arbitrary places in a manufacturing system. Their approach optimized sensor layout by maximizing the minimum distances between any pair of variation patterns, which were obtained using the variation model of a single set of fixtures. Owing to the fact that their diagnostic procedure assumes the occurrence of a single variation source at a time, their sensor placement strategy only ensures that each single variation source is optimally distinguished from the others. Another limitation is that their strategy is based on the specific manner in which the variation patterns are defined and constructed. Their results may no longer be optimal if a different type of variation pattern is defined and used. This chapter will provide a comprehensive solution to the sensor placement and distribution problem in a multistation manufacturing process, based on analysis and optimization on the state space variation model developed in Part I, which integrates the sensor deployment information with the process configuration information. Two different types of design criteria, diagnosability and detection sensitivity, are presented, and their relationship is discussed. The algorithms targeting sensor optimization on individual stations and in a multistation process are introduced. We devise various means of improving the algorithm’s efficiency and effectiveness.
12.2 DESIGN CRITERIA FOR SENSOR PLACEMENT 12.2.1 DIAGNOSABILITY INDEX
AS
DESIGN CRITERION
In Chapter 9, we defined the concept of diagnosability Equation 9.9 and Equation 9.10 and the diagnosability condition (Theorem 9.1) for identifying the mean and variance components. Diagnosability characterizes whether or not the sensor system provides sufficient information to ensure that the mean or variance components of variation sources can be separated. This means that if a variation source is diagnosable, no matter how small a change it undergoes, we can theoretically find an algorithm to estimate it provided we have enough samples. If a sensor system cannot ensure diagnosability, no matter how large a variation source changes, we cannot uniquely pinpoint which variation source undergoes the change. Let G denote the testing matrix for diagnosability, where G could be either Γ T Γ or H, as defined in Theorem 9.1, depending on whether the mean diagnosability or the variance diagnosability is relevant. We further define a generic diagnosability index as: µ=
ρ(G) P
(12.1)
where ρ(⋅) is the rank of a matrix and P is the number of variation sources. The diagnosability index here characterizes the information quantity aspect, as being
2151_book.fm Page 279 Friday, October 20, 2006 5:04 PM
Optimal Sensor Placement and Distribution
279
discussed in Chapter 9. This diagnosability index is a normalized quantity in [0, 1], with µ = 1 being equivalent to a fully ranked G. As such, we say that a sensor system provides the complete diagnosability if and only if µ = 1. When the diagnosability index is used as a design criterion for sensor system design, the objective would be to achieve the desired diagnosability level at a minimum cost.
12.2.2 SENSITIVITY INDEX AS DESIGN CRITERION The diagnosability index does not make any distinction among diagnosable systems, even though some sensor systems may have superior performance compared to others, making it easier to detect a small change in the variation sources. This difference in detection capability is characterized by the concept of sensitivity, which may be interpreted as follows: a sensor system that has zero sensitivity to any one of the variation sources provides no diagnosability, whereas a sensor system with a nonzero sensitivity to all variation sources possesses a certain level of diagnosability. It is desirable that a sensor system not only have full diagnosability, but also be sensitive to the underlying changes of variation sources. Equation 9.4 offers a mixed linear model, which connects the process variation inputs (u, w, v) to the sensor measurements y. In this chapter, we focus on the process fault inputs u and assume that the effect of the background noise Ψ · w is negligible. This simplifies our technical treatment. The resulting concepts and the optimization algorithms can also be easily extended to a more complicated model that includes Ψ · w. Under this simplification, the mixed-effect model becomes: y = Γ ⋅ µ + Γ ⋅ u + v
(12.2)
Based on this model, we have: my = Γ ⋅ µ
(12.3)
vec( Σ y ) = π(Γ ) ⋅ θ + σ 2v ⋅ vec(Im )
(12.4)
and
where my and Σy are the mean vector and covariance matrix of y, respectively, and π(⋅) is a matrix transform defined as π(Γ ) = ( γ 1 * γ 1 )T
( γ 1 * γ m )T
( γ m * γ 1 )T
( γ m * γ m )T
T
(12.5)
where * represents the Hadamard product. γ ′ (j = 1, 2, …, n) is the jth row vector of Γ. The other notations follow their meaning in Equation 9.7 and Equation 9.8, except that θ is adjusted to be [σ12 σ 22 σ 2P ]T , namely, the variance components of random elements in u. Note that we will assume in this chapter that knowledge
2151_book.fm Page 280 Friday, October 20, 2006 5:04 PM
280
Stream of Variation Modeling and Analysis for MMPs
about sensor noise variance, σ 2v , is available from the sensor vendor’s calibration and specification. Following the same spirit in defining diagnosability as in Equation 9.9 and Equation 9.10, the sensitivity for detecting changes in mean and variance components can be defined as the ratio of change in the mean or variance of y over a perturbation of the mean or variance of the input sources. We can define the detecting sensitivity of mean and variance components as follows: Definition 12.1: Given measurement y, the mean-detecting sensitivity, denoted as Sm, is defined as: Sm ≡ min δµ ≠ 0
(δm y )T (δm y ) (δµ )T (δµ )
(12.6)
and the variance-detecting sensitivity, denoted as Sv, is defined as: Sv ≡ min δθ ≠ 0
tr (δΣ Ty ⋅ δΣ y ) (δθ)T (δθ)
(12.7)
where Σ y is the covariance matrix contributed from the process variation sources, i.e., vec( Σ y ) = π(Γ ) ⋅ θ. Given the linear relation in Equation 12.2 and utilizing the eigenvalue property of a symmetric matrix [11, p.105], we can express the preceding sensitivity indices in terms of the eigenvalue of Γ T Γ as follows (the proof is fairly straightforward and is thus omitted): Sm = λ min (Γ T Γ ) and Sv = λ min (π(Γ )T π(Γ ))
(12.8)
where λ min (⋅) denotes the smallest eigenvalue of a matrix. In deriving Sv , the relation tr (δΣ T ⋅ δΣ ) = vec(δΣ )T ⋅ vec(δΣ ) is used. y
y
y
y
Remark 12.1: The squared summation of elements in input or output vectors are used in the preceding definition so that we can have a scalar sensitivity index that is easy to interpret. The squared summations are equivalent to the Euclidean norm of the corresponding vector or matrix, i.e., tr (δΣ Ty ⋅ δΣ y ) is the Euclidean norm of matrix δΣ y . Remark 12.2: In the variance sensitivity definition, we use Σ y rather than Σy because the sensor noise variance σ 2v is assumed to be known. Remark 12.3: Without the minimum, the ratios in Equation 12.6 and Equation 12.7 are input dependent. Using input-dependent indices, we will have to design a sensor system for individual changes of input variation sources, which would be inconvenient. The minimum operator defines the sensitivity indices to be the smallest ratios
2151_book.fm Page 281 Friday, October 20, 2006 5:04 PM
Optimal Sensor Placement and Distribution
281
given all possible combinations of input changes. Equation 12.8 shows that the previously defined sensitivity indices are actually input independent; they are solely determined by matrix Γ. Remark 12.4: The earlier definition is also consistent with the intuitive relation between sensitivity and diagnosability that we mentioned before the definition. The diagnosability conditions obtained in Theorem 9.1 and Equation 9.12 are: (1) The mean components are diagnosable if Γ T Γ is of full rank and (2) the variance components are diagnosable if the matrix {( γ Ti γ j )2}iP, j =1 is of full rank and {} ⋅ iP, j =1 is a T 2 P T P × P matrix. It can be shown that {( γ i γ j ) }i , j =1 = π(Γ ) π(Γ ); the proof is included in Lemma 12.1. It is then apparent that full diagnosability is guaranteed if the corresponding sensitivity index is nonzero and a system with zero sensitivity is equivalent to the one that is not fully diagnosable. Lemma 12.1: {( γ Ti γ j )2}iP, j =1 = π(Γ )T π(Γ ) Proof: Recall that ( γ Ti γ j )2 = tr ( γ i γ Ti γ j γ Tj ) and tr ( AB) = vec ( A)T vec (B) for any symmetric matrices A and B. The (i,j)th element in {tr ( γ i γ Ti γ j γ Tj )}iP, j =1 is (vec( γ i γ Ti ))T Γ). That leads to the vec( γ j γ Tj ). Actually, vec( γ i γ Ti ) is the ith column vector in π(Γ conclusion that π(Γ )T π(Γ ) = {( γ Ti γ j )2}iP, j =1. Q.E.D. The mean and variance component sensitivity indices Sv and Sm are also related to the estimation variance of the mean and variance components. For Equation 12.2, if the variance components in θ are estimated using a maximum likelihood estimator (MLE), the variance-covariance matrix of θˆ is approximated by the inverse of the Fisher Information Matrix as cov(θˆ ) ∝ [{tr ( Σ −y1 ( γ i γ Ti ) Σ −y1 ( γ j γ Tj )}ip, j ]−1
(12.9)
A constant related to sample size is omitted from the right-hand side in Equation θˆ ) depends 12.9; thus, we used “∝” instead of “=”. This expression suggests that cov (θ 2 Γ)·θ θ + σ v ⋅ vec(In ). Under a normal process Σy) = π(Γ on the true value of θ because vec(Σ condition when there is no outstanding variation source, we can assume that θ = 0 and then Σy = σ 2v In . Then Equation 12.9 becomes cov(θˆ )
θ=0
∝ σ 4v ⋅ {tr ( γ i γ Ti γ j γ Tj }iP, j −1
−1
= σ ⋅ {( γ γ ) } = σ ⋅ π(Γ ) π(Γ ) 4 v
T i
2 P j i, j
4 v
T
−1
(12.10)
Thus, the variance of the linear parametric function f T θˆ under a normal process condition is: cov( f T θˆ )
−1
θ=0
∝ σ 4v ⋅ f T π(Γ )T π(Γ ) f
(12.11)
2151_book.fm Page 282 Friday, October 20, 2006 5:04 PM
282
Stream of Variation Modeling and Analysis for MMPs
Then, the maximum variance of f T θˆ for any unit vector f is the maximum eigenΓ)T πΓ Γ)]–1. In other words, Sv, the smallest eigenvalue of [π(Γ Γ)T π(Γ Γ)] value of [π(Γ Tˆ represents the maximum variance of f θ, ∀ f = 1, under a normal process condition. The criterion to maximize Sv is then equivalent to selecting a Γ to minimize the maximum variance of f T θˆ under a normal process condition. Similarly, it is not difficult to show that maximizing Sm is equivalent to minimizing the maximum variance of the linear parametric function pT µˆ , ∀ p = 1, under a normal process condition. As Sm and Sv are different functions of Γ, a sensor system design may end up with different results depending on which one of the objectives is chosen, either achieving the maximum mean detection sensitivity or the maximum variance detection sensitivity. One can define certain weighted criteria, for instance, c1Sm2 + c2 S v , as the objective function, where constants c1 and c2 determine the trade-off between mean and variance detection sensitivities. However, further investigation found an inequality relationship between Sm and Sv ; Sm2 is a lower bound for Sv for the same Γ. The result is stated in Lemma 12.2. Lemma 12.2: For the same Γ, Sv ≥ (Sm)2. Γ T Γ) * (Γ Γ T Γ). Proof: We know that {γ Ti γ j}iP, j =1 = Γ T Γ. Then, {( γ Ti γ j )2}iP, j =1 is actually (Γ T 2 P T Γ) πΓ Γ), which means that From the pervious proof, we know that {( γ i γ j ) }i , j =1 = π(Γ Γ)T π(Γ Γ) = (Γ Γ T Γ) * (Γ Γ T Γ). Theorem 7.28 in Schott [14, p. 276] stated that π(Γ λmin(A*B) ≥ λmin(AB) for any nonnegative definite matrices A and B. The matrix Γ T Γ is a nonnegative definite matrix, so that we have: λmin((Γ Γ T Γ)*(Γ Γ T Γ)) ≥ T T T T 2 T Γ Γ)(Γ Γ Γ)). As λmin((Γ Γ Γ)(Γ Γ Γ)) = λ min(Γ Γ Γ), is inequality is equivalent to λmin((Γ Γ)Tπ(Γ Γ)) ≥ λ 2min(Γ Γ T Γ). λmin(π(Γ Q.E.D. Γ T Γ), as the unified Based on Lemma 12.2, we choose to use Sm, i.e., λmin(Γ criterion for optimal sensor placement to simplify the design process. In other words, we optimize Sm, while regulating Sv . Nonetheless, the optimization routines presented in the next section should be equally applicable to the maximization of Sv or other combinations using Sm and Sv . A final note on this sensitivity index is that it is the same as the E-optimality in optimal experimental design that was initially proposed by Ehrenfeld [12], when Γ matrix is considered as the mathematical equivalence of β + ε. X in a regression model such as y = Xβ
12.3 SINGLE-STATION SENSOR PLACEMENT 12.3.1 OPTIMIZATION FORMULATION Γ T Γ), the problem of optimal sensor With the unified sensitivity index Sm = λmin(Γ placement will be formulated as follows: The design parameters are the number and locations of sensors, denoted by ϕ(s) ≡ [X1 Z1 Xs Zs]T. Moreover, certain constraints should be satisfied. One constraint is that a sensor location has to be a point on the product (geometrical constraint), represented by G(⋅) ≥ 0, where G(⋅) represents the appropriate geometry function of a manufactured product. For OCMM to perform parallel measurements, we also apply the second constraint to avoid any possible optical interference among laser beams when taking measurements. Our
2151_book.fm Page 283 Friday, October 20, 2006 5:04 PM
Optimal Sensor Placement and Distribution
283
engineering knowledge indicates that could be taken if we keep the sensor locations at least 100 mm apart from each other parallel measurements. For a given number of sensors, we try to find the optimal sensor locations that maximize Sm, namely: max Sm ≡ λ min (Γ T Γ ) ϕ (s)
(12.12)
subject to G (ϕ (s)) ≥ 0 and
the 100 − mm seperation rule
The optimization in Equation 12.12 does not determine the number of sensors. As an increase in sensor number will generally result in a larger Sm, researches usually try to determine the appropriate sensor number by trading off between the benefit gained from an increase in Sm and the cost for more sensors. However, in engineering practice, it is not easy to quantify the monetary savings associated with an increase merit in Sm. It is thus difficult to define an accurate cost function to attain this trade-off. Alternatively, we can specify a lower bound for Sm. Then the second optimization formulation is to minimize the sensor number and also satisfy a lowerbound constraint on Sm, in addition to other constraints previously specified, i.e., min s subject to Sm ≥ c, G (ϕ (s)) ≥ 0, and the 100 − mm seperation rule
(12.13)
where c is the lower bound, arrived at based on engineering requirements. In the next subsection, we will mainly study the optimization in Equation 12.12, which is equivalent to the “exact” design problem in optimal experimental designs. Optimization Equation 12.13 can be solved using the resulting exact design algorithm with a gradually increasing sensor number. In Subsection 12.3.5, we will briefly discuss other considerations in solving Equation 12.13, as well as how to select constant c.
12.3.2 EXCHANGE ALGORITHMS
FROM
OPTIMAL EXPERIMENTAL DESIGN
The optimization problems formulated in 12.3.1 are nonlinear in design parameters ϕ. Standard nonlinear programming approaches (such as quadratic programming) are usually based on a derivative calculation and will be easily entrapped in a local optimum. The derivative-based approaches will be ineffective for a nonconvex design space, imposed by the geometry of panels involved, not to mention those design spaces in the assembly that are not simply connected; for instance, the rear quarter panel, because the window-opening area is not a candidate area for sensor placement. In the research of optimal experimental design, exchange algorithms were developed for optimizing those aforementioned design criteria, such as D-, A-, and Eoptimality; refer to Cook and Nachtsheim [13] and Atkinson and Doner [14] for reviews and comparisons of exchange algorithms. According to Meyer and Nachtsheim [15], an exchange algorithm has more freedom to maneuver on a complicated design space because each of its exchanges involves only a part of the design parameters
2151_book.fm Page 284 Friday, October 20, 2006 5:04 PM
284
Stream of Variation Modeling and Analysis for MMPs
(associated with one design point). Exchange algorithms could then be more effective in escaping from a local optimum than derivative-based nonlinear programming. Additionally, exchange algorithms have other advantages when applied to engineering system designs: their procedures are intuitive and implementation is easy, the algorithms are flexible and can easily handle complicated constraints in engineering system design, and they can also be used for a wide variety of design criteria. To use exchange algorithms, we discretize the continuous design space first. The resulting discretized design space with Nc candidate sensor locations is called the candidate space (denoted as Dc) and the space with s current sensor locations is known as the sensor space (denoted as Ds). The basic idea of an exchange algorithm is to start with a set of s design points (i.e., the sensor location) in Ds, usually randomly selected, and exchange the current design points with those points in the much larger Dc to improve the chosen design criteria. In exchange algorithms, however, the action of exchange is not carried out for every single point. Every point in Ds will be tested against a point in Dc, meaning that the improvements in design criterion are recorded, supposing that the point in Ds has been exchanged with a point in Dc. There are different variants to this basic idea, depending on how often the action of exchange is actually carried out. One option is to perform the exchange action after all points in Ds have been tested against the entire set of points in Dc. It exchanges the pair of points, one in Ds and one in Dc, which results in the maximum improvement in design criteria. This option is actually the celebrated Fedorov exchange algorithm. Another option is to perform the exchange action for every point in Ds after that design has been tested against all points in Dc. In other words, point i in Ds will be exchanged with a point in Dc that maximizes the improvement in design criteria and the same action is repeated in a sequential order for i = 1, 2, …, s. The second option is the modified Fedorov exchange. In combinatorial optimization, these are two extreme cases of a general k-exchange algorithm, with k = s for the Fedorov exchange and k = 1 for the modified Fedorov exchange [16]. When applying them to the sensor placement problem, we notice that exchange algorithms, especially the Fedorov algorithm, could consume a great deal of CPU time for cases involving a large number of sensors. This is not surprising, because the exchange algorithm was initially developed for experimental design with a relatively small number of factors and experiments [13]. In the following subsection, we will introduce and implement a sort-and-cut procedure that will shorten computation time without sacrificing much optimal value.
12.3.3 FAST EXCHANGE ALGORITHM
WITH
SORT-AND-CUT
Let us first conceptually understand the factors affecting the algorithm’s computation time. Define the process to pass over the entire Dc set once as a “loop.” There are two major factors affecting the run time: the average number of loops and the size of the candidate space Nc. To reduce the computation time, we will have to reduce the average number of loops as well as the size of the candidate design space Nc. The following sort-and-cut procedure is employed to achieve both goals. The basic idea is to perform multiple exchanges in each loop to reduce the average number of loops and, after each loop, discard a subset of the candidate design points to reduce Nc.
2151_book.fm Page 285 Friday, October 20, 2006 5:04 PM
Optimal Sensor Placement and Distribution
285
When designing a uniform coverage design in molecule selection, Lam et al. [17] suggested that instead of exchanging one design point per loop it may be preferable to exchange multiple candidate points in the upper tail of the distribution of improvements in design criteria among all the candidates. In this way, the number of design points that will be exchanged during each loop will be more than one, so that the average number of loops required to replace all random initial designs can be reduced. Unlike the modified Fedorov algorithm, which uses a design point in Ds as the pivoting point in each exchange, Lam et al. [17] used a candidate point in Dc as the pivot pointing. Specifically, for each point in Dc, add this point to Ds and augment an s-sensor design Ds to an (s + 1)-sensor design. For Ds to remain an s-sensor design, we will put one sensor, which causes the smallest decrease in the sensitivity index, from the augmented Ds back to Dc. This process constitutes an exchange action. Following the algorithm in Lam et al. [17], we should record the improvement in design criteria that a candidate location can make if the corresponding exchange is indeed carried out. After each exchange action, denote the improvement in the Sm criterion by ∆, i.e., ∆ ≡ Smnew − Smold . Record all ∆j’s (j = 1, …, Nc) when we loop through the Nc candidate locations. Sort the values of ∆j’s in descending order as ∆(1) ≥ ∆(2) ≥ …, and so on. The distribution of improvements is approximated by the sorted values ∆(j). Then, an integer number q is set so that the first q candidate locations in the upper tail of ∆(j) will be exchanged in each loop. On the other side, we can reduce the total number of candidate points Nc. The sorted values of design improvement ∆(j) actually provide us with valuable information about the potential of a candidate location. Those candidate locations with a low ∆ value are less likely to be picked up by the exchange algorithm in subsequent iterations. Thus, we can remove some candidate points after each iteration. Let ϕ denote the candidate points that will be retained after a cut. To implement this sort-and-cut procedure, there are two parameters to be determined: q and ϕ. In our sensor system design problem, the sensor number in Ds is usually only a small percentage of that in Dc space (for instance, Ds = 10 but Dc = 10,000). We recommend an aggressive choice of ϕ, e.g., from 10 to 20% (For the preceding example, if ϕ = 10% for the first two iterations, the remaining locations in Dc are still about 10 times that in Ds). For a sort-and-cut procedure to work, the assumption is that the distribution of improvements approximated by data from the previous exchange routine can sufficiently represent the distribution in the next exchange. However, whenever an exchange occurs, the distribution cannot be exactly the same, because the Ds space generating that distribution is no longer the same. The common ground is actually constituted by the design points that are not exchanged in the past iteration. Thus, for the preceding assumption to hold, q should be smaller than s, i.e., 1 ≤ q < s. When q is close to s, however, almost all sensors in Ds will be exchanged in one iteration, and the distribution recorded in ∆(j) from the previous loop does not truly represent the distribution for the new Ds space. A subsequent exchange based on ∆(j) could be a poor choice that has to be repeated in the following loops. On the other hand, a very small q will result in too few exchanges per loop and thus miss our
2151_book.fm Page 286 Friday, October 20, 2006 5:04 PM
286
Stream of Variation Modeling and Analysis for MMPs
original goal of having multiple exchanges to reduce the average number of loops. We therefore recommend selecting q = s/2 to strike a balance so that half the sensors in Ds will not be exchanged to provide a common ground for distribution and half of the sensors will be exchanged to reduce the average loop number. This is where we differ from Lam et al. [17]. Instead of using q as the direct control on the number of exchanges, Lam et al. [17] set ∆(q) as the threshold to control the exchange, i.e., if there is an improvement greater than ∆(q), then the exchange is carried out. We find that using ∆(q) in our application is not effective. The difference is due to our inclusion of a subsequent cut action, which is not included in the procedure used by Lam et al. [17]. The effect of the cut action, as explained in the preceding paragraph, requires us to have more direct control over the number of exchanges, which cannot be fulfilled by using ∆(q). The algorithm for an s-sensor exact design is summarized as follows: Step 1: The candidate design space Dc is discretized, and s locations are randomly selected to form Ds. Perform exchanges for all locations in Dc and establish the initial distribution for ∆. Step 2: In every iteration 1. Rank the sensor locations in Dc in descending order according to their ∆j values. 2. Cut off those locations with low ∆j values, and retain the top ϕ×100% of candidate points in Dc. 3. Exchange the sensor locations in Dc that satisfy the constraint condition, with the sensor locations in the current Ds space in sequential order, starting from the one with the largest ∆j. Repeat this exchange for the first q sensors in the sorted Dc. Step 3: Repeat step 2 until the improvement in design criteria for two successive designs is smaller than a predetermined threshold (we used 0.1%).
12.3.4 COMPARISON
AMONG
ALTERNATIVE ALGORITHMS
Algorithms described in the earlier sections are coded in Matlab and compared on the same computer. We measure algorithm efficiency by the time that it takes to find the optimal value. We measure algorithm effectiveness by the average value of optimal solutions (i.e., average Sm) it finds when a group of random sensor layouts is used as the initial design. In the literature of algorithm comparison [13], a relative effectiveness R was often used, which is defined as the ratio of the average optimal value over the best optimal solution found by all the algorithms in the comparison under the same setting. We use both measures in this study. We implemented the algorithm to determine the sensor placement on the SUV side panel assembly in Figure 6.10d of Chapter 6. In this subsection, we only consider the final piece of the full assembly supported by a set of 3-2-1 fixtures. We discretize it with candidate points 10 mm apart. Our engineering experience indicates that this resolution of discretization is sufficient to generate a fine enough grid on a panel that has a size of over several thousand millimeters. The discretization results in a
2151_book.fm Page 287 Friday, October 20, 2006 5:04 PM
Optimal Sensor Placement and Distribution
287
TABLE 12.1 Comparisons of the Resulting Algorithms
s=2
s=4
s=6
s=8
s = 10
Fedorov Modified Fedorov Fast exchange Fedorov Modified Fedorov Fast exchange Fedorov Modified Fedorov Fast exchange Fedorov Modified Fedorov Fast exchange Fedorov Modified Fedorov Fast exchange
Average Computer Time (sec)
Average Maximal Sm
R
27.24 19.12 2.47 98.12 50.61 3.24 106.50 63.50 4.21 187.95 87.55 4.29 376.80 142.04 5.11
1.0044 0.9975 0.9300 2.0105 2.0088 1.9794 3.0145 3.0150 2.9715 4.0177 4.0188 3.9913 5.0214 5.0207 4.9965
0.9995 0.9926 0.9255 0.9980 0.9972 0.9826 0.9979 0.9981 0.9837 0.9972 0.9975 0.9907 0.9978 0.9976 0.9928
total of Nc = 13,304 candidate positions in Dc. For the sort-and-cut procedure, we choose ϕ = 0.1 and q = s/2. In this application, we only implement the cut action in the first iteration. We performed numerical study to compare these algorithms for different choices of s, starting with s = 2 because at least two sensors are needed to measure all variation sources associated with the two fixture locators (P1 and P8) that support the eventual side panel assembly. In the comparison, 50 trials with randomly generated initial designs are performed; the comparison results are included in Table 12.1. For the sake of brevity, only the results with an even s value are displayed, whereas the understanding can be generally extended to an odd s. We observed the following: 1. The sort-and-cut procedure considerably improves the algorithm efficiency. The reduction in computation time as compared to the Fedorov exchange algorithm ranges from 90% (for s = 2) to 98% (for s = 10). In fact, the reduction is more significant when the sensor number is relatively large (e.g., s = 8 or 10), which is desirable for engineering system design problems. We also apply this fast exchange algorithm to large sensor numbers such as s = 20, 30, 40, 50, and 60. The computation time vs. the sensor number is shown in Figure 12.2, where the value indicated below each mark is the average computation time. From this figure, it is apparent that the average time required to find an optimal 60-sensor design using this fast exchange algorithm is only half of what the Fedorov algorithm needs for a 4-sensor design.
2151_book.fm Page 288 Friday, October 20, 2006 5:04 PM
288
Stream of Variation Modeling and Analysis for MMPs
45.0 average time (seconds)
40.0
41.9
35.0
35.9
30.0 25.0
26.1
20.0 15.0 10.0
13.0 8.4
5.0 5.1 4.04.7 3.03.3
0.0
10
0
20
30
40
50
60
70
sensor number (s)
FIGURE 12.2 The computation time vs. the number of sensors.
2. We also noticed that the R value increases as s increases. For s = 2, R = 0.9255, which is noticeably lower than what the Fedorov and modified Fedorov algorithms found. This can be explained by the same reasoning we used to choose the q value in the previous section. When s = 2, even if only one sensor has been exchanged, the previously recorded ∆j can hardly represent the distribution for the next loop because it is based on a single (usually randomly selected) sensor. When the sensor number increases, this problem is alleviated. Actually, for s = 4 and upward, R is large enough compared to the other two algorithms, meaning that the fast exchange algorithm does not scarify the optimal value much. For s = 2, because it is a small-scale problem similar to those in experimental designs, the Fedorov algorithm can be used directly.
12.3.5 RESULTS
OF
OPTIMAL SENSOR LAYOUT
ON A
SINGLE STATION
The resulting optimal sensor layouts with an even sensor number (s = 2, …, 10) are shown in Figure 12.3, where a “*” mark indicates a sensor location. From the layouts, we observe that the sensors are located in the area close to the panel boundary, and many of them are actually on the edge. It brings up the question if we can reduce our candidate locations by limiting our search to the geometrical boundary of each part on the first hand. The answer is yes. However, we should also notice that not all the sensor locations are on the edge (refer to when s = 8 and s = 10). Based on empirical knowledge alone, it is nontrivial to determine a search area that contains all the potential good sensor locations. In this study, we used the approximated distribution of design improvements in the sort-and-cut procedure, which provides more reliable information and quantitative evaluation to find the area of the potential good sensor locations. The algorithm is fairly general and can be used together with a reduced candidate pool to further improve its efficiency if the aforementioned intuitive rule is implemented before the search.
2151_book.fm Page 289 Friday, October 20, 2006 5:04 PM
Optimal Sensor Placement and Distribution
289
FIGURE 12.3 Optimal sensor layouts on a single station.
The preceding exact design algorithm is to find an optimal sensor layout when the sensor number is specified. In order to solve for optimization in Equation 12.13, it may be possible to use a sequential routine, i.e., start from one randomly generated point, and then sequentially add one more sensor from Dc to Ds which maximizes the resulting Sm, until eventually Sm ≥ c. For such a sequential strategy to work well, the sensor layout for (s – 1) sensors should be a subset of the optimal layout for s sensors. This might be true when the sensor number is small (s < 4) but it does not hold when sensor numbers grow larger. Although we did not show the sensor layouts
2151_book.fm Page 290 Friday, October 20, 2006 5:04 PM
290
Stream of Variation Modeling and Analysis for MMPs
for odd sensor numbers, they actually agree with the phenomenon demonstrated by the layouts displayed in Figure 12.3. Therefore, the sequential routine could miss the optimal layout. Nevertheless, we can combine sequential probing and the exact design. That is, first use the sequential routine to probe and find a sensor number that can yield Sm ≥ c and then switch to an exact design routine to find the optimal sensor locations for sensor numbers around the one found by the sequential routine. We can thus skip a number of time-consuming exact designs, especially when the resulting sensor number is relatively large. In optimization of Equation 12.13, we specify a constant c to stop the algorithm. Usually, the choice of c depends on engineering requirements and is specified under a particular context. In this study, we can choose c based on the accuracy requirement. It is known that the OCMM, although more agile and faster, is not as accurate as the mechanical CMM — the OCMM measurement repeatability is five to ten times lower than the CMM [18]. Let us be optimistic and consider that an OCMM is five times less accurate than a CMM, namely, σ 2v ,OCMM = 5σ 2v ,CMM . According to the arguments after Lemma 12.1, under a normal process condition, the maximum variance in estimating process mean components and the lower bound for the maximum variance in estimating process variance components is σ 2v ,OCMM . Sm To achieve the same variance level as when a CMM is used to directly measure the process variation source, we hence require that σ 2v ,OCMM /Sm < σ 2v ,CMM , which translates into Sm > 5. So we will choose c = 5 in this study. This value may change when the accuracy requirement is different, but the earlier logic can still be applied in determining an appropriate c. When choosing c = 5, we find that 10 sensors will provide an sensing capability as efficient as a CMM. One may wonder what happens if we use Sv instead of Sm as our design criteria. Examples using Sv are shown in Figure 12.3f to Figure 12.3h for s = 2, 6,10. Interestingly, the sensor layouts using Sv bear a strong similarity to those using Sm, especially in terms of areas where the sensors will be located. Of course, the resulting layouts using Sv deviate to some extent from those using Sm and the deviation is less obvious for s = 2, but it increases as s increases in value.
12.4 MULTIPLE-STATION SENSOR DISTRIBUTION 12.4.1 OPTIMIZATION FORMULATION In the previous subsection, the problem of sensor placement at a single station was discussed. Basically, it answered the questions related to sensor location and sensor number. Regarding sensor distribution in a multistation process, one additional question would be: where in a system should we build sensing stations? In this subsection, we provide a backward propagation algorithm to determine the locations of the sensing station. Specifically, we will study a sensor distribution strategy for achieving
2151_book.fm Page 291 Friday, October 20, 2006 5:04 PM
Optimal Sensor Placement and Distribution
fixture variation
Bi
u i
Transmission
k,i
Transmitted information x k
i|k
291
Ck Detection
Overall information detected y k
diagnosability
FIGURE 12.4 Variation transmission and detection.
a fully diagnosable system and, for simplicity, we will only discuss the variance component diagnosability, i.e., let G = H when µ is defined in Equation 12.1. This approach can be extended to the mean component diagnosability following fairly straightforward steps. A diagnosability index will be used that can specify the location of sensing stations and the minimal number of sensors on each station. However, it cannot uniquely specify the location of sensors within each station. There could be multiple solutions. To find a unique sensor location, a sensitivity index may be used, which is an ongoing study yet to be completed. The cost of a sensor system lies not only on sensors but also on the expense of building sensing stations. So the optimization formulation for a full diagnosability system (i.e., µ = 1) is: Jopt = min c1 ⋅
(# of sensors at station k ) + c2 ⋅ ( # of sensing stations) k =1 (12.14) N
∑
subjject to µ = 1 and G (⋅) ≥ 0 where c1 and c2 are the average cost per sensor and cost per sensing station, respectively. In this subsection, we will decompose the systemwide diagnosability in a multistation process into two steps (Figure 12.4): (1) the transmission of variation from station i to station k, with the transmitted information modeled by state covariance matrix Σ xk ; and (2) the detection of fixture variation by sensors located at station k, with the overall information modeled by measurement covariance matrix Σ yk . Information transformation in these two steps is characterized by the transmissibility ratio λi|k and the detecting power τk (on station k), respectively. The optimal sensor distribution is studied through: (1) achieving the optimal detecting power on a single station and (2) identifying stations at which error information is not completely transmitted (i.e., λi|k < 100%).
12.4.2 VARIATION TRANSMISSIBILITY RATIO When the transmission of variation is studied, we assume that a sufficient number of sensors are installed at station k. We will further discuss the meaning of “a sufficient number of sensors” in Subsection 12.4.3. For the time being, let us assume that it means Ck = I.
2151_book.fm Page 292 Friday, October 20, 2006 5:04 PM
292
Stream of Variation Modeling and Analysis for MMPs
Variation transmission is determined by process configuration, such as fixture layout geometry (modeled by Bi) and the change in fixture layouts between stations (modeled by Φk,i). The 3-2-1 fixture shown in Figure 6.3 can restrain degrees of freedom (DOF) of a rigid workpiece (a workpiece could be a single part or a multipart subassembly), where DOF = 3 for a 2-D workpiece and DOF = 6 for a 3-D workpiece. Suppose that there are ni 3-2-1 fixtures on station i and each of them supports one rigid workpiece. The total number of d.o.f.’s that these fixtures restrain is ni ⋅ DOF = pi = dimension(ui), which is the number of independent variation sources associated with the ni fixtures. Thus, pi is the number of unknown variance components of fixture variation that we try to diagnose. Given Ck=I, ρ(π(Φk ,i Bi )) represents the number of independent equations that Φk,jBi)) < we have in solving the pi unknown variance components on station k. If ρ(π(Φ pi, not all variance components of fixture variation on station i can be exactly solved. In that case, some information regarding fixture variation on station i is lost during the transmission step. We define a transmissibility ratio λi|k to quantify variation transmission from station i to station k as: λ i|k ≡
ρ(π(Φk ,i Bi )) pi
(12.15)
where λi|k = 1 suggests that complete information regarding fixture variation has been transmitted from station i to station k. If loss of information occurs during the transmission step, (1 – λi|k) is used to quantify the information loss. Any information loss during the transmission step suggests that fixture variations at station i are not fully diagnosable regardless of the number of sensors placed on station k. Furthermore, we have: Lemma 12.3: A transmissibility ratio possesses the following properties: (P1) λi|i=1. (P2) λ i| j = λ i|k ≥
ni − 1 ,k> j>i ni
The proofs of both properties can be found in Ding et al. [19]. The first property is intuitive because it implies that if we measure all the dimensional information of a workpiece (let Ci = I), the variation of the fixture that is currently used to support the workpiece can be uniquely determined. The second property seems counterintuitive. It implies that the variation transmissibility from station i to station j (j > i) is the same as that from station i to station k which is located further downstream, k > j > i. This is an important property describing transmission of fixture variation in a multistation assembly process, under the condition that all measurement points on a product or part can be measured at any station if needed.
2151_book.fm Page 293 Friday, October 20, 2006 5:04 PM
Optimal Sensor Placement and Distribution
293
If a sufficient number of sensors are installed on station N (to measure the final product), i.e., all the transmitted information modeled in Σ xk can be detected, we can express the diagnosability index µ as:
µ=
Σ iN=1λ i|N ⋅ ni ⋅ DOF Σ iN=1ni ⋅ DOF
(12.16)
The transmissibility ratio λ i|k provides some insight about how fixture variation propagates in a multistation process. The index λ i|k is solely determined by fixture design configuration, and can thus be calculated after the process is designed but before the sensor positions are allocated. The values of λ i|k are not modifiable after the process design phase is completed. However, we can utilize the properties (P1) and (P2) to decide where to place sensors to retrieve lost information during the transmission step. From Equation 12.16, it is obvious that µ will be 1 if all λi |N’s are 1, i.e., fixtures variations on all upstream stations are diagnosable by taking measurements on station N. In such a case, we only need to install sensors on the last station N. In many cases, not all λi |N’s are equal to 1. If λi |N < 1, the strategy of increasing transmissibility by installing sensors on any stations between i + 1 and N – 1 will not help because λi |k = λi |N < 1 for k = i + 1, …, N – 1, according to (P2) in Lemma 12.3. The only viable solution is to add sensors directly on station i because λi |i =1 (according to (P1) in Lemma 12.3). The same procedure will be repeated for all stations with λi |N < 1. The preceding intuition will be formatted into a sensor distribution strategy in Subsection 12.4.4.
12.4.3 DETECTION POWER
ON
AN INDIVIDUAL STATION
Suppose that sensors are installed on station N with the resultant diagnosability of µ. The total number of variance components to be diagnosed from station 1 to N is P = Σ iN=1 pi . Then the quantity of µ ⋅ Σ iN=1 pi is considered to be the amount of information retrieved by sensors on station N, and Σ iN=1λ i|N ⋅ pi represents the amount of information on fixture variation transmitted from upstream stations. If Ck is assumed to be I, then µ ⋅ Σ iN=1 pi always equals to Σ iN=1λ i|N ⋅ pi . However, given an arbitrary number and layout of sensors, we may have the inequality µ ⋅ Σ iN=1 pi < Σ iN=1λ i|N ⋅ pi . We can then define a detectability ratio τ as τ≡
µ ⋅ ( Σ iN=1 pi ) Σ iN=1λ i|N ⋅ pi
(12.17)
Sensor placement on an individual station is considered as having a sufficient number of sensors if variation detectability τ = 1. The minimum sufficient sensor number is studied for two cases: deviation detection and variation detection.
2151_book.fm Page 294 Friday, October 20, 2006 5:04 PM
294
Stream of Variation Modeling and Analysis for MMPs
1. Minimum sufficient number of sensors in detecting product positional deviations: In an assembly process, coordinate sensors are used to measure the positional deviation of a rigid workpiece. One coordinate sensor can measure 3 d.o.f. on a 3-D workpiece or 2 d.o.f. on a 2-D workpiece. In order to measure all d.o.f. of a workpiece, three independent sensors are required for a 3-D case (or two sensors for a 2-D case). If all d.o.f. of each part in an assembly are measured, it can be concluded that the positional deviations of the product in all directions have been obtained. Remember that there are np parts in the assembly on station N. Then, we need 3np (or 2np for 2-D) sensors with three sensors on every part to detect positional deviations of all parts. Once positional deviations of all parts are detected, we can calculate Σ xN from the deviational measurements. The converse is not true, though. That is, even if we know Σ xN , we cannot reconstruct the deviational measurements. Thus, the condition for detecting the deviations of all parts is a sufficient condition for variation detection required in Equation 12.17. In the case of variation detection, it is possible to reduce the number of sensors while still reaching τ = 1. 2. Minimum sufficient number of sensors in detecting product positional variations: The reasoning behind a possible sensor-number reduction lies in the application of the π-transform in Equation 12.5. Suppose that we have a deviation relationship represented as y = Γu. The d.o.f. of parts is the same as the dimension of u if all parts are completely restrained. In order to solve for fixture deviations, the dimension of y has to be at least the same as that of u and Γ T Γ should be of full rank. When fixture Σy) = π(Γ Γ) · variations of u are considered, the model becomes vec(Σ Σu), according to Equation 12.4. As fixture deviations in u are physdiag(Σ Σu), which contains ically independent, Σu is a diagonal matrix. The diag(Σ all variance components of fixture variation, is of the same size as u. This suggests that the number of unknowns is not changed from that of the Σy) increases deviation model, but the number of known quantities in vec(Σ Σy) is then compared to the number of elements in y. If y is of M × 1, vec(Σ Σy) are the covaof size M(M + 1)/2 × 1. These additional terms in vec(Σ riances between the variables in y. The π-transform takes this change into Γ) increases the number of rows but keeps the number account so that π(Γ of columns the same as that of Γ. As the covariance terms in the variation model provide more known quantities, the required number of sensors can be reduced. Let P be the dimension of u. Regarding the deviation model, the realization of diagnosability requires that M ≥ P and Γ T Γ be of full rank. But for the variation model, it requires that M(M + 1)/2 ≥ P Γ)T π(Γ Γ) be of full rank. and π(Γ We have explained why the number of sensors can be reduced for variation diagnosis. However, an important question remains: how many sensors are necessary to reach τ = 1? It becomes clear to us that the minimum number of sensors satisfying τ = 1 depends on how the sensors are allocated on different parts. For instance,
2151_book.fm Page 295 Friday, October 20, 2006 5:04 PM
Optimal Sensor Placement and Distribution
295
given a subassembly consisting of two parts, we can place six sensors on one of the two parts, but the information they provide will be the same as that provided by three sensors placed on the same part. A more efficient method is to place three sensors on each of the two parts, respectively. The sensor placement on an individual station is modeled by Ck. In general, it is very difficult to show analytically how the sensor placement will affect the rank Γi), where Γ i = C k Φk ,i Bi (refer to the information chain in Figure 12.4). Thereof π(Γ fore, we conduct a numerical study to produce certain practical rules to follow. In this study, for the sake of simplicity, we consider a two-station assembly process that consists of all assembly operations (such as part positioning, joining, and transferring) so that it captures the interactions in C k Φk ,i Bi . Sensors will be placed on the second station in this two-station assembly segment. The numerical test results are summarized in Table 12.2, where τ(s1, …, sj) is the detectability when s1, …, sj sensors are placed on part 1, part 2,…, part j, respectively. There are a limited number of options in placing these sensors. The maximum number of sensors is 3np (3-D) or 2np (2-D) for all product-deviations to
TABLE 12.2 Numerical Analysis of Sensor Placement on an Individual Station Number of Parts np = 2
np = 3
Number of Sensors 1 2 3 4 1 2 3 4
5 6 np = 4
4
—
Sensor Placements and τ τ(0,1) = τ(1,0) = 0.43 τ(0,2) = τ(2,0) = 0.43; τ(1,1) = 1 τ(0,3) = τ(3,0) = 0.43; τ(1,2) = τ(2,1) = 1 τ(0,4) = τ(4,0) = 0.43; τ(1,3) = τ(3,1) = 1; τ(2,2) = 1 τ(1,0,0) = τ(0,1,0) = τ(0,0,1) = 0.25 τ(2,0,0) = τ(0,0,2) = 0.25; τ(0,2,0) = 0.33; τ(1,0,1) = 0.58; τ(1,1,0) = 0.67; τ(0,1,1) = 0.75 τ(3,0,0) = τ(0,0,3) = 0.25; τ(0,3,0) = 0.33; τ(2,0,1) = τ(1,0,2) = 0.58; τ(1,2,0) = τ(2,1,0) = 0.67; τ(0,1,2) = τ(0,2,1) = 0.75; τ(1,1,1) = 1 τ(4,0,0) = τ(0,0,4) = 0.25; τ(0,4,0) = 0.33 τ(3,0,1) = τ(1,0,3) = τ(2,0,2) = 0.58; τ(1,3,0) = τ(3,1,0) = τ(2,2,0) = 0.67 τ(0,1,3) = τ(0,3,1) = τ(0,2,2) = 0.75; τ(1,1,2) = τ(2,1,1) = τ(1,2,1) = 1 only for τ = 1. τ(1,1,3) = τ(3,1,1) = τ(1,3,1) = τ(2,2,1) = τ(2,1,2) = τ(1,2,2) =1 only for τ = 1. τ(1,1,4) = τ(4,1,1) = τ(1,4,1) = τ(3,2,1) = τ(2,3,1) = τ(3,1,2) = τ(2,1,3) = τ(1,3,2) = τ(1,2,3) = τ(2,2,2) = 1 τ(4,0,0,0) = τ(0,0,0,4) = 0.20; τ(0,4,0,0) = τ(0,0,4,0) = 0.27; τ(1,0,0,3) = τ(3,0,0,1) = 0.47; τ(1,3,0,0) = τ(3,1,0,0) = τ(1,0,3,0) = τ(3,0,1,0) = 0.53 τ(0,1,3,0) = τ(0,3,1,0) = τ(0,0,1,3) = τ(0,0,3,1) = τ(0,1,0,3) = τ(0,3,0,1) = 0.6 τ(1,1,2,0) = τ(1,1,0,2) = τ(1,0,1,2) = τ(1,0,2,1) = τ(1,2,1,0) = τ(1,2,0,1) = τ(0,1,1,2) = τ(0,1,2,1) = τ(0,2,1,1) = τ(2,,0,1,1) = τ(2,1,0,1) = τ(2,1,1,0) = 0.8; τ(1,1,1,1) = 1 Cases with the number of sensors = 5, 6, 7, 8 were omitted.
2151_book.fm Page 296 Friday, October 20, 2006 5:04 PM
296
Stream of Variation Modeling and Analysis for MMPs
be made detectable. For instance, if a subassembly consists of two 2-D parts, the maximum number of sensors needed is four. The possible sensor placements constitute the following sets: (0,4), (4,0), (1,3), (3,1), or (2,2). For each sensor placement, we test detectability τ through a numerical calculation. Comparison among all the possible sensor placements will lead us to the minimum sufficient number of sensors and the associated scheme of sensor placement. It should be noticed that a small position change of sensor locations on the same part may not affect µ that is defined in Equation 12.1. In this numerical test, the position of each sensor is determined by following a simplified procedure, which postulates that no two sensors can be located at the same position and no positions of any three sensors can be collinear. We observe the following: 1. Given the same number of sensors, the detectability is larger if the sensors are placed on different parts in a rigid multipart subassembly than if they are placed on the same part. 2. To make τ = 1, at least one sensor should be placed on each part. 3. The minimum sufficient number of sensors is np, and the associated distribution is one sensor per part. In Table 12.2, we list only the detectability values up to np = 4. This type of exercise can be continued for more parts with more sensors. Remember that the number of sensors can be reduced because of the extra information generated from the covariance terms between variables in y. The fact that the number of covariance terms is a quadratic function of the number of variables in y provide useful information when additional parts are involved. Thus, when more parts are involved, the above conclusions hold true. We summarize this in the following lemma: Lemma 12.4: When each part in an assembly has the same number of degrees of freedom, sensors should be uniformly allocated among all parts within an individual station so that one sensor per part will make τ = 1.
12.4.4 OPTIMAL STRATEGY
OF
SENSOR DISTRIBUTION
Lemma 12.3 indicates that if a variation source can be diagnosed at the next station, it will be diagnosable in any subsequent station. On the other hand, if fixture variation is not diagnosable at the following station, sensors will have to be placed right on the station where the fixture error occurs; such a station will be indicated by λi|k < 100%. Meanwhile, Lemma 12.4 shows that one sensor should be allocated to each part to use the minimal number of sensors to detect all the transmitted variation in Σ xk . We use Table 12.3 to summarize the meaning of µ, λ, and τ, as well as the earlier understandings regarding variation transmissibility and sensor placement on individual stations. The algorithm of the following subsection, which distributes sensors in the multistation assembly process, is a natural outcome of the results from Subsection 12.4.2 and Subsection 12.4.3.
2151_book.fm Page 297 Friday, October 20, 2006 5:04 PM
Optimal Sensor Placement and Distribution
297
TABLE 12.3 Interpretation of Three Indices µ, λ, and τ Symbol
Name
Interpretation
λi/k
Transmissibility
Benchmarks how much information of fixture variation is transmitted from station i to station k Benchmarks how much information that has been transmitted to station k can be detected by sensors installed on station k Characterizes the overall information retrieved by a sensor system, µ = f ((λ,τ))
τ
Detectability
µ
Diagnosability
Impact Determine the locations of sensing stations Determine the minimum sufficient number of sensors and the corresponding sensor layout on station k Combination of the previous impacts
12.4.4.1 Strategy of Sensor Distribution Step 1: On station k = N (the last station), place one sensor on each part. If µ = 1, then stop; else, go to Step 2. Step 2: Let k = k – 1, given the installed sensors at all downstream stations k + 1, k + 2, …, N, check if λk|N equals 1, install sensors on station with λk|N ≠ 1. The installation procedure follows the general rule of sensor placement on an individual station. If λk|N = 1, do not install any sensor on that station. Step 3: Stop if µ =1, otherwise repeat Step 2 for k = N – 1, N – 2, …, 1. Using the preceding sensor distribution strategy, we expect to see the change of diagnosability µ, as shown in Figure 12.5, when sensors are sequentially installed in a production line. 100%
target line
sensors on station k sensors installed on station N
saturation of for the given # of stations
New sensing station
New sensing station
……
FIGURE 12.5 Diagnosability property related to sensor distribution.
# of sensor
2151_book.fm Page 298 Friday, October 20, 2006 5:04 PM
298
Stream of Variation Modeling and Analysis for MMPs
When sensors are placed on station N, the entire transmitted information regarding fixture variation is detected, and the system diagnosability increases rapidly with more sensors placed on station N. If λi|N ≠ 1 for some station i, the maximum µ that can be achieved with sensors placed only on station N is always smaller than 1, i.e., µ = ( Σ iN=1λ i|N ⋅ pi ) / ( Σ iN=1 pi ) <1 (from Equation 12.16). This is illustrated in Figure 12.5 as a dotted flat line signifying that the diagnosability level is saturated. Thus, we should place some sensors at the upstream stations where λi|N≠1. Because λi|i = 1, the installation of sensors directly on station i can help further increase the system diagnosability (step increases in diagnosability are seen in Figure 12.5). The diagnosability obtained by sensors on station N is λ i|N ⋅ pi (transmitted information), and the diagnosability obtained by sensors on station i is (1 − λ i|N ) ⋅ pi (information loss during transmission to station N). However, given that λ i|N ≥ (ni – 1)/ni (according to (P2) in Lemma 12.3), λ i|N > (1 − λ i|N ) for ni > 2, the slope of the curve (to the right) is less steep than the curve prior to it. Again, the diagnosability will saturate at a higher level until sensors are installed on all stations where λi|N ≠ 1. Following this procedure, full diagnosability is achieved and µ = 1.
12.4.5 EXAMPLE The optimal strategy of sensor distribution is illustrated by optimizing a sensor system in a three-station assembly process, as used in Subsection 9.6.1 (Figure 9.4). We will use the same physical process configuration and parameters, except that we will start with no sensor in the process. We still consider a 2-D assembly so that DOF = 3 here. Based on the values of A, B matrices in Equation 9.16 and Equation 9.17, the transmissibility ratios λi|N are calculated as: λ1|3 = 0.667; λ2|3 = 1; λ3|3 = 1. Then, the sensor distribution algorithm in Subsection 12.4.4.1 is invoked to determine the location of sensing stations and the minimal number of sensors on each station: Step 1: On the last station k = N = 3, install one sensor on every part. The value of µ is calculated given different numbers of sensors. The value of µ keeps increasing until it saturates (Curve 1 in Figure 12.6) at the level of 0.889 for four sensors, one sensor per part. Further increase in the number of sensors on Station III does not increase the index µ (the dotted line of Curve 1 in Figure 12.6). The saturated value of µ can be computed by: µ=
Σ 3i =1λ i|3 ⋅ ni ⋅ DOF Σ 3i =1 pi ⋅ DOF
=
0.667 ⋅ 2 ⋅ 3 + 1 ⋅ 3 ⋅ 3 + 1 ⋅ 1 ⋅ 3 = 0.889 , 2 ⋅ 3 + 3 ⋅ 3 + 1⋅ 3
according to Equation 12.16. Step 2: On station k = N – 1 = 2. Check λ2|3. As λ2|3 = 1, we need not place any sensor on station II. A numerical calculation conducted by authors verifies that µ does not change even if additional sensors are placed on station II.
2151_book.fm Page 299 Friday, October 20, 2006 5:04 PM
Optimal Sensor Placement and Distribution
299
Station I
Station III
1 Curve2: Sensor installed on Station I further increases , make it to 100%
0.8
0.6 Curve1: When sensors are only installed on Station III, saturates when sensor number 4
0.4
0.2
0
New Sensing Station
1
2
3
4
5
6
7
8
Sensor Number
FIGURE 12.6 The impact of sensor number on µ.
Step 3: k = N – 2 = 1. Check λ1|3 and minimize the number of sensors on station I to reach µ = 1. Because λ1|3 = 0.667 < 1, we should place sensors on station I. Provided that some sensors were already installed on the downstream station, sensors do not usually need to be installed on every part on station I. A combinatorial test trying different numbers of sensors is necessary to find out the minimum number of sensors required to reach µ = 1. Given Ik parts on station k, the maximal number of k possible combinations is Σ iI=1 C Iik , where C ab is the combinatorial operator for integers a and b. In this example, Σ iI=1 1C Ii1 = 3 , where I1 = 2. The three possible sensor placements are (1,0), (0,1), and (1,1). In fact, adding one more sensor on station I, either on part 1 or part 2, will result in µ = 1. The optimal sensor distribution is to place a total of five sensors at two stations (marked as SP1–SP 5 in Figure 12.7). In Figure 12.7a,
LP 3
700
(c) StationIII 1
z
SP 1
2
LP 4 LP 7 60
SP 2
SP 3 3 LP 8
LP 1 4
x
LP 5 3 LP 6 LP 8 4
350
100
LP 1
LP 4
500
2
1
SP 4
400
600
450
550
LP 2
LP 1
(b) Station II
SP 5
2 80
850
750
(a) Station I 1
500
LP i – locating points SP i – measurement points - active 4-way pinhole - active 2-way slot - inactive 4-way pinhole - inactive 2-way slot - sensor location
4-way locator, 2 associated variation sources 2-way locator, 1 associated variation sources
FIGURE 12.7 The three-station assembly process with the optimal sensor distribution.
2151_book.fm Page 300 Friday, October 20, 2006 5:04 PM
300
Stream of Variation Modeling and Analysis for MMPs
TABLE 12.4 Coordinates of Sensors in Figure 12.7 Using the Fast Exchange Algorithm (unit: mm) Sensor Points SP1 SP2 SP3 SP4 SP5
(on (on (on (on (on
Coordinates (x, z)
Station Station Station Station Station
III) III) III) III) I)
(950, 900) (1630, 1100) (2280, 1000) (2280, 150) (1630, 1100)
sensor SP5 is placed on part 2; but it can alternatively be placed on part 1. Where to place a sensor within individual parts can be determined simply based on some empirical guidelines, such as that used in Subsection 12.4.3, i.e., that no two sensors can be located at the same position and no positions of any three sensors can be collinear. Alternatively, we can employ the sensitivity index defined in Subsection 12.2.2 for sensors on the same part and then use the fast exchange algorithm to determine their locations on an individual part. The results of this procedure are shown in Table 12.4. The previous distributed sensing layout can be compared with two traditional sensing layouts, end-of-line sensing and saturated sensing, which are discussed in Ding et al. [20]. End-of-line sensing layout is defined as placing a sufficient number of sensors at the last station to measure the d.o.f of all parts. Saturated sensing layout involves placing a sufficient number of sensors to measure the d.o.f of all parts on every station. In the cases of end-of-line and saturated sensing layouts, “a sufficient number of sensors” means two sensors per part (for a 2-D assembly process). Thus, in this example, the end-of-line sensing layout will install eight sensors on station III and the saturated sensing layout needs 20 sensors, 2 sensors on each part on every station. The results of all three sensing layouts are presented in Table 12.5.
TABLE 12.5 Comparison of Sensor Distributions for the ThreeStation Assembly
Saturated sensing End-of-line sensing Optimal strategy
Number of Sensors
Number of Stations
m
Jopt
20 8 5
3 1 2
100% 88.9% 100%
3c2 + 20c1 c2 + 8c1 2c2 + 5c1
2151_book.fm Page 301 Friday, October 20, 2006 5:04 PM
Optimal Sensor Placement and Distribution
301
This shows that the optimal algorithm yields the minimum number of sensors and sensing stations while attaining 100% system diagnosability (µ = 1). The cost reduction in comparing the optimal sensing strategy with the scheme of saturated sensing is: Cost reduction in station construction =
c2 = 33.3% 3c2
Cost reduction in sensor deployment =
15c1 = 75% 20c1
12.5 SUMMARY This chapter investigates the design problem of sensor placement and distribution in a multistation system. It is performed on the basis of the state space variation model and design criteria from the diagnosability and sensitivity study. Diagnosability provides a necessary condition for variation sources to be identifiable. Sensitivity indices are defined to characterize the sensor system ability of detecting the underlying process mean and variance changes. Mathematically, the sensitivity indices are the same as the E-optimality criterion proposed in optimal experimental design. To solve the sensor placement problem, we devised a fast exchange routine with a sort-and-cut procedure, which considerably reduces the algorithm’s computation time while maintaining any optimal value it can find. To find the optimal distribution among multiple stations, a backward propagation algorithm is developed. When the unique properties of variation propagation in a multistation process are considered, the resulting strategy of sensor distribution is optimal, i.e., the sensing cost is minimized. Dimensional variation reduction is critical to ensuring high product quality in discrete-part manufacturing. Effective use of sensory data in diagnosing variation sources depends to a great extent on the optimal design of a sensor system with multiple sensors. Optimal design of sensor systems will surely make the task of variation root cause diagnosis more meaningful and efficient. Criteria and methods presented in this chapter will find applications beyond coordinate sensor placement because the approach is based on a general linear system model.
12.6 EXERCISES 1. Derive Equation 12.4. 2. Prove Equation 12.8. β + ε, where β is an unknown but 3. For a standard regression model y = Xβ constant vector and ε ~ N(0, σ 2ε I), establish an equivalence relationship between the sensitivity index using E-optimality and the estimation accuracy of βˆ .
2151_book.fm Page 302 Friday, October 20, 2006 5:04 PM
302
Stream of Variation Modeling and Analysis for MMPs
M1
M3 Z Y
100
X M2
200
600
P2 P1
100 700 1000
FIGURE 12.8 Single-station fixturing system.
4. Consider the following single-station fixturing system. Use the fast exchange algorithm to determine the optimal locations for a three-sensor measurement system. Determine whether the resulting three sensors are in the same locations as those that are currently used (shown in Figure 12.8). In order to complete the problem, use the following: a. The optimization objective and constraints are the same as in Equation 12.12. b. Treat the part as rectangular. c. Discretize the part using a resolution of 10 mm. d. Choose ϕ = 0.1 and q = s/2, and only implement the cut action in the first iteration. 5. Repeat (4) by using the Sv criterion. Is your answer the same as the one found in (4)? 6. Briefly explain why (P1) in Lemma 12.3 is true. You may want to support your argument with a specific example. 7. Consider the assembly process of an SUV in Figure 6.10. a. Calculate the transmissibility ratios λi|k for the whole process; b. What conclusion can you draw in terms of sensor distribution from the transmissibility ratios you obtained in (a)? c. Following the rules established in Section 12.4 (from both the transmissibility ratio and the detection power), what is your suggestion for a sensor distribution that provides 100% diagnosability? d. Verify your answer in (c) by numerically calculating µ for the whole process after substituting the sensor locations in your Matlab program for the state space model. Khrysti: Handwriting was difficult 1. Tarabanis, K.A., Allen, P.K., and Tsai, R.Y., A survey of sensor planning in computer to read. Is vision. IEEE Transactions on Robotics and Automation, 11, 86, 1995. “invariablity” 2. Apley, D.W. and Shi, J., A factor analysis method for designing invariability in correct? multinvariate manufacturing processes, Technometrics, 43, 1, pp. 84–95, 2001. KJ
References
2151_book.fm Page 303 Friday, October 20, 2006 5:04 PM
Optimal Sensor Placement and Distribution
303
3. Mandroli, S.S., Shrivastava, A.K., and Ding, Y., A survey of inspection strategy and sensor distribution studies in discrete-part manufacturing processes, IIE Transactions, 38(4), 309–328, 2006. 7. Khan, A. and Ceglarek, D., Sensor optimization for fault diagnosis in multi-fixture assembly systems with distributed sensing, ASME Transactions, Journal of Manufacturing Science and Engineering, 122, 215, 2000. 4. Khan, A., Ceglarek, D., and Ni, J., Sensor location optimization for fault diagnosis in multi-fixture assembly systems, ASME Transactions, Journal of Manufacturing Science and Engineering, 120, 781, 1998. 5. Khan, A., Ceglarek, D., Shi, J., Ni, J., and Woo, T.C., Sensor optimization for fault diagnosis in single fixture systems: a methodology, ASME Transactions, Journal of Manufacturing Science and Engineering, 121, 109, 1999. 8. Djurdjanovic, D. and Ni, J., Bayesian approach to measurement scheme analysis in multistation machining systems, Journal of Engineering Manufacture, 217, 1117, 2003. 9. Djurdjanovic, D. and Ni, J., Measurement scheme synthesis in multi-station machining systems, ASME Transactions, Journal of Manufacturing Science and Engineering, 126, 178, 2004. 6. Wang, Y. and Nagarkar, S.R., Locator and sensor placement for automated coordinate checking fixtures, ASME Transactions, Journal of Manufacturing Science and Engineering, 121, 709, 1999. 14. Atkinson, A.C. and Donev, A.N., Optimum Experimental Designs, Oxford University Press, New York, 1992. 10. Powell, M.J.D., A Direct Search Optimization Methods that Models the Objective and Constraint Functions by Linear Interpolation, Numerical Analysis Reports, DAMTP 1992/NA5, The University of Cambridge, England, 1992. 11. Schott, J. R., Matrix Analysis for Statistics, John Wiley & Sons, New York, 1997. 12. Ehrenfeld, S., On the efficiency of experimental designs, Annals of Mathematical Statistics, 26, 247, 1955. 13. Cook, R.D. and Nachtsheim, C.J., A comparison of algorithms for constructing exact D-optimal designs, Technometrics, 22, 315, 1980. 15. Meyer, R.K. and Nachtsheim, C.J., The coordinate-exchange algorithm for constructing exact optimal experimental designs, Technometrics, 37, 60, 1995. 16. Aarts, E. and Lenstra, J.K., Local Search in Combinatorial Optimization, John Wiley & Sons, New York, 1997. 17. Lam, R.L.H., Welch, W.J., and Young, S.S., Uniform coverage designs for molecule selection, Technometrics, 44, 99, 2002. 18. Hu, S.J., Impact of 100% Measurement Data on Statistical Process Control (SPC) in Automobile Body Assembly, Ph.D. dissertation, The University of Michigan, Ann Arbor, MI, 1990. 19. Ding, Y., Kim, P., Ceglarek, D., and Jin, J., Optimal sensor distribution for variation diagnosis for multi-station assembly processes, IEEE Transactions on Robotics and Automation, 19, 543, 2003. 20. Ding, Y., Shi, J., and Ceglarek, D., Diagnosability analysis of multi-station manufacturing processes, ASME Transactions, Journal of Dynamic Systems, Measurement, and Control, 124, 1, 2002. 21. Liu, Q., Ding, Y., and Chen, Y., Optimal coordinate sensor placements for estimating mean and variance componetns of variation suorces, IIE Transactions, 37, (9), pp. 877–889, 2005.
2151_book.fm Page 304 Friday, October 20, 2006 5:04 PM
2151_book.fm Page 305 Friday, October 20, 2006 5:04 PM
13
Design Evaluation and Process Capability Analysis*
This chapter considers the problem of evaluating and benchmarking process design and process capability in a multistation assembly process. We will focus on the unique challenges brought by the multistation system, namely: •
•
The necessity of describing the system response to variation inputs at both the global (system level) and local (station level and single-fixture level) scales Process capability propagation for a multivariate, multistage system
The analysis and evaluation will be based on the state space model developed in Part II. Following the background material of Section 13.1, Section 13.2 will present a multilayer sensitivity-based design evaluation approach, and Section 13.3 is devoted to a multivariate process capability analysis. We will illustrate the analysis and evaluation methods in Section 13.4 by using the assembly process for an SUV side panel.
13.1 INTRODUCTION Product quality is characterized by a group of features that can greatly affect design functionality and the level of customer satisfaction. In the industry, this group of critical features as often labeled key product characteristics (KPC). In a production process, KPCs are controlled by a set of determining process factors to achieve the required specification; this set of process factors is labeled key control characteristics (KCC). For the purpose of dimension control in assembly processes, for example, fixture locators constitute such a set of KCCs. Design evaluation and comparative analysis of different designs have been studied by a number of researchers with regard to various goals and objectives. As product variability is taken to be the indicator of process performance in this book, only the design evaluation work aimed at improving quality and reducing variation will be discussed here. We first distinguish between the imprecise description of design parameters and the analysis of manufacturing variability. The imprecision of parameters, which is * Part of chapter material is based on Reference 22.
305
2151_book.fm Page 306 Friday, October 20, 2006 5:04 PM
306
Stream of Variation Modeling and Analysis for MMPs
a critical problem in the preliminary design stage, was modeled by Wood and Antonsson [1] and Antonsson and Otto [2] using fuzzy calculus. Their work seems similar to the analysis of manufacturing variability but is actually conceptually different. The methodology developed in their papers does not allow for the evaluation of production system performance, but rather allows for a better representation of imprecisely defined design parameters. Process design evaluation with respect to the process response of manufacturing variability can generally be classified into two categories: (1) process capability analysis and (2) sensitivity analysis. Process capability analysis [3] is based on the defined indices, often using Cp and Cpk, for a single or multiple design or quality characteristics. They are defined in terms of the statistics of production output to indicate the expected performance of a manufacturing process under the influence of variation or bias. Kazmer and Roser [4] defined a capability index (called “robustness index” in their paper) for multiple simultaneous quality characteristics. In general, a process capability index depends on variation input and is computed directly from the output of KPCs, i.e., process capability index is input dependent. On the other hand, sensitivity analysis usually defines and develops inputindependent ratios. Sensitivity-based analysis has been performed in many situations for different applications. One of the characteristics of sensitivity-based design evaluation is the characterization of the product and process into key characteristics (KC), more specifically KPC and KCC, as mentioned earlier. The intuitive decomposition of products and processes into key characteristics was proposed in Ceglarek et al. [5]. They characterized a product with measurement locating points, which correspond to our concept of KPCs, and characterized a process with principal locating points and clamping locating points, corresponding to our KCCs. Thornton [6] proposed a mathematical framework for the KC process. A systematic KC flowdown was developed, and an effectiveness measure was defined. A complex production system was broken into several layers corresponding to product-KC, part-KC, and process-KC. The KC defined in her paper is closely related to the KPC and KCC used here. KPC is equivalent to product-KC or part-KC, and KCC is actually process-KC. Figure 13.1 illustrates the variation propagation from KCC to KPC. If the sensitivity analysis is performed to characterize the influence of variations
KCC (Process-KC)
Part-KPC (Part-KC)
Product-KPC (Product-KC) Product-oriented
Process-oriented Variation propagation
FIGURE 13.1 Process-oriented vs. product-oriented KC analysis.
2151_book.fm Page 307 Friday, October 20, 2006 5:04 PM
Design Evaluation and Process Capability Analysis
307
of part-KPC or product-KPC, the corresponding variation model is developed within the product. Hence, the technique is labeled as product-oriented. By contrast, if the relationship between KCC and KPC is included in the variation model and sensitivity analysis, the technique is labeled as process-oriented. This distinction is also shown in Figure 13.1. Although KCCs (process-KC) are included in the general framework of KC flowdown, only the product-oriented model and analysis is materialized for the door assembly case in Thornton [6]. Similar product-oriented sensitivity analysis (sometimes including optimization) was performed toward different design goals by Whitney et al. [7], Parkinson et al. [8], Parkinson [9], and Ceglarek and Shi [10] among others. Process-oriented sensitivity analysis is more difficult because the problem domain expands to include the process information and process or product interaction. Product variables (KPC) are less diversified, for instance, quantities included in an assembly are largely geometrical. Process variables (KCC) could cover diversified physical parameters in a process design. When a process-oriented technique is discussed, it can be further divided into a single-station approach (at the station level or machine level) and a multistation approach (at the system level). Most of the process-oriented work has been done at the single-station level. Thornton [11,12] included fixture elements as the KCC (process-KC) in the variation model and analysis of aircraft wing assembly. Similar station-level research work includes Cai et al. [13], Wang [14], and Soderberg and Carlson [15]. In a multistation process, the impact of KCC variation on KPC depends on both the workpiece and tool layout at every station and the station-to-station design configuration. In the subsequent sections, sensitivity analysis and process capability analysis will be performed specifically for a multistage process, respectively. In Section 13.2, a set of multilayer sensitivity indices will be defined to characterize both the macro and micro variation sensitivity in a process. In Section 13.3, we will define a multivariate process capability index for each stage in a multistage process, utilizing the most recent developments in the related statistics field [16].
13.2 SENSITIVITY-BASED DESIGN EVALUATION In this subsection, we present a set of design evaluation indices satisfying the following requirements: (1) ease of interpretation and comparison and (2) ability to capture the uniqueness of a multistage process. Unlike a single-stage process, a single index may be inadequate to describe every aspect of a multistage system. If one devises a ratio at the high level, i.e., it describes the behavior of the entire system to variation inputs, the detailed process characteristics related to individual stations and a single fixture may not be captured by this global level ratio very well. On the other hand, if one devises a ratio at a local level, namely, regarding a single fixture, the combined effect of multiple inputs at the station level or at the system level may not be represented. Owing to this unique problem, a group of hierarchical multilevel indices instead of any single index should be employed to represent different aspects of variational behavior in a multistage manufacturing process. Finally, the multilayer
2151_book.fm Page 308 Friday, October 20, 2006 5:04 PM
308
Stream of Variation Modeling and Analysis for MMPs
indices should be expressed in terms of critical design characteristics and independent of input variation because it is the process or product design that needs to be evaluated rather than the transmitted variation. The state space model developed in Part II prepares the ground for process design evaluation. First, we need to transform the state space model of a multistage assembly process in Equation 6.39 and Equation 6.40 into a variation propagation model. For design evaluation purposes, we usually assume that only the final assembly is measured on station N, i.e., k = N in Equation 6.40, y = Cx N + v
(13.1)
where the subscript indices for y, C, and v are omitted without causing ambiguity. A comprehensive version of the variation propagation model has been presented in Chapter 9, Equation 9.1 to Equation 9.5. Given Equation 13.1, what we will need in this chapter and in Chapter 14 is merely a special case of Equation 9.2. Using the similar notation, we have: N
N
y=
∑ CΦ
N ,k
Bk u k + CΦN ,0 x 0 +
k =1
∑ CΦ
N ,k
wk + v
(13.2)
k =1
Define Γ k = CΦN ,k Bk , Ψk = CΦN ,k , and Γ 0 = CΦN ,0 . Then Equation 13.2 can be simplified as, N
N
y=
∑
Γ k uk + Γ 0x0 +
k =1
∑Ψ w k
k
+v
(13.3)
k =1
Provided that uk is a sequence of mutually uncorrelated Gaussian random vectors, the input-output variance-covariance relationship can be obtained as: N
N
Σy =
∑
Γ k Σ uk Γ Tk + Γ 0 Σ 0 Γ T0 +
k =1
∑Ψ Σ k
wk
ΨkT + Σ v
(13.4)
k =1
In the preceding equation, only the first term is related to KCCs. The second term Σ0 is the variation from the preceding process and can be estimated through incoming part inspection. The third and fourth terms are associated with sensor noises, background disturbances, and higher-order terms due to linearization. These terms are usually of small magnitude and may be estimated through historic data or a calibration process. We can thus simplify Equation 13.4 to focus on the KPC-toKCC relationship as: Σ y =
N
∑Γ Σ k
uk
Γ Tk
k =1
where Σ y is the variance-covariance matrix resulting from the tooling errors.
(13.5)
2151_book.fm Page 309 Friday, October 20, 2006 5:04 PM
Design Evaluation and Process Capability Analysis
309
2 k
Station 1
Station k
Station N Product variation
2 kp
Variation sources
2 KPC
2 KCC
FIGURE 13.2 Variation sources at different levels in a multistage process.
Equation 13.5 states the relationship between KPC variations and KCC variations. The system design parameters are included in Γk’s, which combine the process design information (such as manufacturing process configuration and KPC selection) embedded in system matrices A, B, and C, respectively. The potential evaluation index, namely, the ratio of KPC variations and KCC variations, should be expressed as a function of Γk . Using Figure 13.2, we first introduce the notations associated with variation sources at different levels. The KPC variation is denoted as σ 2KPC , which contains the diagonal elements of Σ y , i.e., σ 2KPC = diag( Σ y ). The KCC variation associated with the pth input at station k is denoted as σ 2kp , where p = 1, 2, …, pk. Because we assumed a mutually independent relationship among all the inputs at station k, Σnk = 2 diag[σ k21 … σ kp ], where pk is the number of inputs on station k. Here we will use σ 2k k to be the variation vector at station k, namely, σ 2k = diag( Σ uk ). The KCC variation at the whole process level is denoted as σ 2KCC , containing the variance components from all stations, i.e., σ 2KCC = [(σ 12 )T … (σ 2N )T ]T . The sensitivity analysis can define how the system responds to certain input variation sources, which variation source contributes most to the final product variation, and how and which process parameters most affect the variation propagation. As such, the sensitivity to be defined is the same as system gain in conventional control theory. Appropriate measures should be introduced to represent process sensitivity as the gain of a multiple-input-multiple-output (MIMO) system. Threelevel sensitivity indices are defined to facilitate the description of the system behavior of a multistage process: (1) single input level, (2) station level with multiple inputs, and (3) system level with multiple stations. Definition 13.1: The sensitivity-based design evaluation index at the single input level, denoted as Skp is defined as:
Skp =
Wσ σ 2KPC σ 2kp
2
(13.6)
where the weighting matrix W determines the relative importance of KPCs on the final product, and ||⋅||2 is the Euclidean norm. The index Skp indicates how the pth input at station k contributes to the KPC variations. At this level, Skp actually corresponds to the gain of a single-input-multiple-output (SIMO) system.
2151_book.fm Page 310 Friday, October 20, 2006 5:04 PM
310
Stream of Variation Modeling and Analysis for MMPs
Definition 13.2: The sensitivity-based design evaluation index at the station level, denoted as Sk is defined as: Wσ 2KPC
Sk = sup
σ k2
σ 2k
2
(13.7)
2
Here, Sk indicates how the kth station contributes to the KPC variation of the final product, and it will help to identify the critical station that contributes the most. It is a MIMO-type gain because each station contains multiple variation inputs. Definition 13.3: The sensitivity-based design evaluation index at the system level, denoted as So is defined as: Wσ 2KPC
So = sup
σ 2KCC
σ 2KCC
2
(13.8)
2
Index So indicates the system capacity to amplify or suppress the input KCC variations and is also a MIMO-type gain. The preceding indices Skp, Sk, and So will be expressed in terms of Γk , which encodes all critical information regarding process design. Lemma 13.1: If the variation sources in Figure 13.1 are mutually uncorrelated, σ 2KPC can be represented as a linear combination of σ 2k as N
σ 2KPC =
∑ [Γ ] ⋅ σ 2 k
2 k
(13.9)
k =1
where [Γ 2k ] represents a matrix in which each element is the square of the corresponding element in Γ k i.e., [Γ 2k ] = [( γ ij2 ,k )], where γ ij ,k is the (i, j)th element of Γk . Proof: According to Equation 13.5, we have
Σ y =
N
∑Γ Σ k
N
uk
Γ = T k
k =1
k =1
∑ γ
1,k
1,k
γ pk ,k ⋅ σ 2k1γ 1,k
k =1
T Γk σ 2kpk
σ k21 γ pk ,k ⋅
N
=
∑
σ 2k1 Γk
⋅ γ 1,k 2 σ kpk
N
=
∑ γ k =1
σ 2kpk γ pk ,k
T
γ pk ,k
T
(13.10)
2151_book.fm Page 311 Friday, October 20, 2006 5:04 PM
Design Evaluation and Process Capability Analysis
311
where γ j ,k , j = 1, …, pk, is the jth column vector of Γk . If the diagonal elements of Σy are extracted and arranged into a vector, we have
∑ = diag( Σ ) = ∑ ∑ N
σ 2KPC
y
k =1
N
=
∑ k =1
∑
σ 2kj ⋅ γ 12j ,k σ kj2 ⋅ γ 22 j ,k j =1,..., pk 2 2 σ kj ⋅ γ mk j ,k j =1,..., pk
2 γ 11 ,k 2 γ 21,k γ 2 mk 1,k
j =1,..., pk
2 γ 12 ,k
γ 222,k
γ
2 mk 2 ,k
(13.11) γ 12pk ,k σ k21 γ 22 pk ,k σ 2k 2 ⋅ 2 γ 2mk pk σ kpk
From the definition of [Γ 2k ] and σ 2k , we know Equation 13.11 is the same as Equation 13.9. Q.E.D. Utilizing Lemma 13.1, the following propositions present the relationship of the aforementioned three sensitivity indices with system parameter matrices Γk . Proposition 13.1: The sensitivity index Skp for the pth input at station k can be expressed as Skp = W ⋅ γ 2p,k
(13.12)
2
where γ 2p,k represents a vector having elements that are the square of the corresponding element in γ p,k , following a similar notation as Γ 2k . Proof: According to the definition of Skp, it is assumed that there exists only a single variation source (instead of multiple simultaneous sources) throughout the entire process at each time. Let us denote the only nonzero variation vector by σ 2k , which contains one nonzero element σ 2k = [0
σ 2kp
0]T
(13.13)
Substituting Equation 13.13 in Equation 13.9 yields σ 2KPC = [Γ k2 ] ⋅ [0
2 σ kp
2 0]T = γ 2p,k ⋅ σ kp
(13.14)
Substituting Equation 13.14 in Equation 13.6 will give us Equation 13.12. Q.E.D.
2151_book.fm Page 312 Friday, October 20, 2006 5:04 PM
312
Stream of Variation Modeling and Analysis for MMPs
The second index is the station sensitivity Sk. It is then assumed that only one station contains variation inputs at each time. But within each station, more than one fixture could contribute to σ 2KPC simultaneously. Proposition 13.2: The station-level sensitivity index Sk can be expressed as Sk = W ⋅ [Γ Γ 2k ]
(13.15)
2
Proof: If variation sources exist only on station k, then σ 2j = 0 ∀ j ≠ k. Hence, σ 2KPC = [Γ k2 ] ⋅ σ k2
(13.16)
Substituting Equation 13.6 in Equation 13.7
Sk = sup σ 2k
W ⋅ [Γ 2k ] ⋅ σ 2k σ 2k
2
= W ⋅ [Γ 2k ]
2
(13.17)
2
The preceding equality holds according to the definition of matrix 2-norm [17]. Q.E.D. System-level sensitivity considers all possible combinations of multiple variation inputs — within a station or across stations. Thus, it represents the overall sensitivity level of a process to the variation inputs. Proposition 13.3: The system-level sensitivity index So can be expressed as So = W ⋅ [Γ 12
Γ 22
Γ 2N ]
2
(13.18)
Proof: The KCC variance σ 2KCC is the combination of variance vectors at all stations. Equation 13.9 in Lemma 13.1 can be rearranged as:
σ 2KPC = [Γ 12
Γ 22
σ 12 2 σ 2 Γ N ] ⋅ 2 = [Γ 12 2 σ N
Γ 22 Γ 2N ] ⋅ σ 2KCC
(13.19)
According to the definition of So and the definition of matrix 2-norm, Equation 13.18 can be obtained. Q.E.D. Sensitivity-based design evaluation analysis using the aforementioned expressions of sensitivity indices is much more generic and comprehensive than numerical methods, such as VSA software [18]. First, it is numerically efficient. Unlike the
2151_book.fm Page 313 Friday, October 20, 2006 5:04 PM
Design Evaluation and Process Capability Analysis
313
numerical methods, the time-consuming Monte Carlo simulation can be avoided in obtaining these indices. Hence, as long as the system matrices A, B, and C are available, the calculation of three-level sensitivity indices for a large-scale system can be completed within seconds of CPU time. The time required for numerical simulation can be much longer, depending on how complex and how large the manufacturing process is. Second, numerical methods are capable of computing a SIMO-type of input-output sensitivity ratio, Skp. A MIMO-type index, however, will be difficult to calculate using simulation methods. A MIMO-type index needs to enumerate the inputs over all possible values because it involves the supremum operation in their definitions. Third, according to the definition of Γk , the preceding indices are fully determined by the information from product and process design, such as tooling layout, datum change scheme, and quality feature selection. Redesigning a manufacturing process could result in a decrease in system sensitivity, or equivalently, an increase in robustness to external noises. These analytical models and sensitivity expressions provide a basis for further design optimization, which will be presented in the subsequent chapters. In fact, simulation methods are used more for design evaluation, rather than design optimization. Based on the previously-defined indices, system sensitivity analysis can be conducted in three steps: 1. When there are multiple process configuration options, such as different tooling layouts, datum change schemes, or assembly sequences, a sensitivity analysis can be performed at the system level to reveal the optimal design that yields the lowest process sensitivity value. 2. Within that design, a study on station-level sensitivity can identify critical stations in the process that contribute most to the KPC variation. 3. A single input index will further isolate the largest variation input within these critical stations. The three-step sensitivity analysis can help select better process designs and set up a proper priority policy so that the most critical variation sources can be focused on and solved.
13.3 MULTIVARIATE PROCESS CAPABILITY ANALYSIS The state space model developed in Part II characterizes variation propagation in a multistage process. A relevant issue is how the process capability changes along the production line. For a univariate quality characteristic, a process capability index is defined as [3]: Cp =
USL − LSL 6σ
(13.20)
where USL and LSL are the upper and lower specification limits of the quality characteristic, respectively, and σ is its standard deviation. As a rule of thumb, industries usually will require Cp ≥ 1.33 to deem a process capable. If implementing a Six Sigma standard, then Cp ≥ 2.
2151_book.fm Page 314 Friday, October 20, 2006 5:04 PM
314
Stream of Variation Modeling and Analysis for MMPs
For a complicated production process, there always exist multiple quality characteristics. One such example is seen in the ten features marked in Figure 6.10d. The challenge lies in defining a scalar index and thus providing an intuitive measure for multivariate process capability, similar to properties of Cp in Equation 13.20. Over the past two decades, several sensible multivariate process capability ratios have been proposed; a general review and comparison can be found in Wang et al. [16]. We will adopt a multivariate process capability ratio here proposed by Chen [19] for a general type of tolerance zone that includes rectangular regions as a special case. First, define a tolerance zone as
{
}
V = Yk ∈ R qk : h( Yk − µ k ) ≤ r0 (k )
(13.21)
where r0(k) is a positive number associated with station k, µk is the targeted mean for Yk, and h( Yk − µ k ) is a specific positive function, defining Yk’s tolerance region. For instance, h( Yk − µ k ) = [( Yk − µ k )T Σ k−1 ( Yk − µ k )]1/ 2 defines an ellipsoidal tolerance region for Yk. Mathematically, Chen [19] further required h(⋅) to be a positive homogeneous function with degree one, i.e., h(tx) = th(x), for t > 0 so as to ensure a well-posed process capability ratio defined later. Let α be the allowable expected proportion of nonconforming products from a process, for instance, α = 0.0027 for conventional 3-Sigma natural tolerance limits. With the tolerance zone defined in Equation 13.21, a process is capable if Pr(X∈V) ≥ 1 – α or, equivalently, Pr(h(Yk – µk) ≤ r0(k)) ≥ 1 – α. Further, define rk = min{c: Pr(h(Yk – µk ≤ c)) ≥ 1 – α}. When the cumulative distribution function of h(Yk – µk) increases in a neighborhood of rk , then rk can be solved as the unique root of the equation Pr(h(Yk – µk ≤ rk) = 1 – α. The process is deemed capable if and only if rk ≤ r0(k), i.e., r0(k)/rk ≥ 1. This suggests that the quantity r0 /rk describes the capability of a process. Therefore, a multivariate process capability ratio for quality characteristics on station k, denoted by MCp(k), is defined as
MCp(k)=
r0 ( k ) rk
(13.22)
We choose this multivariate capability ratio for the following reasons: (1) the tolerance specifications can be as general as given in Equation 13.21; (2) the statistical interpretation does not rely on a particular form, such as normality, of the distribution of Y; and (3) the choice of α provides flexibility in setting a criterion for the capability of a process. An index value of MCp(k) = 1.0 corresponds to an expected proportion of conforming product of exactly 1 – α. A larger MCp(k) indicates a less expected proportion of nonconforming products or a more capable process. These interpretations enable the MCp ratio to have a similar meaning as the univariate ratio Cp.
2151_book.fm Page 315 Friday, October 20, 2006 5:04 PM
Design Evaluation and Process Capability Analysis
315
In industrial applications, for the sake of convenience, most of the tolerance regions are specified as a rectangular zone, i.e., V = {Yk ∈ R qk : | yi,k − µ i,k |≤ ri ( k ) , i = 1, …, qk}, where yi,k and µi,k are the ith element of Yk and µk , respectively, and ri ( k ) is the tolerance range for the ith element associated with station k. Here, h(⋅) is an absolute value function. The rectangular zone can be written equivalently as: V=
{Y ∈ R k
qk
) }
(
: max yi ,k − µ i ,k ri (k ), i = 1, ..., qk ≤ 1
(13.23)
In another word, V has the structure of Equation 13.21 with h( Yk − µ k ) = max{|yi,k – µi,k |/ri(k), I = 1, …, qk}. Hence, the multivariate process capability index can be expressed as MCp(k) =
1 rk
(13.24)
where rk is such that Pr (max{| yi ,k − µ i ,k | / ri (k ), i = 1, ..., qk } ≤ rk ) = 1 – α. Let F be the cumulative function of h(Yk – µk ). Then rk = F–1(1 – α), namely, the 100(1 – α)th percentile of F. In general high-dimensional cases, the cumulative functions of h(Yk – µk ) are usually not available in an analytical form. Monte Carlo simulation is needed to numerically obtain an approximation of F and then an estimate of MCp(k). (k). The inferential property of MC (k) Denote the estimate of MCp(k) by MC p p is generally difficult to obtain in closed form. Usually, resampling methods such as bootstrap or jackknife [20] are employed to obtain an asymptotic 100(1 – ν)% p(k) ± zν/ 2 σˆ k ,MC, where zν/2 is the 100(1-ν)th confidence interval for MCp(k) as MC percentile of the standard normal and σˆ k,MC is the resampling estimate for the stan (k). To test the hypothesis that H : MC (k) = 1 (or other dard deviation of MC p 0 p designated values by design engineers) vs. H1: MCp(k) >1, one would reject H0 at the significance level ν if MCp(k) – zα / 2 σˆ k ,MC > 1. When the null hypothesis is rejected, we will conclude that the process under test is capable.
13.4 EXAMPLES The assembly process of the SUV side panel, discussed in Section 6.4, is used to demonstrate the sensitivity-based design evaluation and the multivariate process capability analysis of a multistage process.
13.4.1 SENSITIVITY-BASED DESIGN EVALUATION In addition to fixture locators LP1 – LP8 used in the assembly process of Figure 6.10, there is an extra locator LP9 (coordinates: x = 3026.25, z = 950.30), as shown in Figure 13.3, on the rear quarter panel which can be used to position this particular panel on station III, as well as to position the final subassembly on the measurement station, station IV. The position of KPC points are those measurement points, i.e.,
2151_book.fm Page 316 Friday, October 20, 2006 5:04 PM
316
Stream of Variation Modeling and Analysis for MMPs
Rail roof side panel
LP 7
LP 5 LP 6 LP 2 LP 8 LP 3
LP 1 LP 9
LP 4
A-pillar B-pillar
Rear quarter panel
FIGURE 13.3 Fixture locators LP1–LP9 on the assembly.
TABLE 13.1 Process Sensitivity Index for C1–C4 Process Configuration
So
C1
C2
C3
C4
6.14
3.33
3.26
3.13
TABLE 13.2 Station Sensitivity Index for Configuration C4
Sk
Station I
Station II
Station III
2.94
1.69
3.01
SP1 – SP10 in Figure 6.10. The coordinates of those locators and KPCs can be found in Table 6.1 and Table 6.2, respectively. We propose four alternative process configuration schemes marked C1–C4. Configuration C1 is currently used in one domestic automotive assembly plant and has been described in Section 6.4. It is also used as the reference in our design evaluation. A major difference between other configurations (C2, C3, and C4) and C1 is that locator LP9 is used to replace LP7 when the rear quarter panel is located on station III. The fixture locating layout for each configuration is presented as follows using the same notations described in Chapter 6. Configuration (C1): {{LP1, LP2}, {LP3, LP4}}I → {{LP1, LP4}, { LP5, LP6}}II → {{ LP1, LP6}, {LP7, LP8}}III→{{ LP1, LP8}}IV ; Configuration (C2): {{LP1, LP2}, {LP3, LP4}}I → {{LP1, LP4}, { LP5, LP6}}II → {{ LP1, LP6}, { LP8, LP9}}III→{{ LP1, LP9}}IV ;
2151_book.fm Page 317 Friday, October 20, 2006 5:04 PM
Design Evaluation and Process Capability Analysis
317
Configuration (C3): {{LP1, LP2}, {LP4, LP3}}I → {{LP4, LP2}, { LP5, LP6}}II → {{ LP4, LP6}, { LP8, LP9}}III→{{ LP4, LP9}}IV ; Configuration (C4): {{LP1, LP2}, {LP4, LP3}}I → {{LP4, LP2}, { LP5, LP6}}II → {{ LP1, LP6}, { LP8, LP9}}III→{{ LP1, LP9}}IV ; To evaluate the previous design configurations, a state space model of N = 4 is first developed for each of the preceding configurations, following the procedure presented in Chapter 6. The system matrices A, B, and C of the four different design configurations are included in Section 13.7. The sensitivity-based design evaluation is then performed in three steps as outlined in Section 13.2. In this example, the weight coefficient matrix W is selected as an identity matrix, implying that all KPCs are regarded as equally important: Step 1 (System-level design evaluation): The system sensitivity indices regarding all four process configurations are calculated and presented in Table 13.1. Apparently, the lower the index value, the better the robustness of a process design. Comparing two sensitivity indices, we further quantify the significance of improvement (SOI) as
SOI =
Soold − Sonew % Soold
(13.25)
Given a unit KCC variation input, SOI represents the percentage of KPC variation change when a new process design configuration is compared to the original design configuration. A negative SOI means that the process sensitivity actually increases and, hence, the system robustness deteriorates. Significant value range of an SOI depends on the trade-off between the saving from the quality improvement and the efforts in making the changes. The determination of a quantitative SOI significant range could only be conducted where the following relations are known: (1) statistical distributions of KPC or KCC variables; (2) tolerance limits; and (3) variation or tolerance vs. cost (scrap, rework, warranty etc.). In the presented case study, based on our discussions with automotive engineers, we consider an SOI greater than 20% to be significant, between 10%~20% to be marginally significant, and less than 10% to be insignificant. Based on the sensitivity values in Table 13.1, we know that SOI takes a value of 45%–49% when either one of the alternative configurations, C2, C3, or C4, is used to replace the current industrial configuration C1. It is then concluded that the sensitivity level drops considerably when LP9 is used to replace LP7. In other words, the new configuration with LP9 significantly improves the system’s robustness. The result suggests that C1, the design configuration currently used in the industry, is not optimal in terms of robustness to dimensional variations. Meanwhile, the SOIs between any two of the other three process designs using LP9 (configurations C2, C3, and C4) is smaller than 6%, namely, their differences are not significant. The fourth scheme (C4) yields the lowest So value among the four process configurations. The value of SOI equals 49.0% when C4 is compared
2151_book.fm Page 318 Friday, October 20, 2006 5:04 PM
318
Stream of Variation Modeling and Analysis for MMPs
to C1, which corresponds to a 49.0% decrease in KPC variation level under the same condition of KCC variation inputs. Hence, it is recommended that the current process design should be replaced by Configuration C4. Step 2 (Station-level design evaluation): Let us further study the station sensitivity of the fourth configuration (C4) to identify which station makes the biggest contribution to the KPC variation. Sensitivity indices for three stations are shown in Table 13.2. The percentage of variation contribution (PVC) from station k can be calculated using the following index: PVC k =
S
Σ
k N k =1 k
S
%.
Results indicated that PVC3 = 39.4%, PVC1 = 38.5%, and PVC2 = 22.1%. The third station is the most critical station with the highest sensitivity and PVC value. Station I also makes a remarkable contribution to the KPC variation. Station I and station III together account for 77.9% contribution in the KPC variation level. Station II has the lowest station sensitivity and the smallest PVC value. It is the designer’s first priority to investigate the design layouts of station I and station III. Step 3 (Single-input level design evaluation): In this process, each fixture locator constitutes a single process variation input. The fixture sensitivity index is computed using Equation 13.12. At each station, two parts or multipart subassemblies are positioned by four independent locators. Thus, a total of 12 indices are shown in Table 13.3. From the preceding Table 13.3, one can see that all locators at station II are not the major variation inputs. Station I and station III include some critical variation sources. Locator 1 and locator 3 (both 4-way locators) at station I and station III cause the largest variations in the final assembly, if the input variations have the same magnitude. The variation reduction and design efforts should first target station I and station III to reduce the sensitivity of these two 4-way locators. A simulation software such as 3DCS [21] can be used to obtain the sensitivity indices by performing Monte Carlo simulations. As discussed in Section 13.2, it is
TABLE 13.3 Fixture Sensitivity Index for Configuration C4
Locator Locator Locator Locator
1 2 3 4
Station I
Station II
Station III
2.38 1.37 2.18 0.75
1.50 0.69 0.65 0.56
2.03 0.75 2.68 1.09
2151_book.fm Page 319 Friday, October 20, 2006 5:04 PM
Design Evaluation and Process Capability Analysis
3
Station I
319
StationII I
Station II
Sensitivity Index
2.5 2 VSA
1.5
Analytical
1 0.5
Locator 4
Locator 3
Locator 2
Locator 1
Locator 4
Locator 3
Locator 2
Locator 1
Locator 4
Locator 3
Locator 2
Locator 1
0
FIGURE 13.4 Comparison of sensitivity index from VSA and analytical approach.
difficult and time consuming to compute MIMO-type indices such as Sk and So when using VSA. Thus, the VSA software is used to obtain only the fixture sensitivity index Skp. An identical assembly process, as presented in Figure 6.10, is modeled by the VSA and the variation model is then generated. A normal variation source with 3σ = 1 is assigned to one fixture locator each time, and 5000-run Monte Carlo simulations are conducted. The sensitivity index is computed by dividing the KPC variation by the input source’s variance. The results are compared with those values in Table 13.3, which are calculated using Equation 13.12 and the design parameters in Tables 6.1 and 6.2. The comparison is shown in Figure 13.4. One can certainly observe a good consistency between the analytical and numerical calculation of the 12 fixture sensitivity indices; the maximum difference is less than 3.2%.
13.4.2 MULTIVARIATE PROCESS CAPABILITY ANALYSIS In this subsection, we apply the process capability analysis to the process configuration scheme C1 as described in Subsection 13.4.1. Suppose that tolerances of ±0.25 mm are used for all the locating points at all four assembly stations. It is also assumed that deviations at the locating points are normally distributed and all tolerance limits correspond to their 6σ values. Simulated variations of the measurement points are computed by 3DCS. In this illustration of process capability analysis, we try to use the same number of quality characteristics on each station because the number difference in quality characteristics may unfairly affect the process capability of a station. In particular, we select four single-directional measurement points out of the original measurement points (namely, SP1 to SP10 in Figure 6.10) on each station; their selections on each station are indicated in Table 13.4, SPi(x) and SPi(z) denote the x and the z direction of the ith measurement point. Therefore, the Yk in our application is a four-dimensional vector, i.e., Yk ∈ R 4 , k = 1, 2, 3, 4. We consider a rectangular tolerance zone as regulated by Equation 13.23. In the automotive assembly applications, it is commonly required that the tolerance range ri(k) for the final car body assembly be less than 2 mm. As we are only considering a subassembly process in this example, we decide to specify ri(k) to be 0.5 mm,
2151_book.fm Page 320 Friday, October 20, 2006 5:04 PM
320
Stream of Variation Modeling and Analysis for MMPs
TABLE 13.4 Measurement Points Selected from Each Station
Station Station Station Station
I II III IV
Measurement Point 1
Measurement Point 2
Measurement Point 3
Measurement Point 4
SP1(x) SP1(z) SP2(x) SP3(z)
SP2(z) SP2(x) SP4(x) SP5(x)
SP3(x) SP4(x) SP5(x) SP8(x)
SP4(x) SP6(x) SP9(z) SP10(z)
i = 1, 2, 3, 4 and k =1, 2, 3, 4. Because Yk is a vector of the deviations at the measurement points, it is easy to understand that the target mean µk=0. Hence, the tolerance zone is
{
) }
(
V = Yk ∈ R 4 : max yi ,k 0.5 , i = 1, …, 4 ≤ 1 . Monte Carlo simulations are used to numerically obtain an estimate of MCp(k). Assume that Yk is normally distributed with mean vector tk and covariance matrix Σk . Estimates tˆk and Σˆ k can be computed from the observations of the assembly process. Hence, if we generate K (K should be a large number, such as 1,000,000) samples of random variable Yk , then we can estimate the probability Pr(max{|yi,k|/0.5, i = 1, …, 4|} ≤ rk) for a given rk by
Pr =
{
}
Number of samples that satisfy max yi ,k 0.5 , i = 1, …, 4 ≤ rk K
(13.26)
Therefore, we are able to obtain rk such that Pr(max{|yi,k|/0.5, i = 1, …, 4|} ≤ rk) = 1 – α by the following procedure. Denote the samples we generated by Yk1, Yk2, …, YkK, and Ykj = [ y1j,k , y2j,k , y3j,k , y 4j,k ] . If we denote βj = max{| yij,k | /0.5, i = 1, …, 4}, j = 1, 2, …, K, then we can estimate rk by the 100(1 – α)% percentile of {β j}Kj =1. The MCp(k) is then estimated by 1/rk. Under the process configuration described earlier, we obtained the measurement results under 500 different deviations at the locating points. Thus, the estimates tˆ k and Σˆ k can be computed from those 500 observations. Using tˆ k and Σˆ k , we can generate K (say K = 1,000,000) samples of Yk, and then estimate rk by the 100(1 – α)% percentile of {β j}Kj =1. As such, the estimate of the multivariate process capability (k), k = 1, …, 4, are shown as dots in Figure 13.5. The value of MC (k) index MC p p is indicated next to the dot. Obviously, MCp(k) decreases with k. For this process, MCp(4), the process capability at the final station, is still greater than 1.33, the general benchmark required for a process by the automobile industry. The bootstrap method is used to obtain an asymptotic 100(1 – v)% confidence (k) ± z σˆ ˆ interval for MCp(k) as MC p ν / 2 k , MC. We calculate the σ k , MC from 100 boot-
2151_book.fm Page 321 Friday, October 20, 2006 5:04 PM
Design Evaluation and Process Capability Analysis
321
3.6066
3.5
3.4206 3.2346
MC p
3
2.5
2.2941 2.2041 2.1141
2
1.5649 1.4902
1.5
1.4154
1.4878 1.4122 1.3367
1 Station I
station II
Station III
Station IV
FIGURE 13.5 Multivariate process capability index MCp(k).
(k).The confidence intervals are shown in Figure 13.5 as strapping samples of MC p well. The two short bars at the end of the line represent the range associated with the 95% confidence interval. Overall, Figure 13.5 demonstrates the change of process capability for a product with multivariate quality characteristics over a series of production stations. The uncertainty in the process capability estimates is a result of the Monte Carlo simulations.
13.5 EXERCISES 1. Compared to a univariate case, what are the major challenges in defining a design sensitivity index for a multivariate system? Likewise, what are the major challenges in defining a process capability ratio? 2. Prove Lemma 13.1. 3. Prove Propositions 13.1 to 13.3. 4. If the process mean is truly on target (i.e., there is no mean shift), what kind of defective rate can a Cp = 1.33 guarantee the manufacturer? How about Cp = 2? Based on your calculation, explain why Six Sigma is broadly used as a quality standard in the industry. 5. Given the following single-station multilocator fixturing system (Figure 13.6), calculate the station-level, Sk, and fixture-level sensitivity index for individual locators, Skp. Here assume all measurements are equally important, or W = I. 6. Consider the same fixturing system as in (5). Suppose that the tolerance for each locator is ±0.25 mm (and assume that the tolerance limits correspond to ±3σ value of a normal distribution) and the tolerance specification for each measurement point is ±1 mm. Calculate the process capability ratio for this MIMO system using Monte Carlo simulations.
2151_book.fm Page 322 Friday, October 20, 2006 5:04 PM
322
Stream of Variation Modeling and Analysis for MMPs
M1
M3 Z
100
X M2
200
600
P2 Y
P1
100 700 1000
FIGURE 13.6 The geometry of the fixture and locations of measurements (units: mm).
Also, plot your result in a graph similar to Figure 13.5, but remember that you have only one station instead of four.
13.6 APPENDIX 13.1: SYSTEM MATRICES OF CONFIGURATION C1 0 0 0.0004 A(1) = −1.1787 0.3603 0.0004
0
0
0
0
0
0
0
0
0
0
0.0009
1
−0.0004
−0.0009
−0.2942
−0.4387
0
1.1787
0.4387
136.2029
−0.1155
0
−0.3603
0.1155
−274.5896
0.0009
0
−0.0004
−0.0009
0.7058
0
6×6
6×6 0 6×6 I 12×12
2151_book.fm Page 323 Friday, October 20, 2006 5:04 PM
Design Evaluation and Process Capability Analysis
0 0 −0.0003 −0.8583 −0.2857 A(2) = −0.0003 −0.7752 −0.33376 −0.0003 0 0 −0.0001 −0.9307 −0.1397 −0.0001 A(3) = −0.8900 −0.1651 −0.0001 −0.8824 −0.3854 −0.0001 1 0 0.0016 B(1) =
0
0
0
0
0.0005
1
−0.2495
0
−0.4970
0
0.0005
0
−0.3959
0
−0.4055
0
0.0005
0
0
0 0
3× 3
0
0
0
3× 3
0
0
0
0.0003
I
323
0
−0.0005 0.2495
83.6209
0.2857
−0.5030
−168.5822
0.0003
−0.0005
−0.1806
0.7752
0.3959
132.6691
0.3376
0.4055
−199.2403
0.0003
−0.0005
0.8194
3× 3
3× 3
0
0
0.0005
1
0.0001
−0.0005
−0.2445
0
−0.0693
0.2445
−0.5071
0
0.1397
−0.4929
0.0005
0
0.0001
−0.0005
−0.1100
0.3879
0.1651
−0.5825
−0.3879
0
−0.4175
0
6×6
0
0
0
0
0.0001
−0.0005
0
0.8824
0.4148
0.3854
−0.3593
0.0001
−0.0005
0
0.0005
0
0
0
0
0
1
0
0
−0.0012
−0.0016
0.0012
0
3× 4
0
6 ×8
3× 3
3× 3
0.4769 −220.74 445.01 0.4769 −350.21 525.94 0.4769 −374.49 1227.3 1.4769 12×12 0
0.0005
0.3593
0
3× 3
0
−0.4148
3× 6
0
I
0
I
0
3× 3
0
3× 6
−0.1806
−0.1417 3× 3
0
0
0
3× 3
0
3× 4
1
0
0
0
1
0
−0.0012
0
0.0012
0 0 0 12×8
12×12
2151_book.fm Page 324 Friday, October 20, 2006 5:04 PM
324
1 0 −0.0004 1.1787 −0.3603 −0.0004 B(2) = 1 0 0.0003 0.8583 0.2857 0.0003 B(3) = 0.7752 0.3376 0.0003
Stream of Variation Modeling and Analysis for MMPs
0
0
0
1
0
0
−0.0009
0.0004
0.0009
0.4387
−0.1787
−0.4387
0.1155
0.3603
0.8845
−0.0009
0.0004
0.0009
0
1
0
0
0
0
−0.0005
−0.0003
0.0005
0.2495
0.1417
−0.2495
0.4970
−0.2857
0.5030
−0.0005
−0.0003
0.0005
0.33959
0.2248
−0.3959
0.4055
−0.3376
0.5945
−0.0005
−0.0003
0.0005
0
3× 4
6× 4
1
0
0
0
1
0
0.0007
6× 4
0
0
−0.0032
−0.0007
0
0
0
0
0
0
0
0
0
0
9× 4
1
0
0
0
1
0
0.0011
0.0004
−0.0004
0 0 0.0032 0 0 12×8 0
0 0 −0.001112×8
2151_book.fm Page 325 Friday, October 20, 2006 5:04 PM
Design Evaluation and Process Capability Analysis
1 0 1 0 C=
0
1
1
−96.3
0
−728.7
1
197.9
0
6×3
0
0
2×3
8×3
0
4×3
0
1
0
141.39
0
1
−11.3
1
0
735.35
0
1
5. 5
1
0
1283.9
0
1
−56.5 0
2×3
0
325
8×3
4×3
0
1
0
0
1
6×3
−141.4 133.79
0
8×3
1
0
0
1
1
0
0
1
1
0
0
1
1
0
0
1
4×3 0 6×3 0 2×3 0 −260.19 −56.62 1216.1 −197.92 1465 −1103 231.91 −961.62 20×12
Matrices A(1), A(2), B(1), B(2), are the same as those of configuration C1.
2151_book.fm Page 326 Friday, October 20, 2006 5:04 PM
326
Stream of Variation Modeling and Analysis for MMPs
13.7 APPENDIX 13.2: SYSTEM MATRICES OF CONFIGURATION C2 0 0 0 −0.9971 −0.0058 0 A(3) = −0.9954 −0.0069 0 −0.9969 −0.0010 0 1 0 0.0003 0.8583 0.2857 0.0003 B(3) = 0.7752 0.3376 0.0003
0
0
0
0
0.0004
1
0.1741
1
0
3× 6
0
0
0
0
0
−0.0004
0 −0.0029
0.1741
−0.6491
0
0.0058
−0.3509
0.0004
0
0
−0.0004
−0.2762
0
−0.5853
I
6×6
−0.0046
0.2762
0
0.0069
−0.4148
0.0004
0
0
−0.0004
−0.1868
0
0.9969
0.1868
0.0110
0.3410
−0.3410
0
0.0004
0
0 1
0
3× 6
0
0
0
−0.0005
−0.0003
0.0005
0.2495
0.1417
−0.2495
0.4970
−0.2857
0.5030
−0.0005
−0.0003
0.0005
0.3959
0.2248
−0.3959
0.4055
−0.3376
0.5945
−0.0005
−0.0003
0.0005
0
3× 4
−0.0004
0
0
−0.3378 513.38 −315.28 −0.3378 248.11 −372.61 −0.3378 167.85 −592.08 0.6622 12 ×12 0
0
9× 4
1
0
0
0
1
0
−0.0004
−0.0009
0.0004
0 0 0.000912 ×8
2151_book.fm Page 327 Friday, October 20, 2006 5:04 PM
Design Evaluation and Process Capability Analysis
1 0 1 0 C=
0
1
1
−96.3
0
−728.7
1
197.9
0
6×3
0 0
0
4×3
0
1
0
141.39
0
1
−11.3
1
0
735.35
0
1
5.5
1
0
1283.9
0
1
−56.5
2×3
0
1× 3
0
327
2×3
4×3
0
1
0
0
1
1× 3
6×3
−141.4 133.79 0
1× 3
1
0
4×3 0 6×3 0 2×3 0 54.67 12×13
Matrices A(1), A(2), and B(1), B(2) are the same as those of configuration C1.
13.8 APPENDIX 13.3: SYSTEM MATRICES OF CONFIGURATION C3 0.7031 −0.7290 0.0008 A(1) = 0 0 0.0008
−0.2371
44.5677
−0.7031
0.2371
0
0.4178
109.4068
0.7290
−0.4178
0
0.0006
0.8791
−0.0008
−0.0006
0
0
0
0 0.0006
0
0
0
−0.1209 0
6×6
−0.0008
0
0
0
0
−0.0006
0
6×6
1 I
6×6
12×12
2151_book.fm Page 328 Friday, October 20, 2006 5:04 PM
328
I 3× 3 A(2) = 0 6×3 3× 3 0 I 3× 3 0 3× 3 A(3) = 3× 3 0 3× 3 0
Stream of Variation Modeling and Analysis for MMPs
−0.7325
−0.1143
0
−0.2675
0.1143
50.907
0.6568
−1.2806
0
−0.6568
0.2806
−0.0007
0.0003
0
0.0007
−0.0003
0
0
0
0
0
0
0
0
0
0
0
0
−00.0007
0.0003
1
0.0007
−0.0003
−0.1995
−0.3420
0
0.1995
0.3420
152.37
−0.1437
−0.9386
0
0.1437
0.9386
−27.341
−0.0007
0.0003
0
0.0007
−0.0003
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
−0.9531
−0.1992
0
0.1151
−1.4890
0
−0.0001
0.0005
0
0.0000
0.0000
0
0.0000
0.0000
0
−0.0001
0.0005
1
−0.8597
−0.5960
0
−0.0252
0
0
3× 3
3× 3
3× 3
124.97 −0.1381
−0.1381
−0.0469
0.1992
−0.1151
0.4890
0.0001
−0.0005
0.0000
0.0000
0.0000
0.0000
0.0001
−0.0005
−0.1403
0.5960
0.0252
−0.1070
0
−0.0001
0.0005
0
0.0001
−0.0005
−0.8899
−0.4676
0
0.8899
0.4676
0.1078
0.5420
0.0001
−0.0005
−0.1078
−0.5420
0
−0.0001
0.0005
0
0
3× 3
0
6×3
0.8619
−0.8930
I
0
3× 3
I
3× 3
12×12
−0.4319 0.0000 0.0000 −0.4319 476.41 −85.412 −0.4319 373.80 −366.08 0.5681 12×12 159.22 390.85
2151_book.fm Page 329 Friday, October 20, 2006 5:04 PM
Design Evaluation and Process Capability Analysis
1 0 0.0016 B(1) =
0
0
0
1
0
0
−0.0012
−0.0016
0.0012
0
0
0
0
1
0
0.0012
0
−0.0012
−0.2371
0.2969
0.2371
0.4178
0.7290
0.5822
0.0006
−0.0008
−0.0006
0.7325 −0.6568 0.0007 1 0 0.0007 B(3) = 0.1995 0.1437 0.0007
0.1143
0.2675
−0.1143
1.2806
0.6568
−0.2806
−0.0003
−0.0007
0.0003
0
0
0
1
0
0
−0.0008
0
0 0.0007
6× 4
0
0
0
1
0
0
−0.0003
−0.0007
0.0003
0.3420
0.8005
−0.3420
0.9386
−0.1437
0.0614
−0.0003
−0.0007
0.0003
0
3× 4
0 0 0 12×8
6× 4
−0.0006 1
0
0
6 ×8
0.7031 −0.7290 0.0008 1 0 0.0008 B(2) =
0.0006
3× 4
1 3× 4
0
329
0
0
1
0
−0.0032
−0.0007
0
0
0
0
0
0
0
0
0
0
9× 4
1
0
0
0
1
0
−0.0004
−0.0009
0.0004
0 0 0.0032 0 0 12×8 0 0 0 0.0009 12×8
2151_book.fm Page 330 Friday, October 20, 2006 5:04 PM
330
1 0 1 0 C=
Stream of Variation Modeling and Analysis for MMPs
0
1
1
−96.3
0
−728.7
1
197.9
0
0
6×3
0
4×3
0
1
0
−690.13
0
1
16.97
1
0
−96.13
0
1
33.77
1
0
452.37
0
1
−28.23
2×3
0
2×3
4×3
0
1
0
0
1
6×3
−141.4 133.79 1 0 1
0
8× 3
0
8× 3
0
0
8× 3
1 0 1 0
4×3 0 6×3 0 2×3 0 0 −548.67 1 764.48 927.63 0 1 623.18 0 1176.5 1 −281.92 0 −56.57 1 −140.52 20 ×12
13.9 APPENDIX 13.4: SYSTEM MATRICES OF CONFIGURATION C4 0 0 −0.0003 −1.1129 −0.2770 A(2) = −0.0003 −0.7752 −0.3376 −0.0003
0
0
0
0
0.0005
1
0.1987
0
−0.5122
0
0.0005
0
−0.3959
0
−0.4055
0
0.0005
0
0
3× 3
0 0
3× 3
0 0.0003
I
3× 3
0
0
3× 3
3× 3
0
0
0
0
−0.0005
−0.1806
−0.1129
−0.1987
−66.5929
0.2770
−0.4878
−163.4753
0.0003
−0.0005
−0.1806
0.7752
0.3959
132.6691
0.3376
0.4055
−199.2403
0.0003
−0.0005
0.8194
0
3× 3
0
3× 3
0
0
I
3× 3
3× 3
3× 3
12×12
2151_book.fm Page 331 Friday, October 20, 2006 5:04 PM
Design Evaluation and Process Capability Analysis
0 0 0 −1.0023 −0.0057 0 A(3) = −0.9954 −0.0069 0 −0.9969 −0.0110 0 1 0 0.0003 1.1129 0.2770 0.0003 B(3) = 0.7752 0.3376 0.0003
0
0
0
0
0.0004
1
0
−0.0004
0.1386
0
0.0023
−0.1386
0
3× 6
0
0
0
0
331
0
−0.6597
0
0.0057
−0.3403
0.0004
0
0
−0.0004
−0.2762
0
−0.5852
I
6×6
−0.0046
0.2762
0
0.0069
−0.4148
0.0004
0
0
−0.0004
−0.1868
0
0.9969
0.1868
0.0110
0.3410
−0.3410
0
0.0004
0
0 1
0
3× 6
0
0
0
−0.0005
−0.0003
0.0005
−0.1987
−0.1129
0.1987
0.5122
−0.2770
0.4878
−0.0005
−0.0003
0.0005
0.3959
0.2248
−0.3959
0.4055
−0.3376
0.5945
−0.0005
−0.0003
0.0005
0
3× 4
−0.0004
0
0
−0.3378 −124.54 −305.72 −0.3378 248.11 −372.61 −0.3378 167.85 −592.08 0.6622 12 ×12 0
0
9× 4
1
0
0
0
1
0
−0.0004
−0.0009
0.0004
0 0 0.000912 ×8
Matrices A(1), A(2), and B(1), B(2) are the same as those of configuration C3.
REFERENCES 1. Wood, K.L, and Antonsson, E.K., Computations with imprecise parameters in engineering design: background and theory, ASME Transactions, Journal of Mechanisms, Transmissions, and Automation in Design, 111, 616, 1989. 2. Antonsson, E.K. and Otto, K.N., Imprecision in engineering design, ASME Transactions, Journal of Mechanical Design, 117B, 25, 1995. 3. Montgomery, D.C., Introduction to Statistical Quality Control, 5th ed., John Wiley & Sons, New York, 2003. 4. Kazmer, D. and Roser, C., Evaluation of product and process design robustness, Research in Engineering Design, 11, 22, 1999.
2151_book.fm Page 332 Friday, October 20, 2006 5:04 PM
332
Stream of Variation Modeling and Analysis for MMPs
5. Ceglarek, D., Shi, J., and Wu, S.M., A knowledge-based diagnostic approach for the launch of the autobody assembly process, ASME Transactions, Journal of Engineering for Industry, 116, 491, 1994. 6. Thornton, A.C., A mathematical framework for the key characteristic process, Research in Engineering Design, 11, 145, 1999. 7. Whitney, D.E., Gilbert, O., and Jastrzebski, M., Representation of geometric variations using matrix transforms for statistical tolerance analysis in assemblies, Research in Engineering Design, 6, 191, 1994. 8. Parkinson, A., Sorensen, C., and Pourhassan, N., A general approach for robust optimal design, ASME Transactions, Journal of Mechanical Design, 115, 74, 1993. 9. Parkinson, A., Robust mechanical design using engineering models, ASME Transactions, Journal of Mechanical Design, 117B, 48, 1995. 10. Ceglarek, D. and Shi, J., Design evaluation of sheet metal joints for dimensional integrity, ASME Transactions, Journal of Manufacturing Science and Engineering, 120, 452, 1998. 11. Thornton, A.C., Variation risk management using modeling and simulation, ASME Transactions, Journal of Mechanical Design, 121, 297, 1999. 12. Thornton, A.C., Quantitative selection of variation reduction plans, ASME Transactions, Journal of Mechanical Design, 122, 185, 2000. 13. Cai, W., Hu, S.J., and Yuan J.X., A variational method of robust fixture configuration design for 3-D workpieces, ASME Transactions, Journal of Manufacturing Science and Engineering, 119, 593, 1997. 14. Wang, M.Y., An optimum design approach to fixture synthesis for 3-D workpieces, Transactions of NAMRI/SME, XXVII, 209, 1999. 15. Soderberg, R. and Carlson, J.S., Locating scheme analysis for robust assembly and fixture design, Proceedings of the 1999 ASME Design Engineering Technical Conferences, September 12–15, Las Vegas, NV, 1999. 16. Wang, F.K. et al., Comparison of three multivariate process capability indices, Journal of Quality Technology, 32, 263, 2000. 17. Chen, C.T., Linear System Theory and Design, Harcourt Brace Jovanovich Inc., Orlando, FL, 1984. 18. VSA, VSA-3D Release 12.5 User Manual, Variation System Analysis, Inc., MI, 1998. 19. Chen, H., A multivariate process capability index over a rectangular solid tolerance zone, Statistica Sinica, 4, 749, 1994. 20. Efron, B., The Jackknife, the Bootstrap, and Other Resampling Plans, SIAM, Philadelphia, 1982. 21. 3DCS Analyst, Dimensional Control Systems, Inc., Troy, MI, http://www.3dcs.com, 2004. 22. Ding, Y., Ceglarek, D., and Shi, J., 2002, “Design Evaluation of Multi-station Manufacturing Processes by Using State Space Approach,” ASME Transactions, Journal of Mechanical Design, Vol. 124, pp. 408-418
2151_book.fm Page 333 Friday, October 20, 2006 5:04 PM
14
Optimal Fixture Layout Design*
Following in the footsteps of Chapter 13, this chapter and the next will consider the problem of optimally designing a process configuration and deciding a sensor deployment strategy in a multistation assembly process, respectively. Fixture layout design, a critical aspect of process configuration design, will be considered in this chapter. The objective is to find the best fixture layout such that the variability of the final assembly will be insensitive to fixture variation inputs. One of the challenges raised by a multistation design is that a high-dimension design space, which usually embeds a lot of local optima, will have to be explored. Consequently, it makes global optimality more difficult to achieve and, if an inefficient algorithm is used, may also require prohibitive computing time. To address this challenge, we devise a datamining-aided optimal design method in this chapter, which is able to find a competitive design solution with a remarkably reduced computation cost. This chapter unfolds as follows. Section 14.1 describes the background of the problem and reviews the relevant literature. Section 14.2 presents the design criterion, as well as the optimization formulation used for achieving the above-stated design objective. Section 14.3 lays out the details of the data-mining-aided design algorithm. Section 14.4 compares the performance of the data-mining-aided optimization algorithm with several widely used methods and presents the resulting fixture layout.
14.1 INTRODUCTION As we stated in Part I of the book, dimensional quality control is one of the major challenges in discrete-part manufacturing processes, such as the automotive assembly process. Fixtures are used extensively to provide physical support and to coordinate references of parts and subassemblies. As a result, fixture layout design greatly affects the dimensional accuracy of the final products. Earlier research on fixture design employed kinematical and mechanical analysis to explore accessibility, detachability, and location uniqueness of a fixture, aiming at the automatic generation of fixture layouts [1]. Heuristic algorithms were developed for automatic generation of fixture configurations [2,3]. Trappery and Liu [4] summarized the research before 1990 on fixture-design automation, and a more recent summary can be found in Section 1 of Cai et al. [5]. These fixture designs are considered deterministic approaches because they consider neither random manufacturing errors of fixture elements nor workpiece positioning errors induced by fixturing operations. Because a workpiece or a fixture * Part of chapter material is based on Reference 42 and Reference 48.
333
2151_book.fm Page 334 Friday, October 20, 2006 5:04 PM
334
Stream of Variation Modeling and Analysis for MMPs
element is unavoidably subject to manufacturing error, researchers studied the problem of robust fixture design in a stochastic environment. One branch of robust fixture design aims at finding optimal fixture positions that minimize the deflection of a compliant workpiece under working load [6–12]. This research usually does not consider manufacturing errors of fixture elements. However, fixture-related local deformation and microslippage are considered error sources [8,9]. Another branch of robust fixture design is known as the variational approach because it considers fixture error or workpiece surface error and tries to find an optimal fixture layout that makes the positioning accuracy of a workpiece insensitive to input errors [5,13–15]. Variational fixture design often starts with developing a sensitivity measure that characterizes the robustness of a fixture system. This sensitivity measure is determined by a fixture layout and is independent of the fixture error inputs. The smaller the sensitivity, the more robust a fixture system should be. For example, Wang [13] maximized the determinant of the Fisher information matrix (D-optimality), which is the inverse of the sensitivity matrix, and Cai et al. [5] minimized the Euclidean norm of the sensitivity matrix. Meanwhile, heuristic or rule-based methods have also been developed for designing robust fixture layouts [15]. Research work presented by Rong et al. [16], Choudhuri and DeMeter [17], Ding et al. [18], and Carlson [19], among others, is also relevant in the sense that it provides variation/tolerance analysis of a fixture system, whereas the difference is that the issue of fixture synthesis is not addressed. In the past, variational fixture designs were conducted mainly at the singlemachine level rather than at the multistation system level, i.e., the fixture layout being optimized is limited to a single workstation. In a typical multistation assembly process, dimensional variation could originate from fixture elements on every station, propagate along the production line, and accumulate on the final assembly. A multistation fixture layout design should optimize the locations of fixtures on every assembly station for the whole system. It is apparent that a stationwise optimization of fixture layout is different from a systemwide optimization. Consider the sport utility vehicle (SUV) side panel assembly process presented in Section 6.4. Suppose that one had optimized the positions of P1, P2, P3, and P4 on Station I. Note that P1 and P4 will be reused on Station II. Thus, when a stationwise optimization is carried out on Station II, one could choose to optimize all fixtures on Station II as though P1 and P4 were not optimized on Station I; or one can retain the optimized positions of P1 and P4 and only optimize the fixture layout (P5 and P6) that supports the newly added part. Obviously, neither approach will lead to an overall optimal fixture layout in a multistation process. Three aspects should be addressed for multistation fixture optimization: (1) a variation propagation model that links fixture variation inputs on every station to product dimensional variation, (2) a quantitative design measure that benchmarks the sensitivity of different fixture layouts, and (3) optimization algorithms that find the optimal fixture layout. Research on multistation fixture optimization is limited because of the inherent difficulty resulting from all the three aspects.
2151_book.fm Page 335 Friday, October 20, 2006 5:04 PM
Optimal Fixture Layout Design
335
TABLE 14.1 Comparison of Fixture Design Methodologies Problem Domain Fixture Deterministic design Robust design for minimal deflection Variational Single station robust Multistation design
Methodologies [1–4] [6–12] [5,13–17,19] Modeling and analysis Fixture optimization
[20,21,22,23,24,25,46,47] To be presented in this chapter
In Part I of this book, we have demonstrated that variation propagation in a multistation process can be modeled by using a station-indexed state space model. The resulting state space models take care of the issues of variation modeling and variation analysis. Similar variation analysis has also been performed using a datummachining surface relationship graph [24,25]. In Chapter 13, we developed a multilayer sensitivity-based design evaluation [18], which may provide us some quantitative benchmark for different fixture layouts. Given those developments, this chapter will continue the development of design criteria (similar to the sensitivity indices from Chapter 13) and optimization algorithms for multistation fixture layout design. Overall, the methodologies reviewed in this section are summarized in Table 14.1.
14.2 DESIGN CRITERIA FOR VARIATION REDUCTION In Equation 13.8, we defined a system-level sensitivity index that characterizes the system variation response to the input variance components. Here, we define a similar sensitivity index in a sum-square sense so that not only are the variance components taken into account, but also any mean shift associated with a fixture locator. In Equation 13.2, we already converted the state space model in Equation 9.1 to Equation 9.2 into an input-output linear model, for the case when quality measurements are only taken at the end of the assembly line. Following the same arguments used in Section 13.2, in this fixture design problem, our focus is on the first term in Equation 13.2, i.e., Σ kN=1CΦN ,k Bk u k , because it represents the fixture error inputs from all N stations. Here, we simplify Equation 13.2 as yˆ ≡ Du = Σ kN=1CΦN ,k Bk u k
(14.1)
where D ≡ [CΦN ,1B1 CΦN ,2 B2 CB N ], u T ≡ [u1T u TN ] and yˆ is the fixtureinduced product variation. We use yˆ T yˆ , the sum of squares of product deviations, to benchmark the overall level of product dimensional nonconformity. Thus, product quality is optimized if
2151_book.fm Page 336 Friday, October 20, 2006 5:04 PM
336
Stream of Variation Modeling and Analysis for MMPs
yˆ T yˆ is minimized. Given yˆ T yˆ = u T D T Du , the problem is equivalent to minimizing u T D T Du . However, u T D T Du is an input-dependent quantity. Because our goal is to find a fixture layout in which product quality is insensitive to fixture inputs, we need a design criterion or a sensitivity index that is determined only by the fixture design information (modeled by D) and is independent of the variation inputs (represented by u). For a single input-output pair, the sensitivity can be defined as Si,j = yi /uj , where yi is the ith product feature and uj is the jth fixture error input. For the entire assembly system with multiple inputs and multiple features, an intuitive way to define the sensitivity index is as S≡
yˆ T yˆ = uT u
u T D T Du uT u
(14.2)
The square root is taken to make the unit of the sensitivity index the same as that of the input-output variables. The difficulty associated with this definition is that S is still input dependent. Apparently, D T D plays a determining role in the preceding definition, which motivates researchers to define the sensitivity index using a measure of D T D. Research has been conducted on a similar problem in experimental design, and several optimality criteria have been proposed [26–28]. The often-used criteria include Doptimality, which is to minimize det(D T D); A-optimality, which is to minimize the tr(D T D); and E-optimality, which is to minimize the largest eigenvalue of D T D, where tr(⋅) and det(⋅) are the trace and the determinant of a matrix, respectively. These three measures are related to one other through eigenvalues of D T D, {λ i }ip=1, where p is the column number of D. They can be expressed as Dopt : det (DT D) =
∏
p i =1
λ i ; Aopt : tr (DT D) = Σ ip=1λ i ; and Eopt : λ max
The D-optimality criterion is the most widely used in experimental designs because of the following two reasons [26,27]: 1. For experimental designs, this criterion has a clear interpretation. The D-optimality criterion is equivalent to minimizing the prediction variance from an estimated model or the variances of least-squares estimates of unknown parameters. 2. It possesses an invariant property under scaling, i.e., experiments can be designed using a group of standardized dimensionless variables (say, all variables are in [1, 1]) instead of the original physical variables. In fact, this D-optimality criterion was also used in solving problems of fixture design and sensor placement by Wang and his colleagues [13,14,29]. However, in the multistation fixture system under consideration, the unique phenomenon of reorientation (please refer to Subsection 6.2.2) will cause the state
2151_book.fm Page 337 Friday, October 20, 2006 5:04 PM
Optimal Fixture Layout Design
337
transition matrix A to be singular. We have elaborated in Section 11.1 and Section 11.2 that this singularity of A can further cause CΦN ,i Bi to be rank deficient even if C and B matrices are of full rank. As a result, matrix D is less than full rank so that D T D is singular. In light of this singularity in A as well as in D, we need to reconsider the design criterion. When D T D is singular, at least one of its eigenvalues is zero, i.e., det(D T D) = 0. Recalling the reason why A is singular (explained in Section 11.1), we know that this singularity issue cannot be resolved by simply changing the positions of fixture locators on a station. It is an inherent problem caused by the fixturing mechanism under consideration in a multistation assembly process. This fact implies that even if we choose new positions for fixture locators, det(D T D) will remain zero. It is fair to conclude that det(D T D), the D-optimality criterion, is noninformative in this multistation fixture design. Given the singularity problem of design matrix D, we consider that the A-optimality or the E-optimality is an informative criterion for the multistation fixture design. We recommend the use of E-optimality because it has a clearer physical interpretation. It is known that
S≡
u T D T Du ≤ λ max (D T D) for any u ≠ 0 uT u
(14.3)
That is, E-optimality, which minimizes λmax(D T D), is equivalent to minimizing the upper sensitivity bound of the fixture system. This criterion can also be derived using the definition of matrix 2-norm. Defining the upper bound of sensitivity as Smax, it follows the definition of matrix 2-norm [30] that
S max ≡ sup u ≠0
u T D T Du = D = λ max (D T D) 2 uT u
(14.4)
In other words, the E-optimal condition is the square of the 2-norm of design matrix D. It is not difficult to see that the sensitivities defined in Section 13.2 are also the E-optimality-type criteria. We cannot rule out the possible use of A-optimality in this multistation fixture design problem. Because an eigenvalue of D T D represents the sensitivity level related to one particular input-output pair for a canonical variation model, tr(D T D) is the summation of sensitivities related to all input-output pairs, representing the overall sensitivity level of the fixture system. Using A-optimality can be considered to be minimizing the summation of sensitivities. By contrast, E-optimality is more conservative because it attempts to reduce the maximum sensitivity index. This conservativeness actually makes E-optimality more easily accepted by practitioners, because minimization of the maximum sensitivity is consistent with the Pareto principle in quality engineering. Based on our experience with this multistation fixture design, we caution the use of D-optimality in general engineering system designs. Engineering system
2151_book.fm Page 338 Friday, October 20, 2006 5:04 PM
338
Stream of Variation Modeling and Analysis for MMPs
designs are different from experimental designs in many aspects. The differences could cause the advantages of using D-optimality in an experimental design to be inapplicable to an engineering design problem. The major differences include: 1. Engineering design problems are often accompanied by complex constraints, for example, the geometric constraints imposed by the shape of a part in the SUV side frame assembly process. This type of complexity makes it almost impossible to design an engineering system based on a group of dimensionless standardized variables. In this regard, the invariant property of D-optimality becomes much less attractive to general engineering designs. 2. The complexity of engineering systems often results in ill-conditioned systems with some eigenvalue of D T D close to zero or even singular systems (such as our multistation fixture system). Because the purpose of D-optimality is to minimize the product of all eigenvalues, it is possible in the presence of ill-conditioned systems that the near-zero eigenvalue is forced to become zero while leaving other eigenvalues uncontrolled as though a perfect D-optimal condition had been achieved. Obviously, this is actually an undesirable result. This problem is less likely to occur, though, in an experimental design or to a well-posed system (see Wang and Nagarkar [29] for a detailed discussion). In the rest of this chapter, we will use the E-optimality criterion for determining a robust fixture system in a multistation panel assembly process. The design parameters are the locations of fixture locators, denoted as θL ≡ [X1 Z1 XnL ZnL]T, where nL is the total number of locators used in a process, e.g., nL = 8 for the process in Figure 6.10. Mathematically, the optimization scheme is expressed as min S (θ L ) ≡ D θL
2
(14.5)
subject to G (θ L ) ≥ 0 where G(⋅) captures geometrical constraints on the locations of fixture locators, imposed by geometries of parts. Please note that the S(⋅) in Equation 14.5 is actually Smax(⋅) in Equation 14.4. We drop the subscript “max” hereafter for notational simplicity.
14.3 DATA-MINING-AIDED DESIGN ALGORITHM Equation 14.5 actually captures a general formulation of a nonlinear optimization problem — S(⋅) is the objective function, G(⋅) is the constraint function, and θL is the vector of design parameters. In Equation 14.5, without loss of generality, we present a minimization problem. A maximization problem can be solved in the same fashion. To solve the problem, nonlinear programming methods, such as sequential quadratic programming [31] or simplex search [32], can be employed to find the
2151_book.fm Page 339 Friday, October 20, 2006 5:04 PM
Optimal Fixture Layout Design
339
optimal solution. They usually converge to a solution in a relatively short time. But the quality of the final solution depends greatly on the selection of an initial design. These methods are known as local optimization methods, because the solutions are easily entrapped in a local optimum. To escape local optima, one would prefer to use a random-search-based method such as the genetic algorithm (GA) [33] or the simulated annealing algorithm (SA) [34]. Empirical evidence shows that GA or SA are indeed quite effective in escaping local optima but at the expense of considerably slower convergence, and are thus impractical when the computation cost of evaluating an objective function is high [35]. As for the nonlinear optimization problem in Equation 14.5, the dimension of the design space is 2nL = 16 for the four-station, 2-D assembly process in Figure 6.10, a design space hardly large enough for the nonlinear programming methods to be effective. The objective function S(⋅) is generally accompanied by complicated constraints. If we consider the flexibility of parts during assembly operations, S(⋅) will have to be evaluated using computationally expensive finite element analysis (FEA) codes [22]. GA or SA, usually involve a considerably long computation time. In this chapter, instead of looking for the global optimum, we try to find a good balance in an algorithm that is able to find a competitive design solution with a remarkably reduced computation cost.
14.3.1 OVERVIEW
OF THE
DATA-MINING-AIDED DESIGN
An optimal design problem concerns selecting the best of alternative designs from a candidate design space subject to certain constraints. Following this logic, the idea of applying data-mining methods to design optimization developed [35,36]. That is, if the design alternatives are further treated as a data set, a data-mining method may be able to discover valuable structures within the data set and generalize design selection rules leading to a much smaller good design subset, which is more likely to yield a better design solution even if a local optimization method is applied. Here, the data-mining method usually refers to various classification methods. The idea is illustrated in Figure 14.1. A data-mining method generalizes the design selection rules based on the training data in a design library, which is in turn random sampling
Design library
Data-mining methods
Design alternatives
Design selection rules Good design subset
Historical designs Local optimization methods Optimal design
FIGURE 14.1 Design optimization utilizing data-mining methods.
2151_book.fm Page 340 Friday, October 20, 2006 5:04 PM
340
Stream of Variation Modeling and Analysis for MMPs
created either from a collection of historical design results or from random sampling among design alternatives. The resulting selection rules are often expressed as a classification tree, or equivalently, a set of “if-then” rules. Then, the large number of design alternatives will pass through the selection rules, and certain local optimization methods will be applied to the selected good designs to find the final optimal design. Schwabacher et al. [35], for instance, applied this idea in the prototype selection of structures of a racing yacht and a supersonic aircraft, respectively, where their design library is created from historical designs and a C4.5 decision tree [37] is used to generate the design selection rules. Although the general idea as described in Figure 14.1 could help in discovering valuable design selection guidelines, there is a major obstacle to applying this idea to engineering design problems, especially those with a computationally expensive objective function. The obstacle is that for a new design without enough historical data, generation of design selection rules needs to evaluate the objective functions of all designs in a design library. For the design library to be representative of a large volume of design alternatives, one will have to include large enough number of designs in the library — potentially too many to be computationally affordable for generating the design selection rules. For example, for the 2-D assembly process in Figure 6.10, we generate a finite candidate design space via discretization, say, using the resolution of 10 mm (the size of a locator’s diameter) on each panel. This resolution level will result in the number of candidate locations on each panel as n1 = 308, n2 = 905, n3 = 396, and n4 = 6,204, respectively, where nj denotes the number of candidate locations on panel j. Because of the difference of the two locators used on each panel, the layout of P1 = (X1, Z1) and P2 = (X2, Z2) may generate different responses on product dimensional variability than the layout of P1 = (X2, Z2) and P2 = (X1, Z1) does. Thus, the total number of design alternatives is 2C2308 × 2C2905 × 2C2396 × 2C26204 ≈ 4.65 × 1023, where Cab is the combinational operator, namely, it is the number of ways a objects can be selected from a set of size b. Apparently, the number of design alternatives is overwhelmingly large. Even if the objective function is computationally inexpensive, it is going to be impractical to apply the idea presented in Figure 14.1. In the design of a civil structure, Igusa et al. [36] proposed a more sophisticated idea that circumvents frequent evaluation of an objective function. They recommend employing a much simpler feature function together with a clustering method to reduce the number of designs whose objective function needs to be evaluated for the generation of a classification tree. Following the general idea proposed by Igusa et al. [36], we develop in this chapter a data-mining-aided optimization method for the aforementioned multistation fixture layout design. The method includes the following components: (1) a uniform-coverage selection method, which chooses design representatives from among a large number of original design alternatives for a nonrectangular design space; (2) feature functions whose evaluation is computationally economical as the surrogate for the design objective function; (3) a clustering method that generates a design library based on the evaluation of feature functions instead of an objective function; and (4) a classification method to create the design selection rules, eventually leading us to a competitive design. This procedure will allow us to eventually have an affordable number of designs as a training
2151_book.fm Page 341 Friday, October 20, 2006 5:04 PM
Optimal Fixture Layout Design
341
Design alternatives uniform-coverage selection Design representatives Clustering method and feature evaluation Design library
Classification
Design selection rules Good design subset Local optimization Optimal design
FIGURE 14.2 Modified data-mining aided design optimization procedure.
data set in a design library. The overall framework is illustrated in Figure 14.2, which is Figure 14.1 appropriately modified.
14.3.2 CANDIDATE DESIGN SPACE Before getting into the details of the proposed design method, we first describe the design space for candidate locators (called the candidate design space). The candidate design space imposed by G(⋅) in Equation 14.5 is different from the natural boundary of each panel. The boundary of the candidate design space should be at least 35 mm away from the edge of a panel because of an engineering safety requirement, because a locating hole that is too close to the edge may not be able to withstand the load exerted during fixturing. In addition to this, we know that a part-positioning deviation is more sensitive to locating deviations when both locators are close to each other than when they are farther apart. This rule suggests that two locators on the same panel should be located far enough away from each other such that the neighborhood around a panel’s geometric center (GC) is ruled out of consideration for the candidate design space. The determination of this neighborhood is illustrated in Figure 14.3a. Distances from the GC to the vertex points of a panel are calculated, and their median value is chosen to represent the size of the panel, denoted as d0. A hypothetical circle is drawn on the panel with the GC as its center and d0 /2 as its radius. The area inside this hypothetical circle is considered to be the neighborhood of a panel’s GC. The use of the median of all GC-to-vertex distances in determining d0, rather than their mean value, makes the resulting d0 less sensitive to a very large or a very small GCto-vertex distance on panels with an irregular shape (recall that a median is a more robust statistic than a mean value). The value of d0 /2 is an empirical choice. Our experience indicates that the choice is actually quite conservative, i.e., after removing
2151_book.fm Page 342 Friday, October 20, 2006 5:04 PM
342
Stream of Variation Modeling and Analysis for MMPs
2000 1
1500
2
3
4
1000 500 0
boundary of the neighborhood
(a)
500
1000 1500 2000 2500 3000
(b)
FIGURE 14.3 (a) Neighborhood of a GC and (b) candidate design space on SUV side frames.
candidate locations using d0 /2 as the neighborhood radius, we did not see much difference in terms of the best-found sensitivity value of fixture layouts. One can certainly decrease the neighborhood radius to be on safe side. The resulting candidate design space imposed by G(⋅) is shown as the shaded areas in Figure 14.3b, to which all the latter optimal design methods will be applied and their performances compared. If the candidate design space of each panel is discretized using the same 10-mm resolution, the numbers of candidate fixture locations are n1 = 239, n2 = 707, n3 = 200, and n4 = 3496, respectively. After eliminating the locator pairs of which distances are smaller than d0 /2 on each panel, we still have as many as 1.09 × 1021 possible combinations of locator layouts. This number is still too large to be computationally affordable for design optimization.
14.3.3 UNIFORM COVERAGE SELECTION The first component of the proposed method is selection of design representatives from the original design alternatives. Unless one has profound knowledge of which part in the candidate design space (after the neighborhood of a GC has been ruled out) is preferred, a safer way of selecting good representatives of the original design set is to select them from a design space as evenly as possible. Igusa et al. [36] suggested randomly selecting design representatives, based on a uniform distribution, from the set of design alternatives. The problem with random selection is that probabilistic uniformity does not guarantee even geometric coverage in a design space. When the design space is of a high dimension and the sample size is relatively small (e.g., 2000 chosen from 8.5 × 108 alternatives in Igusa’s case), the selected sample could cluster in a small area and fail to cover large portions of the design space [38]. A space-filling design, widely used in computer experiments [39], aims to spread design points evenly throughout a design region and appears to fit well into our purpose of design representative selection. A space-filling design is usually devised by using Latin hypercube sampling (LHS) [40], a stratified sampling method, or using a uniformity criterion from the number-theoretic method (NTM) [41].
2151_book.fm Page 343 Friday, October 20, 2006 5:04 PM
Optimal Fixture Layout Design
343
These methods can be easily implemented over a hyper rectangular design space in experimental designs. In engineering system designs accompanied by complicated geometric and physical constraints, the design space is often nonrectangular or even highly irregular, such as the candidate design space shown in Figure 14.3b. Another constraint also comes into play in this fixture layout design, i.e., once a locator’s position is chosen on a panel, the second locator on the same panel should not be located near the first one, following the same physical intuition related to positioning variability explained previously. This is different from the factor level selection in experimental designs, where there is usually no clear prior knowledge to indicate the dependency among factors. Given the complexity in the design constraints, we have not seen a generic method to translate an LHS- or NTM-based space-filling design to an engineering system design problem. Here, we devise a heuristic procedure for the fixture layout design, attempting to provide a uniform-coverage selection of design representatives from the original fixture layouts. Step 1: Uniformly discretize the candidate design space on each panel plane using the same resolution (in our implementation, the resolution is 10 mm between two adjacent locations). Associate a probability p with each candidate location, and p is initially set to be equal for all locations. Step 2: On each panel, the first locator is chosen sequentially to be at those locations from the discretization process. Once the first locator is selected, generate a subset of locations containing those with a distance from the first locator greater than half of the panel size (d0 /2). The second locator is selected according to the probability associated with all the locations in that subset. Once a location is chosen, update the probability for this location pnew = ν⋅pold, 0 < ν < 1, namely, reduce the probability of selecting this location by a factor ν. Denote by Ω(j0 ) the resulting candidate locator set for panel j. Then, nj equals the number of locator pairs included in Ω(j0 ) . Step 3: For i = 1 … max( n j ), j (i) Randomly select one locator pair from each of Ω(ji−1) for j = 1, 2, 3, 4 without replacement. (ii) Combine these four locator pairs as one design representative. (iii) Whenever a Ω(ji −1) becomes empty, reset Ω(ji −1) = Ω(j0 ) ; otherwise, set Ω(ji ) = Ω(ji −1) , j = 1, 2, 3, 4. End of the loop. In Figure 6.3, we have shown that the two different types of locators are used as a locator pair on each panel. In the preceding selection procedure, the 4-way locator restraining two d.o.f.’s is considered more important so that it is selected as the first locator in step 2; its uniformity is a result of uniform discretization. The 2-way locator is treated as the second locator and it is chosen to be at least d0 /2 away from the first locator because of the aforementioned constraint on the betweenlocator distance. The threshold of d0 /2 is again chosen empirically, as described in
2151_book.fm Page 344 Friday, October 20, 2006 5:04 PM
344
Stream of Variation Modeling and Analysis for MMPs
Subsection 14.3.2. The reason that we associate a probability with each location and subsequently reduce the probability for a selected location is to ensure that the second locator is more likely evenly spread out. Had we used a simple random selection for choosing the second locator, there would be a greater chance that values of the second locator position will form clumps or clusters of values. With the discount factor ν set to be 0.1 in our implementation, once a location is selected, the probability of choosing it again will be 10% of the original probability and choosing it the third time will be very unlikely (1% of the original probability). After step 2, the set Ω(40 ) has the largest number of locator pairs, n4 = 3496. Step 3 performs a stratified sampling to generate locator combinations. The stratified sampling will go over Ω(40 ) once but will have to go over Ω(j0 ) for j = 1, 2, 3 multiple times. Step 3 can be thought of being equivalent to what will be achieved by the following procedure: first augment Ω(j0 ) for j = 1, 2, 3 to be the same size as Ω(40 ) ; then, perform an LHS on this 4-D rectangular region. Hence, this step can be considered as a generalization of LHS to nonregular regions. Eventually, a total of 3496 combinations of locators are generated as design representatives; we denote this number as Nr .
14.3.4 FEATURE
AND
FEATURE FUNCTION
To avoid direct and frequent evaluations of objective function S(⋅), we will use a set of feature functions to characterize system performance. A feature function maps an engineering system to a feature, which is tied to the design objective. For example, the distance between two locators in the fixture design can be considered a feature. Generally, any physical quantity that is potentially tied to the design objective can be used as a feature. The set of feature functions is actually a surrogate for the design objective function. Features are often selected based on prior experience, engineering knowledge, or physical intuitions. The advantage of such a feature definition/selection is that vague experiences, knowledge, or understandings of a complicated engineering system can be more systemically integrated into the optimal design process. What follows feature selection is a clustering method acting on the set of the chosen feature functions. If the feature function does not form well-separated clusters, the subsequent clustering step will not be effective when it attempts to form a design library with smaller number of designs. This problem may very well be avoided if the feature functions are chosen to be the actual surrogate for the objective function. If the objective function forms a lot of local optima on the response surface, the feature functions will be also likely to form clusters, corresponding to those local optimum areas. Therefore, although the selection of a feature in the proposed method is rather flexible, we actually need to do so with care. We have the following generic considerations for an effective selection of features and feature functions. First, when choosing the feature functions, we need to make sure that they are directly related to the objective function instead of making them the replacements of design variables in θL. In particular, one needs to avoid choosing feature functions that are just subsets of θL or are linearly related to θL. Second, because features are used to replace the direct evaluation of an objective function, a feature function should be computationally simple. Otherwise, it will not
2151_book.fm Page 345 Friday, October 20, 2006 5:04 PM
Optimal Fixture Layout Design
345
serve our purpose. Third, because a feature is usually not connected to the design objective with mathematical explicitness, too few feature functions may generate a serious bias in the latter selection of design representatives. On the other hand, too many feature functions will increase the computation burden. A trade-off will depend on specific applications, where 5 to 15 feature functions may be selected. Finally, it is desirable to select scalable features, i.e., a feature definition will remain the same when the size of a system has increased. For the example of multistation fixture design, a scalable feature means that it can be used to characterize system performance whether the multistation system has 3 or 10 stations. Keeping these guidelines in mind, we will choose a set of feature functions for the fixture layout design as follows. We know that the distance between locators is an important factor related to the variation sensitivity of a fixture layout. We will select the between-locator distance on a panel as one feature relevant to our design objective. We select the following five functions to characterize the feature of between-locator distance; the five feature functions actually approximate the distribution of the set of the between-locator distances and are as follows: θL) — The largest value of the same-panel between-locator distances F1(θ θL) — The second-largest value of the same-panel between-locator F2(θ distances θL) — The mean of the same-panel between-locator distances F3(θ θL) — The second-smallest value of the same-panel between-locator F4(θ distances θL) — The smallest value of the same-panel between-locator distances F5(θ For a larger-scale assembly system with more parts and stations, the aforementioned feature functions can still be used; that is, they are scalable. The approximation of the distribution could be improved by augmenting the number of feature functions so that they give more refined percentile values of the set of between-locator distances. If we are concerned only with a single part that is positioned by a pair of locators at a single station, the between-locator distance could be the only factor that matters. However, complexity results from the fact that locating holes on a panel are reused but usually in a different layout. For the multistation assembly process in Figure 6.10, A-pillar and B-pillar are positioned on Station I by {P1, P2} and {P3, P4}, respectively. After the assembly operation is finished, the subassembly becomes one single piece, and it is transferred to Station II and positioned by {P1, P4}. This assembly transition across stations and the reuse of fixture locating holes complicate the sensitivity analysis for a multistation system. It was shown by Kim and Ding [42] that a larger between-locator distance on one station may not necessarily produce a lower sensitivity for the whole process. To capture the across-station transition effect, we select a second feature, which is the ratio of between-locator distances on two adjacent stations. Denote by L1, L2, … , Lm the between-locator distance for m locator pairs on a station. After those parts are assembled, they are transferred to the next station and positioned by a locator pair with a between-locator distance L(m). The ratio of distance change rd is then defined for this transition as
2151_book.fm Page 346 Friday, October 20, 2006 5:04 PM
346
Stream of Variation Modeling and Analysis for MMPs
L( m ) ( Σ L ) /m
rd ≡
(14.6)
m i =1 i
Here we include three more feature functions related to the feature of distance change ratio as: θL) — The largest value of distance change ratios F6(θ θL) — The mean value of distance change ratios F7(θ θL) — The smallest value of distance change ratios F8(θ Similarly, the three feature functions approximate the distribution of the set of rd. We do not include five functions as we did for the between-locator distances, because four stations in this example produce only three distance change ratios. We have defined eight scalable feature functions for two physically intuitive features relevant to the variation sensitivity of a multistation assembly process. Please note that the calculation of the aforementioned feature functions is very economical even for a large-scale system.
14.3.5 CLUSTERING METHOD Clustering aims to segment a heterogeneous population into a number of more homogeneous subgroups [43]. When using feature functions as the surrogate for a design objective to benchmark the dissimilarity criteria in a clustering procedure, design representatives in a resulting cluster will have a more similar distribution profile for the two physical features, namely the between-locator distance and the across-station distance change ratio. Empirical evidence [36] shows that resulting clusters are associated with a local response surface and its center will likely be around a local optimum. For this reason, a design library for classification can then be created by selecting a few designs from each cluster around the cluster center, which results in fewer designs. The generation of a design library is illustrated in Figure 14.4. For the ith fixture layout represented by θ iL , Fi ≡ [F1 (θ iL ) … F8 (θ iL ) ]T is the vector of its feature functions, and c(i) denotes the cluster to which it belongs. In our solution procedure, we employ a standard K-means clustering method [43]. For example, for K clusters, we will find the mean value of cluster k, mk, as the cluster . . . . .. . .. . . . . . . .. . . . . .. .. . .. . . . . . .. . . . . . .. . . . . . . . .. . . . .. . . . .. . . . . . ... . .. . . . . . . .. .. . . . .. . . . . . . . . .. . . ... . . . . . . .. . . ... . . . . .. . ...
Design alternatives
uniform coverage selection
.
.
.
.
. .
.
. .
.
.
.
.
. .
.
.
. .
. . .
.
.
.
. .
clustering
. .
.
Design representatives
FIGURE 14.4 Generation of a design library.
. . .. . . . .. . ...... . ..
. .. . ...... ... ... . .. .
Clustered design representatives
Parameter Response y 1=f( 1) 1=[…. ] y 2=f( 2) 2=[…. ] … … y i=f( i) i=[…. ] … …
Design library
2151_book.fm Page 347 Friday, October 20, 2006 5:04 PM
Optimal Fixture Layout Design
347
center, and the association of a fixture layout to cluster k (represented by c(i) = k) so that K
MinK
c ,{m k }1
∑N ∑ F − m k
k =1
i
2 k
(14.7)
c ( i )= k
where Nk is the number of elements in cluster k. The K-means method minimizes the dissimilarity measure, defined as the Euclidean distances of the elements within the same cluster. With different values of K, the minimization in Equation 14.7 will yield different clustering results, i.e., different cluster centers and cluster associations. We delay the discussion of how to choose K to Subsection 14.3.7. Once the design representatives are clustered, i.e., each fixture layout is labeled with a cluster identification, we will choose a few designs around the cluster center to form a design library, as illustrated in Figure 14.4. We call the selected designs from each cluster center seed designs and denote by Jk the number of seed designs chosen from cluster k. For the sake of simplicity, we will use the same seed number for all clusters, i.e., Jk = J for all k. Then, the design library contains KJ data pairs {Fi, Si} for i = 1, 2, … , KJ, where Si is the sensitivity value of the ith fixture layout. Finally, the clustering step is briefly summarized as follows. We are given Nr design representatives from the uniform-coverage selection, and then: Step 1: For each design, determine F = (F1, … , F8). Step 2: Cluster the F values into K clusters using some clustering method (for example, K-means clustering). The clustering method should be one in which the measure of similarity is consistent with the assumption that when two values of F are similar, the corresponding values of the objective function S are similar. Usually, this will mean that Euclidean distance is the measure of similarity. Step 3: For each of the K clusters, choose the J designs that are closest to the center of the cluster. This yields KJ designs, and one typically wants KJ << Nr .
14.3.6 CLASSIFICATION METHOD We will perform classification on the data set in the design library to generate the design selection rules; this step is similar to what has been implemented before [35] (refer to Figure 14.1). Local optimization methods can be used to evaluate a few designs chosen by the selection rules and get the final optimal design. Often, as we will see in Section 14.4, a local optimization method may no longer be necessary, i.e., a direct comparison among all the selected designs could have given us a satisfactory result. In Schwabacher et al. [35] and other similar work, the classification step is fulfilled by using a tree-based method, including a C4.5 decision tree [37] and a
2151_book.fm Page 348 Friday, October 20, 2006 5:04 PM
348
Stream of Variation Modeling and Analysis for MMPs F 6 < 1.15 F 5 < 1576.57
. . . .
F 8 < 2.43
F 4 < 979.80 F 1 < 177.28
F 1 < 286.52
F 4 < 389.70 F 7 < 1.61
18.84 15.25
10.87
19.23
20.96 F 1 < 525.35
10.05
6.54
. . . .
F 4 < 1153.28 F 8 < 2.42 F 5 < 1645.25
9.13
8.07
F 4 < 578.20 7.49
F 7 < 1.85 6.00 5.04
4.11
FIGURE 14.5 Part of the classification tree for the fixture layout design.
classification and regression tree (CART). According to Hastie et al. [43, p. 273], the later version of Quinlan’s decision tree is very similar to a CART. We will focus our discussion on CART here because it is widely accessible through a commercial software such as Matlab. In our problem, we choose to apply a CART model to {Fi, Si} in the design library. The paths in a CART can be expressed as a set of if-then rules in terms of the feature functions. One resulting classification tree is shown in Figure 14.5. A decision condition such as F6 < 1.15 is indicated at each node. One takes the lefthand path if the answer to this condition is “yes” or the right-hand path for a “no.” An end node in the tree represents a set of designs associated with a narrow range of sensitivity values — the value indicated in Figure 14.5 next to an end node is the average sensitivity value of the corresponding design set. If a certain combination of feature function values leads us to a set of designs whose expected sensitivity value is the lowest among all end nodes, then the corresponding path (one such path is highlighted in Figure 14.5) constitutes a design selection rule that we are looking for. The resulting selection rule will be applied to the whole set of Nr design representatives. The designs finally selected constitute a so-called good design subset, as indicated in Figure 14.1 and Figure 14.2. Note that because of the random selection of the second locator on a panel, the resulting tree is not exactly the same each time we start the design process over. But our results show that this difference does not affect the final optimal design much. One critical question is how to select the final tree structure. What we need to achieve here is to find a tree that not only has reasonably good predictive power but also can generate a substantially small good design subset when it is applied to Nr design representatives. These two objectives are somewhat conflicting. A simple tree structure will usually have better predictive power but will cause the good design subset to become undesirably large. On the other hand, a full-structure tree without pruning will lead to a smaller good design subset but may lack predictive capability for a new data set. During our research, we found that applying the existing criteria
2151_book.fm Page 349 Friday, October 20, 2006 5:04 PM
Optimal Fixture Layout Design
349
such as the one-standard-error rule [43, p. 57] or the cost-complexity index [43, p. 270] in selecting a tree structure will usually lead to a tree that is too simple for our application. This is not surprising, because both criteria attempt to find the simplest tree with a small prediction error; the size of the good design subset does not come into consideration. Here, we proposed a revised cost-complexity index, C, more suitable for our problem:
C≡
∑
KJ i =1
L ( Si , Sˆi ) +
Nf α ⋅ N r − KJ
(14.8)
where L ( Si , Sˆi ) is the squared-error loss function, Sˆi is the predicted sensitivity value using the resulting tree, Nf is the number of designs in the good design subset, and 0 < α < 1 is a predetermined constant. In Equation 14.8, the cost part (i.e., the loss function) is the same as the one used in the cost-complexity index in Hastie et al. [43]; it characterizes a tree’s predictive power and will be evaluated based on a tenfold cross-validation. The complexity part is replaced by a ratio tied to the size of the good design subset. The αNr represents the level of how many designs one would like to evaluate. After subtracting KJ, the number of designs used to generate the design library, the denominator αNr KJ provides a reference level for the size of the good design subset. We recommend choosing α = 0.1 or smaller so that the computation for evaluating those designs will remain affordable. In our problem, the choice of α = 0.1 translates to 350 overall designs, and the final tree structure is selected by applying the one-standard-error rule on this new index C. It is not merely a coincidence that a CART model has been commonly chosen in the relevant research; it does have several advantages over competitive methods. Hastie et al. [43, p. 313] compared the tree-based method with other methods, including the support vector machine (SVM) and the neural net. The tree-based method is better than other methods in terms of its capability of handling data of mixed types, dealing with irrelevant inputs, being robust to outliers, and also being computationally scalable. These are all desired properties when we are trying to establish a connection between the feature function and the sensitivity function. For example, its better capability of handling irrelevant inputs is important, because feature functions are usually selected based on rough engineering knowledge, and we might have selected a few irrelevant ones. Computational scalability is also important, because we may need to select more feature functions to avoid bias when dealing with a large system. In engineering applications, the tree-based method is popular also because it can be easily implemented and the resulting decision rule has an intuitive interpretation. For this reason, Hastie et al. [43] considered that the tree-based method is one of the best candidates for an off-the-shelf data-mining method.
14.3.7 SELECTION
OF
K
AND
J
One issue we left out in Subsection 14.3.5 is how to select K (cluster number) and J (seed design number). The importance of these two values, K and J, is obvious
2151_book.fm Page 350 Friday, October 20, 2006 5:04 PM
350
Stream of Variation Modeling and Analysis for MMPs
became they determine the number of designs in the resulting design library. Apparently, these two factors are related to both the optimal sensitivity value our method can find and the computation time it requires. Unfortunately, a theoretical tie between the clustering result and the behavior of a response surface has not yet been established. Using the multistation fixture design at hand, we will further investigate this issue by employing an experimental design approach. For a given combination of K and J, two response variables are chosen. These are the smallest sensitivity value (before a local optimization method is applied) and computation time. For this data-mining-aided optimal design, the overall computation time can be calculated by T0 + KJ ⋅ T + Nf ⋅ T, where T0 is the time component in addition to that for evaluating the objective function, therefore known as the overhead time due to the uniform-coverage selection and clustering/ classification processes; and T is the computation time to evaluate the objective function once. The second and third components are directly related to the times that the objective function is evaluated. For a given engineering design problem and a given choice of K and J, the first and second time components will be largely fixed and T is also a constant. Hence, we use Nf as the second response variable. We conduct a 32 factorial experiment, with three levels of K and J chosen at 3, 6, 9 and 5, 10, 15, respectively. We limit ourselves to the cases with K < 9 because a large K will easily result in a large KJ, a situation less likely to be computationally advantageous. Because of the previously mentioned random selection of the second locator on a panel, for a given combination of K and J, the sensitivity and Nf are in fact random variables. Then three replications are performed at each combined level of K and J. A total of 27 computer experiments are conducted, each of which will go through the procedure outlined in Figure 14.2 (before applying the local optimization). The lowest sensitivity value and the value of Nf are recorded in Table 14.2.
TABLE 14.2 Results for Different Design Conditions Sensitivity Value (S)
The Number of Designs in the Good Design Subset (Nf)
J
K
3
6
9
J
5
10
15
5
10
15
3.8870 3.9082 3.9488 3.9134 3.8824 3.9024 3.8870 3.9335 3.9024
3.8870 3.8821 3.9512 4.008 3.9082 3.9024 3.8870 3.9083 3.9225
3.8870 3.9335 3.9024 3.9582 3.8821 3.9275 3.8934 3.9082 3.9244
1361 592 938 881 380 299 258 210 315
186 628 497 56 249 410 136 47 270
624 138 142 58 152 65 46 112 53
2151_book.fm Page 351 Friday, October 20, 2006 5:04 PM
Optimal Fixture Layout Design
351
From Table 14.2, we find that the sensitivity value S is not significantly affected by the choice of K and J. On the other hand, the value of Nf is significantly affected, ranging from close to one thousand to less than one hundred, depending on the choice of K and J. An ANOVA using the S and Nf data confirms this finding. K and J are significant in the case of Nf at the 5% level, and their interaction has a p-value of 0.067, suggesting that Nf is much more sensitive to variation in K and J than S. A choice of K and J will not have much effect on the sensitivity value because Nf changes accordingly for different K and J. When K and J are small, i.e., the designs in the library are fewer, the partition of design sets corresponding to different levels of sensitivity is rough and, thus, the resulting selection rule generated by the design library is not very discriminating. As a result, when the rule is applied to the entire set of design representatives, there will be a large number of designs that will satisfy it (e.g., the average of Nf is 964 for K = 3, J = 5). Evaluation of the large number of the selected designs, however, will circumvent the limitation brought about by the nondiscriminating selection rule, and the whole design process is eventually able to yield a low sensitivity value. On the other hand, when a relatively large number of designs is chosen to constitute the design library, the resulting selection rule will be discriminating and able to select a small number of good designs, evaluation of which will give us a comparably low sensitivity value. Apparently, the adaptive nature of Nf makes the eventual sensitivity value insensitive to the initial choice of K and J. Therefore, the choice of K and J will mainly affect the algorithm efficiency benchmarked by how many times the objective function is evaluated. The case with both small K and J is not an efficient choice because of a large Nf . However, a large K and J will not be a good choice either, because KJ will still be large even though Nf decreases. Define the total number of function evaluations as Nt ≡ KJ + Nf . Utilizing the data in Table 14.2, we can fit a second-order polynomial, expressing Nt in terms of K and J as Nˆ t = 2220.1 − 254.0 ⋅ K − 163.9 ⋅ J + 8.9 ⋅ KJ + 9.0 ⋅ K 2 + 3.7 ⋅ J 2
(14.9)
Based on the preceding expression, it is not difficult to show that the combination of K = 8 and J = 13 will give the lowest value of Nt. This combination of K and J is only optimal within the experimental range. But the benefit of a decreasing Nf does not appear to be much beyond the point of K = 9 and J = 15, where KJ = 135 is already more than the average value of Nf (which is 70). Further increase in KJ is likely to outnumber the decrease in Nf . Using the following approximation, we provide a guideline for choosing K and J that is independent of the specific relation in Equation 14.9. Recall that the good design subset is generated by passing Nr design representatives through the design selection rule. To have a meaningful design selection rule, the corresponding end node in the classification tree must have at least one design point. Suppose that there is only one design in the end node. Then, the percentage of good designs selected from KJ designs in the library is 1/KJ. If the same percentage applies to all the
2151_book.fm Page 352 Friday, October 20, 2006 5:04 PM
352
Stream of Variation Modeling and Analysis for MMPs
TABLE 14.3 The Number of Clusters Suggested by the Other Methods
K
Milligan’s Method
Krzanowski’s Method
Hartigan’s Method
Kaufman’s Silhouette Statistics
Tibshirani’s Gap Statistic
5
5
2
3
3
design representatives, then Nf = Nr /KJ. The total number of function evaluations can be approximated as N t ≈ KJ +
Nr KJ
(14.10)
The preceding equation suggests that Nt is minimized when KJ = N r . In our problem, given Nr = 3496, KJ is roughly 60. A reasonable choice of K and J would be K = 6 and J = 10. In actual cases, a classification tree pruned by cross-validation usually keeps more than one element in its end nodes. We also observe that the percentage of good designs selected from the design representatives may be higher than that from the design library. These factors make the actual value of KJ minimizing Nt larger than what is estimated from Equation 14.10. We could treat KJ = N r as the lower bound for choosing K and J. As a rule of thumb, we recommend choosing the cluster number K from 6 to 9 and the number of seed designs J per cluster from 10 to 15. Decision making regarding cluster number is a major research topic in statistics. Tibshirani et al. [44] proposed a gap statistic for determining the cluster number and also provided a comparison of several available statistical rules, including Milligan’s method, Krzanowski’s method, Hartigan’s method, Kaufman’s silhouette statistics, and their own gap statistic method. For the details of these criteria and computational procedures, refer to Tibshirani et al. [44] and the references therein. Using these criteria for our fixture design problem, the cluster number selected ranges from 2 to 5, as shown in Table 14.3. According to our previous discussion, these resulting cluster numbers appear to be too small and will likely cause a large Nf . Because those criteria were originally devised for a different purpose, it is not really surprising that directly applying them here may not serve our optimal design well enough.
14.3.8 AN OVERALL DESCRIPTION OF THE DATA-MINING-AIDED DESIGN Finally, we provide a generic description of the data-mining-aided design method as follows: A. Choose a collection of Nr designs ∆ that are evenly spread out in the design space using the method described in Subsection 14.3.3. B. Identify a relatively small number of feature functions F1, … , Ff that are believed to be related to the objective function and are easy to compute.
2151_book.fm Page 353 Friday, October 20, 2006 5:04 PM
Optimal Fixture Layout Design
353
C. For each design in ∆, compute the corresponding values F = (F1, … , Ff ) of the feature functions. D. Cluster the values F of these feature functions into K clusters using K-means clustering. E. For each of these K clusters, identify the J designs that correspond to the J values of F that are closest to the center of the cluster. This is the design library. Note that KJ should be << Nr. F. For each of the KJ designs in the design library, compute the objective function S. G. Using the KJ values of F and S corresponding to the designs in the design library, use CART to classify F1, …, Ff into values leading to optimum values of S. H. Compute the values of F for all designs in ∆, and apply the classification rule resulting from CART to these values. The result is a small subset δ of ∆ that contains designs likely to produce optimum values of S. I. Apply some local optimization algorithm to the designs in δ to determine the one that optimizes S.
14.4 EXAMPLE AND PERFORMANCE COMPARISON We implemented our data-mining-aided design for the SUV panel assembly process in Figure 6.10. Using this four-station assembly process, we compare the performance of our data-mining-aided optimal design algorithm (before a local optimization is applied) with other optimization routines. Our design algorithm is implemented with the choice of K = 8 and J = 13, the optimal combination found in Subsection 14.3.7. The other optimization algorithms in this comparison include the simulated annealing algorithm, simplex search, and a direct evaluation of all design representatives selected by the uniform-coverage selection procedure described in Subsection 14.3.3. The performance of a simulated annealing is largely determined by parameter kB ∈ [0,1], known as the Boltzmannís constant. A larger kB (close to 1) will cause the algorithm to converge quickly, but probably to a local optimum. On the other hand, a smaller kB will help in finding the global optimum or a solution closer to the global optimum, but at the cost of a long computing time. A general guideline given by Viswanadham et al. [45] is to choose kB between 0.85 and 0.95. We include both the cases with kB = 0.9 and kB = 0.95, respectively, in our comparison. The performance indices for comparison include the lowest sensitivity value an algorithm can find and the time it consumes. The objective function for the assembly process in Figure 6.10 is expensive because of various simplifications we made in our variation modeling process in Chapter 6 (e.g., a 2-D assembly, rigid part assumption, only four stations). The T is only 0.018 sec on a computer with a 2.20-GHz P4 processor. When a computationally inexpensive function is used, the overhead computing cost T0 kicks in, which may blind us to the benefit of the proposed method for a complicated system with a more expensive objective function. To show the potential benefit for expensive objective functions, we also include the comparison
2151_book.fm Page 354 Friday, October 20, 2006 5:04 PM
354
Stream of Variation Modeling and Analysis for MMPs
TABLE 14.4 Comparison of Optimization Methods Optimization Methods Simplex search Simulated annealing (kB = 0.9) Simulated annealing (kB = 0.95) Direct evaluation of design representatives Data-mining-aided design using CART model
S
Time (sec)
6.825 3.831 3.979 3.892 3.913
73.8 542.8 259.5 79.3 52.3
Time for Evaluating the Objective Function 3,200 28,503 13,606 3,496 192
T T T T T
of the number of times the objective function is evaluated — when T is large, the time of function evaluation essentially dominates the entire computation cost. We implemented the above-mentioned optimization algorithms to solve the multistation fixture-layout design in the Matlab environment; for example, the Matlab function kmeans is used for the K-means clustering method, treefit and treeprune for the tree generation, and fminsearch for simplex search. All optimization methods are executed on the same computer, and the average performance data of 10 trials are included in Table 14.4. Based on the comparison, we have a few remarks: 1. The best design is found by the simulated annealing with kB = 0.9 at the cost of 542.8 sec of computation time or over 28,000 times of function evaluation. By comparison, the data-mining-aided design using CART reaches a very close sensitivity value (only 2.2% higher than what the SA found) but uses one tenth of the computation time. We also notice that the data-mining-aided design using CART evaluates the objective function only one-hundredth the number of times that the SA did. The SA with a larger kB is not advantageous — the computing time is still long (five times the data-mining-aided method for kB = 0.95), but the resulting sensitivity value increases considerably. 2. Because of our current choice of objective function, the time that a datamining-aided method takes is dominated by its overhead time, close to 50 sec when using a CART model. Because we used scalable feature functions, these overhead time components will not change much even for a system with an expensive objective function. The computation for other algorithms such as SA and simplex search, however, is mainly the result of evaluating the objective function. Therefore, the benefit of our data-mining-aided design method will be more obvious — 28,503T for SA vs. 192T for our method — for a larger, more complicated system in which the evaluation of the objective function will dominate the overall computation cost. 3. Another competitive solution for this fixture-design problem is to directly evaluate all 3496 design representatives and select the best design among them, which provides a simple method of optimization. In the previous
2151_book.fm Page 355 Friday, October 20, 2006 5:04 PM
Optimal Fixture Layout Design
355
example, for instance, this direct comparison method finds the second-best design among the chosen optimization methods. The direct comparison method based on a uniform selection is also robust, i.e., its performance is less sensitive to the properties of response functions, the properties of constraints, or the choice of initial conditions. The same philosophy of optimization was advocated by Fang and Wang [38] using their NTMbased uniform number generation. The limitation of this solution is that it may need to evaluate a rather large number of design representatives. It thus becomes computationally unaffordable when the objective function is expensive (3496T vs. 192T or 105T in the case of fixture layout design). The number of function evaluations can be reduced with a data-mining method. In summary, the advantage of the data-mining-aided design is noteworthy. For the multistation fixture layout design, it yields a solution with a sensitivity value as low as a random search method can find while taking a shorter time than a local optimization method (the simplex search takes 73.8 sec). A local optimization can be applied to the best design found by the data-mining method. It will lead to a small improvement by reducing the sensitivity value further down to 3.868. It is our observation that the data-mining-aided design can often produce a satisfactory design result without the need for a local optimization method. The best fixture layout found by our data-mining aided method is shown in Figure 14.6, in which a fixture location is denoted by a “+”. The fact that both locators on the rear quarter panel are on the same side of the panel’s gravity center does not cause problems here, because the panels are positioned on a horizontal platform in our application. If the panels are actually vertically positioned, a force closure constraint in addition to the geometrical constraint G(⋅) should be included in the optimization scheme (i.e., in Equation 14.5) to ensure that the resultant force and moment will be zero. Under that circumstance, the resulting optimal fixture layout will be different but there is almost no change in the design procedure.
2000
P5
1500
P2 1000
P6 P7
P3
P1 P8
500
P4
0 500
1000
1500
2000
2500
FIGURE 14.6 Optimal fixture layouts with the lowest S value.
3000
2151_book.fm Page 356 Friday, October 20, 2006 5:04 PM
356
Stream of Variation Modeling and Analysis for MMPs
14.5 SUMMARY This chapter investigates various aspects of optimal fixture layout design in a multistation assembly process, but with a focus on design criteria and optimization methods. Because of the singularity of the design matrix of a multistation fixturing system, the widely used D-optimal criterion is not an appropriate design measure. Instead, the E-optimality criterion is recommended, which minimizes the maximum sensitivity level of a fixture system to the input variation. Furthermore, this chapter presents a data-mining-aided design method to facilitate the optimal fixture layout design. Compared to other available optimization methods, the data-mining-aided optimal design method demonstrates clear advantages in terms of both the sensitivity value it can find (only 2.2% higher than what an SA found) and the computation time it consumes (shorter than a simplex search and one tenth of what an SA takes). The benefit could be more obvious for a larger system with a computationally expensive objective function. This method, although demonstrated in the specific context of fixture layout design, is actually rather flexible. It can be applied to a broad variety of objective functions and design criteria. It can also easily handle complicated geometric and physical constraints. The reason that data-mining methods can facilitate optimal engineering design lies in its capability in knowledge discovery, transfer, and encapsulation. The clustering method actually connects, without performing direct evaluation of an objective function, vague human knowledge about an engineering system to design parameters and objectives that are mathematically defined. The reduction in evaluating an objective function will eventually generate a remarkable benefit in terms of algorithm efficiency. Meanwhile, the knowledge about the performance of an engineering system will become more explicit and numerical once the set of design selection rules is formed from a classification method. The accumulated knowledge, expressed in design rules and the optimal design conditions, can be translated into the optimization of a similar yet larger system. Finally, we would like to add one more note on the use of the feature function, which transfers engineering knowledge for statistical treatments. Such an integration of engineering knowledge and statistical methods is considered an important way of improving statistical solutions when solving messy engineering problems. Traditional ways of transferring engineering knowledge include expert systems or physical modeling. The former is, usually too qualitative and the latter is highly quantitative but less flexible — in many sophisticated physical systems, accurate physical modeling of the system is almost impossible. We feel that the inclusion of the feature function strikes a balance of being more quantitative, as well as flexible enough in incorporating engineering knowledge and understanding into the process of design optimization.
14.6 EXERCISES 1. The sensitivity index defined in Equation 14.2 is not exactly the same but somewhat related to those defined in Chapter 13. Elaborate on the similarities and differences between them.
2151_book.fm Page 357 Friday, October 20, 2006 5:04 PM
Optimal Fixture Layout Design
357
2. We have argued that a singular A matrix can cause the final D matrix to become singular even if C and B matrices are of full rank. Present a simple numerical example using a two-station assembly process to illustrate this argument. 3. Prove Equation 14.3. 4. Consider a 2-D space G = [0, 100]×[0, 100]. Use the minimum distance between any pair of points as the measure for characterizing the uniformity of a set of points generated in G. Denote by U(g) the uniformity measure for a point set g. A larger value of U(g) means a better uniform coverage of the point set g over G. Now use the following two different methods to generate a set of 100 points: a. Select a value from a uniform distribution over [0, 100], and assign this value to the x-coordinate of a point. Repeat the same value selection process and assign the newly selected value to the y-coordinate of the same point. This decides the position of a point in G. Use the same procedure to randomly select 100 points to form a set of points, g. Its uniformity measure is U(g). b. First, partition the region G into 10×10 smaller, equal-sized subregions. For each one of the 100 subregions, use a similar procedure as in (a) to randomly select one point. Of course, the uniform distribution will be over a smaller range, say, [0, 10] for the first subregion, and likewise for others. Repeat this selection for all 100 subregions, and select one point from each subregion. Then you will have a set of 100 points, denoted by h. Its uniformity measure is U(h). Use methods (a) and (b) to generate 1000 point sets, respectively, which are g1, g2, … , g1000, and h1, h2, …, h1000. Calculate their uniformity measures accordingly. Plot the uniformity measures in a boxwhisker plot for both methods. What can you conclude from your observations from the box-whisker plot you just generated? (Note: This problem is designed to help students understand that probabilistic uniformity may not guarantee geometric uniformity. The second method as described above is actually a Latin hypercube sampling method, which will generally produce a better uniform coverage than a pure random selection.) 5. Consider a single-station fixturing system, the current fixture locations of which are shown in Figure 14.7. In this problem, you are required to use the data-mining aided design method to find the locator position that minimizes the sensitivity index defined as in Equation 14.5. To make the problem easier to solve, use the following simplifications: a. Treat the part as rectangular. b. Use a resolution of 10 mm to discretize the part. c. For a single-station system, you may have only one feature, which is the distance between a pair of locators.
2151_book.fm Page 358 Friday, October 20, 2006 5:04 PM
Stream of Variation Modeling and Analysis for MMPs
M1
M3
600
Z P1
P2 Y
100
X
M2
200
358
100 700 1000
FIGURE 14.7 Single-station fixturing system.
d. When attempting to select the classification tree structure, do not use the criterion developed in Equation 14.8. Instead, simply use the Matlab function to obtain a CART by setting the number for when an impure node will stop splitting to ten, which should be the default setting used in Matlab. e. Use KJ = N r to determine K and J. If possible, set K = 6, J = 10. This may depend on what you got from the uniform coverage selection.
References 1. Asada, H. and By, A.B., Kinematic analysis of workpart fixturing for flexible assembly with automatically reconfigurable fixture, IEEE Journal of Robotics and Automation, RA-1, 86, 1985. 2. Chou, Y.C., Chandru, V., and Barash, M.M., A mathematical approach to automatic configuration of machining fixtures: analysis and synthesis, ASME Transactions, Journal of Engineering for Industry, 111, 299, 1989. 3. Ferreira, P.M., Kochar, B., Liu, C.R., and Chandru, V., AIFIX: an expert system approach to fixture design, ASME Winter Annual Meeting on Computer Aided/Intelligent Process Planning, Miami Beach, FL, 1985. 4. Trappey, J.C. and Liu, C.R, A literature survey of fixture design automation, International Journal of Advanced Manufacturing Technology, 5, 240, 1990. 5. Cai, W., Hu, S.J., and Yuan, J.X., A variational method of robust fixture configuration design for 3-d workpieces, ASME Transactions, Journal of Manufacturing Science and Engineering, 119, 593, 1997. 6. Menassa, R.J. and DeVries, W.R., Optimization methods applied to selecting support positions in fixture design, ASME Transactions, Journal of Engineering for Industry, 113, 412, 1991. 7. Rearick, M.R., Hu, S.J., and Wu, S.M., Optimal fixture design for deformable sheet metal workpieces, Transactions of NAMRI/SME, XXI, 407, 1993. 8. DeMeter, E.C., Min-max load model for optimizing machining fixture performance, ASME Transactions, Journal of Engineering for Industry, 117, 186, 1995. 9. Melkote, S.N., Prediction of the reaction force system for machining fixtures based on machining process simulation, Transactions of NAMRI/SME, XXIII, 207, 1995.
2151_book.fm Page 359 Friday, October 20, 2006 5:04 PM
Optimal Fixture Layout Design
359
10. Hockenberger, M.J. and DeMeter, E.C., The application of meta functions to the quasi-static analysis of workpiece displacement within a machining fixture, ASME Transactions, Journal of Manufacturing Science and Engineering, 118, 325, 1996. 11. Cai, W.J. and Hu, S.J., Optimal fixture configuration design for sheet metal assembly with spring back, Transactions of NAMRI/SME, XXIV, 229, 1996. 12. Huang, Y. and Hoshi, T., Study for Optimum Fixture Design Considering Flatness Error due to Moving Cutting Heat Source, Technical Paper — Society of Manufacturing Engineers, MS99-177, MS99-177-1–MS99-177-6, 1999. 13. Wang, M.Y., An optimum design for 3-D fixture synthesis in a point set domain, IEEE Transactions on Robotics and Automation, 16, 839–846, 2000. 14. Wang, M.Y. and Pelinescu, D.M., Optimizing fixture layout in a point-set domain, IEEE Transactions on Robotics and Automation, 17, 312, 2001. 15. Soderberg, R. and Carlson, J.S., Locating scheme analysis for robust assembly and fixture design, Proceedings of the 1999 ASME Design Engineering Technical Conferences, Las Vegas, NV, September 12–15, 1999. 16. Rong, Y., Li, W., and Bai, Y., Locating error analysis for fixturing accuracy verification, ASME Computer in Engineering, Boston, MA, September 17–21, 1995, p. 825. 17. Choudhuri, S.A. and DeMeter, E.C., Tolerance analysis of machining fixture locators, ASME Transactions, Journal of Manufacturing Science and Engineering, 121, 273, 1999. 18. Ding, Y., Ceglarek, D., and Shi, J., Design evaluation of multi-station manufacturing processes by using state space approach, Transactions of the ASME, Journal of Mechanical Design, 124, 408, 2002. 19. Carlson, J.S., Quadratic sensitivity analysis of fixtures and locating schemes for rigid parts, ASME Transactions, Journal of Manufacturing Science and Engineering, 123, 462, 2001. 20. Jin, J. and Shi, J., State space modeling of sheet metal assembly for dimensional control, ASME Transactions, Journal of Manufacturing Science and Engineering, 212, 756, 1999. 21. Ding, Y., Ceglarek, D., and Shi, J., Modeling and diagnosis of multi-station manufacturing processes: part I state space model, Proceedings of 2000 Japan-USA Symposium on Flexible Automation, July 23–26, Ann Arbor, MI, 2000JUSFA-13146, 2000. 22. Camelio, A.J., Hu, S.J., and Ceglarek, D.J., Modeling Variation Propagation of Multistation Assembly Systems with Compliant Parts, Transactions of the ASME, Journal of Mechanical Design, 125, 673, 2003. 23. Zhou, S., Huang, Q., and Shi, J., State space modeling for dimensional monitoring of multistage machining process using differential motion vector, IEEE Transactions on Robotics and Automation, 19, 296, 2003. 24. Rong, Y. and Bai, Y., Machining accuracy analysis for computer-aided fixture design verification, ASME Transactions, Journal of Manufacturing Science and Engineering, 118, 289, 1996. 25. Xiong, C., Rong, Y., Koganti, R., Zaluzec, M., and Wang, N., Geometric variation prediction in automotive assembly, Assembly Automation Journal, 22, 260, 2002. 26. Fedorov, V.V., Theory of Optimal Experiments, Academic Press, New York, 1972. 27. Atkinson, A.C. and Donev, A.N., Optimum Experimental Designs, Oxford University Press, New York, 1992. 28. Pukelsheim, F., Optimal Design of Experiments, John Wiley & Sons, New York, 1993. 29. Wang, Y. and Nagarkar, S.R., Locator and sensor placement for automated coordinate checking fixtures, ASME Transactions, Journal of Manufacturing Science and Engineering, 121, 709, 1999.
2151_book.fm Page 360 Friday, October 20, 2006 5:04 PM
360
Stream of Variation Modeling and Analysis for MMPs
30. Schott, J.R., Matrix Analysis for Statistics, John Wiley & Sons, New York, 1997. 31. Hillier, F.S. and Lieberman, G.J., Introduction to Operations Research, 7th ed., McGraw-Hill, New York, 2001. 32. Nelder, J.A. and Mead, R., A simplex method for function minimization, The Computer Journal, 7, 308, 1965. 33. Gen, M. and Cheng, R., Genetic Algorithms and Engineering Optimization, John Wiley & Sons, New York, 2000. 34. Bertsimas, D. and Tsitsiklis, J., Simulated annealing, Statistical Science, 8, 10, 1993. 35. Schwabacher, M., Ellman, T., and Hirsh, H., Learning to set up numerical optimizations of engineering designs, Data Mining for Design and Manufacturing, Kluwer Academic Publishers, Boston, MA, 2001, p. 87. 36. Igusa, T., Liu, H., Schafer, B., and Naiman, D.Q., Bayesian classification trees and clustering for rapid generation and selection of design alternatives, Proceedings of 2003 NSF Design, Service and Manufacturing Grantees and Research Conference, Birmingham, AL, January 4, 2003. 37. Quinlan, J.R., C4.5: Programs for Machine Learning, Morgan Kaufmann, San Mateo, CA,1993. 42. Kim, P. and Ding, Y., Optimal design of fixture layout in multi-station assembly process, IEEE Transactions on Automation Science and Engineering, 1, 133, 2004. 38. Fang, K.T. and Wang, Y., Number-Theoretic Methods in Statistics, Chapman and Hall, New York, 1994. 41. Fang, K.T., Lin, D.K.J., Winkle, P., and Zhang, Y., Uniform design: theory and application, Technometrics, 42, 237, 2000. 40. McKay, M.D., Bechman, R.J., and Conover, W.J., A comparison of three methods for selecting values of input variables in the analysis of output from a computer code, Technometrics, 21, 239, 1979. 39. Santner, T.J., Williams, B.J., and Notz, W.I., Design and Analysis of Computer Experiments, Springer-Verlag, New York, 2003. 43. Hastie, T., Tibshirani, R., and Friedman, J., The Element of Statistical Learning, Springer-Verlag, New York, 2001. 44. Tibshirani, R., Walther, G., and Hastie, T., Estimating the number of clusters in a data set via the gap statistic, Journal of the Royal Statistical Society — Series B, 63, 411, 2001. 45. Viswanadham, N., Sharma, S., and Taneja, M., Inspection allocation in manufacturing systems using stochastic search techniques, IEEE Transactions on Systems, Man and Cybernetics — Part A, 26, 222, 1996. 46. Djurdjanovic, D. and Ni, J., Linear state space modeling of dimensional machining errors, Transactions of NAMRI/SME, XXIX, 541, 2001. 47. Mantripragada, R. and Whitney, D.E., Modeling and controlling variation propagation in mechanical assemblies using state transition models, IEEE Transactions on Robotics and Automation, 15, 124, 1999. 48. Kim, P. and Ding, Y., Optimal engineering design guided by data-mining methods, Technometrics, Vol. 47(3), pp. 336–348, 2005
2151_book.fm Page 361 Friday, October 20, 2006 5:04 PM
15
Process-Oriented Tolerance Synthesis*
15.1 CONCEPT OF PROCESS-ORIENTED TOLERANCING Traditional tolerancing research [1,2] mainly focused on an assembly that is built up through many mating features of individual components. Raw parts are inherently imprecise; tolerance is used to control the quality of the final product by specifying allowable limits on raw parts. There are two basic directions in tolerancing research: (1) tolerance analysis or (variation) prediction, and (2) tolerance synthesis or allocation. Tolerance analysis predicts the variation of the final product given the tolerance of each part. The basic idea of tolerance analysis is represented in Figure 15.1. First, a mathematical expression, such as geometrical or dimensional tolerance, is used to represent the raw part tolerance according to their properties. Then, based on mathematical model of tolerance accumulation, such as the worst case (WC) model and root square sum (RSS) model [3,4], the final product variation as an accumulation of part tolerance is computed and predicted. Tolerance synthesis/allocation is conducted in the inverse direction: given the quality specification of the final product, what tolerance should be assigned to each raw part? There is always a manufacturing cost associated with the tolerance to be assigned: the tighter the tolerance, the higher the cost. Hence, the optimal tolerance with minimum manufacturing cost can be allocated by solving a constrained optimization. The tolerance research focusing mainly on product variables, such as dimensions of final product and raw parts, is referred to as product-oriented tolerancing in this book. From models in Chapter 6, Chapter 7, and Chapter 16, the quality of the final product from an MMP is not only determined by the tolerance of each individual raw part but also by variations of many process variables, such as fixturing error, force, and tooling vibration at different manufacturing stages. In this chapter, we would like to focus on a tolerance synthesis/allocation problem as follows: for an MMP, given the quality specification of the final product, how to optimally allocate the tolerances of process and product variables such that the given quality criteria can be achieved at minimum cost. To differentiate this problem from the other product-oriented tolerance synthesis, we call it process-oriented tolerance synthesis. Despite the research being presented in the specific context of BIW assembly processes, the methodology is applicable to generic multistage manufacturing processes.
* Part of chapter material is based on Reference 7 and Reference 8.
361
2151_book.fm Page 362 Friday, October 20, 2006 5:04 PM
362
Stream of Variation Modeling and Analysis for MMPs
Tolerance accumulation or variation propagation
Representation of raw part tolerance
Variation of final product
FIGURE 15.1 Tolerance analysis.
15.2 FRAMEWORK OF PROCESS-ORIENTED TOLERANCING 15.2.1 OVERVIEW A schematic diagram is shown in Figure 15.2 to demonstrate tolerance, quality, and cost interwining in an MMP. The cost is associated with the tolerances assigned to both process and product variables (Figure 15.2). Variations of process/product variables determined by their tolerances will affect the quality of the final product. The variation of process variables is the focus of this research. The variation of product variables, i.e., variation of each part in final product coming from the preceeding process is treated as an initial variation condition to the current fabrication process. Later on, with the example of the BIW assembly process, we will simplify the problem by setting the initial condition to zero, that is, we are only concerned with process variables in the current development. Our objective is to allocate optimal tolerance to process variables at minimum cost with satisfactory quality of final product. Differentiated from product variables, process variables carry dynamic process information such as tooling degradation and, thus, they are strongly related to process reliability and the corresponding maintenance policies. If the tolerances are allocated without considering tooling degradation, product quality can only be guaranteed at the very initial stages of production. However, the quality criteria should be satisfied not only during initial stages of production but also during the whole life cycle of the production system. Currently, for many real production systems, maintenance service is still conducted following a fixed time schedule, e.g., all locating pins at assembly stations are replaced every half year. In this case, the initial tolerances need to be tighter to accommodate tooling degradation between conservative maintenance schedules to
$$ cost T olerances R e liability & M aintenance
P rocess V ariables (k) P (k) P roduct V ariables x
x(k-1) stage 1
stage k
x(k)
stage N
Multistage Manufacturing Process
FIGURE 15.2 Overview of process-oriented tolerance synthesis.
P roduct Quality
2151_book.fm Page 363 Friday, October 20, 2006 5:04 PM
Process-Oriented Tolerance Synthesis
Tolerance
363
Variation of process variable
variation prediction tolerance allocation Quality criteria
Product variation
FIGURE 15.3 Relationship between tolerance and quality.
avoid out-of-specification products. Mathematically, the optimal tolerance T* can be formulated as the following constrained optimization problem: T∗ = arg min CT (T) T
(15.1)
subject to ℑQ ( T, t ) ≥ 0 for {tt : 0 < t < tm} where CT represents the cost function of tolerancing, T is the tolerance vector of selected key process variables, ℑQ (⋅, ⋅) is a function of given measure or index of product quality, t is time, and tm is the maintenance time period. There are two fundamental questions to be answered: how to select the cost function, and how to formulate the quality constraint functions. The cost function is determined by the tolerances assigned to process variables. Generally speaking, the tighter the tolerance, the higher the cost of satisfying it. The reciprocal function and negative exponential function are widely used as cost functions. The second question is how to relate the tolerances to the product quality index, which actually is the constraint function. The construction of the constraint function relating the tolerance to quality requires development of several essential models (shown in Figure 15.3) with both forward (prediction) and backward (allocation) propagation. From the diagram in Figure 15.3, it can be seen that tolerance is first related to variation of process variables. Product variation propagates through the production line with the contribution and accumulation of process variables at each stage. Eventually, a suitable measure is devised to compare the variation of final product to the specified quality index. Therefore, there are four key elements to realize this optimization formulation: • • • •
Variation propagation model Tolerance-variation relationship Process degradation model Cost function
15.2.2 VARIATION PROPAGATION MODEL The state space model developed in Chapter 6 is used to describe the variation propagation in a multistation assembly process as
2151_book.fm Page 364 Friday, October 20, 2006 5:04 PM
364
Stream of Variation Modeling and Analysis for MMPs
x k = A k −1x k −1 + B k u k + w k
(15.2)
yk = Ck xk + vk
(15.3)
Suppose there is only end-of-line observation, that is, k = N. Equation 15.3 can be converted to N
yN =
∑C Φ N
N ,k
Bk u k + C N ΦN , 0 x 0 + ε
(15.4)
k =1
where the state transition matrix Φk ,i is defined as in the previous chapters. Here, x0 corresponds to the initial conditions coming from manufacturing imperfection of stamped parts and ε is the summation of all modeling uncertainty and sensor noise terms. It was assumed that this process involves sheet metal assembly with only laplap joints and, thus, stamping imperfection of part dimensions will not affect propagation of variations. Then, we can set initial conditions to zero. The uncertainty term ε can be neglected in the design stage or estimated based on historical data. The variation propagation can then be approximated as follows: N
ΣY =
∑ γ (k ) Σ (k ) γ (k ) T
(15.5)
u
k =1
where ΣY and Σu(k) represent the covariance matrices of Y ≡ yN and uk, respectively, and γ (k ) ≡ C N ΦN ,k Bk . Thus, the product quality is affected by Σu(k), which is the covariance of process variables.
15.2.3 RELATIONSHIP
BETWEEN
TOLERANCE
AND
VARIATION
In Figure 15.4, let dpin and dhole denote the diameter of a pin or a hole and Ti be the specified tolerance of the clearance, that is, the upper limit of the clearance. Our primary interest is the variation associated with a pin-hole locating pair caused by its clearance. The clearance-induced fault is shown in Figure 15.5.
Ti
dpin
dhole
X Ti
Ti dpin
Z
Z
Ti
dhole
(a) 4-way pin/hole
(b) 2-way pin/hole
FIGURE 15.4 Diagram of pin-hole locating pairs.
X
2151_book.fm Page 365 Friday, October 20, 2006 5:04 PM
Process-Oriented Tolerance Synthesis
z
P2
365
z
P2
x z
x P1
P2
P1 (a)
z
x
x
P2 (c)
(b)
FIGURE 15.5 Clearance-induced process faults.
The geometrical relationship can be obtained as in Chapter 16. The deviation of a 4-way locating pin P1′ (center of the pin-hole) from P1 (center of the pin) in both x and z directions are ∆x = ξ cos θ ,
(15.6)
∆z = ξ sin θ
(15.7)
where ξ is the distance between P1′ and P1 and θ is the contact orientation. If the clearance tolerance is Ti, ξ is the random variable representing the actual clearance in one setup. Although ξ(0 ), the initial clearance, is in fact bounded by [0, Ti], it can be reasonably approximated by a normal distribution N( T2i , ( T6i )2 ). As discussed in Chapter 16, θ ∼ U(0, 2π). Given that the two random variables ξ and θ are independent of each other, the statistics regarding ∆x and ∆z are as follows: E (∆x ) = 0
(15.8)
E (∆z) = 0
(15.9)
σ 2x ,4−way = E (∆x 2 ) = E (ξ 2 ) ⋅ E (cos 2 θ) =
5Ti 2 36
(15.10)
σ 2z ,4− way = E (∆z 2 ) = E (ξ 2 ) ⋅ E ( sin 2 θ) =
5Ti 2 36
(15.11)
Cov (∆x, ∆z) = 0
(15.12)
The geometrical relationship of a 2-way locating pair with orientation angle α shown in Figure 15.5b and Figure 15.5c can be obtained as ∆x = ξ sin α ⋅ κ and ∆z = − ξ cos α ⋅ κ
(15.13)
2151_book.fm Page 366 Friday, October 20, 2006 5:04 PM
366
Stream of Variation Modeling and Analysis for MMPs
where κ is a binary random variable with a value of either 1 or –1. We postulate that if the pin touches the top (or left, if α approaches 90˚) edge of the pin-hole, then κ is 1; if the pin touches the bottom (or right, if α approaches 90˚) edge of the pin-hole, then κ is –1. Also, κ is independent of ξ. Hence, the variation associated with a 2-way locating pair can then be expressed as E (∆x ) = E (∆z ) = 0
(15.14)
σ 2x ,2 −way = E (ξ 2 sin 2 α ⋅ κ 2 ) =
5Ti 2 ⋅ sin 2 α 18
(15.15)
σ 2z,2 −way = E (ξ 2 cos 2 α ⋅ κ 2 ) =
5Ti 2 ⋅ cos 2 α 18
(15.16)
Cov (∆x, ∆z) = E (ξ 2 cos α sin α ⋅ κ 2 ) =
5Ti 2 cos α sin α 18
(15.17)
Equation 15.17 implies that the deviations of a 2-way locating pair at an arbitrary orientation angle α are correlated. To eliminate this correlation, the angle α is usually made around 0˚ (horizontal) or 90˚ (vertical). In light of this argument, it can be concluded that the matrices Σu(k) are diagonal for all k according to Equation 15.12 and Equation 15.17. Equation 15.10, Equation 15.11, Equation 15.15, and Equation 15.16 will be iteratively applied to every pin-hole locating pair at each station in a multistation BIW assembly process so that Σu(k) can be expressed in terms of corresponding tolerances.
15.2.4 PROCESS DEGRADATION MODEL A stochastic degradation model of a locating pin is described in Chapter 16 for the QR-chain model. The aggregated wear ∆d (t) at operation t is expressed as ∆ d (t ) = ∆ d (t − 1) + ∆(t )
(15.18)
According to the QR-chain model in Chapter 16, ∆(t) is of lognormal distribution, i.e., ∆(t) ∼ LOGNOR( µ ∆ (t ) , σ 2∆ (t )). In this chapter, we assume that the mean of wear-out rate µ∆ consists of two components, a constant wear-out rate plus a higher initial wear-out rate that decreases exponentially. The mean of wear-out rate at operation t is assumed to be µ ∆ (t ) = µ 0 + µ1e − βt
(15.19)
where µ0 + µ1 is the initial wear-out rate, µ0 is the constant rate, and β determines how fast the wear-out will reach its steady state. The change of clearance of pinhole locating pair can be computed by
2151_book.fm Page 367 Friday, October 20, 2006 5:04 PM
Process-Oriented Tolerance Synthesis
367
ξ(t ) = ξ(0 ) + ∆ d (t )
(15.20)
where ξ(t) is the clearance after operation t. We should substitute the enlarged clearance at time tm into Equation 15.6, Equation 15.7, and Equation 15.13 and recalculate the locating variation. In the following derivations, we have several assumptions: (1) the initial clearance ξ(0), orientation angles θ and κ, and aggregated wear ∆d (t) are assumed to be independent of each other, (2) the variance of wear out rate σ 2∆ is assumed to be the same for all operations, and (3) according to the central limit theorem, the aggregated wear ∆d (t) will be close to the normal distribution after a large enough number of operations. Based on these properties and assumptions, the following relationships can be obtained by substituting Equation 15.20 into Equation 15.11, Equation 15.12, Equation 15.16 and Equation 15.17, respectively. σ 2x ,4− way (tm ) = E ((ξ(0) + ∆ d (tm ))2 ⋅ cos2θ) =
1 E ((ξ(0) + ∆ d (tm ))2 ) 2
(15.21)
σ 2z ,4− way (tm ) = E ((ξ(0) + ∆ d (tm ))2 ⋅ sin 2 θ) =
1 E ((ξ(0) + ∆ d (tm ))2 ) 2
(15.22)
σ 2x ,2− way (tm ) = E ((ξ(0) + ∆ d (tm ))2 sin 2 α ⋅ κ 2 )
(15.23)
= sin 2 α ⋅ E ((ξ(0) + ∆ d (tm ))2 ) σ 2z ,2− way (tm ) = E ((ξ(0) + ∆ d (tm ))2 cos 2 α ⋅ κ 2 )
(15.24)
= cos 2 α ⋅ E ((ξ(0) + ∆ d (tm ))2 ) where E ((ξ(0) + ∆ d (tm ))2 ) = E (ξ(0)2 + 2ξ(0) ∆ d (tm ) + ∆ d2 (tm )) = E (δ 2 ) + 2 ⋅ E (ξ(0)) ⋅ d (tm ) + Var ( ∆ d (tm )) + d (tm )2 =
5Ti 2 + Ti ⋅ d (tm ) + tm ⋅ σ 2∆ + d (tm )2 18
=
5 9 1 ⋅ Ti + d (tm ) + tm ⋅ σ 2∆ + ⋅ d (tm )2 18 5 10
2
and d (tm ) ≡ E (∆ d (tm )) is the average aggregated wear.
(15.25)
2151_book.fm Page 368 Friday, October 20, 2006 5:04 PM
368
Stream of Variation Modeling and Analysis for MMPs
15.2.5 COST FUNCTION Many different cost functions of tolerances have been proposed for different tolerance allocation schemes [5]. As for a dimensional tolerance made from a fabrication process, the reciprocal function and negative exponential function are the most oftenused representations. In this problem, the cost function is chosen to be the reciprocal function represented as CT =
∑ wT
i
i =1,... p
,
(15.26)
i
where Ti is the ith tolerance, i = 1, 2, … , p, wi is the weighting coefficient associated with Ti.
15.2.6 OPTIMIZATION FORMULATION
AND
OPTIMALITY
As long as these essential process models have been made available, a constrained optimization problem is formulated for the multistage assembly process as
{
T∗ = arg min CT (T) T
}
(15.27)
subject to ℑQ (T, t ) = σ s2 − diag( Σ Y )
∞
≥ 0 for all 0 < t < tm and Ti > 0 ∀i
where ||diag( Σ Y )||∞= max(|kii |) and kii is the (i, i) diagonal element of matrix ΣY (see i Equation 15.5). Vector diag( Σ Y ) includes variances of those selected measurement points on the final product that are key product characteristics (KPCs). The current choice of constraint function requires that variations of all KPCs on the final product must be less than the given upper variation limit (i.e., σ 2s in this formulation). This constraint function is only one of the choices corresponding to the criteria used in industry. It can be shown that this formulation will achieve global optimality, because the cost function is convex and the constraint function is a concave function.
15.3 CASE STUDY FOR PROCESS-ORIENTED TOLERANCING The assembly process of side frame inner panel (Figure 15.6) from Chapter 6 is used to illustrate the tolerancing of multistation assembly processes. We will first consider this process without considering tooling degradation.
15.3.1 TOLERANCE ALLOCATION IS NOT CONSIDERED
WHEN
TOOLING DEGRADATION
There are twelve tolerance variables of clearance T1 – T12 to be allocated in this three-station process (each station has four pin-hole locating pairs). It is assumed that all process variables have the same weights in the cost function, that is, wi = 1
2151_book.fm Page 369 Friday, October 20, 2006 5:04 PM
Process-Oriented Tolerance Synthesis
369
Rail Roof Side Panel P5 P6
M2
M7
M6
P2 P7
P3
M1 M10 M3
P1 A-Pillar Inner Panel
P8
P4 B-Pillar Inner Panel
M4
Rear Quarter Inner Panel
(a) Principal Locating Points (PLP)
M5
M9
M8
(b) Measurement Locating Points (MLP)
FIGURE 15.6 Side frame inner panel assembly. ToleranceT is initiated From the interval [0.01, 2] Compute variation of process variables. Iterate Compute variation of KPCs onfinal product.
MATLAB Optimization Sequential Quadratic Programming
Stop and select tolerance with minimum cost
FIGURE 15.7 Tolerance allocation without degradation model.
for i = 1, 2, … , 12 in Equation 15.26. The designer requires that the final product (the inner-panel-complete) must have Six Sigma value no greater than 1.5 mm at all KPCs, namely σ 2s = ( 1.5 )2 in Equation 15.27. From industrial practice, it is known 6 that the tolerance of a clearance is usually above 0.01 mm. Thus, the initial tolerance is then picked up from the interval [0.01, 2] mm. The procedure for tolerance allocation is shown in the following flowchart (Figure 15.7). The optimization problem is solved by using the Matlab function fmincon, which uses a sequential quadratic programming (SQP) method [6]. The algorithm converges within several minutes and yields the optimal tolerance after 290 times of iterations. The optimally allocated tolerances for these process variables are listed in Table 15.1. Compared with the current industry practice, where the tolerance of locating clearance is allocated empirically as uniform for all locating pairs, the processoriented tolerancing approach no longer allocates tolerances uniformly. This nonuniformity is consistent with the process sensitivity, i.e., the more variation the process variable contributes to the final product, the tighter the corresponding tolerance
2151_book.fm Page 370 Friday, October 20, 2006 5:04 PM
370
Stream of Variation Modeling and Analysis for MMPs
TABLE 15.1 Tolerances without Tooling Degradation T1
T2
T3
T4
T5
T6
T7
T8
T9
T10
T11
T12
0.21
0.36
0.19
0.31
0.30
0.42
0.63
0.36
0.34
0.34
0.34
0.32
Note: Unit — mm.
should be. It is too hard for the empirical approach to determine which tolerance should be tight and which should be loose. As a result, either the cost is higher or the variation of final product is above the threshold.
15.3.2 TOLERANCE ALLOCATION CONSIDERING TOOLING DEGRADATION Under this condition, the tolerances are allocated at the beginning of production, whereas the quality criteria are checked for all products produced by the degraded process. The procedure for tolerance allocation taking the tooling degradation model into consideration is shown in Figure 15.8. The optimization is still solved by using the Matlab function fmincon, but with the tooling degradation model implemented. Based on industry experience, parameters needed in the degradation model such as operation rate, maintenance period, and pin wear-out rate are listed in Table 15.2. The algorithm converges and yields the optimal tolerance after almost the same number of iterations as in Subsection 15.3.1. The new tolerances become tighter and are shown in Table 15.3. Tolerance T is initiated From the interval [0.01, 2] Compute initial variation of process variables. Tooling degradation: compute the end-period* variation of process variables
iterate
Compute end-period* variation of KPCs on final product. Optimization iteration
Stop and select tolerance With minimum cost * end-period means at the end of scheduled maintenance period.
FIGURE 15.8 Tolerance allocation with degradation.
2151_book.fm Page 371 Friday, October 20, 2006 5:04 PM
Process-Oriented Tolerance Synthesis
371
TABLE 15.2 Parameters in the Degradation Model µ0 (mm)
µ1 (mm)
β
σ∆ (mm)
tm
Operations/day
5 × 10–7
1 × 10–6
1 × 10–3
5 × 10–5
6 months
500
TABLE 15.3 Tolerances with Tooling Degradation T1
T2
T3
T4
T5
T6
T7
T8
T9
T10
T11
T12
0.16
0.31
0.14
0.25
0.23
0.34
0.58
0.26
0.30
0.27
0.28
0.26
Note: Unit: mm.
TABLE 15.4 σ of KPCs for 0.25 mm Tolerance Maximum 6σ Beginning 6σ = 1.44 mm
Half Year
Specified
6σ = 1.77 mm
6σ = 1.50 mm
TABLE 15.5 Comparison of Manufacturing Cost under Different Scenarios Conditions
Without Degradation
With Degradation
Uniform 0.25 mm
Cost
38.1
47.9
48
Currently in the automotive industry, the tolerances are uniformly set to be 0.25 mm for all clearances. Using these tolerances, the maximum Six Sigma values of KPCs at the beginning of production and after a half year of production are listed in Table 15.4. Although the assigned tolerance can produce qualified products at the beginning of the production period, many out-of-specification products will be fabricated when tooling elements are degraded. Furthermore, the manufacturing costs of different cases, represented by the summation of the reciprocal of all the tolerances as in Equation 15.26 are compared in Table 15.5. When degradation is not considered, the tolerance is allocated nonuniformly and results in the manufacturing cost decreasing by 20.6% compared to that with uniform 0.25-mm tolerance. When process degradation is considered, product quality is ensured throughout the production without increasing manufacturing cost from that with uniform 0.25-mm tolerance. Because defective product will be unavoidably
2151_book.fm Page 372 Friday, October 20, 2006 5:04 PM
372
Stream of Variation Modeling and Analysis for MMPs
produced under the scheme of uniform 0.25-mm tolerance, the actual cost is even higher for the empirical method when the quality-loss-related costs such as rework, labor, and material waste are counted. It can be concluded that process-oriented tolerance allocation can deliver high-quality product at a comparably lower cost and, thus, it is advantageous over the empirical approach.
15.4 INTEGRATION OF PROCESS-ORIENTED TOLERANCE SYNTHESIS AND MAINTENANCE PLANNING In the process-oriented tolerance synthesis framework developed from Section 15.1 to Section 15.3, we have seen that the process variables carry the dynamic process information such as tooling degradation and, thus, they are strongly related to process reliability and the corresponding maintenance policies. This suggests that tolerance design and maintenance decision-making policy are interconnected through process variables. Intuitively, tight initial tolerances specified on process variables can reduce the frequency of conducting maintenance during production, because the process can accommodate more deterioration to reduce maintenance cost; but they take a toll on tolerance cost. On the other hand, loose initial tolerances specified on process variables can lower design cost but increase the frequency of maintenance during production. Hence, there is a critical need to achieve a balance between the tolerance cost of tooling fabrication and the maintenance cost of tooling replacement. The maintenance schedule used in the previous sections is not considered a decision variable but is a predetermined constant period based on empirical experience. This and the following sections will present a general framework to integrate preventive maintenance decision-making and process-oriented tolerance synthesis for MMP.
15.4.1 DECISION VARIABLES
OF
TOLERANCE
AND
MAINTENANCE DESIGN
As in the QR-Chain model developed in Chapter 16, we consider an MMP consisting of p manufacturing system components that are distributed in N stations. The p process variables associated with the system components are denoted by the vector ξ = [ξ1 … ξp]T. The study of maintenance actions here is focused on replacements of system components. For new manufacturing system components (without degradation), the allowable varying ranges of ξ are determined by their tolerances T. Because of the continuous degradation of tooling elements during production time, ξ may shift continuously out of the ranges defined by initial tolerances T. Maintenance action can restore the states of one or more system components from their deteriorated states. In other words, by replacing the system component i with a new one, maintenance action resets ξ i to a value within its original range defined by Ti. An easy-to-adopt preventive maintenance policy is the age replacement policy, under which tooling element i is replaced when it reaches age ta(i). Vector ta ≡ [ta(1) … ta(p)]T denotes the age replacement maintenance policies for all system components. In discrete-part manufacturing processes, such as automotive body assembly processes, ta is usually measured in terms of number of operations performed by the manufacturing system. T and ta are both key decision variables in
2151_book.fm Page 373 Friday, October 20, 2006 5:04 PM
Process-Oriented Tolerance Synthesis
373
Total cost C (t) III. Different cost components
II. Effects of T and t a on process variables
Maintenance cost C (t;T, ta )
Tooling fabrication costCTi , i 1,..., p
M
Quality loss L(t ; T, t a )
Maintenance policy t a
Tolerance T
Process variables I. KPC deviations vs. process variables
Station k
Station 1
Decision variables
Station N
KPC deviations Y(t ;T, t a )
Objective functions
FIGURE 15.9 Integration of tolerance and maintenance design in MMP.
the integrated design of manufacturing systems. They determine the states of process variable ξ. The relationship of T and ta to ξ is designated in segment II of Figure 15.9.
15.4.2 COST COMPONENTS Costs are involved in the selection of tolerances and maintenance policies for manufacturing systems. The costs associated with tolerances have been discussed in Subsection 15.2.5. We use C Ti to denote the cost associated with the tolerance of the ith process variable, i = 1, …, p. The maintenance policy affects the maintenance cost at time t, denoted by CM (t). Here time t, measured by the number of operations, is in a discrete scale. Additionally, loss is always incurred as product quality deteriorates. The cost due to product quality deterioration is indicated by a loss function L(t). These three cost components are illustrated in segment III of Figure 15.9 and will be discussed in detail as follows. 15.4.2.1 Tolerance Cost The same cost function as in Subsection 15.2.5 is applied for tolerance costs: CTi =
wi , i = 1, … , p Ti
(15.28)
15.4.2.2 Maintenance Cost When a system component is replaced, the replacement cost should include the manufacturing cost of a new system component, which is its tolerance cost. In addition to the cost associated with its tolerance, the maintenance cost of replacing system component i includes various other features, including labor and basic repair facility costs. The total of these additional costs is represented by c0pi for system component i. The replacement cost induced at time t can be written as
2151_book.fm Page 374 Friday, October 20, 2006 5:04 PM
374
Stream of Variation Modeling and Analysis for MMPs
C M (t ) =
∑ (C
i T
) ∑ wT
+ c0pi =
i∈Jt
i
i
i∈Jt
+ c0pi
(15.29)
where Jt is the index set of the system components subject to a scheduled replacement at time t. If no system component is replaced at time t, CM (t) is zero. Obviously, Jt is affected by maintenance schedule ta, and C Ti is affected by tolerance T. Thus, C M (t ) depends on both T and a. The notation C M (t; T, a ) is used when we need to make this dependency explicit. 15.4.2.3 Quality Loss Function The following quality loss function, which is a multivariate quadratic loss function, is used L ( Y) = Y T SY
(15.30)
where S = [sij] is a symmetric positive definite matrix. From Figure 15.9, the final product KPC deviation Y depends on tolerance T, the maintenance schedule ta, and operation time t. Therefore, the quality loss function L(Y) depends on all these three variables. The dependency of L on t, T, and ta is made explicit in the notation L(t; T, ta).
15.4.3 FORMULATION
OF
OPTIMIZATION PROBLEMS
This subsection will formulate two different optimization problems depending on the use of the quality loss function. 15.4.3.1 Optimization Formulation Using a Quality Loss Function (F1) Under this scenario, the overall cost C(t) includes both maintenance cost and quality loss and is C (t ) = C M (t; T, t a ) + L (t; T, t a ) The long-run expected production cost per unit time Φ(T, t a ) is defined as t
t
∑ ( ) E C (τ)
Φ(T, t a ) ≡ lim
t →∞
τ=0
t
= lim
t →∞
∑ E (C
M
(τ; T, t a ) + L (τ; T, t a )
τ=0
t
) (15.31)
The objective of the integrated tolerance and maintenance design is to minimize the long-run average production cost Φ(T, t a ) . The optimization problem is formulated as
2151_book.fm Page 375 Friday, October 20, 2006 5:04 PM
Process-Oriented Tolerance Synthesis
F1:
375
(T , t ) = arg min ( Φ(T, t )) *
*
a
a
T ,t a
(15.32)
subject to Ti > 0 and ta(i) > 0, ∀i. Typically, a tight tolerance and high replacement frequency lead to a high maintenance cost in the long run. On the other hand, a loose tolerance and low replacement frequency lead to high quality loss. Hence, optimal designs of both tolerance and maintenance are needed to trade off the maintenance cost and quality loss. 15.4.3.2 Optimization Formulation with a Quality Constraint If a quality loss function is difficult to determine in some manufacturing processes, a scheme similar to Equation 15.27 may be used for the optimal design by formulating a constrained optimization problem — selecting the optimal tolerance and maintenance policy to minimize the maintenance cost while satisfying the constraints on the KPC deviations at any operation time t. Under this scenario, let Ψ(T, a ) denote the long-run expected maintenance cost per unit time, that is, t
∑ E (C (τ)) M
Ψ(T, t a ) ≡ lim
τ=0
t →∞
t
(15.33)
Then the second formulation of the optimization problem is expressed as F2:
(T , t ) = arg min Ψ(T, t ) *
*
a
T ,t a
a
(15.34)
subject to [ Σ Y ](i ,i ) ≤ σ s2 , ∀t , Ti > 0, ta (i) > 0, ∀i ΣY](i,i) is the (i, i)th entry of ΣY , i.e., [Σ ΣY](i,i) is the variance of the ith KPC. where [Σ The solution to the aforementioned optimization problems depends on the model describing the impacts of T and ta on the final product KPC’s deviation Y. This model generally relies on the physical knowledge of specific manufacturing processes. In the following sections, we will develop these models in the context of BIW assembly processes.
15.5 INTEGRATED TOLERANCE AND MAINTENANCE DESIGN IN BIW ASSEMBLY PROCESSES Based on Equation 15.5, we have ΣY =
∑
N k =1
γ (k ) Σ u (k ) γ T (k ) = ΓΣ u Γ T
(15.35)
2151_book.fm Page 376 Friday, October 20, 2006 5:04 PM
376
Stream of Variation Modeling and Analysis for MMPs
Ku (1) where Γ = [γγ (1) … γ (N)], and Ku = 0
Ku ( N ) 0
Let u ≡ [u1 … uN ]T denote the process faults at all stations in the BIW assembly process and [u]i, i = 1, … , p, denote the ith element of u. The relationship between the process fault ([u]i) and the tolerance (Ti) and age (τi ) of locating pin i has been derived in Subsection 15.2.4 (by replacing tm with τi ) as follows E([u]i) = 0 ∀i, and Cov([u]i, [u]j) = 0, ∀i, j
(15.36)
Assuming constant pin degradation rate, i.e., assuming that µ∆(t) defined in Subsection 15.2.4 is a constant denoted by µi for pin i, the variance of the toollocating error is
( )
2 Var([u]i ) = κ i E ξ i / 2
(15.37)
2 5 τ 2i / 2 2 9 2 = κ i ⋅ Ti / 2 + τ i / 2 µ i / 2 + τ i / 2 ⋅ σ ∆ + ⋅ µ i / 2 , i = 1, ..., 2 p 10 18 5
where x denotes the smallest integer greater than or equal to x; 0.5, κ i = sin 2 α, coos 2 α,
for 4-way pin locating error forr 2-way pin locating error in x direction for 2-way pin locating error in z direection
There are 2p process faults in the system, because each pin may have errors in both x- and z-directions. From Equation 15.36, E(Y) = 0 and Σu(k) is diagonal. Further, by utilizing Equation 15.5, the expected value of quality loss (defined in Equation 15.30) can be written as E ( L ( Y)) = E
MN
MN
i =1
j =1
∑∑
∑ ( Γ
sij yi y j =
2p
=
i =1
T (:,i )
S Γ
(:,i )
)
MN
MN
i =1
j =1
∑ ∑ s Cov(( y , y ) ij
i
j
(15.38) 2p
Var (Ui ) =
∑ ρ Var([u] ) i
i
i =1
where ρi ≡ [Γ ](:,T i ) S[Γ ](:,i ) . Substituting Equation 15.37 into Equation 15.38, we are able to express the quality loss function L in terms of T and ta for automotive body assembly processes.
2151_book.fm Page 377 Friday, October 20, 2006 5:04 PM
Process-Oriented Tolerance Synthesis
377
15.6 OPTIMIZATIONS AND OPTIMALITY FOR INTEGRATED TOLERANCE AND MAINTENANCE DESIGN Two optimization formulations F1 and F2, one with a quality loss function and the other with a quality constraint function, will be further elaborated using process models of the automotive body assembly process. Because there are a large number of locating pins in a BIW assembly process, an efficient optimization procedure is required. This subsection focuses on the study of the optimality of the formulated optimization problems to ensure a feasible optimization solution.
15.6.1 OPTIMALITY ANALYSIS OF OPTIMIZATION FORMULATION F1 Problem F1 is formulated in Equation 15.32 when the quality loss function is considered a part of the objective function. The objective function in Equation 15.32 can be written as t
∑ E ( L(Y(τ)) + C (τ)) M
Φ(T, t a ) = lim
τ= 0
t →∞
t t
= lim
2p
∑∑ τ= 0
i =1
t →∞
∑
τ= 0
p
i
=
∑ i =1
i M
(15.39)
i =1
t→∞
p
∑ ∑ ( L (Y(τ)) + E (C (τ))) t
= lim
wi p ρiVar ([u]i ) + E + c i 0 i∈Jτ Ti t
t lim t →∞
∑ ( L (Y(τ)) + E (C (τ))) t
i
i M
τ= 0
t
where Li ( Y(τ)) ≡ ρ2 i−1Var ([u]2 i−1 ) + ρ2 iVar ([u]2 i ) , which is the contribution of pin i to the quality loss at time τ, and wi p + c0 i , CMi (τ) ≡ Ti 0,
if i ∈ Jτ
,
otherwise
which is the replacement cost of pin i at time τ. From Equation 15.37, Var(Ui) depends only on replacement time and tolerance of pin i. So both C Mi (τ) and Li (Y(τ))
2151_book.fm Page 378 Friday, October 20, 2006 5:04 PM
378
Stream of Variation Modeling and Analysis for MMPs
are independent of replacement schedules and tolerances of the other locating pins. We further assume that the degradation processes of different locating pins are independent. Therefore, based on Equation 15.39, the optimization problem of the whole system defined in Equation 15.32 can be decomposed into optimization problems for each single locating pin as follows:
(T , t (i)) = arg min ( Φ (T , t (i))) *
i
* a
i
Ti ,ta ( i )
i
(15.40)
a
subject to Ti > 0 and ta(i) > 0, ∀i where
∑ ( L (Y(τ)) + E (C (τ))) t
i
Φi (Ti , ta (i)) ≡ lim
i M
τ=0
t →∞
t
.
Because each replacement cycle for locating pin i is fixed as ta(i), we can see that Φi (Ti , ta (i)) =
≈
average cost per replacementt cycle for pin i replacement cycle of pin i
∫
ta ( i )
0
(ρ
)
Var ([u]2i −1 ) + ρ2i Var ([u]2i ) dt +
2 i −1
wi + c0pi Ti
(15.41)
ta (i)
where Var ([u]i ) is calculated by Equation 15.37. Here, instead of a summation, the integral is used to approximate the cumulative quality loss until time ta(i). In the latter discussion, Equation 15.41 will be considered the exact long-run average production cost for pin i. Substituting Equation 15.37 into Equation 15.41 gives us
(
Φi (Ti , ta (i)) = ρ2i −1κ 2i −1 + ρ2iκ 2i
)
2 5 9 13 2 1 wi cp µ i (ta (i))2 + σ i2ta (i) + + 0i Ti + µ ita (i) + Tita (i) ta (i) 18 10 2 120
Given Ti > 0 and ta(i) > 0, 2 T 3 t (i) 1 i a ∇2 = Ti ta (i) 1 t (i) 2 T 2 i a
(
)
(
ta (i) Ti 2 2 3 ta (i) Ti
(
1
)
2
)
2151_book.fm Page 379 Friday, October 20, 2006 5:04 PM
Process-Oriented Tolerance Synthesis
379
is positive definite, suggesting that 1 Ti ta (i) is convex. Also, ρi, κi, µi, σi, wi and c0pi , ∀i are nonnegative. Therefore, Φi (Ti , ta (i)), which is a linear combination of several convex functions with nonnegative coefficients, is also convex function of ta(i) and Ti. Thus, the optimality analysis for the optimal solution of Equation 15.40 can be stated as follows: Result 15.1: The objective function in Equation 15.40 is convex, and hence the optimization problem described by Equation 15.40 converges to a global optimum.
15.6.2 OPTIMALITY ANALYSIS OF OPTIMIZATION FORMULATION F2 Equation 15.34 is used for the optimization formulation F2 when product quality is treated as the constraint. If removing L(Y) from the objective function and then following the derivation of Equation 15.39 to Equation 15.41, we can express the long-run average maintenance cost as p
Ψ(T, t a ) =
∑ i =1
wi Ti + c0pi ta (i)
The following lemmas are used to show that the local minimum is also the global minimum under this formulation. Lemma 15.1: The cost function Ψ(T, t a ) is convex. Proof: For Ti > 0 and ta(i) > 0, 1/Ti ta(i) is convex; for all i, 1/ta(i) is convex; and, wi and c0pi are nonnegative. Thus, Ψ(T, t a ) , which is a linear combination of convex functions, is also convex. Lemma 15.2: The constraint of F2 is a convex set. ΣY](i,i) is a linear combination of Var([u]i), i = 1, 2, …, Proof: From Equation 15.5, [Σ 2p with nonnegative coefficients. It is known that given tolerance T and maintenance policy ta, ΣY depends on the age of each locating pin τi, i = 1, 2, …, p. This dependency is made explicit with the notation ΣY|τ1, …, τp. From Equation 15.37, we can see that Var([u]i) increases over τ i/2 . This monotonic property suggests that ΣY](i,i) achieves its maximum at the time when the age of each locating pin equals [Σ its scheduled replacement cycle. That is, max [Σ Y ](i ,i ) = [Σ Y ](i ,i )
τ1 , ..., τ p
τ1 = ta (1), ..., τ n = ta ( p )
As such, the constraint of F2 can be written as
2151_book.fm Page 380 Friday, October 20, 2006 5:04 PM
380
Stream of Variation Modeling and Analysis for MMPs
{T, t a
[Σ Σ Y ](i ,i ) ≤ σ s , ∀t , Ti > 0, ta (i) > 0, ∀i} =
{T, t a
[Σ Σ Y ](i ,i )
τ1 = ta (1), ..., τ p = ta ( p )
≤ σ s , Ti > 0, ta (i) > 0, ∀i}
(15.42)
From Equation 15.37, Var [u]i | T ,t ( i / 2 ) ( i/2 a ) is convex on T and ta. Then it is not difficult to see that the constraint set of F2 is convex. Result 15.2: The nonlinear optimization problem (NLP) stated in Equation 15.34 converges to a global minimum T* and ta*. Proof: Based on Lemma 15.2, the constraint in Equation 15.34 is a convex set. From Lemma 15.1, the objective function of Equation 15.34 is convex. Thus every local minimum of the objective function is a global minimum within the constraint.
15.7 CASE STUDY FOR INTEGRATED TOLERANCE AND MAINTENANCE DESIGN A case study is conducted for the assembly process of side frame inner panel as shown in Figure 15.6. Parameters used in this study are listed in Table 15.6. We assume that S = qI, where I is an identity matrix and q is the weight in the quality loss function for all KPCs. The optimal tolerance and maintenance design will be studied for both F1 and F2, respectively.
15.7.1 OPTIMAL TOLERANCE AND MAINTENANCE DESIGN FOR OPTIMIZATION FORMULATION F1 The optimal tolerance and replacement cycle for each locating pin are calculated using Matlab functions and are listed in Table 15.7, where different tolerances and replacement cycles are assigned to individual locating pins. The difference in the assignments of tolerance and replacement cycle is because the contribution of each locating pin to the quality loss is different depending on its type (4-way or 2-way) and geometrical position. If a locating pin makes little contribution to the quality
TABLE 15.6 Parameters Used in the Case Study µ∆ (mm)
σ∆ (mm)
c0 ≡ c0ip
5 × 10–7
5 × 10–5
200
($)
wi ($⋅⋅mm)
q ($/mm2)
200
1
2151_book.fm Page 381 Friday, October 20, 2006 5:04 PM
Process-Oriented Tolerance Synthesis
381
TABLE 15.7 Optimal Tolerance and Replacement Cycle under Optimization Formulation F1 i
1
2
3
4
5
6
7
8
9
10
11
12
Ti (mm) 0.078 0.105 0.083 0.100 0.086 0.102 0.112 0.093 0.078 0.086 0.072 0.079 ta(i) (×105) 1.48 2.06 1.60 1.94 1.65 1.99 2.20 1.79 1.49 1.65 1.35 1.50
0.16
T1 (mm)
Tolerance 8
0.14
Replacement cycle q /w=0.0005
7
0.12
q /w=0.001
6
0.1
5
0.0 8
0.04 0.02 0
q /w=0.0025
4 q /w=0.0005 q /w=0.001 q /w=0.0025 q /w=0.005 q /w=0.01 q /w=0.05 q /w=0.5 q /w=5
0.06
0
t a(1) ( 105)
10
20
30
40
50
60
70
c 0 /w
80
90
100
q /w=0.005
3
q /w=0.01
2
q/w=0.05
1 0 0
q /w=0.5 q /w=5 10
20
30
40
50
60
70
80
90
100
c 0 /w
FIGURE 15.10 Optimal tolerance and maintenance schedule for different combinations of cost ratios.
loss, then a loose tolerance and a long replacement cycle will be assigned to this pin. This helps reduce maintenance cost while keeping the quality loss at an acceptable level. In addition, the optimal solution is sensitive to the cost ratios q/w and c0 /w. The optimal tolerance and replacement cycle of locating pin P1 with different combinations of q/w and c0 /w are shown in Figure 15.10. Similar plots can be easily obtained for other locating pins. Figure 15.10 can be used as a guideline for the optimal selection of pin tolerance and maintenance schedule for various applications with specific cost ratios. From Figure 15.10, the optimal tolerance T1 decreases when the cost ratio c0 /w or q/w increases. That is, a tighter tolerance is used when the fixed maintenance cost (modeled by c0) and/or the quality loss (modeled by q) becomes significant when compared to the tooling fabrication cost (determined by w). On the other hand, the optimal replacement cycle ta(1) rises with the increase of c0 /w or decrease of q/w. That is, a longer replacement cycle is used when the fixed maintenance cost becomes significant, and/or the quality loss becomes less significant when compared to the tooling fabrication cost.
15.7.2 OPTIMAL TOLERANCE AND MAINTENANCE DESIGN FOR OPTIMIZATION FORMULATION F2 Under the scenario of optimization formulation F2, we find the optimal tolerance assignment and tooling replacement cycle shown in Table 15.8. Because product
2151_book.fm Page 382 Friday, October 20, 2006 5:04 PM
382
Stream of Variation Modeling and Analysis for MMPs
TABLE 15.8 Optimal Tolerance and Replacement Cycles of Optimization Formulation F2 i
1
2
3
4
5
6
7
8
9
10
11
12
Ti (mm) 0.109 0.172 0.100 0.144 0.137 0.176 0.288 0.148 0.163 0.153 0.159 0.152 Ta(i) (×105) 1.24 2.10 1.14 1.70 1.61 2.14 3.87 1.76 1.97 1.82 1.91 1.81
T1 (mm)
0.12
Ta(1)
2.0 1.9
0.10
1.8 1.7
0.08
1.6 1.5
0.06
1.4 1.3
0.04
1.2 0.02
1.1 0
50
100
c0/w
150
200
0
50
100
150
200
c0/w
FIGURE 15.11 Optimal tolerance assignment and maintenance schedule under different c0 /w.
quality is treated as a constraint, the optimal solution depends only on the cost ratios c0 /w. The optimal tolerance assignment and maintenance schedule under different cost ratio c0 /w for locating pin P1 are shown in Figure 15.11. Figure 15.11 shows that the optimal tolerance decreases over c0 /w while the optimal replacement cycle increases over c0 /w. The results are consistent with Figure 15.10. The same intuitive interpretation can be applied here.
15.7.3 COST COMPARISON
UNDER
DIFFERENT DESIGN SCHEMES
The maintenance schedule used in Section 15.3 is a fixed replacement cycle of 0.6 × 105 operations for each locating pin purely based on experience. Table 15.3 yields a looser tolerance and shorter replacement cycle than the optimal integrated design, using either optimization formulation F1 or F2. The cost efficiencies of the two design schemes compared in Section 15.3 and the integrated maintenance and tolerance design are investigated in Table 15.9 under formulation F1. From Table 15.9, we can see that although the integrated tolerance and maintenance design suffers a higher system component fabrication cost at the first setup because of its tighter tolerance, it has a similar long-run average component fabrication cost as the other two design schemes because of its lower replacement frequency (2nd row in Table 15.9). More importantly, the tighter tolerance and lower replacement frequency lead to a much lower long-run average maintenance cost and quality loss than the other two. Therefore, in terms of the overall production cost
2151_book.fm Page 383 Friday, October 20, 2006 5:04 PM
Process-Oriented Tolerance Synthesis
383
TABLE 15.9 Comparison of Production Cost under Optimization Formulation F1
Scenario Component fabrication cost at the first setup ($) Long-run average component fabrication cost ($/operation) Long-run average maintenance cost ($/operation) Long-run average quality loss ($/operation) Long-run overall production cost ($/operation)
Definition
∑T p
i =1
wi
Fixed 0.6 × 105 Replacement Cycle
Integration of Tolerance and Maintenance
9.6 × 103
9.41 × 103
27.4 × 103
0.160
0.157
0.165
0.200
0.197
0.179
0.484
0.555
0.175
0.684
0.752
0.354
i
wi
∑ E ∑ T t
lim
Uniform 0.25-mm Tolerance
τ=0
i ∈J τ
i
t
t →∞
∑ E (C (τ; T, t )) t
a
M
lim
τ=0
t
t →∞
∑ E ( L (t ; T, t ) ) t
a
lim
τ=0
t
t →∞
∑ E (C (τ; T, t ) + L(t; T, t )) t
a
M
lim t →∞
a
τ=0
t
including long-run maintenance cost and quality loss, the integrated tolerance and maintenance design is much better than the other two design schemes. The cumulative total production costs over operation time for these three design schemes are compared in Figure 15.12. It confirms that the integrated design faces a higher initial setup cost than the other two designs. However, after τ0 (≈0.3 × 105) cycles of operations, the process under the integrated design scheme becomes more and more cost-effective compared to the other designs. The optimal integrated design, although obtained using infinite-horizon long-run average cost criteria, can lead to significant cost savings if the concerned time horizon is greater than 2τ0 (0.6 × 105 cycles). However, if the time horizon is less than 2τ0, the infinite-horizon cost criteria may not be appropriate to approximate the finite horizon reality of the application. As for the constrained optimization problem, Table 15.10 shows that the longrun maintenance cost using the integrated design methodology is 45% lower than that of the other two design schemes. The tolerance assignment given in Section
2151_book.fm Page 384 Friday, October 20, 2006 5:04 PM
384
Stream of Variation Modeling and Analysis for MMPs Total Cumulative Cost (x105$) 3.5
Fixed Replacement Cycle
3 2.5
Uniform 0.25mm Tolerance
2 1.5 1
Integrated Design
0.5 0
0
0
0.5
1
1.5
2
2.5
3
3.5
4
Operation Time (x105)
FIGURE 15.12 Cumulative costs at different operation time for three design schemes.
TABLE 15.10 Comparison of Maintenance Cost and Quality for Optimization Formulation F2
Scenario Long run maintenance cost ($/operation) Maximum 6σ of KPCs (mm)
Uniform 0.25-mm Tolerance
Uniform 6-Month Replacement Cycle
Integration of Tolerance and Maintenance
0.20 1.76
0.20 1.50
0.11 1.50
15.3 has a smaller Six Sigma value and hence is better than the uniform tolerance design. With the same maximal Six Sigma value, the integrated design has a much lower maintenance cost than the design scheme in Section 15.3. Thus, it can be concluded that the integrated tolerance and maintenance design is more sophisticated and can achieve better performance than the other design schemes in various optimization settings.
15.8 EXERCISES 1. Describe the difference between process-oriented tolerancing and productoriented tolerancing. 2. Give a detailed derivation of Equation 15.10, Equation 15.11, and Equation 15.15–Equation 15.17. 3. Suppose the designer of the side frame inner panel assembly process illustrated in Figure 15.6 requires that the final product must have a Six Sigma value no greater than 1 mm at all KPCs. With all the other parameters unchanged, rerun the optimal tolerance allocation procedure in Subsection 15.3.1 using a computer and compare the result with Table 15.1.
2151_book.fm Page 385 Friday, October 20, 2006 5:04 PM
Process-Oriented Tolerance Synthesis
385
4. Rerun the optimal tolerance allocation procedure in Section 15.3.2 with β = 0 and all the other parameters remaining the same. Compare your result with Table 15.3. 5. Evaluate the manufacturing cost under the optimal tolerance allocation design obtained from Problem 4 and compare your result with Table 15.5. 6. Based on Figure 15.12, what are the major sources of long-term cost savings for the integrated design? 7. Obtain all the numbers in Table 15.10 by yourself. 8. Outline the basic steps to capture catastrophic failures of system components as well as their degradations in the integrated maintenance and tolerance design framework.
References 1. Bjorke, O., Computer Aided Tolerancing, Tapir Publishers, Trondheim, Norway, 1978. 2. Chase, K.W. and Parkinson, A.R., A survey of research in the application of tolerance analysis to the design of mechanical assemblies, Research in Engineering Design, 3, 23, 1991. 3. Greenwood, W.H. and Chase, K.W., A new tolerance analysis method for designers and manufacturers, ASME Transactions, Journal of Engineering for Industry, 109, 112, 1987. 4. Shaprio, S.S. and Gross, A., Statistical Modeling Techniques, Marcel Dekker, New York, 1981. 5. Wu, Z., ElMaraghy, W.H., and ElMaraghy, H.A., Evaluation of cost-tolerance algorithms for design tolerance analysis and synthesis, Manufacturing Review, 1, 168, 1988. 6. Matlab, Optimization Toolbox User’s Guide, Version 5, The MathWorks Inc., Natick, MA, 1999. 7. Ding, J. Jin, Ceglarek, D., and Shi, J., Process-oriented tolerancing for multistation assembly systems, IIE Transactions, Vol. 37, No. 6, pp. 493–508, 2005. 8. Chen, Y., Ding, Y., Jin, J., and Ceglarek, D., Integration of tolerance and maintenance design for multistage manufacturing processes, IEEE Transactions on Automation Science and Engineering, Vol. 3, No. 3, 2006.
2151_book.fm Page 386 Friday, October 20, 2006 5:04 PM
2151_book.fm Page 387 Friday, October 20, 2006 5:04 PM
Part V Quality and Reliability Integration and Advanced Topics
2151_book.fm Page 388 Friday, October 20, 2006 5:04 PM
2151_book.fm Page 389 Friday, October 20, 2006 5:04 PM
16
Quality and Reliability Chain Modeling and Analysis*
16.1 INTRODUCTION The downtime of a manufacturing system is not only caused by manufacturingsystem component failures, but also by nonconforming product produced by a degraded system. Therefore, system reliability analysis for a manufacturing system should take into account not only its tooling and machine uptime, but also the product quality. To develop an integrated system model by integrating product quality, manufacturing-system component degradation, and process design information together for manufacturing-system design evaluation and optimization, some terminologies are defined in this chapter: •
•
•
•
•
•
Manufacturing system — a system consisting of tools and equipment to perform the designed operations in a multistage manufacturing process (MMP) with products as its end item. Component/manufacturing-system component — a physical part of a manufacturing system, such as a tool, equipment, or a machine. In this chapter, component means the component of a manufacturing system rather than the component of a product. Examples of components include locators in assembly processes, cutting tools and drills in machining processes, and dies in stamping processes. Component catastrophic failure — failure of components immediately leading to downtime of the component. Examples of component catastrophic failures include breakage of tools such as locators and cutting tools. Product — both the incoming and outgoing workpieces at each station of an MMP, which includes both the intermediate parts and the final product. Product quality — the product quality is measured as the variation of the dimensional quality characteristics, such as size, straightness, and orientation of a drilled hole in machining processes, size of a burr, and dimension of a formed part in stamping processes. Component reliability information — includes component catastrophic failure information such as catastrophic failure rate and component degradation information such as wear rate.
* Part of chapter material is based on Reference 15 and Reference 16.
389
2151_book.fm Page 390 Friday, October 20, 2006 5:04 PM
390
Stream of Variation Modeling and Analysis for MMPs
• •
•
System catastrophic failure — system failure due to component catastrophic failures. System failure due to nonconforming products — the event that the manufactured products are out of specifications. For MMP, the specifications can be generally assigned to the intermediate or final products. System reliability of an MMP — the probability that neither the system catastrophic failure nor the failure due to nonconforming products occurs during a specific period of time.
The reliability analysis of a manufacturing system should consider both manufacturing-system component reliability and the product quality. It can be seen that there exist some interactions between manufacturing-system component reliability and the product quality, which are represented by a newly defined concept of QRCo-Effect as shown in Figure16.1. The functions of QR-Co-Effect has two aspects. One aspect is the degradation of a manufacturing-system component that has an impact on product quality. Thus, a product with unsatisfactory quality may result because of manufacturing-system component degradation before a catastrophic component failure is observed. This leads to manufacturing-system downtime due to defective products in production. The other aspect is the incoming product quality, which also has an impact on manufacturing-system component degradation and its probability of catastrophic failure. A larger variation of incoming parts may introduce more interference between workpieces and manufacturing-system components during operations. Therefore, an accelerated degradation and/or more catastrophic failures can be observed when incoming products have a larger variation or mean deviation. As a result, manufacturing-system components working under low-quality incoming products may fail sooner than those under consistently high-quality incoming products. Without considering the QR-Co-Effect, the results of the manufacturing-system reliability analysis could be biased. In an MMP, each station consists of multiple components. To simplify the problem, all components in an MMP are assumed to be in series: the catastrophic failure of any component may lead to system catastrophic failure. In an MMP, the final product quality is affected by the accumulation or stack-up of all variations
System Reliability and Downtime System Component Reliability
Nonconforming Product Quality Outgoing Product Quality
Component Degradation
QR-CoEffect
Component Reliability Component Catastrophic Failure
Product Quality In-Coming Product Quality
FIGURE 16.1 QR-Co-Effect function between quality and reliability.
2151_book.fm Page 391 Friday, October 20, 2006 5:04 PM
Quality and Reliability Chain Modeling and Analysis
Product ality Quality
Product ality Quality
Component Reliability Station 1
391
Product ality Quality
Product ality Quality
Component Reliability
Component Reliability
Station 2
Station 3
Variation Propagation
QR-Co-Effect
FIGURE 16.2 General concepts of the QR-Chain in MMPs.
generated at previous stations. Considering the QR-Co-Effect at each station, the variation propagation in product quality leads to the propagation of the interaction between the manufacturing-system component reliability and the product quality (Figure 16.2), which is called the QR-Chain effect. The QR-Chain effect can be observed in many MMPs. These examples are given here to illustrate the common characteristics of the QR-Chain effect in various manufacturing processes:
16.1.1 EXAMPLE 1: MACHINING PROCESSES The QR-Chain effect can be observed in a cylinder head machining process. In an example from Huang et al. [1], as shown in Figure 16.3, there are two stations that drill a hole in a cylinder head (station I) and then tap a thread on it afterward (station II). In station I, the material properties of the incoming workpiece (taken as the product quality) have an important impact on the wear and breakage rate of the drill (taken as the manufacturing-system component reliability). The drill condition further
Station I: Drill bolt hole
Station II:Tapping threads in the bolt
A1
M1
B C
Z Y1, Y2
Y1, Y2, Z, M1, A1, B, and C are datum
FIGURE 16.3 QR-Chain in a machining process.
2151_book.fm Page 392 Friday, October 20, 2006 5:04 PM
392
Stream of Variation Modeling and Analysis for MMPs
impacts the quality of the hole drilled in this station in terms of size, straightness, orientation, etc. In the next tapping station (station II), those quality characteristics of the drill hole at station I are essential factors affecting the thread quality of station II and the breakage rate of the tap. Therefore, the QR-Co-Effect at station I has propagated to station II. Because the stochastic degradation of the drill affects not only the breakage rate of the tap but also the conformity of the final product quality, the catastrophic failures of the tap and the system failure due to nonconforming products are statistically dependent.
16.1.2 EXAMPLE 2: TRANSFER OR PROGRESSIVE DIE-STAMPING PROCESSES QR-Chain effects similar to those in Example 1 can be observed in a stamping process with transfer or progressive dies, where multiple stations are used to form a part. Figure 16.4 shows a doorknob-forming process consisting of six stations in a sequence of (1) notch and cutoff, (2) blanking, (3) the first draw, (4) the second draw, (5) the third draw, and (6) bulging [2]. In general, the product quality at previous stations impacts the current die or tool degradation, which further impacts the quality of products in the current station. As an example, the blanking die worn out at station 2 (taken as component reliability) generates burr on the edge of the part (taken as the outgoing product quality of station 2). In the following draw stations from station 3 to station 5, the size of the burr (taken as incoming product quality) has an important impact on the draw die wear and breakage rate (taken as component reliability), and a further impact on the dimension and surface quality of the formed part. In the last bulging station, the combined effects of the formed part dimension and the size of burr on the part surface impact the rubber tool wear rate (rubber functions as a die) at the bulging station. Thus, a clear QR-Chain effect
Stamping Press
Tonnage Sensors
Doorknob Product
Slide Blank
Multiple operations Notch
Cutoff
Blanking
Draw
Redraw
2nd-Redraw 2nd-Redraw
FIGURE 16.4 QR-Chain in a stamping process with a transfer die.
Bulging
2151_book.fm Page 393 Friday, October 20, 2006 5:04 PM
Quality and Reliability Chain Modeling and Analysis
393
can be observed in this process. Because the stochastic degradation of the blanking die at station 2 affects not only the breakage rate of the draw dies in the following stations but also the conformity of the quality of the final formed workpiece, the failures of the draw dies and the system failure due to nonconforming products are statistically dependent. From these MMP examples, the common characteristics of the QR-Chain effect can be summarized as follows: 1. The quality of the outgoing product at a station depends on the reliability of system components at the current station as well as the quality of incoming products produced by the previous stations. 2. The reliability of the manufacturing-system components at the current station can be affected by incoming product quality from the previous stations. 3. The deterioration of the outgoing product quality is caused by the degradation of the manufacturing-system components. The dependent propagation of product quality across stations causes the dependency between system catastrophic failures and system failure due to nonconforming products.
16.2 QR-CHAIN MODELING Based on the characteristics of the QR-Chain effect summarized in Section 16.1, this section discusses models for the relationship between component performance and product quality, the degradation process of components, the assessment criteria of product quality, and the impact of the product quality on the component catastrophic failures. The following assumptions have been made in QR-Chain modeling: A1. Discrete-part manufacturing processes are considered, in which the number of operation cycles is treated as the time index. A2. Component degradation can be modeled by a discrete approximation to a diffusion process with mean wear linearly dependent on the component degradation state, and constant variance. A3. The wear within a time interval has a Gaussian distribution. A4. Without loss of generality, the component is in the ideal state if X(t) = 0. A5. All components have nondecreasing wear. A6. The product quality can be assessed by the mean squared deviation of the product quality characteristics from the target. A7. Pr{a system component fails during the next operation time | it still works at the current operation time} is assumed to be proportional to the linear combination of the squared deviations of the product quality characteristics. A8. The effects of different product quality characteristics on component catastrophic failure are independent.
2151_book.fm Page 394 Friday, October 20, 2006 5:04 PM
394
Stream of Variation Modeling and Analysis for MMPs
16.2.1 RELATIONSHIP BETWEEN COMPONENT PERFORMANCE AND PRODUCT QUALITY The product quality in an MMP is generally affected by the state of multiple manufacturing-system components. A measure of the component state, such as the diameter of a locator or the shut-height of a stamping press, is referred to as an adjustable process variable because these variables can be adjusted in the production phase of a manufacturing system through maintenance operations. Another type of variable affecting product quality is the random noise that is not determined by the component state, called noise variables. Examples of noise variables include random variations of raw material quality and random environmental variations. In general, noise variables randomly change from one operation time to the next and represent the natural variation of a manufacturing process. Considering further the interaction between the adjustable process variables and the noise variables, the following general linear model is assumed for the product quality characteristics yj (t), j = 1, 2, …, M, which is called the process model: y j (t ) = η j + α Tj ξ (t ) + β Tj zt + ξ (t )T Γ j zt , j = 1, 2, ..., M
(16.1)
where z t ≡ [ z1t , z2 t , …, zlt ]T ∈ R l is the vector of noise variables, with mean E(zt) and covariance matrix cov(zt) independent of the time index; ηj , j = 1, 2, …, M are constants; αj, βj are vectors characterizing the effects of ξ(t) and zt; Γj is a matrix characterizing the effects of the interactions between ξ(t) and zt. Remarks: The process model of Equation 16.1 is called the response model in robust parameter design [3,4]. In parameter design, the adjustable process variables in Equation 16.1 are often called the main effect of the control factors and the noise variables are called the main effect of the noise factors. Based on the effect-ordering principle in parameter design, the main effect of the control factors, the main effect of the noise factors, and the control by noise interactions are the most important effects on product quality. Thus, the linear combination of these three effects, as in Equation 16.1, is widely used in robust parameter design to model product quality characteristics. It will be seen in Section 16.4 that the process model of Equation 16.1 can be obtained based on the specific physical process models when the physical process knowledge is available. Otherwise, it can be generally obtained by using robust parameter design because the process model of Equation 16.1 has the same structure as the response model in robust parameter design.
16.2.2 SYSTEM COMPONENT DEGRADATION In general, ξ(t) in Equation 16.1 changes over time because of system component degradation. Let the time axis be divided into contiguous and uniform intervals of length h and the successive endpoints of the intervals be denoted by h, 2h, 3h, …, kh, …. The process-variable vector ξ(tk)∈Rp represents the degradation state of system components at time tk ≡ kh, k = 1, 2, …. The production mission time is tK = Kh. A well-known single component degradation model [5] is
2151_book.fm Page 395 Friday, October 20, 2006 5:04 PM
Quality and Reliability Chain Modeling and Analysis
(
)
(
395
)
X (tk +1 ) − X (tk ) = µ X (tk ) + σ X (tk ) ε k
(16.2)
where {εk, k ≥ 1} is a sequence of independently identically distributed (i.i.d.) random variables. When {εk, k ≥ 1} is normally distributed, Equation 16.2 becomes a discrete approximation to a diffusion process. For a multivariate version of Equation 16.2, the following model can be obtained based on assumption A2: ξ (tk +1 ) = Pk ξ (tk ) + G k ε k , k = 0, 1, 2, ...
(16.3)
Pk, Gk are possibly time-varying, known matrices of appropriate dimension; {εk, k ≥ 1} are i.i.d. and normally distributed, with εk ~ N(µε, Σε); X(t0) ~ N(µ0, Σ0). Equation 16.3 generally describes a Gauss–Markov process. The assumption of the normal distribution of εk here is due to its mathematical tractability and the ample literature on diffusion processes and its applications in degradation modeling. For a large-scale MMP with many operation cycles during each time interval, the normal distribution assumption can also be justified by the central limit theorem if the wear increments of each operation cycle during a time interval are independent. From assumptions A4 and A5, for any 1 ≤ j ≤ p, the Pr{[ξξ(t0)]j < 0} and Pr{[ξξ(tk+1) – ξ(tk)]j < 0} can be ignored. In fact, decreasing (or negative) wear is not meaningful in many applications [9].
16.2.3 PRODUCT QUALITY ASSESSMENT AND SYSTEM FAILURE DUE TO NONCONFORMING PRODUCTS Based on the derivation in Appendix 16.1, the quality index qj(t) under given component degradation state, ξ(tk), can be written in the following form:
(
)
q j t ξ (tk ) = ξ (tk )T Q j ξ (tk ) + d j , j = 1, 2,..., M , tk ≤ t < tk +1
(16.4)
Qj in Equation 16.4 is positive semidefinite, and dj ≥ 0. For each quality index, there is a threshold value based on the product design specifications. Let the threshold for the quality index #j be aj. There is no system failure due to nonconforming products by time t if Etq holds, where Etq is defined as the event that all quality indexes are within the specification by time t, i.e., M
Etq ≡
∩ ( q (τ) ≤ a , ∀0 ≤ τ ≤ t ) j
j
j =1
where qj(τ), with ti ≤ τ < ti+1, can be obtained by taking expectation on qj (τ|ξξ(ti)), i.e.,
(
)
q j (τ) = E q j t ξ (τ i ) ξ ( ti )
2151_book.fm Page 396 Friday, October 20, 2006 5:04 PM
396
Stream of Variation Modeling and Analysis for MMPs
and aj can be determined by the product quality specification and the following process capability ratio, Cpm, which is widely used in quality engineering [7]: C pm =
USL − LSL
(16.5)
6 MSE
where USL – LSL is the tolerance range for a product quality characteristic and MSE is the expected squared deviation.
16.2.4 COMPONENT CATASTROPHIC FAILURE SYSTEM CATASTROPHIC FAILURE
AND ITS INDUCED
From assumption A7, Pr{component i fails at operation t + 1|it works at operation t , y(t )} = (16.6)
λ 0 i + s Ti ((y(t ) − γ ) (y(t ) − γ )), i = 1, 2, …, p
where s i ∈ R M is called the QR-coefficient, and has only nonnegative elements; γ ≡ [γ1 γ2 … γM]T. With the failure rate being minimum at y(t) = γ (quality characteristics right on the target), and with assumption A8, the quadratic relationship in Equation 16.6 can be considered as a second-order approximation of a general functional relationship based on the Taylor series. By taking expectation on the noise variables in y(t), and using the definition of the failure rate, λ i (t ) can be written as
{
}
λ i (t ) = Pr component i fails during (t ,t + 1) it works at t ,ξ (tk )
(
)(
)
(
)(
)
= E λ 0i + s Ti y(t ) − γ y(t ) − γ ξ (tk ) = λ 0 i + s E y(t ) − γ y(t ) − γ ξ (tk ) T i
(
= λ 0i + s iT q t ξ (tk )
(16.7)
)
where q(t|ξξ(tk)) = [q1(t|ξξ(tk)) [q2(t|ξξ(tk)) … [qM (t|ξξ(tk))]T. Define Etc as the event that catastrophic failures never occur at any of the p system components by time t. Then there is no system catastrophic failure by time t if Etc holds.
16.3 SYSTEM RELIABILITY EVALUATION Based on the definition of system reliability of an MMP with the QR-Chain effect, the system reliability at the production mission time, tK, is R (tK ) = Pr{EtqK ∩ EtcK }.
2151_book.fm Page 397 Friday, October 20, 2006 5:04 PM
Quality and Reliability Chain Modeling and Analysis
16.3.1 CHALLENGES
IN
397
SYSTEM RELIABILITY EVALUATION
The complex interactions among various elements of the QR-Chain model lead to following challenges in system reliability evaluation: •
Dependency between Etq and Etc : Both Etq and Etc depend on the system component degradation in an MMP. So they are not independent of each other in general:
{ } { }
R (t ) = Pr{Etq ∩ Etc } ≠ Pr Etq ⋅ Pr Etc •
•
Dependency among system component degradation: Generally the degradation of a system component depends not only on its current state, but also on that of other system components. So the degradation processes of system components are not independent. Dependency among catastrophic failures of system components: The catastrophic failures of system components at each station of an MMP can depend on the incoming product quality. The same quality characteristic of the incoming product can affect multiple system components. Different quality characteristics are generally not independent of each other, because they all depend on the system component degradation processes in the previous stations. Consequently, the catastrophic failures of system components are not independent of each other.
16.3.2 SYSTEM RELIABILITY EVALUATION
OF
MMPS
Three steps are used to evaluate the system reliability based on the QR-Chain model: Step 1: Conditioning on the degradation path of each component, the dependency between Etq and Etc can also be removed. Step 2: Uncondition on the degradation paths by taking expectation of the conditional system reliability calculated in Step 1. Step 3: Reorganize the integrand in Equation 16.10 obtained in step 2 into the form of the probability density function (PDF) of a Gaussian random variable in order to efficiently apply Monte Carlo simulation methods. Details on each step are as follows: Step 1: A conditional system reliability is evaluated in this step by conditioning on the degradation path {ξξ(tk), k = 1, 2, …}, or ξK ≡ [ξξT (t0) ξT (t1) … ξT (tK)t]. Conditioning on ξK , EtqK and EtcK are independent. So the conditional system reliability
(
)
{
} {
R t K ξ K = Pr EtcK ξ K ⋅ Pr EtqK ξ K
{
}
{
}
}
(16.8)
and Pr EtcK |ξξ K and Pr EtqK |ξξ K can be calculated by Result 16.1 and Result 16.2.
2151_book.fm Page 398 Friday, October 20, 2006 5:04 PM
398
Stream of Variation Modeling and Analysis for MMPs
Result 16.1: Let d ≡ [d1 d2 … dm]T and p
c≡
∑ (λ
0i
)
+ s Ti d .
i =1
There exists a positive semidefinite matrix UK, such that
{
((
}
Pr EtcK |ξξ K = ςK = exp − c ⋅ t K + ςKT U K ςK
)) .
The proof of Result 16.1 is in Appendix 16.2. Result 16.2: Ω is a domain in R ( K +1)× p s.t. K
ςK ∈ Ω ⇔
M
∩ ∩{ς (t )B ς(t ) ≤ a − d } . T
k
j
k
j
j
k = 0 j =1
Let 1, IK ≡ 0,
if x K ∈Ω . otherwise
Then
{
}
Pr EtqK |X K = x K = I K This result is obvious from the definition of the system failure due to nonconforming products and Equation 16.4. From Result 16.1, Result 16.2, and Equation 16.8,
(
((
)
))
R t K | ξ K = ςK = exp − ct K + ςKT U K ςK ⋅ I K
(16.9)
Step 2: Unconditioning on ξK by taking expectation of Equation 16.9, we have (16.10)
R(t K ) =
∫
ςK ∈Ω
((
exp − ct K + ςKT U K ςK
= exp(−ct K )
∫
ςK ∈Ω
))
1
( 2π )
1
( 2π )
n ( K +1) 2
ΣK
1 2
n ( K +1) 2
ΣK
1 2
1 exp − ςK − µ K 2
(
1 exp − ςKT U K ςK + ςK − µ K 2
(
)
)
T
T
Σ −K1 ςK − µ K d ςK
(
)
Σ −K1 ςK − µ K d ςK
(
)
2151_book.fm Page 399 Friday, October 20, 2006 5:04 PM
Quality and Reliability Chain Modeling and Analysis
399
Please refer to Appendix 16.3 for the detailed derivation of Equation 16.10 and the definition of the notations used in Equation 16.10. Step 3: We will convert the integrand in Equation 16.10 to the form of the PDF of another multivariate Gaussian r.v., ξ K , based on Lemma 16.1. Lemma 16.1: There exists a positive definite matrix Σ K , a vector µ K and a scalar sK > 0 such that 1 1 ςKT U K ςK + (ςK − µ K )T Σ −K1 (ςK − µ K ) = (ςK − µ K )T Σ −K1 (ςK − µ K ) + sK (16.11) 2 2 Lemma 16.1 is proved in Appendix 16.4. From Lemma 16.1, the integrand in Equation 16.10 can be converted into the form of a multivariate Gaussian PDF as in Result 16.3. Result 16.3: Let ξ , be an n ( K + 1) dimension Gaussian random variable (r. v.) K
whose PDF is
( )
fξ K ςK =
1
( 2π )
n ( K +1) 2
Σ K
1 2
1 exp − ςK − µ K 2
(
)
T
Σ −K1 ςK − µ K
(
)
Then the system reliability can be written as exp(− sK ) 2 ⋅ ΣK ⋅ dFξ K ςK 1 ςK ∈Ω ΣK 2 1
R(t K ) = exp(−ct K ) ⋅
∫
( )
(16.12)
Equation 16.12 can be proved by reorganizing the exponent of the integrand in Equation 16.10 based on Lemma 16.1. The multidimensional integral in Equation 16.12 can be calculated by the following Monte Carlo simulation method: Let Ns Gaussian random variables be generated with mean µ K and variance Σ K , among which N0 generated random variables fall in the quality constraint Ω. Then N0/Ns is an estimate of the integral
∫
dFξ K (ςK ) .
ςK ∈Ω
Alternative approaches include other multidimensional numerical integration methods such as multidimensional Gaussian quadratures [8]. To be effective, however, these methods usually require a simple region of integration and low integral dimension. Owing to the complexity of the region of integration Ω and typically high dimension of the integral, we suggest using the Monte Carlo simulation method.
2151_book.fm Page 400 Friday, October 20, 2006 5:04 PM
400
Stream of Variation Modeling and Analysis for MMPs
It is worth noting that the entire procedure we have proposed to evaluate the complex multidimensional integral in Equation 16.10 can be considered as an importance sampling approach [9] to reduce the variance of the Monte Carlo estimation of integrals. We take advantage of the similarity between the shape of the nonnegative integrand function in Equation 16.10 and the shape of a multivariate normal PDF by sampling from the multivariate normal distribution. It is known from the theory of the Monte Carlo method that its efficiency at evaluating integrals can be greatly improved by sampling from a distribution with PDF of a similar shape to the absolute value of the integrand [9,10].
16.3.3 SELF-IMPROVEMENT OF PRODUCT QUALITY AND THE UPPER BOUND OF SYSTEM RELIABILITY The dimension of the integral in Equation 16.12 is p(K+1), which is generally very large. The integral dimension depends on the production mission time tK. It generally takes considerable computation resources to generate random variables with such a large dimension. However, taking advantage of the properties of Gaussian random variables, computation resources for the situation in which the product quality does not have self-improvement can be appreciably saved; this is discussed as follows. If qj (t) is decreasing at t, then the product quality is self-improved at that time. Usually, the product quality of a manufacturing process does not have self-improvement unless the process is set up inappropriately so that degradation of the manufacturing system components may even improve the system performance at certain times. When the product of an MMP does not have self-improvement, the evaluation of the system reliability can be made much easier by taking advantage of the properties of Gaussian random variables. Lemma 16.2 is used to examine under what condition the product quality of an MMP does not have self-improvement. Lemma 16.2: If all elements of Qj , j = 1, 2, …, M, in Equation 16.4 are nonnegative, then the product quality of the MMP does not have self-improvement. Particularly, if Qj , j = 1, 2, …, M, are diagonal, the product quality of the MMP does not have self-improvement. If Qj is a diagonal matrix, then it has only nonnegative elements, because Qj is positive semidefinite. The proof of Lemma 16.2 is obvious from Equation 16.4 and assumptions A4 and A5. If the product quality of an MMP does not have self-improvement, then the event that qj (t) is within specification at any time by tK is equivalent to the event that it is within specification at time tK. 1, I tK ≡ 0,
ς (t K ) ∈ΩK otherwise
where M
ς (t K ) ∈ΩK ⇔
∩{ς (t )Q ς(t ) ≤ a − d }. T
K
j =1
j
K
j
j
2151_book.fm Page 401 Friday, October 20, 2006 5:04 PM
Quality and Reliability Chain Modeling and Analysis
401
If the product quality of an MMP does not have self-improvement, then I K = I tK . Thus, Equation 16.12 can be rewritten as exp(− sK ) 2 ΣK ⋅ dFςK ςK 1 ς ( t K )∈ΩK ΣK 2 1
R(t K ) = exp(−ct K )
∫
( )
(16.13)
The integral domain of Equation 16.13 depends only on ς(tK), not on ς(tK–1), ς(tK–2), … , ς(t0). So the integral can be calculated based on the marginal distribution of ξ˜ (tK), which is the last p elements of ξ˜ K. Because ξ˜ K is multivariate Gaussian, the marginal distribution for ξ˜ (tK) can be directly obtained from the distribution of ξ˜ K. Following this, the p(K+1) dimension integral can be reduced to a p-dimensional integral as in Result 16.4. Result 16.4: exp(− sK ) 2 ΣK ⋅ dFξ (tK ) ς (t K ) 1 2 t ς ( ) ∈ Ω ΣK K K 1
R(t K ) = exp(−ct K )
∫
(
)
(16.14)
The proof of Result 16.4 is in Appendix 16.5. How the distribution of ξ˜ (tK) used in Equation 16.14 is discussed in Appendix 16.6. In Result 16.4, the dimension of ξ˜ (tK) is p, which is independent of the mission time index K. If the distribution of ξ˜ (tK) is known, evaluation of Equation 16.14 requires the numerical calculation of an integral of p dimension rather than p(K+1) dimension. Thus, Equation 16.14 gives a simplified closed-form solution of the system reliability for the case when the product quality has no self-improvement. This result is meaningful, because the product quality of the majority of MMP does not have self-improvement. The multidimensional integral in Equation 16.14 can be evaluated using the Monte Carlo method in a similar manner to that discussed in Subsection 16.3.2. In general cases when the product quality can have self-improvement, it is easy to see that I K = 1 ⇒ ξ (t K ) ∈ΩK , but the converse (ξ (t K ) ∈Ω ⇒ I K = 1) is not true in general. So Equation 16.14 provides an upper bound estimation of the system reliability.
16.4 IMPLEMENTATION OF QR-CHAIN MODELING AND ANALYSIS IN BODY-IN-WHITE ASSEMBLY PROCESSES Body-in-white (BIW) assembly processes have been introduced in Chapter 6. In this section, the QR-Chain model developed for general MMP in this chapter will be applied for the BIW assembly processes. We will focus on reliability of a BIW assembly process with 3-2-1 fixtures and rigid parts.
2151_book.fm Page 402 Friday, October 20, 2006 5:04 PM
402
16.4.1 QR-CHAIN
Stream of Variation Modeling and Analysis for MMPs IN
MULTISTATION BIW ASSEMBLY PROCESSES
System reliability of the BIW assembly process is one of the key factors affecting the productivity and product quality. In general, the system failure of a BIW assembly process includes both system component catastrophic failure and unsatisfactory product quality. System component catastrophic failures, such as a broken or loose locating pin directly lead to an immediate downtime of the automation process. Nonconforming product quality, such as large product variation, is an indication that the process has lost its capability of producing products with the specified quality. In addition, real process data have shown a significant QR-Co-Effect between locating tool reliability and product quality propagated across multiple stations of a BIW assembly process. For example, previous research indicates that 72% of the root causes of dimensional errors of a BIW are due to locating-tool malfunction [11], which indicates significant effects of locating-tool reliability on dimensional product quality. On the other hand, large dimensional errors of the locating holes on incoming product may lead to locating-tool catastrophic failures, such as broken locating pin during the part-loading process, part stuck at pins, or part unable to be positioned correctly by the locators. These kinds of failures are called locating-tool failures induced by incoming product quality. Therefore, the catastrophic failure rates of locating tools are affected by the dimensional accuracy of the incoming product, which is determined by the propagation of the dimensional product quality from the previous stations. Based on a previous study in Yang et al. [12], the locating-tool failure induced by incoming product quality corresponds to about 44% of all locatingtool catastrophic failures. Therefore, the real process data have shown a strong QRCo-Effect between locating-tool reliability and product quality in a multistation BIW assembly process. From Chapter 6, we have seen that the product variation is propagated in a multistation assembly process. The variation propagation in product quality will lead to the QR-Chain effect. From Section 16.2, we can see that the process model plays a critical role in analyzing the QR-Chain. In this section, we will construct the process model for a multistation BIW assembly process based on the SoV model.
16.4.2 QR-CHAIN MODEL
OF A
BIW ASSEMBLY PROCESS
A general modeling procedure focusing on the x-z plane is presented in this chapter for rigid-part assembly and the 4-way and 2-way locating pins are considered as system components. Suppose there are pi locating pins at the ith station, i = 1, 2, …, N. The total number of locating pins in all stations is N
p=
∑p. i
i =1
The changes of the pin diameter will change the clearance between the locating pin on the fixture and the locating hole on the part, which will affect product quality. Thus, the accumulated decrement in the pin diameter due to pin wear-out is considered
2151_book.fm Page 403 Friday, October 20, 2006 5:04 PM
Quality and Reliability Chain Modeling and Analysis
403
the process variable. Let Pi,j denote the jth locating pin at the ith station and ξi,j(t) denote its accumulated diameter decrement at time t. Let yi,j (t), i = 1, …, N, j = 1, …, Mi denote the jth product quality characteristic on the outgoing product of station i at time t, where Mi is the number of product quality characteristics on the outgoing product of station i. The key elements of a QR-Chain model as illustrated in Section 16.2 will be discussed in this section specifically for BIW assembly processes. 16.4.2.1 Locating-Pin Degradation Model In an assembly process, pin wear is the result of friction from the sliding movement between the pin and the hole of a part. Therefore, the pin wear is aggregated by all the wear that occurs during each operation. Archard [1] proposed a wear model based on the physical principle of the contacting and rubbing wear: V=
KFL 3P
(16.15)
where V represents the worn volume, F corresponds to the load force, L is the sliding distance, P is the penetration hardness of the softer material, and K is a random wear factor. Based on the sliding wear theory, the wear factor K closely depends on the contacting surface conditions. This is because the wear takes place at the contact points between aspirates on the sliding surface. Wallbridge and Dowson [14] proved that K generally follows a lognormal distribution, that is, K ~ LOGNOR(µ K , σ 2K ), or equivalently, log( K ) ~ N (µ K , σ 2K ). The density function f(K) is:
f (K ) =
1 σ K 2π
−
e
[ln( K )− µ K ]2 2 σ 2K
(16.16)
Because of the random behavior of the wear factor K, the component wear V given in Equation 16.15 is also considered as a lognormal distributed random variable with V ~ LOGNOR (µV , σV2 ) . Therefore, the aggregated wear of the pin diameter is increased with the number of operations, which can be described by a stochastic process model with lognormal-distributed increments: ξ i, j (t ) = ξ i, j (t − 1) + ∆ i, j (t ) where ∆ i, j (t ) is the random wear increment due to operation t, ξ i, j (0 ) is the initial clearance between locating pin Pi,j and its corresponding locating hole, and t is the operation index. It is assumed that ξ i, j (0 ) ~ N (µ i, j (0 ), σ i2, j (0 )) and ∆ i, j (t ) ~ LOGNOR (µ i, j ( ∆), σ 2i, j ( ∆))
2151_book.fm Page 404 Friday, October 20, 2006 5:04 PM
404
Stream of Variation Modeling and Analysis for MMPs
where µi,j (∆) and σi,j (∆) are the mean and standard deviations of the lognormal random variable ∆i,j (t). Let µ 0 ≡ [µ1,1 (0) µ1,2 (0) … µ N , pN (0)]T and σ12,1 (0) Σ 0 ≡ 0
2 σ N , pN (0) 0
σ (0) 2 1,2
We use one production day as the time interval for a BIW assembly process. Let h denote the number of operations during each production day. Because the time is measured by the number of operations, tk is the total number of operations until the end of production day k. The BIW assembly operations are discrete in nature. The time of sliding wear when the part is positioned on the pin is much shorter than the cycle time of an operation (more time is spent on welding, clamping operations, and part handling and moving). And the accumulated wear of a locating pin is much smaller than the pin diameter and has little impact on future wear mechanism. So it is reasonable to assume that the wear amount during an operation is independent of the amount of previous operations. In addition, a BIW assembly process can produce 500–1500 car bodies during each day of production. Therefore, the accumulated wear of such a large number of operations can reasonably be approximated as normally distributed based on the central limit theorem. Thus the following equation can be used to model the pin wear: ξ (tk ) = ξ (tk −1 ) + ε k , k = 1, 2, ...
(16.17) T
where ε k ~ N (µ ε , Σ ε ), µ ε = h ⋅ µ1,1 ( ∆) … µ1, p1 ( ∆) … µ N ,1 ( ∆) … µ N , pN ( ∆) , Σε = h · diag (σ12,1 ( ∆) … σ12, p1 ( ∆) … σ 2N ,1( ∆) … σ 2N , pN ( ∆)). It can be seen that Equation 16.17 is a special case of the general model in Equation 16.3 with Pk = Gk = I. 16.4.2.2 Relationship between Process Variables and Deviations of Quality Characteristics By introducing the station index explicitly, the process model from Equation 16.1 can be written as follows for multistation BIW assembly processes: yi , j (t ) = ηi , j + α Ti , j ξ (t ) + β Ti , j zt + ξ (t )T Γ i , j zt , i = 1, 2, ..., N ; j = 1, 2, ..., Mi . (16.18) where ξ (t ) ≡ [ξ1,1 (t ) ξ1,2 (t ) … ξ N , pN (t )]T . We will see later that, for BIW assembly processes, zt describes the random pin-hole contact orientations. The construction of the process model in Equation 16.18 is based on the state space model for SoV and the study of the pin-hole relationship.
2151_book.fm Page 405 Friday, October 20, 2006 5:04 PM
Quality and Reliability Chain Modeling and Analysis
405
The locating pins’ wear is reflected in the reduction of the pin diameters, which causes an increasing clearance between a locating pin and the corresponding locating hole. This clearance results in the process faults. The following notations are used for the description of relationship between pin wear and part-locating errors: (a) ∆x Pi, j and ∆zPi, j denote the part-locating errors of pin Pi,j in the x and z directions. (b) u i (t ) ≡ [ ∆x Pi ,1 ∆z Pi ,1 … ∆x Pi , p ∆z Pi , p ]T represents the vector of process i i faults at station i. (c) θi,j represents the orientation of the contacting point between the pin Pi,j and the locating hole. Based on observations from lab and real auto body assembly plants, in most cases the locating pin touches the wall of the locating hole during the assembly operations. So it is reasonable to assume that the locating hole contacts with the pin on one side when the part is positioned by a fixture. Because of the still existing possibility that the locating pin does not touch the locating hole, this assumption may result in slight overestimation of the product variation and conservative prediction of the reliability. From the preceding assumption, the contacting orientations between the locating pin and the locating hole for a 4-way pin and a 2-way pin are shown in Figure 16.5. The process faults in the x-z plane can be represented by the displacement of the locating-hole center from the center of the pin as shown in Figure 16.5. Based on Figure 16.5, the relationship between the process faults ( ∆x Pi , j , ∆z Pi ,j ) and the pin diameter reduction of a 4-way pin can be obtained as ∆x Pi, j = 0.5ξ i, j cos θi, j ; ∆zPi, j = 0.5ξ i, j sin θi, j
(16.19)
where ξi,j is the accumulated pin diameter decrement, which is the process variable corresponding to the locating pin Pi,j in the QR-Chain model. Here θi,j , ∀i, and j are assumed to be independent random variables following uniform distribution within
xPi , j i, j i,j
i,
zPi , j z
O
Pi,j (4-way)
i,j
x
FIGURE 16.5 Process faults due to pin wear.
P i,j (2-way )
zPi , j
2151_book.fm Page 406 Friday, October 20, 2006 5:04 PM
406
Stream of Variation Modeling and Analysis for MMPs
[0, 2π], which is denoted as θi,j ~ U(0, 2π). It can also be obtained that var(sin θi,j ) = var(cos θi,j) = 0.5 and cov(sin θi,j , cos θi,j) = 0 for a 4-way locating pin. Similarly, ignoring the effect of the wear of a 2-way pin in the x direction, the relationship between the process faults and the wear of the 2-way pin can be obtained as ∆zPi, j = 0.5ξ i, j sin θi, j
(16.20)
Because the locating hole contacts with the 2-way pin either on the upper or the lower side in the z direction, θi,j is a random variable having two values of –π/2 and π/2 with the same probability equal to 0.5, denoted as θi,j ~Unif {–π/2, π/2}. It can also be shown that var(sin θi, j ) = 1 for a 2-way locating pin. Recall the state space model developed in Chapter 6: x i = Ai −1x i −1 + Bi u i y i = Ci x i ,
i = 1, 2, …, N
(16.21)
In Equation 16.21, the unmodeled errors and measurement errors are ignored. By recursively substituting x i (t ), x i−1 (t ), …, x1 (t ) in Equation 16.21, the product quality characteristics on the outgoing parts of station i can be calculated based on both the process faults at station i and those of previous stations as follows:
(
)
y i (t ) = G(i ) u( ) (t ) + C i A i−1A i−2 A 0 x 0 (t ) i
(16.22)
where matrix G(i) ≡ Ci[(Ai–1Ai–2 A1)B1 (Ai–1Ai–2 A2)B2 Ai–1Bi–1 Bi] and vector u (i ) (t ) ≡ [u1T (t ) u T2 (t ) … u Ti (t )]T. Let z′t ,i ≡ [cos θi ,1 sin θi ,1 … cos θi , pi sin θi , pi ]T , z′t ≡ [(z′t ,1 )T … (z′t ,N )T ]T and z t ≡ x 0T (t )
(z ′ )
T
t
T
(16.23)
That is, the raw part error and the random orientation of the contacting point between the pin and the locating hole are considered as noise variables in the process model. From Equation 16.22, Equation 16.19, and Equation 16.20, it can be seen that yi , j (t ) = β i(1, j)
β i(,2j) zt + ξ (t )T Γ (i1, j)
Γ (i ,2j) zt ,
(16.24)
i = 1, 2, ..., N ; j = 1, 2, ..., Mi . where β (i1, j) = [Ci Ai −1Ai − 2 A 0 ]( j ,:) ; β (i ,2j) is a 1 × 2p vector whose elements are all zero; Γ (i1, j) is a zero matrix of appropriate dimension; and
2151_book.fm Page 407 Friday, October 20, 2006 5:04 PM
Quality and Reliability Chain Modeling and Analysis
407
Γ i(,2j) = G ( i ) G ( i ) 0 0 ( j ,1) ( j ,2 ) G ( i ) G ( i ) 0 0 ( j ,3 ) ( j ,4 ) 1 2 0 0 0 0 0 (( p− p )×1) 0 (( p− p )×1) 0 (( p− p )×1) 0 (( p− p )×1) i
i
i
…
0
0
…
0
0
G ( i ) G ( i ) ( j ,2 p ) ( j , 2 p −1 ) i
i
…
i
i
( 1× 2 ( p − p ))
i
…
0 ( 1× 2 ( p − p )) 0 ( 1× 2 ( p − p )) 0 ( r × 2 ( p − p )) ( p× 2 p ) 0
0
0
(( p − p i ) ×1)
(( p − p i ) ×1)
i
i
where i
pi ≡
∑p . k
k =1
Thus, the coefficients of the process model of Equation 16.18 for a BIW assembly process are η j = 0 , αi,j = 0, β i , j = [β i(1, j) β i(,2j) ]T and Γ i , j = [Γ i(1, j) Γ i(,2j) ]. 16.4.2.3 Product Quality Assessment and Pin Catastrophic Failure Following Subsection 16.2.3 and Appendix 16.1 with station indices explicitly shown, we have
(
)
qi , j t ξ (tk ) = ξ (tk )T Qi , j ξ (tk ) + di , j , i = 1, …, N; j = 1, …, Mi .
(16.25)
where Qi , j = Γ i , j cov(zt )Γ iT, j and di , j = β iT, j cov(zt )β i , j. Because βi,j is only related to the raw part errors, di, j can be interpreted as the contribution of the raw part errors to the product quality at each station. For multistation BIW assembly processes, N
E ≡ q t
Mi
∩ ∩ (q
i, j
)
(τ) ≤ ai, j , ∀0 ≤ τ ≤ t ,
i =1 j =1
where ai,j is the threshold of the specification for the jth product quality characteristic at the ith station. Following Subsection 16.2.4 with station index explictly shown, we have
(
λ i , j (t ) = λ i , j (0) + s Ti , jq t ξ (tk )
(
)
(
)
(
)
(
) )
T
where q t ξ (tk ) = q1,1 t ξ (tk ) q1,2 t ξ (tk ) … qN ,M N t ξ (tk ) .
2151_book.fm Page 408 Friday, October 20, 2006 5:04 PM
408
Stream of Variation Modeling and Analysis for MMPs
16.4.2.4 System Reliability Evaluation From the definition of Γi,j, zt, and the distribution of θi,j discussed in Subsection 16.4.2, it can be seen that Bi , j = Γ i , j cov(zt )Γ iT, j is diagonal. Based on Lemma 16.2, the product quality of such a process does not have self-improvement. Therefore, the system reliability of a BIW assembly process can be evaluated using Result 16.4.
16.4.3 CASE STUDY A case study is presented to illustrate the developed methodology. The side frame inner panel assembly process studied in Chapter 6, as shown in Figure 16.6, is used in the study. Table 16.1 and Table 16.2 give all the dimensions of the tooling positions and the KPC points. The design parameters used for this example are shown in Table 16.3. The raw part dimensional errors and variation of initial pin/hole clearance is very small and are ignored in this case study. The system reliability analysis is conducted under the following three different definitions of system failures: 1. Only consider the probability of component catastrophic failures. That is, the pin degradation and the impact of the incoming product quality on the pin catastrophic failure are not considered in this model. It is equivalent
P2,3P2,4&P3,2
M3,2
M3,7
M3,6
P1,2&P2,2 P3,3
M3,1
P1,4
M3,10 P1,1&P3,1
M3,3 P3,4 P1,3&P2,1
M3,4
(a) Locating pin positions
M3,8
M3,9 M3,5
(b) Product quality characteristics
FIGURE 16.6 Layout of the tooling positions and product quality characteristics.
TABLE 16.1 Nominal x-z Coordinates for Locating Points
Locating Points Nominal coordinates (mm)
x z
P1,1 and P3,1
P1,2 and P2,2
P1,3 and P2,1
P1,4
367.7 906.1
667.5 1295.4
1272.7 537.4
1301.0 1368.9
P2,3
P2,4 and P3,2
P3,3
P3,4
1470.7 1640.4
1770.5 1702.6
2120.3 1402.8
3026.3 950.3
2151_book.fm Page 409 Friday, October 20, 2006 5:04 PM
Quality and Reliability Chain Modeling and Analysis
409
TABLE 16.2 Nominal x-z Coordinates for Product Quality Characteristics KPC Points Nominal coordinates (mm)
x z
M3,1
M3,2
M3,3
M3,4
M3,5
M3,6
M3,7
M3,8
M3,9
M3,10
271.5 905.0
565 1634
1289 1227
1306 633
1244 85
1640 1781
2884 1951
2743 475
1838 226
1980 1459
TABLE 16.3 Summary of the Parameters Used in the Study Description of the Parameter Initial failure rate λi, j (0) QR coefficient si,j
Initial pin/hole clearance µi, j (0) Operations per time interval h Unit degradation rate µi, j (∆) Threshold for quality index ai,j Unit degradation s.t.d. σi, j (∆)
Value λi, j (0) = λ(0) = 4 × 10–7, ∀i, j [si, j ]r = s = 0.001, ∀i, j, when the rth element corresponds to the quality characteristics of the locating hole located by the jth pin at the ith station µi, j (0) = µ0 = 0.04 mm, ∀i, j h = 500 operations µi, j (∆) = µ(∆) = 2 × 10–6 mm/operation, ∀i, j 6ai, j = 6a = 0.08 mm, ∀i, j σi, j (∆) = σ(∆) = 5 × 10–5 mm/opration, ∀i, j
to the case of setting the QR coefficient s and the pin degradation rate µ(∆) to zero in the QR-Chain model. 2. Consider both the pin catastrophic failure and the product quality deterioration due to component wear-out, but without considering the impact of the incoming product quality on the catastrophic failure rate of the locating pins. It is equivalent to the case of setting the QR coefficient s in the QR-Chain model to zero. 3. Use the integrated QR-Chain model. The system reliability results are shown in Figure 16.7. The Matlab code for the numerical evaluation of the system reliability was run on an IBM PC Pentium III machine. On average, it takes about 10 sec to evaluate system reliability at a specific time based on the QR-Chain model. From the comparison study, it can be seen that the system reliabilities under definitions 1 and 2 are always higher than that under definition 3. As a result, overlooking the impact of incoming product quality as in 1 and 2 may lead to significant overestimation of the overall system reliability. If a scheduled maintenance policy is planned based on the predicted system reliability using definition 1 or 2, unexpected downtimes could be experienced because of overlooking the interdependency between product quality and reliability of locating pins.
2151_book.fm Page 410 Friday, October 20, 2006 5:04 PM
410
Stream of Variation Modeling and Analysis for MMPs
1
Reliability
(a) s=0, ( )=0
0.9 0.8 0.7 0.6
(b) s=0, ( ) 0
(c) s 0, ( ) 0
0.5 0.4 0.3 0.2 0.1 0
0
10
20
30
40
50
60
70
80
Operating Days
FIGURE 16.7 System reliability with and without considering the QR-Chain effect.
16.5 EXERCISES 1. What is the relationship between the QR-Co-Effect and the QR-Chain effect? 2. What events are considered as system failures under the QR-Co-Effect framework? 3. Explain why catastrophic failures of manufacturing components are statistically dependent in the presence of the QR-Chain effect. 4. Explain why system catastrophic failure and system failure due to nonconforming products are statistically dependent in the presence of the QRChain effect. 5. Derive µε and Σε in Equation 16.17 for the side frame inner panel assembly process studied in Subsection 16.4.3. 6. Derive the process model as in Equation 16.24 for the side frame inner panel assembly process studied in Subsection 16.4.3. 7. Derive Equation 16.25, in particular, Qi,j and di,j , for the side frame inner panel assembly process studied in Subsection 16.4.3. 8. Show in details that Bi,j is a diagonal matrix for BIW assembly processes under the assumptions in this chapter.
16.6 APPENDIX 16.1: DERIVATION OF EQUATION 16.4 Based on the degradation model assumed in Subsection 16.2.2, Equation 16.1 can be rewritten as y j (t ) = η j + α Tj ξ (tk ) + β Tj zt + ξ (tk )T Γ j zt , tk ≤ t < tk +1 , j = 1, 2, ..., M (16.26)
2151_book.fm Page 411 Friday, October 20, 2006 5:04 PM
Quality and Reliability Chain Modeling and Analysis
411
Under a given component degradation state ξ(tk), yj(t) is still a random variable due to the randomness of the noise variable zt. The product quality, as a system performance index, is defined by the mean and variance of yj(t) under a given component state ξ(tk). Because of the term α Tj ξ (tk ) in Equation 16.26, the change of ξ(tk) over time leads to a mean shift of yj(t). Because of the term ξ (tk )T Γ j zt in Equation 16.26, the change of ξ(tk) can also lead to the change of the variability of yj(t). Because variability is a very important quality index, it is important to include the interaction between the process variable and the noise variable in our model. Quality is generally defined as the closeness of a quality characteristic to the target. To capture both the mean shift and the variability information, product quality can be assessed as in assumption A6. Following this concept, under a given component degradation state ξ(tk), qj(t) can be defined as
(
) (
) = Var ( y (t ) ξ (t ) ) + E ( ( y (t ) − γ ) ξ (t ) ) , t
q j t ξ (tk ) ≡ E ( y j (t ) − γ j )2 ξ (tk )
(16.27)
2
j
k
j
j
k
k
≤ t < tk +1
where γ j is defined as the target value for the jth product quality characteristic. Based on assumption A4, the mean of the product quality characteristic attains the target value when ξ(tk) = 0. Thus from Equation 16.26,
(
)
γ j = E y j (t ) ξ (tk ) = 0 = η j + β Tj E (zt )
(
)
(
)
(16.28)
(
Var y j (t ) ξ (tk ) = β Tj cov(zt )β j + ξ (tk )T Γ j cov(zt ) Γ Tj ξ (tk )
)
(16.29)
From Equation 16.26 and Equation 16.28,
(
) (
E 2 ( y j (t ) − γ j ) ξ (tk ) = α Tj ξ (tk ) + ξ (tk )T Γ j E (zt )
(
)(
)
2
)
(16.30)
T
= ξ (tk ) α j + Γ j E (zt ) α j + Γ j E (zt ) ξ (tk ) T
From Equation 16.27, Equation 16.29, and Equation 16.30, q j (t ) is a quadratic function of ξ(tk); thus
(
)
q j t ξ (tk ) = ξ (tk )T Q j ξ (tk ) + d j , j = 1, 2, ..., M , tk ≤ t < tk +1
(16.31)
where Q j ≡ Γ j cov(zt )Γ Tj + (α j + Γ j E (zt ))(α j + Γ j E (zt ))T, d j ≡ β Tj cov(zt )β j , and Qj in Equation 16.31 is positive semidefinite, and dj ≥ 0.
2151_book.fm Page 412 Friday, October 20, 2006 5:04 PM
412
Stream of Variation Modeling and Analysis for MMPs
16.7 APPENDIX 16.2: PROOF OF RESULT 16.1 Conditioning on ξK, the catastrophic failures of each component are independent. Furthermore, all components are connected in series. Thus, Pr EtcK ξ K = exp −
{
}
p
K −1
∑∑ i =1
k =0
λ i (tk ) ⋅ h
(16.32)
From Equation 16.4 and Equation 16.7, λ i (tk ) = λ 0 i + s Ti q(tk ) = λ 0 i + s Ti d + ξ (tk )T
M
∑ s Q ξ(t ), i = 1, 2, ..., p, k = 0, 1, ..., K i
j
j =1
k
j
So, p
K −1
∑∑ i =1
k =0
λ i (tk )h =
p
K −1
∑ ∑( i =1
k =0
λ 0 i + s Ti d h +
)
p
K −1
∑∑ i =1
k =0
ξ (k )T
M
∑ s Q h ⋅ ξ(k) i
j =1
j
j
Let
U′ ≡ h ⋅
p
M
i =1
j =1
∑ ∑ s ⋅ Q i
j
.
j
Because the elements of si are nonnegative and Qj , j = 1, 2, …, M are positive semidefinite, it can be seen U′′ is positive semidefinite. Also p
K −1
∑ ∑ (λ i =1 k = 0
0i
)
+ s iT d h = cKh = ctK .
Thus p
K −1
∑∑ i =1
k =0
λ i (tk )h = ct K +
K −1
∑ ξ (t )
T
k
k =0
U′ξ (tk )
(16.33)
2151_book.fm Page 413 Friday, October 20, 2006 5:04 PM
Quality and Reliability Chain Modeling and Analysis
413
Finally, let I( K × K ) ⊗ U′ UK ≡ 0 ( p× pK )
0 ( pK × n ) , 0 ( p× p)
where I(K×K) is the K × K identity matrix. UK is positive semidefinite. The p × p zero matrix in UK is for ξ(tK), which does not contribute to the catastrophic failure of components. From Equation 16.32 and Equation 16.33,
{
((
}
Pr EtcK ξ K = exp − ct K + ξ TK U K ξ K
))
16.8 APPENDIX 16.3: DERIVATION OF EQUATION 16.10 Unconditioning on ξK by taking expectation to Equation 16.9, then
(
) ∫ R (t
R(t K ) = E R t K ξ K = ςK = ξK
K
)
( )
ξ K = ςK dFξ K ςK
The ∫ in this equation denotes a multidimensional integral. Further, from Equation 16.9,
∫ exp ( − (ct + ς U ς )) I dF (ς ) = ∫ exp ( − (ct + ς U ς )) dF (ς )
R(t K ) =
T K
K
K
K K
T K
K
K K
ξK
ξK
K
(16.34)
K
ςK ∈Ω
From Equation 16.3, ξ(tk) is Gaussian. Let ξ (tk ) ~ N (µ (k ), Σ (k )); then µ(k) and Σ(k) can be obtained recursively as µ (k ) = Pk −1µ (k − 1) + G k −1µ ε , k = 1, 2,..., K , µ (0) = µ 0 Σ (k ) = Pk −1Σ (k − 1)PkT−1 + G k −1Σ ε GTk −1 , k = 1, 2, ..., K , Σ (0) = Σ 0 There exist constant matrices H(i), i ≥ 1, so that ξ (tk +i ) = (Pk +i −1 Pk )ξ (tk ) + H(i) × ε k ,i , k = 1, 2, ..., K , i = 1, 2, ... (16.35) where ε k ,i = [ε Tk ε Tk +1 … ε Tk +i −1 ]T. From Equation 16.35,
2151_book.fm Page 414 Friday, October 20, 2006 5:04 PM
414
Stream of Variation Modeling and Analysis for MMPs
(
)
Σ (k + i, k ) ≡ cov ξ (tk +i ), ξ (tk ) = (Pk +i −1 Pk ) Σ (k ) and Σ (k , k + i) = Σ T (k + i, k ). Let µ K ≡ [µ T (0) µ T (1) … µ T ( K )]T , Σ (0) Σ (1, 0) ΣK ≡ Σ ( K , 0)
Σ (0,1) Σ (1)
Σ (0, K ) Σ (1, K ) Σ ( K )
… … …
Σ ( K ,1)
Because ξ(t0), ξ(t1), …, ξ(tK) are jointly normally distributed, then ξ K ~ N (µ K , Σ K )
(16.36)
Substitute Fξ K (ςK ) in Equation 16.34 based on Equation 16.36; then
( )
R tK =
∫
ςK ∈Ω
((
exp − ct K + ςKT U K ςK
))
1
( 2π )
(
n ( K +1) 2
1 exp − ςK − µ K )T Σ −K1 (ςK − µ K ) 2 = exp(−ct K )
∫
ςK ∈Ω
ΣK
1 2
) dς
K
(16.37) 1
( 2π )
n ( K +1) 2
ΣK
1 2
1 exp − ςKT U K ςK + (ςK − µ K )T Σ −K1 (ςK − µ K ) d ςK 2
16.9 APPENDIX 16.4: PROOF OF LEMMA 16.1 Σ −1 K is positive definite and UK is positive semidefinite (from Result 16.1) ⇒ Σ −1 ≡ 2U + Σ −1 is positive definite. K
K
K
Let
(
µ K ≡ U K + Σ −K1 2 Σ −K1 ≡ 2U K + Σ −K1
) ( Σ 2) µ −1
−1 K
K
2151_book.fm Page 415 Friday, October 20, 2006 5:04 PM
Quality and Reliability Chain Modeling and Analysis
415
T sK ≡ µ TK Σ −K1 2 µ K − µ TK Σ −K1 2 U K + Σ −K1 2
(
)
(
) (
(
(
= µ TK U TK U K + Σ −K1 2
(
)) ( Σ 2) µ −T
−1 K
K
)) ( Σ 2) µ −1
T
−1 K
K
>0
Substituting µ K , Σ −1 K and sK , we have 1 1 (ςK − µ K )T Σ −K1 (ςK − µ K ) + sK = ςKT U K ςK + (ςK − µ K )T Σ −K1 (ςK − µ K ) 2 2
16.10 APPENDIX 16.5: PROOF OF RESULT 16.4 From Equation 16.13, exp(− sK ) 2 ΣK ⋅ dFξ K ςK 1 2 ς ( t K )∈ΩK ΣK 1
R(t K ) = exp(−ct K )
( )
∫
∫ Pr {ξ (t ) ∈Ω
= exp(−ct K )
exp(− sK ) ΣK 1 ΣK 2
1 2
= exp(−ct K )
exp(− sK ) ΣK 1 ΣK 2
1 2
= exp(−ct K )
exp(− sK ) 2 Σ K Pr ξ (t K ) ∈ΩK 1 ΣK 2
= exp(−ct K )
exp(− sK ) ΣK 1 2 ΣK
K
K
}
ξ K = ςK dFξ K (ςK )
{
}
E Pr ξ (t K ) ∈ΩK ξ K
ξ K
1
{
1 2
∫
}
( ( ))
dFξ (tK ) ς t K
ς ( t K )∈ΩK
16.11 APPENDIX 16.6: DISTRIBUTION OF RANDOM VARIABLE ξ˜ (tK) Based on the property of a multivariate normally distributed random variable, ξ (t K ) is also normally distributed with ξ (t K ) ~ N (µ ( K ), Σ ( K )), where µ ( K ) ≡ E (ξ (t K )) and Σ ( K ) ≡ cov(ξ (t K )). Partition Σ K , µ K as:
2151_book.fm Page 416 Friday, October 20, 2006 5:04 PM
416
Stream of Variation Modeling and Analysis for MMPs
Σ11 pK × pK ) Σ = ( K Σ 21 ( p× pK )
µ1 ( pK ×1) µ K = µp×21 ( )
Σ12 ( pK × p) Σ 22 ( p× p)
It is easy to see that µ ( K ) = µ 2 and Σ ( K ) = Σ 22. So,
( ( )) =
dFξ (tK ) ς t K 1
( 2π )
p 2
Σ ( K )
1 2
(
T 1 (K ) exp − ς (t K ) − µ ( K ) µ ( K )−1 ς (t K ) − ∑ 2
(
)
) dς(t ). K
REFERENCES 1. Huang, Q., Zhou, N., and Shi, J., Stream of variation modeling and diagnosis of multi-station machining processes, Proceedings of IMECE 2000 International Mechanical Engineering Congress and Exposition, Nov. 5–10, 2000, Orlando, Florida, pp. 81–88. 2. Jin, J. and Shi, J., Tonnage Signal Decomposition for Transfer/Progressive Die Processes Monitoring and Control, NIST-ATP, 1997. 3. Taguchi, G., Introduction to Quality Engineering, American Supplier Institute, Dearborn, MI, 1986. 4. Wu, C.F.J. and Hamada, M., Experiments: Planning, Analysis, and Parameter Design Optimization, John Wiley & Sons, New York, 2000. 5. Lemoine, A.J. and Wenocur, M.L., On failure modeling, Naval Research Logistics Quarterly, 32, 479–508, 1985. 6. Singpurwalla, N.D., Survival in dynamic environments, Statistical Science, 10(1), 86–103, 1995. 7. Montgomery, D.C., Introduction to Statistical Quality Control, John Wiley & Sons, New York, 2001. 8. Stroud, A. H., Approximate Calculation of Multiple Integrals, Prentice Hall, Englewood Cliffs, NJ, 1971. 9. Rubinstein, R.Y., Simulation and the Monte Carlo Method, John Wiley & Sons, New York, 1981. 10. Kalos, M.H. and Whitlock, P.A., Monte Carlo Methods, John Wiley & Sons, New York, 1986. 11. Ceglarek, D. and Shi, J., Dimensional variation reduction for automotive body assembly, Journal of Manufacturing Review, 8, 139–154, 1995. 12. Yang, S., Chen, Y., Shi, J., and Hu, S.J., Modeling of Assembly Lines and Fixtures for Variation and Reliability, General Motors Satellite Research Laboratory, 2000. 13. Archard, J.F., Contact and rubbing of flat surfaces, Journal of Applied Physics, 24, 981–988, 1953. 14. Wallbridge, N.C. and Dowson, D., Distribution of wear rate data and a statistical approach to sliding wear theory, Wear, 119(3), 295–312, 1987.
2151_book.fm Page 417 Friday, October 20, 2006 5:04 PM
Quality and Reliability Chain Modeling and Analysis
417
15. Chen, Y. and Jin, J., Quality-reliability chain modeling for system-reliability analysis of complex manufacturing processes, IEEE Transactions on Reliability, Vol. 54, pp. 475–488, 2005. 16. Chen, Y., Jin, J., and Shi, J., Integration of dimensional quality and locator reliability in design and evaluation of multistation body-in-white assembly processes, IIE Transactions, Vol. 39, pp. 827–839, 2004.
2151_book.fm Page 418 Friday, October 20, 2006 5:04 PM
2151_book.fm Page 419 Friday, October 20, 2006 5:04 PM
17
Quality-Oriented Maintenance for Multiple Interactive System Components*
In Chapter 15, a maintenance model was proposed to minimize the quality loss and maintenance costs. Maintenance of a manufacturing process with consideration of product quality loss is called quality-oriented maintenance. There are two major limitations in the maintenance model developed in Chapter 15: (1) Chapter 15 focuses on BIW assembly processes instead of a general MMP with a more general process model as given in Equation 16.1; (2) catastrophic failures are not captured in the model of Chapter 15. In this chapter, we will develop a more general qualityoriented maintenance model based on the general process model as given in Equation 16.1 with catastrophic failures of manufacturing-system components considered. Besides age replacement, other commonly used multicomponent maintenance strategies will also be investigated in this chapter.
17.1 QUALITY-ORIENTED MAINTENANCE MODEL The quality-oriented maintenance model consists of three major elements: (1) degradation of manufacturing-system components and their catastrophic failures, (2) effects of process variables on product quality, which is described by the process model, and (3) maintenance costs and product quality loss. These elements are described in the following text: •
Degradation of manufacturing-system components: As in the QR-Chain model developed in Chapter 16, an adjustable process variable can be considered as the measure of the state of a manufacturing-system component. We assume the degradation of a manufacturing-system component over time is modeled as ξ i (τ i ) = ξ i (0 ) + µ i τ i + wi (τ i ), τ i ≥ 0, i = 1, 2, ..., p,
(17.1)
where τi is the age of the ith system component, E(ξi(0)) = µ0i , var(Xi(0)) = σ02i , E(w i (τi)) = 0, and var(wi (τ i )) = σ 2i τ i . * Part of the chapter material is based on Reference 4.
419
2151_book.fm Page 420 Friday, October 20, 2006 5:04 PM
420
Stream of Variation Modeling and Analysis for MMPs
The mean and variance of ξ i given τi , can be obtained as E (ξ i (τ i )) = µ 0 i + µ i τ i
(17.2)
var(ξ i (τ i )) = σ 20 i + σ i2τ i
(17.3)
The catastrophic failure of each system component is assumed to have a constant hazard rate. That is, the catastrophic failure time follows an exponential distribution: Pr{Component i fails before age τ i } = 1 − e − λiτi •
Process model: The process model used in Chapter 16 for the QR-Chain model is used to describe the relationship between process variables and the product quality characteristic: y(τ ) = η + α T ξ (τ ) + β T z + ξ (τ )T Γz + ε
(17.4)
where τ ≡ [τ1, …, τp]T. The model in Equation 17.4 is the same as the process model in Chapter 16 except that only a single product quality characteristic is considered here. Based on Equation 17.4, given ξ(τ), the expectation and variance of y(τ) can be obtained as:
( ) var ( y(τ ) ξ (τ ) ) = (β
E y( τ ) ξ ( τ ) = η + α T ξ ( τ ) T
•
) (
)
+ ξ (τ )T Γ V β + Γ T ξ (τ ) + σ 2ε
Product quality loss and maintenance costs: Using the quadratic loss function, the expected quality loss given ξ(τ) can be written as
(
) (
E L ( y(τ )) ξ (τ ) = E q( y(τ ) − γ )2 ξ (τ )
(
)
) (
= q ⋅ var y(τ ) ξ (τ ) + q E ( y(τ ) ξ (τ )) − γ
(
(
) +σ )
(
)
2
)
= q ξ (τ ) ΓVΓ + αα ξ (τ ) + 2 β VΓ ( η − γ )α ξ (τ ) T
T
+ β T Vβ + ( η − γ )2
T
2 ε
T
T
T
(17.5)
2151_book.fm Page 421 Friday, October 20, 2006 5:04 PM
Quality-Oriented Maintenance for Multiple Interactive System Components 421
In Equation 17.5, the term qξ (τ )T (ΓVΓ T + αα T )ξ (τ ) represents the interactive impacts of the adjustable process variables on product quality loss. In the maintenance model for BIW assembly processes developed in Chapter 15, this interactive impact does not exist. For general processes, however, this interactive impact may exist. Given component ages τ, ξ(τ) is still random because of the uncertainty in the component degradation processes. Taking expectation on ξ(τ),
((
(
))
)
Q(τ ) = E E L y(τ ) ξ (τ ) = qE[ξ (τ )T (ΓVΓ T + αα T )ξ (τ )
(
)
+ 2 β T VΓ T + ( η − γ )α T ξ (τ ) + β T Vβ + ( η − γ )2 + σ ε2 ] (17.6)
= qE[ξ (τ )T Bξ (τ ) + pT ξ (τ ) + d ] T = q E ξ (τ ) BE ξ (τ ) + pT E ξ (τ ) +
(
)
(
)
(
)
p
∑ b Var (ξ (τ)) + d ii
i
i =1
where B = (bij)p × p ≡ ΓVΓT = ααT, pT ≡ 2(β T VΓ T + (η – γ)αT ), and d ≡ β T Vβ + (η – γ)2 + σε2 . Define b ≡ [b11 … bpp]T. Substituting Equation 17.2 and Equation 17.3 into Equation 17.6, we have Q(τ ) = q(τ T (B∗ )τ + ( p∗ )T τ + d ∗ ) where B* ≡ (µµT) µ0T Bµ0 + bTθ0.
B, p* ≡ µ p + 2(B
µ)T + b
(17.7)
θ, d* ≡ d + pT µ0 +
The product is an extension of the Hadamard product, which is defined as follows: Definition 17.1: For two n1 × n2 matrices of C = (cij) and D = (dij), and one n2 × 1 vector of u = (ui), the Hadamard product of C and D results in an n1 × n2 matrix of the elementwise product C D = (cij dij ) n1× n2 Further, similar to the Hadamard product, C u, is defined as an n1 × n2 matrix as C u = (cij u j ) n1× n2 Let τ(t) denote the ages of the adjustable process variables at time t ∈[0, ∞) . The ages of the adjustable process variables τ i (t ), i = 1, …, p are independent random
2151_book.fm Page 422 Friday, October 20, 2006 5:04 PM
422
Stream of Variation Modeling and Analysis for MMPs
variables because catastrophic failures of system components are assumed independent. Taking the expectation of Equation 17.7 on τ(t),
( ( ))
((
)
T
E Q τ (t ) = qE τ (t ) (B∗ )τ (t ) + ( p∗ )T τ (t ) + d ∗
( )
T
( )
∗
)
( )
∗ T
= q( E τ (t ) (B ) E τ (t ) + ( p ) E τ (t ) +
p
∑ b varr(τ (t)) + d ) ∗ ii
(17.8)
∗
i
i =1
where bii∗ is the (i, i)th element of B∗ . The following lemma shows properties of B∗ and bii∗ . Lemma 17.1: B∗ is nonnegative definite and bii* ≥ 0, ∀i = 1, …, p . Lemma 17.1 is not difficult to see from the definitions of B* and bii* . In addition to quality loss, maintenance actions are subject to maintenance costs. If the ith process variable is reset preventively at a scheduled time, we assume that cost cip suffers. If it is reset at the unexpected catastrophic failure of its corresponding system component, we assume that cost cif suffers.
17.2 MULTICOMPONENT MAINTENANCE POLICIES The drifted adjustable process variables due to component degradation need to be reset during the production phase by readjustment or replacement of the corresponding manufacturing-system component. Maintenance decisions will determine when each of the adjustable process variables should be reset. Depending on the requirements on feasibility, flexibility, and cost efficiency of the maintenance actions, three replacement/resetting policies are widely studied for multicomponent systems. Each of them will be investigated in this section. Becasue the resetting of the adjustable process variables is not necessarily accomplished by component replacement, we will use resetting instead of replacement when we refer to a maintenance policy in the literature.
17.2.1 SIMPLE BLOCK RESETTING (REPLACEMENT) Under simple block resetting policy, each adjustable process variable is reset when its corresponding system component experiences a catastrophic failure. Additionally, regardless of their individual ages, all adjustable process variables are preventively reset at times kb (k = 1, 2, …). The optimal maintenance decision aims to minimize the expected long-run average cost by choosing optimal b. The expected long-run average cost under the simple block resetting policy is given in the following lemma. Lemma 17.2: The expected long-run average cost under the simple block resetting policy is p
Φ(b) =
∑ i =1
cif λ i +
cp +
∫
b
E (Q(τ (t )))dt
0
b
(17.9)
2151_book.fm Page 423 Friday, October 20, 2006 5:04 PM
Quality-Oriented Maintenance for Multiple Interactive System Components 423
Proof: It is defined that Φ(b ) =
E [cost during one preventive resettting interval] . b
During each resetting interval, the total preventive resetting cost is always p
∑c . p i
i =1
The expected total failure resetting cost is p
∑ c E (N (b)) , f
i
i
i =1
where Ni(t) is the number of catastrophic failures of component i during time interval (0,t). Because each component has a constant hazard rate λi , Ni(t) follows a Poisson distribution with mean λi (t), for t < b. So, the expected total failure resetting cost is p
∑c λ t . f
i
i
i =1
The expected quality loss at time t is E (Q(τ (t ))), and the cumulative quality loss during the whole resetting interval is
∫
b
E (Q(τ (t )))dt .
0
Summing up the preventive resetting cost, failure resetting cost, and the cumulative quality loss, we have E[cost during one preventive resetting interval] = p
∑
p
cip + b
i =1
∑c λ + ∫ f i
b
i
E (Q(τ (t )))dt ,
0
i =1
and p
Φ(b) =
∑
p
cip + b
i =1
∑
∫
b
E (Q(τ (t )))dt
0
i =1
b p
=
cif λ i +
∑ i =1
c λi + f i
cp +
∫
b
Q.E.D.
E (Q(τ (t )))dt
0
b
2151_book.fm Page 424 Friday, October 20, 2006 5:04 PM
424
Stream of Variation Modeling and Analysis for MMPs
From Lemma 17.2 and Equation 17.8, E (Q(τ (t ))) under the simple block resetting policy, i.e., Φ(b ) , can be calculated if E (τ i (t )) and Var(τ i (t )) are known, which are given in the following lemma: Lemma 17.3: Under the simple block resetting policy, for t ∈(0, b ), E (τ i (t )) =
1 (1 − e − λit ) , λi
and Var(τ i (t )) =
1 − 2 λ i te − λit − e −2 λit λ i2
Proof: Because a Poisson process has stationary and independent increments, Pr{τ i (t ) > x} = Pr{No failure during (t -x ,t )} = Pr{No failure during (0,x )} = e − λi x , for 0 < x < t.
(17.10)
Also, τ i (t ) is nonnegative. Therefore, E (τ i (t )) =
∫e t
− λi x
dx =
0
E (τ 2i (t )) =
∫
t
2 xe − λi x dx =
0
1 (1 − e − λit ) , λi
2 (1 − e − λit − λ i te − λit ) , λ i2
and Var(τ i (t )) = E (τ i2 (t )) − ( E (τ i (t )))2 =
1 − 2 λ i te − λit − e −2 λit . λ i2
Q.E.D.
From Lemma 17.2, it can be seen that the optimal value of b does not depend on the failure resetting cost cif . Based on Lemma 17.2 and Lemma 17.3, we have the following result: Result 17.1: The optimal simple block resetting policy can be obtained by solving the following nonlinear optimization problem with one decision variable:
b∗ = arg min Φ(b) = arg min b
subject to b > 0
b
cp +
∫
b
E (Q(τ (t )))dt
0
b
(17.11)
2151_book.fm Page 425 Friday, October 20, 2006 5:04 PM
Quality-Oriented Maintenance for Multiple Interactive System Components 425
where E (Q(τ (t ))) can be calculated by substituting E (τ i (t )) and Var(τ i (t )) in Equation 17.8 with Lemma 17.3. By calculating the first-order derivative, if b ∗ < ∞ , it should satisfy b∗
∫ ( E(Q(τ(b ))) − E(Q(τ(t)))) dt = c ∗
p
(17.12)
0
Corollary 17.1: If b ∗ < ∞ is the optimal simple block resetting interval, then p
∑ c λ + E(Q(τ(b )))
Φ(b∗ ) =
f i
∗
i
i =1
Proof: From Equation 17.12,
∫
b∗
E (Q(τ (t )))dt = b∗E (Q(τ (b∗ ))) − c p
0
Using Lemma 17.2,
Φ(b∗ ) =
∑
c + p
p
cif λ i +
∫
b∗
0
i =1
E (Q(τ (t )))dt b∗
p
=
∑ c λ + E(Q(τ(b ))) . f i
∗
i
Q.E.D
i =1
17.2.2 MODIFIED BLOCK RESETTING Although the simple block resetting policy is simple to understand and implement, it may result in the resetting of newly adjusted process variables. To avoid this unnecessary waste, the modified block resetting policy was proposed by Berg and Epstein [1]. Under the modified block resetting policy, an adjustable process variable is reset based on two events: one is component catastrophic failure, and the other is prescheduled block resetting times (kb, k = 1, 2, …) if the ages of the adjustable process variables are not less than h ∈[0, b ] at a prescheduled block resetting time. Let τ i (b ) denote the random age of system component i at the end of a block resetting interval. The expected long-run average cost is Φ(b, h) =
E (cost during one preventive reseetting interval) b p
p
=
∑c λ + f i
i =1
i
∑c e
p − λi h i
i =1
+
∫
b
0
b
E (Q(τ (t )))dt
(17.13)
2151_book.fm Page 426 Friday, October 20, 2006 5:04 PM
426
Stream of Variation Modeling and Analysis for MMPs
One difference between Equation 17.13 and Lemma 17.2 is in the preventive resetting cost. Under a modified block resetting policy, only adjustable process variables with ages no less than h are preventively reset at the end of a block resetting interval. As a result, the preventive resetting cost under modified block resetting policy is p
p
∑c
p i
Pr{τ i (b ) ≥ h} =
i =1
∑c e
p − λi h i
i =1
where Pr{τ i (b ) ≥ h} is calculated from Equation 17.10. For the modified block resetting policy, Equation 17.8 can still be used to calculate E (Q(τ (t ))). However, the calculation of E (τ i (t )) and E (τ 2i (t )) is different from the simple block resetting policy owing to the possible nonzero τ i (0 ), the age of adjustable process variable i at the beginning of a block resetting interval. The following lemma evaluates E (τ i (t )) and Var(τ i (t )) for the modified block resetting policy. Lemma 17.4: Under the modified block resetting policy (b, h), for t ∈(0, b ) E (τ i (t )) =
Var (τ i (t )) =
1 1 − + h e − λi ( h+t ) λi λi
1 h 2λ i + 2(hλ i + 1)t − ( h+t ) λi (1 + hλ i )2 −2( h+t ) λi e − e − λi λ i2 λ i2
Proof: Conditioning on the last catastrophic failure time of component i during (0, t), E(τi(t)) = Pr{no failure on (0, t))}⋅E(τi(t)|no failure on (0, t))
∫ E(τ (t) a failure at time t-x) λ e t
+
i
i
− λi x
dx
0
= e − λ i t E (τ i (0) + t ) +
∫
t
0
x λ ie − λi x dx =
1 1 − + h e − λi ( h+t ) λi λi
E (τ 2i (t )) = Pr{no failure on (0, t))}⋅E( τ 2i (t ) |no failure on (0, t))
∫ E(τ (t) a failure at time t-x) λ e t
+
2 i
0
i
− λi x
dx
2151_book.fm Page 427 Friday, October 20, 2006 5:04 PM
Quality-Oriented Maintenance for Multiple Interactive System Components 427
= e − λit E ((τ i (0 ) + t )2 ) +
∫ x λe t
2
i
− λi x
dx
0
=
1 (2 − e − ( h +t ) λi (2 + 2 λ i t + h 2 λ i2 + 2 hλ i (1 + λ i t ))) λ i2
Var(τ i (t )) = E (τ i2 (t )) − ( E (τ i (t )))2 =
1 h 2λ i + 2(hλ i + 1)t − ( h+t ) λi (1 + hλ i )2 −2( h+t ) λi − e − e λi λ i2 λ i2
Q.E.D.
Result 17.2: The optimal modified block resetting policy can be obtained by solving the following nonlinear optimization problem with two decision variables: p
∗
∗
(b , h ) = arg min Φ(b, h) = arg min b ,h
∑
cipe − λi h +
∫
b
E (Q(τ (t )))dt
0
i =1
b
b ,h
subject to b > 0 and 0 ≤ h ≤ b where E (Q(τ (t ))) is calculated by substituting E (τ i (t )) and Var(τ i (t )) in Equation 17.8 with Lemma 17.4. Let bh denote the optimal value of b ≥ h, which minimizes Φ(b, h) under fixed h. Similar to the simple block resetting policy, if bh < ∞, then p
∫ ( E(Q(τ(b ))) − E(Q(τ(t)))) dt = ∑ c e bh
p − λi h i
h
0
i =1
and p
Φ(bh , h) =
∑ c λ + E(Q(τ(b ))). f i
i
h
i =1
17.2.3 AGE RESETTING For both the simple block resetting policy and the modified block resetting policy, all adjustable process variables share the same maintenance policy (b or (b, h)). Special drifting and/or failure characteristics of individual adjustable process variables are not fully considered. Under an age resetting policy, each adjustable process variable has its own resetting schedule. The ith adjustable process variable is reset when catastrophic failures of the ith system component occur. In addition, it is
2151_book.fm Page 428 Friday, October 20, 2006 5:04 PM
428
Stream of Variation Modeling and Analysis for MMPs
preventively reset whenever its age reaches ta(i). Therefore, an age resetting policy is determined by ta ≡ [ta(1) … ta(p)]T. The drawback of the age resetting policy is that preventive maintenance cannot be planned in advance, and tracking the ages of adjustable process variables involves significant administration. The expected longrun average cost under an age resetting policy is given in the following lemma: Lemma 17.5: Under the age resetting policy ta, the expected long-run average cost is
p
Φ(t a ) =
∑ λ c i
f i
− cip +
i =1
cip + lim E (Q(τ (t ))) − λ i ta ( i ) 1− e t→∞
Proof: Let Φm (ta) denote the expected long-run average maintenance cost and Φq (ta) denote the expected long-run average quality loss, Φ(t a ) = Φm (t a ) + Φq (t a ) The catastrophic failure and its induced cost for each system component can be considered a renewal process to restore the failed component at each resetting time. Based on the renewal theory, when the catastrophic failure of each component has a constant hazard rate, the expected long-run average maintenance cost is p
Φ m (t a ) =
∑
cipe − λita (i ) + cif (1 − e − λita (i ) )
i =1
∫
ta ( i )
p
=
e − λit dt
∑ λ c i
f i
− cip +
i =1
cip 1 − e − λita (i )
0
For the long-run average quality loss,
Φ q (t a ) = lim
T →∞
∫
T
E (Q(τ (t )))
0
T
= lim E (Q(τ (t ))) t→∞
Q.E.D.
From Equation 17.8, lim E (Q(τ (t ))) can be obtained based on lim E (τ i (t )) and t →∞ t →∞ lim Var (τ i (t )), which are given in the following lemma: t →∞
Lemma 17.6: Under age resetting policy ta, lim E (τ i (t )) = t →∞
lim Var (τ i (t )) = t →∞
1 ta (i)e − λita (i ) − λ i 1 − e − λ i ta ( i ) 1 + e −2ta (i ) λi − e − ta (i ) λi (2 + ta2 (i)λ i2 ) λ i2 (1 − e − ta (i ) λi )2
2151_book.fm Page 429 Friday, October 20, 2006 5:04 PM
Quality-Oriented Maintenance for Multiple Interactive System Components 429
Proof: For each adjustable process variable, a renewal occurs at the time of resetting. Let U i denote the random length of the renewal/resetting interval of component i. Based on the limiting distribution of the age of a renewal process given in Ross [2], it is not difficult to see that lim E (τ i (t )) =
E (Ui2 ) 1 t (i)e − λita (i ) = − a − λ t (i ) 2 E (Ui ) λ i 1 − e i a
lim E (τ i2 (t )) =
2 t (i)e − λita (i ) (2 + ta (i)λ i ) E (Ui3 ) = 2− a 3E (Ui ) λ i (1 − e − λita (i ) )λ i
t →∞
t →∞
The lemma follows from lim Var (τ i (t )) = lim E (τ 2i (t )) − (lim E (τ i (t )))2 . Q.E.D. t →∞
t→∞
t→∞
Result 17.3: The optimal age resetting policy can be obtained by solving the following nonlinear optimization problem with p decision variables: t*a = arg min Φ(t a ) = arg min ta ta
p
∑ 1 −λe c i =1
p i i − λ i ta ( i )
+ lim E (Q(τ (t ))) t →∞
subject to ta (i) > 0, i = 1, 2, …, p where lim E (Q(τ (t ))) is calculated by using Equation 17.8 and Lemma 17.6. t →∞ Although the age resetting policy usually results in lower cost than the simple block resetting and modified block resetting policies, its implementation is difficult because it applies different maintenance schedules for different system components.
17.3 FURTHER DISCUSSION ON SOLUTIONS OF OPTIMIZATION PROBLEMS In general, studying the properties of an optimal maintenance policy is very difficult. Practical assumptions are needed to simplify the problem. For example, in the literature of traditional preventive replacement problems, assumptions such as monotone increasing hazard rates are made to derive properties of the optimal solutions. Here, we assume that Q(τ ), the loss function corresponding to the product quality produced by system components of age τ, is a continuous and monotone increasing function of τ. This assumption is consistent with common intuition, because degraded system components cause inferior product quality. Based on this assumption, the optimal solutions for the simple block resetting and modified block resetting policies are further investigated. The following lemmas will be used in the discussion: Lemma 17.7: For the simple block resetting policy, E (τ i (t )) and Var (τ i (t )) are strictly increasing functions of t. For the modified block resetting policy, E (τ i (t )) is a strictly increasing function of t, and Var (τ i (t )) is a strictly increasing function of
2151_book.fm Page 430 Friday, October 20, 2006 5:04 PM
430
Stream of Variation Modeling and Analysis for MMPs
t when t ≥ h . For the modified block resetting policy, both E (τ i (t )) and Var (τ i (t )) are increasing functions of h. Proof: The proof is based on evaluating the derivatives of E (τ i (t )) and Var (τ i (t )) repeatedly. Refer to the appendix of this chapter for the detailed proof. Lemma 17.8: If Q(τ ) is a continuous and increasing function of τ, for the simple block resetting policy, E (Q(τ (t ))) is a continuous and increasing function of t. For the modified block resetting policy, E (Q(τ (t ))) is a continuous and increasing function of t for t ≥ h . If Q(τ ) is strictly increasing, so is E (Q(τ (t ))) . Proof: Because Q(ττ) is an increasing function of τ, from Equation 17.7, τ T (B∗ )τ + (p*)T τ is increasing on τ. Further, from Lemma 17.7, E (τ i (t )) is an increasing function of t. So, E (τ (t ))T (B∗ ) E (τ (t )) + ( p∗ )T E (τ (t )) is an increasing function of t. From Lemma 17.1 and Lemma 17.7, bii∗ ≥ 0 and Var (τ i (t )) is an increasing function of t (if t ≥ h for the modified resetting policy), p
E (τ (t ))T (B∗ ) E (τ (t )) + ( p∗ )T E (τ (t )) +
∑ b Var(τ (t)) ∗ ii
i
i =1
is an increasing function of t. Based on Equation 17.8, this shows that E (Q(τ (t ))) is an increasing function of t (if t ≥ h for the modified resetting policy). If Q(τ ) is strictly increasing, obviously E (Q(τ (t ))) is also strictly increasing because E (τ i (t )) is strictly increasing. Q.E.D.
17.3.1 OPTIMAL SOLUTIONS
OF
SBR POLICY
The following result shows the property of the optimal solutions for the simple block resetting policy. These properties can be used to facilitate the nonlinear optimization procedure. Result 17.4: For the simple block resetting policy, if Q(τ ) is a continuous and monotone increasing function of τ, we have that 1. Any local minimum of Equation 17.11 is also a global minimum. 2. If ∞
∫ ( E(Q(τ(∞))) − E(Q(τ(t)))) dt ≤ c , p
0
infinity is an optimal solution to Equation 17.11, i.e., resetting at only catastrophic failures is an optimal policy. If ∞
∫ ( E(Q(τ(∞))) − E(Q(τ(t)))) dt > c , p
0
a finite optimal solution to Equation 17.11 exists. 3. If Q(τ ) is strictly increasing, the optimal solution is unique.
2151_book.fm Page 431 Friday, October 20, 2006 5:04 PM
Quality-Oriented Maintenance for Multiple Interactive System Components 431
Proof: From Lemma 17.8, when Q(τ ) is increasing, E (Q(τ (t )) is also increasing. So, for b2 > b1 ≥ 0 ,
∫ ( E(Q(τ(b ))) − E(Q(τ(t)))) dt ≥ ∫ ( E(Q(τ(b ))) − E(Q(τ(t)))) dt b2
b1
2
2
0
0
≥
∫ ( E[Q(s(b ))] − E(Q(τ(t)))) dt b1
1
0
Therefore,
∫ ( E(Q(τ(b))) − E(Q(τ(t)))) dt b
0
is increasing as a function of b. This completes the proof for (1). For (2), from Result 17.1 and ∞
∫ ( E(Q(τ(∞))) − E(Q(τ(t)))) dt ≤ c , p
0
d Φ(b) ≤ 0, ∀b < ∞. db
Therefore, the optimal solution to Equation 17.11 equals infinity. If ∞
∫ ( E(Q(τ(∞))) − E(Q(τ(t)))) dt > c , p
0
because E (Q(τ (t ))) is continuous, there must exist b < ∞, such that
∫ ( E(Q(τ(b))) − E(Q(τ(t)))) dt = c . b
p
0
From (1), b is a finite optimal solution. If Q(τ ) is strictly increasing, so is E (Q(τ (t ))). Additionally,
∫ ( E(Q(τ(b))) − E(Q(τ(t)))) dt b
0
is strictly increasing. So,
∫ ( E(Q(τ(b))) − E(Q(τ(t)))) dt = c b
p
0
has a unique solution, if at all it has any. Thus, the optimal solution to Equation 17.11 is unique (can be infinite) and (3) is proved. Q.E.D.
2151_book.fm Page 432 Friday, October 20, 2006 5:04 PM
432
Stream of Variation Modeling and Analysis for MMPs
17.3.2 OPTIMAL SOLUTIONS
OF
MODIFIED BLOCK RESETTING POLICY
There are two decision variables in modified block resetting policies. Therefore, the study of the properties of optimal modified block resetting policy is more difficult than that of simple block resetting policy. The following result and its corollaries show the properties of the optimal modified block resetting policy: Result 17.5: For the modified block resetting policy, if Q(τ ) is strictly increasing, h1 ≤ h2, and bh1 < bh 2 , then Φ(bh1, h1 ) < Φ(bh 2 , h2 ). Proof: When bh1 < bh2 < ∞ and h1 ≤ h2, from Lemma 17.8, E(Q(ττ(b(h01)))) < E(Q(ττ(b(h02)))). Further, from Result 17.2, p
Φ(bh1 , h1 ) =
∑ c λ + E(Q(τ(b f i
i
h1
i =1
p
))) <
∑ c λ + E(Q(τ(b f i
i
h2
))) = Φ(bh 2 , h2 )
i =1
If bh2 = ∞ , it is easy to see that p
∑ c λ + E(Q(τ(b f i
i
h2
))) ≤ Φ(bh 2 , h2 )
i =1
and the result still follows. Corollary 17.2: For the modified block resetting policy, if Q(t ) is strictly increasing and (b ∗ , h ∗ ) is a global optimal policy, h ≤ h ∗ implies bh ≥ b ∗. Proof: We have b ∗ = bh∗ . If h ≤ h ∗, suppose bh < b ∗, from Result 17.5, Φ(bh, h) < Φ(bh*, h*), which contradicts the fact that (b ∗ , h0∗ ) is a global optimum. Therefore, bh ≥ b ∗ if h ≤ h ∗. Q.E.D. In the next corollary, a sufficient condition is given for modified block resetting policy to have a finite optimal solution. Corollary 17.3: If Q(τ ) is strictly increasing and there exists a finite optimal simple block resetting policy, a finite optimal modified block resetting policy must also exist. Proof: The simple block resetting policy can be considered a modified block resetting policy with h = 0. Therefore, the optimal simple block resetting policy is bh=0 < ∞ . From Corollary 17.2, because h* ≥ 0, the global optimal modified block resetting policy (b ∗ , h ∗ ) must satisfy b ∗ ≤ bh =0 < ∞ . Q.E.D.
17.4 CASE STUDY An experimental design was carried out for a sheet metal stamping process shown in Figure 17.1 [3]. The adjustable process variables include: outer shut height, inner shut height, punch speed, and blank washer pressure. The noise variables include blank thickness and lubrication. Based on ANOVA analysis, it is found that outer
2151_book.fm Page 433 Friday, October 20, 2006 5:04 PM
Quality-Oriented Maintenance for Multiple Interactive System Components 433
Tie Rod
Stamping Press
Flyw heel
Bearing
Linkage Gib
Punch Speed Bolster Shut Height
Slide
Die Blank Upright
FIGURE 17.1 Sheet metal stamping process.
shut height (ξ1) and inner shut height (ξ2) are two significant adjustable process variables, and blank thickness (z1) is the only one significant noise variable. The product quality characteristic (y) is the dimensional deviation from the nominal design of a critical point on the workpiece. The target value of y is γ = 0. The variable measurements of outer shut height and inner shut height reflect the worn states of two separate system components: outer die and inner die, respectively. During production, the wear of the outer die and the inner die can be characterized as the positive drift of the outer shut height and the inner shut height from their initial settings. Therefore, the maintenance of the corresponding dies should be scheduled to maintain these two shut heights within a certain region to ensure satisfactory product quality. The process model of this stamping process is obtained in Jin and Shi [2] as y = −0.1136 ξ1 − 0.0219ξ 2 + 0.0190 z1 − 0.0130 ξ1z1 + ε
(17.14)
where the variances of ε and z1 are obtained as σ 2ε = 0.05 and σ 2z = 0.23, respectively. It should be noted that all values of the process variables in Equation 17.14 are coded values, which are certain linear transformations of the corresponding real physical values. The coding and its corresponding real values for the adjustable process variables used in the experiment are listed in Table 17.1. The model parameters used in the quality-oriented maintenance model are listed in Table 17.2. In this study, production time is measured in terms of the number of workpieces produced. The values in Table 17.2 correspond to the coded values used in statistical experimental design. The corresponding real values of these parameters are listed in Table 17.3. Based on the model parameters given in Table 17.3 and following the procedures in Section 17.1, Q(τ ) and E (Q(τ (t ))) for the stamping process can be obtained as
2151_book.fm Page 434 Friday, October 20, 2006 5:04 PM
434
Stream of Variation Modeling and Analysis for MMPs
TABLE 17.1 Coding of Adjustable Process Variables Variable
ξ1
ξ2
Name Low (–1) High (+1)
Outer Shut Height (in.) 83.6553 83.6725
Inner Shut Height (in.) 95.9435 95.9646
TABLE 17.2 Maintenance Model Parameters for the Stamping Process σz
µ01
µ02
µ1
µ2
σ01
σ02
0.48
0.5
–0.5
6.94 × 10–6
4.17 × 10–6
0.076
0.062
σ1
σ2
λ1
λ2
q
c1
c2
7.45 × 10–5
4.47 × 10–5
8 × 10–6
3 × 10–6
35
3000
3000
p
p
TABLE 17.3 Maintenance Model Parameters in Terms of Real Values σz (mm)
µ01 (in.)
µ02 (in.)
µ1 (in.)
µ2 (in.)
σ01 (in.)
σ02 (in.)
0.03
83.6682
95.9488
5.97 × 10–8
4.40 × 10–8
6.54 × 10–4
6.54 × 10–4
σ1 (in.)
σ2 (in.)
λ1
λ2
q($/mm2)
c1 ($)
c2 ($)
6.41 × 10–7
4.72 × 10–7
8 × 10–6
3 × 10–6
35
3000
3000
p
Q(τ ) = 1.83 + 2.19 × 10 −11 τ12 + τ1 (2.53 × 10 −6 + 5.06 × 10 −12 τ 2 ) + 2.92 × 10 −7 τ 2 + 3.04 × 10 −13 τ 22
p
(17.15)
E (Q(τ (t ))) = 1.83 + 2.19 × 10 −11 E (τ12 (t )) + E (τ1(t ))(2.53 × 10 −6 + 5.06 × 10 −12 E (τ 2 (t ))) + 2.92 × 10 −7 E (τ 2 (t ))
(17.16)
+ 3.04 × 10 −13 E (τ 22 (t )) Following the procedures in Section 17.2, the optimization problem corresponding to the simple block resetting policy, the modified block resetting policy, and the
2151_book.fm Page 435 Friday, October 20, 2006 5:04 PM
Quality-Oriented Maintenance for Multiple Interactive System Components 435
age resetting policy are given in Equation 17.17, Equation 17.18, and Equation 17.19, respectively. e −11×10−6 b −6 b = arg min (−19184 + (237095 + 0.685b)e3×10 b b b ∗
+ (147859 + 0.0676)e8×10
−6
b
+ (−359770 + 3.202b)e11×10
−6
b
)
)
(17.17)
subjecct to b > 0 (b ∗ , h ∗ ) = −6
arg min ( b ,h )
e −11×10 b
( b+ h )
4.60 × 10 −7 e11×10 2.74 × 10 −6 e10 1.01 × 10 −7 e10 e3×10
−6
( b+ h )
e8×10
−6
( b+ h )
−6
−6
−6
b
(3.20219be
11×10 −6 ( b + h )
− 4.60 × 10 −7 (125000 + h)(333333 + h) +
(125000 + h)(333333 + h) −
(11b + 3h )
(11b +8 h )
(160726 + h)(531696 + h) −
(355464 + h)(4.02 × 10 6 + h) +
( b(0.68 + 5.48 × 10 h) + 2.74 × 10 (163703 + h)(528720 + h)) + ( b(0.068 + 2.03 × 10 h) + 1.01 × 10 (363558 + h)(4.01 × 10 + h))) −6
−6
−7
−7
6
subject to b > 0, 0 ≤ h ≤ b
(17.18)
(ta* (1), ta* (2)) = 1
arg min
( ta (1),ta ( 2 ))
3.24e10
−6
(e
8×10 −6 ta (1)
(8 ta (1)+ 3ta ( 2 ))
(t (2)) (3.04 × 10 2
−6
(3.20 − 3.23e − 1)
8×10 −6 ta (1)
+ ta (2)(1.13 × 10 −6 − 1.13 × 10 −6 e8×10
−13
a
9.70 × 10 −6 e3×10
− 1)(e
3×10 −6 ta ( 2 )
ta ( 2 )
− 3.04 × 10 −13 e8×10
(
)
2
−6
ta (1)
−6
ta (1)
− 3.21e3×10
ta ( 2 )
+
)+
) + ta (1)(9.70 × 10 −6 + 5.06 × 10 −12 ta (2) −
) + ta (1) (2.19 × 10 −11 − 2.19 × 10 −11 e3×10
subject to ta (1) > 0, ta (2) > 0
−6
−6
ta ( 2 )
)
) (17.19)
From Equation 17.15, Q(ττ) is strictly increasing. Further, from Result 17.4, the optimization problem in Equation 17.17 has a unique global optimum. If it has a local optimum, then the local optimum must be a global optimum. From Corollary 17.3,
2151_book.fm Page 436 Friday, October 20, 2006 5:04 PM
436
Stream of Variation Modeling and Analysis for MMPs
TABLE 17.4 Optimal Resetting Policies for the Stamping Process Simple Block Resetting
Modified Block Resetting
Optimal policy
b* = 5.8 × 104
b* = 5.4 × 104, h = 2.18 × 104
Minimum long-run average cost ($/part)*
2.0244
2.0183
Age Resetting ta*(1) = 4.1 × 104, ta*(2) = 12.4 × 104 1.9915
Because all the three resetting policies are independent of cif, the long-run average costs in this table do not include the common cost component
*
∑λ c . p
f
i i
i =1
a finite optimal solution for Equation 17.18 also exists. The age resetting problem in Equation 17.19 can be solved by exhaustive search because there are only two significant adjustable process variables in this system. The optimal solutions for all three policies are listed in Table 17.4. We use the age resetting policy as a benchmark to evaluate the cost efficiency of the simple block resetting and modified block resetting policies. As the long-run average costs of both simple block resetting and modified block resetting are close to that of age resetting, their cost efficiency is both satisfactory. Because simple block resetting is easier to implement, a simple block resetting with preventive resetting interval of 5.8 × 104 operations is preferred.
17.5 EXERCISES 1. List at least three characteristics of the maintenance model presented in this chapter that are not captured in the maintenance model in Chapter 15. 2. Calculate 1 0
2 2 1 0.5
based on the definition of the operator given in this chapter. 3. Provide a proof for Lemma 17.1. 4. Derive Equation 17.17–Equation 17.19 rigorously. 5. Through process improvements, the hazard rates for both the outer die and the inner die are cut by half. Rerun the case study in Section 17.4 to obtain the optimal maintenance policies. 6. Describe how to apply the maintenance models studied in this chapter to the BIW assembly processes studied in previous chapters.
2151_book.fm Page 437 Friday, October 20, 2006 5:04 PM
Quality-Oriented Maintenance for Multiple Interactive System Components 437
17.6 APPENDIX 17.1: PROOF OF LEMMA 17.7 If Lemma 17.7 is valid for the modified block resetting, then by setting h = 0 it will be also valid for the simple block resetting. Therefore, we only need to prove Lemma 17.7 for the modified block resetting. For E (τ i (t )) , from Lemma 17.4, dE (τ i (t )) d 1 1 = − ( + h)e − λi ( h+t ) = (1 + hλ i )e − λi ( h+t ) > 0 dt dt λ i λi dE (τ i (t )) d 1 1 = − ( + h)e − λi ( h+t ) = hλ ie − λi ( h+t ) ≥ 0 dh dh λ i λi Therefore, E (τ i (t )) is an increasing function of both t and h. For Var (τ i (t )), from Lemma 17.4, Var (τ i (t )) =
(
)
1 − h 2λ i2 + 2(hλ i + 1)λ it e − ( h+t ) λi − (1 + hλ i )2 e −2( h+t ) λi λ i2
The preceding equation can also be considered a function of λ i h and λ i t . Therefore, without loss of generality, we can set λ i = 1.
(
)
d 1 − (h 2 + 2(h + 1)t )e − ( h+t ) − (1 + h)2 e −2( h+t ) = dt e
−2 ( h + t )
(2(h + 1) + e 2
h+t
(17.20)
(h + 2(t − 1)(h + 1))) 2
First, we show that g1 ( h ) ≡ 2 (1 + h )2 + e2 h (3 h 2 − 2 ) ≥ 0. This can be seen by repeatedly calculating the derivatives of g as follows: g1 (0 ) = 0,
dg1 d 2 g1 d 3 g1 (0 ) = 0, ( 0 ) > 0 , ( h ) = 4e2 h (5 + 18 h + 6 h 2 ) > 0 dh dh 2 dh 3
Also, we have d (2 ( h + 1)2 + e h +t ( h 2 + 2 (t − 1)( h + 1))) = e h +t ( h 2 + 2 t + 2 ht ) ≥ 0, ∀h (17.21) dt When t = h, 2 ( h + 1)2 + e h +t ( h 2 + 2 (t − 1)( h + 1)) = g1 ( h ) ≥ 0
(17.22)
2151_book.fm Page 438 Friday, October 20, 2006 5:04 PM
438
Stream of Variation Modeling and Analysis for MMPs
From Equation 17.21 and Equation 17.22, 2 ( h + 1)2 + e h +t ( h 2 + 2 (t − 1)( h + 1)) ≥ 0, ∀ t ≥ h The preceding equation is equal to zero only if t = h. This follows from Equation 17.20 that Var (τ i (t )) is strictly increasing on t when t ≥ h. To show Var (τ i (t )) is increasing function of h, we have
(
)
d 1 − (h 2 + 2(h + 1)t )e − ( h+t ) − (1 + h)2 e −2( h+t ) = dh e
−2 ( h + t )
h(2(h + 1) + e
h+t
(17.23)
(−2 + h + 2t ))
First we show that g2 ( h ) ≡ 2 + 2 h + e h ( h − 2 ) ≥ 0 . This can be seen by repeatedly calculating its derivatives as follows: g2 (0 ) = 0,
d d2 g2 ( h ) = e h h ≥ 0 g2 (0 ) > 0, dh dh 2
Also we have
(
)
d 2 ( h + 1) + e h +t (−2 + h + 2 t ) = e h +t ( h + 2 t ) ≥ 0, ∀h dt
(17.24)
When t = 0, 2 ( h + 1) + e h +t (−2 + h + 2 t ) = g2 ( h ) ≥ 0, ∀h
(17.25)
From Equation 17.24 and Equation 17.25, 2 ( h + 1) + e h +t (−2 + h + 2 t ) ≥ 0, ∀t, ∀h . Further from Equation 17.23, Var (τ i (t )) is an increasing function of h.
References 1. Berg, M. and Epstein, B., A modified block replacement policy, Naval Research Logistics Quarterly, 23(1), 15–24, 1976. 2. Jin, J. and Shi, J., Diagnostic feature extraction from stamping tonnage signals based on design of experiments, ASME Journal of Manufacturing Science and Engineering, 122, 360–369, 2000. 3. Ross, S.M., Stochastic Processes, John Wiley & Sons, New York, 1996. 4. Chen, Y. and Jin, J., Quality-oriented-maintenance for multiple interactive components, IEEE Transactions on Reliability, Vol. 55, No.1, pp. 123–134, 2006.
2151_book.fm Page 439 Friday, October 20, 2006 5:04 PM
18
Additional Topics on Stream of Variation
This chapter introduces some recent advances in stream of variation (SoV) modeling and analysis that are not addressed in earlier chapters. Four topics will be discussed in detail: SoV modeling for compliant parts, SoV modeling for serial-parallel configured MMPs, integration of the SoV model with process planning, and active control for variation reduction based on the SoV model in an MMP.
18.1 SoV MODELING FOR MULTISTATION ASSEMBLY PROCESS OF COMPLIANT PARTS Compliant part assembly is a manufacturing process in which two or more nonrigid parts are connected with various joining techniques to form a subassembly or a final product. Because part deformation during assembly cannot be disregarded, understanding the dimensional variation propagation of this type of assembly process is especially challenging. In this section, we will introduce some recent advances in SoV modeling techniques for multistation compliant sheet metal assembly processes.
18.1.1 INTRODUCTION SoV modeling for rigid parts was discussed in Chapter 6. In recent years, compliant sheet metal assembly has attracted great attention in automotive body design and manufacturing. For example, 37% of all assembly stations in automotive body structure manufacturing assemble were nonrigid parts [1]. Takezawa [2] pointed out that the traditional additive theorem of variance used in rigid-part study was no longer valid for compliant sheet metal assemblies. To model the variation and resulting propagation of those compliant part assembly processes, the possible deformation of the parts during the assembly process should be considered, and the models should include a force analysis that takes into consideration the stiffness of each part and the forces applied by each tool. Variation propagation modeling for compliant parts in a multistation assembly process introduces new challenges. In comparison to the station-level approach, it is necessary to define an appropriate variation representation to track the variation propagation from station to station. There are some issues that facilitate the application of SoV in such processes. The variation simulation process is sequential, i.e., to estimate the variation at station k, it is necessary to know the variation at station k – 1. Moreover, there is a station-to-station interaction introduced by the release of holding fixtures and the use of new fixtures in subsequent stations. However, compliant assembly variation analysis requires applying finite element methods to
439
2151_book.fm Page 440 Friday, October 20, 2006 5:04 PM
440
Stream of Variation Modeling and Analysis for MMPs
calculate the deformation after assembly. Therefore, the computation involved increases with the number of stations. In this section, SoV concepts are extended to model the impact of part and tooling variation on dimensional quality in a multistation assembly process with compliant sheet metal parts, and to study how variation propagates from different subassemblies to the final product. Such a model is more realistic and can be quite useful for both the design and control of the manufacturing system. During the design stage, such models can be used to predict product variation so that improvements in parts or processes can be made early. During a production operation of the manufacturing system, such models can aid in the monitoring of the process changes and diagnosis of root causes of variation. The material in this subsection is mainly summarized from the original work by Camelio et al. [3].
18.1.2 SOV MODELING
OF
COMPLIANT PARTS
IN AN
MMP
The modeling methodology is applied to the study of variation propagation in a multistation assembly line, as shown in Figure 18.1. The state space representation for a discrete system requires independent modeling of each station. First, a state vector xk is defined to include every component on the final assembly. The state vector includes welding points, locating points, and measurement points for each of these components. In addition, for each station, it is necessary to define the locating matrix Mk using homogeneous transformation; the sensitivity matrix (springback matrix) Sk using finite element analysis (FEA) and the method of influence coefficients; and the deformation matrix Pk using FEA. Finally, by combining the initial part variation, fixture variation (N – 2 – 1 fixturing principles), and welding gun
Assembly Line Tooling Variation Incoming Parts Variation
x0
Station 1
Station k
… x1
xk-1
Part and Tooling Configuration
Mk
FEA
Measurement Station
Fixture Configuration Re-locating Matrix
Multi-station Model
Station N
… xk
Pk, Sk xk-1
Final Assembly Variation
Tooling Variation
State Space Model Station k
Estimated Part Variation Station k
xk
Estimated Part Variation Station k-1
FIGURE 18.1 Overview of SoV modeling for multistation assembly process of compliant parts.
2151_book.fm Page 441 Friday, October 20, 2006 5:04 PM
Additional Topics on Stream of Variation
441
variation, it is possible to estimate the expected variation for the output subassembly at each station. Because the multistation assembly process is sequential, such a relationship can be described recursively. In other words, knowing the estimated part variation at station k, it is possible to estimate the part variation at station k + 1. In the following subsections, the derivation of the state space model from a single-station and multistation level, taking part and tooling variation into consideration, is summarized. 18.1.2.1 Single Station Assembly Modeling In this subsection, the variation modeling approach at a single-station level is based on the mechanistic simulation method developed by Liu and Hu [4]. This procedure assumes that (1) sheet deformation is within the linear elastic range, (2) the material is isotropic, (3) fixture and welding gun are rigid, (4) there is no thermal deformation, and (5) the stiffness matrix remains constant for nonnominal part shapes. Figure 18.2 illustrates the assembly process of compliant parts, which consists of four steps [4]: S1. The parts are loaded and located in the station using a locating scheme (Figure 18.2a). A vector Vu is used to represent the deviations from the nominal part shape. Index u refers to unwelded parts. S2. Deviation (Vu) of part 1 is closed by a welding gun or a fixture applying a force Fu (Figure 18.2b). Considering the part stiffness matrix Ku, the force required to close the gap due to Vu is given by Fu = Ku ⋅ Vu
(18.1)
S3. The parts are joined together while the force Fu is still being applied (Figure 18.2c). Clamping Force Fu Vu
(a) Part Deviation from Nominal Design
(b) Part Clamped at Nominal Position Fw
Fu
Vw
Welding
(c) Welding
(d) Clamp Release and Springback
FIGURE 18.2 Sheet metal assembly process.
2151_book.fm Page 442 Friday, October 20, 2006 5:04 PM
442
Stream of Variation Modeling and Analysis for MMPs
S4. The welding gun/fixture is removed (Figure 18.2d). After removing the forces applied by the clamping system, the new assembled structure will have spring-back. The spring-back of the assembly can be represented by the mechanistic deviation model as Vw = S ⋅ Vu
(18.2)
where S is the sensitivity matrix determined by the method of influence coefficients as presented in Liu and Hu [4]. Then, considering a linear relationship between the incoming parts deviation and the final assembly deviation for compliant parts at the station level, the sensitivity matrix for a specific station configuration can be achieved. 18.1.2.2 Multistation Assembly Modeling Similar to Chapter 6, a state space representation can be used to model a multistation assembly process, where the dimensional deviation of assembly parts can be represented by x k = A k −1x k −1 + B k u k + w k
(18.3)
yk = Ck xk + vk
(18.4)
All notations here are defined the same as Equation 6.39 and Equation 6.40 in Chapter 6. However, there are special interpretations of the state vector and state transition matrix for compliant part modeling. 18.1.2.2.1 State Vector xk The state vector includes the representation of all the parts/subassemblies at each station in the system. A single compliant sheet metal part requires more than two points to represent its state, which is different from a 2-D rigid body representation, where only two points are required. The number of points needed will depend on the complexity of the parts and the accuracy required. The relevant points required to represent a compliant part state are the part deviation on the welding positions or welding locating points (WLP), the fixture points or principal locating points (PLPs), and any additional measurement points or measurement locating points (MLPs). For a compliant part j, at station k, the state vector can be defined as x kj = [ XWLP1
XWLPs
X PLP1
X PLPt
X MLP1
T
X MLPl ]
Here, assume that the total numbers of WLPs, PLPs, and MLPs on part j at station k are s, t, and l, respectively. Thus, for an assembly of n parts, the state vector at station k will be x k = x1k
x 2k
⋅⋅⋅
x nk
T
(18.5)
2151_book.fm Page 443 Friday, October 20, 2006 5:04 PM
Additional Topics on Stream of Variation
443
Station k
x k-1
Re-location Mk
x'k-1
Assembly Process Pk , S k
xk
FIGURE 18.3 Relocation and assembly process.
18.1.2.2.2 State Transition Matrix Ak The relationship between the input part deviations and the output subassembly deviations can be depicted in Figure 18.3. First, incoming parts are relocated/reorientated in the station k using a 3-2-1 fixture layout defined by the reorientation matrix Mk. Second, the part is deformed when the welding guns and additional clamps on the primary plane are closed, and the parts are welded to produce a subassembly defined by the deformation matrix Pk. Finally, the welding guns and fixtures are released, causing the spring-back, defined by the sensitivity matrix Sk. So, the state transition matrix Ak with no tooling deviation is defined by A k = f (M k , Pk , S k )
(18.6)
In Equation 18.6, the reorientation matrix Mk is defined as Part 1 0 1 M 0 1 1 M2 0 M1m 0 1 0 Mk = 0 0
0
Part 2 M12 0 0 2 M2 2 0 M m2 0
0
0
0
0
0
0 0 0 Part n 0 0 M1n M 2n 0 M nmn 0
(18.7)
2151_book.fm Page 444 Friday, October 20, 2006 5:04 PM
444
Stream of Variation Modeling and Analysis for MMPs
where n is the number of parts in the assembly, or the number of elements in the state vector and mi is the number of key points in part/subassembly i. The deformation and sensitivity matrices Pk and Sk can be defined using the method of influence coefficients introduced in Reference 4 and Reference 5: Pk = diag PPart1 S k = diag S Subassembly1
PPart2
PPartn
⋅⋅⋅ ⋅⋅⋅
S Subassembly2
(18.8)
S Subassemblyn
(18.9)
By incorporating the welding gun variation and fixture variation, the state space model is x′k −1 = x k −1 + M k ⋅ x k −1 − U t3− 2−1 ,k −1
(
(18.10)
)
x k = Sk − Pk + I ⋅ x′k −1 − (Sk − Pk ) ⋅ [U kg + U t( N −3) ,k ] + Wk
(18.11)
where U gk , the welding gun deviation vector, has the form, g U k = v1g1 … v1g p1
WLPpart1
0
0
vg31 … vg3p 3
WLPpart 3
0
0
…
vgn1 … vgnpn
WLPpartn
0
0
T
U t3−2 −1,k −1 and U t( N −3 ) ,k are the variation vector of the (3-2-1) locating fixtures and the (N-3) additional holding fixtures, respectively. All the information required to create the state and input matrices may be obtained from the existing assembly line or from the design drawings of a new assembly line.
18.1.3 ADDITIONAL COMMENTS The SoV modeling technique for multistation assembly process of compliant sheet metal parts can be directly employed to predict the dimensional variation of the measurement points on the final product based on the given variation level of individual parts, tooling, and part material properties. Thus, it serves as an effective tool for design evaluation of both the product and the process. The case study conducted in Reference 3 demonstrates its capability in variation analysis through simulation. In addition, research efforts have been done on process planning and sequence selection for compliant parts [6], and sensitivity study and tolerance synthesis considering model uncertainties for compliant multistage assembly processes [5].
2151_book.fm Page 445 Friday, October 20, 2006 5:04 PM
Additional Topics on Stream of Variation
445
18.2 SoV MODELING FOR SERIAL-PARALLEL MULTISTAGE MANUFACTURING SYSTEMS Manufacturing-system configuration and its impact on system performance are attractive problems for researchers and practitioners. Recent advances in system configuration design, especially the development of serial-parallel multistage manufacturing systems (SP-MMS), raise new challenges for quality control and improvement. In this section, we introduce some advancements in SoV modeling for SP-MMS, which is commonly used in industries [7].
18.2.1 INTRODUCTION Traditional manufacturing systems for medium- and high-volume production are designed as serial lines. Each machine or station performs some of the operations needed to fabricate the part, and there is only one flow path. This type of system is cost-effective for medium- or high-volume production. Nowadays, the new era of global competition places great demands on manufacturers’ capabilities. Large fluctuations in production capacity and frequent change of product variety require a highly reliable manufacturing system. To meet these requirements, it is common for the manufacturers to adopt an MMP with multiple identical machines/stations utilized at each stage [8]. This type of system is called serial-parallel, or hybrid, MMP, as shown in Figure 18.4. The serial-parallel configuration of the system has profound impacts not only on productivity, adaptability to market demands, and reliability, but also on product quality [7]. In Chapter 6 and Chapter 7, SoV modeling techniques for serial lines of assembly and machining processes were introduced. In this subsection, modeling techniques taking into consideration multiple SoVs in the SP-MMP will be discussed. When an SP-MMP is used to fabricate parts, different parts may go through different stations at each of the multiple stages that form different process routes. In other words, there will be multiple SoV in the process contributing to the final dimensional variation. The variation propagation of each individual potential process route can be modeled using the state space modeling technique introduced in the previous chapters. However, if those process routes merge at certain stages, routes Stage 1
Stage N
Stage 2
1 2
2
…
2
…
…
…
m1
m2
mN
FIGURE 18.4 An example of a serial-parallel multistage manufacturing process.
2151_book.fm Page 446 Friday, October 20, 2006 5:04 PM
446
Stream of Variation Modeling and Analysis for MMPs
and their interactions need to be considered. This makes SoV modeling more complex, and the modeling techniques for serial systems are no longer adequate. Furthermore, multiple SoVs cause more difficulties in determining the sensing/gauging strategy as well as in identifying the root causes of product dimensional variation. Therefore, it is necessary to extend the previous methodologies to accommodate the quality control pressure exerted by the commonly adopted SP-MMP. In this section, a generic system-level methodology will be studied to model and analyze multiple SoV in an SP-MMP. All the potential process routes and their interactions will be considered during modeling, although the model dimension reduction issue will also be considered. The model is useful for variation propagation simulation and root cause diagnosis of the SP-MMP. The material in this subsection is mainly summarized from the original work by Huang and Shi [9].
18.2.2 SOV MODELING
FOR
SERIAL-PARALLEL MMPS
In an SP-MMP, parts being fabricated follow the same processing sequence, i.e., they sequentially go through every stage once. However, the process routes may vary from part to part. Figure 18.5a illustrates the process routes in an SP-MMP with a three-stage manufacturing process, where the circles represent machines/ stations and the arrows represent the process routes. In such a system, a part could go through process route 1, as depicted in Figure 18.5b, i.e., through machine 1 at each stage. Or it could go through one of the other routes. In the following subsections, the derivation of the state space model for multiple SoV in the SP-MMP and model dimension reduction will be introduced. 18.2.2.1 State Space Modeling for Multiple SoVs in an SP-MMP Assume the total number of process routes is R in an N-stage SP-MMP. Denote the part deviation after stage k through process route i by x (ki ) , where the superscript (i) denotes process route i (i = 1, 2, …, R) and subscript k denotes stage k (k = 1, 2, …, N). Following the same definition convention as specified in Chapter 6 and Chapter 7, the state space model for the multiple SoV in the SP-MMP can be defined as: x (ki ) = A (ki−) 1x (ki−) 1 + B(ki ) u (ki ) + w (ki ) , k = 1,, 2,..., N ; i = 1, 2,..., R
(18.12)
y (ki ) = C (ki ) x (ki ) + v (ki ) , {k} ⊂ {1, 2, …, N }
(18.13)
where each term has the same physical interpretation that was defined in Chapter 6 and Chapter 7. It should be noted that Equation 18.12 and Equation 18.13 are derived based on the following assumptions: A1. All the parts going through the process routes are assumed to be from the same batch, so that the raw workpiece deviation x0(i)’s follow the same distribution.
2151_book.fm Page 447 Friday, October 20, 2006 5:04 PM
Additional Topics on Stream of Variation
447
Stage 1 Stage 1 1
Stage 2
Stage 3 1
1
Route 1 x0 1 Route 2 x0 2 Route 3 x0 2
2
2
2
3
n1=3
3
n2=2
n3=3
Route 4 x0 2 Route 5 x0 3 Route 6 x0 3
Stage 2
x(1) 1
1
x(2) 1 x(3) 1
2 2
x(4) 1 x(5) 1 x(6) 1
2 2 2
Stage 3
x(1)2 x(2) 2 x(3)2 x(4) 2 x(5)2 x(6) 2
x(ki ) Part features X after stage k in route i (a) An SP-MMP
1 1 2 3 2 3
x(1)3 x(2) 3 x(3) 3 x(4) 3 x(5) 3 x(6) 3
(b) Portion of Process Routes
FIGURE 18.5 Process routes in an SP-MMP.
A2. Different process routes are expected to perform the same fixturing and (i) cutting operations at stage k. Therefore, the system matrices A(i) k–1, Bk , and Ck(i) defined by design are the same for all i, and the superscript will be dropped hereafter. A3. If routes i and j merge at stage k, the input random vectors uk’s for those two routes at that stage are assumed to be the same, i.e., uk(i) = uk(j). A4. The error terms vk(i) are assumed to be the same for every route. The superscript will be dropped too. Let T u•(i ) = u1(i )
u (2i )
T
T u (Ni)
T
be the process deviations from operations 1 to N of route i, T y •(i ) = y1(i )
y (2i )
T
T y (Ni)
T
be the deviations of all measured characteristics in route i, and v • = v1T
v 2T
v TN
T
be the measurement error in route i. By following the procedure in Chapter 11, we have y•(i ) = Γ • u•(i ) + Γ 0 x 0 + v•
(18.14)
where Γ • and Γ 0 have the same forms as those defined in Chapter 11. Generally, the observed part deviations in an SP-MMP with R routes can be modeled as:
2151_book.fm Page 448 Friday, October 20, 2006 5:04 PM
448
Stream of Variation Modeling and Analysis for MMPs
y = Γu + Γ 0 x 0 + v
(18.15)
where T y = y•(1)
y•( 2)
T
T
T y•( R ) ,
Γ = diag(Γ • , Γ • , , Γ • ), T u = u•(1)
Γ 0 = Γ T0• v = v •T
u•( 2)
T
T
T u•( R ) ,
Γ T0• v •T
T
Γ T0• , and T
v •T .
18.2.2.2 Model Dimension Reduction Equation 18.12 to Equation 18.15 show that the dimension of this state space model for the multiple SoVs is R times as large as the one defined in Chapter 6, Chapter 7, and Chapter 11. If nk denotes the number of machine tools at stage k, the total possible number of routes R will theoretically equal
∏
R i =1
nk .
This means that the model dimension will dramatically increase with R, which will increase the measurement cost and reduce the diagnosability. Three-dimensional reduction techniques are proposed: M1. u reduction or Γ column reduction: If routes i and j (i < j) merge at stage k, then u (ki ) = u (kj ) (by A3). Those two identical vectors, i.e., u (ki ) and u (kj ) , can be merged into one subvector in u of model of Equation 18.15, i.e., retaining only u (ki ). Correspondingly, let (Γ )(ki ) be the block matrix in Γ corresponding to u (ki ). The reduced Γ, denoted by Γ u, is obtained by replacing (Γ )(ki ) with (Γ )(ki ) + (Γ )(k j ) and deleting (Γ )(k j ) in original Γ. M2. y reduction or Γ row reduction: If routes i and j (i < j) merge together through stage M, i.e., u (ki ) = u (kj ) for k = 1, 2, …, M, then y (ki ) and y (kj ) are identical random variables for k = 1, 2, …, M. The dimension of y can be reduced by eliminating all y (kj ) ’s. The dimension of Γ is reduced by eliminating the corresponding rows. M3. y reduction due to common datum scheme: In a machining system, there is another opportunity to perform y reduction. Suppose the features machined at stage k are used as datum in the process segment composed of stages k + 1 to k + S. Because u (ki ) = u (kj ) , u (ki+) S = u (kj+)S , and there is no
2151_book.fm Page 449 Friday, October 20, 2006 5:04 PM
Additional Topics on Stream of Variation
449
datum change, y (ki+) S and y (kj+)S are identical random vectors. The dimensions of u and y can be reduced accordingly. Huang and Shi [9] proved that the aforementioned dimension reduction procedures retain the same information about the process variation propagation, and will not decrease the system diagnosability.
18.2.3 ADDITIONAL COMMENTS A major challenge of SoV research in the SP-MMP is the increased model dimension due to the multiple variation streams. The large model dimension is usually associated with a correspondingly large number of variation sources to be monitored, controlled, and diagnosed. It also poses a great challenge for devising an effective measurement strategy providing sufficient information in support of decision making. In addition to the model dimension reduction techniques presented in this subsection, recent studies have shown the advantages of considering “error-equivalent” phenomena in SoV modeling, measurement reduction, root cause diagnosis, and feedback adjustment [10–12]. Because different error sources may result in identical dimensional variation patterns, those equivalent error sources can thus be transformed into one common base error. This will lead to a significant reduction of model dimension for the SP-MMP. Another potential research topic is to apply dynamic programming and other optimization techniques to analyze the multiple variation paths for costeffective decision making in the SP-MMP.
18.3 SoV-BASED QUALITY-ENSURED SETUP PLANNING The SoV models can be applied to process variation source diagnosis, process design evaluation, and quality-reliability integration, as demonstrated in Chapter 9 through Chapter 17. All these applications are based on a well-defined process plan. In this section, we will briefly introduce the methodology of incorporating the SoV concepts and modeling techniques in process planning, especially in the setup planning stage, which essentially affects the outcome product quality of the manufacturing process.
18.3.1 INTRODUCTION The purpose of process planning is to determine the steps by which a product can be manufactured economically and competitively. It is a key element that bridges the activities between design and manufacturing. Setup planning constitutes a critical component that connects a conceptual process planning and detailed process planning. A setup plan defines a series of datum/fixturing schemes for an MMP [13]. Product quality is one of the main concerns of setup planning. A well-defined setup plan should be able to satisfy quality specifications under normal manufacturing conditions. Although some research has been reported to deliver quality-assured setup plans, their capabilities of evaluating the quality impact of candidate setup plans, especially those for complex MMPs, are restricted by their qualitative nature. In other words, there is a lack of quantitative methods to mathematically represent
2151_book.fm Page 450 Friday, October 20, 2006 5:04 PM
450
Stream of Variation Modeling and Analysis for MMPs
the SoV of corresponding setup plan alternatives. Cost-effectiveness, in terms of the cost related to process accuracy, is another critical concern in setup planning. However, to ensure product quality, the current experience-based approach tends to be very conservative by selecting unnecessarily accurate fixtures and thus causing unnecessary cost. This is especially true for the upstream stages of an MMP because of the lack of a variation propagation evaluation method. The invention of SoV concepts and tools overcomes those limitations and creates potential for developing optimal setup plans. In this subsection, a systematic approach is proposed. Its objective is to fill up the gap in process planning through conducting cost-effective, quality-ensured setup planning. SoV concepts and modeling techniques will be actively incorporated in the planning procedure. The material is mainly summarized from the original work of Liu et al. [14].
18.3.2 QUALITY-ENSURED SETUP PLANNING METHODOLOGIES The proposed methodology can be illustrated in Figure 18.6. Candidate setup plans and their corresponding SoV are quantitatively modeled based on input information and SoV concepts. Their cost-effectiveness is evaluated with the defined criterion, based on which setup planning is formulated as an optimization problem. In the following subsections, the extended state space modeling for setup planning and optimization formulation is summarized. 18.3.2.1 Variation Propagation Modeling for Setup Planning Adopting the same assumptions and notation conventions defined in previous chapters, the linear state space model for setup planning can be constructed as d
x dk k = A k −k 1x dk −k −11 + B dk k u dk k + w k y
dk k
=C x dk k
dk k
+ v k , k = 1, 2, …, N
FIGURE 18.6 Overview of the SoV-based, quality-ensured setup planning.
(18.16)
2151_book.fm Page 451 Friday, October 20, 2006 5:04 PM
Additional Topics on Stream of Variation
451
where superscript dk represents the datum scheme of a candidate setup option at stage k; d k ∈Θ k , Θ k is the set of potential datum schemes. By ignoring the original parts deviation x0, the linear input-output model that associates quality deviation with process deviation can be derived as: k
y = dk k
∑C
k
dk k
(• ) k ,i
Φ B u + di i
di i
i =1
∑C
dk k
Φk(•,i)w i + v k
(18.17)
i =1
where Φk(•,i) is the state transition matrix that records the datum schemes transformation from stage i to stage k. Given the datum scheme dk and the decisions on datum schemes for upstream stages {d1 d 2 … d k −1}, the coefficient matrices, A dk k , B dk k, C dk k , and Φk(•,i) (i = 1, 2, …, k), can be derived following the procedure presented in Chapters 6 and Chapter 7. 18.3.2.2 Optimization Formulation The objective of setup planning is to minimize the cost related to process accuracy with constraints on the key product characteristic (KPC) quality. The mathematical representation is defined as: min{Cu (u)} s.t. σ yi ≤
USLi − LSLi , i = 1, 2, ..., M τi
(18.18)
where P
C u (u) =
∑ j =1
wj ηj ⋅ σu j
(18.19)
USLi and LSLi are the predefined upper specification limit and lower specification limit of KPC yi , respectively. σ yi is the standard deviation of KPC yi and τ i is a constant. ηj · σuj represents the tolerance of process variable uj , e.g., position of a locator, j = 1, 2, …, P. ηj’s and wj’s are constants. The objective function can be interpreted as the summation of inverse proportion of all the process variables’ tolerance. Because the setup planning is a sequential decision-making procedure, the decisions on the datum scheme selection for stage k are affected by those for the upstream stages and will affect those for the downstream stages. This is identical to the characteristic of dynamic programming (DP). DP-stage (Q k , x k ) is employed to represent decision making on datum scheme selection dk for stage k, where Q k is defined as the in-process quality specification for the features generated from stage 1 to
2151_book.fm Page 452 Friday, October 20, 2006 5:04 PM
452
Stream of Variation Modeling and Analysis for MMPs
stage k. Selecting datum scheme dk incurs cost Vk (u k , d k ) and implements transition from DP-stage (Q k −1, x k −1 ) to DP-stage (Q k , x k ) through state transition t((Qk, xk), dk, dk–1), where pk
Vk (u k , d k ) = C u dk (u ) = k
dk k
∑ j =1
wj η j ⋅ σ u dk
(18.20)
k, j
and t ((Q k , x k ), d k , d k −1 ) has the explicit form defined in the first equation in model of Equation 18.16. Then, let L (Q k , x k ) be the minimum process accuracy cost consumed from stage 1 to stage k by selecting datum schemes d1, d2, …, dk, and generating quality variation at most Qk, the optimization formulation of Equation 18.18 can be replaced by a DP function defined as: min {L (Q k − q k (u k , d k ), t ((Q k , x k ), d k , d k −1 )) dk ∈Θk ,dk −1∈Θk −1 q k ( u k , d k )≤Q k L (Q k , x k ) = + Vk (u k , d k )}, for k = 1,, …, N 0 for k = 0
(18.21)
where q k (u k , d k ) is the maximum KPC’s variations allowed after the fabrication is performed from stage 1 to stage k. This DP function can be solved using a reaching algorithm [14]. The final results include: (1) the minimum total cost related to process accuracy, L (Q N , x N ) ; (2) a sequence of decisions (d1* d 2* … d N* ) on datum schemes selected for a sequence of stages, which is the optimal setup plan; and (3) the tolerance specifications in terms of η j ⋅ σ u j ’s, j = 1, 2, …, P.
18.3.3 ADDITIONAL COMMENTS The proposed approach creates the potential for addressing other issues in process planning. The interaction between setup planning and fixture design is one of them. The SoV-based optimal setup-planning method takes the fixture layout as given, and defines the tolerance specifications for the fixture design to ensure the product quality. However, because the coefficient matrices, A dk k ’s, B dk k’s, and C dk k ’s, in the state space model are derived based on fixture layout information, the setup planning is actually constrained by the fixture to be applied. In other words, the fixture accuracy specifications are defined as the results of the given fixture layout. On the other side, fixture layout should be designed considering fixture accuracy. To deal with this chicken-and-egg dilemma, a methodology that completely integrates setup planning and fixture configuration design is required.
2151_book.fm Page 453 Friday, October 20, 2006 5:04 PM
Additional Topics on Stream of Variation
453
18.4 ACTIVE CONTROL FOR VARIATION REDUCTION OF MULTISTAGE MANUFACTURING PROCESSES The methodologies presented in Chapter 9 to Chapter 11 addressed issues of SoVbased monitoring and diagnosis for the MMP, which is a reactive strategy based on detecting process changes and finding their root causes. This section will discuss some concepts and new achievements on the active control for a variation reduction in an MMP, which aims at automatically compensating disturbances as soon as they occur.
18.4.1 INTRODUCTION Active control here refers to an automatic adjustment of the tool (locators, cutting tools, etc.) settings that minimizes the process variability, based on the in-line measurements. There are two basic approaches in controlling deviations of an MMP: feedback and feed-forward control. Feedback control makes use of information from downstream measurements (usually at the end of the process station, and from intermediate measurement stations, if they exist) to perform control actions. Feedforward control requires having the measurements prior to the control stations to determine the control actions. Figure 18.7 presents schematics of both strategies. In a discrete-part manufacturing process, a process change can be regarded as a mean deviation (or shift) and a variance change. The sources of variation are typically independent in an MMP. In this situation, feedback control is only effective with respect to the mean shift (or drifting) disturbances, but not with respect to variability change. Therefore, feed-forward control is more preferable in variation reduction for the MMPs.
Station 1
Station 2
Station k
Station k+1
Station N
Measurement Station
(a) Feedback strategy
Station 1
Station 2
Station k
Station k+1
(b) Feed-forward strategy FIGURE 18.7 Schematics of the control strategies.
Station N
Measurement Station
2151_book.fm Page 454 Friday, October 20, 2006 5:04 PM
454
Stream of Variation Modeling and Analysis for MMPs
Feed-forward control requires the use of distributed sensors and process information to determine the deviations of the incoming parts/components for calculating the control actions. In general, the control actions are obtained on a part-by-part basis that minimizes the deviation of each part, and thus minimizes the overall process variability. Most active controls for discrete-part manufacturing are focused on performing control at a single station. In assembly, one of the first attempts to perform stationlevel corrections was made by Svensson [15]. He used a vision system to help a robot arm accurately position doors and windshields in a car assembly. In industry, Nissan developed the intelligent body assembly system (IBAS) that was capable of measuring and adjusting the positions of the body panels [16]. Later, Pasek and Ulsoy [17] developed a reconfigurable part locator system or programmable tooling based on a Steward platform to perform corrections of parts during the assembly process. With the same objective in mind, Khorzard et al. [18] developed a systematic approach to improve the fitting of parts during assembly (optimal panel fitting). Using optimization techniques, they were able to determine the optimal position of the parts. The optimization was necessary because the part-fitting problem usually involves several KPCs that cannot be improved simultaneously. In machining, Wang and Huang [19] used control to compensate both static errors as well as quasi-static errors caused by a thermal effect of machine tools. The active control at a single station is effective if this station is the last stage of an MMP, or if the key quality features compensated at the current station will not be impacted by downstream stations. However, the effectiveness of active control at the single station will be limited if the quality features propagate to the downstream stations. In this situation, the optimal compensation at the current station may not be the optimal one for minimizing the variation at the final product at the end of the MMP. In this situation, the SoV model should be adopted to provide the prediction or analysis of variation and its propagation for optimal control purposes. The ultimate goal of active dimensional control in the manufacturing systems is to systematically and cost-effectively improve the quality of the final product by combining SoV model and control theory.
18.4.2 ACTIVE CONTROL
FOR
VARIATION REDUCTION
OF
MMP
Efforts have been made to develop active control methods for MMPs. Mantripragada and Whitney [20] proposed to use optimal control theory to perform corrections during the assembly of an automobile structure. After measuring the parts before assembling them, they were able to calculate the control actions or corrections that minimize the final deviations, considering that the parts were the only source of variation in the process and the measurements did not include noise. Recently, Djurdjanovic and Zhu [21] proposed to control the position of the fixtures and tool path to improve the final product quality in an MMP. Focusing on assembly, Izquierdo [22] proposed to use feed-forward control by estimating the deviations of the incoming part at each station, and then determining the control actions that minimized the deviations of the final product at the end of the process.
2151_book.fm Page 455 Friday, October 20, 2006 5:04 PM
Additional Topics on Stream of Variation
455
An example is presented here to illustrate the formulation for determining the control actions uCk that reduce the expected final product KPC variation at station N given the information available up to station k (i.e., yˆ N / k in Equation 18.22). It is important to mention that the control vector uCk is a vector that contains the control actions for station k, as well as those for the downstream stations in an MMP. The control problem is formulated as a constrained optimization: J = Min yˆ TN / k ⋅ Qk ⋅ yˆ N / k uCk
s.t.
(18.22) g(uCk ) ≤ 0
where J is the objective function. yˆ N / k is the prediction of the quality deviation at the final station (N) made at station k. The constraints, g(uCk ) ≤ 0, account for restrictions such as actuators’ resolution, workspace limitations, and interference with other equipment. The weighting matrix Qk is usually diagonal and accounts for the differences in importance among the KPCs. To determine the control actions, it is necessary to estimate the deviations at the current station and further predict the final product deviations. This estimation requires using the SoV model and in-line measurements. There are many techniques available for the estimation and prediction, include the ones discussed in Chapter 11. Here, we introduce an estimation method, as well as the associated control strategies that were discussed in Reference 23. 1. Estimation of deviations: Considering that the positions of the parts are measured before applying the control (as presented in Figure 18.8). Then the deviations of the parts in the station can be estimated based on those measurements: xˆ Bk = C + k ⋅ y Bk
(18.23)
where y Bk is the quality feature measurements before the control action; C+k is the pseudo-inverse of the observation matrix; and xˆ Bk is the state vector estimation at station k. 2. Control Law: If there are no constraints in Equation 18.22, the optimization problem has a closed-form solution: uCk = −K k ⋅ xˆ kB
(18.24)
where Kk is the control gain matrix at station k, which depends on the product-process information, design requirements, sensors characteristics, and KPCs’ location. One advantage of this solution is its simplicity, because the control action is proportional to the estimated deviation, which can be easily implemented for the real-time control applications. In addition, it considers the variation and its propagations, and thus minimizes the deviations at the final product in an MMP.
2151_book.fm Page 456 Friday, October 20, 2006 5:04 PM
456
Stream of Variation Modeling and Analysis for MMPs
ˆ X
ˆ X
ˆ X
FIGURE 18.8 Feed-forward control strategies.
If there are constraints in Equation 18.22, the control action has to be obtained by solving the optimization problem as discussed in Reference 23. A feed-forward control strategy is able to reduce the part-to-part deviation and considers variation propagation with the help of the SoV model. However, it cannot compensate the disturbances between the control and the final stations. Thus, additional uncertainties can be introduced to the system. A better strategy would be a hybrid control system with both feedback and feed-forward controls to minimize the variation in an MMP.
18.4.3 ADDITIONAL COMMENTS The use of the active dimensional control for the variation reduction of the MMP is a very promising topic. In automotive body assembly, a series of feasibility studies have been conducted on in-process distributed sensing, controllability with programmable tooling, and SoV model validation in continuous production environments. All those case studies were successful. Recently, a beta site testing of the technology was begun to further develop active control techniques for the automotive body assembly.
2151_book.fm Page 457 Friday, October 20, 2006 5:04 PM
Additional Topics on Stream of Variation
457
The introduction of the active control for the variation reduction of the MMP also brought other related research topics, such as the optimal distributed sensing for active control, actuator placement issues considering controllability, robust control considering sensing, and model uncertainties. More efforts are required to fully develop a technology base of the active control for variation reduction in an MMP.
References 1. Shiu, B., Ceglarek, D., and Shi, J., Flexible beam-based modeling of sheet metal assembly for dimensional control, Transactions of NAMRI/SME, 25, 49–54. 1997. 2. Takezawa, N., An improved method for establishing the process wise quality standard, Reports of Statistical and Applied Research, Japanese Union of Scientists and Engineers (JUSE), 27(3), September, 1980, pp. 63-76. 3. Camelio, J., Hu, S.J., and Ceglarek, D., Modeling variation propagation of multistation assembly systems with compliant parts, ASME Transactions, Journal of Mechanical Design, 125, 673–681, 2003. 4. Liu, S.C. and Hu, S.J., Variation simulation for deformable sheet metal assemblies using finite element methods, ASME Transactions, Journal of Manufacturing Science and Engineering, 119, 368–374, 1997. 5. Yue, J.P., Sensitivity and Uncertainty in Variation Simulation Modeling for Multistage Manufacturing Systems, Ph.D. thesis, The University of Michigan, 2006. 6. Wang, H. and Ceglarek, D., Quality-driven sequence planning for compliant structure assemblies, Annals of CIRP, 54/1, 31–35, 2005. 7. Koren, Y., Hu, S.J., and Weber, T., Impact of manufacturing system configuration on performance, Annals of the CIRP, 47, 369–372, 1998. 8. Hu, S.J. and Koren, Y., Reconsider machine layout to optimize production, Manufacturing Engineering, 134(2), 81–90, 2005. 9. Huang, Q. and Shi, J., Stream of variation modeling and analysis of serial-parallel multistage manufacturing systems, ASME Transactions, Journal of Manufacturing Science and Engineering, 126, 611–618, 2004. 10. Wang, H., Huang, Q., and Katz, R., Multi-operational machining processes modeling for sequential root cause identification and measurement reduction, ASME Transactions, Journal of Manufacturing Science and Engineering, 127, 512–521, 2005. 11. Wang, H. and Huang, Q., Error cancellation modeling and its application in machining process control, IIE Transactions on Quality and Reliability, 38, 379–388, 2006. 12. Wang, H. and Huang, Q., Using error equivalence concept to automatically adjust discrete manufacturing processes for dimensional variation reduction, accepted with revision by ASME Transactions, Journal of Manufacturing Science and Engineering, 2006. 13. Huang, S.H., Automated setup planning for lathe machining, Journal of Manufacturing Systems, 17(3), 196–208, 1998. 14. Liu, J., Shi, J., and Hu, S.J., Quality ensured setup planning based on the stream-ofvariation model for multistage machining processes, to be submitted to ASME Transactions, Journal of Manufacturing Science and Engineering. 2006. 15. Svensson, R., Car body assembly with ASEA 3D-Vision, Proceedings, 15th International Symposium on Industrial Robots, Tokyo, Japan, 1985. 16. Sekine, Y., Koyama S., and Imazu H., Nissan’s new production system: intelligent body assembly system, SAE Conference, Paper # 910816, Detroit, MI, 1991.
2151_book.fm Page 458 Friday, October 20, 2006 5:04 PM
458
Stream of Variation Modeling and Analysis for MMPs
17. Pasek, Z. and Ulsoy, A.G., An adaptive assembly system for automotive applications, First S.M. Wu Symposium, 1, 341–348, 1994. 18. Khorzard, D., Shi, J., Hu, S.J., Ni, J, Zussman, E., and Seliger, G., Optimization of multiple panel fitting in automobile assembly, Transaction of NAMRI/SME, XXIII, 241–246. 1995. 19. Wang, H. and Huang, Q., Automatic process adjustment for reducing dimensional variation in discrete part machining processes, ASME International Mechanical Engineering Congress and Exposition (IMECE), Orlando, FL, Paper Number IMECE 2005-80406, November 6–11, 2005. 20. Mantripragada, R. and Whitney, D.E., Modeling and controlling variation propagation in mechanical assemblies using state transition models, IEEE Transaction on Robotics and Automation, 15, 124–140, 1999. 21. Djurdjanovic, D. and Zhu, J., Stream of variations based error compensation in multistation manufacturing systems, ASME International Mechanical Engineering Congress and Exposition (IMECE), Orlando, FL, Paper Number IMECE2005-81550, November 6–11, 2005. 22. Izquierdo, L.E., Product, Process and Control Integration for Dimensional Quality Control in Reconfigurable Assembly Systems, Ph.D. thesis proposal, The University of Michigan, Ann Arbor, MI, 2005.
2151_book.fm Page 459 Friday, October 20, 2006 5:04 PM
Index A Absolutely continuous distributions, 33 probability density function (PDF), 33 Across-station variation factors, 120–121 Active dimensional control multistage manufacturing processes (MMP) and, 453–457 variation reduction, 453–457 Age resetting, 427–419 AIC information criteria, 182 Algorithms comparison of, 286–288 data-mining-aided design, 338–353 exchange, 284–286 sort-and-cut, 284–286 Angle, definition of, 21 Assembly processes assumptions of, 119–122 deviation determination, 137–140 introduction to, 117–118 state space model, 117–133 variation factors, 119–122
B Best linear unbiased estimator (BLUE), 71 BIW. See body-in-white. Block resetting, modified, 425–427 simple, 425 BLUE. See best linear unbiased estimator. Body-in-white (BIW), 117 assembly processes, QR-chain modeling implementation of, 401–410 locating-pin degradation model, 403–404 multistation, 402 process variables, 404–407 product quality assessments, 407–408 quality characteristic deviations, 404–407 CAD. See computer-aided design. Candidate design space, 341–342 CART. See classification and regression tree. Catastrophic failures, QR-chain modeling and, 396
Cauchy-Schwarz inequality, 26 extended, 26 CDF. See cumulative distribution function. Chain modeling and analysis, 389 QR-chain effect, 391–396 system reliability evaluation, 396–401 Classification and regression tree (CART), 348 Classification method, 347–349 classification and regression tree (CART), 348 Cluster number selection, 349–352 Clustering method, 346–347 CMMs. See coordinate measuring machines. Collinearity, 71–72 Communality, interpretation of, 102–103 Compliant parts, stream of variation (SOV) and, 439–444 multistage manufacturing processes, 440–441 multistation assembly modeling, 442–444 single station assembly modeling, 441–442 Component performance, QR-chain modeling and, 394 Computer-aided design (CAD), 9 Conditional correlation coefficients. See partial correlation coefficients. Conditional cumulative distribution functions, 35 Conditional distributions, 34–35 Confidence regions, 74–75 mean vectors and, 59–63 Connected fault class, 209 Control charts multivariate, 64–68 subgroup means, 66–68 Coordinate measuring machines (CMMs), 275–278 Coordinate system transformations, 154 Correlation coefficients, 37–40 multiple correlation coefficients, 38–40 partial correlation coefficients, 38 simple, 37 Cost components, 373–374 maintenance cost, 373 quality loss function, 374 tolerance cost, 373 Cost function, 368 Covariance matrices, 50–51 Cumulative distribution function (CDF), 33
459
2151_book.fm Page 460 Friday, October 20, 2006 5:04 PM
460
Stream of Variation Modeling and Analysis for MMPs
D Data-mining-aided design algorithm, 338–353 candidate design space, 341–342 classification method, 347–349 clustering method, 346–347 feature function, 344–355 overview, 339–341 uniform coverage selection, 342–344 case study, 353–356 cluster number selection, 349–352 generic description of, 352–353 seed design number, 349–352 Datum, 143 Datum-induced error analysis, 154–158 Decision variables maintenance planning and, 372–373 tolerances and, 372–373 Decomposition of sum of squares, 73 Design evaluation, sensitivity-based, 307–313 Detection power, multiple-station sensor distribution and, 293–296 Deviation determination, assembly processes and, 137–140 Deviation least-square estimator, 250–251 Deviation LS estimator, 255–261 Deviations QR-chain modeling and, 404–407 state space modeling and, 122–131 Diagnosability analysis, case studies, 212–221 index, sensors and, 278–279 multistage assembly process, 212–219 machining process, 219–221 Diagnosable class, 207–210 reduced row echelon form (RREF), 208 Differential motion vector, 150 Differential transformation matrix (DTM), 150 Dimensional variation single-stage modeling, 153–162 coordinate system transformations, 154 datum-induced error analysis, 154–158 error source combinations, 159–160 fixture error analysis, 158–159 multistage machining processes, 161–162 sources, introduction to, 143–145 Direct product. See Kronecker product. Discrete distributions, 34 Distributed senses, sensors and, 278 Distributions, 33–34 sensors and, multiple-station, 290–301 Disturbances
sampling uncertainty, 236–237 unstructured noises, 235–236 Downtime, manufacturing and, 389–391 DTM. See differential transformation matrix.
E Eigenvalues, 22–24 Eigenvectors, 22–24, 234 End-of-line sensing, sensors and, 278 Error analysis combining sources, 159–160 datum-induced, 154–158 fixture, 158–159 Estimating, regression modeling and, 76–77 Estimation-based diagnosis, 249–270 least-squares estimators, 249–254 methods, 249 least-squares (LS), 249 maximum likelihood method (ML), 249 variance estimators, relationship between, 254–261 Exchange algorithms, sort-and-cut, 284–286 Existence conditions, variance estimators and, 259–260 Experimental machining processes, model validation and, 162–163 Extended Cauchy-Schwarz inequality, 26
F Factor analysis, 99–111 communality, interpretation of, 102–103 factor loadings, interpretation of, 102 factor rotation, 105–109 factor score, 109–111 general procedures, 111 loading matrix, estimation of, 103–105 method multiple fault identification, 187–193 statistical properties, 193–195 variability modeling and, 175–195 model AIC information criteria, 182 example of, 183–186 minimum description length (MDL), 182 number of faults estimating, 181–186 principal component analysis limitations, 180–181 process variability, 176 rotation, 180–181 structure, 176–177 variation pattern interpretation, 177–180
2151_book.fm Page 461 Friday, October 20, 2006 5:04 PM
Index orthogonal factor model, 100–102 principal component analysis (PCA), comparison of, 99–100 Factor loadings, interpretation of, 102 Factor rotation, 105–122, 180–181 angle ϕ, 107 oblique, 105 orthogonal, 105 Factor score, 109–111 Failure, QR-chain modeling and, 396 Fault diagnosability analysis formulation for, 201–203 motivation for, 201–203 variation source identification and, 201–224 criterion of, 205–207 definitions of, 203–205 diagnosable class, 207–210 Fault estimating, likelihood ratio test, 181–182 Fault geometry vectors, 187–189 statistical properties and, 193–195 Fault identification, 187–193 Fault-quality model, variation pattern matching, connection of, 231–233 Fixture error analysis, 1158–159 Fixture layout design, 333–356 methodologies, 335
G Gauging system, evaluation of, 210–212 information quality, 211 system flexibility, 211–212 Gauss least-squares theorem, 72
H Hadamard product, 30 Hat matrix, 72 Hessian matrix, 25–26 Homogeneous transformation (HT), 148 Hotelling’s T 2 test, 57–59 HT. See homogeneous transformation.
I Independent variables, multivariate linear regression and, 85–86 Inline optical coordination measurement machine (OCMM), 2 Inner product, definition of, 21
461 Integrated tolerance, process-oriented tolerance synthesis and, 375–376 case study, 380–384 Inverse, definition of, 16–17
K KCC. See key control characteristics. Key product characteristics (KPC), 305–307 Kinematic analysis basics, 148–152 differential motion vector, 150 differential transformation matrix (DTM), 150 homogeneous transformation (HT), 148 KPC. See key product characteristics. Kronecker product, 27–30 direct product, 27–30
L Layout design, optimal fixture, 333–356 Least-square estimates (LSE), 70–71 best linear unbiased estimator, 71 collinearity, 71–72 decomposition of sum of squares, 73 Gauss least-squares theorem, 72 hat matrix, 72 multivariate linear regression and, 79 residual sum of squares (RSS), 71 Least-square estimators, 249–254 deviation, 250–251 other types, 252–254 variation, 251–252 Least-squares method (LS), 249 Likelihood ratio test, 83, 181 Linear span, definition of, 20 Loading matrix, estimation of, 103–105 Locating-pin degradation model, QR-chain modeling and, 403–404 LS. See least-squares method. LSE. See least-squares estimates. 70–71
M Machining processes case studies, 240–244 datum, 143 introduction to, 143–145 model formulation, 147–148 model validation, 162–166 stream of variation modeling, 143–166 variation propagation model, 145–162 Mahalonobis distance. See statistical distance. Maintenance cost, 373
2151_book.fm Page 462 Friday, October 20, 2006 5:04 PM
462
Stream of Variation Modeling and Analysis for MMPs
Maintenance design, process-oriented tolerance synthesis and, 375–376 case study, 380–384 optimization of, 377–380 Maintenance model, quality-oriented, 419–422 Maintenance multiple interactive system components and, 419–436 Maintenance planning cost components, 373–374 decision variables, 372–373 optimization problem formulation, 374–375 process-oriented tolerance synthesis and, 372–375 Maintenance policies for, multiple interactive system components, 422–429 Manufacturing downtime, 389–391 Marginal distributions, 34–35 Matlab function, 45 Matrix definition of, 15 Matrix differentiation, 25–26 Hessian matrix, 25–26 Matrix inequalities, 26–27 Cauchy-Schwarz inequality, 26 Matrix permutation, 208 Matrix theory basics of, 15–30 definitions of, 15–30 eigenvalues, 22–24 eigenvectors, 22–24 Hadamard product, 30 matrix differentiation, 25–26 matrix inequalities, 26–27 maximization, 26–27 operators, 27–30 partitioned matrices, 18–19 quadratic forms, 19 vector differentiation, 24–25 key terms, 15–19 inverse, 16–17 matrix, 15 reduced row echelon form, 17 square matrix, 16 trace, 16 vector, 15 space, 20–22 Maximization, 26–27 Maximum likelihood estimation (MLE), 79–82 Maximum likelihood method (ML), 249 MDL. See minimum description length. Mean vectors multivariate linear regression and, 81–82 multivariate quality control charts, 64–68 population, 63–64 statistical inferences on, 57–86
confidence regions, 59–63 Hotelling’s T 2 test, 57–59 simultaneous confidence intervals, 61–63 MIMO. See multiple-input-multiple-output. Minimal diagnosable class, 207–210 connected fault class, 209 gauging system, evaluation of, 210–212 matrix permutation, 208 permuted matrix, 208 pivot position, 208 Minimum description length (MDL), 182 ML estimators, vs. variance estimators, 264–270 ML. See maximum likelihood method. MMPs. See multistage manufacturing processes. Model checking, regression and, 75–76 Model dimension reduction, 448–449 Model formulation, machining processes and, 147–148 Model validation experimental machining processes, 162–163 machining processes and, 162–166 real vs. model measurement, 164–166 state space modeling and, 131–133 Modified block resetting, 425–427 Multicomponent maintenance policies age resetting, 427–429 modified block resetting, 425–427 policy, optimal solutions for, 430–431 multiple interactive system components and, 422–429 other issues, 429 simple block resetting (SBR), 422–425 policy, optimal solutions for, 430–431 Multiple correlation coefficients, 38–40 Multiple fault identification example of, 190–193 factor analysis method and, 187–193 fault geometry vectors, 187–189 fault interpretation, 190–193 subgroup identification, 189–190 Multiple interactive system components maintenance policies, 422–429 multicomponent maintenance policies, 422–429 quality-oriented maintenance, 419–436 case study, 432–436 Multiple linear regression, 68–86 inferences, 73–75 confidence region, 74–75 least-squares estimates (LSE), 70–71 model checking, 75–76 multivariate, 78–86 Multiple stream of variation, serial-parallel multistage manufacturing systems (SP-MMS) and, 446–448
2151_book.fm Page 463 Friday, October 20, 2006 5:04 PM
Index Multiple-input-multiple-output (MIMO) system, 309 supremum operation, 313 Multiple-station sensor distribution, 290–301 optimal strategy, 296–297 optimization of, 290–291 single station, detection power, 293–296 variation transmissibility ratio, 291–293 Multistage assembly processes, 117–118 diagnosability analysis and, 212–219 Multistage machining processes, 161–162 diagnosability analysis and, 219–221 Multistage manufacturing processes (MMPs), 5–9 active dimensional control of, variation reduction, 453–457 chain modeling and analysis, 389–391 compliant parts and, 440–441 system reliability evaluation and, 397–400 stream of variation (SoV), 1–11 Multistage manufacturing systems (MMS), stream of variation (SOV) and, serial-parallel type, 445–449 Multistage, body-in-white (BIW), 117 Multistation assembly modeling, compliant parts and, 442–444 Multistation assembly process (MMS), stream of variation (SOV) modeling and, compliant parts, 439–444 Multistation body-in-white (BIW) assembly processes, QR-chain modeling and, 402 Multivariate central limit theorem, 51 Multivariate distribution, 33–40 absolutely continuous distributions, 33–34 conditional, 34–35 cumulative distribution functions, 35 correlation coefficients, 37–40 cumulative distribution function, 33 discrete, 34 marginal, 34–35 population moments, 35–37 statistical independence, 35 Multivariate linear regression independent variables selection method, 85–86 least-squares estimation, 79 likelihood ratio test, 83 maximum likelihood estimation (MLE), 79–82 mean vectors, 81–82 modeling of, 78–79 normality assumption, 82–83 predictions from 83–85 Multivariate normal distributed random vectors, sampling theory and, results of, 50–51
463 Multivariate normal distribution, 40–55 properties of, 41 Multivariate process capability analysis, 313–315 case studies, 315–321 multivariate process capability analysis, 319–321 sensitivity-based design evaluation, 315–319 Multivariate quality control charts, 64–68 Multivariate statistical analysis, 33–52 multivariate distribution, 33–40 multivariate normal distribution, 40–45 noncentral χ2 distribution, 44 noncentral F distribution, 44–45 quadratic forms, 43 sampling theory, 45 Wishart distribution, 51–52
N Noncentral χ2 distribution, 44 Matlab function, 45 SAS function, 45 Noncentral F distribution, 44–45 Matlab function, 45 SAS function, 45 Nonconforming products quality assessment, 395–396 system failure, 395–396 Normality assumptions, multivariate linear regression and, 82–83
O Oblique rotation, 105 Observation matrix, 45 OCMM. See optical coordinate measuring machine. Operators, 27–30 Kronecker product, 27–30 Vec operator, 27 Optical coordinate measuring machine (OCMM), 275–278 Optimal fixture layout design, 333–356 data-mining-aided design algorithm, 338–353 variation reduction criteria, 335–338 Optimal sensor placement, multiple-station sensor distribution and, 296–297 Optimization problem formulation, 374–375 quality constraint, 375 quality loss function, 374–375
2151_book.fm Page 464 Friday, October 20, 2006 5:04 PM
464
Stream of Variation Modeling and Analysis for MMPs
process-oriented tolerance synthesis and maintenance design, 377–380 Orthogonal factor model, 100–102 Orthogonal rotation, 105 Orthogonal Γ, variance estimators and, 260–262
P Part position deviation state of, 122–123 state space modeling and, 122–125 Partial correlation coefficients, 38 Partitioned matrices, 18–19 Pattern matching procedure, 237–239 Pattern matching variation source identification, 233–239 Pattern matching variation source identification disturbances sampling uncertainty, 236–237 unstructured noises, 235–236 eigenvector, 234 matching procedure, 238–239 PCA. See principal component analysis. PCs. See principal components. PDF. See probability density function. Permuted matrix, 208 Pivot position, 208 Population covariance matrix, 35–36 Population mean vectors, 63–64 Population moments, 35–37 population covariance matrix, 35–36 population mean vector, 35 statistical distance, 36–37 Positive definite, 19 Positive semidefinite matrix, 19 Predicting, regression modeling and, 77–78 Principal component analysis (PCA), 91–111 factor analysis, comparison of, 99–100 geometrical interpretation, 95–96 limitations of, 180–181 mathematical model, 91–95 principal components (PCs), 92, 96 Principal components (PCs), 92, 96 process control, 97–99 Probability density function (PDF), 33 Process capability analysis, 305–321 key control characteristics (KCC), 305–307 key product characteristics (KPC), 305–307 multivariate, 313–315 Process control, 97–99 Process degradation model, 366–367 Process-oriented tolerance synthesis, 361–384 case study, 368–372 concepts, 361
framework of, 362–368 cost function, 368 optimization formulation, 368 process degradation model, 366–367 tolerance and variation relationship, 364–366 variation propagation model, 363–364 integrated tolerance, 375–376 maintenance design, 375–387 case study, 380–384 optimization of, 377–380 maintenance planning, 372–375 cost components, 373–374 decision variables, 372–373 optimization problem formulation, 374–375 product-oriented tolerancing, 361 tolerance allocation, 368–372 decision variables, 372–373 Process variability, factor analysis model and, 176 Process variables, QR-chain modeling and, 404–407 Product quality self-improvement, system reliability evaluation and, 400–401 Product-oriented tolerancing, process-oriented tolerance synthesis, 361 Projection of vectors, 22
Q QR-chain effect, 391–393 examples of, 391–393 QR-chain modeling, 393–396 body-in-white (BIW) assembly processes, implementation of, 401–410 locating-pin degradation model, 403–404 model of, 402–408 multistation, 402 process variables, 404–407 product quality assessments, 407–408 quality characteristic deviations, 404–407 catastrophic failure, 396 component performance, product quality, 394 nonconforming products, 395–396 system component degradation, 394–395 Quadratic forms definition of, 19 multivariate statistical analysis and, 43 positive definite, 19 positive semidefinite matrix, 19 Quality assessments nonconforming products and, 395–396 QR-chain modeling and, 407–408
2151_book.fm Page 465 Friday, October 20, 2006 5:04 PM
Index Quality characteristic deviations, QR-chain modeling and, 404–407 Quality constraint, optimization problem formulation and, 375 Quality ensured setup planning, stream of variation (SOV) and, 449–452 Quality loss function, 374–375 Quality, QR-chain modeling and, 394 Quality-oriented maintenance, case study, 432–436
R Random samples, 49–50 Reduced row echelon form (RREF), 208 definition of, 17 Regression modeling checking of, 75–76 inferences from, 76–78 estimating, 76–77 predicting, 77–78 Regression sum of squares (RSS), 73 Resetting age, 427–429 block, 425–427 Residual sum of squares (RSS), 71 Root cause diagnosis, sensors and, 276 Rotation angle ϕ, 107 RREF. See reduced row echelon form. RegSS. See regression sum of squares. RSS. See residual sum of squares.
S Sample geometry, 45–49 observation matrix, 45 Sample means, 50–51 covariance matrix, expected values of, 49–50 Sampling theory, 45–51 covariance matrices, 50–51 multivariate central limit theorem, 51 multivariate normal distributed random vectors and, results of, 50–51 random samples, 49–50 sample geometry, 45–49 sample means, 50–51 covariance matrix, 49–50 Sampling uncertainty, 236–237 SAS function, 45 SBR. See simple block resetting. Seed design number, 349–352 Sensitivity index, sensors and, 279–282 Sensitivity-based design evaluation, 307–313
465 case study of, 315–319 multiple-input-multiple-output (MIMO) system, 309 single-input-multiple-output (SIMO) system, 309 Sensor distribution case study, 298–301 strategy of, 297–298 Sensors layout, single station, 288–290 Sensors coordinate measuring machines (CMMs), 275–278 design criteria diagnosability index, 278–279 sensitivity index, 279–282 distribution of, distributed senses, 278 end-of-line, 278 multiple-station, 290–301 optical coordinate measuring machine (OCMM), 275–278 optimal placement and distribution, 275–301 placement of design criteria, 278–282 single-station, 282–290 root cause diagnosis, 276 Serial-parallel multistage manufacturing systems (SP-MMS), 445–449 model dimension reduction, 448–449 multiple stream of variation, 446–448 state space modeling, 446–448 SIMO. See single-input-multiple-output. Simple block resetting (SBR), 422–425 policy, optimal solutions for, 430–431 Simple correlation coefficients, 37 Simultaneous confidence intervals, mean vectors and, 61–63 Single station assembly modeling, compliant parts and, 441–442 sensor layout, 288–290 Single-input-multiple-output (SIMO) system, 309 Single-stage modeling, dimensional variation, 153–162 Single-station sensor placement, 282–290 optimization formula, 282–283 exchange algorithms, 283–284 Sort-and-cut exchange algorithms, 284–286 SoV. See stream of variation. SP-MMS. See serial-parallel multistage manufacturing systems. Square matrix, definition of, 16 State space modeling, 117–133 assembly processes and, 117–133 deviation of, 125–131
2151_book.fm Page 466 Friday, October 20, 2006 5:04 PM
466
Stream of Variation Modeling and Analysis for MMPs
model validation, 131–133 multistage assembly processes, 117–118 part position, 122–123 deviation of, 122–125 serial-parallel multistage manufacturing systems (SP-MMS) and, 446–448 state space representation, 125–131 validation, 131–133 variation factors and assumptions, 119–122 variation system analysis (VSA), comparison of, 133 State space representation, 125–131 Station-level variation factors, 119–120 Statistical distance, 36–37 Mahalonobis distance, 36–37 Statistical independence, 35 Statistical inferences, mean vectors, 57–86 Statistical properties, fault geometry vectors, 193–195 Stream of variation (SOV) additional topics, 439–457 applications and methodology, 1–11 compliant parts, 439–444 multistage manufacturing processes, 440–441 multistation assembly modeling, 442–444 single station assembly modeling, 441–442 methodology history of, 3–4 multistage manufacturing processes (s), 5–9 variation reductions, 7–9 overview, 5–9 relationship to other methodologies, 9–11 modeling machine processes and, 143–166 multistation assembly process (MMS), compliant parts, 439–444 serial-parallel multistage manufacturing systems (SP-MMS), 446–449 multistage manufacturing systems (MMS), serial-parallel type, 445–449 quality ensured setup planning, 449–452 research impact of, 1–3 inline optical coordination measurement machine (OCMM), 2–4 initiation of, 1–3 Subgroup identification, multiple fault identification and, 189–190 Subgroup means, control charts and, 66–68
Supremum operation, 313 System component degradation, 394–395 System failure, nonconforming products and, 395–396 System reliability evaluation chain modeling and analysis and, 396–401 challenges of, 397 multistage manufacturing processes (MMPs), 397–400 product quality self-improvement, 400–401
T Tolerance allocation, 368–372 Tolerance cost, 373 Tolerances, decision variables, 372–373 Trace, definition of, 16 Transmissibility ratio, 291–293
U Uniform coverage selection, data-mining-aided design algorithm and, 342–344 Unstructured noises, 235–236
V Variability modeling, factor analysis method, 175–195 Variance estimators comparison of, 262–270 deviation LS estimator, 255–261 dispersion of, 263 existence conditions, 259–260 ML estimators, comparison of, 264–270 orthogonal Γ, 260–262 relationship between, 254–261 unbiasedness qualities of, 262 Variation factors, and assumptions across-station, 120–121 assembly processes and, 119–122 key, 121–122 state space modeling and, 119–122 station-level, 119–120 Variation least-square estimator, 251–252 Variation pattern definition of, 230 matching, 229–244 definition of, 230 fault-quality model, connection of, 231–233
2151_book.fm Page 467 Friday, October 20, 2006 5:04 PM
Index machining process, case studies, 240–244 source identification, 233–239 Variation propagation model, 148–162, 363–364 dimensional variation single-stage modeling, 153 kinematic analysis basics, 148–152 workpiece geometric deviation, 152 Variation reduction criteria, optimal fixture layout design and, 335–338 Variation reduction methods stream of variation methodology and, 7–9 types, 9–10 Variation reduction, active control and, 453–457 Variation source identification fault diagnosability analysis, 201–224 pattern matching and, 233–239 Variation system analysis (VSA), state space modeling and, comparison of, 133 Variation transmissibility ratio, 291–293 Various pattern interpretation, 177–180
467 VD & T. See vectorial dimensioning tolerance. Vec operator, 27 Vector definition of, 15 differentiation, 24–25 space angle, 21 definition of, 20–22 inner product, 21 linear span, 20 projection of vectors, 22 Vectorial dimensioning tolerance (VD&T), 152
W Wishart distribution, 51–52 Workpiece geometric deviation, 152–153 vectorial dimensioning tolerance (VD&T), 152
2151_book.fm Page 468 Friday, October 20, 2006 5:04 PM
2151_book.fm Page 469 Friday, October 20, 2006 5:04 PM
Related Titles Root Cause Analysis: Improving Performance for Bottom-Line Results, Second Edition Robert J. Latino, Reliability Center Inc., Hopewell, VA Kenneth C. Latino, Practical Reliability Group, Roanoke, VA ISBN: 0-849-31318-x Statistical Quality Control M. Jeya Chandra, Pennsylvania State University, University Park, PA ISBN: 0-849-32347-8 Applied Reliability, Second Edition Paul A. Tobias, Austin, TX David C. Trindade, Sun Microsystems, Inc., Palo Alto, CA ISBN: 0-442-00469-9
469
2151_book.fm Page 470 Friday, October 20, 2006 5:04 PM