Validation of the Measurement Process James R. DeVoe, EDITOR Institute for Materials Research, National Bureau of Standa...
70 downloads
2143 Views
3MB Size
Report
This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form
Validation of the Measurement Process James R. DeVoe, EDITOR Institute for Materials Research, National Bureau of Standards
A symposium sponsored by the Division of Analytical Chemistry at the
171st
Meeting of the American Chemical Society, New York, N Y , April 5-6,
1976.
ACS SYMPOSIUM SERIES
AMERICAN CHEMICAL SOCIETY WASHINGTON, D. C. 1977
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
63
Library of Congress
Data
Validation of the measurement process. (ACS symposium series; 63 ISSN 0097-6156) Includes bibliographies and index. 1. Chemistry, Analytic—Statistical methods—Congresses. I. DeVoe, James R. II. American Chemical Society. Division of Analytical Chemistry. III. Series: American Chemical Society. ACS symposium series; 63. QD75.4.S8V34 ISBN 0-8412-0396-2
543'.01'82 77-15555 ACSMC8 63 1-207 1977
Copyright © 1977 American Chemical Society A l l Rights Reserved. N o part of this book may be reproduced or transmitted in any form or by any means—graphic, electronic, including photocopying, recording, taping, or information storage and retrieval systems—without written permission from the American Chemical Society. PRINTED IN T H E UNITED STATES O F AMERICA
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
ACS Symposium Series Robert F. Gould,
Editor
Advisory D o n a l d G. Crosby Jeremiah P . Freeman E. Desmond Goddard Robert A. Hofstader J o h n L . Margrave N i n a I. M c C l e l l a n d J o h n B . Pfeiffer Joseph V. Rodricks Alan
C . Sartorelli
R a y m o n d B . Seymour Roy L. Whistler Aaron W o l d
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
FOREWORD The A C S S Y M P O S I U a medium for publishin format of the SERIES parallels that of the continuing A D V A N C E S I N C H E M I S T R Y SERIES except that i n order to save time the papers are not typeset but are reproduced as they are submitted by the authors i n camera-ready form. As a further means of saving time, the papers are not edited or reviewed except by the symposium chairman, who becomes editor of the book. Papers published i n the A C S S Y M P O S I U M SERIES are original contributions not published elsewhere i n whole or major part and include reports of research as well as reviews since symposia may embrace both types of presentation.
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
PREFACE The
existence of integrated electronic circuits has changed radically our thinking with respect to performing chemical analyses. L o w cost microprocessors are now integral parts of commercial analytical instrumentation. Minicomputers have the ability to control experiments, to collect data, and to perform calculations with ever increasing facility. Thus, there is considerable interest on the part of the chemical analyst to use computational technique t validat th t Chapters 1 and 2 describ control of the measurement process and emphasize the use of graphical techniques which can be implemented conveniently on digital computers. After control of the measurement process has been established, it is necessary to evaluate systematic errors; Chapters 3 and 4 are devoted to this subject. Chapter 5 describes an innovative procedure which uses a laboratory minicomputer to optimize the variables i n a chemical analysis. Chapter 6 outlines some examples for evaluating statistical control i n testing laboratories. I would like to thank the authors for their diligent effort and to express appreciation to Carol Shipley and the text editing staffs of the Analytical Chemistry Division and the Institute for Materials Research, N B S , for helping with the manuscripts. Institute for Materials Research, N B S
JAMES
R.
Washington, DC 20234 August 12, 1977
vii
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
DEVOE
1 Statistical Control of Measurement Processes GRANT WERNIMONT Department of Chemistry, Purdue University, Lafayette, I N 47905
Valid measurements are necessary whenever we make chemical test proper action can b the material. Measurements are not valid until we evaluate the performance characteristics of the process which produced the measurements and i t is essential that the statements about the future behavior of these characteristics be correct. Statistical control is concerned with removing the assignable causes of variation in a measurement process (or correcting for their effects) so that we can associate approximate levels of confidence with these statements. It is unfortunate, I think, that most academic courses involving measurement do not seem to make the student aware of how important i t is to achieve a state of s t a t i s t i c a l control when we set up and run a measurement process. I was able to find only one current text on the theory and practice of quantitative analysis which addressed i t s e l f to this most important performance characteristic. In contrast, applied analytical chemists have been involved in statistical control activities for more than 40 years. Some of the United States Government regulatory agencies are now becoming concerned about this important aspect of measurement operations. For example, the Nuclear Regulatory Commission requires (1): "The licensee shall establish and maintain a statistical control system including control charts and formal statistical 1 In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
2
VALIDATION
OF T H E
M E A S U R E M E N T PROCESS
procedures, designed to monitor the q u a l i t y o f each type o f program measurement. C o n t r o l c h a r t l i m i t s s h a l l be e s t a b l i s h e d t o be e q u i v a l e n t t o l e v e l s o f s i g nificance o f 0.05 and 0.001. When e v e r c o n t r o l d a t a e x c e e d t h e 0.05 c o n t r o l limits, the licensee s h a l l i n v e s t i g a t e the c o n d i t i o n and t a k e c o r r e c t i v e a c t i o n i n a timely manner. The results of these investigations and actions shall be recorded. When e v e r the c o n t r o l data exceed t h e 0.001 control limits, the measurement system which generated the d a t a s h a l l n o t be u s e d f o r c o n t r o l l i m i t s the measuremen data shall no purposes until the deficiency has been b r o u g h t i n t o c o n t r o l a t t h e 0.05 l e v e l . " In t h i s c h a p t e r t h e meaning o f s t a t i s t i c a l cont r o l i s e x p l a i n e d , and t h e p r o c e d u r e s which we can use t o h e l p s e t up a n d r u n a m e a s u r e m e n t p r o c e s s a r e r e v i e w e d so t h a t i t i s i n a s t a t e o f s t a t i s t i c a l control .
WHAT I S MEASUREMENT Measurement h a s b e e n d e f i n e d as " t h e o p e r a t i o n o f a s s i g n i n g numbers t o r e p r e s e n t properties using arbitrary rules b a s e d on s c i e n t i f i c p r i n c i p l e s . O f c o u r s e t h i s i s an o v e r - s i m p l i f i c a t i o n ; a much b r o a d e r interpretation o f measurement f o r m u l a t e s a h i e r a r c h y of measurement s c a l e s : Nominal, Ordinal, Interval, and Ratio (_2) . The mathematical transformations p e r m i t t e d on e a c h s c a l e determine what statistical methodology c a n be a p p l i e d t o t h e m e a s u r e m e n t s . I n general, t h e more u n r e s t r i c t e d the permissable transformations, t h e more r e s t r i c t e d t h e s t a t i s t i c s ; n e a r l y a l l m e t h o d o l o g i e s c a n be applied to ratioscale measurements, but only a few serve f o r measurements on a n o m i n a l s c a l e . The most penetrating a n a l y s i s , by f a r , o f t h e basis f o r m a k i n g m e a s u r e m e n t s was formulated by Churchill Eisenhart ( 3 ) ; a n d i t s h o u l d be c a r e f u l l y s t u d i e d b y a l l p e o p l e who d e v i s e m e a s u r e m e n t methods and perform measurement operations as w e l l as b y t h o s e who u s e m e a s u r e m e n t r e s u l t s t o make decisions. E i s e n h a r t s t a t e s (_3, p . 1 6 3 ) :
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
1.
WERNIMONT
Statistical Control of Measurement Processes
"Measurement i s t h e a s s i g n m e n t o f numbers to m a t e r i a l t h i n g s t o r e p r e s e n t t h e r e l a tions e x i s t i n g among t h e m w i t h r e s p e c t t o particular properties. T h e number a s signed t o some p a r t i c u l a r p r o p e r t y s e r v e s to r e p r e s e n t t h e r e l a t i v e amounts o f t h i s property a s s o c i a t e d w i t h t h e o b j e c t concerned. Measurement always p e r t a i n s t o p r o p e r t i e s of things not t o t h e things themselves. Thus we c a n n o t measure a meter b a r , b u t c a n , a n d u s u a l l y do m e a s u r e i t s length; and we c o u l d a l s o measure i t s mass i t s d e n s i t y , and p e r h a p s The object o f measurement i s two f o l d : f i r s t , symbolic r e p r e s e n t a t i o n o f properties o f t h i n g s as a b a s i s f o r c o n c e p t u a l analysis; and second, to effect the representation i n a form amenable t o t h e powerful t o o l s o f mathematical analysis. The decisive feature i s symbolic repres e n t a t i o n o f p r o p e r t i e s , f o r which end numerals are n o t t h e usable symbols." There i s a form o f d i r e c t measurement w h i c h i s independent o f t h e p r i o r knowledge o f any o t h e r property; b u t t h e number s y s t e m u s e d t o e x p r e s s m a g n i t u d e s must behave l i k e t h e p r o p e r t y being measured. A s i m p l e example o f d i r e c t measurement i s t h e u s e o f JOHANSON b l o c k s t o c a l i b r a t e a m i c r o m e t e r . In this case i t i s e v i d e n t t h a t t h e p r o p e r t y we c a l l l e n g t h does behave l i k e numbers i n t h e f o l l o w i n g two ways: 1.
An e x p e r i m e n t a l p r o c e d u r e c a n be d e v i s e d w h i c h w i l l p r o d u c e an o r d e r e d sequence o f t h e b l o c k s .
2.
A n o t h e r e x p e r i m e n t a l p r o c e d u r e c a n be d e v i s e d t o combine (wring) t h e b l o c k s a d d i t i v e l y .
A more c o m p l e x e x a m p l e i s t h e p r o p e r t y we c a l l a b s o r b a n c e (A = - l o g T r a n s m i t t a n c e ) which behaves acc o r d i n g t o t h e r u l e s o f m a t r i x a l g e b r a (<4). In a n a l y t i c a l c h e m i s t r y , measurements a r e occas i o n a l l y made b y a d i r e c t m e t h o d ; b u t f o r r e a s o n s o f convenience, we more often u s e an i n d i r e c t method b a s e d on f u n d a m e n t a l o r e m p i r i c a l laws involving various p h y s i c a l , chemical and b i o l o g i c a l p r o p e r t i e s
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
3
4
VALIDATION
o f m a t t e r , e n e r g y and
OF T H E
MEASUREMENT
PROCESS
radiation.
It is a f a c t o f e x p e r i e n c e t h a t a measurement p r o c e s s c a n be u s e d u n d e r more d i v e r s i f i e d c o n d i t i o n s when i t i s b a s e d on f u n d a m e n t a l l a w s r a t h e r t h a n on e m p i r i c a l l a w s f o r w h i c h we h a v e inadequate theoretical explanations. H o w e v e r , we s h o u l d n e v e r f o r g e t t h a t e m p i r i c a l laws are involved i n almost every measurement p r o c e s s .
MEASUREMENT METHODS AND Let us now e x a m i n e how we t h e o p e r a t i o n s t o mak discussed this i n grea e x t r a c t some o f h i s p o i n t s :
PROCESSES devise (_3
and c a r r y out p
" S p e c i f i c a t i o n o f t h e a p p a r a t u s and e q u i p ment t o be used, the operations to be formed, the conditions i n which they are c a r r i e d out - these i n s t r u c t i o n s serve to d e f i n e a method o f measurement... A measurement process i s the r e a l i z a t i o n of a method o f measurement i n terms of a particular apparatus, equipment, condit i o n s , e t c . that at best only approximate those p r e s c r i b e d . . . Written s p e c i f i c a t i o n s i n m e t h o d s o f measurement o f t e n c o n t a i n a b s o l u t e l y precise instructions which c a n n o t be c a r r i e d o u t (repeatedly) with exactitude in practice...to this extent there are c e r t a i n d i s c r e p a n c i e s between a method and i t s r e a l i z a t i o n by a p a r t i c u l a r p r o c e s s . . . The s p e c i f i c a t i o n o f t e n i n c l u d e s i m p r e c i s e i n s t r u c t i o n s s u c h as raise the temperature slowly , s t i r w e l l , etc...to the extent the i n s t r u c t i o n s are not absolutely definite, there will be room for diff e r e n c e s b e t w e e n r e a l i z a t i o n s o f t h e same measurement method... 1
1
T
1
To qualify as a s p e c i f i c a t i o n , a set of i n s t r u c t i o n s m u s t be s u f f i c i e n t l y d e f i n i t e to insure statistical stability of rep e a t e d measurements o f a s i n g l e quantity,
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
1.
WERNIMONT
Statistical Control of Measurement Processes
that i s t h e measurement p r o c e s s must be capable o f meeting the criteria of statistical control." Of course, we m u s t carry out a w e l l designed s e r i e s o f experiments t o devise t h e types o f equipment a n d t h e s e q u e n c e o f o p e r a t i o n s w h i c h make up t h e m e a s u r e m e n t p r o c e s s b e f o r e we c a n f o r m u l a t e t h e measurement method; a n d i t i s n e c e s s a r y t h a t we f i n d optimum c o n d i t i o n s f o r r u n n i n g t h e p r o c e s s so t h a t i t responds t o s i g n i f i c a n t changes i n t h e l e v e l o f t h e p r o p e r t y being measured, b u t does n o t respond to small changes i n i t s o p e r a t i n g c o n d i t i o n s ( 5 ) . My e x p e r i e n c e l e a d s me t conclud that fail t r e c o g n i z e how d i f f i c u l c o n d i t i o n s , (b) c o n t r o significan assigna ble causes o f v a r i a t i o n , and ( c ) w r i t e a c o n c i s e y e t unambiguous s e t o f p r o c e d u r a l i n s t r u c t i o n s .
Measurement processes i n chemical consist of u n i t operations which i n c l u d e ,
analysis
- t a k i n g a g r o s s sample f r o m an a g g r e g a t e o f m a t e r i a l , - t a k i n g a l a b o r a t o r y subsample from t h e gross sample, - t r e a t i n g t h e subsample, p h y s i c a l l y and c h e m i c a l l y t o remove i n t e r f e r e n c e s , -measuring a p r o p e r t y o f t h e t r e a t e d subsample, and -estimating the desired property using a calibration curve. A u s e f u l technique for visually showing t h e measurement o p e r a t i o n s i s a b l o c k diagram or flow chart. The p e r s o n who d e v e l o p e d t h e p r o c e s s , o r h a s r u n i t r e p e a t e d l y , c a n e a s i l y draw t h e f l o w c h a r t ; i t will supplement t h e method and h e l p o t h e r p o e p l e t o understand the process. It i s important to realize that the f i n a l measurement i s an a t t r i b u t e o f t h e l a b o r a t o r y s a m p l e , and i s o n l y a n e s t i m a t e o f t h e p r o p e r t y i n t h e e n t i r e a g g r e g a t e o f m a t e r i a l . We m u s t n e v e r f o r g e t t h a t we make inferences about t h e magnitude o f t h e p r o p e r t y i n t h e aggregate from a s m a l l f i n i t e group o f measurements on t h e l a b o r a t o r y s a m p l e . I f we f a i l t o c o n t r o l t h e s i g n i f i c a n t a s s i g n a b l e causes which c a n , p o t e n t i a l l y , affect the various operations i n t h e m e a s u r e m e n t p r o c e s s , we w i l l c e r t a i n l y f i n d t h a t i t w i l l n o t meet the c r i t e r i a of
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
VALIDATION OF T H E M E A S U R E M E N T PROCESS
6 statistical
control.
S T A T I S T I C A L CONTROL I am u n a b l e t o g i v e a s i m p l e , c o n c i s e d e f i n i t i o n o f w h a t we mean b y " t h e s t a t e o f b e i n g i n s t a t i s t i c a l control". The c o n c e p t was c o n c e i v e d a n d d e v e l o p e d b y D r . W a l t e r A. S h e w h a r t , a p h y s i c i s t - e n g i n e e r at the Bell Telephone L a b o r a t o r i e s , t o help solve the pro blems o f m a n u f a c t u r i n g p r o d u c t s o f u n i f o r m and accep table q u a l i t y . While his publications (6>Z_>§_) not p r i m a r i l y concerned w i t h measurement processes, t h e y do p r e s e n t i d e a which b applied t them The 1939 book give statistical control, presentatio r e s u l t s , and t h e s p e c i f i c a t i o n o f p r e c i s i o n and a c c u racy. a
r
e
Eisenhart presents a section of the requirement o f s t a t i s t i c a l c o n t r o l (.3, p . 1 6 6 ) w h i c h summarizes Shewhart s ideas a n d d e m o n s t r a t e s how t h e y a p p l y t o measurement p r o c e s s e s ; I e x t r a c t some o f t h e s e i d e a s : 1
"The p o i n t t h a t S h e w h a r t makes f o r c e f u l l y , and s t r e s s e s r e p e a t e d l y , i s t h a t t h e f i r s t η measurements o f a q u a n t i t y g e n e r a t e d by a measurement p r o c e s s provide a logical basis f o r p r e d i c t i n g the behavior of fur t h e r m e a s u r e m e n t s o f t h e same q u a n t i t y by the same m e a s u r e m e n t p r o c e s s , i f a n d o n l y i f , t h e s e η m e a s u r e m e n t s may be regarded rancTom s a m p l e f r o m a p o p u l a t i o n o r universe of a l l conceivable measure m e n t s ... c h a r a c t e r i z e d by a p r o b a b i l i t y distribution...nothing i s said about the mathematical form of the d i s t r i b u t i o n . The important thing i s that there be one... a
s
a
1
Shewhart was w e l l a w a r e t h a t , f r o m a s e t of η measurements i n hand, i t i s n o t pos sible t o decide, w i t h c e r t a i n t y , whether t h e y do o r do n o t c o n s t i t u t e a random s a m p l e f r o m some d e f i n i t e s t a t i s t i c a l pop ulation characterized by a p r o b a b i l i t y distribution. He therefor p r o p o s e d (Z) t h a t i n a n y p a r t i c u l a r i n s t a n c e one s h o u l d decide t o a c t f o r t h e p r e s e n t as i f t h e measurements i n hand (and t h e i r immediate f
T
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
1.
WERNiMONT
Statistical Control of Measurement Processes
successors)...meet the requirements of the s m a l l sample v e r s i o n o f C r i t e r i o n I o f h i s previous book (6J) a n d . . . show no e v i d e n c e of lack of statistical control when analyzed f o r randomness i n the order i n which t h e y were t a k e n by t h e c o n t r o l c h a r t techniques, f o r averages and s t a n d a r d d e v i a t i o n s t h a t he h a d f o u n d so v a l u a b l e in industrial process control, and by certain additional tests f o r randomness based on 'runs above and below average and r u n s up a n d d o w n . . . T
Experience shows t h a t i n t h e c a s e o f mea surement p r o c e s s e s statistical contro scribes, i s usually very d i f f i c u l t to attain, j u s t as i n t h e case o f i n d u s t r i a l production processes..." E i s e n h a r t a l s o quotes f r o m a p a p e r b y D r . R. B. Murphy, a n o t h e r B e l l Telephone e n g i n e e r , on t h e v a l i d i t y o f p r e c i s i o n and accuracy statements (9): " . . . a t e s t m e t h o d o u g h t n o t t o b e known a s a measurement p r o c e s s u n l e s s i t i s c a p a b l e of statistical c o n t r o l . . . ( w h i c h ) means t h a t e i t h e r t h e measurements a r e t h e pro duct o f an i d e n t i f i a b l e s t a t i s t i c a l u n i v e r s e , o r i f n o t , t h e p h y s i c a l causes pre v e n t i n g s u c h i d e n t i f i c a t i o n may t h e m s e l v e s be i d e n t i f i e d a n d , i f desired, isolated and suppressed. Incapability of control i m p l i e s t h a t t h e r e s u l t s o f t h e measure ment p r o c e s s a r e n o t t o be t r u s t e d as i n d i c a t i o n s o f t h e p r o p e r t y a t hand - i n short, we a r e n o t i n a n y v e r i f i a b l e s e n s e measuring anything...without this limi t a t i o n on t h e n o t i o n o f a measurement p r o c e s s , one i s u n a b l e t o go o n t o g i v e meaning t o those statistical measures which a r e t h e b a s i s f o r any d i s c u s s i o n o f p r e c i s i o n and a c c u r a c y . " I b e l i e v e we c a n now f o r m u l a t e t h e i d e a o f s t a t i s t i c a l c o n t r o l as f o l l o w s : A measurement process may be s a i d t o be i n a s t a t e o f s t a t i s t i c a l c o n t r o l i f the significant a s s i g n a b l e causes of variation have been removed o r c o r r e c t e d f o r , so t h a t a f i n i t e s e t o f η measurements f r o m t h e p r o c e s s c a n be u s e d t o
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
8
VALIDATION OF
THE
M E A S U R E M E N T PROCESS
(a) predict limits o f v a r i a t i o n f o r the η measure m e n t s a n d (b) assign a level of confidence that f u t u r e measurements w i l l l i e w i t h i n these l i m i t s .
CONTROL CHART A N A L Y S I S The o p e r a t i o n a l p r o c e d u r e f o r d e m o n s t r a t i n g t h a t a process i s i n a s t a t e of statistical control is quite simple i n concept but r a t h e r complex i n prac tice. I t c o n s i s t s of arranging to gather η measure ments, i n some k i n d o f o r d e r , and i n t h e f o r m o f s o c a l l e d " r a t i o n a l subgroups", w i t h i n which the varia tions may be c o n s i d e r e d th basi f knowledg of the p r o c e s s , to b which, the v a r i a t i o n y suspecte assign able causes. To i l l u s t r a t e how we make a c o n t r o l - c h a r t - a n a l y s i s o f m e a s u r e m e n t s , l e t us e x a m i n e t h e r e s u l t s o f a s i m p l e experiment which Shewhart c a r r i e d out to simu late a "controlled" production process. He placed 998 circular c h i p s i n a l a r g e b o w l ; numbers between n e g a t i v e 3.0 and p o s i t i v e 3.0, a t 0.1 i n t e r v a l s , w e r e recorded on the c h i p s w h i c h w e r e one c o l o r f o r t h e n e g a t i v e n u m b e r s and a n o t h e r f o r t h e positive. The magnitudes o f t h e numbers were d i s t r i b u t e d a c c o r d i n g to a "normal" d i s t r i b u t i o n w i t h average = 0.0 and standard deviation = 1.007. The c h i p s w e r e d r a w n f r o m t h e b o w l one a t a t i m e , w i t h r e p l a c e m e n t , until 4000 v a l u e s w e r e o b t a i n e d and r e c o r d e d i n o r d e r . For f u r t h e r d e t a i l s , see ( 6 , pp. 164-165 and Appendix II). Shewhart observed that i n t h i s experiment we h a v e as n e a r an a p p r o a c h as i s l i k e l y f e a s i b l e t o t h e conditions i n w h i c h t h e law o f l a r g e numbers a p p l i e s s i n c e , t o t h e b e s t o f o u r k n o w l e d g e , t h e same essen tial c o n d i t i o n s were maintained. H o w e v e r , he o n c e t o l d me t h a t t h i s s i m p l e d r a w i n g o p e r a t i o n is prone t o show l a c k o f s t a t i s t i c a l c o n t r o l u n l e s s g r e a t c a r e i s t a k e n t o m i x up t h e bowl of chips between the d r a w i n g s and k e e p t h e b o o k k e e p i n g m i s t a k e - f r e e . I have plotted the results of d r a w i n g s as a c o n t r o l c h a r t i n F i g u r e 1, tional subgroup of four consecutive a v e r a g e s and s t a n d a r d d e v i a t i o n s o f t h e were c a l c u l a t e d a s ,
the f i r s t 200 using a ra values. The 50 subgroups
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
Statistical Control of Measurement Processes
WERNiMONT
A SIMULATED MEASUREMENT PROCESS
J
2
I
I
I
I
10
I
ι
ι
ι
ι
20
ι
ι
ι
ι
ι
30
I
ι
i
ι
I
40
I
I
I
I
L
—I
50 991
1
1
1
1-
1000
RATIONAL SUBGROUP NUMBER Figure 1.
Consecutive drawings from Shewharfs bowl of chips
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
10
VALIDATION O F
X = E X / 4 , and i
THE
M E A S U R E M E N T PROCESS
S = A ( X j - X ) V (4-1).
The g r a n d a v e r a g e , o£ a l l 200 v a l u e s i s -0.08 a n d t h e average o f the group s t a n d a r d d e v i a t i o n s is 0.912. T h r e e - s i g m a c o n t r o l l i m i t s f o r t h e 50 s u b g r o u p s a r e , Limits Upper Lower
Standard D e v i a t i o n (2.266 χ 0.912) = 2.07 (0. x 0.912) - 0.
Average -0.08 + (1.628 χ 0.912) = -0.08 - (1.628 χ 0.912) =
The factors, B = 0 , B = 2.266, and A t a b l e d i n v a r i o u s r e f e r e n c e s (1_0, 1_1, 12_, 3
o
4
3
1.40 -1.40
= 1.628 a r e 1_3, 14) .
To evaluate thes result fo statistical t r o l , we f i r s t e x a m i n standard deviation , , g r e a t e r t h a n the 3-sigma l i m i t . This indicates that no a s s i g n a b l e causes were a f f e c t i n g the o p e r a t i o n o f c o n s e c u t i v e l y d r a w i n g and r e p l a c i n g f o u r c h i p s . Lack of control f o r s t a n d a r d d e v i a t i o n w o u l d l e a d us t o look f o r l o c a l assignable causes i n the way each group o f f o u r c h i p s was r e m o v e d f r o m t h e b o w l . Per h a p s someone i s s u r r e p t i t i o u s l y e x c h a n g i n g the bowl with one w h i c h h a s a s t a n d a r d d e v i a t i o n g r e a t e r t h a n 1.007. Next, we examine the upper graph f o r subgroup a v e r a g e s , w h i c h a l s o shows n o n e o u t s i d e 3-sigma l i mits. T h i s i n d i c a t e s t h a t no a s s i g n a b l e c a u s e s w e r e a f f e c t i n g the drawing o p e r a t i o n through out the en tire s e q u e n c y o f t h e f i r s t 200 v a l u e s . Lack o f con t r o l would suggest that some n o n l o c a l assignable cause affected some s u b g r o u p s d i f f e r e n t l y t h a n o t h ers. Perhaps the s u r r e p t i t i o u s exchange involved a bowl with a d i s t r i b u t i o n w h i c h a v e r a g e s two r a t h e r than zero. Shewhart s u g g e s t e d t h a t c r i t e r i a f o r randomness should also i n c l u d e the behavior of urns f o r consec utive groups w i t h i n the 3-sigma l i m i t s . Duncan ex p l a i n s ( j ^ , p. 386) a r u n as "a s u c c e s s i o n o f items of t h e same c l a s s " s u c h as a s e r i e s o f i n c r e a s i n g o r d e c r e a s i n g v a l u e , or a s e r i e s of consecutive values above or below the average. We f i n d no r u n s , up o r down, g r e a t e r t h a n f i v e ; b u t two r u n s , o f s e v e n b e l o w the a v e r a g e , o c c u r r e d ( b e g i n n i n g w i t h subgroups 6 and 15). S t a t i s t i c a l t h e o r y and p r a c t i c a l e x p e r i e n c e i n d i c a t e t h a t a s s i g n a b l e c a u s e s c a n u s u a l l y be f o u n d t o e x p l a i n r u n s o f s e v e n o r more; o f c o u r s e i t is now i m p o s s i b l e t o l o o k f o r them.
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
1.
WERNiMONT
Statistical Control of Measurement Processes
11
No other t y p e s o f s y s t e m a t i c v a r i a t i o n such as c y c l e s o r t r e n d s , a p p e a r t o be p r e s e n t f o r e i t h e r t h e standard d e v i a t i o n s or the averages. C a n we c o n c l u d e t h a t t h i s p r o c e s s was i n a s t a t e o f s t a t i s t i c a l control? Well, we h a v e t w o c h o i c e s : (a) t h e p r o c e s s was n o t i n c o n t r o l , o r ( b ) t h e p r o c e s s was i n c o n t r o l but two i m p r o b a b l e runs o c c u r r e d . This i sexactly t h e s i t u a t i o n we m e e t a l m o s t e v e r y time we examine results from a measurement p r o c e s s . No m a t t e r w h i c h c h o i c e we make, t h e r e i s some chance that i ti s wrong. I would conclude that t h e evidence f o r l a c k of c o n t r o l i sn o t c o n v i n c i n g based on knowledge o f the p r o c e s s , and p r e d i c t t h a t t h e 3-sigma l i m i t s , e s t i m a t e d f r o m t h e f i r s t 200 d r a w i n g s should als i n clude p r a c t i c a l l y a l see t h a t t h e l a s t 4 d r a w i n g are w e l l w i t h i n these l i m i t s . D u n c a n h a s g i v e n ( 1 3 , p . 3 9 2 ) t h e f o l l o w i n g summary o f c r i t e r i a f o r lacTf o f s t a t i s t i c a l c o n t r o l : 1. 2.
3. 4. 5. 6.
One o r more p o i n t s o u t s i d e 3 - s i g m a l i m i t s , One o r more p o i n t s i n t h e v i c i n i t y o f a " w a r n i n g l i m i t " suggesting that additional observations be t a k e n , A r u n o f s e v e n o r more p o i n t s , Cycles, trends, or other nonrandom patterns w i t h i n 3-sigma l i m i t s , A r u n o f t w o o r more p o i n t s outside o f 2-sigma limits, A r u n o f f o u r o r more p o i n t s o u t s i d e 1-sigma l i mits .
Of course we a r e a l w a y s f a c e d w i t h t h e r i s k o f b e i n g w r o n g when we d e c i d e w h e t h e r , o r n o t , a p r o c e s s is i n a state of s t a t i s t i c a l control. We f i x t h i s r i s k by a r b i t r a r i l y c h o o s i n g c r i t i c a l 3-sigma l i m i t s . Using wider limits, we increase the risk of erroneously concluding that the process i s i n s t a t i s tical control and decrease t h e chances o f d e t e c t i n g s i g n i f i c a n t a s s i g n a b l e causes. The u s e o f n a r r o w e r limits will have the opposite effects. Experience h a s shown t h a t t h e r i s k s a r e q u i t e t o l e r a b l e , i n m o s t cases, when a c t i o n l i m i t s a r e s e t b e t w e e n 2- a n d 3sigma f o r subgroup s t a n d a r d d e v i a t i o n s and averages.
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
12
VALIDATION OF T H E M E A S U R E M E N T
PROCESS
RATIONAL SUBGROUPS The key to s u c c e s s when we u s e c o n t r o l c h a r t a n a l y s i s t o examine r e s u l t s from a measurement p r o cess, l i e s i n t h e s t r a t e g y we u s e t o s e t up " r a t i o n al subgroups. The i d e a o f a r r a n g i n g t o g a t h e r the measurements i n s u b g r o u p s makes r e a l s e n s e , b e c a u s e i t i s my o b s e r v a t i o n t h a t a s s i g n a b l e c a u s e s a f f e c t i n g a measurement process f a 11 r a t h e r c l e a r l y i n t o two classes. n
The first class i s under the l o c a l c o n t r o l of t h e p e r s o n who o p e r a t e s t h e p r o c e s s ; i t i n c l u d e s s u c h o p e r a t i o n s as m a n i p u l a t i n equipment dispensin gents, c a l i b r a t i n g instruments points, and otherwise f o l l o w i n g procedural instruc t i o n s i n l o c a l time and space. O p e r a t o r s c a n be h e l d responsible f o r maintaining r i g i d c o n t r o l of these l o c a l o p e r a t i o n s , and good o p e r a t o r s soon learn how t o do i t . L a c k o f s t a t i s t i c a l c o n t r o l o f t h e s e l o c a l operations i s observed, o c c a s i o n a l l y , but only because o f b a s i c s h o r t c o m i n g s i n t h e method o r equipment w h i c h t h e o p e r a t o r i s u n a b l e t o p e r c e i v e o r c o p e with. The s e c o n d c l a s s o f a s s i g n a b l e c a u s e s i s n o t u n der t h e l o c a l c o n t r o l o f the o p e r a t o r ; i t includes such t h i n g s as l o n g - r a n g e m a i n t a i n a n c e o f l a b o r a t o r y c o n d i t i o n s and e q u i p m e n t , types and/or methods of calibration, d e t e r i o r a t i o n o f r e a g e n t s and i n s t r u ments, the nature o f i n t e r f e r e n c e s i n the material b e i n g t e s t e d , and numerous o t h e r t y p e s o f n o n l o c a l o r r e g i o n a l assignable causes. The l a b o r a t o r y supervis o r m u s t assume r e s p o n s i b i l i t y f o r f i n d i n g a n d r e m o v ing a s s i g n a b l e causes a f f e c t i n g these o p e r a t i o n s . I think i t i s obvious that c o n t r o l chart analys i s f o r v a r i a t i o n w i t h i n r at i o n a l subgroups ( s t a n d a r d information deviation or range) g i v es us i m p o r t a n t while the chart about the l o c a l a s s i g n a b l e causes, for averages r e v e a l s i n f ormation about t h e r e g i o n a l assignable causes. Two p o s s i b l e m i s t a k e s a r e e a s y t o make when we s e t up a s y s t e m o f r a t i o n a l s u b g r o u p s : (a) t h e r e p l i c a t i o n s a r e so c l o s e t o g e t h e r i n t i m e a n d / o r s p a c e t h a t t h e y do n o t i n c l u d e a l l the l o c a l assignable causes. F o r i n s t a n c e , we w o u l d n e v e r w a n t t o r e c o r d d u p l i c a t e r e a d i n g s o f an i n s t r u m e n t s c a l e b e c a u s e , as W. J . Youden o f t e n p o i n t e d o u t , t h i s i s m e r e l y "du-
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
1.
WERNiMONT
Statistical Control of Measurement Processes
13
plicity". The subgroup s h o u l d i n c l u d e a l l t h e l o c a l random c a u s e s b e c a u s e a measurement p r o c e s s c a n n e v e r be b r o u g h t i n t o a s t a t e o f s t a t i s t i c a l c o n t r o l i f t h e r a t i o n a l subgroups a r e t o o r e s t r i c t e d , (b) t h e r e p l i cations a r e so f a r a p a r t i n time and/or space t h a t t h e y i n c l u d e some o f t h e r e g i o n a l a s s i g n a b l e causes. This l e a d s t o wide c o n t r o l l i m i t s which l a c k t h e power t o d e t e c t a s s i g n a b l e c a u s e s , l o c a l o r r e g i o n a l . I have a d e t a i l e d d i s c u s s i o n o f t h e concept o f r a t i o n a l s u b g r o u p s i n my p a p e r , "The U s e o f C o n t r o l Charts i n theA n a l y t i c a l Laboratory" (15). Specific i n s t r u c t i o n s cannot be f o r m u l a t e d t o d e v i s e rational subgroups which w i l l a p p l y t o a l l k i n d s o f measure ment p r o c e s s e s . I limited so t h a t variation e s s e n t i a l l y random a n d t h e y s h o u l d be sufficiently extended t o r e v e a l a s s i g n a b l e causes which t h e operator i sunable t o c o n t r o l .
how
Let u s now l o o k a t some r e a l w o r l d e x a m p l e s o f we c a n u s e c o n t r o l c h a r t a n a l y s i s .
A PROCESS WITH NO A S S I G N A B L E CAUSES Figure 2 shows a c o n t r o l c h a r t f o r a p r o c e s s t o d e t e r m i n e t h e w a t e r - e q u i v a l e n t o f a P a r r - t y p e bomb combustion calorimeter. Once e a c h m o n t h , t h e o p e r a t o r made f o u r i n d e p e n d e n t calibration runs on t h e same a f t e r n o o n b y w e i g h i n g a p p r o p r i a t e a m o u n t s o f NBS S t a n d a r d B e n z o i c A c i d and b u r n i n g i t i n t h e oxygenc h a r g e d bomb u n d e r e s s e n t i a l l y t h e same c o n d i t i o n s a s were used t o d e t e r m i n e h e a t s o f combustion of fuel. The material was i g n i t e d b y h e a t i n g e l e c t r i c a l l y a small p i e c e o f pure i r o n w i r e . The c a l o r i m e t e r cons t a n t was c o m p u t e d f r o m t h e o b s e r v e d t e m p e r a t u r e r i s e o f t h e w a t e r s u r r o u n d i n g t h e bomb, t h e w e i g h t o f b e n zoic acid, a n d t h e NBS c e r t i f i e d v a l u e f o r t h e h e a t of combustion o f the a c i d . A small correction f o r t h e h e a t g e n e r a t e d b y t h e w i r e was a p p l i e d . The d a t a f o r t h i s c h a r t was t a k e n f r o m h i s t o r i c a l r e c o r d s and you can see that d u r i n g t h e p r e v i o u s 11-month p e r i o d , no s i g n i f i c a n t a s s i g n a b l e causes w e r e a f f e c t i n g t h e s t a n d a r d d e v i a t i o n s s o we c a n c o n c l u d e t h a t t h e o p e r a t o r was c o n t r o l l i n g a l l t h e l o c a l operations. T h e c h a r t f o r a v e r a g e s a l s o shows s a t i s f a c t o r y c o n t r o l w h i c h means t h a t no r e g i o n a l a s s i g n a -
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
14 ble over
VALIDATION
OF
THE
MEASUREMENT
causes were a f f e c t i n g the c a l i b r a t i o n an e x t e n d e d p e r i o d o f t i m e .
PROCESS
operations
I t i s i n t e r e s t i n g t o n o t e t h a t p r i o r t o t h i s analysis of calibration mea s u r e m e n t s , the l a b o r a t o r y supervisor had been r e vi s i n g the water-equivalent each month. He now d e c i d e d t o a d o p t t h e l o n g range a v e r a g e o f 29030 b u t c o n t i nue c h e c k i n g i t e v e r y m o n t h as b e f o r e . T h i s was s o u n d strategy because a few months l a t e r t h e c a l i b r a t i o n a v e r a g e was o b s e r v e d t o be j u s t o u t o f c o n t r o l on t h e l o w s i d e . Investigat i o n r e v e a l e d t h a t a new s u p p l y o f i r o n w i r e h a d b e e n a c q u i r e d b u t t h e s u p e r v i s or n e g l e c t e d t o g i v e a rev i s e d c o r r e c t i o n f a c t o r t o th operator
A PROCESS WITH LOCAL A S S I G N A B L E
CAUSES
I have a l r e a d y i n d i c a t e d t h a t l a c k o f c o n t r o l o f l o c a l a s s i g n a b l e c a u s e s i s n o t commonly o b s e r v e d ; a n d I am a w a r e o f no s i m p l e t e c h n i q u e s , o t h e r t h a n c o n t r o l c h a r t a n a l y s i s , t o d e t e c t i t . T h i s example i n volved t h e u s e o f an i n s t r u m e n t t o m e a s u r e t h e t e a r i n g s t r e n g t h o f p l a s t i c s h e e t i n g u s e d t o s u p p o r t photographic emulsions. The i n s t r u m e n t (Thwing-Albert), d e s i g n e d t o measure t h e t e a r i n g s t r e n g t h of paper, consisted of a f a i r l y m a s s i v e pendulum a r r a n g e d so that i t c o u l d absorb the energy used to t e a r a small specimen o f m a t e r i a l , thus d e c r e a s i n g the amplitude of the pendulum. The i n s t r u m e n t h a d b e e n m o d i f i e d t o make i t more s e n s i t i v e t o t h e s m a l l e r s t r e n g t h s o f f i l m s u p p o r t by attaching a counterbalance t o the pendulum, thus r a i s i n g i t scenter of gravity. The m o d i f i e d instrument was m o n i t o r e d b y means o f a r e s e r v o i r o f " r e f e r ence" f i l m support p i c k e d from a uniform production lot, c u t i n t o t e s t s p e c i m e n s , and t h o r o u g h l y randomized. The s p e c i m e n s were conditioned and t e a r i n g strengths were measured once each day u s i n g r a t i o n a l subgroups o f f i v e s t r i p s from the r e s e r v o i r . Control chart analysis showed no e v i d e n c e f o r lack of s t a t i s t i c a l c o n t r o l for both standard deviat i o n a n d a v e r a g e d u r i n g t h e f i r s t 14 w e e k s as y o u c a n s e e i n F i g u r e 3. D u r i n g we ek 2 0 , l a c k o f c o n t r o l was i n d i c a t e d f o r one s u b g r o u p s t a n d a r d d e v i a t i o n a n d one a v e r a g e ; a n d b y w e e k 2 4 , i t became e v i d e n t t h a t both standard d e v i a t i o n and aver a g e were o u t o f s t a t i s t i cal control. The o p e r a t o r c o u l d f i n d no reasons to
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
Statistical Control of Measurement Processes
WERNIMONT
A PROCESS WITH NO ASSIGNABLE 4θ|-
η=4
CAUSES
χ = observed value - 24,000
X sof20 10
S
5 0
MONTH NUMBER Figure 2.
Determination of the water-equivalent of a bomb calorimeter
A PROCESS WITH LOCAL ASSIGNABLE
CAUSES
η= 5 56.4 X
5
6
,
2
56.Oh
.
Λ /
S
0.2 0.0
ν
V
/*
I
I I 14
I
V / V
\ /'Χ
I
I I 20
I
-
•
w ν' I
\ // \
\
ν
55.8
0.4
• 7\ Λ Λ « • · *
,·
/
I
•
I
1 1
24
ι
ι
I
I
I
I I
25
WEEK NUMBER Figure 3.
Determination of the force to tear plastic film support
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
16
V A L I D A T I O N OF
THE
M E A S U R E M E N T PROCESS
explain this and the m a t e r i a l behaved a l r i g h t t e s t e d on o t h e r i n s t r u m e n t s c u r r e n t l y i n u s e .
when
The i n s t r u m e n t was r e t u r n e d t o t h e m a c h i n e s h o p where t h e c o u n t e r b a l a n c e had been i n s t a l l e d , and i t was found that t h e b e a r i n g s , on w h i c h t h e p e n d u l u m was s u p p o r t e d , w e r e b e g i n n i n g t o d i s i n t i g r a t e b e c a u s e of the i n c r e a s e d l o a d of the c o u n t e r b a l a n c e . Larger b e a r i n g s w e r e i n s t a l l e d a n d , as y o u c a n s e e , t h e c o n trol chart f o r b o t h s t a n d a r d d e v i a t i o n and a v e r a g e returned to normal. New b e a r i n g s h a d t o be i n s t a l l e d on a l l t h e o t h e r i n s t r u m e n t s .
A PROCESS WIT When c o n t r o l c h a r t a n a l y s i s shows s a t i s f a c t o r y c o n t r o l f o r the v a r i a t i o n w i t h i n r a t i o n a l subgroups but l a c k o f c o n t r o l among s u b g r o u p a v e r a g e s , we m u s t look f o r r e g i o n a l a s s i g n a b l e causes. Most i n t e r l a b o ratory studies o f m e a s u r e m e n t p r o c e s s e s show l i t t l e o r no e v i d e n c e f o r l a c k o f c o n t r o l w i t h i n t h e laboratories over a s h o r t p e r i o d of time; but i t i s v e r y d i f f i c u l t to achieve statistical control among a g r o u p o f l a b o r a t o r i e s a l l u s i n g t h e same t e s t m e t h o d . F i g u r e 4 shows r e s u l t s o f a s t u d y of the Eberstadt method f o r determining the a c e t y l - c o n t e n t of c e l l u lose acetate. Samples o f a r e f e r e n c e material were a n a l y z e d i n e i g h t d i f f e r e n t l a b o r a t o r i e s w i t h two i n dependent o p e r a t o r s i n each l a b o r a t o r y making duplicate t e s t s on e a c h o f two d i f f e r e n t d a y s . The lower c h a r t f o r o p e r a t o r r a n g e s shows t h a t a s t a t e o f stat i s t i c a l c o n t r o l e x i s t e d f o r the v a r i a t i o n w i t h i n the l a b o r a t o r i e s , but i t i s obvious t h a t l a b o r a t o r y avera g e s v a r y more t h a n c a n be e x p l a i n e d by t h e v a r i a t i o n within laboratories. It is difficult to find the reasons for this because they are o f t e n d i f f e r e n t f r o m one l a b o r a t o r y t o a n o t h e r . In t h i s case i t was found that some o f t h e l a b o r a t o r i e s w e r e n o t r i g o r o u s l y f o l l o w i n g the t e s t method p r o c e d u r e s .
THE
PROBLEM OF
DUPLICITY
Let us r e t u r n t o t h e c r i t i c a l p r o b l e m o f d e v i s ing r a t i o n a l subgroups. I n F i g u r e 5, we s e e results f o r t h e d e t e r m i n a t i o n o f c o p p e r , made d u r i n g t h e p r o d u c t i o n o f b r o n z e c a s t i n g s . Two i n d e p e n d e n t samples were drilled f r o m e a c h c a s t i n g and a n a l y z e d , i n dup l i c a t e , u s i n g a p r e c i s e method of electrolytically
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
Statistical Control of Measurement Processes
WERNIMONT
A PROCESS WITH REGIONAL ASSIGNABLE CAUSES η =2 39.4
Χ" 39.0 38.6 0.4
R
0.2 0.0 I
2
LABORATORY NUMBER Figure 4.
Determination of acetyl in cellulose acetate
A PROCESS WITH LIMITS BASED ON TEST
VARIATION
η= 2 85.8
X"
85.4 85.0
R
0.2 0.1 o.oh
-
-*r.
2
3
4
5
6
7
s
8
~
9
\
10 II
CASTING NUMBER Figure 5.
Determination of copper in bronze castings
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
18
VALIDATION OF T H E M E A S U R E M E N T PROCESS
depositing the copper a n d w e i g h i n g i t . The l o w e r chart f o r ranges i s i n c o n t r o l , but the chart f o r subgroup averages shows t h a t t h e d u p l i c a t e samples are e x c e e d i n g l y v a r i a b l e compared to the duplicate determinations. When c o n t r o l l i m i t s a r e b a s e d o n t h e v a r i a t i o n o f sample a v e r a g e s , w i t h i n c a s t i n g s , there is some r e a s o n t o b e l i e v e t h a t t h e m a n u f a c t u r i n g a n d t e s t i n g operations are both i n a state of s t a t i s t i c a l control, although a c y c l i c e f f e c t c a n n o t be r u l e d out, a s y o u c a n s e e i n F i g u r e 6.
S I M P L E AND COMPLEX
CONTROL
In a l l o f t h e a m a t h e m a t i c a l model w h i c h Eisenhart called SIMPLE statistical c o n t r o l (^3, p . 1 7 4 ) , t h a t i s , t h e v a r i a t i o n o f measurements within rational subgroups i s random and s e r v e s as a v a l i d e s t i m a t e o f t h e random v a r i a t i o n o f t h e s u b g r o u p a v e r a g e s . H o w e v e r , we o f ten f i n d p r o c e s s e s f o r w h i c h t h i s model i s i n a d e q u a t e because r e g i o n a l a s s i g n a b l e causes exist which we cannot identify and/or remove; i n such c a s e s , i t i s d e s i r a b l e t o determine whether the process is in a state o f COMPLEX, o r m u l t i s t a g e , s t a t i s t i c a l c o n t r o l (3, p. 1 7 8 ) . We do t h i s b y s e t t i n g up a c o n t r o l c h a r t f o r t h e v a r i a t i o n (standard d e v i a t i o n o r range) of measurements w i t h i n t h e r a t i o n a l s u b g r o u p s , j u s t as b e f o r e . H o w e v e r , we e s t i m a t e c o n t r o l l i m i t s f o r t h e subgroup averages by t r e a t i n g them as " i n d i v i d u a l " measurements and t h e n u s e t h e "moving range" method which calculates a l l t h e c o n s e c u t i v e d i f f e r e n c e s between the subgroup a v e r a g e s , t h u s p a r t i a l l y e l i m i n a t i n g t h e effects of the regional a s s i g n a b l e c a u s e s (1_3, p . 451) . Figure 7 shows r e s u l t s f o r t h e measurement o f the w a t e r c o n t e n t o f a s e r i e s o f p r o d u c t i o n lots of an organic solvent using t h e K a r l F i s c h e r method. The l o w e r c h a r t f o r standard deviations indicates t h a t t h e m e a s u r e m e n t p r o c e s s i s i n c o n t r o l when t h r e e r e p l i c a t e d e t e r m i n a t i o n s a r e made o n a s i n g l e sample of m a t e r i a l from each l o t . The u p p e r g r a p h shows t h e averages; t h e narrow l i m i t s a r e based on replicate measurement variation, w h i l e t h e wide l i m i t s c o r r e spond t o t h e moving range o f c o n s e c u t i v e l o t averages. O f c o u r s e , we w o u l d n o t e x p e c t t h e d i s t i l l a t i o n o f a n o r g a n i c m a t e r i a l t o be i n s i m p l e s t a t i s t i -
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
1. WERNIMONT
Statistical Control of Measurement Processes
A PROCESS WITH LIMITS BASED ON MATERIAL VARIABILITY 86.2
_n=2
85.8
X
V
85.4 85.0h I.Oh .5 Ο I
I
I
I
I
2 3 4 5 6 7 8 9
ΙΟ II
CASTING NUMBER Figure 6.
Determination of copper in bronze castings
SIMPLE
AND COMPLEX
CONTROL
η = 3
*
5
^ _
7
^
.
^
f
A / r > r \ _ - _ -
0.2-
.
S o.i-
φ /
.
/ \ / χ
Ν
ο.
,
Λ
.—
-
· I
I
I
I
χ
-
1
1
I
I
I
I
I
I
3
5
7
9
II
13 15 17 19 21 23 25
1
1
LOT NUMBER Figure 7. Determination of water in an organic solvent
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
19
V A L I D A T I O N OF
20 cal c o n t r o l i n a case o v e r a l l operation of d even i n the state The a s s i g n a b l e c a u s e s ment of the water l i k e l y t o be f o u n d i n
THE
M E A S U R E M E N T PROCESS
l i k e t h i s ; b u t we s e e t h a t t h e i s t i l l i n g and m e a s u r i n g i s not of complex s t a t i s t i c a l c o n t r o l . f o r t h i s may be i n t h e m e a s u r e c o n t e n t , b u t t h e y a r e much m o r e the d i s t i l l a t i o n process.
UNORDERED DATA A N A L Y S I S C o n t r o l c h a r t a n a l y s i s was o r i g i n a l l y a p p l i e d t o measurements taken i n s e q u e n t i a l o r d e r from a c o n t i n uous p r o c e s s , b u t i t c a n a l s o be u s e d t o c o m p a r e r e s u l t s from d i f f e r e n t sources where l o g i c a l o r d e r can not be a s s i g n e d . A l a b o r a t o r y study o is necessary t o g i v e v e r y s e r i o u s t h o u g h t o f how t o arrange f o r subgroups w i t h i n the l a b o r a t o r i e s . Some people have defined a s u b g r o u p as t h e m e a s u r e m e n t s made b y a s i n g l e o p e r a t o r , using a single set of e q u i p m e n t , as c l o s e l y t o g e t h e r as p o s s i b l e . T h i s c a n be c o n s i d e r e d t o be d u p l i c i t y . A more useful subgroup includes the local a s s i g n a b l e causes over a more r e a s o n a b l e p e r i o d o f t i m e , f o r e x a m p l e , a week or more. A l o g i c a l r e a s o n f o r t h i s more e x t e n s i v e r a t i o n a l s u b g r o u p i s t h e f a c t t h a t t h e p e o p l e who use measurement results, often r e q u i r e c o m p a r i s o n s between r e p e a t e d measurements t o help make decisions relating t o s a m p l e r e c h e c k s , p r o d u c t i o n c h a n g e s , mat e r i a l s o u r c e s , e t c . , made o v e r t h e i n t e r v a l o f this p e r i o d of time. Many of t h e s e c o n t r o l c h a r t methods were devel o p e d b y S h e w h a r t a n d s u c c e s s f u l l y u s e d b y many people for nearly f i f t y years. During the l a s t three d e c a d e s , more s o p h i s t i c a t e d c o n t r o l c h a r t s for such t h i n g s as c u m u l a t i v e s u m s , l o t a c c e p t a n c e , m u l t i v a r i a b l e r e s p o n s e s , e t c . , have been developed (18); and some o f t h e s e t e c h n i q u e s w i l l be f o u n d u s e f u l t o h e l p e v a l u a t e measurement p r o c e s s e s .
RELATED A S S I G N A B L E CAUSES Many m e a s u r e m e n t p r o c e s s e s show l a c k o f s t a t i s t i c a l c o n t r o l of a type which o f t e n appears baffling because the assignable c a u s e s a c t t o g e t h e r so t h a t t h e e f f e c t s o f one a r e n o t t h e same a t v a r i o u s l e v e l s of the other. F o r e x a m p l e , i t h a s l o n g b e e n known that the o x i d a t i o n of ferrous iron with potassium
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
1.
WERNIMONT
Statistical Control of Measurement Processes
21
permanganate gives high r e s u l t s i nhydrochloric acid solutions; the deviations increase with acid concentration. Also, the deviations are r e l a t i v e l y smaller as t h e i r o n c o n c e n t r a t i o n i n c r e a s e s , a n d t h e r a t e o f titration decreases. I t i s m o s t i m p o r t a n t t h a t we f i n d and remove t h e e f f e c t s o f t h i s k i n d o f d i f f e r e n tial response while a measurement p r o c e s s i s b e i n g developed. The c l a s s i c a l e x p e r i m e n t a l p r o c e d u r e (sometimes c a l l e d t h e s c i e n t i f i c method) f o r o p t i m i z i n g t h e r e sponse o f measurement p r o c e s s i s i n a d e q u a t e t o d e t e c t this kind of related behavior between assignable causes. In t h e cas s t u d i e s e a c h , a t som shown i n Figure 8 on t h e l e f t ; b u t i t never d e t e r mines whether t h e e f f e c t s o f changing t h e l e v e l s o f the f a c t o r s a r e independent o f each-other. Different i a l response i s e a s i l y detected using a complete factorial d e s i g n a s i s shown o n t h e r i g h t , w h e r e t h e e f f e c t s o f a l l c o m b i n a t i o n s o f t h e f a c t o r s a r e measu r e d w i t h l i t t l e o r no e x t r a work. In t h i s case, the factors are acting independently i f the difference between the diagonal averages i snot s i g n i f i c a n t l y greater than zero. Differential response (usually called interact i o n , o r n o n a d d i t i v i t y by s t a t i s t i c i a n s ) c a n be o f three types: ( a ) among f a c t o r s w i t h i n t h e m e a s u r e ment p r o c e s s , ( b ) b e t w e e n process factors and t h e type o f m a t e r i a l b e i n g t e s t e d , and (c) between t e s t methods and t h e t y p e o f i n t e r f e r e n c e s i n t h e m a t e r i a l being tested. The e x a m p l e d e s c r i b e d a b o v e f a l l s i n t o t h e f i r s t type. F i g u r e 9 shows t h e p r o b l e m o f d i f f e r e n t i a l r e s p o n s e when s e v e r a l m a t e r i a l s a r e t e s t e d u s i n g a meas u r e m e n t p r o c e s s s e t up i n v a r i o u s l a b o r a t o r i e s . The l a b o r a t o r i e s do n o t r a n k t h e m a t e r i a l s i n e x a c t l y t h e same o r d e r . T h i s b e h a v i o r i s n o t s e r i o u s as l o n g as the v a r i a t i o n among t h e l a b o r a t o r i e s i s no g r e a t e r than the r e p l i c a t i o n e r r o r o f the process. However, when u n k n o w n interferences are present i nd i f f e r e n t t y p e s o f m a t e r i a l , w h i c h a f f e c t some laboratory r e sults b u t n o t o t h e r s , i t soon becomes i m p o s s i b l e t o p r e d i c t t h e r e s p o n s e o f t h e t e s t method on t y p e s o f m a t e r i a l , other than those used i n t h e i n t e r l a b o r a t o ry study.
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
V A L I D A T I O N O F T H E M E A S U R E M E N T PROCESS
EXPERIMENTAL
TWO FACTORS TOGETHER
ONE FACTOR AT A TIME
ω
ÛÛ
OAV.
cr ο
Ι Ο
o-ih - I Av. + I
<
FACTOR A Figure 8.
£
DESIGNS
y.
y
y
y4
3
-I
2
Av. +i
FACTOR A
Two
DIFFERENTIAL RESPONSE
LU ^ LU
PROCESS
ι
A
1—
B
X
1
C
MATERIAL
1
D
1
Ε
1
F
MATERIAL Figure 9.
Resistance of floor materials to sur face abrasion
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
1.
WERNIMONT
Statistical Control of Measurement Processes
23
D i f f e r e n t i a l r e s p o n s e i s o f t e n o b s e r v e d when two t e s t methods a r e based on s l i g h t l y d i f f e r e n t p h y s i c a l and c h e m i c a l p r i n c i p l e s . This has been observed f o r methods t o e v a l u a t e material flammability, surface abrasion, fabric wrinkle r e s i s t a n c e , p a p e r smoothness, etc. We a r e u s u a l l y u n a b l e to explain these interactions, e s p e c i a l l y i f the fundamental p r i n c i p l e s o f t h e methods a r e o n l y v a g u e l y u n d e r s t o o d .
RUGGEDNESS OF A MEASUREMENT PROCESS Measurement processes are often developed i na s i n g l e l a b o r a t o r y and the d i othe laborator ies. We h a v e a l r e a d s u l t s from d i f f e r e n y state o f s t a t i s t i c a l c o n t r o l with respect t o the vari a t i o n w i t h i n l a b o r a t o r i e s . O c c a s i o n a l l y we a r e a b l e to i d e n t i f y some o f t h e a s s i g n a b l e c a u s e s but only with a great deal o f e f f o r t . Dr. W. J . Youden a d d r e s s e d h i m s e l f p o r t a n t p r o b l e m and he o b s e r v e d ( 1 9 ) :
t o t h i s im-
"By n o means an u n u s u a l o c c u r r e n c e i s a c o l l a b o r a t i v e t e s t whose r e s u l t s o b v i o u s l y fall short o f e x p e c t a t i o n s based on d a t a from the i n i t i a t i n g laboratory. The explanation i s u s u a l l y found i n t h e f a c t t h a t t h e i n i t i a t i n g l a b o r a t o r y has a s e t of o p e r a t i o n s and equipment t h a t i s never varied. In f a c t , care i s taken not to vary the routine i n any p a r t i c u l a r . N a t u r a l l y no l i g h t i s s h e d o n w h a t may h a p p e n when t h e p r o c e d u r e o n t r i a l i s u s e d b y a number o f l a b o r a t o r i e s e a c h o f w h i c h e s t a b l i s h e s i t s own p a r t i c u l a r r o u t i n e " . He goes on t o suggest that things l i k e the s o u r c e , age a n d c o n c e n t r a t i o n o f r e a g e n t s , the rate of heating solutions, t h e temperature and time o f drying materials, the environmental conditions of temperature and r e l a t i v e humidity, a n d many o t h e r f a c t o r s may n o t b e s p e c i f i e d i n d e t a i l s o t h a t they v a r y , w i t h i n s m a l l l i m i t s , f r o m one l a b o r a t o r y t o a n other .
able
The o n l y p r o t e c t i o n a g a i n s t cause i s f o r t h e i n i t i a t i n g
t h i s type o f assignlaboratory to deli-
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
24
VALIDATION OF
THE
M E A S U R E M E N T PROCESS
berately introduce minor v a r i a t i o n s i n t o the proce d u r e and o b s e r v e w h a t h a p p e n s . At f i r s t t h i s appears to i n v o l v e a g r e a t d e a l o f e x t r a work, most o f w h i c h may yield negative results. Youden suggested a scheme o f a t t a c k t h a t w i l l c o n s e r v e l a b o r y e t be s e n s i t i v e e n o u g h t o p i c k up f a i r l y s m a l l e f f e c t s . The lustrated factors slightly ditions, tively.
p r i n c i p l e o f h i s s u g g e s t e d d e s i g n c a n be i l by a s i m p l e e x p e r i m e n t i n v o l v i n g j u s t t h r e e (20). Let the l e v e l s of the f a c t o r s , chosen above and b e l o w t h e s p e c i f i e d o p e r a t i n g con be designated A, B, C and a , b , c r e s p e c O n l y f o u r e x p e r i m e n t s n e e d t o be r u r i : Run Number
Level
1
ABC
t
2
aBc
X
3
abC
4
Abe
y ζ
Result
Notice that two f a c t o r s a r e a l w a y s c h a n g e d f r o m one run to another. The d i f f e r e n t i a l e f f e c t o f changing t h e l e v e l o f e a c h f a c t o r i s o b t a i n e d by c o m p u t i n g t h e averages, Â =
(t+z)/2; Β =
(t+x)/2; C =
(t+y)/2,
â
(x+y)/2;
(y+z)/2;
(x+z)/2.
=
b =
c =
The two r u n s f o r  i n v o l v e t h e l e v e l s B, b , C, a n d c _ f o r t h e o t h e r two f a c t o r s and t h i s i s a l s o _ t r u e for a; thus i f A d i f f e r s s i g n i f i c a n t l y f r o m a, t h e f i r s t f a c t o r m u s t be t h e a s s i g n a b l e c a u s e . The same l o g i c a p p l i e s t o t h e o t h e r two f a c t o r s . I t i s important to recognize that the j u s t i f i c a tion for this logic rests on the expectation that changes i n the levels o f a l l t h e f a c t o r s has been q u i t e s m a l l and a r e n o t s u p p o s e d t o h a v e appreciable e f f e c t on t h e m e a s u r e m e n t p r o c e s s .
over 1.
This experimental design has many a d v a n t a g e s t h e t r a d i t i o n a l o n e - f a c t o r - a t - a - t i m e scheme: The number o f experiments i s minimized a t one more t h a n t h e number o f f a c t o r s being studied, although only c e r t a i n combinations are p o s s i b l e .
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
1.
WERNIMONT
2.
3.
4.
Statistical Control of Measurement Processes
25
D i f f e r e n c e s a r e e v a l u a t e d i n terms o f averages so t h a t t h e d i s c r i m i n a t i n g p o w e r i s g r e a t e r f o r t h e same number o f r u n s . I f no s i g n i f i c a n t f a c t o r s a r e f o u n d , we c a n g e t a p r e l i m i n a r y e s t i m a t e o f how t h e m e t h o d will behave i n o t h e r l a b o r a t o r i e s . I f s i g n i f i c a n t f a c t o r s a r e found, t h e i r e f f e c t s can be e s t i m a t e d and a p p r o p r i a t e t o l e r a n c e s s e t for their control.
The restriction t o c e r t a i n combinations o f factors and r u n s i s n o t v e r y s e r i o u s ; Youden considered the plan f o r seven factors i n e i g h t runs t o be a good compromise a n d he h a p u b l i s h e d l example ( 2 0 21). Other peopl ments t o t e s t t h e ruggednes , without exception, they were a b l e t o d e t e c t one o r more p o t e n t i a l a s s i g n a b l e c a u s e s o f v a r i a t i o n i n t h e t e s t method, as w r i t t e n ( 2 2 ) .
A MEASUREMENT HIERARCHY Experience tells u s t h a t some m e a s u r e m e n t p r o c e s s e s c a n be e a s i l y b r o u g h t i n t o a s t a t e o f s t a t i s t i c a l c o n t r o l w h i l e o t h e r s seem s u b j e c t t o a p l e t h o r a o f a s s i g n a b l e causes t h a t a r e hard t o l o c a t e and d i f ficult to control. Why i s t h i s s o ? Let following 1. 2. 3. 4. 5.
us arrange order:
measurement o p e r a t i o n s
into the
Measurement p r o c e s s e s t o determine n a t u r a l constants , C a l i b r a t i o n o f p h y s i c a l and chemical reference processes , P r e c i s e and a c c u r a t e " s t a n d a r d " measurement p r o cesses, R o u t i n e c o n t r o l measurement p r o c e s s e s , and Laboratory s i m u l a t i o n p r o c e s s e s t o measure performance c h a r a c t e r i s t i c s .
The determination o f n a t u r a l c o n s t a n t s such as the speed o f l i g h t , t h e a c c e l e r a t i o n o f g r a v i t y , t h e atomic weights o f e l e m e n t s , e t c . , r e q u i r e s t h a t we s p a r e no e f f o r t t o c o r r e c t l y a s s i g n a p r e c i s e a n d u n biased value to represent the property involved.
etc.,
Operations to c a l i b r a t e weights, volumetric-ware a n d t o p r e p a r e p u r e c h e m i c a l compounds a n d a s -
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
VALIDATION OF
26 sign into
THE
v a l u e s t o homogeneous r e f e r e n c e the second c l a s s .
M E A S U R E M E N T PROCESS
materials,
fall
1
The t h i r d c l a s s i n c l u d e s "standard' measurement processes which are u s u a l l y f a i r l y complex so that all s i g n i f i c a n t i n t e r f e r e n c e s are removed or c o r r e c ted f o r . Routine control measurements are i n the f o u r t h c l a s s ; they r e q u i r e l e s s e l a b o r a t e equipment than the standard methods and a r e more e c o n o m i c a l t o r u n a l t h o u g h t h e y g i v e l e s s p r e c i s e and l e s s accurate results . Finally, the las performance of a syste magnitud a property. R e s i s t a n c e to s u r f a c e a b r a s i o n or weathe r i n g , f l a m m a b i l i t y of c h i l d r e n s sleep-wear, the eff i c i e n c y of removing d i r t from a c a r p e t , are examples of t h i s c l a s s . T
as we
You w i l l s u r e l y r e c o g n i z e s e v e r a l t y p e s o f o r d e r proceed from top to bottom i n t h i s h i e r a r c h y : -the scientific p r i n c i p l e s i n v o l v e d , are f u n d a m e n t a l and f a i r l y w e l l u n d e r s t o o d at the t o p ; t h e y a r e e m p i r i c a l and v e r y comp l e x at the bottom, -operations to c o r r e c t f o r the e f f e c t s of known interferences become increasingly more d i f f i c u l t , -assignable identify,
causes
are
more d i f f i c u l t
to
-differential response becomes more p r e v a l e n t and more d i f f i c u l t to cope with, and -ruggedness against e r a t i o n a l procedure down t h e h i e r a r c h y .
stresses decreases
on as
the we
opgo
We s h o u l d k e e p t h i s c l a s s i f i c a t i o n i n m i n d when e v e r we d e v i s e and d e v e l o p a m e a s u r e m e n t p r o c e s s ; i t s position in the hierarchy i s an i n d i c a t i o n o f t h e p r o b l e m s we m u s t s o l v e i n o r d e r t o m a i n t a i n t h e process i n a s t a t e of s t a t i s t i c a l c o n t r o l .
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
1.
WERNIMONT
Statistical Control of Measurement Processes
27
SUMMARY M e a s u r e m e n t s a r e n o t v a l i d u n t i l we e v a l u a t e t h e performance c h a r a c t e r i s t i c s o f t h e process which produces them. Statistical c o n t r o l i s concerned w i t h removing a l l s i g n i f i c a n t a s s i g n a b l e causes which can a f f e c t t h e p r o c e s s so t h a t statements about t h e v a r i a b i l i t y o f f u t u r e r e s u l t s w i l l be c o r r e c t w i t h an a s sociated level of confidence. C o n t r o l c h a r t a n a l y s i s , f i r s t d e v e l o p e d b y W. A. Shewhart, i s used t o examine a finite sequence o f statistical control. The m e a s u r e m e n t s a r e d i v i d e d into " r a t i o n a l subgroups" d th standard d e v i a t i o n within subgroups, serve v a r i a t i o n o f the subgroup-standar averages. Lack of statistical control i s indicated when t h e s e l i m i t s a r e e x c e e d e d o r when n o n r a n d o m p a t terns o f v a r i a t i o n occur w i t h i n the l i m i t s . The concept o f r a t i o n a l s u b g r o u p s makes s e n s e because a s s i g n a b l e causes o f process variation fall r a t h e r c l e a r l y i n t o two c l a s s e s : l o c a l m a n i p u l a t i o n s which a r e under t h e c o n t r o l o f t h e o p e r a t o r ; and r e g i o n a l o p e r a t i o n s , i n t i m e a n d s p a c e , t o m a i n t a i n th~e s t a b i l i t y o f the process, f o r which someone, other t h a n t h e o p e r a t o r , m u s t b e r e s p o n s i b l e . We o b s e r v e a lack of s t a t i s t i c a l control f o r subgroup - averages more often than f o r standard d e v i a t i o n s because r e g i o n a l a s s i g n a b l e causes a r e d i f f i c u l t t o f i n d and remove. Not i n f r e q u e n t l y , we f i n d t h a t two a s s i g n a b l e causes a c t t o g e t h e r so t h a t t h e i r e f f e c t s a r e n o t additive, t h a t i s , t h e e f f e c t s o f o n e a r e n o t t h e same at a l l l e v e l s o f the other. Process interferences, also may b e p r e s e n t i n some t y p e s o f m a t e r i a l b e i n g t e s t e d and n o t i n o t h e r s . We f i n d i t difficult to understand this kind o f p r o c e s s r e s p o n s e u n l e s s we use complete f a c t o r i a l e x p e r i m e n t a l d e s i g n s . Lack o f s t a t i s t i c a l c o n t r o l among l a b o r a t o r i e s i s i n e v i t a b l e u n l e s s t h e measurement method i s rugged against small changes i n process operating conditions. W. J . Y o u d e n h a s p r o v i d e d u s w i t h an effic i e n t experimental design t o t e s t t h e ruggedness o f a method.
into
Finally, we c a n a r r a n g e m e a s u r e m e n t o p e r a t i o n s a h i e r a r c h y w h i c h c l e a r l y shows t h a t t h e b e t t e r
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
28
VALIDATION O F T H E M E A S U R E M E N T PROCESS
we u n d e r s t a n d t h e s c i e n t i f i c p r i n c i p l e s i n v o l v e d , t h e e a s i e r i t i s t o m a i n t a i n t h ep r o c e s s i n a s t a t e o f statistical control.
LITERATURE CITED 1. 2.
3.
4. 5. 6. 7.
8.
9. 10. 11. 12. 13. 14.
Federal Register (1975) Vol. 40, No. 155, p. 33653. Stevens, S. S. in "Measurement, Definitions and Theories", C. W. Churchman and R. Philburn, E d . , Chapter 2, John Wiley and Sons, Inc., New York, 1959. Eisenhart, Churchill Journal of Research of the National Burea Instrumentatio (1963) , , pp to 187. Wernimont, Grant, Anal. Chem. (1967) Vol. 29, pp. 554-562. Wernimont, Grant, Materials Research and Stan dards (1969) Vol. 9, No. 9, pp. 8-21. Shewhart, W. Α . , "Economic Control of Quality of Manufactured Product", D. Van Nostrand Company, Inc., New York, 1931. Shewhart, W. Α . , "Statistical Method from the Viewpoint of Quality Control", The Graduate School, U. S. Department of Agriculture, Washington, 1939. Shewhart, W. A. in "Contribution of Statistics to the Science of Engineering, University of Pennsylvania Bicentennial Conference", University of Pennsylvania Press, 1941. Murphy, R. Β . , Materials Research and Standards (1961) Vol. 1, No. 4, pp. 264-267. "ASTM Manual on Quality Control of Materials", American Society for Testing and Materials, Phi ladelphia, 1976. "Definitions, Symbols, Formulas, and Tables for Control Charts", American Society for Quality Control, Milwaukee, 1972. "Glossary and Tables for Statistical Quality Control", American Society for Quality Control, Milwaukee, 1973. Duncan, A. J., "Quality Control and Industrial Statistics, 4th E d . " , Richard D. Irwin, Inc., Homewood, 1974. Bicking, C. A. and Gryna, F. M . , J r . in "Quality Control Handbook, 3rd. Ed.", J. M. Juran, E d . , Section 23, McGraw-Hill Book Company, New York, 1974.
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
1.
WERNIMONT
15. 16. 17. 18. 19. 20. 21. 22.
Statistical Control of Measurement Processes
29
Wernimont, Grant, Anal. Chem. (1946) Vol. 18, pp. 587-592. Wernimont, Grant, ASTM Bulletin (1950) No. 166, pp. 45-48. Wernimont, Grant, Anal. Chem. (1951) Vol. 23, pp. 1572-1576. Gibra, I. N . , J. Qual. Tech. (1975) Vol. 7, No. 4, pp. 183-192. Youden, W. J., J. Assoc. Offic. Agr. Chem. (1963) Vol. 46, p. 56. Youden, W. J., Materials Research and Standards (1961) Vol. 1, No. 11, p. 863. Youden, W. J. and Steiner, Ε. Η., "Statistical Manual of the AOAC" p 50 The Association of Official Analytica Wernimont, Grant, "Symposiu Preparatio and Use of Precision Statements", American So ciety for Testing and Materials, Philadelphia, In Press.
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
2 Testing Basic Assumptions in the Measurement Process J A M E S J. F I L L I B E N National Bureau of Standards, Washington, DC 20234
The purpose of this paper is to discuss various statistical techniques for the in a measurement process refers to the act of collecting quantified information about some phenomenon of interest under well-defined conditions. Among the various components of the measurement process are the experimentalists themselves. The end objective in a measurement process is predictability (1,2) that is, the ability to make probability statements about measurements already taken and yet to be taken. If this predictability is not present, then the process will yield conclusions which are only temporal and local in nature, and which will lack the generality typical of scientific experimentation. To achieve such predictability, the measurement process must be "in control (3)." The term "in control" is a statistical term—having nothing to do per se with whatever physical science area or phenomenon that the experiment involves, but rather with the properties of sequences of measurements. A broad definition of "in control" is as follows: a measurement process is in control if the resulting observations from the process, when collected under any fixed experimental condition within the scope of the a priori well-defined conditions of the measurement process, behave like random drawings from some fixed probability distribution with fixed location and fixed variation parameters. The essential components implied by the above definition are: 1. randomness
30 In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
2.
FiLLiBEN
Testing Basic Assumptions
2.
fixed location
3.
fixed variation
4.
fixed distribution
31
I t i s t o be noted t h a t " f i x e d l o c a t i o n " as used above and throughout t h i s paper i s an abbreviated way o f s t a t i n g t h a t the measurement process has a s i n g l e l i m i t i n g mean ( i . e . , as the measurement process continues i n time, i t i s conceived t o have a unique l i m i t i n g " t y p i c a l v a l u e " ) . S i m i l a r l y , " f i x e d v a r i a t i o n " i s an abbreviated way of s t a t i n g t h a t the measurement process has a s t a b l e degree o f v a r i a t i o n . I t i s important of course t o note t h a t the above components i n the d e f i n i t i o n o f " i n c o n t r o l " are i d e n t i c a l l y those u n d e r l y i n made, e i t h e r knowingl The consequences f o r i n v a l i d i t y o f these assumptions are the a r r i v a l a t i n c o r r e c t c o n c l u s i o n s and the l o s s of the d e s i r e d - f o r p r e d i c t a b i l i t y t h a t the s c i e n t i s t i n v a r i a b l y seeks. In t h i s l i g h t , the t e s t i n g and checking o f b a s i c assumpt i o n s L$l measurement process takes on i t s r i g h t f u l importance. The t e s t i n g o f assumptions i s a "necessary e v i l , " t a n g e n t i a l i n a sense t o the main a n a l y s i s , but which r a r e l y can be s h o r t - c i r c u i t e d . In order t o t e s t such assumptions a f t e r the f a c t , we have only the raw data r e s u l t i n g from the experiment. F o r t u n a t e l y , however, much i n f o r m a t i o n about the v a l i d i t y o r i n v a l i d i t y o f the u n d e r l y i n g assumptions i s s t i l l l a t e n t i n the data and c o n s i d e r a b l e progress ( o f both a t h e o r e t i c a l and p r a c t i c a l nature) has been made i n the l a s t decade i n the development o f s t a t i s t i c a l techniques f o r the e x t r a c t i o n of such information. The remainder o f t h i s paper w i l l deal w i t h an enumeration and d i s c u s s i o n of such techniques. i n
a
I t w i l l be noted t h a t almost a l l o f the techniques t o be presented are g r a p h i c a l i n nature. There are many reasons f o r such a heavy dependence on graphics: 1. P l o t s take f u l l advantage o f the p a t t e r n r e c o g n i t i o n c a p a b i l i t i e s of the human (e.g., l i n e a r i t y i s easy t o d e t e c t ) . 2. P l o t s make use o f a minimal number o f assumptions. Thus, the use o f p l o t s increases the l i k e l i h o o d t h a t the conclusions w i l l not be approach-dependent. 3. From a communications p o i n t of view, a p l o t i s g e n e r a l l y a much more understandable and e f f i c i e n t way o f conveying information t o another a n a l y s t o r e x p e r i m e n t a l i s t than i s a set of s t a t i s t i c s .
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
32
VALIDATION OF T H E M E A S U R E M E N T PROCESS
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
2.
FiLLiBEN
Testing Basic Assumptions
33
Où co
too
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
VALIDATION OF T H E M E A S U R E M E N T PROCESS
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
THE
IS A P L O T
579.CCCCC00=MIN-
-469.1250000
-359.25CC00O
-2 4 9 . 3 7 5 C C C C
- 129. 5CC0COO=MID-
-29.6250000
8O.25C0OOO
190. 1 250CO0
I 1-0C00O0O
j
ι
X
X
XX
X
ι
I
X
XX
X X
X
X
X
100.5CC0000
X
X
X
X
XX
X
Run sequence plot. Beam deflection data.
X
X X X
ι
(HORIΖCNTALLY)
XX
I
I 50.7500000
VERSUS
Figure Id.
I
X X
OF X ( I ) ( V E R T I C A L L Y )
3C0.0CCCC00=MAX-
FOLLOWING
X
X
X
150.2500000
XX
X
X
X
I
XX
X
X
X X
X
I 200.0000000
X
VALIDATION OF T H E M E A S U R E M E N T PROCESS
36
4. A p l o t allows the a n a l y s t t o see and use a l l o f the data--thus no information i s l o s t i n the t y p i c a l e x e r c i s e of forming s t a t i s t i c s which ( i n essence) summarize and map information l a t e n t i n the e n t i r e data set i n t o a s i n g l e number—a number which w i l l u s u a l l y only be s e n s i t i v e t o one p a r t i c u l a r a n a l y s i s aspect of the data. 5. A p l o t allows the a n a l y s t t o check many d i f f e r e n t aspects of the data simultaneously--and so information w i l l be relayed not only about what i s being i n v e s t i g a t e d , but a l s o about unsuspected anomalies (e.g., o u t l i e r s ) i n the data. RUN SEQUENCE PLOT We assume t h a observations Y Y , .. 1 $
2
response Ϋ. = constant c + e r r o r e.
(1)
Almost a l l data have a "time" run sequence ( i = l , 2, n) a s s o c i a t e d w i t h i t . Although the c o l l e c t i o n o f data p o i n t s may or may not have been equispaced i n time, the o r d e r i n g of the data i n time ( i . e . , the run sequence) i s u s u a l l y w e l l - d e f i n e d (unless observations are simultaneously c o l l e c t e d ) . In cases where the data a c q u i s i t i o n r a t e i s such t h a t there i s an equal time-spacing between c o l l e c t e d data p o i n t s , the run sequence has a natural analogue t o a p o s s i b l y r e l e v a n t f a c t o r (time) i n the experiment; i n other cases, when the data a c q u i s i t i o n r a t e i s v a r i a b l e or random, no such analogue e x i s t s - - y e t the run sequence " f a c t o r " i s s t i l l f r e q u e n t l y of i n t e r e s t . The run sequence p l o t ( d e f i n e d as a p l o t of Y j versus i ) i s the s i m p l e s t p o s s i b l e data p l o t and y e t i s almost i n v a r i a b l y i n f o r m a t i v e . This run sequence p l o t i s the recommended f i r s t step i n assessing whether the b a s i c assumptions o f the measurement process are tenable. In p a r t i c u l a r , t h i s p l o t y i e l d s information about the assumption o f f i x e d l o c a t i o n , f i x e d v a r i a t i o n and the i m p l i c i t c o r o l l a r y assumptions t h a t the data s e t i s o u t l i e r - f r e e . Figures l a through Id i l l u s t r a t e t h i s on three p a r t i c u l a r data s e t s . The two upper p l o t s present data (voltage counts) from a Josephson J u n c t i o n cryothermometry experiment. The upper l e f t p l o t (data a l l on the lower margin w i t h the exception o f an i s o l a t e d p o i n t on the upper margin) i s i n d i c a t i v e of an o u t l i e r ( i n t h i s case due t o a keypunch e r r o r i n the l e a d i n g d i g i t ) . The upper r i g h t p l o t i s the c o r r e c t e d data s e t ; note the absence of s h i f t s or v a r i a t i o n a l changes as one proceeds l e f t t o r i g h t ( i n time) across the p l o t . (Note a l s o how t h i s p l o t gives the a n a l y s t a c l e a r and i n i t i a l " f e e l " f o r the d i s c r e t e character o f these data.) The lower l e f t f i g u r e i s a run sequence p l o t f o r wind v e l o c i t y data. Note the apparent s h i f t (up) i n l o c a t i o n i n the second h a l f o f the d a t a — a c l e a r i n d i c a t i o n o f a process apparently not i n s t a t i s t i c a l c o n t r o l . The lower r i g h t f i g u r e
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
2.
37
Testing Basic Assumptions
FiLLiBEN
gives a run sequence p l o t f o r d e f l e c t i o n s o f a s t e e l - c o n c r e t e beam when a p e r i o d i c force i s a p p l i e d t o i t . No apparent s h i f t i n l o c a t i o n o r v a r i a t i o n and no apparent o u t l i e r s a r e evident from t h i s p l o t , and so t h i s p a r t i c u l a r data s e t "passes" the s c r u t i n y of t h i s f i r s t s t a t i s t i c a l technique. I t i s t o be noted i n passing t h a t t h e FORTRAN c a l l i n g sequence i n the upper l e f t corner o f t h i s f i g u r e (and most other f i g u r e s i n t h i s paper) r e f e r t o c a l l s t o subroutines i n the DATAPAC ( 6 , 7 ) data a n a l y s i s package which produced the computerized output t h a t comprises the f i g u r e s . An important g e n e r a l i z a t i o n o f the run sequence p l o t i s the c o n t r o l chart. Rather than p l o t t i n g XJ vs i (as above), the sample mean c o n t r o l char f i x e d ) numbers o f observation been grouped together t o form each x-j. I f the o r i g i n a l process i s normal, o r i f the number o f observations grouped t o form a s i n g l e mean i s l a r g e , then normal p r o b a b i l i t y l i m i t s can be i n s e r t e d onto the chart so as t o define a t y p i c a l band o f v a r i a t i o n f o r the process. Large ( o r frequent) excursions outside the band i s an i n d i c a t i o n o f a process t h a t i s no longer "in control." In a d d i t i o n t o χ c o n t r o l c h a r t s , other useful c o n t r o l charts are s (standard d e v i a t i o n ) c h a r t s , r (range) charts and CUSUM (cummulative sum) charts. C o l l e c t i v e l y , they are e x c e l l e n t d i a g n o s t i c t o o l s f o r determining whether i n p a r t i c u l a r an o u t l i e r has occurred and i n general whether a s h i f t i n l o c a t i o n o r v a r i a t i o n has occurred i n the process. For f u r t h e r information on c o n t r o l c h a r t s , the reader i s r e f e r r e d t o Himmelblau (8). LAG-1
AUTOCORRELATION PLOT
The lag-1 a u t o c o r r e l a t i o n p l o t i s defined as a p l o t o f Y versus Υ· -j^over the e n t i r e data s e t ; that i s , the f o l l o w i n g n-1 points are p l o t t e d : ( Υ , Υ ) , ( Y , Y ) , (Y >Yâ)>. · · (Y > V l ) lag-1 autocorrelation p l o t i s s e n s i t i v e t o the randomness assumption i n a measurement process. I f the data are random, then adjacent observations w i l l be u n c o r r e c t e d and the p l o t o f Y. versus Y - j ^ w i l l appear as a data cloud with no apparent structure. However, i f the data are not random and i f adjacent observations do have some a u t o c o r r e l a t i o n t h i s s t r u c t u r e w i l l f r e q u e n t l y manifest i t s e l f i n the a u t o c o r r e l a t i o n p l o t . Figure 2 gives three examples o f the a u t o c o r r e l a t i o n p l o t . The upper l e f t p l o t i s an a u t o c o r r e l a t i o n p l o t o f the aforementioned voltage c o u n t s — n o apparent s t r u c t u r e i s n o t i c e a b l e (aside from the l a t t i c e e f f e c t due t o the discreteness o f the data). The upper r i g h t p l o t i s f o r the wind v e l o c i t y data--note the pronounced l i n e a r s t r u c t u r e i n t h i s p l o t which i m p l i e s t h a t the randomness assumption i s untenable f o r these data. The lower p l o t i s f o r the beam d e f l e c t i o n data--note the w e l l - d e f i n e d Ί
T h e
2
Χ
3
2
4
n
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
THF
IS
A
PLCT
CF
X(I)
Figure 2a.
(VERTICALLY)
2895.O0CCO00
2895.0000000=MIN
2895.8750000
2896.750CC00
2897.625CCO0
2898.50C0CC0=MID
2899.375CC00
2900.2500000
2901 . 1 250COO
29C2.0CCCC00=MAX
FOLLCWING
X(I-l)
t
( FΟκIZGNTALL
2898.5CC00OO
Y)
29C0.2500000
Lag-1 (Y vs. Yi-i) autocorrelation plot. Voltage counts.
2896.75C0OOO
VERSUS
2902.0000000
Ο Ο H
H
M
M
CO 00
FiLLiBEN
Testing Basic Assumptions
WIND VELOCITIES
CALL PL0TXX(X,1200)
Π
295.14 295.14
1
1
1
l
ι
T"
Γ
I 654.20
I
I
1
L 1013.26
Yi—ι
Figure 2b.
Lag-1 (Yi vs. Y< - J autocorrelation plot. Wind velocities.
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
THE
IS A PLOT
X X X XX
X
X XXX
X
XX XX
X X X X X
xxxxx
X
X
X
X X
XX
X X XX
- 139.5CCC000
X
X XXX X
X XX
XX
X XX
X X X X
XX XX
X
XXX X X XX X
X XX X XX
80.2500000
xxxx XX
Lag-1 (Yi vs. Yi-i) autocorrelation plot. Beam deflections.
-359.25CO00O
XX
>
X ( I - l ) (HORIZONTALLY)
X X X XXXXX
VERSUS
Figure 2c.
X X X
XX
(VERTICALLY)
-579.0CCC0OO
-579.CCCCCCO=MIN-
-469.1250000
-359.25CC0OO
-249.375CCC0
- 129.5CCCCCC=MID~
-29.6250000
80.2SC0CC0
190.125CO00
X
ΓΡ X(I>
300-0CC0C0O-MAX-
FOLLOWING
300.0000000
Ο Ο Μ
M
Ο
2.
FiLLiBEN
Testing Basic Assumptions
41
e l l i p t i c a l s t r u c t u r e o f the a u t o c o r r e l a t i o n p l o t which i s a l s o i n d i c a t i v e o f the untenableness o f the randomness assumption. In t h i s l a s t case the l a c k o f randomness was, as i t turned out, due to an underlying c y c l i c s t r u c t u r e i n the data ( i . e . , the t r u e model was Y-j = c + a* s i η (ôi + φ) + e i (where i i s time) r a t h e r than the assumed Y = c + e . The reader should note the two p o i n t s i n the upper r i g h t p o r t i o n o f the p l o t which are o f f the ellipse. This i s due t o a s i n g l e o u t l i e r i n the data and demonstrates the secondary sensitivity o f the lag-1 a u t o c o r r e l a t i o n p l o t to o u t l i e r s . RUNS TEST The runs t e s t i s a technique t h a t i s s p e c i f i c a l l y used f o r testing randomness application of t h i s i l l u s t r a t e the technique, consider the run sequence p l o t o f 50 spectrophotometry transmittance data p o i n t s i n f i g u r e 3. I t i s apparent from the p l o t t h a t the data are not random (note how observations 35 to 45 are not random but r a t h e r near-monotonic i n nature). To s c r u t i n i z e the c o r r e l a t i o n s t r u c t u r e i n t h i s data set, consider the runs a n a l y s i s given i n f i g u r e 4. A run up of length i means t h a t there are e x a c t l y (i+1) successive observations such t h a t each observation i s g r e a t e r than (or a t l e a s t equal t o ) the previous observation. The underlying theory behind the runs t e s t i s t h a t i f the data are random and i f the sample s i z e i s known ( i n t h i s case, n=50), the number of runs up of length 1, o f length 2, etc. , may be considered as random v a r i a b l e s whose expected values and standard d e v i a t i o n s can be c a l c u l a t e d from t h e o r e t i c a l c o n s i d e r a t i o n s (9) and these c a l c u l a t i o n s w i l l not depend on the (unknown) d i s t r i b u t i o n o f the data but only on i t s assumed randomness. Having computed such t h e o r e t i c a l v a l u e s , the f i n a l step i n the t e s t i s to compute from the data the observed number o f runs (up) o f length 1, o f length 2, e t c . , and then determine how many t h e o r e t i c a l standard deviations that t h i s observed statistic falls from the t h e o r e t i c a l l y expected value. This i s most e a s i l y done by formation of the standardized v a r i a b l e : N. - E(N.) SD(N.) where Nj i s the observed number o f runs (up) o f length i , E(N-j) i s the t h e o r e t i c a l expected number o f runs up o f length i and SD(N-j) i s the t h e o r e t i c a l standard d e v i a t i o n o f the number o f runs up o f length i . This standardized v a r i a t e i s given i n the right-most column of f i g u r e 4. For random data, one would expect values o f , say, ±1, ±2, ±3 i n t h i s column, i . e . , the observed number o f runs o f length i should be only a few ( a t most) standard d e v i a t i o n s away from the t h e o r e t i c a l expected value f o r the number o f runs o f length i . For nonrandom data, the
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
THE
I S A PLGT
X
X
X
X
XX
X ( l ) (VERTICALLY) I I
I (HORI Ζ C N T A L L Y ) I
1 3 . 25 C0C 0 0
VERSOS
XX
X
X
X
X
25.5CC00CO
X
X
X
37 . 7 5 C 0 0 O 0
Figure 3. Run sequence plot for spectrophotometric measurement of transmittance
2.00 13000 = MIN-
2.0014750
2.00 16500
2.0018250
2.0020000=MIO-
2.002 1750
2.CC22500
2.0025250
I
1 .cccccoo
OF
2.OC27000=MAX-
FOLLOWING
X X X
50.CCOOOOO
X
I
Ο η w
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
I
I
«
.0
1 3
Figure 4.
14
A
NUMBER
T
NUMBER
STAT
•
S T
•
OF
OF
RUNS EXACTLY
I
OR
6.0417 1.5750 . 3208 .0538 .007 7 .0010 .0001 .nooo .0000 .0000 .0000 .0000 .0000 .0000
1 . 3962 1.0622 .5433 . 2299 .0874 . 0308 . 0 102 . 0032 .0010 ,0003 .000 1 . 0000 .0000 .0000
I
MORE
2.0696
LENGTH
16.5000
OF
SD(STAT)
UP
1.6539 .9997 .5003 .2132 .08 l θ .029 1 .0097 . 003 1 .0009 . 0003 .0001 .0000 .0000 .0000
EXP(STAT)
RUNS
4.4667 I .2542 . 267 1 .046 1 .0067 . 0008 .0001 .0000 . 0000 .0000 .0000 .0000 . 0000 .0000
3.2170
LENGTH
10.4583
OF
EXP(STAJ)
P
UP U
SD(STAT )
RUNS
-.03 1 .3*4 3.09 8. M7 22.79 64.85 195.70 31 1 . 6 4 1042.19 -.00 -.00 -.00 -.00 -.00
-H.59
( S Τ Α Τ - Ε X P ( S Τ Α Τ> ) / S O ( S T A T
-2.9*4 -.89 -.25 -.53 -.22 -.08 -.03 103.06 -.00 1087.63 -.00 -.00 -.00 -.00 -.00
A
<STAT-EXP<ST T))/SD<STAT)
)
Runs analysis for spectrophotometric measurement of transmittance. Call runs. (X50)
.0
15
.0
I 3
.0 .0 •0
1 .0
1 1
1 2
1 .0
8 9
10
2.0
2.0 2.0
7
2.0
2.0
6
S
6.0 3.0
7.0
OF
1
VALUE
2 3 4
LENGTH
STATISTIC
.0
.0
.0
1 4 1 S
.0
.0
1 1
1 2
1 .0
9
10
.0
.0
7
β
.0
6
RUN
.0
4
S
OF
I .0
1
3.0
3
.0
2
RUN I
OF
OF
1
· .LENGTH
VALUE
STATISTIC
3
"S.
C CO O
ce Ci*
era ta
CO s*
2
M
a C w
to
VALIDATION OF
44
THE
M E A S U R E M E N T PROCESS
d e v i a t i o n s from the expected values w i l l , of course, be much l a r g e r and t h i s i s the crux of the runs t e s t . Note t h a t i n the spectrophotometry data, the randomness assumption i s e n t i r e l y untenable as i n d i c a t e d by the e x c e s s i v e l y l a r g e values of the standardized s t a t i s t i c i n the l a s t column (e.g., the number of runs up of length 10 i n the data i s 1 and y e t f o r n=50 observations and f o r random data, we should have e s s e n t i a l l y no runs up of length 1 0 - - t h i s 1 run up of length 10 i s over 1000 standard d e v i a t i o n s from i t s expected value and so the randomness assumption must be r e j e c t e d ) . The above-described runs a n a l y s i s i s a v a l u a b l e a d d i t i o n a l t o o l f o r t e s t i n g the s p e c i f i c hypothesis of randomness. The net e f f e c t o the sample mean X of th t h i s experiment, the u n c e r t a i n t y statement a s s o c i a t e d ^ w i t h X ( f o r example s- = the estimated standard d e v i a t i o n of X) would c e r t a i n l y have το be based on many fewer degrees of freedom than n - i = 49. As i s evident from the data, there are not 50 independent observations of the t r a n s m i t t a n c e ; there would be c o n s i d e r a b l y fewer--and t h i s would r e s u l t (from a p r a c t i c a l p o i n t of view) i n a l a r g e r (and more r e a l i s t i c ) value f o r s-. Λ BAND PLOTS The assumed u n d e r l y i n g model f o r a p p l i c a t i o n of technique i s again as i n eq. ( 1 ) . To g r a p h i c a l l y t e s t model, however, an a l t e r n a t i v e model, v i z . , response Ϋ. = f i X ^ - )
+
e r r o r e..
this this
(2)
where f(Xi_-j ) i s some unknown f u n c t i o n of the v a r i a b l e X and where X i s a p o s s i b l e v a r i a b l e a f f e c t i n g the response. A band p l o t (4) i s a s p e c i a l l y - c o n s t r u c t e d p l o t of the response v a r i a b l e Y versus another v a r i a b l e X . A band p l o t considers a l l the data w i t h i n v a r i o u s c l a s s e s of the h o r i z o n t a l a x i s v a r i a b l e and then, r a t h e r than p l o t t i n g a l l such p o i n t s , summarizes each subset of data i n t o f i v e s t a t i s t i c s : the median, the lower and upper q u a r t i l e s , and the two extremes (minimum and maximum). A l i n e connecting the medians across the h o r i z o n t a l a x i s adds c o n t i n u i t y to the p l o t and gives a more robust i n d i c a t i o n of whether the response v a r i a b l e s h i f t s l o c a t i o n w i t h respect to the h o r i z o n t a l a x i s v a r i a b l e . Lines connecting the various lower q u a r t i l e s provide a lower p r a c t i c a l l i m i t to the "body" of the data whereas l i n e s connecting the upper q u a r t i l e s d e l i n e a t e an upper edge to the body of the data. The f l a t n e s s ( l a c k of trend) of the band between upper and lower q u a r t i l e s i s an i n d i c a t i o n of whether or not model 1 above (the f i x e d l o c a t i o n model) i s tenable. The width of the band between upper and lower q u a r t i l e s i s an i n d i c a t i o n of whether the f i x e d v a r i a t i o n assumption ( w i t h respect to the h o r i z o n t a l a x i s v a r i a b l e ) i s tenable. ls
1$
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
2.
45
Testing Basic Assumptions
FiLLiBEN
The example that w i l l be used f o r the band p l o t w i l l demonstrate how i t can ( i n c e r t a i n s p e c i a l circumstances) be used t o t e s t randomness. The data s e t c o n s i s t e d o f 400 percentage measurements taken from a near-complete surface i n s p e c t i o n o f a c i r c u l a r a u s t e n i t e standard reference material specimen. A given reading i s the percentage a u s t e n i t e value f o r t h a t p a r t i c u l a r small sub-area o f the specimen. To t e s t the hypothesis t h a t the specimen was homogeneous ( t h a t i s , t h a t 2dimensional randomness e x i s t e d ) , a band p l o t o f the percentage a u s t e n i t e readings versus angle (from some reference r a d i a l spoke o f the c i r c u l a r specimen) was constructed. Figure 5 i l l u s t r a t e s the r e s u l t i n g band p l o t s when t h i s angle f a c t o r was d i v i d e d i n t o 24 c l a s s e s with a c l a s s width o f 15 degrees was used. Thus, a l l data i n a given 15 degree wedge were assembled and then summarize q u a r t i l e , upper q u a r t i l e s t a t i s t i c s were then p l o t t e d t o represent t h a t specimen wedge r a t h e r than p l o t t i n g a l l the data i n the wedge. I f the sample were homogeneous (with respect to a n g l e ) , the band p l o t should be n e a r - f l a t over the e n t i r e 360° range. As the p l o t i l l u s t r a t e s , t h i s i s not the case f o r these d a t a — t h e percentage a u s t e n i t e measurements tend t o be low i n the v i c i n i t y o f 135°, tend t o be high near 280°, and tend again t o be low near 330°. The p l o t c l e a r l y shows the (homogeneity) randomness assumptions t o be suspect f o r t h i s specimen. 2-VARIABLE GRAPHICAL ANALYSIS OF VARIANCE This technique i s a p p l i c a b l e when a m u l t i - f a c t o r model o f the f o l l o w i n g type i s suspected: response Y. . = constant c + B^.X^. +
B
2 j
+
e r r o r e
i j
with an a l t e r n a t i v e general model of the f o l l o w i n g form: response Y. . = "Ρ(Χ ·> X p 2
ΐΊ
+
e r r o r
e n
*j
where f i s an unknown f u n c t i o n r e l a t i n g the nonrandom v a r i a b l e s X] and X t o the response v a r i a b l e Y, and where Xn" and Xo-iindicate d i f f e r e n t d i s c r e t e l i m i t s w i t h i n the v a r i a b l e s )( ana X , r e s p e c t i v e l y . Although the p l o t o f Y versus K w i l l c e r t a i n l y give the a n a l y s t some i n d i c a t i o n of the nature o f the f u n c t i o n f , the main p o i n t i s whether and how the response i s a d d i t i o n a l l y a f f e c t e d by the second v a r i a b l e (say v a r i a b l e X ) f o r some (but not a l l ) values o f the v a r i a b l e X . I f the v a r i a b l e X a f f e c t s the response f o r only some (but not a l l ) of the values o f the v a r i a b l e X i n s t a t i s t i c a l terms t h i s i s r e f e r r e d t o as an " i n t e r a c t i o n " e x i s t i n g between X and X --thus the e f f e c t o f X on the response i s dependent on the value of X . Rather than apply the usual 2-factor a n a l y s i s of variance 2
1
±
2
2
x
2
l 5
x
2
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
2
x
SPECIMAN HOMOGENERITY CALL PLOTBD(Y,X,N,XDELTA)
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
2.
FiLLiBEN
Testing Basic Assumptions
47
(ANOVA) t o data from t h i s model, we apply the g r a p h i c a l procedure i l l u s t r a t e d i n f i g u r e 6. This g r a p h i c a l a n a l y s i s o f variance (GANOVA) (10,11) i s simply a p l o t of the response v a r i a b l e versus one f a c t o r , w i t h d i f f e r e n t l e v e l s of the second f a c t o r i n d i c a t e d by d i f f e r e n t types of p l o t characters w i t h i n the p l o t . The GANOVA procedure i s very r e v e a l i n g i n t h a t i t communicates a l l o f the l a t e n t r e l e v a n t information i n t h i s 2-factor system. This technique i s i l l u s t r a t e d i n f i g u r e 6 which p l o t s the r e s i d u a l s from a f i t o f the response v a r i a b l e (days t o f a i l u r e ) from a s t r e s s f a t i g u e experiment versus l a b (9 l a b s ) with the value o f t h e p l o t c h a r a c t e r representing v a r i o u s l e v e l s (3 l e v e l s ) o f a second v a r i a b l e (experiment c o n f i g u r a t i o n ) which could (but h y p o t h e t i c a l l y should not) a f f e c t the response. A p l o t c h a r a c t e r value data p o i n t was generate The two independent v a r i a b l e s are: l a b o r a t o r y (9 l e v e l s — p l o t t e d h o r i z o n t a l l y ) , configuration (3 levels--denoted by d i f f e r e n t characters).
plot
Making reference t o f i g u r e 6, i t i s seen t h a t the assumption t h a t a l l l e v e l s o f the c o n f i g u r a t i o n f a c t o r a f f e c t s the response i n a uniform fashion i s untenable. I t i s c l e a r from the p l o t t h a t a lab-configuration interaction exists. For example, c o n f i g u r a t i o n s 2 and 3 y i e l d c o n s i s t e n t l y low values f o r l a b 4 w h i l e c o n f i g u r a t i o n 1 y i e l d s low and r a t h e r v a r i a b l e values f o r labs 5 and 7. A suspicious low observation a l s o i s seen t o e x i s t f o r lab 3, c o n f i g u r a t i o n 2. Such an augmented p l o t - - a s described above--is a useful technique f o r examining the assumption t h a t the response i s not dependent on some p a r t i c u l a r v a r i a b l e . I t i s t o be noted i n passing t h a t although the p l o t c h a r a c t e r i s the recommended procedure f o r conveying information about the second v a r i a b l e , one could a l s o j u s t as w e l l use the type o f l i n e f o r conveying the second-variable information. The former i s recommended when generating computer p r i n t e r p l o t s - which are by nature d i s c r e t e . The l a t t e r i s recommended when a continuous p r i n t i n g device ( i . e . , one capable o f drawing d i f f e r e n t types o f continuous l i n e s ) i s a v a i l a b l e . Figure 7 i l l u s t r a t e s the above l i n e - t y p e a l t e r n a t i v e with an example based on measured voltages from electrical connectors. The two independent v a r i a b l e s here are: time i n days ( p l o t t e d h o r i z o n t a l l y ) connector type (3 levels--denoted by d i f f e r e n t l i n e types) The m u l t i p l e replication.
l i n e s o f t h e same type a r e due t o experimental Although much can be s a i d about the p l o t and about
American Chemical Society In Validation of theLibrary Measurement Process; DeVoe, J.;
ACS Symposium Series; American Society: Washington, DC, 1977. 1155 16thChemical St., N.W.
VALIDATION OF T H E M E A S U R E M E N T PROCESS
48
CO
.ο SX
CUD
Ο
Ο
fi
·
2 — [0 Jj
•H ω (U
ffi
g Ο
«H
II >H
0)
+3 W
Ρ bû -P -H -H «M
Un
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
2.
Testing Basic Assumptions
FiLLiBEN
49
the r e l a t i v e e f f e c t s o f the two f a c t o r s on the response, we concentrate herein on v i o l a t i o n s o f b a s i c assumptions and note t h a t an apparent v i o l a t i o n i n the form of an o u t l i e r i s evident from the p l o t — n o t e how the f o u r t h data p o i n t of the bottom l i n e of the p l o t i s i n c o n s i s t e n t w i t h the other data l i n e s i n t h i s bottom group. This f o u r t h p o i n t i s c l e a r l y an o u t l i e r , and y e t i t s d e t e c t i o n may very e a s i l y have been l o s t i n the numerical mechanics o f a standard ANOVA. 3-VARIABLE GANOVA This 3 - v a r i a b l e GANOVA technique i s a p p l i c a b l e where a multifactor model i s a p p r o p r i a t e , i . e . , the u n d e r l y i n g hypothesized model i s o f the form (e.g., f o r three f a c t o r s ) : response Y.
j R
= constan + e r r o r e.
w i t h an a l t e r n a t i v e general model of the f o l l o w i n g : Y
= f
X
X
X
ijk ( li» 2j- 3k>
+
™
e
ijk
w i t h f unknown, where the doubly-subscripted B's r e f e r t o f a c t o r e f f e c t s and the doubly-subscripted X's r e f e r t o coded dummy l e v e l s o f each f a c t o r . Again, r a t h e r than apply the standard 3f a c t o r a n a l y s i s of variance (ANOVA) t o data from t h i s model, we apply the g r a p h i c a l procedure i l l u s t r a t e d i n f i g u r e 7. This graphical a n a l y s i s o f variance (GANOVA) (10,11) i s a g e n e r a l i z a t i o n o f the type of p l o t discussed i n s e c t i o n 6 and i s defined simply as a p l o t o f the response v a r i a b l e versus one f a c t o r , w i t h d i f f e r e n t l e v e l s of the second f a c t o r i n d i c a t e d by d i f f e r e n t types o f p l o t characters w i t h i n the p l o t , and w i t h the d i f f e r e n t l e v e l s of the t h i r d f a c t o r i n d i c a t e d by d i f f e r e n t types o f l i n e s w i t h i n the p l o t . The 3 - v a r i a b l e GANOVA conveniently communicates a t a glance a l l o f the r e l e v a n t information i n t h i s 3 - f a c t o r system. Figure 8 i l l u s t r a t e s the a p p l i c a t i o n of the 3 - v a r i a b l e GANOVA t o d r i l l t h r u s t f o r c e . These data are drawn from the e x c e l l e n t a r t i c l e by Hamaker (10) whose main p o i n t was e x a c t l y as we are emphasizing h e r e — v i z . , t h a t there e x i s t v a l u a b l e graphical alternatives t o the usual ANOVA. The three independent v a r i a b l e s are: d r i l l speed (5 l e v e l s — p l o t t e d h o r i z o n t a l l y ) m a t e r i a l (2 levels--denoted by d i f f e r e n t p l o t c h a r a c t e r s ) feed r a t e (3 l e v e l s — d e n o t e d by d i f f e r e n t l i n e types) R e f e r r i n g t o f i g u r e 8, one concludes t h a t v a r i a b l e s 2 ( m a t e r i a l ) and 3 (feed r a t e ) both a f f e c t the response i n a n o n - n e g l i g i b l e
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
VALIDATION
CALL GAN0V3(Y,F1,F2,F3,N)
O F T H E M E A S U R E M E N T PROCESS
ELECTRIC CONNECTORS
20-)—,—ι—,—ι—,—ι—,—ι—,—ι—,—ι ure 7.
0 20 40 60 80 100 120 UOLTAGE DROP UERSUS ELAPSED DAYS 6 / 2 3 / 7 5 JJF6.CLARK1 HOT/COLD COPPER OPEN PLUG 20 AMP BRASS » SOLID, STEEL = SHALL DASH , INNOUATIUE «= LARGE DASH
Three graphic analyses of variance for electrical connectors data
CALL GAN0V3(Y,F1,F2,F3,K)
1500 ι
HAMAKER ( 1 9 7 1 PIRT) DRILL THRUST FORCE
—
Review of the International Statistical Institute
Figure 8. Three graphic analyses of variance for Hamaker (10). Drill thrust force data; thrust force (v) vs. drill speed (x); plot character = material (2 levels); type of line = feed rate (3 levels).
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
2.
51
Testing Basic Assumptions
FiLLiBEN
way. The apparent e x i s t e n c e o f an o u t l i e r i s a l s o evident from the p l o t . I t i s t o be noted t h a t due t o the necessary use o f l i n e types, the 3-variable GANOVA can be done only with the continuous p r i n t i n g devices. YOUDEN PLOT The Youden p l o t (12,13) i s a useful graphical technique most commonly a p p l i c a b l e t o i n t e r l a b o r a t o r y experiments when there e x i s t s e x a c t l y 2 runs ( o r 2 specimens, or 2 l e v e l s of some p a r t i c u l a r f a c t o r , e t c . ) t o be t e s t e d f o r a c e r t a i n property o f interest. Ideally, l a b o r a t o r i e s are " a l i k e , ( 1 ) ; whereas i f a l a b o r a t o r y e f f e c t e x i s t e d , an appropriate model might be: response Y. = constant c + L. + e r r o r e. (where 1_· represents an e f f e c t due t o laboratory i - - i = 1, 2,...,k) and a d d i t i o n a l l y , i f both a laboratory e f f e c t and a run e f f e c t e x i s t e d , an a p p r o p r i a t e model might be: Ί
response Y. = constant c + L. + R. + e r r o r e^. (where R^ represents a run e f f e c t f o r run j — j = 1 , 2 ) . To t e s t which model i s appropriate, a Youden p l o t i s a p p l i e d which i s defined as a p l o t o f the k (where k = the number of l a b o r a t o r i e s i n the experiment) coordinate p a i r s : Y , Y ) (Y2i> ^ 2 2 ) » · · · ( k l > k 2 ) > where Y-jj represents the measured values obtained from l a b o r a t o r y i ( i = 1, 2, ... , k) on run j ( j = 1, 2). 13L
Y
i 2
5
Y
To f a c i l i t a t e the graphical a n a l y s i s , the p l o t c h a r a c t e r i s again used t o "pack" i n e x t r a i n f o r m a t i o n — i n t h i s case, about the laboratory f a c t o r . Thus, e.g., a p l o t character o f 4 i n d i c a t e s t h a t the measurement i n question came from l a b o r a t o r y 4. The Youden p l o t i s i l l u s t r a t e d i n f i g u r e 9 as a p p l i e d t o data from an ASTM s t r e s s c o r r o s i o n experiment where 7 ( k ) l a b o r a t o r i e s were being t e s t e d . I f no l a b o r a t o r y o r run e f f e c t s e x i s t e d , the r e s u l t i n g Youden p l o t w i l l appear as a random 2-dimensional s c a t t e r o f points. A l t e r n a t i v e l y , i f l a b o r a t o r y and/or run e f f e c t s do e x i s t , much useful information about the nature o f such e f f e c t s can be gleaned from the r e s u l t i n g p l o t . The p l o t i n f i g u r e 9 a c t u a l l y i s based on 7 χ 5 = 35 p l o t points (not a l l o f which In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
52
VALIDATION O F T H E
MEASUREMENT
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
PROCESS
2.
FiLLiBEN
53
Testing Basic Assumptions
appear due t o computer p r i n t e r o v e r s t r i k i n g ) . The m u l t i p l i c i t y of 5 i s due t o the e x i s t e n c e o f 5 r e p l i c a t i o n s per lab--such r e p l i c a t i o n s pose no problems i n u t i l i z i n g the Youden p l o t . With respect t o how t o i n t e r p r e t a Youden p l o t , several c h a r a c t e r i s t i c s are t o be noted. A displacement o f p o i n t s from the same l a b o r a t o r y along the 45° diagonal i s i n d i c a t i v e t h a t t h i s l a b o r a t o r y i s c o n s i s t e n t l y generating low ( o r high) readings r e l a t i v e t o the other l a b o r a t o r i e s ( t h e c l u s t e r o f l a b o r a t o r y 4 p o i n t s i n f i g u r e 9 i s i l l u s t r a t i v e of t h i s negative l a b o r a t o r y b i a s ) . On the other hand, a c l u s t e r o f p o i n t s from the same l a b o r a t o r y d i s p l a c e d o f f the diagonal represents i n c o n s i s t e n t readings by t h a t l a b o r a t o r y from one run t o the next. Figure 9 i n d i c a t e v a r i a b i l i t y problem are c o n s i s t e n t l y higher than those f o r run 2. The Youden p l o t i s a s i m p l e — y e t extremely method f o r a n a l y z i n g i n t e r l a b o r a t o r y data.
effective--
EXAMINING DISTRIBUTIONAL INFORMATION The d i s c u s s i o n has already touched on three (randomness, f i x e d l o c a t i o n , f i x e d v a r i a t i o n ) o f the four assumptions t y p i c a l l y made about a measurement process. The f o u r t h assumption ( f i x e d d i s t r i b u t i o n ) w i l l now be addressed. From a s t a t i s t i c a l p o i n t of view, there are f i v e reasons why d i s t r i b u t i o n a l i n f o r m a t i o n should be r o u t i n e l y checked: 1. optimal parameters ;
estimators
f o r location
and
variation
2. v a l i d i t y of c r i t i c a l values used i n s t a t i s t i c a l t e s t s o f significance; 3.
assessment of goodness o f f i t i n r e g r e s s i o n ;
4. e x i s t e n c e of o u t l i e r s ; 5. assessment o f whether the measurement process control.
is in
The l a s t o f these reasons (assessment o f whether a measurement process i s i n s t a t i s t i c a l c o n t r o l ) i s the main one w i t h respect to the o v e r a l l purpose o f t h i s paper. The f i r s t four reasons provide a d d i t i o n a l m o t i v a t i o n f o r checking distributional assumptions, and w i l l be i n d i v i d u a l l y touched on a t t h i s time.
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
54
VALIDATION OF T H E M E A S U R E M E N T PROCESS
The f i r s t (optimal estimators) p o i n t r e f e r s to the case where one i s i n t e r e s t e d i n e s t i m a t i n g from a given data s e t the l o c a t i o n parameter c and v a r i a t i o n ( d i s p e r s i o n or s c a l e ) parameter σ i n the model described i n eq. (1). ( I t i s assumed t h a t the e r r o r , e^ i s a random v a r i a b l e w i t h mean 0 and (unknown) standard d e v i a t i o n , σ. ) Various estimators of c would, f o r example, include the usual sample mean of η observations c = SY|/n, the sample median (c = the middle observation i n the ordered s e t of o b s e r v a t i o n s ) , or the sample midrange (c = the average of the s m a l l e s t and l a r g e s t o b s e r v a t i o n s ) . It is a statistical "fact-of-life" t h a t i n e s t i m a t i n g l o c a t i o n and v a r i a t i o n parameters, the goodness (accuracy) of a p a r t i c u l a r estimator and the choice of an optimal estimator are dependent on the underlying d i s t r i b u t i o n d i s t r i b u t i o n which generate (normal or Gaussian), the best estimator of c would be the sample mean. However, i f the underlying d i s t r i b u t i o n were uniform ( i . e . , i t had a f l a t - - r a t h e r than bell-shaped p r o b a b i l i t y f u n c t i o n ) , it can be t h e o r e t i c a l l y demonstrated that the sample midrange, c = ( s m a l l e s t + l a r g e s t ) / 2 i s a much more accurate estimator of c than the sample mean. A l t e r n a t i v e l y , i f the underlying d i s t r i b u t i o n f o r the data were, e.g., very "longt a i l e d " l i k e the Cauchy ( i ^ j e . , the p r o b a b i l i t y f u n c t i o n i s b e l l shaped but higher valued i n the t a i l s than the normal), then theory d i c t a t e s and p r a c t i c e confirms t h a t the sample median i s a much more accurate estimator of c than e i t h e r the sample mean or the simple midrange. Thus, i t i s seen t h a t f o r e s t i m a t i n g the constant c i n the s i m p l e s t p o s s i b l e response model (Y = c + e ) , a necessary p r e l i m i n a r y step i s t o "estimate" the underlying distribution. Although the c e n t r a l l i m i t theorem provides a t h e o r e t i c a l b a s i s f o r suggesting t h a t f o r many p h y s i c a l science experiments, the normal d i s t r i b u t i o n "should" be the underlying d i s t r i b u t i o n , such normality should never be a u t o m a t i c a l l y assumed. As w i l l be seen i n the remaining s e c t i o n s , s t a t i s t i c a l techniques do e x i s t which a l l o w the a n a l y s t t o e a s i l y and r o u t i n e l y check such d i s t r i b u t i o n a l models. The second reason why d i s t r i b u t i o n a l information should be checked deals w i t h the v a l i d i t y o f t e s t s t a t i s t i c s . In the m u l t i f a c t o r s t a t i s t i c a l techniques r e f e r r e d t o as r e g r e s s i o n and a n a l y s i s of v a r i a n c e , there are a v a r i e t y of t e s t s t a t i s t i c s (mostly t and F s t a t i s t i c s ) which are a p p l i e d t o t e s t the s i g n i f i c a n c e of various f a c t o r s i n the m u l t i - f a c t o r model. I t i s an important s t a t i s t i c a l f a c t t h a t the v a l i d i t y of these t e s t s t a t i s t i c s holds only i f the r e s i d u a l s ( d e v i a t i o n s ) a f t e r the f i t are normally d i s t r i b u t e d . That i s t o say, i t i s the d i s t r i b u t i o n a l c h a r a c t e r i s t i c of the r e s i d u a l s a f t e r the f i t t h a t d i c t a t e the v a l i d i t y of the t and F s t a t i s t i c s . I f the t r u e underlying d i s t r i b u t i o n of the r e s i d u a l s i s non-normal, t h i s w i l l a f f e c t the t r u e s i g n i f i c a n c e l e v e l s of the t e s t s t a t i s t i c s . The net r e s u l t i s t h a t u l t i m a t e l y the conclusions about the
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
2.
FiLLiBEN
Testing Basic Assumptions
55
s i g n i f i c a n c e o f various f a c t o r s i n r e g r e s s i o n and ANOVA may be incorrect. Again, as emphasized before, no b l i n d assumptions need be made about the d i s t r i b u t i o n o f such residuals. Techniques w i l l be demonstrated t o a l l o w the d i s t r i b u t i o n t o be r o u t i n e l y checked. The t h i r d reason f o r checking d i s t r i b u t i o n a l i n f o r m a t i o n i s r e l a t e d t o the aforementioned r e g r e s s i o n and ANOVA. The p o i n t t o be emphasized i s t h a t an a d d i t i o n a l important reason f o r examining the d i s t r i b u t i o n o f r e s i d u a l s a f t e r t h e f i t i s t o determine whether o r not one has a r r i v e d a t a reasonable d e t e r m i n i s t i c o r f u n c t i o n a l model f o r the data. I f the f i t t e d r e g r e s s i o n or ANOVA model i s c o r r e c t , the r e s i d u a l s a f t e r the f i t should i d e a l l y have the same four p r o p e r t i e s as has been p r e v i o u s l y discusse variable, viz. : random fixed location fixed variation fixed distribution In a large m a j o r i t y of cases, the r e s i d u a l s a f t e r the c o r r e c t f i t w i l l not only f o l l o w some f i x e d d i s t r i b u t i o n , but w i l l a l s o rather specifically follow a normal distribution. The i m p l i c a t i o n o f course i s t h a t i n order t o assess whether o r not one has a c o r r e c t f i t , one ought t o examine the d i s t r i b u t i o n o f the r e s i d u a l s t o check f o r such normality. Though not a s u f f i c i e n t c o n d i t i o n i n i t s e l f f o r adequate f i t , the normality of the r e s i d u a l s serves as a p r a c t i c a l necessary c o n d i t i o n which may p r o f i t a b l y be used i n determining model adequacy. From a pragmatic p o i n t o f view, t h i s t h i r d reason f o r examining d i s t r i b u t i o n a l information i s an extremely important one. The f o u r t h reason f o r checking d i s t r i b u t i o n a l information deals w i t h the o u t l i e r problem. How does one t e l l i f a suspicious-looking observation i s i n fact an o u t l i e r ? ( " O u t l i e r " as here used r e f e r s t o an observation t h a t was generated from a d i f f e r e n t model o r a d i f f e r e n t d i s t r i b u t i o n than was the main "body" o f the data.) Frequently, an o u t l i e r w i l l manifest i t s e l f i n one o r another o f the p l o t s already discussed i n previous s e c t i o n s . However, an a d d i t i o n a l and a t times more s e n s i t i v e check i s given by a d e t a i l e d examination o f the d i s t r i b u t i o n of the data. An observation which appears t o be a b o r d e r l i n e o u t l i e r i n some previous p l o t s f r e q u e n t l y turns out to be a w e l l - d e f i n e d o u t l i e r when examined r e l a t i v e t o the d i s t r i b u t i o n o f the r e s t o f the data. The same numerical observation may very well be a " t y p i c a l " extreme observation r e l a t i v e t o one d i s t r i b u t i o n but an o u t l i e r r e l a t i v e t o another d i s t r i b u t i o n . By examining the d i s t r i b u t i o n of the data (and/or the r e s i d u a l s a f t e r a f i t ) , the a n a l y s t gives himself a much more s e n s i t i v e t o o l f o r o u t l i e r d e t e c t i o n and i d e n t i f i c a t i o n . In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
56
VALIDATION OF T H E M E A S U R E M E N T PROCESS
The f i f t h and f i n a l p o i n t w i t h respect t o the importance of checking f o r d i s t r i b u t i o n a l i n f o r m a t i o n deals w i t h the main p o i n t of t h i s p a p e r — p r e d i c t a b i l i t y and the determination of whether a process i s " i n c o n t r o l . " P r e d i c t a b i l i t y means being able t o make p r o b a b i l i t y statements about f u t u r e output from the process. These p r o b a b i l i t y statements w i l l most commonly r e f e r to expected v a r i a t i o n (about some t y p i c a l value) of output from the process. The main p o i n t i s t h a t such p r o b a b i l i t y statements w i l l change depending on the t r u e u n d e r l y i n g d i s t r i b u t i o n of the process. A statement such as: "97-1/2% of the f u t u r e observations from t h i s measurement process should f a l l w i t h i n (approximately) 3 standard d e v i a t i o n
w i l l of course be t r u e i f the u n d e r l y i n g generating d i s t r i b u t i o n i s normal but on the other hand w i l l be f a l s e i f the u n d e r l y i n g d i s t r i b u t i o n i s ( f o r example) uniform, Cauchy or e x p o n e n t i a l . I t i s important f o r a n a l y s t s t o keep i n mind t h a t f o r non-normal d i s t r i b u t i o n s , a p r o b a b i l i t y statement about expected f u t u r e occurrences (e.g., w i t h i n two standard d e v i a t i o n s of the mean) w i l l change from d i s t r i b u t i o n t o d i s t r i b u t i o n . The exact proba b i l i t y value (= 97-1/2% f o r the normal) must be (and can be) determined once the u n d e r l y i n g d i s t r i b u t i o n i s determined. I t is a r e c u r r i n g requirement t o "estimate" the u n d e r l y i n g distribution. With these motivations and j u s t i f i c a t i o n s f o r examining d i s t r i b u t i o n a l i n f o r m a t i o n , the next two s e c t i o n s w i l l present various data a n a l y s i s techniques t o c a r r y out such examinations. PROBABILITY PLOTS A p r o b a b i l i t y p l o t (14,15,16,17,18,19,20,21) i s a g r a p h i c a l t o o l f o r a s s e s s i n g the goodness of f i t of some hypothesized d i s t r i b u t i o n (e.g., normal, uniform, Poisson, e t c . ) t o an observed data set. In d e s c r i b i n g a p r o b a b i l i t y p l o t , i t w i l l be assumed t h a t the model i s as i n d i c a t e d i n eq. ( 1 ) . However, i t i s t o be kept i n mind t h a t the p r o b a b i l i t y p l o t technique has much g r e a t e r g e n e r a l i t y inasmuch as i t can be a p p l i e d t o the r e s i d u a l s a f t e r any m u l t i f a c t o r f i t as w e l l as t o the raw observations from the simple Y. = c + e.. model. A p r o b a b i l i t y p l o t i s ( i n general) simply a p l o t of the observed ordered ( s m a l l e s t t o l a r g e s t ) observations Y j on the vertical axis versus the corresponding typical ordered observations based on whatever d i s t r i b u t i o n i s being hypothesized. Thus, f o r example, i f one were forming a normal p r o b a b i l i t y p l o t , the f o l l o w i n g η coordinate p l o t p o i n t s would
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
2.
Testing Basic Assumptions
FiLLiBEN
57
be formed: ( Y M J , Y , M ) , . . . ( Y , M ) where Y i s the observed s m a l l e s t data p o i n t , and M i s the t h e o r e t i c a l "expected" value o f the s m a l l e s t data p o i n t from a sample o f s i z e η normally d i s t r i b u t e d p o i n t s . S i m i l a r l y , Y would be the second s m a l l e s t observed value and M would be the "expected value" o f the second s m a l l e s t o b s e r v a t i o n i n a sample o f s i z e η normally d i s t r i b u t e d p o i n t s . This proceeds up t o Y which would be the l a r g e s t observed data value and M would be the "expected value" of the l a r g e s t observation i n a sample o f s i z e η from a normal d i s t r i b u t i o n . Thus, i n forming a normal p r o b a b i l i t y p l o t , the v e r t i c a l a x i s values depend only on the observed data, w h i l e the horizontal a x i s values a r e generated independently o f the observed data and depend only on the t h e o r e t i c a l d i s t r i b u t i o n being t e s t e d o r hypothesized ( n o r m a l i t y i n t h i s case) and a l s o the value o f the sampl s i m p l e s t terms a p l o "expected." i $
2
2
n
1
n
x
2
2
n
t
h
The crux o f the p r o b a b i l i t y p l o t i s t h a t the i ordered observation i n a sample o f s i z e η from some d i s t r i b u t i o n i s i t s e l f a random v a r i a b l e which has a d i s t r i b u t i o n unto i t s e l f . This d i s t r i b u t i o n o f the i^h ordered o b s e r v a t i o n can be t h e o r e t i c a l l y d e r i v e d and summarized ( i . e . , mapped i n t o a s i n g l e " t y p i c a l value") as can any other random v a r i a b l e . One can then pose the r e l e v a n t question as t o what s i n g l e number best t y p i f i e s the d i s t r i b u t i o n a s s o c i a t e d w i t h a given ordered o b s e r v a t i o n i n a sample o f s i z e , n. A computational disadvantage t o the use o f the mean i s t h a t d i f f e r e n t i n t e g r a t i o n techniques may be needed f o r d i f f e r e n t types o f d i s t r i b u t i o n . For some d i s t r i b u t i o n s the mathematical i n t e g r a t i o n does not e x i s t . These c o n s i d e r a t i o n s d i c t a t e t h a t the median i s s u p e r i o r t o the mean i n terms o f forming a t h e o r e t i c a l "expected" o r " t y p i c a l " value t o summarize the e n t i r e d i s t r i b u t i o n o f the l'th ordered o b s e r v a t i o n i n a sample o f s i z e η from the d i s t r i b u t i o n being t e s t e d . Thus, t o be p r e c i s e , the M-j on the h o r i z o n t a l a x i s of the p r o b a b i l i t y p l o t i s taken t o be the median o f the d i s t r i b u t i o n o f the i ™ ordered observation i n a sample o f s i z e η from whatever u n d e r l y i n g d i s t r i b u t i o n i s being t e s t e d . I t i s t o be noted t h a t the s e t o f Mj as a whole w i l l change from one hypothesized d i s t r i b u t i o n t o another--and t h e r e i n l i e s the distributional s e n s i t i v i t y o f the p r o b a b i l i t y plot technique. For example, i f the hypothesized d i s t r i b u t i o n i s uniform, then a uniform p r o b a b i l i t y p l o t would be formed and the w i l l be approximately equi-spaced t o r e f l e c t the f l a t nature of the uniform p r o b a b i l i t y d e n s i t y f u n c t i o n . On the other hand, i f the hypothesized d i s t r i b u t i o n i s normal, then the Mj w i l l have a r a t h e r sparse spacing f o r the f i r s t few ( M , M , M ,...) and l a s t few (..., M __ , M , M ) values but w i l l become more densely spaced as one proceeds xoward the middle of the s e t (... , ^-1/2» n/2> n + l / 2 behavior f o r the M-j i s of x
n
M
M
2
2
3
n - 1
S u c h
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
VALIDATION OF
58 course r e f l e c t i n g density function.
the
bell-shape
of
THE
the
MEASUREMENT
normal
PROCESS
probability
In summary, f o r a s p e c i f i c hypothesized d i s t r i b u t i o n , D Q the l'th value Mj i n the corresponding p r o b a b i l i t y p l o t i s a t h e o r e t i c a l (but computable) value c l o s e to what one t y p i c a l l y would "expect" f o r the value of the i order observation i f i n f a c t one had taken a random sample of s i z e η from the distribution D . ο How does one use and i n t e r p r e t p r o b a b i l i t y p l o t s ? In l i g h t of the above, i t i s seen t h a t i f i n f a c t the observed data do have a d i s t r i b u t i o n t h a t the a n a l y s t has hypothesized, then (except f o r an unimportant l o c a t i o n and scale f a c t o r which can be determined a f t e r the f o r a l l i , t h a t i s , ove of Y j versus Μ · w i l l be n e a r - l i n e a r . This l i n e a r i t y i s the dominant f e a t u r e to be checked f o r i n any p r o b a b i l i t y p l o t . A linear probability plot indicates t h a t the hypothesized d i s t r i b u t i o n , D gives a good d i s t r i b u t i o n a l f i t to the observed data set. This combination of s i m p l i c i t y of use along w i t h distributional s e n s i t i v i t y makes the probability plot an extremely powerful t o o l f o r data a n a l y s i s . η
Q
The next l o g i c a l question to be examined i s what w i l l the p r o b a b i l i t y p l o t look l i k e i f the hypothesized d i s t r i b u t i o n , D i s not c o r r e c t - - ! . e . , i f the u n d e r l y i n g d i s t r i b u t i o n t h a t generated the data i s not the same as the d i s t r i b u t i o n , D hypothesized by the a n a l y s t . In t h i s case, the Υ· and will not match over the e n t i r e set and so the r e s u l t i n g p r o b a b i l i t y p l o t w i l l be nonlinear. A very useful aspect of the p r o b a b i l i t y p l o t i s t h a t the type of n o n l i n e a r i t y e x h i b i t e d by a given p r o b a b i l i t y p l o t w i l l give the a n a l y s t useful i n f o r m a t i o n as to how the d i s t r i b u t i o n a l hypothesis, D should be adjusted so as to a r r i v e at a b e t t e r d i s t r i b u t i o n a l f i t to the data. This l a s t p o i n t i s an important asset of the p r o b a b i l i t y p l o t technique f o r t e s t i n g assumptions i n d i s t r i b u t i o n . For example, i f the a n a l y s t b e l i e v e s t h a t the true underlying d i s t r i b u t i o n i s i n general a symmetric d i s t r i b u t i o n ( i . e . , a d i s t r i b u t i o n which has a p r o b a b i l i t y f u n c t i o n as i l l u s t r a t e d i n f i g . 10) as opposed to a skewed d i s t r i b u t i o n (e.g., w i t h a p r o b a b i l i t y f u n c t i o n as i l l u s t r a t e d i n f i g . 11), then the p r o b a b i l i t y p l o t a n a l y s i s to be p r e s e n t l y d e s c r i b e d i s r a t h e r t y p i c a l . The f i r s t step i n such a n a l y s i s i s u s u a l l y to t e s t the normal d i s t r i b u t i o n hypothesis (the normal being the most commonly-employed symmetric d i s t r i b u t i o n ) by forming a normal p r o b a b i l i t y p l o t . In forming such a p l o t , l e t us c o n s i d e r the f o l l o w i n g f i v e types of most commonly-encountered appearances of the normal p r o b a b i l i t y p l o t : l i n e a r , S-shaped, N-shaped, nonsymmetric c r o s s - o v e r , and convex (see f i g . 12). 0
0
Ί
Q
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
FiLLiBEN
Testing Basic Assumptions
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
VALIDATION OF T H E M E A S U R E M E N T PROCESS
60
I f the normal p r o b a b i l i t y p l o t has the l i n e a r appearance of f i g u r e 12a, t h i s i n d i c a t e s t h a t the normal d i s t r i b u t i o n y i e l d s an acceptably good f i t t o the data; so no f u r t h e r p r o b a b i l i t y p l o t s need be formed and the d i s t r i b u t i o n a n a l y s i s i s completed. I f the normal p r o b a b i l i t y p l o t has the S-shaped appearance of f i g u r e 12b, t h i s i n d i c a t e s t h a t the D = normal hypothesis i s i n c o r r e c t , and t h a t the t r u e underlying d i s t r i b u t i o n f o r the data i s symmetric but i s s h o r t e r - t a i l e d than normal. Examples of such symmetric d i s t r i b u t i o n s , s h o r t e r - t a i l e d than normal, would be a U-shaped d i s t r i b u t i o n , a uniform d i s t r i b u t i o n , or a truncated bell-shaped d i s t r i b u t i o n . (These three d i s t r i b u t i o n s have p r o b a b i l i t y f u n c t i o n s as i l l u s t r a t e d i n f i g . 13.) In such a case, the second i t e r a t i o n by the a n a l y s t would be t o form an additional probabilit (e.g., from a unifor uniform p r o b a b i l i t y p l o t i s s t i l l S-shaped, the t h i r d i t e r a t i o n i s t o form a p r o b a b i l i t y p l o t f o r a d i s t r i b u t i o n t h a t i s even s h o r t e r - t a i l e d than uniform (e.g., some U-shaped d i s t r i b u t i o n ) . On the other hand, i f the uniform p r o b a b i l i t y p l o t has a form as i n f i g u r e 12c (and which w i l l be represented very crudely as an "N shape"), the t h i r d i t e r a t i o n would be t o form a p r o b a b i l i t y p l o t f o r some d i s t r i b u t i o n s h o r t e r - t a i l e d than normal but l o n g e r - t a i l e d than uniform. Such i t e r a t i o n i s continued u n t i l there i s convergence t o an acceptable l i n e a r p r o b a b i l i t y p l o t . In p r a c t i c e , the a n a l y s i s w i l l u s u a l l y converge t o an acceptable d i s t r i b u t i o n i n a r e l a t i v e l y small number of i t e r a t i o n s . 0
To consider another p o s s i b i l i t y , i f the o r i g i n a l normal p r o b a b i l i t y p l o t has the "N-shaped" appearance o f f i g u r e 12c, t h i s suggests t h a t the D = normal hypothesis i s i n c o r r e c t , and t h a t the true underlying d i s t r i b u t i o n f o r the data i s s t i l l symmetric but i s l o n g e r - t a i l e d than normal. An example would be the Cauchy ( a l s o known as the Lorentzian) d i s t r i b u t i o n which i s a bell-shaped d i s t r i b u t i o n whose " t a i l s " are "longer" or " f a t t e r " than the normal. Figure lOd i l l u s t r a t e s the p r o b a b i l i t y d e n s i t y f u n c t i o n f o r the Cauchy d i s t r i b u t i o n . The t y p i c a l nature of l o n g - t a i l e d d i s t r i b u t i o n s l i k e the Cauchy i s t h a t i f the measurement process i s generating data from such a d i s t r i b u t i o n , i t i s more l i k e l y t o generate some observations which are c o n s i d e r a b l y removed from the "body" of the data than i n sampling from a more m o d e r a t e - t a i l e d d i s t r i b u t i o n (such as the normal). As b e f o r e , since the o r i g i n a l normal p r o b a b i l i t y p l o t was not l i n e a r , the a n a l y s t should perform the i t e r a t i v e a n a l y s i s t o produce a l o n g e r - t a i l e d probability plot ( l i k e a Cauchy p r o b a b i l i t y p l o t ) . I f t h i s second p l o t i s l i n e a r , t h i s i m p l i e s t h a t the Cauchy y i e l d s an acceptable d i s t r i b u t i o n . I f t h i s second p l o t i s not l i n e a r , other i t e r a t i o n s on D must be made based on the S-shaped or N-shaped appearance of the Cauchy p r o b a b i l i t y p l o t . A r o u t i n e computerized procedure t o c a r r y out such i t e r a t i o n s f o r the symmetric f a m i l y of d i s t r i b u t i o n s w i l l be presented i n s e c t i o n 11. 0
0
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
2.
FiLLiBEN
Testing Basic Assumptions
χ
61
χχχχχχχΧ
Figure 12. Typical shapes of probability plots, (a.) Linear; (b.) s-shaped; (d.) nonsymmetric crossover; (e.) convex.
Figure 13. Distributions shorter-tailed than normal, a. Tukey λ = 1.5 distribution (very short-tailed); b. uniform distribution (shorttailed); c. truncated normal distribution (moderate/short-tailed); d. normal distribution (moderate-tailed).
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
VALIDATION OF
62
THE
M E A S U R E M E N T PROCESS
I f the o r i g i n a l normal p r o b a b i l i t y p l o t has the appearance of f i g u r e 12d where the diagonal l i n e d i v i d e d the data p o i n t s on e i t h e r side unequally or as i n 12e where the diagonal l i n e does not d i v i d e the data at a l l , t h i s i s i n d i c a t i v e t h a t not only may the s p e c i f i c hypothesis t h a t D = normal, be i n c o r r e c t , but a l s o t h a t the hypothesis of a symmetric d i s t r i b u t i o n may be i n c o r r e c t . In such a case, the t r u e u n d e r l y i n g d i s t r i b u t i o n f o r the data would then be some type of skewed d i s t r i b u t i o n (e.g. , of the types w i t h p r o b a b i l i t y d e n s i t y f u n c t i o n s as i l l u s t r a t e d i n f i g u r e 11). In forming a d d i t i o n a l p r o b a b i l i t y p l o t s to f i t the data, the a n a l y s t should consequently consider d i s t r i b u t i o n s which are skewed. To enumerate but a few of the skewed d i s t r i b u t i o n s t h a t might be considered i n subsequent i t e r a t i o n s , one would i n c l u d e the log-normal d i s t r i b u t i o n , the half-normal d i s t r i b u t i o n , the exponentia of d i s t r i b u t i o n s , th the Pareto f a m i l y of d i s t r i b u t i o n s . For an e x c e l l e n t general d e s c r i p t i o n of various d i s t r i b u t i o n s and d i s t r i b u t i o n a l f a m i l i e s (both skewed and symmetric) the reader i s r e f e r r e d to the comprehensive t e x t s by Johnson and Kotz (22,23). 0
One f i n a l p o i n t regarding o u t l i e r - d e t e c t i o n i s noteworthy. I f i n forming, f o r example, a normal p r o b a b i l i t y p l o t , the p l o t turns out to be l i n e a r w i t h the exception of one or two p o i n t s (see f i g . 14), how i s t h i s to be i n t e r p r e t e d ? This type of p l o t i s i n d i c a t i n g t h a t the normal f i t i s acceptable f o r most of the data but t h a t one or two p o i n t s are o u t l i e r s and do not seem to agree w i t h the normality assumption. The p r o b a b i l i t y p l o t i s thus seen to be usable f o r d e t e c t i n g o u t l i e r s . The next step i n the a n a l y s i s i s f o r the a n a l y s t to d e l e t e the one or two o f f e n d i n g p o i n t s and to form a p r o b a b i l i t y p l o t w i t h the remaining p o i n t s . I f t h i s second p l o t i s s t i l l s t r o n g l y l i n e a r , t h i s gives a d d i t i o n a l support to the hypothesis t h a t the data are n o r m a l l y - d i s t r i b u t e d and t h a t the one or two questionable p o i n t s are i n f a c t o u t l i e r s . The use of the p r o b a b i l i t y p l o t as a t o o l f o r o u t l i e r d e t e c t i o n i s g e n e r a l l y more s e n s i t i v e than any of the techniques discussed i n previous s e c t i o n s . The experimenter i s a l s o reminded t h a t although such o u t l i e r s may be deleted from f u r t h e r a n a l y s i s , these o u t l i e r s e x i s t " f o r a reason" and the experimenter ought to s a t i s f y himself t h a t he has determined what set of experimental circumstances had l e d to them. The examination of o u t l i e r s almost i n v a r i a b l y leads to improved design of the experiment and ultimately to an improved understanding of the experimental f a c t o r s which prevent a measurement process from being " i n c o n t r o l . " Having discussed what a p r o b a b i l i t y p l o t i s and how one i s to be interpreted, we now enumerate b r i e f l y some of the advantages of using a p r o b a b i l i t y p l o t as opposed to other methods of checking f o r d i s t r i b u t i o n a l information (e.g., histogram, χ s t a t i s t i c , f i t to p r o b a b i l i t y d e n s i t y f u n c t i o n ) . 2
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
FiLLiBEN
Testing Basic Assumptions
iI I
Ο
Ο
Ν
Ί
0>
ο ro
(Λ W HI U
V) W
I
I I I ι ι ι
Ο
^5
V
ζ
00 < ω Η _ι «Η • UJ ο _) α -· α α π
Ν
"δ.
>- υ
*-Η 5
m >ο Η α ·α -ΐ < Ζ et • ζ
< (D Ο α α
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
64
V A L I D A T I O N OF
THE
MEASUREMENT
PROCESS
Although i t w i l l be shown t h a t the p r o b a b i l i t y p l o t technique i s to be highly recommended, the various techniques are complementary. An o u t l i n e of the advantages of the p r o b a b i l i t y p l o t approach i s as f o l l o w s : Graphical Technique The p r o b a b i l i t y p l o t i s a graphical technique and b e n e f i t s from a l l of the advantages of graphics as o u t l i n e d the end of s e c t i o n 1. Easy to
so at
Use
The dominant f e a t u r e to be checked i n a p r o b a b i l i t y p l o t i s l i n e a r i t y . This i s th is easily detectable p r o b a b i l i t y p l o t s i s no longer a problem. A p p l i c a b l e to a Wide Range of D i s t r i b u t i o n The p r o b a b i l i t y p l o t technique can be a p p l i e d to a wide range of d i s t r i b u t i o n s — c e r t a i n l y f o r a l l d i s t r i b u t i o n s commonly encountered i n p r a c t i c e . These d i s t r i b u t i o n s would cover those of both the continuous (e.g., normal) and the d i s c r e t e (e.g., Poisson) types. Such d i s t r i b u t i o n s would i n c l u d e the normal (Gaussian), uniform, v a r i o u s U-shaped d i s t r i b u t i o n s , Cauchy, L o g i s t i c , h a l f - n o r m a l , log-normal, e x p o n e n t i a l , gamma, beta, Wei b u l l , extreme v a l u e , Pareto, b i n o m i a l , Poisson, geometric, and negative binomial. For each such d i s t r i b u t i o n D, there nonetheless remains the same uniform approach i n i n t e r p r e t i n g the r e s u l t i n g p r o b a b i l i t y p l o t ; v i z . , to check f o r l i n e a r i t y and if nonlinear to make adjustments to the hypothesized d i s t r i b u t i o n s D a c c o r d i n g l y — b a s e d on the type of n o n l i n e a r i t y encountered. 0
No a p r i o r i Location and V a r i a t i o n Estimates Needed 2
One problem a s s o c i a t e d w i t h the χ goodness of f i t techniques and w i t h the e m p i r i c a l technique of superimposing a f i t t e d p r o b a b i l i t y d e n s i t y f u n c t i o n over a histogram of the data i s t h a t a p r i o r i values of the parameter ( u s u a l l y l o c a t i o n and v a r i a t i o n ) are needed before the technique can a c t u a l l y be a p p l i e d . This i s f r e q u e n t l y i m p r a c t i c a l f o r two reasons: 1. Such available.
known
values
for
the
parameters
are
rarely
2. Accurate estimates f o r the parameters can only be obtained a f t e r the d i s t r i b u t i o n has been "estimated" r a t h e r than before.
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
2.
FiLLiBEN
Testing Basic Assumptions
65
Since the p r o b a b i l i t y p l o t technique does not need a p r i o r i values t o be a p p l i e d , i t i s s u p e r i o r and d e f i n i t e l y f a r more p r a c t i c a l than the χ and f i t t e d p r o b a b i l i t y d e n s i t y f u n c t i o n methods f o r d i s t r i b u t i o n a l t e s t i n g . 2
Automatic Estimate of Location and V a r i a t i o n Obtained An a d d i t i o n a l advantage o f a p p l y i n g the technique i s t h a t estimates o f l o c a t i o n and s c a l e parameters a r e a u t o m a t i c a l l y produced as a secondary output. These l o c a t i o n and v a r i a t i o n estimates a r e d e r i v a b l e , r e s p e c t i v e l y , from the v e r t i c a l a x i s i n t e r c e p t and t h e slope o f the r e s u l t i n g p r o b a b i l i t y p l o t . Although the a n a l y s t i s reminded t h a t such l o c a t i o n and v a r i a t i o n estimates a r e not t o be considered as the optimal (minimum v a r i a n c e ) estimates practical indication should be. No Grouping o f Data Need be Done A problem a s s o c i a t e d w i t h the histogram technique (whereby the a n a l y s t simply forms a histogram o f the data and notes i t s general shape without applying or f i t t i n g a specific d i s t r i b u t i o n t o i t ) f o r gathering d i s t r i b u t i o n a l i n f o r m a t i o n i s t h a t o f choosing the grouping i n t e r v a l (the c l a s s width) f o r the histogram. The appearance o f the r e s u l t i n g histogram i s r a t h e r s t r o n g l y a f f e c t e d by the choice o f t h i s c l a s s width. A c l a s s width which i s "too narrow" w i l l r e s u l t i n a histogram i n which the true d i s t r i b u t i o n a l shape i s obscured by excessive v a r i a b i l i t y i n the height o f the bar a s s o c i a t e d w i t h each c l a s s , a c l a s s width which i s "top wide" w i l l r e s u l t i n a histogram i n which the t r u e d i s t r i b u t i o n a l shape i s obscured by "leakage" across neighboring c l a s s e s so t h a t the d i s t r i b u t i o n a l content f o r a given c l a s s w i l l be "smeared" out over several c l a s s e s . Although r u l e s o f thumb do e x i s t f o r choosing a reasonable c l a s s w i d t h , t h i s nevertheless c a l l s f o r an intermediate judgment t o be made by the a n a l y s t . The use o f the p r o b a b i l i t y p l o t technique e l i m i n a t e s the need f o r such a choice. Inasmuch as a p r o b a b i l i t y p l o t uses each observation i n d i v i d u a l l y and r e q u i r e s no grouping, t h i s f r e e s t h e a n a l y s t from making choices about c l a s s widths and e l i m i n a t e s ( i f the wrong c l a s s width happens t o have been chosen) a p o s s i b l e undesirable approach-dependency on the u l t i m a t e c o n c l u s i o n s . The net p o s i t i v e e f f e c t o f the p r o b a b i l i t y p l o t i s t h a t i t allows a d i s t r i b u t i o n a l a n a l y s i s t o be performed i n a completely d i r e c t and automatic f a s h i o n w i t h no intermediate d e c i s i o n s (such as c l a s s width) t o be made by the a n a l y s t . Thus, the conclusions from the d i s t r i b u t i o n a l a n a l y s i s w i l l r e f l e c t only the content o f the data and w i l l avoid p o s s i b l e biases introduced by the a n a l y s i s .
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
VALIDATION OF T H E M E A S U R E M E N T PROCESS
tuo
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
FiLLiBEN
Testing Basic Assumptions
X X X X X
<*· ο ai c _j a. Ό α α
a. ο ο α ζ α
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
VALIDATION O F T H E M E A S U R E M E N T
X X X
X X X
χ
χ
:
Σ Ο < α _j α.
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
PROCESS
FiLLiBEN
Testing Basic Assumptions
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
VALIDATION
OF T H E
MEASUREMENT
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
PROCESS
FiLLiBEN
Testing Basic Assumptions
X X X X χ χ
χ
x χ
X χ x
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
72
VALIDATION OF T H E M E A S U R E M E N T PROCESS
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
FiLLiBEN
Testing Basic Assumptions
: X X X X X X X X X J
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
74
VALIDATION O F T H E M E A S U R E M E N T PROCESS
Obtain Feedback Information f o r Improved D i s t r i b u t i o n a l F i t Any statistic by i t s very nature attempts t o map i n f o r m a t i o n (about some c h a r a c t e r i s t i c of i n t e r e s t ) l a t e n t i n the e n t i r e data s e t i n t o a s i n g l e number. Needless t o say, such a mapping must be very s e l e c t i v e ( i n order t o be s e n s i t i v e ) and so i n v a r i a b l y a c o n s i d e r a b l e amount of a n c i l l a r y i n f o r m a t i o n w i l l bè l o s t i n the mapping. The above i s t r u e f o r t e s t s t a t i s t i c s i n general and f o r any d i s t r i b u t i o n a l t e s t s t a t i s t i c in p a r t i c u l a r . Thus, f o r example, the standardized t h i r d c e n t r a l moment may be s e n s i t i v e t o symmetry but i s not s e n s i t i v e to t a i l length; the standardized 4 t h c e n t r a l moment may be s e n s i t i v e t o t a i l length but not t o symmetry; and the χ s t a t i s t i c may be s e n s i t i v e t o the d i s t r i b u t i o n being t e s t e d , but y i e l d s no i n f o r m a t i o the event o f a poor d i s t r i b u t i o n . Although not a l l d i s t r i b u t i o n a l t e s t s t a t i s t i c s are e q u a l l y bad, they are a l l n e c e s s a r i l y " d e f i c i e n t " i n the sense t h a t i n f o r m a t i o n i s i r r e t r i e v a b l y l o s t i n the r e d u c t i o n o f η numbers ( t h e η o b s e r v a t i o n s ) i n t o one number ( t h e t e s t s t a t i s t i c value). This negative f e a t u r e of t e s t s t a t i s t i c s i s avoided i n general by graphical procedures which u t i l i z e a l l of the i n d i v i d u a l data p o i n t s . Thus, i n the event t h a t the o r i g i n a l d i s t r i b u t i o n , D does not y i e l d a good f i t t o the data, the a n a l y s t i s able to improve the d i s t r i b u t i o n a l f i t . As discussed i n d e t a i l p r e v i o u s l y , the shape o f a given p r o b a b i l i t y p l o t a u t o m a t i c a l l y gives the a n a l y s t the feedback i n f o r m a t i o n necessary t o make another i t e r a t i o n i n attempting t o improve the distributional f i t . 2
0
Figures 15 through 21 demonstrate the use of the p r o b a b i l i t y p l o t on v a r i o u s data sets. The f i r s t example i s a data s e t c o n s i s t i n g of 500 normal random numbers drawn from the Rand (24) Corporation random number t a b l e s . Even though i t i s known t h a t the t r u e D i s normal, p r o b a b i l i t y p l o t s based on four hypothetical choices o f d i s t r i b u t i o n are given i n f i g u r e 15. These four d i s t r i b u t i o n s (drawn from d i f f e r e n t regions o f the t a i l length domain) are given f o r comparative purposes. The upper l e f t p l o t i s a uniform ( s h o r t - t a i l e d ) p r o b a b i l i t y p l o t , the upper r i g h t i s a normal (moderate-tailed) p r o b a b i l i t y p l o t , the lower l e f t i s a p r o b a b i l i t y p l o t f o r the Tukey λ = -.5 d i s t r i b u t i o n (25,26) (a moderate-long t a i l e d d i s t r i b u t i o n ) , and the lower r i g h t i s a Cauchy ( l o n g - t a i l e d ) p r o b a b i l i t y p l o t . Note the l i n e a r c h a r a c t e r o f the normal p r o b a b i l i t y p l o t as i t should be since the considered data set i s normal by c o n s t r u c t i o n . Note a l s o the c h a r a c t e r i s t i c N- and S-shapes o f the other three probability plots. Q
The second data example ( f i g . 16) c o n s i s t s o f 500 random numbers generated from a uniform d i s t r i b u t i o n . These random numbers were a l s o based on the Rand (24) Corporation random
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
2.
FiLLiBEN
Testing Basic Assumptions
75
number t a b l e s . Again, p r o b a b i l i t y p l o t s f o r the same four d i s t r i b u t i o n s are presented f o r comparative purposes. Note how the uniform p r o b a b i l i t y p l o t i s most l i n e a r (as i t should be) and how the other three p r o b a b i l i t y p l o t s are decidedly n o n l i n e a r (as they should be). The above two data examples were included to i l l u s t r a t e the s e n s i t i v i t y of the p r o b a b i l i t y p l o t technique. The t h i r d example ( f i g . 17) i s the 700 voltage readings c o l l e c t e d from the Josephson J u n c t i o n cryothermometry experiment described i n s e c t i o n s 2 and 3. Note how (even through the inherent d i s c r e t e n e s s o f the data) i t i s seen t h a t the normal d i s t r i b u t i o n provides the best f i t o f the four d i s t r i b u t i o n s considered. The f o u r t h exampl readings. The normal t h i s data s e t , although the hump i n the normal l i n e i s suggestive of a l o c a t i o n and c o r r e l a t i o n problem as has already been discussed f o r t h i s data set i n s e c t i o n s 2 and 3. The f i f t h example ( f i g . 19) i s the 200 s t e e l - c o n c r e t e beam d e f l e c t i o n readings when subjected t o a p e r i o d i c pressure. Note how the four p r o b a b i l i t y p l o t s are a l l S-shaped, and t h a t t h e best f i t ( o f the four hypothesized d i s t r i b u t i o n s ) i s provided by the uniform d i s t r i b u t i o n . Note a l s o how the uniform f i t i s o b v i o u s l y not f u l l y l i n e a r ; so another i t e r a t i o n i s a p p r o p r i a t e , i n the s h o r t e r - t a i l e d d i r e c t i o n since the p l o t i s S- r a t h e r than N-shaped. As i t turned out, a short-tailed, U-shaped d i s t r i b u t i o n r e s u l t e d i n a p r o b a b i l i t y p l o t t h a t was q u i t e l i n e a r and so provided a much b e t t e r f i t t o the data than any o f the four d i s t r i b u t i o n s considered i n f i g u r e 19. Example 6 ( f i g . 20) i s r e s i d u a l s from a r a t h e r complicated 100-variable l e a s t squares f i t o f x-ray c r y s t a l l o g r a p h y data. None o f the four p r o b a b i l i t y p l o t s i s l i n e a r . From the nature o f the t r a n s i t i o n between the N-shaped p r o b a b i l i t y p l o t s (uniform and normal) the S-shaped p r o b a b i l i t y p l o t s (Tukey λ = -.5 d i s t r i b u t i o n ) , there i s seen a suggestion t h a t the l a r g e s t three data p o i n t s are o u t l i e r s . A subsequent examination o f these three data p o i n t s revealed t h a t they were c o l l e c t e d q u i t e e a r l y i n the experiment and so may p o s s i b l y be r e f l e c t i n g an instrument warm-up e f f e c t . Example 7 ( f i g . 21) i s 53 y e a r l y maximum wind speeds (27,28) f o r V a l e n t i n e , Nebraska. The four usual symmetric d i s t r i b u t i o n s were a l l n o n l i n e a r i n the f a s h i o n o f 12d and 12e, and so suggested t h a t some skewed d i s t r i b u t i o n s might b e t t e r f i t the data. The p r o b a b i l i t y p l o t given i s f o r one such skewed d i s t r i b u t i o n - - t h e log-normal d i s t r i b u t i o n . I t i s seen t h a t the p r o b a b i l i t y p l o t i s roughly l i n e a r but not o p t i m a l l y so and some other skewed d i s t r i b u t i o n might provide an improved d i s t r i b u t i o n fit.
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
I
1
Probability plots for voltage plots. Uniform.
.2504949 .5C000O0 UNIFORM P R O B A B I L I T Y PLOT ( T H E SAMPLE S I Z E Ν = 700) ( P R O B A B I L I T Y P L O T CORRELATION C O E F F I C I E N T = -955441
Figure 17a.
XXXXXXXXXXXXXXXXX
xxxxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
ΧΧΧΧΧΧΧΧΧΧΧΧΧΧΧ
.0009897
2 8 9 5 . 0 0 0 0 0 0 0 = Μ Ι Ν - XX
2895.875C000
2896.75C0000
2897.6250000
2898.50COCO0=MID-
2899.3750000
2900.2500000
2901.1250000
2902.0000000=MAX-
ο w
Ο
43
S!
Μ Η
> g
Μ
Χ w
Η
ο
ο
Η
> >
<
-α os
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
xxxxxxxxxxxxxx
xxxxxxxxxxxxx
xxxxxxxxxxxxxxx
Figure 17b.
xxxxxxxxxxxxxxx
Probability plots for voltage counts. Normal.
-I 1 .5466499
xxxxxxxxxxxxx
xxxxxxxxxxxxx
-1.5466499 .0000000 NORMAL P R O B A B I L I T Y P L O T ( T H E SAMPLE S I Z E Ν = 7001 ( P R O B A B I L I T Y PLOT CORRELATION C O E F F I C I E N T = .974841
X XX
-3.0932998
2895.00CC000=MIN- X
2895.e75CC00
2896.7500000
2897.6250000
2ES8.50CCCC0=MID
2899.3750000
2900.25C00CO
2901.1250000
2902.0GC0000=MAX-
1-
XXX X
X -
3.0932998
X
78
VALIDATION OF T H E M E A S U R E M E N T PROCESS
Γ
ζ m ο η O |D Ο
1
ο
α
ο
"θ.
Ο
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
-221.6149902
2895.0CCCC00=MIN- X
2895.8750000
2896.75CCC00
2897.6250000
2898.5CC0COO=MID
2899.3750000
2900.2500000
2901.125C000
2 9 0 2 . 0 0 C C C 0 0=MAX-
Figure 17d.
X
160.8074951
Probability plots for voltage counts. Cauchy.
-160.8074951 .0000000 CAUCHY P R O B A B I L I T Y PLOT ( T H E SAMPLE S I Z E Ν = 700» ( P R O B A B I L I T Y P L O T CORRELATION C O E F F I C I E N T = .42313)
XX X
321.6149902
ta
5
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
Figure 18a.
D
Probability plots for wind velocities. Uniform.
UMI^ORVI PROBABILITY OLOT (THE SAVPLE SIZE M = 15>nn) (PROBABILITY LOT CORRELATION COEFFICIENT = .QB571)
********* ****** ******* ****** ***** ****
***** ******* ******
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
295. lt|* -3.25
1ΠΠ.?5
Figure 18b.
r
Probability plots for wind velocities. Normal.
T
MORTAL PROBABILITY ^LOT ( TM SADDLE SI?E M = 120") (PROBABILITY " L 0 CORRELATION ClE-^TCIEMT = ,oP57Q)
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
295.m* -80.«2
ιοη.?5
P
T
Figure 18c.
1?00)
Probability plots for wind velocities. Tukey λ = —.5.
-.no LAVpOA r -.5 d p 0=< A=! T LIΤ Y PLOT (THE SAMPLE S l ^ r M = (OP0BA3ILITY L ° CORRELATION CO^^TCIENT = ,7*«?n)
**** ****
***
Ο η
3
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
Figure 18d.
T
p
=
12001 ."^5^)
Probability plots for wind velocities. Cauchy.
C A J C H Y R R 0 3 A B I L T T Y OLOT (THF SAMPLE S I 7 E M (=>ROBABILl Y LOT CORRELATION COEFFICIENT =
VALIDATION OF T H E M E A S U R E M E N T
χ x X
X
χ χ
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
PROCESS
FiLLiBEN
Testing Basic Assumptions
ο ω ο ο
χ χ
X X X X X X X X X X X X X X X X X X X X X X X X X X X X
Ο Ν
ί
αϊ J α Σ <
υ ί U. IL UJ
V) ο
r- κ _» •* ο ω ΙΌ j α ο α α
• is m α
α ο ο α ζ α
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
86
VALIDATION OF T H E M E A S U R E M E N T
ο
PROCESS
Ζ 10 «-·
O NJ Ν
: χ χ χ χ χ χ χ χ χ χ χ χ χ χ χ χ χ χ
« Ζ V) tu
& ω α φ < ο tu
1
υ
in α
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
Ο
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
-92.0004549
-579.0CC0000=MIN- X
-469.1250000
-259.2ECCC00
-249.3750000
-139.5CC00O0=MID
-29.6250C00
80.25COOOO
190.1250000
300.0CC0OOO=MAX-
Figure 19d.
46.0002270
Probability plots for beam deflections. Cauchy.
-46.0002270 .0000000 CAUCHY P R O B A B I L I T Y PLOT ( T H E SAMPLE S I Z E Ν = 200) ( P R O B A B I L I T Y P L O T CORRELATION C O E F F I C I E N T = .44084)
92.0004549
VALIDATION OF T H E M E A S U R E M E N T
PROCESS
χ χ
X X X X X X X
•3 Î5> SX
Ο
δ.
Ο
ι
α
«
I I
w
I I ι 1
X
X X X X X X X X X X X
α m
Ο < U- CD
« Ο ζ a. ο α
X
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
FiLLiBEN
Testing Basic Assumptions
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
VALIDATION O F T H E M E A S U R E M E N T
PROCESS
Γ
^2
χ χ X
^5 X X X X X X X X X X X X X X X
SX
X
χ χ X
X
χ χ
Σ H < ζ V) UJ
ι oo in α \ f- < Ο I ο ω υ
ο Ο ω Σ <
< m Ο α J α
3
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
-1111.0110321
-13.6579995=MIN- X
-8.5572495
-3.4564996
6.7450001=MID-
11.8457501
16.9465001
22.0472500
27.1479996=ΜΑΧ-
Figure 20d.
555.5055161
Probability plots for x-ray crystallography residuals. Cauchy.
-555.5055161 .0000000 CAUCHY P R O B A B I L I T Y PLOT ( T H E SAMPLE S I Z E Ν = 2 4 1 9 » ( P R O B A B I L I T Y P L O T COR R E L A T I O N C O E F F I C I E N T = .48098»
1 1 1 1.0110321
CO
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
1 _
+
7.43
+
-+
9,?707
+
WIND SPEEDS (VALENTINE, NEB.)
3.7730 S.ftflF* ."69°» ESTIMATES * IUTFPCFDT = ?β.359*43»
53 "FASi IRP'FMTr TM ΓΟΙ U^M -+ +
Figure 21. Log-normal probability plot for Valentine, Nebraska annual maximum wind speeds
,107fi7 1.9404 PROB. PLOT CORR. COEFF. =
LOG-NOPMAL, PRORATTLTTY PLOT OF + -+ + +
CALL LGNPLT(XjN)
Ο ο w
Hpj &J
H
w
w
ο
δ
Θ
&
2.
FiLLiBEN
93
Testing Basic Assumptions
PROBABILITY PLOT CORRELATION COEFFICIENT A natural extension o f the p r o b a b i l i t y p l o t concept i s t h a t of the p r o b a b i l i t y p l o t c o r r e l a t i o n c o e f f i c i e n t (14,15). The p r o b a b i l i t y p l o t c o r r e l a t i o n c o e f f i c i e n t i s an attempt t o summarize the l i n e a r i t y information l a t e n t i n a p r o b a b i l i t y p l o t i n t o a s i n g l e s t a t i s t i c . A natural choice f o r such a l i n e a r i t y s t a t i s t i c i s the product moment c o r r e l a t i o n c o e f f i c i e n t , r which i n general terms ( f o r any two v a r i a b l e s X and Y) i s defined as:
Σ(Χ.
- X) (Y. - Y)
r = Corr(X,Y) = /ΣΤΧ
x ) Σ(Y z
where
X =
and Y = _ η
i n
I f t h e v a r i a b l e s X and Y a r e l i n e a r l y r e l a t e d i n a p o s i t i v e f a s h i o n ( i . e . , i f Y = aX + b where a > 0 ) , then r = 1. I f the v a r i a b l e s X and Y are l i n e a r l y r e l a t e d i n a negative f a s h i o n ( i . e . , i f Y = aX + b where a < 0 ) , then r = - L I f the v a r i a b l e s X and Y a r e u n r e l a t e d , then r = 0. The stronger the l i n e a r dependency between X and Y, the c l o s e r r w i l l be t o +1 (or -1 ). In a p r o b a b i l i t y p l o t , the 2 " v a r i a b l e s " o f i n t e r e s t are the ordered observations Y and the "expected" ( f o r a given d i s t r i b u t i o n D ) ordered observations M. The p r o b a b i l i t y p l o t c o r r e l a t i o n c o e f f i c i e n t i s thus d e f i n e d as r = Corr(Y, M). I f the p r o b a b i l i t y p l o t i s n e a r - l i n e a r , then r w i l l be near-unity. If the p r o b a b i l i t y plot i s n o n l i n e a r , then r w i l l be a p p r o p r i a t e l y s m a l l e r than unity. The value o f the p r o b a b i l i t y p l o t c o r r e l a t i o n c o e f f i c i e n t r , i s thus a simple summary measure of the l i n e a r i t y o f a given p r o b a b i l i t y p l o t and hence a measure of the appropriateness o f the d i s t r i b u t i o n a l f i t f o r the hypothesized d i s t r i b u t i o n , D . Q
Q
In p r a c t i c e the value o f r f o r a s i n g l e isolated d i s t r i b u t i o n a l f i t i s not important i n i t s e l f i n an absolute sense. Of more importance i s the r e l a t i v e values o f r f o r various members o f an a d m i s s i b l e t e s t s e t o f d i s t r i b u t i o n s . I f the four d i s t r i b u t i o n s (uniform, normal, Tukey λ = -.5, and Cauchy) from example 1 (RAND normal numbers--fig. 15) o f the previous s e c t i o n were considered, and i f the p r o b a b i l i t y p l o t c o r r e l a t i o n c o e f f i c i e n t f o r each o f the four d i s t r i b u t i o n s would be much c l o s e r t o u n i t y than would the p r o b a b i l i t y p l o t correlation coefficient f o r any o f t h e other three d i s t r i b u t i o n s . Used i n t h i s f a s h i o n , one i s l e d t o consider the Maximum P r o b a b i l i t y Plot Correlation Coefficient (MPPCC)
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
94
VALIDATION OF T H E M E A S U R E M E N T PROCESS
criterion as a reasonable one f o r choosing the "best" d i s t r i b u t i o n out of a s e t of admissible d i s t r i b u t i o n s . The "best" as defined above w i l l , of course, mean t h a t d i s t r i b u t i o n which y i e l d s the "most l i n e a r " p r o b a b i l i t y p l o t . Several examples are now presented which i l l u s t r a t e the use of the Maximum P r o b a b i l i t y Plot Correlation Coefficient criterion. Reverting t o example 1 (Rand (24) normal random numbers) as described p r e v i o u s l y , the reader i s d i r e c t e d to f i g u r e 22 which presents c a l c u l a t e d values of the p r o b a b i l i t y plot correlation coefficient f o r 44 s e l e c t e d symmetric d i s t r i b u t i o n s . These 44 d i s t r i b u t i o n s have been s e l e c t e d so as to present a dense t a i l - l e n g t h coverage i n the important symmetric d i s t r i b u t i o n case. The λ values i n the t a b l e are references t o s p e c i f i ^ y (25,26). The wide range o f t a i l lengths and are ordered from the very shorttailed distributions a t top t o the very long-tai led d i s t r i b u t i o n s a t bottom. The values of λ between 2.0 and 1.0 represent very s h o r t - t a i l e d U-shaped d i s t r i b u t i o n s , λ = 1.0 i s i d e n t i c a l t o the uniform d i s t r i b u t i o n . Values of λ smaller than 1 and l a r g e r than 0 are truncated bell-shaped d i s t r i b u t i o n s which are short t o moderate-tailed i n nature, λ = 0 i s e x a c t l y the l o g i s t i c d i s t r i b u t i o n . Values of λ smaller than 0 represent infinite-domain bell-shaped distributions. The normal, l o g i s t i c , double exponential and Cauchy d i s t r i b u t i o n s have been appropriately positioned i n the t a i l - l e n g t h ordering. The p r o b a b i l i t y p l o t c o r r e l a t i o n c o e f f i c i e n t values f o r each of the 44 hypothesized d i s t r i b u t i o n s i s given on the f a r r i g h t of the t a b l e . Figure 22 i s a s i n g l e page which summarized the l i n e a r i t y information of 44 d i f f e r e n t symmetric d i s t r i b u t i o n a l p r o b a b i l i t y p l o t s . Note how the p r o b a b i l i t y p l o t c o r r e l a t i o n c o e f f i c i e n t s are s m a l l e r a t the top and bottom and reach a maximum a t the normal d i s t r i b u t i o n (as i t should since the data set was normal random numbers). The usefulness of such an a n a l y s i s (which of course i s completely computerized) i s t h a t the a n a l y s t can q u i c k l y determine not only the b e s t - f i t t i n g d i s t r i b u t i o n D ( i n this case, normal), but a l s o a subset of g o o d - f i t t i n g d i s t r i b u t i o n s ( i n t h i s case, d i s t r i b u t i o n s i n the normal neighborhood) f o r the data s e t a t hand. A f t e r narrowing the s e l e c t i o n t o an appropriate small neighborhood of d i s t r i b u t i o n s which have large values o f the p r o b a b i l i t y p l o t c o r r e l a t i o n c o e f f i c i e n t , i t i s recommended t h a t the a n a l y s t generate and s c r u t i n i z e the actual p r o b a b i l i t y p l o t s f o r a few o f the d i s t r i b u t i o n s i n t h i s neighborhood f o r o u t l i e r i n f o r m a t i o n and other p o s s i b l e anomalies. f a m i
Q
As a second example, consider a data s e t c o n s i s t i n g of 500 uniform random numbers based on the Rand (24) t a b l e s . The MPPCC c r i t e r i a (see f i g . 23) as a p p l i e d t o the uniform random numbers leads t o the c o n c l u s i o n t h a t the uniform d i s t r i b u t i o n (as i t should) provides the best d i s t r i b u t i o n a l f i t f o r these data.
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
Testing Basic Assumptions
FiLLiBEN
2.
iiiiiiiiiiiiiiiiiiiIi
·£ £ £
O O O C O D O C O G Û Û Û Û Q O Û C Û 3 Û
w D U D Q C u D a O D O O Q U O C D O u O C
ί *~ . ' Γ * * * Γ * * " ? J ! - J ^ - J ! ^ - J - c û ω οα ωι ι ι ι ι ι ι ι ι ι ι
I I I I I I Μ I Π I il il I il I I I I I Ο I U UJ I I I I I! Il M M I Ο M M H H H M M H
< « r < < f < < < < < < < < < < < < < < < _ j < K a ! <<<<<<<<<>-<•<<<<<<<<<<
iiiSliiîif !lf HîiiigîISiiiîîliifSuf
1st f f f f t
-J_l-)_J_J_J_JJ-J_l-J_J_J-J_l_J-JJ_jZj_JO_jJ_J_l_j-J_J-J-J^'-J-J-J-J-J-J- -J-J- -J J
J
mminnmmnnnunnnmmmtm immiimmmmmmmmtmrnim
IlIilISIIilIIIilIIilIIlillilIIIIIIS'IIIIliili O
Û
O
O
O
Û
Û
O
Û
Û
Q
C
C
C
I
Û
Q
Q
D
C
O
D
Û
O
C
J
O
C
C
O
C
C
'
O
O
D
C
C
i s? ι s Ï * ï \ Ï Ï 11 st ι ι ι ι κ ι ι ι ι ι ι i t m m s
I
Û
Û
C
'
C
.
Û
C
Û
Û
C
1
i Ï« * * ι * $* t s U
< < j < < < < r < < < r < < < < < < < < < < < < < < < r < < < < r < < < < < r < < f < < < i < < < < < r <
- kk £U
£ U
S ££k
£ ir>
U
t; U
t, ^
U
\rX.^
£U
£U
U
i l l l i l i i l i i i i l i i i i i i i i i i i i i i l i i l l i i i i i i l l i i i tr' cr' ο*- «; ω * J", J> ' Ji ; ω ω « ω* en w ο* σ' ω u? w ω Jl * w ω ω ιή* J «ο* ω «? en ω en w «τ! ω J! w « ë ëgëgggggëggëgëëëëëgg ggggëgggëëëggëggëëëëgg ë w
w
C O C C D C C D D Q O C C Û C C O C O C û D D O O O O O O O D C û O C C O C Û O O O D D
S a t S t S'a- S ι ï ï " O
O
O
O
O
Q
O
C
O
C
O
C
S £ £ ο f 'i' £ S a β t a' g & a S α S S Ï Î ' ï S S t S ' Î J I C
O
O
O
O
O
O
O
O
O
O
O
O
C
O
D
O
O
O
O
O
O
C
O
O
C
C
O
O
O
α t O
D
Ο Ο ^ OOOcO°OOOoOcoOOoOoOOOCOuOOOOOoOoOOOcOoOOc cιο if.mc c uc") irο mο ο ι·ο- inο mc if° in inΟ in mΟ ο ο inc inο inο inο irc mο ο inο ir.ο inο inο ο ο inο inο inο inοinο toοinο inο inc οinc inc c
a
C>
if
ID
C
CJ LP IT IT*
IT
IT IT
u a a a a: a a. u a a a a a a, u: a a; u a a UJ UJ a. ui ui a, u. u.i u- ni u.» u.' :
IT
in
UJ
u UJ a. a: a a ui UJ
ΙΧΧΙΧΧΧΧΙΙΧχΧΧΙΙΧΙΧΧΙΙΙΙΧΙΙΧΙΙΧΧΧΧΙΙΙΧΧΧΙΧΙΙ Ui h-Hhhhl-Hhl-KI-hK HKHKI-HHKKHHI-l-t-h-l-h KHhK-t-HHKhr-Kt-HK z z z z z z z z z z z z z z z z z z z z z z z z z z z z z z z z z z z z z z z z z z z z
u: a; a< a a. a^ a a a. a: aj UJ a a UJ a. aj ai a> a a a ai a a> UJ a a UJ a> a. a. a a- a α.· ai aj a> UJ UJ a, a a a: UJ U a a. a a> a a a a a a u a^ a. a; a a a) a a a a a: a a a< a. a ai a' a a a> a a a. a. a a a a 1
:
1
:
1
a a IL a ai a. a a: a ai a a a' a a a a a UJ a a a a a a' a a a a a a a a a a a a a a a a a a a a a a a a a a œ a a a r a . c c a c c a a a c G a a a . a a c c a a a a , a a t a a c c c D û G a , c c a c c a a a . 1
z z z z z z z z z z z z z z z z z z z z z z z z z z z z z z z z z z z z z z z z z z z z
I
Hht-
Κ Κ Η Η Η Η Η Η Η Ι - Η Η Η Κ Γ Γ | - Η Η Η Γ Η Η Η Η Η Η Κ Κ Κ Η Η Ι - Η - Η Ι - - Ι - Η Η Η
u u a a: a a u> α ΰ a a or α c a a i r a a a a a U C J G C C G G
u α o G
1
a c u U
a u a of tr Qa a a u G O C
L a a: a a' u. u: a a a a u a a a a a a a c r c r C K Q ' a c r c t a α α ο ' a a a a a c t a a a a a a a c a a a D C Q G C O C O G G O C D O C G Û a
a a; a a L a α κ α α α α α ' a c x a a c r G G C O O O
a σ a D
a u a u a a ι. a u α α α α α ο ' α α ο · c r u . Q ' a o ' a O G G G G C G O
yΧΧΧΙΧΤΧΙΧΙΧΧΤΧΧΤΙΙΤΧΧΤΙΧΤΧΧΤΧΤΤΧΙΤΧΧΤΙΙΤΧΤΧΙ h- h ^ i~ h ^ |-h|-HI-l-Kl-KH(-l-l-hHhl-l-t-t-l-hl-l-K-l-HHKHI-KH»-l-hk
a
a a a a u u u a a a .
a. u U ' a a ' a U i n u , a a a a u a ! a u a a .
u a a a a u i u L .
a a a u
a' a
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
VALIDATION O F T H E M E A S U R E M E N T PROCESS
96
· · · · · Κ
Η
Η
Κ
Κ
ζ · Η
Κ
Κ
Η
^
Κ
Κ
Κ
Η
Κ
Η
Η
Κ
Η
Ο
·ζ
Η
Κ | - Κ Η | - Κ Η Η · - Ο Η Κ Η Η Κ | _ Η Κ Η Κ Ι -
O O O D O C O Û C O O O O Û C O O Q D D O
c O C G C G Û O G C C D C O O O G O G Q Q O G
CP
· »->
(Γι
CO Q Hy
· · · · · · · ·
· · · · · α
co I
• · · · σ. · ι ι ι ι ι ι ι ι ιH -
· w
x
O I U U- Il I I I I I I I M Ο H
iilillIiSiillifliiiliUiiilliiiiUiiiliiiili _J_J_J_)J_J-JJ_J_JJ_)_JJ_J_l_l_J_JZ_J_JD_j^J_j_j_ij_j_J_JU_J_J_J_J_JJ_J_J_iJ_i
nmimmmtnnmnntnmnnmnn imnmmmimimiimmmmmnn
ο
Iliiiliiilliliiiiillillliillililfilllïilllli *
l-KHKKHKHhKKKKHHKHt-KHt-Hh(-K>-HKHHt-KHKKHHhKHhKt--h -^ <<<<<<< r gi < < < < < < < < < < < < < < < r < < < < < < < < < < < < < < < < < < < < < <
Q
(/; co y: to m en
tnnnnnnnnnnnnïnnntnnnnm
g
H!ii!iiiiiiii§iiiiiiiiiiiiiiiiii wcococncococnc^coeocococnc^tococnc^
ω oc- ce ce ce tr oc
aiLiiLcaiCŒiriLEûLODcrir t a t r i i r a i E L i t i D i i i D L i Œ i r a ' Œ i t . Œ Œ a
u. a U; Lu O
C
C
C
Q
O
C
O
u
O
D
C
G
C
C
O
G
C
O
O
O
O
C
C
O
O
O
C
C
O
O
O
C
O
a' Lu LL C
U
C
O
O
U. a
O
O
D
C
O
O
O
u u. a: ~ U U U' U U. O
C i Q C Û C C C C D O G û G C C D C C C C Û C O G C G D G O G G D G G G C G D C O Û C û G U e u u UJ IL U ; U L. U U U UJ U.; U. U U U. . U . U LU U . U UJ U ' u LL L u; L. u u a a a a a a a c a a a a a a a a a a a a a a a a a a a a a a a a a c r a a a a a a a a Q - a a UJ UJ U. LL U UJ U UJ U' a UJ U ! UJ UJ U: UJ ^ UJ UJ UJ IX'
Ui U, lu Ui
Lu
UJ
U
!
U
u
^
G O G O
UJ r«o
a a a u : a a u u a a a a a a ' a a a a a a " a a a a a a a a a a a u a a a a a a a . a a a û : a
ο
G
C
G
C
C
G
C
Q
D
G
O
G
Û
G
C C G O G C O
C
G C
C
G
O
C
G
C
C
G
G
C
O
O
C
C
:
LU UJ
:
U:
LL U.
ΟCΟ Ο Ο ΟCΟ ο C 0Γ0Ο0°ΟΟΟθΟοΟΟΟΟΟοο ^ΟοΟΟΟΟθΟΟΟοίΐοθ Ο CΟΟ ΟΟ C Ο C' CCc, οο οCο ο ο ο c οt-C οCοΟC cΟΟο Ο c οΟ CC οοοοcοο οcο iii u u. a ι χ ι ι ι χ χ χ ι χ ι ι χ ι ι χ ι χ ι ι χ ι ι ι χ ι ι ι ι χ χ χ χ ι ι ι χ χ χ ι ι ι ι χ D G G O G O G G O O C C O C a O O O O G C O O G G C O G G G C0 < σ 0 :< !::C G O G O G O C a Q Q Q O Ο
r
1
a
K
ο
c ο ο
Γ: η r η π m Γ/ ΓΟ Γ' Ρ· m r η η ^ η c π u. u a a a: a a
IL
H
h
Η
Η
Κ
Κ
Η
Κ
Η
a a. a. a
Η
Κ
Η
Η
Κ
Η
Η
ο
a a a ai a a a a u, a a u. a a a a a a a
1
Κ
ο ο
Η
Κ
Η
Η
a aa a a. a a; aUia:a a aa aa.. a:a a a
Η
Η
!
Κ
Ι
-
Η
Κ
Η
Κ
Η
Κ
Η
Κ
Η
Η
Κ
Η
Η
Κ
Η
Η
Η
1
a a a
a
a a a a a a
z z z z z z z z z z z z z z z z z z z z z z z z z z z z z z z z z z z z z z z z z z z z
O
O
κ
κ
<
<
G
G
κ <
G
κ •
G
h <
<
G
H<
G
Κ
<
O
Η
<
O
<
Η
G
<
G
G
O
G
G
G
O
O
O
G
O
C
O
O
C
G
C
Q
Η Η Κ Η Η Η Η Κ Γ Κ Κ Γ Γ Η Η Η Η Η < < < < · < < < < < < · < < < < < < <
a>
O
Κ
<
O
Η
<
G
Γ
<
C
Η
<
I
Κ
G
Η
<
G
Κ
<
<
O
Κ
<
G
Η
<
<
G
Η
<
O
Κ
G
<
a a tu
Κ
<
O
Κ
<
G
Κ
<
G
Κ
<
Κ
<
1
G
G
1
G
G
G
C
G
G
O
G
G
O
1
G
O
O
O
O
G
G
υουυ^^υυ^υυυ^^υυ^υυ^'
G
G
O
UCJC'
O
G
O
G
O
O
G
G
O
a
G
G
O
G
O
O
G
G
O
G
aa;°^a. a a a a a a ai a* a u u a a ai G a
a: ^>
OJD
Η
_J_l_J_)_)_J_J_J_J_)_i_jJ_j_J_J_l_J_JjJ_J_l_J_J_iJI_IJ_J_J_J_J_JJ_J_J_l_lJJJ_i
G
c
U
a u U' u a. a u a> u u u u.1 ui a a a. u u a u a a a a> a: a ai a a ai u U ' a ai a' a a a a a a a a a a a a a a a a a a a a a a a a a a a c r a a a a a a a a a a a a a a a a a a a c r a a a a a a a a a a c r a a a a a a a a a a u a . a a a a a a a a a a a a a a a a u a C
ο
a
z z z z z z z z z z z z z z z z z z z z z z z z z z z z1 z z z z z z z z z z z z z z z z a a a a a a ai a a a a a a a a a a a a a a a u: u a a> a u u a a a a u a a a a a a a: a; ai a a a a ai u a, a a a a ui a ai a: a a a a a a a. a G G a a : a a : G G a G c L a c c œ œ a G a
ο
g
Η
ai a aa a au . a a. a. a: a a -a a a
c
ο
r
a; a a a. a u: a
:
1
ο ο
c i r R r PÎ r: π Ρ r, r η η n r m r> r r m η m ' m r
aa
a.
O
U G C U U U G G G U U U G L - i G U U G G G U
Ui a a' a a. a a ui a a a a U, ai ΧΤΧΧΧΧΧΧΙΧΖΧΤΧΙΧΧΤΤΧΧΤΧΧΙΧΧΧΧΤΧΙΤΧΧΤΙΧΧΧΧΧΙΧ I Κ μ- Khi- — t — t t- Κ H t- Ht-- K— t -t>— — l — t H> —— l — J K— l h- K H t- h H I a, lu a
a u-
a
a. in UJ a. UJ a a
u. a. a a
UJ
a. a u a
u. a u
a u,
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
u UJ
IL
2.
FiLLiBEN
Testing Basic Assumptions
97
For the Josephson J u n c t i o n v o l t a g e counts data (see f i g . 24), the MPPCC c r i t e r i o n i n d i c a t e d t h a t the normal d i s t r i b u t i o n was the best f i t which i s i n agreement w i t h t h e already-seen l i n e a r i t y of the ( f i g . 17) normal p r o b a b i l i t y p l o t . For the wind v e l o c i t y data (see f i g . 25), the MPPCC c r i t e r i o n i n d i c a t e d t h a t the normal d i s t r i b u t i o n (though not p r e c i s e l y optimal) was nevertheless near-optimal and i s i n agreement w i t h the nearl i n e a r i t y o f the ( f i g . 18) normal p r o b a b i l i t y p l o t . The MPPCC c r i t e r i a as a p p l i e d t o the beam d e f l e c t i o n data i s presented i n f i g u r e 26. As expected, the b e s t - f i t symmetric d i s t r i b u t i o n t o t h i s data s e t i s i n the U-shaped d i s t r i b u t i o n region. The a n a l y s i s o f the x-ray c r y s t a l l o g r a p h y r e s i d u a l s (see f i g . 27) a l s o confirms what was p r e v i o u s l y s u s p e c t e d — v i z . , t h a t the best d i s t r i b u t i o n i s i n the moderate to moderate l o n g - t a i l e d region. The f i n a l exampl MPPCC f o r a skewed d i s t r i b u t i o n a l f a m i l y - - i n t h i s case the extreme value f a m i l y . The data set considered i s annual maximum wind speeds a t Walla W a l l a , Washington. In a f a s h i o n s i m i l a r t o the symmetric f a m i l y a n a l y s i s , a r e p r e s e n t a t i v e s e t o f 46 members o f the extreme value d i s t r i b u t i o n a l f a m i l y were s e l e c t e d and the p r o b a b i l i t y p l o t c o r r e l a t i o n c o e f f i c i e n t was computed for each. The r e s u l t s o f the a n a l y s i s i n d i c a t e d t h a t an extreme value d i s t r i b u t i o n w i t h shape parameter γ = 7 y i e l d s the best fit. The use o f the MPPCC c r i t e r i a i s recommended not as a replacement f o r examination of i n d i v i d u a l p r o b a b i l i t y p l o t s , but r a t h e r as an important complement t o such analyses. The automated procedures presented above a l l o w the a n a l y s t t o q u i c k l y "converge" t o a neighborhood o f d i s t r i b u t i o n s which provide good f i t s to the data set under examination. 4-PLOT ANALYSIS The general question posed i n t h i s s e c t i o n i s as f o l l o w s : Given t h a t one would l i k e t o prepare a completely automated (computerized) f i r s t - p a s s a n a l y s i s t h a t would be a p p l i c a b l e to a wide v a r i e t y o f data sets and which would take no more than one computer page, what would one i n c l u d e on t h a t s i n g l e page? As has been s t r e s s e d throughout t h i s paper, the b a s i c assumptions which must be t e s t e d t o assure t h a t a measurement process i s i n c o n t r o l i n c l u d e randomness, f i x e d l o c a t i o n , f i x e d v a r i a t i o n , and f i x e d d i s t r i b u t i o n . The value o f the run sequence p l o t and the lag-1 a u t o c o r r e l a t i o n p l o t s have already been amply discussed w i t h respect to the f i r s t three p o i n t s . The p r o b a b i l i t y p l o t has been discussed a t length w i t h respect t o the l a s t p o i n t . And so the following 4-plot a n a l y s i s i s presented as an automated
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN
THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE
700 700 700 700 700 700 700 700 700 700 700 700 700 700 700 700 700 70 0 700 700 700 700 700 700 700 700 700 700 700 700 700 700 700 700 700 700 700 700 700 700 700 700 700 700
ORDERED ORDERED ORDERED ORDERED ORDERED ORDERED ORDERED ORDERED ORDERED ORDERED ORDERED ORDERED ORDERED ORDERED ORDERED ORDERED ORDERED ORDERED ORDERED CRDEREO ORDERED ORDERED ORDERED OROERED ORDERED ORDERED ORDERED ORDERED ORDEREO ORDERED ORDERED ORDERED ORDERED ORDERED ORDERED ORDERED ORDERED ORDERED ORDERED ORDERED ORDERED ORDERED ORDERED ORDERED
OBS. OBS. OBS. OBS. OBS. OBS. OBS. OBS · OBS. OBS. OBS. OBS. OBS. OBS. OBS. OBS. OBS. OBS. OBS. OBS. OBS. OBS. OBS. OBS. OBS. OBS. OBS. OBS. OBS. OBS. OBS. OBS. OBS. OBS. OBS. OBS. OBS. OBS. OBS. OBS. OBS. OBS. OBS. OBS.
AND AND AND AND AND AND AND AND AND AND AND AND AND AND AND AND AND AND AND AND AND AND AND AND AND AND AND AND AND AND AND AND AND AND AND AND AND AND AND AND AND AND AND AND
THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE
ORDER ORDER ORDER ORDER ORDER ORDER ORDER ORDER OROER ORDER ORDER ORDER ORDER ORDER ORDER ORDER ORDER ORDER ORDER OROER OROER ORDER ORDER ORDER ORDER ORDER ORDER ORDER ORDER ORDER ORDER ORDER ORDER ORDER ORDER ORDER ORDER OROER ORDER ORDER ORDER ORDER ORDER ORDER
STAT. STAT. STAT. STAT. STAT. STAT. STAT. STAT. STAT. STAT. STAT. STAT. STAT. STAT. STAT. STAT. STAT. STAT. STAT. STAT. STAT. STAT. STAT. STAT. STAT. STAT. STAT. STAT. STAT. STAT. STAT. STAT. STAT. STAT. STAT. STAT. STAT · STAT. STAT. STAT. STAT. STAT. STAT . STAT. MEDIANS MEDI ANS MEDIANS MEDIANS MEDIANS MEDIANS MEDIANS MEDIANS MEDIANS MEDIANS MEDIANS MEDIANS MEDIANS MEDIANS MEDIANS MEDIANS MEDIANS MEDIANS MEDIANS MEDIANS MEDIANS MEDIANS MEDIANS MEDIANS MEDIANS MEDIANS MEDIANS MEDIANS MEDIANS MEDIANS MEDIANS MEDIANS MEDIANS MEDIANS MEDIANS MEDIANS MEDIANS MEDIANS MEDIANS MEDIANS MEDIANS MEDIANS MEDIANS MEDIANS
FROM FROM FROM FROM FROM FROM FROM FRCM FROM FROM FROM FROM FROM FROM FROM FROM FROM FROM FRCM FROM FROM FROM FROM FROM FRCM FROM FROM FRCM FROM FROM FROM FROM FROM FROM FROM FROM FROM FROM FROM FROM FROM FROM FROM FROM THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE
LAMBDA = 2 . 0 DI S T . LAMBDA = 1 .9 DIST. LAMBDA 1.8 D I S T . LAMBDA = I . 7 DI S T . LAMBDA = 1 .6 DIST. LAMBDA = 1.5 DIST. LAMBDA = 1 .4 DIST. LAMBDA 1 .3 DIST. LAMBDA = 1 .2 DIST . LAMBDA = 1 .1 DI S T . LAMBDA = 1.0 D I S T . LAMBDA = • 9 DIST. LAMBDA = • 8 DI S T . LAMBDA .7 DIST. LAMBOA = • 6 DIST. LAMBDA = . 5 DIST . • 4 DIST. LAMBDA = LAMBDA = . 3 DI S T . LAMBDA = • 2 DIST. NORMAL D I S T R I B U T ION LAMBDA = • 1 DI S T . LOGISTIC DIST. DIST. DOUBLE E X P . DIST. LAMBDA -.1 LAMBDA - . 2 DIST. LAMBDA = - . 3 DIST. LAMBDA = - . 4 DIST. LAMBDA = - . 5 DIST. LAMBOA = - . 6 DI S T . - .7 D I S T . LAMBDA = LAMBDA = - . 8 OIST. LAMBDA = - . 9 DIST. CAUCHY D I S T R I B U T I O N DIST. LAMBDA = - 1 . 0 DI S T . LAMBDA -1.1 LAMBDA = -1 . 2 D I S T . LAMBDA = - 1 . 3 D I S T . LAMBDA - 1 . 4 DI S T . DIST. LAMBDA = - 1 . 5 LAMBDA = -1 . 6 D I S T . LAMBDA = - 1 . 7 DI S T . LAMBDA -1 . 8 D I S T . L A M B D A = - 1 . 9 D I ST . L A M B D A = - 2 . 0 DI S T . I s IS IS IS
I s IS IS IS IS IS IS IS IS
I s IS IS IS IS IS
I s IS IS
IS IS IS IS IS IS IS IS IS IS IS IS IS IS IS IS IS IS IS IS IS IS
.95543 .95447 .95374 .95308 .95265 .95244 .95249 .95267 .95329 .95407 .95544 .95706 .95911 .96161 .96446 .96737 .97048 .97315 .97492 .97484 .97434 .97045 .95613 .96028 .94076 .90818 .85924 .79357 .71518 .63128 .55017 .47743 .42313 .41619 .36553 .32518 .29315 .26772 .24749 .23128 .21820 .20752 . 19871 .19139
Figure 24. Printout plot correlation coefficient analysis for Josephson Junction cryothermometry voltage counts
T H E CORR E L A T I C N T H E CORR E L A T I C N T H E CORR E L A T I O N T H E CORR E L A T I C N T H E CORR E L A T I O N T H E CORR E L A T I O N T H E CORR E L A T I C N T H E CORR E L A T I O N T H E CORR E L A T I O N T H E CORR E L A T I C N T H E CORR E L A T I O N T H E CORR E L A T I O N T H E CORR E L A T I C N T H E CORR E L A T I O N T H E CORR E L A T I O N T H E CORR E L A T I C N T H E CORR E L A T I O N T H E CORR E L A T I C N T H E CORR E L A T I O N T H E CORR E L A T I O N T H E CORR E L A T I O N T H E CORR E L A T I O N T H E CORR E L A T I O N T H E CORR E L A T I O N T H E CORR E L A T I O N T H E CORR E L A T I O N T H E CORR E L A T I O N T H E CORR E L A T I O N T H E CORR E L A T I C N T H E CORR E L A T I C N T H E CORR E L A T I O N T H E CORR E L A T I O N T H E CORR E L A T I O N T H E CORR E L A T I O N T H E CORR E L A T I O N T H E CORR E L A T I O N T H E CORR E L A T I O N T H E CORR E L A T I O N T H E CORR E L A T I O N T H E CORR E L A T I O N T H E CORR E L A T I C N T H E CORR E L A T I C N T H E CORR E L A T I O N T H E CORR E L A T I C N
53 Ο ο w
*
H
c! 53 M £
>
Χ
Η Μ £ M
*ι
Ο
Ο
Η
>
Θ
ZD
00
2.
Testing Basic Assumptions
FiLLiBEN
Vf ι Γ l ί t f ι κ κ • κ • r t f t f l Γ ^ f f c r ί ^ f ^ M r J ^ ^ r f f ^ ^ f f ι f • σ a ^ ι f C • c κ ' d • ^ ι f • ^ P ι er cc cr a cr a cr. cr a a a a or cr σ cr σ σ σ cr σ σ se et- c; ;* \c hc « - ι τ ^ ο Λ ί ο ^ σ cc νβ If If. Γ
r
;
" IT IT' U"IT Ι/Ί1/IT IT: 1/1 IT IT.
IT' LT. œ Ι/î IT. IT 1/1 IT' l/l IT IT) IT ΙΛ IT
»I- C·I - · H - l - t - l - l - K - h O I - l - . . · . l - K - l - h h K I - t : i/
en
· \s. irif.
c ο ο ο c τ o c c r r r c r en r; r r- r n r
i n i/ι/ι m in in »- mι/Ί^ι/ΐΐ/ΐι/ιι/ΐιΧί/ιι/ιι/ι
on c c c c c e r, r c D r. r c ο c. c ο ο ο ο ο μα c
cr— · μ» or · m · rr h»-> · LT c o ir ι ι ι ι ι ι ι ι ι ι ι *> μI C II Ο L l II II II II II i l II II II C II II II II II II II II II II II
IIIIIIII I
« ' « < < : « « « « « « < « « r < : « c . « « < Î j « K U J < < « « e « < L « « > ~ < i « « < i : < « f f < « < x « « . Ci c. c c ο c. c. r c c ο c c. ο c c c, c ο «=r c in J ο c η ο c c c η «τ τ α ο ο ο ο ο ο c ο ο ο cr rr cr en ΓΓΙ en ce, cr cr, cr cr, en cr cr• er, en cr cr, cr, 5 5 5 5 5 ^ ^ 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 55 5 5 55 c r 5 < £ . 5 5 55555 5 { τ ,
J J J J J - i J j J J J J j j _ i j j j j ; j j c j J j J j J J J UJ X μ5 C et U
UJ lit UJ LU UJ UJ U LL! UJ UI Lf U UJ U LU UI L! L i U.' Ui UI U UJ UJ UJ UJ UJ U' LL UJ UJ U' UJ UJ UJ LU UJ Lu Ui Ul UI UJ UJ X X X X X X X X X X X X X X X X X X X X I X X X X X X X X X X X X X X X X I X I X X X μ- μ μ κ μ - μ ^ μ - μ - μ μ κ μ κ μ- μ- μ μ μ- κ μ μ μ μ μ μ κ μ μ - μ μ μ μ κ μ- μ μ ι - μ ι - κ κ ι - κ 5 5 5 5 5 5 5 5 5^ 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5" 5- 5 5 5 5 5" 5 5 C C C C C C C C C C C . C C C C C C C C . c c c c c c c c c c c c c c c c c c c c cr c c c c r a erex cr er ο or or or or or cr cr cr or or. ctc^o^aaccorctcr(i'a.cr or. aaoctrer a. ocoro^cr:Cr. a. a L L L L U L L U L L L L L U L U U U L U L i L L U L L U L L U U U U U U L U U u U U L L ty LT u~ 1/ 1/ I T IT. I T y y y y y y y y y y y y y —y ~ —y y y y y y y y y y y —Ζ ? 7 77 7 7? 7 Ζ7 Ζ 7 < r < c < f < < i < c < r c r < < : < r < i < < r < ! < r « i < t f < < r < < i c « i c i t f < r c < < < r < « r < i < r < < f < t f < r < 1
!
c c c c c c c c c c c c c c c c c c c c c c c c c c c c c c c c c c c c c c c c e e c c U U ι Li li I U' U ι li Li-U.IL L L It ' U li L U U U U LI l i li : L U U U li li L L U U Ll IJ ' L L ' li L U li U U U 55 5 55^55^ 5 ^ 5 5 5 5 5 5 5 5 5 ^ 5 5 5 5 5 5 5 5 5 ^ 5 5 ^ 5 5 5 5 5 5 5 5 55^5 ;
:
1
LT&crLrtyLrcjrcrcrcrcrc5fcrcr L U U L L U L : L > U L L ' L ' L L L L L L L L L L L L U L L IJ L L L L L L L L L L L L J L L L L c e e e e c e e e c c e c e c c c c e c c e e c c e c e c c e c c cc c r r r r r r r r eter. ererererereraererexera
o o o o o o o o o o o o o o o o o o o c o o o o o o o o c o o o o o o o o c o o o o o o
L L ' L ! ϋ L L L L L» L U L L L L L L L L L L L L L L L L L L L L L L L L L L L L L i L L L L. X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X μ-μμκμ-μ-μ-μ-μ-μ
μ^μμ-μ-μ-μμ-κμ-μμ-μ-μΗμ-μμ-μ-μ-μ-μ
μμ-Ημ-μ-μμ-μ-μ-μ-μ-μ-
c c e e ec e c c t c e c erc c c c c c c e r r e ce ce c ce c t . e e ce c ce ce ~Z Ζ T y ~ y ~ Ζ Ζ y Ζ y Ζ y y Ζ Ζ Ζ Τ - Τ Τ Ζ Ζ Z 7 7 7 7 Z 7 Z 7 7 7 7 7 7 7 Z 7 7 7 7 L/>-l/M/lU";l/lLfil/>l/^inLnLOl^ cr cr cr er CÎ ex cr cr cr cr v er π rr r rr rr cr cr rr rr rr e r a rererrrrrrrr-ererererereerrererrrerr. c c c c c c c c c c c c c c c c c c c c c c c c c c c c c c c c c. c ce c c. c c: c: c c. c
c o n o o . n o o o o n o n a
L i L l L ' lu L l L l l i L l lt< L l L) L.I L ! L i L l Li I L ! Ii' L! L! h.! L! Lr L.i L' Li I L L' L l L' L.! L - L» L l L i b ' L l Ul L l L ! LJ Ul L' L rr n cr rr ύ rr rr rr et rr ύ et y cr rr rr rr ύ ncrr>Ύrîrrrrrγΐrrrrγcrrrr:yeyΎrΎ'y^cγ^rryrrcγcrcr Ιι · L ! L I U LU L' L' L l L • L ι L l lu L l LJ L lil L ' L : L L L L l L' L: L > L L ! tu L i L ' L : L L L l L ' li : L ' L i L ι L L L I L l i 1
1
η c η ο οη ηηηη ηηοc η ηη ηc ηη ο n n n n n n o n c o n n o n n c .
ηοοηοη
1
r r r r r r r r c r f r f r f r r r r r f r r r r r f r r r f r f r r r r r r r f r c r f r r r r r r r r r r r r r r r r r r r r r r r r r r r r r r v r r rr rr rr r r rr C C C C C C C: c c, c c c c. c c c c c c c c c c c c c c c c c c c c ο c. c c c c c c c c c o o o c o c o o c c o o o o o o o o o o o o o o o o o c o o o o o o o o o o c c o c o o
L ' L • L ' LU L ' L L. L L l. L' L! L L L LU L' l u ti L i L t. L U L L 1
L U L ι L L L L L ' L ' L L LU L i L •• L > L U' L 1
1
x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x
Γι
fil
L i til III III LJ L l IU til l i ! !.«. 1 II UJ 111 III lit til L l III hi L! L l Ul ( j I til L l L l tit L l ll) IJ I LI til IJ I III ll! ll) LI III 111 Ll I L l III L L L L L L' L In li L, L L L L U L. L L i L ! L L L Li L L, L L lu L' L l L L < L ι L ' LI L : L J L l u U li li L L : 1
1
(-»-»- ι - μ t~ D- \~ κ ι-- μ μ· μ. y- y- t- h y~ v_ μ- μ ^ ι_ μ. μ- j- κ μt- *κ ^ t- ν- t- μ μ- μ μ μΙιΙ L l LJ 111 LJ LJ L l til L l UJ LJ LJ LJ til 1,1 !>J L i LJ 1,1 1,1 Ul LJ LJ L l Ul L) L l til LJ LtJ I.J LJ L.I LJ LJ UJ LJ L) LJ L l L l til L l L l r r r r r n r n r r r r rnrrfTirrirrr^ μ
• > ~ Γ'•
> z -y ζ > -y
u
y τ •? -y y y y > y y y - y -•••-y y y y y y y y y y y y y y y y y y y y
O O O O O O O O O O O O O O O O v O O O O O O O O O O O O O O O O O O O O O O O O O O O O μ-μ-μ-μ-μ-μ-μ-μ-μ-μ-μ-μ-μ-μ-μ-μ-μ-μ-t—μ-μ-μ-μ-μ-μ-μ-^μ-μ-μ^ til Ul L l L l LJ L l LJ L l U L l L l L l Ul L l UJ Ul L l L l LI LJ Ul L l Ut lit Ul Ul L l L l L l L l L l Ul LJ IJ L l L l til lit IJ L l L l tiJ L l U rrr~rrrr.rrcrrrrrcrcrr^ rrrrrrrrrrrrrrrrrrrrr^CYrrrrrrrYr-rrrr c c c c c c c c c c c c c c c c c c c c c c c c c c c C C C C C C . C CC C C C C C C C c CJ t j t> ο ο ο ο ο ο ο ο ο ο ο υ υ υ υ υ ο υ ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο t) ο L l L l Lit L l L l
111
LJ LJ til L l L l til L l til
111
LJ L l til tJ lit L l L l L i t j L l til U L l LJ L l U IJ L l L l lit L> L l L l L l L l IJ L l LJ L l
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE
CORRELATION CORR EL AT I C N CORRELATION CORRELATICN CORRELATION CORRELATION CORRELATION CORRELATICN CORR E L A T I O N CORRELATION CORRELATION CORRELATION CORRELATION CORRELATICN CORRELATION CORRELATICN CORRELATICN CORRELATION CORRELATION CORRELATICN CORRELATION CORRELATICN CORRELATION CORRELATION CORRELATICN CORRELATION CORRELATICN CORRELATICN CORRELATION CORRELATION CORRELATICN CORRELATION CORRELATICN CORR ELAT I C N CORRELATION CORRELATION CORRELATION CORRELATION CORRELATICN CORRELATICN CORRELATICN CORRELATICN CORRELATION C O R R E L A T I ON
BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN EETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN EETWEEN BETWEEN BETWEEN BETWEEN BETWEEN
200 200 200 200 200 200 200 200 200 200 200 200 200 200 200 200 200 200 200 200 200 200 200 200 200 200 200 200 200 200 200 200 200 200 200 20 0 200 200 200 200 200 200 200 200
OBS. OBS. OBS. OBS. OBS. OBS. OBS. OBS. OBS. OBS. OBS. OBS. OBS. OBS. OBS. OBS. OBS. OBS. OBS. OBS. OBS. OBS. OBS. OBS. OBS. OBS. OBS. OBS. OBS. OBS. OBS. OBS. OBS. OBS. OBS. OBS. OBS. OBS. OBS. OBS. OBS. OBS. OBS. OBS.
AND AND AND ANO AND ANO AND AND AND AND AND AND AND AND AND AND AND AND AND AND ANO AND AND AND AND AND AND AND AND AND AND AND ANO AND AND AND AND AND AND AND AND AND AND AND
THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE
ORDER ORDER ORDER ORDER ORDER ORDER ORDER ORDER ORDER ORDER ORDER ORDER ORDER ORDER ORDER ORDER ORDER ORDER ORDER ORDER ORDER ORDER ORDER ORDER ORDER ORDER ORDER ORDER ORDER ORDER ORDER ORDER ORDER ORDER ORDER ORDER ORDER ORDER ORDER ORDER ORDER ORDE R ORDER ORDER
STAT. STAT. STAT. STAT. STAT. STAT. STAT. STAT. STAT. STAT. STAT. STAT. STAT. STAT. STAT. STAT . STAT. STAT. STAT. STAT. STAT . STAT. STAT. STAT. STAT. STAT. STAT. STAT. STAT. STAT. STAT. STAT. STAT. STAT. STAT . STAT. STAT. STAT . STAT. STAT. STAT . STAT. STAT. STAT .
MEDIANS MEDIANS MEDIANS MEDIANS MEDIANS MEDIANS MED I ANS MEDIANS MEDIANS MEDIANS MEDIANS MEDIANS MEDIANS MEDIANS MEDIANS MEDIANS MEDIANS MEDIANS MED IANS MEDIANS MEDIANS MEDIANS MEDIANS MEDIANS MEDIANS MEDIANS MEDIANS MEDIANS MEDIANS MEDIANS MEDIANS MEDIANS MEDIANS MEDIANS MEDIANS MEDIANS MEDIANS MEDIA NS MFD I ANS MEDIANS MEDIANS MEDIANS MED I ANS MEDIANS
FROM FROM FROM FROM FROM FROM FROM FRCM FROM FROM FROM FROM FRCM FRCM FROM FROM FRCM FROM FROM FRCM FROM FROM FRCM FROM FROM FRCM FROM FROM FRCM FROM FROM FRCM FROM FROM FROM FROM FRCM FROM FRCM FRCM FROM FROM FRCM FROM
THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE
-
LAMBDA = 2.0 D I S T . LAMBDA = 1 .9 DI S T . LAMBDA = 1.8 D I S T . 1.7 D I S T . LAMBDA = LAMBDA = 1 .6 DI S T . LAMBDA = 1 .5 D I S T . 1 .4 D I S T . LAMBDA = LAMBDA 1 .3 D I S T . LAMBDA = 1 .2 D I S T . LAMBDA = 1.1 D I S T . 1 .0 D I S T . LAMBDA LAMBDA • 9 DIST. LAMBDA • 8 DIST. LAMBDA .7 D I S T . LAMBDA = • 6 DIST. LAMBDA .5 D I S T . .4 D I S T . LAMBDA = .3 D I S T . LAMBDA LAMBDA = . 2 DIST. NORMAL DI: ST RIBUT ION LAMBDA = • 1 DIST. L O G I S T I C DI ST. DOUBLE E X P . D I S T . LAMBOA - - . 1 DI S T . -.2 DI S T . LAMBDA = LAMBDA = -.3 DIST. LAMBDA = - .4 DI S T . LAMBDA = - .5 DI S T . LAMBDA = -.6 D I S T . - . 7 DI S T . LAMBDA = LAMBDA - .8 D I S T . LAMBDA = -.9 DIST. CAUCHY DI[ S T R I B U T I O N LAMBDA = - I .0 D I S T . LAMBDA - - 1 . 1 D I S T . LAMBDA = -1 .2 DI S T . LAMBDA = - 1 . 3 D I S T . LAMBDA -1 .4 D I S T . LAMBDA = -1 .5 DI S T . LAMBDA -1.6 DIST. LAMBDA = - 1 . 7 D I S T . LAMBDA = -1 .8 DI S T . LAMBDA = - 1 . 9 D I S T . LAMBDA -2.0 OIST.
Printout plot correlation coefficient analysis for beam deflection
ORDERED ORDERED ORDERED CRDE RED ORDERED OROEREO ORDERED ORDERED ORDERED CRDERED ORDERED OROERED CRDERED ORDERED ORDERED ORDERED ORDERED CRDERED ORDERED ORDERED CRDERED OROERED ORDERED ORDERED OROERED ORDERED CRDERED ORDERED ORDERED OROERED ORDERED CRDERED ORDERED ORDERED CRDERED ORDERED ORDERED CRDERED ORDERED ORDERED CRDERED ORDERED ORDERED CRDERED
Figure 26.
THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE
IS
I s IS IS IS IS IS
Is IS IS IS IS IS
I s IS IS
Is Is IS
Is IS IS
I s IS
I s IS IS
I s IS IS IS IS IS IS IS IS IS IS
Is IS IS IS IS IS
.99255 .99308 .99351 .99383 .99405 .99417 .99416 .99403 .99374 .99326 .99255 • 99155 .99015 .98822 .98558 .98198 .97703 .97026 .96097 .95408 .94825 .93092 .88949 .90760 .87 67 7 .83721 .78846 .73138 .66839 .60309 .53933 .48027 .44084 .42864 .38292 .34520 .31404 .28852 .26771 .25074 .23687 .22548 .2160 8 .20828
Ο n w
*ϋ
H
td M £ M
C
C/J
>
Μ
Η Μ
*1
Η Ο Ο
ο
>
— II
< >
ο
Ο
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE
CORRELATICN CORRELATICN CORRELATION CORRELATION CORRELATICN CORRELATION CORRELATICN CORRELATICN CORRELATION CORRELATICN CORRELATION CORRELATION CORRELATICN CORRELATICN CORRELATION CORR EL AT I C Ν CORRELATION CORRELATION CORRELATICN CORRELATION CORRELATICN CORRELATION CORRELATION CORRELATICN CORRELATICN CORRELATION CORRELATICN CORRELATICN CORRELATION CORRELATION CORRELATION CORRELATION CORRELATICN CORRELATION CORRELATICN CORRELATION CORRELATION CORRELATION CORRELATION CORRELATION CORRELATICN CORRELATICN CORRELATION CORRELATICN
THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE
24 19 2419 2419 2419 2419 24 19 2419 2419 2419 2419 2419 2419 2419 2419 2419 2419 2419 2419 2419 24 19 24 19 2419 2419 2419 24 19 2419 24 19 2419 2419 2419 2419 2419 2419 2419 2419 2419 2419 2419 2419 2419 2419 2419 2419 2419
ORDERED ORDERED ORDERED ORDERED ORDERED ORDERED CRDERED ORDERED ORDERED ORDERED ORDERED OROERED ORDERED ORDERED CRDERED ORDERED ORDERED CRDERED ORDERED ORDERED ORDERED ORDERED CRDERED ORDERED ORDERED CRDERED ORDERED ORDERED CRDERED ORDERED ORDERED ORDERED ORDERED CRDERED CRDERED ORDERED ORDERED ORDERED ORDERED OROERED CRDERED ORDERED CRDERED ORDERED
OBS. OBS. OBS. OBS. OBS. OBS · OBS. OBS. OBS. OBS. OBS. OBS. OBS. OBS. OBS. OBS. OBS · OBS. OBS. OBS. OBS. OBS. OBS. OBS. OBS. OBS. OBS. OBS. OBS. OBS. OBS. OBS. OBS. OBS. OBS. OBS. OBS. OBS. OBS. OBS. OBS. OBS. OBS. OBS.
AND THE AND THE AND THE AND THE AND THE AND THE AND THE AND THE AND THE AND THE AND THE AND THE AND THE AND THE AND THE AND THE AND THE AND THE AND THE AND THE AND THE AND THE AND THE AND THE AND THE AND THE AND THE AND THE AND T H E AND THE AND THE AND T H E AND THE AND THE ANO THE AND THE AND THE AND THE AND THE ANO THE AND THE AND THE AND T H E AND THE
ORDER ORDER ORDER ORDER ORDER ORDER ORDER ORDER ORDER ORDE R ORDER ORDER ORDER ORDER ORDER ORDER ORDER ORDER ORDER ORDER ORDER ORDER ORDER ORDER ORDER ORDER ORDER ORDER ORDER ORDER ORDER ORDER ORDER ORDER ORDER ORDER ORDER ORDER ORDER ORDER ORDER ORDER ORDER ORDER
STAT. STAT . STAT. STAT. STAT. STAT . STAT. STAT. STAT . STAT. STAT. STAT. STAT. STAT. STAT. STAT. STAT. STAT. STAT. STAT . STAT. STAT. STAT. STAT. STAT. STAT. STAT. STAT. STAT. STAT. STAT. STAT. STAT. STAT. STAT. STAT. STAT. STAT. STAT. STAT. STAT. STAT. STAT. STAT. MEDIANS MEDIANS MEDIANS MEDIANS MEDIANS MEDIANS MEDIANS MEDIANS MED I ANS MEDIANS MEDIANS MEDIANS MED I ANS MEDIANS MEDIANS MEDIANS MEDIANS MFD IANS MEDIANS MEDIANS MEDIANS MEDIANS MEOIANS MEDIANS MEDIANS MEDIANS MEDIANS MEDIANS MEOIANS MEDIANS MEDIANS MEDIANS MEDIANS MED I ANS MEDIANS MEDIANS MEDIANS MEDIANS MEDIANS MEDIANS MEDIANS MEDIANS MED IANS MED I ANS
FROM FROM FROM FROM FROM FRCM FROM FRCM FROM FROM FROM FROM FROM FROM FROM FROM FROM FROM FROM FROM FROM FROM FROM FROM FRCM FROM FROM FRCM FROM FROM FRCM FROM FROM FRCM FROM FROM FRCM FROM FROM FRCM FROM FROM FRCM FROM THE THE THE THE THE THE THE THE THE THF THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE THE
LAMBDA = 2.0 DI S T . LAMBDA = 1 .9 D I S T . LAMBDA = 1 .8 D I S T . LAMBDA = 1.7 DI S T . LAMBDA = 1 .6 D I S T . LAMBDA = 1 · 5 DIST. LAMBDA = 1 .4 DI S T . LAMBDA 1.3 D I S T . LAMBDA = 1.2 D I S T . LAMBOA 1 .1 DI S T . LAMBDA = 1 · 0 DIST. LAMBDA .9 DI S T . .8 D I S T . LAMBDA = LAMBDA = .7 D I S T . LAMBDA • 6 DIST. = LAMBDA = .5 DI S T . LAMBDA • 4 DIST. = LAMBDA .3 D I S T . = LAMBDA = • 2 DI S T . NORMAL D I S T R I B U T ION LAMBDA • 1 DIST. = L O G I S T I C 01 S T . DOUBLE E X P . D I S T . LAMBDA = - .1 D I S T . LAMBDA = -.2 D I S T . LAMBDA = -.3 D I S T . LAMBDA = -.4 D I S T . LAMBDA - - . 5 D I S T . LAMBDA -.6 DIST. LAMBDA = -.7 DI S T . LAMBDA = -.8 D I S T . LAMBDA = - . 9 DI S T . CAUCHY D I S T R I B U T ION LAMBDA = - 1 . 0 D I S T . LAMBDA = -1 . 1 DI S T . LAMBDA = -1 .2 D I S T . LAMBDA = - 1 . 3 D I S T . LAMBOA = - 1 . 4 D I S T . LAMBDA = -1 .5 D I S T . LAMBDA = -1 .6 D I S T . LAMBDA = -1 .7 DI S T . LAMBDA = -1 .8 D I S T . LAMBDA = - 1 . 9 DI S T . LAMBDA = - 2 . 0 D I S T . I
s
I s IS IS IS IS IS IS IS IS IS IS IS IS IS IS
Is IS IS IS IS IS IS IS IS IS IS IS IS IS IS IS IS IS IS IS
I S IS IS IS IS IS IS IS
.89619 .89466 .89334 .89226 .89146 .89100 .89095 .89136 .89231 .89389 •89619 .89932 .90339 .90854 .91490 .92259 .93173 .94233 .95428 .96258 .96710 .97965 .98785 .98960 . 9 9 2 6 5 MAX .98206 .94981 .89075 .80825 .71434 .62288 .54283 .48098 .47729 .42510 .38433 .35243 .32730 .30730 .29118 • 27805 .26722 .25820 .25061
Figure 27. Printout plot correlation coefficient analysis for x-ray crystallography residuals
BETWEEN EETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN EETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN EETWEEN BETWEEN BETWEEN BETWEEN BETWEEN BETWEEN
Ο
"S.
3
Co Co
ce Ci*
2
w M
VALIDATION OF T H E M E A S U R E M E N T
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
PROCESS
2.
FiLLiBEN
Testing Basic Assumptions
comprehensive f i r s t - p a s s t o o l f o r data a n a l y s i s . are as f o l l o w s : 1.
run sequence p l o t
2.
lag-1 a u t o c o r r e l a t i o n
3.
histogram
4.
normal p r o b a b i l i t y p l o t
103 The four p l o t s
plot
normal p r o b a b i l i t y p l o t was chosen because the normality assumption i s most commonly employed. The histogram i s included as an a d d i t i o n a l g r a p h i c a l p o i n t o f reference i n case the normality assumption i of about a dozen usefu summarizing s t a t i s t i c s i s included on t h i s s i n g l e page. This p a r t i c u l a r 4-plot a n a l y s i s has proved to be i n v a r i a b l y informative i n terms o f a s s e s s i n g the v a l i d i t y of the underlying assumptions i n a measurement process. The a n a l y s i s can be a p p l i e d to both raw response data and to r e s i d u a l s a f t e r a m u l t i factor (e.g., r e g r e s s i o n , ANOVA) f i t . The technique i s recommended only as a f i r s t pass i n a data a n a l y s i s and should be' complemented by more d e t a i l e d a n a l y s i s . The a p p l i c a t i o n o f t h i s technique t o several examples i s now discussed. The f i r s t example ( f i g . 29) i s the 500 Rand (24) normal random numbers o f f i g u r e s 15 and 22. The run sequence plot indicates f i x e d l o c a t i o n and v a r i a t i o n . The lag-1 autocorrelation plot indicates randomness. The histogram i n d i c a t e s a bell-shaped symmetric d i s t r i b u t i o n . The normal p r o b a b i l i t y p l o t i n d i c a t e s normality. The second example ( f i g . 30) i s the 700 Josephson J u n c t i o n cryothermometry voltage counts o f f i g u r e s 17 and 24. The run sequence p l o t i n d i c a t e s f i x e d l o c a t i o n and v a r i a t i o n and a l s o the rather discrete nature o f the data. The lag-1 a u t o c o r r e l a t i o n p l o t i n d i c a t e s randomness and r e i n f o r c e s the d i s c r e t e aspect o f the data; the histogram i n d i c a t e s symmetry and a bell-shape; the normal p r o b a b i l i t y p l o t indicates normality. The t h i r d example ( f i g . 31) i s the 200 beam d e f l e c t i o n s data on f i g u r e s 19 and 26. The run sequence p l o t i n d i c a t e s f i x e d l o c a t i o n and v a r i a t i o n and perhaps a s i n g l e o u t l i e r ( h i g h ) . The lag-1 a u t o c o r r e l a t i o n p l o t i n d i c a t e s w e l l - d e f i n e d nonrandomness and a d d i t i o n a l evidence f o r an o u t l i e r . The histogram i n d i c a t e s symmetry and a U-shaped i n d i c a t e s t h a t a d i s t r i b u t i o n s h o r t e r t a i l e d than normal i s needed) and a d d i t i o n a l o u t l i e r evidence. A remodeling t o take i n t o account the dominant a u t o c o r r e l a t i o n s t r u c t u r e of the data i s c l e a r l y c a l l e d f o r i n t h i s case.
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
104
VALIDATION O F T H E M E A S U R E M E N T
PROCESS
κ ο X κ Ο Ο * * W ΙΌ # * • 00 * * Ο # *
) 3 > t-
Ο Ο Ο ΙΌ Ο
ζ
ο
υ
α
ί Ο ) ΙΌ Ο « Il II · Ο Ο Ο b CM Ν£> <- Ζ Ζ Ο ΙΌ CM < < < 1
% > I 3 UJ Σ ο α «- ·- ζ ·-< Σ Ζ Ο < χ · < « U J U J < t -
LU D Ζ j Σ « _, _ ~
\-• .> α ω Lu u ο ο υ ο Χ Χ Η < < D
( Λ Σ Σ Σ Σ ( Λ Σ Σ <
σ* ο < σ> ο c ν* ο <
> ο ο ο ο ο ο σ > ο σ > > ο ο ο ο ο ο σ > ο σ > > ο ο ο ο ο ο σ > ο σ jœroati-oo-ooiOr-
Ο ο
α
ο
σ ο ο ο ο ο ο ο ο ο σ ο σ * σ ο ο ο ο ο ο ο ο ο σ * ο σ > ο 17> ο O O O O O O O O C O f f ο κ ι ο ι η ο ι η ο ι ο ί ' ί η σ Ι^ΛΙ00(Όσ>·*Ο<ί·00|ΌΙ -0 ι Ο Ι Ό Φ Ό Ν Φ ^ ^ Ο ^ Ν » Ifi — v O C M C O I O O ' t f O ' f O N O CM (\J Η Η I I I ^ - Γ
σ
>
<\OC\JOOIOOION 1 ΓΌ CM CM Ψ» ή
V) V) ο Ζ Ζ Ι Ο ο ν)
α < Ο ζ « Η 10
χ χ χ > Χ Χ Χ > c χ χ χ
cr « ο ζ < Η (Λ
ΙΌ Φ u.
Χ Χ Χ Χ Χ Χ Χ Χ Χ Χ Χ Χ χ χ χ χ χ χ χ χ χ χ χ χ χ χ x x x x x x x x x x x x x x x
Χ Χ Χ Χ Χ Χ >
ζ ια α Κ Ο Ζ
α.
Ό Η Ζ Ο <
.£1 ο ·<
U D Ε 3 Σ
II J N O O O O O O O O O ^ O C ^ ; > o o o o o o o o o g > o o * > ο ο ο ο ο ο ο ο ο σ > ο σ MfiOlflOinoldOlflCMOC* n « N w œ i O a * o < o o i O i ^ Î O I O r t Î N I B n O Ï C M O N h h
Il
σ> ο q> a σ> ο <* ο
(Λ "Ι Κ CO CD 3 < Ο Ο J CD U Ο <
o««hoin*o«iCNiiiro*o Ο ^ · * « - " < Β < 0 Ι Ό © Γ ^ * CM 00 <ϋ ΙΌ Ο O(\jli)CD<-'
Ο(0Νσΐ0{Μ0)ΙΛ»<>Ν·*θΝΙΌΟ Ν Ν
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
5
rH
2.
105
Testing Basic Assumptions
FiLLiBEN
X
X
~ ο ~ ~ Κ Ο Χ Κ « Ο Μ Ν (0 Ο Η «ί
X
Ο φ <* in
χ
χ
X
χ
X
χ
χ
m
χ
t- UJ η π Ζ 13 3 Ζ
3
S8
χ ο ο ο ~ ο σ* ο ο ο ο ο «ο Η ιι ο * ο ο> m Ν Ο ο - ο < ζ ζ · ο ο ο ο < < Ο Ιί) Ο (η Ul UJ Γ0
X
X
χ
χ
X
ο fcJO
X
χ
χ
12 Σ
Σ
CM^^OOOOOBCOI^NOOIOU)
CM--OO0*<7>C0C0NN<0<0li)in OOOOO0>ÔiC7>0*CH0»CJ*C*Q»0*
OC7iO>0>0>COCOOOCOOOOOCOOOCOOO CMCMCMCMCMCMCMCMCMCMCMCMCMCMCM
CMCMCMCMCMCMCMCMCMCMCMCMCMCMCM
χ II D Χ ·Χ < χ
ζ < Ιϋ χ
> > > α UJ UJ UJ Ο Ο Ο Ο (J ο . Χ Χ Η Η < < D en χ χ <
Ζ
II II II oinminiflininomiflinininino ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο o n n n n o n o n n n n r o n o ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο o m o i o o i n o i n o i n o i n o m o
oininininu)inou)U)u)U)if)U)o ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ΟΙΊΙΟΙΊΙΊηΐΊΟΙΊΙΊΠΙΊΙΊΙΊΟ Ο Ο Ο Ο Ο Ο Ο Ο Ο Ο Ο Ο Ο Ο Ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο oinotnoinowoinoujotflo
Ζ < « Ο UJ χ
Ο
10 V) 8 Ζ Ζ K» O Ο VI
.ο
ο Ω Ο CL ζ ζ α < <
SX
ΙΌ Ό u. α
α ο ο
ι ι «
e x x x x x x x x x x C X X X X X X X X X X
χ χ χ χ χ χ χ χ
• α χ UJ
X X X
δ)
ou)u)ininu)inoir)inininmino ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο o n i i i n n m n o i n i n n n n n o ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο o i n o O T o i n o m o m o i n o O T o
II s o s o s o s o jui
II o o o o oi
o o o o n
o o o o ui
o o o o D
o o o o o o o o oin
o o o o o o o o oin
o o o o o o o o o o o o otfio
Ο Ο ο ο m
Ο ο ο «-> ο
cM^^ooc>>v>coco>>r-«OIV(MMI> οσ>οσ>φοοοοοοοοοοα>οσοοαοοο CMCMCMCMCMCMCMCMCMCMCMCMCMCMCM
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
VALIDATION O F T H E M E A S U R E M E N T
106
PROCESS
I s
- h r 13 Ο Γ· 3 Ζ
ο υ
υ a
ο ο ο σ> ο Μ ο ο ο σ> ο c CM ο ο σ> ο c ο ο * ο r ο ο η ο r ο ο « ο r
Σ
Σ
> > > α
UJ3ZII3UJUJUJO
Σ ο ο ο υ _ _ ζ « ο ζ ο < χ · χ χ κ-
J Ï < α
Σ
<wUJUJ
ο οο χ
(Μ Ο (Μ Ο « * Ο Ν Ό σ> ο σ> «ο CM < ίο r- ο ο CM η - ιη ΛΙ ο h * <J ~ CM <• r» οο * ο m «* h-
CM Ο Ο Ο
ο οο r
r- ο
Κ) (\Ι Η Ο V Μ Ι )
οο σ> Ν in CM ο οο oN^^eoniocjiCMmNoio^ON ons-««''*Nr'ioiocMCitn~«N η CM -< ι i«cMCMroro
"SX
II II οο χ h f t l O N O « « O 0 0 D f t | O * ρ >Γ>-ν0ΦΟ(Γ<0ΓΜ<Μ00σ>Ο 00 Γ ) « O r - o o c M n « ) i i > N o CM II )-*inCMON<-«00li)CMO <* ο j r » U J < t O t f ) - « N C V J 0 3 < t O ftf)CM--lf)*CM-HC> )ooo>r»mcMoroioC0r0i0(7>CMlfir"Orn>00> o r i N i - i ^ x N r o o o c M O i o ^ r » iCMCMrOf^iniD η N - -
ο ο Ο Ο ο
(Λ V) Ο Ζ Ζ H • Ο (Λ
(Λ (Λ Ο
ζ c ι
χ
u. α
χ χ χ χ χ
C
X
χ χ χ χ χ χ χ χ χ χ χ χ χ χ χ χ χ χ χ χ χ χ X
X
X
X
X
X
X
• ο - ι σ> «· σ* ζ
< - «~ ζ
X
V)
χ
Χ Χ
U) 0,
0 < II ζ ζ » _ι ο < υ ι Μ <η
χ χ 5 c
χ
χ
X
X
X
χ X
X
Κ Η·
X
u. ο < ω ο « > χ » α κ
χ χ χ χ
• «ο χ < ο ο « ι 3 J œ t ζ υ ο <
OC0OO<7><0CMCM00(7>O O00r-««i0f--OOCMr0l0l0t»-O
OWineOHiflWON^HOIONO o < c o N N œ « o i n p » N w œ < o
O-'CM*U)CM'*O00>>lf)^C\»^t> οΝ^-οοιηβο»ο»ΐί)Γ'οηιοσ< ο PI s - i ^ ^ r - m o ^ c M ^ u i - Γ η CM ,* I (-.cMCMroro^toirt
σ o ο ο ο
r» tu Φ « ο>>* — οο«ιηο r» ΟΙί)^Γ«·(Μ00<Ο in ooof-m^cM-o οο CM Ιί) οο - ·* Ν iO«ONhKlllO
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
2.
FiLLiBEN
Testing Basic Assumptions
107
The f o u r t h example ( f i g . 32) i s the r e s i d u a l s from the x-ray crystallography f i t . The run sequence p l o t i n d i c a t e s f i x e d l o c a t i o n but l a r g e r v a r i a t i o n (or perhaps a few high o u t l i e r s ) a t the beginning o f the set. The lag-1 autocorrelation plot i n d i c a t e s randomness but a d d i t i o n a l evidence f o r the e x i s t e n c e of o u t l i e r s . The histogram i n d i c a t e s symmetry and b e l l shapedness but a l s o the suggestion o f longer-than-normal t a i l s . The normal p r o b a b i l i t y p l o t i n d i c a t e s n o n - n o r m a l i t y — t h e N-shape implies t h a t a d i s t r i b u t i o n l o n g e r - t a i 1 e r than normal i s needed. The elongated upper r i g h t t a i l o f the normal p r o b a b i l i t y p l o t a l s o gives f u r t h e r c o r r o b o r a t i o n t o the p o s s i b l e e x i s t e n c e o f o u t l i e r s as seen i n i t i a l l y i n the run sequence p l o t . The l a s t example ( f i g 33) i s the 50 s p e c t r o p h o t o m e t r y measurements o f transmittanc s e c t i o n 4. The run sequenc i n the second h a l f o f the data. The histogram i s r a t h e r nondescript aside from i t s d e f i n i t e biomodal character. The normal p r o b a b i l i t y p l o t i n d i c a t e s non-normality, and a shortert a i l e d d i s t r i b u t i o n i s needed. A remodeling to take i n t o account the dominant a u t o c o r r e l a t i o n s t r u c t u r e o f the data i s c l e a r l y needed. CONCLUSION In any given measurement process, the u l t i m a t e concern i s p r e d i c t a b i l i t y — b e i n g able t o make p r o b a b i l i t y statements about f u t u r e output from the process. To achieve such p r e d i c t a b i l i t y , a measurement process must be " i n c o n t r o l . " This w i l l be the case when the output from the process behaves l i k e random drawings from some f i x e d d i s t r i b u t i o n w i t h f i x e d l o c a t i o n and fixed variation. The core o f t h i s paper has been t o d i s c u s s v a r i o u s techniques t o t e s t the four components (random, f i x e d l o c a t i o n , f i x e d v a r i a t i o n , and f i x e d d i s t r i b u t i o n ) i n the above definition of " i n control." These four components are i m p l i c i t l y assumed i n the a n a l y s i s of a l l measurement processes. The randomness assumption helps assure t h a t the experimentalist does i n f a c t have as many independent r e a l i z a t i o n s o f the phenomenon o f i n t e r e s t as i s believed. I f the randomness assumption i s untenable, t h i s i s f r e q u e n t l y i n d i c a t i v e o f e i t h e r the e x i s t e n c e o f some extraneous v a r i a b l e (which has not yet been accounted f o r ) o r a l t e r n a t i v e l y ( a t times) an o v e r l y - f a s t sampling rate. The f i x e d l o c a t i o n and f i x e d v a r i a t i o n assumptions help assure t h a t the process i s s t a b l e i n the s i m p l e s t sense. A process which i s d r i f t i n g i n e i t h e r i t s t y p i c a l value ( l o c a t i o n ) or i t s t y p i c a l spread ( v a r i a t i o n ) cannot be considered as " i n c o n t r o l " and c e r t a i n l y negates any p o s s i b i l i t y o f generating p r o b a b i l i t y ( p r e d i c t a b i l i t y ) statements about f u t u r e output from
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
VALIDATION O F T H E M E A S U R E M E N T
108
χ
Ο
ο
ο χ χ * ΓΟ 4 - ο ο
10 00 ο ο ο ο II ο ο Ζ «> D Ζ ο < υ α > ο ο >0 ο τ ο ο m Ο ro œ œ Ν Ν Ο
σ> κ CM M II Ρ) * « Ο Ζ Ζ · Ο < < Ιί)
ζ > > > α I 3 U 11) ω ο Ζ Ο Ο Ο (J >-> Ζ • ο t Ζ Û < ! ι Χ Χ Η c « ω ω « • < < Ώ ι ζ ζ ζ : 7 Ζ Ζ < L «
ο ο -ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο r-\ow« roc\i«*oc7>coN«cin
ο ο ο
> ο ο ο ο ο ο ο ο ο ο ο ο ο ο > ο ο ο ο ο ο ο ο ο ο ο ο ο ο > ο ο ο ο σ ο ο ο ο ο ο ο ο ο
,
o o o o o o o o o o o o o o c ο ο ο ο ο ο ο ο ο ο ο ο ο ο *
) )
C\JC\J0J<MCMC\|C\JC\IOJC\JC\ICgcyC\JC
JC\IC\ICVICMC\IC\|C\ICMOJC\|CMC\iC\JC\l
Ο 0 0 Ο
0 0
0 0
0 0 0 Ο Ο Ο 0 Ο 0 0
Ο Ο
Ο Ο
Ο Ο
Ο Ο
0 0
V) V) ο Ζ Ζ Ι Ο Ο <Λ
(Λ V) Ο
ζ
(Λ
0 < II 1 _ι
(Λ δ.
ζ ζ
ο < υ ι w ν) U. Ο < LU Q Μ > I * α Η
U
ι
OOOOOOOOOOOOOC ο ο ο ο ο ο ο ο ο ο ο ο ο * ο ο ο ο ο ο ο ο ο ο ο ο ο *
ζ < ο ο ο _ι ω 2 υ ο <
II
II
οΝ·*»««υιηΝοΐ*·«-<«οΐί)<νισ o i i ) - N m o o < o i i i « N N ( » < f o ooor»in
«r uu cvi Ν -« i f l o « < u c \ j r ^ - * i i > o
en oo
r- h- y
» m «• η r w Ν H «
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
PROCESS
2.
109
Testing Basic Assumptions
FiLLiBEN
- σ> κ «· ο •
σ» α m ο
χ ιη ·
κ * » *
οο ο> # • ο # ο ιη * « ιο *
ζ ο , D Ζ
(Λ Ο
« ο
χ <
X
X
χ χ X X X X X χ χ χ X X X X X X X X
X X X X X X X
X X X X X X X
σ < ο υ σ> ο c - ο · ο CO CM * · ο ο * Ο CM CO Ο Ζ Ζ · m Ν < < Ν
*
χ χ X
X >
X
X
Ε
> ζ ιι c < Ε
-\OcM"*cor--'roo>cMw.tf
cor«.«cM<*cor--«no>cMin«-Nir>
or»-in*cM^ooor«.*ifiN M ςτ JU7C0^«NOCMli>-e0|fl C ιοοησ»«σ*ιηοιηο>ηα> η Ν • ) - » o o o r - m * n — o -
tficop»m4 CM-«oooN*ii>r»cow> o > C M i n c o - » * i > - o c M m — coiocMu» r-moofocTi^çjiifioinc^nconN •*D>-
Σ Ό Σ
> > > α ω ω ω ο ο ο ο υ
Σ Σ Σ W Σ Σ <
,
r
i
(Λ W O ζ ζ ι ο Ο <Λ
Ο UJ w ο ο α UJ Û Ο Η
ι 2
Ο
α α ζ <
ι <
«. W
ο ο α ζ ζ α
Où
< <
Η h- Η V) ιη ο
ι ιη ι <* ι οο
χ χ χ χ χ > χ χ χ χ χ χ χ χ χ χ > c x x x x x x x x x x x > c x x x x x x x x x x x >
Jo
"δ. 4
; χ χ χ χ ο _ι ο ιηα ί II Ζ Σ J ο < J Χ « (Λ L ο < ω 3 w > I * α. ι-
ιη * . c \ i < - » o o o N * i n r - e o c f t ι ιη οο - ^ « N o c M i n ^ o o m c M c ^ » ce η σ ι « σ > ι η ο ι η ο > Γ θ ο ο ( η Ν oor^u>«-m^cj>.*tM
-< <»• r- ο π
3 \0 J ιη 3 ΓJ in
* Ν « Ο Ο ι Ο β « Ν Ο Ο Ο ο r - o p - o o m o N ^ N o <0 οο N^ino^cocMi^^mo mi^coo^cM'Cinr-coo ω ο CM m r* · m w Ν ο «nu
Γ Ό © H
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
110
VALIDATION
OF T H E M E A S U R E M E N T
PROCESS
the process. Problems a s s o c i a t e d w i t h d r i f t i n g i n e i t h e r l o c a t i o n or v a r i a t i o n are g e n e r a l l y t r a c e a b l e — f r o m an a n a l y s i s p o i n t of v i e w — t o the e x i s t e n c e o f an unaccounted v a r i a b l e which i s i n f l u e n c i n g the response. From an e x p e r i m e n t a l i s t ' s p o i n t o f view, t h i s v a r i a b l e may be ( t o name j u s t a few) i n s t r u m e n t a l , p r o c e d u r a l , or environmental i n nature and almost always n e c e s s i t a t e s a s c r u t i n o u s , d e t a i l e d review of the experimental process as a whole. References ( 1 ) , (2) and (13) are r e l e v a n t i n t h i s regard. The fixed distribution assumption i s particularly important. In many (but not a l l ) p h y s i c a l science experiments, the normal d i s t r i b u t i o n i s an appropriate d i s t r i b u t i o n a l model. If normality i s theoretically appropriate, and i f the distributional tests a such normality, thi c o n c l u s i o n t h a t the measurement process i s indeed " i n c o n t r o l . " On the other hand, i f normality i s appropriate and y e t the distributional tests indicate non-normality, then t h i s i s u s u a l l y another i n d i c a t o r t h a t the response i s being i n f l u e n c e d by some other nonrandom v a r i a b l e which the a n a l y s t has not y e t taken i n t o account. Two d i s t i n c t and opposite s i t u a t i o n s may be i d e n t i f i e d w i t h respect t o t h i s assumption. F i r s t , the nature o f the measurement process may be such t h a t there e x i s t s an a p r i o r i t h e o r e t i c a l b a s i s which d i c t a t e s what d i s t r i b u t i o n the output from the process should f o l l o w ; and second, no such a p r i o r i theory e x i s t s f o r p r e d i c t i n g what d i s t r i b u t i o n the output from the process w i l l follow. In the f i r s t case, the normal d i s t r i b u t i o n i s often (but not universally) theoretically appropriate; other commonlyo c c u r r i n g d i s t r i b u t i o n a l models are the Poisson d i s t r i b u t i o n (e. g. f o r counting processes) and the Wei bull/extreme value d i s t r i b u t i o n s (e.g., f o r l i f e t i m e s o f f r a c t u r e processes). I f a c e r t a i n type o f d i s t r i b u t i o n i s t h e o r e t i c a l l y a p p r o p r i a t e , and i f the d i s t r i b u t i o n a l t e s t s as described i n s e c t i o n s 10 and 11 confirm such a d i s t r i b u t i o n , then t h i s gives c o n s i d e r a b l e support t o the c o n c l u s i o n t h a t the measurement process i s indeed " i n c o n t r o l . " On the other hand, i f the t e s t s i n d i c a t e a bestf i t d i s t r i b u t i o n t h a t i s d i f f e r e n t from theory, t h i s i s u s u a l l y another i n d i c a t i o n t h a t the response i s being i n f l u e n c e d by some other v a r i a b l e which the a n a l y s t has not y e t taken i n t o account. In the second case (when the t h e o r e t i c a l d i s t r i b u t i o n i s unknown), the a n a l y s t has l e s s d e f i n i t i v e information upon which to a c t . P r a c t i c a l l y speaking, i n such a case, a b e s t - f i t d i s t r i b u t i o n t o normal i s f r e q u e n t l y an i n d i c a t i o n o f s t a b i l i t y i n the measurement process. On the other hand, a b e s t - f i t d i s t r i b u t i o n which i s very l o n g - t a i l e d (e.g, one i n the v i c i n i t y
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
2.
FiLLiBEN
Testing Basic Assumptions
111
of the Cauchy d i s t r i b u t i o n ) i s almost i n v a r i a b l y an i n d i c a t i o n of a process which s t i l l has unresolved sources of v a r i a t i o n . In any event, f o r both o f the above cases (known versus unknown t h e o r e t i c a l d i s t r i b u t i o n s ) , i t i s seen t h a t one o f the obvious pieces o f i n f o r m a t i o n which must be reported i n the summary d e s c r i p t i o n o f any measurement process i s the nature o f the b e s t - f i t d i s t r i b u t i o n . The importance o f p r o v i d i n g such d i s t r i b u t i o n a l i n f o r m a t i o n i s r e a f f i r m e d when i t i s r e c a l l e d t h a t the u l t i m a t e o b j e c t i v e i n a measurement process i s p r e d i c a b i l i t y and i n t h i s regard the a n a l y s t i s c o n s t a n t l y brought back to the f a c t t h a t i t i s the u n d e r l y i n g d i s t r i b u t i o n ( e i t h e r known o r estimated) which must be employed t o form such p r e d i c t i o n ( p r o b a b i l i t y ) statements With respect t t e s t i n g randomness, l o c a t i o n , v a r i a t i o n , and d i s t r i b u t i o n a l aspects o f the data, i t i s noted t h a t most were g r a p h i c a l i n nature. The techniques have wide a p p l i c a b i l i t y due t o the f a c t t h a t they can be a p p l i e d not only to the raw data from u n i v a r i a t e models, but a l s o t o the r e s i d u a l s a f t e r the f i t i n m u l t i - f a c t o r models (e.g., r e g r e s s i o n , ANOVA). With the exception o f the g r a p h i c a l ANOVA technique, the procedures covered have been implemented as stand-alone subroutines i n the p o r t a b l e DATAPAC FORTRAN (6,7) data a n a l y s i s package which i s a v a i l a b l e from the author. ACKNOWLEDGMENT The author i s indebted to the many NBS s c i e n t i s t s whose data have served as the b a s i s f o r the v a r i o u s examples used i n t h i s paper. I t i s t o be noted t h a t the examples have been chosen t o i l l u s t r a t e s p e c i f i c s t a t i s t i c a l techniques and t o emphasize particular statistical anomalies. Such anomalies most f r e q u e n t l y are uncovered i n p r e l i m i n a r y i n v e s t i g a t i o n o f a measurement process as was the case i n many o f the examples cited. The author i s a l s o pleased t o acknowledge the v a l u a b l e comments o f f e r e d by Joseph Cameron, Joan Rosenblatt, James DeVoe and Judy G i l s i n n - - a l l members of the NBS Technical S t a f f . ABSTRACT This paper concerns i t s e l f w i t h the important problem o f t e s t i n g the v a l i d i t y o f the b a s i c assumptions i n a measurement process. The paper covers f o u r p r i n c i p a l areas i n t h i s regard: 1) the assumptions t h a t a r e t y p i c a l l y made i n a measurement process, 2) the consequences t o c o n c l u s i o n s drawn from a measurement process i f the assumptions do not h o l d , 3) theoretical statistical t e s t s f o r the checking o f b a s i c
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
112
VALIDATION O F T H E M E A S U R E M E N T
PROCESS
assumptions, and 4) p r a c t i c a l t o o l s t o f a c i l i t a t e the checking of b a s i c assumptions. Examples o f assumption-checking on data drawn from the chemical and p h y s i c a l sciences are included.
Literature Cited 1. 2.
3.
4. 5.
6.
7. 8. 9. 10. 11. 12.
13.
Cameron, Joseph Μ., Measurement Assurance, Journal of Quality Technology, (1976), Vol. 8, No. 1, p 53. Cameron, Joseph Μ., Procedures for the Assurance of the Adequacy of Sound Level Measurements. National Bureau of Standards Technical Note 931. Chapter 4 of Environmental Effects on Microphones and Type 2 Sound Level Meters (edited by Edward B. Magrab), (1976), p 63. Eisenhart, C., Realisti Accuracy of Instrumen Research of the National Bureau of Standards-C. Engineer ing and Instrumentation, (1963), Vol. 67C, No. 2, p 161. Cleveland, William S. and Kleiner, Beat., A Graphical Technique of Enhancing Scatterplots With Moving Statistics. Technometrics, (1975), Vol. 17, No. 4, p 447. Currie, Lloyd Α., Filliben, James J. and DeVoe, James R., Statistical and Mathematical Methods in Analytical Chemistry., Analytical Chemistry Reviews, (1972), Vol. 44, No. 5, p 479R. Filliben, James J., DATAPAC: A Data Analysis Package. Proceedings of the Ninth Interface on Computer Science and Statistics, Prindle, Weber and Schmidt, Inc., Boston, (1977), p 212. Filliben, James J., A User's Guide to the DATAPAC Data Analysis Package. National Bureau of Standards Technical Note (in preparation) (1977). Himmelblau, David Μ., Process Analysis by Statistical Methods, John Wiley, New York, (1968), p 78. Levene, H. and Wolfowitz, J., The Convariance Matrix and Runs Up and Down, The Annals of Mathematical Statistics, (1944), Vol. 15, p 58. Hamaker, H. C., New Techniques of Statistical Teaching. Review of the International Statistical Institute, (1971), Vol. 39, No. 3, p 351. Filliben, James J., User's Guide to DATAPLOT: A System for Interactive Graphical Analysis. National Bureau of Standards Technical Note (in preparation) (1977). Youden, W. J., Graphical Analysis of Interlaboratory Test Results. Industrial Quality Control, (1959), Vol. 15, No. 11, p 24. (Reprinted in Journal of Quality Technology, (1972), Vol. 4, No. 1, p 29. Youden, W. J., Experimental Design and ASTM Committees. Materials Research and Standards, (1961), Vol. 1, No.11,p 862.
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
2.
FILLIBEN
14.
15. 16. 17. 18. 19. 20. 21. 22. 23. 24. 25.
26.
27.
28.
Testing Basic Assumptions
113
Filliben, James J., Techniques for Tail Length Analysis. Proceedings of the 18th Conference on the Design of Experiments in Army Research Development and Testing, October (1973), Part 2, ARO Report 73-2, p 425. Filliben, James J., The Probability Plot Correlation Coefficient Test for Normality. Technometrics, (1975), Vol. 17, No. 1, p 111. Ryan, Thomas Α., Jr. and Joiner, Brian L., Normal Probability Plots and Tests for Normality, (1975), Pennsylvania State University Report. Daniel Cuthbert, Use of Half-Normal Plots in Interpreting Factorial Two-Level Experiments. Technometrics, (1959), Vol. 1, No. 4, p 311. Hahn, G. and Shapiro S. Statistical Methods in Engineering, John Mood, Alexander M Grayhill, , to the Theory of Statistics, McGraw-Hill, New York (1963). Nelson, W. and Thompson, V. C., Weibull Probability Papers. Journal of Quality Technology, (1971), Vol. 3, No. 2, p 45. Wilk, M. and Gnanadesikan, R., Probability Plotting Methods for the Analysis Data, Biometrika, (1968), Vol. 55, No. 1, p 1. Johnson, Norman L. and Kotz, Samuel, Continuous Univariate Distributions-l and 2, Houghton Mifflin Company, Boston (1970). Johnson, Norman L. and Kotz, Samuel, Discrete Distributions, Houghton Mifflin Company, Boston (1969). Rand Corporation, A Million Random Digits With 100,000 Normal Deviates, The Free Press, Glencoe, Illinois (1955). Filliben, James J., Simple and Robust Linear Estimation of the Location Parameter of a Symmetric Distribution. Unpublished Ph.D. Dissertation, Princeton University (1969). Joiner, B. L. and Rosenblatt, J. R., Some Properties of the Range in Samples from Tukey's Symmetric Lambda Distributions. Journal of the American Statistical Association, (1971), Vol. 66, p 394. Simiu, Emil and Filliben, James J., Probability Distributions of Extreme Wind Speeds, Journal of the Structural Division, American Society of Civil Engineers, (1976), Vol. 102, No. ST9,p1861. Simiu, Emil and Filliben, James J., Statistical Analysis of Extreme Winds, NBS Technical Note 868, (1975), p 1.
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
3 Systematic Error in Chemical Analysis L . A . C U R R I E and J. R. D E V O E Analytical Chemistry Division, Institute for Materials Research, National Bureau of Standards, Washington, DC 20234
The fundamental limitation to accuracy in chemical analysis is systematic error Unfortunately systemati error--which comprises all nonrando the truth--is the rule in analytical chemistry. Systematic error comes about whenever the actual nature of the analytical process differs from that assumed. It results from invalid sampling, operator or equipment instability and blunders, unrecognized sample loss or contamination, poor instrument calibration, inadequate physical (mathematical) or random error distribution models, and faulty reporting of data. These problems, which will be covered in some detail below, are not exceptional. It is only through exhaustive, quantitative evaluation of the individual and collective effects of such violations in assumption that the analyst can hope to provide meaningful bounds for systematic error. The impact of erroneous analytical measurements can be considerable. A recent New York Times article (1) entitled, "Medical Labs May Not Be All That Accurate" pointed up the fact that in a survey of the clinical laboratories involved in Interstate Commerce (and consequently under the monitoring of the Federal Center for Disease Control, USPHS) 31 percent were unable to identify sickle-cell anemia from blood smears. Additional tests such as hemoglobin and electrolyte content in blood were unsatisfactory in a similar fraction of laboratories. Naturally, this situation has resulted in some lack of confidence on the part of the physician; and confidence-erosion can be dangerous. Instances have occurred where a test result deviated from the norm to such an extent that the physician who ignored the result (assuming laboratory error, when there was none) made an improper diagnosis with serious consequences to the patient. Another example of somewhat less immediate severity but greater long term importance is the measurement of ozone in the atmosphere. Figure 1 shows the deviations from the true concen-*Contribution of the National Bureau of Standards. Not subject to copyright. 114
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
3.
cuRRiE
AND DEVOE
Systematic Error in Chemical Analysis
115
tration (2) o f experimental results from a number o f laboratories. C o l l e c t i v e l y , one sees t h a t the l a b o r a t o r i e s produced r e s u l t s whose (negative) bias exceeds the i m p r e c i s i o n bound. As a r e s u l t o f both systematic and random e r r o r components reported 0 concentrations were too low by 20 percent to 60 percent ( a t the A i r Q u a l i t y Standard l e v e l ) . In t h i s case the "true" concentration was provided t o the t e s t i n g l a b o r a t o r i e s i n the form o f an accurately-prepared gaseous reference sample. This example r a i s e s an important p o i n t r e l a ted t o the r o l e o f reference m a t e r i a l s f o r t r a n s f e r r i n g accura cy from one l a b o r a t o r y t o another. Though reference m a t e r i a l s are exceedingly useful f o r d i s c l o s i n g l a b o r a t o r y e r r o r , they do not e l i m i n a t e the need f o r the q u a n t i t a t i v e assessment o f a l l p o t e n t i a l sources o f b i a s 3
The overwhelming importance o f the systematic component o f e r r o r may be grasped from eq. ( 1 ) : total error: random e r r o r :
e= δ +Δ
(la)
δ = z^SE = ζ(σ/\[η)
(lb)
Where e represents the t o t a l e r r o r i n χ and δ and Δ represent the random and systematic components, r e s p e c t i v e l y . * * I f normality i s assumed (Gaussian random e r r o r d i s t r i b u t i o n ) , the random e r r o r i s simply the product o f the random normal deviate ( z ) and the standard e r r o r (SE). The standard e r r o r i n χ depends upon the p r e c i s i o n parameter σ (standard d e v i a t i o n ) and the number of r e p l i c a t i o n s n. With increased r e p l i c a t i o n the standard e r r o r tends toward zero, w i t h the r e s u l t t h a t the t o t a l e r r o r a s y m p t o t i c a l l y approaches the b i a s - - i . e . , Θ-»Δ. The u l t i m a t e c a p a b i l i t y o f any a n a l y t i c a l procedure thus r e s t s upon the magnitude o f the b i a s . The problem i s compounded by the f a c t t h a t only the p r e c i s i o n may be d i r e c t l y estimated through experiment ( r e p l i c a t i o n ) . The two examples c i t e d above simply i l l u s t r a t e the consequences o f i g n o r i n g o r g i v i n g inadequate a t t e n t i o n t o t h i s extremely important, but more d i f f i c u l t t o estimate,systematic component of e r r o r . When adequate care i s given to e s t i m a t i n g bounds f o r Δ , the r e s u l t s may appear s u r p r i s i n g . For example, i n the most recent t a b u l a t i o n o f Eu-152 γ-ray decay p r o b a b i l i t i e s , the estimated l i m i t s f o r systematic e r r o r exceed the standard e r r o r by f a c t o r s of 2.5 t o 40 03). As s t a t e d bounds f o r systematic e r r o r a r e h i g h l y dependent upon the s c i e n t i f i c judgment and philosophy o f the experimenter, under- and over-estimation o f such bounds can completely cloud the meaning of a n a l y t i c a l r e s u l t s . An i n c i s i v e d i s c u s s i o n of t h i s p a r t i c u l a r problem, as r e l a t e d to the funda mental p h y s i c a l constants, has been given by T a y l o r et a]_. ( 4 ) . **A
l i s t of terms and symbols i s given a t the end o f the
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
Chapter.
VALIDATION
116
OF T H E
M E A S U R E M E N T PROCESS
In the d i s c u s s i o n which f o l l o w s we s h a l l f i r s t examine the means and l i m i t a t i o n s of nonrandom e r r o r d e t e c t i o n . A systematic a n a l y s i s of the i n d i v i d u a l steps of the Chemical Measurement Process (CMP) w i l l then be undertaken i n order t o expose the sources and methods f o r c o n t r o l l i n g t h i s component of e r r o r . F i n a l l y , some simple, y e t powerful d i a g n o s t i c techniques w i l l be presented f o r the i d e n t i f i c a t i o n of b i a s and blunders a f f e c t i n g experimental r e s u l t s . SYSTEMATIC ERROR BOUNDS L i m i t s f o r systematic e r r o r may be a r r i v e d a t i n two d i f f e r e n t ways. (1) They may be e s t i m a t e d ^ ( i n the s t a t i s t i c a l sense) by comparing an experimental r e s u l t χ w i t h the t r u e value τ ( i f known) or w i t h methods or l a b o r a t o r i e s . * * sense e x i s t only by d e f i n i t i o n , the f i r s t type of comparison g e n e r a l l y i m p l i e s the a v a i l a b i l i t y of "accepted" or " c e r t i f i e d " v a l u e s , such as those which accompany reference m a t e r i a l s d i s t r i b u t e d by n a t i o n a l s t a n d a r d i z i n g l a b o r a t o r i e s . (2) The second approach to systematic e r r o r e v a l u a t i o n i s through d e t a i l e d a n a l y s i s of the s t r u c t u r e of the CMP i n order t o i n f e r bounds f o r o v e r a l l propagated systematic e r r o r . This approach, i n c o n t r a s t to the former, r e l i e s wholly upon sound, s c i e n t i f i c judgment. The f i r s t , e m p i r i c a l approach thus y i e l d s e = χ - τ
(2a)
f o r the estimated b i a s , and £
±
= e ± δ
(2b)
Μ
f o r i t s upper and lower l i m i t s . In eq. (2b), δ represents the absolute value of the maximum l i k e l y random error—commonly taken to be two t o three times the standard e r r o r . The second, " t h e o r e t i c a l " approach y i e l d s i n f e r r e d bounds M
Δ
±
= Ρ(Δ ).
(3)
±
where ( Δ ) . represents the c o n t r i b u t i o n of s t e p - i of the CMP, and Ρ symbolizes the a p p r o p r i a t e propagation o p e r a t i o n which i n the s i m p l e s t case i s merely summation--e.g., Δ = Σ (Δ ).. +
+
+
I t i s e s s e n t i a l t h a t both types of a n a l y s i s take place. V e r i f i c a t i o n of measurement accuracy can only come through intercomparison. ( $ method-! must cover zero f o r an unbiased + 9
***Intercomparisons l a c k i n g e i t h e r independence or r e l i a b i l i t y are f r u i t l e s s . A dramatic i l l u s t r a t i o n has been given by Yolken ( 5 ) , who c o n t r a s t s r e s u l t s obtained by expert l a b o r a t o r i e s w i t h those obtained using " c e r t i f i c a t i o n by concensus" o f nonexperts.
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
3.
cuRRiE
Systematic Error in Chemical Analysis
A N D DEVOE
117
measurement process.) However, meaningful u n c e r t a i n t y bounds f o r any giveji experiment u l t i m a t e l y depend upon c a r e f u l e v a l u a t i o n of Δ (method-2). Furthermore, having both types o f estimate makes" p o s s i b l e an extremely valuable check f o r consistency: the overlap of ^ and Xj.. (Further d i s c u s s i o n o f bounds f o r systematic and t o t a l e r r o r w i l l be given i n the s e c t i o n on r e p o r t i n g of data.) ^ Although s c i e n t i f i c e v a l u a t i o n o f systematic e r r o r bounds (Δ ) i s q u i t e d i f f i c u l t , adequate e s t i m a t i o n v i a intercomparison C^v) i s perhaps even more d i f f i c u l t . This i s because o f random e r r o r . I t i s evident from eq. (1) t h a t any observed d i f f e r e n c e or e r r o r (e) w i l l have a random component (δ) which l i m i t s our a b i l i t y t o estimate Δ. J u s t t o ( r e l i a b l y ) detect systematic e r r o r , i t can be shown f o r n o r m a l l y - d i s t r i b u t e d random e r r o r s t h a t Δ must exceed S detect a systematic e r r o d e v i a t i o n (σ), one t h e r e f o r e needs a t l e a s t 15 observations. Figure 2 i n d i c a t e s i n a d i f f e r e n t way the d i f f i c u l t y i n d e t e c t i n g sources o f e r r o r . The s o l i d curve shows the d e t e c t i o n l i m i t f o r bias (Δ) r e l a t i v e t o the standard d e v i a t i o n (σ) as a f u n c t i o n o f the number o f observations. The dotted curve gives the same type o f information f o r another common problem: extraneous random e r r o r (σ^) a d d i t i o n a l to the Poisson component i n counting experiments C6). In t h i s case, i f the a d d i t i o n a l random e r r o r i s twice the Poisson component one must have t e n observations t o demonstrate i t s existence. I f the two are comparable, 47 observations s u f f i c e ; and i f the a d d i t i o n a l e r r o r i s h a l f the Poisson e r r o r , several hundred observations are required. I n c i d e n t a l l y , the same (dotted) curve a p p l i e s t o the d e t e c t i o n o f the i n t e r ! a b o r a t o r y e r r o r (corresponding to σ ) f o r a group o f l a b o r a t o r i e s having comparable intra!aboratory i m p r e c i s i o n (corresponding t o σ). C l e a r l y , i n the absence o f a very large number o f measure ments and long term s t a b i l i t y , one cannot e m p i r i c a l l y (through intercomparison) e s t a b l i s h e r r o r bounds (Δ o r σ ) much smaller than the standard d e v i a t i o n o f a s i n g l e measurement (σ). There i s no s u b s t i t u t e , however, f o r intercomparison and r e p l i c a t i o n f o r the d e t e c t i o n o f u n a n t i c i p a t e d blunders o r bias o r l a c k o f c o n t r o l which i s r e l a t i v e l y Targe compared t o the standard deviation. +
+
SOURCES OF SYSTEMATIC ERROR The most e f f e c t i v e way t o i d e n t i f y and c o n t r o l sources of bias i n chemical a n a l y s i s i s t o t r e a t the CMP as a c a r e f u l l y defined system. C r i t i c a l a n a l y s i s o f the i n d i v i d u a l steps and t h e i r linkage w i l l then make i t p o s s i b l e t o estimate i n d i v i d u a l bias components as w e l l as on o v e r a l l propagated e r r o r s f o r systematic e r r o r . For t h i s purpose a g e n e r a l i z e d flow diagram i s given i n f i g u r e 3. In the f o l l o w i n g paragraphs we s h a l l examine each of the steps f o r p o s s i b l e systematic e r r o r c o n t r i b u t i o n s . A
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
VALIDATION OF T H E M E A S U R E M E N T
118
Figure 1.
PROCESS
Results of a collaborative test of the EPA reference method for ambient ozone (2). Dashed line indicates the true value.
Figure 2. Detection limits vs number of observations for extraneous random error (a , dashed curve) and systematic error (Δ, solid curve) e
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
3. cuRRiE
AND DEVOE
Systematic Error in Chemical Analysis
119
measurement process cannot be s a i d t o e x i s t i n the absence o f c o n t r o l ( 7 ) . When c o n t r o l i s achieved both the c e n t r a l value and the v a r i a b i l i t y are s t a b l e . Under such circumstances the e r r o r can be completely defined by a f i x e d systematic component (the b i a s ) and a random component having constant standard d e v i a t i o n . Such b i a s may be the r e s u l t o f f i x e d mistakes o r blunders i n experiment o r theory, o r i t may a r i s e from converted random e r r o r s — i . e . , e r r o r s which occur i n a random f a s h i o n but which remain f i x e d because one o r more steps o f the CMP a r e not repeated. I f the systematic e r r o r i s not constant, i t becomes impossible t o generate meaningful u n c e r t a i n t y bounds f o r experimental data. Lack o f c o n t r o l may a r i s e through carelessness o r e r r a t i These may be expose v a r i a t i o n s may come about from the e f f e c t s o f systematic trends i n u n c o n t r o l l e d v a r i a b l e s (such as barometric p r e s s u r e ) , o r from u n a n t i c i p a t e d e f f e c t s o f seemingly remote f a c t o r s . (Such e f f e c t s are not n e c e s s a r i l y a nuisance. They may provide an opportunity f o r discovery--as i n the case o f v a r i a t i o n s i n the c a l i b r a t i o n curve f o r radiocarbon d a t i n g induced by the a c t i v i t i e s o f man and v a r i o u s geophysical and c l i m a t i c phenomena (8)·) The assessment o f whether a measurement process i s i n c o n t r o l i s f r e q u e n t l y accomplished through the use o f c o n t r o l c h a r t s - - a technique which has been thoroughly discussed above. The c o n t r o l c h a r t , o f course, merely s i g n a l s i n s t a b i l i t y ; i t does not g e n e r a l l y compensate f o r i t . In order t o achieve c o n t r o l , the experimenter must i d e n t i f y and e i t h e r s t a b i l i z e o r c o r r e c t f o r sources o f e r r a t i c behavior. When v a r i a b l e s cannot be held constant, i t i s o f t e n e f f e c t i v e t o c o r r e c t f o r changes by means o f an i n t e r n a l o r e x t e r n a l standard. Figure 4 gives one such example (39). Because o f the extremely low concentrations present i t was necessary t o measure a sample o f r a d i o a c t i v e A r over a p e r i o d o f a month, during which time there was about a 10 percent d r i f t i n gain. Though i t was not p o s s i b l e t o prevent the d r i f t , which came from s l o w l y changing p r o p o r t i o n a l counter gas composition, i t was p o s s i b l e t o c o r r e c t f o r i t . This was accomplished w i t h an e x t e r n a l m o n i t o r — a n x-ray source which simulated the response o f the d e t e c t o r to the sample r a d i a t i o n . 3 7
Sample V a l i d i t y Among the more s e r i o u s problems a f f e c t i n g the sample a r e contamination, heterogeneity and i n s t a b i l i t y . Contamination w i l l be discussed below. The most l i k e l y consequence of heterogeneity i s a nonrepresentative sample. Quite o f t e n one can observe a major d i f f e r e n c e i n sample composition w i t h amount taken f o r a n a l y s i s . For example, the appearance o f severe heterogeneity among t r a c e elements i n Orchard Leaves (SRM #1571)
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
120
VALIDATION O F T H E M E A S U R E M E N T
PROCESS
CRITICAL ASPECTS Existence of CMP (definition, control) Sampling
Homogeneity, contamination, stability
Separation
Recovery, contamination
(sample prep.)
Calibration, resolution
Model, error
structure
Figure 3. The chemical measurement proc ess—flow diagram
σ
c c σ
I
29
Am-241 (Cu-ka)
Ar - Sample (SAWM-7)
28 27 26
25 220
225
230
235
240
245
1974 Days Figure 4.
Quenching of Ar; external standard control (39) 37
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
3. cuRRiE
Systematic Error in Chemical Analysis
A N D DEVOE
121
was demonstrated when 10 mg r a t h e r than the recommended 250 mg samples were taken ( 9 ) . Such heterogeneity depends, o f course, upon sample type. (The authors o f r e f . (10) noted the "extreme homogeneity" o f t r a c e elements i n fresh l i v e r . ) Also, homogeneity o f some elements cannot assure the same f o r others. For c e r t a i n methods of a n a l y s i s , sample homogeneity requirements are indeed s t r i n g e n t . E l e c t r o n probe m i c r o a n a l y s i s , f o r example, r e q u i r e s standards o f e x p e r i m e n t a l l y demonstrated microhomogeneity (10). Another major source o f systematic e r r o r r e l a t e s t o the change o f sample composition w i t h time. A e r o s o l s , f o r example, are known t o be s u s c e p t i b l e t o moisture and gaseous contaminants. Spurious s u l f a t e r e s u l t s have been obtained from the gradual o x i d a t i o n o f S 0 on a i r f i l t e r s (11) and H S 0 aerosol r e s u l t s hav n e u t r a l i z a t i o n by t r a c e trace species from aqueous samples a t c o n t a i n e r w a l l s (adsorption or d i f f u s i o n ) i s another common source o f instability. This i s p a r t i c u l a r l y marked f o r heavy, v o l a t i l e elements such as mercury (13). The Blank Figure 5 i l l u s t r a t e s a systematic e r r o r t h a t i s troublesome i n the f i r s t two steps o f the CMP: t h a t i s the occurrence o f an unmeasured blank. A s i g n i f i c a n t d i f f e r e n c e i s shown between a simple s o l u t i o n o f l e a d and the apparent lead content i n whole blood (14). When t h e d e v i a t i o n s a t these low l e v e l s o f c o n c e n t r a t i o n are a l l p o s i t i v e , a good s u p p o s i t i o n i s a blank problem from contamination o f reagents used t o prepare the sample of whole blood but not used f o r the aqueous s o l u t i o n . A common p i t f a l l i n trace analysis i s i n s u f f i c i e n t a t t e n t i o n t o the v a r i a b i l i t y o f the blank. I f v a r i a b i l i t y due t o contamination i s such t h a t i t may p l a y an important p a r t i n the s e t t i n g o f u n c e r t a i n t y bounds o r d e t e c t i o n l i m i t s , some c a u t i o n i s necessary i n i n t e r p r e t i n g the r e s u l t o f j u s t one o r a few blank observations (19). Results such as those quoted above show the danger o f b l i n d l y assuming t h a t the r e l a t i v e range o f the blank i s no more than 10 percent, 100 percent, a f a c t o r o f 2 o r even a f a c t o r o f 1 0 (15). Thus, even i f the blank i s ten times s m a l l e r than the s i g n a l o f i n t e r e s t , i t s v a r i a b i l i t y must be measured. I f t h i s i s accomplished, f o r example, by examining the d i f f e r e n c e o f j u s t two experimental blanks, there i s a s i g n i f i c a n t chance t h a t the actual range o f the blanks w i l l exceed t h a t measured d i f f e r e n c e by a f a c t o r o f 25, under the best of circumstances ( n o r m a l l y - d i s t r i b u t e d blanks; 95 percent tolerance interval). G i v i n g up the assumption of n o r m a l i t y , but r e q u i r i n g the blank to be under ( s t a t i s t i c a l ) c o n t r o l , one can be f a i r l y (95 percent) c e r t a i n t h a t h a l f o f the blanks w i l l f a l l w i t h i n the range o f 8 o b s e r v a t i o n s , o r 90 percent o f them w i t h i n the range of 47 observations! 6
1
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
122
VALIDATION O F T H E
Aqueous Solution 140 ng Pb/g 75
MEASUREMENT
PROCESS
Porcine Blood 30 ng Pb/g
60
456
109 801
+ 40 r
453
90
62T
+ 20h
•20h
-40
L
1
2
3
4
5
6
7
1 LABORATORY
2
3
4
5
6
7
NUMBER National Bureau of Standards Special Publication
Figure 5. Comparison of interlaboratory Pb results for an aqueous standard vs. whole blood (14)
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
3. cuRRiE
AND DEVOE
Systematic Error in Chemical Analysis
123
Sample P r e p a r a t i o n Besides problems w i t h the blank, great care must be taken when performing p h y s i c a l o r chemical separations o f components i n a sample. I f the recovery o f the d e s i r e d component i s not q u a n t i t a t i v e , - - e . g . , l o s s o f v o l a t i l e components during sample d i s s o l u t i o n — s e r i o u s systematic e r r o r may r e s u l t (16). The recovery f a c t o r presents the same o p p o r t u n i t y to e r r as does the instrument c a l i b r a t i o n f a c t o r — n a m e l y , the assumption t h a t the (average) y i e l d i s q u a n t i t a t i v e or constant and t h a t i t s v a r i a b i l i t y ( r e l a t i v e standard d e v i a t i o n ) i s f i x e d . Such f i x e d values may be deduced by assumption, a t h e o r e t i c a l model ( s o l u b i l i t y product, p a r t i t i o n c o e f f i c i e n t , ...) o r b e t t e r s t i l l , by a few measurement l a i d : as soon as v a r y i n encountered low and f l u c t u a t i n g y i e l d s w i l l occur. One o f the most r e l i a b l e means f o r e l i m i n a t i n g bias due t o n o n q u a n t i t a t i v e s e p a r a t i o n i s isotope d i l u t i o n . Provided t h a t the d i l u t i n g isotope i s added at the e a r l i e s t p o s s i b l e stage and t h a t complete i s o t o p i c mixing takes p l a c e , t h i s technique i s capable of very high accuracy. The a r t has perhaps reached i t s u l t i m a t e l e v e l a t the hands o f s k i l l e d chemical mass spectrom e t r i s t s , who have succeeded i n measuring isotope r a t i o s w i t h u n c e r t a i n t i e s of only 0.03 percent (17,18). Measurement The measurement step provides many chances f o r e r r o r . Operator b i a s , f o r example, commonly occurs i n the making o r r e c o r d i n g o f o b s e r v a t i o n s , as shown i n f i g u r e s 6 and 7 (19,20). Results of 1,000 weighings ( f i g . 6) show t h a t operators favor the values o f 0 and 5 f o r the l a s t d i g i t , and t h a t even numbers tend to be favored over odd numbers. From 1,510 buret readings ( f i g . 7 ) , on the other hand, one can observe t h a t small numbers are favored over l a r g e numbers. The p o s s i b i l i t y of operator b i a s i s , perhaps, s u f f i c i e n t j u s t i f i c a t i o n f o r c o n s i d e r i n g computer c o n t r o l f o r such types o f processes. (Since even computers are programmed and run by the operator o f the instruments, however, the t h r e a t of e r r o r s (blunders) of t h i s type i s only reduced, not eliminated.) Two o f the most important c h a r a c t e r i s t i c s o f a n a l y t i c a l measurements a r e the c a l i b r a t i o n f u n c t i o n and instrumental r e s o l u t i o n . To assume t h a t the c a l i b r a t i o n f a c t o r i s constant, independent o f the nature ( m a t r i x ) o r c o n c e n t r a t i o n o f the sample, i s t o i n v i t e b i a s . I t i s i n the c a l i b r a t i o n f a c t o r , together w i t h recovery f a c t o r s , t h a t " r e a l " samples d i f f e r most s t r i k i n g l y from pure s o l u t i o n s . Aside from the use o f sound t h e o r e t i c a l o r semi-empirical c o r r e c t i o n formulas, the most r e l i a b l e method t o assure a c o r r e c t c a l i b r a t i o n i s the use of an
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
VALIDATION O F T H E M E A S U R E M E N T ANALYTICAL BALANCE
TERMINAL DIGIT
Figure 6. Operator bias—analytical balance. Histogram depicts observed terminal (estimated) digit distribution for 1000 student weighings. Dashed line indicates expected distribution. (Data from Ref. 20).
BURET
300
u Ζ
Ο Ul ÛÈ
200 Η
100
0 1
2 3
4
5 6 7 8 9
TERMINAL DIGIT
Figure 7. Operator bias—buret reading. Histogram depicts terminal digit distribu tion for 1510 student observations. Dashed lines delimit the 95% confidence interval for a uniform distribution. (Data from Ref. 20).
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
PROCESS
3. cuRRiE
A N D DEVOE
Systematic Error in Chemical Analysis
125
i n t e r n a l standard. By adding t o the sample a known a l i q u o t o f the substance being measured, one can t r a n s l a t e the d i f f e r e n t i a l response i n t o an e f f e c t i v e c a l i b r a t i o n f a c t o r f o r the actual sample a t hand (21 ). Instrumental r e s o l u t i o n , j u s t l i k e chemical o r p h y s i c a l "résolution"--!.e. , s e p a r a t i o n — i s one o f the most important means of p r e v e n t i n g systematic e r r o r from u n a n t i c i p a t e d components o r i l l - d e f i n e d s p e c t r a l f e a t u r e s . Some o f the p e n a l t i e s from inadequate r e s o l u t i o n w i l l be examined below i n our d i s c u s s i o n o f data e v a l u a t i o n . When d e a l i n g w i t h complex m a t e r i a l s c o n t a i n i n g p o t e n t i a l l y i n t e r f e r i n g s p e c i e s , however, a small investment i n increased chemical r e s o l u t i o n w i l l be w e l l r e p a i d i n decreased b i a s . Data E v a l u a t i o n Although the l a s t two steps o f the CMP do not n e c e s s a r i l y i n v o l v e any work i n the chemical l a b o r a t o r y , they are nevert h e l e s s an i n t e g r a l p a r t o f the o v e r a l l measurement system and thus must be recognized as p o t e n t i a l c o n t r i b u t o r s o f systematic e r r o r . In f a c t , p e r f e c t l y v a l i d sampling, chemistry and i n s t r u mental measurement can be rendered meaningless by f a u l t y e v a l uation o r r e p o r t i n g . This p o t e n t i a l f o r i n a c c u r a t e data e v a l uation has been recognized r e c e n t l y i n an i n t e r n a t i o n a l comparison devoted s t r i c t l y t o the e v a l u a t i o n (and r e p o r t i n g ) phase o f gamma-ray spectroscopy (22). E r r o r s which are due t o the d i f f e r e n c e s between pure s o l u t i o n s and complex samples o f t e n remain l a t e n t u n t i l the e v a l u a t i o n stage. Systematic e r r o r can be minimized provided t h a t such d i f f e r e n c e s — c o n n e c t e d w i t h the blank, matrix e f f e c t s , component i n t e r f e r e n c e — a r e adequately recognized i n the evaluation process. (The p o s s i b i l i t y o f making proper c o r r e c t i o n s may, o f course, depend upon the p r i o r i n t r o d u c t i o n of a recovery t r a c e r o r use o f a high r e s o l u t i o n measuring device.) For s i n g l e component measurements a common source o f e v a l u a t i o n b i a s i s the assumed c a l i b r a t i o n "constant." M a t r i x c o r r e c t i o n s represent one area where the a n a l y s t must c o r r e c t l y a d j u s t t h i s f a c t o r (10,23). The other r e l a t e s t o the f u n c t i o n a l r e l a t i o n s h i p assumed between the q u a n t i t a t i v e response o f an a n a l y t i c a l chemistry measurement system and the composition o f standards. Many times the r e l a t i o n s h i p i s l i n e a r o r a t l e a s t i t appears t o be so. However, one soon l e a r n s t h a t he can d e f i n e a f i t t o a mathematical model i n a v a r i e t y o f ways. I t i s i n t h i s process o f determining whether the model adequately represents the experimental data, t h a t systematic e r r o r s can a r i s e . A common but p o t e n t i a l l y misleading c a l i b r a t i o n procedure i s f i t t i n g a s t r a i g h t l i n e t o the data and the subsequent examination o f a t e s t s t a t i s t i c t o assess the goodness o f f i t . In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
126
VALIDATION O F T H E M E A S U R E M E N T PROCESS
An example i s the c a l i b r a t i o n of l i n e a r i t y of the energy s c a l e of a G e ( L i ) γ-ray detector. Figure 8 shows a l i n e a r f i t where the r e s i d u a l r e l a t i v e standard d e v i a t i o n ( a measure o f f i t ) was l e s s than 0.1 percent, and the c o r r e l a t i o n c o e f f i c i e n t ( a measure of l i n e a r i t y ) was 0.9999. However, through more d e t a i l e d examination we found t h a t the f i t was not r e a l l y adequate when compared w i t h t h a t expected on the b a s i s o f Poisson counting s t a t i s t i c s . A very i n f o r m a t i v e way t o evaluate the f i t i s t o observe the p l o t of r e s i d u a l s ( a l g e b r a i c d i f f e r e n c e between the experimental data p o i n t s and the f i t t e d mathematical model vs γ-ray energy). One can see i n the f i g u r e t h a t the X's are not d i s t r i b u t e d randomly about zero. In f a c t , e r r o r s i n both the c a l i b r a t i o n f u n c t i o n and the t a b u l a t e d standard energie are represented by th obtained, one can see by i n s p e c t i o n t h a t there i s a s l i g h t decrease i n the spread of the r e s i d u a l s a t higher channel numbers. This i n d i c a t e s a p o s s i b l e a d d i t i o n a l problem t h a t might warrant f u r t h e r study. Multicomponent methods o f a n a l y s i s o f t e n s u f f e r b i a s from inadequate r e s o l u t i o n . The problem o f a c c u r a t e l y r e s o l v i n g o b v i o u s l y overlapping peaks, such as those shown i n f i g u r e 9, has r e c e i v e d c o n s i d e r a b l e a t t e n t i o n i n the s p e c t r o s c o p i c and chromatographic l i t e r a t u r e (24). Not so w e l l a p p r e c i a t e d , however, i s the f a c t t h a t s i g n i f i c a n t systematic e r r o r may be introduced when an i n t e r f e r i n g peak i s present but not apparent, and hence excluded from the data r e d u c t i o n model (25). The magnitude o f the r e s u l t i n g b i a s , when an undetected peak l i e s b u r i e d w i t h i n the peak of i n t e r e s t i s shown i n f i g u r e 10. I t i s a s u r p r i s i n g r e s u l t t h a t the l e v e l o f e r r o r can be so l a r g e and s t i l l go undetected. P l o t t e d i n the f i g u r e i s the r a t i o o f the systematic e r r o r t o the standard d e v i a t i o n o f the estimated area of a (Guassian) peak as a f u n c t i o n o f i t s separation from a neighboring (undetected) peak. I t can be seen t h a t i f the overlap i s equal t o o r l e s s than the h a l f w i d t h , very l a r g e systematic e r r o r can r e s u l t (26). Improved instrumental r e s o l u t i o n may e l i m i n a t e the above pitfall. In f a c t , advanced instrumentation may reveal q u i t e a s u r p r i s i n g degree o f complexity; f i g u r e 11 shows the s t r u c t u r e a c t u a l l y contained i n the apparent γ-ray doublet o f f i g u r e 9. Reporting Results and U n c e r t a i n t i e s Among the r e s u l t s reported i n a recent t r a c e a n a l y s i s l a b o r a t o r y intercomparison o f an NBS Standard Reference M a t e r i a l (SRM 1577, bovine l i v e r ) , one f i n d s the f o l l o w i n g :
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
cuRRiE
Systematic Error in Chemical Analysis
A N D DEVOE
/
/
2 1000 h /
0
1000
2000
CHANNEL NUMBER
+1.0
® X
-l.Oh
from gamma-ray calibration curve vs. channel number, (x = linear function; · = cubic function; Ο = bad physical input data [tabulated y-energy ].)
Figure 9. Gamma-ray spectrum from Bremsstrahlung-activated gold: NaI(Tl)detector
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
128
VALIDATION O F T H E M E A S U R E M E N T
PROCESS
Figure 10. Model error bias. Curve shows the (maximum) bias in the estimated area of the major peak in a spectral doublet when an undetected minor peak is omitted from the mathematical model. The minor peak is not detectable if it lies below the solid curve.
Figure 11.
Gamma-ray spectrum from Bremsstrahlung - activated gold: Ge(Li)detector
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
3. cuRRiE
A N D DEVOE
Systematic Error in Chemical Analysis
Hg-content (pg/g):
129
0.09 <0.2 0.006 ± 0.0006
In the absence o f a d d i t i o n a l i n f o r m a t i o n , i n t e r p r e t a t i o n o f the r e s u l t s i n d i v i d u a l l y o r c o l l e c t i v e l y i s o b v i o u s l y impossible. One has no idea o f the u n c e r t a i n t y a s s o c i a t e d w i t h the f i r s t r e s u l t , and the meaning o f the upper l i m i t f o r the second i s unknown. Besides t h e mismatch o f decimal d i g i t s , the r e s u l t reported by the t h i r d l a b o r a t o r y i n c l u d e s an u n c e r t a i n t y which allows several i n t e r p r e t a t i o n s (27). C l e a r l y , r e s u l t s o f good experiments n i c a t e d are worthless A full verbal reported q u a n t i t i e s an recommend, as a minimu
inadequately commuexplanation of a l l
(a) The r e s u l t which was a c t u a l l y computed--even i f i t be negative. (b)
For " n o n s i g n i f i c a n t " or nondetected (i)
results:
the d e t e c t i o n c r i t e r i o n
( i i ) an upper l i m i t , together w i t h i t s meaning. ( c ) The estimated standard e r r o r . (i) number o f observations (n) and the number o f degrees o f freedom ( v ) , i f estimated by r e p l i c a t i o n , ( v = n-1 f o r simple, one-parameter estimates o r averages.) ( i i ) source o f the estimate, i f no r e p l i c a t i o n - - e . g . , p r i o r experience Poisson s t a t i s t i c s , ... (d) The estimated bounds f o r systematic e r r o r (not nece s s a r i l y symmetric), and method o f e s t i m a t i o n - - e . g . , converted random e r r o r (nonrepeated step, as c a l i b r a t i o n ) , comparison w i t h reference m a t e r i a l or a l t e r n a t i v e method, propagation o f b i a s bounds o f i n t r i n s i c CMP s t e p s , etc. F i n a l l y , the o v e r a l l u n c e r t a i n t y o f the experimental r e s u l t may be given as a combination o f the random and systematic components, provided the i n d i v i d u a l components ( ( c ) and ( d ) , above) and the e x p l i c i t combination r e c i p e i s included. Because of l a c k o f knowledge concerning e r r o r d i s t r i b u t i o n s and because of the somewhat s u b j e c t i v e nature o f i n f e r r e d systematic e r r o r bounds ( 4 ) , the c o n s e r v a t i v e approach i s p r e f e r r e d : simple summation o f the random and systematic e r r o r bounds, where the former i s equal t o the standard e r r o r m u l t i p l i e d by a given percentage o f Student's-t d i s t r i b u t i o n . References (28-30) In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
VALIDATION OF
130
THE
M E A S U R E M E N T PROCESS
c o n t a i n c r i t i c a l i n f o r m a t i o n on the e s t i m a t i o n and treatment of u n c e r t a i n t y bounds. DIAGNOSTIC TECHNIQUES The t a s k of the a n a l y s t i s incomplete u n t i l systematic e r r o r or blunders, which have been detected, have been thoroughly examined and e l i m i n a t e d . Important progress i n the A n a l y s i s of Blunders (ANOB) may be made through the use of c e r t a i n numerical and g r a p h i c a l procedures which are j u s t now being developed (31,32,33). Such techniques have been d i s c u s s e d i n d e t a i l above; the f o l l o w i n g examples i l l u s t r a t e t h e i r a p p l i c a t i o n to s p e c i f i c a n a l y t i c a l problems i n v o l v i n g systemati u t i l i z e " r e s i s t i v e " an immune to the e f f e c t s of o u t l i e r s and assumptions concerning random e r r o r d i s t r i b u t i o n s , r e s p e c t i v e l y . The g r a p h i c a l methods i n c l u d e the use of histograms, r e s i d u a l p l o t s and c o r r e l a t i o n diagrams. Such one- or two-dimensional d i s p l a y s of the data can be an e x c e p t i o n a l l y e f f i c i e n t way to p i n p o i n t e r r a t i c blunders and sources of b i a s . The purpose of such procedures i s e r r o r diagnosis l e a d i n g to improved experiments--not the r e p a i r of f a u l t y data. When assumptions--such as n o r m a l i t y , l i n e a r c a l i b r a t i o n curves, n e g l i g i b l e o u t l i e r or bias o c c u r r e n c e — a r e q u e s t i o n a b l e , then r e s i s t i v e , robust and d i s t r i b u t i o n - f r e e methods are i n order. Among the most convenient f o r the a n a l y s t , who o f t e n has a r e l a t i v e l y small number of o b s e r v a t i o n s , i s the median and i t s confidence i n t e r v a l (see Table I and r e f . (38)). Both q u a n t i t i e s can be determined from a s e t of data simply by o r d e r i n g , without computation. Furthermore, the estimate f o r the confidence i n t e r v a l i s r e s i s t a n t to the e f f e c t s of an o u t l i e r once the number of observations exceeds 8. (For η = 6,7,8 the 95 percent confidence i n t e r v a l f o r the median i s j u s t equal to the range.) With respect to the assessment of v a r i a b i l i t y , a convenient method f o r e s t i m a t i n g the standard d e v i a t i o n from a s e t of residuals i s to take the median (unsigned) residual. Considerable p r o t e c t i o n a g a i n s t bad data i s then a f f o r d e d , e s p e c i a l l y as compared to the use of the ( c o n v e n t i o n a l ) r o o t mean squared r e s i d u a l . 1-Dimensional P l o t s - Residuals Systematic e r r o r s were revealed above by means of h i s t o grams ( f i g s . 6,7) and r e s i d u a l p l o t s ( f i g . 8 ) . Such p l o t s are always h e l p f u l , but they are c e r t a i n l y c a l l e d f o r whenever χ (computed to t e s t r e g r e s s i o n model or d i s t r i b u t i o n a l accuracy) i s unacceptable or when the estimated v a r i a n c e ( s ) from model f i t t i n g i s i n c o n s i s t e n t w i t h the expected v a r i a n c e ( σ ) , as i n f i g . 8. 2
2
2
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
3. cuRRiE
Systematic Error in Chemical Analysis
AND DEVOE
Table I.
131
a
Confidence i n t e r v a l f o r the median ' ( d i s t r i bu t i on-free)
/2
k
J1
k
6-8 9-11 12-14 15-16 17-19
1 2 3 4 5
20-22 23-24 25-27 28-29 30-32
6 7 8 9 10
E n t r i e s d e r i v e d from t a b l e A-25
o f reference (38)
I f observations are ordered x
l 5
x 5 2
x , the confidence i n t e r v a l n
(a < .05) equals x c
k
to * _ -| n
k+
CI = w (range) f o r η = 6,7,8 CI « I n t e r q u a r t i l e range (mid-50 percent o f the o b s e r v a t i o n s ) f o r η = 12 to 24.
2
2
Such i n c o n s i s t e n c y ( s » σ ) l e d t o the a p p l i c a t i o n o f r e s i s t i v e and r e s i d u a l p a t t e r n techniques f o r the d i a g n o s i s o f systematic e r r o r i n the XRF a n a l y s i s o f CaO (34). A s t r a i g h t l i n e f i t t o the c a l i b r a t i o n curve gave an estimated RSD o f 1.2 percent, seven times the value expected from Poisson counting s t a t i s t i c s (see t a b l e I I ) . Figure 12a d i s p l a y s the normalized r e s i d u a l s r e s u l t i n g from the unweighted least-squares f i t o f a l i n e a r c a l i b r a t i o n curve t o the data. Two p e c u l i a r i t i e s a r e suggested by t h i s p l o t : ( a ) A s l i g h t l y nonrandom p a t t e r n i s apparent. (b) The l a r g e s t r e s i d u a l s are too s m a l l — t h e r e i s s c a r c e l y an excursion beyond ±ls. An a l t e r n a t i v e method o f computation, u s i n g ^ h e same model but i n c o r p o r a t i n g the observed background ($} and Poisson s t a t i s t i c s , was then performed. To provide some measure o f i n s e n s i t i v i t y t o o u t l i e r s r e s i d u a l s were then c a l c u l a t e d from the median value of the estimated c a l i b r a t i o n constant (7^). The computation formulas and numerical r e s u l t s a r e shown i n t a b l e I I ; t h e r e s u l t i n g r e s i d u a l p l o t i s given i n f i g u r e 12b. The p a t t e r n remains, but now we see s i g n i f i c a n t excursions beyond the (Poisson) c o n t r o l l i m i t s , w i t h the f i r s t measurement f a r worse than the remainder. We t e n t a t i v e l y mark the f i r s t r e s u l t , t h e r e f o r e , as a blunder. Note t h a t t h i s i n i t i a l e r r o r was revealed only through the use of the r e s i s t i v e technique; i t was completely masked when conventional l e a s t squares was a p p l i e d . In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
132
VALIDATION O F T H E M E A S U R E M E N T PROCESS
Table I I .
a
Cal ciurn--x-ray fluorescence c a l i b r a t i o n curve model : y = b + Ax + e
x ( % CaO)
y(counts/500s)
0.202 0.275 0.719 0.812 1.636 2.047 3.969
83.5k 123.2k 307.7k 342.4k 682.1k 854.3k 1661.2k
A 387,6k 428.5k 420.5k 415.1k (median) 413.8k 414.8k 417.2k
Least Square ID = 3620±2360
I D = 5240±70
= 417.1±l.lk It = 415.1±0.7k m s = 1.2% a = 0.17%
^
s/ô « 7. k , ^ , r e s u l t ^ f r o m f i t t i n g the model. ^ = observed b a c k g r o u n d ; ^ =(y -Τ?)/χ .. i
Ί
The p a t t e r n among the remaining r e s i d u a l s i n f i g u r e 12b does not e x h i b i t the smoothness expected from a systematic e f f e c t . This was c l a r i f i e d by f u r t h e r i n q u i r y , however, when we learned t h a t the samples were loaded on a r o t a r y sample wheel i n order o f i n c r e a s i n g calcium concentration. Taking t h i s i n t o account, and d e l e t i n g the presumed o u t l i e r , we r e p l o t t e d t h e r e s i d u a l s as a f u n c t i o n o f sample p o s i t i o n i n f i g u r e 12c. The r e l a t i v e l y symmetric, smooth p a t t e r n i s q u i t e c o n s i s t e n t w i t h sample wheel d i s t o r t i o n which produces varying response through sample-detector distance variations. Construction of an improved sample wheel e l i m i n a t e d t h i s source of systematic e r r o r , y i e l d i n g an experimental i m p r e c i s i o n of j u s t 0.2 percent (35). Another important p o i n t should be made about t h i s measure ment method. Had i t not been p o s s i b l e t o adequately reduce the wobble from the t u r n t a b l e , i t would have been best t o randomize the p o s i t i o n s of the CaO samples. This would transform a p o t e n t i a l systematic e r r o r i n t o a random e r r o r and of course reduce the p r e c i s i o n of the measurement. A d d i t i o n a l t e s t i n g of the magnitude of the e f f e c t o f the wobble would then be p o s s i b l e by p l o t t i n g ( o r c o r r e l a t i n g ) the r e s i d u a l s with p o s i t i o n (Θ). A l t e r n a t i v e l y , a p l o t of r e s i d u a l s vs CaO concentration might reveal concentration-dependent systematic e r r o r .
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
3.
cuRRiE
AND DEVOE
Systematic Error in Chemical Analysis
133
2-Dimensional P l o t s - C o r r e l a t i o n Inter!aboratory comparisons provide an extremely powerful method f o r r e v e a l i n g systematic e r r o r . Using a technique o r i g i n a l l y conceived by Youden (36), one can e f f e c t i v e l y d i s t i n g u i s h random e r r o r , l a b o r a t o r y bias and e r r a t i c blunders by a simple two-dimensional c o r r e l a t i o n technique. As o r i g i n a l l y devised, the method required the a n a l y s i s o f two s i m i l a r samples by each o f the p a r t i c i p a t i n g l a b o r a t o r i e s . As a p r e l i m i n a r y d i a g n o s t i c method, however, i t i s useful even with the r e l a x a t i o n o f the s i m i l a r i t y c o n s t r a i n t . To i l l u s t r a t e the p o t e n t i a l o f the method, f i g u r e 13 shows vanadium r e s u l t s from NBS-EPA t r a c e element intercompariso provided with samples o f two NBS-SRM's (SRM #1632, t r a c e elements i n coal and SRM #1633, t r a c e elements i n coal f l y ash). The t e s t s t a t i s t i c which l e d us t o explore graphic systematic e r r o r diagnosis was once again the variance r a t i o : the i n t e r l a b o r a t o r y e r r o r had been computed t o be several times the i n t r a ! a b o r a t o r y component. The r e s u l t s are s t r i k i n g . Despite the appearance o f a " f i t t e d l i n e " i n f i g u r e 13, the p o i n t s shown and the l i n e are t o t a l l y independent! The l i n e has no f r e e parameters. I t s slope (on the l o g - l o g p l o t ) was f i x e d a t +45° based on the hypothesis of a m u l t i p l i c a t i v e bias model — i . e . , biased c a l i b r a t i o n factors. The l o c a t i o n o f the l i n e was f i x e d by the known vanadium concentrations (dashed r e c t a n g l e ) f o r the two SRM's. (The concentrations were known to NBS, not to the p a r t i c i p a n t s . ) Nine o f the r e s u l t s shown ( s o l i d c i r c l e s ) i n v o l v e d r e p l i c a t i o n ; the tenth ( c r o s s ) d i d not. The Coherence o f the r e p l i c a t e d r e s u l t s t o the t h e o r e t i c a l l i n e confirm the assumption o f m u l t i p l i c a t i v e b i a s , and t h e i r d e v i a t i o n s from the l i n e are c o n s i s t e n t with the i n t r a l a b o r a t o r y imprecision. I t f o l l o w s t h a t use o f the SRM's t o expose the i n d i v i d u a l c a l i b r a t i o n biases has the p o t e n t i a l f o r b r i n g i n g a l l l a b o r a t o r i e s w i t h i n the bounds expected from the i n t r a l a b o r a t o r y e r r o r s — a n improvement i n r e p r o d u c i b i l i t y by a t l e a s t a f a c t o r of two. The non-repeated measurement ( c r o s s ) demonstrates a l s o the v u l n e r a b i l i t y o f i s o l a t e d observations t o e r r a t i c blunders. Displacement from the l i n e i n t h i s case i n d i c a t e s a mistake i n the a n a l y s i s o f the f l y ash sample or the coal sample, or both. SUMMARY This chapter has been d i r e c t e d toward four key aspects o f systematic e r r o r i n chemical a n a l y s i s : (a) the serious conse-
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
VALIDATION OF T H E M E A S U R E M E N T
CONCENTRATION
(%
CaO)
+2σ 0 -2σ
SAMPLE
POSITION
Figure 12. Residual diagnosis of systematic error in CaO XRF analysis, (a) Residual pattern (y-y) following least squares fitting of a linear calibration curve, (b) normalized Poisson residuals (A—A) using resistant method (A ), (c) normalized Poisson re siduals (A-A ) recalcuhted following dele tion of the first point using sample position as the independent variable. m
m
600
300
<
100
100
Figure 13. Results of NBS-EPA vanadium intercom parison. X represents the result which lacks replication.
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
PROCESS
3.
cuRRiE
A N D DEVOE
Systematic Error in Chemical Analysis
135
quences o f inaccuracy i n the external use of a n a l y t i c a l r e s u l t s , (b) the e s s e n t i a l requirement o f systematic e r r o r d e t e c t i o n o r accuracy v e r i f i c a t i o n v i a l a b o r a t o r y o r method intercomparison, (c) the systems a n a l y t i c approach t o the CMP as the only r e l i a b l e , organized way t o a n t i c i p a t e the o r i g i n , magnitude and flow o f systematic e r r o r , ( d ) the power o f numerical and g r a p h i c a l d i a g n o s t i c techniques, which are r a p i d and r e l a t i v e l y immune t o assumptions and blunders, f o r exposing the p a r t i c u l a r nature o f systematic e r r o r s f o l l o w i n g t h e i r d e t e c t i o n . I t i s c l e a r t h a t the need f o r accuracy i n a n a l y t i c a l chemistry continues t o increase along with our understanding o f the importance o f the composition o f m a t e r i a l s . E r r o r s i n such measurements can lead t o i n c o r r e c t d e c i s i o n s i n f i e l d s ranging from environmental p r o t e c t i o Accuracy assurance can b e n e f i t from the use o f reference m a t e r i a l s of known composition, so long as c a r e f u l i t e r a t i v e feedback o f information among l a b o r a t o r i e s i s used t o e l i m i n a t e methodic e r r o r s . In a d d i t i o n i t i s v i t a l t o i n v e s t i g a t e , on an i n d i v i d u a l - a n a l y s t b a s i s , p o s s i b l e sources o f e r r o r s i n each of the steps o f the s p e c i f i c CMP, f o r i t i s evident ( f i g . 2) t h a t bias which i s comparable t o o r smaller than the i m p r e c i s i o n can e a s i l y escape e m p i r i c a l d e t e c t i o n . Often we understand where many of the e r r o r s a r i s e . Lack of control, sampling, d i s s o l u t i o n , chemical separation and p u r i f i c a t i o n , and instrumental e r r o r s come immediately t o mind.
Table I I I . Assumption L i m i t a t i o n s ( n e g l i g i b l e bias + estimate i m p r e c i s i o n
meaningful r e s u l t s )
Random E r r o r :
Poisson d e v i a t i o n s from normality (N < 30) Random component o f systematic e r r o r sources Random e r r o r s i n c o r r e c t i o n s f o r systematic e r r o r s
Systematic E r r o r :
Sampling and sample preparation (recovery) Blank, i n t e r f e r e n c e , and contamination Improper c a l i b r a t i o n and/or standards Matrix e f f e c t — p a r t i c l e s i z e and composition, enhancement, a d s o r p t i o n , and s c a t t e r i n g Inaccurate data r e d u c t i o n models or c o r r e c t i o n formulas (assumed parameters, functional relations) Blunders and f a u l t y r e p o r t i n g
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
VALIDATION
136
OF
THE
MEASUREMENT
PROCESS
However, r e l i a b l e data e v a l u a t i o n and r e p o r t i n g are no l e s s important. Great care must be taken t o employ mathematical procedures f o r data r e d u c t i o n which r e f l e c t the actual physicochemical processes of the e n t i r e a n a l y t i c a l system. Inadequate r e p o r t i n g , a t b e s t , can lead t o m i s i n t e r p r e t a t i o n of r e s u l t s and consequent erroneous d e c i s i o n s . In order to give a capsule summary of a l l of these f a c t o r s , we show i n t a b l e I I I (27) the d e v i a t i o n s from assumptions which are most l i k e l y t o lead t o unanticipated error.
Terms and Symbols Used i n Text and Figures CMP =
Chemical Measurement Process
η
=
number of observation
ν
- degrees of freedom = number observations minus number of estimated parameters (unknowns)
τ
=
χ
= experimental r e s u l t (mean)
e
=
error = result - truth = χ - τ
=
u n c e r t a i n t y bounds i n e
=
random e r r o r i n χ
e
+
δ
t r u e value ( i f known)
= maximum l i k e l y (absolute value) random e r r o i — t y p i c a l l y taken as two t o three times the standard error σ
=
standard d e v i a t i o n ( s i n g l e observation)
ζ
=
random normal d e v i a t e
SE = Standard E r r o r , standard d e v i a t i o n of mean = a/y/rT a
=
extraneous random e r r o r
Δ
=
systematic e r r o r i n x - - i . e . , nonrandom component of e r r o r equals constant b i a s when the CMP i s i n control
^
= estimated bounds f o r Δ ( r e s u l t o f experimental comparison)
Δ
e
+
=
i n f e r r e d bounds f o r Δ ( r e s u l t of CMP-structure and s c i e n t i f i c judgment)
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
3. cuRRiE
AND DEVOE
Systematic Error in Chemical Analysis
137
Literature Cited 1. 2.
3. 4. 5.
6. 7. 8.
9.
10. 11. 12.
New York Times, Mar. 28, 1976,pE-9. McNesby, J., Testimony on Assessment - Need for Standards, at Hearing on The Effects of Chronic Exposure to Low- Level Pollutants in the Environment before the Subcommittee on the Environment and the Atmosphere, U.S. Congress (1975). "Some Europium-152 Gamma-Ray Probabilities," Radioactivity Section, NBS Oct. 1976. Taylor, Β. Ν., Parker, W. H. and Langenberg, D. Ν., The Fundamental Constants and Quantum Electrodynamics, Academic Press, New York (1969). [See especially p 20]. Yolken, H. T., The Role of Standard Reference Materials in Environmental Monitoring Pollution Monitorin Symposium held at NBS, Gaithersburg, MD, May 13-17, 1974. Currie, L. Α., The Limit of Precision in Nuclear and Analytical Chemistry, Nucl. Instr. Meth., (1972), 100, 397. Eisenhart, C., Realistic Evaluation of the Precision and Accuracy of Instrument Calibration Systems, Journal of Research NBS, (1963), Vol. 67C, No. 2, p 161. Olsson, I. U., Editor, Radiocarbon Variations and Absolute Chronology, Proceedings of the 12th Nobel Symposium held at the Institute of Physics at Uppsala University, Wiley Interscience (1970). Campbell, J. L., Orr, Β. Η., Herman, A. W., McNelles, L. Α., Thomson, J. A. and Cook, W. Β., Trace Element Analysis of Fluids by Proton-Induced X-Ray Fluorescence Spectrometry, Anal. Chem., (1975), 47, 1542. Heinrich, K. F. J., Common Source of Error in Electron Probe Microanalysis, Advances in X-Ray Analysis, (1968), Vol. 11, p 40. Lee, R. Ε., A Sampling Anomaly in the Determination of Atmospheric Sulfate Concentration, Am. Ind. Hyg. Assoc. J., (1966), 27, 266. Charlson, R. J., Vanderpol, Α. Η., Covert, D. S., Waggoner, A. P. and Ahlquist, N. C., H S0 /(NH ) S0 Background Aerosol: Optical Detection in St. Louis Region, Atmospheric Environment, (1974), Vol. 8, p 1257. Burrows, W. D. and Krenkel, P. Α., Loss of Methylmercury -203 from Water, Anal. Chem., (1974), 46, 1613. Murphy, T. J., The Role of Analytical Blank in Accurate Trace Analysis, NBS Spec. Publ. 422, (1976), Vol. II, p 509. Robertson, D. Ε., Role of Contamination in Trace Element Analysis of Sea Water, Anal. Chem., (1968), 40, 1067. Apt, Κ. E. and Gladney, E. S., Loss of Osmium During Fusion of Geological Materials, Anal. Chem., (1975), Vol. 47, 1484. 2
13. 14. 15. 16.
4
4
2
4
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
138
17.
18.
19. 20. 21.
22.
23.
24. 25.
26.
27.
28.
29. 30.
VALIDATION O F T H E M E A S U R E M E N T
PROCESS
Barnes, I. L., Murphy, T. J., Gramlich, J. W. and Shields, W. R., Lead Separation by Anodic Deposition and Isotope Ratio Mass Spectrometry of Microgram and Smaller Samples of Lead, Anal. Chem., (1973), 45, 1881. Moore, L. J., Machlan, L. Α., Shields, W. R. and Garner, E. L., Internal Normalization Techniques for High Accuracy Isotope Dilution Analyses - Application to Molybdenum and Nickel in Standard Reference Materials, Anal. Chem., (1974), 46, 1082. Hume, D., Pitfalls in the Determination of Environmental Trace Metals, Progress in Anal. Chem., Vol. 5, "Chemical Analysis of the Environment," Plenum Press (1973). Laitinen, H. A. and Harris, W. Ε., Chemical Analysis, McGraw-Hill Book Company Secon Editio (1975) Intersociety Committee Vanadium Content of Atmospheric Particulate Matter by Atomic Absorption Spectroscopy, Health Lab. Sci., (1974), 11, No. 3, 240. International Atomic Energy Agency, "Intercomparison of Methods for Processing Ge(Li) Gamma-Ray Spectra," H. Houtermans and R. M. Parr, Analytical Quality Control Services (1976). Nargolwalla, S. S. and Przybylowicz, E. P., Activation Analysis With Neutron Generators, Sources and Reduction of Systematic Error, Chap. 6, p 255, John Wiley and Sons (1973). Blackburn, J. Α., Editor, Spectral Analysis: Methods and Techniques, Marcel Dekker, New York, NY (1970). Currie, L. A., The Discovery of Errors in the Detection of Trace Components in Gamma Spectral Analysis, Modern Trends in Activation Analysis, Vol. II, DeVoe, J. R. and LaFleur, P. D., Editors, p 1215, NBS Spec. Publ. 312 (1968). Currie, L. A., Sources of Error and the Approach to Accuracy in Analytical Chemistry, Chap. 4 in Vol. I, Treatise on Analytical Chemistry, P. Elving and I. M. Kolthoff, Editors, J. Wiley and Sons, New York, NY (1977). Currie, L. Α., Detection and Quantitation in X-Ray Fluo rescence Spectrometry, Chap. 23 in X-Ray Fluorescence Methods for Analysis of Environmental Samples, T. Dzubay, Editor, Ann Arbor Science Pub., Ann Arbor (1976). Ku, Η. Η., Precision Measurement and Calibration Statistical Concepts and Procedures, NBS Spec. Publ. 300, Vol. I, Superintendent of Documents, U.S. Government Printing Office, Washington, DC (1969). Eisenhart, C., Expression of the Uncertainties of Final Results, Science, (1968), 160, 1201. "Round-Table Discussion on Statement of Data and Errors," Nuclear Instr. Meth., (1973), 112, 391.
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
3.
31.
CURRIE A N D DEVOE
Systematic Error in Chemical Analysis
Lide, D. R., Jr. and Paul, Μ. Α., Critical Evaluation of Chemical and Physical Structural Information, Chap. I, "Analysis of Experimental Data," Conference Proceedings Dartmouth College, National Academy of Sciences (1973) Tukey, J. W., Exploratory Data Analysis, Addison-Wesley
139
32. (1977). 33. Draper, Ν. H. and Smith, Η., Applied Regression Analysis, Chap. 3, "The Analysis of Residuals," John Wiley and Sons (1966). 34. Experimental Data; Courtesy of P. Pella, NBS (1976). 35. Breiter, D., Personal Communication (1977). 36. Youden, W. J., The Sample, The Procedure, and the Lab oratory, Anal. Chem., (1960), 32 [13], 23A. 37. EPA-NBS Interlaborator Coal, Fly Ash, Fue 38. Dixon, W. J. and Massey, F. J., Jr., Introduction to Statistical Analysis, McGraw-Hill Book Company, Inc., Third Edition (1969). 39. "Natural Production Rate and Atmospheric Concentration of Ar-37", L. A. Currie, R. M. Lindstrom, J. F. Barkley and P. S. Shoenfeld, Technical Note of the National Bureau of Standards to be published; 1977.
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
4 Role of Reference Materials and Reference Methods in the Measurement Process G E O R G E A . U R I A N O and J. P A U L C A L I Office of Standard Reference Materials, Institute for Materials Research, National Bureau of Standards, Washington, DC 20234
This paper is concerned with the role of reference material measurement process reference methods are considered to be two v i t a l components of measurement systems needed to assure the accuracy and compatibility of measurements. The views expressed in this paper are an outgrowth of ideas and concepts expressed recently in a number of publications (1,2,3) by J. Paul Cali and other members of the NBS staff. In particular, NBS Monograph 148 provides extensive background information concerning the use of reference methods and reference materials. This monograph was written in response to a request from the Agency for International Development to provide assistance for improving the measurement systems of developing countries. This paper consists of two parts. The f i r s t part is concerned with the achievement of measurement compatibility. Some general considerations concerning the use of reference materials and reference methods in the measurement process are discussed f i r s t . Reference materials and reference methods are seen to be two necessary but not always sufficient mechanisms for achieving measurement compatibility between laboratories on a national scale. The general discussion of measurement compatibility is aimed at providing the conceptual framework needed to examine two specific examples of how reference methods and/or reference materials have improved measurement systems for determining calcium in serum and nitrogen dioxide in ambient a i r . Evidence is also presented showing that the lack of 140 In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
4.
URIANO AND
CAO
Reference Materials and Methods
141
adequate r e f e r e n c e m a t e r i a l s and/or r e f e r e n c e methods i s impeding the development of compatible, national measurement systems f o r the determination of trace amounts o f m e r c u r y i n w a t e r and f o r t h e measurement o f t r a c e chromium i n b i o l o g i c a l m a t r i c e s .
THE The
ACHIEVEMENT OF MEASUREMENT C O M P A T I B I L I T Y
Importance o f Measurement C o m p a t i b i l i t y
Why do we make m e a s u r e m e n t s ? Measurements a r e i m p o r t a n t f o r a number of reasons. Measurements provide the basi fo equit i trad d fo settling disputes betwee p r o d u c e r and u s e r . Measurement importan i n d u s t r i a l q u a l i t y control process , and a r e used to a s s e s s and improve t h e r e l i a b i l i t y o r p e r f o r m a n c e o f m a t e r i a l s and s y s t e m s . In recent years a number of social and/or political considerations have highlighted the importance of good measurements, p a r t i c u l a r l y i n t h e a r e a s o f m e d i c a l d i a g n o s i s (5_) as w e l l as t h e enforcement of environmental (6) and, occupational safety regulations. Finally, measurements provide us with the quantitative, scientific, and engineering d a t a t h a t s e r v e as t h e language of s c i e n c e , a l l o w i n g improved communication of information among the s c i e n t i s t s of the world, even a c r o s s language b a r r i e r s . What do we mean b y t h e m e a s u r e m e n t p r o c e s s ? The measurement process consists of at least two components. First, some type o f a s c a l e i s needed t h a t a l l o w s one t o q u a n t i t a t i v e l y e s t i m a t e t h e value of an i n t r i n s i c or e x t r i n s i c property of a m a t e r i a l or system. Second, a method f o r a p p l y i n g the scale to whatever property i s b e i n g measured i s needed. The e n d r e s u l t o f a p p l y i n g a m e t h o d p l u s a scale i s t o a r r i v e a t a number t h a t a l l o w s a d e f i n i t e v a l u e t o be a s s i g n e d t o t h e p r o p e r t y under c o n s i d e r a t i o n by means o f a m e a s u r e m e n t - p r o p e r t y r e l a t i o n s h i p . In a l l communications or t r a n s a c t i o n s i n v o l v i n g two o r more p a r t i e s , w h e t h e r t h e y be e c o n o m i c , s o c i o p o l i t i c a l o r s c i e n t i f i c , one o f t h e c r i t i c a l s t e p s i n t h e t r a n s a c t i o n i s t h a t t h e p a r t i e s i n v o l v e d a g r e e on t h e r e s u l t s o f t h e measurement and t h e meaning o f t h e numbers associated with the measurement. This agreement should take into consideration any imprecision and inaccuracies inherent in the
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
142
VALIDATION
OF
THE
MEASUREMENT
PROCESS
measurement process under consideration. I f the m e a s u r e m e n t s a r e i n a g r e e m e n t , we s a y t h a t they are compatible. Thus a l l people i n v o l v e d i n communie a t i n g v i a the measurement can agree that the measurement is u s e f u l f o r w h a t e v e r end p u r p o s e s t h e m e a s u r e m e n t was made, By definition, a m e a s u r e m e n t i s a c c u r a t e when the r e s u l t i n g numbers a r e b o t h p r e c i s e and free of any systematic error. Under these conditions c o m p a t i b i l i t y between p a r t i e s i s not o n l y possible but highly probable. This leads t o t h e somewhat obvious but not always appreciated conclusion that measurement accurac lead t t compatibility. Mechanisms f o r A c h i e v i n g
Measurement
Compatibility
There a r e a number o f d i f f e r e n t mechanisms by w h i c h m e a s u r e m e n t c o m p a t i b i l i t y may be achieved and accuracy transferred between two or more laboratories. For example, a l l p a r t i e s might agree to use a reliable, s t a b l e , generated radio s i g n a l (e.g., the time signals transmitted by NBS radio station WWV) as t h e common r e f e r e n c e point for a s s u r i n g the c o m p a t i b i l i t y o f time measurements. The u s e o f r e f e r e n c e d a t a p r o v i d e s a n o t h e r means f o r a s s u r i n g measurement c o m p a t i b i l i t y . Temperature measurements c a n b e made c o m p a t i b l e t h r o u g h t h e u s e o f r e f e r e n c e d a t a s u c h as t h e m e l t i n g p o i n t of pure substances. Compatibility of e l e c t r i c a l conductivity measurements can a l s o be achieved in a similar manner. The p u r i t y of the material i s very c r u c i a l i n s u c h an a p p l i c a t i o n . U n l e s s a l l p a r t i e s a r e u s i n g (the same) "pure" m a t e r i a l s , c o m p a t i b i l i t y w i l l not be a s s u r e d . A t h i r d way t o a c h i e v e c o m p a t i b i l i t y i s t h r o u g h t h e u s e o f r e f e r e n c e m a t e r i a l s as a t r a n s f e r medium. In the broadest sense, a reference material i s a material, device, or system, which has been c o n s t r u c t e d o r m o d i f i e d i n s u c h a way t h a t d e f i n i t i v e numerical values c a n be associated with specific properties. The "material" m i g h t be an ozone g e n e r a t o r t h a t e m i t s known amounts o f ozone or a homogeneous m e t a l a l l o y t h a t c o n t a i n s a known amount o f chromium.
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
4.
URiANo A N D CALi
Reference Materials and Methods
143
By using reference materials, measurement compatibility c a n be a c h i e v e d on t h e b a s i s of p r e c i s i o n a l o n e , i f a l l p a r t i e s a g r e e t o u s e t h e same measurement methods and r e f e r e n c e material. This mode of achieving compatibility i s illustrated s c h e m a t i c a l l y i n f i g u r e 1. W h e t h e r t h e l a b o r a t o r i e s are obtaining the "true value" or not i s unimportant as l o n g a s t h e m e a s u r e m e n t i s c o n f i n e d t o t h e g r o u p of l a b o r a t o r i e s (A t o D ) , a l l o f w h i c h a r e u s i n g e x a c t l y t h e same m e t h o d s and r e f e r e n c e materials. Buyer-seller t r a n s a c t i o n s f o r example can occur i n a compatible manner provided we remain within the immediate u n i v e r s e o f u s e r s i . e . , among l a b o r a t o r i e s (A t o D). However communicate with a d o m a i n ( e . g . , l a b o r a t o r y E) a n d t h a t laboratory i s using a different measurement method and r e f e r e n c e material, then measurement compatibility may be difficult i f not impossible t o achieve i f p r e c i s i o n alone i s the b a s i s . This problem c a n u s u a l l y be alleviated by achieving compatibility through accuracy r a t h e r than p r e c i s i o n . In that case t h e recommended measurement methods and reference m a t e r i a l s w o u l d h a v e known u n c e r t a i n t i e s a s s o c i a t e d w i t h them, i . e . , t h e y w o u l d be c h a r a c t e r i z e d i n terms o f r e l i a b l e known v a l u e s denoting both imprecision and systematic e r r o r s . One i m p o r t a n t m e c h a n i s m f o r a c h i e v i n g c o m p a t i b i l i t y on t h e b a s i s o f a c c u r a c y i s through t h e use o f r e f e r e n c e m a t e r i a l s and r e f e r e n c e methods. This i s t h e mechanism used i n the determination of calcium i n serum t o be d e s c r i b e d later. Accurate
Measurement and True
Values
This leads us t o t h e concept of accurate measurement and t r u e values. L e t us p e r f o r m the following thought experiment ( i l l u s t r a t e d i n f i g u r e 2): We a r e t o m e a s u r e a specific property of a liquid substance, which i s stable with respect to time a n d homogeneous with respect to spatial variations o f t h e p r o p e r t y being measured. Without getting into the philosophy o f what o n e means b y " t r u e v a l u e - - l e t us a l l a g r e e t h a t t h e r e c a n be o n l y one u n i q u e t r u e v a l u e o f t h i s property-- for example the copper content o f t h e l i q u i d . 1 1
S i n c e measurement c o m p a t i b i l i t y i m p l i e s a s e r i e s making of users, s u p p o s e we h a v e Ν l a b s each measurements i n a way t h a t t h e y a l l g e tt h e t r u e
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
VALIDATION O F T H E M E A S U R E M E N T
• A l l laboratories (A-D), in universe using same methods and reference materials
• Hence, compatibility within immediate universe of users • What happen nicate with " E " w h o is using different methods and reference materials?
Figure 1. Schematic of how measurement compati bility may be achieved through precision alone, if all interacting laboratories in a network are using exactly the same measurement methods and reference materials
Sample • liquid •
stable
• homogeneous • specific property Lab A M
A
Lab Ν
Lab Β
± ο A
M
B
±
Σ
MM ±
Β
Σ
Ν
If all M's are accurate, then within the σ 's M
A
= M
B
= •. M
N
and Measurement Compatibility Results thus A C C U R A C Y ENSURES COMPATIBILITY
Figure 2. Schematic of how measure ment accuracy assures measurement compatibility
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
PROCESS
4.
URIANO A N D
CALi
Reference Materials and
Methods
145 T
v a l u e M w i t h i n random u n c e r t a i n t i e s . I f a l l t h e Ms are accurate, then within the measurement uncertainties σ
and measurement we e m p h a s i z e t h a t
compatibility results. Thus, again accuracy leads to c o m p a t i b i l i t y .
The necessary requirements for accurate m e a s u r e m e n t s y s t e m s a r e t h a t t h e y be b o t h p r e c i s e and free of systematic errors (7). I t m i g h t a l s o be d e s i r a b l e t h a t a c c u r a t e measurement techniques also have such characteristic r a p i d o p e r a t i o n , an such requirements are not necessary to achieving accuracy but r a t h e r are p r a c t i c a l c o n s i d e r a t i o n s . Consider a pragmatic o p e r a t i o n a l d e f i n i t i o n of w h a t we c a l l t r u e value." The true value of a property i s t h a t v a l u e t h a t c a n u l t i m a t e l y be t r a c e d to the base or d e r i v e d u n i t s of measurement, e.g., length, mass, or time through experiments having no s y s t e m a t i c e r r o r s (or w i t h systematic error small relative to practical end-use requirements). One c o u l d look at the t r u e value of the speed of l i g h t i n two w a y s : 1. P h i l o s o p h i c a l l y - - o n e may n e v e r be a b l e to determine the true value w i t h absolute certainty, and 2. P r a g m a t i c a l l y - - m e t r o l o g i s t s have been a b l e to determine a best v a l u e of the speed of light with rather small uncertainties. This best value is operationally synonymous with the true value. Furthermore, these u n c e r t a i n t i e s i n the measurement o f t h e s p e e d o f l i g h t c a n be d i r e c t l y r e l a t e d t o the uncertainties in the determination of the basic m e a s u r e m e n t u n i t s s u c h as l e n g t h o r t i m e . This paper assumes that the pragmatic approach to determining true values i s v a l i d . Thus, t r u e values are those that are d e t e r m i n e d by p r e c i s e m e a s u r e m e n t m e t h o d s , t h a t a r e e s s e n t i a l l y f r e e o f s y s t e m a t i c e r r o r and a r e t r a c e a b l e to b a s i c m e t r o l o g i c a l measurements. M
The Measurement Method H i e r a r c h y Accuracy
and
the
Transfer
of
This l e a d s us t o t h e c o n c e p t o f t h e m e a s u r e m e n t method hierarchy, the problem of transferring a c c u r a c y t h r o u g h o u t t h e h i e r a r c h y , and t o t h e r o l e o f r e f e r e n c e m a t e r i a l s and reference methods in this process (8^,9 ,10) . T h e r e a r e a number o f m e c h a n i s m s
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
146
VALIDATION
OF T H E
M E A S U R E M E N T PROCESS
that c a n be u t i l i z e d t o t r a n s f e r a c c u r a c y throughout measurement networks. This is illustrated by referring to the so-called Measurement "Pyramid," w h i c h i s shown i n v e r t e d i n f i g u r e 3. At the bottom point are the fundamental m e t r o l o g i s t s of the world who are concerned with accurate experimental determinations o f the base u n i t s o f measurement. The m e t r o l o g i c a l community consists of a r e l a t i v e l y s m a l l group o f s c i e n t i s t s dedicated to the accurate determination of the seven basic m e a s u r e m e n t u n i t s s u c h a s l e n g t h a n d t h e 40 o r so d e r i v e d u n i t s s u c h a s v o l t a g e . We m u s t e m p h a s i z e that u n l e s s such fundamenta experiment are carried out, i n f r a s t r u c t u r e i s on s h a k y g r o u n d s as f a r as a c c u r a c y i s concerned. M e t r o l o g i c a l measurements a r e n o r m a l l y of t h e h i g h e s t a c c u r a c y , i n some c a s e s a p p r o a c h i n g 1 p a r t i n 10*° o r b e t t e r . The next level on t h e measurement p y r a m i d i s represented by absolute measurement methods (or definitive methods as they are being c a l l e d i n the c l i n i c a l chemistry f i e l d ) . Definitive methods (11) are t h o s e t h a t have been s u f f i c i e n t l y w e l l - t e s t e d and e v a l u a t e d so t h a t r e p o r t e d r e s u l t s have essentially zero systematic errors and have h i g h levels of precision. Therefore they give true values w i t h i n a narrow range of uncertainty. These methods are usually expensive, time-consuming, require highly sophisticated a n a l y s t s and u n f o r t u n a t e l y a r e u s u a l l y not very p r a c t i c a l f o r everyday f i e l d use. The t h i r d l e v e l i s r e p r e s e n t e d by o t h e r methods called by such names as reference methods or standardized methods. These, l i k e a b s o l u t e methods, are a l s o o f known accuracy although u s u a l l y the inaccuracies are of greater magnitude or less w e l l d e f i n e d than f o r d e f i n i t i v e methods. These methods are generally faster and less expensive than the absolute methods. Many o f t h e ASTM recommended a n a l y t i c a l m e t h o d s f a l l i n t o t h i s c a t e g o r y (12) . The final level o f t h e measurement pyramid consists of the routine or f i e l d methods that are g e n e r a l l y f a s t , c h e a p , and u s u a l l y r e q u i r e r e l a t i v e l y non-sophisticated personnel f o r application. Field methods are g e n e r a l l y used i n a p p l i c a t i o n s i n v o l v i n g l a r g e numbers o f s e p a r a t e measurements t h a t must be performed rapidly. Although many of the field
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
4.
URiANO
methods may inaccurate.
147
Reference Materials and Methods
AND CALi
be
very
precise,
they
may b e h i g h l y
T h i s b r i n g s us t o a k e y p o i n t o f t h i s p a p e r . I n o r d e r t o a c h i e v e measurement c o m p a t i b i l i t y on a l a r g e scale, some m e c h a n i s m f o r t r a n s f e r r i n g a c c u r a c y f r o m t h e b o t t o m two s t e p s t o t h e t o p o f t h e p y r a m i d must be found. One m e c h a n i s m f o r t r a n s f e r r i n g a c c u r a c y through t h i s network i s through well-characterized reference materials.
EXAMPLES OF THE TRANSFER OF ACCURACY We now presen illustrate thetransfe y throughou measurement hierarchy. A t t h i s p o i n t , we i g n o r e t h e bottom step o f t h e pyramid with the understanding that to achieve measurement a c c u r a c y , one must be able to trace certain k e y measurements to the experimental realization o f t h e base measurement units. The M e a s u r e m e n t o f C a l c i u m
i n Serum
C o n s i d e r t h e e x a m p l e o f t h e m e a s u r e m e n t o f Ca i n serum, which i s used by p h y s i c i a n s t o diagnose certain thyroid diseases. Figure 4 describes the measurement methods and a s s o c i a t e d inaccuracies at different levels in the calcium measurement hierarchy. The most a c c u r a t e method f o r d e t e r m i n i n g Ca i s through the definitive method of isotope dilution-mass spectrometry (ID-MS) ( 1 3 ) , w h e r e b y t h e accuracy c a n be e v a l u a t e d f r o m f i r s t p r i n c i p l e s and traced d i r e c t l y to the experimental realization of the base measurement u n i t s . As a r e f e r e n c e m a t e r i a l , a pure a n a l y t e such as CaC0 i s r e q u i r e d . The level of inaccuracy i s 0.2V f o r c a l c i u m . The n a t i o n a l l y a c c e p t e d r e f e r e n c e m e t h o d (lji,2J> > 16)_ i s an atomic absorption technique. The a c c u r a c y o f t h e r e f e r e n c e method i s based on a s t a n d a r d reference material which i n turn was c e r t i f i e d using the d e f i n i t i v e method. The l e v e l o f i n a c c u r a c y f o r the reference method i s ±2.0%, a f a c t o r o f t e n h i g h e r t h a n f o r t h e d e f i n i t i v e method. F i n a l l y , t h e r e a r e numerous f i e l d m e t h o d s , many o f which have been a s s e s s e d by t h e C o l l e g e o f A m e r i c a n P a t h o l o g i s t s (CAP) i n i t s p r o ficiency testing program (Γ7,18). In these t e s t s , t h e CAP u s e d the reference metKod with reference sera, which are "matrix" reference materials 3
American Chemical Society Library In Validation of the Measurement Process; DeVoe, J.;
1155 16th St., Society: N.W. Washington, DC, 1977. ACS Symposium Series; American Chemical
V A L I D A T I O N O F T H E M E A S U R E M E N T PROCESS
• Accuracy decreases and use increases in going from bottom
Figure 3. The measurement "pyramid" representing the hierarchy of measurement methods needed to transfer accuracy from the experimental realization of the base measurement units to methods used in the field
Method
Accuracy
Level
Reference Material
1.
Definitive Isotopedilution Mass Spectrometry
via first principles; traceable to SI; pure analyte required (SRM)
±0.2%
Pure C a C 0
II.
Reference Atomic Absorption
via samples accurately assayed using definitive method. SRM required.
±2.0%
Pure C a C 0
III.
Field 9 different methods assessed byCDC
via reference method and accurately assessed refsera.
±5-10%
3
3
Ca in reference sera
Figure 4. The hierarchy of reference methods and reference materials used to establish a compatible measurement system for the determination of calcium in serum
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
4.
URiANO A N D
CALi
Reference Materials and Methods
149
c o n t a i n i n g known a m o u n t s ο £ c a l c i u m i n t h e m a t r i x of interest. Depending on t h e s p e c i f i c f i e l d m e t h o d , t h e i n a c c u r a c i e s a r e known t o be i n t h e r a n g e of 510%. The recent efforts to improve the accuracy o f c a l c i u m measurements i n t h e U n i t e d S t a t e s illustrate a number o f i m p o r t a n t p o i n t s . These i n c l u d e t h e f a c t t h a t an important component of a total national measurement system is some mechanism (e.g., p r o f i c i e n c y t e s t i n g ) that assures long-term quality c o n t r o l of the system. I n 1 9 6 9 , NBS c e r t i f i e d d issued f o i th clinical laboratorie had very little immediat impac measurement o f c a l c i u m i n serum on a n a t i o n a l scale, because no n a t i o n a l l y - a c c e p t e d r e f e r e n c e m e t h o d was a v a i l a b l e a t t h a t time to e v a l u a t e the accuracy of the f i e l d methods. R e c o g n i z i n g t h i s d e f i c i e n c y , NBS, t o g e t h e r w i t h s e v e r a l o t h e r government agencies [the Center f o r Disease Control (CDC) a n d t h e N a t i o n a l I n s t i t u t e s o f H e a l t h (NIH)] and s e v e r a l p r o f e s s i o n a l societies (e.g., the American Association for Clinical Chemistry), established a measurement network t o d e v e l o p a r e f e r e n c e method f o r d e t e r m i n i n g c a l c i u m i n serum (14·) . Five "round-robin" tests using a network of seven qualified clinical l a b o r a t o r i e s were r e q u i r e d b e f o r e t h e a c c u r a c y of a reference method was sufficiently demonstrated. Reference method development i s not a trivial undertaking: two years of effort costing over $125,000 were r e q u i r e d i n t h i s c a s e . An a c c u r a c y f o r the r e f e r e n c e m e t h o d o f w i t h i n ±1% o f t h e t r u e v a l u e was the i n i t i a l goal. However, this level of accuracy simply was n o t a t t a i n a b l e a t t h a t t i m e a n d i s now ±2%. Having demonstrated this degree of overall accuracy f o r t h e r e f e r e n c e method, a sample o f serum was prepared and analyzed by seven p a r t i c i p a t i n g laboratories (using the reference method and t h e SRM), and by nine commercial and hospital l a b o r a t o r i e s , w h i c h u s e d f i e l d methods for calcium The resulting d a t a a r e s u m m a r i z e d i n f i g u r e 5. A l l r e s u l t s a r e p l o t t e d as absolute percent deviations from the t r u e v a l u e " a s d e t e r m i n e d b y ID-MS. Note that three of the twelve commercial l a b o r a t o r i e s reported results above t h e 81 e r r o r danger l i n e , w h i c h i s t h e l e v e l where i n c o r r e c t m e d i c a l diagnosis M
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
VALIDATION
150
OF T H E M E A S U R E M E N T
PROCESS
could lead t o erroneous treatment of the disease called hyperparathroidism. Obviously, some of the f i e l d methods were i n need o f improvement. Subsequent to this study and based on t h e concept o f u s i n g r e f e r e n c e methods plus reference materials to help assure measurement a c c u r a c y and c o m p a t i b i l i t y , t h e CDC ( 1 9 ) h a s d i s c o v e r e d t h a t o f the n i n e d i f f e r e n t r o u t i n e methods used f o r c a l c i u m , one i s s u f f i c i e n t l y b i a s e d t h a t i t i s now recommended for d i s c o n t i n u a n c e , w h i l e f o u r o t h e r s a r e i n need o f improvement. F i e l d t e s t s been used i n the United Kingdom (20) p a r a l l e l t h e s e f i n d i n g s . During the developmen method, the following factors were found to contribute to systematic biases: q u a l i t y of reagents (including water); the quality of volumetric glassware; instrument stability and linearity; analytical techniques; the q u a l i t y and even t h e motivation of the analyst. When f a c t o r s s u c h a s these were properly identified and controlled, a c c u r a t e and c o m p a t i b l e measurement soon f o l l o w e d . If measurement c o m p a t i b i l i t y and a c c u r a c y on a n a t i o n a l s c a l e a r e t o be m a i n t a i n e d , these findings i n d i c a t e t h e need f o r a mechanism t o a s s u r e l o n g - t e r m q u a l i t y c o n t r o l o v e r t h e measurement p r o c e s s even i n t h o s e cases where good m e t h o d o l o g y and s t a n d a r d s have a l r e a d y been developed! NBS has i s s u e d over 20 c l i n i c a l reference m a t e r i a l s i n r e c e n t y e a r s and i s c u r r e n t l y involved in a joint p r o g r a m w i t h CDC a n d t h e F o o d a n d D r u g Administration to develop a number of clinical reference methods f o r substances s u c h as g l u c o s e , v a r i o u s e l e c t r o l y t e s i n serum, u r e a and u r i c a c i d . The M e a s u r e m e n t o f N 0
2
i n Ambient A i r
An example o f t h e a c c u r a c y o f an important system, i s the use o f d i o x i d e SRM s t o e v a l u a t e Protection A g e n c y (EPA) analysis of N0 i n ambi< s t u d y i s d e s c r i b e d i n a p< 1
2
1
use o f SRM s t o improve t h e environmental measurement a series o f NBS n i t r o g e n an official Environmental reference method f o r the nt a i r . This particular p e r b y J . R. M c N e s b y ( 2 1 ) .
In 1 9 7 1 , EPA d e s i g n a t e d t h e " J a c o b s - H o c h h e i s e r M e t h o d " t o be t h e o f f i c i a l EPA R e f e r e n c e Method f o r
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
URiANO
4.
Reference Materials and Methods
AND CALi
151
the measurement of N0 i n ambient a i r . T h i s i s a c o l o r i m e t r i c method i n v o l v i n g the diazotization of sulfanilamide by ambient N0 i n combination with other appropriate reagents. Because of legal necessity a t the time to quickly d e s i g n a t e an o f f i c i a l R e f e r e n c e M e t h o d , i t was n o t p o s s i b l e to e v a l u a t e t h e a c c u r a c y o f t h e J a c o b s - H o c h h e i s e r Method n o r t o p e r f o r m any c o l l a b o r a t i v e t e s t i n g p r i o r t o i t s designation as an o f f i c i a l R e f e r e n c e M e t h o d . This m e t h o d was s e l e c t e d ( 2 2 ) i n p a r t b e c a u s e i t h a d b e e n previously used i n a h e a l t h e f f e c t s s t u d y a n d was known t o h a v e g o o d p r e c i s i o n . Thus, i n t h e o r y an internally consistent and c o m p a t i b l e measurement system would result whereb unknow ambient a i concentrations coul effects data. 2
2
In 1 9 7 2 , EPA r e q u e s t e d f o r and s u p p o r t e d t h e i s s u a n c e o f a N 0 P e r m e a t i o n Tube S t a n d a r d Reference Material d e s i g n a t e d a s SRM 1629 (23)· T h i s SRM c o n s i s t s o f a g l a s s t u b e f i l l e d w i t h l i q u i d N 0 . The N0 p e r m e a t i n g t h r o u g h t h e cap i s a c c u r a t e l y measured via gravimetry. The t u b e s are calibrated for permeation r a t e as a f u n c t i o n o f t e m p e r a t u r e and a r e u s e d t o g e n e r a t e known c o n c e n t r a t i o n s o f N 0 i n a i r . 2
2
2
2
!
In t h e i r f i r s t a p p l i c a t i o n , t h e SRM s were used by EPA (24) t o e v a l u a t e t h e s y s t e m a t i c errors inherent i n t h e J a c o b s - H o c h h e i s e r r e f e r e n c e method. P r e v i o u s a p p l i c a t i o n s o f t h e J a c o b s - H o c h h e i s e r method were based i n part on t h e a s s u m p t i o n that the c o l l e c t i o n e f f i c i e n c y was a c o n s t a n t 35% o v e r t h e complete range o f c o n c e n t r a t i o n . EPA u s e d t h e SRM's to determine the c o l l e c t i o n e f f i c i e n c i e s o f t h e J a cobs - H o c h h e i s e r method a n d t h e r e s u l t s a r e shown i n f i g u r e 6. U s i n g f o u r d i f f e r e n t p e r m e a t i o n t u b e s , t h e overall collection efficiencies were found t o v a r y considerably with N0 concentration. The 3 5 % e f ficiency level i s shown b y t h e d a s h e d line. The collection efficiencies a t low c o n c e n t r a t i o n s were c o n s i d e r a b l y h i g h e r than 35%. A t h i g h c o n c e n t r a t i o n s the c o l l e c t i o n e f f i c i e n c i e s were c o n s i d e r a b l y l o w e r . These d a t a showed t h e s y s t e m a t i c e r r o r s inherent i n the Jacobs-Hochheiser method. EPA w i t h d r e w the method as t h e o f f i c i a l Reference Method f o r N0 measurements a n d h a s r e c e n t l y e v a l u a t e d o t h e r more promising analytical methods. They have proposed that NBS Standard Reference Materials be u s e d t o c a l i b r a t e t h e m e t h o d s (2 5 ) . 2
2
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
152
VALIDATION OF T H E M E A S U R E M E N T
12
> Ο
PROCESS
r-
Possible Wrong Medical Decisions Result
Desired Medical Error Limit
Inaccuracy Limit of RM+SRM
SRM + RM
FIELD M E T H O D S National Bureau of Standards Special Publications
Figure 5. A comparison of calcium in serum data ob tained by laboratories using the SRM, plus reference method (RM) with results obtained by laboratories using field methods (14)
80
1
I
1
I
1
I
1
ι
I
I
70
I • • A Ο
60
> υ
1
ι
I
1
I
1
1
I
I
ι
1ST PERMEATION TUBE 2ND PERMEATION TUBE 3RD PERMEATION TUBE 4TH PERMEATION TUBE
50
ο [I
40
<
30
DC
>
Ο
20
10 ppm//g/m3 -
.016
047
.080
.11
.18
ι 30
90
150
210
270
ι 330
.21
.24
.27
ι I 390
450
510
.30
.34
I
ι
570
630
.37
.40
• l ι 690
I
750
C O N C E N T R A T I O N S OF NITROGEN DIOXIDE S A M P L E D
Figure 6.
Data illustrating the use of NBS N0 permeation tube SRM's to evaluate the collection efficiencies of the Jacobs-Hochheiser method (21) 2
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
4.
URIANO
AND C A O
Reference Materials and Methods
153
This example points out the great need f o r measurement methods with well-characterized accuracies as w e l l as f o r good r e f e r e n c e m a t e r i a l s . The reference m a t e r i a l by i t s e l f cannot assure interlaboratory measurement compatibility i f the a n a l y t i c a l methodology i s poor. However, i n such cases, t h e SRM p r o v i d e s an i m p o r t a n t means f o r e v a l u a t i n g a n a l y t i c a l methods. The Determination Level
of
Mercury
in
W a t e r a t t h e PPB
The need f o r good reference m a t e r i a l s and analytical method measurement of trac water at l e v e l s o f 1 part per b i l l i o n or less. In early 1 9 7 5 , NBS was about r e a d y t o i s s u e t w o new S t a n d a r d R e f e r e n c e M a t e r i a l s t o be u s e d i n l o w - l e v e l m e r c u r y a n a l y s i s . SRM's 1 6 4 1 a n d 1642 w e r e c e r t i f i e d for mercury composition a t the l e v e l o f 1.49 yg/ml and 1.18 n g / m l respectively. S i n c e t h e r e w e r e no mercury reference materials available at those levels, i t was thought that t h e NBS-SRM s m i g h t p r o v i d e an o p p o r t u n i t y t o e v a l u a t e analytical field methods c u r r e n t l y being used f o r mercury m o n i t o r i n g . H e r e was a s i t u a t i o n w h e r e there were no accepted national r e f e r e n c e methods b u t t h e r e were supposedly adequate field methods (e.g., cold-vapor atomic absorption) i n rather widespread use. !
D. A. Becker (26) and c o w o r k e r s i n t h e NBS Analytical Chemistry Division carried out an i n t e r l a b o r a t o r y comparison f o r mercury measurements. The p a r t i c i p a t i n g l a b o r a t o r i e s p e r f o r m e d experiments with a n d w i t h o u t SRM's t o s e e i f t h e SRM w o u l d h e l p to improve measurement compatibility between laboratories. Seventeen l a b o r a t o r i e s p a r t i c i p a t e d i n one or both phases of this study and eight laboratories submitted sufficient data to allow statistical analysis. Four different mercury concentrations were used t o e s t a b l i s h t a r g e t values c o v e r i n g t h e r a n g e f r o m 0.2 t o 5.0 p a r t s p e r b i l l i o n . The r a w d a t a w i t h o u t t h e u s e o f t h e SRM show a g r e a t deal of scatter with several of the laboratories obtaining results which deviated from the target v a l u e s b y a f a c t o r o f 10 o r m o r e . This experiment was repeated a t t h e same c o n c e n t r a t i o n l e v e l s , b u t t h e NBS-SRM s were s e n t o u t along with t h e "unknown" s o l u t i o n s t o be u s e d as ?
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
154
V A L I D A T I O N OF
THE
M E A S U R E M E N T PROCESS
c o n t r o l s f o r checking the analytical procedures of the individual laboratories. The summary o f r e s u l t s a r e shown i n t a b l e s 1 and 2. Summarized are the target value, the average value obtained by a l l l a b o r a t o r i e s ( w i t h and w i t h o u t t h e use of the SRM) and t h e c o e f f i c i e n t o f v a r i a t i o n b e t w e e n l a b o r a t o r i e s ( w i t h a n d w i t h o u t t h e u s e o f t h e SRM). The data in table 1 indicate that the precision between l a b o r a t o r i e s was i m p r o v e d l i t t l e i f a t a l l through the u s e o f t h e SRM. T a b l e 2 shows t h e d e v i a t i o n s o f the average v a l u e s from the t a r g e t v a l u e , expressed as a p e r c e n t a g e o f t h e t a r g e t v a l u e w i t h and w i t h o u t t h e u s e o f t h e SRM. These r e s u l t s i n d i c a t e l i t t l e or no improvement i through th f th SRM. T h u s , i n t h i s p a r t i c u l a r c a s e t h e u s e o f t h e SRM appears to have l i t t l e or no effect on improving either the i n t e r l a b o r a t o r y p r e c i s i o n or the accuracy of the measurements. Any i m p r o v e m e n t s are probably too small to be of practical value. The i n t e r l a b o r a t o r y p r e c i s i o n and a c c u r a c y seem to be much better above 1 ppb t h a n b e l o w , w i t h o r w i t h o u t t h e use o f the SRM. The fact that the SRM does not help to s i g n i f i c a n t l y improve the accuracy or compatibility of these measurements leads us to examine the measurement methodology b e i n g used. A more detailed e x a m i n a t i o n o f t h e raw d a t a i n d i c a t e s t h a t o n l y a f e w of the participating laboratories achieved good a c c u r a c y a n d p r e c i s i o n w i t h o r w i t h o u t t h e SRM a t a l l c o n c e n t r a t i o n s s t u d i e d . F o r e x a m p l e , one particular laboratory deviated from the t a r g e t v a l u e by o n l y +11% a n d +4% without the SRM at the lowest and highest concentrations s t u d i e d . U s i n g t h e SRM, the d e v i a t i o n s were +11% and +3% at the lowest and highest c o n c e n t r a t i o n s . Such d a t a i n d i c a t e t h a t the measurement p r o c e d u r e s o f s e v e r a l o f the l a b o r a t o r i e s were i n v e r y g o o d s t a t i s t i c a l c o n t r o l . The d a t a a l s o p r o v i d e d some e v i d e n c e t h a t t h e laboratories coming closest t o t h e t a r g e t v a l u e s a l s o had g r e a t e r w i t h i n l a b o r a t o r y p r e c i s i o n than the other l a b o r a t o r i e s . I t w o u l d be i n t e r e s t i n g t o d e t e r m i n e why certain l a b o r a t o r i e s were able to do so well while the measurements of others seemed completely out of c o n t r o l , even though a s u p p o s e d l y s i m i l a r measurement t e c h n i q u e was u s e d b y a l l . One a p p r o a c h t o i m p r o v i n g r e s u l t s i n t h o s e l a b o r a t o r i e s w o u l d be to have the
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
URiANo A N D CALi
Reference Materials and Methods
Table 1. Average and Coefficient of Variation for Data Obtained by Eight Laboratories in the Evaluation of Methods Used to Determine Trace Mercury in Water
Target Value
Without U se of S R M
With Us e of S R M % CV<2)
Average^ *
% CV< >
0.18
0.24
57
.23
60
0.64
0.7
1.16
1.30
25
1.22
16
4.98
5.17
7.3
5.12
9.3
(ppb)
1
2
Average
( 1 )
(1) A v e r a g e of 8 laboratories. (2) Relative standard deviation, expressed as percent of the average (of 8 labs), of variability a m o n g laboratories.
Table 2. Deviation from the Target Values of the Interlaboratory Comparison Test Results for Mercury Measurements
% Deviation From Target Value Target Value
Without Use of S R M
With Use of S R M
0.18
+ 33
+ 28
0.64
+ 17
-
1.16
+ 12
+ 5
4.98
+ 4
+ 3
(ppb)
9
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
VALIDATION OF
156
THE
M E A S U R E M E N T PROCESS
laboratories that are capable of producing r e s u l t s w i t h good w i t h i n - l a b o r a t o r y p r e c i s i o n and accuracy, develop a strict measurement p r o t o c o l or r e f e r e n c e method. Then one could repeat the experiment requiring that the s t r i c t r e f e r e n c e method p r o t o c o l be followed by a l l laboratories. This should increase within laboratory p r e c i s i o n s to the p o i n t w h e r e t h e SRM s h o u l d t h e n be of greater value in r e d u c i n g between l a b o r a t o r y v a r i a t i o n s . T h i s example i n d i c a t e s t h a t a r e f e r e n c e m a t e r i a l i n and/of i t s e l f i s not s u f f i c i e n t to i n s u r e accurate and compatible measurements, i f a s t a t e of i n t e r n a l or w i t h i n - l a b q u a l i t l ha firs bee attained by each laborator a d d i t i o n t o making p r o p e r use o f the SRM, adequate analytical procedures and methods under strict q u a l i t y c o n t r o l are a l s o n e c e s s a r y . T r a c e Cr
i n B i o l o g i c a l Matrices
The final example i s concerned w i t h a lack of measurement compatibility between laboratories studying the role of trace metals in biological p r o c e s s e s and s y s t e m s . A recent paper by W. Mertz has r e v i e w e d t h e measurement p r o b l e m s a s s o c i a t e d w i t h the a n a l y s i s of the trace elements important to n u t r i t i o n and h e a l t h . F o r e x a m p l e , Cr i s b e l i e v e d t o p l a y an i m p o r t a n t r o l e i n processes governing the onset of diabetes through a Cr-containing substance known as t h e " g l u c o s e t o l e r a n c e f a c t o r . " Over the p a s t few y e a r s , some s e r i o u s p r o b l e m s associated with i n t e r l a b o r a t o r y c o m p a t i b i l i t y of Cr analyses in c e r t a i n b i o l o g i c a l m a t r i c e s h a v e become apparent. Mertz reported on values for Cr concentrations i n blood as obtained by various investigators using several different analytical m e t h o d s {IT). V a r i a t i o n s by a f a c t o r of 1000 were reported. "Even i f one t a k e s i n t o account the fact that the i n v e s t i g a t o r s u s e d d i f f e r e n t s p e c i m e n s and t h e d i f f e r e n c e s one m i g h t e x p e c t i n "normal" values between i n d i v i d u a l s , the t r a c e chromium measurement s y s t e m seems t o be o u t o f c o n t r o l . This is further illustrated by some measurements performed at NBS and elsewhere The Cr content of two b i o l o g i c a l m a t r i x S R M determined. The NBS O r c h a r d L e a v e s SRM (SRM has a known v a l u e f o r the Cr c o n c e n t r a t i o n ,
recent (26). s was 1571) which
f
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
4.
URiANO AND
CALi
Reference Materials and
Methods
157
seems t o be e a s i l y r e p r o d u c e d when m e a s u r e d independ e n t l y by d i f f e r e n t l a b o r a t o r i e s . On t h e o t h e r h a n d , when C r i s d e t e r m i n e d i n t h e NBS Bovine Liver SRM (SRM 1577) at different l a b o r a t o r i e s , the values o b t a i n e d show c o n s i d e r a b l e v a r i a b i l i t y . For example, NBS determined t h e t o t a l Cr c o n t e n t i n B o v i n e L i v e r t o be 170 ± 20 ppb u s i n g N e u t r o n A c t i v a t i o n Analysis with radiochemical separation. Another laboratory o b t a i n e d 50 ppb u s i n g t h e same t e c h n i q u e . Not only i s t h e t o t a l Cr c o n t e n t d i f f i c u l t t o determine i n Bovine L i v e r , but q u a n t i t a t i v e estimates of organic s p e c i e s o f C r a r e e v e n more d i f f i c u l t t o obtain. A common t e c h n i q u f o measurin chromiu i g r a p h i t e furnace atomi the o r g a n i c chromiu glucos tolerance factor i s apparently lost during the c h a r r i n g c y c l e - - s i n c e i t i s v o l a t i l e - - t h u s l e a d i n g to l a r g e measurement e r r o r s . What we h o p e t o do a t NBS i s t o h e l p r e s o l v e t h e Cr measurement problem by producing a Standard Reference Material with known certified concentrations of total Cr and also, hopefully, organic Cr. Brewers Yeast i s the candidate b i o l o g i c a l m a t r i x m a t e r i a l . I f s u c h an SRM can be c e r t i f i e d , i t t h e n c a n be u s e d t o o p t i m i z e t h e a t o m i c a b s o r p t i o n techniques used to determine Cr and to help expedite the development of a r e f e r e n c e method. The question of determination of speciation is particularly important b e c a u s e m e t a l l o r g a n i c Cr can be a f a c t o r of 100 more active i n the glucose t o l e r a n c e f a c t o r than i n o r g a n i c chromium. This example i l l u s t r a t e s the f a c t t h a t problems i n v o l v i n g the q u a n t i t a t i v e measurement of distinct chemical s p e c i e s as o p p o s e d t o e l e m e n t a l c o n t e n t a r e becoming i n c r e a s i n g l y important, p a r t i c u l a r l y i n the health and e n v i r o n m e n t a l a r e a s . We b e l i e v e t h a t t h e accurate d e t e r m i n a t i o n of the composition of d i s t i n c t chemical species w i l l present developers of reference methods and r e f e r e n c e m a t e r i a l s w i t h a host of new and c h a l l e n g i n g m e a s u r e m e n t p r o b l e m s i n f u t u r e y e a r s .
CONCLUSION Consider the problems i n chromium a n a l y s i s ; i n t h i s c a s e we a r e i n t h e e a r l y s t a g e s o f establishing a c o m p a t i b l e n a t i o n a l measurement system. One m i g h t
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
VALIDATION O F T H E M E A S U R E M E N T
PROCESS
'True Value"
Definitive Method
I Reference Materials J
Reference Methods • • • •
Method and Instrument Development Method Evaluation Proficiency Testing Regulatory Requirements
(
Quality
^
Control Materials
J
f Secondary Λ \Reference Materials J
Field Methods
Figure 7. Schematic of an "idealized" measurement network needed to transfer accuracy and assure measurement compati bility
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
4.
URIANO A N D C A L i
Reference Materials and Methods
159
ask, what s h o u l d be d e v e l o p e d first--reference materials o r r e f e r e n c e methods? There i s no definitive answer i n g e n e r a l because each d i s t i n c t t y p e o f measurement p r e s e n t s a u n i q u e s e t o f problems and requirements. Good m e t h o d o l o g y (preferably d e f i n i t i v e methods) i s needed t o c e r t i f y SRM's a n d yet well characterized reference materials are obviously useful for evaluating methods (in p a r t i c u l a r r e f e r e n c e methods). This leads to the conclusion that the e s t a b l i s h m e n t o f a c o m p a t i b l e measurement system i s an iterative, selfconsistent process sometimes requiring several generation f validatin tests This also leads t measurement network accuracy number o f i m p o r t a n t mechanisms r e q u i r e d t o d e v e l o p a c o m p a t i b l e measurement system. They are: (1) An agreed upon system o f base u n i t s o f measurement; (2) A c c u r a t e r e f e r e n c e methods and r e f e r e n c e m a t e r i a l s ; (3) F i e l d m e t h o d s made c o m p a t i b l e t h r o u g h t h e u s e o f reference methods and r e f e r e n c e materials. In addition, some t y p e o f a b u i l t - i n f e e d b a c k m e c h a n i s j n i s needed f o r m a i n t a i n i n g quality control of the whole system and f o r a s s u r i n g c o m p a t i b i l i t y between v a r i o u s components o f t h e system. One such i d e a l i z e d m e a s u r e m e n t s y s t e m i s shown schematically i n figure 7. Measurement systems analogous to this a r e a l r e a d y w e l l o n t h e i r way t o b e i n g implemented i n e n v i r o n m e n t a l and c l i n i c a l areas such as t h o s e previously cited. I t i s through n e t w o r k s such as t h e s e t h a t measurement c o m p a t i b i l i t y can be a s s u r e d and m a i n t a i n e d and t h a t " m e a n i n g f u l " i . e . , c o m p a t i b l e measurements c a n be a c h i e v e d on a national or international basis.
ACKNOWLEDGMENTS The a u t h o r s w i s h t o acknowledge s e v e r a l h e l p f u l discussions with D r . James R. DeVoe during the preparation o f t h i s paper. We a r e g r a t e f u l t o D r . D o n a l d A. B e c k e r a n d c o w o r k e r s f o r s u p p l y i n g u s with the data on m e r c u r y and chromium measurements. We a l s o t h a n k D r . J a m e s R. M c N e s b y f o r s u p p l y i n g u s w i t h the data o n t h e u s e o f NBS N 0 S R M s i n e v a l u a t i n g the J a c o b s - H o c h h e i s e r Method. 1
2
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
VALIDATION
160
O F T H E M E A S U R E M E N T PROCESS
ABSTRACT Reference Materials ( c a l l e d Standard Reference M a t e r i a l s SRM s by t h e N a t i o n a l Bureau o f S t a n d a r d s , NBS) a r e two i m p o r t a n t mechanisms b e i n g u t i l i z e d t o a s s u r e t h e a c c u r a c y and c o m p a t i b i l i t y o f measurements in large measurement systems. SRM's a r e m a t e r i a l s whose p r o p e r t i e s ( c o m p o s i t i o n a l and/or p h y s i c a l ) have been well-characterized and c e r t i f i e d b y NBS. R e f e r e n c e Methods a r e a n a l y t i c a l methods h a v i n g high accuracy and p r e c i s i o n , w h i c h have been t h o r o u g h l y demonstrated. A systems approach to establishing a c c u r a t e measurement systems i s p r e s e n t e d . Reference materials and reference methods assist i n the transfer o f accurac realization o f bas performance o f measurements i n the field. The a p p l i c a t i o n o f t h e systems approach t o " r e a l world" s i t u a t i o n s i s i l l u s t r a t e d through the presentation o f four examples: (1) The measurement o f calcium i n serum; (2) The d e t e r m i n a t i o n o f N 0 i n ambient a i r ; (3) T h e a n a l y s i s o f t r a c e l e v e l s o f m e r c u r y i n w a t e r ; and ( 4 ) The measurement o f chromium i n b i o l o g i c a l matrices. 1
2
LITERATURE CITED 1.
2. 3. 4. 5. 6. 7. 8*
9.
C a l i , J. P . , et a l . , "The Role of Standard Reference Materials in Measurement Systems," NBS Monograph 148, U.S. Government Printing Office, Washington, D.C. 20402 (1975). C a l i , J. P. and Stanley, C. L., Annual Review of Materials Science, 5, 329 (1975). C a l i , J. P . , Med. Instru., 8, 17 (1974). Wernimont, G., "Statistical Control of Measurement Processes," contained in this monograph. Eilers, R. J., Clinical Chemistry, 21, 10 (1975). Rhodes, R. C . , "Importance of Sampling Errors in Chemical Analysis," contained in this monograph. Eisenhart, C . , Science, 160, (1968). Seward, R. W., editor, "Standard Reference Materials and Meaningful Measurements," NBS Spec. Publ. 408, U.S. Government Printing Office, Washington, D.C. 20402 (1975). C a l i , J. P. and Reed, W. P . , "The Role of NBS Standard Reference Materials in Accurate Trace Analysis," to be published in Accuracy in Trace
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
4. URIANO A N D CALi
10. 11. 12.
13. 14.
15. 16. 17. 18. 19. 20. 21. 22. 23.
24. 25. 26. 27.
Reference Materials and Methods
161
Analysis, NBS Spec. Publ. 422, U.S. Government Printing Office, Washington, D.C. 20402 (1976). C a l i , J. P . , Bulletin of the World Health Organization, 48, 721 (1973). Young, D. S., Z. Klin. Biochem., 12, 560 (1974). See for example, 1973 Annual Book of ASTM Standards, parts 12, 32 and 42, published annually by the American Society for Testing and Materials, Philadelphia, Pa., 19103. Moore, L. J., Anal. Chem., 44, 2291 (1972). C a l i , J. P . , et al., "A Referee Method for the Determination of Calcium in Serum," NBS Spec. Publ. 260-36, U.S. Government Printing Office, Washington, D.C 20402 (1972) reprinted (1976) C a l i , J. P . (1973). National Committee for Clinical Laboratory Standards, NCCLS Proposed Standard: PSC-4 (1976). Gilbert, R. Κ., Am. J. Clin. Pathol. 63, 974 (1975). Hanson, D. J., Am. J. Clin. Pathol. 61, 916 (1974). Private communication from J. Boutwell to J. P. Cali. Pickup, J. F . , et al., Clin. Chem. 21, 1416 (1975). McNesby, J. R., Berichte Der Bunsen-Gesellschaft Fur Physicaliske Chemie, 78, 158 (1974). "Air Quality Criteria for Nitrogen Dioxides," EPA, AP-84, (1971). Hughes, Ε. Ε., "Development of Standard Reference Materials for Air Quality Measurements," International Instrumentation Automation Conference and Exhibit, ISA Reprint 74-704 (1974). Federal Register, 38, 15174 (June 8, 1973). Federal Register, 41, 11261 (March 17, 1976). Becker, D. Α . , Private communication. Mertz, W., Clinical Chemistry, 21, 408 (1975).
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
5 Optimization of Experimental Parameters in Chemical Analysis STANLEY N. D E M I N G Department of Chemistry, University of Houston, Houston, TX 77004
The t i t l e of thi Measurement Process,
symposium "Validatio
f th
In its usual usage, "validation" means "the determination of the degree of validity of a [measurement process]"(1). This definition suggests an activity that takes place after the measurement process has been developed. If the evaluation is successful, the process will receive o f f i c i a l sanction, confirmation, or approval. An alternate meaning of "validation" is "to make valid" in the sense of "producing the desired result" (2); that i s , making the measurement process meet the criteria against which i t is to be evaluated. This definition suggests activity that takes place while the measurement process is being developed. This latter interpretation has been emphasized by Youden (3) and is the interpretation I wish to stress here i f the i n i t i a l development of a measurement process is carried out with the goal of meeting the evaluation c r i t e r i a , then the probability that the process will receive rapid approval is greatly increased. SYSTEMS THEORY Figure 1 shows a systems theory view of the measurement process. The primary input to the system is a s amp1e. The measurement process abstracts the desi rë^inTormation from the sample and transforms the information into a number. This number, or result, is the primary output from the system. 162 In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
5.
DEMiNG
Experimental Parameters in Chemical Analysis
163
Ideally, the numerical value o f t h e output s h o u l d be r e l a t e d o n l y t o t h e d e s i r e d i n f o r m a t i o n i n the sample. As an example, a " p e r f e c t " measurement p r o c e s s f o r t h e a n a l y s i s o f enzyme a c t i v i t y w o u l d b e sensitive t o t h e amount o f enzyme i n t h e s a m p l e a n d i n s e n s i t i v e t o a l l other variables. In practice, the numerical value o f the output i s i n f l u e n c e d by a host o f other f a c t o r s . Some a r e associated with t h e sample m a t r i x , w h i l e o t h e r s app e a r as a d d i t i o n a l i n p u t s t o t h e measurement p r o c e s s . T h e s e f a c t o r s may b e s y s t e m a t i z e d a n d a r e shown s c h e m a t i c a l l y i n F i g u r e 1. An o b v i o u s c a t e g o r i z a t i o i n g a measurement p r o c e s set of factors t h a t a r e known t o h a v e a n e f f e c t o n the p r o c e s s ( s o l i d arrows) and a second s e t o f f a c tors that do a f f e c t t h e r e s u l t s o f t h e m e a s u r e m e n t p r o c e s s b u t have n o t y e t been i d e n t i f i e d - - that i s , they a r e unknown ( d a s h e d a r r o w s ) . Another grouping d i v i d e s the f a c t o r s i n t o those that are controlled (represented by a d o t on t h e t a i l o f t h e arrow) and those that are uncontrolled. When f a c t o r s a r e c a t e gorized i n these two ways, f o u r d i s t i n c t t y p e s r e sult : A factor that i s known t o e x e r t a s i g n i f i c a n t i n f l u e n c e on t h e r e s u l t o f a measurement process i s usually controlled. This w i l l u s u a l l y improve t h e p r e c i s i o n o f t h e method i f v a r i a t i o n s i n t h e uncontrolled factor level appear as n o i s e (that i s , t h e v a r i a t i o n s a r erapid with respect to the frequency o f measurement); i t might a l s o improve t h e accuracy o f the method i f t h e f r e q u e n c y o f c a l i b r a t i o n i s long with respect to variations i nthe uncontrolled factor level. Some factors a r e known t o i n f l u e n c e t h e r e s u l t of a measurement p r o c e s s b u t a r e l e f t uncontrolled. For example, i f a f a c t o r i s d i f f i c u l t o r e x p e n s i v e t o c o n t r o l and i f t h e f u n c t i o n a l r e l a t i o n s h i p o f i t s i n fluence i s known, t h e l e v e l o f t h i s f a c t o r m i g h t be measured and a c o r r e c t i o n a p p l i e d t o t h e r e s u l t . Or it might b e known t h a t a f a c t o r s i n f l u e n c e o n t h e r e s u l t , though r e a l , i s n o t s i g n i f i c a n t ; i t would p r o b a b l y be u n n e c e s s a r y t o c o n t r o l s u c h a f a c t o r . 1
Factors t h a t a r e unknown a n d c o n t r o l l e d a r e n o t u s u a l l y a p r o b l e m u n l e s s t h e method o f c o n t r o l i s i n -
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
V A L I D A T I O N OF
164
THE
MEASUREMENT
PROCESS
advertently changed. A f a m i l i a r example of a f a c t o r t h a t i s u n k n o w n and c o n t r o l l e d i s an impurity in a reagent: because the r e a g e n t i s always added i n a f i x e d amount, the l e v e l o f i m p u r i t y i s a l s o constant and is controlled. The e f f e c t s o f t h e r e a g e n t and t h e i m p u r i t y a r e c o n f o u n d e d and a r e u s u a l l y n o t s e p a rated unless a change i n r e a g e n t l o t or s u p p l i e r i s made. Most unknown factors are u n c o n t r o l l e d . It is a s s u m e d t h a t f a c t o r s i n t h i s c a t e g o r y do n o t o r will not exert a significant influence. Whatever i n f l u e n c e t h e y do e x e r t i s a c c e p t e d as " n o i s e " o r impreci sion. The s a m p l e c a n categories (_4) .
contain
RUGGEDNESS OF
factors
from a l l of
these
MEASUREMENT PROCESSES
As M a n d e l has p o i n t e d o u t , "The d e v e l o p m e n t o f a method of measurement i s to a l a r g e e x t e n t the d i s c o very o f the most i m p o r t a n t e n v i r o n m e n t a l f a c t o r s and the s e t t i n g of t o l e r a n c e s f o r the v a r i a t i o n of each one o f t h e m " (4Γ) . T o l e r a n c e s make p o s s i b l e t h e o p e r a t i o n a l i m p l e m e n t a t i o n of the concept of c o n t r o l : i t is o f t e n i m p o s s i b l e or i m p r a c t i c a l to c o n t r o l a f a c t o r at a g i v e n l e v e l , but i t i s u s u a l l y p o s s i b l e and practical t o c o n t r o l a f a c t o r w i t h i n a s p e c i f i e d do main of f a c t o r l e v e l s - - t h a t i s , to c o n t r o l a factor around a given level, within specified tolerances. The s p e c i f i c a t i o n o f f a c t o r t o l e r a n c e s i s based upon the r e q u i r e d p r e c i s i o n o f t h e m e t h o d and a n s w e r s t h e q u e s t i o n , "To w h a t e x t e n t c a n a f a c t o r be a l l o w e d to vary before the output o f t h e s y s t e m c h a n g e s by v_ amount?" F o r a s p e c i f i e d v a l u e o f y, i t i s d e s i r a b l e that t h e s e t o l e r a n c e s be b r o a d so that the measurement p r o c e s s i s r e l a t i v e l y i n s e n s i t i v e to small v a r i a t i o n s in factor levels. To i l l u s t r a t e , c o n s i d e r t h e rela tionship b e t w e e n r e a c t i o n r a t e ( t h e r e s u l t o f a mea s u r e m e n t p r o c e s s ) as a f u n c t i o n o f pH (a known and controlled factor) f o r the k i n e t i c d e t e r m i n a t i o n of enzyme a c t i v i t y ( s e e F i g u r e 2 ) . In g e n e r a l , enzymes do n o t f u n c t i o n w e l l a t e x t r e m e s o f pH and e x h i b i t an o p t i m u m w i t h r e s p e c t t o pH. Let us assume that a method i s t o be d e v e l o p e d f o r m e a s u r i n g t h e a c t i v i t y o f an e n z y m e . A p e r f o r m a n c e c r i t e r i o n has b e e n spe-
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
5.
DEMiNG
Experimental Parameters in Chemical Analysis
i ν γ ν SAMPLE-
M
SYSTEM
-> R E S U L T
Figure 1. Systems
Figu/e 2. Reaction rate as a function of pH for the kinetic determination of enzyme activity
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
165
166
V A L I D A T I O N O F T H E M E A S U R E M E N T PROCESS
cified that requires within y reaction rate
an i n t e r l a b o r a t o r y agreement units.
If t h e i n i t i a t i n g l a b o r a t o r y d e v e l o p s a method i n w h i c h t h e pH l e v e l i s s e t a t p o i n t A i n F i g u r e 2, then t h e i n t e r l a b o r a t o r y c o n t r o l o f pH m u s t b e e x tremely t i g h t : s m a l l d i f f e r e n c e s i n pH b e t w e e n t h e laboratories e v a l u a t i n g t h e m e t h o d w i l l show up a s a large between-laboratory variance. Worse still, in the absence o f a d d i t i o n a l i n f o r m a t i o n , i t w o u l d be d i f f i c u l t t o s e p a r a t e pH a s o n e o f t h e c a u s e s o f t h i s variance. If, instead, th a method i n w h i c h t h Figure 2, t h e n s m a l l d i f f e r e n c e s i n pH b e t w e e n t h e l a b o r a t o r i e s e v a l u a t i n g t h e method will contribute very l i t t l e to the between-laboratory variance. This method would have a h i g h e r p r o b a b i l i t y o f b e i n g ac cepted a f t e r i t s f i r s t i n t e r l a b o r a t o r y t e s t . Many o f t h e f a c t o r s a f f e c t i n g measurement pro c e s s e s e x h i b i t t h e b e h a v i o r shown i n F i g u r e 2. Other factors initially increase and then a s y m p t o t i c a l l y approach a p l a t e a u (e.g., t h e s u b s t r a t e dependence o f many enzymes). With f a c t o r s e x h i b i t i n g these types of behavior, a d j u s t i n g the f a c t o r l e v e l s t o improve the system o u t p u t w i l l a l s o improve t h e f a c t o r t o l e r a n c e s (5) .
DEVELOPMENT OF MEASUREMENT PROCESSES The development o f a measurement p r o c e s s s h o u l d involve three stages: obtaining a response, improv ing t h e r e s p o n s e , and u n d e r s t a n d i n g t h e r e s p o n s e . Many l a b o r a t o r i e s c a r r y t h e development through the f i r s t stage o n l y . Youden (3) has p o i n t e d o u t t h e potential l i m i t a t i o n s o f s u c h m e t h o d s a n d h a s empha s i z e d t h e i m p o r t a n c e o f a c q u i r i n g an o p e r a t i o n a l un derstanding o f t h e measurement p r o c e s s e s ; t h a t i s , i d e n t i f y i n g and c o n t r o l l i n g those f a c t o r s that exert a significant effect on t h e s y s t e m . T h i s i s espe c i a l l y c r i t i c a l i f t h e measurement p r o c e s s e s are to become w i d e l y u s e d b y a number o f l a b o r a t o r i e s . The i m p r o v e m e n t o f r e s p o n s e h a s b e e n c a r r i e d o u t i n f r e q u e n t l y , a l t h o u g h i t s importance has been recog nized f o r some time. I n 1952 , B o x (6J) p r e s e n t e d a
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
5.
Experimental Parameters in Chemical Analysis
DEMiNG
167
p a p e r i n w h i c h i t was p o i n t e d o u t t h a t s i n g l e - f a c t o r at-a-time s t r a t e g i e s a r e inadequate f o r optimizing most chemical processes (.7,8_) a n d t h a t m e a s u r e m e n t processes (conceptually v e r y " s i m i l a r to production processes) c a n be e f f e c t i v e l y i m p r o v e d b y t h e u s e o f sequential f a c t o r i a l designs, a technique that later became known a s " e v o l u t i o n a r y o p e r a t i o n , " o r EVOP (9_, 10). EVOP s t r a t e g i e s a r e w e l l s u i t e d t o t h e i n d u s trial environment--the production process i s being run c o n s t a n t l y and p r o v i d e s a continuous framework for t h e l a r g e number o f e x p e r i m e n t s r e q u i r e d b y t h e sequential f a c t o r i a l designs. In the developmental l a b o r a t o r y , however, e f f i c i e n c y o f i n i t i a l experimentation i sstressed strategie desi rable. I n 1962, S p e n d l e y i n t r o d u c e d t h e f i x e d s i z e s i m p l e x a s a more e f f i c i e n t sequential experimental design for t r a d i t i o n a l evolut i o n a r y o p e r a t i o n s ; L o n g (12) a p p e a r s t o have been the f i r s t t o a p p l y f i x e d s i z e s e q u e n t i a l s i m p l e x designs t o t h edevelopment o f measurement processes. Nelder a n d Mead (13) m o d i f i e d t h e s e q u e n t i a l s i m p l e x method t o a l l o w a c c e l e r a t i o n i n d i r e c t i o n s that are f a v o r a b l e and d e c e l e r a t i o n i n d i r e c t i o n s t h a t are unfavorable . We h a v e f o u n d t h e v a r i a b l e s i z e s i m p l e x ( s l i g h t l y m o d i f i e d ) t o b e a r a p i d means o f i m p r o v i n g r e s u l t s i n t h e development o f a n a l y t i c a l c h e m i c a l measurement processes (14-17). F a c t o r i a l d e s i g n s (18) , c e n t r a l composite d e s i g n s (19) , a n d B o x - B e h n k e n d e s i g n s ( 2 0 ) are u s e f u l f o r understanding the v a r i o u s f a c t o r ëTf e c t s upon t h e r e s p o n s e i n t h e r e g i o n o f t h e optimum.
EXAMPLE The d e t e r m i n a t i o n o f f o r m a l d e h y d e i n an aqueous sample c a n be d e t e r m i n e d by t h e a d d i t i o n o f chromotopic acid (4,5-dihydroxy-2,7-naphthalendisulfonic acid) and s u l f u r i c a c i d (21-25); a c o l o r develops, and t h e a b s o r b a n c e i s r e a d a t 570 nm. I n t h i s s t u d y ( 1 4 ) , a s a m p l e s i z e o f 2.00 m l was chosen. T h e amount o T a q u e o u s 20 g l " chromotropic acid (CTA, f a c t o r χι) a l l o w e d t o v a r y between 0.00 a n d 1.00 m l ; c o n c e n t r a t e d s u l f u r i c a c i d (H S0 , f a c t o r x ) c o u l d v a r y b e t w e e n 1.00 a n d 10.00 m l . T h e o b j e c t i v e s o f t h e study were: (a) t o determine t h e amounts o f H S 0 a n d CTA t h a t p r o d u c e d t h e g r e a t e s t a b s o r b a n c e f o r a g i v e n amount o f f o r m a l d e h y d e (2 ppm) 1
w
a
s
2
2
2
4
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
4
VALIDATION OF T H E M E A S U R E M E N T
PROCESS
7
I
1
0.1
1
0.2
1
1
ml o f
1
1
CTA
1
0.7
1
0.8
1
0.9
1
Analytica Chimica Acta
Figure 3. Simplex progress in the chromotropic acid-concentrated sulfuric acid domain. See text and Table 1 for details (14).
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
5.
Experimental Parameters in Chemical Analysis
DEMiNG
169
TABLE 1 Simplex
Vertex
progress
Retained
Chronotropic
Sulfuric
vertices
acid
a c i d (ml)
1
-
0. .200
2
-
3
1,2
(ml)
Absorbance
0. .221
2. .00
.080 0. ,355
-
- 0 . , 224
2. .50
- 1 .. o o o
4
1,3
0. . 5 2 9
2. .26
0. ,223
5'
-
0. . 684
2. .94
0. .197
5
3,4
0. .321
2. .23
0. . 3 2 5
6
3,5
0. .147
2, .65
0. .563
6'
-
- 0 . .044
2. .85
- 1 .. o o o
7
3,6
0. .182
3. . 0 9
0. .562
8'
-
- 0 . .027
3, .07
- 1 .. o o o
8
6,7
0. . 2 6 0
2, .77
0. . 5 7 3
9'
-
0. . 2 2 6
2. . 33
0. .502
9
6,8
0.. 1 9 3
2. . 9 0
0.. 5 9 9
10'
-
0. . 3 0 5
3, .03
0.. 5 7 0
11
8,9
0. . 1 8 7
2. .74
0. . 584
a
Primes indicate
rejected
b
b
b
vertices.
^Boundary v i o l a t i o n . R e p r i n t e d from r e f e r e n c e 14 w i t h p e r m i s s i o n
of
Elsevier
Scientific
P u b l i s h i n g Company.
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
VALIDATION OF T H E M E A S U R E M E N T
TABLE 2 Results
of factorial
experiments
Chronotropic
Sulfuric
acid
acid
(ml)
Absorbance
(ml)
,53 0.. 515
2..80
0., 516 0 . 526
3..10
0., 530 0..455
2.. 50
0.. 509 2..80
0..583
3..10
0., 534
0., 575
0..545 0.. 386
2.. 50
0..428 0.. 545
2..80
0..537 0..554
3,.10
0.. 551
Reiminted Elsevier
i n part
from r e f e r e n c e
Scientific
Publishing
14 w i t h p e r m i s s i o n o f
Company.
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
PROCESS
5.
DEMiNG
Experimental Parameters in Chemical Analysis
Analytica Chimica Acta
Figure 4.
Cell mean plot of factorial study (14)
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
171
172
VALIDATION OF THE MEASUREMENT PROCESS
TABLE 3 F a c t o r i a l analysis of variance
(ANOVA)
Source o f
Degrees o f
Sum o f
Mean
variation
freedom
squares
square
Chromotropic
2
0.00369
0.00179
6.27
98.0
2
0.01926
0.00963
33.66
99.9
0.00258
0.00029
F-ratio
Significance (%)
acid Sulfuric acid Interaction
4
Error
Reprinted
i n part
from r e f e r e n c e
Elsevier S c i e n t i f i c Publishing
14 w i t h p e r m i s s i o n o f
Company.
Analytica Chimica Acta
Figure 5. Absorbance response surface as a function of chromotropic acid volume, and concentrated sulfuric acid volume (14)
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
5.
DEMiNG
Experimental Parameters in Chemical Analysis
173
and (b) t o u n d e r s t a n d t h e e f f e c t s o f H S 0 a n d CTA upon t h e r e s p o n s e i n t h e r e g i o n o f t h e optimum so t h a t f a c t o r t o l e r a n c e s c o u l d be s p e c i f i e d . 2
4
Figure 3 shows t h e p r o g r e s s o f t h e s i m p l e x t o ward t h e optimum ( d a t a i n T a b l e 1 ) . The numbers i n the f i g u r e i n d i c a t e t h e sequence i n which t h e r e t a i n ed v e r t i c e s w e r e e v a l u a t e d ; r e j e c t e d v e r t i c e s a r e n o t shown. Table 2 contains ther e s u l t s o f a t h r e e - l e v e l two-factor, f u l l factorial study with replication c a r r i e d o u t i n t h e r e g i o n o f t h e s i m p l e x optimum; t h e r e s u l t s o f theanalysis o f variance are presented i n T a b l e 3. I n F i g u r e 4, c e l l means a r e p l o t t e d vs. CTA f o r each o f t h e thre level f H2SO4 Other studie (25) bance response i sr e l a t e d t o t h e r a t i o ( s u l f u r i c a c i d volume)/(total volume); this i s p r o b a b l y an e f f e c t caused by t h e heat o f m i x i n g w h i c h d r i v e s the reac tion toward completion. Assuming t h i s r e l a t i o n s h i p t o be a p p r o x i m a t e l y G a u s s i a n i n t h e r e g i o n o f t h e op timum, a model o f t h e form Absorbance = k ^ x p
[-( ( ( x / ( x + x 2
*(2.0/(x +x 1
+ 2
1
+ 2
2
2
2 . 0) ) - k ) ) / ( 2 k ) ] 2
2.0)) ( l - e x p f - k ^ ) )
can be f i t a n d i s v i s u a l i z e d i n t h e p s e u d o - t h r e e - d i m e n s i o n a l p l o t shown i n Figure 5. The f a c t o r i a l p o i n t s a r e superimposed on t h e s u r f a c e . At l o w H2SO4 v o l u m e s , i n c r e a s i n g t h e volume o f CTA moves a c r o s s t h e " f r o n t " o f t h e r e s p o n s e surface with the r e s u l t that response decreases. A t an i n t e r m e d i a t e l e v e l o f H^SO^, i n c r e a s i n g t h e v o l u m e o f CTA moves f r o m " b e h i n d " t h e d i a g o n a l r i d g e t o t h e t o p o f i t a n d down a g a i n o n t h e f r o n t s i d e . At the high est l e v e l o f H2SO4 s t u d i e d , i n c r e a s i n g t h e volume o f CTA moves a l o n g t h e " b a c k " o f t h e r i d g e w i t h t h e re s u l t that response increases. With t h i s o p e r a t i o n a l understanding o f a process f o r m e a s u r i n g t h e amount o f f o r m a l d e h y d e i n a n a q u e ous sample, f a c t o r l e v e l s and f a c t o r t o l e r a n c e s can be s p e c i f i e d . Because c o n c e n t r a t e d H S0i+ i s a w o r r y some reagent, a n d b e c a u s e a q u e o u s CTA s o l u t i o n s o f accurate concentration aree a s i l y prepared and han dled, i t w o u l d b e a p p r o p r i a t e t o s p e c i f y a t i g h t CTA l e v e l o f 0.1 m l w h e r e t h e v o l u m e o f Η 5 0 has l e s s o f an e f f e c t . 2
2
4
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
&
174
VALIDATION
OF T H E MEASUREMENT
PROCESS
CONCLUSION The s t r a t e g y o f o b t a i n i n g a response, improving the response, and u n d e r s t a n d i n g t h e response i s a reasonable means o f o b t a i n i n g s o u n d m e a s u r e m e n t p r o cesses. We h a v e f o u n d t h e v a r i a b l e size sequential s i m p l e x d e s i g n t o b e a n e f f i c i e n t means o f o p t i m i z i n g the primary response from a measurement process. Established s t a t i s t i c a l d e s i g n s a l l o w an understand ing o f t h e f a c t o r e f f e c t s and t h e i r interactions i n the r e g i o n o f t h e optimum.
ACKNOWLEDGMENTS The following work p r e s e n t e d h e r e : P. G. K i n g , S. L. M o r g a n , L. R. Parker, J r . , A. S. O l a n s k y , a n d L. A. Y a r b r o . T h e author acknowledges support from t h e N a t i o n a l Science Foundation t h r o u g h g r a n t s GP-32911 a n d M P S - 7 4 - 2 3 1 5 7 .
LITERATURE CITED 1. "Webster's New Collegiate Dictionary," 1292, G. Merriam Company, Springfield, MA, 1973. 2. "The Random House Dictionary of the English Lan guage, " 1578, Random House, New York, NY, 1969. 3. Youden, W. J., Materials Research & Standards (1961), 1, 862. 4. Mandel, J., "The Statistical Analysis of Experi mental Data," Interscience, New York, NY, 1964. 5. Skogerboe, R. Κ., in Baer, W. Κ., Perkins, A. J. and Grove, E. L., Eds., "Developments in Applied Spectroscopy," Vol. 6, 127, Plenum, New York, NY 1968. 6. Box, G. Ε. P . , Analyst (1952), 77, 879. 7. Box, G. E. P . , Biometrics (1954) 10, 16. 8. Morgan, S. L., and Deming, S. Ν . , Anal. Chem. (1974), 46, 1170. 9. Box, G. Ε. P . , Appl. Statist. (1957), 6, 81. 10. Box, G. Ε. P . , and Draper, Ν. R., "Evolutionary Operation," Wiley, New York, NY, 1969. 11. Spendley, W., Hext, G. R., and Himsworth, F. R., Technometrics (1962), 4, 441. 12. Long, D. E., Anal. Chim. Acta (1969), 46, 193. 13. Nelder, J. Α . , and Mead, R., Computer J. (1965), 7, 308.
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
5.
DEMiNG
14. 15. 16. 17. 18. 19. 20. 21. 22. 23. 24. 25.
Experimental Parameters in Chemical Analysis
175
Olansky, A. S., and Deming, S. Ν . , Anal. Chim. Acta (1976), 83, 241. Morgan, S. L., and Deming, S. Ν . , J. Chromatogr. (1975), 112, 267. Parker, L. R., Morgan, S. L., and Deming, S. Ν . , Appl. Spectrosc. (1975), 29, 429. Deming, S. Ν . , and King, P. G . , Research/Devel opment (1974), 25(5), 22. Fisher, R. Α . , "The Design of Experiments," O l i ver and Boyd, Edinburgh, 1935. Box, G. E. P . , and Wilson, Κ. B . , J. Royal Sta t i s t . Soc., Β (1951), 13, 1. Box, G. E. P . and Behnkey D W. Ann Math Statist. (1960) Bricker, C. E., (1950), 22, 720. Klein, B . , and Weissman, Μ., Anal. Chem. (1953), 25, 771. Kamel, Μ., and Wizinger, R., Helv. Chim. Acta (1960), 43, 594. Sawicki, E., Hauser, T. R., and McPherson, S., Anal. Chem. (1962), 34, 1460. Houle, M. J., and Powell, R. L., Anal. Biochem. (1965), 13, 562.
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
6 Components of Variation in Chemical Analysis R A Y M O N D C. RHODES Environmental Protection Agency, Research Triangle Park, N C
The first task in the evaluation of an analytical chemistr sources of variability minimize the total variability by searching out and controlling the major contributors. There are many sources of variability in a chemical analysis process. This work describes several of these sources that have been encountered in work by the Environmental Monitoring and Support Laboratory of EPA. Variability can be expected to occur in any or a l l of the following measurement steps within a given laboratory: 1. 2. 3. 4. 5. 6.
The material to be analyzed Materials, including reagents, used in the analysis Calibration materials or devices Environmental factors Analysts Instruments, or apparatus
While these items are not definitive, they describe the general classes of sources of variability that must be considered. It is clear that when one is measuring the reproducibility of an analytical chemistry method, i t is important that the complete method's variability is being measured. For example, in the case where events are measured under the Poisson probability distribution i t is usually improper to describe the error as resulting from this effect alone. In another example, the replication of only a portion of the measurement method, such as replicate analysis of a single extraction, is 176 In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
6.
RHODES
sometimes method.
Components of Variation in Chemical Analysis
erroneously
177
q u o t e d as t h e p r e c i s i o n o f t h e
Every laboratory should s y s t e m a t i c a l l y maintain a c o m p i l a t i o n o r r e c o r d o f t h e e f f e c t s o f each o f t h e above-listed factors, as o b t a i n e d from special studies conducted i n their laboratory or the laboratories of others, so t h a t i f improvement i s needed i n the q u a l i t y of the reported results, efforts to effect s u c h i m p r o v e m e n t s c a n b e made i n t h e most c o s t - e f f e c t i v e way. The p l a n n i n g o f such special studies and t h e a n a l y s i s o f t h e data therefrom i s an a r e a where statisticians and chemists, working together gai h i th knowledge and understandin the total measurement v a r i a b i l i t y . Eac laboratory should conduct p e r i o d i c quality control checks t o measure and c o n t r o l t h e combined effects of the sources o f v a r i a b i l i t y which affect their reported results. THE LABORATORY MEASUREMENT PROCESS A schematic diagram of portions of the measurement p r o c e s s o f an i n d u s t r i a l laboratory i s p r e s e n t e d i n F i g u r e 1. I n p u t s o f p h y s i c a l s a m p l e s t o t h e l a b o r a t o r y a r e shown: (1)
Samples from t h e m a n u f a c t u r i n g (a) (b) (c)
(2) (3)
process
Raw m a t e r i a l s In-process materials Final product
C a l i b r a t i o n standards Reagents and other m a t e r i a l s
The various internal factors of the laboratory measurement p r o c e s s , p r e v i o u s l y m e n t i o n e d , are also shown. The measurement process and sampling considerations f o r pollutant measurements are identical except f o r d i f f e r e n t types o f sampled m a t e r i a l s t o be a n a l y z e d . The s c h e m a t i c o f F i g u r e 1 g e n e r a l l y a p p l i e s t o any measurement p r o c e s s , whether of i n d u s t r i a l , research, government o r independent laboratories.
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
178
VALIDATION OF T H E M E A S U R E M E N T
PROCESS
Several recent articles and documents d i s c u s s the various quality assurance aspects of the laboratory measurement process. The m e a s u r e m e n t o r analytical process i n industrial situations is normally a part of the overall quality assurance system. Moreover, i n a d d i t i o n to being a part o f the quality assurance system, recent concepts, quite appropriately view t h e measurement or analytical process a s o n e w h i c h s h o u l d m a i n t a i n i t s own q u a l i t y a s s u r a n c e s u b - s y s t e m o r p r o g r a m (1., 2_, 3.). In the Environmental Protection Agency (EPA) the e n v i r o n m e n t a l m o n i t o r i n g e f f o r t s may be v i e w e d as a p r o d u c t i o n p r o c e s s w h i c h i s e s s e n t i a l l y a measurement p r o c e s s , h a v i n g as i t Accordingly, EPA i A s s u r a n c e Programs f o r each o f i t s a i r measurement methods ( 4 ) , and has i s s u e d a " Q u a l i t y Assurance Handbook f o r A i r P o l l u t i o n Measurement S y s t e m s " ( 5 ) . The above concept i s r e c o g n i z e d i n EPA a s n o t o n l y appropriate f o rmonitoring programs, but also f o r research projects i n which i t i s d e s i r e d to obtain data of high quality. Major research projects require rather e x t e n s i v e q u a l i t y assurance programs (6). The accuracy of the a n a l y t i c a l portion of a c h e m i c a l measurement system i s a c h i e v e d by t h e use o f Standard Reference M a t e r i a l s , high q u a l i t y reagents, and by c o n t r o l of variables of the calibration process. I t s h o u l d be e m p h a s i z e d t h a t t h e b a s i s f o r accuracy of the r e s u l t s from a laboratory should emanate from outside the laboratory. Unless a laboratory h a s i t s own internal capability for p r o d u c i n g p r i m a r y s t a n d a r d s , i t m u s t d e p e n d u p o n some external source of standards. Otherwise, i t is attempting "to l i f t i t s e l f b y i t s own b o o t s t r a p s . " I f u n q u e s t i o n e d a c c u r a c y i s d e s i r e d , a l l measurements should be t r a c e a b l e t o t h e s t a n d a r d s o f t h e N a t i o n a l Bureau of Standards or other national and international standards laboratories. Even with traceable standards, accuracy depends upon their proper c a r e a n d u s e . The c e r t i f i c a t e s i s s u e d b y t h e National Bureau o f Standards for i t s Standard Reference Materials (SRM) o f t e n i n c l u d e c a r e f u l l y worded c a u t i o n s which e x p l a i n t h a t t h e SRM s have s p e c i f i c e x p i r a t i o n d a t e s a n d t h a t t h e y m u s t be g i v e n p r o p e r c a r e , and used under s p e c i f i e d c o n d i t i o n s . I n fact, t h e a c c u r a c y o f c a l i b r a t i o n s depends n o t o n l y upon the standard used, but upon the entire c a l i b r a t i o n p r o c e s s (7_). f
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
6.
RHODES
Components of Variation in Chemical Analysis
179
The r e l a t i v e a c c u r a c y among l a b o r a t o r i e s c a n b e measured through the use of interlaboratory comparison studies, represented by an i n p u t - o u t p u t relationship on F i g u r e 1. The objective of i n t e r l a b o r a t o r y c o m p a r i s o n s t u d i e s , i n w h i c h t h e same samples a r e a n a l y z e d by v a r i o u s l a b o r a t o r i e s , i s to make c o m p a r i s o n s w i t h r e s p e c t t o a c c u r a c y . Two t y p e s of interlaboratory studies are conducted. Collaborative studies are generally c o n s i d e r e d as t h o s e s t u d i e s w h i c h a r e c o n d u c t e d among a group o f selected laboratories t o evaluate a newly developed analytical method (<S, 9_) . O t h e r interlaboratory studies a r e conducted t o compare the results of standard methods i n use at various selected or volunteering laboratorie interlaboratory studie startling differences i n accuracy between l a b o r a t o r i e s even though s p e c i a l a t t e n t i o n and care (not routinely used f o r i n t e r n a l m e a s u r e m e n t s ) may have been t a k e n . Such r e s u l t s support t h e f a c t that accuracy depends upon many factors other than standards. The o c c u r r e n c e of excessively deviant results from a laboratory participating i n an i n t e r l a b o r a t o r y study should t r i g g e r the i n i t i a t i o n of i n v e s t i g a t i o n a l e f f o r t s t o determine t h e cause. As described i n Chapter I , controls f o r p r e c i s i o n may b e c o n s i d e r e d a t two l e v e l s , i . e . , " l o c a l " c o n t r o l and " r e g i o n a l " c o n t r o l . CONTROLS FOR P R E C I S I O N Local
Control
Duplicates, Duplicates, Duplicates,
Regional Back-to-Back Same Run Same D a y
Control
D u p l i c a t e s by Different Analysts D i f f e r e n t Equipment Different Calibrations D i f f e r e n t Days
For a g i v e n method, t h e r e a r e u s u a l l y a number o f duplicate or replicate measurements o f t h e same sample w h i c h c a n b e made t o c o n t r o l a n d m e a s u r e t h e p r e c i s i o n o f the partial o r complete measurement process. Sequential duplicates f o r a p a r t i c u l a r part o f t h e measurement p r o c e s s a r e t h e most "local" of controls. They a r e u s e f u l i n a s s u r i n g c o n t r o l by t h e particular combination o f sample/analyst/equipment/p r o c e d u r e / c o n d i t i o n s / t i m e , and thus s h o u l d show t h e greatest possible precision. As o t h e r normally existing variables of t h e measurement process
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
CALIBRATION STANDARDS
REAGENTS AND OTHER MATERIALS
Figure 1.
INPROCESS
INTERLABORATORY COMPARISON SAMPLES
The industrial laboratory measurement process
DATA
Γ
INSTRUMENTS ANALYSTS CALIBRATION ANALYSIS CONDITIONS COMPUTATIONS
SAMPLING SAMPLE PREPARATION
LABORATORY
RAW MATERIALS
FINAL PRODUCT
MANUFACTURING PROCESS
6.
RHODES
Components of Variation in Chemical Analysis
181
(different a n a l y s t s , d i f f e r e n t equipment, d i f f e r e n t calibrations) enter into the replications, lower precision results. These c a n be c o n s i d e r e d t o be o f a " r e g i o n a l " c o n t r o l n a t u r e , and always e x i s t i n t h e data f o r between-laboratory comparisons b u t normally are n o t r e f l e c t e d i n most w i t h i n - l a b o r a t o r y p r e c i s i o n estimates. Nevertheless, they are part of the w i t h i n - l a b o r a t o r y t o t a l measurement v a r i a b i l i t y , and require special studies to ascertain their effects. There a r e many levels o f p r e c i s i o n and i t i s v e r y i m p o r t a n t t h a t t h e s p e c i f i c v a r i a b l e s w h i c h have entered into t h e c o m p a r i s o n b e i d e n t i f i e d (11.» 12) . I f i nthe normal cours f event give sampl have been a n a l y z e d b w i t h one o f s e v e r a , y y number o f a n a l y s t s , a n d a t one o f s e v e r a l t i m e s u n d e r any o f s e v e r a l c o n d i t i o n s , then t h e measure o f precision relating to theresults f o r that particular sample s h o u l d i n v o l v e a l l o f t h e l i k e l y variables. Therefore, a most r e a l i s t i c measure o f p r e c i s i o n f o r a g i v e n method a t a g i v e n l a b o r a t o r y would be one obtained by having had a l lr o u t i n e l y existing variables a t play during the determination. Obviously, i t i s idealistic and i m p r a c t i c a l t o conduct routine analyses under a l l likely combinations. However, such precision estimates should be d e t e r m i n e d by combining a l l available information on i n d i v i d u a l precision estimates. I t may b e n e c e s s a r y t o c o n d u c t some s p e c i a l studies to fill any gaps. And such o v e r a l l p r e c i s i o n e s t i m a t e s s h o u l d be p e r i o d i c a l l y r e - e v a l u a t e d , p a r t i c u l a r l y i f changes a r e i n t r o d u c e d i n t o t h e measurement system. T
1
Changes a r e f r e q u e n t l y i n t r o d u c e d i n measurement systems e i t h e r because o f undesired necessity or internationally d e s i r e d improvement. A n y c h a n g e may affect the accuracy and/or precision of the measurement system either i n a desirable or undesirable manner. Statisticians and quality assurance people are particularly suspicious of change. Many t i m e s s i g n i f i c a n t undesirable shifts in the level of analytical results have been determined t o have been caused by t h e c o i n c i d e n t a l introduction o f seemingly insignificant change i n p r o c e d u r e s , e q u i p m e n t , o r m a t e r i a l s . The s l o g a n s o f s t a t i s t i c i a n s and q u a l i t y assurance people might w e l l be " C a v e V i c i s s i t u d i n e s " (beware o f changes) o r "Cave V a r i e t a s " (beware o f d i f f e r e n c e s ) .
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
182
V A L I D A T I O N OF
THE
M E A S U R E M E N T PROCESS
A v e r y important but yet v e r y simple p r i n c i p l e , w h i c h s h o u l d be u s e d w h e n e v e r c h a n g e s a r e introduced into the measurement process i s "The Overlap Principle." For example, overlapping some " o l d " variable, A, with a "new" v a r i a b l e , B, s h o u l d be employed whenever p r a c t i c a b l e to obtain objective data by comparing a "before" c o n d i t i o n w i t h the " a f t e r " c o n d i t i o n t o a s s u r e t h a t the change has not introduced a s i g n i f i c a n t or d e l e t e r i o u s e f f e c t . For example, i f a new instrument, new analyst, new reagent s o u r c e ( s u p p l i e r ) , o r e v e n a new calibration s o u r c e i s i n t r o d u c e d i n t o the measurement p r o c e s s , i t is a good quality control procedure to analyze a g i v e n sample or samples o f m a t e r i a l under both the "old" and the "new change has not introduce deleterious effect. SAMPLING AND
A N A L Y S I S V A R I A B I L I T Y OF SAMPLES
EPA
REFERENCE
An example from an e n v i r o n m e n t a l measurement l a b o r a t o r y i s used to i l l u s t r a t e the d e t e r m i n a t i o n of the effects of s e v e r a l sources of v a r i a b i l i t y i n a measurement system. The Environmental Monitoring and Support Laboratory periodically distributes to various environmental l a b o r a t o r i e s i n the United States, c a r e f u l l y p r e p a r e d r e f e r e n c e s a m p l e s as a check on the accuracy of the r e s u l t s from the l a b o r a t o r i e s . A l t h o u g h t h e s a m p l e s a r e p r e p a r e d as n e a r l y i d e n t i c a l as p o s s i b l e , some v a r i a b i l i t y between the samples cannot be completely eliminated. To assess the average concentration level of t h e s a m p l e s and t o determine the sample-to-sample variability, EPA analyzes a number o f r a n d o m l y s e l e c t e d s a m p l e s p r i o r to distributing the remainder to the various participating laboratories. In the example to f o l l o w , EPA a n a l y z e d e a c h sample on two different days to obtain some measure of between-day variability. A l t h o u g h t h e r e a r e a number o f sources o f v a r i a b i l i t y i n the e n t i r e measurement system, o n l y t h e f o l l o w i n g f o u r w i l l be e v a l u a t e d , n a m e l y : 1. 2. 3. 4.
Sample-to-Sample v a r i a b i l i t y Within-sample Within-day variability Day-to-Day v a r i a b i l i t y Laboratory-to-Laboratory v a r i a b i l i t y
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
6.
RHODES
Components of Variation in Chemical Analysis
183
The measurement method i n v o l v e d i n t h e example i s t h e p a r a r o s a n i l i n e method f o r a n a l y s i s of sulfur dioxide ( 1 3 ) . As p a r t o f EPA s q u a l i t y assurance program, samples of sodium sulfite i n tetrachloromercurate s o l u t i o n arec a r e f u l l y prepared to simulate f i e l d samples, and these samples a r e distributed to the various p a r t i c i p a t i n g laboratories for analysis. 1
A large number o f samples a t each of five different concentrations ( o r s e r i e s ) a r e p r e p a r e d by a contractor. Each sample, c o n s i s t i n g o f one-half m i l l i l i t e r of solution, i s freeze-dried and then sealed i n a glass v i a l to maintain thei n t e g r i t y o f the sample u n t i l i participating laboratory random sample f r o m e a c h of the five concentration levels t o e s t i m a t e t h e v a r i a t i o n among s a m p l e s . I n the a n a l y s i s , t h e m a t e r i a l i n t h e v i a l i s dissolved and d i l u t e d t o a t o t a l v o l u m e o f 50 m i l l i l i t e r s w i t h 10 m i l l i l i t e r a l i q u o t s t a k e n o n e a c h o f t w o d a y s f o r analysis. For brevity, only t h e r e s u l t s f o r S e r i e s 1000 ( t h e l o w e s t c o n c e n t r a t i o n l e v e l ) a n d S e r i e s 5000 ( t h e highest concentration l e v e l ) w i l l be c o n s i d e r e d i n detail. The r e s u l t s o f t h e i n d i v i d u a l a n a l y s e s a r e presented i nTable I . Statistical used t o estimate 1. 2. 3.
analysis of variance f o r each s e r i e s ,
t e c h n i q u e was
The b e t w e e n - s a m p l e v a r i a b i l i t y , T h e b e t w e e n - d a y v a r i a b i l i t y , σ_, The w i t h i n - s a m p l e , w i t h i n - d a y v a r i a b i l i t y
As an example, the results of the analysis v a r i a n c e f o r s e r i e s 5000 a r e shown i n T a b l e I I . From t h e a n a l y s i s o f v a r i a n c e .3516
= a
2
+ dô
e
5.7608 = a
2
s
.1034
above
2
s
+ a
e
given
2
d
= a
2
e w h e r e d = number o f d a y s s = number o f s a m p l e s
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
o
q
of
VALIDATION OF T H E M E A S U R E M E N T
Table I .
Individual Results
PROCESS
( i n ug) o f SO
S e r i e s 1000 Sample No.
Day 1
Day 2
Difference
1381 1427 1436 1376 1490
11.24 9.84 10.54 10.54 11.24
11 .36 11.01 11.01 11.01 11.01
- .12 -1.17 - .47 - .47 .23
Average 10.68 0
5589 5750 5649 5557 5401
54.07 54.07 55.12 55.47 54.77
11.080
56.22 55.86 56.22 56.57 56.22
Average 54.700
-2.15 -1.79 -1.10 -1.10 -1.45
56.220
Table I I . A n a l y s i s of Variance
Source
Sum o f Degrees Mean F - t e s t E s t i m a t e d Squares of Square Mean Square Freedom
BetweenSample
1.40634
4
BetweenDays
5.76081
1
.41374
4
7.58089
9
Within Sample-Day
Total
.3516
3.40
5.7608 55.69
a
2 e
+
, 2 da s
2 ^ e
.1034
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
2 a
6.
RHODES
Solving
Components of Variation in Chemical Analysis
for &
s
a n d 6>^ S
s
a
185
gives s
.352 1.064 .322
The a b o v e e x a m p l e i s p r e s e n t e d t o i l l u s t r a t e t h e basic principles i n d e t e r m i n i n g t h e components o f variance. Most o f t e n t h e e x p e r i m e n t a l d e s i g n f o r t h e a n a l y s i s o f components o f v a r i a n c e i s more complic a t e d t h a n t h i s ( 1 4 ) ; however t h i s f a c t , w h i c h u s u a l ly requires a statistician' such a study and i does n o t d e t r a c t from t h e importance o f t h i s type o f study.
five
The c o m p o n e n t s of variation r e s u l t i n g from t h e separate analyses a r e presented i nTable I I I .
A general pattern should e x i s t across levels f o r the three components of variation. The s t a n d a r d d e v i a t i o n s f o r t h e w i t h i n sample-day v a r i a b i l i t y and between sample variability a r e p l o t t e d on F i g u r e 2 for thevarious concentration levels. Since no g e n eral pattern, either i n c r e a s i n g o r decreasing, exi s t s , thepooled, i . e . , s t a t i s t i c a l l y averaged, values f o r these s t a n d a r d d e v i a t i o n s a r e : s
- .276 f o r w i t h i n s a m p l e - d a y
q
s
s
= .394 f o r b e t w e e n - s a m p l e
The p l o t o f t h e b e t w e e n - d a y s t a n d a r d d e v i a t i o n s ( F i g u r e 3 ) , h o w e v e r , d o e s show a g e n e r a l increasing pattern with i n c r e a s i n g concentrations, with the exc e p t i o n o f s e r i e s 2000. Omitting theresults f o r ser i e s 2000, t h e s t a n d a r d r e g r e s s i o n r e l a t i o n s h i p o f t h e between-day standard d e v i a t i o n t o c o n c e n t r a t i o n l e v e l i s a s shown o n F i g u r e 3. The s t a t i s t i c a l l y more correct weighted* regression relationship i salso shown, a n d i s a p p r e c i a b l y d i f f e r e n t f r o m t h e s t a n d a r d
*A w e i g h t e d r e g r e s s i o n i s a p p r o p r i a t e b e c a u s e o f v i o l a t i o n o f t h e a s s u m p t i o n o f homogeneous variances the v a r i a t i o n o f sample s t a n d a r d d e v i a t i o n s i s g r e a t er f o r l a r g e r expected o r true standard d e v i a t i o n s .
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
VALIDATION O F T H E M E A S U R E M E N T
186
Table
Series
III.
Components o f V a r i a n c e O b t a i n e d from A n a l y s i s o f V a r i a n c e
Within Sample-day
Between-Sample
cs )
(s )
e
.367 .207 .285 .136 .322 .276
Between-Day
(s )
s
Standard D e v i a t i o n s 1000 2000 3000 4000 5000 Pooled
PROCESS
.493 .401 .447 .352 .394
d
yg
zero* .714 .804 1.064
(The z e r o v a l u e - - a c t u a l l y i t was n e g a t i v e - - i s p o s s i b l y due t o p e c u l i a r c o m b i n a t i o n s o f i n d i v i d u a l values. In t h i s p a r t i c u l a r study, the lack of a b e t w e e n - d a y e f f e c t f o r s e r i e s 2000 i s s u s p e c t e d t o be d u e t o some a s s i g n a b l e b u t u n k n o w n c a u s e . )
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
3
S
<
2
α α < α
2;
<
ο Η-
—
—
1.1
1.0
—
—
—
0.4
0.3
0.2
Figure 2.
QL
0.1 —
—
—
—
—
0.5
0.6
ο.7
0.8
0.9^—
—
1.2
1.3i
20
Ο
SAMPLE CONCENTRATION LEVEL, jug
30
• Ο
60
η
Ο
3
S
α ο -s ο
Ο α
X
oo
Within sample-day (error) variability; between sample (sample) variability vs. sample concentration levels
c
QL .
SAMPLE
Ο ERROR •
Figure 3. Between day variability vs. sample concentration levels In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
6.
RHODES
Components of Variation in Chemical Analysis
189
regression relationship. From t h e w e i g h t e d r e g r e s sion line the expected between-day standard d e v i a t i o n s a r e as f o l l o w s : Series
S t a n d a r d D e v i a t i o n , pg
1000 2000 3000 4000 5000
.275 .511 .504 .911 1.235
The above result d determin appropriate a c c e p t a b i l i t participating i a i n t e r l a b o r a t o r y survey whe a n a l y z i n g t h e r e m a i n i n g samples o f t h e f i v e series. It i s assumed that other l a b o r a t o r i e s receive only one s a m p l e v i a l f r o m e a c h o f t h e f i v e s e r i e s a n d t h a t t h e y a n a l y z e e a c h s a m p l e o n l y o n c e o n some g i v e n d a y . ( T h i s d o e s n o t mean t h a t t h e y h a v e t o analyze more than one sample on any g i v e n day.) I t i s also o b v i o u s l y assumed (1) t h a t t h e samples analyzed by the EPA a r e r e p r e s e n t a t i v e r a n d o m s a m p l e s f r o m t h e t o t a l o f each s e r i e s o f sample v i a l s , (2) t h a t t h e two days during w h i c h EPA a n a l y z e d t h e s a m p l e s a r e r e p r e s e n t a t i v e o f a n y d a y o n w h i c h a n a n a l y s i s may b e performed, and (3) that the p a r t i c i p a t i n g l a b o r a t o r i e s have c a p a b i l i t y f o r a c c u r a t e and p r e c i s e a n a l y s e s a t l e a s t e q u a l t o t h a t o f EPA. Table IV deviations f o r : 1. 2. 3. 4. for
shows
the
determined
standard
w i t h i n sample-day v a r i a b i l i t y , sample v a r i a b i l i t y , day-to-day v a r i a b i l i t y , and combined v a r i a b i l i t y
each s e r i e s .
In computing the acceptability limits f o r the i n t e r l a b o r a t o r y c o m p a r i s o n s , two v a r i a b i l i t i e s must be considered. 1.
The v a r i a t i o n o f t h e e s t i m a t e l e v e l f r o m EPA a n a l y s i s .
of the "true"
2.
The c o m b i n e d within sample-day, betweensample, and between-day v a r i a t i o n f o r the
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
Standard
Deviations
3 .198
+
55 .459
.895
1 .325
1.235
.394
.276
5000
58 . .657
52, .261
42, .8 50 37, .926 2 .462
+
40 .388
.674
1 .030
.911
.394
.276
4000
23, .127 19, ,899
1 .614
+
21 .513
.407
.697
.504
.394
.276
3000
23, .491 20, ,231
.702
.511
.394
.276
2000
1 .627
9,,641 12, .119
+
.277
.554
.275
.394
.276 21 .864
Lower Limit
Upper Limit
.411
95% A c c e p t a b i l i t y Limits
1 .239
rue
+
r
Limits
10 .880
t
S
Combination o f V a r i a t i o n t o Determine A c c e p t a b i l i t y f o r I n t e r l a b o r a t o r y Comparisons
Within Between- Between- Combined Sample-Day Sample Day S
IV.
1000
Series
Table
6.
RHODES
Components of Variation in Chemical Analysis
single result laboratory.
191
o b t a i n e d by a p a r t i c i p a t i n g
The standard d e v i a t i o n w h i c h i s used as a b a s i s f o r computing t h e a c c e p t a b i l i t y l i m i t s o f Table IV i s t h e s t a t i s t i c a l c o m b i n a t i o n o f 1 a n d 2, a s f o l l o w s : 'total where s
s
M
"true"
true'
+ s
standard d e v i a t i o n o f the estimated "true" concentration l e v e l d e t e r m i n e d b y EPA
a s i n g l e r e s u l t from a p a r t i c i pating laboratory 2 2 2 s d e s t a n d a r d d e v i a t i o n o f t h e sample variability Si = standard deviation o f the day-to-day v a r i a b i l i t y It has been shown b y s t a t i s t i c a l t h e o r y t h a t when c o m b i n i n g t h e e r r o r s o f i n d e p e n d e n t variables, the squares o f the standard deviations, i . e . ,the variances, areadditive. ( I n some f i e l d s , t h e a b o v e a d d i t i o n o f t h e v a r i a n c e s i s r e f e r r e d t o as t h e RootMean S q u a r e o r RMS sum.) The variability of the concentration l e v e l f o r each s e r i e s following expression:*
estimated "true" i sgiven by t h e
2
'true'
* I f n e c e s s a r y , c o n s u l t your l o c a l the b a s i s o f t h i s expression.
2
+
s e 10
statistician for
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
192
VALIDATION OF T H E M E A S U R E M E N T
PROCESS
These c a l c u l a t e d v a l u e s a r e : s
Series 1000 2000 3000 4000 5000
"true
M
.277 .411 .407 .674 .895
For each series, the t o t a l standard d e v i a t i o n f o r e x p e c t e d d i f f e r e n c e s b e t w e e n t h e EPA v a l u e a n d a participating laborator following table: Series
s
10 00 2000 3000 4000 500 0
total .620 .814 .807 1. 231 1.599
The corresponding therefore :
95% a c c e p t a b i l i t y
limits
are
χ ± 2s where χ series .
is
the
average
of
t h e 10 v a l u e s o f e a c h
The c o m p u t e d v a l u e s a r e shown i n T a b l e I V . I t s h o u l d be e m p h a s i z e d t h a t t h e s e limits are minimum limits. Based on the information i n the data reported h e r e i n , agreement of the participating laboratories results with those o f EPA are not e x p e c t e d t o be a n y b e t t e r t h a n t h a t r e f l e c t e d b y t h e limits g i v e n , because no a l l o w a n c e i s made i n t h e above c o m p u t a t i o n s f o r t h e e x i s t e n c e o f any real differences (or biases) between EPA and t h e participating laboratories. In practice, an additional allowance, f o r some between-laboratory v a r i a b i l i t y , b a s e d on p r e v i o u s survey results, is a p p r o p r i a t e l y added ( 1 5 ) . T
It i s h e l p f u l t o p r e s e n t i n g r a p h i c a l form the i n d i v i d u a l and combined e f f e c t s o f v a r i o u s s o u r c e s o f variability. The equation given previously for
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
6.
RHODES
Components of Variation in Chemical Analysis
193
combining such effects assumes that the various effects a r e independent o f each other--a valid assumption i n many s i m i l a r m e a s u r e m e n t s y s t e m s . B y making u s e o f t h e P y t h a g o r e a n Theorem, w h i c h relates the lengths of the sides of right triangles, the r e l a t i v e magnitudes o f t h e standard d e v i a t i o n s o f t h e above expression (Equation 2) c a n b e displayed g r a p h i c a l l y by t h e sides o f a group of adjoining right triangles. Thus, using t h e data from Table IV, the graphical representation f o r combining the within sample-day*, between d a y , and between-sample v a r i a t i o n s f o r each o f t h e s e r i e s i s a s shown i n F i g u r e 4. Adding t h e between-day v a r i a b i l i t i e s f o r s e r i e s 1000 a n d f o r s e r i e s 5000 gives the variabilities shown i n F i g u r e 4. Thus, i t c a n r e a d i l y be seen t h a t t h e betweenday v a r i a b i l i t y becomes an o v e r w h e l m i n g portion of the combined variability for the higher concentrations. F u r t h e r s t u d y s h o u l d be i n i t i a t e d t o identify the specific cause f o r t h e day-to-day variability. It i s o f i n t e r e s t t o compare t h e combined variability of the sampling and analytical variabilities with the variability experienced recently by p a r t i c i p a t i n g laboratories analyzing these types o f samples. Recent interlaboratory c o m p a r i s o n s among laboratories result i n computed coefficients of variation of 20% w i t h i n t h e c o n c e n t r a t i o n ranges o f t h i s study. The c o e f f i c i e n t o f v a r i a t i o n o f 2 0 % i n c l u d e s t h e combined v a r i a t i o n due t o s a m p l e s , d a y s , and w i t h i n s a m p l e - d a y , s i n c e e a c h l a b o r a t o r y a n a l y z e d one sample ( a t e a c h l e v e l ) o n some given day. Assuming that this combined variation f o r the participating l a b o r a t o r i e s i s e q u a l t o t h a t f o r t h e EPA l a b o r a t o r y , we m u s t s u b t r a c t t h i s v a r i a b i l i t y from t h e computed
*0n F i g u r e 4, t h e w i t h i n s a m p l e - d a y variability, for the sake o f space, i s c a l l e d " e r r o r " , t h e u s u a l stat i s t i c a l term f o r theinherent v a r i a b i l i t y not ident i f i e d w i t h some s p e c i f i c f a c t o r .
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
194
VALIDATION OF T H E M E A S U R E M E N T
Figure 4.
PROCESS
Combined between sample, within sample-day (error), and between day variabilities for series 1000 and 5000
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
6.
RHODES
Components of Variation in Chemical Analysis
195
Figure 5. Combined within laboratory and between laboratory variability for series 1000 and 5000
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
V A L I D A T I O N OF
196 20% v a l u e t o o b t a i n an between l a b o r a t o r i e s .
THE
MEASUREMENT
estimate of the t r u e
PROCESS
variation
Addition o f t h e s e b e t w e e n - l a b o r a t o r y components of variation to those of the previous figures p r o v i d e s t h e r e s u l t s shown i n F i g u r e 5. Thus, i f the p a r t i c i p a t i n g l a b o r a t o r i e s are a r e p r e s e n t a t i v e sample o f a l l the l a b o r a t o r i e s i n the country, the graphical r e p r e s e n t a t i o n r e f l e c t s the r e l a t i v e magnitudes of the sources of variation of the chemical analytical portion o f SO^ d a t a b e i n g obtained throughout the country. The picture clearl a r e a f o r i n v e s t i g a t i o n to improve the r e p r o d u c i b i l i t y of the reported data i s the d e t e r m i n a t i o n of the causes of the between-laboratory d i f f e r e n c e s . It should be noted that a d d i t i o n a l sources of variability, such as between a n a l y s t s , could be considered i n the planning and a n a l y s i s f o r such s t u d i e s , and t h a t t h e graphical r e p r e s e n t a t i o n can a l s o be e x t e n d e d t o i n c l u d e t h e s e a d d i t i o n a l s o u r c e s . SUMMARY Variability i n the m e a s u r e m e n t p r o c e s s may be a s s o c i a t e d w i t h instruments or apparatus, analysts, conditions, time, c a l i b r a t i o n sequence, c a l i b r a t i o n s t a n d a r d s , a n d r e a g e n t s and m a t e r i a l s used in the entire measurement process. The entire laboratory measurement p r o c e s s has been viewed as a process requiring i t s own q u a l i t y a s s u r a n c e s y s t e m . Several p e r t i n e n t p o i n t s a n d means o f c o n t r o l for achieving and m a i n t a i n i n g good a c c u r a c y and p r e c i s i o n o f t h e p r o d u c t , DATA, o f a l a b o r a t o r y measurement system h a v e b e e n p r e s e n t e d . An e x a m p l e h a s b e e n p r e s e n t e d t o show how d e s i g n e d s t u d i e s o f t h e v a r i o u s factors of t h e m e a s u r e m e n t p r o c e s s may be e v a l u a t e d t o d e t e r m i n e the components o f v a r i a n c e a t t r i b u t a b l e to each of the factors. A g r a p h i c a l method o f p r e s e n t i n g the measures o f i n d i v i d u a l and combined components of variability due to independent factors has been demonstrated. Much f r u i t f u l e f f o r t s i n t h e s t u d y of laboratory measurement systems r e s u l t from a c l o s e working relationship between chemists and statisticians.
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
6.
RHODES
Components of Variation in Chemical Analysis
197
LITERATURE CITED 1. 2. 3. 4.
5.
6.
7. 8.
9.
10. 11.
Cameron, Joseph Μ., Journal of Quality Technology (1976) 8 (1) pp. 53-55, "Measurement Assurance". Curley, James B . , ASTM Standardization News, (1976) 4 (9) pp. 16-18, "Quality Assurance and Test Methology". Wening, Robert J., ASTM Standardization News (1976) 4 (3) pp. 11-16, "Quality Assurance in the Laboratory". Environmental Protection Agency, "Guidelines for Development of a Quality Assurance Program", EPA-R4-73-028 series and EPA-650/4-74005 series Environmental Monitorin Laboratory, (1973-1976) Environmental Protection Agency, "Quality Assurance Handbook for Air Pollution Measurement Systems, Volume 1, Principles", EPA-600/9-76005, Environmental Monitoring and Support Laboratory, Research Triangle Park, N . C . , March 1976. Von Lehmden, Darryl J., Raymond C. Rhodes and Seymour Hochheiser, "Applications of Quality Assurance in Major Air Pollution Monitoring Studies--CHAMP and RAMS," proceedings International Conference on Environmental Sensing and Assessment, Las Vegas, Nevada, September 14-19, 1975. Cameron, Joseph Μ., Journal of Quality Technology (1975) 7 (4), pp. 153-195, "Traceability?". American Society for Testing and Materials, "Manual for Conducting Interlaboratory Study of a Test Method", (STP 335), American Society for Testing and Materials, Philadelphia, PA, 19103. Youden, William J. and Steiner, Ε. Η., Association of Official Analytical Chemists (1975), "Statistical Manual of The Association of The Official Analytical Chemists, Statistical Techniques for Collaborative Tests, Planning and Analysis of Results of Collaborative Tests". Lewis, Lynn L . , ASTM Standardization News, (1976) 4 (9) pp. 19-23, "Interlaboratory Testing Programs for the Chemical Analysis of Metals". ASTM E177, "Use of the Terms Precision and Accuracy as Applied to Measurement of a Property of a Material", American Society for Testing and Materials, Philadelphia, PA 19103.
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
198
12.
13.
14. 15.
VALIDATION OF THE MEASUREMENT PROCESS
ASTM E180-67, "Developing Precision Data on ASTM Methods for Analysis and Testing of Industrial Chemicals", American Society for Testing and Materials, Philadelphia, PA 19103. Environmental Protection Agency, Federal Register (1971) 36 (84), pp. 8187-8190, "National Primary and Secondary Ambient Air Quality Standards". Hicks, C. R., "Fundamental Concepts in the Design of Experiments", Holt, Rinehart, and Winston (1964). Bromberg, S. M . , Akland, G. G. and Bennett,' Β. I . , "Survey of Laboratory Performance Analy sis of Simulated Ambient Sulfur Dioxide Bubbler Samples", Environmenta ronmental Monitoring Suppor Laboratory, search Triangle Park, N.C. 1975.
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
INDEX A Absorbance response 173 Acceptability limits 189 interlaboratory comparisons 189 variation 190 Accuracy statements, validity of 7 Accuracy, transfer of 145,147 Analysis of beam deflection data, 4-plot 106 components of variation in chemica methods, inadequate resolutio multicomponent 127 pararosaniline method 183,184 for series 5000, variance 184 for sulfur dioxide, pararosaniline method of 184 systematic error in chemical 114 trace 121 variability of E P A reference samples 182 of variance 186 A N O V A , regression and 55 A N O V A , 2-factor analysis of variance 45, 47 Appearance(s) linear 60 of the normal probability plot 61 s-shaped 60 A priori location and variation estimates 64 A r , control chart for radioactive 120 Askewed distributional family, M P P C C for 102 Assignable causes classes of 12 a process with local 14 no 13 regional 16 related 20 Assumption limitations 135 Assumptions randomness 97 distribution 97 location 97 variation 97 A S T M stress corrosion 48, 52 Atomic absorption technique 147 Austenite standard reference material, circular 45 37
Autocorrelation plot elliptical structure of the lag-1 of voltage counts of wind velocities Average value
41 37,103 39 39 154
Β Band width 44 Beam deflection autocorrelation plot of 40 Cauchy probability plot 87 data 103 4-plot analysis of 106 M P P C C criteria 100 normal probability plot 85 Tukey lambda = — .5 probability plot 86 uniform probability plot 84 Bell-shaped distributions 60 infinite-domain 94 probability functions of 61 truncated 94 Betweenday standard deviations 185,188,189 laboratory components of variation 196 sample 189 Bias in burette readings, operator 124 components, estimation of individual 120 evaluation 125 from overlapping peaks, evaluation 128 in weight measurement, operator ... 124 Biological matrices, trace C r in 156 Blank 121 in lead, occurrence of an unmeasured 122 Bowl of chips, Shewhart's 8, 9 Box-Behnken designs 167 Burette readings, operator bias in 124 C Calcium reference method in serum, measurement of x-ray fluoresecnce calibration curve
199
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
150 147 132
200
VALIDATION O F T H E M E A S U R E M E N T PROCESS
Calibration curve, calcium x-ray fluorescence ... 132 curve, normalized residuals of linear 134 factor 125 instrument 123 function 123 of G e ( L L ) x-ray detector, linear .... 127 Calorimeter, water-equivalent of a bomb 15 Cauchy 54, 60 beam deflection 87 definition of 60 probability plot 74, 60 Josephson Junction voltage 79 using Rand Corp. random number tables 69,73 wind velocity 8 Cell mean vs. C T A for H S 0 17 Central composite designs 16 Chart(s) control 37,119 Chemical analysis, components of variation .... 176 analysis, systematic error in 114 measurement process ( C M P ) 116 measurement system 178 Class width (grouping interval) for the histogram 65 C M P ( chemical measurement process) 116 Coefficient maximum probability plot correlation ( M P P C C ) 93 probability plot correlation 93 product moment correlation 93 for symmetric distributions, probability plot correlation 95 Collaborative studies 179 Collection efficiencies of the Jacobs-Hochheiser method 152 Comparison studies, interlaboratory .. 179 Compatibility measurement 141 measurement system, idealized 158 mechanisms for measurement 142 and true values, measurement 144 Compatible measurements 159 Complex statistical control 18 Components of variance 186 Concentration-dependent systematic error 132 Concentration level(s) relationship of between-day standard deviation to 188 sources of variability vs 187 true 191 Confidence interval 130 for the m e d i a n ' ° 131 Contamination problems in sample validity 119 2
a,b
Continuous (normal) distribution 64 Control chart 37,119 analysis 8,14 of measurements 8 for radioactive A r 120 simple and complex 18 Control, statistical (see Statistical control) Controlled factor 163 Convex 58 Copper determination duplicity studies 17 Correlation coefficient M P C C ( maximum probability plot ) 93 product moment 93 37
4
Cr in biological matrices, trace 156 Criteria for randomness Cross-over, nonsymmetric 58 Cryothermometry experiment, Josephson Junction 75,105 Cryothermometry voltage 103 C T A for H S 0 , cell mean vs 171 Cumulative sum ( C U S U M ) charts .... 37 Cyclic structure 41 2
4
D Data analysis, unordered 20 Data evaluation 125 Day-to-day variability 182 Definitive methods 146 Designs, central composite 167 Density function, probability 60 Determined standard deviations 189 Deviation ( s ) between-day standard 185,188,189 charts, s (standard) 37 determined standard 189 within duplicate samples, statistical 17 residuals 54 standard 191 Diagnostic techniques 130 Differential response 21 detection 22 types of 21 and varying process factors 22 2-Dimensional plots 133 2-Dimensional randomness 45 Direct measurement 3 Discrete (Poisson) distribution 64 Distribution bell-shaped 60 of Cauchy probability plot for Rand Corp. random number tables, uniform 73 fixed 30,53,97
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
201
INDEX
Distribution ( continued ) infinite-domain bell-shaped 94 of lambda = — .5 probability plot of the Rand Corp. random number tables, uniform 72 logistic 94 normal 56 probability for Rand Corp. random number tables, uniform 71 Poisson 56 probability functions of the U probability function of the U-shaped, uniform, and bell-shaped 61 probability plot correlation coeffi cient for symmetric 9 skewed 58 symmetric 58 truncated bell-shaped 94 uniform 56, 60 probability plot for Rand Corp. random number tables, uniform 70 U-shaped 60 wind speed symmetric 92 Distributional fit 74 Distributional information 53 Drill thrust force, G A N O V A 50 Dummy levels 49 Duplicate measurements 179 Duplicate samples, statistical deviation within 17 Duplicates, sequential 179 Duplicity 12,13,16 studies, copper determination 17
Ε Electric connector voltage drop, GANOVA 50 Elliptical structure of the autocorrelation plot 41 Enzyme activity, kinetic determination of 165 E P A reference method 150 E P A reference samples 182 Error estimation of propagated 120 random 115,135 within clinical laboratory testing 118 sources, detection of 118 systematic ( see Systematic error ) .. 135 within clinical laboratory testing 118 Estimates, a priori location and variation 64 Estimates, optimal (minimum variance ) 65 Estimators, optimal 53, 54
Evaluation bias created by overlapping peaks data Evolutionary operation Experimental parameters, optimization of Extremes (minimum and maximum)
125 128 125 167 162 44
F 2-Factor analysis of variance (ANOVA) Factor effects Factorial designs differential response detection through
45, 47 49 167 22
comparison study, reference vs 152 Film support, force to tear plastic 15 Fixed distribtuion 30,53,97 location 30, 97 systematic component 119 variation 30,97 Flatness, band 44 Force, G A N O V A drill thrust 50 G
Gamma ray doublet G A N O V A ( graphical analysis of variance ) drill thrust force electric connector voltage drop 3-variable G e ( L L ) x-ray detector Graphical analysis of variance (GANOVA) Grouping interval (class width) for the histogram
128 47 50 50 49 127 47 65
H Heterogeneity problems of sample validity Histogram grouping interval ( class width ) for the technique Homogeneity specimen H S 0 , cell mean vs. C T A for 2
4
119 103 65 65 45 46 171
I "In control" basic assumptions assuring a measurement process
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
29, 56 97
VALIDATION O F T H E M E A S U R E M E N T PROCESS
202
"In control" ( continued ) fixed distribution 97 location 97 variation 97 randomness 97 Indirect measurement 3 Industrial laboratory, measurement process of an 180 Infinite-domain bell-shaped distributions 94 Information, distributional 53 Instability problems of sample validity 119 Instrument calibration factor 123 Instrumental resolution 123,125 Interaction 21, 45 Intercomparison, estimation via 117 Intercomparison studies, vanadiu results from 13 Interlaboratory comparison ( s acceptability limits 189 variation studies 194 Interlaboratory variance studies, combined 195 Interval measurement scales 2 Isotope dilution 123
J
M Matrices, trace C r in biological 156 Maximum probability plot correlation coefficient (see M P P C C ) Mean, limiting 30 Median 44,130 Median ' ' confidence interval for the 131 Measurements 2, 3,123,142,159 accurate 143 of calcium in serum 147 compatibility 140, 141 mechanisms for 142 and true values 144 control-chart-analysis of 8 direct 3 duplicate 179 hierarchy 25 indirect 3 methods 4 absolute 146 hierarchy 145 operator bias in weight 124 operations, classes of 25 process 4,29,163,177 basic assumptions in the 29 development of 166 of an industrial laboratory 180 reference materials and methods in the 140 ruggedness of 23,164 statistical control of 1 systems theory view of the 165 pyramid 146, 148 replicate 179 scales: 2 interval 2 nominal 2 a
Jacobs-Hochheiser method 150,152 Josephson Junction 103 cryothrmometry experiment 75 cryothermometry voltage counts ... 105 voltage Cauchy probability plot 79 M P P C C criterion 98 normal probability plot 77 Tukey lambda = — .5 probability plot 78 uniform probability plot 76
Κ Karl Fischer method of water content determination Kinetic determination of enzyme activity Known factor
Lambda = — .5 probability plot of the Rand Corp. random number tables, uniform distribution of 72 Lead, unmeasured blank in 122 Leakage 65 Limiting mean 30 Limits for systematic error 116 Linear 58 appearance 60 near58 probability plot 58 location, fixed 30, 97 Location parameter 54 Logistic distribution 94 Long-tailed probability plot, Cauchy 74 Longer-tailed probabilit plo 60
19 165 163
L Laboratory -to- laboratory variability 182 measurement process of an industrial 180 measurement variability, within- .... 181 precision, influence of the S R M on 155 testing, systematic and random error within clinical 118 Lag-1 autocorrelation plot 37,103
b
c
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
203
INDEX
Measurement scales (continued) ordinal 2 ratio 2 specification 4 steps, variability occurrence in 176 system, idealized compatibility 158 system, chemical 178 techniques, accurate 145 of transmission, spectrophotometric 43, 107 variability within-laboratory 181 Minimum variance (optical) estimates 65 Moderate-tailed ( normal ) probability plot 74 Moment correlation coefficient, product 9 Moving range method 1 M P P C C (maximum probability plo correlation coefficient ) 93, 97 for askewed distributional family, .. 102 criteria beam deflection 100 Josephson Junction voltage 98 Rand tables 96 wind velocity 99 x-ray crystallography 101 probability plot correlation coefficient 95 Multicomponent analysis methods .... 127 Multifactor model 45 Multifactor statistical techniques 54
Ν N-shaped 58 N B S - E P A trace element intercomparison studies 134 NBS standard reference materials 151 Near-linear 58 Nominal measurement scales 2 Nonadditivity 21 Nonlinear probability plot 58 Nonrandom data 41 Nonrandom variations 119 Nonsymmetric cross-over 58 Normal distribution 56, 64 moderate-tailed probability plot .... 74 probability plot 56, 63,103 beam deflection 85 forming a 57 Josephson Junction voltage 77 using Rand Corp. random number tables 67, 71 types of appearances of the 60, 61 wind velocity 81 x-ray crystallography 89
Normal (continued) random numbers 4-plot analysis of Rand
103 104
Ο Operator bias Optimal estimators Optimal (minimum variance) estimates Optimization data, simplex of experimental parameters of variable size simplex variance analysis of simplex Optimum region, replication with
detection Overlap principle
124 53, 54 65 169 162 168 172
62 182
Ρ Parameters location 54 optimization of experimental 162 variation 29, 54 Pararosaniline method for analysis .... 183 for sulfur dioxide 184 Performance criterion 164 4-plot analysis 97 of beam deflection data 106 of Josephson Junction cryothermometry voltage counts .... 105 of Rand normal random numbers .... 104 of spectrophotometric measure ments of transmittance 109 of x-ray crystallography data 108 Poisson counting statistics 126 discrete 64 probability distribution 56, 176 Precision controls, local and regional level of 179 influence of the S R M on laboratory 155 levels of 181 validity of 7 Predictability 29 Probability density function 60 distribution, Poisson 56,176 functions of 61 bell-shaped distribution 61 U-shaped distribution 61 uniform distribution 61 plot 56 beam deflection Cauchy 87
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
VALIDATION O F T H E M E A S U R E M E N T PROCESS
204
Probability plot beam deflection (continued) normal 85 Tukey lambda = — .5 86 uniform 84 Cauchy 60,74 correlation coefficient ( M P P C C ) , maximum 93 correlation coefficient illustrating M P P C C 95 forming a normal 57 Josephson Junction voltage Cauchy 79 normal 77 Tukey lambda = —.5 78 uniform 76 linear 5 longer-tailed 6 nonlinear 5 normal 56, 74, 103 for Rand Corp. random number tables uniform distribution of Cauchy 73 of lambda = - . 5 72 normal 71 uniform 70 using Rand Corp. random number tables Cauchy 69 normal 67 Tukey lambda = —.5 68 uniform 66 for the Tukey lambda = — .5 distribution 74 types of appearances of the normal 61 uniform (short-tailed) 74 wind velocity Cauchy 83 normal 81 Tukey lambda = — .5 82 uniform 80 x-ray crystallography 91 normal 89 Tukey lambda 90 uniform 88 Procedure of statistical control, operational 8 Process factors, varying 22 Product moment correlation coefficient 93 Propagated errors, estimation of 120 Pyramid, measurement 148
Q Quality assurance system Quartiles, lower and upper
178 44
R r (range) charts Rand Corp. random number tables Cauchy probability plot using normal probability plot using Tukey lambda = — .5 probability plot using uniform distribution Cauchy probability plot for lambda = — .5 probability plot of the normal probability plot for uniform probability plot for uniform probability plot using Rand normal random numbers, 4-plot analysi f
37 69 67 68 73 72 71 70 66 104
error 115,135 within clinical laboratory testing 118 number(s) normal 103 4-plot analysis of Rand normal.. 104 tables Cauchy probability plot using Rand Corp 69 normal probability plot using Rand Corp 67 uniform distribution of Cauchy probability plot for Rand Corp 73 uniform distribution of lambda = — .5 probability plot of the Rand Corp 72 uniform distribution normal probability plot for Rand Corp 71 uniform distribution uniform probability plot for Rand Corp 70 uniform probability plot using Rand Corp 66 Tukey lambda = — .5 probability plot using Rand Corp 68 variation and statistical control of testing operations 19 Randomness 29, 97 assumption 37 criteria for 10 2-dimensional 45 Ratio measurement scales 2 Rational subgroups 12, 13 Recovery factor 123 Reference materials 159 circular austenite standard 45
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
205
INDEX
Reference materials ( continued ) in the measurement process 140 NBS standard 151 standard 178 methods 146,150, 156, 159 calcium 150 in the measurement process 140 samples, analysis variability of E P A 182 vs. field method comparison study .. 152 Regional levels of precision controls .. 179 Regression and A N O V A 55 and multifactor statistical techniques 54 relationship of between-day standard deviation to concentration level, standard 18 Replicate? measurements 17 Replication 115,18 experimental 47 with the simplex optimum region .. 170 Residuals (deviations) 54 of linear calibration curve, normalized 134 Resistive robust statistics 130 Resolution, instrumental 123,125 Resolution of multicomponent analysis methods, inadequate 127 Response 166 Robust statistics, resistive 130 Round-robin tests 149 Ruggedness of measurement processes 23,164 Run(s) 10 analysis 44 sequence plot 36, 103 of spectrophotometric transmittance data 42 sequence, time 36 test 41
Sequential duplicates 179 Serum, measurement of calcium in ... 147 Shewhart's bowl of chips 9 Shorter-tailed 60 probability plot, uniform 74 Simple statistical control 18 Simplex optimization of variable size 168 optimization data 169 optimization, variance analysis of .. 172 optimum region, replication with the 170 variable size 167 Skewed distribution 58, 59 Smeared 65 Spectrophotometric measurement of
Spectrophotometri data, run sequence plot of 42 SRM 153 on laboratory precision, influence of the 155 on target values, influence of the .... 155 Standard deviation 191 between-day 185,189 determined 189 theoretical 41 Standard reference materials 178 circular austenite 45 NBS 151 Standardized methods 146 Standardized variable 41 Statistical control 6 complex 18 criteria for lack of 11 of measurement processes 1 operational procedure of 8 simple 18 system 1 of testing operations, 19 deviation within duplicate samples 17 S-shaped 58 techniques 29 appearance 60 regression and multifactor 54 s ( standard deviation ) charts 37 Statistics 44 Sample lower quartile 44 physical 177 maximum extreme 44 preparation 123 median 44 -to-sample variability 182 minimum extreme 44 validity: contamination, heteroresistive robust 130 geneity and instability, upper quartile 44 problems of 119 Steel-concrete beam deflection 75 Sampling variability of E P A reference samples 182 Structure of the autocorrelation plot, elliptical 41 Sequence plot, run 36,103 Structure, cyclic 41 Sequence plot of spectrophotometric transmittance data, run 42 Sulfur dioxide, pararosaniline method of analysis for 184 Seqeunce, time run 36
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
VALIDATION O F T H E M E A S U R E M E N T
206
Symmetric distribution 58, 59 illustrating M P P C C , probability plot correlation coefficient for.. 95 wind speed 92 Systematic component, fixed 119 error 115, 135 bounds 116 concentration-dependent 132 empirical approach 116 theoretical approach 116 causes of 114 in chemical analysis 114 within clinical laboratory testing 118 concentration-dependent 132 limits for 116 sources of 11 System chemical measurement 17 quality assurance 178 theory 162 view of the measurement process 165 Τ
Target value 154 influence of the S R M on 155 Techniques, statistical 29 Test, runs 41. Testing operations, random variation and statistical control of 19 Testing, systematic and random error within clinical laboratory 118 Theoretical standard deviations 41 Time run sequence 36 Tolerances 164 Trace analysis 121 Trace element intercomparison studies N B S - E P A 134 Transfer of accuracy 145, 147 Transmittance data, run sequence plot of spectrophotometric 42 4-plot analysis of spectrophoto metric measurements of 109 spectrophotometric measure ments of 43,107 True concentration level 191 True value 143,145 measurement compatibility and 144 Truncated bell-shaped distributions .. 94 Tukey lambda = — .5 distribution ( moderate-long tailed ), probability plot for the 74 probability plot beam deflection 86 Josephson Junction voltage 78 using Rand Corp. random number tables 68
PROCESS
Tukey lambda probability plot (continued) wind velocity x-ray crystallography Typical value
82 90 57
U
U-shaped distribution 60 probability functions of the 61 Uncontrolled factors 164 Uniform distribuiton 56,60 of Cauchy probability plot for Rand Corp. random number tables 73 f lambd 5 probabilit normal probability plot for Rand Corp. random number tables 71 probability functions of the 61 uniform probability plot for Rand Corp. random number tables 70 short-tailed probability plot 74 beam deflection 84 Josephson Junction voltage 76 for Rand Corp. random number tables, uniform distribution .. 70 using Rand Corp. random number tables 66 wind velocity 80 x-ray crystallography 88 Unity 93 Unknown factors 164 Upper quartiles 44 V Validity, problems of sample contamination heterogeneity instability Vanadium results from N B S - E P A trace element intercomparison studies Variability day-to-day of E P A reference samples laboratory-to-laboratory in measurement steps sample-to-sample sources of vs. concentration levels, sources of .. within-laboratory measurement within-sample within-day Variable G A N O V A , 3graphical analysis of variance, 2- ....
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
119 119 119 119 134 182 182 182 176 182 182 187 181 182 49 45
207
INDEX
Variable (continued) size simplex 167 optimization of 168 standardized 41 Variance analysis of 186 series 5000 184 simplex optimization 172 ( A N O V A ) , 2-factor analysis of 45, 47 components of 185, 186 estimates, optimal (minimum) 65 ( G A N O V A ), graphical analysis of .. 47 studies, combined interlaboratory .. 195 2-variable graphical analysis of 45 Variation between-laboratory components of 196 in chemical analysis, component coefficient of 15 to determine acceptability limit estimates needed ,a priori 64 fixed 30,97 parameter 29, 54 nonrandom 119 scheme, You den's 24 studies, interlaboratory comparison 194 of testing operations, random 19 Voltage counts, autocorrelation plot of 38
102 19 15 44
75 39 83 99 81 92 82 80 102 182 181 189 182
X-ray crystallography 75,107 data, 4-plot analysis of 108 M P P C C criterion 101 normal probability plot 89 probability plot 91 Tukey lambda probability plot .... 90 uniform probability plot 88 detector, linear calibration of
Ge(LL)
W Walla Walla, Wash, annual maximum wind speeds Water content determination, Karl Fischer method of Water-equivalent of a bomb calorimeter Width of the band
W i n d velocity ( ies ) autocorrelation plot of Cauchy probability plot M P P C C criterion normal probability plot symmetric distributions Tukey lambda = — .5 probability plot uniform probability plot at Walla Walla, Wash., annual maximum Within -day variability, -laboratory measurement variability sample-day -sample variability
fluorescence calibration curve, calcium
127 132
Y
Youden plot Youden s variation and statistical control scheme
In Validation of the Measurement Process; DeVoe, J.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
51, 53 24