Common Statistical Methods for Clinical Research: with SAS® Examples
Glenn A. Walker
SAS Publishing
More praise for this new edition “This book is an extension and improvement to the excellent first edition, which appeared in 1996. Each chapter has an introduction covering a specific statistical method and gives examples of where the method is used. The introduction is followed by a synopsis presenting the necessary and significant information for the reader to get a good understanding of the statistical method. Excellent examples are given, followed by the SAS code for the example. Detailed annotations make the SAS output easy to understand. The book also gives numerous extensions for the methods. Dr. Walker has certainly used his many years of consulting experience, his teaching experience, and his understanding of SAS to produce an even better book that is equally understandable and helpful to the novice as well as the experienced statistician. Each will benefit from this insightful journey into statistics and the use of SAS, whether as a teaching tool or a refresher. I can recommend this book wholeheartedly.” Stephan Ogenstad, Ph.D. Senior Director, Biometrics Vertex Pharmaceuticals Incorporated
SECOND EDITION
CommonStatistical Methods for Clinical Research with SAS Examples ®
Glenn A. Walker
58086
The correct bibliographic citation for this manual is as follows: Walker, Glenn A. 2002. Common Statistical Methods ® for Clinical Research with SAS Examples, Second Edition. Cary, NC: SAS Institute Inc. ®
Common Statistical Methods for Clinical Research with SAS Examples, Second Edition Copyright © 2002 by SAS Institute Inc., Cary, NC, USA ISBN 1-59047-040-0 All rights reserved. Printed in the United States of America. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, or otherwise, without the prior written permission of the publisher, SAS Institute Inc. U.S. Government Restricted Rights Notice: Use, duplication, or disclosure of this software and related documentation by the U.S. government is subject to the Agreement with SAS Institute and the restrictions set forth in FAR 52.227-19, Commercial Computer Software-Restricted Rights (June 1987). SAS Institute Inc., SAS Campus Drive, Cary, North Carolina 27513. 1st printing, July 2002 SAS Publishing provides a complete selection of books and electronic products to help customers use SAS software to its fullest potential. For more information about our e-books, e-learning products, CDs, and hardcopy books, visit the SAS Publishing Web site at www.sas.com/pubs or call 1-800-727-3228. ®
SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. ® indicates USA registration. Other brand and product names are trademarks of their respective companies.
Table of Contents
Preface to the Second Edition .........................................................................................................xi Preface to the First Edition ............................................................................................................xiii Chapter 1 - Introduction & Basics 1.1 1.2 1.3 1.4 1.5 1.6
Statistics—the Field .................................................................................................................... 1 Probability Distributions ........................................................................................................... 4 Study Design Features ............................................................................................................... 9 Descriptive Statistics ................................................................................................................ 13 Inferential Statistics ................................................................................................................. 16 Summary ................................................................................................................................... 21
Chapter 2 – Topics in Hypothesis Testing 2.1 2.2 2.3 2.4 2.5 2.6 2.7
Significance Levels .................................................................................................................... 23 Power ......................................................................................................................................... 25 One-Tailed and Two-Tailed Tests ............................................................................................ 26 p-Values ..................................................................................................................................... 27 Sample Size Determination ..................................................................................................... 27 Multiple Testing ........................................................................................................................ 30 Summary ................................................................................................................................... 38
Chapter 3 – The Data Set TRIAL 3.1 3.2 3.3 3.4 3.5
Introduction............................................................................................................................... 39 Data Collection ......................................................................................................................... 39 Creating the Data Set TRIAL.................................................................................................. 42 Statistical Summarization ....................................................................................................... 45 Summary.................................................................................................................................... 54
Chapter 4 – The One-Sample t-Test 4.1 4.2 4.3 4.4
Introduction............................................................................................................................... 55 Synopsis ..................................................................................................................................... 55 Examples ................................................................................................................................... 56 Details & Notes ......................................................................................................................... 64
Chapter 5 – The Two-Sample t-Test 5.1 5.2 5.3 5.4
Introduction............................................................................................................................... 67 Synopsis ..................................................................................................................................... 67 Examples ................................................................................................................................... 68 Details & Notes ......................................................................................................................... 73
iv Table of Contents
Chapter 6 – One-Way ANOVA 6.1 6.2 6.3 6.4
Introduction............................................................................................................................... 77 Synopsis ..................................................................................................................................... 77 Examples ................................................................................................................................... 80 Details & Notes ......................................................................................................................... 86
Chapter 7 – Two-Way ANOVA 7.1 7.2 7.3 7.4
Introduction............................................................................................................................... 91 Synopsis ..................................................................................................................................... 91 Examples ................................................................................................................................... 94 Details & Notes ....................................................................................................................... 107
Chapter 8 – Repeated Measures Analysis 8.1 8.2 8.3 8.4
Introduction............................................................................................................................. 111 Synopsis ................................................................................................................................... 111 Examples ................................................................................................................................. 115 Details & Notes ....................................................................................................................... 148
Chapter 9 – The Crossover Design 9.1 9.2 9.3 9.4
Introduction............................................................................................................................. 157 Synopsis ................................................................................................................................... 158 Examples ................................................................................................................................. 160 Details & Notes ....................................................................................................................... 171
Chapter 10 – Linear Regression 10.1 10.2 10.3 10.4
Introduction............................................................................................................................. 175 Synopsis ................................................................................................................................... 175 Examples ................................................................................................................................. 178 Details & Notes ....................................................................................................................... 193
Chapter 11 – Analysis of Covariance 11.1 11.2 11.3 11.4
Introduction............................................................................................................................. 199 Synopsis ................................................................................................................................... 200 Examples ................................................................................................................................. 203 Details & Notes ....................................................................................................................... 221
Chapter 12 – The Wilcoxon Signed-Rank Test 12.1 12.2 12.3 12.4
Introduction............................................................................................................................. 227 Synopsis ................................................................................................................................... 227 Examples ................................................................................................................................. 228 Details & Notes ....................................................................................................................... 234
Chapter 13 – The Wilcoxon Rank-Sum Test 13.1 13.2 13.3 13.4
Introduction............................................................................................................................. 237 Synopsis ................................................................................................................................... 237 Examples ................................................................................................................................. 239 Details & Notes ....................................................................................................................... 244
Table of Contents
v
Chapter 14 – The Kruskal-Wallis Test 14.1 14.2 14.3 14.4
Introduction............................................................................................................................. 247 Synopsis ................................................................................................................................... 247 Examples ................................................................................................................................. 249 Details & Notes ....................................................................................................................... 253
Chapter 15 – The Binomial Test 15.1 15.2 15.3 14.4
Introduction............................................................................................................................. 257 Synopsis ................................................................................................................................... 257 Examples ................................................................................................................................. 259 Details & Notes ....................................................................................................................... 261
Chapter 16 – The Chi-Square Test 16.1 16.2 16.3 16.4
Introduction............................................................................................................................. 265 Synopsis ................................................................................................................................... 265 Examples ................................................................................................................................. 267 Details & Notes........................................................................................................................ 279
Chapter 17 – Fisher’s Exact Test 17.1 17.2 17.3 17.4
Introduction............................................................................................................................. 285 Synopsis ................................................................................................................................... 285 Examples ................................................................................................................................. 286 Details & Notes........................................................................................................................ 289
Chapter 18 – McNemar’s Test 18.1 18.2 18.3 18.4
Introduction............................................................................................................................. 293 Synopsis ................................................................................................................................... 293 Examples ................................................................................................................................. 294 Details & Notes........................................................................................................................ 301
Chapter 19 – The Cochran-Mantel-Haenszel Test 19.1 19.2 19.3 19.4
Introduction............................................................................................................................. 305 Synopsis ................................................................................................................................... 305 Examples ................................................................................................................................. 307 Details & Notes........................................................................................................................ 313
Chapter 20 – Logistic Regression 20.1 20.2 20.3 20.4
Introduction............................................................................................................................. 317 Synopsis ................................................................................................................................... 318 Examples ................................................................................................................................. 322 Details & Notes........................................................................................................................ 340
Chapter 21 – The Log-Rank Test 21.1 21.2 21.3 21.4
Introduction............................................................................................................................. 349 Synopsis ................................................................................................................................... 350 Examples.................................................................................................................................. 351 Details & Notes........................................................................................................................ 359
vi Table of Contents
Chapter 22 – The Cox Proportional Hazards Model 22.1 22.2 22.3 22.4
Introduction............................................................................................................................. 365 Synopsis ................................................................................................................................... 366 Examples.................................................................................................................................. 367 Details & Notes........................................................................................................................ 376
Chapter 23 – Exercises 23.1 23.2 23.3 23.4
Introduction............................................................................................................................. 381 Exercises .................................................................................................................................. 381 Appropriateness of the Methods ........................................................................................... 383 Summary.................................................................................................................................. 387
Appendix A – Probability Tables A.1 A.2 A.3
Probabilities of the Standard Normal Distribution............................................................. 390 Critical Values of the Student t-Distribution....................................................................... 391 Critical Values of the Chi-Square Distribution ................................................................... 392
Appendix B – Common Distributions Used in Statistical Inference B.1 B.2 B.3 B.4
Notation.................................................................................................................................... 393 Properties................................................................................................................................. 394 Results ...................................................................................................................................... 395 Distributional Shapes.............................................................................................................. 397
Appendix C – Basic ANOVA Concepts C.1 C.2 C.3
Within- vs. Between-Group Variation.................................................................................. 399 Noise Reduction by Blocking ................................................................................................. 401 Least Squares Mean (LS-mean) ............................................................................................ 405
Appendix D – SS Types I, II, III, and IV Methods for an Unbalanced Two-Way Layout D.1 D.2 D.3 D.4 D.5
SS Types Computed by SAS .................................................................................................. 409 How to Determine the Hypotheses Tested ........................................................................... 411 Empty Cells.............................................................................................................................. 416 More Than Two Treatment Groups..................................................................................... 417 Summary.................................................................................................................................. 419
Appendix E – Multiple Comparison Methods E.1 E.2 E.3
Multiple Comparisons of Means ........................................................................................... 421 Multiple Comparisons of Binomial Proportions.................................................................. 432 Summary.................................................................................................................................. 436
Appendix F – Data Transformations F.1 F.2
Introduction............................................................................................................................. 437 The Log Transformation........................................................................................................ 438
Appendix G – SAS Code for Exercises in Chapter 23.................................................... 441 References .............................................................................................................................................. 447 Index ................................................................................................................................................... 453
Examples
4.1 4.2 5.1 6.1 7.1 7.2 8.1 8.2 8.3 9.1 9.2 10.1 10.2 11.1 11.2 12.1 13.1 14.1 15.1 16.1 16.2 16.3 17.1 18.1 18.2 19.1 20.1 20.2 20.3 21.1 22.1 22.2
Body-Mass Index ......................................................................................................................56 Paired-Difference in Weight Loss ..........................................................................................60 FEV1 Changes ..........................................................................................................................68 HAM-A Scores in GAD ...........................................................................................................80 Hemoglobin Changes in Anemia ............................................................................................94 Memory Function ...................................................................................................................103 Arthritic Discomfort Following Vaccine ..............................................................................115 Treadmill Walking Distance in Intermittent Claudication................................................127 Disease Progression in Alzheimer’s Trial.............................................................................137 Diaphoresis Following Cardiac Medication ........................................................................160 Antibiotic Blood Levels Following Aerosol Inhalation.......................................................165 Anti-Anginal Response vs. Disease History .........................................................................178 Symptomatic Recovery in Pediatric Dehydration...............................................................185 Triglyceride Changes Adjusted for Glycemic Control.......................................................203 ANCOVA in Multi-Center Hypertension Study.................................................................214 Rose-Bengal Staining in KCS................................................................................................228 Global Evaluations of Seroxatene in Back Pain..................................................................239 Psoriasis Evaluation in Three Groups..................................................................................249 Genital Wart Cure Rate.........................................................................................................259 ADR Frequency with Antibiotic Treatment ........................................................................267 Comparison of Dropout Rates for 4 Dose Groups..............................................................274 Active vs. Placebo Comparisons of Degree of Response ....................................................277 CHF Incidence in CABG after ARA ....................................................................................286 Bilirubin Abnormalities Following Drug Treatment..........................................................294 Symptom Frequency Before and After Treatment .............................................................300 Dermotel Response in Diabetic Ulcers..................................................................................307 Relapse Rate Adjusted for Remission Time in AML..........................................................322 Symptom Relief in Gastroparesis..........................................................................................331 Intercourse Success Rate in Erectile Dysfunction...............................................................336 HSV-2 Episodes Following gD2 Vaccine..............................................................................351 HSV-2 Episodes with gD2 Vaccine (continued) ..................................................................367 Hyalurise in Vitreous Hemorrhage.......................................................................................370
viii Common Statistical Methods for Clinical Research with SAS Examples
Figures
1.1. 1.2. 1.3. 1.4. 1.5. 1.6. 2.1. 3.1. 3.2. 3.3. 7.1. 7.2. 8.1. 8.2. 8.3. 8.4. 10.1. 10.2. 11.1. 20.1. 20.2. 20.3. 20.4. B.1. C.1.
Probability Distribution for Z = X1 .................................................................................... 7 Probability Distribution for Z = X1+X2 .............................................................................. 7 Probability Distribution for Z = X1+X2+X3 ...................................................................... 8 Probability Distribution for Z = X1+X2+X3+X4+X5+X6+X7+X8 ...................................... 9 Histogram of Height Measurements (n=25) .................................................................... 13 Histogram of Height Measurements (n=300) ................................................................. 14 Power Curve for the Z-Test .............................................................................................. 26 Sample Data Collection Form 1 (Dichotomous Response) ............................................. 40 Sample Data Collection Form 2 (Categorical Response)................................................ 41 Sample Data Collection Form 3 (Continuous Numeric Response) ................................ 41 No Interaction (a,b,c) and Interaction (d,e,f) Effects.................................................... 102 Interaction in Example 7.2 .............................................................................................. 106 Sample Profiles of Drug Response.................................................................................. 114 Response Profiles (Example 8.1) ..................................................................................... 123 Mean Walking Distances for Example 8.2 ..................................................................... 136 ADAS-cog Response Profiles (Example 8.3) .................................................................. 139 Simple Linear Regression of y on x ................................................................................ 176 95% Confidence Bands for Example 10.1...................................................................... 196 ANCOVA-Adjusted Means for Example 11.1 ............................................................... 208 Logistic Probability Function ......................................................................................... 319 Probability of Relapse (Px) vs. Remission Time (X) for Example 20.1........................ 325 Plot of Transformation from P to the Logit(P).............................................................. 343 Plot of Estimated Relapse Probabilities vs. Time since Prior Remission (X).............. 346 Distributional Shapes ...................................................................................................... 398 Plot of Mean Weights Shows No Treatment-by-Gender Interaction .......................... 405
x Common Statistical Methods for Clinical Research with SAS Examples
Prreeffaaccee ttoo tthhee SSeeccoonndd EEddiittiioonn This second edition expands the first edition with the inclusion of new sections, examples, and extensions of statistical methods described in the first edition only in their most elementary forms. New methods sections include analysis of crossover designs (Chapter 9) and multiple comparison methods (Appendix D). Chapters that present repeated measures analysis (Chapter 8), linear regression (Chapter 10), analysis of covariance (Chapter 11), the chi-square test (Chapter 16), and logistic regression (Chapter 20) have been notably expanded, and 50% more examples have been added throughout the book. A new chapter of exercises has also been added to give the reader practice in applying the various methods presented. Also new in this edition is an introduction to α-adjustments for interim analyses (Chapter 2). Although many of the new features will have wide appeal, some are targeted to the more experienced data analyst. These include discussion of the proportional odds model, the clustered binomial problem, collinearity in multiple regression, the use of time-dependent covariates with Cox regression, and the use of generalized estimating equations in repeated measures analysis. These methods, which are based on more advanced concepts than those found in most of the book, are routinely encountered in data analysis applications of clinical investigations and, as such, fit the description of ‘common statistical methods for clinical research’. However, so as not to overwhelm the less experienced reader, these concepts are presented only briefly, usually by example, along with references for further reading. First and foremost, this is a statistical methods book. It is designed to have particular appeal to those involved in clinical research, biometrics, epidemiology, and other health or medical related research applications. Unlike other books in the SAS Books by Users (BBU) library, SAS is not the primary focus of this book. Rather, SAS is presented as an indispensable tool that greatly simplifies the analyst’s task. While consulting for dozens of companies over 25 years of statistical application to clinical investigation, I have never seen a successful clinical program that did not use SAS. Because of its widespread use within the pharmaceutical industry, I include SAS here as the ‘tool’ of choice to illustrate the statistical methods. The examples have been updated to Version 8 of SAS, however, the programming statements used have been kept ‘portable’, meaning that most can be used in earlier versions of SAS as they appear in the examples, unless otherwise noted. This includes the use of portable variable and data set names, despite accommodation for use of long names beginning with Version 8. Because SAS is
xii Common Statistical Methods for Clinical Research with SAS Examples
not the main focus of this book but is key to efficient data analysis, programming details are not included here, but they can be found in numerous references cited throughout the book. Many of these references are other books in the Books by Users program at SAS, which provide the details, including procedure options, use of ODS, and the naming standards that are new in Version 8. For statistical programming, my favorites include Categorical Data Analysis Using the SAS System, Second Edition, by Stokes, Davis, and Koch (2000) and Survival Analysis Using the SAS System, A Practical Guide, by Paul Allison (1995). I welcome and appreciate reader comments and feedback through the SAS Publications Web site.
Glenn A. Walker July 2002
Preface
xiii
Prreeffaaccee ttoo tthhee FFiirrsstt EEddiittiioonn This book was written for those involved in clinical research and who may, from time to time, need a guide to help demystify some of the most commonly used statistical methods encountered in our profession. All too often, I have heard medical directors of clinical research departments express frustration at seemingly cryptic statistical methods sections of protocols which they are responsible for approving. Other nonstatisticians, including medical monitors, investigators, clinical project managers, medical writers and regulatory personnel, often voice similar sentiment when it comes to statistics, despite the profound reliance upon statistical methods in the success of the clinical program. For these people, I offer this book (sans technical details) as a reference guide to better understand statistical methods as applied to clinical investigation and the conditions and assumptions under which they are applied. For the clinical data analyst and statistician new to clinical applications, the examples from a clinical trials setting may help in making the transition from other statistical fields to that of clinical trials. The discussions of 'Least-Squares' means, distinguishing features of the various SAS® types of sums-of-squares, and relationships among various tests (such as the Chi-Square Test, the CochranMantel-Haenszel Test and the Log-Rank Test) may help crystalize the analyst's understanding of these methods. Analysts with no prior SAS experience should benefit by the simplifed SAS programming statements provided with each example as an introduction to SAS analyses. This book may also aid the SAS programmer with limited statistical knowledge in better grasping an overall picture of the clinical trials process. Many times knowledge of the hypotheses being tested and appropriate interpretation of the SAS output relative to those hypotheses will help the programmer become more efficient in responding to the requests of other clinical project team members. Finally, the medical student will find the focused presentation on the specific methods presented to be of value while proceeding through a first course in biostatistics. For all readers, my goal was to provide a unique approach to the description of commonly used statistical methods by integrating both manual and computerized solutions to a wide variety of examples taken from clinical research. Those who learn best by example should find this approach rewarding. I have found no other book which demonstrates that the SAS output actually does have the same results as the manual solution of a problem using the calculating formulas. So ever reassuring this is for the student of clinical data analysis!
xiv Common Statistical Methods for Clinical Research with SAS Examples
Each statistical test is presented in a separate chapter, and includes a brief, nontechnical introduction, a synopsis of the test, one or two examples worked manually followed by an appropriate solution using the SAS statistical package, and finally, a discussion with details and relevant notes. Chapters 1 and 2 are introductory in nature, and should be carefully read by all with no prior formal exposure to statistics. Chapter 1 provides an introduction to statistics and some of the basic concepts involved in inference-making. Chapter 2 goes into more detail with regard to the main aspects of hypothesis testing, including significance levels, power and sample size determination. For those who use analysis-of-variance, Appendix C provides a non-technical introduction to ANOVA methods. The remainder of the book may be used as a text or reference. As a reference, the reader should keep in mind that many of the tests discussed in later chapters rely on concepts presented earlier in the book, strongly suggesting prerequisite review. This book focuses on statistical hypothesis testing as opposed to other inferential techniques. For each statistical method, the test summary is clearly provided, including the null hypothesis tested, the test statistic and the decision rule. Each statistical test is presented in one of its most elementary forms to provide the reader with a basic framework. Many of the tests discussed have extensions or variations which can be used with more complex data sets. The 18 statistical methods presented here (Chapters 3-20) represent a composite of those which, in my experience, are most commonly used in the analysis of clinical research data. I can't think of a single study I've analyzed in nearly 20 years which did not use at least one of these tests. Furthermore, many of the studies I've encountered have used exclusively the methods presented here, or variations or extensions thereof. Thus, the word 'common' in the title. Understanding of many parts of this book requires some degree of statistical knowledge. The clinician without such a background may skip over many of the technical details and still come away with an overview of the test's applications, assumptions and limitations. Basic algebra is the only prerequisite, as derivations of test procedures are omitted, and matrix algebra is mentioned only in an appendix. My hope is that the statistical and SAS analysis aspects of the examples would provide a springboard for the motivated reader, both to go back to more elementary texts for additional background and to go forward to more advanced texts for further reading. Many of the examples are based on actual clinical trials which I have analyzed. In all cases, the data are contrived, and in many cases fictitious names are used for different treatments or research facilities. Any resemblence of the data or the tests' results to actual cases is purely coincidental.
Glenn A. Walker May 1996
&++$$3377((55 ,QWURGXFWLRQ %DVLFV
6WDWLVWLFV²WKH)LHOG 3UREDELOLW\'LVWULEXWLRQV 6WXG\'HVLJQ)HDWXUHV 'HVFULSWLYH6WDWLVWLFV ,QIHUHQWLDO6WDWLVWLFV 6XPPDU\
6WDWLVWLFV²WKH)LHOG
,QVRPHZD\VZHDUHDOOERUQVWDWLVWLFLDQV,QIHUULQJJHQHUDOSDWWHUQVIURPOLPLWHG NQRZOHGJH LV QHDUO\ DV DXWRPDWLF WR WKH KXPDQ FRQVFLRXVQHVV DV EUHDWKLQJ
7KHSXUSRVHRIWKHILHOGRI6WDWLVWLFVLVWRFKDUDFWHUL]HDSRSXODWLRQEDVHGRQWKH LQIRUPDWLRQ FRQWDLQHG LQ D VDPSOH WDNHQ IURP WKDW SRSXODWLRQ 7KH VDPSOH LQIRUPDWLRQ LV FRQYH\HG E\ IXQFWLRQV RI WKH REVHUYHG GDWD ZKLFK DUH FDOOHG VWDWLVWLFV7KHILHOGRI6WDWLVWLFVLVDGLVFLSOLQHWKDWHQGHDYRUVWRGHWHUPLQHZKLFK IXQFWLRQVDUHWKHPRVWUHOHYDQWLQWKHFKDUDFWHUL]DWLRQRIYDULRXVSRSXODWLRQV7KH FRQFHSWV RI µSRSXODWLRQV¶ µVDPSOHV¶ DQG µFKDUDFWHUL]DWLRQ¶ DUH GLVFXVVHG LQ WKLV FKDSWHU )RU H[DPSOH WKH DULWKPHWLF PHDQ PLJKW EH WKH PRVW DSSURSULDWH VWDWLVWLFWRKHOS FKDUDFWHUL]HFHUWDLQSRSXODWLRQVZKLOHWKHPHGLDQPLJKWEHPRUHDSSURSULDWHIRU RWKHUV 6WDWLVWLFLDQV XVH VWDWLVWLFDO DQG SUREDELOLW\ WKHRU\ WR GHYHORS QHZ PHWKRGRORJ\DQGDSSO\WKHPHWKRGVEHVWVXLWHGIRUGLIIHUHQWW\SHVRIGDWDVHWV $SSOLHG6WDWLVWLFVFDQEHYLHZHGDVDVHWRIPHWKRGRORJLHVXVHGWRKHOSFDUU\RXW VFLHQWLILF H[SHULPHQWV ,Q NHHSLQJ ZLWK WKH VFLHQWLILF PHWKRG DSSOLHG VWDWLVWLFV FRQVLVWV RI GHYHORSLQJ D K\SRWKHVLV GHWHUPLQLQJ WKH EHVW H[SHULPHQW WR WHVW WKH K\SRWKHVLV FRQGXFWLQJ WKH H[SHULPHQW REVHUYLQJ WKH UHVXOWV DQG PDNLQJ FRQFOXVLRQV 7KH VWDWLVWLFLDQ¶V UHVSRQVLELOLWLHV LQFOXGH VWXG\ GHVLJQ GDWD FROOHFWLRQVWDWLVWLFDODQDO\VLVDQGPDNLQJDSSURSULDWHLQIHUHQFHVIURPWKHGDWD,Q GRLQJ VR WKH VWDWLVWLFLDQ VHHNV WR OLPLW ELDV PD[LPL]H REMHFWLYLW\ DQG REWDLQ UHVXOWVWKDWDUHVFLHQWLILFDOO\YDOLG
&RPPRQ6WDWLVWLFDO0HWKRGVIRU&OLQLFDO5HVHDUFKZLWK6$6([DPSOHV
3RSXODWLRQV $SRSXODWLRQLVDXQLYHUVHRIHQWLWLHVWREHFKDUDFWHUL]HGEXWLVWRRYDVWWRVWXG\LQ LWV HQWLUHW\ 7KH SRSXODWLRQ LQ D FOLQLFDO WULDO ZRXOG EH GHILQHG E\ LWV OLPLWLQJ FRQGLWLRQVXVXDOO\VSHFLILHGYLDVWXG\LQFOXVLRQDQGH[FOXVLRQFULWHULD ([DPSOHVRISRSXODWLRQVLQFOXGH SDWLHQWVZLWKPLOGWRPRGHUDWHK\SHUWHQVLRQ REHVHWHHQDJHUV DGXOWLQVXOLQGHSHQGHQWGLDEHWLFSDWLHQWV 7KH ILUVW H[DPSOH KDV RQO\ RQH OLPLWLQJ IDFWRU GHILQLQJ WKH SRSXODWLRQ WKDW LV PLOGWRPRGHUDWHK\SHUWHQVLRQ7KLVSRSXODWLRQFRXOGEHGHILQHGPRUHSUHFLVHO\DV SDWLHQWV ZLWK GLDVWROLF EORRG SUHVVXUH ZLWKLQ D VSHFLILF UDQJH RI YDOXHV DV DQ LQFOXVLRQFULWHULRQIRUWKHFOLQLFDOSURWRFRO$GGLWLRQDOFULWHULDZRXOGIXUWKHUOLPLW WKHSRSXODWLRQWREHVWXGLHG 7KHVHFRQGH[DPSOHXVHVERWKDJHDQGZHLJKWDVOLPLWLQJFRQGLWLRQVDQGWKHWKLUG H[DPSOHXVHVDJHGLDJQRVLVDQGWUHDWPHQWDVFULWHULDIRUGHILQLQJWKHSRSXODWLRQ ,WLVLPSRUWDQWWRLGHQWLI\WKHSRSXODWLRQRILQWHUHVWLQDFOLQLFDOVWXG\DWWKHWLPHRI SURWRFROGHYHORSPHQWEHFDXVHWKHSRSXODWLRQLVWKHµXQLYHUVH¶WRZKLFKVWDWLVWLFDO LQIHUHQFHVPLJKWDSSO\6HYHUHO\UHVWULFWLQJWKHSRSXODWLRQE\XVLQJPDQ\VSHFLILF FULWHULD IRU DGPLVVLRQ PLJKW XOWLPDWHO\OLPLWWKHFOLQLFDOLQGLFDWLRQWRDUHVWULFWHG VXEVHWRIWKHLQWHQGHGPDUNHW 6DPSOHV
&KDSWHU,QWURGXFWLRQ %DVLFV
$OWKRXJK LQIHUHQFHV FDQ EH ELDVHG LI WKH VDPSOH LV QRW UDQGRP DGMXVWPHQWV FDQ VRPHWLPHVEHXVHGWRFRQWUROELDVLQWURGXFHGE\QRQUDQGRPVDPSOLQJ$QHQWLUH EUDQFK RI 6WDWLVWLFV NQRZQ DV 6DPSOLQJ 7KHRU\ KDV EHHQ GHYHORSHG WR SURYLGH DOWHUQDWLYHDSSURDFKHVWRVLPSOHUDQGRPVDPSOLQJ0DQ\RIWKHVHDSSURDFKHVKDYH WKH JRDO RI PLQLPL]LQJ ELDV 7KH WHFKQLTXHV FDQ EHFRPH TXLWH FRPSOH[ DQG DUH EH\RQGWKHVFRSHRIWKLVRYHUYLHZ )RUORJLVWLFDOUHDVRQVFOLQLFDOVWXGLHVDUHFRQGXFWHGDWDFRQYHQLHQWVWXG\FHQWHU ZLWKWKHDVVXPSWLRQWKDWWKHSDWLHQWVHQUROOHGDWWKDWFHQWHUDUHW\SLFDORIWKRVHWKDW PLJKWEHHQUROOHGHOVHZKHUH0XOWLFHQWHUVWXGLHVDUHRIWHQXVHGWREOXQWWKHHIIHFW RIFKDUDFWHULVWLFVRIWKHSDWLHQWRURISURFHGXUDODQRPDOLHVWKDWPLJKWEHXQLTXHWR DQ\VSHFLILFFHQWHU 6WUDWLILHG VDPSOLQJ LV DQRWKHU WHFKQLTXH WKDW LV RIWHQ XVHG WR REWDLQ D EHWWHU UHSUHVHQWDWLRQRISDWLHQWV6WUDWLILHGVDPSOLQJXVHVUDQGRPVDPSOHVIURPHDFKRI VHYHUDOVXEJURXSVRIDSRSXODWLRQZKLFKDUHFDOOHGµVWUDWD¶(QUROOPHQWLQDVWXG\ LVVRPHWLPHVVWUDWLILHGE\GLVHDVHVHYHULW\DJHJURXSRUVRPHRWKHUFKDUDFWHULVWLF RIWKHSDWLHQW %HFDXVH LQIHUHQFHV IURP QRQUDQGRP VDPSOHV PLJKW QRW EH DV UHOLDEOH DV WKRVH PDGHIURPUDQGRPVDPSOHVWKHFOLQLFDOVWDWLVWLFLDQPXVWVSHFLILFDOO\DGGUHVVWKH LVVXH RI VHOHFWLRQ ELDV LQ WKH DQDO\VLV 6WDWLVWLFDO PHWKRGV FDQ EH DSSOLHG WR GHWHUPLQH ZKHWKHU WKH WUHDWPHQW JURXS DVVLJQPHQW µDSSHDUV¶ UDQGRP IRU FHUWDLQ UHVSRQVHYDULDEOHV)RUH[DPSOHEDVHOLQHYDOXHVPLJKWEHORZHUIRU*URXS$WKDQ *URXS%LQDFRPSDUDWLYHFOLQLFDOVWXG\,I*URXS$VKRZVDJUHDWHUUHVSRQVHSDUW RIWKDWSHUFHLYHGUHVSRQVHPLJKWEHDUHJUHVVLRQWRZDUGWKHPHDQHIIHFWWKDWLVD WHQGHQF\ WR UHWXUQ WR QRUPDOIURPDQDUWLILFLDOO\ORZEDVHOLQHOHYHO6XFKHIIHFWV VKRXOG EH LQYHVWLJDWHG WKRURXJKO\ WR DYRLG PDNLQJ IDXOW\ FRQFOXVLRQV GXH WR VHOHFWLRQELDV $GGLWLRQDO FRQILUPDWRU\ VWXGLHV LQ VHSDUDWH LQGHSHQGHQW VDPSOHV IURP WKH VDPH SRSXODWLRQFDQDOVREHLPSRUWDQWLQDOOD\LQJFRQFHUQVUHJDUGLQJSRVVLEOHVDPSOLQJ ELDVHV
&KDUDFWHUL]DWLRQ 6RKRZLVWKHSRSXODWLRQFKDUDFWHUL]HGIURPDVDPSOH"6WDWLVWLFDOPHWKRGVXVHGWR FKDUDFWHUL]HSRSXODWLRQVFDQEHFODVVLILHGDVGHVFULSWLYHRULQIHUHQWLDO 'HVFULSWLYH VWDWLVWLFV DUH XVHG WR GHVFULEH WKH GLVWULEXWLRQ RI SRSXODWLRQ PHDVXUHPHQWV E\ SURYLGLQJ HVWLPDWHV RI FHQWUDO WHQGHQF\ DQG PHDVXUHV RI YDULDELOLW\ RU E\ XVLQJ JUDSKLFDO WHFKQLTXHV VXFK DV KLVWRJUDPV ,QIHUHQWLDO PHWKRGVXVHSUREDELOLW\WRH[SUHVVWKHOHYHORIFHUWDLQW\DERXWHVWLPDWHVDQGWRWHVW VSHFLILFK\SRWKHVHV ([SORUDWRU\ DQDO\VHV UHSUHVHQW D WKLUG W\SH RI VWDWLVWLFDO SURFHGXUH XVHG WR FKDUDFWHUL]H SRSXODWLRQV $OWKRXJK H[SORUDWRU\ PHWKRGV XVH ERWK GHVFULSWLYHDQG LQIHUHQWLDO WHFKQLTXHV FRQFOXVLRQV FDQQRW EH GUDZQ ZLWK WKH VDPH OHYHO RI FHUWDLQW\EHFDXVHK\SRWKHVHVDUHQRWSUHSODQQHG*LYHQDODUJHGDWDVHWLWLVYHU\
&RPPRQ6WDWLVWLFDO0HWKRGVIRU&OLQLFDO5HVHDUFKZLWK6$6([DPSOHV
OLNHO\ WKDW DW OHDVW RQH VWDWLVWLFDOO\ VLJQLILFDQW UHVXOW FDQ EH IRXQG E\ XVLQJ H[SORUDWRU\ DQDO\VHV 6XFK UHVXOWV DUH µK\SRWKHVLVJHQHUDWLQJ¶ DQG RIWHQ OHDG WR QHZVWXGLHVSURVSHFWLYHO\GHVLJQHGWRWHVWWKHVHQHZK\SRWKHVHV 7ZR PDLQ LQIHUHQWLDO PHWKRGV DUH FRQILGHQFH LQWHUYDO HVWLPDWLRQ DQG K\SRWKHVLV WHVWLQJZKLFKDUHGLVFXVVHGLQGHWDLOODWHULQWKLVFKDSWHU
3UREDELOLW\'LVWULEXWLRQV
$Q XQGHUVWDQGLQJ RI EDVLF SUREDELOLW\ FRQFHSWV LV HVVHQWLDO WR JUDVS WKH IXQGDPHQWDOV RI VWDWLVWLFDO LQIHUHQFH 0RVW LQWURGXFWRU\ VWDWLVWLFV WH[WV GLVFXVV WKHVH EDVLFV WKHUHIRUH RQO\ VRPH EULHI FRQFHSWV RI SUREDELOLW\ GLVWULEXWLRQV DUH UHYLHZHGKHUH (DFK RXWFRPH RI D VWDWLVWLFDO H[SHULPHQW FDQ EH PDSSHG WR D QXPHULFYDOXHG IXQFWLRQFDOOHGDµUDQGRPYDULDEOH¶6RPHYDOXHVRIWKHUDQGRPYDULDEOHPLJKWEH PRUH OLNHO\ WR RFFXU WKDQRWKHUV7KHSUREDELOLW\GLVWULEXWLRQDVVRFLDWHGZLWKWKH UDQGRPYDULDEOH;GHVFULEHVWKHOLNHOLKRRGRIREWDLQLQJFHUWDLQYDOXHVRUUDQJHVRI YDOXHVRIWKHUDQGRPYDULDEOH )RU H[DPSOH FRQVLGHU WZR FDQFHU SDWLHQWV HDFK KDYLQJ D FKDQFH RI VXUYLYLQJ DW OHDVW PRQWKV 7KUHH PRQWKV ODWHU WKHUH DUH SRVVLEOH RXWFRPHV ZKLFKDUHVKRZQLQ7DEOH 7$%/(3UREDELOLW\'LVWULEXWLRQRI1XPEHURI6XUYLYRUVQ
2XWFRPH
3DWLHQW
3DWLHQW
;
3UREDELOLW\
'LHG
'LHG
'LHG
6XUYLYHG
6XUYLYHG
'LHG
6XUYLYHG
6XUYLYHG
(DFK RXWFRPH FDQ EHPDSSHGWRWKHUDQGRPYDULDEOH;ZKLFKLVGHILQHGDVWKH QXPEHURISDWLHQWVVXUYLYLQJDWOHDVWPRQWKV;FDQWDNHWKHYDOXHVRUZLWK SUREDELOLWLHV DQG UHVSHFWLYHO\ EHFDXVH HDFK RXWFRPH LV HTXDOO\ OLNHO\
&KDSWHU,QWURGXFWLRQ %DVLFV
7KHSUREDELOLW\GLVWULEXWLRQIRU;LVJLYHQE\3[DVIROORZV ;
3[
'LVFUHWH'LVWULEXWLRQV 7KH SUHFHGLQJ H[DPSOH LV D GLVFUHWH SUREDELOLW\ GLVWULEXWLRQ EHFDXVH WKH UDQGRP YDULDEOH;FDQRQO\WDNHGLVFUHWHYDOXHVLQWKLVFDVHLQWHJHUVIURPWR 7KH ELQRPLDO GLVWULEXWLRQ LV SHUKDSV WKH PRVW FRPPRQO\ XVHG GLVFUHWH GLVWULEXWLRQLQFOLQLFDOELRVWDWLVWLFV7KLVGLVWULEXWLRQLVXVHGWRPRGHOH[SHULPHQWV LQYROYLQJQLQGHSHQGHQWWULDOVHDFKZLWKSRVVLEOHRXWFRPHVVD\µHYHQW¶RUµQRQ HYHQW¶DQGWKHSUREDELOLW\RIµHYHQW¶SLVWKHVDPHIRUDOOQWULDOV7KHSUHFHGLQJ H[DPSOH ZKLFK LQYROYHV WZR FDQFHU SDWLHQWV LV DQ H[DPSOH RI D ELQRPLDO GLVWULEXWLRQLQZKLFKQ SDWLHQWV S DQGµHYHQW¶LVVXUYLYDORIDWOHDVW PRQWKV 2WKHU FRPPRQ GLVFUHWH GLVWULEXWLRQV LQFOXGH WKH SRLVVRQ DQG WKH K\SHUJHRPHWULF GLVWULEXWLRQV &RQWLQXRXV'LVWULEXWLRQV ,IDUDQGRPYDULDEOHFDQWDNHDQ\YDOXHZLWKLQDQLQWHUYDORUFRQWLQXXPLWLVFDOOHG DFRQWLQXRXVUDQGRPYDULDEOH+HLJKWZHLJKWEORRGSUHVVXUHDQGFKROHVWHUROOHYHO DUH XVXDOO\ FRQVLGHUHG FRQWLQXRXV UDQGRP YDULDEOHV EHFDXVH WKH\ FDQ WDNH DQ\ YDOXHZLWKLQFHUWDLQLQWHUYDOVHYHQWKRXJKWKHREVHUYHGPHDVXUHPHQWLVOLPLWHGE\ WKHDFFXUDF\RIWKHPHDVXULQJGHYLFH 7KHSUREDELOLW\GLVWULEXWLRQIRUDFRQWLQXRXVUDQGRPYDULDEOHFDQQRWEHVSHFLILHG LQDVLPSOHIRUPDVLWLVLQWKHGLVFUHWHH[DPSOHDERYH7RGRWKDWZRXOGHQWDLODQ LQILQLWH OLVW RI SUREDELOLWLHV RQH IRU HDFK SRVVLEOH YDOXH ZLWKLQ WKH LQWHUYDO 2QH ZD\ WR VSHFLI\ WKH GLVWULEXWLRQ IRU FRQWLQXRXV UDQGRP YDULDEOHV LV WR OLVW WKH SUREDELOLWLHV IRU UDQJHV RI ;YDOXHV +RZHYHU VXFK D VSHFLILFDWLRQ FDQ DOVR EH YHU\FXPEHUVRPH &RQWLQXRXVGLVWULEXWLRQVDUHPRVWFRQYHQLHQWO\DSSUR[LPDWHGE\IXQFWLRQVRIWKH UDQGRPYDULDEOH;VXFKDV3[([DPSOHVRIVXFKIXQFWLRQVDUH
3 [ IRU[
3 DH
RU
[
[
±D[
IRU[
&RPPRQ6WDWLVWLFDO0HWKRGVIRU&OLQLFDO5HVHDUFKZLWK6$6([DPSOHV
7KH QRUPDO GLVWULEXWLRQ LV WKH PRVW FRPPRQO\ XVHG FRQWLQXRXV GLVWULEXWLRQ LQ FOLQLFDOUHVHDUFKVWDWLVWLFV0DQ\QDWXUDOO\RFFXUULQJSKHQRPHQDIROORZWKHQRUPDO GLVWULEXWLRQZKLFKFDQEHH[SODLQHGE\DSRZHUIXOUHVXOWIURPSUREDELOLW\WKHRU\ NQRZQDVWKH&HQWUDO/LPLW7KHRUHPGLVFXVVHGLQWKHQH[WVHFWLRQ 7KHQRUPDOSUREDELOLW\GLVWULEXWLRQLVJLYHQE\WKHIXQFWLRQ [ - ± = IRU ± ¥ < [ < ¥ H 3[ ZKHUHmDQGsDUHFDOOHGµSDUDPHWHUV¶RIWKHGLVWULEXWLRQ)RUDQ\YDOXHVRImDQGs ! DSORWRI3[YHUVXV[KDVDµEHOO¶VKDSHLOOXVWUDWHGLQ$SSHQGL[% 2WKHU FRPPRQ FRQWLQXRXV GLVWULEXWLRQV DUH WKH H[SRQHQWLDO GLVWULEXWLRQ WKH FKL VTXDUH GLVWULEXWLRQ WKH)GLVWULEXWLRQDQGWKH6WXGHQWWGLVWULEXWLRQ$SSHQGL[% OLVWV VRPH DQDO\WLF SURSHUWLHV RI FRPPRQ FRQWLQXRXV GLVWULEXWLRQV XVHG LQ VWDWLVWLFDOLQIHUHQFHPHQWLRQHGWKURXJKRXWWKLVERRN 7KHQRUPDOFKLVTXDUH) DQGWGLVWULEXWLRQVDUHDOOLQWHUUHODWHGDQGVRPHRIWKHVHUHODWLRQVKLSVDUHVKRZQLQ $SSHQGL[% :KHWKHUGLVFUHWHRUFRQWLQXRXVHYHU\SUREDELOLW\GLVWULEXWLRQKDVWKHSURSHUW\WKDW WKHVXPRIWKHSUREDELOLWLHVRYHUDOO;YDOXHVHTXDOV 7KH&HQWUDO/LPLW7KHRUHP 7KH &HQWUDO /LPLW 7KHRUHP VWDWHV WKDW UHJDUGOHVV RI WKH GLVWULEXWLRQ RI PHDVXUHPHQWVVXPVDQGDYHUDJHVRIDODUJHQXPEHURIOLNHPHDVXUHPHQWVWHQGWR IROORZ WKH QRUPDO GLVWULEXWLRQ %HFDXVH PDQ\ PHDVXUHPHQWV UHODWHG WR JURZWK KHDOLQJRUGLVHDVHSURJUHVVLRQPLJKWEHUHSUHVHQWHGE\DVXPRUDQDFFXPXODWLRQ RILQFUHPHQWDOPHDVXUHPHQWVRYHUWLPHWKHQRUPDOGLVWULEXWLRQLVRIWHQDSSOLFDEOH WRFOLQLFDOGDWDIRUODUJHVDPSOHV 7R LOOXVWUDWH WKH &HQWUDO /LPLW 7KHRUHP FRQVLGHU WKH IROORZLQJ H[SHULPHQW $ SODFHER LQDFWLYH SLOO LVJLYHQWRQSDWLHQWVIROORZHGE\DQHYDOXDWLRQRQHKRXU ODWHU6XSSRVHWKDWHDFKSDWLHQW VHYDOXDWLRQFDQUHVXOWLQµLPSURYHPHQW¶FRGHGDV µQRFKDQJH¶ RUµGHWHULRUDWLRQ¶± ZLWKHDFKUHVXOWHTXDOO\SUREDEOH/HW ; ; ; UHSUHVHQWWKHPHDVXUHPHQWVIRUWKHQSDWLHQWVDQGGHILQH=WREHD UDQGRP YDULDEOH WKDW UHSUHVHQWV WKH VXP RI WKHVH HYDOXDWLRQ VFRUHV IRU DOO Q SDWLHQWV
Q
= ; ; ;
Q
&KDSWHU,QWURGXFWLRQ %DVLFV
)RUQ WKHSUREDELOLW\GLVWULEXWLRQRI=LVWKHVDPHDV;ZKLFKLVFRQVWDQWIRU DOOSRVVLEOHYDOXHVRI;7KLVLVFDOOHGDµXQLIRUP¶GLVWULEXWLRQ6HH)LJ ),*85(3UREDELOLW\'LVWULEXWLRQIRU= ;
=
3]
0.5 Pz 0
-1
0
1
Z )RUQ WKHUHDUHHTXDOO\SUREDEOHRXWFRPHVUHVXOWLQJLQSRVVLEOHGLVWLQFW YDOXHVIRU=DVVKRZQLQ7DEOH 7$%/($OO3RVVLEOH(TXDOO\3UREDEOH2XWFRPHVQ
3DWLHQW
3DWLHQW
=
3URE
7KHUHVXOWLQJSUREDELOLW\GLVWULEXWLRQIRU=LVVKRZQLQ)LJXUH ),*85(3UREDELOLW\'LVWULEXWLRQIRU= ;;
=
3]
0.5
Pz
0 -2
-1
0 Z
1
2
&RPPRQ6WDWLVWLFDO0HWKRGVIRU&OLQLFDO5HVHDUFKZLWK6$6([DPSOHV
)RUQ =FDQWDNHYDOXHVIURPWR6HH)LJXUHIRUWKHGLVWULEXWLRQ ),*85(3UREDELOLW\'LVWULEXWLRQIRU= ;;;
0.3
=
3]
Pz
0 -3
-2
-1
0
1
2
3
Z
&KDSWHU,QWURGXFWLRQ %DVLFV
),*85(3UREDELOLW\'LVWULEXWLRQIRU = ;;;;;;;;
=
3]
0.2
Pz
0 -8
-6
-4
-2
0
2
4
6
8
Z
6WXG\'HVLJQ)HDWXUHV
6RXQGVWDWLVWLFDOUHVXOWVFDQEHYDOLGRQO\LIWKHVWXG\SODQLVZHOOWKRXJKWRXWDQG DFFRPSDQLHG E\ DSSURSULDWH GDWD FROOHFWLRQ WHFKQLTXHV (YHQ WKH PRVW VRSKLVWLFDWHG VWDWLVWLFDO WHVWV PLJKW QRW OHDG WR YDOLG LQIHUHQFHV RU DSSURSULDWH FKDUDFWHUL]DWLRQV RI WKH SRSXODWLRQ LI WKH VWXG\ LWVHOI LV IODZHG 7KHUHIRUH LW LV LPSHUDWLYH WKDW VWDWLVWLFDO GHVLJQ FRQVLGHUDWLRQV EH DGGUHVVHG LQ FOLQLFDO VWXGLHV GXULQJSURWRFROGHYHORSPHQW 7KHUHDUHPDQ\VWDWLVWLFDOGHVLJQFRQVLGHUDWLRQVWKDWJRLQWRWKHSODQQLQJVWDJHRID QHZVWXG\7KHSUREDELOLW\GLVWULEXWLRQRIWKHSULPDU\UHVSRQVHYDULDEOHVZLOOKHOS SUHGLFW KRZ WKH PHDVXUHPHQWV ZLOO YDU\ %HFDXVH JUHDWHU YDULDELOLW\ RI WKH PHDVXUHPHQWVUHTXLUHVDODUJHUVDPSOHVL]HGLVWULEXWLRQDODVVXPSWLRQVHQDEOHWKH FRPSXWDWLRQRIVDPSOHVL]HUHTXLUHPHQWVWRGLVWLQJXLVKDUHDOWUHQGIURPVWDWLVWLFDO YDULDWLRQ'HWHUPLQLQJWKHVDPSOHVL]HLVGLVFXVVHGLQ&KDSWHU
&RPPRQ6WDWLVWLFDO0HWKRGVIRU&OLQLFDO5HVHDUFKZLWK6$6([DPSOHV
0HWKRGVWRKHOSUHGXFHUHVSRQVHYDULDELOLW\FDQDOVREHLQFRUSRUDWHGLQWRWKHVWXG\ GHVLJQ )HDWXUHV RI FRQWUROOHG FOLQLFDO WULDOV VXFK DV UDQGRPL]DWLRQ DQG EOLQGLQJ DQG VWDWLVWLFDO µQRLVHUHGXFLQJ¶ WHFKQLTXHV VXFK DV WKH XVH RI FRYDULDWHV VWUDWLILFDWLRQRUEORFNLQJIDFWRUVDQGWKHXVHRIZLWKLQSDWLHQWFRQWUROV DUHZD\VWR KHOS FRQWURO H[WUDQHRXV YDULDELOLW\ DQG IRFXV RQ WKH SULPDU\ UHVSRQVH PHDVXUHPHQWV &RQWUROOHG6WXGLHV $FRQWUROOHGVWXG\XVHVDNQRZQWUHDWPHQWZKLFKLVFDOOHGDµFRQWURO¶DORQJZLWK WKHWHVWWUHDWPHQWV$FRQWUROPD\EHLQDFWLYHVXFKDVDSODFHERRUVKDPRULWPD\ EHDQRWKHUDFWLYHWUHDWPHQWSHUKDSVDFXUUHQWO\PDUNHWHGSURGXFW $ VWXG\ WKDW XVHV D VHSDUDWH LQGHSHQGHQW JURXS RISDWLHQWVLQDFRQWUROJURXSLV FDOOHG D SDUDOOHOJURXS VWXG\ $ VWXG\ WKDW JLYHV ERWK WKH WHVW WUHDWPHQW DQG WKH FRQWUROWRWKHVDPHSDWLHQWVLVFDOOHGDZLWKLQSDWLHQWFRQWUROVWXG\ $FRQWUROOHGVWXG\KDVWKHDGYDQWDJHRIEHLQJDEOHWRHVWLPDWHWKHSXUHWKHUDSHXWLF HIIHFWRIWKHWHVWWUHDWPHQWE\FRPSDULQJLWVSHUFHLYHGEHQHILWUHODWLYHWRWKHEHQHILW RI WKH FRQWURO %HFDXVH WKH SHUFHLYHG EHQHILW PLJKW EH GXH WR QXPHURXV VWXG\ IDFWRUVRWKHUWKDQWKHWUHDWPHQWLWVHOIDFRQFOXVLRQRIWKHUDSHXWLFEHQHILWFDQQRWEH PDGH ZLWKRXW ILUVW UHPRYLQJ WKRVH RWKHU IDFWRUV IURP FRQVLGHUDWLRQ %HFDXVH WKH FRQWUROVDUHVXEMHFWWRWKHVDPHVWXG\IDFWRUVWUHDWPHQWHIIHFWUHODWLYHWRFRQWURO LQVWHDG RI DEVROXWH SHUFHLYHG EHQHILW LV PRUH UHOHYDQW LQ HVWLPDWLQJ DFWXDO WKHUDSHXWLFHIIHFW 5DQGRPL]DWLRQ 5DQGRPL]DWLRQLVDPHDQVRIREMHFWLYHO\DVVLJQLQJH[SHULPHQWDOXQLWVRUSDWLHQWVWR WUHDWPHQW JURXSV ,Q FOLQLFDO WULDOV WKLV LV GRQH E\ PHDQV RI D UDQGRPL]DWLRQ VFKHGXOHJHQHUDWHGSULRUWRVWDUWLQJWKHHQUROOPHQWRISDWLHQWV 7KH UDQGRPL]DWLRQ VFKHPH VKRXOG KDYH WKH SURSHUW\ WKDW DQ\ UDQGRPO\ VHOHFWHG SDWLHQWKDVWKHVDPHFKDQFHDVDQ\RWKHUSDWLHQWRIEHLQJLQFOXGHGLQDQ\WUHDWPHQW JURXS 5DQGRPL]DWLRQ LV XVHG LQ FRQWUROOHG FOLQLFDO WULDOV WR HOLPLQDWH V\VWHPDWLF WUHDWPHQWJURXSDVVLJQPHQWZKLFKPLJKWOHDGWRELDV,QDQRQUDQGRPL]HGVHWWLQJ SDWLHQWVZLWKWKHPRVWVHYHUHFRQGLWLRQPLJKWEHDVVLJQHGWRDJURXSEDVHGRQWKH WUHDWPHQW VDQWLFLSDWHGEHQHILW:KHWKHUWKLVDVVLJQPHQWLVLQWHQWLRQDORUQRWWKLV FUHDWHVELDVEHFDXVHWKHWUHDWPHQWJURXSVZRXOGUHSUHVHQWVDPSOHVIURPGLIIHUHQW SRSXODWLRQV VRPH RI ZKRP PLJKW KDYH PRUH VHYHUH FRQGLWLRQV WKDQ RWKHUV 5DQGRPL]DWLRQ ILOWHUV RXW VXFK VHOHFWLRQ ELDV DQG KHOSV HVWDEOLVK EDVHOLQH FRPSDUDELOLW\DPRQJWKHWUHDWPHQWJURXSV 5DQGRPL]DWLRQSURYLGHVDEDVLVIRUXQELDVHGFRPSDULVRQVRIWKHWUHDWPHQWJURXSV 2PLWWLQJ VSHFLILF UHVSRQVHV IURP WKH DQDO\VLV LV D IRUP RI WDPSHULQJ ZLWK WKLV UDQGRPL]DWLRQ DQG ZLOO SUREDEO\ ELDV WKH UHVXOWV LI WKH H[FOXVLRQV DUH PDGH LQ D QRQUDQGRPL]HGIDVKLRQ)RUWKLVUHDVRQWKHSULPDU\DQDO\VLVRIDFOLQLFDOWULDOLV RIWHQ EDVHG RQ WKH µLQWHQWWRWUHDW¶ SULQFLSOH ZKLFK LQFOXGHV DOO UDQGRPL]HG SDWLHQWV LQ WKH DQDO\VLV HYHQ WKRXJK VRPH PLJKW QRW FRPSO\ ZLWK SURWRFRO UHTXLUHPHQWV
&KDSWHU,QWURGXFWLRQ %DVLFV
%OLQGHG5DQGRPL]DWLRQ %OLQGHG RU PDVNHG UDQGRPL]DWLRQ LV RQH RI WKH PRVW LPSRUWDQW IHDWXUHV RI D FRQWUROOHG VWXG\ 6LQJOHEOLQG GRXEOHEOLQG DQG HYHQ WULSOHEOLQG VWXGLHV DUH FRPPRQDPRQJFOLQLFDOWULDOV $VLQJOHEOLQGVWXG\LVRQHLQZKLFKWKHSDWLHQWVDUHQRWDZDUHRIZKLFKWUHDWPHQW WKH\ UHFHLYH 0DQ\ SDWLHQWV DFWXDOO\ VKRZ D FOLQLFDO UHVSRQVH ZLWK PHGLFDO FDUH HYHQ LI WKH\ DUH QRW WUHDWHG 6RPH SDWLHQWV PLJKW UHVSRQG ZKHQ WUHDWHG ZLWK D SODFHER EXWDUHXQDZDUHWKDWWKHLUPHGLFDWLRQLVLQDFWLYH7KHVHDUHH[DPSOHVRI WKH ZHOONQRZQ SODFHER HIIHFW ZKLFK PLJKW KDYH D SV\FKRORJLFDO FRPSRQHQW GHSHQGHQW RQ WKH SDWLHQW V EHOLHI WKDW KH LV UHFHLYLQJ DSSURSULDWH FDUH $ SODFHERUHVSRQVHLVQRWXQFRPPRQLQPDQ\FOLQLFDOLQGLFDWLRQV 6XSSRVH WKDW D UHVSRQVH < FDQ EH UHSUHVHQWHG E\ D WUXH WKHUDSHXWLF UHVSRQVH FRPSRQHQW 75 DQG D SODFHER HIIHFW 3( /HWWLQJ VXEVFULSWV $ DQG 3 GHQRWH µDFWLYH¶DQGµSODFHER¶WUHDWPHQWVUHVSHFWLYHO\WKHHVWLPDWHGWKHUDSHXWLFEHQHILWRI WKHDFWLYHFRPSRXQGPLJKWEHPHDVXUHGE\WKHGLIIHUHQFH <$±<3 75$3($ ±7533(3 %HFDXVHDSODFHERKDVQRWKHUDSHXWLFEHQHILW753 :LWK3(D 3($±3(3\RX REWDLQ <$±<3 75$3(D :KHQSDWLHQWVDUHXQDZDUHRIWKHLUWUHDWPHQWWKHSODFHERHIIHFW3( VKRXOGEHWKH VDPHIRUERWKJURXSVPDNLQJ3(D 7KHUHIRUHWKHGLIIHUHQFHLQUHVSRQVHYDOXHV HVWLPDWHVWKHWUXHWKHUDSHXWLFEHQHILWRIWKHDFWLYHFRPSRXQG +RZHYHULISDWLHQWVNQRZZKLFKWUHDWPHQWWKH\KDYHEHHQDVVLJQHGWKHSODFHER HIIHFWLQWKHDFWLYHJURXSPLJKWGLIIHUIURPWKDWRIWKHFRQWUROJURXSSHUKDSVGXH WR EHWWHU FRPSOLDQFH RU H[SHFWDWLRQ RI EHQHILW ,Q WKLV FDVH WKH HVWLPDWH RI WKHUDSHXWLFEHQHILWLVFRQWDPLQDWHGE\DQRQ]HUR3(D ,QDGGLWLRQELDVZKHWKHUFRQVFLRXVRUQRWPLJKWDULVHLIWKHLQYHVWLJDWRUGRHVQRW HYDOXDWHDOOSDWLHQWVXQLIRUPO\(YDOXDWLRQRIVWXG\PHDVXUHPHQWVVXFKDVJOREDO DVVHVVPHQWV DQG GHFLVLRQV UHJDUGLQJ GRVLQJ FKDQJHV YLVLW WLPLQJ XVH RI FRQFRPLWDQW PHGLFDWLRQV DQG GHJUHH RI IROORZXS UHODWLQJ WR DGYHUVH HYHQWV RU DEQRUPDOODEV PLJKWEHDIIHFWHGE\WKHLQYHVWLJDWRU¶VNQRZOHGJHRIWKHSDWLHQW¶V WUHDWPHQW6XFKELDVFDQEHFRQWUROOHGE\GRXEOHEOLQGLQJWKHVWXG\ZKLFKPHDQV WKDW LQIRUPDWLRQ UHJDUGLQJ WUHDWPHQW JURXS DVVLJQPHQW LV ZLWKKHOG IURP WKH LQYHVWLJDWRUDVZHOODVWKHSDWLHQW 'RXEOHEOLQGLQJ LV DFRPPRQDQGLPSRUWDQWIHDWXUHRIDFRQWUROOHGFOLQLFDOWULDO HVSHFLDOO\ ZKHQ HYDOXDWLRQV DUH RSHQ WR VRPH GHJUHH RI VXEMHFWLYLW\ +RZHYHU GRXEOHEOLQGLQJLVQRWDOZD\VSRVVLEOHRUSUDFWLFDO)RUH[DPSOHWHVWDQGFRQWURO WUHDWPHQWVPLJKWQRWEHDYDLODEOHLQWKHVDPHIRUPXODWLRQ,QVXFKFDVHVWUHDWPHQW FDQVRPHWLPHVEHDGPLQLVWHUHGE\RQHLQYHVWLJDWRUDQGWKHHYDOXDWLRQVSHUIRUPHG
&RPPRQ6WDWLVWLFDO0HWKRGVIRU&OLQLFDO5HVHDUFKZLWK6$6([DPSOHV
E\ D FRLQYHVWLJDWRU DW WKH VDPH FHQWHU LQ DQ DWWHPSW WR PDLQWDLQ VRPH VRUW RI PDVNLQJRIWKHLQYHVWLJDWRU 6WXGLHV FDQ DOVR EH WULSOHEOLQG ZKHUHLQ WKH SDWLHQW LQYHVWLJDWRU DQG FOLQLFDO SURMHFWWHDPLQFOXGLQJWKHVWDWLVWLFLDQ DUHXQDZDUHRIWKHWUHDWPHQWDGPLQLVWHUHG XQWLOWKHVWDWLVWLFDODQDO\VLVLVFRPSOHWH7KLVUHGXFHVDWKLUGOHYHORISRWHQWLDOELDV WKDWRIWKHLQWHUSUHWDWLRQRIWKHUHVXOWV 6HOHFWLRQ RI DSSURSULDWH VWDWLVWLFDO PHWKRGV IRU GDWD DQDO\VLV LQ FRQILUPDWRU\ VWXGLHV VKRXOG EH GRQH LQ D EOLQGHG PDQQHU ZKHQHYHU SRVVLEOH 8VXDOO\ WKLV LV DFFRPSOLVKHG WKURXJK WKH GHYHORSPHQW RI D VWDWLVWLFDO DQDO\VLV SODQ SULRU WR FRPSOHWLQJ GDWD FROOHFWLRQ 6XFK D SODQ KHOSV UHPRYH WKH SRWHQWLDO IRU ELDVHV DVVRFLDWHGZLWKGDWDGULYHQPHWKRGRORJ\,WDOVRHOLPLQDWHVWKHDELOLW\WRVHOHFWD PHWKRGIRUWKHSXUSRVHRISURGXFLQJDUHVXOWFORVHVWWRWKHRXWFRPHWKDWLVEHLQJ VRXJKW 6HOHFWLRQRI6WDWLVWLFDO0HWKRGV )HDWXUHV RI FRQWUROOHGFOLQLFDOWULDOVVXFKDVUDQGRPL]DWLRQDQGEOLQGLQJKHOSWR OLPLW ELDV ZKHQ PDNLQJ VWDWLVWLFDO LQIHUHQFHV 7KH VWDWLVWLFDO PHWKRGV WKHPVHOYHV PLJKW DOVR LQWURGXFH ELDV LI WKH\ DUH µGDWDGULYHQ¶ WKDW LV WKH PHWKRG LV VHOHFWHG EDVHGRQWKHVWXG\RXWFRPHV,QPRVWFDVHVWKHVWXG\GHVLJQDQGREMHFWLYHVZLOO SRLQW WR WKH PRVW DSSURSULDWH VWDWLVWLFDO PHWKRGV IRU WKH SULPDU\ DQDO\VLV 7KHVH PHWKRGV DUH XVXDOO\ GHWDLOHG LQ D IRUPDO DQDO\VLV SODQ SUHSDUHG SULRU WR GDWD FROOHFWLRQ DQG WKHUHIRUH UHSUHVHQW WKH EHVW µWKHRUHWLFDO¶ PHWKRGRORJ\ QRW LQIOXHQFHGE\WKHGDWD 2IWHQ VXIILFLHQW NQRZOHGJH RI WKH YDULDELOLW\ DQG GLVWULEXWLRQ RI WKH UHVSRQVH LQ 3KDVHRULQSLYRWDOWULDOVLVREWDLQHGIURPSUHYLRXVVWXGLHV,IQHFHVVDU\WKHUHDUH ZD\V WR FRQILUP GLVWULEXWLRQDO DVVXPSWLRQV EDVHG RQ SUHOLPLQDU\ EOLQGHG GDWD LQ RUGHU WR IXOO\ SUHVSHFLI\ WKH PHWKRGRORJ\ %HFDXVH GLIIHUHQW VWDWLVWLFDO PHWKRGV PLJKWOHDGWRGLIIHUHQWFRQFOXVLRQVIDLOXUHWRSUHVSHFLI\WKHPHWKRGVPLJKWOHDGWR WKHDSSHDUDQFHRIVHOHFWLQJDPHWKRGWKDWUHVXOWVLQWKHPRVWGHVLUDEOHFRQFOXVLRQ 0HWKRGRORJ\ELDVLVRQHFRQFHUQDGGUHVVHGE\DQDQDO\VLVSODQ0RUHLPSRUWDQWO\ SUHVSHFLI\LQJ PHWKRGRORJ\ KHOSV WR HQVXUH WKDW WKH VWXG\ REMHFWLYHV DUH DSSURSULDWHO\DGGUHVVHG7KHVWDWLVWLFDOPHWKRGVHOHFWHGZLOOGHSHQGYHU\VWURQJO\ RQWKHDFWXDOREMHFWLYHRIWKHVWXG\&RQVLGHUDWULDOWKDWLQFOXGHVWKUHHGRVHVRIDQ DFWLYH FRPSRXQG DQG DQ LQDFWLYH SODFHER 3RVVLEOH VWXG\ REMHFWLYHV LQFOXGH GHWHUPLQLQJLI WKHUHLVDQ\GLIIHUHQFHDPRQJWKHIRXUJURXSVEHLQJVWXGLHG DQ\RIWKHDFWLYHGRVHVLVEHWWHUWKDQWKHSODFHER WKHKLJKHVWGRVHLVVXSHULRUWRWKHORZHUGRVHV WKHUHLVDGRVHUHVSRQVH $GLIIHUHQWVWDWLVWLFDOPHWKRGPLJKWEHUHTXLUHGIRUHDFKRIWKHVHREMHFWLYHV7KH VWXG\REMHFWLYHPXVWEHFOHDUEHIRUHWKHVWDWLVWLFDOPHWKRGFDQEHVHOHFWHG
&KDSWHU,QWURGXFWLRQ %DVLFV
'HVFULSWLYH6WDWLVWLFV
'HVFULSWLYHVWDWLVWLFVGHVFULEHWKHSUREDELOLW\GLVWULEXWLRQRIWKHSRSXODWLRQ7KLVLV GRQH E\ XVLQJ KLVWRJUDPV WR GHSLFW WKH VKDSH RI WKH GLVWULEXWLRQ E\ HVWLPDWLQJ GLVWULEXWLRQDOSDUDPHWHUVDQGE\FRPSXWLQJYDULRXVPHDVXUHVRIFHQWUDOWHQGHQF\ DQGGLVSHUVLRQ $ KLVWRJUDP LV D SORW RI WKH PHDVXUHG YDOXHV RI D UDQGRP YDULDEOH E\ WKHLU IUHTXHQF\)RUH[DPSOHKHLJKWPHDVXUHPHQWVIRU\HDUROGPDOHVWXGHQWVFDQEH GHVFULEHGE\DVDPSOHKLVWRJUDPEDVHGRQVWXGHQWV6HH)LJXUH
)LJXUH+LVWRJUDPRI+HLJKW0HDVXUHPHQWVQ
Frequency
6 4 2 0 60 62 64 66 68 70 72 74 76 78 Height (in.) ,I PRUHDQGPRUH PHDVXUHPHQWV DUH WDNHQ WKH KLVWRJUDP PLJKW EHJLQ ORRNLQJ OLNH D µEHOOVKDSHG¶ FXUYH ZKLFK LV FKDUDFWHULVWLF RI D QRUPDO GLVWULEXWLRQ 6HH )LJXUH
&RPPRQ6WDWLVWLFDO0HWKRGVIRU&OLQLFDO5HVHDUFKZLWK6$6([DPSOHV
)LJXUH+LVWRJUDPRI+HLJKW0HDVXUHPHQWVQ
Frequency
58 60 62 64 66 68 70 72 74 76 78 80 Height (in) ,I \RX DVVXPH WKH SRSXODWLRQ GLVWULEXWLRQ FDQ EH PRGHOHG ZLWK D NQRZQ GLVWULEXWLRQVXFKDVWKHQRUPDO \RXQHHGRQO\HVWLPDWHWKHSDUDPHWHUVDVVRFLDWHG ZLWK WKDW GLVWULEXWLRQ LQ RUGHU WR IXOO\ GHVFULEH LW 7KH ELQRPLDO GLVWULEXWLRQ KDV RQO\RQHSDUDPHWHUSZKLFKFDQEHGLUHFWO\HVWLPDWHGIURPWKHREVHUYHGGDWD7KH QRUPDO GLVWULEXWLRQ KDV WZR SDUDPHWHUV m DQG s UHSUHVHQWLQJ WKH PHDQ DQG YDULDQFHUHVSHFWLYHO\ 6XSSRVHDVDPSOHRIQPHDVXUHPHQWVGHQRWHGE\[[[QLVREWDLQHG9DULRXV GHVFULSWLYH VWDWLVWLFV FDQ EH FRPSXWHG IURP WKHVH PHDVXUHPHQWV WR KHOS GHVFULEH WKH SRSXODWLRQ 7KHVH LQFOXGH PHDVXUHV RI FHQWUDO WHQGHQF\ ZKLFK GHVFULEH WKH FHQWHURIWKHGLVWULEXWLRQDQGPHDVXUHVRIGLVSHUVLRQZKLFKGHVFULEHWKHYDULDWLRQ RIWKHGDWD&RPPRQH[DPSOHVRIHDFKDUHVKRZQLQ7DEOH ,Q DGGLWLRQ WR GLVWULEXWLRQDO SDUDPHWHUV \RX VRPHWLPHV ZDQW WR HVWLPDWH SDUDPHWHUV DVVRFLDWHG ZLWK D VWDWLVWLFDO PRGHO ,I DQ XQNQRZQ UHVSRQVH FDQ EH PRGHOHG DV D IXQFWLRQ RI NQRZQ RU FRQWUROOHG YDULDEOHV \RX FDQ RIWHQ REWDLQ YDOXDEOH LQIRUPDWLRQ UHJDUGLQJ WKH UHVSRQVH E\ HVWLPDWLQJ WKH ZHLJKWV RU FRHIILFLHQWVRIHDFKRIWKHVHNQRZQYDULDEOHV7KHVHFRHIILFLHQWVDUHFDOOHGPRGHO SDUDPHWHUV 7KH\ DUH HVWLPDWHG LQ D ZD\ WKDW UHVXOWV LQ WKH JUHDWHVW FRQVLVWHQF\ EHWZHHQWKHPRGHODQGWKHREVHUYHGGDWD
&KDSWHU,QWURGXFWLRQ %DVLFV
7$%/(&RPPRQ'HVFULSWLYH6WDWLVWLFV
0HDVXUHVRI &HQWUDO7HQGHQF\ $ULWKPHWLF0HDQ
[ S[ Q [ [ [ Q
0HGLDQ
WKHPLGGOHYDOXHLIQLVRGGWKHDYHUDJHRI WKHWZRPLGGOHYDOXHVLIQLVHYHQ SHUFHQWLOH
L
Q
WK
0RGH
WKHPRVWIUHTXHQWO\RFFXUULQJYDOXH
*HRPHWULF0HDQ
P[ [ ¼[ ¼¼[
+DUPRQLF0HDQ
QS[ Q^[ [ [ `
:HLJKWHG0HDQ
[ SZ [ :ZKHUH: SZ
7ULPPHG0HDQ
$ULWKPHWLFPHDQRPLWWLQJWKHODUJHVWDQG VPDOOHVWREVHUYDWLRQV
:LQVRUL]HG0HDQ
$ULWKPHWLFPHDQDIWHUUHSODFLQJRXWOLHUVZLWK WKHFORVHVWQRQRXWOLHUYDOXHV
Q
Q
L
Q
±
±
L
Z
L
L
Q
L
0HDVXUHVRI 'LVSHUVLRQ 9DULDQFH
V S[ ± [ Q±
6WDQGDUG'HYLDWLRQ
V VTXDUHURRWRIWKHYDULDQFH
6WDQGDUG(UURURIWKH PHDQ
V Q 6WDQGDUGGHYLDWLRQRI [
5DQJH
/DUJHVWYDOXH6PDOOHVWYDOXH
0HDQ$EVROXWH 'HYLDWLRQ
S_[ ± [ _ Q
,QWHU4XDUWLOH5DQJH
SHUFHQWLOH± SHUFHQWLOH
&RHIILFLHQWRI9DULDWLRQ
V [
L
L
WK
WK
'HVFULSWLYH VWDWLVWLFDO PHWKRGV DUH RIWHQ WKH RQO\ DSSURDFK WKDW FDQ EH XVHG IRU DQDO\]LQJWKHUHVXOWVRISLORWVWXGLHVRU3KDVH,FOLQLFDOWULDOV'XHWRVPDOOVDPSOH VL]HVWKHODFNRIEOLQGLQJRUWKHRPLVVLRQRIRWKHUIHDWXUHVRIDFRQWUROOHGWULDO VWDWLVWLFDOLQIHUHQFHPLJKWQRWEHSRVVLEOH+RZHYHUWUHQGVRUSDWWHUQVREVHUYHGLQ WKH GDWD E\ XVLQJ GHVFULSWLYH RU H[SORUDWRU\ PHWKRGV ZLOO RIWHQ KHOS LQ EXLOGLQJ K\SRWKHVHVDQGLGHQWLI\LQJLPSRUWDQWFRIDFWRUV7KHVHQHZK\SRWKHVHVFDQWKHQEH WHVWHG LQ D PRUH FRQWUROOHG PDQQHU LQ VXEVHTXHQW VWXGLHV ZKHUHLQ LQIHUHQWLDO VWDWLVWLFDOPHWKRGVZRXOGEHPRUHDSSURSULDWH
&RPPRQ6WDWLVWLFDO0HWKRGVIRU&OLQLFDO5HVHDUFKZLWK6$6([DPSOHV
,QIHUHQWLDO6WDWLVWLFV
7KHWZRSULPDU\VWDWLVWLFDOPHWKRGVIRUPDNLQJLQIHUHQFHVDUHFRQILGHQFHLQWHUYDO HVWLPDWLRQDQGK\SRWKHVLVWHVWLQJ
&RQILGHQFH,QWHUYDOV 3RSXODWLRQSDUDPHWHUVVXFKDVWKHPHDQm RUWKHVWDQGDUGGHYLDWLRQs FDQEH HVWLPDWHG E\ XVLQJ D SRLQW HVWLPDWH VXFK DV WKH VDPSOH PHDQ [ RU WKH VDPSOH VWDQGDUG GHYLDWLRQ V $ FRQILGHQFH LQWHUYDO LV DQ LQWHUYDO DURXQG WKH SRLQW HVWLPDWHWKDWFRQWDLQVWKHSDUDPHWHUZLWKDVSHFLILFKLJKSUREDELOLW\RUFRQILGHQFH OHYHO $ FRQILGHQFH LQWHUYDO IRU WKH PHDQ m FDQ EH FRQVWUXFWHG IURP WKH VDPSOH GDWD ZLWK WKH IROORZLQJ LQWHUSUHWDWLRQ ,I WKH VDPH H[SHULPHQW ZHUH FRQGXFWHGDODUJHQXPEHURIWLPHVDQGFRQILGHQFHLQWHUYDOVZHUHFRQVWUXFWHGIRU HDFKDSSUR[LPDWHO\RIWKRVHLQWHUYDOVZRXOGFRQWDLQWKHSRSXODWLRQPHDQm 7KH JHQHUDO IRUP RI D FRQILGHQFH LQWHUYDO LV >q/ ± q8@ ZKHUH q/ UHSUHVHQWV WKH ORZHUOLPLWDQGq8LVWKHXSSHUOLPLWRIWKHLQWHUYDO,IWKHSUREDELOLW\GLVWULEXWLRQRI WKHSRLQWHVWLPDWHLVV\PPHWULFVXFKDVWKHQRUPDOGLVWULEXWLRQ WKHLQWHUYDOFDQ EHIRXQGE\
A&Â s Ö q q
A LV WKH SRLQW HVWLPDWH RI WKH SRSXODWLRQ SDUDPHWHU q s Ö LV WKH VWDQGDUG ZKHUH q q HUURU RI WKH HVWLPDWH DQG & UHSUHVHQWV D YDOXH GHWHUPLQHG E\ WKH SUREDELOLW\ GLVWULEXWLRQRIWKHHVWLPDWHDQGWKHVLJQLILFDQFHOHYHOWKDW\RXZDQW:KHQ sqÖ LV XQNQRZQWKHHVWLPDWH sÖ qÖ PD\EHXVHG )RU H[DPSOH IRU a EHWZHHQ DQG D ±a FRQILGHQFH LQWHUYDO IRU D QRUPDOSRSXODWLRQPHDQm LV
£ [=a Âs» Q
£
ZKHUH WKH SRLQW HVWLPDWH RI m LV [ WKH VWDQGDUG HUURU RI [ LV s» Q DQG WKH YDOXHRI=aLVIRXQGLQWKHQRUPDOSUREDELOLW\WDEOHV6HH$SSHQGL[$ 6RPH FRPPRQO\XVHGYDOXHVRIaDQGWKHFRUUHVSRQGLQJFULWLFDO=YDOXHVDUH
a
=a
&KDSWHU,QWURGXFWLRQ %DVLFV
,QPRVWFDVHVWKHVWDQGDUGGHYLDWLRQs ZLOOQRWEHNQRZQ,ILWFDQEHHVWLPDWHG XVLQJWKHVDPSOHVWDQGDUGGHYLDWLRQV D±a FRQILGHQFHLQWHUYDOIRUWKH PHDQm FDQEHIRUPHGDV
£ [Wa ÂV» Q
ZKHUHWaLVIRXQGIURPWKH6WXGHQWWSUREDELOLW\WDEOHVVHH$SSHQGL[$ EDVHG RQ WKH QXPEHU RI GHJUHHV RI IUHHGRP LQWKLVFDVHQ±)RUH[DPSOHDYDOXHRI Wa ZRXOGEHXVHGIRUDFRQILGHQFHLQWHUYDOZKHQQ 0DQ\6$6SURFHGXUHVZLOOSULQWSRLQWHVWLPDWHVRISDUDPHWHUVZLWKWKHLUVWDQGDUG HUURUV 7KHVH SRLQW HVWLPDWHV FDQ EH XVHG WR IRUP FRQILGHQFH LQWHUYDOV XVLQJ WKH JHQHUDO IRUP IRU qÖ WKDW LV JLYHQ DERYH 6RPH RI WKH PRVW FRPPRQO\ XVHG FRQILGHQFH LQWHUYDOV DUH IRU SRSXODWLRQ PHDQV m GLIIHUHQFHV LQ PHDQV EHWZHHQ WZRSRSXODWLRQVm±m SRSXODWLRQSURSRUWLRQVS DQGGLIIHUHQFHVLQSURSRUWLRQV EHWZHHQ WZR SRSXODWLRQV S ± S )RU HDFK RI WKHVH WKH IRUP IRU qÖ DQG LWV VWDQGDUGHUURUDUHVKRZQLQ7DEOH 7$%/(&RQILGHQFH,QWHUYDO&RPSRQHQWV$VVRFLDWHGZLWK 0HDQVDQG3URSRUWLRQV
q
A q
sqÖ
sÖ qÖ
&
m
[
s Q
V Q
=a LIsLVNQRZQWa LIsLVXQNQRZQ
[ - [
m ±m
V Q Q
s Q s Q
=a LI s DQGs DUHNQRZQ Wa LIs RUs LVXQNQRZQ,IXQNQRZQ DVVXPHHTXDOYDULDQFHVDQGXVH
V Q ± V Q ± V Q Q ±
S
SA [Q
S ±S
S±S Q
S ±S Q
DSSOLHVWRODUJHVDPSOHV
[µHYHQWV¶LQQELQRPLDOWULDOV =a
A Q SA ±S
=a
SA±SA Q
A Q SA ±SA S ±S Q SA ±S
SA [ Q IRUL L
L
L
&RPPRQ6WDWLVWLFDO0HWKRGVIRU&OLQLFDO5HVHDUFKZLWK6$6([DPSOHV
+\SRWKHVLV7HVWLQJ
+\SRWKHVLV WHVWLQJLVDPHDQVRIIRUPDOL]LQJWKHLQIHUHQWLDOSURFHVVIRUGHFLVLRQ PDNLQJ SXUSRVHV ,W LV D VWDWLVWLFDO DSSURDFK IRU WHVWLQJ K\SRWKHVL]HG VWDWHPHQWV DERXWSRSXODWLRQSDUDPHWHUVEDVHGRQORJLFDODUJXPHQW 7R XQGHUVWDQG WKH FRQFHSW EHKLQG WKH K\SRWKHVLV WHVW OHW¶V H[DPLQH D IRUP RI GHGXFWLYHDUJXPHQWIURPORJLFXVLQJWKHIROORZLQJH[DPSOH ,I\RXKDYHDQDSSOH\RXGRQRWKDYHDQRUDQJH
LI3WKHQSUREDEO\QRW4 4 BBBBBBBBBBBBBBBBBBBB WKHUHIRUHSUREDEO\QRW3
FRQGLWLRQDOSUHPLVH SUHPLVH FRQFOXVLRQ
&KDSWHU,QWURGXFWLRQ %DVLFV
7KHIROORZLQJH[DPSOHVLOOXVWUDWHVXFKµVWDWLVWLFDODUJXPHQWV¶
([DPSOH 6WDWHPHQWV 3 WKHFRLQLVIDLU 4 \RXREVHUYHWDLOVLQDURZ $UJXPHQW ,IWKHFRLQLVIDLU\RXZRXOGSUREDEO\QRWREVHUYHWDLOVLQ DURZ
,QWKHILUVWH[DPSOH\RXPLJKWLQLWLDOO\VXVSHFWWKHFRLQRIEHLQJELDVHGLQIDYRURI WDLOV 7R WHVW WKLV K\SRWKHVLV DVVXPH WKH QXOO FDVHZKLFKLVWKDWWKHFRLQLVIDLU 7KHQGHVLJQDQH[SHULPHQWWKDWFRQVLVWVRIWRVVLQJWKHFRLQWLPHVDQGUHFRUGLQJ WKHRXWFRPHRIHDFKWRVV
&RPPRQ6WDWLVWLFDO0HWKRGVIRU&OLQLFDO5HVHDUFKZLWK6$6([DPSOHV
)RUPDOO\ WKH VWXG\ LV VHW RXW E\ LGHQWLI\LQJ WKH K\SRWKHVLV GHYHORSLQJ D WHVW FULWHULRQDQGIRUPXODWLQJDGHFLVLRQUXOH)RU([DPSOH
1XOOK\SRWKHVLV:WKHFRLQLVIDLU $OWHUQDWLYH WKHFRLQLVELDVHGLQIDYRURIWDLOV 7HVWFULWHULRQ: WKHQXPEHURIWDLOVLQFRQVHFXWLYH WRVVHVRIWKHFRLQ 'HFLVLRQUXOH: UHMHFWWKHQXOOK\SRWKHVLVLIDOOWRVVHV UHVXOWLQ WDLOV
)LUVWHVWDEOLVKWKHK\SRWKHVLV37KHK\SRWKHVLVLVWHVWHGE\REVHUYLQJWKHUHVXOWV RIWKHVWXG\RXWFRPH4,I\RXFDQGHWHUPLQHWKDWWKHSUREDELOLW\RIREVHUYLQJ4LV YHU\ VPDOO ZKHQ 3 LV WUXH DQG \RX GR REVHUYH 4 \RX FDQ FRQFOXGH WKDW 3 LV SUREDEO\ QRW WUXH 7KH GHJUHH RI FHUWDLQW\ RI WKH FRQFOXVLRQ LV UHODWHG WR WKH SUREDELOLW\DVVRFLDWHGZLWK4DVVXPLQJ3LVWUXH +\SRWKHVLVWHVWLQJFDQEHVHWIRUWKLQDQDOJRULWKPZLWKSDUWV WKHQXOOK\SRWKHVLVDEEUHYLDWHG+ WKHDOWHUQDWLYHK\SRWKHVLVDEEUHYLDWHG+$ WKHWHVWFULWHULRQ WKHGHFLVLRQUXOH WKHFRQFOXVLRQ 7KH QXOO K\SRWKHVLV LV WKH VWDWHPHQW 3 WUDQVODWHG LQWR WHUPV LQYROYLQJ WKH SRSXODWLRQ SDUDPHWHUV ,Q ([DPSOH µWKH FRLQ LV IDLU¶ LV HTXLYDOHQW WR µWKH SUREDELOLW\RIWDLOVRQDQ\WRVVLVò¶3DUDPHWULFDOO\WKLVLVVWDWHGLQWHUPVRIWKH ELQRPLDOSDUDPHWHUSZKLFKUHSUHVHQWVWKHSUREDELOLW\RIWDLOV + S 7KHDOWHUQDWLYHK\SRWKHVLVLVµQRW3¶RU
+ S! 8VXDOO\ \RX WDNH µQRW 3¶ DV WKH K\SRWKHVLV WR EH GHPRQVWUDWHG EDVHG RQ DQ DFFHSWDEOHULVNIRUGHILQLQJµSUREDEO\¶DVXVHGLQ([DPSOHVDQG $
7KHWHVWFULWHULRQRUµWHVWVWDWLVWLF¶LVVRPHIXQFWLRQRIWKHREVHUYHGGDWD7KLVLV VWDWHPHQW4RIWKHVWDWLVWLFDODUJXPHQW6WDWHPHQW4PLJKWEHWKHQXPEHURIWDLOV LQ WRVVHV RI D FRLQ RU WKH QXPEHU RI LPSURYHG DUWKULWLF SDWLHQWV DV XVHG LQ ([DPSOHVDQGRU\RXPLJKWXVHDPRUHFRPSOH[IXQFWLRQRIWKHGDWD2IWHQWKH WHVWVWDWLVWLFLVDIXQFWLRQRIWKHVDPSOHPHDQDQGYDULDQFHRUVRPHRWKHUVXPPDU\ VWDWLVWLFV
&KDSWHU,QWURGXFWLRQ %DVLFV
7KHGHFLVLRQUXOHUHVXOWVLQWKHUHMHFWLRQRIWKHQXOOK\SRWKHVLVLIXQOLNHO\YDOXHVRI WKHWHVWVWDWLVWLFDUHREVHUYHGZKHQDVVXPLQJWKHWHVWVWDWLVWLFLVWUXH7RGHWHUPLQH D GHFLVLRQ UXOH WKH GHJUHH RI VXFK µXQOLNHOLQHVV¶ QHHGV WR EH VSHFLILHG 7KLV LV UHIHUUHGWRDVWKHVLJQLILFDQFHOHYHORIWKHWHVWGHQRWHG a DQGLQFOLQLFDOWULDOVLV RIWHQEXWQRWDOZD\V VHWWR%\NQRZLQJWKHSUREDELOLW\GLVWULEXWLRQRIWKH WHVW VWDWLVWLF ZKHQ WKH QXOO K\SRWKHVLV LV WUXH \RX FDQ LGHQWLI\ WKH PRVW H[WUHPH aRIWKHYDOXHVDVDUHMHFWLRQUHJLRQ7KHGHFLVLRQUXOHLVVLPSO\UHMHFW+ ZKHQWKHWHVWVWDWLVWLFIDOOVLQWKHUHMHFWLRQUHJLRQ
6HH&KDSWHUIRUPRUHLQIRUPDWLRQDERXWVLJQLILFDQFHOHYHOV
6XPPDU\
7KLVLQWURGXFWRU\FKDSWHUSURYLGHVVRPHRIWKHEDVLFFRQFHSWVRIVWDWLVWLFVJLYHVDQ RYHUYLHZ RI VWDWLVWLFV DV D VFLHQWLILF GLVFLSOLQH DQG VKRZV WKDW WKH UHVXOWV RI D VWDWLVWLFDO DQDO\VLV FDQ EH QR EHWWHU WKDQ WKH GDWD FROOHFWHG
&RPPRQ6WDWLVWLFDO0HWKRGVIRU&OLQLFDO5HVHDUFKZLWK6$6([DPSOHV
&++$$3377((55 7RSLFVLQ+\SRWKHVLV7HVWLQJ
6LJQLILFDQFH/HYHOV 3RZHU 2QH7DLOHGDQG7ZR7DLOHG7HVWV S9DOXHV 6DPSOH6L]H'HWHUPLQDWLRQ 0XOWLSOH7HVWLQJ 6XPPDU\
6LJQLILFDQFH/HYHOV
:KHQFRQGXFWLQJK\SRWKHVLVWHVWLQJDQHUURQHRXVFRQFOXVLRQLVPDGHLIWKHQXOO K\SRWKHVLVLVUHMHFWHGZKHQLWLVUHDOO\WUXH7KLVHUURULVFDOOHGD7\SH,HUURUDQG LWVSUREDELOLW\LVGHQRWHGE\ aZKLFKLVNQRZQDVWKHµVLJQLILFDQFHOHYHO¶RIWKH WHVW :KHQ VHWWLQJ XS WKH K\SRWKHVLV WHVW WKH UHMHFWLRQ UHJLRQ LV VHOHFWHG EDVHG RQ D SUHGHWHUPLQHG YDOXH IRU a XVXDOO\ D VPDOO YDOXH VXFK DV 7KLV PHDQV WKDW WKHUHLVRQO\DFKDQFHRIUHMHFWLQJDWUXHQXOOK\SRWKHVLV )RU H[DPSOH VXSSRVH WKDW DGPLQLVWUDWLRQ RI D GUXJ ZDV VXVSHFWHG WR FDXVH LQFUHDVHGDONDOLQHSKRVSKDWDVHOHYHOVLQDGXOWPDOHVDSRSXODWLRQNQRZQWRKDYH DQDONDOLQHSKRVSKDWDVHPHDQRI8OLQDFHUWDLQODERUDWRU\7RWHVWWKLVWKHQXOO DQGDOWHUQDWLYHK\SRWKHVHVDUHVHWDV
+ m
YHUVXV
+ m! $
ZKHUHmUHSUHVHQWVWKHSRSXODWLRQPHDQDONDOLQHSKRVSKDWDVHLQDOOPHQZKRPLJKW TXDOLI\WRUHFHLYHWKHGUXJDQGEHWHVWHGDWWKLVWHVWLQJIDFLOLW\
&RPPRQ6WDWLVWLFDO0HWKRGVIRU&OLQLFDO5HVHDUFKZLWK6$6([DPSOHV
$ VDPSOH RI Q PHQ WUHDWHG ZLWK WKH GUXJ LV REVHUYHG DQG WKHLU DONDOLQH SKRVSKDWDVHOHYHOVDUHPHDVXUHG7KH=WHVWZKLFKLVEDVHGRQWKHVWDQGDUGQRUPDO GLVWULEXWLRQDQGFRPSXWHGIURPWKHVDPSOHPHDQ [ LVFKRVHQDVWKHWHVWVWDWLVWLF $FFRUGLQJWRWKH&HQWUDO/LPLW7KHRUHP&KDSWHU [ KDVDQRUPDOGLVWULEXWLRQ ZLWKPHDQmDQGVWDQGDUGHUURU Q IRUODUJHQVRWKDW [ ±
= Q KDVDµVWDQGDUGQRUPDO¶GLVWULEXWLRQVHH$SSHQGL[% 7KHQXOOK\SRWKHVLVZRXOGEHFRQWUDGLFWHGLIWKHVDPSOHPHDQ [ LVPXFKJUHDWHU WKDQWKHNQRZQPHDQ7KHGHFLVLRQUXOHLVWRUHMHFW+ LQIDYRURI+ ZKHQWKH WHVWVWDWLVWLFLVWRRODUJHFRPSXWHGXQGHUWKHDVVXPSWLRQWKDW+ LVWUXH
$
[ ± = Q 7KH UHMHFWLRQ UHJLRQ LV = ! F ZKHUH F LV VHOHFWHG DFFRUGLQJ WR WKH FKRVHQ VLJQLILFDQFHOHYHOa7KDWLV
a 3UUHMHFW+ ZKHQ+ LVWUXH 3U= !F 7KHFULWLFDOYDOXHFFDQEHGHQRWHGE\=aZKLFKLVIRXQGIURPZLGHO\DYDLODEOH WDEOHVRIWKHSUREDELOLWLHVIRUWKHVWDQGDUGQRUPDOGLVWULEXWLRQLQFOXGLQJ$SSHQGL[ $RIWKLVERRN)RUWKHFRPPRQO\XVHGYDOXHRIa =a
6XSSRVHWKDWSUHYLRXVODERUDWRU\WHVWLQJDWWKHVWXG\ODERUDWRU\HVWDEOLVKHGDPHDQ DONDOLQHSKRVSKDWDVHOHYHORI8OZLWKDVWDQGDUGGHYLDWLRQRIs $FXUUHQW VDPSOH RI WUHDWHG PHQ UHVXOWHG LQ D VDPSOH PHDQ RI 8O 7KH =WHVW VXPPDU\LV
QXOOK\SRWKHVLV DOWK\SRWKHVLV WHVWVWDWLVWLF
+ m + m!
UHMHFWLRQUHJLRQ
5HMHFW+ LI= !DWVLJQLILFDQFHOHYHO a
FRQFOXVLRQ
%HFDXVH GR QRW UHMHFW + ,QVXIILFLHQW HYLGHQFH H[LVWV WR LQGLFDWH DQ LQFUHDVH LQ PHDQ DONDOLQHSKRVSKDWDVHOHYHOV
$
= =
[ - - = = Q
&KDSWHU7RSLFVLQ+\SRWKHVLV7HVWLQJ
3RZHU
$FFHSWLQJWKHQXOOK\SRWKHVLVZKHQLWLVQRWWUXHLVDVHFRQGW\SHRIHUURUWKDWFDQ RFFXU ZKHQ WHVWLQJ D K\SRWKHVLV 7KLV LV NQRZQ DV D 7\SH ,, HUURU DQG KDV WKH SUREDELOLW\b
)RUDJLYHQWHVW bLVSDUWO\GHWHUPLQHGE\WKHFKRLFHIRU a,GHDOO\ERWK aDQG b ZRXOG EH VPDOO +RZHYHU LQ JHQHUDO WKHUH LV DQ LQYHUVH UHODWLRQVKLS EHWZHHQ a DQG bIRUDIL[HGVDPSOHVL]HQ'HFUHDVLQJ aWKHSUREDELOLW\RID7\SH,HUURU LQFUHDVHVbWKHSUREDELOLW\RID7\SH,,HUURU DQGLIWDNHQWRRIDUWHQGVWRUHQGHU WKHWHVWSRZHUOHVVLQLWVDELOLW\WRGHWHFWUHDOGHYLDWLRQVIURPWKHQXOOK\SRWKHVLV $WHVW VSRZHULVGHILQHGE\± bWKHSUREDELOLW\RIUHMHFWLQJWKHQXOOK\SRWKHVLV ZKHQ LW LV QRW WUXH )RU WKH IL[HG VLJQLILFDQFH OHYHO a WKH VDPSOH VL]H ZLOO GHWHUPLQHbDQGWKHUHIRUHWKHSRZHURIWKHWHVW ,QWKHH[DPSOHGLVFXVVHGLQ6HFWLRQLI\RXDFFHSW+DQGFRQFOXGHWKDWWKHUHLV QR LQFUHDVH LQ PHDQ DONDOLQH SKRVSKDWDVH OHYHOV ZLWK WUHDWPHQW \RX ZRXOG EH JXLOW\ RI D 7\SH ,, HUURU LI D WUXH LQFUHDVH JRHV XQGHWHFWHGE\WKHVWDWLVWLFDOWHVW 8QWLO WKH WHVW V SRZHU FDQ EH LQYHVWLJDWHG \RX PXVW FRQFOXGH WKDW WKHUH LV µLQVXIILFLHQWHYLGHQFHWRLQGLFDWHDFKDQJH¶UDWKHUWKDQµWKHUHLVQRFKDQJH¶ 1RWHWKDWbLVQRWRQO\DIXQFWLRQRIWKHVLJQLILFDQFHOHYHODQGWKHVDPSOHVL]HEXW DOVRRIWKHYDOXHRIWKHDOWHUQDWLYHK\SRWKHVLV7KH7\SH,,HUURUSUREDELOLW\IRUWKLV DONDOLQHSKRVSKDWDVHH[DPSOHLVJLYHQE\ b 3UDFFHSW+ZKHQ+$LVWUXH 3U=ZKHQm! ZKLFKZLOOGLIIHUIRUHDFKDOWHUQDWLYHYDOXHRIm! )RUH[DPSOHWKHSUREDELOLW\RID7\SH,,HUURUZKHQm LV
b = 3U = £ ZKHQ m = æ [ - ö = 3U ç £ ZKHQ m = ÷ è s Q ø ö æ [ - = 3U ç £ è s Q s Q ÷ø
= 3U = £ ± EHFDXVH s DQGQ )URPWKHQRUPDOSUREDELOLW\WDEOHV$SSHQGL[$ \RX REWDLQ b DQG D SRZHU RI ±b 6LPLODU FDOFXODWLRQV ZKHQ m UHVXOWLQb ZKLFKJLYHVDSRZHURI
&RPPRQ6WDWLVWLFDO0HWKRGVIRU&OLQLFDO5HVHDUFKZLWK6$6([DPSOHV
7KHSRZHUIXQFWLRQRIWKHWHVWFDQEHGHVFULEHGE\DSORWRIDOWHUQDWLYHYDOXHVRIm YV WKH SRZHU FRPSXWHG DV GHPRQVWUDWHG LQ WKH SUHFHGLQJ HTXDWLRQV )LJXUH VKRZVWKHSRZHUFXUYHRIWKH=WHVWIRURXUH[DPSOH )LJXUH3RZHU&XUYHIRUWKH=7HVW
1
0.847
0.8 POWER
0.6 0.377
0.4 0.2 0 60
61
62
63
64
65
ME AN
3RZHU FXUYHV DUH LPSRUWDQW IRU GHWHUPLQLQJ WKH EHVW WHVW VWDWLVWLF WR XVH :KHQ PRUHWKDQRQHVWDWLVWLFDOWHVWLVDORJLFDOFDQGLGDWHIRUWHVWLQJWKHVDPHK\SRWKHVLV WKHVWDWLVWLFLDQXVHVWKHWHVW VSRZHUIXQFWLRQWRGHWHUPLQHWKHPRUHSRZHUIXOWHVW IRUDUDQJHRIOLNHO\DOWHUQDWLYHYDOXHV3RZHUFXUYHVDUHDOVRLPSRUWDQWLQVDPSOH VL]HGHWHUPLQDWLRQGXULQJVWXG\GHVLJQ6DPSOHVL]HFDOFXODWLRQVDUHGLVFXVVHGLQ 6HFWLRQ
2QH7DLOHGDQG7ZR7DLOHG7HVWV
7KHIRUPRIWKHDOWHUQDWLYHK\SRWKHVLVGHWHUPLQHVZKHWKHUWKHWHVWLVDRQHRUWZR WDLOHG WHVW 7KH DONDOLQH SKRVSKDWDVH H[DPSOH LV D RQHWDLOHG WHVW EHFDXVH WKH DOWHUQDWLYHK\SRWKHVLV+$m!LVRQO\FRQFHUQHGZLWKDOWHUQDWLYHPHDQYDOXHVLQ RQHGLUHFWLRQWKDWLVJUHDWHUWKDQ7KHWZRWDLOHGDOWHUQDWLYHIRUWKLVH[DPSOH ZRXOGEHVSHFLILHGDV+$m ZKLFKLQGLFDWHVDQLQWHUHVWLQDOWHUQDWLYHYDOXHV RIWKHPHDQHLWKHUJUHDWHURUOHVVWKDQ
&KDSWHU7RSLFVLQ+\SRWKHVLV7HVWLQJ
7KH UHMHFWLRQ UHJLRQ IRU WKH WZRWDLOHG =WHVW ZRXOG LQFOXGH ERWK YHU\ ODUJH DQG YHU\VPDOOYDOXHVRIWKHWHVWVWDWLVWLF)RUDVLJQLILFDQFHOHYHORIa\RXUHMHFW+LQ IDYRU RI WKH WZRWDLOHG DOWHUQDWLYH LI = ! =a RU = ±=a )RU a a LQHDFKµWDLO¶RIWKHQRUPDOGLVWULEXWLRQREWDLQLQJ= DVWKH FULWLFDOYDOXHIRUUHMHFWLRQIURPWKHQRUPDOSUREDELOLW\WDEOHV$SSHQGL[$ 7KH UHMHFWLRQ UHJLRQ IRU WKH WZRWDLOHG =WHVW ZKHQ a LV _=_ ! WKDWLV=!RU=± GXHWRWKHV\PPHWU\RIWKHQRUPDOGLVWULEXWLRQ :KHQWKHGLVWULEXWLRQRIWKHWHVWVWDWLVWLFLVQRWV\PPHWULFWKHUHODWLRQVKLSEHWZHHQ RQHDQGWZRWDLOHGWHVWVLVPRUHFRPSOH[DQGEH\RQGWKHVFRSHRIWKLVERRN
S9DOXHV
)RUPDOO\ WKH FRQFOXVLRQ RI D VWDWLVWLFDO K\SRWKHVLV WHVW LV WR µUHMHFW¶ RU WR µQRW UHMHFW¶WKHQXOOK\SRWKHVLVDWDSUHVHWVLJQLILFDQFHOHYHO$QRWKHUZD\WRFRQYH\D WHVW¶VOHYHORIVLJQLILFDQFHLVZLWKLWVSYDOXHV7KHSYDOXHLVWKHDFWXDOSUREDELOLW\ RI REWDLQLQJ WKH FDOFXODWHG WHVW VWDWLVWLF RU D YDOXH LQ D PRUH H[WUHPH SDUW RIWKH UHMHFWLRQUHJLRQZKHQ+LVWUXH ,Q WKH DONDOLQH SKRVSKDWDVH H[DPSOH GLVFXVVHG SUHYLRXVO\ WKH WHVW VWDWLVWLF RI = LVREWDLQHG7KHSYDOXHLVFRPSXWHGDV
S 3U=DVVXPLQJ+LVWUXH EDVHG RQ WKH QRUPDO SUREDELOLW\ WDEOHV $SSHQGL[ $ &DOFXODWHG SYDOXHV OHVV WKDQWKHSUHVHWVLJQLILFDQFHOHYHOVXFKDVZRXOGEHFRQVLGHUHGVWDWLVWLFDOO\ VLJQLILFDQW ,I WKH SUREDELOLW\ GLVWULEXWLRQ RI WKH WHVW VWDWLVWLF LV V\PPHWULF WKH SYDOXH WKDW FRUUHVSRQGVWRDWZRWDLOHGWHVWFDQEHKDOYHGWRREWDLQWKHSYDOXHFRUUHVSRQGLQJ WR WKH RQHWDLOHGWHVW 7KLV LV WUXH IRU WKH =WHVW 6HFWLRQ ZKLFK KDV WKH VWDQGDUGQRUPDOGLVWULEXWLRQ&KDSWHU DQGIRUWKHWWHVW&KDSWHUV ZKLFK KDVWKH6WXGHQWWGLVWULEXWLRQERWKRIZKLFKDUHV\PPHWULFDOO\GLVWULEXWHGDERXW 7KHUHVXOWVRIDVWDWLVWLFDOWHVWDUHRIWHQGHVFULEHGDVµKLJKO\VLJQLILFDQW¶ZKHQYHU\ VPDOOSYDOXHVVXFKDVSSHWF DUHREWDLQHG
6DPSOH6L]H'HWHUPLQDWLRQ
6DPSOH VL]H UHTXLUHPHQWV DUH GHWHUPLQHG IURP WKH GHVLUHG SRZHU RI WKH WHVW WKH VLJQLILFDQFHOHYHOWKHPHDVXUHPHQWYDULDELOLW\DQGWKHVWDWHGDOWHUQDWLYHWRWKHQXOO K\SRWKHVLV,QGHVLJQLQJFRPSDUDWLYHFOLQLFDOVWXGLHV a LVRIWHQXVHGDQGD SRZHURIDWOHDVWLVVRXJKW9DULDELOLW\LVHVWLPDWHGIURPSUHYLRXVVWXGLHVRU NQRZQGDWDVHWVDQGWKHDOWHUQDWLYHYDOXHRIWKHK\SRWKHVLVLVGHWHUPLQHGE\WKDW ZKLFKLVFOLQLFDOO\LPSRUWDQW
&RPPRQ6WDWLVWLFDO0HWKRGVIRU&OLQLFDO5HVHDUFKZLWK6$6([DPSOHV
0DQ\ H[FHOOHQW VRXUFHV DUH DYDLODEOH WKDW VKRZ PHWKRGV IRU FRPSXWLQJ VDPSOH VL]HV LQFOXGLQJ /DFKLQ DQG WKHVH DUH QRW UHSHDWHG KHUH ,QVWHDG WKUHH H[DPSOHVXVHGTXLWHIUHTXHQWO\LQGHWHUPLQLQJVDPSOHVL]HVIRUFOLQLFDOVWXGLHVDUH SUHVHQWHG KHUH 7KHVH DUH DSSUR[LPDWH IRUPXODV EDVHG RQ WKH FRPPRQO\ XVHG VLJQLILFDQFHOHYHORI 2QH6DPSOH7HVWRID0HDQ )RU WKH RQHVDPSOH =WHVW DERXW D SRSXODWLRQ PHDQ SUHVHQWHG HDUOLHU LQ 6HFWLRQ RUWKHRQHVDPSOHWWHVW&KDSWHU WKHVDPSOHVL]HQHHGHGLVJLYHQE\
Q
&
(
)
ZKHUH s LV WKH VWDQGDUG GHYLDWLRQ D UHSUHVHQWV WKH GLIIHUHQFH EHWZHHQ WKH K\SRWKHVL]HG YDOXH DQG WKH DOWHUQDWLYH YDOXH DQG & LV REWDLQHG IURP 7DEOH EDVHGRQWKHSRZHUDQGW\SHRIWHVW,I LVXQNQRZQWKHVDPSOHVWDQGDUGGHYLDWLRQ V HVWLPDWHGIURPSUHYLRXVVWXGLHVPD\EHVXEVWLWXWHG ,QWKHDONDOLQHSKRVSKDWDVHH[DPSOHGLVFXVVHGSUHYLRXVO\WKHVDPSOHVL]HUHTXLUHG WRGHWHFWDQLQFUHDVHLQPHDQDONDOLQHSKRVSKDWDVHIURPWR8OEDVHGRQD VWDQGDUGGHYLDWLRQRILV Q SDWLHQWV 7KLVVDPSOHVL]HUHVXOWVLQDSRZHURIEDVHGRQDRQHWDLOHGVLJQLILFDQFHOHYHO RI
7ZR6DPSOH&RPSDULVRQRI0HDQV 7KHFRPSDULVRQRIWZRPHDQVXVLQJDWZRVDPSOH=WHVWRUWKHWZRVDPSOHWWHVW &KDSWHU W\SLFDOO\ WHVWV ZKHWKHU WKH GLIIHUHQFH LQ PHDQV EHWZHHQ WZR LQGHSHQGHQWJURXSVLV7KHDOWHUQDWLYHLVWKDWWKHGLIIHUHQFHLVDQRQ]HURYDOXH VXFK DV D ,I SDWLHQWV ZLOO EH HTXDOO\ GLYLGHG EHWZHHQ WZR JURXSV EDVHG RQ D VWDQGDUGGHYLDWLRQLQHDFKJURXSRIsWKHVDPSOHVL]HIRUHDFKJURXSLVIRXQGE\
Q
(
¼&
)
ZKHUH&LVREWDLQHGIURP7DEOH$JDLQWKHVDPSOHVWDQGDUGGHYLDWLRQVPD\ EHVXEVWLWXWHGZKHQsLVXQNQRZQ
&KDSWHU7RSLFVLQ+\SRWKHVLV7HVWLQJ
)RUH[DPSOHEDVHGRQDWZRWDLOHGWHVWDWSRZHUDQGDVWDQGDUGGHYLDWLRQRI DGLIIHUHQFHLQPHDQVRIDWOHDVWFDQEHGHWHFWHGZLWKDVDPSOHVL]HRI
Q SDWLHQWVSHUJURXS
7ZR6DPSOH&RPSDULVRQRI3URSRUWLRQV 7KH FRPSDULVRQ RI WZR LQGHSHQGHQW ELQRPLDO SURSRUWLRQV XVLQJ D QRUPDO DSSUR[LPDWLRQ =WHVW RU FKLVTXDUH WHVW &KDSWHU WHVWV IRU D GLIIHUHQFH LQ SURSRUWLRQV + S ±S 7KHDOWHUQDWLYHLVWKDWWKHGLIIHUHQFHLVDQRQ]HURYDOXHVD\ D7KHVDPSOHVL]H SHUJURXS WKDWLVUHTXLUHGWRGHWHFWDGLIIHUHQFHZKHQWKHWUXHGLIIHUHQFHLV DLV JLYHQE\
Q
× & × S × ± S
ZKHUHS S S DQG&LVREWDLQHGIURP7DEOH$VLQWKHWZRVDPSOH PHDQVFRPSDULVRQWKLVIRUPXODFDQEHXVHGZKHQWKHVWXG\FDOOVIRUWZRJURXSV WKDWKDYHWKHVDPHQXPEHURISDWLHQWVLQHDFK
)RUH[DPSOHWKHVDPSOHVL]HQHHGHGWRGHWHFWDGLIIHUHQFHLQWUXHUHVSRQVHUDWHVRI YVZLWKDWOHDVWSRZHULVIRXQGDV Q SDWLHQWVSHUJURXS 7KLVLVEDVHGRQDWZRWDLOHGWHVWDWDVLJQLILFDQFHOHYHORI
7$%/(&9DOXHVIRUa
&
32:(5
b
2QH7DLOHG
7ZR7DLOHG
&RPPRQ6WDWLVWLFDO0HWKRGVIRU&OLQLFDO5HVHDUFKZLWK6$6([DPSOHV
0XOWLSOH7HVWLQJ
&RQVLGHU WKH DONDOLQH SKRVSKDWDVH H[DPSOH GLVFXVVHG DW WKH EHJLQQLQJ RI WKLV FKDSWHU DQG VXSSRVH WKDW WZR VXFK VWXGLHV DUH FRQGXFWHG RQH LQ HDFK RI WZR LQGHSHQGHQWFHQWHUV7KHVLJQLILFDQFHOHYHOaLVWKHSUREDELOLW\RIDQHUURQHRXVO\ VLJQLILFDQW ILQGLQJ LH WKH SUREDELOLW\ RI D VLJQLILFDQW UHVXOW LQ &HQWHU RU LQ &HQWHUZKHQ+ LVWUXH8VLQJWKHODZRISUREDELOLW\WKDWVWDWHV)RUDQ\HYHQW$
3U$ ±3UQRW$
\RXKDYHa ±3UDQRQVLJQLILFDQWUHVXOWLQ&HQWHUVDQGZKHQ+LVWUXH ,I a DQG a UHSUHVHQW WKH VLJQLILFDQFH OHYHOV RI WKH WHVWV IRU &HQWHUV DQG UHVSHFWLYHO\\RXKDYH
3UDQRQVLJQLILFDQWUHVXOWLQ&HQWHULZKHQ+LVWUXH ±a IRUL L
$SSO\LQJDODZRISUREDELOLW\ZKLFKVWDWHVWKDWIRUDQ\WZRLQGHSHQGHQWHYHQWV$ DQG% 3U$DQG% 3U$ ¼3U% \RXREWDLQa ±±a ±a ,Ia a \RXKDYH
a ±± ,Q JHQHUDO LI WKHUH DUH N FHQWHUV RU N LQGHSHQGHQW WHVWV HDFK FRQGXFWHG DW D VLJQLILFDQFHOHYHORIWKHRYHUDOOVLJQILFDQFHOHYHOaLV a ±± ZKLFKLVVHHQWRLQFUHDVHPDUNHGO\HYHQIRUVPDOOYDOXHVRINDVVKRZQKHUH N
N
a
&KDSWHU7RSLFVLQ+\SRWKHVLV7HVWLQJ
7KLV LOOXVWUDWHV D SUREOHP HQFRXQWHUHG ZLWK VLPXOWDQHRXVO\ FRQGXFWLQJ PDQ\ K\SRWKHVLVWHVWV$OWKRXJKWKHWHVWVXVXDOO\ZLOOQRWEHLQGHSHQGHQWWKHIDFWLVWKDW WKH RYHUDOO VLJQLILFDQFH OHYHO ZLOO GLIIHU IURP WKH VLJQLILFDQFH OHYHO DW ZKLFK WKH LQGLYLGXDOWHVWVDUHSHUIRUPHG 0XOWLSOHK\SRWKHVLVWHVWLQJDULVHVLQDQXPEHURIZD\VLQWKHDQDO\VLVRIFOLQLFDO WULDOV GDWD DQG WKH UHVHDUFKHU PXVW EH DZDUH RI DQ\ HIIHFWV RQ WKH RYHUDOO FRQFOXVLRQVUHVXOWLQJIURPWKHVHVLWXDWLRQV 0XOWLSOH&RPSDULVRQVRI7UHDWPHQW5HVSRQVH 2QHW\SHRIPXOWLSOHWHVWLQJVLWXDWLRQDULVHVLQWKHFRPSDULVRQRIDVLQJOHUHVSRQVH YDULDEOH DPRQJ PRUH WKDQ WZR UDQGRPL]HG WUHDWPHQW JURXSV :LWK WKUHH JURXSV IRU H[DPSOH D ORZDFWLYHGRVH JURXS D KLJKDFWLYHGRVH JURXS DQG D SODFHER JURXS\RXPLJKWZDQWWRFRPSDUHWKHUHVSRQVHRIHDFKDFWLYHJURXSWRWKDWRIWKH SODFHER JURXS DQG FRPSDUH WKH UHVSRQVHV RI WKH ORZGRVH DQG WKH KLJKGRVH JURXSV 0XOWLSOHFRPSDULVRQVFDQEHGHVLJQHGWRDQVZHUVSHFLILFTXHVWLRQVVXFKDV :KLFKLIDQ\WUHDWPHQWVDUHEHWWHUWKDQDFRQWUROJURXS" 'RHVUHVSRQVHLPSURYHZLWKLQFUHDVLQJGRVHVRIDWUHDWPHQW" :KLFKRIVHYHUDOGRVHVLVWKHPRVWHIIHFWLYH" :KDWLVWKHVPDOOHVWHIIHFWLYHGRVH" :KLFKWUHDWPHQWVDUHVWDWLVWLFDOO\LQIHULRUWRWKHµEHVW¶" ,VWKHUHDUHYHUVDORIGRVHUHVSRQVHDWDKLJKHUGRVH" ,Q VRPH FDVHV D VWDWLVWLFDO WHVW GHVLJQHG WR GHWHFW YHU\ VSHFLILF DOWHUQDWLYH K\SRWKHVHVFDQEHDSSOLHGWRKHOSDQVZHUVXFKTXHVWLRQV7KH-RQFKHHUH7HUSVWUD WHVW IRU PRQRWRQLF GRVH UHVSRQVH DQG WKH &RFKUDQ$UPLWDJH WHVW IRU OLQHDU WUHQG DUHWZRH[DPSOHVVHH6$62QOLQH'RFIRU352&)5(4 2IWHQFRQWUDVWVOLQHDU IXQFWLRQV RIJURXSPHDQVFDQEHXVHGWRKHOSDQVZHUVSHFLILFTXHVWLRQVDERXWWKH UHODWLRQVKLSVRIUHVSRQVHVDPRQJJURXSV6HWVRIFRQWUDVWVFDQEHVLPXOWDQHRXVO\ WHVWHG E\ XVLQJ PXOWLSOH FRPSDULVRQ PHWKRGV WR DGMXVW WKH SYDOXHV LQ RUGHU WR PDLQWDLQFRQWURORIWKHRYHUDOOVLJQLILFDQFHOHYHOV 0RVWFRPPRQO\WKHJRDOVRIPXOWLSOHFRPSDULVRQVRIWUHDWPHQWHIIHFWVDUHL WR SHUIRUPDOOSDLUZLVHFRPSDULVRQVDPRQJWKHWUHDWPHQWJURXSVDQGLL WRWHVWHDFK DFWLYHWUHDWPHQWJURXSDJDLQVWDVLQJOHFRQWUROJURXS :LWK . . JURXSV WKHUH DUH . Â .± SRVVLEOH SDLUZLVH FRPSDULVRQV +RZHYHUZKHQFRPSDULQJHDFKWUHDWHGJURXSZLWKDFRQWUROWKHUHDUHRQO\.± FRPSDULVRQVDVXEVWDQWLDOUHGXFWLRQIURPWKHFDVHRIDOOSDLUZLVHFRPSDULVRQV DVVKRZQLQ7DEOH
&RPPRQ6WDWLVWLFDO0HWKRGVIRU&OLQLFDO5HVHDUFKZLWK6$6([DPSOHV
7$%/(1XPEHURI*URXS&RPSDULVRQVIRU. WR 1XPEHURI *URXSV
1XPEHURI3DLUZLVH &RPSDULVRQV
1XPEHURI&RPSDULVRQV ZLWKµ&RQWURO¶
.
..±
.±
,QDVWXG\ZLWKGRVHJURXSVIRUH[DPSOHWKHUHZRXOGEHSRVVLEOHSDLUZLVH FRPSDULVRQV ,I HDFK RI WKHVH LV FRQGXFWHG DW D VLJQLILFDQFH OHYHO RI WKH RYHUDOOVLJQLILFDQFHOHYHOLVDIIHFWHGVRWKDWWKHOLNHOLKRRGRIREWDLQLQJDWOHDVWRQH HUURQHRXVILQGLQJPLJKWLQFUHDVHVXEVWDQWLDOO\)RUWXQDWHO\WKLVSUREOHPLVHDVLO\ RYHUFRPHE\XVLQJRQHRUPRUHRIWKHPDQ\DSSURDFKHVWRPXOWLSOHFRPSDULVRQV WKDWZLOOFRQWURORYHUDOOVLJQLILFDQFHOHYHOV $YDVWDUUD\RIPXOWLSOHWHVWLQJPHWKRGVLVDYDLODEOHLQ6$60XOWLSOHFRPSDULVRQ SURFHGXUHV FDQ EH FDUULHG RXW E\ XVLQJ DSSURSULDWH RSWLRQV LQ WKH 0($16 VWDWHPHQW LQ 352& $129$ RU LQ 352& */0 RU E\ XVLQJ WKH /60($16 VWDWHPHQW ZLWK WKH $'-867 RSWLRQ LQ 352& */0 RU 352& 0,;(' &HUWDLQ PXOWLSOH WHVWLQJ PHWKRGV FDQ DOVR EH FRQGXFWHG E\ XVLQJ 352& 08/77(67 RU WKH352%0&IXQFWLRQLQD'$7$VWHS6HH$SSHQGL[(IRUIXUWKHUGHWDLOV 0XOWLSOH5HVSRQVH9DULDEOHV $QRWKHU LQVWDQFH LQ ZKLFK PXOWLSOH WHVWLQJ DULVHV LV ZKHQ FRQGXFWLQJ LQGLYLGXDO WHVWVRQPDQ\UHVSRQVHYDULDEOHV)RUH[DPSOHWHVWLQJIRUVLJQLILFDQWSUHWRSRVW VWXG\FKDQJHVLQODERUDWRU\YDOXHVE\FRQGXFWLQJLQGLYLGXDO=RUWWHVWVRQDODUJH QXPEHU RI ODERUDWRU\ SDUDPHWHUV PLJKW UHVXOW LQ FKDQFH VLJQLILFDQFH :LWK LQGHSHQGHQWWWHVWVHDFKDWDVLJQLILFDQFHOHYHORI\RXZRXOGH[SHFWRQHWHVW UHVXOWWREHVLJQLILFDQWGXHWRFKDQFHYDULDWLRQZKHQWKHUHLVQRUHDOGHYLDWLRQIURP WKHQXOOK\SRWKHVLV,IRURUWHVWVDUHFRQGXFWHGRQWKHODERUDWRU\GDWD DOWKRXJK QRW LQGHSHQGHQW RQH PLJKW H[SHFW D QXPEHU RI WKHVH WR EH IDOVHO\ VLJQLILFDQW 0XOWLSOH WHVWLQJ VLWXDWLRQV PLJKW DOVR DULVH ZKHQ D VWXG\ KDV PRUH WKDQ RQH SULPDU\ HQGSRLQW DVVRFLDWHG ZLWK HVWDEOLVKLQJ WUHDWPHQW HIILFDF\ 7KH RYHUDOO VLJQLILFDQFHOHYHOIRUHIILFDF\GHSHQGVRQZKHWKHUDWOHDVWRQHVRPHRUDOORIWKH HQGSRLQWV PXVW LQGLYLGXDOO\ DWWDLQ D FHUWDLQ OHYHO RI VLJQLILFDQFH &RQVLGHUDWLRQV PXVWEHJLYHQDVWRZKLFKFRPELQDWLRQVRISULPDU\UHVSRQVHYDULDEOHVPXVWVKRZ VLJQLILFDQFHEHIRUHWUHDWPHQWHIILFDF\FDQEHGHFODUHG&HUWDLQFRPELQDWLRQVPLJKW UHTXLUHXVHRIDPXOWLYDULDWHVWDWLVWLFDOPHWKRGRUDQDGMXVWPHQWRIWKHLQGLYLGXDO VLJQLILFDQFHOHYHOVXVHGWRWHVWHDFKYDULDEOH6RPHRIWKHPDQ\SYDOXHDGMXVWPHQW
&KDSWHU7RSLFVLQ+\SRWKHVLV7HVWLQJ
PHWKRGVDYDLODEOHWRDGGUHVVWKLVPXOWLSOLFLW\SUREOHPDUHGLVFXVVHGLQ$SSHQGL[( LQWKHFRQWH[WRIPXOWLSOHFRPSDULVRQV :KLOH YHU\ FRQVHUYDWLYH D VLPSOH %RQIHUURQL DGMXVWPHQW LV RIWHQ XVHG LQ WKHVH VLWXDWLRQV)RUH[DPSOHZKHQHIILFDF\PD\EHFODLPHGLIMXVWRQHRINFRSULPDU\ UHVSRQVHYDULDEOHVLVVLJQLILFDQWWKH%RQIHUURQLPHWKRGFDOOVIRUWHVWLQJHDFKRIWKH N UHVSRQVH YDULDEOHV DW D VLJQLILFDQFH OHYHO RI N WR PDLQWDLQ DQ RYHUDOO VLJQLILFDQFH OHYHO RI $ OHVV FRQVHUYDWLYH PHWKRG FDQ EH XVHG E\ WDNLQJ LQWR DFFRXQWWKHFRUUHODWLRQVDPRQJWKHUHVSRQVHYDULDEOHV3RFRFNHWDO KDYH VKRZQ WKDW ZLWK N QRUPDOO\ GLVWULEXWHG UHVSRQVH YDULDEOHV DQG D FRUUHODWLRQ RI EHWZHHQDQ\SDLUWKHWHVWVFDQEHFRQGXFWHGDWDVLJQLILFDQFHOHYHOWKDWLVVOLJKWO\ KLJKHUWKDQWKH%RQIHUURQLYDOXH7KLVLVLOOXVWUDWHGLQ7DEOHIRUDWZRWDLOHG WHVWZLWKDQRYHUDOOVLJQLILFDQFHOHYHORI
7$%/($GMXVWHG6LJQLILFDQFH/HYHOV1HHGHGWR0DLQWDLQDQ2YHUDOO /HYHO:KHQ7HVWLQJN&R3ULPDU\(QGSRLQWVZLWK&RUUHODWLRQ
N
%RQIHUURQL
,QWHULP$QDO\VHV ,QWHULPDQDO\VHVRIRQJRLQJVWXGLHVUHSUHVHQWVDQRWKHUVLWXDWLRQLQYROYLQJPXOWLSOH WHVWLQJ 8VH RI LQWHULP DQDO\VHV KDV EHFRPH KLJKO\ DFFHSWHG LQ ODUJH RU OHQJWK\ FOLQLFDO UHVHDUFK VWXGLHV ,Q VRPH VLWXDWLRQV LW LV ORRNHG XSRQ DV XQHWKLFDO WR FRQWLQXH D VWXG\ ZKHQ WKHUH LV RYHUZKHOPLQJ HYLGHQFH RI WKH HIILFDF\ RI D QHZ WKHUDS\ %\ FRQWLQXLQJ VXFK DVWXG\SDWLHQWVPLJKWUHFHLYHDSODFHERRUDQRWKHU OHVV HIIHFWLYH WUHDWPHQW WKDW GHSULYHV WKHP RI WKH PRUH HIIHFWLYH WUHDWPHQW $VVXPLQJ WKHUH DUH QR VDIHW\ LVVXHV LW LV JHQHUDOO\ SUHIHUDEOH WR PDNH WKH QHZ WKHUDS\DYDLODEOHWRSDWLHQWVDVVRRQDVSRVVLEOH :KHQDGHFLVLRQLVPDGHWRVWRSRUWRFRQWLQXHWKHVWXG\RUWRFKDQJHWKHVWXG\LQ VRPHIXQGDPHQWDOZD\EDVHGRQDQLQWHULPORRNDWWKHGDWDWKHILQDOVLJQLILFDQFH OHYHOVZLOOEHDOWHUHG*URXSVHTXHQWLDOPHWKRGVDUHVSHFLDOVWDWLVWLFDODSSURDFKHV WKDWFDQEHDSSOLHGWRKDQGOHVXFKSUREOHPVDQGLQPRVWFDVHVRIIHUDGMXVWPHQWV LQ RUGHU WR PDLQWDLQ DQ RYHUDOO VLJQLILFDQFH OHYHO DW D SUHGHWHUPLQHG YDOXH 7KH JURXS VHTXHQWLDO PHWKRGV PRVW FRPPRQO\ XVHG LQ FOLQLFDO WULDOV LQFOXGH WKRVH GHVFULEHG E\ 3RFRFN 2¶%ULHQ DQG )OHPLQJ DQG /DQ DQG 'H0HWV 7KHVHDUHGLVFXVVHGLQWKHVHFWLRQVWKDWIROORZ
&RPPRQ6WDWLVWLFDO0HWKRGVIRU&OLQLFDO5HVHDUFKZLWK6$6([DPSOHV
%HFDXVH WKH LVVXH RI LQWHULP DQDO\VHV FDQ DIIHFW WKH RYHUDOO VLJQLILFDQFH OHYHO FDUHIXO SODQQLQJ DW WKH GHVLJQ VWDJH LV YHU\ LPSRUWDQW LQ VWXGLHV ZLWK DQWLFLSDWHG LQWHULPDQDO\VHVLQRUGHUWRSURWHFWWKHRYHUDOO 7KHVWXG\SURWRFRODQGVWDWLVWLFDO DQDO\VLVSODQVVKRXOGVSHFLILFDOO\OD\RXWWKHGHWDLOVLQFOXGLQJ WKHQXPEHURILQWHULPDQDO\VHVWKDWZLOOEHGRQH ZKHQWKH\ZLOOEHFRQGXFWHG WKHLUSXUSRVH ZKLFKYDULDEOHVZLOOEHDQDO\]HG KRZWKHDQDO\VHVZLOOEHKDQGOHG DQ\DGMXVWPHQWVWREHPDGHWRWKHVLJQLILFDQFHOHYHOV ZKRZLOOUHPDLQEOLQGHG :KHQLQWHULPDQDO\VHVRFFXUZLWKRXWSUHSODQQLQJFDUHIXOGRFXPHQWDWLRQPXVWEH NHSWWRDYRLGFRPSURPLVLQJWKHVWXG\LQWHJULW\ 7R PDLQWDLQ DQ RYHUDOO VLJQLILFDQFH OHYHO VXFK DV WKH LQWHULP DQDO\VHV PXVWEHFRQGXFWHGDWVLJQLILFDQFHOHYHOVVRPHZKDWOHVVWKDQ2QHPHWKRGLV WRFRQGXFWHDFKLQWHULPDQDO\VLVDWDYHU\VPDOO OHYHOVXFKDVRUOHVVVR WKDW WKH ILQDO DQDO\VLV FDQ EH FRQGXFWHG QHDU WKH OHYHO 7KLV LV D YHU\ FRQVHUYDWLYHDSSURDFKEHFDXVHLWLVH[WUHPHO\GLIILFXOWWRILQGDGLIIHUHQFHEHWZHHQ WUHDWPHQW JURXSV DW LQWHULP WHVWLQJ DW VXFK D UHGXFHG OHYHO +RZHYHU LW PLJKW DFFRPSOLVKWKHSXUSRVHRILGHQWLI\LQJRYHUZKHOPLQJWUHDWPHQWGLIIHUHQFHVHDUO\LQ WKHDQDO\VLVZKLOHSHUPLWWLQJDOPRVWDIXOOaOHYHOWHVWDWWKHILQDODQDO\VLV
3RFRFN¶V$SSURDFK
3RFRFN SURSRVHGDJURXSVHTXHQWLDOPHWKRGZKHUHE\WKHDQDO\VLVDWHDFK VWDJH RI WHVWLQJ LQFOXGLQJ WKH ILQDO DQDO\VLV LV FRQGXFWHG DW WKH VDPH UHGXFHG VLJQLILFDQFHOHYHO LQRUGHUWRPDLQWDLQDQRYHUDOO RI9DOXHVRI DUH VKRZQLQ7DEOHIRUWRSODQQHGLQWHULPDQDO\VHV 3
3
7$%/(3RFRFN¶Va3IRUa
1XPEHURI 1XPEHU $QDO\VHV RI,QWHULPV
3
&KDSWHU7RSLFVLQ+\SRWKHVLV7HVWLQJ
2¶%ULHQ)OHPLQJ$SSURDFK
$GUDZEDFNRI3RFRFN¶VPHWKRGLVWKDWWKHILQDODQDO\VLVLVFRQGXFWHGDWDOHYHO PXFKVPDOOHUWKDQWKHOHYHO7KH2¶%ULHQ)OHPLQJDSSURDFK SUREDEO\ WKH PRVW ZLGHO\ XVHG PHWKRG LQ KDQGOLQJ LQWHULP DQDO\VHV RI FOLQLFDO WULDOV RYHUFRPHVWKLVREMHFWLRQEXWDWWKHH[SHQVHRIUHTXLULQJHYHQJUHDWHUFRQVHUYDWLVP DW WKH HDUO\ LQWHULPV 7KLV PHWKRG XVHV SURJUHVVLYHO\ LQFUHDVLQJ a OHYHOV DW HDFK LQWHULPDQDO\VLVVXFKDV a2) VRWKDWWKHILQDODQDO\VLVLVFRQGXFWHGFORVHWRWKH OHYHOZKLOHPDLQWDLQLQJDQRYHUDOOaRI$VVKRZQLQ7DEOHWKHILQDO DQDO\VLVLVFRQGXFWHGDWDQa2)OHYHOEHWZHHQDQGZKHQWKHUHDUHRUOHVV SODQQHGLQWHULPDQDO\VHV 7$%/(2¶%ULHQ)OHPLQJ¶Va2)IRUa ,QWHULP$QDO\VLV
1XPEHURI $QDO\VHV
1XPEHURI ,QWHULPV
)LQDO
)RUH[DPSOHLQDWZRVWDJHGHVLJQWKDWXVHVWKH2¶%ULHQ)OHPLQJDSSURDFKZLWK RQH VFKHGXOHG LQWHULP DQDO\VLV K\SRWKHVLV WHVWLQJ ZRXOG EH FRQGXFWHG DW DQ LQWHULPVLJQLILFDQFHOHYHORIa2) ,IVLJQLILFDQFHLVIRXQGLQIDYRURIWKHWHVW WUHDWPHQWWKHVWXG\PD\EHVWRSSHGZLWKVXIILFLHQWVWDWLVWLFDOHYLGHQFHRIHIILFDF\ ,IWKHOHYHORIVLJQLILFDQFHLVQRWUHDFKHGDWWKHLQWHULPWKHVWXG\FRQWLQXHV WR QRUPDO FRPSOHWLRQ DW ZKLFK WLPH WKH K\SRWKHVLV WHVWLQJ LV FRQGXFWHG DW D VLJQLILFDQFH OHYHO RI a2) 7KLV VWDJHZLVH SURFHGXUH ZLOO KDYH DQ RYHUDOO VLJQLILFDQFHRIa
:KHQDVWXG\LVGHVLJQHGWRLQFOXGHRQHRUPRUHLQWHULPDQDO\VHVVDPSOHVL]HVFDQ EHSUHGHWHUPLQHGWRDFKLHYHWKHSRZHU\RXZDQWLQDZD\WKDWLVVLPLODUWRWKDW GLVFXVVHG SUHYLRXVO\ LQ WKLV FKDSWHU %HFDXVH WKH ILQDO DQDO\VLV RI D JURXS VHTXHQWLDOGHVLJQLVFRQGXFWHGDWDQDOSKDOHYHOVPDOOHUWKDQWKHQRPLQDO VDPSOHVL]HUHTXLUHPHQWVIRUVWXGLHVLQYROYLQJLQWHULPDQDO\VHVDUHJHQHUDOO\ODUJHU WKDQ IRU VLPLODU IL[HG VDPSOH VL]H VWXGLHV LH WKRVH ZLWK QR SODQQHG LQWHULP DQDO\VHV 2QHRIWKHIHDWXUHVRIWKH2¶%ULHQ)OHPLQJDSSURDFKLVWKDWVDPSOHVL]H UHTXLUHPHQWV DUH YHU\ FORVH WR WKRVH RI WKH IL[HG VDPSOH VL]H VWXG\ XVXDOO\ QR PRUHWKDQWRKLJKHULQRUGHUWRDFKLHYHWKHVDPHSRZHU
&RPPRQ6WDWLVWLFDO0HWKRGVIRU&OLQLFDO5HVHDUFKZLWK6$6([DPSOHV
/DQ'H0HWV 6SHQGLQJ)XQFWLRQ
7KH 2¶%ULHQ)OHPLQJ DSSURDFK ZDV GHYHORSHG ZLWK WKH DVVXPSWLRQ WKDW D SUH VSHFLILHG QXPEHU RI LQWHULP DQDO\VHV ZLOO EH SHUIRUPHG DW DSSUR[LPDWHO\ HTXDOO\ VSDFHG LQWHUYDOV GXULQJ WKH VWXG\ EDVHG RQ SDWLHQW DFFUXDO 6LPXODWLRQ VWXGLHV KDYH VKRZQ WKDW WKH SURFHGXUH LV QRW JUHDWO\ DIIHFWHG XQGHU QRQH[WUHPH GHYLDWLRQV IURP WKLV DVVXPSWLRQ,QPDQ\FDVHVWKHQXPEHURUWLPLQJRILQWHULP DQDO\VHVFDQQRWEHSUHVSHFLILHG/HQJWK\WULDOVIRUH[DPSOHPLJKWVLPSO\UHTXHVW DQLQWHULPHYHU\PRQWKVDQGSDWLHQWDFFUXDOPLJKWQRWEHXQLIRUPWKURXJKHDFK RIWKRVHPRQWKSHULRGV /DQDQG'H0HWV LQWURGXFHGDPHWKRGIRUKDQGOLQJWKHPXOWLSOLFLW\SUREOHP ZKHQ WKH QXPEHU RI LQWHULP DQDO\VHV LV QRW NQRZQ DW WKH SODQQLQJ VWDJH 7KLV PHWKRGLVEDVHGRQDQµ VSHQGLQJIXQFWLRQ¶WKDWDOORFDWHVDSRUWLRQRIWKHRYHUDOO VLJQLILFDQFHOHYHO IRUWHVWLQJDWHDFKLQWHULPDQDO\VLVEDVHGRQWKHDPRXQWRI LQIRUPDWLRQDYDLODEOHDWWKDWDQDO\VLVµLQIRUPDWLRQIUDFWLRQ¶
7KHLQIRUPDWLRQIUDFWLRQLVXVXDOO\EDVHGRQWKHUDWLRRIWKHQXPEHURISDWLHQWV DYDLODEOHIRULQWHULPDQDO\VLVWRWKHWRWDODQWLFLSDWHGVDPSOHVL]HLIWKHVWXG\ZHUH WRJRWRFRPSOHWLRQ,QVRPHFDVHVWKHLQIRUPDWLRQIUDFWLRQFDQEHWKHIUDFWLRQRI HODSVHGWLPHRULQWKHFDVHRIVXUYLYDODQDO\VLVWKHQXPEHURIGHDWKVREVHUYHG UHODWLYHWRWKHQXPEHURIGHDWKVH[SHFWHG $QXPEHURIVSHQGLQJIXQFWLRQVKDYHEHHQSURSRVHGIRUXVHZLWKWKH/DQ'H0HWV PHWKRG LQFOXGLQJ WKH VSHQGLQJ IXQFWLRQ WKDW LV EDVHG RQ WKH 2¶%ULHQ)OHPLQJ DSSURDFK7KHUHMHFWLRQERXQGDULHVDQGLQWHULPWHVWLQJOHYHOVFDQEHREWDLQHGE\ XVLQJDFRPSXWHUSURJUDPFDSDEOHRIHYDOXDWLQJPXOWLYDULDWHQRUPDOSUREDELOLWLHV :KLOH QRW DYDLODEOH LQ 6$6 PDQ\ JRRG VRIWZDUH SURJUDPV DUH DYDLODEOH IRU FRPSXWLQJ WKHVH ERXQGDULHV LQFOXGLQJ WKH LQWHUDFWLYH /$1'(0 SURJUDP DYDLODEOHRQWKH,QWHUQHW (D6WDQG3$66 7DEOH VKRZV WKH SRUWLRQ RI DYDLODEOH IRU DOO LQWHULP DQDO\VHV XSWR DQG LQFOXGLQJ D VSHFLILHG LQIRUPDWLRQ IUDFWLRQ 7KLV FXPXODWLYH XVHV WKH 2¶%ULHQ )OHPLQJ VSHQGLQJ IXQFWLRQ DQG DVVXPHV D WZRWDLOHG V\PPHWULF WHVW ZLWK RYHUDOO ,Q7DEOH\RXVHHWKDWRQO\RIWKHRYHUDOO LVDOORFDWHG IRU DOO LQWHULP DQDO\VHV FRQGXFWHG E\ WKH PLGSRLQW RI WKH VWXG\ LQIRUPDWLRQ IUDFWLRQ
&KDSWHU7RSLFVLQ+\SRWKHVLV7HVWLQJ
7$%/(&XPXODWLYH/DQ'H0HWVa6SHQGLQJ)XQFWLRQIRUWKH 2¶%ULHQ)OHPLQJ6SHQGLQJ)XQFWLRQ7ZR7DLOHG7HVWDW 2YHUDOOa
,QIRUPDWLRQ)UDFWLRQ
&XPXODWLYH
,QWHULP DQDO\VLV WHVWLQJ OHYHOV /' IRU WKH 2¶%ULHQ)OHPLQJ VSHQGLQJ IXQFWLRQ XQGHU WKH /DQ'H0HWV PHWKRG DUH VKRZQ LQ 7DEOH 7KHVH DVVXPH HTXDOO\ VSDFHGLQWHULPDQDO\VHVGXULQJWKHVWXG\DQGDUHEDVHGRQDWZRWDLOHGWHVWZLWKDQ RYHUDOOVLJQLILFDQFHOHYHORI )RUH[DPSOHDVWXG\LQYROYLQJWKUHHHTXDOO\VSDFHGLQWHULPDQDO\VHVZRXOGHQWDLO WHVWLQJDWWKHOHYHODWWKHILQDODQDO\VLVLQRUGHUWRPDLQWDLQDQRYHUDOOWZR WDLOHG RIDVVXPLQJWKHVWXG\JRHVWRSODQQHGFRPSOHWLRQ $VVHHQLQ7DEOH VWRSSLQJWKHVWXG\DWDQ\LQWHULPVWDJHZRXOGUHTXLUHRYHUZKHOPLQJHYLGHQFH RIHIILFDF\QDPHO\VLJQLILFDQFHDWDWWKHILUVWVWDJHDWWKHVHFRQG VWDJHRUDWWKHWKLUGVWDJHRIWHVWLQJ
7$%/(/DQ'H0HWVa/'IRU2YHUDOOa 8VLQJWKH2¶%ULHQ )OHPLQJ6SHQGLQJ)XQFWLRQ(TXDOO\6SDFHG,QWHULPV
,QWHULP$QDO\VLV
1XPEHURI ,QWHULPV
1XPEHURI $QDO\VHV
)LQDO
&RPPRQ6WDWLVWLFDO0HWKRGVIRU&OLQLFDO5HVHDUFKZLWK6$6([DPSOHV
6XPPDU\
7KLV FKDSWHU SUHVHQWV D GLVFXVVLRQ RI VRPH RI WKH PRVW IXQGDPHQWDO HOHPHQWV RI K\SRWKHVLVWHVWLQJLQFOXGLQJVLJQLILFDQFHOHYHOVSRZHUSYDOXHVLOOXVWUDWLRQRIWKH GLIIHUHQFH EHWZHHQ RQH DQG WZRWDLOHG WHVWV WKH LGHD EHKLQG VDPSOH VL]H FRPSXWDWLRQV DQG WKH HIIHFW RI PXOWLSOH WHVWLQJ RQ VLJQLILFDQFH OHYHOV LQFOXGLQJ LQWHULPDQDO\VHV0DQ\RIWKHVWDWLVWLFDOWHUPVXVHGLQWKLVFKDSWHUDUHIUHTXHQWO\ XVHGE\FOLQLFDOUHVHDUFKHUVERWKVWDWLVWLFLDQVDQGQRQVWDWLVWLFLDQVIURPSURWRFRO GHYHORSPHQW WKURXJK UHJXODWRU\ VXEPLVVLRQ 7KH FRQFHSWV SUHVHQWHGKHUHIRUPD EDVLVIRUDJHQHUDORYHUYLHZRIK\SRWKHVLVWHVWLQJWKHSULPDU\LQIHUHQWLDOWRROXVHG LQSUHVHQWLQJWKHVWDWLVWLFDOPHWKRGVLQ&KDSWHUVWKURXJK
&++$$3377((55 7KH'DWD6HW75,$/ ,QWURGXFWLRQ 'DWD&ROOHFWLRQ &UHDWLQJWKH'DWD6HW75,$/ 6WDWLVWLFDO6XPPDUL]DWLRQ 6XPPDU\
,QWURGXFWLRQ
7KLVFKDSWHUSUHVHQWVGDWDIURPDYHU\VLPSOHK\SRWKHWLFDOFOLQLFDOWULDOWKDWZLOOEH XVHG WR LOOXVWUDWH PDQ\ RI WKH PHWKRGV GLVFXVVHG WKURXJKRXW WKLV ERRN )LUVW WKURXJK WKH XVH RI YHU\ UXGLPHQWDU\ FDVH UHSRUW IRUPV &5)¶V \RX¶OO VHH KRZ UHVSRQVH GDWD FDQ EH FROOHFWHG LQ YDULRXV IRUPDWV GLFKRWRPRXV FDWHJRULFDO RU QXPHULF7KHQRQHRIVHYHUDOPHWKRGVDYDLODEOHLQ6$6IRUFUHDWLQJDGDWDVHWIURP WKHVH &5)¶V LV LOOXVWUDWHG 7KH GDWD VHW 75,$/ LV XVHG WR GHPRQVWUDWH VRPH HOHPHQWDU\GDWDVXPPDUL]DWLRQPHWKRGVXVLQJ6$6 )LQDOO\WKHGDWDVHW75,$/LVXVHGLQWKHH[HUFLVHVSUHVHQWHGLQ&KDSWHU7KH H[HUFLVHV DUH GHVLJQHG WR JLYH WKH UHDGHU SUDFWLFH LQ DSSO\LQJ WKH VWDWLVWLFDO WHFKQLTXHVGLVFXVVHGLQWKLVERRN
'DWD&ROOHFWLRQ
&RQVLGHU D YHU\ VLPSOH WULDO XVHG WR VWXG\ D FOLQLFDO UHVSRQVH IURP HDFK RI SDWLHQWV 7KH GHVLJQ LV D PXOWLFHQWHU UDQGRPL]HG GRXEOHEOLQG SDUDOOHO VWXG\ FRQGXFWHGDWWKUHHVWXG\FHQWHUV3DWLHQWVDUHUDQGRPL]HGLQHTXDOQXPEHUVWRRQH RIWZRWUHDWPHQWJURXSV7KHREMHFWLYHLVWRFRPSDUHWKHHIILFDF\RIH[SHULPHQWDO GUXJ$ ZLWKWKDWRIUHIHUHQFHGUXJ% LQUHGXFLQJRUHOLPLQDWLQJWKHV\PSWRPVRI DQXQVSHFLILHGGLVHDVH7KHSULPDU\UHVSRQVHYDULDEOHWKDWPHDVXUHVWKHVHYHULW\RI WKHV\PSWRPVLVREWDLQHGE\XVLQJDSDWLHQWUDWHGJOREDODVVHVVPHQWDWWKHHQGRI WKHWULDO7KUHHW\SHVRIUHVSRQVHVDUHFRQVLGHUHG SUHVHQFHRUDEVHQFHRIV\PSWRPVDGLFKRWRPRXVUHVSRQVH GLVFUHWHVHYHULW\UDWLQJDQRUGHUHGFDWHJRULFDOUHVSRQVH SHUFHQWLOH UDWLQJ XVLQJ D YLVXDO DQDORJ VFDOH 9$6 D FRQWLQXRXV QXPHULF UHVSRQVH
&RPPRQ6WDWLVWLFDO0HWKRGVIRU&OLQLFDO5HVHDUFKZLWK6$6([DPSOHV
6DPSOHGDWDFROOHFWLRQIRUPV&5)¶V ZKLFKLOOXVWUDWHKRZWKHVHUHVSRQVHVPLJKW EH REWDLQHG DUH VKRZQ LQ )LJXUHV WR 7KH IRUPV LQFOXGH EDVLF SDWLHQW LQIRUPDWLRQSDWLHQWLGHQWLILFDWLRQDJHDQGJHQGHU VLWHQXPEHUDQGYLVLWGDWHLQ DGGLWLRQWRWKHUHVSRQVHPHDVXUH ),*85(6DPSOH'DWD&ROOHFWLRQ)RUP'LFKRWRPRXV5HVSRQVH PROTOCOL NO. _______________
SITE NO. _________________
3$7,(17,1)250$7,21 PATIENT NO.
PATIENT INITIALS
VISIT DATE
____________
mm
AGE (years)
dd
GENDER:
____________
Male
Female
*/2%$/$66(660(17 [To be completed at study termination]
Are symptoms present now?
YES
NO
yy
&KDSWHU7KH'DWD6HW75,$/
),*85(6DPSOH'DWD&ROOHFWLRQ)RUP&DWHJRULFDO5HVSRQVH PROTOCOL NO. _______________
SITE NO. _______________
3$7,(17,1)250$7,21 PATIENT NO.
PATIENT INITIALS
VISIT DATE
____________
mm
AGE (years)
dd
yy
GENDER:
____________
Male
Female
*/2%$/$66(660(17 [To be completed at study termination]
Check box corresponding to how your symptoms are now
No Symptoms
Mild
Moderate
Severe
),*85(6DPSOH'DWD&ROOHFWLRQ)RUP&RQWLQXRXV1XPHULF 5HVSRQVH PROTOCOL NO. ______________
SITE NO. _________________
3$7,(17,1)250$7,21 PATIENT NO.
PATIENT INITIALS
VISIT DATE
____________
mm
AGE (years)
dd
yy
GENDER:
____________
Male
Female
*/2%$/$66(660(17 [To be completed at study termination]
Place an
X on the line corresponding to how you feel now
0 (No symptoms)
100 (Most severe)
42
Common Statistical Methods for Clinical Research with SAS Examples
&UHDWLQJWKH'DWD6HW75,$/ Create and store the data set TRIAL in the directory bookfiles\examples\sas on Drive C. The SAS code for creating this data set is shown below. All three response types are input as follows: • • •
The dichotomous response variable is named RESP. It has coded values of 0 (if symptoms are absent) and 1 (if symptoms are present); The categorical response variable is named SEV. It is coded as 0, 1, 2, or 3 for ‘none’, ‘mild’, ‘moderate’, and ‘severe’; The continuous numeric response variable is named SCORE. Its values can range between 0 and 100. These values represent the percentage of the greatest possible severity perceived by the patient.
Because these data are used for illustrative purposes only, the values of the SCORE variable å are entered and, for ease in constructing the data set, the other response variables (RESP and SEV) are computed rather than entered manually. RESP is set to 0 or 1, based on whether SCORE is 0 or >0, respectively . The severity category (SEV) field is created as follows: SCORE values of 1 to 30 are defined as ‘Mild’; 31 to 69, ‘Moderate’; and 70 to 100, ‘Severe’ ê.
6$6&RGHIRU&UHDWLQJWKH'DWD6HW75,$/ LIBNAME EXAMP ’c:\bookfiles\examples\sas’; DATA EXAMP.TRIAL; INPUT TRT $ CENTER PAT SEX $ AGE SCORE @@; å RESP = (SCORE GT 0); /* RESP=0 (symptoms are absent), =1 (symptoms are present) IF (SCORE = 0) THEN SEV = 0; /* “No Symptoms” IF ( 1 LE SCORE LE 30) THEN SEV = 1; /* “Mild Symptoms” IF (31 LE SCORE LE 69) THEN SEV = 2; /* “Moderate Symptoms” IF (SCORE GE 70) THEN SEV = 3; /* “Severe Symptoms” DATALINES; A 1 101 M 55 5 A 1 104 F 27 0 A 1 106 M 31 35 A 1 107 F 44 21 A 1 109 M 47 15 A 1 111 F 69 70 A 1 112 F 31 10 A 1 114 F 50 0 A 1 116 M 32 20 A 1 118 F 39 25 A 1 119 F 54 0 A 1 121 M 70 38 A 1 123 F 57 55 A 1 124 M 37 18 A 1 126 F 41 0 A 1 128 F 48 8 A 1 131 F 35 0 A 1 134 F 28 0 A 1 135 M 27 40 A 1 138 F 42 12 A 2 202 M 58 68 A 2 203 M 42 22 A 2 206 M 26 30 A 2 207 F 36 0 A 2 210 F 35 25 A 2 211 M 51 0 A 2 214 M 51 60 A 2 216 F 42 15 A 2 217 F 50 50 A 2 219 F 41 35 A 2 222 F 59 0 A 2 223 F 38 10 A 2 225 F 32 0 A 2 226 F 28 16 A 2 229 M 42 48 A 2 231 F 51 45 A 2 234 F 26 90 A 2 235 M 42 0 A 3 301 M 38 28 A 3 302 M 41 20 A 3 304 M 65 75 A 3 306 F 64 0 A 3 307 F 30 30 A 3 309 F 64 5 A 3 311 M 39 80
*/ */ */ */ */
ê
&KDSWHU7KH'DWD6HW75,$/
A A A B B B B B B B B B B B B B B B B ;
3 3 3 1 1 1 1 1 1 2 2 2 2 2 2 3 3 3 3
314 319 325 105 113 120 127 132 137 201 208 213 220 227 232 305 312 317 323
F F F M M F F M F F F M F F M M M M F
57 34 56 45 61 19 53 49 57 63 48 24 45 41 37 46 52 52 51
85 26 35 20 75 55 40 20 95 10 12 88 72 0 32 15 40 60 35
A A B B B B B B B B B B B B B B B B
3 3 1 1 1 1 1 1 1 2 2 2 2 2 2 3 3 3
315 321 102 108 115 122 129 133 139 204 209 215 221 228 233 308 313 320
M M M F M F M F F M F M F F F M F F
61 39 19 44 45 38 48 28 47 36 42 40 27 24 33 59 33 32
12 10 68 65 83 0 0 0 40 49 40 59 55 65 0 82 40 2
A A B B B B B B B B B B B B B B B B
3 3 1 1 1 1 1 1 1 2 2 2 2 2 3 3 3 3
318 324 103 110 117 125 130 136 140 205 212 218 224 230 303 310 316 322
F M F M F M F F M M F M M M M F M F
45 27 51 32 21 37 36 34 29 36 32 31 56 44 40 62 62 43
95 0 10 25 0 72 80 45 0 16 0 24 70 30 26 38 87 0
)LUVWVRUWWKHGDWDVHWE\SDWLHQWQXPEHUWKHQREWDLQDSULQWRXWVHH287387 E\XVLQJWKHIROORZLQJ6$6VWDWHPHQWV
PROC SORT DATA = EXAMP.TRIAL; BY PAT; PROC PRINT DATA = EXAMP.TRIAL; VAR PAT TRT CENTER SEX AGE RESP SEV SCORE; TITLE ‘Printout of Data Set TRIAL, Sorted by PAT’; RUN;
2873875HVXOWVIURP352&35,177KH'DWD6HW75,$/ Printout of Data Set TRIAL, Sorted by PAT
Obs
PAT
TRT
CENTER
SEX
AGE
RESP
SEV
SCORE
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22
101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122
A B B A B A A B A B A A B A B A B A A B A B
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
M M F F M M F F M M F F M F M M F F F F M F
55 19 51 27 45 31 44 44 47 32 69 31 61 50 45 32 21 39 54 19 70 38
1 1 1 0 1 1 1 1 1 1 1 1 1 0 1 1 0 1 0 1 1 0
1 2 1 0 1 2 1 2 1 1 3 1 3 0 3 1 0 1 0 2 2 0
5 68 10 0 20 35 21 65 15 25 70 10 75 0 83 20 0 25 0 55 38 0
&RPPRQ6WDWLVWLFDO0HWKRGVIRU&OLQLFDO5HVHDUFKZLWK6$6([DPSOHV
2873875HVXOWVIURP352&35,177KH'DWD6HW75,$/FRQWLQXHG
23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80
123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 301 302 303 304 305
A A B A B A B B A B B A A B B A B B B A A B B A A B B A A B B A B A A B A B B A A B A A B B A B A B B A A A A B A B
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 3 3 3 3 3
F M M F F F M F F M F F M F F F F M F M M M M M F F F F M F M M M F F M F F F F F M F F F F M M F M F F M M M M M M
57 37 37 41 53 48 48 36 35 49 28 28 27 34 57 42 47 29 63 58 42 36 36 26 36 48 42 35 51 32 24 51 40 42 50 31 41 45 27 59 38 56 32 28 41 24 42 44 51 37 33 26 42 38 41 40 65 46
1 1 1 0 1 1 0 1 0 1 0 0 1 1 1 1 1 0 1 1 1 1 1 1 0 1 1 1 0 0 1 1 1 1 1 1 1 1 1 0 1 1 0 1 0 1 1 1 1 1 0 1 0 1 1 1 1 1
2 1 3 0 2 1 0 3 0 1 0 0 2 2 3 1 2 0 1 2 1 2 1 1 0 1 2 1 0 0 3 2 2 1 2 1 2 3 2 0 1 3 0 1 0 2 2 1 2 2 0 3 0 1 1 1 3 1
55 18 72 0 40 8 0 80 0 20 0 0 40 45 95 12 40 0 10 68 22 49 16 30 0 12 40 25 0 0 88 60 59 15 50 24 35 72 55 0 10 70 0 16 0 65 48 30 45 32 0 90 0 28 20 26 75 15
&KDSWHU7KH'DWD6HW75,$/
2873875HVXOWVIURP352&35,177KH'DWD6HW75,$/FRQWLQXHG 81 82 83 84 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100
306 307 308 309 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325
A A B A A B B A A B B A A B A B B A A
3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3
F F M F M M F F M M M F F F M F F M F
64 30 59 64 39 52 33 57 61 62 52 45 34 32 39 43 51 27 56
0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 1 0 1
0 1 3 1 3 2 2 3 1 3 2 3 1 1 1 0 2 0 2
0 30 82 5 80 40 40 85 12 87 60 95 26 2 10 0 35 0 35
6WDWLVWLFDO6XPPDUL]DWLRQ
/HW¶V EHJLQ ZLWK VRPH HOHPHQWDU\ 6$6 SURFHGXUHV WR REWDLQ VRPH VLPSOH GDWD VXPPDULHV VWDUWLQJ ZLWK WKH VXPPDU\ VWDWLVWLFV IRU WKH YDULDEOH 6&25( IRU HDFK WUHDWPHQWJURXS 7KLVERRNDVVXPHVVRPHHOHPHQWDU\NQRZOHGJHRIWKH6$6'$7$VWHSDQG EDVLFSURFHGXUHVHJ352&35,17DQG352&6257 )RUPRUHLQIRUPDWLRQ DERXWKRZWRUHDGGDWDLQWR6$6DQGZRUNLQJZLWKYDULDEOHVWKHQHZXVHULV UHIHUUHGWR³7KH/LWWOH6$6%RRN$3ULPHU´E\'HOZLFKHDQG6ODXJKWHUZKLFK LVZULWWHQXQGHUWKH6$6%RRNVE\8VHUVSURJUDP
)LUVWXVH352&0($16VHH287387 PROC SORT DATA = EXAMP.TRIAL; BY TRT; PROC MEANS MEAN STD N MIN MAX DATA = EXAMP.TRIAL; BY TRT; VAR SCORE AGE; TITLE “Summary Statistics for ‘SCORE’ and ‘AGE’ Variables”; RUN;
1RZXVH352&81,9$5,$7(VHH287387 PROC UNIVARIATE DATA = EXAMP.TRIAL; BY TRT; VAR SCORE; TITLE “Expanded Summary Statistics for ‘SCORE’” ; RUN;
&RPPRQ6WDWLVWLFDO0HWKRGVIRU&OLQLFDO5HVHDUFKZLWK6$6([DPSOHV
1H[W ZULWH WKH PHDQ DQG PHGLDQ YDOXHV IRU 6&25( IRU HDFK WUHDWPHQW JURXS DQG VWXG\ FHQWHU FRPELQDWLRQ WR DQ RXWSXW GDWD VHW WKHQ SULQW WKH UHVXOWV VHH 287387 PROC SORT DATA = EXAMP.TRIAL; BY TRT CENTER; PROC UNIVARIATE NOPRINT DATA = EXAMP.TRIAL; BY TRT CENTER; VAR SCORE; OUTPUT OUT = SUMMRY N = NUM MEAN = AVESCORE MEDIAN = MEDSCORE; RUN;
PROC PRINT DATA = SUMMRY; TITLE “Summary Statistics for ‘SCORE’ by Treatment Group & Study Center”; RUN;
9LVXDOL]HWKHGLVWULEXWLRQRIVFRUHVXVLQJDURXJKKLVWRJUDPVHH287387 PROC CHART DATA = EXAMP.TRIAL; VBAR SCORE / MIDPOINTS = 10 30 50 70 90 GROUP = TRT; TITLE “Distribution of ‘SCORE’ by Treatment Group”; RUN;
2EWDLQUHVSRQVHUDWHVE\WUHDWPHQWJURXSXVLQJ352&)5(4VHH287387 PROC FORMAT; VALUE RSPFMT RUN;
0 = ‘0=Abs.’
1 = ‘1=Pres’;
PROC FREQ DATA = EXAMP.TRIAL; TABLES TRT*RESP / NOCOL NOPCT; FORMAT RESP RSPFMT.; TITLE ‘Summary of Response Rates by Treatment Group’; RUN;
2EWDLQ WKH IUHTXHQF\ WDEOHV IRU WKH VHYHULW\ FDWHJRULHV XVLQJ 352& )5(4 VHH287387 PROC FORMAT; VALUE SEVFMT
0 1 2 3
= = = =
‘0=None’ ‘1=Mild’ ‘2=Mod.’ ‘3=Sev.’
;
RUN; PROC FREQ DATA = EXAMP.TRIAL; TABLES TRT*SEV / NOCOL NOPCT; FORMAT SEV SEVFMT.; TITLE ‘Severity Distribution by Treatment Group’; RUN;
&KDSWHU7KH'DWD6HW75,$/
2EWDLQ WKH IUHTXHQF\ GLVWULEXWLRQ RI VHYHULW\ FDWHJRU\ E\ WUHDWPHQW JURXS VWUDWLILHGE\6(;VHH287387
PROC FREQ DATA = EXAMP.TRIAL; TABLES SEX*TRT*SEV / NOCOL NOPCT; FORMAT SEV SEVFMT.; TITLE ‘Severity Distribution by Treatment Group and Sex’; RUN;
2EWDLQDKLVWRJUDPRIVHYHULW\IRUHDFKWUHDWPHQWJURXSVHH287387 PROC CHART DATA = EXAMP.TRIAL; VBAR SEV / MIDPOINTS = 0 1 2 3 GROUP = TRT; TITLE “Distribution of ‘SEV’ by Treatment Group”; RUN;
7KH 6257 35,17 )250$7 0($16 81,9$5,$7( )5(4 DQG &+$57 SURFHGXUHV UHSUHVHQW VRPH RI WKH EDVLF 6$6 SURFHGXUHV 0RUH LQIRUPDWLRQ DERXW WKHVH DQG RWKHU 6$6 SURFHGXUHV LQFOXGLQJ V\QWD[ DQG DYDLODEOH RSWLRQV FDQ EH IRXQGLQQXPHURXV6$6ERRNVHJ'HOZLFKHDQG6ODXJKWHU (OOLRWW WKH 6$6 3URFHGXUHV *XLGH RU 6$6 2QOLQH'RF 7KH RXWSXW JHQHUDWHG IURP WKH SUHFHGLQJFRGHSURYLGHVDVPDOOVDPSOLQJRIWKHW\SHVRIVXPPDULHVDYDLODEOHIURP WKHYDVWZHDOWKRIVXPPDUL]DWLRQSRVVLELOLWLHVXVLQJWKHYDULRXV6$6SURFHGXUHVDQG RSWLRQV
2873875HVXOWVIURP352&0($16 Summary Statistics for ‘SCORE’ and ‘AGE’ Variables ------------------------------ TRT=A --------------------------------The MEANS Procedure Variable Mean Std Dev N Minimum Maximum ---------------------------------------------------------------------SCORE 26.6730769 26.9507329 52 0 95.0000000 AGE 43.7307692 12.2187824 52 26.0000000 70.0000000 --------------------------------------------------------------------------------------------------- TRT=B ---------------------------------
Variable Mean Std Dev N Minimum Maximum ---------------------------------------------------------------------SCORE 38.333333 29.6937083 48 0 95.0000000 AGE 41.3333333 11.7949382 48 19.0000000 63.0000000 ----------------------------------------------------------------------
&RPPRQ6WDWLVWLFDO0HWKRGVIRU&OLQLFDO5HVHDUFKZLWK6$6([DPSOHV
2873875HVXOWVIURP352&81,9$5,$7( Expanded Summary Statistics for ‘SCORE’ ---------------------------- TRT=A -----------------------------------The UNIVARIATE Procedure Variable: SCORE Moments N Mean Std Deviation Skewness Uncorrected SS Coeff Variation
Location Mean Median Mode
Test
52 26.6730769 26.9507329 1.02297616 74039 101.04096
Sum Weights Sum Observations Variance Kurtosis Corrected SS Std Error Mean
52 1387 726.342006 0.13964333 37043.4423 3.73739421
Basic Statistical Measures Variability
26.67308 20.00000 0.00000
Std Deviation Variance Range Interquartile Range
26.95073 726.34201 95.00000 36.50000
Tests for Location: Mu0=0 -Statistic-----p Value------
Student's t Sign Signed Rank
t M S
7.136811 19.5 390
Pr > |t| Pr >= |M| Pr >= |S|
<.0001 <.0001 <.0001
Quantiles (Definition 5) Quantile Estimate 100% Max 99% 95% 90% 75% Q3 50% Median 25% Q1 10% 5% 1% 0% Min
95.0 95.0 85.0 70.0 39.0 20.0 2.5 0.0 0.0 0.0 0.0
Extreme Observations ----Lowest-------Highest---
Value
Obs
Value
Obs
0 0 0 0 0
51 42 38 33 31
75 80 85 90 95
41 45 46 37 48
&KDSWHU7KH'DWD6HW75,$/
2873875HVXOWVIURP352&81,9$5,$7(FRQWLQXHG Expanded Summary Statistics for ‘SCORE’ ----------------------------- TRT=B -------------------------------Moments N Mean Std Deviation Skewness Uncorrected SS Coeff Variation
48 38.3333333 29.6937083 0.21433323 111974 77.4618477
Location Mean Median Mode
Sum Weights Sum Observations Variance Kurtosis Corrected SS Std Error Mean
48 1840 881.716312 -1.2051435 41440.6667 4.28591762
Basic Statistical Measures Variability
38.33333 39.00000 0.00000
Std Deviation Variance Range Interquartile Range
29.69371 881.71631 95.00000 54.00000
Tests for Location: Mu0=0 Test
-Statistic-
-----p Value------
Student's t Sign Signed Rank
t M S
Pr > |t| Pr >= |M| Pr >= |S|
8.94402 19.5 390
Quantiles (Definition 5) Quantile Estimate 100% Max 99% 95% 90% 75% Q3 50% Median 25% Q1 10% 5% 1% 0% Min
95 95 87 82 65 39 11 0 0 0 0
Extreme Observations ----Lowest-------Highest---
Value
Obs
Value
Obs
0 0 0 0 0
47 37 33 26 20
82 83 87 88 95
40 7 44 27 18
<.0001 <.0001 <.0001
&RPPRQ6WDWLVWLFDO0HWKRGVIRU&OLQLFDO5HVHDUFKZLWK6$6([DPSOHV
2873876XPPDU\6WDWLVWLFVLQDQ2XWSXW'DWD6HW Summary Statistics for ‘SCORE’ by Treatment Group & Study Center
Obs
TRT
CENTER
NUM
AVESCORE
MEDSCORE
1 2 3 4 5 6
A A A B B B
1 2 3 1 2 3
20 18 14 20 17 11
18.6000 28.5556 35.7857 39.6500 36.5882 38.6364
13.5 23.5 27.0 40.0 32.0 38.0
2873875HVXOWVIURP352&&+$57 Distribution of ‘SCORE’ by Treatment Group Frequency 25 + *** | *** 24 + *** | *** 23 + *** | *** 22 + *** | *** 21 + *** | *** 20 + *** | *** 19 + *** | *** 18 + *** | *** 17 + *** | *** 16 + *** | *** 15 + *** *** | *** *** 14 + *** *** *** | *** *** *** 13 + *** *** *** | *** *** *** 12 + *** *** *** | *** *** *** 11 + *** *** *** | *** *** *** 10 + *** *** *** *** | *** *** *** *** 9 + *** *** *** *** *** | *** *** *** *** *** 8 + *** *** *** *** *** *** | *** *** *** *** *** *** 7 + *** *** *** *** *** *** | *** *** *** *** *** *** 6 + *** *** *** *** *** *** *** | *** *** *** *** *** *** *** 5 + *** *** *** *** *** *** *** *** | *** *** *** *** *** *** *** *** 4 + *** *** *** *** *** *** *** *** *** *** | *** *** *** *** *** *** *** *** *** *** 3 + *** *** *** *** *** *** *** *** *** *** | *** *** *** *** *** *** *** *** *** *** 2 + *** *** *** *** *** *** *** *** *** *** | *** *** *** *** *** *** *** *** *** *** 1 + *** *** *** *** *** *** *** *** *** *** | *** *** *** *** *** *** *** *** *** *** •------------------------------------------------------------10 30 50 70 90 10 30 50 70 90 SCORE Midpoint |--------- A ---------|
|--------- B ---------|
TRT
&KDSWHU7KH'DWD6HW75,$/
2873875HVXOWVIURP352&)5(4'LFKRWRPRXV5HVSRQVH )UHTXHQFLHV Summary of Response Rates by Treatment Group The FREQ Procedure Table of TRT by RESP TRT
RESP
Frequency| Row Pct |0=Abs. |1=Pres | ---------+--------+--------+ A | 13 | 39 | | 25.00 | 75.00 | ---------+--------+--------+ B | 9 | 39 | | 18.75 | 81.25 | ---------+--------+--------+ Total 22 78
Total 52
48
100
2873875HVXOWVIURP352&)5(46HYHULW\)UHTXHQFLHVE\ 7UHDWPHQW*URXS Severity Distribution by Treatment Group The FREQ Procedure Table of TRT by SEV TRT
SEV
Frequency| Row Pct |0=None |1=Mild |2=Mod. |3=Sev. | ---------+--------+--------+--------+--------+ A | 13 | 22 | 11 | 6 | | 25.00 | 42.31 | 21.15 | 11.54 | ---------+--------+--------+--------+--------+ B | 9 | 12 | 17 | 10 | | 18.75 | 25.00 | 35.42 | 20.83 | ---------+--------+--------+--------+--------+ Total 22 34 28 16
Total 52
48
100
&RPPRQ6WDWLVWLFDO0HWKRGVIRU&OLQLFDO5HVHDUFKZLWK6$6([DPSOHV
2873875HVXOWVIURP352&)5(46HYHULW\)UHTXHQFLHVE\ *HQGHUDQG7UHDWPHQW*URXS Severity Distribution by Treatment Group and Sex The FREQ Procedure Table 1 of TRT by SEV Controlling for SEX=F TRT
SEV
Frequency| Row Pct |0=None |1=Mild |2=Mod. |3=Sev. | ---------+--------+--------+--------+--------+ A | 10 | 12 | 5 | 4 | | 32.26 | 38.71 | 16.13 | 12.90 | ---------+--------+--------+--------+--------+ B | 7 | 4 | 11 | 3 | | 28.00 | 16.00 | 44.00 | 12.00 | ---------+--------+--------+--------+--------+ Total 17 16 16 7
Total 31
25
56
Table 2 of TRT by SEV Controlling for SEX=M TRT
SEV
Frequency| Row Pct |0=None |1=Mild |2=Mod. |3=Sev. | ---------+--------+--------+--------+--------+ A | 3 | 10 | 6 | 2 | | 14.29 | 47.62 | 28.57 | 9.52 | ---------+--------+--------+--------+--------+ B | 2 | 8 | 6 | 7 | | 8.70 | 34.78 | 26.09 | 30.43 | ---------+--------+--------+--------+--------+ Total 5 18 12 9
Total 21
23
44
&KDSWHU7KH'DWD6HW75,$/
2873875HVXOWVIURP352&&+$576(9E\*HQGHUDQG 7UHDWPHQW*URXS Distribution of "SEV" by Treatment Group" Frequency 22 + ***** | ***** 21 + ***** | ***** 20 + ***** | ***** 19 + ***** | ***** 18 + ***** | ***** 17 + ***** ***** | ***** ***** 16 + ***** ***** | ***** ***** 15 + ***** ***** | ***** ***** 14 + ***** ***** | ***** ***** 13 + ***** ***** ***** | ***** ***** ***** 12 + ***** ***** ***** ***** | ***** ***** ***** ***** 11 + ***** ***** ***** ***** ***** | ***** ***** ***** ***** ***** 10 + ***** ***** ***** ***** ***** ***** | ***** ***** ***** ***** ***** ***** 9 + ***** ***** ***** ***** ***** ***** ***** | ***** ***** ***** ***** ***** ***** ***** 8 + ***** ***** ***** ***** ***** ***** ***** | ***** ***** ***** ***** ***** ***** ***** 7 + ***** ***** ***** ***** ***** ***** ***** | ***** ***** ***** ***** ***** ***** ***** 6 + ***** ***** ***** ***** ***** ***** ***** ***** | ***** ***** ***** ***** ***** ***** ***** ***** 5 + ***** ***** ***** ***** ***** ***** ***** ***** | ***** ***** ***** ***** ***** ***** ***** ***** 4 + ***** ***** ***** ***** ***** ***** ***** ***** | ***** ***** ***** ***** ***** ***** ***** ***** 3 + ***** ***** ***** ***** ***** ***** ***** ***** | ***** ***** ***** ***** ***** ***** ***** ***** 2 + ***** ***** ***** ***** ***** ***** ***** ***** | ***** ***** ***** ***** ***** ***** ***** ***** 1 + ***** ***** ***** ***** ***** ***** ***** ***** | ***** ***** ***** ***** ***** ***** ***** ***** •----------------------------------------------------------0 1 2 3 0 1 2 3 Midpoint |----------- A ----------|
|----------- B ----------|
SEV
TRT
&RPPRQ6WDWLVWLFDO0HWKRGVIRU&OLQLFDO5HVHDUFKZLWK6$6([DPSOHV
6XPPDU\
7KLVFKDSWHULOOXVWUDWHVKRZGDWDIURPDVLPSOLILHGK\SRWKHWLFDOFOLQLFDOWULDOFDQEH FROOHFWHGDQGHQWHUHGLQWRD6$6GDWDVHWDQGKRZVLPSOHVWDWLVWLFDOVXPPDULHVFDQ EH REWDLQHG IRU WKH NH\ GDWD 7KUHH W\SHV RI UHVSRQVH YDULDEOHV DUH SUHVHQWHG D GLFKRWRPRXVELQDU\YDULDEOH5(63 DQRUGHUHGFDWHJRULFDOYDULDEOH6(9 DQGD FRQWLQXRXVQXPHULFYDULDEOH6&25( 7KHVHW\SHVRIUHVSRQVHVDUHUHSUHVHQWDWLYH RIWKRVHPRVWRIWHQHQFRXQWHUHGLQFOLQLFDOVWXGLHV7KHGDWDVHW75,$/LVXVHGLQ WKHH[HUFLVHVLQ&KDSWHU
CHHAAPPTTEERR 4 The One-Sample t-Test 4.1 4.2 4.3 4.4
Introduction........................................................................................................ 55 Synopsis .............................................................................................................. 55 Examples............................................................................................................. 56 Details & Notes................................................................................................... 64
4.1
Introduction The one-sample t-test is used to infer whether an unknown population mean differs from a hypothesized value. The test is based on a single sample of ‘n’ measurements from the population. A special case of the one-sample t-test is often used to determine if a mean response changes under different experimental conditions by using paired observations. Paired measurements most often arise from repeated observations for the same patient, either at two different time points (e.g., pre- and post-study) or from two different treatments given to the same patient. Because the analyst uses the changes (differences) in the paired measurements, this test is often referred to as the ‘paired-difference’ t-test. A hypothesized mean difference of 0 might be interpreted as ‘no clinical response’. The one-sample t-test, along with its two-sample counterpart presented in the next chapter, must surely be the most frequently cited inferential tests used in statistics.
4.2
Synopsis A sample of n data points, y1, y2, ..., yn, is randomly selected from a normally distributed population with unknown mean, m. This mean is estimated by the sample mean, y. You hypothesize the mean to be some value, m0. The greater the deviation between y and m0, the greater the evidence that the hypothesis is untrue. The test statistic t is a function of this deviation, standardized by the standard error of y, namely, s/ n . Large values of t will lead to rejection of the null hypothesis. When H0 is true, t has the Student-t probability distribution with n–1 degrees of freedom.
56
Common Statistical Methods for Clinical Research with SAS Examples
The one-sample t-test is summarized as follows:
null hypothesis: alt. hypothesis:
H0: m = m0 HA: m =/ m0 t=
test statistic: decision rule:
y −µ 0 s/ n
reject H0 if | t | > ta/2,n–1
The value ta/2,n–1 represents the ‘critical t-value’ of the Student t-distribution at a two-tailed significance level, a, and n–1 degrees of freedom. This value can be obtained from Appendix A.2 or from SAS, as described later (see 4.4.3). The one-sample t-test is often used in matched-pairs situations, such as testing for pre- to post-study changes. The goal is to determine whether a difference exists between two time points or between two treatments based on data values from the same patients for both measures. The ‘paired-difference’ t-test is conducted using the procedures outlined for the one-sample t-test with m=the mean difference (sometimes denoted md), m0= 0 , and the yi’s = the paired-differences. Examples 4.1 and 4.2 demonstrate application of the one-sample t-test in non-matched and matched situations, respectively.
4.3
Examples
p Example 4.1 –Body-Mass Index
In a number of previous Phase I and II studies of male, non-insulin-dependent diabetic (NIDDM) patients conducted by Mylitech Biosystems, Inc., the mean body mass index (BMI) was found to be 28.4. An investigator has 17 male NIDDM patients enrolled in a new study and wants to know if the BMI from this sample is consistent with previous findings. BMI is computed as the ratio of weight in kilograms to the square of the height in meters. What conclusion can the investigator make based on the following height and weight data from his 17 patients?
Chapter 4: The One-Sample t-Test 57
TABLE 4.1. Raw Data for Example 4.1 Patient Number
Height (cm)
1
178
2
Weight (kg)
Patient Number
Height (cm)
Weight (Kg)
101.7
10
183
97.8
170
97.1
11
n/a
78.7
3
191
114.2
12
172
77.5
4
179
101.9
13
183
102.8
5
182
93.1
14
169
81.1
6
177
108.1
15
177
102.1
7
184
85.0
16
180
112.1
8
182
89.1
17
184
89.7
9
179
95.8
Solution First, calculate BMI as follows: convert height in centimeters to meters by dividing by 100, square this quantity, then divide it into the weight. BMI data (yi) are shown below, along with the squares and summations. Because no height measurement is available for Patient No. 11, BMI cannot be computed and is considered a missing value. TABLE 4.2. Computation of Sum of Squares / Example 4.1 Patient Number
BMI (y)
1
32.1
2
2
2
Patient Number
BMI (y)
BMI 2 (y )
1030.30
10
29.2
852.85
33.6
1128.87
11
.
.
3
31.3
979.94
12
26.2
686.26
4
31.8
1011.43
13
30.7
942.28
5
28.1
789.98
14
28.4
806.30
6
34.5
1190.58
15
32.6
1062.08
7
25.1
630.33
16
34.6
1197.07
8
26.9
723.55
17
26.5
701.96
9
29.9
893.96
BMI 2 (y )
TOTALS
Σyi
= 481.5 2
Σ(yi) = 14627.74
&RPPRQ6WDWLVWLFDO0HWKRGVIRU&OLQLFDO5HVHDUFKZLWK6$6([DPSOHV
7KHVDPSOHPHDQ%0, \ DQGVWDQGDUGGHYLDWLRQVDUHFRPSXWHGIURPWKHQ SDWLHQWVZLWKQRQPLVVLQJGDWDDVIROORZV7KHIRUPXODVKRZQIRUVLVHTXLYDOHQWWR WKDW JLYHQ LQ &KDSWHU EXW LW¶V LQ D PRUH FRQYHQLHQW IRUPDW IRU PDQXDO FRPSXWDWLRQV
\ \ Q DQG \ Q\ V Q :LWKDVLJQLILFDQFHOHYHOWKHWHVWVXPPDU\EHFRPHV L
L
QXOOK\SRWKHVLV DOWK\SRWKHVLV
+ m
+ m \ - - $
W
WHVWVWDWLVWLF
GHFLVLRQUXOH
UHMHFW+ LI_W_!W
FRQFOXVLRQ
%HFDXVH!\RXUHMHFW+ DQGFRQFOXGH WKDWWKH%0,RIWKHQHZSDWLHQWVGLIIHUVIURPWKDW RIWKHSDWLHQWVVWXGLHGSUHYLRXVO\
V Q
6$6$QDO\VLVRI([DPSOH
7KH 6$6 FRGH DQG RXWSXW IRU DQDO\]LQJ WKLV GDWD VHW DUH VKRZQ RQ WKH QH[W WZR SDJHV7KHGDWDDUHUHDGLQWRWKHGDWDVHW',$%åXVLQJWKH,1387VWDWHPHQW 7KH,1),/(VWDWHPHQWFDQEHXVHGLIWKHGDWDDUHDOUHDG\LQDQH[WHUQDOILOH 7KH UHVSRQVHYDULDEOH%0,ERG\PDVVLQGH[ LVFRPSXWHGêDVHDFKUHFRUGLVUHDGLQ 7KH SURJUDP SULQWV D OLVWLQJRIWKHGDWDE\XVLQJ352&35,17 7KHWWHVWLV FRQGXFWHGZLWKWKH77(67SURFHGXUHVSHFLI\LQJWKHQXOOYDOXHRIIRU+ 7KH RXWSXW VKRZV D WVWDWLVWLF RI ñ ZKLFK FRUURERUDWHV WKH PDQXDO FDOFXODWLRQV SHUIRUPHG 7KH WHVW LV VLJQLILFDQW LI WKH SYDOXH ò LV OHVV WKDQ WKH VLJQLILFDQFHOHYHORIWKHWHVW,QWKLVFDVH\RXUHMHFWWKHQXOOK\SRWKHVLVDWDWZR WDLOHGVLJQLILFDQFHOHYHORIEHFDXVHWKHSYDOXHRILVOHVVWKDQ
Chapter 4: The One-Sample t-Test 59
The output automatically shows the BMI summary statistics, namely, the mean and the standard deviation . The standard error and the upper and lower confidence limits associated with a 95% confidence interval are also shown for both the mean and standard deviation. In SAS Releases 6.12 and earlier, PROC TTEST cannot be used to conduct the one-sample t-test. Instead, you first transform the response variable in the DATA step by subtracting the null value (e.g., BMI0 = BMI-28.4), then use PROC MEANS with the T and PRT options on this new variable. The SAS statements for this example would be: PROC MEANS T PRT DATA = DIAB; VAR BMI0; RUN;
SAS Code for Example 4.1 DATA DIAB; INPUT PATNO WT_KG HT_CM @@; BMI = WT_KG / ((HT_CM/100)**2); DATALINES; 1 101.7 178 2 97.1 170 3 114.2 191 4 101.9 179 5 93.1 182 6 108.1 177 7 85.0 184 8 89.1 182 9 95.8 179 10 97.8 183 11 78.7 . 12 77.5 172 13 102.8 183 14 81.1 169 15 102.1 177 16 112.1 180 17 89.7 184 ;
PROC PRINT DATA = DIAB; VAR PATNO HT_CM WT_KG BMI; FORMAT BMI 5.1; TITLE1 'One-Sample t-Test'; TITLE2 'EXAMPLE 4.1: Body-Mass Index Data'; RUN; PROC TTEST H0=28.4 DATA=DIAB; VAR BMI; RUN;
60
Common Statistical Methods for Clinical Research with SAS Examples
Output 4.1. SAS Output for Example 4.1 One-Sample t-Test EXAMPLE 4.1: Body-Mass Index Data Obs
PATNO
HT_CM
WT_KG
BMI
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
178 170 191 179 182 177 184 182 179 183 . 172 183 169 177 180 184
101.7 97.1 114.2 101.9 93.1 108.1 85.0 89.1 95.8 97.8 78.7 77.5 102.8 81.1 102.1 112.1 89.7
32.1 33.6 31.3 31.8 28.1 34.5 25.1 26.9 29.9 29.2 . 26.2 30.7 28.4 32.6 34.6 26.5
The TTEST Procedure
Variable BMI
Lower CL N Mean 16
28.478
Variable BMI
Statistics Upper CL Lower CL Mean Mean Std Dev
30.093
31.709
2.24
Statistics Std Err Minimum 0.7581
25.106
Variable
T-Tests DF t Value
BMI
15
2.23
Std Dev
Upper CL Std Dev
3.0323
4.693
Maximum 34.599
Pr > |t| 0.0411
p Example 4.2 -- Paired-Difference in Weight Loss
Mylitech is developing a new appetite suppressing compound for use in weight reduction. A preliminary study of 35 obese patients provided the following data on patients' body weights (in pounds) before and after 10 weeks of treatment with the new compound. Does the new treatment look at all promising?
Chapter 4: The One-Sample t-Test 61
TABLE 4.3. Raw Data for Example 4.2 Weight In Pounds Subject Number
Pre-
1
165
2
Weight In Pounds
Post-
Subject Number
Pre-
Post-
160
19
177
171
202
200
20
181
170
3
256
259
21
148
154
4
155
156
22
167
170
5
135
134
23
190
180
6
175
162
24
165
154
7
180
187
25
155
150
8
174
172
26
153
145
9
136
138
27
205
206
10
168
162
28
186
184
11
207
197
29
178
166
12
155
155
30
129
132
13
220
205
31
125
127
14
163
153
32
165
169
15
159
150
33
156
158
16
253
255
34
170
161
17
138
128
35
145
152
18
287
280
Solution First, calculate the weight loss for each patient (yi) by subtracting the ‘post'- weight from the ‘pre-’ weight (see Output 4.2). The mean and standard deviation of these differences are yd = 3.457 and s = 6.340 based on n = 35 patients. A test for significant mean weight loss is equivalent to testing whether the mean loss, md, is greater than 0. You use a one-tailed test because the interest is in weight loss (not weight ‘change’). The hypothesis test is summarized as follows: null hypothesis:
H0: md = 0
alt. hypothesis:
HA: md > 0
test statistic:
t=
decision rule:
reject H0 if t > t0.05,34 = 1.691
conclusion:
Because 3.226 > 1.691, you reject H0 at a significance level of 0.05 and conclude that a significant mean weight loss occurs following treatment.
yd sd / n
=
3.457 = 3.226 6.340/ 35
62
Common Statistical Methods for Clinical Research with SAS Examples
SAS Analysis of Example 4.2 The SAS code and output for analyzing this data set are shown below. The program computes the paired differences (variable=WTLOSS) as the data are read into the data set OBESE. The program prints a listing of the data by using PROC PRINT. A part of the listing is shown in Output 4.2 11 . As in Example 4.1, PROC TTEST without a CLASS statement can be used to conduct the paired t-test. In this case, you tell SAS to compute the differences, WTLOSS, so you can simply include the VAR WTLOSS statement with PROC TTEST. Alternatively, you may omit the VAR statement and include the PAIRED statement followed by the pair of variables being compared, separated by an asterisk (*). This eliminates the need to compute the differences in a separate DATA step. This example illustrates the PAIRED statement by including the two variables, WTPRE and WTPST 12 . SAS will automatically compute the paired differences (WTPRE–WTPST), then conduct the t-test. The output shows a t-statistic of 3.23 13 , which corroborates the manual computations. Because SAS prints the twotailed p-value 14 , this value must be halved to obtain the one-tailed result, p= 0.0028/2 =0.0014. You reject the null hypothesis and conclude that a significant weight loss occurs because this p-value is less than the nominal significance level of a = 0.05. In SAS Releases 6.12 and earlier, PROC TTEST cannot be used to conduct paired t-tests. Instead, use PROC MEANS with the T and PRT options on the variable WTLOSS. The SAS statements for this example would be: PROC MEANS MEAN STD N T PRT DATA=OBESE; VAR WTLOSS; RUN;
SAS Code for Example 4.2 DATA OBESE; INPUT SUBJ WTPRE WTPST @@; WTLOSS = WTPRE – WTPST; DATALINES; 1 165 160 2 202 200 3 256 5 135 134 6 175 162 7 180 9 136 138 10 168 162 11 207 13 220 205 14 163 153 15 159 17 138 128 18 287 280 19 177 21 148 154 22 167 170 23 190 25 155 150 26 153 145 27 205 29 178 166 30 129 132 31 125 33 156 158 34 170 161 35 145 ;
259 187 197 150 171 180 206 127 152
4 8 12 16 20 24 28 32
155 174 155 253 181 165 186 165
156 172 155 255 170 154 184 169
Chapter 4: The One-Sample t-Test 63
11 PROC PRINT DATA = OBESE; VAR SUBJ WTPRE WTPST WTLOSS; TITLE1 'One-Sample t-Test'; TITLE2 'EXAMPLE 4.2: Paired-Difference in Weight Loss'; RUN; PROC TTEST DATA = OBESE; 12 PAIRED WTPRE*WTPST; RUN;
Output 4.2. SAS Output for Example 4.2 One-Sample t-Test EXAMPLE 4.2: Paired-Difference in Weight Loss OBS
SUBJ
WTPRE
WTPST
1 2 3 4 5 6 7 8 . . . 33 34 35
1 2 3 4 5 6 7 8 . . . 33 34 35
165 202 256 155 135 175 180 174 . . . 156 170 145
160 200 259 156 134 162 187 172 . . . 158 161 152
WTLOSS 5 2 -3 -1 1 13 -7 2 . . . -2 9 -7
11
The TTEST Procedure Statistics
Variable WTPRE – WTPST
N 35
Lower CL Mean 1.2792
Upper CL Mean Mean 3.4571 5.635
Lower CL Std Dev 5.1283
Std Dev 6.3401
Statistics Variable Std Err WTPRE - WTPST 1.0717
Minimum -7
Maximum 15
T-Tests Variable WTPRE - WTPST
DF 34
t Value 3.23 13
Pr > |t| 0.0028 14
Upper CL Std Dev 8.3068
64
Common Statistical Methods for Clinical Research with SAS Examples
4.4
Details & Notes The main difference between the Z-test (Chapter 2) and the t-test 4.4.1. is that the Z-statistic is based on a known standard deviation, s, while the tstatistic uses the sample standard deviation, s, as an estimate of s. With the assumption of normally distributed data, the variance s2 is more closely estimated by the sample variance s2 as n gets large. It can be shown that the t-test is equivalent to the Z-test for infinite degrees of freedom. In practice, a ‘large’ sample is usually considered n ³ 30. The distributional relationship between the Z- and t-statistics is shown in Appendix B. If the assumption of normally distributed data cannot be made, the mean might not represent the best measure of central tendency. In such cases, a nonparametric rank test, such as the Wilcoxon signed-rank test (Chapter 12), might be more appropriate. The UNIVARIATE procedure in SAS can be used to test for normality. Because Example 4.1 tests for any difference from a hypothesized 4.4.2. value, a two-tailed test is used. A one-tailed test would be used when you want to test whether the population mean is strictly greater than or strictly less than the hypothesized threshold level (m0), such as in Example 4.2. We use the rejection region according to the alternative hypothesis as follows:
Type of Test
Alternative Hypothesis
Corresponding Rejection Region
two-tailed
HA: m =/ m0
reject H0 if t > ta/2,n–1 or t < –ta/2,n–1
one-tailed (right)
HA: m > m0
reject H0 if t > ta,n–1
one-tailed (left)
HA: m < m0
reject H0 if t < -ta,n–1
Most statistics text books have tables of the t-distribution in an 4.4.3. Appendix. These tables often provide the critical t-values for levels of a = 0.10, 0.05, 0.025, 0.01, and 0.005, as in Appendix A.2. Critical t-values can also be found from many statistical programs. In SAS, you use the function TINV(1–a,n–1), where a = a for a one-tailed test and a = a/2 for a two-tailed test. In Example 4.1, the critical t-value of 2.131 with n=16 (15 degrees of freedom) can be found with the SAS function TINV(0.975,15).
Chapter 4: The One-Sample t-Test 65
Alternatively, the p-value associated with the test statistic, t, based on υ degrees of freedom, can be found by using the SAS function PROBT(t,υ), which gives the cumulative probability of the Student-t distribution. Because you want the tail probabilities, the associated two-tailed p-value is 2H(1–PROBT(t, υ)). In Example 4.1, the p-value 0.041 can be confirmed by using the SAS expression 2H(1–PROBT(2.23,15)). In Example 4.2, the onetailed p-value 0.0014 can be confirmed by using the SAS expression 1–PROBT(3.226,34). A non-significant result does not necessarily imply that the null 4.4.4. hypothesis is true, only that insufficient evidence exists to contradict it. Larger sample sizes are often needed and an investigation of the power curve (see Chapter 2) is necessary for ‘equivalency studies’. Significance does not imply causality. In Example 4.2, concluding 4.4.5. that the treatment caused the significant weight loss would be presumptuous. Causality can better be investigated using a concurrent untreated or control group in the study, and strictly controlling other experimental conditions. In controlled studies, comparison of responses among groups is carried out with tests such as the two-sample t-test (Chapter 5) or analysis of variance methods (Chapters 6-8). Known values of n measurements uniquely determine the sample 4.4.6. mean y . But given y, the n measurements cannot uniquely be determined. In fact, n–1 of the measurements can be freely selected, with the nth determined by y. Thus the term n–1 degrees of freedom. Note: The number of degrees of freedom, often denoted by the Greek letter υ (nu), is a parameter of the Student-t probability distribution.
66
Common Statistical Methods for Clinical Research with SAS Examples
CHHAAPPTTEERR 5 The Two-Sample t-Test 5.1 5.2 5.3 5.4
Introduction........................................................................................................ 67 Synopsis .............................................................................................................. 67 Examples............................................................................................................. 68 Details & Notes................................................................................................... 73
5.1
Introduction The two-sample t-test is used to compare the means of two independent populations, denoted m1 and m2. This test has ubiquitous application in the analysis of controlled clinical trials. Examples might include the comparison of mean decreases in diastolic blood pressure between two groups of patients receiving different antihypertensive agents, or estimating pain relief from a new treatment relative to that of a placebo based on subjective assessment of percentimprovement in two parallel groups. Assume that the two populations are normally distributed and have the same variance (s2).
5.2
Synopsis Samples of n1 and n2 observations are randomly selected from the two populations, with the measurements from Population i denoted by yi1, yi2, ..., yini (for i = 1, 2). The unknown means, m1 and m2, are estimated by the sample means, y1 and y2 , respectively. The greater the difference between the sample means, the greater the evidence that the hypothesis of equality of population means (H0) is untrue. The test statistic, t, is a function of this difference standardized by its standard error, namely s(1/n1 + 1/n2)1/2. Large values of t will lead to rejection of the null hypothesis. When H0 is true, t has the Student-t distribution with N–2 degrees of freedom, where N = n1+n2.
&RPPRQ6WDWLVWLFDO0HWKRGVIRU&OLQLFDO5HVHDUFKZLWK6$6([DPSOHV
7KHEHVWHVWLPDWHRIWKHFRPPRQXQNQRZQSRSXODWLRQYDULDQFHs LVWKH SRROHG YDULDQFHVS FRPSXWHGDVWKHZHLJKWHGDYHUDJHRIWKHVDPSOHYDULDQFHVXVLQJWKH IRUPXOD Q - ¼ V + Q - ¼ V VS = Q + Q - 7KHWZRVDPSOHWWHVWLVVXPPDUL]HGDVIROORZV
QXOOK\SRWKHVLV DOWK\SRWKHVLV
WHVWVWDWLVWLF
+ m m + m m
$
W
\ \ V S Q Q
GHFLVLRQUXOH
¬ ®
UHMHFW+ LI_W_!Wa
1±
([DPSOHV
p ([DPSOH)(9&KDQJHV
$QHZFRPSRXQG$%&LVEHLQJGHYHORSHGIRUORQJWHUPWUHDWPHQWRI SDWLHQWVZLWKFKURQLFDVWKPD$VWKPDWLFSDWLHQWVZHUHHQUROOHGLQDGRXEOH EOLQGVWXG\DQGUDQGRPL]HGWRUHFHLYHGDLO\RUDOGRVHVRI$%&RUD SODFHERIRUZHHNV7KHSULPDU\PHDVXUHPHQWRILQWHUHVWLVWKHUHVWLQJ)(9 IRUFHGH[SLUDWRU\YROXPHGXULQJWKHILUVWVHFRQGRIH[SLUDWLRQ ZKLFKLV PHDVXUHGEHIRUHDQGDWWKHHQGRIWKHZHHNWUHDWPHQWSHULRG'DWDLQ OLWHUV DUHVKRZQLQWKHWDEOHZKLFKIROORZV'RHVDGPLQLVWUDWLRQRI$%& DSSHDUWRKDYHDQ\HIIHFWRQ)(9"
Chapter 5: The Two-Sample t-Test 69
TABLE 5.1. Raw Data for Example 5.1 ABC-123 Group
Placebo Group
Patient Number
Baseline
Week 6
Patient Number
Baseline
Week 6
101
1.35
n/a
102
3.01
3.90
103
3.22
3.55
104
2.24
3.01
106
2.78
3.15
105
2.25
2.47
108
2.45
2.30
107
1.65
1.99
109
1.84
2.37
111
1.95
n/a
110
2.81
3.20
112
3.05
3.26
113
1.90
2.65
114
2.75
2.55
116
3.00
3.96
115
1.60
2.20
118
2.25
2.97
117
2.77
2.56
120
2.86
2.28
119
2.06
2.90
121
1.56
2.67
122
1.71
n/a
124
2.66
3.76
123
3.54
2.92
Solution Let m1 and m2 represent the mean increases in FEV1 for the ABC-123 and the placebo groups, respectively. The first step is to calculate each patient's increase in FEV1 from baseline to Week 6 (shown in Output 5.1). Patients 101, 111, and 122 are excluded from this analysis because no Week-6 measurements are available. The FEV1 increases (in liters) are summarized by treatment group as follows: TABLE 5.2. Treatment Group Summary Statistics for Example 5.1
ABC-123
Placebo
Mean
( yi )
0.503
0.284
SD
(si)
0.520
0.508
Sample Size
(ni)
11
10
The pooled variance is
s2p =
(11 − 1) ⋅ 0.5202 + (10 − 1) ⋅ 0.5082 = 0.265 21 − 2
&RPPRQ6WDWLVWLFDO0HWKRGVIRU&OLQLFDO5HVHDUFKZLWK6$6([DPSOHV
%HFDXVH\RXDUHORRNLQJIRUDQ\HIIHFWXVHDWZRWDLOHGWHVWDVIROORZV
QXOOK\SRWKHVLV
+ m m
DOWK\SRWKHVLV
+ m m
WHVWVWDWLVWLF
W
GHFLVLRQUXOH
UHMHFW+ LI_W_!W
FRQFOXVLRQ
%HFDXVH LV QRW ! \RX FDQQRW UHMHFW +
$
¬ ®
6$6$QDO\VLVRI([DPSOH 7KH6$6FRGHIRUDQDO\]LQJWKLVGDWDVHWDQGWKHUHVXOWLQJRXWSXWDUHVKRZQRQWKH QH[W WKUHH SDJHV 7KH SURJUDP FRPSXWHV WKH SUH WR SRVWVWXG\ FKDQJHV LQ )(9 µ&+*¶ åWKHQSULQWVDOLVWLQJRIWKHGDWDE\XVLQJ352&35,17 352& 0($16 LV XVHG LQ WKLV H[DPSOH WR REWDLQ WKH VXPPDU\ VWDWLVWLFV IRU WKH EDVHOLQHDQG:HHNUHVSRQVHYDOXHVIRUHDFKJURXSê7KH7DQG357RSWLRQV DUH VSHFLILHG LQ WKH 352& 0($16 VWDWHPHQW WR LOOXVWUDWH WKHRQHVDPSOH WWHVWV IRU WKH VLJQLILFDQFH RI WKH ZLWKLQJURXS FKDQJHV YDULDEOH &+* $V GHPRQVWUDWHGLQ&KDSWHUWKHRQHVDPSOHWWHVWFDQDOVREHFRQGXFWHGXVLQJWKH 77(67SURFHGXUHLQ6$69HUVLRQRUODWHU1RWHWKDWWKH$%&*URXSVKRZV DVLJQLILFDQWLQFUHDVHLQPHDQ)(9S ZKLOHWKH3ODFHER*URXSGRHVQRW S 7KH WZRVDPSOH WWHVW LV FDUULHG RXW E\ XVLQJ 352& 77(67 ZLWK WKH FODVV YDULDEOH 757*53 VSHFLILHG LQ D &/$66 VWDWHPHQW $VVXPLQJ HTXDO YDULDQFHV \RX XVH WKH WYDOXH DQG SYDOXH 3U!_W_ FRUUHVSRQGLQJ WR µ(TXDO¶ XQGHU WKH µ9DULDQFHV¶ FROXPQ 7KH WVWDWLVWLF FRQILUPV WKH UHVXOW IURP WKH PDQXDO FDOFXODWLRQDERYH7KHSYDOXH! LQGLFDWHVQRVLJQLILFDQWGLIIHUHQFHLQ WKH)(9LQFUHDVHVEHWZHHQJURXSV 6$6SULQWVVXPPDU\VWDWLVWLFVIRUHDFKOHYHORIWKHFODVVYDULDEOHDQGEHJLQQLQJ ZLWK 9HUVLRQ WKH XSSHU DQG ORZHUFRQILGHQFH OLPLWV IRU WKH PHDQV DQG VWDQGDUG GHYLDWLRQV DUH DOVR SULQWHG E\ GHIDXOW 7KH FRQILGHQFH OLPLWV FDQ EH VXSSUHVVHGE\XVLQJWKH&, 121(RSWLRQLQ352&77(67
Chapter 5: The Two-Sample t-Test 71
SAS Code for Example 5.1 DATA FEV; INPUT PATNO TRTGRP $ FEV0 FEV6 @@; CHG = FEV6 - FEV0; IF CHG = . THEN DELETE; DATALINES; 101 A 1.35 . 103 A 3.22 3.55 106 108 A 2.45 2.30 109 A 1.84 2.37 110 113 A 1.90 2.65 116 A 3.00 3.96 118 120 A 2.86 2.28 121 A 1.56 2.67 124 102 P 3.01 3.90 104 P 2.24 3.01 105 107 P 1.65 1.99 111 P 1.95 . 112 114 P 2.75 2.55 115 P 1.60 2.20 117 119 P 2.06 2.90 122 P 1.71 . 123 ;
A A A A P P P P
2.78 2.81 2.25 2.66 2.25 3.05 2.77 3.54
3.15 3.20 2.97 3.76 2.47 3.26 2.56 2.92
PROC FORMAT; VALUE $TRT 'A' = 'ABC-123' 'P' = 'PLACEBO'; RUN;
PROC PRINT DATA = FEV; VAR PATNO TRTGRP FEV0 FEV6 CHG; FORMAT TRTGRP $TRT. FEV0 FEV6 CHG 5.2; TITLE1 'Two-Sample t-Test'; TITLE2 'EXAMPLE 5.1: FEV1 Changes'; RUN; PROC MEANS MEAN STD N T PRT DATA = FEV; BY TRTGRP; VAR FEV0 FEV6 CHG; FORMAT TRTGRP $TRT.; RUN; PROC TTEST DATA = FEV; CLASS TRTGRP; VAR CHG; FORMAT TRTGRP $TRT.; RUN;
72
Common Statistical Methods for Clinical Research with SAS Examples
OUTPUT 5.1. SAS Output for Example 5.1 Two-Sample t-Test EXAMPLE 5.1: FEV1 Changes Obs
PATNO
TRTGRP
FEV0
FEV6
CHG
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21
103 106 108 109 110 113 116 118 120 121 124 102 104 105 107 112 114 115 117 119 123
ABC-123 ABC-123 ABC-123 ABC-123 ABC-123 ABC-123 ABC-123 ABC-123 ABC-123 ABC-123 ABC-123 PLACEBO PLACEBO PLACEBO PLACEBO PLACEBO PLACEBO PLACEBO PLACEBO PLACEBO PLACEBO
3.22 2.78 2.45 1.84 2.81 1.90 3.00 2.25 2.86 1.56 2.66 3.01 2.24 2.25 1.65 3.05 2.75 1.60 2.77 2.06 3.54
3.55 3.15 2.30 2.37 3.20 2.65 3.96 2.97 2.28 2.67 3.76 3.90 3.01 2.47 1.99 3.26 2.55 2.20 2.56 2.90 2.92
0.33 0.37 -0.15 0.53 0.39 0.75 0.96 0.72 -0.58 1.11 1.10 0.89 0.77 0.22 0.34 0.21 -0.20 0.60 -0.21 0.84 -0.62
Two-Sample t-Test EXAMPLE 5.1: FEV1 Changes ------------------- TRTGRP=ABC-123 ------------------------------The MEANS Procedure
Variable Mean Std Dev N t Value Pr > |t| ----------------------------------------------------------------FEV0 2.4845455 0.5328858 11 15.46 <.0001 FEV6 2.9872727 0.5916095 11 16.75 <.0001 CHG 0.5027273 0.5198286 11 3.21 0.0094 --------------------------------------------- ---------------------------------------- TRTGRP=PLACEBO ------------------------------Variable Mean Std Dev N t Value Pr > |t| -----------------------------------------------------------------FEV0 2.4920000 0.6355365 10 12.40 <.0001 FEV6 2.7760000 0.5507006 10 15.94 <.0001 CHG 0.2840000 0.5077882 10 1.77 0.1107 ------------------------------------------------------------------
Chapter 5: The Two-Sample t-Test 73
OUTPUT 5.1. SAS Output for Example 5.1 (continued)
Two-Sample t-Test EXAMPLE 5.1: FEV1 Changes The TTEST Procedure Statistics
N
Lower Mean
ABC-123 11 PLACEBO 10 Diff (1-2)
0.1535 -0.079 -0.251
Variable CHG CHG CHG
Class
CL Upper CL Mean Mean 0.5027 0.284 0.2187
0.852 0.6472 0.6889
Lower CL Upper CL Std Dev Std Dev Std Dev 0.3632 0.3493 0.391
0.5198 0.5078 0.5142
0.9123 0.927 0.751
Statistics Variable CHG CHG CHG
Class
ABC-123 PLACEBO Diff (1-2)
Std Err
Minimum
Maximum
0.1567 0.1606 0.2247
-0.58 -0.62
1.11 0.89
T-Tests Variable CHG CHG
Method
Variances
DF
t Value
Pooled Satterthwaite
Equal Unequal
19 18.9
0.97 0.97
Variable CHG
5.4
Method Folded F
Equality of Variances Num DF Den DF 10
9
Pr > |t|
0.3425 0.3420
F Value
Pr > F
1.05
0.9532
Details & Notes The assumption of equal variances can be tested using the F-test. 5.4.1. An F-test generally arises as a ratio of variances. When the hypothesis of equal variances is true, the ratio of sample variances should be about 1. The probability distribution of this ratio is known as the F-distribution (which is widely used in the analysis of variance).
74
Common Statistical Methods for Clinical Research with SAS Examples
The test for comparing two variances can be made as follows: 2
2
null hypothesis: alt. hypothesis:
H0: s1 = s2 2 2 HA: s1 =/ s2
test statistic:
2 2 F = sU / sL
decision rule:
reject H0 if F > Fc(α/2)
The subscripts U and L denote ‘upper’ and ‘lower’. The sample with the larger 2 sample variance (sU ) is considered the ‘upper’ sample, and the sample with the 2 smaller sample variance (sL ) is the ‘lower’. Large values of F indicate a large disparity in sample variances and would lead to rejection of H0. The critical F value, Fc, can be obtained from the F-tables in most elementary statistics books or many computer packages, based on nU-1 upper and nL-1 lower degrees of freedom. In SAS, the critical F value is found by using the FINV function:
Fc(α/2) = FINV(1– α/2, nU–1, nL–1) 2
2
In Example 5.1, compute F = 0.520 / 0.508 = 1.048, which leads to non-rejection of the null hypothesis of equal variances at α=0.05 based on Fc(0.025) = 3.96. As noted in Output 5.1, the p-value associated with this preliminary test is 0.9532 , which can also be obtained using the PROBF function in SAS: 2*(1–PROBF(1.048,10,9)). You can proceed with the t-test assuming equal variances because the evidence fails to contradict that assumption. If the hypothesis of equal variances is rejected, the t-test might give erroneous results. In such cases, a modified version of the t-test proposed by Satterthwaite is often used. The ‘Satterthwaite adjustment’ consists of using the statistic similar to the t-test but with an approximate degrees of freedom, carried out as follows:
test statistic:
decision rule:
t′ =
y1 – y 2 2
2
s1 + s 2 n1 n 2
reject H 0 if
t ¢ > t α/2, q
Chapter 5: The Two-Sample t-Test 75
where q represents the approximate degrees of freedom computed as follows 2 (with wi = si / ni ):
q=
(w 1 + w 2 )2 w12 w22 + (n 1 – 1) (n 2 – 1)
Although Satterthwaite’s adjustment is not needed for Example 5.1, it can be used to illustrate the computational methods as follows. Compute 2
w1 = 0.520 / 11 = 0.0246 and
2
w2 = 0.508 / 10 = 0.0258 so that q=
(0.0246 + 0.0258) 2 = 18.9 0.0246 2 0.0258 2 + 10 9
and
t′ =
0.503 – 0.284 0.520 2 0.508 2 + 11 10
= 0.975
SAS computes these quantities automatically in the output of PROC TTEST (See Output 5.1). Satterthwaite’s t-value and approximate degrees of freedom are given for Unequal under Variances in the SAS output. (See in Output 5.1.) These results should be used when the F-test for equal variances is significant, or when it is known that the population variances differ. The significance level of a statistical test will be altered if it is 5.4.2. conditioned on the results of a preliminary test. Therefore, if a t-test depends on the results of a preliminary F-test for variance homogeneity, the actual significance level might be slightly different than what is reported. This difference gets smaller as the significance level of the preliminary test increases. With this in mind, you might want to conduct the preliminary F-test for variance homogeneity at a significance level greater than 0.05, usually 0.10, 0.15, or even 0.20. For example, if 0.15 were used, Satterthwaites adjustment would be used if the F-test were significant at p < 0.15.
76
Common Statistical Methods for Clinical Research with SAS Examples
The assumption of normality can be tested by using the Shapiro-Wilk 5.4.3. test or the Kolmogorov-Smirnov test executed with the NORMAL option in PROC UNIVARIATE in SAS. Rejection of the assumption of normality in the small sample case precludes the use of the t-test. As n1 and n2 become large (generally³30), you don’t need to rely so heavily on the assumption that the data have an underlying normal distribution in order to apply the two-sample t-test. However, with non-normal data, the mean might not represent the most appropriate measure of central tendency. With a skewed distribution, for example, the median might be more representative of the distributional center than the mean. In such cases, a rank test, such as the Wilcoxon rank-sum test (Chapter 13) or the log-rank test (Chapter 21), should be considered. Note that, if within-group tests are used for the analysis instead of 5.4.4. the two-sample t-test, the researcher might reach a different, erroneous conclusion. In Example 5.1, you might hastily conclude that a significant treatment-related increase in mean FEV1 exists just by looking at the withingroup results. The two-sample t-test, however, is the more appropriate test for the comparison of between-group changes because the control group response must be factored out of the response from the active group. In interpreting the changes, you might argue that the mean change in FEV1 for the active group is comprised of two additive effects, namely the placebo effect plus a therapeutic benefit. If it can be established that randomization to the treatment groups provides effectively homogeneous groups, you might conclude that the FEV1 mean change for the active group in Example 5.1 can be broken down as follows: 0.503 = 0.284 + 0.219 (Total Effect) (Placebo Effect) (Therapeutic Effect) To validate such interpretations of the data, you would first establish baseline comparability of the two groups. The two-sample t-test can also be applied for this purpose by analyzing the baseline values in the same way the changes were analyzed in Example 5.1. While larger p-values (> 0.10) give greater credence to the null 5.4.5. hypothesis of equality, such a conclusion should not be made without considering the test’s power function, especially with small sample sizes. Statistical power is the probability of rejecting the null hypothesis when it is not true. This probability is based on the value of the parameter assumed as the alternative and, with many alternatives, a power function can be constructed. This concept is discussed briefly in Chapter 2. In checking for differences between means in either direction, 5.4.6. Example 5.1 uses a two-tailed test. If interest is restricted to differences in only one direction, a one-tailed test can be applied in the same manner as described for the one-sample t-test (Chapter 4). The probability given in the SAS output, Pr>|t|, can be halved to obtain the corresponding one-tailed p-value.
&++$$3377((55 2QH:D\$129$ ,QWURGXFWLRQ 6\QRSVLV ([DPSOHV 'HWDLOV 1RWHV
,QWURGXFWLRQ
2QHZD\$129$DQDO\VLVRIYDULDQFH LVXVHGWRVLPXOWDQHRXVO\FRPSDUHWZRRU PRUHJURXSPHDQVEDVHGRQLQGHSHQGHQWVDPSOHVIURPHDFKJURXS7KHELJJHUWKH YDULDWLRQ DPRQJ VDPSOH JURXS PHDQV UHODWLYH WR WKH YDULDWLRQ RI LQGLYLGXDO PHDVXUHPHQWV ZLWKLQ WKH JURXSV WKH JUHDWHU WKH HYLGHQFH WKDW WKH K\SRWKHVLV RI HTXDO JURXS PHDQV LV XQWUXH 7KLV FRQFHSW LV LOOXVWUDWHG LQ $SSHQGL[ & 6HFWLRQ & ZKLFKGLVFXVVHVVRPHEDVLFFRQFHSWVRI$129$ ,Q FOLQLFDO WULDOV WKLV $129$ PHWKRG PLJKW EH DSSURSULDWH IRU FRPSDULQJ PHDQ UHVSRQVHVDPRQJDQXPEHURISDUDOOHOGRVHJURXSVRUDPRQJYDULRXVVWUDWDEDVHG RQSDWLHQWV¶EDFNJURXQGLQIRUPDWLRQVXFKDVUDFHDJHJURXSRUGLVHDVHVHYHULW\ ,QWKLVFKDSWHUDVVXPHWKHVDPSOHVDUHIURPQRUPDOO\GLVWULEXWHGSRSXODWLRQVDOO ZLWKWKHVDPHYDULDQFH
6\QRSVLV ,Q JHQHUDO WKHUH DUH N N OHYHOV RI WKH IDFWRU *5283 )URP HDFK LQGHSHQGHQWO\ VDPSOH D QXPEHU RI REVHUYDWLRQV OHWWLQJ \LM UHSUHVHQW WKH MWK PHDVXUHPHQW IURP WKH LWK JURXS DQG QL UHSUHVHQW WKH QXPEHU RI PHDVXUHPHQWV ZLWKLQ*URXSLL N 'DWDDUHFROOHFWHGDVVKRZQLQ7DEOH
&RPPRQ6WDWLVWLFDO0HWKRGVIRU&OLQLFDO5HVHDUFKZLWK6$6([DPSOHV
7$%/(7\SLFDO/D\RXWIRUWKH2QH:D\$129$
*5283 *URXS
*URXS \ \ \Q
\ \ \Q
*URXSN
\N \N \NQN
7KH QXOO K\SRWKHVLV LV WKDW RI ³QR *URXS HIIHFW´ LH QR GLIIHUHQFH LQ PHDQ UHVSRQVHV DPRQJ JURXSV 7KH DOWHUQDWLYH K\SRWKHVLV LV WKDW ³WKH *URXS HIIHFW LV LPSRUWDQW´ LH DW OHDVW RQH SDLU RI *URXS PHDQV GLIIHUV :KHQ + LV WUXH WKH YDULDWLRQDPRQJJURXSVDQGWKHYDULDWLRQZLWKLQJURXSVDUHLQGHSHQGHQWHVWLPDWHV RIWKHVDPHPHDVXUHPHQWYDULDWLRQ DQGWKHLUUDWLRVKRXOGEHFORVHWR7KLV UDWLRLVXVHGDVWKHWHVWVWDWLVWLF)ZKLFKKDVWKH)GLVWULEXWLRQZLWKN±XSSHUDQG 1±NORZHUGHJUHHVRIIUHHGRP1 Q Q Q
N
7KHWHVWVXPPDU\LVJLYHQDV
QXOOK\SRWKHVLV DOWK\SRWKHVLV
+ + QRW+
$
N
WHVWVWDWLVWLF
06* ) BBBBB 06(
GHFLVLRQUXOH
N UHMHFW+ LI)!)1N
06*LVDQHVWLPDWHRIWKHYDULDELOLW\DPRQJJURXSVDQG06(LVDQHVWLPDWHRIWKH YDULDELOLW\ZLWKLQJURXSV ,Q $129$ \RX DVVXPH µYDULDQFH KRPRJHQHLW\¶ ZKLFK PHDQV WKDW WKH ZLWKLQ JURXSYDULDQFHLVFRQVWDQWDFURVVJURXSV7KLVFDQEHH[SUHVVHGDV
N
ZKHUH GHQRWHV WKH XQNQRZQ YDULDQFH RI WKH LWK SRSXODWLRQ 7KH FRPPRQ YDULDQFH LV HVWLPDWHG E\ V ZKLFK LV D ZHLJKWHG DYHUDJH RI WKH N VDPSOH YDULDQFHV
L
V
Q - × V Q - × V Q N - × V N Q Q Q N - N
&KDSWHU2QH:D\$129$
5HFDOOWKHIRUPXODIRUWKHVDPSOHYDULDQFHIRU*URXSL7DEOH WREH Q \ LM - \L VL = QL - M- %HFDXVH V LV WKH HVWLPDWHG YDULDQFH ZLWKLQ *URXS L V UHSUHVHQWV DQ DYHUDJH ZLWKLQJURXS YDULDWLRQ RYHU DOO JURXSV ,Q $129$ V LV FDOOHG WKH PHDQ VTXDUH HUURU06( DQGLWVQXPHUDWRULVWKHVXPRIVTXDUHVIRUHUURU66( 7KHµHUURU¶LV WKHGHYLDWLRQRIHDFKREVHUYDWLRQIURPLWVJURXSPHDQ,I66(LVH[SUHVVHGDVWKH VXPRIVTXDUHGHUURUV
å L
L
66(
åå\ ± \ N
QL
LM L L M WKHQWKHSRROHGYDULDQFHV LVMXVW66( 1±N 7KHGHQRPLQDWRU1±NZKHUH1 Q Q Q LV WKH WRWDO VDPSOH VL]H RYHU DOO VDPSOHV DQG LV NQRZQ DV WKH GHJUHHVRIIUHHGRPDVVRFLDWHGZLWKWKHHUURU
N
7KH YDULDELOLW\ DPRQJ JURXSV FDQ EH PHDVXUHG E\ WKH GHYLDWLRQ RI WKH DYHUDJH REVHUYDWLRQLQHDFKJURXSIURPWKHRYHUDOODYHUDJH \ 7KDWLVWKHRYHUDOOYDULDQFH REWDLQHG E\ UHSODFLQJ HDFK REVHUYDWLRQ ZLWK LWV JURXS PHDQ \ UHSUHVHQWV WKH EHWZHHQJURXS YDULDELOLW\ 06* ,WV QXPHUDWRU LV WKH VXP RI VTXDUHV IRU JURXSV 66* FRPSXWHGDV L
66*
å Q \ ± \ N
L
L
L
ZKHUH \ LVWKHPHDQRIDOO1REVHUYDWLRQV(DFKJURXSPHDQLVWUHDWHGDVDVLQJOH REVHUYDWLRQ VR WKHUH DUH N± GHJUHHV RI IUHHGRP DVVRFLDWHG ZLWK WKH 66* 7KH PHDQVTXDUHIRUWKH*5283HIIHFWLVWKHVXPRIVTXDUHVGLYLGHGE\LWVGHJUHHVRI IUHHGRP 06* 66*N± :KHQWKHQXOOK\SRWKHVLVLVWUXHWKHYDULDWLRQEHWZHHQJURXSVVKRXOGEHWKHVDPH DVWKHYDULDWLRQZLWKLQJURXSV7KHUHIRUHXQGHU+ WKHWHVWVWDWLVWLF)VKRXOGEH FORVHWRDQGKDVDQ)GLVWULEXWLRQZLWKN±XSSHUGHJUHHVRIIUHHGRPDQG1±N ORZHUGHJUHHVRIIUHHGRP&ULWLFDO)YDOXHVEDVHGRQWKH)GLVWULEXWLRQDUHXVHGWR GHWHUPLQHWKHUHMHFWLRQUHJLRQ6HH6HFWLRQ
&RPPRQ6WDWLVWLFDO0HWKRGVIRU&OLQLFDO5HVHDUFKZLWK6$6([DPSOHV
([DPSOHV
p ([DPSOH+$0$6FRUHVLQ*$' $QHZVHURWRQLQXSWDNHLQKLELWLQJDJHQW61;LVEHLQJVWXGLHGLQVXEMHFWV ZLWKJHQHUDODQ[LHW\GLVRUGHU*$' )LIW\WZRVXEMHFWVGLDJQRVHGZLWK*$' RIPRGHUDWHRUJUHDWHUVHYHULW\FRQVLVWHQWZLWKWKH³'LDJQRVWLFDQG6WDWLVWLFDO 0DQXDOUG(GLWLRQ´'60,,,5 ZHUHHQUROOHGDQGUDQGRPO\DVVLJQHGWR RQHRIWKUHHWUHDWPHQWJURXSVPJ61;PJ61;RUSODFHER $IWHUZHHNVRIRQFHGDLO\RUDOGRVLQJLQDGRXEOHEOLQGIDVKLRQDWHVW EDVHGRQWKH+DPLOWRQ5DWLQJ6FDOHIRU$Q[LHW\+$0$ ZDVDGPLQLVWHUHG 7KLVWHVWFRQVLVWVRIDQ[LHW\UHODWHGLWHPVHJ DQ[LRXVPRRG WHQVLRQ
LQVRPQLD IHDUV HWF HDFKUDWHGE\WKHVXEMHFWDV QRWSUHVHQW PLOG
PRGHUDWH VHYHUH RU YHU\VHYHUH +$0$WHVWVFRUHVZHUHIRXQGE\ VXPPLQJWKHFRGHGYDOXHVRIDOOLWHPVXVLQJWKHQXPHULFFRGLQJVFKHPHRI IRU QRWSUHVHQW IRU PLOG IRU PRGHUDWH IRU VHYHUH DQGIRU YHU\ VHYHUH 7KHGDWDDUHSUHVHQWHGLQ7DEOH$UHWKHUHDQ\GLIIHUHQFHVLQ PHDQ+$0$WHVWVFRUHVDPRQJWKHWKUHHJURXSV" 7$%/(5DZ'DWDIRU([DPSOH
/R'RVHPJ
+L'RVHPJ
3ODFHER
3DWLHQW 1XPEHU
+$0$
3DWLHQW 1XPEHU
+$0$
3DWLHQW 1XPEHU
+$0$
QD
QD
QD
QD
&KDSWHU2QH:D\$129$
6ROXWLRQ
3DWLHQWVZKRGURSSHGRXWZLWKQRGDWD1RV DUHH[FOXGHGIURP WKHDQDO\VLV$UELWUDULO\DVVLJQLQJVXEVFULSWVIRU/R'RVHIRU+L'RVHDQG IRU 3ODFHER WKH JURXS VXPPDU\ VWDWLVWLFV IRU WKH +$0$ VFRUHV DUH VKRZQ LQ 7DEOH 7$%/(6XPPDU\6WDWLVWLFVE\7UHDWPHQW*URXSIRU([DPSOH
/R'RVH L
+L'RVH L
3ODFHER L
2YHUDOO
í \
V
Q
L
L
L
&RPSXWH
*5283
66* ± ± ±
ZLWKN JURXSVVRWKDW
06* 66*N±
$OVR
66( ± ± ± ± ± ± ± ± ±
ZLWK1±N ± GHJUHHVRIIUHHGRPVRWKDW06( 7R FKHFN WKH FDOFXODWLRQV \RX FDQ DOVR FRPSXWH WKH 06( DOWHUQDWLYHO\ DV WKH ZHLJKWHGDYHUDJHRIWKHJURXSYDULDQFHV
06(
× × ×
&RPPRQ6WDWLVWLFDO0HWKRGVIRU&OLQLFDO5HVHDUFKZLWK6$6([DPSOHV
7KHWHVWVXPPDU\FRQGXFWHGDWDVLJQLILFDQFHOHYHORI LVVKRZQDV
QXOOK\SRWKHVLV
+
DOWK\SRWKHVLV
+ QRW+
WHVWVWDWLVWLF
)
GHFLVLRQUXOH
UHMHFW+ LI)!)
FRQFOXVLRQ
EHFDXVH ! \RX UHMHFW + DQG FRQFOXGH WKDW WKHUH LV D VLJQLILFDQW GLIIHUHQFH LQ PHDQ +$0 $VFRUHVDPRQJWKHGRVHJURXSV
$
6$6$QDO\VLVRI([DPSOH 7KHQH[WIRXUSDJHVSURYLGHWKHFRGHDQGRXWSXWIRUWKHDQDO\VLVRI([DPSOH XVLQJWKH*/0SURFHGXUHLQ6$6 7KH VXPPDU\ VWDWLVWLFV DUH ILUVW REWDLQHG IRU HDFK GRVH JURXS E\ XVLQJ 352& 0($16å7KHVHVWDWLVWLFVDUHDOVRVXPPDUL]HGLQ7DEOH 7KH NH\ WR XVLQJ 352& */0 LV LQ FRUUHFWO\ VSHFLI\LQJ WKH 02'(/ VWDWHPHQW ZKLFKOLVWVWKHUHVSRQVHYDULDEOHRQWKHOHIWVLGHRIWKHHTXDOVLJQDQGWKHPRGHO IDFWRUVRQWKHULJKWVLGHRIWKHHTXDOVLJQ,QWKHRQHZD\$129$WKHUHLVRQO\RQH PRGHOIDFWRU,QWKLVFDVHWKHPRGHOIDFWRULV'RVH*URXSYDULDEOH '26(*53 7KHPRGHOIDFWRUPXVWDOVREHVSHFLILHGLQWKH&/$66VWDWHPHQWWRLQGLFDWHLWLVD FODVVLILFDWLRQIDFWRUUDWKHUWKDQDQXPHULFFRYDULDWH 7KH RXWSXW VKRZV WKH 06( WKH 06* ê DQG WKH )WHVW IRU WKH 'RVH *URXS HIIHFW RI ZKLFK FRUURERUDWH WKH PDQXDO FDOFXODWLRQV 7KH SYDOXH RI LQGLFDWHVWKDWWKHPHDQ+$0$VFRUHVGLIIHUVLJQLILFDQWO\DPRQJ 'RVH *URXSV 7KH 0($16 VWDWHPHQW ñ LV XVHG WR REWDLQ PXOWLSOH FRPSDULVRQ UHVXOWV XVLQJ ERWK WKH SDLUZLVH WWHVW ò DQG 'XQQHWW¶V WHVW ô %RWK RI WKHVH PHWKRGVUHYHDOVLJQLILFDQWGLIIHUHQFHVEHWZHHQHDFKDFWLYHJURXSDQGSODFHER$ &2175$67 VWDWHPHQW LV DOVR LQFOXGHG õ WR LOOXVWUDWH D FXVWRPL]HG WHVW VHH 6HFWLRQ
&KDSWHU2QH:D\$129$
6$6&RGHIRU([DPSOH DATA GAD; INPUT PATNO DOSEGRP $ HAMA @@; DATALINES; 101 LO 21 104 LO 18 106 LO 19 110 LO . 112 LO 28 116 LO 22 120 LO 30 121 LO 27 124 LO 28 125 LO 19 130 LO 23 136 LO 22 137 LO 20 141 LO 19 143 LO 26 148 LO 35 152 LO . 103 HI 16 105 HI 21 109 HI 31 111 HI 25 113 HI 23 119 HI 25 123 HI 18 127 HI 20 128 HI 18 131 HI 16 135 HI 24 138 HI 22 140 HI 21 142 HI 16 146 HI 33 150 HI 21 151 HI 17 102 PB 22 107 PB 26 108 PB 29 114 PB 19 115 PB . 117 PB 33 118 PB 37 122 PB 25 126 PB 28 129 PB 26 132 PB . 133 PB 31 134 PB 27 139 PB 30 144 PB 25 145 PB 22 147 PB 36 149 PB 32 ; PROC SORT DATA = GAD; BY DOSEGRP; PROC MEANS MEAN STD N DATA = GAD; BY DOSEGRP; VAR HAMA; TITLE1 ’One-Way ANOVA’; TITLE2 ’EXAMPLE 6.1: HAM-A Scores in GAD’; RUN;
å
PROC GLM DATA = GAD; CLASS DOSEGRP; MODEL HAMA = DOSEGRP; MEANS DOSEGRP/T DUNNETT(‘PB’); ñ CONTRAST ‘ACTIVE vs. PLACEBO’ DOSEGRP 0.5 0.5 –1; RUN;
õ
&RPPRQ6WDWLVWLFDO0HWKRGVIRU&OLQLFDO5HVHDUFKZLWK6$6([DPSOHV
2873876$62XWSXWIRU([DPSOH One-Way ANOVA EXAMPLE 6.1: HAM-A Scores in GAD --------------------------- DOSEGRP=HI -----------------------------The MEANS Procedure Analysis Variable : HAMA
å
Mean Std Dev N ---------------------------------21.5882353 4.9630991 17 ------------------------------------------------------------ DOSEGRP=LO -----------------------------Analysis Variable : HAMA Mean Std Dev N ---------------------------------23.8000000 4.9742192 15 ------------------------------------------------------------ DOSEGRP=PB -----------------------------Analysis Variable : HAMA Mean Std Dev N ---------------------------------28.0000000 5.0332230 16 ----------------------------------
The GLM Procedure Class Level Information Class DOSEGRP
Levels 3
Values HI LO PB
Number of observations
52
NOTE: Due to missing values, only 48 observations can be used in this analysis.
&KDSWHU2QH:D\$129$
2873876$62XWSXWIRU([DPSOHFRQWLQXHG One-Way ANOVA EXAMPLE 6.1: HAM-A Scores in GAD The GLM Procedure Dependent Variable: HAMA
Source
Sum of Squares
DF
Model 2 Error 45 Corrected Total 47
Mean Square
347.149020 1120.517647 1467.666667
R-Square 0.236531
173.574510 24.900392
Coeff Var 20.43698
F Value
Pr > F
6.97
0.0023
Root MSE 4.990029
HAMA Mean 24.41667
Source DOSEGRP
DF 2
Type I SS 347.1490196
Mean Square 173.5745098
F Value 6.97
Pr > F 0.0023
Source DOSEGRP
DF 2
Type III SS 347.1490196
Mean Square 173.5745098
F Value 6.97
Pr > F 0.0023
ê ò
t Tests (LSD) for HAMA
NOTE: This test controls the Type I comparisonwise error rate, not the experimentwise error rate.
Alpha 0.05 Error Degrees of Freedom 45 Error Mean Square 24.90039 Critical Value of t 2.01410
Comparisons significant at the 0.05 level are indicated by ***.
DOSEGRP Comparison PB PB LO LO HI HI
-
LO HI PB HI PB LO
Difference Between Means 4.200 6.412 -4.200 2.212 -6.412 -2.212
95% Confidence Limits 0.588 2.911 -7.812 -1.349 -9.912 -5.772
7.812 9.912 -0.588 5.772 -2.911 1.349
*** *** *** ***
&RPPRQ6WDWLVWLFDO0HWKRGVIRU&OLQLFDO5HVHDUFKZLWK6$6([DPSOHV
2873876$62XWSXWIRU([DPSOHFRQWLQXHG One-Way ANOVA EXAMPLE 6.1: HAM-A Scores in GAD The GLM Procedure
ô
Dunnett’s t Tests for HAMA NOTE: This test controls the Type I experimentwise error for comparisons of all treatments against a control. Alpha 0.05 Error Degrees of Freedom 45 Error Mean Square 24.90039 Critical Value of Dunnett’s t 2.28361 Comparisons significant at the 0.05 level are indicated by ***.
DOSEGRP Comparison LO HI
- PB - PB
Difference Between Means -4.200 -6.412
Simultaneous 95% Confidence Limits -8.295 -10.381
-0.105 -2.443
*** ***
Dependent Variable: HAMA Contrast ACTIVE vs. PLACEBO
Mean Square
F Value
Pr > F
1 299.9001075
299.9001075
12.04
0.0012
õ
'HWDLOV 1RWHV
DF Contrast SS
7KH SDUDPHWHUV DVVRFLDWHG ZLWK WKH )GLVWULEXWLRQ DUH WKH XSSHU DQGORZHUGHJUHHVRIIUHHGRP0DQ\HOHPHQWDU\VWDWLVWLFDOWH[WVSURYLGHWDEOHV RI FULWLFDO )YDOXHV DVVRFLDWHG ZLWK D IL[HG VLJQLILFDQFH OHYHO XVXDOO\ IRUYDULRXVFRPELQDWLRQVRIYDOXHVIRUWKHXSSHUDQGORZHUGHJUHHV RIIUHHGRP)YDOXHVRUDVVRFLDWHGSUREDELOLWLHVIRUNQRZQGHJUHHVRIIUHHGRP FDQ DOVR EH IRXQG E\ XVLQJ PRVW VWDWLVWLFDO SURJUDPV ,Q 6$6 WKH ),19 IXQFWLRQ ± XGI OGI ZKHUH XGI DQG OGI UHSUHVHQW WKH XSSHU DQG ORZHU GHJUHHV RI IUHHGRP UHVSHFWLYHO\ LV XVHG WR REWDLQ WKH FULWLFDO)YDOXHVIRUD VLJQLILFDQFHOHYHORI :KHQFRQGXFWLQJDQDO\VLVRIYDULDQFHXVLQJDQ$129$WDEOHLV DWUDGLWLRQDODQGZLGHO\XVHGPHWKRGRIVXPPDUL]LQJWKHUHVXOWV7KH$129$ VXPPDU\LGHQWLILHVHDFKVRXUFHRIYDULDWLRQWKHGHJUHHVRIIUHHGRPWKHVXP RI VTXDUHV WKH PHDQ VTXDUH DQG WKH )VWDWLVWLF $Q $129$ WDEOH LV D FRQYHQWLRQDOZD\RIRUJDQL]LQJUHVXOWVIRUDQDO\VHVWKDWXVHPDQ\IDFWRUV,Q ([DPSOH WKHUH LV RQO\ RQH PRGHO IDFWRU 'RVH *URXS EXW LW LV VWLOO
&KDSWHU2QH:D\$129$
FRQYHQLHQW WR XVH WKH $129$ WDEOH IRUPDW WR VXPPDUL]H WKH UHVXOWV 7KH $129$WDEOHIRUWKLVH[DPSOHLV
$129$
GI
66
06
)
'RVH*URXS
(UURU
7RWDO
6LJQLILFDQWS
6RXUFH
$RQHZD\$129$UHTXLUHVWKHDVVXPSWLRQVRI QRUPDOO\GLVWULEXWHGGDWD LQGHSHQGHQWVDPSOHVIURPHDFKJURXS YDULDQFHKRPRJHQHLW\DPRQJJURXSV 7KH6KDSLUR:LONWHVWRU.ROPRJRURY6PLUQRIIWHVWIRUQRUPDOLW\DQG%DUWOHWW V WHVWIRUYDULDQFHKRPRJHQHLW\DUHVRPHRIWKHPDQ\IRUPDOWHVWVDYDLODEOHIRU GHWHUPLQLQJ ZKHWKHU WKH VDPSOH GDWD DUH FRQVLVWHQW ZLWK WKHVH DVVXPSWLRQV 7KHVH WHVWV FDQ EH FRQGXFWHG LQ 6$6 XVLQJ RSWLRQV DYDLODEOH LQ 352& 81,9$5,$7(DQG352&*/0*UDSKLFVRIWKHGDWDVXFKDVKLVWRJUDPVRU SORWV RI WKH UHVLGXDOV RIWHQ SURYLGH DQ LQIRUPDO EXW FRQYHQLHQW PHDQV RI LGHQWLI\LQJGHSDUWXUHVIURPWKH$129$DVVXPSWLRQV ,Q WKH HYHQW RI VLJQLILFDQW SUHOLPLQDU\ WHVWV WKDW LGHQWLI\ GHSDUWXUHV IURP WKH DVVXPSWLRQVDGDWDWUDQVIRUPDWLRQPLJKWEHDSSURSULDWH/RJDULWKPLFDQGUDQN WUDQVIRUPDWLRQVDUHSRSXODUWUDQVIRUPDWLRQVXVHGLQFOLQLFDOGDWDDQDO\VLVVHH $SSHQGL[ ) 7KH .UXVNDO:DOOLV WHVW &KDSWHU PD\ EH XVHG DV DQ DOWHUQDWLYHWRWKHRQHZD\$129$ZKHQQRUPDOLW\FDQQRWEHDVVXPHG :KHQ FRPSDULQJ PRUH WKDQ WZR PHDQV N ! D VLJQLILFDQW ) WHVWLQGLFDWHVWKDWDWOHDVWRQHSDLURIPHDQVGLIIHUVEXWZKLFKSDLURUSDLUV LVQRWLGHQWLILHGE\$129$,IWKHQXOOK\SRWKHVLVRIHTXDOPHDQVLVUHMHFWHG IXUWKHUDQDO\VLVPXVWEHXQGHUWDNHQWRLQYHVWLJDWHZKHUHWKHGLIIHUHQFHVOLH 1XPHURXVPXOWLSOHFRPSDULVRQSURFHGXUHVDUHDYDLODEOHIRUFRQGXFWLQJVXFK LQYHVWLJDWLRQV )RU QRZ DWWHQWLRQ LV FRQILQHG WR WZR FRPPRQO\XVHG PXOWLSOH FRPSDULVRQ PHWKRGVRQHPHWKRGLVDYHU\VLPSOHDSSURDFKWRDOOSDLUZLVHWHVWVDQGRQH PHWKRG LV D YHU\ XVHIXO ZD\ WR FRPSDUH QXPHURXV JURXSV ZLWK D FRQWURO 0XOWLSOH FRPSDULVRQV DUH GLVFXVVHG LQ PRUH GHWDLO LQ $SSHQGL[ ( 7KH VLPSOHVW DSSURDFK WR PXOWLSOH FRPSDULVRQV LV WR FRQGXFW SDLUZLVH WWHVWV IRU HDFKSDLURIWUHDWPHQWV7KLVPHWKRGXVHVWKHDSSURDFKRIWKHWZRVDPSOHWWHVW GLVFXVVHGLQ&KDSWHUEXWUDWKHUWKDQXVLQJWKHSRROHGVWDQGDUGGHYLDWLRQRI
&RPPRQ6WDWLVWLFDO0HWKRGVIRU&OLQLFDO5HVHDUFKZLWK6$6([DPSOHV
RQO\ WKH WZR JURXSV EHLQJ FRPSDUHG WKH SRROHG VWDQGDUG GHYLDWLRQ RI DOO N JURXSV LV XVHG LH WKH VTXDUH URRW RI WKH 06( IURP WKH $129$ 7KLV PHWKRGFDQEHFDUULHGRXWLQ6$6XVLQJWKH7RSWLRQLQWKH0($16VWDWHPHQW LQ352&*/0
$QRWKHUDSSURDFKWRPXOWLSOHFRPSDULVRQVZKLFKFRPSDUHVWUHDWPHQWPHDQV ZLWK D FRQWURO JURXS LV FDOOHG 'XQQHWW¶V WHVW 6SHFLDO WDEOHV DUH QHHGHG WR FRQGXFW WKLV WHVW PDQXDOO\ 7KH UHVXOWV DUH SULQWHG E\ 6$6 LI \RX XVH WKH '811(77 RSWLRQ LQ WKH 0($16 VWDWHPHQW LQ 352& */0 RU XVH WKH 352%0&IXQFWLRQLQWKH'$7$VWHS 7KH7DQG'811(77RSWLRQVDUHVKRZQLQ([DPSOH7KHRXWSXWIRUWKH SDLUZLVHWWHVWV/6' òVKRZVVLJQLILFDQWGLIIHUHQFHVLQPHDQ+$0$VFRUHV EHWZHHQ WKH SODFHER JURXS DQG HDFK RI WKH DFWLYHGRVH JURXSV EXW QR GLIIHUHQFH EHWZHHQ WKH ORZ DQG KLJKGRVH JURXSV 7KH UHVXOWV XVLQJ '811(77 ô DOVR VKRZ VLJQLILFDQW FRPSDULVRQV RI HDFK DFWLYH WUHDWPHQW JURXSYHUVXVWKHFRQWUROJURXSSODFHER 7KH ELJJHVW GUDZEDFN RI XVLQJ SDLUZLVH WWHVWV LV WKH HIIHFW RQ WKH RYHUDOO VLJQLILFDQFH OHYHO DV GHVFULEHG LQ &KDSWHU ,I HDFK SDLUZLVH WHVW LV FRQGXFWHGDWDVLJQLILFDQFHOHYHORI WKHFKDQFHRIKDYLQJDWOHDVWRQH HUURQHRXV FRQFOXVLRQ DPRQJ DOO WKH FRPSDULVRQV SHUIRUPHG LQFUHDVHV ZLWK WKHQXPEHURIJURXSVN$SSURDFKHVWRPXOWLSOHFRPSDULVRQVWKDWFRQWUROWKH RYHUDOOHUURUUDWHKDYHEHHQGHYHORSHGDQGLWLVLPSRUWDQWWRFRQVLGHUWKHXVH RIWKHVHPHWKRGVZKHQWKHVWXG\XVHVDODUJHUQXPEHURIJURXSV'XQQHWW¶VWHVW FRQWUROV WKH RYHUDOO VLJQLILFDQFH OHYHO EHFDXVH LW XVHV DQ µH[SHULPHQWZLVH HUURUUDWH¶DVQRWHGLQ2XWSXW 1RWH)RUDFRPSUHKHQVLYHGLVFXVVLRQRIWKHQXPHURXVRWKHUDSSURDFKHV WRPXOWLSOHFRPSDULVRQVDQH[FHOOHQWUHIHUHQFHLV³0XOWLSOH &RPSDULVRQVDQG0XOWLSOH7HVWV8VLQJWKH6$66\VWHP´E\:HVWIDOO 7RELDV5RPDQG:ROILQJHUZKLFKLVSDUWRIWKH6$6%%8OLEUDU\
:KHQ FRQGXFWLQJ DQ $129$ FXVWRPL]HG FRPSDULVRQV DPRQJ FRPELQDWLRQVRIWKHJURXSPHDQVFDQEHPDGH7KLVLVGRQHE\WHVWLQJZKHWKHU VSHFLILF OLQHDU FRPELQDWLRQV RI WKH JURXS PHDQV FDOOHG µOLQHDU FRQWUDVWV¶ GLIIHUIURP ,Q ([DPSOH D &2175$67 VWDWHPHQWLVLQFOXGHGLQWKH6$6DQDO\VLVWR FRPSDUHPHDQVFRUHVEHWZHHQµ$FWLYH¶DQGµ3ODFHER¶ õµ$FWLYH¶UHSUHVHQWV WKH SRROHG UHVXOWV IURP ERWK DFWLYHGRVH JURXSV DQG WKLV RYHUDOO PHDQ LV FRPSDUHGWRWKHUHVXOWVIURPWKHSODFHERJURXS7KH&2175$67VWDWHPHQWLQ 352& */0 LV IROORZHG E\ D GHVFULSWLYH ODEHO LQ WKLV FDVH $&7,9( YV 3/$&(%2 WKHQWKHQDPHRIWKHHIIHFW'26(*53 DQGILQDOO\WKHFRQWUDVW VSHFLILFDWLRQ7KHFRQWUDVWLVVLPSO\DYHFWRUZKRVHHOHPHQWVDUHFRHIILFLHQWV RI WKH SDUDPHWHUV ZKLFK UHSUHVHQW WKH OHYHOV RI WKH HIIHFW EHLQJ WHVWHG %HFDXVHWKHHIIHFW'26(*53KDVWKUHHOHYHOVZHLQFOXGHWKHFRHIILFLHQWV DQG±WKHRUGHURIZKLFKPXVWFRUUHVSRQGWRWKHRUGHUWKDW6$6
&KDSWHU2QH:D\$129$
XVHV HLWKHU DOSKDEHWLFDO HJ +, /2 3% RU DV VSHFLILHG LQ DQ 25'(5 VWDWHPHQW7KLVVSHFLILFDWLRQWHOOV6$6WRWHVW
+ +L'RVHPHDQ/R'RVHPHDQ 3ODFHERPHDQ
7KHRXWSXWVKRZVDVLJQLILFDQWFRQWUDVWZLWKDQ)YDOXHRIS 0XOWLSOH FRPSDULVRQ UHVXOWV FDQ VRPHWLPHV EH FRQWUDU\ WR \RXU LQWXLWLRQ )RU H[DPSOH OHW \ \ \ DQG VXSSRVH WKH DQDO\VLV GHWHUPLQHV WKDW WKH *URXS $ PHDQ LV QRW VWDWLVWLFDOO\ GLIIHUHQW IURP WKDW RI *URXS % ZKLFK LV QRW GLIIHUHQW IURP &
\ ±W L
1
-N
× 06(
± ×
= -
6LPLODUO\FRQILGHQFHLQWHUYDOVFDQEHIRXQGIRUWKH/R'RVHJURXS ± DQG IRU WKH 3ODFHER JURXS ± 6XFK LQWHUYDOV DUH RIWHQ GHSLFWHGLQJUDSKLFVLQPHGLFDOMRXUQDODUWLFOHV FRQILGHQFH LQWHUYDOV FDQ DOVR EH REWDLQHG IRU WKH PHDQ GLIIHUHQFHEHWZHHQDQ\SDLURIJURXSVHJ*URXSLYV*URXSM E\XVLQJWKH IRUPXOD M
1
-N
06(
+ Q Q M
,Q([DPSOHDFRQILGHQFHLQWHUYDOIRUWKHGLIIHUHQFHLQPHDQ+$0$ VFRUHVEHWZHHQWKH3ODFHERDQG/R'RVHJURXSVLV
ö ÷ ø è æ
± × ç RU WR
L
L
Q
,Q([DPSOHDFRQILGHQFHLQWHUYDOIRUWKH+L'RVHJURXSLV
L
&
6RPHWLPHV ZKHQ D VLJQLILFDQW )WHVW IRU JURXSV LV IRXQG LW LV SUHIHUDEOH WR UHSRUW WKH FRQILGHQFH LQWHUYDOV IRU WKH JURXS PHDQ UHVSRQVHV UDWKHUWKDQSHUIRUPPXOWLSOHFRPSDULVRQV$FRQILGHQFHLQWHUYDOIRUWKH PHDQRI*URXSLLVIRXQGE\
\ - \ ± W
%
&RPPRQ6WDWLVWLFDO0HWKRGVIRU&OLQLFDO5HVHDUFKZLWK6$6([DPSOHV
$ FRQILGHQFH LQWHUYDO IRU WKH GLIIHUHQFH LQ PHDQV WKDW GRHV QRW FRQWDLQ LV LQGLFDWLYHRIVLJQLILFDQWO\GLIIHUHQWPHDQV 0DNLQJ LQIHUHQFHV EDVHG RQ D VHULHV RI FRQILGHQFH LQWHUYDOV FDQ OHDG WR WKH VDPHW\SHRILQIODWLRQRIWKHRYHUDOOVLJQLILFDQFHOHYHODVZDVGLVFXVVHGZLWK K\SRWKHVLVWHVWLQJ$GMXVWPHQWVWRWKHLQWHUYDOZLGWKVFDQEHPDGHZKHQXVLQJ VLPXOWDQHRXV FRQILGHQFH LQWHUYDOV E\ UHVRUWLQJ WR WKH VDPH PHWKRGRORJ\ GHYHORSHGIRUPXOWLSOHFRPSDULVRQLQIHUHQFHV ,IWKHUHDUHRQO\WZRJURXSVN WKHSYDOXHIRU*URXSHIIHFW XVLQJ DQ $129$ LV WKH VDPH DV WKDW RI D WZRVDPSOH WWHVW 7KH ) DQG W GLVWULEXWLRQVHQMR\WKHUHODWLRQVKLSWKDWZLWKXSSHUGHJUHHRIIUHHGRPWKH) VWDWLVWLFLVWKHVTXDUHRIWKHWVWDWLVWLF:KHQN WKH06(FRPSXWHGDVD SRROHG FRPELQDWLRQ RI WKH JURXS VDPSOH PHDQV LV LGHQWLFDO WR WKH SRROHG YDULDQFHV ZKLFKLVXVHGLQWKHWZRVDPSOHWWHVW&KDSWHU
S
0DQXDO FRPSXWDWLRQV ZLWK D KDQG FDOFXODWRU FDQ EH IDFLOLWDWHG ZLWKVWDQGDUGFRPSXWLQJIRUPXODVDVIROORZV
7RWDOIRU*URXSL
2YHUDOO7RWDO
å< * = å* * = L
L
L
L
* 1
&RUUHFWLRQ)DFWRU
&=
7RWDO6XPRI6TXDUHV
72766 =
åå \ L
6XPRI6TXDUHVIRU*URXSV
66* =
å L
6XPRI6TXDUHVIRU(UURU
M
LM
-&
* -& Q L
L
66( 72766 ±66*
6$6FRPSXWHVWKHVXPRIVTXDUHVLQIRXUZD\V7KHVHDUHFDOOHG WKH7\SH,,,,,,DQG,9VXPVRIVTXDUHV&RPSXWDWLRQDOGLIIHUHQFHVDPRQJ WKHVHW\SHVDUHEDVHGRQWKHPRGHOXVHGDQGWKHPLVVLQJYDOXHVWUXFWXUH)RU WKHRQHZD\$129$DOOIRXUW\SHVRIVXPVRIVTXDUHVDUHLGHQWLFDOVRLWGRHV QRW PDWWHU ZKLFK W\SH LV VHOHFWHG 6$6 SULQWV WKH 7\SH , DQG ,,, UHVXOWV E\ GHIDXOW 7KHVH GLIIHUHQW W\SHV RI VXPV RI VTXDUHV DUH GLVFXVVHG LQ &KDSWHU DQG$SSHQGL['
&++$$3377((55 7ZR:D\$129$ ,QWURGXFWLRQ 6\QRSVLV ([DPSOHV 'HWDLOV 1RWHV
,QWURGXFWLRQ 7KH WZRZD\ $129$ LV D PHWKRG IRU VLPXOWDQHRXVO\ DQDO\]LQJ WZR IDFWRUV WKDW DIIHFW D UHVSRQVH $V LQ WKH RQHZD\ $129$ WKHUH LV D JURXS HIIHFW VXFK DV WUHDWPHQW JURXS RU GRVH OHYHO 7KH WZRZD\ $129$ DOVR LQFOXGHV DQRWKHU LGHQWLILDEOH VRXUFH RI YDULDWLRQ FDOOHG D EORFNLQJ IDFWRU ZKRVH YDULDWLRQ FDQ EH VHSDUDWHGIURPWKHHUURUYDULDWLRQWRJLYHPRUHSUHFLVHJURXSFRPSDULVRQV)RUWKLV UHDVRQ WKH WZRZD\ $129$ OD\RXW LV VRPHWLPHV FDOOHG D µUDQGRPL]HG EORFN GHVLJQ¶ %HFDXVHFOLQLFDOVWXGLHVRIWHQXVHIDFWRUVVXFKDVVWXG\FHQWHUJHQGHUGLDJQRVWLF JURXS RU GLVHDVH VHYHULW\ DV D VWUDWLILFDWLRQ RU EORFNLQJ IDFWRU WKH WZRZD\ $129$ LV RQH RI WKH PRVW FRPPRQ $129$ PHWKRGV XVHG LQ FOLQLFDO GDWD DQDO\VLV 7KHEDVLFLGHDVXQGHUO\LQJWKHWZRZD\$129$DUHJLYHQLQ$SSHQGL[&6HFWLRQ &
6\QRSVLV ,QJHQHUDOWKHUDQGRPL]HGEORFNGHVLJQKDVJJ OHYHOVRIDµJURXS¶IDFWRUDQG E E OHYHOV RI D µEORFN¶ IDFWRU $Q LQGHSHQGHQW VDPSOH RI PHDVXUHPHQWV LV WDNHQIURPHDFKRIWKHJ EFHOOVIRUPHGE\WKHJURXSEORFNFRPELQDWLRQV/HWQ UHSUHVHQWWKHQXPEHURIPHDVXUHPHQWVWDNHQLQ*URXSLDQG%ORFNM&HOOL±M DQG OHW1UHSUHVHQWWKHQXPEHURIPHDVXUHPHQWVRYHUDOOJ [ E FHOOV/HWWLQJ\ GHQRWH WKHN UHVSRQVHLQ&HOOL±MN Q WKHJHQHUDOOD\RXWRIWKHUDQGRPL]HG EORFNGHVLJQLVVKRZQLQ7DEOH LM
LMN
WK
LM
&RPPRQ6WDWLVWLFDO0HWKRGVIRU&OLQLFDO5HVHDUFKZLWK6$6([DPSOHV
7$%/(5DQGRPL]HG%ORFN/D\RXW
*URXS
*URXS
«
*URXSJ
%ORFN
\\«\Q \\«\Q
«
\J\J«\JQJ
%ORFN
\\«\Q \\«\Q
«
\J\J«\JQJ
«
«
«
\JE\JE«\JEQJE
« %ORFNE
«
«
\E\E«\EQE \E\E«\EQE
7KHJHQHUDOHQWULHVLQDWZRZD\$129$VXPPDU\WDEOHDUHUHSUHVHQWHGDVVKRZQ LQ7DEOH
7$%/($129$6XPPDU\7DEOHIRUWKH7ZR:D\$129$ 6RXUFH
66
06
)
*URXS*
J±
66*
06*
)* 06*06(
%ORFN%
E±
66%
06%
)% 06%06(
06*%
)*% 06*%06(
*[% LQWHUDFWLRQ
GI
J± E± 66*%
(UURU
1±JE
66(
06(
7RWDO
1±
72766
66UHSUHVHQWVWKHVXPRIVTXDUHGGHYLDWLRQVDVVRFLDWHGZLWKWKHIDFWRUOLVWHGXQGHU 6RXUFH 7KHVH DUH FRPSXWHG LQ DZD\VLPLODUWRWKDWVKRZQLQ&KDSWHUIRUWKH RQHZD\$129$ 7KHPHDQVTXDUH06 LVIRXQGE\GLYLGLQJWKH66E\WKHGHJUHHVRIIUHHGRP7KH 06 UHSUHVHQWV D PHDVXUH RI YDULDELOLW\ DVVRFLDWHG ZLWK WKH IDFWRU OLVWHG XQGHU VRXUFH:KHQWKHUHLVQRHIIHFWGXHWRWKHVSHFLILHGIDFWRUWKLVYDULDELOLW\UHIOHFWV PHDVXUHPHQWHUURUYDULDELOLW\ ZKLFKLVDOVRHVWLPDWHGE\06(
7KH)YDOXHVDUHUDWLRVRIWKHHIIHFWPHDQVTXDUHVWRWKHPHDQVTXDUHHUURU06( 8QGHUWKHQXOOK\SRWKHVLVRIQRHIIHFWWKH)UDWLRVKRXOGEHFORVHWR7KHVH) YDOXHV DUH XVHG DV WKH WHVW VWDWLVWLFV IRU WHVWLQJ WKH QXOO K\SRWKHVLV RI QR PHDQ GLIIHUHQFHVDPRQJWKHOHYHOVRIWKHIDFWRU
&KDSWHU7ZR:D\$129$
7KH)WHVWIRUJURXS)* WHVWVWKHSULPDU\K\SRWKHVLVRIQRJURXSHIIHFW'HQRWLQJ WKHPHDQIRUWKHL JURXSE\ WKHWHVWVXPPDU\LV WK
L
QXOOK\SRWKHVLV DOWK\SRWKHVLV
+ + 127+
$
J
WHVWVWDWLVWLF
06* ) BBBB 06(
GHFLVLRQUXOH
UHMHFW+ LI) ! )1 -J
*
J -
*
7KH)WHVWIRUWKHEORFNHIIHFW)% SURYLGHVDVHFRQGDU\WHVWZKLFKLVXVHGLQD VLPLODU ZD\ WR GHWHUPLQH LI WKH PHDQ UHVSRQVHV GLIIHU DPRQJ EORFNLQJ OHYHOV $ VLJQLILFDQWEORFNHIIHFWRIWHQUHVXOWVLQDVPDOOHUHUURUYDULDQFH06( DQGJUHDWHU SUHFLVLRQIRUWHVWLQJWKHSULPDU\K\SRWKHVLVRI³QRJURXSHIIHFW´WKDQLIWKHEORFN HIIHFWZHUHLJQRUHG 7KH *URXS[%ORFN IDFWRU * [% UHSUHVHQWV WKH VWDWLVWLFDO LQWHUDFWLRQ EHWZHHQ WKH WZRPDLQHIIHFWV,IWKH)WHVWIRULQWHUDFWLRQLVVLJQLILFDQWWKLVUHVXOWLQGLFDWHVWKDW WUHQGVDFURVVJURXSVGLIIHUDPRQJWKHOHYHOVRIWKHEORFNLQJIDFWRU7KLVLVXVXDOO\ WKH ILUVW WHVW RI LQWHUHVW LQ D WZRZD\ $129$ EHFDXVH WKH WHVW IRU JURXS HIIHFWV PLJKW QRW EH PHDQLQJIXO LQ WKH SUHVHQFH RI D VLJQLILFDQW LQWHUDFWLRQ ,I WKH LQWHUDFWLRQLVVLJQLILFDQWIXUWKHUDQDO\VLVPLJKWEHUHTXLUHGVXFKDVDSSOLFDWLRQRI DRQHZD\$129$WRFRPSDUHJURXSVZLWKLQHDFKOHYHORIWKHEORFNLQJIDFWRU ,QDWZRZD\$129$\RXDVVXPHWKDWWKHVDPSOHVZLWKLQHDFKFHOODUHQRUPDOO\ GLVWULEXWHGZLWKWKHVDPHYDULDQFH
LM
ååQ - × V J
E
LM
V
L M
1 - J × E
LM
7KH QXPHUDWRU RI WKLV TXDQWLW\ LV WKH VXP RI VTXDUHV IRU HUURU 66( EDVHG RQ 1±JÂE GHJUHHVRIIUHHGRP *HQHUDOO\WKHHIIHFWVXPVRIVTXDUHV66 DUHFRPSXWHGLQDZD\WKDWLVVLPLODUWR WKDWVKRZQIRUWKHRQHZD\$129$)RUH[DPSOHWKHVXPRIVTXDUHVIRUµ*URXS¶ 66* ZKLFKUHSUHVHQWVWKHYDULDELOLW\DPRQJJURXSOHYHOVLVEDVHGRQWKHVXPRI VTXDUHGGHYLDWLRQVRIWKHJURXSPHDQVIURPWKHRYHUDOOPHDQ,QWKHVDPHZD\WKH VXPRIVTXDUHVIRUµ%ORFN¶66% ZKLFKUHSUHVHQWVWKHYDULDELOLW\DPRQJWKHEORFN OHYHOV LV EDVHG RQ WKH VXP RI VTXDUHG GHYLDWLRQV RI WKH EORFN PHDQV IURP WKH RYHUDOO PHDQ 7KH LQWHUDFWLRQ VXP RI VTXDUHV 66*% LV EDVHG RQ WKH VXP RI VTXDUHGGHYLDWLRQVRIWKHFHOOPHDQVIURPWKHRYHUDOOPHDQ7KHFRPSXWDWLRQVDUH VKRZQLQ([DPSOH
&RPPRQ6WDWLVWLFDO0HWKRGVIRU&OLQLFDO5HVHDUFKZLWK6$6([DPSOHV
([DPSOHV
p ([DPSOH+HPRJORELQ&KDQJHVLQ$QHPLD
$QHZV\QWKHWLFHU\WKURSRLHWLQW\SHKRUPRQH5HEOLJHQZKLFKLVXVHGWRWUHDW FKHPRWKHUDS\LQGXFHGDQHPLDLQFDQFHUSDWLHQWVZDVWHVWHGLQDVWXG\RI DGXOWFDQFHUSDWLHQWVXQGHUJRLQJFKHPRWKHUDSHXWLFWUHDWPHQW+DOIWKH SDWLHQWVUHFHLYHGORZGRVHDGPLQLVWUDWLRQRI5HEOLJHQYLDLQWUDPXVFXODU LQMHFWLRQWKUHHWLPHVDWGD\LQWHUYDOVKDOIWKHSDWLHQWVUHFHLYHGDSODFHERLQ DVLPLODUIDVKLRQ3DWLHQWVZHUHVWUDWLILHGDFFRUGLQJWRWKHLUW\SHRIFDQFHU FHUYLFDOSURVWDWHRUFRORUHFWDO)RUVWXG\DGPLVVLRQSDWLHQWVZHUHUHTXLUHG WRKDYHDEDVHOLQHKHPRJORELQOHVVWKDQPJGODQGDGHFUHDVHLQ KHPRJORELQRIDWOHDVWPJGOIROORZLQJWKHODVWFKHPRWKHUDS\&KDQJHVLQ KHPRJORELQLQPJGO IURPSUHILUVWLQMHFWLRQWRRQHZHHNDIWHUODVWLQMHFWLRQ DVVKRZQLQ7DEOH ZHUHREWDLQHGIRUDQDO\VLV'RHV5HEOLJHQKDYHDQ\ HIIHFWRQWKHKHPRJORELQ+JE OHYHOV"
7$%/(5DZ'DWDIRU([DPSOH
&DQFHU7\SH
3DWLHQW 1XPEHU
+JE &KDQJH
3DWLHQW 1XPEHU
+JE &KDQJH
&(59,&$/
35267$7(
&2/25(&7$/
$&7,9(
3/$&(%2
&KDSWHU7ZR:D\$129$
6ROXWLRQ
7$%/(6XPPDU\6WDWLVWLFVE\7UHDWPHQW*URXSIRU([DPSOH &DQFHU 7\SH
3/$&(%2
1
1
1
1
1
1
1
&ROXPQ0HDQ
1
35267$7( &2/25(&7$/
(QWULHVLQ7DEOHDUH&HOOPHDQ6' DQGVDPSOHVL]H1
7KH VXP RI VTXDUHV IRU WKH PDLQ HIIHFWV 7UHDWPHQW DQG &DQFHU 7\SH DUH FRP SXWHGDV
5RZ0HDQ
$&7,9(
&(59,&$/
7UHDWPHQW*URXS
66757 ± ±
667<3( ± ± ±
&RPPRQ6WDWLVWLFDO0HWKRGVIRU&OLQLFDO5HVHDUFKZLWK6$6([DPSOHV
7KHLQWHUDFWLRQVXPRIVTXDUHVFDQEHFRPSXWHGDV
66757E\7<3( ± ± ± ± ± ± ± 66757 ±667<3(
7KHWRWDOVXPRIVTXDUHVLVVLPSO\WKHQXPHUDWRURIWKHVDPSOHYDULDQFHEDVHGRQ DOOREVHUYDWLRQV
66727 ± ±± ±
)LQDOO\ WKH HUURU VXP RI VTXDUHV 66( FDQ EH IRXQG E\ VXEWUDFWLQJ WKH VXP RI VTXDUHVRIHDFKRIWKHHIIHFWVIURPWKHWRWDOVXPRIVTXDUHV 66( 66727 ±66757 ±667<3( ±66757E\7<3( ±±± $VDFKHFN\RXFDQDOVRFRPSXWH66(IURPWKHFHOOVWDQGDUGGHYLDWLRQV
66( Â Â Â Â Â Â
1RZ\RXFDQFRPSOHWHWKH$129$WDEOHDQGFRPSXWHWKH)VWDWLVWLFV6HH7DEOH
7$%/($129$6XPPDU\IRU([DPSOH
6RXUFH
GI
66
06
)
S9DOXH
7UHDWPHQW757
&DQFHU7<3(
757E\7<3(
(UURU
7RWDO
6LJQLILFDQWS
SYDOXHVREWDLQHGIURP6$6
&KDSWHU7ZR:D\$129$
$W D VLJQLILFDQFH OHYHO RI WKH )VWDWLVWLF LV FRPSDUHG ZLWK WKHFULWLFDO)YDOXH ZKLFKFDQEHREWDLQHGIURPZLGHO\WDEXODWHG)WDEOHVRUIURP6$6E\XVLQJWKH ),19 8/ IXQFWLRQ7KHXSSHUGHJUHHVRIIUHHGRPFRUUHVSRQGWRWKH06LQ WKH QXPHUDWRU WKH HUURU GHJUHHV RI IUHHGRP ZKLFK FRUUHVSRQG WR WKH 06( DUH XVHGDVWKHORZHUGHJUHHVRIIUHHGRP7KH)YDOXHIRU7UHDWPHQWHIIHFWLQH[DPSOH LV WKH UDWLR 06757 06( EDVHG RQ XSSHU DQG ORZHU GHJUHHV RI IUHHGRP 7KH$129$VXPPDU\LQGLFDWHVDQRQVLJQLILFDQWLQWHUDFWLRQZKLFKVXJJHVWVWKDW WKHGLIIHUHQFHVEHWZHHQ7UHDWPHQWOHYHOVDUHFRQVLVWHQWRYHU&DQFHU7\SHV7KH) WHVW IRU 7UHDWPHQW LV VLJQLILFDQW S ZKLFK LQGLFDWHV WKDW WKH PHDQ KHPRJORELQUHVSRQVHIRUWKH$FWLYHJURXSGLIIHUVIURPWKDWRIWKH3ODFHERJURXS DYHUDJHGRYHUDOO&DQFHU7\SHV 7KH &DQFHU 7\SH HIIHFW LV DOVR VLJQLILFDQW DW WKH OHYHO ZKLFK VXJJHVWV GLIIHULQJPHDQUHVSRQVHOHYHOVDPRQJWKH&DQFHU7\SHV6XFKLQIRUPDWLRQPLJKW EHXVHIXOLQGHVLJQLQJIXWXUHVWXGLHVRULQJXLGLQJIXUWKHUDQDO\VHV
6$6$QDO\VLVRI([DPSOH 7KH6$6FRGHDQGRXWSXWIRUDQDO\]LQJWKHVHGDWDZLWKWKHWZRZD\$129$DUH VKRZQ RQWKHQH[WIRXUSDJHV7KHFHOOVXPPDU\VWDWLVWLFVDUHILUVWSULQWHGXVLQJ WKH 0($16 SURFHGXUH å 7KH */0 SURFHGXUH LV XVHG ZLWK WKH PDLQ HIIHFWV 7UHDWPHQW 757 DQG &DQFHU 7\SH 7<3( VSHFLILHG DV FODVV YDULDEOHV LQ WKH &/$66 VWDWHPHQW DQG DV IDFWRUV LQ WKH 02'(/ VWDWHPHQW 7KH LQWHUDFWLRQ 757 7<3( LVDOVRLQFOXGHGLQWKH02'(/VWDWHPHQW )RUDEDODQFHGOD\RXWDVVKRZQLQWKLVH[DPSOHVDPHVDPSOHVL]HLQHDFKFHOO WKH 6$67\SHV,,,,,,DQG,9VXPVRIVTXDUHVDUHLGHQWLFDO7KH7\SH,,,UHVXOWV ê VSHFLILHG E\ WKH 66 RSWLRQ LQ WKH 02'(/ VWDWHPHQW FRUURERUDWH WKH UHVXOWV REWDLQHGE\PDQXDOFRPSXWDWLRQV :KLOH WKH SULPDU\ FRQFHUQ LV WKH WUHDWPHQW HIIHFW 757 ZKLFK LV VHHQ WR EH VLJQLILFDQWS WKHUHLVDOVRDVLJQLILFDQWGLIIHUHQFHLQUHVSRQVHDPRQJ &DQFHU7\SHVS %HFDXVHWKHUHDUHPRUHWKDQWZROHYHOVRIWKLVHIIHFW IXUWKHU DQDO\VHV DUH QHHGHG WR GHWHUPLQH ZKHUH WKH GLIIHUHQFHV H[LVW 7R XVH SDLUZLVHWWHVWVIRUPXOWLSOHFRPSDULVRQVLQFOXGHWKH0($16VWDWHPHQWZLWKWKH 7RSWLRQDIWHUWKH02'(/VWDWHPHQWLQWKH6$6FRGH7KLVSURYLGHVFRPSDULVRQV EHWZHHQHDFKSDLURI&DQFHU7\SHVDVGHVFULEHGLQ&KDSWHU$VVHHQLQ2XWSXW PHDQ KHPRJORELQ UHVSRQVH GLIIHUV VLJQLILFDQWO\ EHWZHHQ WKH SURVWDWH DQG FROHUHFWDO&DQFHU7\SHVñ
&RPPRQ6WDWLVWLFDO0HWKRGVIRU&OLQLFDO5HVHDUFKZLWK6$6([DPSOHV
7KH /60($16 VWDWHPHQW LV DOVR LQFOXGHG WR LOOXVWUDWH DQRWKHU ZD\RIREWDLQLQJ SDLUZLVHFRPSDULVRQV:KHQWKH3',))RSWLRQLVXVHG6$6SULQWVRXWDPDWUL[RI SYDOXHVIRUDOOSDLUZLVHFRPSDULVRQV ò$VLQGLFDWHGE\DSYDOXHOHVVWKDQ PHDQ UHVSRQVHV IRU WKH FROHUHFWDO QR PHDQ DQG SURVWDWH QR PHDQ &DQFHU 7\SHVGLIIHUVLJQLILFDQWO\S 6$6&RGHIRU([DPSOH DATA HGBDS; INPUT TRT $ DATALINES; ACT C 1 1.7 ACT C 7 2.3 ACT C 13 1.3 ACT P 24 1.6 ACT P 29 2.6 ACT P 36 1.3 ACT R 46 1.7 ACT R 51 -0.4 PBO C 2 2.3 PBO C 8 1.3 PBO C 14 -0.2 PBO P 23 1.7 PBO P 30 1.4 PBO P 35 1.5 PBO R 44 1.9 PBO R 53 -0.9 ;
TYPE $ ACT ACT ACT ACT ACT ACT ACT ACT PBO PBO PBO PBO PBO PBO PBO PBO
C C C P P R R R C C C P P R R R
3 10 15 26 31 42 47 52 4 9 16 25 32 41 48 55
PATNO HGBCH @@; -0.2 2.7 0.6 2.5 3.7 -0.3 0.5 0.1 1.2 -1.1 1.9 0.8 0.7 1.6 -1.6 1.5
ACT ACT ACT ACT ACT ACT ACT ACT PBO PBO PBO PBO PBO PBO PBO PBO
C C P P P R R R C C P P P R R R
6 1.7 12 0.4 22 2.7 28 0.5 34 2.7 45 1.9 49 2.1 54 1.0 5 -0.6 11 1.6 21 0.6 27 1.7 33 0.8 43 -2.2 50 0.8 56 2.1
PROC FORMAT; VALUE $TYPFMT ’C’ = ’CERVICAL ’ ’P’ = ’PROSTATE ’ ’R’ = ’COLORECTAL’ ; RUN; PROC SORT DATA = HGBDS; BY TRT TYPE; PROC MEANS MEAN STD N; å VAR HGBCH; BY TRT TYPE; FORMAT TYPE $TYPFMT.; TITLE1 ’Two-Way ANOVA’; TITLE2 ’EXAMPLE 7.1: Hemoglobin Changes in Anemia’; RUN; PROC GLM DATA = HGBDS; CLASS TRT TYPE; MODEL HGBCH = TRT TYPE TRT*TYPE / SS3; MEANS TYPE / T; LSMEANS TYPE / PDIFF STDERR; FORMAT TYPE $TYPFMT.; RUN;
&KDSWHU7ZR:D\$129$
2XWSXW6$62XWSXWIRU([DPSOH Two-Way ANOVA EXAMPLE 7.1: Hemoglobin Changes in Anemia
The MEANS Procedure Analysis Variable : HGBCH
------------------------ TRT=ACT TYPE=CERVICAL ---------------------Mean Std Dev N ---------------------------------1.3125000 0.9876921 8 --------------------------------------------------------- TRT=ACT TYPE=PROSTATE ---------------------Mean Std Dev N ---------------------------------2.2000000 1.0042766 8 ------------------------------------------
----------- TRT=ACT TYPE=COLORECTAL --------------------Mean Std Dev N ---------------------------------0.8250000 0.9982127 8 ----------------------------------
å
------------------------ TRT=PBO TYPE=CERVICAL ----------------------Mean Std Dev N ---------------------------------0.8000000 1.2581165 8 --------------------------------------------------------- TRT=PBO TYPE=PROSTATE ----------------------Mean Std Dev N ---------------------------------1.1500000 0.4690416 8 --------------------------------------------------------- TRT=PBO TYPE=COLORECTAL --------------------Mean Std Dev N ---------------------------------0.4000000 1.7071279 8 ----------------------------------
&RPPRQ6WDWLVWLFDO0HWKRGVIRU&OLQLFDO5HVHDUFKZLWK6$6([DPSOHV
2XWSXW6$62XWSXWIRU([DPSOHFRQWLQXHG
Two-Way ANOVA EXAMPLE 7.1: Hemoglobin Changes in Anemia The GLM Procedure Class Level Information Class
Levels
TRT TYPE
2 3
Values ACT PBO CERVICAL COLORECTAL PROSTATE
Number of observations
48
Dependent Variable: HGBCH
DF
Sum of Squares
Model 5 Error 42 Corrected Total 47
15.29604167 53.88375000 69.17979167
Source
Source TRT TYPE TRT*TYPE
Mean Square
F Value
Pr > F
3.05920833 1.28294643
2.38
0.0543
R-Square
Coeff Var
Root MSE
HGBCH Mean
0.221106
101.6229
1.132672
1.114583
DF
Type III SS
Mean Square
F Value
Pr > F
1 2 2
5.26687500 9.11291667 0.91625000
5.26687500 4.55645833 0.45812500
4.11 3.55 0.36
0.0491 0.0376 0.7018
ê
&KDSWHU7ZR:D\$129$
2XWSXW6$62XWSXWIRU([DPSOHFRQWLQXHG
Two-Way ANOVA EXAMPLE 7.1: Hemoglobin Changes in Anemia The GLM Procedure t Tests (LSD) for HGBCH NOTE: This test controls the Type I comparisonwise error rate, not the experimentwise error rate. Alpha 0.05 Error Degrees of Freedom 42 Error Mean Square 1.282946 Critical Value of t 2.01808 Least Significant Difference 0.8082
Means with the same letter are not significantly different. t Grouping
B B B
A A A
Mean
N
TYPE
1.6750
16
PROSTATE
1.0563
16
CERVICAL
0.6125
16
COLORECTAL
ñ
Least Squares Means
TYPE CERVICAL COLORECTAL PROSTATE
HGBCH LSMEAN
Standard Error
1.05625000 0.61250000 1.67500000
0.28316806 0.28316806 0.28316806
Pr > |t| 0.0006 0.0363 <.0001
LSMEAN Number 1 2 3
Least Squares Means for effect TYPE Pr > |t| for H0: LSMean(i)=LSMean(j) Dependent Variable: HGBCH i/j
1
1 2 3
0.2741 0.1298
2
3
0.2741
0.1298 0.0112
0.0112
ò
NOTE: To ensure overall protection level, only probabilities associated with pre-planned comparisons should be used.
&RPPRQ6WDWLVWLFDO0HWKRGVIRU&OLQLFDO5HVHDUFKZLWK6$6([DPSOHV
7REHWWHUXQGHUVWDQGWKHLQWHUDFWLRQHIIHFWLWLVKHOSIXOWRYLVXDOL]HWKHWUHQGVZLWK JUDSKLFV8VXDOO\DQLQWHUDFWLRQLVLQGLFDWHGLIWKHUHVSRQVHSURILOHVFURVVRUKDYH PDUNHGO\ GLIIHUHQW VORSHV DPRQJ WKH OHYHOV RI WKH EORFNLQJ IDFWRU ,Q )LJXUH JUDSKV D E DQG F VKRZ QR LQWHUDFWLRQ JUDSKV G H DQG I GHSLFW DQ LQWHUDFWLRQ EHWZHHQWKHJURXSDQGEORFNPDLQHIIHFWV ),*85(1R,QWHUDFWLRQDEF DQG,QWHUDFWLRQGHI (IIHFWV
(a)
B lock 1
B lock 1
B lock 2 B lock 2 Gr oup A
Gr oup B
(b)
(d)
Gr oup A
Gr oup B
(e)
B lock 1 B lock 1
B lock 2 Gr oup A
B lock 2
Gr oup B
(c)
Gr oup A
Gr oup B
(f)
B lock 1 B lock 1 B lock 2 B lock 2 Gr oup A
Gr oup B
Gr oup A
Gr oup B
&KDSWHU7ZR:D\$129$
([DPSOH VKRZV WKH WZRZD\ $129$ IRU DQ XQEDODQFHG OD\RXW ZLWK WKUHH WUHDWPHQWJURXSVDQGDVLJQLILFDQWLQWHUDFWLRQ
p ([DPSOH0HPRU\)XQFWLRQ
7ZRLQYHVWLJDWRUVFRQGXFWHGDFOLQLFDOWULDOWRGHWHUPLQHWKHHIIHFWRIWZR GRVHVRIDQHZWKHUDSHXWLFDJHQWRQVKRUWWHUPPHPRU\IXQFWLRQ$VLQJOH RUDOGRVHRIWKHWHVWSUHSDUDWLRQZDVDGPLQLVWHUHGWRVXEMHFWVZKRZHUHWKHQ DVNHGWRUHFDOOLWHPVRQHKRXUDIWHUH[SRVXUHWRDOLVWFRQVLVWLQJRILWHPV 7KHQXPEHURILWHPVFRUUHFWO\LGHQWLILHGDUHVKRZQLQ7DEOH$SODFHER JURXSZDVLQFOXGHGDVDFRQWUROLQDSDUDOOHOJURXSGHVLJQ
7$%/(5DZ'DWDIRU([DPSOH
'26(*5283
&(17(5
3ODFHER
PJ
PJ
'U$EHO
'U%HVW
62/87,21
7KHVXPPDU\VWDWLVWLFVDQG$129$WDEOHDUHVKRZQLQ7DEOH 7$%/(6XPPDU\6WDWLVWLFVE\'RVH*URXSIRU([DPSOH
'26(*5283
&(17(5
3ODFHER
PJ
PJ
'U$EHO
'U%HVW
(QWULHVLQ7DEOHDUHPHDQ6' DQGVDPSOHVL]H
&RPPRQ6WDWLVWLFDO0HWKRGVIRU&OLQLFDO5HVHDUFKZLWK6$6([DPSOHV
6$6$QDO\VLVRI([DPSOH
7KLVLVDQXQEDODQFHGOD\RXWDQGWKHFRPSXWDWLRQRIWKHHIIHFWVXPVRIVTXDUHVDV LOOXVWUDWHGLQ([DPSOHPLJKWQRWSURGXFHWKHPRVWDSSURSULDWHWHVWV:LWKDQ XQEDODQFHGOD\RXWWKH6$67\SHV,,,DQG,,,VXPVRIVTXDUHVGLIIHUDVGLVFXVVHG LQ$SSHQGL['7KLVDQDO\VLVIRFXVHVRQO\RQWKH7\SH,,,UHVXOWV 7KH$129$VXPPDU\WDEOHIURPWKH6$6RXWSXWLVVKRZQLQ7DEOH7KH6$6 FRGHIRUJHQHUDWLQJWKHWZRZD\$129$LVVKRZQIROORZLQJWKHWDEOH 7$%/($129$6XPPDU\IRU([DPSOH
)
GI
66
06
'RVH
&HQWHU
&HQWHUE\'RVH
(UURU
7RWDO
6LJQLILFDQWS
6$6&RGHIRU([DPSOH DATA MEMRY; INPUT DOSE $ DATALINES; 0 A 6 0 A 5 0 A 4 0 A 5 0 A 7 0 A 8 0 B 6 0 B 7 0 B 11 0 B 4 30 A 7 30 A 8 30 A 11 30 B 5 30 B 3 30 B 8 30 B 5 60 A 11 60 A 9 60 A 10 60 B 9 60 B 12 60 B 12 60 B 14 ;
CENTER $ 0 0 0 0 0 30 30 30 60 60 60 60
A 6 A 6 B 7 B 8 B 7 A 6 B 6 B 6 A 7 A 12 B 13 B 15
Y @@; 0 0 0 0 30 30 30 30 60 60 60 60
A 8 A 5 B 4 B 5 A 8 A 9 B 6 B 9 A 7 A 9 B 9 B 12
0 0 0 0 30 30 30 30 60 60 60
A A B B A A B B A A B
3 5 7 9 12 6 5 11 11 15 13
PROC GLM DATA = MEMRY; CLASS CENTER DOSE; MODEL Y = DOSE CENTER CENTER*DOSE; LSMEANS DOSE/PDIFF STDERR; TITLE1 ’Two-Way ANOVA’; TITLE2 ’EXAMPLE 7.2: Memory Function’; RUN;
S9DOXH
6RXUFH
&KDSWHU7ZR:D\$129$
2XWSXW6$62XWSXWIRU([DPSOH
Two-Way ANOVA EXAMPLE 7.2: Memory Function The GLM Procedure Class Level Information Class CENTER DOSE
Levels 2 3
Values A B 0 30 60
Number of observations
59
Dependent Variable: Y Source Model Error Corrected Total
DF 5 53 58
R-Square 0.561076
Source DOSE CENTER CENTER*DOSE
DF 2 1 2
Source DF DOSE 2 CENTER 1 CENTER*DOSE 2
Sum of Squares 299.5763953 234.3558081 533.9322034
Mean Square 59.9152791 4.4218077
F Value 13.55
Coeff Var 26.17421
Root MSE 2.102809
Y Mean 8.033898
Type I SS 256.6302710 3.1777983 39.7683260 Type III SS 251.4197307 2.2272995 39.7683260
Mean Square 128.3151355 3.1777983 19.8841630 Mean Square 125.7098653 2.2272995 19.8841630
F Value 29.02 0.72 4.50
F Value 28.43 0.50 4.50
Pr > F <.0001
Pr > F <.0001 0.4004 0.0157
Pr > F <.0001 0.4810 0.0157
ô õ
Least Squares Means
DOSE 0 30 60
Y LSMEAN
Standard Error
Pr > |t|
LSMEAN Number
6.2424242 7.3875000 11.1111111
0.4388811 0.4987251 0.4956369
<.0001 <.0001 <.0001
1 2 3
Least Squares Means for effect DOSE Pr > |t| for H0: LSMean(i)=LSMean(j)
i/j 1 2 3
Dependent Variable: Y 1 2 0.0906 0.0906 <.0001 <.0001
3 <.0001 <.0001
NOTE: To ensure overall protection level, only probabilities associated with pre-planned comparisons should be used.
&RPPRQ6WDWLVWLFDO0HWKRGVIRU&OLQLFDO5HVHDUFKZLWK6$6([DPSOHV
7KH 'RVH HIIHFW LV VHHQ WR EH KLJKO\ VLJQLILFDQW S ô ZKLFK LQGLFDWHV GLIIHUHQFHV DPRQJ WKH GRVH JURXSV +RZHYHU ZKHQHYHU WKH LQWHUDFWLRQ WHUP LV VLJQLILFDQWFDXWLRQPXVWEHXVHGLQWKHLQWHUSUHWDWLRQRIWKHPDLQHIIHFWV,QWKLV H[DPSOHWKH)YDOXHIRUWKH&HQWHUE\'RVHLQWHUDFWLRQLVVLJQLILFDQWS õZKLFK LQGLFDWHVWKDWWKH'RVHUHVSRQVHLVQRWWKHVDPHIRUHDFKVWXG\FHQWHU 7KLVLQWHUDFWLRQLVVKRZQLQ)LJXUH
),*85(,QWHUDFWLRQLQ([DPSOH
13
10
'U$EHO V&HQWHU
7
'U%HVW V&HQWHU
4 Placebo
30 mg
60 mg
DOSE GROUP
,Q WKH SUHVHQFH RI VXFK DQ LQWHUDFWLRQ \RX ZRXOG W\SLFDOO\ SHUIRUP DQDO\VHV VHSDUDWHO\ZLWKLQHDFKFHQWHU7RGRWKLVDRQHZD\$129$&KDSWHU FDQEH XVHGIRU([DPSOH6XFKDQDQDO\VLVUHVXOWVLQDVLJQLILFDQW'RVHHIIHFWIRUERWK FHQWHUVZLWKSDLUZLVHFRPSDULVRQV6HH7DEOH 7$%/(5HVXOWVRI3DLUZLVH'RVH&RPSDULVRQVIRU([DPSOH
3DLUZLVH&RPSDULVRQV 3ODFHERYV 3ODFHERYV
PJYV
&(17(5
PJ
PJ
PJ
'U$EHO
16
'U%HVW
16
$VWHULVN VLJQLILFDQWS 16 QRWVLJQLILFDQWS!
&KDSWHU7ZR:D\$129$
)URP7DEOH\RXFDQFRQFOXGHWKDWDGLIIHUHQFHH[LVWVEHWZHHQWKHPJ GRVHDQGWKHSODFHEREXWWKHRYHUDOOHIILFDF\RIWKHPJGRVHLVLQTXHVWLRQ :KHQ VLJQLILFDQW LQWHUDFWLRQV DUH IRXQG WKH DQDO\VW FDQ XVH H[SORUDWRU\ DQDO\VHVWRWU\WREHWWHUXQGHUVWDQGRUWRH[SODLQWKHLQWHUDFWLRQ,QWKLVFDVH IXUWKHU DQDO\VHV PLJKW UHYHDO IRU H[DPSOH WKDW 'U %HVW¶V SDWLHQWV DUH ROGHU DQGKDYHJUHDWHUGLIILFXOW\ZLWKVKRUWWHUPPHPRU\WKHUHE\UHTXLULQJDKLJKHU GRVHRIPHGLFDWLRQ
'HWDLOV 1RWHV
,Q([DPSOHWKHPDLQEHQHILWRILQFOXGLQJ&DQFHU7\SHDVDQ $129$IDFWRULQWKHDQDO\VLVLVWRLPSURYHWKHSUHFLVLRQRIWKHWUHDWPHQW FRPSDULVRQV E\ LGHQWLI\LQJ WKH ODUJH YDULDWLRQV DPRQJ &DQFHU 7\SHV DQG UHPRYLQJ WKHP IURP WKH HVWLPDWH RI WKH HUURU YDULDWLRQ 06( ,QGHHG LI WKHRQHZD\$129$KDGEHHQXVHGWRFRPSDUHWUHDWPHQWVLJQRULQJ&DQFHU 7\SH \RX PLJKW YHULI\ WKDW WKH UHVXOW ZRXOG EH QRQUHMHFWLRQ RI WKH K\SRWKHVLVRIHTXDOWUHDWPHQWPHDQVEDVHGRQDQ)WHVWRIS 7KLVDQDO\VLVSURGXFHVDQLQIODWHGHVWLPDWHRIWKH06(RIFRPSDUHG ZLWKDQ06(RIZKHQXVLQJWKHPRUHDSSURSULDWHWZRZD\$129$ ,Q GHYHORSLQJ WKH FRQFHSW RI DQDO\VLV RI YDULDQFH LQ WKLV DQG RWKHU FKDSWHUV PDQXDO FDOFXODWLRQV DUH GHPRQVWUDWHG IRU EDODQFHG OD\RXWV WRHPSKDVL]HKRZWKHZLWKLQDQGEHWZHHQJURXSYDULDELOLW\HVWLPDWHVDULVH DQG KRZ WKH UHVXOWV DUH LQWHUSUHWHG 7KH FRPSXWLQJ PHWKRGV VKRZQ LQ ([DPSOH VKRZ WKH GHYLDWLRQV XVHG LQ REWDLQLQJ WKH HIIHFW VXPV RI VTXDUHV)RUPXODVVLPLODUWRWKRVHXVHGLQWKHRQHZD\$129$VHH6HFWLRQ FDQ DOVR EH XVHG IRU WKH WZRZD\ DQG KLJKHU $129$ ZLWK EDODQFHG OD\RXWV +RZHYHU VXFK HTXDWLRQV DUH QRW RIWHQ XVHG 6WDWLVWLFDO SDFNDJHVREYLDWHWKHQHHGIRUPDQXDOFDOFXODWLRQVLQWKHEDODQFHGFDVHDQG DUH PDQGDWRU\ IRU PRVW SUDFWLFDO DSSOLFDWLRQV WKDW LQYROYH XQEDODQFHG OD\RXWV ,Q FOLQLFDO WULDOV WKHUH LV DOPRVW DOZD\V LPEDODQFH LQ WZRZD\ OD\RXWVHYHQLIWKHVWXG\ZHUHGHVLJQHGDVDEDODQFHGRQH7KLVPLJKWEH GXH WR SDWLHQWV ZKR GURSSHG RXW RUPLVVHGYLVLWVGDWDH[FOXVLRQRUVRPH RWKHUUHDVRQWKDWUHVXOWVLQPLVVLQJGDWD,QVXFKFDVHVWKHUHPLJKWEHPRUH WKDQ RQH ZD\ WR FRPSXWH WKH VXP RI VTXDUHV IRU WKH $129$ HIIHFWV DV VHHQE\WKHGLIIHUHQW6$6W\SHVRIVXPVRIVTXDUHV ,Q WKH XQEDODQFHG FDVH WKH WRWDO VXP RI VTXDUHV FDQQRW EH EURNHQ GRZQ DV HDVLO\LQWRDGGLWLYHFRPSRQHQWVGXHWRWKHVRXUFHVRIYDULDWLRQ$QXPEHURI VWDWLVWLFDO PHWKRGV KDYH EHHQ GHYLVHG WR FLUFXPYHQW WKH SUREOHPV WKDW DULVH ZLWKVDPSOHVL]HLPEDODQFH7KHPHWKRGRIµILWWLQJFRQVWDQWV¶DQGWKHPHWKRG RI µZHLJKWHG VTXDUHV RI PHDQV¶ DUH WZR FODVVLFDO DSSURDFKHV ZKLFK FDQ EH H[SORUHGIXUWKHUE\WKHLQWHUHVWHGUHDGHU%DQFURIW (DFKPHWKRGWHVWVD VOLJKWO\GLIIHUHQWK\SRWKHVLVDVIUDPHGLQWHUPVRIFHOOPHDQV7KHPHWKRGWKDW VKRXOGEHXVHGLVWKHPHWKRGWKDWFRUUHVSRQGVPRVWFORVHO\WRWKHDVVXPSWLRQV DQGK\SRWKHVHVUHOHYDQWWRWKHJLYHQSUREOHP
&RPPRQ6WDWLVWLFDO0HWKRGVIRU&OLQLFDO5HVHDUFKZLWK6$6([DPSOHV
,Q6$67\SH,,,FRUUHVSRQGVWRWKHPHWKRGRIZHLJKWHGVTXDUHVRIPHDQV DQG LV RIWHQ WKH PHWKRG RI FKRLFH IRU WKH DQDO\VLV RI FOLQLFDO GDWD 7KH K\SRWKHVLVWHVWHGE\XVLQJ7\SH,,,UHVXOWVLVWKHHTXDOLW\RIWUHDWPHQWJURXS PHDQVZKHUHHDFKJURXSPHDQLVWKHXQZHLJKWHGDYHUDJHRIWKHFHOOPHDQV WKDW FRPSULVH WKDW JURXS 7KH JURXS PHDQ GRHV QRW GHSHQG RQ WKH FHOO VDPSOH VL]HV /6PHDQ DV LV WKH FDVH ZLWK 7\SH , DQG ,, ,Q FDVHV RI H[WUHPHLPEDODQFHVXFKDVDODUJHQXPEHURISDWLHQWVLQVRPHFHQWHUVDQG DYHU\VPDOOQXPEHURISDWLHQWVLQRWKHUFHQWHUVWKHXVHRI7\SH,,,UHVXOWV IRUWKHWUHDWPHQWHIIHFWZKHQFHQWHULVXVHGDVDEORFNLQJIDFWRUVKRXOGEH TXHVWLRQHG ZKHQ ILQLWH SRSXODWLRQV PXVW EH DVVXPHG /6PHDQV DUH GLVFXVVHG LQ $SSHQGL[ & 6HFWLRQ & DQG LQWHUSUHWDWLRQ RI WKH YDULRXV W\SHVRIVXPVRIVTXDUHVLVGLVFXVVHGLQ$SSHQGL[' ,Q D PXOWLFHQWHU WULDO WKHUH PLJKW EH GLIIHUHQFHV DPRQJ VWXG\ FHQWHUV GXH WR VXFK WKLQJV DV JHRJUDSK\ FOLPDWH JHQHUDO SRSXODWLRQ FKDUDFWHULVWLFV RU VSHFLDOW\ GLIIHUHQFHV DW VSHFLILF FHQWHUV ,I \RX LJQRUH WKHVHGLIIHUHQFHVWKHYDULDWLRQGXHWR6WXG\&HQWHUVZLOOEHLQFOXGHGLQWKH HVWLPDWH RI H[SHULPHQWDO HUURU 06( $V VHHQ LQ &KDSWHU ³2QH:D\ $129$´ SRWHQWLDO WUHDWPHQW GLIIHUHQFHV EHFRPH REVFXUHG ZLWK ODUJH YDOXHVRI06(7KHUHIRUH\RXZDQWWRWU\WRLGHQWLI\DOOLPSRUWDQWVRXUFHV RI YDULDWLRQ WKDW FDQ EH IDFWRUHG RXW RI WKH H[SHULPHQWDO HUURU WR \LHOG D PRUH SUHFLVH WHVW IRU WUHDWPHQW HIIHFW ,Q WKH FDVH RI D PXOWLFHQWHU VWXG\ µ6WXG\ &HQWHU¶ LV XVXDOO\ FRQVLGHUHG DQ LPSRUWDQW IDFWRU LQ DQDO\]LQJ WUHDWPHQW GLIIHUHQFHV EHFDXVH RI NQRZQ RU VXVSHFWHG GLIIHUHQFHV DPRQJ FHQWHUV 6RPHWLPHV LQFOXGLQJ D EORFN HIIHFW ZLWK D WZRZD\ $129$ UHVXOWV LQ D OHVV SUHFLVH WUHDWPHQW FRPSDULVRQ RI WKH JURXS PHDQV WKDQ ZRXOG UHVXOW E\ LJQRULQJ WKH EORFNLQJ IDFWRU DQG XVLQJ WKH RQHZD\ $129$%HFDXVHWKHHUURUGHJUHHVRIIUHHGRPDUHUHGXFHGZKHQLQFOXGLQJ WKH%ORFNYDULDEOHDVDVRXUFHRIYDULDWLRQLQWKH$129$WKH06(PLJKW LQFUHDVH DSSUHFLDEO\ LI DPRQJEORFN YDULDWLRQ LV VPDOO DQG WKH QXPEHU RI OHYHOVRIWKHEORFNLQJIDFWRULVODUJH:LWKDQLQFUHDVHG06(WKH)WHVWIRU *5283 LV VPDOOHU GXH WR WKH UHGXFHG GHJUHHV RI IUHHGRP IRU HUURU WKHUHIRUH WKH SUHFLVLRQ RI WKH JURXS FRPSDULVRQV GHFUHDVHV 7KXV XVLQJ $129$VKRXOGDYRLGLQFOXGLQJEORFNLQJIDFWRUVWKDWKDYHDODUJHQXPEHU RIKRPRJHQHRXVOHYHOV 3DLUZLVH FRPSDULVRQV RI JURXS PHDQV DUH RIWHQ FRQGXFWHG E\ XVLQJ WKH 3',)) RSWLRQ LQ WKH /60($16 VWDWHPHQW LQ 352& */0 DV VKRZQLQ([DPSOHVDQG7KLVPHWKRGFRPSDUHVWKH/6PHDQVRIWKH OHYHOVRIWKHIDFWRUVSHFLILHGLQWKH/60($16VWDWHPHQWXVLQJµWWHVWW\SH¶ SURFHGXUHV7RDYRLGJUHDWO\DOWHULQJWKHWHVWV¶VLJQLILFDQFHOHYHOVRQO\SUH SODQQHG FRPSDULVRQV VKRXOG EH PDGH 0XOWLSOH FRPSDULVRQ SURFHGXUHV FRQWUROOLQJ IRU WKH RYHUDOO VLJQLILFDQFH OHYHO VKRXOG EH XVHG ZKHQ WKH QXPEHURIFRPSDULVRQVEHFRPHVODUJHVHH$SSHQGL[(
&KDSWHU7ZR:D\$129$
7KH06(LVDQHVWLPDWHRIWKHYDULDQFHDPRQJVLPLODUSDWLHQWV %HFDXVHSDWLHQWVDUHPRVWVLPLODUZLWKLQ*URXSE\%ORFNFHOOVWKH06(LV DQ DYHUDJH RU SRROHG YDULDQFH RYHU DOO FHOOV RI WKH ZLWKLQFHOO YDULDQFHV +RZHYHU LI HDFK FHOO KDV RQO\ RQH PHDVXUHPHQW Q IRU DOO LM WKH ZLWKLQFHOO YDULDELOLW\ FDQQRW EH HVWLPDWHG ,QVXFKFDVHVWKHDQDO\VLVFDQ SURFHHGE\DVVXPLQJWKHUHLVQRLQWHUDFWLRQDQGXVLQJWKHLQWHUDFWLRQVXPRI VTXDUHVDVWKH66( LM
(YHQ LQ OD\RXWV ZLWK FHOOV WKDW KDYH PRUH WKDQ RQH PHDVXUHPHQW LI LW LV NQRZQRUFDQEHVDIHO\DVVXPHGWKDWWKHUHLVQRLQWHUDFWLRQEHWZHHQPDLQ HIIHFWVWKHLQWHUDFWLRQFDQEHLJQRUHGDVDVRXUFHLQWKH$129$7KHVXP RI VTXDUHV IRU LQWHUDFWLRQ LV WKHQ DEVRUEHG LQWR WKH 66( $OWKRXJK WKLV LQFUHDVHVWKH66(WKHQXPEHURIGHJUHHVRIIUHHGRPDOVRLQFUHDVHV,IWKH LQWHUDFWLRQ HIIHFW LV UHDOO\ LQVLJQLILFDQW WKHQ LQFUHDVLQJ WKH GHJUHHV RI IUHHGRPIRU06(PLJKWPRUHWKDQRIIVHWWKH66(LQFUHDVHZLWKWKHSRVVLEOH QHWHIIHFWRIJDLQLQJVHQVLWLYLW\LQWHVWLQJWKHPDLQHIIHFWV $Q $129$ FDQ EH HDVLO\ H[WHQGHG WR LQFOXGH PRUH WKDQ WZR IDFWRUVZLWKWKHDQDO\VHVE\IROORZLQJWKHVDPHSDWWHUQDVGHPRQVWUDWHGLQ WKLVFKDSWHU)RUH[DPSOH\RXFDQFRQGXFWDWKUHHZD\$129$RQWKHGDWD VHW75,$/LQ&KDSWHUE\XVLQJWKHIDFWRUV7UHDWPHQW*URXS757 6WXG\ &HQWHU &(17(5 DQG SDWLHQW JHQGHU 6(; 8VLQJ WKH */0 SURFHGXUH WKHIROORZLQJ6$6VWDWHPHQWVVKRZDPRGHOWKDWLQFOXGHVDOOWZRZD\DQG WKHWKUHHZD\LQWHUDFWLRQV PROC GLM DATA = TRIAL; CLASS TRT CENTER SEX; MODEL SCORE = TRT CENTER SEX TRT*CENTER TRT*SEX CENTER*SEX TRT*CENTER*SEX;
7KHQXPEHURISRVVLEOHLQWHUDFWLRQVLQFUHDVHVPDUNHGO\DVQHZPDLQHIIHFWV DUHDGGHGWRWKHPRGHO+LJKHURUGHULQWHUDFWLRQVDQGHYHQVRPHWZRZD\ LQWHUDFWLRQV PLJKW QRW EH PHDQLQJIXO RU PLJKW EH GLIILFXOW WR H[SODLQ DQG LQFOXGLQJ D ODUJH QXPEHU RI LQWHUDFWLRQV FDQ GHFUHDVH WKH GHJUHHV RI IUHHGRPDYDLODEOHIRUHVWLPDWLRQRIWKHHUURU06( )RUWKHVHUHDVRQVRQO\ PHDQLQJIXO LQWHUDFWLRQV RU LQWHUDFWLRQV WKDW PLJKW OHDG WR LPSRUWDQW VXEJURXS GLIIHUHQFHV VKRXOG EH FRQVLGHUHG ZKHQ SHUIRUPLQJ WKH ILQDO DQDO\VLV 7KHUHDUHWZRW\SHVRIHIIHFWVLQDQ$129$PRGHOWKHµIL[HG¶ HIIHFW DQG WKH µUDQGRP¶ HIIHFW $OO WKH HIIHFWV LQ WKH H[DPSOHV LQ WKLV FKDSWHU DUH DVVXPHG WR EH IL[HG WKDW LV KDYLQJ SUHVSHFLILHG OHYHOV ZLWK WKH JRDO RI FRPSDULQJ VSHFLILF OHYHOV RI WKDW HIIHFW 7UHDWPHQW *URXS IRU H[DPSOH LV QRUPDOO\ D IL[HG HIIHFW VLQFH \RX ZDQW WR FRPSDUH UHVSRQVHV DPRQJWKHIL[HGJURXSVRUGRVHOHYHOV$UDQGRPHIIHFWLVRQHZKRVHOHYHOV DUHUDQGRPO\VHOHFWHGIURPDODUJHSRSXODWLRQRIOHYHOVDQGDUHQRW
&RPPRQ6WDWLVWLFDO0HWKRGVIRU&OLQLFDO5HVHDUFKZLWK6$6([DPSOHV
DPHQDEOHWRDSULRULVSHFLILFDWLRQ6WXG\&HQWHUIRUH[DPSOHPLJKWEH FRQVLGHUHGDUDQGRPHIIHFWLIDVDPSOHRIFHQWHUVLVUDQGRPO\VHOHFWHGIURP DODUJHQXPEHURIFHQWHUVDYDLODEOHWRFRQGXFWWKHVWXG\ $ VWDWLVWLFDO PRGHO DVVRFLDWHG ZLWK D WZRZD\ RU KLJKHURUGHU $129$ LV FDOOHG D µIL[HGHIIHFWV¶ PRGHO LI DOO HIIHFWV DUH IL[HG D µUDQGRPHIIHFWV¶ PRGHOLIDOOHIIHFWVDUHUDQGRPDQGDµPL[HG¶PRGHOLIWKHUHLVDPL[WXUHRI IL[HG DQG UDQGRP HIIHFWV 7KH WZRZD\ $129$ WKDW LQFOXGHV 7UHDWPHQW *URXS DQG 6WXG\ &HQWHU DV IDFWRUV LV JHQHUDOO\ FRQVLGHUHG D IL[HGHIIHFWV PRGHO HVSHFLDOO\ LI RQO\ D VPDOO QXPEHU RIFHQWHUVLVXVHG,IWKHFHQWHUV WKDW DUH LQFOXGHG DUH VHOHFWHG IURP D ODUJH QXPEHU RI FHQWHUV WKDW PLJKW TXDOLI\ WR FRQGXFW WKH VWXG\ 6WXG\ &HQWHU FDQ EH FRQVLGHUHG D UDQGRP HIIHFWLQZKLFKFDVHWKHWZRZD\$129$PRGHOLVPL[HG :KHQ WKHUH LV QR LQWHUHVW LQ WKH LQWHUDFWLRQ LQ WKH WZRZD\ $129$ WKH DQDO\VLV RI WUHDWPHQW FRPSDULVRQV LV LGHQWLFDO ZKHWKHU 6WXG\ &HQWHU LV FRQVLGHUHG IL[HG RU UDQGRP ,I KRZHYHU 6WXG\ &HQWHU LV FRQVLGHUHG D UDQGRP HIIHFW DQG WKH LQWHUDFWLRQ WHUP LV LQFOXGHG LQWHUDFWLRQ DOVR EHLQJ FRQVLGHUHGDUDQGRPHIIHFW WKHQWKHWHVWIRUWUHDWPHQWHIIHFWUHTXLUHVPRUH ZRUN,QWKHEDODQFHGPRGHOWKH)WHVWIRU7UHDWPHQW*URXSLVWKHUDWLRRI WKH PHDQ VTXDUH IRU 7UHDWPHQW WR WKH PHDQVTXDUHIRULQWHUDFWLRQ7KH) WHVWFDQEHSHUIRUPHGE\XVLQJWKH7(67VWDWHPHQWLQ352&*/0LQ6$6 7KLV LV EHFDXVH PRGHO DVVXPSWLRQV DUH GLIIHUHQW IRU IL[HG DQG UDQGRP HIIHFWVDQGWKH)WHVWVDUHEDVHGRQH[SHFWHGPHDQVTXDUHV $ 5$1'20 VWDWHPHQW IROORZLQJ WKH 02'(/ VWDWHPHQW VSHFLILHV ZKLFK HIIHFWVDUHUDQGRP:KHQRQHRUPRUHHIIHFWVDUHLGHQWLILHGDVUDQGRPWKH 6$6 RXWSXW ZLOO VKRZ WKH IRUP RI WKH H[SHFWHG PHDQVTXDUHV ,Q WKH XQEDODQFHGPRGHOWKHDQDO\VWVKRXOGXVHWKHVHIRUPVWRGHWHUPLQHWKHPRVW DSSURSULDWHHIIHFWVWRXVHLQWKH7(67VWDWHPHQW 7KHFRPSOH[LWLHVLQYROYHGLQWKHDQDO\VLVRIKLJKHURUGHUXQEDODQFHGPL[HG PRGHOVDUHEHVWKDQGOHGE\XVLQJ352&0,;('LQ6$6 1RWH$QH[FHOOHQWUHIHUHQFHIRUWKHDQDO\VLVRIPL[HGPRGHOVIRUWKH PRUHDGYDQFHGUHDGHULV³6$66\VWHPIRU0L[HG0RGHOV´E\/LWWHOO 0LOOLNHQ6WURXSDQG:ROILQJHUZKLFKZDVZULWWHQXQGHUWKH6$6 %RRNVE\8VHUVSURJUDP
&++$$3377((55 5HSHDWHG0HDVXUHV$129$ ,QWURGXFWLRQ 6\QRSVLV ([DPSOHV 'HWDLOV 1RWHV
,QWURGXFWLRQ
5HSHDWHG PHDVXUHV UHIHU WR PXOWLSOH PHDVXUHPHQWV WDNHQ IURP WKH VDPH H[SHULPHQWDOXQLWVXFKDVVHULDOHYDOXDWLRQVRYHUWLPHRQWKHVDPHSDWLHQW6SHFLDO DWWHQWLRQ LV JLYHQ WR WKHVH W\SHV RI PHDVXUHPHQWV EHFDXVH WKH\ FDQQRW EH FRQVLGHUHG LQGHSHQGHQW ,Q SDUWLFXODU WKH DQDO\VLV PXVW PDNH SURYLVLRQV IRU WKH FRUUHODWLRQVWUXFWXUH 0RVW FOLQLFDO VWXGLHV UHTXLUH RXWSDWLHQWV WR UHWXUQ WRWKHFOLQLFIRUPXOWLSOHYLVLWV GXULQJ WKH WULDO ZLWK UHVSRQVH PHDVXUHPHQWV PDGH DW HDFK 7KLV LV WKH PRVW FRPPRQ H[DPSOH RI UHSHDWHG PHDVXUHV VRPHWLPHV FDOOHG µORQJLWXGLQDO¶ GDWD 7KHVH UHSHDWHG UHVSRQVH PHDVXUHPHQWV FDQ EH XVHG WR FKDUDFWHUL]H D UHVSRQVH SURILOH RYHU WLPH 2QH RI WKH PDLQ TXHVWLRQV WKH UHVHDUFKHU DVNV LV ZKHWKHU WKH PHDQUHVSRQVHSURILOHIRURQHWUHDWPHQWJURXSLVWKHVDPHDVIRUDQRWKHUWUHDWPHQW JURXSRUDSODFHERJURXS7KLVVLWXDWLRQPLJKWDULVHIRUH[DPSOHZKHQWU\LQJWR GHWHUPLQHLIWKHRQVHWRIHIIHFWRUUDWHRILPSURYHPHQWGXHWRDQHZWUHDWPHQWLV IDVWHUWKDQWKDWRIDFRPSHWLWRU¶VWUHDWPHQW&RPSDULVRQRIUHVSRQVHSURILOHVFDQEH WHVWHGZLWKDVLQJOH)WHVWIURPDUHSHDWHGPHDVXUHVDQDO\VLV
6\QRSVLV
,QJHQHUDO\RXKDYHJLQGHSHQGHQWJURXSVRISDWLHQWVHDFKRIZKRPDUHVXEMHFWHG WRUHSHDWHGPHDVXUHPHQWVRIWKHVDPHUHVSRQVHYDULDEOH\DWWHTXDOO\VSDFHGWLPH SHULRGV /HWWLQJ Q UHSUHVHQW WKH QXPEHU RI SDWLHQWV LQ *URXS L L J WKH OD\RXWIRUJ JURXSVLVVKRZQLQ7DEOH,QFRPSDUDWLYHWULDOVWKHJURXSVRIWHQ UHSUHVHQWGLIIHUHQWSDUDOOHOWUHDWPHQWJURXSVRUGRVHOHYHOVRIDGUXJ L
&RPPRQ6WDWLVWLFDO0HWKRGVIRU&OLQLFDO5HVHDUFKZLWK6$6([DPSOHV
7$%/(/D\RXWIRUD5HSHDWHG0HDVXUHV'HVLJQZLWK*URXSV
*URXS
3DWLHQW
W
\
\
\W
\
\
\W
Q
\Q
\Q
\QW
\
\
\W
\
\
\W
Q
\Q
\Q
\QW
\
\
\W
\
\
\W
Q
\Q
\Q
\QW
7LPH
7KHUHDUHDQXPEHURIDQDO\WLFDSSURDFKHVIRUKDQGOLQJUHSHDWHGPHDVXUHV)LUVW H[DPLQH D µXQLYDULDWH¶ PHWKRG WKDW XVHV WKH VDPH $129$ FRQFHSWV GLVFXVVHG LQ &KDSWHU$µPXOWLYDULDWH¶PHWKRGZKLFKWUHDWVWKHUHSHDWHGPHDVXUHPHQWVDVD PXOWLYDULDWHUHVSRQVHYHFWRUPD\DOVREHXVHGLQPDQ\FLUFXPVWDQFHV
&RQVLGHUWKH*URXS3DWLHQWDQG7LPHHIIHFWVVKRZQLQ7DEOHDVWKUHHIDFWRUV LQ DQ $129$ 8VLQJ WKH LGHDV GLVFXVVHG LQ &KDSWHU \RX FDQ H[DPLQH WKH YDULDELOLW\ZLWKLQDQGDPRQJWKHVHIDFWRUVQRWLQJWKDWWKH7LPHHIIHFWUHSUHVHQWV FRUUHODWHG PHDVXUHPHQWV 1RWLFH WKDW WKH UHVSRQVH PLJKW YDU\ DPRQJ JURXSV DPRQJ SDWLHQWV ZLWKLQ JURXSV DQG DPRQJ WKH GLIIHUHQW PHDVXUHPHQW WLPHV 7KHUHIRUH\RXLQFOXGHD*URXSHIIHFWD3DWLHQWZLWKLQ*URXS HIIHFWDQGD7LPH HIIHFW DV VRXUFHV RI YDULDWLRQ LQ WKH $129$ ,Q DGGLWLRQ WKH UHSHDWHG PHDVXUHV DQDO\VLVXVLQJDXQLYDULDWHDSSURDFKLQFOXGHVWKH*URXSE\7LPHLQWHUDFWLRQ $V ZLWK RWKHU $129$ PHWKRGV \RX DVVXPH QRUPDOLW\ RI WKH UHVSRQVH PHDVXUHPHQWVDQGYDULDQFHKRPRJHQHLW\DPRQJJURXSV,QDGGLWLRQWKHXQLYDULDWH $129$ UHTXLUHV WKDW HDFK SDLU RI UHSHDWHG PHDVXUHV KDV WKH VDPH FRUUHODWLRQ D IHDWXUHNQRZQDVµFRPSRXQGV\PPHWU\¶ $ VLJQLILFDQW LQWHUDFWLRQ PHDQV WKDW FKDQJHV LQ UHVSRQVH RYHU WLPH GLIIHU DPRQJ JURXSVLHDVLJQLILFDQWGLIIHUHQFHLQUHVSRQVHSURILOHVDVLOOXVWUDWHGLQ)LJXUH :KHQWKHSURILOHVDUHVLPLODUDPRQJJURXSVLHQR*URXSE\7LPHLQWHUDFWLRQ
&KDSWHU5HSHDWHG0HDVXUHV$129$
WHVWVIRUWKH*URXSHIIHFWPHDVXUHWKHGHYLDWLRQIURPWKHK\SRWKHVLVRIHTXDOLW\RI PHDQUHVSRQVHVDPRQJJURXSVµDYHUDJHG¶RYHUWLPH7KLVWHVWXVLQJWKHUHSHDWHG PHDVXUHVDQDO\VLVPHWKRGPLJKWEHPRUHVHQVLWLYHWRGHWHFWLQJJURXSGLIIHUHQFHV WKDQXVLQJDRQHZD\$129$WRFRPSDUHJURXSVDWDVLQJOHWLPHSRLQW+RZHYHU WKH*URXSHIIHFWPLJKWQRWEHPHDQLQJIXOLIWKHUHLVDVLJQLILFDQW7LPHHIIHFW7KH 7LPH HIIHFW LV D PHDVXUH RI GHYLDWLRQ IURP WKH K\SRWKHVLV RI HTXDOLW\ RI PHDQ UHVSRQVHV DPRQJ WKH PHDVXUHPHQW WLPHV IRU DOO JURXSV FRPELQHG 5HSHDWHG PHDVXUHV$129$DOVRSURYLGHVDWHVWRIWKLVK\SRWKHVLV ,Q WKH VLPSOHVW FDVH RI UHSHDWHG PHDVXUHV $129$ ZKLFK LV SUHVHQWHG KHUH WKH *URXS DQG WKH HYDOXDWLRQ 7LPH DUH FURVVFODVVLILHG PDLQ HIIHFWV +RZHYHU WKH VDPSOHVDUHQRWLQGHSHQGHQWDPRQJWKH*URXS7LPHFHOOVEHFDXVHPHDVXUHPHQWV RYHUWLPHWKDWDUHWDNHQIURPWKHVDPHSDWLHQWDUHFRUUHODWHG7RDFFRXQWIRUWKLV FRUUHODWLRQ WKH 3DWLHQW HIIHFW PXVW EH LQFOXGHG DV D VRXUFH RI YDULDWLRQ LQ WKH $129$ :LWK 1 Q Q QJ WKH UHSHDWHG PHDVXUHV $129$ VXPPDU\ WDEOH WDNHV WKH IROORZLQJIRUP
7$%/($129$6XPPDU\IRU5HSHDWHG0HDVXUHV'HVLJQ 6285&(
GI
66
06
*5283
J
66*
06*
3$7,(17 ZLWKLQ*5283
1J
663*
063*
7,0(
W
667
067
*5283E\7,0(
J W
66*7
06*7
(UURU
1J W
66(
06(
1W
72766
7RWDO
) )* 06*063* )7 06706( )*7 06*706(
)RUWKHEDODQFHGOD\RXWQ Q QJ WKHVXPVRIVTXDUHVFDQEHFRPSXWHGLQ D PDQQHU VLPLODU WR WKDW XVHG IRU WKH WZRZD\ $129$ &KDSWHU 7KHVH FRPSXWDWLRQV DUH VKRZQ LQ ([DPSOH $V XVXDO WKH PHDQ VTXDUHV 06 DUH IRXQG E\ GLYLGLQJ WKH VXPV RI VTXDUHV 66 E\ WKH FRUUHVSRQGLQJ GHJUHHV RI IUHHGRP
9DULDWLRQIURPSDWLHQWWRSDWLHQWLVRQHW\SHRIUDQGRPHUURUDVHVWLPDWHGE\WKH PHDQVTXDUHIRU3DWLHQWZLWKLQ*URXS ,IWKHUHLVQRGLIIHUHQFHDPRQJJURXSVWKH EHWZHHQJURXS YDULDWLRQ PHUHO\ UHIOHFWV SDWLHQWWRSDWLHQW YDULDWLRQ 7KHUHIRUH XQGHUWKHQXOOK\SRWKHVLVRIQR*URXSHIIHFW06*DQG063* DUHLQGHSHQGHQW HVWLPDWHV RI WKH DPRQJSDWLHQW YDULDELOLW\ VR WKDW ) KDV WKH )GLVWULEXWLRQ ZLWK JXSSHUDQG1JORZHUGHJUHHVRIIUHHGRP *
&RPPRQ6WDWLVWLFDO0HWKRGVIRU&OLQLFDO5HVHDUFKZLWK6$6([DPSOHV
,I WKHUH LV QR 7LPH HIIHFW WKH PHDQ VTXDUH IRU 7LPH 067 LV DQ HVWLPDWH RI ZLWKLQSDWLHQW YDULDELOLW\ DV LV WKH HUURU YDULDWLRQ 06( 7KH UDWLR RI WKHVH LQGHSHQGHQWHVWLPDWHV) LVWKH)VWDWLVWLFXVHGWRWHVWWKHK\SRWKHVLVRIQR7LPH HIIHFW6LPLODUO\WKHLQWHUDFWLRQPHDQVTXDUHZKLFKLVDOVRDPHDVXUHRIZLWKLQ SDWLHQWYDULDWLRQXQGHU+ LVFRPSDUHGWRWKH06(WRWHVWIRUDVLJQLILFDQW*URXS E\7LPHLQWHUDFWLRQ 7
6DPSOH UHVSRQVH SURILOHV DUH VKRZQ LQ )LJXUH IRU GUXJ JURXSV DQG WLPH SHULRGV$VGHSLFWHGWUHDWPHQWGLIIHUHQFHVGHSHQGRQWLPHLIWKHSURILOHVGLIIHU ),*85(6DPSOH3URILOHVRI'UXJ5HVSRQVH
Response
Response
Drug A
Drug B
Drug A 1
2
3
Drug B
4
1
2
Time
6LPLODU3URILOHV1R'UXJ(IIHFW
4
3
4
Drug A
1
2
3 Time
Time
'LIIHUHQW3URILOHV
6LPLODU3URILOHV'UXJ(IIHFW
Response
Drug B
2
Drug B
Drug A
1
3 Time
Response
'LIIHUHQW3URILOHV
4
&KDSWHU5HSHDWHG0HDVXUHV$129$
([DPSOHV
7KH IROORZLQJ H[DPSOH LV XVHG WR GHPRQVWUDWH WKH µXQLYDULDWH¶ DSSURDFK WR WKH UHSHDWHGPHDVXUHV$129$LQWKHEDODQFHGFDVH
p ([DPSOH$UWKULWLF'LVFRPIRUW)ROORZLQJ9DFFLQH
$SLORWVWXG\ZDVFRQGXFWHGLQSDWLHQWVWRHYDOXDWHWKHHIIHFWRIDQHZ YDFFLQHRQGLVFRPIRUWGXHWRDUWKULWLFRXWEUHDNV)RXUSDWLHQWVZHUH UDQGRPO\DVVLJQHGWRUHFHLYHDQDFWLYHYDFFLQHDQGSDWLHQWVZHUHWR UHFHLYHDSODFHER7KHSDWLHQWVZHUHDVNHGWRUHWXUQWRWKHFOLQLFPRQWKO\IRU PRQWKVDQGHYDOXDWHWKHLUFRPIRUWOHYHOZLWKURXWLQHGDLO\FKRUHVGXULQJWKH SUHFHGLQJPRQWKXVLQJDVFDOHRIQRGLVFRPIRUW WRPD[LPXP GLVFRPIRUW (OLJLELOLW\FULWHULDUHTXLUHGSDWLHQWVWRKDYHDUDWLQJRIDWOHDVW DQLQWKHPRQWKSULRUWRYDFFLQDWLRQ7KHUDWLQJGDWDDUHVKRZQLQ7DEOH ,VWKHUHDQ\HYLGHQFHRIDGLIIHUHQFHLQUHVSRQVHSURILOHVEHWZHHQWKH DFWLYHDQGSODFHERYDFFLQHV"
7$%/(5DZ'DWDIRU([DPSOH
9DFFLQH
3DWLHQW 1XPEHU
0RQWK
0RQWK
0RQWK
$FWLYH
3ODFHER
9LVLW
6ROXWLRQ
8VLQJ D µXQLYDULDWH¶ UHSHDWHG PHDVXUHV $129$ \RX LGHQWLI\ 9DFFLQH DV WKH *URXS HIIHFW DQG 9LVLW DV WKH 7LPH HIIHFW 7KH 3DWLHQWZLWKLQ9DFFLQH DQG 9DFFLQHE\9LVLW HIIHFWV DUH DOVR LQFOXGH LQ WKH $129$7KHVXPRIVTXDUHVIRU HDFK RI WKHVH VRXUFHV FDQ EH FRPSXWHG LQ WKH VDPH PDQQHU DV GHPRQVWUDWHG SUHYLRXVO\&KDSWHUVDQG )LUVW\RXREWDLQWKHPDUJLQDOPHDQVDVVKRZQLQ 7DEOH
&RPPRQ6WDWLVWLFDO0HWKRGVIRU&OLQLFDO5HVHDUFKZLWK6$6([DPSOHV
7$%/(0DUJLQDO6XPPDU\6WDWLVWLFVIRU([DPSOH
9LVLW 0RQWK 0RQWK 0RQWK
9DFFLQH
3DWLHQW 1XPEHU
0HDQ
$FWLYH
0HDQ
6'
3ODFHER
0HDQ
6'
&RPELQHG 0HDQ
7KH9DFFLQHVXPRIVTXDUHVLVSURSRUWLRQDOWRWKHVXPRIVTXDUHGGHYLDWLRQVRIWKH JURXSPHDQVIURPWKHRYHUDOOPHDQ 669DFFLQH ± ± 7KLVLVEDVHGRQGHJUHHRIIUHHGRPEHFDXVHWKHUHDUHWZROHYHOVRIWKH*URXS IDFWRU9DFFLQH 7KHVXPRIVTXDUHVIRU9LVLWEDVHGRQGHJUHHVRIIUHHGRPDQGWKH9DFFLQHE\ 9LVLWLQWHUDFWLRQDOVRZLWKGHJUHHVRIIUHHGRPFDQEHFRPSXWHGDVIROORZV 669LVLW ± ± ± 669DFFLQHE\9LVLW ± ± ± ± ± ± ± 669DFFLQH ±669LVLW ±±
&KDSWHU5HSHDWHG0HDVXUHV$129$
7KH 3DWLHQW ZLWKLQ 9DFFLQH VXP RI VTXDUHV LV IRXQG E\ VXPPLQJ WKH VTXDUHG GHYLDWLRQV EHWZHHQ WKH PHDQ UHVSRQVH IRU HDFK SDWLHQW DQG WKH 9DFFLQH *URXS PHDQZLWKLQZKLFKWKDWSDWLHQWLVQHVWHG:LWKLQHDFK9DFFLQH*URXSWKHIDFWRU 3DWLHQW KDV GHJUHHV RI IUHHGRP VR IRU WKH WZR YDFFLQH JURXSV \RX KDYH GHJUHHVRIIUHHGRPIRU3DWLHQW9DFFLQH 663DWLHQW9DFFLQH ± ± ± ± ± ± ± ± 7KHWRWDOVXPRIVTXDUHVEDVHGRQGHJUHHVRIIUHHGRP LVIRXQGE\VXPPLQJ WKHVTXDUHGGHYLDWLRQVRIHDFKREVHUYDWLRQIURPWKHRYHUDOOPHDQ
667RWDO ± ± ± ± « ± )LQDOO\EHFDXVHWKLVLVDEDODQFHGOD\RXWWKHHUURUVXPRIVTXDUHV66( FDQEH IRXQGE\VXEWUDFWLRQ
66(
667RWDO ±669DFFLQH ±663DWLHQW9DFFLQH ±669LVLW ±669DFFLQHE\9LVLW
±±±±
7KH $129$ WDEOH FDQ QRZ EH FRPSOHWHG DV VKRZQ LQ 7DEOH 7KH PHDQ VTXDUHV IRU HDFK HIIHFW DUH IRXQG E\ GLYLGLQJ WKH VXP RI VTXDUHV 66 E\ WKH GHJUHHV RI IUHHGRP 7KH )WHVWV DUH WKH UDWLRV RI WKH HIIHFW PHDQ VTXDUHV WR WKH DSSURSULDWHHUURUDVVRFLDWHGZLWKWKDWHIIHFW
&RPPRQ6WDWLVWLFDO0HWKRGVIRU&OLQLFDO5HVHDUFKZLWK6$6([DPSOHV
7$%/($129$6XPPDU\IRU([DPSOH 6285&(
GI
66
06
)
9DFFLQH
3DWLHQW9DFFLQH
9LVLW
9DFFLQHE\9LVLW
(UURU
7RWDO
6LJQLILFDQWS
7KH)WHVWIRUWKH9DFFLQHHIIHFWLVWKHUDWLRRIWKHPHDQVTXDUHIRU9DFFLQHWRWKH PHDQVTXDUHIRU3DWLHQW9DFFLQH ) 7KH)YDOXHVIRU9LVLW DQG9DFFLQHE\9LVLWDUHIRXQGE\WKHUDWLRVRIWKHPHDQVTXDUHVIRUWKHVHHIIHFWV WRWKHPHDQVTXDUHHUURU06( 7KHQXOOK\SRWKHVLVRIVLPLODUUHVSRQVHSURILOHVRYHUWLPHIRUHDFKRIWKHYDFFLQH JURXSVLVWHVWHGE\WKH9DFFLQHE\9LVLWLQWHUDFWLRQ)RUWZRWUHDWPHQWJURXSVDV XVHGLQWKLVH[DPSOHWKHK\SRWKHVLVFDQEHH[SUHVVHGDVWKHVLPXOWDQHRXVHTXDOLW\ RI 7UHDWPHQW *URXS GLIIHUHQFHV DW HDFK WLPH SRLQW 7KDW LV LI UHSUHVHQWV WKH GLIIHUHQFH LQ PHDQ UHVSRQVHV EHWZHHQ WKH DFWLYH DQG SODFHER JURXSV DW 0RQWKM HJ ± IRUM RUWKHQWKHWHVWVXPPDU\FDQEHH[SUHVVHGDV M
M
M
M
QXOOK\SRWKHVLV
+
+ QRW+
DOWK\SRWKHVLV WHVWVWDWLVWLF
) 069DFFLQHE\9LVLW 06(
GHFLVLRQUXOH
UHMHFW+ LI)!)
FRQFOXVLRQ
%HFDXVH ! \RX UHMHFW + DQG FRQFOXGH WKDWWKHUHLVDVLJQLILFDQW9DFFLQH*URXSE\9LVLW LQWHUDFWLRQ
$
%HFDXVH9DFFLQH*URXSGLIIHUHQFHVDUHEDVHGRQWKHVWXG\PRQWKIXUWKHUDQDO\VHV FDQEHSHUIRUPHGE\GHSLFWLQJWKHPHDQUHVSRQVHVRYHUWLPHLQJUDSKLFDOIRUPDW VHH )LJXUH DQG E\ XVLQJ OLQHDU FRQWUDVWV WR FRPSDUH GLIIHUHQFHV EHWZHHQ YDFFLQH JURXSV LQ WKH FKDQJHV IURP VXFFHVVLYH WLPH SRLQWV $ RQHZD\ $129$ &KDSWHU FDQDOVREHXVHGWRWHVWIRUWKH9DFFLQHHIIHFWDWHDFKYLVLW ,QWKH$129$VXPPDU\WDEOHWKH9LVLWHIIHFWLVDOVRVHHQWREHVLJQLILFDQW7KLV PHDQV WKDW WKH DYHUDJH UHVSRQVHV IRU WKH FRPELQHG YDFFLQH JURXSV GLIIHU DPRQJ HYDOXDWLRQ PRQWKV 7KH )WHVW IRU WKH 9DFFLQH HIIHFW FDQ EH LQWHUSUHWHG DV D
&KDSWHU5HSHDWHG0HDVXUHV$129$
FRPSDULVRQ RI PHDQ UHVSRQVHV EHWZHHQ YDFFLQH JURXSV µDYHUDJHG¶ RYHU DOO PHDVXUHPHQWWLPHV:LWKDQRQVLJQLILFDQW)YDOXHRIIRUWKH9DFFLQHHIIHFW DQG D VLJQLILFDQW LQWHUDFWLRQ IXUWKHU DQDO\VHV DV VXJJHVWHG DERYH DUH UHTXLUHG SULRUWRIRUPLQJDQ\FRQFOXVLRQVZLWKUHJDUGWRWKHPDLQHIIHFWV
6$6$QDO\VLVRI([DPSOH ,QWKH6$6FRGHIRUDQDO\]LQJWKHGDWDIRU([DPSOHWKHLQSXWGDWDVHW$57+5 LV WUDQVIRUPHG WR D QHZ IRUPDW ZLWK REVHUYDWLRQ SHU UHFRUG LQ WKH GDWD VHW ',6&207KLVGDWDVHWLVGLVSOD\HGE\XVLQJ352&35,17VHH2XWSXW å 352&*/0LQ6$6LVDSSOLHGWRWKLVGDWDVHWE\XVLQJD02'(/VWDWHPHQWWKDW LQFOXGHVHIIHFWVIRU9DFFLQH*URXS9$&*53 7LPH9,6,7 DQG3DWLHQW3$7 QHVWHGZLWKLQ9DFFLQH*URXS7KHVHHIIHFWVPXVWEHGHVLJQDWHGDVFODVVYDULDEOHV LQWKH&/$66VWDWHPHQW 7KHLQWHUDFWLRQLVVSHFLILHGLQWKH02'(/VWDWHPHQW DV9$&*53 9,6,77KH66RSWLRQLVXVHGLQWKH02'(/VWDWHPHQWWRUHTXHVW RQO\WKH7\SH,,,VXPVRIVTXDUHV $VVKRZQLQ2XWSXWWKH06(YHULILHVWKHFRPSXWDWLRQ ê1RWLFHWKDW 6$6DXWRPDWLFDOO\FRPSXWHV)YDOXHVIRUHDFKPRGHOHIIHFWE\XVLQJWKH06(DV WKHGHQRPLQDWRUXQOHVVRWKHUZLVHVSHFLILHG7KHSURJUDPPHUPXVWWHOO6$6WRWHVW WKH *URXS HIIHFW 9$&*53 DJDLQVW WKH 3DWLHQWZLWKLQ*URXS 3$79$&*53 µHUURU¶ E\ XVLQJ WKH 7(67 VWDWHPHQW 7KH UHVXOWLQJ )YDOXH RI LV WKH DSSURSULDWH)WHVWIRUWKHQXOOK\SRWKHVLVRIQR9DFFLQH*URXSHIIHFW7KH)YDOXH IRU9$&*53ZKLFKLVVKRZQLQWKHRXWSXWLVQRWPHDQLQJIXOIRUDUHSHDWHG PHDVXUHV $129$ 7KH )YDOXH IRU 3$79$&*53 FDQ DOVR EH LJQRUHG EHFDXVHWKLVWHVWVIRUDGLIIHUHQFHDPRQJSDWLHQWVZLWKLQHDFKJURXSDQGZLOORIWHQ EH LQFRQVHTXHQWLDOO\ VLJQLILFDQW 7KH VLJQLILFDQW 9DFFLQHE\9LVLW LQWHUDFWLRQ ñ VXJJHVWVWKDWWKHGLIIHUHQFHEHWZHHQWUHDWPHQWVLVWLPHUHODWHGDVZDVSUHYLRXVO\ GLVFRYHUHG 6$6&RGHIRU([DPSOH DATA ARTHR; INPUT VACGRP $ DATALINES; ACT 101 6 3 0 ACT 103 7 3 1 ACT 104 4 1 2 ACT 107 8 4 3 PBO 102 6 5 5 PBO 105 9 4 6 PBO 106 5 3 4 PBO 108 6 2 3 ;
PAT MO1 MO2 MO3 ;
&RPPRQ6WDWLVWLFDO0HWKRGVIRU&OLQLFDO5HVHDUFKZLWK6$6([DPSOHV
DATA DISCOM; SET ARTHR; KEEP VACGRP PAT VISIT SCORE; SCORE = MO1; VISIT = 1; OUTPUT; SCORE = MO2; VISIT = 2; OUTPUT; SCORE = MO3; VISIT = 3; OUTPUT; RUN; PROC PRINT DATA = DISCOM; å VAR VACGRP PAT VISIT SCORE; TITLE1 ’Repeated-Measures ANOVA’; TITLE2 ’Example 8.1: Arthritic Discomfort Following Vaccine’; RUN; PROC GLM DATA = DISCOM; CLASS VACGRP PAT VISIT; MODEL SCORE = VACGRP PAT(VACGRP) VISIT VACGRP*VISIT/SS3; RANDOM PAT(VACGRP); TEST H=VACGRP E=PAT(VACGRP); QUIT; RUN;
2873876$62XWSXWIRU([DPSOH
Example 8.1:
Repeated-Measures ANOVA Arthritic Discomfort Following Vaccine
Obs
VACGRP
PAT
VISIT
SCORE
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
ACT ACT ACT ACT ACT ACT ACT ACT ACT ACT ACT ACT PBO PBO PBO PBO PBO PBO PBO PBO PBO PBO PBO PBO
101 101 101 103 103 103 104 104 104 107 107 107 102 102 102 105 105 105 106 106 106 108 108 108
1 2 3 1 2 3 1 2 3 1 2 3 1 2 3 1 2 3 1 2 3 1 2 3
6 3 0 7 3 1 4 1 2 8 4 3 6 5 5 9 4 6 5 3 4 6 2 3
å
&KDSWHU5HSHDWHG0HDVXUHV$129$
2873876$62XWSXWIRU([DPSOHFRQWLQXHG
Example 8.1:
Repeated-Measures ANOVA Arthritic Discomfort Following Vaccine The GLM Procedure
Class Level Information Class
Levels
VACGRP PAT VISIT
Values
2 8 3
ACT PBO 101 102 103 104 105 106 107 108 1 2 3
Number of observations
24
Dependent Variable: SCORE DF
Sum of Squares
Model 11 Error 12 Corrected Total 23
103.1666667 12.1666667 115.3333333
Source
R-Square 0.894509
Source VACGRP PAT(VACGRP) VISIT VACGRP*VISIT
Source VACGRP
PAT(VACGRP) VISIT VACGRP*VISIT
Coeff Var 24.16609
Mean Square 9.3787879 1.0138889
Root MSE 1.006920
F Value
Pr > F
9.25
0.0003
ê
SCORE Mean 4.166667
DF
Type III SS
Mean Square
F Value
1 6 2 2
10.66666667 25.33333333 58.58333333 8.58333333
10.66666667 4.22222222 29.29166667 4.29166667
10.52 4.16 28.89 4.23
Pr > F
0.0070
0.0171 <.0001 ñ0.0406
Type III Expected Mean Square Var(Error) + 3 Var(PAT(VACGRP)) + Q(VACGRP,VACGRP*VISIT) Var(Error) + 3 Var(PAT(VACGRP)) Var(Error) + Q(VISIT,VACGRP*VISIT) Var(Error) + Q(VACGRP*VISIT)
ò
Tests of Hypotheses Using the Type III MS for PAT(VACGRP) as an Error Term
Source
DF
Type III SS
Mean Square
F Value
VACGRP
1
10.66666667
10.66666667
2.53
Pr > F
0.1631
&RPPRQ6WDWLVWLFDO0HWKRGVIRU&OLQLFDO5HVHDUFKZLWK6$6([DPSOHV
7KHµ0XOWLYDULDWH¶$SSURDFK
7KH PHWKRGRORJ\ GHPRQVWUDWHG LQ ([DPSOH LV D µXQLYDULDWH¶ DSSURDFK WR UHSHDWHG PHDVXUHV DQDO\VLV DQG DV QRWHG UHTXLUHV WKH DVVXPSWLRQ RI FRPSRXQG V\PPHWU\ 7KLV DVVXPSWLRQ LVRIWHQQRWYDOLGHVSHFLDOO\LIWKHWULDOLVOHQJWK\RU WLPHSRLQWVDUHXQHTXDOO\VSDFHGEHFDXVHPHDVXUHPHQWVWDNHQIDUWKHUDSDUWLQWLPH PLJKW EH OHVV FRUUHODWHG WKDQ WKRVH WDNHQ FORVHU WRJHWKHU (UURQHRXV FRQFOXVLRQV PLJKWUHVXOWLIWKHµXQLYDULDWH¶DSSURDFKLVXVHGXQGHUYLRODWLRQRIWKHDVVXPSWLRQ RIFRPSRXQGV\PPHWU\ $µPXOWLYDULDWH¶DSSURDFKFDQVRPHWLPHVEHXVHGWRFLUFXPYHQWWKLVVLWXDWLRQ7KLV DSSURDFKXVHVWKHUHSHDWHGPHDVXUHPHQWVDVPXOWLYDULDWHUHVSRQVHYHFWRUVDQGLV PRUHUREXVWZKHQWKHDVVXPSWLRQRIFRPSRXQGV\PPHWU\LVYLRODWHG 7KH6$6FRGHQHHGHGWRDQDO\]H([DPSOHXVLQJWKHµPXOWLYDULDWH¶DSSURDFKLV VKRZQQH[W7KLVDQDO\VLVXVHVWKHGDWDVHW$57+5ZKLFKLVLQWKHFRUUHFWIRUPDW IRUPXOWLYDULDWHDQDO\VLV1RWLFHWKDWWKH2'6VWDWHPHQWLVXVHGWRFXVWRPL]HWKH RXWSXWVHH6HFWLRQ
1RWH)RUGHWDLOVDERXWXVLQJ2'6VHH³7KH&RPSOHWH*XLGHWRWKH6$6 2XWSXW'HOLYHU\6\VWHP9HUVLRQ´
6$6&RGHIRU([DPSOH±µ0XOWLYDULDWH¶$SSURDFK PROC PRINT DATA = ARTHR; VAR VACGRP PAT MO1 MO2 MO3; RUN;
ô
ODS EXCLUDE PartialCorr ErrorSSCP; PROC GLM DATA = ARTHR; CLASS VACGRP; MODEL MO1 MO2 MO3 = VACGRP / SS3; õ REPEATED VISIT PROFILE / PRINTE SUMMARY; TITLE3 ’[Multivariate Approach]’; QUIT; RUN;
352&35,17LVXVHGWRGLVSOD\WKHGDWDVHWVKRZQLQ2XWSXWô7KHDQDO\VLV LVSHUIRUPHGZLWKWKH*/0SURFHGXUHZKLFKFDQEHXVHGIRUWKHµPXOWLYDULDWH¶DV ZHOO DV WKH µXQLYDULDWH¶ DSSURDFK 7KH 02'(/ VWDWHPHQW LV VSHFLILHG XVLQJ WKH UHVSRQVHV DW HDFK WLPH SRLQW 02 02 02 DV GHSHQGHQW YDULDEOHV LH LQFOXGHG EHIRUH WKH HTXDO VLJQ õ 7KH 5(3($7(' VWDWHPHQW VSHFLILHV WKDW WKHVHGHSHQGHQWYDULDEOHVDUHUHSHDWHGREVHUYDWLRQVDQGSURYLGHV6$6ZLWKDQDPH IRU WKH UHSHDWHG IDFWRU LQ WKLV FDVH 9,6,7 7KH 352),/( RSWLRQ LV XVHG WR FRPSDUHVXFFHVVLYHFKDQJHVDPRQJJURXSVVHHGHWDLOVLQ6HFWLRQ :KHQWKH 35,17(RSWLRQLVVSHFLILHGLQWKH5(3($7('VWDWHPHQWWKH6$6RXWSXWLQFOXGHV DVSKHULFLW\WHVWLQGLFDWHGDV0DXFKO\ V&ULWHULRQ DSSOLHGWRDVHWRIRUWKRJRQDO
&KDSWHU5HSHDWHG0HDVXUHV$129$
FRPSRQHQWV 7KLV WHVW LV EDVHG RQ D FKLVTXDUH DSSUR[LPDWLRQ DQG VLJQLILFDQW YDOXHV LQGLFDWH GHSDUWXUH IURP WKH DVVXPSWLRQ RI FRPSRXQG V\PPHWU\ )RU WKLV H[DPSOHWKHVSKHULFLW\WHVWVDUHQRWVLJQLILFDQWù :KHQWKHUHLVQRHYLGHQFHWRFRQWUDGLFWWKHDVVXPSWLRQRIFRPSRXQGV\PPHWU\DV LQ WKLV FDVH WKH µXQLYDULDWH¶ UHVXOWV SUHYLRXVO\ REWDLQHG FDQ DQG VKRXOG EH XVHG 7KHµPXOWLYDULDWH¶DSSURDFKLVNQRZQWRSURGXFHRYHUO\FRQVHUYDWLYHUHVXOWVZKHQ WKH FRPSRXQG V\PPHWU\ DVVXPSWLRQ LV VDWLVILHG HVSHFLDOO\ ZLWK VPDOO VDPSOH VL]HV7KHRXWSXWFRQILUPVWKLVDVERWKWKH9,6,7 11 DQGWKH9,6,7 9$&*53 12 HIIHFWV KDYH OHVV VLJQLILFDQFH LH JUHDWHU SYDOXHV WKDQ REWDLQHG E\ WKH µXQLYDULDWH¶DSSURDFKQDPHO\YVIRU9,6,7DQGYV IRU 9,6,7 9$&*53 ,Q IDFW WKH LQWHUDFWLRQ GHSLFWHG E\ D SORW RI WKH PHDQV VKRZQLQ)LJXUH JRHVXQGHWHFWHGZKHQWKHµPXOWLYDULDWH¶DSSURDFKLVXVHG ),*85(5HVSRQVH3URILOHV([DPSOH
3ODFHER 5DWLQJ
$FWLYH
0RQWK
2873876$62XWSXWIRU([DPSOH±µ0XOWLYDULDWH¶$SSURDFK Example 8.1:
Repeated-Measures ANOVA Arthritic Discomfort Following Vaccine [Multivariate Approach]
Obs
VACGRP
PAT
MO1
MO2
MO3
1 2 3 4 5 6 7 8
ACT ACT ACT ACT PBO PBO PBO PBO
101 103 104 107 102 105 106 108
6 7 4 8 6 9 5 6
3 3 1 4 5 4 3 2
0 1 2 3 5 6 4 3
ô
&RPPRQ6WDWLVWLFDO0HWKRGVIRU&OLQLFDO5HVHDUFKZLWK6$6([DPSOHV
2873876$62XWSXWIRU([DPSOH±µ0XOWLYDULDWH¶$SSURDFK FRQWLQXHG The GLM Procedure Class Level Information Class VACGRP
Levels 2
Values ACT PBO
Number of observations 8 Dependent Variable: MO1 Sum of Source DF Squares Mean Square Model Error Corrected Total
1 6 7
R-Square 0.006993
0.12500000 17.75000000 17.87500000 Coeff Var 26.98009
0.12500000 2.95833333
Root MSE 1.719981
F Value
Pr > F
0.04
0.8439
MO1 Mean 6.375000
Source
DF
Type III SS
Mean Square
F Value
Pr > F
VACGRP
1
0.12500000
0.12500000
0.04
0.8439
Dependent Variable: MO2 Source Model Error Corrected Total
DF
Sum of Squares
1 6 7
1.12500000 9.75000000 10.87500000
R-Square 0.103448
Source VACGRP
Coeff Var 40.79216
DF 1
Type III SS 1.12500000
Mean Square
F Value
Pr > F
1.12500000 1.62500000
0.69
0.4372
Root MSE 1.274755
MO2 Mean 3.125000
Mean Square 1.12500000
F Value 0.69
Pr > F 0.4372
Dependent Variable: MO3
Source Model Error Corrected Total
DF
Sum of Squares
1 6 7
18.00000000 10.00000000 28.00000000
R-Square 0.642857
Coeff Var 43.03315
Mean Square
F Value
Pr > F
18.00000000 1.66666667
10.80
0.0167
Root MSE 1.290994
MO3 Mean 3.000000
Source
DF
Type III SS
Mean Square
F Value
Pr > F
VACGRP
1
18.00000000
18.00000000
10.80
0.0167
&KDSWHU5HSHDWHG0HDVXUHV$129$
2873876$62XWSXWIRU([DPSOH±µ0XOWLYDULDWH¶$SSURDFK FRQWLQXHG Example 8.1:
Repeated-Measures ANOVA Arthritic Discomfort Following Vaccine [Multivariate Approach] The GLM Procedure
Repeated Measures Analysis of Variance Repeated Measures Level Information Dependent Variable Level of VISIT
MO1 1
MO2 2
MO3 3
Sphericity Tests
Variables Transformed Variates Orthogonal Components
DF
Mauchly’s Criterion
2 2
0.8962875 0.9547758
Chi-Square
Pr > ChiSq
0.5474703 0.2313939
0.7605 0.8907
ù
Manova Test Criteria and Exact F Statistics for the Hypothesis of no VISIT Effect H = Type III SSCP Matrix for VISIT E = Error SSCP Matrix S=1
M=0
Statistic
Value
Wilks’ Lambda Pillai’s Trace Hotelling-Lawley Trace Roy’s Greatest Root
0.10207029 0.89792971 8.79716981 8.79716981
11
N=1.5
F Value
Num DF
21.99 21.99 21.99 21.99
Den DF
2 2 2 2
5 5 5 5
Manova Test Criteria and Exact F Statistics for the Hypothesis of no VISIT*VACGRP Effect H = Type III SSCP Matrix for VISIT*VACGRP
Pr > F 0.0033 0.0033 0.0033 0.0033
12
E = Error SSCP Matrix S=1 Statistic Wilks’ Lambda Pillai’s Trace Hotelling-Lawley Trace Roy’s Greatest Root
M=0
N=1.5
Value
F Value
0.44444444 0.55555556 1.25000000 1.25000000
3.13 3.13 3.13 3.13
Num DF 2 2 2 2
Den DF 5 5 5 5
Pr > F 0.1317 0.1317 0.1317 0.1317
&RPPRQ6WDWLVWLFDO0HWKRGVIRU&OLQLFDO5HVHDUFKZLWK6$6([DPSOHV
2873876$62XWSXWIRU([DPSOH±µ0XOWLYDULDWH¶$SSURDFK FRQWLQXHG Example 8.1:
Repeated-Measures ANOVA Arthritic Discomfort Following Vaccine [Multivariate Approach] The GLM Procedure
Repeated Measures Analysis of Variance Tests of Hypotheses for Between Subjects Effects Source
DF
Type III SS
Mean Square
VACGRP Error
1 6
10.66666667 25.33333333
10.66666667 4.22222222
F Value
Pr > F
2.53 13
0.1631
Univariate Tests of Hypotheses for Within Subject Effects Source
DF
Type III SS
Mean Square
F Value
Pr > F
VISIT VISIT*VACGRP Error(VISIT)
2 2 12
58.58333333 8.58333333 12.16666667
29.29166667 4.29166667 1.01388889
28.89 4.23
<.0001 0.0406
Source VISIT VISIT*VACGRP Error(VISIT)
Adj Pr > F G - G H - F <.0001 0.0433
Greenhouse-Geisser Epsilon Huynh-Feldt Epsilon
<.0001 0.0406
0.9567 14 1.6282
Analysis of Variance of Contrast Variables VISIT_N represents the nth successive difference in VISIT Contrast Variable: VISIT_1 Source
DF
Type III SS
Mean Square
F Value
Mean VACGRP Error
1 1 6
84.50000000 0.50000000 11.00000000
84.50000000 0.50000000 1.83333333
46.09 0.27
Pr > F 0.0005 0.6202 15
Contrast Variable: VISIT_2
Source
DF
Type III SS
Mean Square
F Value
Mean VACGRP Error
1 1 6
0.12500000 10.12500000 10.75000000
0.12500000 10.12500000 1.79166667
0.07 5.65
Pr > F 0.8005 0.0550 15
&KDSWHU5HSHDWHG0HDVXUHV$129$
)RXU PXOWLYDULDWH PHWKRGV GXH WR :LONV 3LOODL +RWHOOLQJ DQG 5R\ VHH 6HFWLRQ DUHXVHGWRREWDLQWKH)WHVWVWDWLVWLFIRUWKHVHZLWKLQSDWLHQWHIIHFWV1RWLFH WKDWWKHWHVWIRUWKHJURXSHIIHFW9$&*53 13 LVWKHVDPHDVLQWKHµXQLYDULDWH¶ DQDO\VLV 7KLV H[DPSOH LOOXVWUDWHVWKHPHWKRGE\ZKLFKWKHµPXOWLYDULDWH¶DSSURDFKFDQEH FRQGXFWHGDOWKRXJKLWLVQRWWKHEHVWDSSURDFKIRUDQDO\]LQJWKHGDWDLQWKLVFDVH EHFDXVH RI WKH VPDOO VDPSOH VL]HV DQG FRQVLVWHQF\ ZLWK FRPSRXQG V\PPHWU\ 2XWSXWIXUWKHUVKRZVKRZWKHµPXOWLYDULDWH¶DSSURDFKSURYLGHVDVHULHVRIRQH ZD\$129$¶VDWHDFKWLPHSRLQWGHSHQGHQWYDULDEOHV0202DQG02 DQG KRZFKDQJHVDFURVVWLPHSRLQWVFDQEHFRPSDUHG$VLPSOHH[HFXWLRQRIWKH2'6 VWDWHPHQWLQ6$6WRFXVWRPL]HWKHRXWSXWLVDOVRVKRZQ7KHVHSRLQWVDUHIXUWKHU GLVFXVVHGLQ6HFWLRQ $GGLWLRQDOIDFWRUVFDQDOVREHXVHGLQWKHUHSHDWHGPHDVXUHVDQDO\VLV([DPSOH VKRZV KRZ 6WXG\ &HQWHU FDQ EH XVHG DV D EORFNLQJ HIIHFW LQ DQ XQEDODQFHG UHSHDWHGPHDVXUHVOD\RXWDQGIXUWKHULOOXVWUDWHVWKHµPXOWLYDULDWH¶DSSURDFK
p ([DPSOH7UHDGPLOO:DONLQJ'LVWDQFHLQ,QWHUPLWWHQW&ODXGLFDWLRQ
3DWLHQWVZHUHUDQGRPO\DVVLJQHGWRUHFHLYHHLWKHUWKHQHZGUXJ1RYDI\OOLQH WKRXJKWWRUHGXFHWKHV\PSWRPVRILQWHUPLWWHQWFODXGLFDWLRQRUDSODFHERLQD PRQWKGRXEOHEOLQGVWXG\7KHSULPDU\PHDVXUHPHQWRIHIILFDF\LVWKH ZDONLQJGLVWDQFHRQDWUHDGPLOOXQWLOGLVFRQWLQXDWLRQGXHWRFODXGLFDWLRQ SDLQ,QWZRVWXG\FHQWHUVSDWLHQWVXQGHUZHQWWUHDGPLOOWHVWLQJDWEDVHOLQH 0RQWK DQGDWHDFKRIPRQWKO\IROORZXSYLVLWV7KHWUHDGPLOOZDONLQJ GLVWDQFHVLQPHWHUV DUHVKRZQLQ7DEOH,VWKHUHDQ\GLVWLQFWLRQLQ H[HUFLVHWROHUDQFHSURILOHVEHWZHHQSDWLHQWVZKRUHFHLYH1RYDI\OOLQHDQG WKRVHRQSODFHER"7KHVHULHVSDWLHQWQXPEHUVDUHIURP&HQWHUWKH VHULHVSDWLHQWQXPEHUVDUHIURP&HQWHU
6ROXWLRQ
$VXPPDU\RIWKHPHDQZDONLQJGLVWDQFHVLQPHWHUV DQGVDPSOHVL]HVLVVKRZQ LQ7DEOH %HFDXVHRIWKHVDPSOHVL]HLPEDODQFHWKHPHWKRGVLOOXVWUDWHGLQ([DPSOHIRU PDQXDOO\FRPSXWLQJWKHVXPVRIVTXDUHVFDQQRWEHXVHG,QVXFKFDVHVWKH7\SH ,,,VXPVRIVTXDUHVDUHPRVWHIILFLHQWO\FRPSXWHGE\XVLQJPDWUL[DOJHEUDZLWKD SURJUDPVXFKDV6$67KHµPXOWLYDULDWH¶DSSURDFKLVXVHGZLWK352&*/0IRU WKLVH[DPSOH
&RPPRQ6WDWLVWLFDO0HWKRGVIRU&OLQLFDO5HVHDUFKZLWK6$6([DPSOHV
7$%/(5DZ'DWDIRU([DPSOH
1RYDI\OOLQH*URXS
7UHDWPHQW0RQWK
3DWLHQW 1XPEHU
3ODFHER*URXS
7UHDWPHQW0RQWK
3DWLHQW 1XPEHU
&KDSWHU5HSHDWHG0HDVXUHV$129$
7$%/(6XPPDU\6WDWLVWLFVIRU([DPSOH
0RQWK
7UHDWPHQW
&HQWHU
$FWLYH
&RPELQHG
3ODFHER
&RPELQHG
7RWDO
,QWKLVH[DPSOHSDWLHQWVDUHQHVWHGZLWKLQHDFK7UHDWPHQWE\&HQWHUFHOOVRWKDW WKHIDFWRUV7UHDWPHQW&HQWHUDQG7UHDWPHQWE\&HQWHUDOOUHSUHVHQWEHWZHHQ SDWLHQWHIIHFWV7KH)WHVWVIRUWKHVHHIIHFWVWKHUHIRUHDUHEDVHGRQWKH 3DWLHQWZLWKLQ7UHDWPHQWE\&HQWHU µHUURU¶
6$6$QDO\VLVRI([DPSOH $V VKRZQ LQ WKH 6$6 FRGH WKDW IROORZV WKH UHSHDWHG PHDVXUHV DUH VSHFLILHG DV GHSHQGHQW YDULDEOHV RQ WKH OHIW VLGH RI WKH HTXDO VLJQ LQ WKH 02'(/ VWDWHPHQW 16 DQGWKHEHWZHHQSDWLHQWHIIHFWVDUHVSHFLILHGWRWKHULJKWRIWKHHTXDO VLJQ 7KH 5(3($7(' VWDWHPHQW WHOOV 6$6 WKDW WKHVH GHSHQGHQW YDULDEOHV DUH UHSHDWHGPHDVXUHVDQGODEHOVWKHWLPHIDFWRUDV0217+ 17 7KH35,17(RSWLRQ LVXVHGWRREWDLQWKHVSKHULFLW\WHVWIRUFRPSRXQGV\PPHWU\2QO\VHOHFWHGSRUWLRQV RIWKHHQWLUHRXWSXWDUHUHSURGXFHGKHUHXVLQJRQHRIWKH2'6IHDWXUHVLQ6$6VHH 6HFWLRQ 8QOHVVVXSSUHVVHGZLWKWKH1281,RSWLRQLQWKH02'(/VWDWHPHQW6$6SULQWV WKH UHVXOWV RI D WZRZD\ $129$ IRU HDFK WLPH SRLQW GHQRWHG E\ WKH GHSHQGHQW YDULDEOHV:':':':'DQG:'2XWSXWLQGLFDWHVQRVLJQLILFDQW 7UHDWPHQWRU&HQWHUPDLQHIIHFWVDQGQR7UHDWPHQWE\&HQWHULQWHUDFWLRQDWDQ\RI WKHLQGLYLGXDOPHDVXUHPHQWWLPHVS!IRUHDFK 18
&RPPRQ6WDWLVWLFDO0HWKRGVIRU&OLQLFDO5HVHDUFKZLWK6$6([DPSOHV
6$6&RGHIRU([DPSOH DATA WDVIS; INPUT TREATMNT $ PAT WD0 WD1 WD2 WD3 WD4; CENTER = INT(PAT/100); DATALINES; ACT 101 190 212 213 195 248 ACT 102 98 137 185 215 225 ACT 104 155 145 196 189 176 (more data lines)... PBO 216 165 140 153 180 155 PBO 217 196 195 204 188 178 ; ODS EXCLUDE GLM.Repeated.MANOVA.Model.Error.ErrorSSCP GLM.Repeated.MANOVA.Model.Error.PartialCorr GLM.Repeated.WithinSubject.ModelANOVA GLM.Repeated.WithinSubject.Epsilons; PROC GLM DATA = WDVIS; CLASS TREATMNT CENTER; 16 MODEL WD0 WD1 WD2 WD3 WD4 = TREATMNT CENTER TREATMNT*CENTER / SS3; REPEATED MONTH CONTRAST(1) / PRINTE SUMMARY; 17 TITLE1 ’Repeated-Measures ANOVA’; TITLE2 ’Example 8.2: Treadmill Walking Distance in Intermittent Claudication’; QUIT; RUN;
7KH µPXOWLYDULDWH¶ DQDO\VLV LQGLFDWHV D VLJQLILFDQW WHVW IRU FRPSRXQG V\PPHWU\ EDVHGRQWKHVSKHULFLW\WHVWVS 19 ZKLFKZRXOGLQYDOLGDWHWKHµXQLYDULDWH¶ DSSURDFK DV GHPRQVWUDWHG LQ ([DPSOH 7KH 0217+ DQG 0217+ 75($7017 HIIHFWV DUH VLJQLILFDQW EDVHG RQ WKH µPXOWLYDULDWH¶ WHVWV VKRZQ DV :LONV /DPEGD 3LOODL¶V 7UDFH +RWHOOLQJ/DZOH\ 7UDFH DQG 5R\¶V *UHDWHVW5RRW 7KHVHWHVWVDOO\LHOGWKHVDPH)YDOXHVIRUHDFKHIIHFWWKDWLQYROYHV 0217+ IRU WKLV GDWD VHW 7KH SYDOXH RI 20 IRU WKH 0217+ 75($7017 LQWHUDFWLRQ LQGLFDWHV WKDW WKH UHVSRQVH SURILOHV GLIIHU EHWZHHQJURXSV7KLVGLIIHUHQFHLVGHSLFWHGLQ)LJXUHZKLFKVKRZVWKHPHDQ UHVSRQVHDWHDFKPRQWKIRUWKHWZRWUHDWPHQWJURXSV 7KH 7UHDWPHQW HIIHFW ZLWK DQ )YDOXH RI LV QRW VLJQLILFDQW 21 ZKHQ µDYHUDJHG¶RYHUDOOPHDVXUHPHQWWLPHVS 7KH&HQWHUDQG7UHDWPHQWE\ &HQWHUHIIHFWVDUHDOVRQRQVLJQLILFDQW 22 7KHWLPHHIIHFW0217+ LVVLJQLILFDQW 23 IRUWKHFRPELQHGWUHDWPHQWJURXSVDQGVWXG\FHQWHUVS +RZHYHUWKLV VKRXOGQRWEHLQWHUSUHWHGDORQHZLWKRXWFRQVLGHUDWLRQRIWKHVLJQLILFDQWLQWHUDFWLRQ ZKLFKLQGLFDWHVDWUHDWPHQWSURILOHGLIIHUHQFH
&KDSWHU5HSHDWHG0HDVXUHV$129$
,Q WKH 5(3($7(' VWDWHPHQW &2175$67 LV VSHFLILHG LQ DGGLWLRQ WR WKH 6800$5<RSWLRQ 17 WRWHOO6$6WRSURYLGHDQDQDO\VLVRIWKHFKDQJHVLQPHDQ UHVSRQVHIURPWKHILUVWWLPHSHULRGXVHG%HFDXVHWKHILUVWWLPHSHULRGLVµ0RQWK¶ ZKLFKUHIHUVWREDVHOLQHSULRUWRPHGLFDWLRQ WKH&2175$67 RSWLRQSURYLGHV EHWZHHQWUHDWPHQWJURXSFRPSDULVRQVRIWKHPHDQFKDQJHVIURPEDVHOLQHWRHDFK WLPHSRLQW 24 7KHVHUHVXOWVLQGLFDWHDVLJQLILFDQWS GLIIHUHQFHEHWZHHQWKH DFWLYHDQGSODFHERJURXSVLQPHDQZDONLQJGLVWDQFHFKDQJHVIURPEDVHOLQHDWHDFK RI WKH IROORZXS PRQWKV SYDOXHV RI DQG 1RWLFHWKDWWKH5(3($7('VWDWHPHQWODEHOV0RQWKVDQGDV0217+B 0217+B0217+BDQG0217+BUHVSHFWLYHO\ 2873876$62XWSXWIRU([DPSOH
Example 8.2:
Repeated-Measures ANOVA Treadmill Walking Distance in Intermittent Claudication The GLM Procedure Class Level Information Class Levels Values TREATMNT CENTER
2 2
ACT PBO 1 2
Number of observations
38
Dependent Variable: WD0 Source
DF
Model Error Corrected Total
3 34 37
R-Square 0.034023 Source TREATMNT CENTER TREATMNT*CENTER
Sum of Squares Mean Square 2190.13085 62181.76389 64371.89474
730.04362 1828.87541
F Value
Pr > F
0.40
0.7545
Coeff Var 25.01668
Root MSE 42.76535
WD0 Mean 170.9474
DF
Type III SS
Mean Square
F Value
1 1 1
0.280018 45.591846 2088.387545
0.280018 45.591846 2088.387545
0.00 0.02 1.14
Pr > F 0.9902 0.8755 18 0.2928
Dependent Variable: WD1 Source
DF
Model Error Corrected Total
3 34 37
R-Square 0.071590 Source TREATMNT CENTER TREATMNT*CENTER
Sum of Squares Mean Square 3825.59284 49611.98611 53437.57895
1275.19761 1459.17606
F Value
Pr > F
0.87
0.4642
Coeff Var 21.59429
Root MSE 38.19916
WD1 Mean 176.8947
DF
Type III SS
Mean Square
F Value
1 1 1
3215.520161 788.946685 201.720878
3215.520161 788.946685 201.720878
2.20 0.54 0.14
Pr > F 0.1469 0.4672 18 0.7123
&RPPRQ6WDWLVWLFDO0HWKRGVIRU&OLQLFDO5HVHDUFKZLWK6$6([DPSOHV
2873876$62XWSXWIRU([DPSOHFRQWLQXHG Example 8.2:
Repeated-Measures ANOVA Treadmill Walking Distance in Intermittent Claudication The GLM Procedure
Dependent Variable: WD2 Source
DF
Sum of Squares
Model Error Corrected Total
3 34 37
7128.96857 74577.34722 81706.31579
R-Square 0.087251 Source TREATMNT CENTER TREATMNT*CENTER
Mean Square
F Value
Pr > F
2376.32286 2193.45139
1.08
0.3693
Coeff Var 24.88399
Root MSE 46.83430
WD2 Mean 188.2105
DF
Type III SS
Mean Square
F Value
1 1 1
6860.021953 504.355287 225.215502
6860.021953 504.355287 225.215502
3.13 0.23 0.10
Pr > F 0.0860 0.6346 18 0.7506
Dependent Variable: WD3 Source
DF
Model Error Corrected Total
3 34 37
R-Square 0.115061 Source TREATMNT CENTER TREATMNT*CENTER
Sum of Squares 7086.08918 54499.30556 61585.39474
Mean Square 2362.02973 1602.92075
F Value
Pr > F
1.47
0.2391
Coeff Var 20.90104
Root MSE 40.03649
WD3 Mean 191.5526
DF
Type III SS
Mean Square
F Value
1 1 1
3542.716846 2512.286738 771.211470
3542.716846 2512.286738 771.211470
2.21 1.57 0.48
Pr > F 0.1463 0.2191 18 0.4926
Dependent Variable: WD4 Source
DF
Model Error Corrected Total
3 34 37
R-Square 0.113554 Source TREATMNT CENTER TREATMNT*CENTER
Sum of Squares 8437.19591 65863.77778 74300.97368
Mean Square 2812.39864 1937.16993
F Value
Pr > F
1.45
0.2450
Coeff Var 22.45275
Root MSE 44.01329
WD4 Mean 196.0263
DF
Type III SS
Mean Square
F Value
1 1 1
5975.405018 1141.225806 1319.290323
5975.405018 1141.225806 1319.290323
3.08 0.59 0.68
Pr > F 0.0880 0.4481 18 0.4150
&KDSWHU5HSHDWHG0HDVXUHV$129$
2873876$62XWSXWIRU([DPSOHFRQWLQXHG Example 8.2:
Repeated-Measures ANOVA Treadmill Walking Distance in Intermittent Claudication The GLM Procedure
Repeated Measures Level Information Dependent Variable Level of MONTH
WD0 1
WD1 2
WD2 3
WD3 4
WD4 5
Partial Correlation Coefficients from the Error SSCP Matrix / Prob > |r| DF = 34
WD0
WD1
WD2
WD3
WD4
WD0
1.000000
0.881446 <.0001
0.778029 <.0001
0.781755 <.0001
0.742002 <.0001
WD1
0.881446 <.0001
1.000000
0.703789 <.0001
0.663442 <.0001
0.754124 <.0001
WD2
0.778029 <.0001
0.703789 <.0001
1.000000
0.695239 <.0001
0.724997 <.0001
WD3
0.781755 <.0001
0.663442 <.0001
0.695239 <.0001
1.000000
0.826616 <.0001
WD4
0.742002 <.0001
0.754124 <.0001
0.724997 <.0001
0.826616 <.0001
1.000000
Sphericity Tests
Variables Transformed Variates Orthogonal Components
DF
Mauchly’s Criterion
Chi-Square
9 9
0.3190188 0.5139265
37.036211 21.578963
Pr > ChiSq <.0001 0.0103 19
&RPPRQ6WDWLVWLFDO0HWKRGVIRU&OLQLFDO5HVHDUFKZLWK6$6([DPSOHV
2873876$62XWSXWIRU([DPSOHFRQWLQXHG Example 8.2:
Repeated-Measures ANOVA Treadmill Walking Distance in Intermittent Claudication Manova Test Criteria and Exact F Statistics for the Hypothesis of no MONTH Effect H = Type III SSCP Matrix for MONTH E = Error SSCP Matrix S=1
Statistic Wilks’ Lambda Pillai’s Trace Hotelling-Lawley Trace Roy’s Greatest Root
M=1
Value 0.56579676 0.43420324 0.76741911 0.76741911
N=14.5
F Value 5.95 5.95 5.95 5.95
Num DF 4 4 4 4
Den DF 31 31 31 31
Pr > F 0.0011 0.0011 23 0.0011 0.0011
Manova Test Criteria and Exact F Statistics for the Hypothesis of no MONTH*TREATMNT Effect H = Type III SSCP Matrix for MONTH*TREATMNT E = Error SSCP Matrix S=1 Statistic Wilks’ Lambda Pillai’s Trace Hotelling-Lawley Trace Roy’s Greatest Root
M=1
Value 0.68930532 0.31069468 0.45073593 0.45073593
N=14.5
F Value 3.49 3.49 3.49 3.49
Num DF 4 4 4 4
Den DF 31 31 31 31
Pr > F 0.0182 0.0182 20 0.0182 0.0182
Manova Test Criteria and Exact F Statistics for the Hypothesis of no MONTH*CENTER Effect H = Type III SSCP Matrix for MONTH*CENTER E = Error SSCP Matrix S=1 Statistic Wilks’ Lambda Pillai’s Trace Hotelling-Lawley Trace Roy’s Greatest Root
M=1
Value 0.81294393 0.18705607 0.23009714 0.23009714
N=14.5
F Value 1.78 1.78 1.78 1.78
Num DF 4 4 4 4
Den DF 31 31 31 31
Pr > F 0.1574 0.1574 0.1574 0.1574
Manova Test Criteria and Exact F Statistics for the Hypothesis of no MONTH*TREATMNT*CENTER Effect H = Type III SSCP Matrix for MONTH*TREATMNT*CENTER E = Error SSCP Matrix S=1 Statistic Wilks’ Lambda Pillai’s Trace Hotelling-Lawley Trace Roy’s Greatest Root
M=1
Value 0.88498645 0.11501355 0.12996081 0.12996081
N=14.5
F Value 1.01 1.01 1.01 1.01
Num DF 4 4 4 4
Den DF 31 31 31 31
Pr > F 0.4188 0.4188 0.4188 0.4188
&KDSWHU5HSHDWHG0HDVXUHV$129$
2873876$62XWSXWIRU([DPSOHFRQWLQXHG Example 8.2:
Repeated-Measures ANOVA Treadmill Walking Distance in Intermittent Claudication
The GLM Procedure Repeated Measures Analysis of Variance Tests of Hypotheses for Between Subjects Effects Source TREATMNT CENTER TREATMNT*CENTER Error
DF 1 1 1 34
Type III SS 15215.6775 141.5815 3864.2911 245350.6639
Mean Square 15215.6775 141.5815 3864.2911 7216.1960
F Value 2.11 0.02 0.54
Pr > F 0.1556 21 0.8894 0.4693 22
Analysis of Variance of Contrast Variables MONTH_N represents the contrast between the nth level of MONTH and the 1st Contrast Variable: MONTH_2 Source DF Type III SS
Mean Square
F Value
1035.12545 3275.81362 455.22581 992.00000 13878.44444
1035.12545 3275.81362 455.22581 992.00000 408.18954
2.54 8.03 1.12 2.43
Contrast Variable: MONTH_3 Source DF Type III SS
Mean Square
F Value
9854.88351 6947.95878 246.66846 941.98029 30794.47222
9854.88351 6947.95878 246.66846 941.98029 905.71977
10.88 7.67 0.27 1.04
Contrast Variable: MONTH_4 Source DF Type III SS
Mean Square
F Value
Mean TREATMNT CENTER TREATMNT*CENTER Error
13127.49507 3605.98970 3234.75314 321.41980 25663.01389
13127.49507 3605.98970 3234.75314 321.41980 754.79453
17.39 4.78 4.29 0.43
Contrast Variable: MONTH_5 Source DF Type III SS
Mean Square
F Value
Mean TREATMNT CENTER TREATMNT*CENTER Error
20601.23701 6057.49507 1643.02195 87.92518 972.78717
Mean TREATMNT CENTER TREATMNT*CENTER Error
Mean TREATMNT CENTER TREATMNT*CENTER Error
1 1 1 1 34
1 1 1 1 34
1 1 1 1 34
1 1 1 1 34
20601.23701 6057.49507 1643.02195 87.92518 33074.76389
21.18 6.23 1.69 0.09
Pr > F 0.1205 0.0077 24 0.2984 0.1283
Pr > F 0.0023 0.0090 24 0.6051 0.3150
Pr > F 0.0002 0.0358 24 0.0461 0.5184
Pr > F <.0001 0.0176 24 0.2025 0.7655
&RPPRQ6WDWLVWLFDO0HWKRGVIRU&OLQLFDO5HVHDUFKZLWK6$6([DPSOHV
),*85(0HDQ:DONLQJ'LVWDQFHVIRU([DPSOH
220
WD (meters)
Novafylline 205 190 175 Placebo 160 0 1 2 3 4 Month 0LVVLQJ'DWD $QDO\VHVRIUHSHDWHGPHDVXUHVXVLQJWKHJHQHUDOOLQHDUPRGHO*/0 LQ6$6FDQ EH GLVFRQFHUWLQJ LQ WKH SUHVHQFH RI PLVVLQJ GDWD VHH 6HFWLRQ ,I WKH µPXOWLYDULDWH¶DSSURDFKLVXVHGRQO\PLVVLQJYDOXHIURPHDFKRIVHYHUDOVXEMHFWV FDQOHDGWRWKHH[FOXVLRQRIODUJHDPRXQWVRIGDWDQRWRQO\ORZHULQJWKHSRZHURI WKHVWDWLVWLFDOWHVWVEXWDOVRUHPRYLQJLQIRUPDWLRQIURPWKHDQDO\VLVWKDWFRXOGEH YDOXDEOHLQLQWHUSUHWLQJWKHWUHQGVRIWKHGDWD
352&0,;('LQ6$6KDQGOHVPLVVLQJYDOXHVPRUHHIILFLHQWO\WKDQ352&*/0 DOORZLQJWKHLQFOXVLRQRIVXEMHFWVZLWKPLVVLQJGDWD,QDGGLWLRQWKHDVVXPSWLRQRI FRPSRXQGV\PPHWU\LVQRWUHTXLUHGZLWK352&0,;(',QIDFWWKHDQDO\VWFDQ VSHFLI\DPRGHOWKDWXVHVWKHPRVWDSSURSULDWHFRUUHODWLRQSDWWHUQVDPRQJSDLUVRI PHDVXUHPHQWVDFURVVWLPH ([DPSOHGHPRQVWUDWHVWKHXVHRI352&0,;('WRDQDO\]HUHSHDWHGPHDVXUHV LQWKHSUHVHQFHRIPLVVLQJYDOXHV
&KDSWHU5HSHDWHG0HDVXUHV$129$
p ([DPSOH'LVHDVH3URJUHVVLRQLQ$O]KHLPHU¶V7ULDO
3DWLHQWVZHUHUDQGRPL]HGWRUHFHLYHRQHRIWZRGDLO\GRVHV/ ORZGRVHRU + KLJKGRVH RIDQHZWUHDWPHQWIRU$O]KHLPHU¶VGLVHDVH$' RUDSODFHER 3 LQDSDUDOOHOVWXG\GHVLJQ(DFKSDWLHQWZDVWRUHWXUQWRWKHFOLQLFHYHU\ PRQWKVIRU\HDUIRUDVVHVVPHQWRIGLVHDVHSURJUHVVLRQEDVHGRQFRJQLWLYH PHDVXUHPHQWVRQWKH$O]KHLPHU¶V'LVHDVH$VVHVVPHQW6FDOH$'$6FRJ 7KLVWHVWHYDOXDWHVPHPRU\ODQJXDJHDQGSUD[LVIXQFWLRQDQGLVEDVHGRQ WKHVXPRIVFRUHVIURPDQLWHPVFDOHZLWKDSRWHQWLDOUDQJHRIWR KLJKHUVFRUHVLQGLFDWLYHRIJUHDWHUGLVHDVHVHYHULW\7KHSULPDU\JRDOLVWR GHWHUPLQHLIWKHUDWHRIGLVHDVHSURJUHVVLRQLVVORZHGZLWKDFWLYHWUHDWPHQW FRPSDUHGZLWKDSODFHER7KHGDWDDUHVKRZQLQ7DEOHDGHFLPDOSRLQW UHSUHVHQWVPLVVLQJYDOXHV ,VWKHUHDGLIIHUHQFHLQUHVSRQVHSURILOHVRYHU WLPHDPRQJWKHWKUHHJURXSV"
7$%/(5DZ'DWDIRU([DPSOH 7UHDWPHQW *URXS + + + + + + + + + + + + + + + + + + + + + + + + / / / / / / / / /
3DWLHQW 6WXG\0RQWK 1XPEHU
&RPPRQ6WDWLVWLFDO0HWKRGVIRU&OLQLFDO5HVHDUFKZLWK6$6([DPSOHV
7$%/(5DZ'DWDIRU([DPSOHFRQWLQXHG 7UHDWPHQW *URXS / / / / / / / / / / / / / / / / / 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3
3DWLHQW 6WXG\0RQWK 1XPEHU
&KDSWHU5HSHDWHG0HDVXUHV$129$
6ROXWLRQ
7KHPHDQ$'$6FRJVFRUHVDUHVKRZQE\7UHDWPHQWDQG0RQWKLQ7DEOHDQG )LJXUH 0HDQ VFRUHV VKRZ DQ LQFUHDVLQJ WUHQG DFURVV WLPH ZLWKLQ HDFK 7UHDWPHQW *URXS LQGLFDWLYH RI GLVHDVH SURJUHVVLRQ 7KH TXHVWLRQ LV ZKHWKHU WKH UDWH RI LQFUHDVH KDV EHHQ VORZHG E\ XVLQJ DQ DFWLYH WUHDWPHQW FRPSDUHG ZLWK D SODFHER 6WDWLVWLFDOO\ WKLV LV DGGUHVVHG E\ H[DPLQLQJ WKH 7UHDWPHQWE\0RQWK LQWHUDFWLRQHIIHFW 7$%/(6XPPDU\6WDWLVWLFVIRU([DPSOH Month Treatment
Placebo (N=30) Low Dose (N=26) High Dose (N=24) Combined (N=80)
4
6
8
10
12
31.6 (8.4) 29 30.6 (7.6) 25 31.6 (10.0) 24 31.3 (8.6) 78
34.6 (8.0) 28 34.4 (8.5) 23 34.4 (9.1) 23 34.5 (8.4) 74
36.2 (7.7) 29 36.1 (9.1) 25 35.0 (9.0) 22 35.8 (8.5) 76
39.2 (8.4) 28 35.2 (10.0) 25 35.0 (9.5) 24 36.6 (9.4) 77
39.2 (7.4) 29 37.2 (10.7) 25 34.0 (7.3) 23 37.0 (8.7) 77
40.8 (9.2) 26 38.0 (10.3) 24 35.9 (9.4) 22 38.4 (9.7) 72
),*85($'$6FRJ5HVSRQVH3URILOHV([DPSOH 3%2
/2:'26(
+,*+'26(
$'$6FRJ
Mean (SD) N Mean (SD) N Mean (SD) N Mean (SD) N
2
0RQWK
&RPPRQ6WDWLVWLFDO0HWKRGVIRU&OLQLFDO5HVHDUFKZLWK6$6([DPSOHV
6$6$QDO\VLVRI([DPSOH
,QLWLDOO\ \RX PLJKW SURFHHG ZLWK D UHSHDWHG PHDVXUHV DQDO\VLV XVLQJ D µPXOWLYDULDWH¶DSSURDFK7KH6$6VWDWHPHQWVIRUGRLQJWKLVDUHVKRZQLQWKHFRGH WKDW IROORZV 7KH GDWD DUH UHDG LQWR WKH GDWD VHW $/=+056 (DFK SDWLHQW¶V UHSHDWHG PHDVXUHPHQWV FRPSULVH D VHSDUDWH UHFRUG 7KLV IRUPDW DOORZV GLUHFW PXOWLYDULDWH PRGHO VSHFLILFDWLRQ LQ 352& */0 DV VKRZQ 25 7KH 5(3($7(' VWDWHPHQW LGHQWLILHV 0217+ DV WKH WLPH IDFWRU DQG UHTXHVWV VSKHULFLW\ WHVWV 35,17(RSWLRQ DQGZLWKLQSDWLHQWHIIHFWVVXPPDULHV6800$5<RSWLRQ 26 6$6&RGHIRU([DPSOH352&*/0 DATA ALZHMRS; INPUT TREAT $ DATALINES; L 1 22 30 . L 5 34 35 46 L 8 40 41 41
PAT ADAS02 ADAS04 ADAS06 ADAS08 ADAS10 ADAS12; 33 37 46
28 31 52
30 35 48
. . .(more data lines). . .
P 71 P 74 P 77 ;
32 . 24
34 50 32
27 52 31
30 56 37
36 52 35
35 54 30
ODS SELECT GLM.Repeated.PartialCorr GLM.Repeated.MANOVA.Model.Error.Sphericity GLM.Repeated.MANOVA.Model.MONTH.MultStat GLM.Repeated.MANOVA.Model.MONTH_TREAT.MultStat GLM.Repeated.WithinSubject.ModelANOVA; PROC GLM DATA = ALZHMRS; CLASS TREAT; MODEL ADAS02 ADAS04 ADAS06 ADAS08 ADAS10 ADAS12 = TREAT / NOUNI; 26 REPEATED MONTH / PRINTE SUMMARY; TITLE1 ’Repeated-Measures ANOVA’; TITLE2 ‘Example 8.3: Disease Progression in Alzheimer’s; TITLE3 ' '; TITLE4 'Multivariate Approach Using GLM'; QUIT; RUN;
25
,Q 2XWSXW WKH VHFWLRQ ODEHOHG 6SKHULFLW\ 7HVWV 27 LQGLFDWHV D VLJQLILFDQW GHSDUWXUH IURP WKH DVVXPSWLRQ RI FRPSRXQG V\PPHWU\ S 7KHUHIRUH WKH µXQLYDULDWH¶ DSSURDFK GHVFULEHG HDUOLHU LQ WKLV FKDSWHU ZRXOG QRW EH DSSURSULDWH /RRNLQJ IXUWKHU LQ WKH RXWSXW \RX VHH WKDW WKH PXOWLYDULDWH WHVWV :LONV¶ /DPEGD 3LOODL¶V 7UDFH DQG +RWHOOLQJ/DZOH\ 7UDFH IRU WKH 0RQWKE\ 7UHDWPHQW LQWHUDFWLRQ DUH QRW VLJQLILFDQW ZLWK SYDOXHV RI 28 7KH
&KDSWHU5HSHDWHG0HDVXUHV$129$
µXQLYDULDWH¶ WHVW IRU WKH 0RQWKE\7UHDWPHQW LQWHUDFWLRQ 29 SURYLGHV WKH UHVXOW S \RXZRXOGJHWE\XVLQJWKHµXQLYDULDWH¶DSSURDFKWRUHSHDWHGPHDVXUHV DIWHU H[FOXGLQJ DOO SDWLHQWV ZKR KDYH RQH RU PRUH PLVVLQJ YDOXHV DQG DVVXPLQJ FRPSRXQGV\PPHWU\%HFDXVHLWZDVDOUHDG\FRQFOXGHGWKDWWKLVZRXOGQRWEHDQ DSSURSULDWHPHWKRGRIDQDO\VLV\RXPLJKWORRNDWWKH*UHHQKRXVH*HLVHUDGMXVWHG YDOXHV VHH 6HFWLRQ 7KH SYDOXH IRU WKH LQWHUDFWLRQ HIIHFW XVLQJ WKH *UHHQKRXVH*HLVHUDGMXVWPHQWLV 30 2873876$62XWSXWIRU([DPSOH±352&*/0 Repeated-Measures ANOVA Example 8.3: Disease Progression in Alzheimer’s Multivariate Approach Using GLM The GLM Procedure Repeated Measures Analysis of Variance
Partial Correlation Coefficients from the Error SSCP Matrix / Prob > |r| DF = 53
ADAS02
ADAS04
ADAS06
ADAS08
ADAS10
ADAS12
ADAS02
1.000000
0.892847 <.0001
0.715396 <.0001
0.743266 <.0001
0.734904 <.0001
0.703999 <.0001
ADAS04
0.892847 <.0001
1.000000
0.750344 <.0001
0.745911 <.0001
0.731188 <.0001
0.653269 <.0001
ADAS06
0.715396 <.0001
0.750344 <.0001
1.000000
0.852443 <.0001
0.690418 <.0001
0.690489 <.0001
ADAS08
0.743266 <.0001
0.745911 <.0001
0.852443 <.0001
1.000000
0.847485 <.0001
0.784974 <.0001
ADAS10
0.734904 <.0001
0.731188 <.0001
0.690418 <.0001
0.847485 <.0001
1.000000
0.900155 <.0001
ADAS12
0.703999 <.0001
0.653269 <.0001
0.690489 <.0001
0.784974 <.0001
0.900155 <.0001
1.000000
Sphericity Tests 27
Variables
DF
Mauchly’s Criterion
Transformed Variates Orthogonal Components
14 14
0.022696 0.2211441
Chi-Square
Pr > ChiSq
193.4424 77.106867
<.0001 <.0001
42
&RPPRQ6WDWLVWLFDO0HWKRGVIRU&OLQLFDO5HVHDUFKZLWK6$6([DPSOHV
2873876$62XWSXWIRU([DPSOH±352&*/0FRQWLQXHG Repeated-Measures ANOVA Example 8.3: Disease Progression in Alzheimer’s
Manova Test Criteria and Exact F Statistics for the Hypothesis of no MONTH Effect H = Type III SSCP Matrix for MONTH E = Error SSCP Matrix S=1 Statistic Wilks’ Lambda Pillai’s Trace Hotelling-Lawley Trace Roy’s Greatest Root
M=1.5
N=23.5
Value
F Value
Num DF
Den DF
Pr > F
0.38811297 0.61188703 1.57656938 1.57656938
15.45 15.45 15.45 15.45
5 5 5 5
49 49 49 49
<.0001 <.0001 <.0001 <.0001
Manova Test Criteria and F Approximations for the Hypothesis of no MONTH*TREAT Effect H = Type III SSCP Matrix for MONTH*TREAT E = Error SSCP Matrix S=2 Statistic Wilks’ Lambda Pillai’s Trace Hotelling-Lawley Trace Roy’s Greatest Root
M=1
N=23.5
Value
F Value
Num DF
1.18 1.18 1.18 1.96
10 10 10 5
0.79719596 0.21044586 0.24481085 0.19587117
Den DF
98 100 70.804 50
Pr > F 0.3163 0.3160 0.3162 0.1012
NOTE: F Statistic for Roy’s Greatest Root is an upper bound. NOTE: F Statistic for Wilks’ Lambda is exact.
Univariate Tests of Hypotheses for Within Subject Effects Source MONTH MONTH*TREAT Error(MONTH)
DF
Type III SS
Mean Square
F Value
5 10 265
1529.760451 222.315729 4366.776533
305.952090 22.231573 16.478402
18.57 1.35
Source MONTH MONTH*TREAT Error(MONTH)
Pr > F <.0001 0.2044 29
Adj Pr > F G - G H - F <.0001 0.2337
<.0001 0.2271
30
28
&KDSWHU5HSHDWHG0HDVXUHV$129$
)URP WKLV DQDO\VLV DORQH \RX PLJKW EHJLQ WR EHOLHYH WKDW WKHUH LV QR VWDWLVWLFDO GLIIHUHQFHLQ$'SURJUHVVLRQDPRQJWKHWUHDWPHQWJURXSVHYHQZKHQ\RXFRQVLGHU WKH HQFRXUDJLQJ GHSLFWLRQ LQ )LJXUH +RZHYHU DV QRWHG HDUOLHU WKH µPXOWLYDULDWH¶ DSSURDFK FDQ EH FRQVHUYDWLYH HVSHFLDOO\ LQ WKH ZD\ LW KDQGOHV PLVVLQJYDOXHV$Q\SDWLHQWZLWKRQHRUPRUHPLVVLQJYDOXHVLVH[FOXGHGIURPWKH DQDO\VLV1RWLFHWKDWDPRQJWKHSDWLHQWVVWXGLHGSDWLHQWVKDYHDWRWDORI PLVVLQJ YDOXHV 7KXV WKH UHSHDWHG PHDVXUHV DQDO\VLV XVLQJ WKH µPXOWLYDULDWH¶ DSSURDFKLJQRUHV ± YDOLGGDWDSRLQWV 1RZDQDO\]HWKHGDWDE\XVLQJ352&0,;('ZKLFKPRUHHIILFLHQWO\KDQGOHV PLVVLQJYDOXHV7KH6$6FRGHDQGRXWSXWIROORZ$VLQSUHYLRXVH[DPSOHVRQO\ VHOHFWHGRXWSXWLVVKRZQE\XVLQJ2'6VHH6HFWLRQ
OUTPUT; OUTPUT; OUTPUT; OUTPUT; OUTPUT; OUTPUT;
ODS SELECT Mixed.RCorr Mixed.Dimensions Mixed.FitStatistics Mixed.Tests3;
PROC MIXED DATA = UNIALZ; CLASS TREAT MONTH PAT; MODEL ADASCOG = TREAT MONTH TREAT*MONTH; 31 REPEATED / TYPE=UN SUBJECT=PAT(TREAT) RCORR; 32 TITLE4 ’PROC MIXED Using Unstructured Covariance (UN)’; RUN;
&RPPRQ6WDWLVWLFDO0HWKRGVIRU&OLQLFDO5HVHDUFKZLWK6$6([DPSOHV
2873876$62XWSXWIRU([DPSOH±352&0,;(' Repeated-Measures ANOVA Example 8.3: Disease Progression in Alzheimer’s PROC MIXED Using Unstructured Covariance (UN) The Mixed Procedure Dimensions Covariance Parameters Columns in X Columns in Z Subjects Max Obs Per Subject Observations Used Observations Not Used Total Observations
21 28 0 80 6 454 26 480
Estimated R Correlation Matrix for PAT(TREAT) 2 H 41
Row 1 2 3 4 5 6
Col1
Col2
Col3
Col4
Col5
Col6
1.0000 0.9005 0.7703 0.8002 0.7763 0.7773
0.9005 1.0000 0.8179 0.7995 0.7521 0.7261
0.7703 0.8179 1.0000 0.8738 0.7223 0.7418
0.8002 0.7995 0.8738 1.0000 0.8628 0.8273
0.7763 0.7521 0.7223 0.8628 1.0000 0.9206
0.7773 0.7261 0.7418 0.8273 0.9206 1.0000
34
Fit Statistics -2 Res Log Likelihood AIC (smaller is better) AICC (smaller is better) BIC (smaller is better)
2649.9 2691.9 2694.1 2741.9
Type 3 Tests of Fixed Effects
Effect TREAT MONTH TREAT*MONTH
Num DF
Den DF
F Value
Pr > F
2 5 10
77 77 77
0.94 21.55 2.08
0.3967 <.0001 0.0357
33
2XWSXW VKRZV D VLJQLILFDQW S 7UHDWPHQWE\0RQWK LQWHUDFWLRQ 33 XVLQJWKHXQVWUXFWXUHGFRYDULDQFH7KHVHFWLRQHQWLWOHG)LW6WDWLVWLFV 34 SURYLGHVDQ LQGLFDWLRQ RI UHODWLYH JRRGQHVV RI ILW VPDOOHU YDOXHV VXJJHVWLQJ D EHWWHU ILW
&KDSWHU5HSHDWHG0HDVXUHV$129$
6$6&RGHIRU([DPSOH352&0,;(' FRQWLQXHG … 35 REPEATED / TYPE=AR(1) SUBJECT=PAT(TREAT); TITLE4 'PROC MIXED Using First-Order Auto-Regressive Covariance (AR(1))';
… 36 REPEATED / TYPE=TOEP SUBJECT=PAT(TREAT); TITLE4 'PROC MIXED Using Toeplitz Covariance (TOEP)';
… 37 REPEATED / TYPE=ARMA(1,1) SUBJECT=PAT(TREAT); TITLE4 'PROC MIXED Using 1st-Order Auto-Regressive Moving Average Covariance (ARMA(1,1))';
… 38 REPEATED / TYPE=CS SUBJECT=PAT(TREAT); TITLE4 'PROC MIXED Using Compound Symmetric Covariance (CS)'; RUN;
2873876$62XWSXWIRU([DPSOH±352&0,;('FRQWLQXHG
Repeated-Measures ANOVA Example 8.3: Disease Progression in Alzheimer’s PROC MIXED Using First-Order Auto-Regressive Covariance (AR(1)) Fit Statistics -2 Res Log Likelihood AIC (smaller is better) AICC (smaller is better) BIC (smaller is better)
2698.2 2702.2 2702.2 2706.9
Type 3 Tests of Fixed Effects 35
Effect TREAT MONTH TREAT*MONTH
Num DF
Den DF
F Value
Pr > F
2 5 10
77 359 359
0.94 11.21 1.43
0.3952 <.0001 0.1671
&RPPRQ6WDWLVWLFDO0HWKRGVIRU&OLQLFDO5HVHDUFKZLWK6$6([DPSOHV
2873876$62XWSXWIRU([DPSOH±352&0,;('FRQWLQXHG Repeated-Measures ANOVA Example 8.3: Disease Progression in Alzheimer’s PROC MIXED Using Toeplitz Covariance (TOEP) Fit Statistics -2 Res Log Likelihood AIC (smaller is better) AICC (smaller is better) BIC (smaller is better)
2671.1 2683.1 2683.3 2697.4
Type 3 Tests of Fixed Effects
36
Effect TREAT MONTH TREAT*MONTH
Num DF
Den DF
F Value
Pr > F
2 5 10
77 359 359
0.91 19.65 2.06
0.4077 <.0001 0.0266
PROC MIXED Using 1st-Order Auto-Regressive-Moving Average Covariance (ARMA(1,1)) Fit Statistics -2 Res Log Likelihood AIC (smaller is better) AICC (smaller is better) BIC (smaller is better)
2691.2 2697.2 2697.2 2704.3
Type 3 Tests of Fixed Effects 37
Num Effect
Den DF
TREAT MONTH TREAT*MONTH
2 5 10
DF
F Value
77 359 359
0.90 14.02 1.47
Pr > F 0.4116 <.0001 0.1478
PROC MIXED Using Compound Symmetric Covariance (CS) Fit Statistics -2 Res Log Likelihood AIC (smaller is better) AICC (smaller is better) BIC (smaller is better)
2743.5 2747.5 2747.5 2752.2
Type 3 Tests of Fixed Effects 38
Effect
DF
Num DF
TREAT MONTH TREAT*MONTH
2 5 10
77 359 359
Den F Value
Pr > F
0.88 26.69 2.30
0.4172 <.0001 0.0125
&KDSWHU5HSHDWHG0HDVXUHV$129$
$,& VWDQGV IRU ³$NDLNH¶V ,QIRUPDWLRQ &ULWHULRQ´ ZKLFK PHDVXUHV PRGHO ILW %DVHGRQ2XWSXWYDOXHVIRUWKH$,&DUHFRQVLVWHQWZLWKWKHRWKHUILWVWDWLVWLFV $,&& %,& ± 5HV /RJ /LNHOLKRRG 7RHSOLW] SURYLGHV WKH EHVW ILW RI WKH ILYH FRYDULDQFHVWUXFWXUHVFRQVLGHUHGEDVHGRQWKH$,&$,&&DQG%,&ILWSDUDPHWHUV VPDOOHVWYDOXHV 7KHXQVWUXFWXUHGFRYDULDQFH 34 LVDFORVHVHFRQGEDVHGRQWKHVH PHDVXUHVDQGLWLVWKHSUHIHUUHGPHWKRGEDVHGRQWKH±5HV/RJ/LNHOLKRRGILW VWDWLVWLF%RWKRIWKHVHFRYDULDQFHVWUXFWXUHVUHVXOWLQDVLJQLILFDQW7UHDWPHQWE\ 0RQWK LQWHUDFWLRQ &RPSRXQG V\PPHWU\ VKRZV WKH ZRUVW ILW DQG DV SUHYLRXVO\ GLVFRYHUHG 27 LVQRWDJRRGDVVXPSWLRQ 7$%/(6XPPDU\RI352&0,;('5HVXOWVIRU([DPSOHIRU 9DULRXV&RUUHODWLRQ6WUXFWXUHV &RYDULDQFH6WUXFWXUH
6$6 1DPH
$,&
±5HV/RJ /LNHOLKRRG
SYDOXHIRU 75($7 0217+
81
$XWRUHJUHVVLYH
$5
7RHSOLW]
72(3
$50$
&6
8QVWUXFWXUHG
$XWRUHJUHVVLYH0$ &RPSRXQG 6\PPHWULF
*(($QDO\VLV
7KHXVHRIJHQHUDOL]HGHVWLPDWLQJHTXDWLRQV*(( LVDQRWKHUPHWKRGIRUDQDO\]LQJ UHSHDWHGPHDVXUHVGDWDVHWVWKDWKDYHPLVVLQJYDOXHV352&*(102'ZKLFKLV XVHGIRU*((DQDO\VLVLQ6$6UHTXLUHVWKHVSHFLILFDWLRQRIDµZRUNLQJ¶FRUUHODWLRQ VWUXFWXUH VXFK DV FRPSRXQG V\PPHWULF &6 RU XQVWUXFWXUHG 81 ZKLFK UHSUHVHQWVWKHFRUUHODWLRQVDPRQJWKHUHSHDWHGPHDVXUHPHQWV7KH*((PRGHOLQJ PHWKRGRORJ\ZLOOXVXDOO\SURGXFHJRRGHVWLPDWHVRIWKHPRGHOSDUDPHWHUVHYHQLI WKH FRUUHODWLRQ VWUXFWXUH LV PLVVSHFLILHG ,Q WKLV VHQVH D *(( DQDO\VLV LV PRUH UREXVW WKDQ XVLQJ WKH JHQHUDOL]HG OLQHDU PRGHOLQJ PHWKRGV RI 352& */0 RU 352&0,;('
7KH 6$6 VWDWHPHQWV WKDW IROORZ WKLV VHFWLRQ LOOXVWUDWH KRZ WR FRQGXFW D *(( DQDO\VLVRIWKHGDWDLQ([DPSOHE\XVLQJWKHDXWRUHJUHVVLYH7<3( $5 ZRUNLQJ FRUUHODWLRQ PDWUL[ 39 7KH ',67 RSWLRQ LQ WKH 02'(/ VWDWHPHQW VSHFLILHVWKHDVVXPSWLRQRIQRUPDOGLVWULEXWLRQRIWKHGDWD',67 1250$/ ,Q WKLV FDVH WKH ',67 RSWLRQ LV QRW UHTXLUHG EHFDXVH WKH QRUPDO GLVWULEXWLRQ LV WKH GHIDXOW7KH7<3(RSWLRQLQWKH02'(/VWDWHPHQWUHTXHVWVWHVWVDQDORJRXVWR WKH352&*/06$67\SH,,,WHVWV IRUHDFKRIWKHHIIHFWVJLYHQLQWKH02'(/ VWDWHPHQW 7KLV DSSURDFK DOVR UHVXOWV LQ D VLJQLILFDQW 7UHDWPHQWE\0RQWK LQWHUDFWLRQS 40
&RPPRQ6WDWLVWLFDO0HWKRGVIRU&OLQLFDO5HVHDUFKZLWK6$6([DPSOHV
6$6&RGHIRU([DPSOH352&*(102' ODS SELECT Genmod.Type3; PROC GENMOD DATA = UNIALZ; CLASS TREAT MONTH PAT; MODEL ADASCOG = TREAT MONTH TREAT*MONTH / DIST = NORMAL TYPE3; 39 REPEATED SUBJECT = PAT / TYPE = AR(1); TITLE4 'GEE Analysis Using PROC GENMOD’; TITLE5 ‘Autoregressive Correlation (AR(1)) Working Correlation'; RUN;
2873876$62XWSXWIRU([DPSOH±352&*(102' Repeated-Measures ANOVA Example 8.3: Disease Progression in Alzheimer’s GEE Analysis Using PROC GENMOD Autoregressive Correlation (AR(1)) Working Correlation The GENMOD Procedure Score Statistics For Type 3 GEE Analysis ChiSource DF Square Pr > ChiSq TREAT MONTH TREAT*MONTH
2 5 10
2.02 46.14 21.76
0.3639 <.0001 0.0164
40
'HWDLOV 1RWHV
:LWK RQO\ RQH JURXS J LQ 7DEOH WKH UHSHDWHG PHDVXUHV OD\RXW LV LGHQWLFDO WR WKDW RI D UDQGRPL]HG EORFN GHVLJQ ,Q WKLV FDVH WKH DQDO\VLVLVFDUULHGRXWDVGHVFULEHGLQ&KDSWHUE\XVLQJWKHUHSHDWHGIDFWRUDV WKH*URXSHIIHFWDQG3DWLHQWDVWKHEORFNLQJIDFWRU
7KH PDQXDO FRPSXWLQJ PHWKRGV GHPRQVWUDWHG LQ ([DPSOH DSSO\RQO\WREDODQFHGGHVLJQVLHVDPHQXPEHURISDWLHQWVSHUJURXS 7KLV GHPRQVWUDWLRQLVXVHGWRVKRZZKLFKGHYLDWLRQVDUHUHSUHVHQWHGE\HDFKRIWKH HIIHFWVXPRIVTXDUHVDQGLVQRWUHFRPPHQGHGLQSUDFWLFH(YHQLIDVXLWDEOH FRPSXWHUSDFNDJHLVXQDYDLODEOHHDVLHUFRPSXWLQJIRUPXODVDUHDYDLODEOHIRU PDQXDOFRPSXWDWLRQVZLWKDEDODQFHGOD\RXWVHHHJ:LQHU
&KDSWHU5HSHDWHG0HDVXUHV$129$
)RU WKHEDODQFHGFDVHWKHUHSHDWHGPHDVXUHV$129$HUURUVXP RI VTXDUHV 66( LV VLPSO\ WKH VXP RI WKH 66( V IRXQG E\ XVLQJ WZRZD\ $129$ V ZLWK QR LQWHUDFWLRQ ZLWKLQ HDFK WUHDWPHQW JURXS ,Q ([DPSOH WKHWZRZD\$129$WDEOHVZLWKLQHDFK9DFFLQH*URXSDUHDVIROORZV $129$
$FWLYH*URXS
3ODFHER*URXS
6RXUFH
GI
66
06
)
GI
66
06
)
3$7,(17
9,6,7
(UURU
7RWDO
6LJQLILFDQWS
66( IRU WKH UHSHDWHG PHDVXUHV $129$ LV EDVHG RQ GHJUHHV RI IUHHGRP 6LPLODUO\ WKH 663DWLHQW FDQ EH DGGHG WR REWDLQ WKH 663DWLHQW9DFFLQH IRU WKH UHSHDWHG PHDVXUHV $129$ ZLWK GHJUHHV RI IUHHGRP 1RWLFH WKDW WKH VXPVRIVTXDUHVIRUWKH7LPHHIIHFW669LVLW DUHQRWDGGLWLYH
,Q WKH µXQLYDULDWH¶ DQDO\VLV RI ([DPSOH 3DWLHQWZLWKLQ 9DFFLQH LV FRQVLGHUHG D UDQGRP HIIHFW EHFDXVH SDWLHQWV DUH VHOHFWHG IURP D ODUJH SRSXODWLRQ RI HOLJLEOH SDWLHQWV DQG \RX ZDQW WR PDNH LQIHUHQFHV DERXW WKDWSRSXODWLRQQRWMXVWWKHVHOHFWHGSDWLHQWV7RLQGLFDWHWKLVWKH5$1'20 VWDWHPHQWLVXVHGIROORZLQJWKH02'(/VWDWHPHQWLQWKH6$6FRGH:KHQHYHU WKH5$1'20VWDWHPHQWLVLQFOXGHGLQ352&*/06$6SULQWVWKHIRUPRI WKHH[SHFWHGPHDQVTXDUHVDVIXQFWLRQVRIWKHYDULDQFHFRPSRQHQWVDWWULEXWHG WRHDFKVRXUFHRIYDULDWLRQ2XWSXWòVKRZVWKDWWKHH[SHFWHGPHDQVTXDUH IRUWKHIL[HG9DFFLQHHIIHFWLVFRPSRVHGRIDFRPSRQHQWGXHWRWKH9DFFLQH LH9$&*53 DQGWZRDGGLWLRQDOFRPSRQHQWVGXHWRWKHUDQGRPHIIHFWV 8QGHUWKHQXOOK\SRWKHVLVRIQR9DFFLQHHIIHFWWKHTXDGUDWLFIRUPLQYROYLQJ 9DFFLQHLH 49$&*539$&*53 9,6,7 ZRXOGEHUHGXFLQJLWWRWKHVDPH IRUPDVWKHH[SHFWHGPHDQVTXDUHIRU3DWLHQW9DFFLQH 7KHUHIRUHZKHQ+ LVWUXHWKHUDWLRRIWKHVHWZRPHDQVTXDUHVZRXOGEHZKLFKFRQILUPVWKDW WKLVLVWKHDSSURSULDWHWHVWIRU9DFFLQH
,QPRUHFRPSOH[UHSHDWHGPHDVXUHVGHVLJQVDQGWKRVHLQYROYLQJPLVVLQJGDWD WKH VWUXFWXUHV RI WKH H[SHFWHG PHDQ VTXDUHV DUH LPSRUWDQW LQ KHOSLQJ WR GHWHUPLQH KRZ WR FRQVWUXFW WKH )WHVWV $V DQ H[DFW WHVW PLJKW QRW H[LVW WKH DQDO\VW PXVW H[DPLQH WKH YDULDQFH VWUXFWXUHV WR GHWHUPLQH WKH UDWLR IRU WKH PRVWDSSURSULDWH)WHVWVDVGLVFXVVHGDERYHDQGWKHQXVHWKH7(67VWDWHPHQW WRFDUU\RXWWKHWHVW
&RPPRQ6WDWLVWLFDO0HWKRGVIRU&OLQLFDO5HVHDUFKZLWK6$6([DPSOHV
7KHµPXOWLYDULDWH¶DSSURDFKWRUHSHDWHGPHDVXUHVDQDO\VLVXVLQJ 6$6 PDNHV WKH ZLWKLQSDWLHQW WHVWV XVLQJ FULWHULD :LONV¶ /DPEGD 3LOODL¶V 7UDFH +RWHOOLQJ/DZOH\ 7UDFH DQG 5R\ V *UHDWHVW 5RRW (DFK RI WKHVH VWDWLVWLFV LV EDVHGRQPXOWLYDULDWHVWDWLVWLFDODQDO\VLVPHWKRGVWRREWDLQDQ) WHVWRIVLJQLILFDQFHIRUWKHJLYHQIDFWRUEDVHGRQFRPSOH[IXQFWLRQVRIVXPVRI VTXDUHV 7KH GLIIHUHQFHV DPRQJ WKHVH WHVWV GHSHQG RQ WKH DOWHUQDWLYH K\SRWKHVLV :LONV¶ /DPEGD IRU H[DPSOH PLJKW EH WKH PRVW SRZHUIXO WHVW WR XVH XQGHU VRPH DOWHUQDWLYHV DQG WKH OHDVW SRZHUIXO RI WKH IRXU WHVWV XQGHU RWKHUV5HVHDUFKWKDWFRPSDUHVWKHVHWHVWVVXJJHVWVWKDW3LOODL V7UDFHPLJKWEH WKH PRVW DSSURSULDWH RQH IRU XVH XQGHU D ZLGH UDQJH RI DSSOLFDWLRQV 7KH UHDGHU LV UHIHUUHG WR RWKHU UHIHUHQFHV IRU IXUWKHU GLVFXVVLRQ RI WKHVH FRPSDULVRQVHJ+DQG 7D\ORU ,QDQ\FDVHZKHQWKHUHDUHRQO\WZR JURXSVJ DOOIRXUWHVWV\LHOGWKHVDPHUHVXOWVDVVKRZQLQ([DPSOHV DQG 7KH µXQLYDULDWH¶ DSSURDFK WR UHSHDWHG PHDVXUHV $129$ FDQ VRPHWLPHV EH XVHG HYHQ XQGHU YLRODWLRQV WR WKH DVVXPSWLRQ RI FRPSRXQG V\PPHWU\E\DSSO\LQJDQDGMXVWPHQWWRWKHDQDO\VLV$FRUUHFWLRQIDFWRUFDOOHG WKH *UHHQKRXVH*HLVHU HSVLORQ FDQ EH FRPSXWHG DV D IXQFWLRQ RI WKH YDULDQFHV DQG FRYDULDQFHV RI WKH UHSHDWHG PHDVXUHPHQWV 7KH DGMXVWPHQW LV PDGHE\PXOWLSO\LQJHDFKRIWKHGHJUHHVRIIUHHGRPLQWKH$129$WDEOHWKDW LV DVVRFLDWHG ZLWK WKH 7LPH HIIHFW E\ WR REWDLQ WKH DGMXVWHG GHJUHHV RI IUHHGRP 7KH *UHHQKRXVH*HLVHU DGMXVWHG SYDOXHV DUH SULQWHG LQ2XWSXW ** IRUHDFKHIIHFWWKDWLQYROYHVWKH7LPHIDFWRUDVLOOXVWUDWHGLQ([DPSOH µ0XOWLYDULDWH¶$SSURDFK 14 :KHQWKHVSKHULFLW\WHVWLVKLJKO\VLJQLILFDQW WKH µXQLYDULDWH¶ DSSURDFK LV QRW UHFRPPHQGHG HYHQ ZLWK WKH DGMXVWPHQW GHVFULEHG :KHQ WKHUH LV D VLJQLILFDQW SURILOH GLIIHUHQFH DPRQJ JURXSV IXUWKHUDQDO\VHVDUHUHTXLUHGWRGHWHUPLQHDWZKDWWLPHSRLQWVWKHVHGLIIHUHQFHV RFFXU )RU WKH µPXOWLYDULDWH¶ DSSURDFK XVLQJ WKH 6800$5< RSWLRQ LQ WKH 5(3($7(' VWDWHPHQW LQ 352& */0 SURGXFHV $129$ V DW YDULRXV VHWV RI WLPH SRLQWV 6SHFLI\LQJ &2175$67 ZLOO JLYH WKH VDPH UHVXOWV DV LI DQ $129$ZHUHFRQGXFWHGRQWKHFKDQJHVLQWKHUHVSRQVHPHDVXUHIURPWKHILUVW WLPH SRLQWDVVKRZQLQWKH5(3($7('VWDWHPHQWIRU([DPSOH 17 7KLV FDQEHXVHIXOLIWKHILUVWWLPHSRLQWLVDEDVHOLQHYDOXH $OWHUQDWLYHO\ XVLQJ WKH 352),/( RSWLRQ LQ WKH 5(3($7(' VWDWHPHQW SURYLGHVWKHUHVXOWVDVLIDQ$129$ZHUHFRQGXFWHGRQWKHFKDQJHVEHWZHHQ VXFFHVVLYH WLPH SRLQWV 7KLV PLJKW EH XVHIXO IRU H[DPSOH LQ GHWHUPLQLQJ DW ZKDWWLPHSRLQWWKHUHVSRQVHVRIWZRWUHDWPHQWJURXSVILUVWEHJLQWRFRQYHUJH RUGLYHUJH7KLVLVLOOXVWUDWHGLQWKHµPXOWLYDULDWH¶DSSURDFKIRU([DPSOH 15 7KHILUVWDQDO\VLVVKRZVDQRQVLJQLILFDQW7UHDWPHQWHIIHFWS EDVHG RQ WKH FKDQJHV IURP 0RQWK WR 0RQWK WKH VHFRQG DQDO\VLV UHVXOWV LQ D PDUJLQDOO\ VLJQLILFDQW S WUHDWPHQW HIIHFW EDVHG RQ WKH FKDQJHV IURP0RQWKWR0RQWK
&KDSWHU5HSHDWHG0HDVXUHV$129$
2WKHU FRQWUDVWV FDQ DOVR EH XVHG ZLWK 6$6 VHH WKH 6$6 +HOS DQG 'RFXPHQWDWLRQ IRU GHWDLOV DERXW WKH 6800$5< RSWLRQ LQ WKH 5(3($7(' VWDWHPHQWLQ352&*/0 7KHHUURUYDULDWLRQIRUEHWZHHQJURXSFRPSDULVRQVDYHUDJHGRYHU WLPH LV HVWLPDWHG E\ WKH PHDQ VTXDUH IRU 3DWLHQWVZLWKLQ*URXS 063* ZLWK 1±J GHJUHHV RI IUHHGRP 7KH ZLWKLQ3DWLHQW HUURU YDULDWLRQ RYHU DOO JURXSV LVHVWLPDWHGE\06(ZLWK1±J W± GHJUHHVRIIUHHGRP7KHHUURU YDULDWLRQ ZLWKLQ WKH *URXSE\7LPH FHOOV LV WKH SRROHG FRPELQDWLRQ RI WKHVH WZRVRPHWLPHVGLVSDUDWHHUURUHVWLPDWHVQDPHO\
Ö
FHOO
6 6 3 * 6 6 ( 1 - J 1 - J W -
6 6 3 * 6 6 ( W × 1 - J
7KLV ZLWKLQFHOO HUURU YDULDWLRQ LV WKH ZLWKLQWLPH DPRQJSDWLHQW YDULDWLRQ DYHUDJHGRYHUDOOWLPHSRLQWV7KDWLVLI06( UHSUHVHQWVWKH06(IURPDRQH ZD\ $129$ FRQGXFWHG DW WLPH SHULRG M WKH ZLWKLQFHOO YDULDQFH RI WKH UHSHDWHGPHDVXUHVOD\RXWLVWKHDYHUDJHRIWKHVH06( V M
å 06( W
Ö
FHOO
=
=
M
M
W
7KLV ZLWKLQFHOO HUURU YDULDWLRQ FDQ EH XVHG WR FRPSDUH FHOO PHDQV HJ *URXS L YV *URXS M IRU D VSHFLILF WLPH SRLQW RU WR IRUP FRQILGHQFH LQWHUYDOVIRUWKHFHOOPHDQV 7KHUHSHDWHGPHDVXUHVDQDO\VLVLVQRWUHFRPPHQGHGZKHQWKHUH DUHPDQ\PLVVLQJYDOXHV$UHSHDWHGPHDVXUHVOD\RXWZLWKWZRWLPHSHULRGVLV VLPLODU WR WKH SDLUHGGLIIHUHQFH VLWXDWLRQ &KDSWHU 3DWLHQWV ZKR KDYH D PLVVLQJ YDOXH DW RQH RI WKH WZR WLPH SHULRGV PLJKW FRQWULEXWH LQIRUPDWLRQ UHJDUGLQJ EHWZHHQSDWLHQW YDULDELOLW\ EXW QRWKLQJ UHJDUGLQJ WKH SDWLHQWV¶ UHVSRQVH SURILOH ZLWKLQSDWLHQW YDULDELOLW\ 6LPLODU SUREOHPV WKRXJK QRW DV H[WUHPHRFFXUZKHQSDWLHQWVKDYHPLVVLQJGDWDLQOD\RXWVZLWKPRUHWKDQWZR WLPHSHULRGV :KHQWKHUHDUHPLVVLQJYDOXHVWKHµXQLYDULDWH¶DSSURDFKWRUHSHDWHGPHDVXUHV DQDO\VLV XVLQJ 6$6 FRQIRXQGV K\SRWKHVHV UHJDUGLQJ WUHDWPHQW HIIHFWV ZLWK RWKHU PRGHO HIIHFWV WKDW UHVXOW LQ FRPSOH[ K\SRWKHVHV DQG LQWHUSUHWDWLRQ GLIILFXOWLHV8QOHVVWKHDQDO\VWFDQFOHDUO\UHVWDWHWKHVWDWLVWLFDOK\SRWKHVLVLQ FOLQLFDOWHUPVWKLVDSSURDFKWRUHSHDWHGPHDVXUHVDQDO\VLVVKRXOGEHDYRLGHG ZKHQWKHGDWDVHWLVSODJXHGZLWKPLVVLQJGDWDMXVWDIHZPLVVLQJYDOXHVLQD ODUJHVWXG\VKRXOGQRWSUHVHQWLQWHUSUHWDWLRQGLIILFXOWLHV
&RPPRQ6WDWLVWLFDO0HWKRGVIRU&OLQLFDO5HVHDUFKZLWK6$6([DPSOHV
7KHµPXOWLYDULDWH¶DSSURDFKXVLQJ6$6HOLPLQDWHVSDWLHQWVZKRKDYHPLVVLQJ YDOXHVIURPWKHDQDO\VLVWKHUHE\GHFUHDVLQJWKHWHVW¶VSRZHU0XFKRIWKHGDWD ZLOOEHH[FOXGHGIURPHYDOXDWLRQZKHQXVLQJWKLVDQDO\VLVPHWKRGLIWKHUHDUHD ODUJHQXPEHURISDWLHQWVHDFKKDYLQJMXVWRQHPLVVLQJYDOXH 352&0,;('LVFDSDEOHRIKDQGOLQJPLVVLQJYDOXHVDQGXQEDODQFHGFDVHVIRU UHSHDWHG PHDVXUHV DQDO\VLV EXW LW DVVXPHV YDOXHV DUH PLVVLQJ DW UDQGRP 0$5 *HQHUDOL]HGHVWLPDWLQJHTXDWLRQV*(( UHTXLUHWKHPRUHUHVWULFWLYH DVVXPSWLRQWKDWPLVVLQJYDOXHVDUHPLVVLQJFRPSOHWHO\DWUDQGRP0&$5 0$5DQG0&$5VXJJHVWWKDWWKHPLVVLQJGDWDUHSUHVHQWDUDQGRPVDPSOH IURP DOO GDWD WKDW ZRXOG KDYH EHHQ DYDLODEOH DQG WKHUHIRUH WKH UHVXOWVDUH QRW ELDVHG E\ WKHLU H[FOXVLRQ 8QIRUWXQDWHO\ WKHUH LV QR ZD\ WR WHVW WKH DVVXPSWLRQRI0$5RU0&$5 :LWKRQO\DERXWRIWKHGDWDPLVVLQJDQGQRDSSDUHQWSDWWHUQVUHODWHGWR WUHDWPHQW RU GLVFRQWLQXDWLRQ WKH PLVVLQJ YDOXH VWUXFWXUH RI ([DPSOH DSSHDUV FRQVLVWHQW ZLWK WKH 0$5 DVVXPSWLRQ +RZHYHU WKLV LV QRW XVXDOO\ WKH FDVH LQ FOLQLFDO VWXGLHV %HFDXVH WKHUH LV IUHTXHQWO\ PLVVLQJ GDWD LQ FOLQLFDOWULDOVIURPHDUO\GLVFRQWLQXDWLRQRISDWLHQWVGXHWRLQHIIHFWLYHQHVVRU VLGHHIIHFWVDVVXPSWLRQVRI0$5RU0&$5DUHRIWHQYLRODWHGLQVXFKGDWD VHWV (VWLPDWHVFDQVRPHWLPHVEHFRPSXWHGIRUPLVVLQJYDOXHVDQGWKHVHFDQWKHQ EHXVHGLQWKHDQDO\VLVDVLIWKH\ZHUHREVHUYHG7KHVLPSOHVWHVWLPDWHDQGRQH WKDWLVRIWHQXVHGLQFOLQLFDOWULDOVLVWKHSDWLHQW VREVHUYDWLRQIURPWKHSUHYLRXV WLPHSRLQW7KLVKDVEHHQUHIHUUHGWRDVWKHµODVWREVHUYDWLRQFDUULHGIRUZDUG¶ /2&) WHFKQLTXH EXW LW LV QRW XQLYHUVDOO\ HQGRUVHG /DFKLQ IRU H[DPSOH UHIHUV WR WKLV PHWKRG DV µFOHDUO\ ULGLFXORXV¶ LQ FOLQLFDO WULDOV WKDW LQYROYHGLVHDVHVWDWHVWKDWPLJKWSURJUHVVRUGHWHULRUDWHGXULQJWKHVWXG\2WKHU HVWLPDWHVRIPLVVLQJYDOXHVPLJKWEHEDVHGRQDYHUDJHVRIDGMDFHQWYDOXHVRU RQURZDQGFROXPQPHDQVRUEDVHGRQUHJUHVVLRQWHFKQLTXHVDFFRXQWLQJIRU H[SODQDWRU\FRYDULDWHV+RZHYHUVXFKHVWLPDWLRQWHFKQLTXHVDOVRUHO\KHDYLO\ RQWKH0$5DVVXPSWLRQWRDYRLGELDV ,IWKHUHSHDWHGPHDVXUHVPRGHODVVXPSWLRQVDUHQRWVDWLVILHGDQGLPSXWDWLRQ WHFKQLTXHVDUHQRWWHQDEOHLQGDWDVHWVSODJXHGZLWKPLVVLQJYDOXHVRQHPD\ GHHPWKHPRVWDSSURSULDWHDQDO\VLVWREHVHSDUDWH$129$¶VDWHDFKWLPHSRLQW *HQHUDOO\ LI ODUJH DPRXQWV RI PLVVLQJ GDWD DUH H[SHFWHG WKH UHSHDWHG PHDVXUHV$129$VKRXOGQRWHYHQEHFRQVLGHUHGDVDQDQDO\VLVWRRO+RZHYHU LIODUJHDPRXQWVRIPLVVLQJGDWDXQH[SHFWHGO\RFFXULQDVWXG\IRUZKLFK\RX SODQ WR XVH D UHSHDWHG PHDVXUHV DQDO\VLV \RX PLJKW FRQVLGHU UHYLVLQJ WKH K\SRWKHVLVDQGFRQGXFWLQJDQHZVWXG\6RPHDXWKRUVHJ*LOOLQJVDQG.RFK VXJJHVWWKDWWKHFUHGLELOLW\RIDQHQWLUHVWXG\PLJKWEHLQTXHVWLRQLILW XQH[SHFWHGO\ FRQWDLQV D ODUJH QXPEHU RI PLVVLQJ YDOXHV ,Q DQ\ FDVH WKH RFFXUUHQFH RI PLVVLQJ GDWD LQ D UHSHDWHG PHDVXUHV VHWWLQJ FDOOV IRU WKH DSSOLFDWLRQ RI DOWHUQDWLYH VWDWLVWLFDO PHWKRGV WR FRQILUP WKH UHVXOWV RI WKH PHWKRGVHOHFWHG
&KDSWHU5HSHDWHG0HDVXUHV$129$
:KHQ WKHUH DUH QR PLVVLQJ YDOXHV 352& 0,;(' XVLQJ WKH FRPSRXQG V\PPHWULF FRYDULDQFH VWUXFWXUH 7<3( &6 SURGXFHV WKH VDPH UHVXOWVDV352&*/0LQWKHµXQLYDULDWH¶DSSURDFK ,QWHUSUHWDWLRQ GLIILFXOWLHV XVLQJ WKH $129$ PHWKRG IRU UHSHDWHG PHDVXUHVGDWDLQWKHFDVHRILPEDODQFHRUPLVVLQJGDWDDUHPDJQLILHGZLWKWKH LQFOXVLRQ RI QXPHULF FRYDULDWHV DV H[SODQDWRU\ YDULDEOHV +RZHYHU *(( DQDO\VHVDUHZHOOVXLWHGIRUXVHZLWKFRYDULDWHVEXWRQO\ZLWKODUJHGDWDVHWV 7KH UHVXOWV RI *(( DQDO\VHV PLJKW QRW EH UHOLDEOH IRU VPDOOHU GDWD VHWV *HQHUDOO\ LI WKHUH DUH QR PRUH WKDQ RU H[SODQDWRU\ YDULDEOHV LQFOXGLQJ µ7UHDWPHQW*URXS¶ DPLQLPXPVDPSOHVL]HRIWRSDWLHQWVLVGHVLUDEOH (VWLPDWHV RI WKH PRGHO SDUDPHWHUV XVH D /LNHOLKRRG 5DWLR PHWKRG ZKHQ WKH 7<3(PRGHORSWLRQLVXVHGLQ352&*(102'DQGSYDOXHVDUHEDVHGRQD FKLVTXDUHWHVW7KLVPHWKRGFDQEHUHVRXUFHLQWHQVLYHIRUYHU\ODUJHGDWDVHWV LQ ZKLFK FDVH WKH :$/' RSWLRQ FDQ RIWHQ EH XVHG PRUH HIIHFWLYHO\ 7KH S YDOXHVEDVHGRQWKH:DOGWHVWDUHDOVRDV\PSWRWLFFKLVTXDUHWHVWV 7KHFRYDULDQFHVWUXFWXUHVSHFLILHGLQ352&0,;('ZLOOPRGHOWKH YDULDQFHDVVXPSWLRQVDWGLIIHUHQWWLPHSRLQWVDQGWKHSDWWHUQVRIFRUUHODWLRQV DPRQJWKHWLPHSRLQWVDFFRUGLQJWRKRZ\RXWKLQNPHDVXUHPHQWVDFURVVWLPH DUHUHODWHG7KHXQVWUXFWXUHGDSSURDFK7<3( 81 PDNHVQRDVVXPSWLRQDW DOO DERXW WKH UHODWLRQVKLS LQ WKH FRUUHODWLRQV DPRQJ YLVLWV 0HDVXUHPHQWV WDNHQ FORVHU WRJHWKHU LQ WLPH KRZHYHU RIWHQ WHQG WR EH PRUH KLJKO\ FRUUHODWHG WKDQ PHDVXUHPHQWV WDNHQ RYHU OHQJWK\ SHULRGV RI WLPH $V VHHQ SUHYLRXVO\ WKH FRPSRXQG V\PPHWULF VWUXFWXUH 7<3( &6 DVVXPHV WKH VDPH FRUUHODWLRQ VD\ EHWZHHQ HDFK SDLU RI WLPH SRLQWV ZLWKRXW UHJDUGWRKRZIDUDSDUWWKH\RFFXU ,Q WKH ILUVWRUGHU DXWRUHJUHVVLYH VWUXFWXUH 7<3( $5 PHDVXUHPHQWV WDNHQ DW DGMDFHQW WLPH SRLQWV HJ FRQVHFXWLYH YLVLWV KDYH WKH VDPH FRUUHODWLRQVXFKDV 7KHFRUUHODWLRQRI LVDVVLJQHGWRPHDVXUHPHQWVWKDW DUH YLVLWV DSDUW WR PHDVXUHPHQWV WKDW DUH YLVLWV DSDUW HWF 7KH DXWRUHJUHVVLYH PRYLQJ DYHUDJH 7<3( $50$ LV VLPLODU H[FHSW WKH HQWULHVWKDWLQYROYHSRZHUVRI DUHPXOWLSOLHGE\DFRQVWDQW 7KH 7RHSOLW]VWUXFWXUH7<3( 72(3 LVPRUHJHQHUDO,WDVVLJQVDFRUUHODWLRQRI WRPHDVXUHPHQWVWDNHQIURPFRQVHFXWLYHYLVLWVDGLIIHUHQWFRUUHODWLRQ WR PHDVXUHPHQWV WKDW DUH WDNHQ YLVLWV DSDUW WR PHDVXUHPHQWV WKDW DUH WDNHQ YLVLWV DSDUW HWF (DFK RI WKHVH H[FHSW WKH XQVWUXFWXUHG DVVXPHV HTXDOYDULDQFHVDWDOOPHDVXUHPHQWWLPHV
%\ XVLQJ WKH 5&255 RSWLRQ LQ WKH 5(3($7(' VWDWHPHQW LQ 352& 0,;('6$6ZLOOSURYLGHWKHHVWLPDWHGFRUUHODWLRQPDWUL[EDVHGRQWKHW\SH RI VWUXFWXUHDVVXPHG7KHHVWLPDWHGFRUUHODWLRQPDWUL[IRUWKHXQVWUXFWXUHG W\SHLVVKRZQLQ2XWSXWIRU([DPSOH352&0,;(' 41 1RWLFHWKDW WKH HVWLPDWHG FRUUHODWLRQV VKRZQ LQ 2XWSXW IRU ([DPSOH 352& */0 ³3DUWLDO &RUUHODWLRQ &RHIILFLHQWV IURP WKH (UURU 66&3 0DWUL[´ 42 DUHFRPSXWHGDIWHUH[FOXGLQJSDWLHQWVZKRKDYHGDWDPLVVLQJ
&RPPRQ6WDWLVWLFDO0HWKRGVIRU&OLQLFDO5HVHDUFKZLWK6$6([DPSOHV
1RWH)XUWKHUGHWDLOVDQGH[DPSOHVRI352&0,;('LQDQDO\]LQJUHSHDWHG PHDVXUHVGDWDDUHDYDLODEOHLQ³6$66\VWHPIRU0L[HG0RGHOV´E\/LWWHOO 0LOLNHQ6WURXSDQG:ROILQJHUZKLFKZDVZULWWHQXQGHUWKH6$6%RRNVE\ 8VHUVSURJUDP 7KH*((DSSURDFKGRHVQRWGHSHQGVRKHDYLO\RQWKHFRUUHFWVSHFLILFDWLRQRI WKH FRUUHODWLRQ VWUXFWXUH 7R LOOXVWUDWH WKLV UREXVWQHVV WKH VLJQLILFDQFH RI WKH LQWHUDFWLRQ WHUP RI ([DPSOH LV VHHQ WR EH FRQVLVWHQW IRU GLIIHUHQW FRUUHODWLRQV VWUXFWXUHV DV VKRZQ EHORZ IRU WKH H[FKDQJHDEOH RU FRPSRXQG V\PPHWULF &6 ZRUNLQJFRUUHODWLRQVWUXFWXUHDQGIRUDXVHUGHILQHGVWUXFWXUH 86(5 7KH XVHUGHILQHG W\SH LV SDWWHUQHG DIWHU WKH HVWLPDWHGFRUUHODWLRQV VKRZQ LQ WKH 6$6 RXWSXW IRU ([DPSOH 352& 0,;(' 41 ZKHUH WKH FRUUHODWLRQVDUHKLJKHJ IRUDGMDFHQWWLPHSRLQWVWKHQGHFUHDVHDQGOHYHO RIIWRIRUWLPHSRLQWVIDUWKHUDSDUW7KHLQWHUDFWLRQSYDOXHVRIIRU WKH H[FKDQJHDEOH DQG IRU WKH XVHUGHILQHG FRUUHODWLRQ VWUXFWXUHV DUH DERXWWKHVDPHDVWKDWREWDLQHGXVLQJWKHDXWRUHJUHVVLYHVWUXFWXUHS
REPEATED SUBJECT=PAT / TYPE=CS;
GEE Analysis Using PROC GENMOD Exchangeable (Compound Symmetric) Working Correlation (CS) Score Statistics For Type 3 GEE Analysis ChiSource DF Square Pr > ChiSq TREAT MONTH TREAT*MONTH
2 5 10
REPEATED SUBJECT=PAT / TYPE=USER(1.0 0.9 0.8 0.9 1.0 0.9 0.8 0.9 1.0 0.7 0.8 0.9 0.7 0.7 0.8 0.7 0.7 0.7
1.96 45.41 21.28
0.7 0.8 0.9 1.0 0.9 0.8
0.7 0.7 0.8 0.9 1.0 0.9
0.3745 <.0001 0.0192
0.7 0.7 0.7 0.8 0.9 1.0);
GEE Analysis Using PROC GENMOD User-Specified Working Correlation (USER()) Score Statistics For Type 3 GEE Analysis ChiSource DF Square Pr > ChiSq TREAT MONTH TREAT*MONTH
2 5 10
2.05 45.39 21.44
0.3582 <.0001 0.0182
&KDSWHU5HSHDWHG0HDVXUHV$129$
7KH WHUP µ7LPH¶ KDV EHHQ XVHG LQ WKLV FKDSWHU WR GHVFULEH WKH VWXG\YLVLWRUWLPHRIPHDVXUHPHQWRIWKHUHVSRQVHYDULDEOHGXULQJDWULDO7KLV LV SHUKDSV WKH PRVW FRPPRQ UHSHDWHG PHDVXUHV VLWXDWLRQ HVSHFLDOO\ IRU FRPSDUDWLYH GUXJ VWXGLHV ,Q JHQHUDO KRZHYHU WKH UHSHDWHG IDFWRU QHHG QRW UHIHU WR WLPH DW DOO 7KH UHSHDWHG IDFWRU FRXOG EH D VHW RI H[SHULPHQWDO FRQGLWLRQVHDFKDSSOLHGWRWKHVDPHVHWRISDWLHQWVLQUDQGRPRUGHUVXFKDV VXEMHFWLQJHDFKSDWLHQWWRHDFKGRVHOHYHORIDWHVWGUXJ7KHGRVHOHYHOVLQWKLV H[DPSOH UHSUHVHQW WKH UHSHDWHG PHDVXUH DQG WKH µSURILOHV¶ DUH WKH GRVH UHVSRQVHFXUYHV 7KH FURVVRYHU GHVLJQ &KDSWHU LV DQ H[DPSOHRIWKLVDQGFDQEHDQDO\]HG ZLWKWKHUHSHDWHGPHDVXUHV$129$,QWKHSRSXODUWZRSHULRGFURVVRYHUWKH WUHDWPHQW$RU% LVWKHUHSHDWHGIDFWRUEHFDXVHERWKWUHDWPHQWV$DQG%DUH JLYHQ WR HDFK SDWLHQW 7KH VHTXHQFH JURXS UHSUHVHQWLQJ WKH RUGHU LQ ZKLFK SDWLHQWV UHFHLYH WKH WUHDWPHQWV $% RU %$ ZRXOG UHSUHVHQW WKH µ*5283¶ HIIHFW 3DWLHQWV DUH QHVWHG ZLWKLQ VHTXHQFH JURXS DQG WKH DQDO\VLV ZRXOG EH FDUULHG RXW DV LQ ([DPSOH $ VLJQLILFDQW 7UHDWPHQWE\6HTXHQFH JURXS LQWHUDFWLRQ ZRXOG VXJJHVW D µSHULRG HIIHFW¶ LQ ZKLFKFDVHWKHDQDO\VWZRXOG SHUIRUPVHSDUDWHDQDO\VHVIRUHDFKWUHDWPHQWSHULRGE\XVLQJWKHWZRVDPSOH WWHVW 7KLV FKDSWHU IRFXVHV RQ UHVSRQVH PHDVXUHPHQWV DVVXPHG WR EH QRUPDOO\ GLVWULEXWHG 1RQQRUPDO GDWD DUH DOVR IUHTXHQWO\ FROOHFWHG LQ D UHSHDWHG PHDVXUHV OD\RXW 6LPSOH H[DPSOHV LQFOXGH VWXGLHV LQ ZKLFK WKH UHVSRQVH PHDVXUH LV ZKHWKHU WKH SDWLHQW LV LPSURYLQJ ELQDU\ UHVSRQVH RU VKRZLQJ D FDWHJRULFDO GHJUHH RI LPSURYHPHQW HJ µQRQH¶ µVRPH¶ µFRPSOHWH¶ DWHDFKYLVLWGXULQJWKHWULDO $QDO\VHV RI QRQQRUPDO GDWD FDQ RIWHQ EH KDQGOHG E\ XVLQJ FDWHJRULFDO WHFKQLTXHV:KHQWKHDQDO\VLVLQFOXGHVDGGLWLRQDOIDFWRUVRUFRYDULDWHVPRUH VRSKLVWLFDWHG PRGHOLQJ XVLQJ:HLJKWHG/HDVW6TXDUHV:/6 RU*HQHUDOL]HG (VWLPDWLQJ(TXDWLRQV*(( FDQEHHPSOR\HG7KHVHWHFKQLTXHVPDNHXVHRI WKH 6$6 SURFHGXUHV &$702' DQG *(102' ZKLFK FDQ KDQGOH YHU\ FRPSOH[GHVLJQV 1RWH6HYHUDOGLIIHUHQWDSSURDFKHVWRWKHDQDO\VLVRIUHSHDWHGPHDVXUHVIRU FDWHJRULFDOGDWDLQFOXGLQJ:/6DQG*((PHWKRGRORJ\DUHLOOXVWUDWHGLQ WKHH[FHOOHQWUHIHUHQFHHQWLWOHG³&DWHJRULFDO'DWD$QDO\VLV8VLQJWKH6$6 6\VWHP´6HFRQG(GLWLRQE\6WRNHV'DYLVDQG.RFKZKLFKZDVZULWWHQ XQGHUWKH6$6%RRNVE\8VHUVSURJUDP
6LPSOHXVDJHRIWKH2'6VWDWHPHQWWRFXVWRPL]HWKH6$6RXWSXW LVGHPRQVWUDWHGLQ([DPSOHVDQG(DFKVHFWLRQRIWKH6$6RXWSXW KDVDQDVVRFLDWHGODEHODQGSDWKQDPH2'6FDQEHXVHGWRVHOHFWRUH[FOXGH VSHFLILFVHFWLRQVRIWKHRXWSXWVLPSO\E\LGHQWLI\LQJLWVQDPH&RPSOHWHQDPHV FDQEHGHWHUPLQHGIURPWKH6$6ORJE\LQFOXGLQJWKHVWDWHPHQW2'675$&( 21DVDVWDWHPHQWZLWKLQWKH6$6SURJUDP
&RPPRQ6WDWLVWLFDO0HWKRGVIRU&OLQLFDO5HVHDUFKZLWK6$6([DPSOHV
([DPSOH 0XOWLYDULDWH $SSURDFK DQG ([DPSOH WHOO 6$6 WR RPLW WKH SULQWLQJ RI WKH FRUUHODWLRQ DQG FRYDULDQFH PDWULFHV QDPHG 3DUWLDO&RUU DQG (UURU66&3 E\ XVLQJ DQ 2'6 (;&/8'( VWDWHPHQW $OWHUQDWLYHO\ \RX FDQ XVHWKH2'66(/(&7VWDWHPHQWWRLQFOXGHRQO\VSHFLILFSDUWVRIWKHRXWSXWDV LOOXVWUDWHGLQ([DPSOH2EYLRXVO\WKHVHDUHYHU\VLPSOHH[DPSOHVRIXVLQJ WKH SRZHUIXO 2'6 IHDWXUHV LQ 6$6 $GGLWLRQDO XVHV RI WKH 2'6 (;&/8'( DQG 2'6 ,1&/8'( VWDWHPHQWV DUH XVHG LQ RWKHU H[DPSOHV WKURXJKRXW WKLV ERRN
CHHAAPPTTEERR 9 The Crossover Design 9.1 9.2 9.3 9.4
Introduction...................................................................................................... 157 Synopsis ............................................................................................................ 158 Examples........................................................................................................... 160 Details & Notes................................................................................................. 171
9.1
Introduction The crossover design is used to compare the mean responses of two or more treatments when each patient receives each treatment over successive time periods. Typically, patients are randomized to treatment sequence groups which determine the order of treatment administration. While responses among treatments in a parallel study are considered independent, responses among treatments in a crossover study are correlated because they are measured on the same patient, much like the repeated measures setup discussed in Chapter 8. In fact, the crossover is a special case of a repeated measures design. Procedures for treatment comparisons must consider this correlation when analyzing data from a crossover design. In addition to treatment comparisons, the crossover analysis can also investigate period effects, sequence effects, and carryover effects. Because the crossover study features within-patient control among treatment groups, fewer patients are generally required than with a parallel study. This advantage is often offset, however, by an increase in the length of the study. For example, when there are two treatments, the study will be at least twice as long as its parallel-design counterpart because treatments are studied in each of two successive periods. The addition of a wash-out period between treatments, which is usually required in a crossover study, tends to further lengthen the trial duration. The crossover design is normally used in studies with short treatment periods, most frequently in pre-clinical and early-phase clinical studies, such as bioavailability, dose-ranging, bio-equivalence, and pharmacokinetic trials. If it’s thought that the patient’s condition will not be the same at the beginning of each treatment period, the crossover design should be avoided. Lengthier treatment periods provide
158
Common Statistical Methods for Clinical Research with SAS Examples
increased opportunities for changing clinical conditions and premature termination. For the same reasons, crossover designs should limit the number of study periods as much as possible. The focus in this chapter is on the 2-period crossover design, although an example that uses 4 periods is also presented. Responses are assumed to be normally distributed.
9.2
Synopsis For the two-way crossover design, each patient receives two treatments, A and B, in one of two sequences, that is, A followed by B (A-B) or B followed by A (B-A). Samples of n1 and n2 patients are randomly assigned to each of the two sequence groups, A-B and B-A, respectively (usually, n1 = n2). Response measurements (yijk) are taken for each patient following each treatment for a total th of N = 2n1 + 2n2 measurements, where yijk is the measurement for the k patient in the sequence group for which Treatment i (i = 1 for Treatment A, i = 2 for Treatment B) is given in Period j (j = 1, 2). A sample layout is shown in Table 9.1. TABLE 9.1. Crossover Layout Sequence
Patient Number
Period 1
Period 2
(Treatment A)
(Treatment B)
A-B
1 3 6 7 . . .
y111 y112 y113 y114 . . .
y221 y222 y223 y224 . . .
B-A
2 4 5 8 . . .
(Treatment B) y211 y212 y213 y214 . . .
(Treatment A) y121 y122 y123 y124 . . .
The data are analyzed using analysis of variance methods with the following factors: Treatment Group, Period, Sequence, and Patient-within-Sequence. Computations proceed as shown in Chapters 6, 7, and 8 to obtain the ANOVA summary shown in Table 9.2.
Chapter 9: The Crossover Design 159
TABLE 9.2. ANOVA Summary Table for a 2-Period Crossover Source
df
SS
MS
F
Treatment (T)
1
SST
MST
FT = MST / MSE
Period (P)
1
SSP
MSP
FP = MSP / MSE
Sequence (S)
1
SSS
MSS
FS = MSS / MSP(S)
Patient-withinSeqence (P(S))
n1+n2–2
SSP(S)
MSP(S)
Error
n1+n2–2
SSE
MSE
Total
N–1
TOT(SS)
The Treatment sum of squares, SST, is proportional to the sum of squared deviations of each Treatment mean from the overall mean. Likewise, SSP is computed from the sum of squared deviations of each Period mean from the overall mean, etc. The mean squares (MS) are found by dividing the sums of squares (SS) by their respective degrees of freedom (df). Computations are shown in Example 9.1. The hypothesis of equal treatment means is tested at a two-tailed significance level, α, by comparing the F-test statistic, FT, with the critical F-value based on 1 upper 1 and n1+n2–2 lower degrees of freedom, denoted by Fn1 + n 2 − 2 (α). The test summary for no treatment effect is
null hypothesis: alt. hypothesis:
H0: µA = µB HA: mA =/ mB
test statistic:
MST FT = ____ MSE
decision rule:
1 reject H0 if FT > Fn1 + n 2 − 2 (α)
The Period and Sequence effects can also be tested in a similar manner. As in the repeated measures ANOVA (Chapter 8), the test for Sequence effect uses the mean square for Patient-within-Sequence in the denominator because patients are nested within sequence groups. When the hypothesis of no Sequence Group effect is true, variation between sequence groups simply reflects patient-to-patient variation.
160
Common Statistical Methods for Clinical Research with SAS Examples
9.3
Examples
p Example 9.1 -- Diaphoresis Following Cardiac Medication
A preliminary study was conducted on 16 normal subjects to examine the time to perspiration following administration of a single dose of a new cardiac medication with known diaphoretic effects. Subjects were asked to walk on a treadmill at the clinic following dosing with the new medication (A), and again on a separate clinic visit following placebo (B). A two-period crossover design was used with 8 subjects randomized to each of the two sequence groups. Time in minutes until the appearance of perspiration beads on the forehead was recorded as shown in Table 9.3. Is there any difference in perspiration times? TABLE 9.3. Raw Data for Example 9.1 Sequence A-B
B-A
Subject Number 1 3 5 6 9 10 13 15
2 4 7 8 11 12 14 16
Period 1
Period 2
(A)
(B)
6 8 12 7 9 6 11 8
4 7 6 8 10 4 6 8
(B)
(A)
7 6 11 7 8 4 9 13
5 9 7 4 9 5 8 9
Solution Using the two-way crossover ANOVA, you want to compare mean perspiration times between treatment groups, controlling for treatment period and sequence (order) of administration. For a balanced design, the sum of squares for each of these sources can be computed in the same manner as demonstrated previously (Chapters 6, 7, 8). First, obtain the marginal means, as shown in Table 9.4.
Chapter 9: The Crossover Design 161
TABLE 9.4. Summary of Marginal Means for Example 9.1 Sequence
Subject Number
Subject Means
Period 1
Period 2
(A)
(B)
6 8 12 7 9 6 11 8
4 7 6 8 10 4 6 8
5.0 7.5 9.0 7.5 9.5 5.0 8.5 8.0
8.375
6.625
7.5000
(B) 5 9 7 4 9 5 8 9
(A) 7 6 11 7 8 4 9 13
6.0 7.5 9.0 5.5 8.5 4.5 8.5 11.0
Sequence Means
7.000
8.125
7.5625
Overall Means
7.6875
7.3750
7.53125
A-B
1 3 5 6 9 10 13 15
Sequence Means B-A
2 4 7 8 11 12 14 16
Treatment A Mean: 8.2500
Treatment B Mean: 6.8125
Now, you can compute the sum of squares as follows: 2
SS(Treatment)
= (16 · (8.2500 – 7.53125) ) + 2 (16 · (6.8125 – 7.53125) ) = 16.53125
SS(Period)
= (16 · (7.6875 – 7.53125) ) + 2 (16 · (7.3750 – 7.53125) ) = 0.78125
SS(Sequence)
= (16 · (7.5000 – 7.53125) ) + 2 (16 · (7.5625 – 7.53125) ) = 0.03125
TOTAL(SS)
= ((6 – 7.53125) + 2 (8 – 7.53125) + … + 2 (13 – 7.53125) ) = 167.96875
2
2
2
162
Common Statistical Methods for Clinical Research with SAS Examples
SS(Patients-within-Sequence) 2
2
2
= ((5.0 – 7.5000) + (7.5 – 7.5000) + … + ( 8.0 – 7.5000) + 2 2 2 (6.0 – 7.5625) + (7.5 – 7.5625) + … + (11.0 – 7.5625) ) = 103.4375 Because this is a balanced design, the error sum of squares can be found by subtraction, SS(Error)
= TOTAL(SS) – (SS(TRT) + SS(PD) + SS(SEQ) + SS(PAT(SEQ))) = 167.96875 – (16.53125 + 0.78125 + 0.03125 + 103.4375) = 47.1875
Completing the ANOVA table, you obtain Table 9.5. ANOVA Summary for Example 9.1 Source
df
SS
MS
F
Treatment (T)
1
16.53125
16.53125
4.90
Period (P)
1
0.78125
0.78125
0.23
Sequence (S)
1
0.03125
0.03125
0.004
Patient-within Sequence (P(S))
14
103.4375
7.38839
Error
14
47.1875
3.37054
Total
31
167.96875
The test summary for Treatment effect becomes null hypothesis:
H0: mA = mB
alt. hypothesis:
HA: mA =/ mB
test statistic:
FT = 4.90
decision rule:
reject H0 if FT > F14 (0.05) = 4.60
conclusion:
Because 4.90 > 4.60, you reject H0 and conclude that the new medication and placebo have different diaphoretic profiles, based on a significance level of a = 0.05.
1
Chapter 9: The Crossover Design 163
SAS Analysis of Example 9.1 The SAS code for analyzing this data set and resulting output are shown below. All four sources of variation (Sequence Group, Treatment, Period, and Patient) must appear in the CLASS statement in PROC GLM. Note that the MODEL statement in PROC GLM is very similar to that of the repeated measures ANOVA (Chapter 8) . The Sequence Group (SEQ) represents the two randomized groups, and patients are nested within Sequence Group. Therefore, a TEST statement is needed to conduct the appropriate F-test for the Sequence Group effect using the PAT(SEQ) as the error term . The SS3 option in the MODEL statement requests printing the SAS Type III results only. Because this data set is balanced, the Type I and III results would be identical. The ANOVA results shown in Output 9.1 agree with the manual calculations. The Treatment effect has a significant p-value of 0.0439 , which
indicates a departure from the null hypothesis of equal treatment means. Neither the Sequence effect (which tests for differences in the order of administration) nor the Period effect (which tests for differences between periods) are significant (p=0.9491 and 0.6376, respectively).
SAS Code for Example 9.1 DATA XOVER; INPUT PAT SEQ DATALINES; 1 AB A 1 6 3 9 AB A 1 9 10 1 AB B 2 4 3 9 AB B 2 10 10 2 BA A 2 7 4 11 BA A 2 8 12 2 BA B 1 5 4 11 BA B 1 9 12 ;
$ TRT $ PD Y @@; AB AB AB AB BA BA BA BA
A A B B A A B B
1 1 2 2 2 2 1 1
8 6 7 4 6 4 9 5
5 13 5 13 7 14 7 14
AB AB AB AB BA BA BA BA
A A B B A A B B
1 12 1 11 2 6 2 6 2 11 2 9 1 7 1 8
6 15 6 15 8 16 8 16
AB AB AB AB BA BA BA BA
A A B B A A B B
1 7 1 8 2 8 2 8 2 7 2 13 1 4 1 9
PROC GLM DATA = XOVER; CLASS SEQ TRT PD PAT; MODEL Y = SEQ PAT(SEQ) TRT PD / SS3; TEST H=SEQ E=PAT(SEQ); TITLE1 'Crossover Design'; TITLE2 ‘Example 9.1: Diaphoresis Following Cardiac Medication’; RUN;
164
Common Statistical Methods for Clinical Research with SAS Examples
OUTPUT 9.1. SAS Output for Example 9.1 Crossover Design Example 9.1: Diaphoresis Following Cardiac Medication The GLM Procedure Class Level Information Class
Levels
SEQ TRT PD PAT
Values
2 2 2 16
AB BA A B 1 2 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
Number of observations
32
Dependent Variable: Y Source
DF
Sum of Squares
Mean Square
F Value
Pr > F
Model
17
120.7812500
7.1047794
2.11
0.0824
Error
14
47.1875000
3.3705357
Corrected Total
31
167.9687500
Source SEQ PAT(SEQ) TRT PD
R-Square
Coeff Var
Root MSE
Y Mean
0.719070
24.37712
1.835902
7.531250
DF 1 14 1 1
Type III SS 0.0312500 103.4375000 16.5312500 0.7812500
Mean Square
F Value
Pr > F
0.0312500 7.3883929 16.5312500 0.7812500
0.01 2.19 4.90 0.23
0.9247 0.0771 0.0439 0.6376
Tests of Hypotheses Using the Type III MS for PAT(SEQ) as an Error Term Source SEQ
DF
Type III SS
Mean Square
F Value
Pr > F
1
0.03125000
0.03125000
0.00
0.9491
The next example illustrates the analysis of a 4-period crossover design, including a test for carryover effect using SAS.
Chapter 9: The Crossover Design 165
p Example 9.2 -- Antibiotic Blood Levels Following Aerosol Inhalation
Normal subjects were enrolled in a pilot study to compare blood levels among 4 doses (A, B, C, and D) of an antibiotic using a new aerosol delivery formulation. The response measure is area-under-the-curve (AUC) up to 6 hours after dosing. A 4-period crossover design was used with a 3-day washout period between doses. Three subjects were randomized to each of four sequence groups, as shown in Table 9.6 along with the data (given as logAUC’s). Are there any differences in blood levels among the 4 doses? TABLE 9.6. Raw Data for Example 9.2 Dosing Sequence
Subject Number
Period 1
Period 2
Period 3
Period 4
A-B-D-C
102
2.31
3.99
11.75
4.78
106
3.95
2.07
7.00
4.20
109
4.40
6.40
9.76
6.12
104
6.81
8.38
1.26
10.56
105
9.05
6.85
4.79
4.86
111
7.02
5.70
3.14
7.65
101
6.00
4.79
2.35
3.81
108
5.25
10.42
5.68
4.48
112
2.60
6.97
3.60
7.54
103
8.15
3.58
8.79
4.94
107
12.73
5.31
4.67
5.84
110
6.46
2.42
4.58
1.37
B-C-A-D
C-D-B-A
D-A-C-B
Solution Treatment, Period, and Sequence effects are computed in the same way as presented for Example 9.1. Going directly to the SAS analysis of these data, a test for any treatment carryover effects is included. The carryover effect is somewhat entangled with the treatment effect. To separate them, you need to compute both the adjusted and unadjusted treatment effects and include a carry-over effect in the ANOVA model. In SAS, this can be done by obtaining both the Type I and Type III results from PROC GLM, as shown in the SAS analysis that follows.
166
Common Statistical Methods for Clinical Research with SAS Examples
SAS Analysis of Example 9.2 The data are input in data lines according to Sequence Group (SEQGRP) and Patient (PAT), as shown in the INHAL1 data set in the SAS code for Example 9.2. The Dose Group (DOSE) and Period (PER) variables are added in the data set INHAL2 using the OUTPUT statement in the DATA step. The final data set for analysis, INHAL, is created to include a character-valued variable for the carryover effect, CO , whose values represent the dose group from the previous period. A value of 0 is assigned to CO for the first period. Three additional carryover effects are included, one for each pairwise difference. Select Dose D as the ‘reference’ dose and define COA, COB, and COC as the difference in carryover effect between Doses A and D, B and D, and C and D, respectively. Thus, COA will take the value of +1 if Dose A was given in the previous period, –1 if Dose D was given in the previous period, and 0 otherwise. Similarly for COB and COC. The printout of the data set INHAL shows the values of each of these carryover effects . PROC GLM is used to analyze the data. Initially, the overall carryover effect, CO, is included in the crossover analysis (this is referred to as “Model 1”). The MODEL statement includes effects for Sequence Group (SEQGRP), Dose Group (DOSE), Period (PER), Patient-within-Sequence Group (PAT(SEQGRP)), and the overall carryover (CO). The carryover effect has a sum of squares of 30.936 and a p-value of 0.0687 , which deserves further analysis because of its marginal significance. You can run a second GLM model that includes the individual dose carryover effects (relative to Dose D) by including COA, COB, and COC in the MODEL statement (this is referred to as “Model 2”) 11 . This second GLM model separates the overall carryover effect sum of squares into the components due to each dose comparison. Note that the Type I sums of squares for COA, COB, and COC in Model 2 12 add up to the sum of squares for CO in Model 1. The largest component of carryover is clearly due to Dose B, which has a significant p-value of 0.0267 13 . The Dose Group effect adjusted for carryover is highly significant in both Models 1 14 and 2 15 (p=0.0001). No Sequence Group (p=0.6526) 16 or Period effects (p=0.7947) 17 are evident. SAS Code for Example 9.2 DATA INHAL1; INPUT SEQGRP $ PAT AUC1 AUC2 AUC3 AUC4 DATALINES; ABDC 102 2.31 3.99 11.75 4.78 ABDC 106 3.95 2.07 7.00 4.20 ABDC 109 4.40 6.40 9.76 6.12 BCAD 104 6.81 8.38 1.26 10.56 BCAD 105 9.05 6.85 4.79 4.86 BCAD 111 7.02 5.70 3.14 7.65 CDBA 101 6.00 4.79 2.35 3.81
@@;
Chapter 9: The Crossover Design 167
CDBA CDBA DACB DACB DACB ;
108 5.25 10.42 112 2.60 6.97 103 8.15 3.58 107 12.73 5.31 110 6.46 2.42
5.68 3.60 8.79 4.67 4.58
4.48 7.54 4.94 5.84 1.37
DATA INHAL2; SET INHAL1; DOSE = SUBSTR(SEQGRP,1,1); DOSE = SUBSTR(SEQGRP,2,1); DOSE = SUBSTR(SEQGRP,3,1); DOSE = SUBSTR(SEQGRP,4,1); RUN;
PER PER PER PER
= = = =
1; 2; 3; 4;
AUC AUC AUC AUC
= = = =
AUC1; AUC2; AUC3; AUC4;
PROC SORT DATA = INHAL2; BY PAT PER; DATA INHAL; SET INHAL2; KEEP PAT SEQGRP DOSE PER AUC CO COA COB COC; CO = LAG(DOSE); IF PER = 1 THEN CO = '0'; COA = 0; COB = 0; COC = 0; IF CO = 'A' THEN COA = 1; IF CO = 'B' THEN COB = 1; IF CO = 'C' THEN COC = 1; IF CO = 'D' THEN DO; COA = -1; COB = -1; COC = -1; END; RUN;
OUTPUT; OUTPUT; OUTPUT; OUTPUT;
PROC SORT DATA = INHAL; BY SEQGRP PER PAT; PROC PRINT DATA = INHAL; TITLE1 'CrossOver Design'; TITLE2 'Example 9.2: Antibiotic Blood Levels Following Aerosol Inhalation' ; RUN; /* Model 1 */ PROC GLM DATA = INHAL; CLASS SEQGRP DOSE PER PAT CO; MODEL AUC = SEQGRP PAT(SEQGRP) DOSE PER CO; TEST H=SEQGRP E=PAT(SEQGRP); TITLE3 ‘Model 1’; RUN;
/* Model 2 */ PROC GLM DATA = INHAL; CLASS SEQGRP DOSE PER PAT; MODEL AUC = SEQGRP PAT(SEQGRP) DOSE PER COA COB COC; TEST H=SEQGRP E=PAT(SEQGRP); TITLE3 ‘Model 2’; RUN;
11
168
Common Statistical Methods for Clinical Research with SAS Examples
OUTPUT 9.2. SAS Output for Example 9.2 CrossOver Design Example 9.2: Antibiotic Blood Levels Following Aerosol Inhalation Obs
PAT
SEQGRP
DOSE
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48
102 106 109 102 106 109 102 106 109 102 106 109 104 105 111 104 105 111 104 105 111 104 105 111 101 108 112 101 108 112 101 108 112 101 108 112 103 107 110 103 107 110 103 107 110 103 107 110
ABDC ABDC ABDC ABDC ABDC ABDC ABDC ABDC ABDC ABDC ABDC ABDC BCAD BCAD BCAD BCAD BCAD BCAD BCAD BCAD BCAD BCAD BCAD BCAD CDBA CDBA CDBA CDBA CDBA CDBA CDBA CDBA CDBA CDBA CDBA CDBA DACB DACB DACB DACB DACB DACB DACB DACB DACB DACB DACB DACB
A A A B B B D D D C C C B B B C C C A A A D D D C C C D D D B B B A A A D D D A A A C C C B B B
PER 1 1 1 2 2 2 3 3 3 4 4 4 1 1 1 2 2 2 3 3 3 4 4 4 1 1 1 2 2 2 3 3 3 4 4 4 1 1 1 2 2 2 3 3 3 4 4 4
AUC 2.31 3.95 4.40 3.99 2.07 6.40 11.75 7.00 9.76 4.78 4.20 6.12 6.81 9.05 7.02 8.38 6.85 5.70 1.26 4.79 3.14 10.56 4.86 7.65 6.00 5.25 2.60 4.79 10.42 6.97 2.35 5.68 3.60 3.81 4.48 7.54 8.15 12.73 6.46 3.58 5.31 2.42 8.79 4.67 4.58 4.94 5.84 1.37
CO
COA
COB
0 0 0 A A A B B B D D D 0 0 0 B B B C C C A A A 0 0 0 C C C D D D B B B 0 0 0 D D D A A A C C C
0 0 0 1 1 1 0 0 0 -1 -1 -1 0 0 0 0 0 0 0 0 0 1 1 1 0 0 0 0 0 0 -1 -1 -1 0 0 0 0 0 0 -1 -1 -1 1 1 1 0 0 0
0 0 0 0 0 0 1 1 1 -1 -1 -1 0 0 0 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 -1 -1 -1 1 1 1 0 0 0 -1 -1 -1 0 0 0 0 0 0
COC 0 0 0 0 0 0 0 0 0 -1 -1 -1 0 0 0 0 0 0 1 1 1 0 0 0 0 0 0 1 1 1 -1 -1 -1 0 0 0 0 0 0 -1 -1 -1 0 0 0 1 1 1
Chapter 9: The Crossover Design 169
OUTPUT 9.2. SAS Output for Example 9.2 (continued) CrossOver Design Example 9.2: Antibiotic Blood Levels Following Aerosol Inhalation Model 1 The GLM Procedure
Class
Class Level Information Values
Levels
SEQGRP
4
ABDC BCAD CDBA DACB
DOSE
4
A B C D
PER
4
1 2 3 4
PAT
12
CO
5
101 102 103 104 105 106 107 108 109 110 111 112 0 A B C D Number of observations
48
Dependent Variable: AUC Source
DF
Sum of Squares
Model Error Corrected Total
20 27 47
225.1866862 104.8902450 330.0769312
Mean Square
F Value
11.2593343 3.8848239
2.90
Pr > F 0.0053
R-Square
Coeff Var
Root MSE
AUC Mean
0.682225
34.38658
1.970996
5.731875
DF
Type I SS
Mean Square
F Value
Pr > F
3 8 3 3 3
7.1111896 48.6932667 134.4534729 3.9931229 30.9356342
2.3703965 6.0866583 44.8178243 1.3310410 10.3118781
0.61 1.57 11.54 0.34 2.65
0.6142 0.1815 <.0001 0.7947 0.0687
Pr > F
Source SEQGRP PAT(SEQGRP) DOSE PER CO
Source SEQGRP PAT(SEQGRP) DOSE PER CO
DF
Type III SS
Mean Square
F Value
3 8 3 2 3
10.3361626 48.6932667 118.0792959 0.0628167 30.9356342
3.4453875 6.0866583 39.3597653 0.0314083 10.3118781
0.89 1.57 10.13 0.01 2.65
0.4604 0.1815 0.0001 14 0.9920 0.0687
Tests of Hypotheses Using the Type III MS for PAT(SEQGRP) as an Error Term Source
DF
Type III SS
Mean Square
F Value
Pr > F
SEQGRP
3
10.33616258
3.44538753
0.57
0.6526
170
Common Statistical Methods for Clinical Research with SAS Examples
OUTPUT 9.2. SAS Output for Example 9.2 (continued)
CrossOver Design Example 9.2: Antibiotic Blood Levels Following Aerosol Inhalation Model 2 The GLM Procedure
Class
Class Level Information Values
Levels
SEQGRP DOSE PER PAT
4 4 4 12
ABDC BCAD CDBA DACB A B C D 1 2 3 4 101 102 103 104 105 106 107 108 109 110 111 112 Number of observations
48
Dependent Variable: AUC Source
DF
Sum of Squares
Model Error Corrected Total
20 27 47
225.1866862 104.8902450 330.0769312
Mean Square
F Value
Pr > F
11.2593343 3.8848239
2.90
0.0053
R-Square
Coeff Var
Root MSE
AUC Mean
0.682225
34.38658
1.970996
5.731875
Source
DF
Type I SS
3 8 3 3 1 1 1
7.1111896 48.6932667 134.4534729 3.9931229 0.0870204 21.3521113 12 9.4965025
SEQGRP PAT(SEQGRP) DOSE PER COA COB COC Source
Mean Square
F Value
2.3703965 6.0866583 44.8178243 1.3310410 0.0870204 21.3521113 9.4965025
0.61 1.57 11.54 0.34 0.02 5.50 2.44
DF
Type III SS
Mean Square
F Value
3 8 3 3 1 1 1
10.3361626 48.6932667 118.0792959 3.9931229 1.9374669 28.9850625 9.4965025
3.4453875 6.0866583 39.3597653 1.3310410 1.9374669 28.9850625 9.4965025
0.89 1.57 10.13 0.34 0.50 7.46 2.44
SEQGRP PAT(SEQGRP) DOSE PER COA COB COC
Pr > F 0.6142 0.1815 <.0001 0.7947 0.8821 0.0267 13 0.1296 Pr > F 0.4604 0.1815 0.0001 15 0.7947 17 0.4861 0.0110 0.1296
Tests of Hypotheses Using the Type III MS for PAT(SEQGRP) as an Error Term Source
DF
Type III SS
Mean Square
F Value
SEQGRP
3
10.33616258
3.44538753
0.57
Pr > F 0.6526 16
Chapter 9: The Crossover Design 171
9.4
Details & Notes Study resources might be wasted by conducting crossover studies 9.4.1 that are likely to yield inconsistent treatment differences among periods. This scenario could arise from a carryover effect of certain treatments, improper wash-out periods between treatments, or the inability of patients to be restored to their original baseline conditions at the start of each period. Designs balanced for residual effects should be used whenever possible with crossover studies. Such schemes have the property that each treatment is preceded by every other treatment the same number of times. The presence of treatment carryover effects can easily be investigated with these types of designs. The layout used in Example 9.2 is that of a 4-treatment design balanced for residuals. In this design, four sequence groups are used, and each treatment is preceded by each of the other treatments exactly once. An example of a 3-treatment (A, B, and C), 3-period crossover design balanced for residuals is shown in Table 9.5. Six sequence groups are used, and each treatment is preceded by every other treatment exactly twice. Note that such balance would not be possible with only three sequence groups. In general, if k represents the number of treatments, the minimum number of sequence groups needed for complete residual balance is k, if k is even, and 2k, if k is odd. TABLE 9.5. A 3-Period Crossover Design Balanced for Residuals Sequence
Period 1
Period 2
Period 3
1 2 3 4 5 6
A B C A B C
B C A C A B
C A B B C A
Study design features should also include an adequate wash-out duration between treatment periods. As previously mentioned, the length of the trial and the number of periods should also be limited to minimize changing clinical conditions of the patients during the study. In a two-period crossover design under the assumption of no 9.4.2 carryover effect (such as in Example 9.1), the Period effect is completely confounded with the Treatment-by-Sequence Group interaction. In other words, any difference between Sequence Groups in the Treatment comparisons are indistinguishable from Period differences. The results of Example 9.1 would be identical if the PD effect is replaced with the TRT*SEQ interaction in the MODEL statement in PROC GLM.
172
Common Statistical Methods for Clinical Research with SAS Examples
Similarly, a Treatment effect implies that response differences between Period 1 and Period 2 are in opposite directions for the two Sequence Groups, i.e., a Period-by-Sequence Group interaction. The Treatment (TRT) effect in the MODEL statement in PROC GLM used in Example 9.1 could be replaced with this interaction (PD*SEQ) with the same results. When the MODEL statement is written in this way, it is identical to the form discussed for the general repeated measures ANOVA of Chapter 8. The preceding interactions cannot be used interchangeably with main effects in crossovers that involve more than two periods or those in which a carryover effect might exist. In the crossover analysis, the presence of a Treatment-by-Period 9.4.3 interaction generally requires separate analyses within each period using, for example, the two-sample t-test (Chapter 5). In such a situation, many analysts accept only the results of the first period. Dropouts following the first period will create bias in the analysis 9.4.4 of the crossover study. For example, in the two-way crossover, the ‘withinpatient control’ advantage of the crossover is lost when a patient has data from the first period, but not from the second. Whether that patient is dropped from the analysis or the analysis is adjusted for the missing value (e.g., use of imputation), the results will likely have some unknown bias. If there are many dropouts, the primary analysis might use just the data from the first period. Because dropout rates usually increase with lengthier studies, this problem can be controlled by using a small number of short treatment periods whenever a crossover design is contemplated. In Example 9.2, the SAS Type I sums of squares for Dose Group 9.4.5 (DOSE), Period (PER), and Sequence Group (SEQGRP) are unadjusted for carryover effects. These are the same SS’s that would be obtained if the carryover terms (CO or COA, COB, COC) were omitted from the model (Model 2). Tests for a significant Dose Group effect would typically use the Type III sum of squares, which is adjusted for residuals. However, when a significant carryover effect is found, caution must be used in the interpretation of the Treatment effects (see Section 9.4.7). The term ‘balanced residuals’ implies that, if the carryover effect 9.4.6 is the same for each treatment, these effects will be cancelled when looking at differences in treatment means, and thus, the true treatment difference is free of any carryover effects. Biased estimates of treatment differences will result if the residuals are not balanced across treatment groups. To check for different degrees of carryover among treatments, a 9.4.7 preliminary test for carryover effects can be made in the manner demonstrated in Example 9.2. Such preliminary tests for model assumptions are often conducted at a significance level greater than 0.05, perhaps in the 0.10 to 0.20 range. If non-significant, the carryover effect can be dropped from the model.
Chapter 9: The Crossover Design 173
The p-value for the overall carryover effect of 0.0687 in Example 9.2 is small enough to raise concern regarding carryover. Significant carryover effects might provide useful information for designing future studies (e.g., using a longer observation period and/or a longer wash-out period). Although this chapter introduces the crossover design for normal 9.4.8 response data, crossover designs can also be performed with non-normal responses, which include categorical and rank data. Using binary responses and non-parametric approaches to the analysis of non-normal responses in crossovers are discussed in Jones and Kenward (1989), and examples using SAS are included in Stokes, Davis, and Koch (2000).
174
Common Statistical Methods for Clinical Research with SAS Examples
&++$$3377((55 /LQHDU5HJUHVVLRQ ,QWURGXFWLRQ 6\QRSVLV ([DPSOHV 'HWDLOV 1RWHV
,QWURGXFWLRQ
5HJUHVVLRQDQDO\VLVLVXVHGWRDQDO\]HWKHUHODWLRQVKLSEHWZHHQDUHVSRQVH\DQGD TXDQWLWDWLYH IDFWRU [ .QRZOHGJH RI WKLV UHODWLRQVKLS FDQ EH LPSRUWDQW IRU SUHGLFWLQJXQPHDVXUHGUHVSRQVHVIURPDNQRZQ[YDOXH ([DPSOHV ZKHUH UHJUHVVLRQ DQDO\VLV PLJKW EH XVHIXO LQ FOLQLFDO GDWD DQDO\VLV LQFOXGH WKH PRGHOLQJ RI EORRG SUHVVXUH UHVSRQVH \ RQ WKH GRVH RI D QHZ DQWL K\SHUWHQVLYHGUXJ[ FKROHVWHUROOHYHO\ RQSDWLHQW VDJH[ SDLQUHOLHI\ RQ WLPH DIWHU GRVLQJ ZLWK DQ DQWLLQIODPPDWRU\ WUHDWPHQW [ RU GHJUHH RI ZRXQG KHDOLQJ\ RQWKHEDVHOLQHVXUIDFHDUHDRIDEXUQZRXQG[
:KLOHDQXPEHURISRWHQWLDOUHODWLRQVKLSVEHWZHHQ[DQG\PLJKWEHFRQVLGHUHGLQ WKLV FKDSWHU WKH IRFXV LV RQ OLQHDU UHODWLRQVKLSV RQO\ 6LPSOH OLQHDU UHJUHVVLRQ PHWKRGRORJ\ SURYLGHV DQ HVWLPDWH RI WKH EHVWILWWLQJ OLQH WKURXJK D VHW RI GDWD SRLQWV [\ [\ [Q\Q
6\QRSVLV
*LYHQ D QXPEHU RI REVHUYHG YDOXHV RI D QRUPDOO\ GLVWULEXWHG UHVSRQVH YDULDEOH \\\Q WKHPHDQ \ UHSUHVHQWVWKHEHVWHVWLPDWHRIDIXWXUHUHVSRQVH7KH LGHD EHKLQG UHJUHVVLRQ DQDO\VLV LV WR LPSURYH WKLV HVWLPDWH E\ XVLQJ WKH YDOXHRI VRPHUHODWHGIDFWRU[,I[DQG\KDYHDNQRZQOLQHDUUHODWLRQVKLSWKHEHVWHVWLPDWH RI\ZLOOEHDOLQHDUIXQFWLRQRIWKHNQRZQYDOXH[
&RPPRQ6WDWLVWLFDO0HWKRGVIRU&OLQLFDO5HVHDUFKZLWK6$6([DPSOHV
$OLQHDUUHODWLRQVKLSEHWZHHQDUHVSRQVH\DQGDQLQGHSHQGHQWYDULDEOH[FDQEH H[SUHVVHGDV \ [ ZKHUH LVWKHLQWHUFHSWDQG LVWKHVORSHDVVKRZQLQ)LJXUH%HFDXVHWKH UHVSRQVH LV VXEMHFW WR UDQGRP PHDVXUHPHQW HUURU \ PLJKW GLIIHU LQ UHSHDWHG VDPSOLQJ IRU WKH VDPH YDOXH RI [ 7KH DFFRXQWV IRU WKLV UDQGRP QDWXUH RI WKH UHVSRQVH\
)LJXUH6LPSOH/LQHDU5HJUHVVLRQRI\RQ[
e
< = a + b;
< b\XQLWV
a --
[XQLW
GDWDSRLQWV
;
7\SLFDOO\\RXKDYHQSDLUVRIFRRUGLQDWHV[L\L IRUL Q DQGDVVXPH WKDWWKH\L VDUHLQGHSHQGHQWQRUPDOO\GLVWULEXWHGDQGKDYHWKHVDPHYDULDQFH IRUDOO[YDOXHV)URPWKHVHGDWD\RXHVWLPDWHWKHPRGHOSDUDPHWHUV DQG E\D DQGEUHVSHFWLYHO\LQVXFKDZD\WKDWWKHUHVXOWLQJµSUHGLFWLRQHTXDWLRQ¶
\A DE [ LV WKH µEHVWILWWLQJ¶ OLQH WKURXJK WKH PHDVXUHG FRRUGLQDWHV ,Q WKLV VHQVH WKH SUHGLFWLRQHTXDWLRQUHSUHVHQWVWKHEHVWHVWLPDWHRIWKHXQNQRZQUHJUHVVLRQPRGHO
&KDSWHU/LQHDU5HJUHVVLRQ
VDWLVI\WKLVUHTXLUHPHQWLVWRPLQLPL]HWKHVXPRIVTXDUHGGLIIHUHQFHVEHWZHHQ\ DQG\A
å \ - \Ö
66(
Q
L
L
L 7KLVLVNQRZQDVWKH/HDVW6TXDUHVFULWHULRQ66(LVWKHVXPRIVTXDUHGHUURUVRU GHYLDWLRQVEHWZHHQWKHDFWXDODQGSUHGLFWHGUHVSRQVHV
'HILQHWKHTXDQWLWLHV6 6 DQG6 DVIROORZV \\
6\\ =
å \ - \ = å \
6[[ =
å
[ L - [ =
å
[ L - [ \L - \ =
6[\ =
Q
L = Q
L =
Q
L
L =
L
å[ Q
L =
L
- Q × \
- Q × [
Q
L =
å [ × \ - Q × [ × \ Q
L =
L
L
7KHµEHVWILWWLQJ¶OLQHEDVHGRQWKH/HDVW6TXDUHVFULWHULRQLVJLYHQE\
A\ DE [
ZKHUH E 6 6
DQG
[\
[[
D \ ±E [
7KH EHVW HVWLPDWH RI WKH YDULDQFH FDOFXODWHGIURPWKHIRUPXOD
[\
DQG
[[
6 [[ 66(
×
6 \\
-
6 [\
LV V 66( Q± ZKHUH 66( FDQ EH
6 [[
EDVHGRQQ±GHJUHHVRIIUHHGRP 7KHPDLQTXHVWLRQSRVHGE\VLPSOHOLQHDUUHJUHVVLRQFRQFHUQVWKHVLJQLILFDQFHRI WKH VORSH SDUDPHWHU,IWKHVORSH LVWKHQWKHYDOXHRI[ZLOOQRWLPSURYHWKH SUHGLFWLRQRI\RYHUWKHRUGLQDU\SUHGLFWRU \ $VLJQLILFDQWVORSH LQGLFDWLQJD OLQHDU UHODWLRQVKLS EHWZHHQ [ DQG \ PHDQV WKDW NQRZOHGJH RI WKH [YDOXHV ZLOO VLJQLILFDQWO\LPSURYH\RXUSUHGLFWLRQDELOLW\
&RPPRQ6WDWLVWLFDO0HWKRGVIRU&OLQLFDO5HVHDUFKZLWK6$6([DPSOHV
7KH VWDWLVWLFDOWHVWLVEDVHGRQDIXQFWLRQRIWKHVORSHHVWLPDWHEZKLFKKDVWKH WGLVWULEXWLRQZKHQWKHQXOOK\SRWKHVLVRIµVORSH¶LVWUXH7KHWHVWVXPPDU\LV
QXOOK\SRWKHVLV DOWK\SRWKHVLV
WHVWVWDWLVWLF
GHFLVLRQUXOH
+ +
$
E W V 6[[
UHMHFW+ LI_ W _ !W
Q±
,IWKHVORSHLVPHDQLQJIXO\RXFDQHVWLPDWHWKHPHDQUHVSRQVHDWDJLYHQYDOXHRI [IRUH[DPSOH[ ZLWKDFRQILGHQFHLQWHUYDOE\
\Ö [ ± W Q - × V ×
[ - [ + Q 6[[
ZKHUHA \ DE [ [
([DPSOHV
p ([DPSOH$QWL$QJLQDO5HVSRQVHYV'LVHDVH+LVWRU\
7UHDGPLOOVWUHVVWHVWVZHUHDGPLQLVWHUHGWRSDWLHQWVZLWKDQJLQDSHFWRULV EHIRUHDQGZHHNVDIWHURQFHGDLO\GRVLQJZLWKDQH[SHULPHQWDODQWLDQJLQDO PHGLFDWLRQ7KHLQYHVWLJDWRUZDQWHGWRNQRZLIWKHLPSURYHPHQWLQH[HUFLVH GXUDWLRQLVUHODWHGWRWKHSDWLHQW VGLVHDVHKLVWRU\'LVHDVHGXUDWLRQVLQFH LQLWLDOGLDJQRVLVLQ\HDUV DQGSHUFHQWLPSURYHPHQWLQWUHDGPLOOZDONLQJ WLPHVDUHVKRZQLQ7DEOHIRUDVWXG\ZLWKSDWLHQWVHQUROOHG,VWKHUHD VLJQLILFDQWOLQHDUUHODWLRQVKLSEHWZHHQLPSURYHPHQWRQPHGLFDWLRQDQG GLVHDVHGXUDWLRQ"
&KDSWHU/LQHDU5HJUHVVLRQ
7$%/(5DZ'DWDIRU([DPSOH [
[
\ 3DWLHQW 1XPEHU
,PSURYHPHQW
3DWLHQW 1XPEHU
,PSURYHPHQW
\
'LVHDVH 'XUDWLRQ \HDUV
'LVHDVH 'XUDWLRQ \HDUV
6ROXWLRQ 8VLQJWKHIRUPXODVSUHVHQWHGFRPSXWH
å[ å[ L
L
å \ å [ × \
L
L
L
å \ Q
L
VRWKDW
6\\ - î - 6[\ - 6[[ -
&RPPRQ6WDWLVWLFDO0HWKRGVIRU&OLQLFDO5HVHDUFKZLWK6$6([DPSOHV
7KHHVWLPDWHGUHJUHVVLRQHTXDWLRQLVIRXQGDVIROORZV
E
-
-
DQG D
- - ×
ZKLFK\LHOG \A ± [
,QDGGLWLRQ\RXILQG
66(
- -
DQG
V
7KH UHJUHVVLRQ VORSH ZKLFK UHSUHVHQWV WKH DYHUDJH FKDQJH LQ \ IRU D RQHXQLW FKDQJH LQ [ LV WKH EHVW HVWLPDWH RI WKH UDWH RI LPSURYHPHQW LQ WUHDGPLOO SHUIRUPDQFH IRU HDFK RQH\HDU LQFUHDVH LQ GLVHDVH GXUDWLRQ LH ± SHU \HDU 7KH VWDWLVWLFDO VLJQLILFDQFH LV GHWHUPLQHG E\ WKH WWHVW VXPPDUL]HG DV IROORZV
QXOOK\SRWKHVLV
+
DOWK\SRWKHVLV
+
WHVWVWDWLVWLF
GHFLVLRQUXOH
UHMHFW+ LI_ W _ !W
FRQFOXVLRQ
%HFDXVH ! \RX UHMHFW + DQG
$
W
-
-
FRQFOXGH WKDW WUHDGPLOO LPSURYHPHQW KDV D VLJQLILFDQWOLQHDUUHODWLRQVKLSWRGLVHDVHGXUDWLRQ
&KDSWHU/LQHDU5HJUHVVLRQ
7KH PHDQ LPSURYHPHQW LQ WUHDGPLOO H[HUFLVH WLPH IRU WKH DYHUDJH SDWLHQW ZLWK D \HDUKLVWRU\RIDQJLQDLVFRPSXWHGDV
\Ö
7KHFRQILGHQFHLQWHUYDOIRUWKLVPHDQLV
-
× ×
× RU±WR %HFDXVHWKLVLQWHUYDOFRQWDLQV\RXZRXOGEHLQFOLQHGWR FRQFOXGHWKDWWKHDYHUDJHWUHDWHGSDWLHQWZLWKD\HDUKLVWRU\RIDQJLQDGRHVQRW KDYHVLJQLILFDQWLPSURYHPHQWLQH[HUFLVHWROHUDQFH
6$6$QDO\VLVRI([DPSOH 7KH6$6FRGHIRUDQDO\]LQJWKHGDWDVHWLQWKLVH[DPSOHLVVKRZQEHORZDQGWKH UHVXOWLQJRXWSXWLVVKRZQLQ2XWSXW$SULQWRXWRIWKHGDWDVHWLVILUVWREWDLQHG E\ XVLQJ 352& 35,17 å IROORZHG E\ WKH VXPPDU\ VWDWLVWLFV IRU WKH [ DQG \ YDOXHVXVLQJ352&0($16 )RUWKLVH[DPSOHWKHUHJUHVVLRQDQDO\VLVLVSHUIRUPHGXVLQJ352&*/0DOWKRXJK 352& 5(* FDQ DOVR EH XVHG :KHQ 352& */0 LV XVHG ZLWKRXW D &/$66 VWDWHPHQW6$6DVVXPHVWKDWWKHLQGHSHQGHQWYDULDEOHVLQWKH02'(/VWDWHPHQW DUHTXDQWLWDWLYHDQGSHUIRUPVDUHJUHVVLRQDQDO\VLV
6$6&RGHIRU([DPSOH DATA ANGINA; INPUT PAT X_DUR Y_IMPR DATALINES; 1 1 40 2 1 90 3 5 1 80 6 5 60 7 9 2 50 10 6 40 11 13 2 50 14 2 110 15 17 5 -30 18 3 20 19 ;
@@; 3 1 1 3 1
30 10 60 20 40
4 8 12 16 20
2 30 4 -10 4 0 3 70 6 0
&RPPRQ6WDWLVWLFDO0HWKRGVIRU&OLQLFDO5HVHDUFKZLWK6$6([DPSOHV
PROC SORT DATA = ANGINA; BY X_DUR Y_IMPR; PROC PRINT DATA = ANGINA; å VAR PAT X_DUR Y_IMPR; TITLE1 ’Linear Regression & Correlation’; TITLE2 ’Example 10.1: Improvement in Angina vs. Disease Duration’; RUN;
PROC MEANS MEAN STD N; VAR X_DUR Y_IMPR; RUN;
PROC GLM DATA = ANGINA; MODEL Y_IMPR = X_DUR / P CLM SS1; RUN;
7KH UHJUHVVLRQ HVWLPDWHV D DQG E DUH SULQWHG LQ WKH (VWLPDWH FROXPQ LQ 2XWSXW ê 1RWH WKH WWHVW IRU VORSH LV ± ZLWK D SYDOXH RI ZKLFKFRQILUPVWKHDQDO\VLVXVLQJWKHFDOFXODWLQJIRUPXODV 7KH 3 DQG &/0 RSWLRQV VSHFLILHG LQ WKH 02'(/ VWDWHPHQW UHTXHVW WKH SUHGLFWHGYDOXHVEDVHGRQWKHUHJUHVVLRQHTXDWLRQIRUHDFKGDWDSRLQWDORQJZLWK FRQILGHQFHLQWHUYDOVIRUWKHPHDQUHVSRQVHDWWKHFRUUHVSRQGLQJ[YDOXHV)RU [ 2EVHUYDWLRQV DQG WKH SUHGLFWHG UHVSRQVH KDV D FRUUHVSRQGLQJ FRQILGHQFH LQWHUYDO RI WR ñZKLFKFRQILUPVWKH PDQXDOFDOFXODWLRQV 2873876$62XWSXWIRU([DPSOH Linear Regression & Correlation Example 10.1: Improvement in Angina vs. Disease Duration
Obs
PAT
X_DUR
Y_IMPR
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
7 1 19 11 5 2 4 9 13 14 15 18 3 16 8 12 17 6 20 10
1 1 1 1 1 1 2 2 2 2 3 3 3 3 4 4 5 5 6 6
10 40 40 60 80 90 30 50 50 110 20 20 30 70 -10 0 -30 60 0 40
å
&KDSWHU/LQHDU5HJUHVVLRQ
2873876$62XWSXWIRU([DPSOHFRQWLQXHG Linear Regression & Correlation Example 10.1: Improvement in Angina vs. Disease Duration The MEANS Procedure
Variable Mean Std Dev N ---------------------------------------------X_DUR 2.8000000 1.7044833 20 Y_IMPR 38.0000000 35.0338182 20 ----------------------------------------------
The GLM Procedure Number of observations
20
Dependent Variable: Y_IMPR
Source
DF
Model Error Corrected Total
1 18 19
R-Square 0.241880 Source X_DUR
ô
Sum of Squares 5640.65217 17679.34783 23320.00000
Coeff Var 82.47328
Mean Square
F Value
Pr > F
5640.65217 982.18599
5.74
0.0276
Root MSE 31.33985
Y_IMPR Mean 38.00000
DF
Type I SS
Mean Square
F Value
1
5640.652174
5640.652174
5.74
Parameter
Estimate
Intercept X_DUR
66.30434783 -10.10869565
Standard Error
ê
13.73346931 4.21820157
t Value 4.83 -2.40
Pr > F
ò
0.0276
Pr > |t| 0.0001 0.0276
&RPPRQ6WDWLVWLFDO0HWKRGVIRU&OLQLFDO5HVHDUFKZLWK6$6([DPSOHV
2873876$62XWSXWIRU([DPSOHFRQWLQXHG Linear Regression & Correlation Example 10.1: Improvement in Angina vs. Disease Duration The GLM Procedure
Observation
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
Observed
95% Confidence Limits for Residual Mean Predicted Value
Predicted
10.0000000 40.0000000 40.0000000 60.0000000 80.0000000 90.0000000 30.0000000 50.0000000 50.0000000 110.0000000 20.0000000 20.0000000 30.0000000 70.0000000 -10.0000000 0.0000000 -30.0000000 60.0000000 0.0000000 40.0000000
56.1956522 -46.1956522 34.4879982 56.1956522 - 16.1956522 34.4879982 56.1956522 -16.1956522 34.4879982 56.1956522 3.8043478 34.4879982 56.1956522 23.8043478 34.4879982 56.1956522 33.8043478 34.4879982 46.0869565 -16.0869565 29.7460282 46.0869565 3.9130435 29.7460282 46.0869565 3.9130435 29.7460282 46.0869565 63.9130435 29.7460282 35.9782609 -15.9782609 21.1491101 35.9782609 -15.9782609 21.1491101 35.9782609 -5.9782609 21.1491101 35.9782609 34.0217391 21.1491101 25.8695652 -35.8695652 7.7076388 25.8695652 -25.8695652 7.7076388 15.7608696 -45.7608696 -8.6702890 15.7608696 44.2391304 -8.6702890 5.6521739 -5.6521739 -26.3006276 5.6521739 34.3478261 -26.3006276
Sum of Residuals Sum of Squared Residuals Sum of Squared Residuals - Error SS PRESS Statistic First Order Autocorrelation Durbin-Watson D
-0.00000 17679.34783 0.00000 22187.33121 -0.05364 1.91983
77.9033062 77.9033062 77.9033062 77.9033062 77.9033062 77.9033062 62.4278848 62.4278848 62.4278848 62.4278848 50.8074117 50.8074117 50.8074117 50.8074117 44.0314916 44.0314916 40.1920281ñ 40.1920281 37.6049755 37.6049755
õ
0XOWLSOH/LQHDU5HJUHVVLRQ
6LPSOHOLQHDUUHJUHVVLRQUHIHUVWRDVLQJOHUHVSRQVHYDULDEOH\PRGHOHGRQDVLQJOH SUHGLFWRU YDULDEOH [ 6RPHWLPHV VHYHUDO TXDQWLWDWLYH IDFWRUV [ [ [ DUH WKRXJKWWRDIIHFWDUHVSRQVH\7KHVHSUHGLFWRUVFDQEHXVHGVLPXOWDQHRXVO\LQD PXOWLSOHOLQHDUUHJUHVVLRQHTXDWLRQRIWKHIRUP
\ E E [ E [ E [
N
N
DVVXPLQJNIDFWRUV/LNHWKHHVWLPDWHERI LQWKHVLPSOHOLQHDUPRGHOE HVWLPDWHV WKH XQNQRZQ SDUDPHWHU DQG LV GHWHUPLQHG E\ WKH DPRXQW RI FRQWULEXWLRQ [ PDNHVWRWKHSUHGLFWLRQRI\IRUL «N ,QWHUSUHWDWLRQDQGHVWLPDWLRQRIWKH ¶VLVWKHVDPHDVLQWKHVLPSOHOLQHDUUHJUHVVLRQFDVH UHSUHVHQWVWKHH[SHFWHG L
L
L
L
L
&KDSWHU/LQHDU5HJUHVVLRQ
LQFUHDVH LQ UHVSRQVH \ IRU D XQLW LQFUHDVH LQ [ ZKHQ DOO RWKHU [¶V DUH KHOG FRQVWDQW 7KHVH ¶VDUHHVWLPDWHGE\XVLQJWKH/HDVW6TXDUHVPHWKRG7KH5(* SURFHGXUHLQ6$6LVXVHGWRLOOXVWUDWHWKLVDQDO\VLVDVVKRZQLQ([DPSOH L
L
p ([DPSOH6\PSWRPDWLF5HFRYHU\LQ3HGLDWULF'HK\GUDWLRQ
$VWXG\ZDVFRQGXFWHGWRGHWHUPLQHWKHGHJUHHRIUHFRYHU\WKDWWDNHVSODFH PLQXWHVIROORZLQJWUHDWPHQWRIFKLOGUHQGLDJQRVHGDWWKHFOLQLFZLWK PRGHUDWHWRVHYHUHGHK\GUDWLRQ8SRQGLDJQRVLVDQGVWXG\HQWU\SDWLHQWV ZHUHWUHDWHGZLWKDQHOHFWURO\WLFVROXWLRQDWYDULRXVGRVHOHYHOV P(TO LQDIUR]HQIODYRUHGLFHSRSVLFOH7KHGHJUHHRI UHK\GUDWLRQZDVGHWHUPLQHGE\XVLQJDVXEMHFWLYHVFDOHEDVHGRQSK\VLFDO H[DPLQDWLRQDQGSDUHQWDOLQSXW7KHVHVFRUHVZHUHFRQYHUWHGWRDWR SRLQWVFDOHUHSUHVHQWLQJWKHSHUFHQWRIUHFRYHU\7KHFKLOG¶VDJHDQGZHLJKW ZHUHDOVRUHFRUGHG,VUHFRYHU\UHODWHGWRHOHFWURO\WHGRVHEDVHGRQWKHGDWD VKRZQLQ7DEOH"
6ROXWLRQ 7KHUHJUHVVLRQSURFHGXUH352&5(* LQ6$6LVXVHGWRSHUIRUPDPXOWLSOHOLQHDU UHJUHVVLRQ DQDO\VLV UHJUHVVLQJ WKH SHUFHQW UHFRYHU\ DW PLQXWHV \ RQ WKH FRQFRPLWDQW YDULDEOHV 'RVH /HYHO ; $JH ; DQG :HLJKW ; $OWKRXJK 352& */0 FDQ DOVR EH XVHG IRU WKLV DQDO\VLV 352& 5(* KDV GLDJQRVWLF DQG PRGHO EXLOGLQJ DGYDQWDJHV IRU UHJUHVVLRQ DQDO\VLV ZKHQ WKHUH DUH QR FODVV YDULDEOHV,QDGGLWLRQPXOWLSOH02'(/VWDWHPHQWVFDQEHXVHGZLWK352&5(*
,QWKLVH[DPSOHWKHPRGHOLVZULWWHQ
\
[
[
[
ZKHUH[ GRVHP(TO [ SDWLHQW¶VDJH\HDUV DQG[ SDWLHQW¶VZHLJKWOEV DUH WKH LQGHSHQGHQW RU µH[SODQDWRU\¶ YDULDEOHV 7KH SDUDPHWHUV DUH ZKLFK UHSUHVHQWV WKH RYHUDOO DYHUDJH UHVSRQVH DQG ZKLFK UHSUHVHQWV WKH DYHUDJH LQFUHDVHLQWKHGHJUHHRIUHFRYHU\\ IRUDRQHXQLWLQFUHDVHLQ[ L 7KH REMHFWLYHLVWRLQYHVWLJDWHWKHHIIHFWRI'RVH/HYHORQUHVSRQVHDQGKRZDJHDQG ZHLJKWPLJKWLQIOXHQFHWKDWHIIHFW
L
L
&RPPRQ6WDWLVWLFDO0HWKRGVIRU&OLQLFDO5HVHDUFKZLWK6$6([DPSOHV
7$%/(5DZ'DWDIRU([DPSOH
3DWLHQW 1XPEHU
'HJUHHRI 5HFRYHU\ <
'RVHP(TO ;
$JH\UV ;
:HLJKWOEV ;
&KDSWHU/LQHDU5HJUHVVLRQ
6$6$QDO\VLVRI([DPSOH 7KH6$6FRGHDQGRXWSXWIRUDQDO\]LQJWKHGDWDVHWLQWKLVH[DPSOHDUHVKRZQRQ WKHIROORZLQJSDJHV$SULQWRXWRIWKHGDWDVHWLVILUVWREWDLQHGXVLQJ352&35,17ù 7KH &255 RSWLRQ LQ 352& 5(* UHTXHVWV WKH LQFOXVLRQ RI WKH FRUUHODWLRQ FRHIILFLHQWVEHWZHHQHDFKSDLURIYDULDEOHVLQWKHRXWSXW 11 1RWLFHWKDW$JHDQG :HLJKW DUH KLJKO\ FRUUHODWHG U DQG WKDW WKH GHJUHH RI UHFRYHU\ DW PLQXWHV DIWHU WUHDWPHQW \ LV SRVLWLYHO\ FRUUHODWHG ZLWK GRVH DQG QHJDWLYHO\ FRUUHODWHGZLWK$JHDQG:HLJKW 7KH )YDOXH
12
K\SRWKHVLV + YV +
D
IRU WKH 02'(/ VWDWHPHQW LV D JOREDO WHVW RI WKH
RU RU
7KLVLVKLJKO\VLJQLILFDQWS LQGLFDWLQJWKDWWKHUHVSRQVHLVOLQHDUO\UHODWHG WRDWOHDVWRQHRIWKHLQGHSHQGHQWYDULDEOHV 7KHSUHGLFWLRQPRGHOFDQEHZULWWHQIURPWKHSDUDPHWHUHVWLPDWHV 13 DV
\A 'RVH $JH ± :HLJKW 7KH ¶VFDQEHWHVWHGLQGLYLGXDOO\IRUHTXDOLW\WRXVLQJWKHWWHVWDVVKRZQLQ ([DPSOH6$6SULQWVWKHVHWHVWVZLWKWKHFRUUHVSRQGLQJSYDOXHVDVVKRZQLQ 2XWSXW 14 'RVH/HYHOLVVLJQLILFDQWS ZKLOH$JHS DQG :HLJKWS DUHQRWEDVHGRQWKLVPRGHO L
2QH PHWKRG WR FKHFN WKH µJRRGQHVVRIILW¶ RI D UHJUHVVLRQ PRGHO LV E\ LWV FRHIILFLHQWRIGHWHUPLQDWLRQ5
0XOWLSOH 02'(/ VWDWHPHQWV FDQ EH XVHG LQ 352& 5(* 7KHVH DUH ODEHOHG FRQVHFXWLYHO\LQWKH6$6RXWSXW02'(/02'(/HWF DFFRUGLQJWRWKHRUGHU VSHFLILHGLQWKH352&5(*V\QWD[7KHVHFRQG02'(/VWDWHPHQWVHH6$6FRGH IRU([DPSOH UHTXHVWVDVLPSOHOLQHDUUHJUHVVLRQDQGXVHV'RVH/HYHODVWKH RQO\ H[SODQDWRU\ YDULDEOH 7KH RXWSXW VKRZV D VLJQLILFDQW OLQHDU UHODWLRQVKLS EHWZHHQ'RVHDQGUHVSRQVHS ZKHQ$JHDQG:HLJKWDUHQRWLQFOXGHG +RZHYHUWKH5 KDVEHHQUHGXFHGWRRQO\ 16 ZKLFKLQGLFDWHVDUHODWLYHO\ SRRUILWFRPSDUHGWRWKHIXOOPRGHOGHVSLWHWKHODFNRIVLJQLILFDQFHIRXQGIRU$JH DQG:HLJKWLQ02'(/
&RPPRQ6WDWLVWLFDO0HWKRGVIRU&OLQLFDO5HVHDUFKZLWK6$6([DPSOHV
7KLV ILQGLQJ FRPELQHG ZLWK WKH KLJK FRUUHODWLRQ EHWZHHQ $JH DQG :HLJKW VXJJHVWV WKH SRVVLELOLW\ RI FROOLQHDULW\ D FRQGLWLRQ LQ ZKLFK WZR RU PRUH LQGHSHQGHQW YDULDEOHV ERWK RI ZKLFK PHDVXUH VLPLODU XQGHUO\LQJ HIIHFWV DUH LQFOXGHG LQ WKH UHJUHVVLRQ PRGHO %\ XVLQJ UHGXQGDQW PRGHO LQIRUPDWLRQ WKH SDUDPHWHU HVWLPDWHV E ¶V PLJKW EH LQDSSURSULDWH DQGRU WKH YDULDQFH RI WKHVH HVWLPDWHVPLJKWEHLQIODWHGZKLFKUHVXOWVLQIHZHUVLJQLILFDQWILQGLQJV L
5HVXOWV IRU WZR DGGLWLRQDO PRGHOV DUH VKRZQ $JH DQG :HLJKW DUH VHSDUDWHO\ LQFOXGHG ZLWK 'RVH /HYHO LQ 02'(/ 17 DQG 02'(/ 18 UHVSHFWLYHO\ $ FRHIILFLHQWRIGHWHUPLQDWLRQRI5 19 LVREWDLQHGE\XVLQJ'RVH/HYHODQG $JHDQGDQ5 20 LVIRXQGZLWK'RVH/HYHODQG:HLJKW7KHVHUHVXOWV LQGLFDWHWKDWLI$JHRU:HLJKWLVXVHGLQWKHPRGHOLQFOXGLQJWKHRWKHUGRHVQRW KHOSPXFK
7KHILQDOPRGHOXVHGLV02'(/EHFDXVHLWKDVWKHKLJKHVW5 :LWKWKLVPRGHO :HLJKWLVDKLJKO\VLJQLILFDQWH[SODQDWRU\YDULDEOHS
6$6&RGHIRU([DPSOH DATA DEHYD; INPUT PAT Y DOSE AGE WT @@; DATALINES; 1 77 0.0 4 28 2 65 1.5 5 35 4 63 1.0 9 76 5 75 0.5 5 31 7 70 1.0 6 35 8 90 2.5 6 47 10 72 3.0 8 50 11 67 2.0 7 50 13 75 1.5 4 33 14 58 3.0 8 59 16 55 0.0 8 58 17 80 1.0 7 55 19 44 0.5 9 66 20 62 1.0 6 43 22 75 2.5 7 50 23 77 1.5 5 29 25 68 3.0 9 61 26 71 2.5 10 71 28 80 2.0 3 27 29 70 0.0 9 56 31 88 1.0 3 22 32 68 0.5 5 37 34 90 3.0 5 45 35 79 1.5 8 53 ;
3 75 2.5 8 55 6 82 2.0 5 27 9 49 0.0 9 59 12 100 2.5 7 46 15 58 1.5 6 40 18 55 2.0 10 76 21 60 1.0 6 48 24 80 2.5 11 64 27 90 1.5 4 26 30 58 2.5 8 57 33 60 0.5 6 44 36 90 2.0 4 29
PROC PRINT DATA = DEHYD; ù VAR PAT Y DOSE AGE WT; TITLE1 ’Multiple Linear Regression’; TITLE2 ’Example 10.2: Recovery in Pediatric Dehydration’; RUN; PROC REG CORR MODEL Y = MODEL Y = MODEL Y =
11 DATA = DEHYD; DOSE AGE WT / SS1 SS2 VIF COLLINOINT; DOSE ; 17 DOSE AGE ;
MODEL Y = DOSE WT RUN;
;
18
&KDSWHU/LQHDU5HJUHVVLRQ
2873876$62XWSXWIRU([DPSOH Multiple Linear Regression Example 10.2: Recovery in Pediatric Dehydration
Obs
PAT
Y
DOSE
AGE
WT
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36
77 65 75 63 75 82 70 90 49 72 67 100 75 58 58 55 80 55 44 62 60 75 77 80 68 71 90 80 70 58 88 68 60 90 79 90
0.0 1.5 2.5 1.0 0.5 2.0 1.0 2.5 0.0 3.0 2.0 2.5 1.5 3.0 1.5 0.0 1.0 2.0 0.5 1.0 1.0 2.5 1.5 2.5 3.0 2.5 1.5 2.0 0.0 2.5 1.0 0.5 0.5 3.0 1.5 2.0
4 5 8 9 5 5 6 6 9 8 7 7 4 8 6 8 7 10 9 6 6 7 5 11 9 10 4 3 9 8 3 5 6 5 8 4
28 35 55 76 31 27 35 47 59 50 50 46 33 59 40 58 55 76 66 43 48 50 29 64 61 71 26 27 56 57 22 37 44 45 53 29
ù
&RPPRQ6WDWLVWLFDO0HWKRGVIRU&OLQLFDO5HVHDUFKZLWK6$6([DPSOHV
2873876$62XWSXWIRU([DPSOHFRQWLQXHG Multiple Linear Regression Example 10.2: Recovery in Pediatric Dehydration The REG Procedure 11
Correlation Variable DOSE AGE WT Y
DOSE
AGE
WT
Y
1.0000 0.1625 0.1640 0.3595
0.1625 1.0000 0.9367 -0.4652
0.1640 0.9367 1.0000 -0.5068
0.3595 -0.4652 -0.5068 1.0000
The REG Procedure Model: MODEL1 Dependent Variable: Y Analysis of Variance
Source
DF
Sum of Squares
Mean Square
Model Error Corrected Total
3 32 35
2667.66870 3151.22019 5818.88889
889.22290 98.47563
Root MSE Dependent Mean Coeff Var
9.92349 71.55556 13.86823
R-Square Adj R-Sq
F Value
Pr > F
9.03 12
0.4584 0.4077
0.0002
15
Parameter Estimates
Variable
DF
Parameter Estimate 13
Standard Error
Intercept DOSE AGE WT
1 1 1 1
85.47636 6.16969 0.27695 -0.54278
5.96528 1.79081 2.28474 0.32362
t Value 14 14.33 3.45 0.12 -1.68
Pr>|t| Type I SS <.0001 0.0016 0.9043 0.1032
Parameter Estimates
Variable
DF
Type II SS
Variance Inflation
Intercept DOSE AGE WT
1 1 1 1
20219 1168.84352 1.44701 277.00687
0 1.02833 8.16330 8.16745
184327 752.15170 638.51013 277.00687
&KDSWHU/LQHDU5HJUHVVLRQ
2873876$62XWSXWIRU([DPSOHFRQWLQXHG Multiple Linear Regression Example 10.2: Recovery in Pediatric Dehydration The REG Procedure Model: MODEL1 21
Collinearity Diagnostics(intercept adjusted)
Number
Eigenvalue
1 2 3
1.99054 0.94617 0.06329
Condition Index 1.00000 1.45044 5.60804
-----Proportion of Variation--DOSE AGE WT 0.02518 0.97480 0.00002134
0.02918 0.00337 0.96745
0.02918 0.00330 0.96752
Model: MODEL2 Dependent Variable: Y Analysis of Variance
Source
DF
Sum of Squares
Mean Square
Model Error Corrected Total
1 34 35
752.15170 5066.73719 5818.88889
752.15170 149.02168
Root MSE Dependent Mean Coeff Var
12.20744 71.55556 17.06009
R-Square Adj R-Sq
F Value
Pr > F
5.05
0.0313
0.1293 0.1037
16
Parameter Estimates
Variable
DF
Parameter Estimate
Standard Error
t Value
Pr > |t|
Intercept DOSE
1 1
63.89576 4.88058
3.97040 2.17242
16.09 2.25
<.0001 0.0313
&RPPRQ6WDWLVWLFDO0HWKRGVIRU&OLQLFDO5HVHDUFKZLWK6$6([DPSOHV
2873876$62XWSXWIRU([DPSOHFRQWLQXHG Multiple Linear Regression Example 10.2: Recovery in Pediatric Dehydration The REG Procedure Model: MODEL3 Dependent Variable: Y 17
Analysis of Variance
Source
DF
Sum of Squares
Model Error Corrected Total
2 33 35
2390.66183 3428.22706 5818.88889
1195.33092 103.88567
10.19243 71.55556 14.24408
R-Square Adj R-Sq
Root MSE Dependent Mean Coeff Var
Mean Square
F Value
Pr > F
11.51
0.0002
19
0.4108 0.3751
Parameter Estimates
Variable
DF
Parameter Estimate
Standard Error
t Value
Pr > |t|
Intercept DOSE AGE
1 1 1
84.07228 6.06709 -3.30580
6.06631 1.83827 0.83240
13.86 3.30 -3.97
<.0001 0.0023 0.0004
Model: MODEL4 Dependent Variable: Y 18
Analysis of Variance
Source
DF
Sum of Squares
Mean Square
Model Error Corrected Total
2 33 35
2666.22169 3152.66720 5818.88889
1333.11084 95.53537
Root MSE Dependent Mean Coeff Var
9.77422 71.55556 13.65962
R-Square Adj R-Sq
F Value
Pr > F
13.95
<.0001
0.4582 0.4254
20
Parameter Estimates
Variable
DF
Parameter Estimate
Standard Error
t Value
Pr > |t|
Intercept DOSE WT
1 1 1
85.59416 6.17526 -0.50610
5.79705 1.76329 0.11307
14.77 3.50 -4.48
<.0001 0.0013 <.0001
&KDSWHU/LQHDU5HJUHVVLRQ
'HWDLOV 1RWHV
,Q VLPSOH OLQHDU UHJUHVVLRQ WKH K\SRWKHVLV RI µ]HUR UHJUHVVLRQ VORSH¶ LV HTXLYDOHQW WR WKH K\SRWKHVLV WKDW WKH FRYDULDWH LV QRW DQ LPSRUWDQW SUHGLFWRURIUHVSRQVH7KHFRYDULDWH[FDQEHYLHZHGDVDQ$129$PRGHO IDFWRUZLWKGHJUHHRIIUHHGRPZLWKWKHIROORZLQJ$129$VXPPDU\ $129$ 6285&( GI
66
06
)
;
66;
06;
)[ 06;06(
(UURU
Q±
66(
06(
7RWDO
Q±
72766
7KH66;DQG06;FDQEHFRPSXWHGDV
6 [\ 66; 06;
6 [[
7KH )YDOXH IRU WKH QXOO K\SRWKHVLV WKDW WKH FRYDULDWH LV XQLPSRUWDQW LV ) 06; 06( ZLWK XSSHU DQG Q± ORZHU GHJUHHV RI IUHHGRP $OJHEUDLFDOO\WKLVLVHTXLYDOHQWWRWKHVTXDUHRIWKHWWHVWIRUWKHµ]HURVORSH¶ K\SRWKHVLVEHFDXVH
06; ) 06(
æ 6 [\ ö ç ÷ è 6 [[ ø
V
æ 6 [\ ö ç ÷ è 6 [[ ø
V 6[[
æ ç è V
E
ö ÷ 6[[ ø
W
7KH)YDOXHIRUWKHFRYDULDWHHIIHFWLQ([DPSOHLVVHHQLQ2XWSXW WREHòZKLFKLVWKHVTXDUHRIWKHWYDOXH±
$Q$129$WDEOHFDQEHSUHSDUHGLQDVLPLODUPDQQHUIRUHDFKPRGHOHIIHFW [ LQDPXOWLSOHOLQHDUUHJUHVVLRQDQDO\VLV7KHUHLVGHJUHHRIIUHHGRPIRU HDFKQHZSDUDPHWHU VRWKDWWKHHUURUGHJUHHVRIIUHHGRPGHFUHDVHE\ IRUHDFKH[SODQDWRU\YDULDEOH L
L
$QRWKHU HTXLYDOHQW PHWKRG RI PDNLQJ LQIHUHQFHV DERXW WKH FRYDULDWH HIIHFW LQ VLPSOH OLQHDU UHJUHVVLRQ LV ZLWK WKH FRUUHODWLRQ FRHIILFLHQW$SRVLWLYHUHJUHVVLRQVORSHLQGLFDWHVDSRVLWLYHFRUUHODWLRQLH \LQFUHDVHVDV[LQFUHDVHV$QHJDWLYHVORSHLQGLFDWHVDQHJDWLYHFRUUHODWLRQ LH \ GHFUHDVHV DV [ LQFUHDVHV 7KH 3HDUVRQSURGXFWPRPHQW FRUUHODWLRQ FRHIILFLHQW LV RIWHQ XVHG DV D PHDVXUH RI WKH GHJUHH RI OLQHDU FRUUHODWLRQ EHWZHHQWZRYDULDEOHV[DQG\7KHFRUUHODWLRQFRHIILFLHQW LVDXQLWOHVV
&RPPRQ6WDWLVWLFDO0HWKRGVIRU&OLQLFDO5HVHDUFKZLWK6$6([DPSOHV
PHDVXUHEHWZHHQ±DQG$FRUUHODWLRQRILQGLFDWHVDSHUIHFWSRVLWLYH FRUUHODWLRQDFRUUHODWLRQRI±LQGLFDWHVDSHUIHFWQHJDWLYHFRUUHODWLRQDQGD LQGLFDWHV QR OLQHDU FRUUHODWLRQ 7KH SRSXODWLRQ SDUDPHWHU LV HVWLPDWHG E\WKHVDPSOHFRUUHODWLRQFRHIILFLHQWU
U
6 [\
6 [[ × 6 \\
$WWHVWIRUVLJQLILFDQWOLQHDUFRUUHODWLRQLV
QXOOK\SRWKHVLV DOWK\SRWKHVLV
WHVWVWDWLVWLF
GHFLVLRQUXOH
+ +
$
W
\\
7KHFRUUHODWLRQFRHIILFLHQWIRU([DPSOHLV
U ±
7KHWVWDWLVWLFIRUVLJQLILFDQWFRUUHODWLRQLV
±
× W - - -
ZKLFKLVWKHVDPHWYDOXHFRPSXWHGIURPWKHVORSHHVWLPDWHE7KH6$6RXWSXW IRU ([DPSOH SULQWV U ± ô 56TXDUH ZKLFK LV D PHDVXUHRIJRRGQHVVRIILW
Q±
[\
UHMHFW+ LI_ W _!W
- U
([SUHVVLQJWKHVORSHEWKHVWDQGDUGGHYLDWLRQVDQGWKHFRUUHODWLRQFRHIILFLHQW ULQWHUPVRI6 6 DQG6 LWLVHDV\WRVKRZWKDWWKHWWHVWEDVHGRQWKH FRUUHODWLRQLVLGHQWLFDOWRWKHWWHVWEDVHGRQWKHVORSH1RWLFHWKHHTXLYDOHQF\ RIWKHK\SRWKHVHV+ DQG+ [[
U × Q -
7KH FRHIILFLHQW RI GHWHUPLQDWLRQ IRU DPXOWLSOH OLQHDU UHJUHVVLRQ PRGHO5 UHSUHVHQWVWKHUHGXFWLRQLQHUURUYDULDELOLW\E\XVLQJWKHH[SODQDWRU\ YDULDEOHVDVSUHGLFWRUVRI\RYHUMXVWXVLQJWKHPHDQ \ 7KXV\RXFDQZULWH
5 766±66( 766
&KDSWHU/LQHDU5HJUHVVLRQ
ZKHUH 766 LV WKH WRWDO VXP RI VTXDUHV ,Q WKLV FDVH WKH 66( LV D PHDVXUH RI HUURU YDULDWLRQ DVVRFLDWHG ZLWK WKH µIXOO¶ PRGHO ZKLFK LQFOXGHV DOO H[SODQDWRU\ YDULDEOHV DQG 766 LV D PHDVXUH RI HUURU YDULDWLRQ DVVRFLDWHG ZLWK WKH µUHGXFHG¶ PRGHO ZKLFK LQFOXGHV QRQH RI WKH H[SODQDWRU\YDULDEOHV7KHGLIIHUHQFHLVWKHµPRGHO¶VXPRIVTXDUHV,Q2XWSXW IRU 02'(/ WKH 56TXDUH YDOXH 15 FDQ EH IRXQG DV 0RGHO66 7RWDO66 6LPLODUO\ \RX FDQ FRPSXWH µSDUWLDO¶ FRUUHODWLRQ FRHIILFLHQWV U EHWZHHQ WKH UHVSRQVH\ DQGDQ\LQGHSHQGHQWYDULDEOH[ DGMXVWHGIRUDOORWKHUYDULDEOHV LQ WKH PRGHO 7KH FRHIILFLHQW RI GHWHUPLQDWLRQ DVVRFLDWHG ZLWK [ LV U D PHDVXUH RI WKH UHGXFWLRQ LQ YDULDELOLW\ DFFRXQWHG IRU E\ LQFOXGLQJ [ LQ WKH PRGHO7KLVFDQEHIRXQGE\XVLQJWKHSDUWLDOVXPRIVTXDUHVODEHOHG7\SH,, 66 LQ 2XWSXW ZKLFK UHSUHVHQW WKH VXPV RI VTXDUHV DVVRFLDWHG ZLWK [ DGMXVWHGIRUDOORWKHULQGHSHQGHQWYDULDEOHVLQWKHPRGHO M
M
M
M
M
M
,I 66( UHSUHVHQWV WKH HUURU VXP RI VTXDUHV EDVHG RQ WKH µIXOO¶ PRGHO ZKLFK LQFOXGHV N H[SODQDWRU\ YDULDEOHV DQG 66(M LV WKH HUURU VXP RI VTXDUHV EDVHG RQ WKH VHW RI N± YDULDEOHV ZKLFK H[FOXGHV [ WKHQ WKH UHGXFWLRQ LQ YDULDELOLW\ GXH WR [ LV WKH 7\SH ,, VXP RI VTXDUHV IRU [ 7KXV 66(M 7\SH,,66; 66(DQGWKHSDUWLDOFRHIILFLHQWRIGHWHUPLQDWLRQFDQ EHFRPSXWHGDV M
M
M
M
U 66(M ±66( 66(M
M
7KH 66 DQG 66 RSWLRQV LQ WKH LQLWLDO 02'(/ VWDWHPHQW 02'(/ LQ ([DPSOHUHTXHVWWKH6$67\SH,DQG,,VXPVRIVTXDUHV)RU'RVH/HYHO '26( WKH7\SH,,66LVVR66( DQG U
7KH6$67\SH,66VRPHWLPHVFDOOHGWKHVHTXHQWLDOVXPRIVTXDUHVUHSUHVHQW WKH VXP RI VTXDUHV DGMXVWHG RQO\ IRU WKH IDFWRUV SUHFHGLQJ LW LQ WKH PRGHO VSHFLILFDWLRQ &RQVHTXHQWO\ WKH VHTXHQFH LQ ZKLFK WKH\ DUH VSHFLILHG LQ WKH 02'(/VWDWHPHQWLVLPSRUWDQW :KHQXVLQJ352&*/0WKH7\SHV,DQG,,,VXPVRIVTXDUHVDUH SULQWHGE\GHIDXOW,QWKH6$6&RGHIRU([DPSOHRQO\7\SH,LVUHTXHVWHG E\ XVLQJ 66 LQ WKH 02'(/ VWDWHPHQW EHFDXVH 7\SH ,,, XVXDOO\ SHUWDLQV WR $129$ UDWKHU WKDQ UHJUHVVLRQ DQDO\VLV ,QWKHVLPSOHOLQHDUUHJUHVVLRQFDVH 667\SHV,,,DQG,,,DUHLGHQWLFDO $ VLJQLILFDQW FRUUHODWLRQ LPSOLHV D VLJQLILFDQW OLQHDU UHODWLRQVKLS EHWZHHQ [ DQG \ EXW GRHV QRW LPSO\ FDXVDOLW\ &RQYHUVHO\ D QRQVLJQLILFDQW FRUUHODWLRQ GRHV QRW LPSO\ WKDW WKHUH LV QR FDXVDO RU RWKHU UHODWLRQVKLS EHWZHHQ[DQG\7KHUHPLJKWLQIDFWEHDTXDGUDWLFRUSDUDEROLFUHODWLRQVKLS RUVRPHRWKHUUHODWLRQVKLSEHWZHHQ[DQG\UHVXOWLQJLQQROLQHDUFRUUHODWLRQ
&RPPRQ6WDWLVWLFDO0HWKRGVIRU&OLQLFDO5HVHDUFKZLWK6$6([DPSOHV
,WLVDOZD\VDJRRGLGHDWRSORWWKHGDWDWRREWDLQDYLVXDOLPSUHVVLRQRIDQ\ UHODWLRQVKLS 4XDGUDWLF FXELF DQG KLJKHURUGHU UHJUHVVLRQ DQDO\VHV FDQ EH SHUIRUPHGTXLWHHDVLO\E\LQFOXGLQJWKHDSSURSULDWHYDULDEOHVLQWKH02'(/ VWDWHPHQW ZKHQ XVLQJ 352& */0 RU 352& 5(* LQ 6$6 )RU H[DPSOH D TXDGUDWLFPRGHOIRU([DPSOHFDQEHVSHFLILHGE\XVLQJWKHIROORZLQJ6$6 VWDWHPHQWV DIWHU GHILQLQJ WKH QHZ YDULDEOH ;B'85 ;B'85
LQ WKH '$7$VWHS PROC GLM DATA = ANGINA; MODEL YBIMPR = XBDUR XBDUR2;
:KHQHVWLPDWLQJDPHDQUHVSRQVHIRUDVSHFLILHGYDOXHRI[VD\ [ WKHFRQILGHQFHLQWHUYDOZLOOEHVPDOOHVWZKHQ[ LVFORVHVWWRWKHPHDQ[ EHFDXVH WKH FRQILGHQFH LQWHUYDO ZLGWK LV DPRQRWRQLFDOO\LQFUHDVLQJIXQFWLRQ RI[ ±[ 6XFKLQWHUYDOVFDQEHFRPSXWHGIRUPXOWLSOH[ YDOXHVDQGSORWWHG DORQJ ZLWK WKH UHJUHVVLRQ HTXDWLRQ WR IRUP FRQILGHQFH EDQGV )LJXUH VKRZVWKHFRQILGHQFHEDQGVIRU([DPSOH
)LJXUH&RQILGHQFH%DQGVIRU([DPSOH
8SSHU&RQILGHQFH%DQG (VW5HJUHVVLRQ\ [
,PSURYHPHQW\
/RZ HU&RQILGHQFH%DQG
7KH3DQG&/0RSWLRQVFDQEHXVHGLQ6$6ZLWKPXOWLSOHOLQHDU UHJUHVVLRQPRGHOVWRREWDLQSUHGLFWHGYDOXHVRIWKHUHVSRQVHYDULDEOHDQG FRQILGHQFH LQWHUYDOV LQ WKH VDPH ZD\ DV LOOXVWUDWHG LQ WKH VLPSOH OLQHDU UHJUHVVLRQFDVH([DPSOH 7KH SUHGLFWLRQ HTXDWLRQ FDQ EH XVHG WR HVWLPDWHWKHUHVSRQVH\ IRU DQ\ YDOXH RI WKH SUHGLFWRU [ ZLWKLQ WKH H[SHULPHQWDO UHJLRQ 7KH
&KDSWHU/LQHDU5HJUHVVLRQ
H[SHULPHQWDO UHJLRQ FRQVLVWV RI WKH UDQJH RI [YDOXHV IURP WKH GDWD XVHG WR FRPSXWH WKH UHJUHVVLRQ HVWLPDWHV 3UHGLFWLQJ D UHVSRQVH \ IRU D YDOXH RI [ RXWVLGHWKHH[SHULPHQWDOUHJLRQLVFDOOHGµH[WUDSRODWLRQ¶([WUDSRODWLRQWRWKH SRLQW[ VKRXOGQRWEHXVHGXQOHVVLWFDQEHVDIHO\DVVXPHGWKDW[DQG\KDYH WKH VDPH OLQHDU UHODWLRQVKLS RXWVLGH WKH H[SHULPHQWDO UHJLRQ DW SRLQW [ %HFDXVH RIWHQ WKLV DVVXPSWLRQ PLJKW QRW EH PHW H[WUDSRODWLRQ VKRXOG EH DYRLGHGLIWKHUHLVDQ\XQFHUWDLQW\DERXWWKLVDVVXPSWLRQ H
H
2QHRIWKHDVVXPSWLRQVQHHGHGWRSHUIRUPUHJUHVVLRQDQDO\VLVLVD FRQVWDQW HUURU YDULDQFH RYHU WKH H[SHULPHQWDO UHJLRQ 7R LQYHVWLJDWH WKH YDOLGLW\RIWKLVDVVXPSWLRQ\RXFDQORRNIRUDFRQVWDQWKRUL]RQWDOSDWWHUQLQ WKH SORW RI WKH UHVLGXDOV LH WKH GLIIHUHQFH EHWZHHQ WKH REVHUYHG DQG SUHGLFWHG YDOXHV DJDLQVW WKH [YDOXHV ,I WKH UHVLGXDOV DSSHDU WR LQFUHDVH RU GHFUHDVHZLWK[WKHDVVXPSWLRQRIYDULDQFHKRPRJHQHLW\PLJKWEHYLRODWHG,I WKDW LV WKH FDVH \RX PLJKW XVH D WUDQVIRUPDWLRQ RI WKH GDWD WR DWWHPSW WR VWDELOL]HWKHYDULDQFH3RSXODUYDULDQFHVWDELOL]LQJWUDQVIRUPDWLRQVLQFOXGHWKH QDWXUDOORJDULWKPVTXDUHURRWDQGDUFVLQHWUDQVIRUPDWLRQVHH$SSHQGL[) $QRWKHU DVVXPSWLRQ RI WKH UHJUHVVLRQ SURFHGXUHV LV WKDW RI LQGHSHQGHQWREVHUYDWLRQV,I[UHSUHVHQWVWLPHDQGWKHUHVSRQVHV\UHSUHVHQW PHDVXUHPHQWV RQ WKH VDPH H[SHULPHQWDO XQLW HJ SDWLHQW DW YDULRXV WLPH SRLQWV WKH DVVXPSWLRQ RI LQGHSHQGHQFH LV YLRODWHG ,Q VXFK FDVHV XVH RI D UHSHDWHGPHDVXUHV$129$PLJKWEHPRUHDSSURSULDWHVHH&KDSWHU $ FRUUHODWLRQ RYHU WLPH LV UHIHUUHG WR DV µDXWRFRUUHODWLRQ¶ LQ WLPHVHULHV DQDO\VLV 6$6 XVHV WKH 'XUELQ:DWVRQ WHVW WR FKHFN IRU DXWRFRUUHODWLRQ $ YDOXH FORVH WR VXFK DV õ LQ 2XWSXW LQGLFDWHV QR VLJQLILFDQW DXWRFRUUHODWLRQ :KHQ XVLQJ 352& 5(* WKH 'XUELQ:DWVRQ WHVW FDQ EH UHTXHVWHG E\ XVLQJ WKH ': RSWLRQ LQ WKH 02'(/ VWDWHPHQW WR FKHFN IRU DXWRFRUUHODWLRQ )RUDSDUWLFXODUYDOXHRI[VD\[ DµSUHGLFWLRQLQWHUYDO¶IRU WKHSUHGLFWHGUHVSRQVHRIDIXWXUHREVHUYDWLRQLVJLYHQE\ S
[ S - x \Ö [ S W Q - × V Q 6 [[ ZKHUHA \[S DE[ S
3UHGLFWLRQ EDQGV FDQ EH HVWDEOLVKHG DURXQG WKH SUHGLFWLRQ HTXDWLRQ LQ D ZD\ WKDWLVVLPLODUWRWKDWGHVFULEHGIRUFRQILGHQFHEDQGV 'LDJQRVWLFV DYDLODEOH IRU GHWHFWLQJ FROOLQHDULW\ LQPXOWLSOHOLQHDU UHJUHVVLRQ PRGHOV WKDW FRQWDLQ D ODUJH QXPEHU RI LQGHSHQGHQW YDULDEOHV LQFOXGH WKH YDULDQFH LQIODWLRQ IDFWRU 9,) DQG HLJHQYDOXH DQDO\VLV RI WKH PRGHO IDFWRUV 7KHVH DQDO\VHV DUH SHUIRUPHG LQ 6$6 E\ XVLQJ WKH 9,) DQG
&RPPRQ6WDWLVWLFDO0HWKRGVIRU&OLQLFDO5HVHDUFKZLWK6$6([DPSOHV
&2//,12,17 RSWLRQV LQ WKH 02'(/ VWDWHPHQW LQ 352& 5(* 9HU\ ODUJH UHODWLYHYDOXHVRI9,)PLJKWVXJJHVWFROOLQHDULW\7KHHLJHQYDOXHDQDO\VLVLVD PHWKRG XVHG LQ D VWDWLVWLFDO WHFKQLTXH FDOOHG SULQFLSDO FRPSRQHQW DQDO\VLV RIWHQ HQFRXQWHUHG LQ WKH VRFLDO VFLHQFHV 7KH EDVLF LGHD LV WKDW YHU\ VPDOO YDOXHV RI WKH HLJHQYDOXHV PLJKW VXJJHVW FROOLQHDULW\ HVSHFLDOO\ LI WKH µFRQGLWLRQLQGH[¶DOVRSULQWHGLQWKH6$6RXWSXW LVODUJHHJ! 1RWH)RUIXUWKHUUHDGLQJDQHQWLUHFKDSWHULVGHYRWHGWRDGLVFXVVLRQRI SULQFLSDOFRPSRQHQWDQDO\VLVLQ³$6WHSE\6WHS$SSURDFKWR8VLQJWKH6$6 6\VWHPIRU8QLYDULDWHDQG0XOWLYDULDWH6WDWLVWLFV´E\+DWFKHUDQG6WHSDQVNL ZKLFKZDVZULWWHQXQGHUWKH6$6%RRNVE\8VHUVSURJUDP
,Q ([DPSOH WKH 9,) DV VKRZQ LQ WKH 3DUDPHWHU (VWLPDWHV VHFWLRQ LQ 0RGHORI2XWSXW LVIRUERWK$JHDQG:HLJKW7KHVHDUHUHODWLYHO\ ODUJH YDOXHV FRPSDUHG ZLWK WKH 9,) IRU 'RVH /HYHO FRQVLVWHQW ZLWK FROOLQHDULW\ 7KH WKLUG HLJHQYDOXH LQ WKH &ROOLQHDULW\ 'LDJQRVWLFV RXWSXW 21 LV YHU\VPDOODQGDOWKRXJKWKHFRQGLWLRQLQGH[LVQRWFORVHWRH[FHHGLQJ\RX VHHWKDWZHLJKWLQJVIRU$JHDQG:HLJKWDUHODUJH FRPSDUHGZLWK'RVH 7KLVSRLQWVWRHYLGHQFHRIDOLQHDUGHSHQGHQFHEHWZHHQWKHVHWZRLQGHSHQGHQW YDULDEOHV 7KHVH REVHUYDWLRQV VXSSRUW WKH VXVSLFLRQ RI FROOLQHDULW\ EHWZHHQ $JH DQG :HLJKW DQG VXJJHVW WKDW WKH PRGHO VKRXOG DYRLG LQFOXGLQJ ERWK RI WKHVHIDFWRUVVLPXOWDQHRXVO\ 352& 5(* LV YHU\ XVHIXO IRU µPRGHOEXLOGLQJ¶ 7KH SURFHVV RI ILWWLQJ YDULRXV PRGHOV DQG FKHFNLQJ WKH 5 DV VKRZQ LQ ([DPSOH LV VLPLODUWRRQHRIWKHPRGHOEXLOGLQJSURFHGXUHVWKDWXVHVWKHVWHSZLVHRSWLRQLQ 352& 5(* VHH 6$6 +HOS DQG 'RFXPHQWDWLRQ WR KHOS GHWHUPLQH WKH EHVW ILWWLQJPRGHOWRGHVFULEHWKHUHVSRQVH6WHSZLVHUHJUHVVLRQWHFKQLTXHVFDQEH YHU\XVHIXOLQILWWLQJDPRGHOWRH[LVWLQJGDWD*UHDWFDXWLRQPXVWEHXVHGLQ PDNLQJLQIHUHQFHVKRZHYHUZKHQHYHUWKHPRGHOLVGDWDJHQHUDWHG
5HJUHVVLRQ DQDO\VLV KDV OLPLWHG XVHLQFRPSDUDWLYHFOLQLFDOWULDOV EHFDXVHWKHµ7UHDWPHQW*URXS¶HIIHFWLVDOPRVWDOZD\VDFODVVYDULDEOH:KHQ FODVVLILFDWLRQ YDULDEOHV VXFK DV 7UHDWPHQW HIIHFW DUH LQFOXGHG LQ WKH VDPH PRGHODVQXPHULFH[SODQDWRU\YDULDEOHV\RXXVHDQDQDO\WLFSURFHGXUHNQRZQ DVDQDO\VLVRIFRYDULDQFH$1&29$ ZKLFKLVWKHVXEMHFWRIWKHQH[WFKDSWHU 7KH FRQFHSWV RI UHJUHVVLRQ DQDO\VLV DUHLQWURGXFHGLQWKLVFKDSWHUWRSURYLGH WKHQHFHVVDU\EDFNJURXQGIRUDSSO\LQJ$1&29$PHWKRGVZKLFKKDYHJUHDWHU XVHLQWKHFOLQLFDOVHWWLQJ 352& 5(* LQ 6$6 LV YHU\ HIILFLHQW IRU SHUIRUPLQJ UHJUHVVLRQ DQDO\VLV DQG VHOHFWLQJDSSURSULDWHPRGHOVEXWLWFDQQRWEHXVHGIRU$1&29$352&*/0 DOWKRXJKODFNLQJVRPHRIWKHUHJUHVVLRQGLDJQRVWLFVDYDLODEOHLQ352&5(* FDQEHXVHGIRUUHJUHVVLRQLQDGGLWLRQWR$129$DQG$1&29$ 1RWH)RUPRUHGHWDLOVDERXWUHJUHVVLRQDQDO\VLVXVLQJ352&5(*LQ 6$6VHH³6$66\VWHPIRU5HJUHVVLRQ6HFRQG(GLWLRQ´E\)UHXQGDQG /LWWHOOZKLFKZDVZULWWHQXQGHUWKH6$6%RRNVE\8VHUVSURJUDP
&++$$3377((55 $QDO\VLVRI&RYDULDQFH ,QWURGXFWLRQ 6\QRSVLV ([DPSOHV 'HWDLOV 1RWHV
,QWURGXFWLRQ
$QDO\VLV RI FRYDULDQFH $1&29$ SURYLGHV D PHWKRG IRU FRPSDULQJ UHVSRQVH PHDQVDPRQJWZRRUPRUHJURXSVDGMXVWHGIRUDTXDQWLWDWLYHFRQFRPLWDQWYDULDEOH RUµFRYDULDWH¶WKRXJKWWRLQIOXHQFHWKHUHVSRQVH7KHDWWHQWLRQKHUHLVFRQILQHGWR FDVHV LQ ZKLFK WKH UHVSRQVH \ PLJKW EH OLQHDUO\ UHODWHG WR WKH FRYDULDWH [ $1&29$ FRPELQHV UHJUHVVLRQ DQG $129$ PHWKRGV E\ ILWWLQJ VLPSOH OLQHDU UHJUHVVLRQPRGHOVZLWKLQHDFKJURXSDQGFRPSDULQJUHJUHVVLRQVDPRQJJURXSV $1&29$ PHWKRGV UHSUHVHQW RQH RI WKH PRVW ZLGHO\ XVHG VWDWLVWLFDO PHWKRGV LQ FOLQLFDOWULDOV([DPSOHVZKHUH$1&29$PLJKWEHDSSOLHGLQFOXGH
FRPSDULQJFKROHVWHUROOHYHOV\ EHWZHHQDWUHDWHGJURXSDQGDUHIHUHQFH JURXSDGMXVWHGIRUDJH[LQ\HDUV
FRPSDULQJ VFDU KHDOLQJ \ EHWZHHQ FRQYHQWLRQDO DQG ODVHU VXUJHU\ DGMXVWHGIRUH[FLVLRQVL]H[LQPP
FRPSDULQJH[HUFLVHWROHUDQFH\ LQGRVHOHYHOVRIDWUHDWPHQWXVHGIRU DQJLQDSDWLHQWVDGMXVWHGIRUVPRNLQJKDELWV[LQFLJDUHWWHVGD\
$1&29$FDQRIWHQLQFUHDVHWKHSUHFLVLRQRIFRPSDULVRQVRIWKHJURXSPHDQVE\ LQFOXGLQJ WKH FRYDULDWH [ DV D VRXUFH RI YDULDWLRQ WKHUHE\ GHFUHDVLQJ WKH HVWLPDWHG HUURU YDULDQFH $1&29$ LV HVSHFLDOO\ XVHIXO ZKHQ WKH YDOXHV RI WKH FRYDULDWHGLIIHUDPRQJWKHJURXSV:KHQWKLVRFFXUVDQGWKHUHVSRQVHLVOLQHDUO\ UHODWHG WR WKH FRYDULDWH FRYDULDQFH DGMXVWPHQWV FDQ OHDG WR PDUNHGO\ GLIIHUHQW FRQFOXVLRQVWKDQWKRVHREWDLQHGXVLQJWKHXQDGMXVWHG$129$PHWKRGV
&RPPRQ6WDWLVWLFDO0HWKRGVIRU&OLQLFDO5HVHDUFKZLWK6$6([DPSOHV
6\QRSVLV
6XSSRVH \RX KDYH N JURXSV ZLWK Q LQGHSHQGHQW [\ SRLQWV LQ *URXS L L N DVVKRZQLQ7DEOH L
7$%/($1&29$/D\RXW
*5283
*5283
*5283N
[
\
[
\
[
\
[
\
[
\
[N
\N
[
\
[
\
[N
\N
[
\
[
\
[N
\N
[Q
\Q
[Q
\Q
[NQN
\NQN
$V ZLWK $129$ $1&29$ DVVXPHV LQGHSHQGHQW JURXSV ZLWK D QRUPDOO\ GLVWULEXWHG UHVSRQVH PHDVXUH \ DQG YDULDQFH KRPRJHQHLW\ ,Q DGGLWLRQ \RX DVVXPHWKDWWKHUHJUHVVLRQVORSHVDUHWKHVDPHIRUHDFKJURXSLH 7KHPHDQUHVSRQVHZLWKLQHDFKJURXSGHSHQGVRQWKHFRYDULDWHZKLFKUHVXOWV LQDPRGHOIRUWKHL JURXSPHDQDV
N
WK
L
[ L
)RUHDFKJURXS\RXFDQFRPSXWH [ \ 6 6 DQG6 XVLQJWKHIRUPXODVJLYHQ [[
LQ &KDSWHU /HW [ \ 6 L
L
6
[[L
\\
[\
DQG 6
\\L
UHSUHVHQW WKHVH TXDQWLWLHV
[\L
UHVSHFWLYHO\ IRU *URXS L L N $OVRFRPSXWH6 6 DQG6 IRUDOO JURXSVFRPELQHGLJQRULQJJURXS 7KHHVWLPDWHGUHJUHVVLRQOLQHIRU*URXSLLV [[
A\ D E[ L
ZKHUHD DQGEDUHWKH/HDVW6TXDUHVHVWLPDWHVFRPSXWHGDV L
E
DQG
Ê6[\ L L Ê6[[ L L
D \ ±E [ L
L
\\
[\
&KDSWHU$QDO\VLVRI&RYDULDQFH
:LWK1 Q Q Q WKH$129$WDEOHVXPPDUL]LQJWKHVLJQLILFDQFHRIWKH VRXUFHVRIYDULDWLRQLVVKRZQLQ7DEOH
N
7$%/($129$6XPPDU\7DEOHIRUD6LPSOH$1&29$0RGHO
6285&(
GI
66
06
)
*5283
N±
66*
06*
)* 06*06(
;FRYDULDWH
66;
06;
); 06;06(
(UURU
1±N± 66(
06(
7RWDO
1±
7KHVXPVRIVTXDUHV66 FDQEHIRXQGIURPWKHIROORZLQJFRPSXWLQJIRUPXODV
72766 6 \\
72766
æ ç è
66(
66*
=
66; =
å L
ö
æ 6 [[ L ÷ × ç ø è
å å6 L
æ ç è
6[[ × 6\\
-
6[\
6[[
å6
\\L
ö æ 6 \\ L ÷ - ç ø è
L
-
å L
ö 6 [\ L ÷ ø
ö [[ L ÷
ø
66(
- 66(
L
1RWLFH WKDW ZKHQ N RQH JURXS 66* DQG WKH FRPSXWLQJ IRUPXODV DUH LGHQWLFDOWRWKRVHXVHGIRUOLQHDUUHJUHVVLRQLQ&KDSWHU
&RPPRQ6WDWLVWLFDO0HWKRGVIRU&OLQLFDO5HVHDUFKZLWK6$6([DPSOHV
7KH PHDQ VTXDUHV 06 DUH WKH VXP RI VTXDUHV 66 GLYLGHG E\ WKH GHJUHHV RI IUHHGRPGI 2ISULPDU\LQWHUHVWLVWKHFRPSDULVRQRIWKHJURXSPHDQVDGMXVWHGWR DFRPPRQYDOXHRIWKHFRYDULDWHHJ[ /HWWLQJ UHSUHVHQWWKHPHDQRI*URXSL IRU[ [ WKHWHVWFDQEHVXPPDUL]HGDVIROORZV
L
QXOOK\SRWKHVLV DOWK\SRWKHVLV
+ + 127+
WHVWVWDWLVWLF
) 06*06(
UHMHFWLRQUHJLRQ
$
N
*
N -
UHMHFW+ LI) ! )1 - N -
*
1RWLFHWKDWWKHK\SRWKHVLVLVHTXLYDOHQWWRWKHHTXDOLW\RILQWHUFHSWVDPRQJJURXSV + N DQGWKHWHVWGRHVQRWGHSHQGRQWKHYDOXHRI[
7KLVWHVWPLJKWUHVXOWLQLPSURYHGSUHFLVLRQIRUWKHJURXSPHDQVFRPSDULVRQVRYHU RQHZD\$129$PHWKRGVLIWKHVORSH GLIIHUVIURP7RGHWHUPLQHZKHWKHUWKH FRYDULDWH KDV D VLJQLILFDQW HIIHFW RQ WKH UHVSRQVH XVH WKH ) UDWLR WR WHVW IRU UHJUHVVLRQVORSHXVLQJWKHIROORZLQJWHVWVXPPDU\ ;
QXOOK\SRWKHVLV DOWK\SRWKHVLV
+ +
WHVWVWDWLVWLF
) 06; 06(
UHMHFWLRQUHJLRQ
$
;
UHMHFW+ LI) ! )1 - N -
[
&KDSWHU$QDO\VLVRI&RYDULDQFH
([DPSOHV
p ([DPSOH7ULJO\FHULGH&KDQJHV$GMXVWHGIRU*O\FHPLF&RQWURO 7KHQHZFKROHVWHUROORZHULQJVXSSOHPHQW)LEUDORZDVVWXGLHGLQDGRXEOH EOLQGVWXG\DJDLQVWWKHPDUNHWHGUHIHUHQFHVXSSOHPHQW*HPILEUR]LOLQ QRQLQVXOLQGHSHQGHQWGLDEHWLF1,''0 SDWLHQWV2QHRIWKHVWXG\ V REMHFWLYHVZDVWRFRPSDUHWKHPHDQGHFUHDVHLQWULJO\FHULGHOHYHOVEHWZHHQ JURXSV7KHGHJUHHRIJO\FHPLFFRQWUROPHDVXUHGE\KHPRJORELQ$FOHYHOV +E$F ZDVWKRXJKWWREHDQLPSRUWDQWIDFWRULQUHVSRQVHWRWKHWUHDWPHQW 7KLVFRYDULDWHZDVPHDVXUHGDWWKHVWDUWRIWKHVWXG\DQGLVVKRZQLQ7DEOH ZLWKWKHSHUFHQWFKDQJHVLQWULJO\FHULGHVIURPSUHWUHDWPHQWWRWKHHQG RIWKHZHHNWULDO,VWKHUHDGLIIHUHQFHLQPHDQUHVSRQVHVEHWZHHQ VXSSOHPHQWV"
7$%/(5DZ'DWDIRU([DPSOH ),%5$/2*5283
*(0),%52=,/*5283
3DWLHQW 1XPEHU
+E$F QJPO ;
7ULJO\FHULGH &KDQJH <
3DWLHQW 1XPEHU
+E$F QJPO ;
7ULJO\FHULGH &KDQJH <
&RPPRQ6WDWLVWLFDO0HWKRGVIRU&OLQLFDO5HVHDUFKZLWK6$6([DPSOHV
6ROXWLRQ
7RDSSO\$1&29$XVLQJ+E$ DVDFRYDULDWHILUVWREWDLQVRPHVXPPDU\UHVXOWV IURPWKHGDWDDVVKRZQLQ7DEOH F
7$%/(6XPPDU\5HVXOWVIRU([DPSOH
)LEUDOR *URXS
*HPILEUR]LO *URXS
&RPELQHG
[
[
\
\
[\
[
\
1
8VLQJWKHIRUPXODVLQ&KDSWHU\RXFRPSXWHIRUWKH)LEUDORJURXSL
6
±
6
±
6
±± ±
[[
\\
[\
6LPLODUO\FRPSXWLQJIRUWKH*HPILEUR]LOJURXSL \RXREWDLQ
6
6
6
±
[[
\\
[\
&KDSWHU$QDO\VLVRI&RYDULDQFH
)LQDOO\IRUWKHFRPELQHGGDWDLJQRULQJJURXSV FRPSXWH
6
6
6 ±
[[
\\
[\
7KHVXPVRIVTXDUHVFDQQRZEHREWDLQHGDV
72766 - - -
66(
66*
DQGWKH$129$VXPPDU\WDEOHFDQEHFRPSOHWHGDVVKRZQLQ7DEOH
- -
-
66; ±
7$%/($1&29$6XPPDU\IRU([DPSOH
6285&(
GI
66
06
7UHDWPHQW
;+E$F
(UURU
7RWDO
6LJQLILFDQWS FULWLFDO)YDOXH
)
&RPPRQ6WDWLVWLFDO0HWKRGVIRU&OLQLFDO5HVHDUFKZLWK6$6([DPSOHV
7KH)VWDWLVWLFVDUHIRUPHGDVWKHUDWLRVRIHIIHFWPHDQVTXDUHV06 WRWKH06( (DFK)VWDWLVWLFLVFRPSDUHGZLWKWKHFULWLFDO)YDOXHZLWKXSSHUDQG ORZHUGHJUHHVRIIUHHGRP7KHFULWLFDO)YDOXHIRU LV 7KH VLJQLILFDQW FRYDULDWH HIIHFW ) LQGLFDWHV WKDW WKH WULJO\FHULGH UHVSRQVH KDV D VLJQLILFDQW OLQHDU UHODWLRQVKLS ZLWK +E$ 7KH VLJQLILFDQW )YDOXH IRU 7UHDWPHQW LQGLFDWHV WKDW WKH PHDQ WULJO\FHULGH UHVSRQVH DGMXVWHG IRU JO\FHPLF FRQWUROGLIIHUVEHWZHHQWUHDWPHQWJURXSV F
$SSOLFDWLRQRIWKHRQHZD\$129$&KDSWHU WRFRPSDUHWUHDWPHQWJURXSPHDQV LJQRULQJWKHFRYDULDWH UHVXOWVLQWKHIROORZLQJ 7$%/($129$IRU([DPSOH,JQRULQJ&RYDULDWH 6285&(
GI
66
06
7UHDWPHQW
(UURU
7RWDO
)
7KH UHMHFWLRQ UHJLRQ IRU WKH 7UHDWPHQW HIIHFW EDVHG RQ D VLJQLILFDQFH OHYHO RI LQFOXGHV)YDOXHVJUHDWHUWKDQWKHFULWLFDO)YDOXHZLWKXSSHUDQG ORZHU GHJUHHV RI IUHHGRP :LWK DQ )YDOXH RI RQO\ \RX FDQQRW UHMHFW WKH K\SRWKHVLV RI HTXDO PHDQV 7KHUHIRUH \RX PXVW FRQFOXGH WKDW QR GLIIHUHQFH LQ PHDQUHVSRQVHEHWZHHHQWUHDWPHQWJURXSVLVHYLGHQWEDVHGRQWKLVWHVW 2QHUHDVRQWKH$1&29$SURGXFHVDVLJQLILFDQWWUHDWPHQWJURXSGLIIHUHQFHEXWWKH $129$GRHVQRWLVDODUJHUHGXFWLRQLQ06(IURPWR 7KLVUHVXOWVLQ JUHDWHUSUHFLVLRQRIWKHWUHDWPHQWJURXSFRPSDULVRQ $QRWKHUUHDVRQIRUWKLVGLIIHUHQFHLVWKHLQFUHDVHLQWKHGLIIHUHQFHEHWZHHQPHDQV E\XVLQJWKHDGMXVWHGYDOXHVDVVKRZQEHORZ
E
±
±
DQGWKHLQWHUFHSWVDL V DV D ± ±±
D ± ±±
&KDSWHU$QDO\VLVRI&RYDULDQFH
ZKLFK\LHOGWKHHVWLPDWHGUHJUHVVLRQHTXDWLRQV
)LEUDOR*URXS
A\ ± [
*HPILEUR]LO
A\ ± [
7KH DGMXVWHG PHDQ UHVSRQVHV IRU HDFK JURXS DUH WKH YDOXHV RI WKH HVWLPDWHG UHJUHVVLRQHTXDWLRQVHYDOXDWHGDWWKHRYHUDOOPHDQRIWKHFRYDULDWH [ DV IROORZV
)LEUDOR
± ±
*HPILEUR]LO
± ±
7KHVH PHDQV GLIIHU FRQVLGHUDEO\ IURP WKH XQDGMXVWHG PHDQV DOUHDG\ FRPSXWHG 7DEOH 7DEOH VXPPDUL]HV WKH XQDGMXVWHG DQG DGMXVWHG PHDQV E\ WUHDWPHQWJURXSV
7$%/(0HDQVE\7UHDWPHQW*URXSIRU([DPSOH )LEUDOR *URXS
*HPILEUR]LO *URXS
8QDGMXVWHG
±
±
$GMXVWHG
±
±
0HDQ 5HVSRQVH
6LJQLILFDQWS
SYDOXH
SYDOXHVIURP6$6RXWSXW
7ULJO\FHULGHVGHFUHDVHE\DPHDQRIIRUHYHU\QJPOLQFUHDVHLQ+E$ DQGWKLVUDWHRIFKDQJHLVVHHQWREHVLJQLILFDQW1RWLFHWKDW7DEOH WKHPHDQ +E$ OHYHOLVVOLJKWO\KLJKHULQWKH)LEUDORJURXSFRPSDUHGZLWKWKH*HPILEUR]LO JURXS Y 7KH XQDGMXVWHG PHDQ UHVSRQVHV DUH WKRVH WKDW OLH RQ WKH UHJUHVVLRQ OLQHV WKDW FRUUHVSRQG WR WKH PHDQ RI WKH FRYDULDWH IRU HDFK JURXS DV VKRZQ LQ )LJXUH &RPSDULQJ WKH PHDQ UHVSRQVHV RI WKH WZR JURXSV DW WKH VDPH YDOXH RI WKH FRYDULDWH [ UHTXLUHV XVLQJ D VPDOOHU [ YDOXH IRU WKH )LEUDOR JURXS ZKLFK UHVXOWV LQ D VPDOOHU WULJO\FHULGH GHFUHDVH DQG D ODUJHU [ YDOXH IRU WKH *HPILEUR]LO JURXS ZKLFK UHVXOWV LQ D ODUJHU WULJO\FHULGH GHFUHDVH 7KH HQG UHVXOW LV D ODUJHU GLIIHUHQFH LQ UHVSRQVH PHDQV EDVHG RQ WKH DGMXVWHG YDOXHVFRPSDUHGZLWKWKHXQDGMXVWHGYDOXHV F
F
&RPPRQ6WDWLVWLFDO0HWKRGVIRU&OLQLFDO5HVHDUFKZLWK6$6([DPSOHV
),*85($1&29$$GMXVWHG0HDQVIRU([DPSOH
\ 7ULJO\FHULGHV&KDQJH
)LEUDOR*URXS\ [ *HPILEUR]LO*URXS\ [
[ +E$FQJPO
(QODUJHPHQWRI$UHD$ERYH
Fibr alo Gr oup
\ DGM -1.9 \ \
-4.1
Gemfibr ozil Gr oup -10.3
\ DGM -12.2
6.64
6.81
7.00
[ [ [
&KDSWHU$QDO\VLVRI&RYDULDQFH
6$6$QDO\VLVRI([DPSOH 7KH6$6FRGHIRUSHUIRUPLQJWKLVDQDO\VLVLVVKRZQEHORZIROORZHGE\WKH6$6 RXWSXW 2XWSXW 7KH 35,17 3/27 DQG 0($16 SURFHGXUHV DUH XVHG WR SURYLGH D GDWD OLVWLQJ å D VFDWWHUSORW RI WKH GDWD DQG EDVLF VXPPDU\ VWDWLVWLFVê 352&*/0LVXVHGWRFRQGXFWWKH$1&29$XVLQJWKH7UHDWPHQW757 HIIHFWDV DFODVVYDULDEOHDQG+*%$&DVDQXPHULFFRYDULDWH7KHVXPVRIVTXDUHVPHDQ VTXDUHVDQG)WHVWVDOOFRUURERUDWHWKHPDQXDOFDOFXODWLRQV 7KH62/87,21RSWLRQLQWKH02'(/VWDWHPHQW LQ352&*/0REWDLQVWKH HVWLPDWHV IRU WKH UHJUHVVLRQ HTXDWLRQV7KHVORSHHVWLPDWHELV± ñ7KH LQWHUFHSWHVWLPDWHVDUHIRXQGE\DGGLQJHDFKWUHDWPHQWJURXSHIIHFWWRWKHLQWHUFHSW HVWLPDWH
DQG
D
D ò
7KH /60($16 VWDWHPHQW LQVWUXFWV 6$6 WR SULQW RXW WKH DGMXVWHG PHDQ UHVSRQVHVô DW WKH FRPPRQ FRYDULDWH PHDQ [ DV GHPRQVWUDWHG LQ WKH PDQXDO FDOFXODWLRQV 7KHGDWDVHWLVDOVRDQDO\]HGXVLQJWKHRQHZD\$129$LJQRULQJWKHFRYDULDWHõ 7KHVH UHVXOWV DV GLVFXVVHG SUHYLRXVO\ VKRZ D QRQVLJQLILFDQW WUHDWPHQW HIIHFW S
6$6&RGHIRU([DPSOH DATA TRI; INPUT TRT $ PAT HGBA1C TRICHG DATALINES; FIB 2 7.0 5 FIB 4 6.0 10 FIB 8 8.6 –20 FIB 11 6.3 0 FIB 16 6.6 10 FIB 17 7.4 –10 FIB 21 6.5 –15 FIB 23 6.2 5 FIB 27 8.5 –40 FIB 28 9.2 –25 FIB 33 7.0 –10 GEM 1 5.1 10 GEM 5 7.2 –15 GEM 6 6.4 5 GEM 10 6.0 –15 GEM 12 5.6 -5 GEM 15 6.7 –20 GEM 18 8.6 –40 GEM 22 6.0 –10 GEM 25 9.3 –40 GEM 29 7.9 -35 GEM 31 7.4 0 GEM 34 6.5 –10 ; PROC SORT DATA = TRI; BY TRT HGBA1C TRICHG;
@@; FIB FIB FIB FIB FIB GEM GEM GEM GEM GEM GEM
7 13 19 24 30 3 9 14 20 26 32
7.1 -5 7.5 –15 5.3 20 7.8 0 5.0 25 6.0 15 5.5 10 5.5 –10 6.4 -5 8.5 –20 5.0 0
&RPPRQ6WDWLVWLFDO0HWKRGVIRU&OLQLFDO5HVHDUFKZLWK6$6([DPSOHV
/* Print data set */ PROC PRINT DATA = TRI; å VAR TRT PAT HGBA1C TRICHG; TITLE1 ’Analysis of Covariance’; TITLE2 ’Example 11.1: Triglyceride Changes Adjusted for Glycemic Control’; RUN; /* Plot data, by group */ PROC PLOT VPERCENT=45 DATA = TRI; PLOT TRICHG*HGBA1C = TRT; RUN;
/* Obtain summary statistics for each group */ PROC MEANS MEAN STD N DATA = TRI; BY TRT; VAR HGBA1C TRICHG; RUN; /* Use glycemic control as covariate */ PROC GLM DATA = TRI; CLASS TRT; MODEL TRICHG = TRT HGBA1C / SOLUTION; LSMEANS TRT/PDIFF STDERR; RUN;
ê
ô
/* Compare groups with ANOVA, ignoring the covariate */ PROC GLM DATA = TRI; õ CLASS TRT; MODEL TRICHG = TRT / SS3; RUN;
2873876$62XWSXWIRU([DPSOH Analysis of Covariance Example 11.1: Triglyceride Changes Adjusted for Glycemic Control
Obs
TRT
PAT
HGBA1C
TRICHG
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
FIB FIB FIB FIB FIB FIB FIB FIB FIB FIB FIB FIB FIB FIB FIB
30 19 4 23 11 21 16 33 2 7 17 13 24 27 8
5.0 5.3 6.0 6.2 6.3 6.5 6.6 7.0 7.0 7.1 7.4 7.5 7.8 8.5 8.6
25 20 10 5 0 -15 10 -10 5 -5 -10 -15 0 -40 -20
å
&KDSWHU$QDO\VLVRI&RYDULDQFH
2873876$62XWSXWIRU([DPSOHFRQWLQXHG 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34
FIB GEM GEM GEM GEM GEM GEM GEM GEM GEM GEM GEM GEM GEM GEM GEM GEM GEM GEM
28 32 1 14 9 12 10 22 3 20 6 34 15 5 31 29 26 18 25
9.2 5.0 5.1 5.5 5.5 5.6 6.0 6.0 6.0 6.4 6.4 6.5 6.7 7.2 7.4 7.9 8.5 8.6 9.3
-25 0 10 -10 10 -5 -15 -10 15 -5 5 -10 -20 -15 0 -35 -20 -40 -40
Analysis of Covariance Example 11.1: Triglyceride Changes Adjusted for Glycemic Control Plot of TRICHG*HGBA1C.
Symbol is value of TRT.
TRICHG | | 25 +F 20 + F 15 + G 10 + G G F F 5 + F G F 0 +G F G F -5 + G G F -10 + G G G F F -15 + G F G F -20 + G GF -25 + F -30 + -35 + G -40 + FG G | •+------------+-------------+-------------+-------------+-------+ 5 6 7 8 9 10 HGBA1C
&RPPRQ6WDWLVWLFDO0HWKRGVIRU&OLQLFDO5HVHDUFKZLWK6$6([DPSOHV
2873876$62XWSXWIRU([DPSOHFRQWLQXHG The MEANS Procedure ------------------------------- TRT=FIB ----------------------------Variable Mean Std Dev N ---------------------------------------------HGBA1C 7.0000000 1.1587349 16 TRICHG -4.0625000 16.9527530 16 ----------------------------------------------
ê
----------------------- ----- TRT=GEM ----------------------------Variable Mean Std Dev N ---------------------------------------------HGBA1C 6.6444444 1.2594220 18 TRICHG -10.2777778 16.4023153 18 ----------------------------------------------
Analysis of Covariance Example 11.1: Triglyceride Changes Adjusted for Glycemic Control The GLM Procedure Class Level Information Class TRT
Levels 2
Values FIB GEM
Number of observations
34
Dependent Variable: TRICHG Source Model Error Corrected Tot
Sum of Squares
DF 2 31 33
8.075283 2903.689422 9211.764706
Mean Square
F Value
Pr > F
3154.037642 93.667401
33.67
<.0001
R-Square
Coeff Var
Root MSE
TRICHG Mean
0.684785
-131.6234
9.678192
-7.352941
Source
DF
Type I SS
Mean Square
F Value
Pr > F
TRT HGBA1C
1 1
327.216095 5980.859189
327.216095 5980.859189
3.49 63.85
0.0711 <.0001
Source
DF
Type III SS
Mean Square
F Value
Pr > F
TRT HGBA1C
1 1
865.363546 5980.859189
865.363546 5980.859189
9.24 63.85
0.0048 <.0001
&KDSWHU$QDO\VLVRI&RYDULDQFH
2873876$62XWSXWIRU([DPSOHFRQWLQXHG Parameter Intercept TRT FIB TRT GEM HGBA1C
Standard Error
Estimate
ò
t Value
Pr > |t|
6.70 3.04 . -7.99
<.0001 0.0048 . <.0001
64.59251309 B 9.64331471 10.22171475 B 3.36293670 0.00000000 B . -11.26810398 ñ 1.41014344
ù
NOTE: The X’X matrix has been found to be singular, and a generalized inverse was used to solve the normal equations. Terms whose estimates are followed by the letter ’B’ are not uniquely estimable. Least Squares Means
TRT FIB GEM
TRICHG LSMEAN -1.9414451 -12.1631599
Standard Error 2.4340646 11 2.2933414
ô
H0:LSMean1= LSMean2 Pr > |t|
H0:LSMEAN=0 Pr > |t| 0.4312 <.0001
0.0048
Analysis of Covariance Example 11.1: Triglyceride Changes Adjusted for Glycemic Control The GLM Procedure Class Level Information Class TRT
Levels 2
Values FIB GEM
Number of observations
Dependent Variable: TRICHG Sum of Source DF Squares Model Error Corrected Tot
Source TRT
1 32 33
327.216095 8884.548611 9211.764706
34
Mean Square
F Value
Pr > F
327.216095 277.642144
1.18
0.2858
õ
R-Square
Coeff Var
Root MSE
TRICHG Mean
0.035522
-226.6113
16.66260
-7.352941
DF
Type III SS
Mean Square
F Value
Pr > F
1
327.2160948
327.2160948
1.18
0.2858
&RPPRQ6WDWLVWLFDO0HWKRGVIRU&OLQLFDO5HVHDUFKZLWK6$6([DPSOHV
7KH QH[W H[DPSOH LOOXVWUDWHV $1&29$ XVLQJ PXOWLSOH QXPHULF FRYDULDWHV ZLWK PXOWLSOHFODVVLILFDWLRQIDFWRUV
p ([DPSOH$1&29$LQ0XOWL&HQWHU+\SHUWHQVLRQ6WXG\
7KHH[SHULPHQWDODQWLK\SHUWHQVLYHDJHQW*%ZDVVWXGLHGLQVWXG\ FHQWHUVWRFRPSDUHZHHNVRIWUHDWPHQWXVLQJ*%DVWKHVROHWKHUDS\ YVXVLQJ*%LQFRPELQDWLRQZLWKDVWDEOHGRVHRIDPDUNHWHGFDOFLXP FKDQQHOEORFNHUZKHQJLYHQWRSDWLHQWVZLWKPRGHUDWHWRVHYHUHK\SHUWHQVLRQ 7KHSULPDU\DQDO\VLVIRFXVHVRQDFRPSDULVRQRIWKHPHDQGHFUHDVHVLQ GLDVWROLFEORRGSUHVVXUHEHWZHHQWKHWZRWUHDWPHQWJURXSVDGMXVWHGIRUWKH SDWLHQW¶VDJHDQGVHYHULW\RIK\SHUWHQVLRQDWVWXG\HQWU\%DVHOLQHVHYHULW\LV WKHDYHUDJHGLDVWROLFEORRGSUHVVXUHREWDLQHGRQWKUHHSUHVWXG\YLVLWV 'R WKHGDWDLQ7DEOHVKRZDQ\GLIIHUHQFHLQUHVSRQVHEHWZHHQWKHWZR WUHDWPHQWJURXSV"
7$%/(5DZ'DWDIRU([DPSOH
6ROH7KHUDS\ 3DWLHQW 1XPEHU
$*( \UV
%DVHOLQH ',$%3 PP+J
)LUVWGLJLWLQGLFDWHVVWXG\FHQWHU
',$%3 &KDQJH PP +J
&RPELQDWLRQ7KHUDS\ 3DWLHQW 1XPEHU
$*( \UV
%DVHOLQH ',$%3 PP+J
',$%3 &KDQJH PP +J
&KDSWHU$QDO\VLVRI&RYDULDQFH
6ROH7KHUDS\ %DVHOLQH 3DWLHQW $*( ',$%3 1XPEHU \UV PP+J
)LUVWGLJLWLQGLFDWHVVWXG\FHQWHU
',$%3 &KDQJH PP +J
&RPELQDWLRQ7KHUDS\
3DWLHQW $*( 1XPEHU \UV
',$%3 %DVHOLQH &KDQJH PP ',$%3 +J PP+J
6$6$QDO\VLVRI([DPSOH 7KH6$6FRGHDQGRXWSXWIRUWKLVH[DPSOHDUHVKRZQRQWKHIROORZLQJSDJHV7KH 35,17 DQG 0($16 SURFHGXUHV SURYLGH D SDUWLDO GDWD OLVWLQJ 12 DQG VXPPDU\ VWDWLVWLFV E\ WUHDWPHQW JURXS 13 &RPELQDWLRQ WUHDWPHQW &20% VKRZV DQ XQDGMXVWHG PHDQ GHFUHDVH LQ GLDVWROLF EORRG SUHVVXUH RI PP +J FRPSDUHG ZLWKDGHFUHDVHRIPP+JIRUVROHWKHUDS\62/( 352& */0 LV XVHG WR FRQGXFW DQ $1&29$ XVLQJ 7UHDWPHQW*URXS757 DQG 6WXG\&HQWHU&(17(5 DVYDULDEOHVLQWKH&/$66VWDWHPHQW 14 $JH$*( DQG EDVHOLQH GLDVWROLF EORRG SUHVVXUH %3',$ DUH XVHG DV QXPHULF FRYDULDWHV E\ LQFOXGLQJ WKHP LQ WKH 02'(/ VWDWHPHQW EXW RPLWWLQJ WKHP IURP WKH &/$66 VWDWHPHQW 7KH 02'(/ VWDWHPHQW FRUUHVSRQGV WR D PDLQ HIIHFWV PRGHO WKDW LJQRUHVDQ\LQWHUDFWLRQV7KHOLQHDUPRGHOXVHGLQWKLVDQDO\VLVFDQEHZULWWHQDV
\ [ [ LM
L
M
ZKHUH \ LV WKH UHVSRQVH PHDVXUHPHQW FKDQJH LQ GLDVWROLF EORRG SUHVVXUH IRU 7UHDWPHQWLL RU DQG&HQWHUMM RU LVWKHRYHUDOOPHDQUHVSRQVH LVWKHHIIHFWRI7UHDWPHQWL LVWKHHIIHFWRI&HQWHUM[ LVDJHLQ\HDUV [ LV EDVHOLQH GLDVWROLF EORRG SUHVVXUH LQ PP +J LV WKH LQFUHDVH LQ UHVSRQVH LM
L
M
N
&RPPRQ6WDWLVWLFDO0HWKRGVIRU&OLQLFDO5HVHDUFKZLWK6$6([DPSOHV
DVVRFLDWHG ZLWK D RQHXQLW FKDQJH LQ [ N RU DQG UHSUHVHQWV WKH UDQGRP HUURUFRPSRQHQW N
7KLVDQDO\VLVDVVXPHVWKDWLIWKHUHVSRQVHLVOLQHDUO\UHODWHGWRWKHFRYDULDWHVWKHQ WKDWVDPHOLQHDUUHODWLRQVKLSH[LVWVZLWKLQHDFKVXEJURXSIRUPHGE\FRPELQDWLRQV RI OHYHOV RI WKH FODVV YDULDEOHV 7KLV DVVXPSWLRQ FDQ EH WHVWHG E\ LQFOXGLQJ LQWHUDFWLRQV VXFK DV 757 $*( RU 757 %3',$ LQ WKH PRGHO VHH 6HFWLRQ $SUHOLPLQDU\WHVWIRUD7UHDWPHQWE\&HQWHULQWHUDFWLRQFDQDOVREHPDGH EHIRUHSURFHHGLQJZLWKWKLVPRGHO 7KH RXWSXW VKRZV D PHDQ VTXDUH HUURU 06( RI EDVHG RQ GHJUHHV RI IUHHGRP DQG LQGLFDWHV D VLJQLILFDQW RYHUDOO PRGHO HIIHFW S 15 ZKLFK PHDQVWKDWQRWDOOIDFWRUVLQFOXGHGLQWKHPRGHODUHXQUHODWHGWRWKHUHVSRQVH 7KH 6$6 7\SH ,,, VXPV RI VTXDUHV LQGLFDWH D VLJQLILFDQW WUHDWPHQW 757 HIIHFW S 1R HYLGHQFH RI GLIIHUHQFHV LV VHHQ DPRQJ &HQWHUV S ZKLOH WKH EDVHOLQH VHYHULW\ %3',$ LV D VLJQLILFDQW FRYDULDWH S 7KH 7\SH,,,UHVXOWVIRUHDFKIDFWRUDUHDGMXVWHGIRUWKHLQFOXVLRQRIDOORWKHUIDFWRUVLQ WKHPRGHO7KH7\SH,UHVXOWVDUHRQO\DGMXVWHGIRUWKHIDFWRUVSUHFHGLQJLWLQWKH 02'(/VWDWHPHQW1RWHWKDWWKH$JHFRYDULDWHLVVLJQLILFDQWS ZKHQLWLV QRW DGMXVWHG IRU EDVHOLQH GLDVWROLF EORRG SUHVVXUH 7\SH , 66 EXW $JH LV QRW VLJQLILFDQWS DIWHUDGMXVWLQJIRU%3',$7\SH,,,UHVXOWV 7KH DGMXVWHG PHDQ UHVSRQVHV /HDVW 6TXDUHV 0HDQV ± PP +J IRU FRPELQDWLRQ WUHDWPHQW DQG ± PP +J IRU VROH WKHUDS\ DUH VOLJKWO\JUHDWHU WKDQWKHXQDGMXVWHGPHDQV±DQG±UHVSHFWLYHO\ 7KH 62/87,21 RSWLRQ XVHG LQ WKH 02'(/ VWDWHPHQW HVWLPDWHV WKH PRGHO SDUDPHWHUV VKRZQ LQ WKH (VWLPDWH FROXPQ LQ 2XWSXW )RU H[DPSOH WKH HVWLPDWHGUHVSRQVHIRUFRPELQDWLRQWKHUDS\LQ&HQWHUFDQEHIRXQGE\XVLQJWKH HTXDWLRQ
A\ ±±± $*( ± %3',$
RU
A\ ± $*(± %3',$
$QHVWLPDWHRIWKHPHDQUHVSRQVHIRUSDWLHQWVZKRUHFHLYHFRPELQDWLRQWKHUDS\LQ &HQWHUFDQEHIRXQGE\VXEVWLWXWLQJWKHRYHUDOOPHDQDJH\UV DQGWKH RYHUDOO PHDQ EDVHOLQH GLDVWROLF EORRG SUHVVXUH PP +J LQWR WKLV HTXDWLRQDVIROORZV
\ ± ± ± 6LPLODUO\ HVWLPDWHG UHVSRQVH HTXDWLRQVDQGPHDQVIRUWKHRWKHUFRPELQDWLRQVRI 7UHDWPHQWDQG&HQWHUDUHVKRZQLQ7DEOH
&KDSWHU$QDO\VLVRI&RYDULDQFH
7$%/($GMXVWHG0HDQ5HVSRQVHVE\7UHDWPHQW*URXSDQG&HQWHU IRU([DPSOH
757
&(17(5
&20%
62/(
(VWLPDWHG5HVSRQVH(TXDWLRQ
A\ A\ A\ A\ A\ A\ A\ A\
0HDQ5HVSRQVH DW$YHUDJH 9DOXHRI &RYDULDWHV
± $*(± %3',$
± $*(± %3',$
± $*(± %3',$
± $*(± %3',$
± $*(± %3',$
± $*(± %3',$
± $*(± %3',$
± $*(± %3',$
7KH DGMXVWHG WUHDWPHQW PHDQV /6PHDQV FDQ EH YHULILHG E\ DYHUDJLQJ WKH HVWLPDWHVRIWKHZLWKLQFHQWHUPHDQVRYHUDOOFHQWHUVIRUWKDWWUHDWPHQW
\
FRPE
\
VROH
±±±± ±
±±±± ±
6$6&RGHIRU([DPSOH DATA BP; INPUT TRT $ PAT AGE BPDIA0 CENTER = INT(PAT/100); DATALINES; SOLE 101 55 102 -6 SOLE 103 SOLE 104 45 97 -2 SOLE 106 SOLE 109 54 115 -7 SOLE 202 ... (more data lines) COMB 422 55 103 COMB 426 38 105 ;
-8 -3
BPDIACH @@;
68 115 -10 69 107 -5 58 99 -4
...
COMB 423 66 115 -14
12 PROC PRINT DATA = BP; TITLE1 ’Analysis of Covariance’; TITLE2 ’Example 11.2: ANCOVA in Multi-Center Hypertension Study’; RUN;
&RPPRQ6WDWLVWLFDO0HWKRGVIRU&OLQLFDO5HVHDUFKZLWK6$6([DPSOHV
PROC SORT DATA = BP; BY TRT CENTER; 13
PROC MEANS MEAN STD N MIN MAX DATA = BP; BY TRT; VAR AGE BPDIA0 BPDIACH; RUN;
PROC GLM DATA = BP; 14 CLASS TRT CENTER; MODEL BPDIACH = TRT CENTER AGE BPDIA0 / SOLUTION; LSMEANS TRT / STDERR PDIFF; TITLE3 ’GLM Model Including Effects: TRT, CENTER, AGE, BPDIA0’; RUN;
2873876$62XWSXWIRU([DPSOH Analysis of Covariance Example 11.2: ANCOVA in Multi-Center Hypertension Study
Obs
TRT
PAT
AGE
BPDIA0
BPDIACH
CENTER
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 . . . 70 71 72 73 74 75
SOLE SOLE SOLE SOLE SOLE SOLE SOLE SOLE SOLE SOLE SOLE SOLE SOLE SOLE SOLE
101 103 104 106 109 202 203 204 206 208 211 212 214 216 217
55 68 45 69 54 58 62 51 47 61 40 36 58 64 55
102 115 97 107 115 99 119 107 96 98 110 103 109 119 104
-6 -10 -2 -5 -7 -4 -11 -9 0 -6 5 -3 10 -12 4
1 1 1 1 1 2 2 2 2 2 2 2 2 2 2
COMB COMB COMB COMB COMB COMB
417 419 420 422 423 426
53 43 77 55 66 38
109 96 100 103 115 105
3 -14 -5 -8 -14 -3
4 4 4 4 4 4
12
&KDSWHU$QDO\VLVRI&RYDULDQFH
2873876$62XWSXWIRU([DPSOHFRQWLQXHG The MEANS Procedure --------------------------- TRT=COMB -----------------------------Variable Mean Std Dev N Minimum Maximum ------------------------------------------------------------------AGE 55.3846154 11.0706235 39 34.0000000 79.0000000 BPDIA0 106.3333333 7.0012530 39 93.0000000 119.0000000 BPDIACH -6.8205128 6.2318021 39 -20.0000000 7.0000000 ------------------------------------------------------------------13
---------------------------- TRT=SOLE ----------------------------Variable Mean Std Dev N Minimum Maximum ------------------------------------------------------------------AGE 54.1944444 9.4923110 36 36.0000000 73.0000000 BPDIA0 106.6666667 7.9677923 36 95.0000000 119.0000000 BPDIACH -4.0555556 5.9997354 36 -15.0000000 10.0000000 -------------------------------------------------------------------
Analysis of Covariance Example 11.2: ANCOVA in Multi-Center Hypertension Study GLM Model Including Effects: TRT, CENTER, AGE, BPDIA0 The GLM Procedure Class Level Information Class Levels Values TRT 2 COMB SOLE CENTER 4 1 2 3 4 Number of observations
75
Dependent Variable: BPDIACH
Source
DF
Sum of Squares
Model Error Corrected Total
6 68 74
641.585250 2237.161416 2878.746667
R-Square 0.222870
Coeff Var -104.4139
Mean Square
F Value
Pr > F
106.930875 32.899433
3.25
0.0072
Root MSE 5.735803
BPDIACH Mean -5.493333
15
&RPPRQ6WDWLVWLFDO0HWKRGVIRU&OLQLFDO5HVHDUFKZLWK6$6([DPSOHV
2873876$62XWSXWIRU([DPSOHFRQWLQXHG Source
DF
Type I SS
Mean Square
F Value
Pr > F
TRT CENTER AGE BPDIA0
1 3 1 1
143.1141880 49.0195734 195.3204109 254.1310779
143.1141880 16.3398578 195.3204109 254.1310779
4.35 0.50 5.94 7.72
0.0408 0.6858 0.0175 0.0070
Source
DF
Type III SS
Mean Square
F Value
Pr > F
TRT CENTER AGE BPDIA0
1 3 1 1
139.2345407 13.1006818 92.6596679 254.1310779
139.2345407 4.3668939 92.6596679 254.1310779
4.23 0.13 2.82 7.72
0.0435 0.9403 0.0979 0.0070
Parameter Intercept TRT COMB TRT SOLE CENTER 1 CENTER 2 CENTER CENTER AGE BPDIA0
Estimate 30.88660621 -2.74007804 0.00000000 -0.96419474 -0.65014780 -1.03110593 0.00000000 -0.11500993 -0.26398254
B B B B B B B
Standard Error
t Value
Pr > |t|
10.04435860 1.33193688 . 2.23622128 1.63467283 1.86433408 . 0.06853055 0.09498183
3.08 -2.06 . -0.43 -0.40 -0.55 . -1.68 -2.78
0.0030 0.0435 . 0.6677 0.6921 0.5820 . 0.0979 0.0070
NOTE: The X’X matrix has been found to be singular, and a generalized inverse was used to solve the normal equations. Terms whose estimates are followed by the letter ’B’ are not uniquely estimable.
Analysis of Covariance Example 11.2: ANCOVA in Multi-Center Hypertension Study GLM Model Including Effects: TRT, CENTER, AGE, BPDIA0
Least Squares Means
TRT
BPDIACH LSMEAN
Standard Error
H0:LSMEAN=0 Pr > |t|
COMB SOLE
-6.93129242 -4.19121438
0.98013179 0.99307483
<.0001 <.0001
H0:LSMean1= LSMean2 Pr > |t| 0.0435
&KDSWHU$QDO\VLVRI&RYDULDQFH
'HWDLOV 1RWHV
VE
7KHVWDQGDUGHUURURIWKHVORSHHVWLPDWHELVJLYHQE\
06(
å6 L
[[ L
%HFDXVHDVLJQLILFDQWFRYDULDWHHIIHFWPHDQVWKHVDPHDVDQRQ]HURVORSH DQDOWHUQDWHEXWHTXLYDOHQWWHVWWRWKH)WHVWIRUWKHFRYDULDWHHIIHFWLVWKHWWHVW DVIROORZV
QXOOK\SRWKHVLV DOWK\SRWKHVLV
+
WHVWVWDWLVWLF
W
GHFLVLRQUXOH
+ $
E VE
UHMHFW+ LI_ W _!W
,Q([DPSOH\RXKDYH06( VRWKDW
VE
7KHWVWDWLVWLFLVFRPSXWHGDV W ± ± EDVHGRQGHJUHHVRIIUHHGRPDVVKRZQLQ2XWSXW ù1RWLFHWKDWWKH VTXDUHRIWKHWVWDWLVWLFLVWKH)VWDWLVWLF± EDVHGRQXSSHUDQG ORZHUGHJUHHVRIIUHHGRP
&RPPRQ6WDWLVWLFDO0HWKRGVIRU&OLQLFDO5HVHDUFKZLWK6$6([DPSOHV
'LIIHUHQFHVLQDGMXVWHGUHVSRQVHPHDQVDUHWKHVDPHUHJDUGOHVVRI WKHYDOXHRIWKHFRYDULDWH[EHFDXVHWKHVORSHVDUHDVVXPHGWREHHTXDODPRQJ JURXSV,Q([DPSOHWKHHVWLPDWHGGLIIHUHQFHLQDGMXVWHGUHVSRQVHPHDQV DW[ LV
±[ ±±[
UHJDUGOHVV RI ZKDW YDOXH [ WDNHV 7KLV LV QRW WKH FDVH LI WKH VORSHV GLIIHU DPRQJJURXSV
,IWKHDVVXPSWLRQRIHTXDOVORSHVZLWKLQHDFKJURXSLVQRWYDOLGWKH$1&29$ PHWKRGV GHVFULEHG LQWKLVFKDSWHUFDQQRWEHXVHG2QHPHWKRGRIWHVWLQJIRU HTXDO VORSHV LV WR XVH DQ µLQWHUDFWLRQ¶ IDFWRU EHWZHHQ *5283 DQG ; LQ WKH 02'(/VWDWHPHQWLQ6$6HJ MODEL Y = GROUP X GROUP*X;
$ VLJQLILFDQW LQWHUDFWLRQ HIIHFW *5283 ; LPSOLHV WKDW WKH GLIIHUHQFHV DPRQJ JURXS OHYHOV FKDQJH IRU GLIIHUHQW ; YDOXHV LH GLIIHUHQW VORSHV $1&29$ VKRXOG QRW EH XVHG LI D SUHOLPLQDU\ WHVW UHVXOWV LQ VLJQLILFDQWO\ GLIIHUHQWVORSHVDPRQJJURXSV $ FRQILGHQFH LQWHUYDO IRU WKH DGMXVWHG PHDQ UHVSRQVH LQ *URXSLL N LVJLYHQE\
D E [ W L
([L - [ ) ÂVÂ QL 6[[
1±N±
ZKHUHV 06( IURPWKH$129$WDEOH ,Q([DPSOHFRQILGHQFHLQWHUYDOVIRUWKHDGMXVWHGPHDQUHVSRQVHVIRU HDFKWUHDWPHQWJURXSDUH )LEUDOR*URXS
× ±Â ±RU±WR
×
-
&KDSWHU$QDO\VLVRI&RYDULDQFH
*HPILEUR]LO*URXS
× × ±Â
±RU±WR± 1RWLFH WKDW WKH FRQILGHQFH LQWHUYDO ZLGWKV DQG FDQ EH HDVLO\ FRPSXWHG DV W 6(0 ZKHUH W LV WKH FULWLFDO WYDOXH DQG 6(0 LV WKH VWDQGDUGHUURURIWKH/6PHDQVKRZQLQ2XWSXW 11 $ FRQILGHQFH LQWHUYDO IRU WKH GLIIHUHQFH LQ DGMXVWHG PHDQ UHVSRQVHVEHWZHHQWZRJURXSVVD\*URXSXDQG*URXSYLVJLYHQE\
D X - D Y W 1N × V ×
± × ×
QY
[ X - [ Y 6[[
-
 RUWR
1RWLFHWKDWWKHVWDQGDUGHUURU XVHGLQWKLVFDOFXODWLRQFDQEHREWDLQHG IURPWKH6$6RXWSXWIRU757HIIHFWVHH2XWSXW 7KH VWDQGDUG HUURUV RI WKH DGMXVWHG JURXS PHDQV DQG PHDQ GLIIHUHQFHVDUHVPDOOHVWZKHQ [ [ IRUDOOL N ,QWKLVFDVHWKH VWDQGDUGHUURULVWKHVDPHDVWKDWRIWKHXQDGMXVWHGFDVHVHH&KDSWHU L
QX
,Q ([DPSOH D FRQILGHQFH LQWHUYDO IRU WKH GLIIHUHQFH LQ DGMXVWHG PHDQVEHWZHHQWKH)LEUDORDQG*HPILEUR]LOJURXSVLV
&RPPRQ6WDWLVWLFDO0HWKRGVIRU&OLQLFDO5HVHDUFKZLWK6$6([DPSOHV
$V VKRZQ LQ 2XWSXWWKHDGMXVWHG)WHVWVLQ6$6DUHIRXQG ZLWKWKH7\SH,,,VXPVRIVTXDUHV6$6SULQWVWKHVHE\GHIDXOWLQDGGLWLRQWR WKH7\SH,VXPVRIVTXDUHV7KH7\SH,UHVXOWVGHSHQGRQWKHRUGHULQZKLFK WKHIDFWRUVDUHVSHFLILHGLQWKH02'(/VWDWHPHQWVHH$SSHQGL[' :KHQWKH 7UHDWPHQWHIIHFWLVVSHFLILHGEHIRUHWKHFRYDULDWHLQWKH02'(/VWDWHPHQWLQ 352& */0 WKH 7\SH , VXP RI VTXDUHV UHSUHVHQWV WKH XQDGMXVWHG VXP RI VTXDUHV IRU 7UHDWPHQW LJQRULQJ WKH FRYDULDWH ZKLFK FRXOG DOWHUQDWLYHO\ EH IRXQGE\XVLQJWKHRQHZD\$129$ZLWK7UHDWPHQWDVWKHRQO\HIIHFW 1RWLFHWKDWWKH7\SH,66IRU757LQ2XWSXWXVLQJWKH$1&29$PRGHOLV WKHVDPHDVWKH66IRU757LQWKHRQHZD\$129$UHVXOWV:KLOHWKHVDPH VXP RI VTXDUHV DUH REWDLQHG WKH FRUUHVSRQGLQJ )WHVW LV QRW WKH VDPH )WHVW EHFDXVHWKHFRYDULDWHDGMXVWHG06(LVXVHGDVDGLYLVRU 1RWH$OWKRXJKWKHH[DPSOHVLQWKLVVHFWLRQXVH352&*/0LQ6$6352& 0,;('FDQDOVREHXVHGIRU$1&29$PHWKRGV7KHERRN³6$66\VWHPIRU 0L[HG0RGHOV´E\/LWWHOO0LOOLNHQ6WURXSDQG:ROILQJHUSURYLGHVDJRRG GLVFXVVLRQRIWKHXVHRI352&0,;('ZLWK$1&29$7KLVERRNZDV ZULWWHQXQGHUWKH6$6%RRNV%\8VHUVSURJUDP
,Q ([DPSOH YDULRXV RWKHUPRGHOVFRXOGKDYHEHHQVHOHFWHG IRUWKHDQDO\VLV2XWSXWVKRZVWKH$1&29$UHVXOWVSURGXFHGE\PRGHOV ZKLFKFRUUHVSRQGWRWKHIROORZLQJ*/002'(/VWDWHPHQWV (1) (2) (3) (4) (5) (6) (7)
MODEL MODEL MODEL MODEL MODEL MODEL MODEL
BPDIACH BPDIACH BPDIACH BPDIACH BPDIACH BPDIACH BPDIACH
= = = = = = =
TRT TRT TRT TRT TRT TRT TRT
/ SS3 ; CENTER / SS3; BPDIA0 / SS3; AGE / SS3; AGE BPDIA0 / SS3; CENTER BPDIA0 / SS3; CENTER AGE / SS3;
7KH6$67\SH,,,UHVXOWVIRUHDFKPRGHOIDFWRUDUHSURGXFHGLQWKHRXWSXWIRUHDFK RIWKHDERYHPRGHOVE\XVLQJWKHods select ModelANOVAVWDWHPHQW 8VLQJ D RQHZD\ $129$ &KDSWHU RU HTXLYDOHQWO\ WKH WZRVDPSOH WWHVW &KDSWHU WRFRPSDUHWUHDWPHQWJURXSVLJQRULQJDOORWKHUVWXG\IDFWRUVUHVXOWVLQ RQO\PDUJLQDOVLJQLILFDQFHS RIWKH7UHDWPHQWHIIHFW0RGHO DERYH 8VLQJ D WZRZD\ $129$ &KDSWHU E\ LQFOXGLQJ 7UHDWPHQW *URXS DQG &HQWHU ZLWKRXWWKHFRYDULDWHVUHVXOWVLQDVLPLODUSYDOXHIRU7UHDWPHQWS 0RGHO DERYH
&KDSWHU$QDO\VLVRI&RYDULDQFH
2XWSXW6$62XWSXWIRU([DPSOH8VLQJ$OWHUQDWLYH0RGHOV (1) Source
DF
Type III SS
Mean Square
F Value
Pr > F
TRT 1 143.1141880 143.1141880 3.82 0.0545 ------------------------------------------------------------------(2) Source DF Type III SS Mean Square F Value Pr > F TRT 1 146.5980349 146.5980349 3.82 0.0547 CENTER 3 49.0195734 16.3398578 0.43 0.7352 ------------------------------------------------------------------(3) Source DF Type III SS Mean Square F Value Pr > F TRT 1 153.8692890 153.8692890 4.72 0.0331 BPDIA0 1 388.0563898 388.0563898 11.90 0.0009 ------------------------------------------------------------------(4) Source DF Type III SS Mean Square F Value Pr > F TRT 1 122.7314895 122.7314895 3.51 0.0649 AGE 1 220.8161121 220.8161121 6.32 0.0142 ------------------------------------------------------------------(5) Source DF Type III SS Mean Square F Value Pr > F TRT 1 137.3094860 137.3094860 4.33 0.0410 AGE 1 97.3139906 97.3139906 3.07 0.0840 BPDIA0 1 264.5542683 264.5542683 8.35 0.0051 ------------------------------------------------------------------(6) Source DF Type III SS Mean Square F Value Pr > F TRT 1 157.4152735 157.4152735 4.66 0.0343 CENTER 3 17.7550045 5.9183348 0.18 0.9128 BPDIA0 1 356.7918209 356.7918209 10.57 0.0018 ------------------------------------------------------------------(7) Source DF Type III SS Mean Square F Value Pr > F TRT CENTER AGE
1 3 1
124.5076559 23.5238722 195.3204109
124.5076559 7.8412907 195.3204109
3.45 0.22 5.41
0.0676 0.8842 0.0230
7KHVH DQDO\VHV LQGLFDWH WKH LPSRUWDQFH RI JLYLQJ DGHTXDWH FRQVLGHUDWLRQ ZKHQ HVWDEOLVKLQJ DQ DSSURSULDWH VWDWLVWLFDO PRGHO DW WKH GHVLJQ VWDJH $V GLVFXVVHG LQ &KDSWHUELDVFDQEHLQWURGXFHGDWWKHDQDO\VLVVWDJHLIVWDWLVWLFDOPHWKRGVDUHQRW SUHSODQQHG 7KLV LQFOXGHV VSHFLILFDWLRQ RI DSSURSULDWH VWDWLVWLFDO PRGHOV ,Q D SLYRWDOVWXG\DSSURSULDWHFRYDULDWHVVKRXOGEHVSHFLILHGLQWKHSULPDU\PRGHORQO\ LI LQGLFDWHG VXFK DV E\ HDUOLHU SKDVH VWXGLHV ,QFOXVLRQ RI SHUWLQHQW FRYDULDWHV SURYLGHV DQ DXWRPDWLF DGMXVWPHQW IRU GLIIHUHQFHV LQ FRYDULDWH PHDQV EHWZHHQ
&RPPRQ6WDWLVWLFDO0HWKRGVIRU&OLQLFDO5HVHDUFKZLWK6$6([DPSOHV
WUHDWPHQW JURXSV DQG FDQ PDUNHGO\ LQFUHDVH WKH SRZHU IRU GHWHFWLQJ WUHDWPHQW GLIIHUHQFHV +RZHYHU LQFOXVLRQ RI LQDSSURSULDWH FRYDULDWHV PLJKW OHDG WR IDFWRU GLOXWLRQ FROOLQHDULW\ SUREOHPV DQG XQQHFHVVDU\ UHGXFWLRQ LQ HUURU GHJUHHV RI IUHHGRP ZKLFK LQ WXUQ FRXOG UHVXOW LQ ORVV RI SRZHU RU HYHQ LQDSSURSULDWH SYDOXHV
:KHQ $1&29$ LV XVHG IRU PRGHOEXLOGLQJ RU H[SORUDWRU\ DQDO\VLV UDWKHU WKDQ DV D SULPDU\ DQDO\VLV WKH FRHIILFLHQW RI GHWHUPLQDWLRQ 5VTXDUH FDQEHXVHGDVDUHODWLYHPHDVXUHRIJRRGQHVVRIILWDVLOOXVWUDWHG LQ([DPSOH6RPHVWHSZLVHSURFHGXUHVUHO\RQ5 YDOXHVIRULQFOXVLRQRU H[FOXVLRQRIFRYDULDWHVVHH6$6+HOSDQG'RFXPHQWDWLRQIRU352&*/0
CHHAAPPTTEERR 12 The Wilcoxon Signed-Rank Test 12.1 12.2 12.3 12.4
Introduction...................................................................................................... 227 Synopsis ............................................................................................................ 227 Examples........................................................................................................... 228 Details & Notes................................................................................................. 234
12.1 Introduction The Wilcoxon signed-rank test is a non-parametric analog of the one-sample t-test (Chapter 4). The signed-rank test can be used to make inferences about a population mean or median without requiring the assumption of normally distributed data. As its name implies, the Wilcoxon signed-rank test is based on the ranks of the data. One of the most common applications of the signed-rank test in clinical trials is to compare responses between correlated or paired data, such as testing for significant pre- to post-study changes of a non-normally distributed evaluation measurement. The layout is the same as that for the paired-difference t-test (Chapter 4).
12.2 Synopsis Given a sample, y1, y2, ..., yn, of n non-zero differences or changes (zeroes are ignored), let Ri represent the rank of | yi | (when ranked lowest to highest). Let R(+) represent the sum of the ranks associated with the positive values of the yi’s, and let R(–) represent the sum of the ranks associated with the negative values of the yi’s. The test statistic is based on the smaller of R(+) and R(–). The hypothesis of ‘no mean difference’ is rejected if this statistic is larger than the critical value found in a special set of tables. To avoid the requirement of special table values, you can use an approximation to this test based on the t-distribution, as shown next.
228
Common Statistical Methods for Clinical Research with SAS Examples
Let (+)
S = (R
(–)
– R )/2
and V = (n(n + 1)(2n + 1)) / 24 The approximate test can be summarized as
null hypothesis: alt. hypothesis: test statistic:
decision rule:
H0: θ = 0 HA: θ =/ 0 T=
S ⋅ n–1 n ⋅ V – S2
reject H0 if |T| > tα/2,n–1
θ represents the unknown population mean, median, or other location parameter. When H0 is true, T has an approximate Student-t distribution with n–1 degrees of freedom. Widely available t-tables can be used to determine the rejection region. When there are tied data values, the averaged rank is assigned to the corresponding Ri values. A small adjustment to V for tied data values is appropriate, as follows: suppose there are g groups of tied data values. For the jth group, compute cj = m(m-1)(m+1), where m is the number of tied data values for that group. The correction factor of C = c1 + c2 + ... + cg is used to adjust V, as follows:
V = (n (n + 1) (2n + 1) – C / 2) / 24 12.3 Examples p Example 12.1 – Rose-Bengal Staining in KCS
A study enrolling 24 patients diagnosed with keratitis sicca (KCS) was conducted to determine the effect of a new ocular wetting agent, Oker-Rinse, for improving ocular dryness based on Rose-Bengal staining scores. Patients were instructed to use Oker-Rinse in one eye and Hypotears®, as a control, in the other eye 4 times a day for 3 weeks. Rose-Bengal staining scores were measured as 0, 1, 2, 3, or 4 in each of four areas of the eye: cornea, limbus, lateral conjunctiva, and medial conjunctiva. Higher scores represent a greater number of devitalized cells, a condition associated with KCS. The overall scores for each eye (see Table 12.1) were obtained by adding the scores over the four areas. Is there evidence of a difference between the two test preparations in the overall Rose-Bengal staining scores?
Chapter12: The Wilcoxon Signed-Rank Test 229
TABLE 12.1. Raw Data for Example 12.1 Patient Number
Hypotears
Oker-Rinse
Patient Number
Hypotears
Oker-Rinse
1
15
8
13
8
10
2
10
3
14
10
2
3
6
7
15
11
4
4
5
13
16
13
7
5
10
2
17
6
1
6
15
12
18
6
11
7
7
14
19
9
3
8
5
8
20
5
5
9
8
13
21
10
2
10
12
3
22
9
8
11
4
9
23
11
5
12
13
3
24
8
8
Solution Because each patient receives both treatments, you have a matched-pairs setup. The analysis is performed on the differences in Rose-Bengal scores between the treatment groups. The data can be plotted to reveal that the differences have a bimodal distribution (two peaks), which is not consistent with an assumption of normality. Indeed, a formal test for normality (see SAS results later) indicates a significant departure from this assumption. Therefore, use of the Wilcoxon signedrank test is preferable to using the paired-difference t-test. You want to test the hypothesis that θ, the ‘average’ difference in Rose-Bengal scores between the treatment groups, is 0. The differences in scores and ranks of their absolute values are shown in Table 12.2. The average rankings for tied differences are used. For example, a difference of ±1 occurs twice (Patients 3 and 22) so the average of the ranks 1 and 2 (= 1.5) is used. Notice that the 0 differences are omitted from the rankings, leaving n = 22, not 24.
230
Common Statistical Methods for Clinical Research with SAS Examples
TABLE 12.2. Ranks of the Data for Example 12.1
Patient Number
Difference
Rank
Patient Number
Difference
Rank
1 2 3 4 5 6 7 8 9 10 11 12
7 7 -1 -8 8 3 -7 -3 -5 9 -5 10
14.5 14.5 1.5 18.5 18.5 4.5 14.5 4.5 7.5 21 7.5 22
13 14 15 16 17 18 19 20 21 22 23 24
-2 8 7 6 5 -5 6 0 8 1 6 0
3 18.5 14.5 11 7.5 7.5 11 -18.5 1.5 11 --
Compute R(+) = (14.5 + 14.5 + 18.5 + 4.5 + 21 + 22 + 18.5 + 14.5 + 11 + 7.5 + 11 + 18.5 + 1.5 + 11) = 188.5 and R(–) = (1.5 + 18.5 + 14.5 + 4.5 + 7.5 + 7.5 + 3 + 7.5) = 64.5 so that S = (188.5 – 64.5) / 2 = 62.0 To apply the correction factor for ties, you record 6 groups of ties with the frequencies (m's): 2, 2, 4, 3, 4, and 4, (see Table 12.3). Thus, c1 = 2 (1) (3) = 6, c2 = 6, c3 = 4 (3) (5) = 60, c4 = 3 (2) (4) = 24, c5 = 60, and c6 = 60, yielding C = 216. TABLE 12.3. Calculation of Correction for Ties for Example 12.1
Tie Group (i)
|Difference|
Number of Ties (m)
ci =m(m-1)(m+1)
1 2 3 4 5 6
1 3 5 6 7 8
2 2 4 3 4 4
6 6 60 24 60 60 C=216
Chapter12: The Wilcoxon Signed-Rank Test 231
Finally, V
= (n (n + 1) (2n + 1) – C / 2) / 24
= (22 (23) (45) – 108) / 24 = 944.25 The test is summarized as follows: null hypothesis: alt. hypothesis:
H0: θ = 0 HA: θ =/ 0 T=
test statistic: =
S ⋅ n –1 n ⋅ V – S2 62 ⋅ 21 (22 ⋅ 944.25) – 622
= 2.184
decision rule:
reject H0 if |T| > t0.025,21 = 2.080
conclusion:
Because 2.184 > 2.080, reject H0, and conclude that there is a significant difference in RoseBengal scores between treatments.
SAS Analysis of Example 12.1 In the test summary just presented, the SAS function PROBT can be used to obtain the p-value associated with a t-statistic of 2.184 based on 21 degrees of freedom, namely, p = 0.0405 (found by using the SAS expression 2*(1–PROBT(2.184, 21))). The SAS code and output for executing this analysis are shown on the next pages. The Wilcoxon signed-rank test can be performed in SAS using PROC UNIVARIATE. The NORMAL option is used to provide a test for normality based on the Shapiro-Wilk test. This test is rejected with a p-value of 0.0322 , which indicates that the data cannot be assumed to have come from a normal distribution. This result makes the paired-difference one-sample t-test (Chapter 4) inappropriate for this analysis. In Output 12.1, the signed-rank statistic is confirmed to be 62 with a p-value of 0.0405 . SAS uses the t-approximation to the Wilcoxon signed-rank test if n is greater than 20. For n £ 20, SAS computes the exact probability.
232
Common Statistical Methods for Clinical Research with SAS Examples
SAS Code for Example 12.1 DATA KCS; INPUT PAT HYPOTEAR OKERINSE @@; DIFF = HYPOTEAR - OKERINSE; DATALINES; 1 15 8 2 10 3 3 6 7 5 10 2 6 15 12 7 7 14 9 8 13 10 12 3 11 4 9 13 8 10 14 10 2 15 11 4 17 6 1 18 6 11 19 9 3 21 10 2 22 9 8 23 11 5 ;
4 5 13 8 5 8 12 13 3 16 13 7 20 5 5 24 8 8
PROC UNIVARIATE NORMAL DATA = KCS; VAR DIFF; TITLE1 'The Wilcoxon-Signed-Rank Test'; TITLE2 'Example 12.1: Rose Bengal Staining in KCS'; RUN;
OUTPUT 12.1. SAS Output for Example 12.1 The Wilcoxon-Signed-Rank Test Example 12.1: Rose Bengal Staining in KCS The UNIVARIATE Procedure Variable: DIFF Moments N Mean Std Deviation Skewness Uncorrected SS Coeff Variation
24 2.29166667 5.66821164 -0.4011393 865 247.340144
Sum Weights Sum Observations Variance Kurtosis Corrected SS Std Error Mean
24 55 32.1286232 -1.2859334 738.958333 1.15701886
Basic Statistical Measures Location Mean Median Mode
2.29167 4.00000 -5.00000
Variability Std Deviation Variance Range Interquartile Range
5.66821 32.12862 18.00000 9.50000
NOTE: The mode displayed is the smallest of 4 modes with a count of 3.
Chapter12: The Wilcoxon Signed-Rank Test 233
OUTPUT 12.1 SAS Output for Example 12.1 (continued) The Wilcoxon-Signed-Rank Test Example 12.1: Rose Bengal Staining in KCS The UNIVARIATE Procedure Variable: DIFF
Tests for Location: Mu0=0 Test
-Statistic-
-----p Value------
Student's t Sign Signed Rank
t M S
Pr > |t| Pr >= |M| Pr >= |S|
1.980665 3 62
0.0597 0.2863 0.0405
Tests for Normality Test
--Statistic---
-----p Value------
Shapiro-Wilk Kolmogorov-Smirnov Cramer-von Mises Anderson-Darling
W D W-Sq A-Sq
Pr Pr Pr Pr
0.908189 0.201853 0.145333 0.859468
< > > >
Quantiles (Definition 5) Quantile 100% Max 99% 95% 90% 75% Q3 50% Median 25% Q1 10% 5% 1% 0% Min
Estimate 10.0 10.0 9.0 8.0 7.0 4.0 -2.5 -5.0 -7.0 -8.0 -8.0
Extreme Observations ----Lowest----
----Highest---
Value
Obs
Value
Obs
-8 -7 -5 -5 -5
4 7 18 11 9
8 8 8 9 10
5 14 21 10 12
W D W-Sq A-Sq
0.0322 0.0125 0.0248 0.0235
234
Common Statistical Methods for Clinical Research with SAS Examples
12.4 Details & Notes 12.4.1 . The Wilcoxon signed-rank test is considered a non-parametric test because it makes no assumptions regarding the distribution of the population from which the data are collected. This appears to be a tremendous advantage over the t-test, so why not use the signed-rank test in all situations? First, the t-test has been shown to be a more powerful test in detecting true differences when the data are normally distributed. Because the normal distribution occurs quite often in nature, the t-test is the method of choice for a wide range of applications. Secondly, analysts often feel more comfortable reporting a t-test whenever it is appropriate, especially to a non-statistician, because many clinical research professionals have become familiar with the terminology. One reason the t-test has enjoyed popular usage is its robustness under deviations to the underlying assumptions. Finally, the Wilcoxon signedrank test does require the assumption of a symmetrical underlying distribution. When the data are highly skewed or otherwise non-symmetrical, an alternative to the signed-rank test, such as the sign test (discussed in Chapter 15), can be used. 12.4.2. Let Wi = Ri if yi > 0 and Wi = –Ri if yi < 0. It can be shown that the T statistic discussed is equivalent to performing a one-sample t-test (Chapter 4) on the Wi’s, that is, the ranked data adjusted by the appropriate sign. This is sometimes referred to as performing a t-test on the ‘rank-transformed’ data. 12.4.3. An alternative approximation to the signed-rank test for large samples is the Z-test. The test statistic is taken to be S′ = smaller of (R(+), R(–)). S′ has mean and variance:
µ S′ = and σ
2
S′
=
n ⋅ (n+1) 4 n ⋅ (n+1) ⋅ (2n+1) 24
Under H0,
Z=
S′ – µS′ σS′
has an approximate normal distribution with mean 0 and variance 1. The Z–test statistic is compared with the tabled value of Zα/2 for a two-tailed test (= 1.96 for α = 0.05).
Chapter12: The Wilcoxon Signed-Rank Test 235
To illustrate this approximation for Example 12.1, compute S′ = 64.5 µ S′ = and σ S′ =
22 ⋅ 23 = 126.5 4 22 ⋅ 23 ⋅ 45 = 30.8 24
so that Z = (64.5 – 126.5) / 30.8 = –2.013. At p = 0.05, you reject H0 because 2.013 > 1.96. The SAS function PROBNORM can be used to obtain the actual p-value of 0.044. 12.4.4. The one-sample t-test is often used with larger samples, regardless of the distribution of the data, because the Central Limit Theorem states that the sample mean y is normally distributed for large n, no matter what distribution the underlying data have (see Chapter 1). Usually, for n > 30, we can safely use the one-sample t-test (Chapter 4), and ignore distributional assumptions, providing the distribution is symmetric. The Wilcoxon signedrank test and the t-test become closer as n gets larger. For non-symmetrical distributions, the mean, median, and other measures of central tendency might have widely disparate values. The t-test should be used only if the mean is the appropriate measure of central tendency for the population being studied. 12.4.5. The method shown here is for a two-tailed test. For a one-tailed test, the same methods that are shown in Chapter 4 for the one-sample t-test can be used. If SAS is used, the p-value associated with the signed-rank test can be halved for an approximate one-tailed test.
236
Common Statistical Methods for Clinical Research with SAS Examples
CHHAAPPTTEERR 1133 The Wilcoxon Rank-Sum Test 13.1 13.2 13.3 13.4
Introduction...................................................................................................... 237 Synopsis ............................................................................................................ 237 Examples........................................................................................................... 239 Details & Notes................................................................................................. 244
13.1 Introduction The Wilcoxon rank-sum test is a non-parametric analog of the two-sample t-test (Chapter 5). It is based on ranks of the data, and is used to compare location parameters, such as the mean or median, between two independent populations without the assumption of normally distributed data. Although the Wilcoxon rank-sum test was developed for use with continuous numeric data, the test is also applied to the analysis of ordered categorical data. In clinical data analysis, this test is often useful, for example, for comparing patient global assessments or severity ratings between two treatment groups.
13.2 Synopsis The data are collected as two independent samples of size n1 and n2, denoted by y11, y12, ..., y1n1 and y21, y22, ..., y2n2. The data are ranked, from lowest to highest, over the combined samples, and the test statistic is based on the sum of ranks for the first group. Let r1j = rank of y1j (j = 1, 2, ..., n1) and r2j = rank of y2j (j = 1, 2, ..., n2), and compute n1
R1 = å r1j j=1
and n2
R 2 = å r2j j=1
&RPPRQ6WDWLVWLFDO0HWKRGVIRU&OLQLFDO5HVHDUFKZLWK6$6([DPSOHV
7KH K\SRWKHVLV RI HTXDO PHDQV ZRXOG EH VXSSRUWHG E\ VLPLODU DYHUDJH UDQNV EHWZHHQWKHWZRJURXSVLHLI5 Q LVFORVHWR5 Q 5 LVFRPSDUHGWRDFULWLFDO YDOXH REWDLQHG IURP D VSHFLDO VHW RI WDEOHV EDVHG RQ :LOFR[RQ UDQNVXP H[DFW SUREDELOLWLHV WR GHWHUPLQH WKH DSSURSULDWH UHMHFWLRQ UHJLRQ 6XFK WDEOHV FDQ EH IRXQG LQ PDQ\ LQWHUPHGLDWH VWDWLVWLFDO ERRNV RU QRQSDUDPHWULF VWDWLVWLFDO UHIHUHQFHV
/DFNRIDYDLODELOLW\RIWKHVHWDEOHVPLJKWVRPHZKDWOLPLWWKHXVHIXOQHVVRIWKLVWHVW +RZHYHUWKHUHTXLUHPHQWIRUVSHFLDOWDEOHVFDQRIWHQEHFLUFXPYHQWHGE\XVLQJD QRUPDO DSSUR[LPDWLRQ IRU ODUJHU VDPSOHV ,W KDV EHHQ VKRZQ WKDW WKH QRUPDO DSSUR[LPDWLRQ IRU WKH UDQNVXP WHVW LV H[FHOOHQW IRU VDPSOHV DV VPDOO DV SHU JURXS :LWK 1 Q Q WKH VXP RI WKH UDQNV 1 FDQ EH H[SUHVVHG DV 1 ù 1 :KHQWKHK\SRWKHVLVRIHTXDOLW\LVWUXH+ \RXZRXOGH[SHFWWKH SURSRUWLRQRIWKLVVXPIURP6DPSOHWREHDERXWQ 1DQGWKHSURSRUWLRQIURP 6DPSOHWREHDERXWQ 17KDWLVWKHH[SHFWHGYDOXHRI5 XQGHU+ LV
5
Q ¬ 11 ¬ Q ¸1 ¸ 1 ® ®
$GGLWLRQDOO\WKHYDULDQFHRI5 FDQEHFRPSXWHGDV
s
5
Q ×Q 1
,IWKHUHDUHWLHGGDWDYDOXHVWKHDYHUDJHUDQNLVDVVLJQHGWRWKHFRUUHVSRQGLQJU YDOXHV6XSSRVHWKHUHDUHJJURXSVRIWLHGGDWDYDOXHV)RUWKHN JURXSFRPSXWH F P P ± ZKHUH P LV WKH QXPEHU RI WLHG YDOXHV IRU WKDW JURXS
WK
N
T
5
& Q ¸ Q ¬ 1 ± 11 ± ®
Chapter 13: The Wilcoxon Rank-Sum Test 239
The test statistic, using a 0.5 continuity correction, is based on an approximate normal distribution, summarized as follows:
null hypothesis: alt. hypothesis:
H0: θ1 = θ2 HA: θ1 =/ θ2 Z=
test statistic: decision rule:
| R 1 – µ R 1 | – 0.5 σ R1
reject H0 if |Z| > Zα/2
where θ1 and θ2 represent the mean, median, or other location parameters for the two populations.
13.3 Examples p Example 13.1 -- Global Evaluations of Seroxatene in Back Pain
In previous studies of the new anti-depressant, Seroxatene, researchers noticed that patients with low back pain experienced a decrease in radicular pain after 6 to 8 weeks of daily treatment. A new study was conducted in 28 patients to determine whether this phenomenon is a drug-related response or coincidental. Patients with MRI-confirmed disk herniation and symptomatic leg pain were enrolled and randomly assigned to receive Seroxatene or a placebo for 8 weeks. At the end of the study, patients were asked to provide a global rating of their pain, relative to baseline, on a coded improvement scale as follows: -------- Deterioration -------
--------- Improvement -------
Marked
Moderate
Slight
No Change
Slight
Moderate
Marked
-3
-2
-1
0
+1
+2
+3
The data are shown in Table 13.1. Is there evidence that Seroxatene has any effect on radicular back pain?
240
Common Statistical Methods for Clinical Research with SAS Examples
TABLE 13.1. Raw Data for Example 13.1 ------ Seroxatene Group -------
-------- Placebo Group ---------
Patient Number
Score
Patient Number
Score
Patient Number
Score
Patient Number
Score
2
0
16
-1
1
3
15
0
3
2
17
2
4
-1
18
-1
5
3
20
-3
7
2
19
-3
6
3
21
3
9
3
23
-2
8
-2
22
3
11
-2
25
1
10
1
24
0
13
1
28
0
12
3
26
2
14
3
27
-1
Solution With a situation involving a two-sample comparison of means, you might first consider the two-sample t-test or the Wilcoxon rank-sum test. Because the data consist of ordered categorical responses, the assumption of normality is dubious, so you can apply the rank-sum test. Tied data values are identified as shown in Table 13.2. TABLE 13.2. Calculation of Correction for Ties in Example 13.1 Response (y)
Number of Ties (m)
Ranks
Average Rank 1.5
6
4
24
2
ck = m(m -1)
-3
2
1, 2
-2
3
3, 4, 5
-1
4
6, 7, 8, 9
7.5
60
0
4
10, 11, 12, 13
11.5
60
+1
3
14, 15, 16
15
24
+2
4
17, 18, 19, 20
18.5
60
+3
8
21, 22, 23, 24, 25, 26, 27, 28
24.5
504 C = 738
&KDSWHU7KH:LOFR[RQ5DQN6XP7HVW
:LWKQ Q DQG1 WKHUDQNHGYDOXHVU V RIWKHUHVSRQVHV\ V DUH VKRZQLQ7DEOH
7$%/(5DQNVRIWKH'DWDIRU([DPSOH 6HUR[DWHQH*URXS
3ODFHER*URXS
3DWLHQW 1XPEHU
6FRUH 5DQN
3DWLHQW 1XPEHU
6FRUH 5DQN
3DWLHQW 1XPEHU
6FRUH 5DQN
3DWLHQW 1XPEHU
6FRUH 5DQN
&RPSXWH
5 DQG
5
$VDFKHFNQRWHWKDW5 5 1 1
5
DQG
×
¬ ¸ ± 5 ®
242
Common Statistical Methods for Clinical Research with SAS Examples
The test summary, based on a normal approximation at a significance level of α = 0.05, becomes null hypothesis: alt. hypothesis:
H0: θ1 = θ2 HA: θ1 =/ θ2
test statistic:
Z= =
R1 − µ R1 − 0.5 σ R1 (261 − 232) − 0.5 = 1.346 448.38
decision rule:
reject H0 if |Z| > 1.96
conclusion:
Because 1.346 is not > 1.96, you do not reject H0, and conclude that there is insufficient evidence of a difference between Seroxatene and placebo in global back pain evaluations.
Interpolation of normal probabilities tabulated in Appendix A.1 results in a twotailed p-value of 0.178 for the Z-statistic of 1.346. The p-value can also be found by using the SAS function PROBNORM. In SAS, the two-tailed p-value for a Z-value of 1.346 is (1–PROBNORM(1.346)) = 0.1783.
SAS Analysis of Example 13.1 The Wilcoxon rank-sum test is performed in SAS by using PROC NPAR1WAY with the WILCOXON option, as shown in the SAS code on the next page. The sum of ranks for either group can be used to compute the test statistic. The manual calculations use R1 = 261, while SAS uses R2 =145 . When the smaller value is used, the result is a negative Z-value . In either case, for a two-tailed test, | Z | = 1.346 with a p-value of 0.1783 . The output from PROC NPAR1WAY also gives the results of the analysis using the Kruskal-Wallis chi-square approximation (Chapter 14).
Chapter 13: The Wilcoxon Rank-Sum Test 243
SAS Code for Example 13.1 DATA RNKSM; INPUT TRT $ DATALINES; SER 2 0 SER SER 8 -2 SER SER 16 –1 SER SER 22 3 SER PBO 1 3 PBO PBO 11 –2 PBO PBO 19 -3 PBO ;
PAT SCORE @@; 3 2 10 1 17 2 24 0 4 –1 13 1 23 -2
SER SER SER SER PBO PBO PBO
5 3 12 3 20 -3 26 2 7 2 15 0 25 1
SER SER SER SER PBO PBO PBO
6 3 14 3 21 3 27 –1 9 3 18 -1 28 0
PROC NPAR1WAY WILCOXON DATA = RNKSM; CLASS TRT; VAR SCORE; TITLE1 'The Wilcoxon Rank-Sum Test'; TITLE2 'Example 13.1: Seroxatene in Back Pain'; RUN;
OUTPUT 13.1. SAS Output for Example 13.1 The Wilcoxon Rank-Sum Test Example 13.1: Seroxatene in Back Pain The NPAR1WAY Procedure Wilcoxon Scores (Rank Sums) for Variable SCORE Classified by Variable TRT Sum of Expected Std Dev Mean TRT N Scores Under H0 Under H0 Score ------------------------------------------------------------------SER 16 261.0 232.0 21.175008 16.312500 PBO 12 145.0 174.0 21.175008 12.083333 Average scores were used for ties. Wilcoxon Two-Sample Test Statistic Normal Approximation Z One-Sided Pr < Z Two-Sided Pr > |Z| t Approximation One-Sided Pr < Z Two-Sided Pr > |Z|
145.0000
-1.3459 0.0892 0.1783
0.0948 0.1895
Z includes a continuity correction of 0.5. Kruskal-Wallis Test Chi-Square DF Pr > Chi-Square
1.8756 1 0.1708
244
Common Statistical Methods for Clinical Research with SAS Examples
13.4 Details & Notes 13.4.1. The term ‘mean’ is used loosely in this chapter to refer to the appropriate location parameter. Because the mean is not always the best measure of a distribution’s center, the location parameter can refer to some other measure, such as one of the measures of central tendency described in Chapter 1 (Table 1.3). 13.4.2. For symmetric distributions, the population mean and median are the same. For skewed distributions with long tails to the right, the median is usually smaller than the mean and considered a better measure of the distributional ‘center’ or location. The geometric mean is also smaller than the arithmetic mean and is often used as a location parameter for exponentially distributed data. The Wilcoxon rank-sum test tests for a location or positional shift in distributions without the need to identify the best measure of central tendency. Thus, the parameter θ is simply a symbol generically denoting a distribution’s location. 13.4.3. Although you need not make assumptions regarding the actual distributions of the data, the Wilcoxon rank-sum test does assume that the two population distributions have the same shape and differ only by a possible shift in location. Thus, you assume the same dispersion, which is analogous to the assumption of variance homogeneity required of the two-sample t-test. Unlike the Wilcoxon signed-rank test, you need not assume the data come from a symmetric distribution. 13.4.4. The test statistic, R1, has a symmetric distribution about its mean, n1(N+1) / 2. Because of this symmetry, the one-tailed p-value is easily obtained by halving the two-tailed p-value. 13.4.5. The Mann-Whitney U-test is another non-parametric test for comparing location parameters based on two independent samples. Mathematically, it can be shown that the Mann-Whitney test is equivalent to the Wilcoxon rank-sum test. 13.4.6. Notice that the approximate test statistic, Z, can be computed from R2 instead of R1. When using R2, the mean and standard deviation (µR, σR) are computed by reversing n1 and n2 in the formulas given. 13.4.7. Another approximation to the Wilcoxon rank-sum test is the use of the two-sample t-test on the ranked data. In SAS, use PROC RANK to rank the pooled data from the two samples, then use PROC TTEST. A comparison between using the SAS procedures with the standard Wilcoxon rank-sum test is discussed by Conover and Iman (1981).
Chapter 13: The Wilcoxon Rank-Sum Test 245
13.4.8. The assumption of normality can be tested by using PROC UNIVARIATE in SAS. If this is rejected, the Wilcoxon rank-sum test is preferable to the two-sample t-test. Tests such as the Kolmogorov-Smirnov test, available by specifying the EDF option in PROC NPAR1WAY in SAS, can be used to compare the equality of two distributions. Significance indicates a difference in location, variability, or distributional ‘shapes’ between the two populations.
246
Common Statistical Methods for Clinical Research with SAS Examples
CHHAAPPTTEERR 1144 The Kruskal-Wallis Test 14.1 14.2 14.3 14.4
Introduction...................................................................................................... 247 Synopsis ............................................................................................................ 247 Examples........................................................................................................... 249 Details & Notes................................................................................................. 253
14.1 Introduction The Kruskal-Wallis test is a non-parametric analogue of the one-way ANOVA (Chapter 6). It is used to compare population location parameters (mean, median, etc.) among two or more groups based on independent samples. Unlike an ANOVA, the assumption of normally distributed responses is not necessary. The KruskalWallis test is an extension of the Wilcoxon rank-sum test (Chapter 13) for more than two groups, just as a one-way ANOVA is an extension of the two-sample t-test. The Kruskal-Wallis test is based on the ranks of the data. This test is often useful in clinical trials for comparing responses among three or more dose groups or treatment groups using samples of non-normally distributed response data.
14.2 Synopsis The data are collected as k (³2) independent samples of size n1, n2, ... nk, as shown in Table 14.1. TABLE 14.1. Layout for the Kruskal-Wallis Test
Group 1
Group 2
y11 y12 ... y1n1
y21 y22 ... y2n2
...
Group k
yk1 yk2 ... yknk
&RPPRQ6WDWLVWLFDO0HWKRGVIRU&OLQLFDO5HVHDUFKZLWK6$6([DPSOHV
)RU L « N DQG M « Q OHW U UDQN RI \ RYHU WKH N FRPELQHG VDPSOHV)RUHDFKJURXSL «N FRPSXWH L
LM
LM
QL
Ê
5 L = ULM M= 7KH DYHUDJH UDQN RI DOO 1 Q Q Q REVHUYDWLRQV LV 5 1 7KHUHIRUH ZKHQ WKH QXOO K\SRWKHVLV LV WUXH WKH DYHUDJH UDQN IRU HDFK JURXS QDPHO\ 5 = 5 Q VKRXOG EH FORVH WR WKLV YDOXH DQG WKH VXPRIVTXDUHG GHYLDWLRQV
L
L
N
L
N
ÊQ 5 ±5
L
L
L= VKRXOGEHVPDOO
7KLVIRUPLVUHFRJQL]DEOHDVWKHIDPLOLDUµEHWZHHQJURXS¶VXPRIVTXDUHVXVHGLQ WKH DQDO\VLV RI YDULDQFH &KDSWHUV DQG DIWHU UHSODFLQJ WKH REVHUYHG YDOXHV ZLWK WKHLU UDQNV 7KH .UXVNDO:DOOLV WHVW VWDWLVWLF LV D IXQFWLRQ RI WKLV VXP RI VTXDUHVZKLFKVLPSOLILHVDOJHEUDLFDOO\WRWKHTXDQWLW\
K
N 5 ¬ L 1 ¸ 11 L O Q L ®
:KHQ+ LVWUXHK KDVDQDSSUR[LPDWHFKLVTXDUHGLVWULEXWLRQZLWKN±GHJUHHV RIIUHHGRP
$V ZLWK RWKHU UDQNLQJ SURFHGXUHV ZKHQ \RX KDYH WLHG GDWD YDOXHV WKH DYHUDJH UDQNRIWKHWLHGYDOXHVLVDVVLJQHGWRWKHFRUUHVSRQGLQJU YDOXHV6XSSRVHWKHUHDUH J FDWHJRULHV RI WLHG YDOXHV )RU WKH / VXFK FDWHJRU\ FRPSXWH F P P ± ZKHUHPLVWKHQXPEHURIWLHGYDOXHVIRUWKDWFDWHJRU\$VPDOODGMXVWPHQWFDQEH PDGHWRWKHWHVWVWDWLVWLFE\XVLQJWKHFRUUHFWLRQIDFWRU& F F FJ:LWK UHSUHVHQWLQJ WKH SRSXODWLRQ ORFDWLRQ SDUDPHWHU WKH WHVW LV VXPPDUL]HG DV VKRZQ QH[W LM
WK
/
&KDSWHU7KH.UXVNDO:DOOLV7HVW
QXOOK\SRWKHVLV
+
DOWK\SRWKHVLV
+
WHVWVWDWLVWLF
K
GHFLVLRQUXOH
UHMHFW+ LIK! cN - a
$
L
N
IRUDWOHDVWRQHSDLULM M
K
& ¬ 1 1 ®
cN - a UHSUHVHQWV WKH FULWLFDO FKLVTXDUHYDOXHEDVHGRQN±GHJUHHVRIIUHHGRP DQGDVLJQLILFDQFHOHYHORI &ULWLFDOFKLVTXDUHYDOXHVFDQEHIRXQGLQWDEOHVRI WKHFKLVTXDUHGLVWULEXWLRQRUE\XVLQJWKH352%&+,IXQFWLRQLQ6$6
([DPSOHV
p ([DPSOH3VRULDVLV(YDOXDWLRQLQ7KUHH*URXSV
$VWXG\FRPSDULQJDORZGRVH DQGDKLJKGRVH RIDQHZQRQ VWHURLGDODQWLSVRULDVLVPHGLFDWLRQZDVFRQGXFWHGXVLQJDSDUDOOHOGHVLJQ LQFOXGLQJDSODFHERJURXSDVFRQWURO7KLUW\WZRSDWLHQWVZHUHVWXGLHGIRU ZHHNVRIGDLO\WUHDWPHQW7KHSULPDU\HIILFDF\UHVSRQVHPHDVXUHZDVWKH GHJUHHRISVRULDWLFOHVLRQUHGXFWLRQDWVWXG\WHUPLQDWLRQUDWHGRQDQRUGLQDO VFDOHDVIROORZV &RGHG 5HVSRQVH &DWHJRU\
5HGXFWLRQLQ /HVLRQ6L]H
&RGHG 5HVSRQVH &DWHJRU\
5HGXFWLRQLQ /HVLRQ6L]H
%DVHGRQWKHGDWDVKRZQLQ7DEOHLVWKHUHDQ\GLIIHUHQFHLQUHVSRQVHDPRQJ WKHWKUHHJURXSV"
250
Common Statistical Methods for Clinical Research with SAS Examples
TABLE 14.2. Raw Data for Example 14.1 0.1% Solution
0.2% Solution
Placebo
Patient Number
Category Code
Patient Number
Category Code
Patient Number
Category Code
1
5
3
5
2
5
6
4
5
8
4
3
9
1
7
2
8
7
12
7
10
8
11
1
15
4
14
7
13
2
19
3
18
4
16
4
20
6
22
5
17
2
23
7
26
4
21
1
27
8
28
6
24
4
32
7
31
4
25
5
29
4
30
5
Solution This data set has an ordinal scale response with unequally spaced intervals. If you use a one-way ANOVA, the results might depend on the coding scheme used. Because the codes of 1 to 8 are arbitrary, the Kruskal-Wallis test is appropriate; the results do not depend on the magnitude of the coded values, only their ranks. The data can be summarized in a frequency table format that includes the ranks, as shown in Table 14.3. TABLE 14.3. Calculation of Ranks and Correction for Ties in Example 14.1
Ranks
Average Rank
Total Frequency (m)
c= 2 m(m -1)
2
1-3
2
3
24
1
2
4-6
5
3
24
1
0
1
7-8
7.5
2
6
4
2
3
3
9-16
12.5
8
504
26-50%
5
1
2
3
17-22
19.5
6
210
51-75%
6
1
1
0
23-24
23.5
2
6
76-99%
7
3
1
1
25-29
27
5
120
100%
8
1
2
0
30-32
31
3
24
10
10
12
32
C = 918
-- Frequencies -0.1% 0.2% Placebo
Response Category
Code
<0%
1
1
0
0%
2
0
1-10%
3
11-25%
&KDSWHU7KH.UXVNDO:DOOLV7HVW
7KHWDEOHRIUDQNVDQGWKHUDQNVXPVIRUHDFKJURXSDUHVKRZQLQ7DEOH 7$%/(5DQNVRIWKH'DWDLQ([DPSOH 6ROXWLRQ
6ROXWLRQ
3ODFHER
3DWLHQW 1XPEHU
&DWHJRU\ 5DQN
3DWLHQW 1XPEHU
&DWHJRU\ 5DQN
3DWLHQW 1XPEHU
&DWHJRU\ 5DQN
5 Q
5 Q
5 Q
&RPSXWHWKHXQDGMXVWHGWHVWVWDWLVWLFDV
K
¬ ¸ ®
DQGWKHWHVWLVVXPPDUL]HGDVIROORZV
QXOOK\SRWKHVLV DOWK\SRWKHVLV
+ + QRW+
WHVWVWDWLVWLF
K
GHFLVLRQUXOH
UHMHFW+ LIK!
FRQFOXVLRQ
%HFDXVHLVQRW!WKHUHLVLQVXIILFLHQW HYLGHQFHWRUHMHFW+
$
¬ ¸ ®
252
Common Statistical Methods for Clinical Research with SAS Examples
SAS Analysis of Example 14.1 In the preceding test summary, the PROBCHI function in SAS can be used to obtain the p-value associated with the test statistic 4.473 based on 2 degrees of freedom namely, p = 0.1068 (=1 – PROBCHI(4.473,2)). The Kruskal-Wallis test can be performed in SAS by using PROC NPAR1WAY with the WILCOXON option, as shown in the SAS code for Example 14.1. SAS will automatically perform the Wilcoxon rank-sum test (Chapter 13) if there are only two groups and the Kruskal-Wallis test if there are more than two groups, as determined by the number of levels in the class variable. In this example, there are three treatment levels in the DOSE variable, which is used in the CLASS statement . The output shows the rank sums for each group and the chi-square statistic , which corroborate the manual calculations demonstrated. The p-value is also printed out by SAS. SAS Code for Example 14.1 DATA PSOR; INPUT DOSE $ PAT SCORE @@; DATALINES; 0.1 1 5 0.1 6 4 0.1 9 0.1 15 4 0.1 19 3 0.1 20 0.1 27 8 0.1 32 7 0.2 3 0.2 7 2 0.2 10 8 0.2 14 0.2 22 5 0.2 26 4 0.2 28 PBO 2 5 PBO 4 3 PBO 8 PBO 13 2 PBO 16 4 PBO 17 PBO 24 4 PBO 25 5 PBO 29 ;
1 6 5 7 6 7 2 4
0.1 0.1 0.2 0.2 0.2 PBO PBO PBO
12 23 5 18 31 11 21 30
7 7 8 4 4 1 1 5
PROC NPAR1WAY WILCOXON DATA = PSOR; CLASS DOSE; VAR SCORE; TITLE1 'The Kruskal-Wallis Test'; TITLE2 'Example 14.1: Psoriasis Evaluation in Three Groups’; RUN;
Chapter 14: The Kruskal-Wallis Test 253
OUTPUT 14.1. SAS Output for Example 14.1 Example 14.1:
The Kruskal-Wallis Test Psoriasis Evaluation in Three Groups The NPAR1WAY Procedure
Wilcoxon Scores (Rank Sums) for Variable SCORE Classified by Variable DOSE Sum of Expected Std Dev Mean DOSE N Scores Under H0 Under H0 Score ------------------------------------------------------------------0.1 10 189.50 165.0 24.249418 18.950000 0.2 10 194.00 165.0 24.249418 19.400000 PBO 12 144.50 198.0 25.327691 12.041667 Average scores were used for ties. Kruskal-Wallis Test Chi-Square DF Pr > Chi-Square
4.4737 2 0.1068
14.4 Details & Notes 14.4.1. When k=2, the Kruskal-Wallis chi-square value has 1 degree of freedom This test is identical to the normal approximation used for the Wilcoxon rank-sum test (Chapter 13). As noted in previous sections, a chisquare with 1 degree of freedom can be represented by the square of a standardized normal random variable (see Appendix B). For k=2, the h-statistic is the square of the Wilcoxon rank-sum Z-test (without the continuity correction). 14.4.2. The effect of adjusting for tied ranks is to slightly increase the value of the test statistic, h. Therefore, omission of this adjustment results in a more conservative test. 14.4.3. For small ni’s (< 5), the chi-square approximation might not be appropriate. Special tables for the exact distribution of h are available in many non-parametric statistics books. These tables can be used to obtain the appropriate rejection region for small samples. (For example, Nonparametric Statistical Methods by Hollander and Wolfe (1973) provides tables of the critical values for the Kruskal-Wallis test, the Wilcoxon rank-sum test, and the Wilcoxon signed-rank test.) Exact tests are also available in SAS by using the EXACT statement in PROC NPAR1WAY (see SAS documentation). Exact tests should be used in special cases when the distribution of the test statistic (h) is not well approximated with the chi-square, such as tests involving very small sample sizes or many ties.
254
Common Statistical Methods for Clinical Research with SAS Examples
14.4.4. When the distribution of the response data is unknown and cannot be assumed to be normal, the median (rather than the mean) is frequently used as a measure of location or central tendency. The median is computed as the middle value (for an odd number of observations) or the average of the two middle values (for an even number of observations). Approximate 95% confidence intervals for the median can be constructed using non-parametric methods developed by Hodges and Lehmann (see Hollander and Wolfe, 1973). 14.4.5. Another approximate test based on the ranks uses a one-way ANOVA on the ranked data. This method is often used instead of the KruskalWallis test because similar results can be expected. Use the following SAS statements to rank the data for Example 14.1, then perform the ANOVA. PROC RANK DATA = PSOR OUT = RNK; VAR SCORE; RANKS RNKSCORE; RUN; PROC GLM DATA = RNK; CLASS DOSE; MODEL RNKSCORE = DOSE / SS3; RUN;
The output (Output 14.2) shows the ANOVA results using the ranked data, which gives the p-value 0.1044 (compared with 0.1068 for the chi-square method). 14.4.6. When the Kruskal-Wallis test is significant, pairwise comparisons can be carried out with the Wilcoxon rank-sum test for each pair of groups. However, the multiplicity problem affecting the overall error rate must be considered for larger values of k. For example, the adjusted p-value method, such as a Bonferroni or Holm’s method using PROC MULTTEST in SAS, can be performed as described in Appendix E. Other techniques for handling multiple comparisons when using non-parametric tests are discussed in Hollander and Wolfe (1973).
Chapter 14: The Kruskal-Wallis Test 255
OUTPUT 14.2. Analysis of Example 14.1 Using a One-Way ANOVA on the Ranks
Example 14.1:
The Kruskal-Wallis Test Psoriasis Evaluation in Three Groups The GLM Procedure
Class Level Information Class DOSE
Levels 3
Values 0.1 0.2 PBO
Number of observations Dependent Variable: RNKSCORE
DF
Sum of Squares
Model 2 Error 29 Corrected Tota1 31
382.645833 2268.854167 2651.500000
Source
Source DOSE
32 Rank for Variable SCORE
Mean Square
F Value
191.322917 78.236351
2.45
Pr > F 0.1044
R-Square
Coeff Var
Root MSE
RNKSCORE Mean
0.144313
53.60686
8.845131
16.50000
DF
Type III SS
Mean Square
F Value
2
382.6458333
191.3229167
2.45
Pr > F 0.1044
14.4.7. When the k levels of the Group factor can be ordered, you might want to test for association between response and increasing levels of Group. The dose-response study is an example. If the primary question is whether increasing response is related to larger doses, the Jonckheere-Terpstra test for ordered alternatives can be used. This is a non-parametric test that shares the same null hypothesis with the Kruskal-Wallis test, but the alternative is more specific:
HA: θ1 ≤ θ2 ≤ … ≤ θk with at least one strict inequality (<).
256
Common Statistical Methods for Clinical Research with SAS Examples
The Jonckheere-Terpstra test can be performed by specifying the JT option in the TABLES statement when using PROC FREQ in SAS. The SAS statements for this test using Example 14.1 are shown in the program that follows. (Notice that you first create a new variable, DOS, so that the dose levels are in ascending order.) DATA JTTEST; SET PSOR; IF DOSE = 'PBO' THEN DOS = 0; ELSE DOS = DOSE; RUN; PROC FREQ DATA = JTTEST; TABLES DOS*SCORE / JT; RUN;
This test results in a p-value of 0.056 (compared with 0.107 for the KruskalWallis test), which illustrates the increased power when using this test for the more targeted alternative. The Jonckheere-Terpstra test uses a chi-square approximation for large samples. Exact tests are available (see Hollander and Wolfe, 1973), and these tests can be performed by using the EXACT statement in PROC FREQ in SAS. 14.4.8. As illustrated in 14.4.5, using ANOVA methods based on the ranktransformed data is a common way to analyze data when the parametric assumptions might not be fulfilled. This approach can also be used for a twoway layout by first ranking the observations from lowest to highest across all treatment groups within each block, then using the usual two-way ANOVA methods (Chapter 7) on these ranks to test for significant treatment effects. Such a technique is often substituted for Friedman's test, which is the nonparametric analog of the two-way ANOVA. This approach assumes multiple observations per cell (nij > 1). Other types of layouts can be analyzed using similar ranking methods (see Conover and Iman, 1981).
CHHAAPPTTEERR 1155 The Binomial Test 15.1 15.2 15.3 15.4
Introduction...................................................................................................... 257 Synopsis ............................................................................................................ 257 Examples........................................................................................................... 259 Details & Notes................................................................................................. 261
15.1 Introduction The binomial test is used to make inferences about a proportion or response rate based on a series of independent observations, each resulting in one of two possible mutually exclusive outcomes. The outcomes can be: response to treatment or no response, cure or no cure, survival or death, or in general, event or non-event. These observations are ‘binomial’ outcomes if the chance of observing the event of interest is the same for each observation. In clinical trials, a common use of the binomial test is for estimating a response rate, p, using the number of patients (X) who respond to an investigative treatment out of a total of n studied. Special cases of the binomial test include two commonly used tests in clinical data analysis, McNemar's test (Chapter 18) and the sign test (discussed later in this chapter).
15.2 Synopsis The experiment consists of observing n independent observations, each with one of two possible outcomes: event or non-event. For each observation, the probability of the event is denoted by p (0 < p < 1). The total number of events in n observations, X, follows the binomial probability distribution (Chapter 1). Intuitively, the sample proportion, X/n, would be a good estimate of the unknown population proportion, p. Statistically, it is the best estimate.
258
Common Statistical Methods for Clinical Research with SAS Examples
The general formula for a binomial probability is: Pr(X=x) =
n! ⋅ p x (1 – p) n – x x!(n – x)!
where x can take integer values 0, 1, 2, ..., n. The symbol ‘!’ is read ‘factorial’ and indicates multiplication by successively smaller integer values down to 1, i.e., a! = (a)·(a-1)·(a-2) ... (3)·(2)·(1). (For example, 5! = 5 x 4 x 3 x 2 x 1 = 120). You want to determine whether the population proportion, p, differs from a hypothesized value, p0. If the unknown proportion, p, equals p0, then the estimated proportion, X/n, should be close to p0, i.e., X should be close to n · p0. When p differs from p0, X might be much larger or smaller than n · p 0. Therefore, you reject the null hypothesis if X is ‘too large’ or ‘too small’. The test summary is: null hypothesis: alt. hypothesis:
H0: p = p0 HA: p =/ p0
test statistic:
X = the number of events in n observations
decision rule:
reject H0 if X £ XL or X ³ XU, where XL and XU are chosen to satisfy Pr(XL < X < XU) ³ 1–α
Tables of binomial probabilities found in most introductory statistical texts or the SAS function PROBBNML can be used to determine XL and XU. This method is demonstrated in Example 15.1. Another option, using a normal approximation, is discussed following Example 15.1.
Chapter 15: The Binomial Test 259
15.3 Examples p Example 15.1 -- Genital Wart Cure Rate
A company markets a therapeutic product for genital warts with a known cure rate of 40% in the general population. In a study of 25 patients with genital warts treated with this product, patients were also given high doses of vitamin C. As shown in Table 15.1, 14 patients were cured. Is this consistent with the cure rate in the general population? TABLE 15.1. Raw Data for Example 15.1 Patient Number
Cured ?
Patient Number
Cured ?
Patient Number
Cured ?
YES
8
NO
9
YES
15
NO
16
YES
22
YES
NO
23
NO
3
YES
10
NO
17
NO
24
YES
4
NO
11
YES
18
YES
25
YES
5
YES
12
NO
19
YES
6
YES
13
YES
20
NO
7
NO
14
NO
21
YES
Patient Number
Cured ?
1 2
Solution Let p represent the probability of a cure in any randomly selected patient with genital warts and treated with Vitamin C concomitantly with the company's product. You want to know if this unknown rate differs from the established rate of p0 = 0.4. Thus, null hypothesis: alt. hypothesis:
H0: p = 0.4 HA: p =/ 0.4
test statistic:
X = 14 (the number cured out of the n = 25 patients treated)
decision rule:
reject H0 if X £ 4 or X ³ 15
conclusion:
Do not reject the null hypothesis. Because 14 does not lie in the rejection region, you conclude that there is insufficient evidence from this study to indicate that concomitant Vitamin C treatment has an effect on the product's cure rate.
260
Common Statistical Methods for Clinical Research with SAS Examples
The decision rule is established by finding a range of X values such that the probability of X being outside that range when the null hypothesis is true is approximately equal to the significance level. With a nominal significance level of α = 0.05, the limits of the rejection region, 4 and 15, can be found from a table of binomial probabilities, satisfying Pr(X £ 4) + Pr(X ³ 15) £ 0.05. The PROBBNML function in SAS can also be used (see Section 15.4.4). The actual significance level is 0.044. Normal Approximation For large values of n and non-extreme values of p, a binomial response, X, can be approximated by a normal distribution with mean n×p and variance n×p×(1–p). The approximation improves as n gets larger or as p gets closer to 0.5. Based on this statistical principle, another way to establish the rejection region for α = 0.05 and large n is by computing ½
and
XL = (n × p0) – 1.96(n × p0 × (1 – p0))
½
XU = (n × p0) + 1.96(n × p0 × (1 – p0))
Applying these formulas to Example 15.1, you obtain
½
XL = (25 · 0.4) – 1.96(25(0.4)(0.6)) = 5.2 and ½
XU = (25 · 0.4) + 1.96(25(0.4)(0.6)) = 14.8 Because X can only take integer values, the rejection region becomes X £ 5 and X ³ 15. Because of the small sample size, this approximation produces a slightly larger lower rejection region and yields an actual significance level of 0.064 (based on exact binomial probabilities). A more common form of the normal approximation to the binomial test uses as the test statistic the actual estimate of p, namely pˆ = X/n. Under the assumption of approximate normality, the test summary becomes:
Chapter 15: The Binomial Test 261
null hypothesis: alt. hypothesis:
H0: p = p0 HA: p =/ p0 1 2n p 0 ⋅ (1 – p 0) n
| pˆ – p 0 | –
test statistic:
Z=
decision rule:
reject H0 if | Z | > Zα/2
Because the binomial distribution is a discrete distribution (i.e., takes integer values), while the normal distribution is continuous, the 1/(2n) in the numerator of the test statistic is used as a ‘continuity correction’ to improve the approximation. When H0 is true, Z has an approximate standard normal distribution. For the commonly used value of α = 0.05, Z0.025 = 1.96. In Example 15.1, the test statistic based on the normal approximation is
14 1 − 0.4 − 25 50 0.140 Z= = = 1.429 0.098 0.4 ⋅ 0.6 25
Because 1.429 is less than 1.96, the null hypothesis is not rejected.
15.4 Details & Notes 15.4.1. The normal approximation to the binomial distribution is a result of the Central Limit Theorem (Chapter 1). Let yi represent a binomial response for Patient i (i = 1, 2, …, n) with numeric values of 0 (‘non-event’) and 1 (‘event’). Note that X, the number of ‘events’ in n trials, is simply the sum of the yi’s. The probability of ‘event’, p, is simply the population mean of the distribution from which the yi’s have been selected. That distribution is the binomial distribution with mean, p, and variance, p·(1–p). Therefore, pˆ = X/n = y has an approximate normal distribution with mean p and variance p·(1-p)/n for large n, according to the Central Limit Theorem. The Z-test statistic shown above uses p0, the hypothesized value of the unknown parameter p because the test is conducted assuming H0 is true.
262
Common Statistical Methods for Clinical Research with SAS Examples
15.4.2. The normal approximation to the binomial is generally a good approximation if
and
( n ⋅ p)
– 2 n ⋅ p ⋅ (1 – p) ≥ 0
( n ⋅ p)
+ 2 n ⋅ p ⋅ (1 – p) ≤ n
or equivalently, if n ≥ 4 ´ max((p/(1–p)), ((1–p)/p)) The minimum sample size, n, satisfying this inequality is shown in Table 15.2 for various values of p. Larger n’s are required for response probabilities (p) closer to 0 or 1. TABLE 15.2. Minimum n Required by Normal Approximation as a Function of p
p
n
0.5
4
0.4 or 0.6
6
0.3 or 0.7
10
0.2 or 0.8
16
0.1 or 0.9
36
15.4.3. An approximate 95% confidence interval for estimating the proportion, p, is given by pˆ ± 1.96 ⋅
ˆ pˆ ⋅ (1 – p) n
where pˆ = X/n. With X/n = 14 / 25 = 0.56 in Example 15.1, a 95% confidence interval for p is 0.56 ± 1.96 ((0.56 x 0.44) / 25)½ = 0.56 ± 0.19 = (0.37 - 0.75)
Chapter 15: The Binomial Test 263
15.4.4. The SAS function PROBBNML can be used to obtain binomial probabilities for specified p and n. The actual significance level 0.044 in Example 15.1 is found by using the SAS statement PROBBNML(0.4,25,4) + (1 – PROBBNML(0.4,25,14)).
The PROBNORM function in SAS can be used with the normal approximation to obtain p-values. The approximate p-value for Example 15.1 with a Z-test statistic of 1.429 is found in SAS as p=2*(1 – PROBNORM(1.429) = 0.136. 15.4.5. When p0 = 0.5, the binomial test is sometimes called the sign test. A common application of the sign test is in testing for pre- to post-treatment changes, given information only about whether a measurement increases or decreases following treatment. The number of increases is a binomial random variable with p = 0.5 when the null hypothesis of no pre- to post-treatment changes is true. The UNIVARIATE procedure in SAS can be used to conduct the sign test (Chapter 18). 15.4.6. Because Example 15.1 tests for a difference from the hypothesized value in either direction, a two-tailed test is used. A one-tailed test would be used when you want to test whether the population proportion, p, is strictly greater than or strictly less than the threshold level, p0. Use the rejection region according to the alternative hypothesis as follows: Alternative Hypothesis
Corresponding Rejection Region
two-tailed
HA: p =/ po
reject H0 if Z > Zα/2 or Z < –Zα/2
one-tailed (right)
HA: p > p0
reject H0 if Z > Zα
one-tailed (left)
HA: p < p0
reject H0 if Z < –Zα
Type of Test
264
Common Statistical Methods for Clinical Research with SAS Examples
&++$$3377((55 7KH&KL6TXDUH7HVW ,QWURGXFWLRQ 6\QRSVLV ([DPSOHV 'HWDLOV 1RWHV
,QWURGXFWLRQ
7KHFKLVTXDUHWHVWLVXVHGWRFRPSDUHWZRLQGHSHQGHQWELQRPLDOSURSRUWLRQVS DQGS,QWKHDQDO\VLVRIFOLQLFDOGDWDWKHELQRPLDOSURSRUWLRQW\SLFDOO\UHSUHVHQWV DUHVSRQVHUDWHFXUHUDWHVXUYLYDOUDWHDEQRUPDOLW\UDWHRUVRPHRWKHUµHYHQW¶UDWH DVLQWURGXFHGLQWKHSUHYLRXVFKDSWHU2IWHQ\RXZDQWWRFRPSDUHVXFKµUHVSRQVH¶ UDWHVEHWZHHQDWUHDWHGJURXSDQGDSDUDOOHOFRQWUROJURXS 7KH FKLVTXDUH WHVW LV DQ DSSUR[LPDWH WHVW ZKLFK PD\ EH XVHGZKHQWKHQRUPDO DSSUR[LPDWLRQ WR WKH ELQRPLDO GLVWULEXWLRQ LV YDOLG VHH &KDSWHU $ SRSXODU DOWHUQDWLYH)LVKHU VH[DFWWHVW&KDSWHU LVEDVHGRQH[DFWSUREDELOLWLHVDQGLWLV RIWHQDSSOLHGZKHQFRQGLWLRQVIRUXVLQJWKHFKLVTXDUHWHVWDUHQRWPHW
6\QRSVLV
2EVHUYDWLRQ LV PDGH RI ; UHVSRQGHUV RXW RI Q SDWLHQWV ZKR DUH VWXGLHG LQ RQH JURXSDQG;QRQUHVSRQGHUVRXWRIQSDWLHQWVLQDVHFRQGLQGHSHQGHQWJURXSDV VKRZQLQ7DEOH 7$%/(/D\RXWIRUWKH&KL6TXDUH7HVW
1XPEHURI 5HVSRQGHUV
1XPEHURI 1RQ5HVSRQGHUV
7RWDO
*URXS
;
Q±;
Q
*URXS
;
Q±;
Q
&RPELQHG
;;
1±;;
1 QQ
&RPPRQ6WDWLVWLFDO0HWKRGVIRU&OLQLFDO5HVHDUFKZLWK6$6([DPSOHV
$VVXPHWKDWHDFKRIWKHQ SDWLHQWVLQ*URXSLL KDYHWKHVDPHFKDQFHS RI UHVSRQGLQJ VR WKDW ; DQG ; DUH LQGHSHQGHQW ELQRPLDO UDQGRP YDULDEOHV VHH &KDSWHU 7KHJRDOLVWRFRPSDUHSRSXODWLRQµUHVSRQVH¶UDWHVS YVS EDVHG RQWKHVHVDPSOHGDWD&RPSXWH L
L
180
DQG
'(1
; × Q - ; × Q 1 Q × Q × ; ; × 1 - ; - ; 1
$VVXPLQJWKDWWKHQRUPDODSSUR[LPDWLRQWRWKHELQRPLDOGLVWULEXWLRQLVDSSOLFDEOH VHH6HFWLRQ WKHFKLVTXDUHWHVWVXPPDU\LV
QXOOK\SRWKHVLV DOWK\SRWKHVLV
$
WHVWVWDWLVWLF
GHFLVLRQUXOH
+ S S + S S
180 '(1
UHMHFW+ LI !
7KHUHMHFWLRQUHJLRQLVIRXQGE\REWDLQLQJWKHFULWLFDOFKLVTXDUHYDOXHEDVHGRQ GHJUHHRIIUHHGRPGHQRWHGDV IURPFKLVTXDUHWDEOHV$SSHQGL[$ RUE\ XVLQJWKH6$6IXQFWLRQ&,19± GI )RU WKHFULWLFDOFKLVTXDUHYDOXHLV &,19 IURP6$6
7KLVFRPSXWLQJIRUPXODIRUWKHFKLVTXDUHVWDWLVWLFFDQEHVKRZQWREHHTXLYDOHQW WRWKHPRUHSRSXODUIRUP
=
å
L
=
(2 L
-( (
L
L
)
&KDSWHU7KH&KL6TXDUH7HVW
ZKHUHWKH2 VDQG( VDUHWKHREVHUYHGDQGH[SHFWHGFHOOIUHTXHQFLHVUHVSHFWLYHO\ DVVKRZQLQ7DEOH L
L
7$%/(2EVHUYHG2 DQG([SHFWHG( &HOO)UHTXHQFLHV L
2
(
;
Q ;; 1
;
Q ;; 1
Q±;
Q 1±;±; 1
Q±;
Q 1±;±; 1
L
L
([DPSOHV
p ([DPSOH$'5)UHTXHQF\ZLWK$QWLELRWLF7UHDWPHQW
$VWXG\ZDVFRQGXFWHGWRPRQLWRUWKHLQFLGHQFHRIJDVWURLQWHVWLQDO*, DGYHUVHGUXJUHDFWLRQVRIDQHZDQWLELRWLFXVHGLQORZHUUHVSLUDWRU\WUDFW LQIHFWLRQV/57, 7ZRSDUDOOHOJURXSVZHUHLQFOXGHGLQWKHVWXG\2QHJURXS FRQVLVWHGRI/57,SDWLHQWVUDQGRPL]HGWRUHFHLYHWKHQHZWUHDWPHQWDQGD UHIHUHQFHJURXSRI/57,SDWLHQWVUDQGRPL]HGWRUHFHLYHHU\WKURP\FLQ 7KHUHZHUHSDWLHQWVLQWKHWHVWJURXSDQGSDWLHQWVLQWKHFRQWURO HU\WKURP\FLQ JURXSZKRUHSRUWHGRQHRUPRUH*,FRPSODLQWVGXULQJGD\V RIWUHDWPHQW,VWKHUHHYLGHQFHRIDGLIIHUHQFHLQ*,VLGHHIIHFWUDWHVEHWZHHQ WKHWZRJURXSV"
6ROXWLRQ 'HILQH µUHVSRQVH¶ DV WKH HYHQW WKDW D SDWLHQW GHYHORSV RQH RU PRUH *, UHDFWLRQV GXULQJ WKH VWXG\ DQG OHW S DQG S UHSUHVHQW WKH SUREDELOLWLHV WKDW D UDQGRPO\ VHOHFWHG/57,SDWLHQWKDVVXFKDµUHVSRQVH¶ZKHQWUHDWHGZLWKWKHWHVWGUXJDQGWKH FRQWUROGUXJUHVSHFWLYHO\7KHGDWDDUHQRUPDOO\VXPPDUL]HGLQDFRQWLQJHQF\ WDEOHDVVKRZQLQ7DEOH
7$%/(5HVSRQVH)UHTXHQFLHVIRU([DPSOH
1XPEHURI 5HVSRQGHUV
1XPEHURI 1RQ5HVSRQGHUV
7RWDO
7HVW'UXJ
&RQWURO
&RPELQHG
&RPPRQ6WDWLVWLFDO0HWKRGVIRU&OLQLFDO5HVHDUFKZLWK6$6([DPSOHV
&RPSXWH
180
DQG
'(1
× - × -
× × ×
$WDVLJQLILF DQFHOHYHORIWKHWHVWVXPPDU\LV
QXOOK\SRWKHVLV
+ S S
DOWK\SRWKHVLV
+ S S
WHVWVWDWLVWLF
GHFLVLRQUXOH
FRQFOXVLRQ
$
- UHMHFW+ LI !
%HFDXVH!\RXUHMHFW+ DQG FRQFOXGHWKHUHLVDVLJQLILFDQWGLIIHUHQFHLQWKH LQFLGHQFHRI*,DGYHUVHHIIHFWVEHWZHHQWUHDWPHQW JURXSVDWDOHYHORIVLJQLILFDQFH
7KH SYDOXH FDQ EH REWDLQHG IURP 6$6 E\ XVLQJ WKH 352%&+, IXQFWLRQ DV
S ±352%&+, ZKLFKUHWXUQVWKHYDOXHRI
6$6$QDO\VLVRI([DPSOH 7KH 6$6 FRGH IRU FRQGXFWLQJ WKH FKLVTXDUH WHVW XVLQJ 352& )5(4 LV VKRZQ QH[W7KH7$%/(6VWDWHPHQW åLGHQWLILHVWKHURZDQGFROXPQYDULDEOHVWKDWDUH XVHG WR IRUP WKH WDEOH 7KH URZ YDULDEOH VSHFLILHG ILUVW LQ WKH 7$%/(6 VWDWHPHQW LVWKHWUHDWPHQWJURXS*53 DQGWKHFROXPQYDULDEOHLVWKHUHVSRQVH 5(63 7KH&+,64RSWLRQLQWKH7$%/(6VWDWHPHQWVSHFLILHVWKDW\RXZDQWWKH FKLVTXDUHWHVW 7KH:(,*+7VWDWHPHQW VSHFLILHVWKDWWKHUHVSRQVHYDULDEOH&17UHSUHVHQWV WKHFHOOIUHTXHQFLHV,IWKHUHVSRQVHGDWDLH<(6RU12 DUHLQSXWLQGLYLGXDOO\ IRU HDFK SDWLHQW WKH &17 YDULDEOH LV QRW XVHG DQG WKH :(,*+7 VWDWHPHQW LV RPLWWHGIURP352&)5(4 2XWSXWFRQILUPVWKHFKLVTXDUHVWDWLVWLFRIZLWKDSYDOXHRIê
&KDSWHU7KH&KL6TXDUH7HVW
6$6&RGHIRU([DPSOH DATA ADR; INPUT GRP RESP $ CNT @@; DATALINES; 1 YES 22 1 _NO 44 2 YES 28 2 _NO 24 ; /* GRP 1 = Test Drug, GRP 2 = Control */ PROC FREQ DATA = ADR; TABLES GRP*RESP / CHISQ NOPERCENT NOROW; å WEIGHT CNT; TITLE1 ’The Chi-Square Test’; TITLE2 ’Example 16.1: ADR Frequency with Antibiotic Treatment’; RUN;
2873876$62XWSXWIRU([DPSOH
The Chi-Square Test Example 16.1: ADR Frequency with Antibiotic Treatment The FREQ Procedure Table of GRP by RESP GRP
RESP
Frequency‚ Row Pct | YES | NO | ---------+--------+--------+ 1 | 22 | 44 | | 33.33 | 66.67 | ---------+--------+--------+ 2 | 28 | 24 | | 53.85 | 46.15 | ---------+--------+--------+ Total 50 68
Total 66
52
118
Statistics for Table of GRP by RESP Statistic DF Value Prob -----------------------------------------------------Chi-Square 1 5.0119 ê0.0252 Likelihood Ratio Chi-Square 1 5.0270 0.0250 Continuity Adj. Chi-Square 1 4.2070 0.0403 Mantel-Haenszel Chi-Square 1 4.9694 0.0258 Phi Coefficient -0.2061 Contingency Coefficient 0.2018 Cramer's V -0.2061 Fisher's Exact Test ---------------------------------Cell (1,1) Frequency (F) 22 Left-sided Pr <= F 0.0201 Right-sided Pr >= F 0.9925 Table Probability (P) Two-sided Pr <= P Sample Size = 118
0.0125 0.0385
&RPPRQ6WDWLVWLFDO0HWKRGVIRU&OLQLFDO5HVHDUFKZLWK6$6([DPSOHV
&KL6TXDUH7HVWVIRUWKHJU&RQWLQJHQF\7DEOH 7KLVGLVFXVVLRQDERXWWKHFKLVTXDUHWHVWVRIDUKDVIRFXVHGRQLWVVLPSOHVWIRUP QDPHO\WKDWRIDWDEOH,QVWDWLVWLFVFKLVTXDUHWHVWLVDJHQHULFUHIHUHQFHIRU DQ\LQIHUHQWLDOVWDWLVWLFWKDWKDVDFKLVTXDUHRUDSSUR[LPDWHFKLVTXDUHSUREDELOLW\ GLVWULEXWLRQ7KHUHDUHPDQ\VLWXDWLRQVWKDWLQYROYHDK\SRWKHVLVWKDWFDQEHWHVWHG ZLWK D FKLVTXDUH WHVW DQG VHYHUDO GLIIHUHQW IRUPV FDQ EH XVHG ZLWK IUHTXHQF\ WDEOHV $IUHTXHQF\WDEOHZLWKJURZVDQGUFROXPQVLVFDOOHGDµJUFRQWLQJHQF\WDEOH¶ 7KH URZ DQG FROXPQ YDULDEOHV FDQ EH QRPLQDO RU RUGLQDO 1RPLQDO OHYHOV DUH GHVFULSWLYH RUGLQDO OHYHOV DUH TXDQWLWDWLYH DQG KDYH D KLHUDUFKLFDO RUGHULQJ ([DPSOHVRIQRPLQDODQGRUGLQDOYDULDEOHVDUHVKRZQLQ7DEOH 7$%/(([DPSOHVRI1RPLQDODQG2UGLQDO9DULDEOHV
7\SH
9DULDEOH
/HYHOV&DWHJRULHV
1RPLQDO
5$&(
&DXFDVLDQ%ODFN+LVSDQLF2ULHQWDO
/(6,216,7(
/HJ%DFN$UP)DFH&KHVW
75($70(17
$FWLYH3ODFHER5HIHUHQFH
2UGLQDO
,03529(0(17
:RUVH1RQH6RPH0DUNHG&XUHG
6(9(5,7<
1RQH0LOG0RGHUDWH6HYHUH
'26$*(
PJPJPJPJ
,QFRPSDUDWLYHFOLQLFDOWULDOVLWLVRIWHQWKHFDVHWKDWWKHURZYDULDEOHUHIHUVWRD JURXSVXFKDVWUHDWPHQWJURXSRUGRVHJURXSDQGWKHFROXPQYDULDEOHUHIHUVWRD UHVSRQVHFDWHJRU\$W\SLFDOJUFRQWLQJHQF\WDEOHKDVWKHIROORZLQJOD\RXWZLWK ; UHSUHVHQWLQJWKHIUHTXHQF\RIHYHQWVLQ*URXSLDQG5HVSRQVH&DWHJRU\M LM
7$%/(/D\RXWIRUDJU&RQWLQJHQF\7DEOH
5HVSRQVH&DWHJRU\
*URXS
«
U
;
;
«
;U
;
;
«
;U
«
«
J
;J
;J
«
;JU
7KHPRVWFRPPRQIRUPRIWKHFKLVTXDUHWHVWIRUFRQWLQJHQF\WDEOHDQDO\VLVLVIRU WHVWLQJµJHQHUDODVVRFLDWLRQ¶EHWZHHQWKH*URXSDQG5HVSRQVHYDULDEOHV7KHQXOO K\SRWKHVLV LV WKDW WKHUH LV QR DVVRFLDWLRQ 7KH DOWHUQDWLYH K\SRWKHVLV LV WKDW WKH GLVWULEXWLRQ RYHU UHVSRQVH FDWHJRULHV GLIIHUV DPRQJ *URXS OHYHOV RU PRUH JHQHUDOO\ WKDW 5HVSRQVH LV µDVVRFLDWHG¶ ZLWK *URXS 7KLV LV WHVWHG E\ XVLQJ D YHUVLRQRIWKHFKLVTXDUHWHVWLQWKHIRUPDOUHDG\LQWURGXFHG
&KDSWHU7KH&KL6TXDUH7HVW
,I WKH 5HVSRQVH YDULDEOH LV RUGLQDO \RX FDQ FRPSDUH PHDQ UHVSRQVHV DPRQJ WKH *URXSOHYHOVE\DVVLJQLQJQXPHULFDOO\FRGHGYDOXHVFDOOHGµVFRUHV¶WRWKHUHVSRQVH FDWHJRULHV+HUHµWDEOH¶RUµLQWHJHU¶VFRUHVDUHXVHGZKLFKDUHVLPSO\WKHUDQNVRI WKH FDWHJRULHV ZKHQ RUGHUHG IURPVPDOOHVWWRODUJHVW,QWKLVFDVHWKHDOWHUQDWLYH K\SRWKHVLV LV WKDW PHDQVFRUHVGLIIHUDPRQJJURXSV$GLIIHUHQWIRUPRIWKHFKL VTXDUHWHVWFDQEHXVHGWRWHVWWKLVK\SRWKHVLV$PHDQVFRUHFDQEHFRPSXWHGDV WKHVXPRIDOOVFRUHVIRUDVSHFLILFURZLHVFRUHîIUHTXHQF\VXPPHGRYHUDOO VFRUHV GLYLGHGE\WKHURZWRWDO ,IWKH*URXSDQG5HVSRQVHYDULDEOHVDUHERWKRUGLQDO\RXFDQXVHDQRWKHUW\SHRI FKLVTXDUH VWDWLVWLF WR WHVW ZKHWKHU WKHUH LV D WUHQG RU FRUUHODWLRQ EHWZHHQ LQFUHDVLQJUHVSRQVHVFRUHVDQGWKH*URXSVFRUHV 7KH IROORZLQJ WKUHH FDVHV DUH WHVWHG XVLQJ WKH FKLVTXDUH WHVW IRU µJHQHUDO DVVRFLDWLRQ¶
EDVHGRQJ± ÂU± GHJUHHVRIIUHHGRPWKHFKLVTXDUHWHVWIRU
*$
µURZ PHDQ VFRUHV GLIIHU¶
EDVHG RQ J± GHJUHHV RI IUHHGRP DQG WKH FKL
06'
VTXDUH WHVW IRU µQRQ]HUR FRUUHODWLRQ¶ EDVHG RQ GHJUHH RI IUHHGRP UHVSHFWLYHO\ DV VKRZQ LQ 7DEOH 1RWLFH WKDW WKH µFRUUHODWLRQ¶ WHVW LV RQO\ PHDQLQJIXOZKHQERWKWKH5HVSRQVHDQG*URXSFDWHJRULHVDUHRUGLQDODQGµPHDQ VFRUHV¶ZLWKLQDURZPDNHVHQVHRQO\ZKHQWKH5HVSRQVHFDWHJRU\LVRUGLQDO
&25
7$%/(7\SHVRI&KL6TXDUH7HVWVIRU&RQWLQJHQF\7DEOHV
*URXS URZV
$OWHUQDWLYH+\SRWKHVLV *HQHUDO 5RZ0HDQ 1RQ=HUR $VVRFLDWLRQ 6FRUHV'LIIHU &RUUHODWLRQ
1RPLQDO
1RPLQDO
*$
1RPLQDO
2UGLQDO
*$
2UGLQDO
1RPLQDO
*$
2UGLQDO
2UGLQDO
*$
06'
06'
&25
7KH WDEOHV ZKLFK IROORZ LOOXVWUDWH SDWWHUQV RI FHOO IUHTXHQFLHV WKDW FDQ EH DVVRFLDWHG ZLWK WKH WKUHH DOWHUQDWLYH K\SRWKHVHV DERYH 7DEOH L VKRZV D VLJQLILFDQW JHQHUDO DVVRFLDWLRQ ZKLFKLQGLFDWHVDGLIIHUHQFHDPRQJJURXSVLQWKH GLVWULEXWLRQ RI UHVSRQVHV 7DEOH LL LQGLFDWHV D VLJQLILFDQW GLIIHUHQFH LQ PHDQ VFRUHVDPRQJJURXSV7KHPHDQVFRUHVDUHDQGIRU*URXSV$% DQG & UHVSHFWLYHO\ EDVHG RQ µWDEOH¶ VFRUHV $ SDWWHUQ UHVXOWLQJ LQ VLJQLILFDQW FRUUHODWLRQLVVKRZQLQ7DEOHLLL XQGHUWKHDVVXPSWLRQWKDWWKH*URXSIDFWRULV RUGLQDOZLWK$%DQG&UHSUHVHQWLQJLQFUHDVLQJYDOXHVHJLQFUHDVLQJGRVDJH 1RWLFH WKDW WKH RQO\ GLIIHUHQFH EHWZHHQ 7DEOHV LL DQG LLL LV WKDW WKH URZ IUHTXHQFLHV IRU WKH % DQG & JURXSV KDYH EHHQ H[FKDQJHG VR WKDW WKH URZ PHDQ VFRUHVLQFUHDVHZLWKLQFUHDVLQJ*URXSOHYHOV
5HVSRQVH FROXPQV
&RPPRQ6WDWLVWLFDO0HWKRGVIRU&OLQLFDO5HVHDUFKZLWK6$6([DPSOHV
7DEOHL *HQHUDO$VVRFLDWLRQ *URXS
5HVSRQVH
$
%
&
7HVW
S9DOXH
*$
06'
&25
7DEOHLL 5RZ0HDQ6FRUHV'LIIHU *URXS
5HVSRQVH
$
%
&
7HVW
S9DOXH
*$
7HVW
S9DOXH
06'
&25
7DEOHLLL 1RQ=HUR&RUUHODWLRQ *URXS
5HVSRQVH
$
%
&
*$
06'
&25
7KHFKLVTXDUHVWDWLVWLFIRUJHQHUDODVVRFLDWLRQFDQEHFRPSXWHGLQDZD\VLPLODU WR WKDW GHPRQVWUDWHG IRU WKH WDEOH 7KH PHDQ VFRUHV FKLVTXDUH YDOXH LV D IXQFWLRQRIWKHVDPSOHJURXSPHDQVDQGYDULDQFHVDQGWKHFRUUHODWLRQVWDWLVWLFLV FRPSXWDWLRQDOO\ D PXOWLSOH RI WKH FRUUHODWLRQ FRHIILFLHQW 7KHVH FDQ HDVLO\ EH REWDLQHGLQ6$6XVLQJWKH&0+RSWLRQLQWKH7$%/(6VWDWHPHQWLQ352&)5(4 DV GHPRQVWUDWHG LQ WKH FRGH IRU WKH VDPSOH GDWD LQ 7DEOHV L LL DQG LLL WKDW IROORZV
&KDSWHU7KH&KL6TXDUH7HVW
([DPSOH6$6&RGHIRU&KL6TXDUH7HVWVIRU7DEOHVL LL DQGLLL 8VLQJ6$6
DATA G_BY_R; INPUT TBL GROUP $ RESP $ CNT @@; DATALINES; 1 A 1 10 1 A 2 15 1 A 3 5 1 B 1 10 1 B 2 10 1 B 3 10 1 C 1 10 1 C 2 5 1 C 3 15 2 A 1 15 2 A 2 10 2 A 3 5 2 B 1 5 2 B 2 10 2 B 3 15 2 C 1 10 2 C 2 10 2 C 3 10 3 A 1 15 3 A 2 10 3 A 3 5 3 B 1 10 3 B 2 10 3 B 3 10 3 C 1 5 3 C 2 10 3 C 3 15 ; ODS SELECT CMH; PROC FREQ ORDER = DATA DATA = G_BY_R; BY TBL; TABLES GROUP*RESP / CMH NOPERCENT NOCOL; WEIGHT CNT; TITLE1 ’Chi-Square Test for 3-by-3 Table’; RUN;
287387&0+7HVWVIRUE\7DEOHV
Chi-Square Test for 3-by-3 Table The FREQ Procedure Summary Statistics for GROUP by RESP Cochran-Mantel-Haenszel Statistics (Based on Table Scores) ------------------------------- TBL=1 ------------------------------Statistic Alternative Hypothesis DF Value Prob --------------------------------------------------------------1 Nonzero Correlation 1 2.4722 0.1159 2 Row Mean Scores Differ 2 2.4722 0.2905 3 General Association 4 9.8889 0.0423 ------------------------------- TBL=2 ------------------------------Statistic Alternative Hypothesis DF Value Prob --------------------------------------------------------------1 Nonzero Correlation 1 2.4722 0.1159 2 Row Mean Scores Differ 2 9.8889 0.0071 3 General Association 4 9.8889 0.0423 ------------------------------- TBL=3 ------------------------------Statistic Alternative Hypothesis DF Value Prob --------------------------------------------------------------1 Nonzero Correlation 1 9.8889 0.0017 2 Row Mean Scores Differ 2 9.8889 0.0071 3 General Association 4 9.8889 0.0423
&RPPRQ6WDWLVWLFDO0HWKRGVIRU&OLQLFDO5HVHDUFKZLWK6$6([DPSOHV
%LQRPLDO5HVSRQVHZLWK0RUH7KDQ7ZR*URXSV 7KH ELQRPLDO UHVSRQVH U IRU PRUH WKDQ WZR JURXSV J ! LV D VSHFLDO EXW IUHTXHQWO\HQFRXQWHUHGFDVHRIWKHJUWDEOHMXVWGLVFXVVHG,QWKLVFDVHWKHQXOO K\SRWKHVLVLVWKDWUHVSRQVHUDWHVDUHWKHVDPHDPRQJDOOJURXSVS S « SJ
2QH DOWHUQDWLYH K\SRWKHVLV LV WKDW UHVSRQVH UDWHV GLIIHU DPRQJ JURXSV :LWK ELQRPLDO UHVSRQVHV WKH K\SRWKHVLV RI µJHQHUDO DVVRFLDWLRQ¶ LV HTXLYDOHQW WR WKH K\SRWKHVLVRIµURZPHDQVFRUHVGLIIHU¶EHFDXVHWKHURZPHDQLVDOLQHDUIXQFWLRQRI WKHSUREDELOLW\RIUHVSRQVHDVELQRPLDOUHVSRQVHFDWHJRULHVFDQDOZD\VEHFRGHG µQRUHVSRQVH¶ DQGµUHVSRQVH¶ :KHQWKH*URXSIDFWRULVRUGLQDOWKHWHVWIRU µQRQ]HURFRUUHODWLRQ¶LVDFWXDOO\DOLQHDUµWUHQG¶WHVW7KLVFDQEHXVHGWRWHVWIRU GRVHUHVSRQVHIRUH[DPSOHZKHQWKH*URXSIDFWRUUHSUHVHQWVGRVHOHYHO
p ([DPSOH&RPSDULVRQRI'URSRXW5DWHVIRU'RVH*URXSV
7KHIROORZLQJWDEOHVXPPDUL]HVGURSRXWUDWHVIRUSDWLHQWVUDQGRPL]HGWR GRVHJURXSVLQDFOLQLFDOVWXG\L ,VWKHUHDGLIIHUHQFHLQGURSRXWUDWHV DPRQJJURXSVDQGLL GRGURSRXWUDWHVKDYHDOLQHDUFRUUHODWLRQZLWK GRVDJH"
7$%/()UHTXHQF\7DEOHIRU([DPSOH
'URSRXW"
*URXS
<(6
12
PJ
PJ
PJ
PJ
6$6$QDO\VLVRI([DPSOH
7KH&0+RSWLRQLQWKH7$%/(6VWDWHPHQWSURGXFHVWKHRXWSXWHQWLWOHGµ&RFKUDQ 0DQWHO+DHQV]HO6WDWLVWLFV¶ZKLFKVKRZVWKHWHVWVIRUµJHQHUDODVVRFLDWLRQ¶µURZ PHDQ VFRUHV GLIIHU¶ DQG µQRQ]HUR FRUUHODWLRQ¶ 2XWSXW
*$
06'
)URP WKLV \RX FDQ FRQFOXGH WKDW UHVSRQVH UDWHV DUH QRW VLJQLILFDQWO\ GLIIHUHQW DPRQJ WKH GRVH JURXSV +RZHYHU WKH FKLVTXDUH WHVW IRU QRQ]HUR FRUUHODWLRQ LV EDVHG RQ GHJUHH RI IUHHGRP ZLWK D SYDOXHRI ñZKLFK LQGLFDWHV D VLJQLILFDQW WUHQG LQ UHVSRQVH UDWHV ZLWK LQFUHDVLQJ GRVDJH ,W¶V QRW XQXVXDOWREHXQDEOHWRGHWHFWDGLIIHUHQFHLQUHVSRQVHUDWHVDPRQJJURXSV\HWILQG WKDWDGRVHUHVSRQVHWUHQGH[LVWVDVWKLVH[DPSOHLOOXVWUDWHV
&25
&KDSWHU7KH&KL6TXDUH7HVW
7KH FKLVTXDUH WHVW SURGXFHG E\ WKH &+,64 RSWLRQ LQWKH7$%/(6VWDWHPHQWLV DOVRDWHVWRIµJHQHUDODVVRFLDWLRQ¶ò7KLVLVVLPSO\DPXOWLSOH11± RI DQGWKHWZRIRUPVDUHHVVHQWLDOO\HTXLYDOHQWIRUODUJH17KH0DQWHO+DHQV]HOFKL
*$
VTXDUHWHVWôLVHTXLYDOHQWWR
ñLQDQRQVWUDWLILHGOD\RXWDVSUHVHQWHGKHUH
&25
6$6&RGHIRU([DPSOH
DATA DORATE; INPUT DOSE_MG DATALINES; 10 YES 5 10 _NO 20 YES 6 20 _NO 40 YES 10 40 _NO 80 YES 12 80 _NO ;
RESP $ CNT @@; 35 29 28 27
PROC FREQ DATA = DORATE; TABLES DOSE_MG*RESP / CHISQ CMH TREND NOPERCENT NOCOL; WEIGHT CNT; TITLE1 ’The Chi-Square Test’; TITLE2 ’Example 16.2: Comparison of Dropout Rates for 4 Dose Groups’; RUN;
2873876$62XWSXWIRU([DPSOH The Chi-Square Test Example 16.2: Comparison of Dropout Rates for 4 Dose Groups The FREQ Procedure Table of DOSE_MG by RESP DOSE_MG RESP Frequency‚ Row Pct |YES |_NO | ---------+--------+--------+ 10 | 5 | 35 | | 12.50 | 87.50 | ---------+--------+--------+ 20 | 6 | 29 | | 17.14 | 82.86 | ---------+--------+--------+ 40 | 10 | 28 | | 26.32 | 73.68 | ---------+--------+--------+ 80 | 12 | 27 | | 30.77 | 69.23 | ---------+--------+--------+ Total 33 119
Total 40
35
38
39
152
&RPPRQ6WDWLVWLFDO0HWKRGVIRU&OLQLFDO5HVHDUFKZLWK6$6([DPSOHV
2873876$62XWSXWIRU([DPSOHFRQWLQXHG Statistics for Table of DOSE_MG by RESP Statistic DF Value Prob -----------------------------------------------------Chi-Square 3 4.7831 0.1884 Likelihood Ratio Chi-Square 3 4.9008 0.1792 Mantel-Haenszel Chi-Square 1 4.2171 0.0400 Phi Coefficient 0.1774 Contingency Coefficient 0.1747 Cramer’s V 0.1774
Cochran-Armitage Trend Test -------------------------Statistic (Z) 2.0603 One-sided Pr > Z 0.0197 Two-sided Pr > |Z| 0.0394
ò ô
õ
Sample Size = 152 Summary Statistics for DOSE_MG by RESP Cochran-Mantel-Haenszel Statistics (Based on Table Scores) Statistic Alternative Hypothesis DF Value Prob --------------------------------------------------------------1 Nonzero Correlation 1 4.2171 0.0400 2 Row Mean Scores Differ 3 4.7516 0.1909 3 General Association 3 4.7516 0.1909
ñ
Total Sample Size = 152
0XOWLQRPLDO5HVSRQVHV &RPSDULVRQ RI WZR JURXSV J ZKHQ WKHUH DUH PRUH WKDQ WZR UHVSRQVH FDWHJRULHV U ! LV DOVR D VSHFLDO FDVH RI WKH JU WDEOH WKDW ZDV GLVFXVVHG SUHYLRXVO\,QWKLVFDVH\RXXVHDUWDEOHZKHUHU! LVWKHQXPEHURIUHVSRQVH OHYHOV )RU H[DPSOH LI SDWLHQWV UDQGRPL]HG WR DFWLYH DQG SODFHER JURXSV KDG UHVSRQVHVFDWHJRUL]HGDVµQRQH¶µSDUWLDO¶RUµFRPSOHWH¶WKHWDEOHZRXOGEH 7$%/(/D\RXWIRUDHU&RQWLQJHQF\7DEOH
5HVSRQVH
*URXS
1RQH
3DUWLDO
&RPSOHWH
$FWLYH
;
;
;
3ODFHER
;
;
;
&KDSWHU7KH&KL6TXDUH7HVW
7KH K\SRWKHVLV RI µJHQHUDO DVVRFLDWLRQ¶ LV HTXLYDOHQW WR WKH K\SRWKHVLV WKDW WKH GLVWULEXWLRQRIUHVSRQVHVGLIIHUVEHWZHHQJURXSV7KLVWHVWFDQEHXVHGUHJDUGOHVV RI ZKHWKHU WKH UHVSRQVH OHYHOV DUH RUGLQDO RU QRPLQDO :KHQ WKH UHVSRQVH LV RUGLQDO WKH µURZ PHDQ VFRUHV GLIIHU¶ K\SRWKHVLV LV HTXLYDOHQW WR WKH µQRQ]HUR FRUUHODWLRQ¶K\SRWKHVLVLQWKHUWDEOH
p ([DPSOH$FWLYHYV3ODFHER&RPSDULVRQVRI'HJUHHRI5HVSRQVH
7KHIROORZLQJWDEOHFRQWDLQVWKHQXPEHURISDWLHQWVZKRVKRZFRPSOHWH SDUWLDORUQRUHVSRQVHDIWHUWUHDWPHQWZLWKHLWKHUDFWLYHPHGLFDWLRQRUD SODFHER,VWKHUHDGLIIHUHQFHEHWZHHQWUHDWPHQWJURXSVLQUHVSRQVH GLVWULEXWLRQV"
7$%/()UHTXHQF\7DEOHIRU([DPSOH
5HVSRQVH
*URXS
1RQH
3DUWLDO
&RPSOHWH
$FWLYH
3ODFHER
6$6$QDO\VLVRI([DPSOH
7KH&0+RSWLRQLQWKH7$%/(6VWDWHPHQWLQ352&)5(4UHTXHVWVWKHFKL VTXDUHWHVWVIRUJHQHUDODVVRFLDWLRQDQGOLQHDUFRUUHODWLRQDVVKRZQLQWKH6$6 FRGHWKDWIROORZV,Q2XWSXW\RXVHHDFKLVTXDUHYDOXHRIEDVHG RQ GHJUHH RI IUHHGRP DQG D SYDOXH RI ù ZKLFK LQGLFDWHV D VLJQLILFDQW FRUUHODWLRQ RU µWUHQG¶ EHWZHHQ WUHDWPHQW JURXS DQG UHVSRQVH 7KH WHVWIRUJHQHUDODVVRFLDWLRQ 11 LVQRQVLJQLILFDQWS ZKLFKLQGLFDWHVD ODFNRIHYLGHQFHRIGLIIHULQJUHVSRQVHGLVWULEXWLRQVEHWZHHQWKHDFWLYHDQGWKH SODFHERJURXSV
6$6&RGHIRU([DPSOH DATA MULTI; INPUT TRT $ RESP $ CNT @@; DATALINES; ACT NONE 16 ACT PART 26 ACT FULL 29 PBO NONE 24 PBO PART 26 PBO FULL 18 ; PROC FREQ ORDER = DATA DATA = MULTI; TABLES TRT*RESP / CMH CHISQ TREND NOPERCENT NOCOL; WEIGHT CNT; TITLE1 ’The Chi-Square Test; TITLE2 ’Example 16.3: Active vs. Placebo Comparison of Degree of Response’; RUN;
&RPPRQ6WDWLVWLFDO0HWKRGVIRU&OLQLFDO5HVHDUFKZLWK6$6([DPSOHV
1RWLFHWKDWWKHGHIDXOWVFRULQJV\VWHPLQ352&)5(4LVWKHµWDEOHV¶VFRUHV,I\RX ZDQWWRH[SOLFLWO\LQFOXGHWKHVFRUHVSHFLI\6&25(6 7$%/(6
2873876$62XWSXWIRU([DPSOH
The Chi-Square Test Example 16.3: Active vs. Placebo Comparison of Degree of Response The FREQ Procedure Table of TRT by RESP TRT
RESP
Frequency| Row Pct |NONE |PART |FULL | ---------+--------+--------+--------+ ACT | 16 | 26 | 29 | | 22.54 | 36.62 | 40.85 | ---------+--------+--------+--------+ PBO | 24 | 26 | 18 | | 35.29 | 38.24 | 26.47 | ---------+--------+--------+--------+ Total 40 52 47
Total 71
68
139
Statistics for Table of TRT by RESP Statistic DF Value Prob -----------------------------------------------------Chi-Square 2 4.1116 0.1280 Likelihood Ratio Chi-Square 2 4.1446 0.1259 Mantel-Haenszel Chi-Square 1 4.0727 0.0436 Phi Coefficient 0.1720 Contingency Coefficient 0.1695 Cramer’s V 0.1720
Cochran-Armitage Trend Test -------------------------Statistic (Z) 2.0254 One-sided Pr > Z 0.0214 Two-sided Pr > |Z| 0.0428
12
Sample Size = 139
Summary Statistics for TRT by RESP Cochran-Mantel-Haenszel Statistics (Based on Table Scores) Statistic Alternative Hypothesis DF Value Prob --------------------------------------------------------------1 Nonzero Correlation 1 4.0727 0.0436 ~ 2 Row Mean Scores Differ 1 4.0727 0.0436 3 General Association 2 4.0821 0.1299 11
Total Sample Size = 139
&KDSWHU7KH&KL6TXDUH7HVW
'HWDLOV 1RWHV
$Q HTXLYDOHQW WHVW WR WKH FKLVTXDUH WHVW IRU FRPSDULQJ WZR ELQRPLDOSURSRUWLRQVLVEDVHGRQWKH=WHVWVWDWLVWLFDVIROORZV
SÖ -SÖ
=
S - S × Q Q
; Q
; + ; Q + Q
SÖ = S= ZKHUHDQG L
L
L
:KHQ + LV WUXH DQG YDOXHV IRU Q DQG Q DUH ODUJH = KDV DQ DSSUR[LPDWH VWDQGDUGQRUPDOGLVWULEXWLRQDQG\RXUHMHFW+ LI_ = _!=
%HFDXVHWKHVTXDUHRIDVWDQGDUGQRUPDOYDULDEOHLVDFKLVTXDUHZLWKGHJUHH RIIUHHGRP= KDVDFKLVTXDUHGLVWULEXWLRQDQGFDQEHVKRZQDOJHEUDLFDOO\WR EH HTXLYDOHQW WR WKH FKLVTXDUH WHVW VWDWLVWLF DOUHDG\ LQWURGXFHG ,Q ([DPSOH WKHUHVSRQVHUDWHHVWLPDWHVDUHIRXQGDVIROORZV
7$%/((VWLPDWLQJ5HVSRQVH5DWHVIRU([DPSOH
7HVW*URXS L
&RQWURO *URXSL
&RPELQHG
;L
QL
(VWRIS
DQGWKH=WHVWVWDWLVWLFLV
=
- × ×
±
%HFDXVH!\RXUHMHFW+ EDVHGRQWKH=VWDWLVWLFDWDOHYHORI VLJQLILFDQFH
1RWLFHWKDW= ± LVWKHFKLVTXDUHYDOXHWKDWZDVREWDLQHG SUHYLRXVO\ 7KH FULWLFDO FKLVTXDUH YDOXH ZLWK GHJUHH RIIUHHGRPIURPWKH FKLVTXDUH WDEOHV FRUUHVSRQGV WR WKH VTXDUH RI WKH FULWLFDO = YDOXH IURP WKH QRUPDO WDEOHV )RU WKH FULWLFDO YDOXH LV =
&RPPRQ6WDWLVWLFDO0HWKRGVIRU&OLQLFDO5HVHDUFKZLWK6$6([DPSOHV
,QFRQGXFWLQJDWZRVDPSOHELQRPLDOFRPSDULVRQXVLQJHLWKHUWKH FKLVTXDUH WHVW RU WKH =VWDWLVWLF WKH QRUPDO DSSUR[LPDWLRQ WR WKH ELQRPLDO GLVWULEXWLRQPXVWEHYDOLG7KHWHVWVGHVFULEHGKHUHPLJKWQRWEHYDOLGLI; RU Q ±; LVVPDOOIRUL RU*HQHUDOO\WKHDQDO\VWVKRXOGEHZDU\RIUHVXOWV WKDWXVHWKLVDSSUR[LPDWHWHVWLIDQ\FHOOIUHTXHQF\LVOHVVWKDQ)LVKHU VH[DFW WHVW &KDSWHU PLJKW EH DSSOLFDEOH IRU FDVHV WKDW LQYROYH VPDOO FHOO IUHTXHQFLHV L
L
7KHQRUPDODSSUR[LPDWLRQWRWKHELQRPLDOLVXVXDOO\DJRRGDSSUR[LPDWLRQLI
Q × S - Q × S × - S
L
L
L
L
L
L
³
DQG
Q × S Q × S × - S L
L
L
L
L
£
Q L
IRUL DQGL VHH6HFWLRQ )RUVPDOOYDOXHVRIQ DQGQ WKHQRUPDODSSUR[LPDWLRQFDQEHLPSURYHGE\ DGMXVWLQJWKHWHVWVWDWLVWLFZLWKDµFRQWLQXLW\FRUUHFWLRQ¶& Q Q 7KH DGMXVWPHQW LV PDGH E\ VXEWUDFWLQJ & IURP WKH QXPHUDWRU RI WKH =WHVW VWDWLVWLFLIWKHQXPHUDWRULVJUHDWHUWKDQRUE\DGGLQJ&WRWKHQXPHUDWRURI= LILWLVOHVVWKDQ
,Q([DPSOHWKH=VWDWLVWLFXVLQJWKHFRQWLQXLW\FRUUHFWLRQLV
=
× ×
-
-
7KH FRQWLQXLW\FRUUHFWHG FKLVTXDUH YDOXH LV WKH VTXDUH RI WKLV =YDOXH ± 7KLVDGMXVWHGYDOXHLVVKRZQLQ2XWSXWZLWKDSYDOXH RI $QDSSUR[LPDWHFRQILGHQFHLQWHUYDOIRUWKHSURSRUWLRQS LV JLYHQE\
L
SÖ L
×
SÖ × SÖ L
L
Q
ZKHUH SÖ L
L
; Q L
$Q DSSUR[LPDWH FRQILGHQFH LQWHUYDO IRU WKH GLIIHUHQFH LQ SURSRUWLRQV S ±S LVJLYHQE\
±
SÖ - SÖ
±
×
SÖ × - SÖ Q
SÖ × - SÖ Q
&KDSWHU7KH&KL6TXDUH7HVW
:LWK ; Q DQG ; Q LQ ([DPSOH D FRQILGHQFHLQWHUYDOIRUS ±S LV
æ
± ±WR±
%HFDXVH ([DPSOH WHVWV IRU D GLIIHUHQFH IURP WKH K\SRWKHVL]HG YDOXH LQ HLWKHU GLUHFWLRQ D WZRWDLOHG WHVW LV XVHG $ RQHWDLOHG WHVW ZRXOG EH XVHG LI \RX ZDQW WR WHVW ZKHWKHU RQH SRSXODWLRQ SURSRUWLRQ LV VWULFWO\ JUHDWHU WKDQ RU VWULFWO\ OHVV WKDQ WKH RWKHU 8VH WKH UHMHFWLRQ UHJLRQ DFFRUGLQJWRWKHDOWHUQDWLYHK\SRWKHVLVDVIROORZV
× × ö -± ×ç ÷ è ø
7\SHRI7HVW
$OWHUQDWLYH +\SRWKHVLV
&RUUHVSRQGLQJ 5HMHFWLRQ5HJLRQ
WZRWDLOHG
+ S S
UHMHFW+ LI=!= RU==
RQHWDLOHGULJKW
+ S!S
UHMHFW+ LI=!=
RQHWDLOHGOHIW
+ SS
UHMHFW+ LI==
$
$
$
1RWLFH WKDW WKH FKLVTXDUH WHVW LV D WZRWDLOHG WHVW EHFDXVH ODUJH FKLVTXDUH YDOXHV ZLWKLQ WKH UHMHFWLRQ UHJLRQ RFFXU ZKHQ = LV D YHU\ ODUJH SRVLWLYH RU ODUJH QHJDWLYH YDOXH 7R FRQGXFW D RQHWDLOHG FKLVTXDUH WHVW WKH QRPLQDO VLJQLILFDQFHOHYHO VKRXOGEHGRXEOHGZKHQORRNLQJXSWKHFULWLFDOFKLVTXDUH YDOXHV 7KH FULWLFDO YDOXHV IRU D UHMHFWLRQ UHJLRQ WKDW FRUUHVSRQG WR D VLJQLILFDQFHOHYHORI DUH 7HVW
2QH7DLOHG
7ZR7DLOHG
=7HVW
&KL6TXDUH
7KHSYDOXHIRUDFKLVTXDUHWHVWLQWKH6$6RXWSXWFRUUHVSRQGVWRDWZRWDLOHG WHVW7KHRQHWDLOHGSYDOXHLVIRXQGE\KDOYLQJWKLVYDOXH )RU WKH FDVH RI Q Q DPHWKRGIRUHVWLPDWLQJVDPSOHVL]HVWR GHWHFWDGLIIHUHQFHLQUHVSRQVHUDWHV S ±S LVVKRZQLQ&KDSWHU
,Q DGGLWLRQ WR WKH FKLVTXDUH WHVW IRU µQRQ]HUR FRUUHODWLRQ¶ WKH &RFKUDQ$UPLWDJH WHVW IRU WUHQG FDQ DOVR EH XVHG WR WHVW IRU OLQHDU FRUUHODWLRQEHWZHHQDELQRPLDOUHVSRQVHDQGDQRUGLQDO*URXSYDULDEOH7KLV WHVWLVEDVHGRQD=WHVWVWDWLVWLFWKDWKDVWKHVWDQGDUGQRUPDOGLVWULEXWLRQIRU
&RPPRQ6WDWLVWLFDO0HWKRGVIRU&OLQLFDO5HVHDUFKZLWK6$6([DPSOHV
ODUJH17KH0DQWHO+DHQV]HOFKLVTXDUHDQGWKH&RFKUDQ$UPLWDJHWHVWIRU WUHQG FDQ EH XVHG LQWHUFKDQJHDEO\ ZKHQ 5HVSRQVH DQG *URXS DUH RUGLQDO IDFWRUVDQGRQHRIWKHVHIDFWRUVKDVRQO\WZROHYHOV:KHQWKHWRWDOVDPSOH VL]H 1 LV ODUJH WKH\ DUH HTXLYDOHQW 7KH 0DQWHO+DHQV]HO FKLVTXDUH LV VLPSO\= DGMXVWHGE\DIDFWRURI1± 1
)RU([DPSOHWKH&RFKUDQ$UPLWDJHWHVWUHVXOWVDUHSULQWHGE\XVLQJWKH 75(1' RSWLRQ LQ WKH 7$%/(6 VWDWHPHQW LQ 352& )5(4 2XWSXW VKRZV D = VWDWLVWLF RI õ :LWK 1 WKH 0DQWHO+DHQV]HO FKL VTXDUH FDQ EH FRQILUPHG DV Â 7KH &RFKUDQ $UPLWDJHWHVWLVDOVRSULQWHGLQ2XWSXW 12
:KHQ FRPSDULQJ ELQRPLDO SURSRUWLRQV DPRQJ PRUH WKDQ WZR JURXSV D VLJQLILFDQWWHVWIRUJHQHUDODVVRFLDWLRQZRXOGLPSO\WKDWUHVSRQVH UDWHVGLIIHUIRUDWOHDVWRQHSDLURIJURXSOHYHOV0XOWLSOHFRPSDULVRQVXVLQJ DVHULHVRIWDEOHVFDQEHXVHGWRGHWHUPLQHZKLFKSDLUVDFWXDOO\GLIIHU 7KLVPLJKWUHTXLUHPXOWLSOLFLW\DGMXVWPHQWVIRUPXOWLSOHFRPSDULVRQVZKLFK DUHGLVFXVVHGLQ$SSHQGL[( :KHQ WKHUH DUH RQO\ WZR FDWHJRULHV LH J RU U WKH VFRUHV RI DQG FDQ EH DVVLJQHG WR WKH OHYHOV UHJDUGOHVV RI ZKHWKHU WKH IDFWRU LV QRPLQDO RU RUGLQDO 7KHUHIRUH ZKHQ WKHUH DUH RQO\ WZR JURXSV J WKHDOWHUQDWLYHK\SRWKHVLVRIµURZPHDQVFRUHVGLIIHU¶LVHTXLYDOHQWWR WKH µQRQ]HUR FRUUHODWLRQ¶ K\SRWKHVLV :KHQ WKHUH DUH RQO\ WZR UHVSRQVH FDWHJRULHVU WKHK\SRWKHVLVRIµJHQHUDODVVRFLDWLRQ¶LVHTXLYDOHQWWRWKH µURZPHDQVFRUHVGLIIHU¶K\SRWKHVLV
µ7DEOH¶ VFRUHV DUH WKH VLPSOHVWDQGPRVWFRPPRQO\XVHGVFRUHV IRU RUGLQDO UHVSRQVH FDWHJRULHV ZKHQ FRQGXFWLQJ FKLVTXDUH WHVWV 2QH FULWLFLVP LQ XVLQJ WDEOH VFRUHV LV WKDW EHFDXVH WKH\ DUH HTXDOO\ VSDFHG WKH\ PLJKW QRWJLYHDQDFFXUDWHUHSUHVHQWDWLRQRIWKHRUGLQDOQDWXUHRIFDWHJRULHV ZKLFKDUHWKRXJKWQRWWREHHTXDOO\VSDFHG)RUH[DPSOHLIUHVSRQVHLVµGHJUHH RI LPSURYHPHQW¶ ZLWK FDWHJRULHV µQRQH¶ µVRPH¶ µPDUNHG¶ DQG µFXUHG¶ WKH UHODWLYH GLIIHUHQFH EHWZHHQ µPDUNHG¶ DQG µFXUHG¶ PLJKW EH FOLQLFDOO\ PRUH LPSRUWDQWWKDQWKHGLIIHUHQFHEHWZHHQµQRQH¶DQGµVRPH¶ 5DQN ORJUDQN ULGLW DQG PRGLILHG ULGLW VFRUHV DUH DOWHUQDWLYHV WR WKH WDEOH VFRUHV DYDLODEOH ZLWK 352& )5(4 LQ 6$62IWKHVHPRGLILHGULGLWVFRUHV DUHSUREDEO\WKHPRVWIUHTXHQWO\HQFRXQWHUHGLQFOLQLFDOWULDOV7KHVHVFRUHV DUHEDVHGRQWKHPLGUDQNVRIWKHFROXPQWRWDOVVWDQGDUGL]HGE\GLYLGLQJE\ 1 )RUH[DPSOHVXSSRVHSDWLHQWVZHUHFODVVLILHGLQIRXUUHVSRQVHFDWHJRULHVDV µQRQH¶ Q µVRPH¶ Q µPDUNHG¶ Q DQG µFXUHG¶ Q &RPSXWDWLRQRIWKHPRGLILHGULGLWVFRUHVLVVKRZQLQ7DEOH
&KDSWHU7KH&KL6TXDUH7HVW
7$%/(6DPSOH&DOFXODWLRQRI0RGLILHG5LGLW6FRUHV
1RQH
6RPH
0DUNHG
&XUHG
0LGUDQN
0RG5LGLW6FRUH
&ROXPQ7RWDO
7KH6&25(6 02'5,',7RSWLRQLQWKH7$%/(6VWDWHPHQWLQ352&)5(4 LVXVHGWRSHUIRUPWKHFKLVTXDUHWHVWEDVHGRQPRGLILHGULGLWVFRUHVLQ6$6 0LGUDQNVDUHFRPSXWHGDV 1RQH 6RPH 0DUNHG &XUHG
&XVWRPVFRULQJV\VWHPVFDQDOVREHXVHGZKHQWKHUHODWLYHLPSRUWDQFHRIWKH UHVSRQVH FDWHJRULHV LV ZHOO GHILQHG ,Q WKH SUHFHGLQJ H[DPSOH \RX FDQ DVVLJQ VFRUHV RI DQG IRU H[DPSOH WR UHIOHFW WKH FOLQLFDO LPSRUWDQFHRIWKHUHVSRQVHUHODWLYHWRRWKHUUHVSRQVHV+RZHYHULQWKHUDUH FDVHV ZKHQ VXFK UHODWLYH LPSRUWDQFH FDQ EH PHDQLQJIXOO\ DVVLJQHG LW LV XVXDOO\ HDVLHU WR WUHDW WKH UHVSRQVH DV D QXPHULF UDWKHU WKDQ FDWHJRULFDO YDULDEOH ZKHQ GHWHUPLQLQJ WKH EHVW DQDO\VLV WR XVH ,I WKH UHODWLYH FOLQLFDO LPSRUWDQFH RI WKH UHVSRQVH FDWHJRULHV LV QRW FOHDU WKH DQDO\VW VKRXOG H[DPLQH WKH UREXVWQHVV RI WKH UHVXOWV XQGHU YDULRXV VFRULQJ DVVLJQPHQWV EHIRUHPDNLQJDFRQFOXVLRQ
7KH FKLVTXDUH WHVWV WKDW ZHUH GLVFXVVHG IRU RUGLQDO UHVSRQVHV DUHLQYDULDQWRYHUOLQHDUWUDQVIRUPDWLRQVRIWKHVFRUHV)RUH[DPSOHLIWKHUH DUHIRXURUGHUHGUHVSRQVHFDWHJRULHVDVVKRZQLQ7DEOH\RXFDQFRGH WKHFDWHJRULHVE\XVLQJWDEOHVFRUHVRIRUVFRUHVEDVHGRQDQ\OLQHDU WUDQVIRUPDWLRQRIWKHVHVFRUHVVXFKDVRUDQGREWDLQWKH VDPHUHVXOWV7KHVDPHLQYDULDQFHSULQFLSOHKROGVZKHQXVLQJPRGLILHGULGLW VFRUHV7KHPRGLILHGULGLWVFRUHVIRXQGLQ7DEOHFDQEHFRPSDUHGZLWK WKHWDEOHVFRUHVXVLQJWKHOLQHDUWUDQVIRUPDWLRQVKRZQLQ7DEOH
7$%/(&RPSDULVRQRI5LGLW6FRUHVZLWK7DEOH6FRUHV
1RQH
6RPH
0DUNHG
&XUHG
0RG5LGLW6FRUH
6XEWUDFW
0XOWLSO\E\
7DEOH6FRUHV
&RPPRQ6WDWLVWLFDO0HWKRGVIRU&OLQLFDO5HVHDUFKZLWK6$6([DPSOHV
7KLV UHSUHVHQWDWLRQ PRUH HDVLO\ GHSLFWV WKH VSDFLQJ RI WKH PRGLILHG ULGLW VFRUHV UHODWLYH WR WKH WDEOH VFRUHV 8VLQJ D VLPLODU DSSURDFK H[DPSOHV RI PRGLILHG ULGLW VFRUHV WUDQVIRUPHG WR DLQWHUYDOIRUFRPSDULVRQZLWKWKH WDEOHVFRUHVDUHVKRZQLQ7DEOHIRUYDULRXVRYHUDOOGLVWULEXWLRQVRIWKH UHVSRQVHFDWHJRU\LHFROXPQWRWDOV 1RWLFHWKDWWKHPRGLILHGULGLWVFRUHV DUHHTXLYDOHQWWRWDEOHVFRUHVZKHQWKHGLVWULEXWLRQLVXQLIRUP
7$%/(&RPSDULVRQRI5LGLW6FRUHVIRU9DULRXV0DUJLQDO 'LVWULEXWLRQV 'LVWULEXWLRQ
1RQH
6RPH
0DUNHG
&XUHG
8QLIRUP
&ROXPQ7RWDO
6FDOHG0RG5LGLW
1RUPDO
&ROXPQ7RWDO
6FDOHG0RG5LGLW
%LPRGDO
&ROXPQ7RWDO
6FDOHG0RG5LGLW
([SRQHQWLDO
&ROXPQ7RWDO
6FDOHG0RG5LGLW
6NHZHG
&ROXPQ7RWDO
6FDOHG0RG5LGLW
,Q D JU FRQWLQJHQF\ WDEOH ZLWK RUGLQDO UHVSRQVHV WKH FKL VTXDUH WHVW IRU WKH µURZ PHDQ VFRUHV GLIIHU¶ K\SRWKHVLV LV LGHQWLFDO WR WKH .UXVNDO:DOOLVWHVW&KDSWHU ZKHQPRGLILHGULGLWVFRUHVDUHXVHG7KLVLV FOHDU ZKHQ \RX UHFRJQL]H WKDW D FRQWLQJHQF\ WDEOH LV VLPSO\ D VXPPDU\ RI UHVSRQVHV LQ ZKLFK IUHTXHQFLHV UHSUHVHQW WKH QXPEHUV RI WLHG YDOXHV 7KH .UXVNDO:DOOLVWHVWLVEDVHGRQUDQNVRIWKHGDWDZLWKDYHUDJHUDQNVDVVLJQHG WRWLHGYDOXHV7KHVHDUHWKHVDPHDVWKHPLGUDQNVXVHGLQWKHPRGLILHGULGLW VFRUHV
CHHAAPPTTEERR 1177 Fisher’s Exact Test 17.1 17.2 17.3 17.4
Introduction...................................................................................................... 285 Synopsis ............................................................................................................ 285 Examples........................................................................................................... 286 Details & Notes................................................................................................. 289
17.1 Introduction Fisher's exact test is an alternative to the chi-square test (Chapter 16) for comparing two independent binomial proportions, p1 and p2. This method is based on computing exact probabilities of observing a given result or a more extreme result, when the hypothesis of equal proportions is true. Fisher's exact test is useful when the normal approximation to the binomial might not be applicable, such as in the case of small cell sizes or extreme proportions.
17.2 Synopsis Using the same notation as in Chapter 16, you observe X1 ‘responders’ out of n1 patients studied in one group and X2 ‘responders’ of n2 patients in a second independent group, as shown in Table 17.1. TABLE 17.1. Layout for Fisher’s Exact Test Number of Responders
Number of Non-Responders
Total
Group 1
X1
n1-X1
n1
Group 2
X2
n2-X2
n2
Combined
X1+X2
N-(X1+X2)
N = n1 + n2
&RPPRQ6WDWLVWLFDO0HWKRGVIRU&OLQLFDO5HVHDUFKZLWK6$6([DPSOHV
*LYHQ HTXDO SURSRUWLRQV S S WKH SUREDELOLW\ RI REVHUYLQJ WKH FRQILJXUDWLRQ VKRZQ LQ 7DEOH ZKHQ WKH PDUJLQDO WRWDOV DUH IL[HG LV IRXQG E\ WKH µK\SHUJHRPHWULFSUREDELOLW\GLVWULEXWLRQ¶DV
Q¬ Q ¬ ¸ ;® ; ® D ¬ SURE ZKHUH 1 ¬ E® ; ; ®
D E ¸ D - E
LVWKHFRPELQDWRULDOV\PEROWKDWUHSUHVHQWV³WKHQXPEHURIZD\VµE¶LWHPVFDQEH VHOHFWHG IURP D VHW RI µD¶ LWHPV´ 7KH SUREDELOLW\ RI WKH WDEOH FRQILJXUDWLRQ VLPSOLILHVWR
SURE
; ; × 1 - ; - ; × Q × Q 1 × ; × ; × Q - ; × Q - ;
7KH SYDOXH IRU WKH WHVW )LVKHU V H[DFW SUREDELOLW\ LV WKH SUREDELOLW\ RI WKH REVHUYHGFRQILJXUDWLRQSOXVWKHVXPRIWKHSUREDELOLWLHVRIDOORWKHUFRQILJXUDWLRQV ZLWKDPRUHH[WUHPHUHVXOWIRUIL[HGURZDQGFROXPQWRWDOV
([DPSOHV
p ([DPSOH&+),QFLGHQFHLQ&$%*DIWHU$5$
$QHZDGHQRVLQHUHOHDVLQJDJHQW$5$ WKRXJKWWRUHGXFHVLGHHIIHFWVLQ SDWLHQWVXQGHUJRLQJFRURQDU\DUWHU\E\SDVVVXUJHU\&$%* ZDVVWXGLHGLQD SLORWWULDOWKDWHQUROOHGSDWLHQWVZKRUHFHLYHGDFWLYHPHGLFDWLRQDQG SDWLHQWVZKRUHFHLYHGDSODFHER)ROORZXSREVHUYDWLRQUHYHDOHGWKDW SDWLHQWVZKRUHFHLYHGDFWLYHPHGLFDWLRQDQGSDWLHQWVZKRUHFHLYHGWKH SODFHERKDGVKRZQV\PSWRPVRIFRQJHVWLYHKHDUWIDLOXUH&+) ZLWKLQ GD\VSRVWVXUJHU\,VWKLVHYLGHQFHRIDUHGXFHGUDWHRI&+)IRUSDWLHQWV WUHDWHGZLWKWKH$5$FRPSRXQG"
6ROXWLRQ
/HW S DQG S UHSUHVHQW WKH &+) UDWHV IRU WKH DFWLYHPHGLFDWLRQ DQG SODFHER JURXSV UHVSHFWLYHO\
+ S S + S S
$
Chapter 17: Fisher’s Exact Test 287
The summary results are shown in Table 17.2. TABLE 17.2. CHF Frequency Summary for Example 17.1 Active Group (i=1)
Placebo Group (i=2)
Combined
Xi
2
5
7
ni
35
20
55
Estimate of p
0.057
0.250
0.127
The conditions for using the chi-square test (Chapter 16) are not met in this case because of the small cell sizes. However, Fisher's exact test can be used as demonstrated next. The observed table and tables with a more extreme result that have the same row and column totals are shown in Table 17.3. TABLE 17.3. Configuration of ‘Extreme’ Tables for Calculation of Fisher’s Exact Probability Table (i)
Table (ii)
Table (iii)
Row Total
X
n–X
X
n–X
X
n–X
n
ACT
2
33
1
34
0
35
35
PBO
5
15
6
14
7
13
20
Column Total
7
48
7
48
7
48
55
Under the null hypothesis, H0, the probability for Table (i) is found by
prob1 =
(7)! ⋅ (48)! ⋅ (35)! ⋅ (20)! = 0.04546 (55)! ⋅ (2)! ⋅ (5)! ⋅ (33)! ⋅ (15)!
Similarly, the probabilities for Tables (ii) and (iii) can be computed as prob2 = 0.00669 and prob3 = 0.00038, respectively. The exact one-tailed p-value is p = 0.04546 + 0.00669 + 0.00038 = 0.05253. Because this value is greater than 0.05, you would not reject the hypothesis of equal proportions at a significance level of 0.05. (This is close to 0.05, however, and it might encourage a researcher to conduct a larger study).
288
Common Statistical Methods for Clinical Research with SAS Examples
SAS Analysis of Example 17.1 Fisher's exact test can be performed by using the FISHER option in the TABLES statement in PROC FREQ, as shown in the SAS code for Example 17.1 . This example uses the WEIGHT statement because the summary results are input. If you use the data set that contains the observations for individual patients, the WEIGHT statement would be omitted. Notice that, with a 2´2 table, either the FISHER or the CHISQ option in the TABLES statement will print both the chisquare test and the Fisher’s exact test results. Fisher's exact probability is printed in Output 17.1, with a one-tailed value of 0.0525 . Notice that SAS prints a warning about using the chi-square test when cell sizes are too small. SAS Code for Example 17.1
DATA CABG; INPUT GRP RESP $ CNT DATALINES; 1 YES 2 1 _NO 33 2 YES 5 2 _NO 15 ;
@@;
/* GRP 1 = Active Group, GRP 2 = Placebo */ PROC FREQ DATA = CABG; TABLES GRP*RESP / FISHER NOPERCENT NOCOL; WEIGHT CNT; TITLE1 “Fisher's Exact Test”; TITLE2 'Example 17.1: CHF Incidence in CABG after ARA'; RUN;
Chapter 17: Fisher’s Exact Test 289
OUTPUT 17.1. SAS Output for Example 17.1
Example 17.1:
Fisher's Exact Test CHF Incidence in CABG after ARA The FREQ Procedure Table of GRP by RESP
GRP
RESP
Frequency‚ Row Pct |YES |_NO | ---------+--------+--------+ 1 | 2 | 33 | | 5.71 | 94.29 | ---------+--------+--------+ 2 | 5 | 15 | | 25.00 | 75.00 | ---------+--------+--------+ Total 7 48
Total 35
20
55
Statistics for Table of GRP by RESP Statistic DF Value Prob -----------------------------------------------------Chi-Square 1 4.2618 0.0390 Likelihood Ratio Chi-Square 1 4.1029 0.0428 Continuity Adj. Chi-Square 1 2.7024 0.1002 Mantel-Haenszel Chi-Square 1 4.1843 0.0408 Phi Coefficient -0.2784 Contingency Coefficient 0.2682 Cramer's V -0.2784 WARNING: 50% of the cells have expected counts less than 5. Chi-Square may not be a valid test.
Fisher's Exact Test ---------------------------------Cell (1,1) Frequency (F) 2 Left-sided Pr <= F 0.0525 Right-sided Pr >= F 0.9929 Table Probability (P) Two-sided Pr <= P
0.0455 0.0857
Sample Size = 55
17.4 Details & Notes 17.4.1. Fisher's exact test is considered a non-parametric test because it does not rely on any distributional assumptions. 17.4.2. Manual computations for Fisher's exact probabilities can be very tedious if you use a calculator, especially, with larger cell sizes. A statistical program, such as SAS, is recommended to facilitate the computations.
290
Common Statistical Methods for Clinical Research with SAS Examples
17.4.3. Fisher's probabilities are not necessarily symmetric. Although some analysts will double the one-tailed p-value to obtain the two-tailed result, this method is usually overly conservative. To obtain the two-tailed p-value, first compute the probabilities associated with all possible tables that have the same row and column totals. The 2-tailed p-value is then found by adding the probability of the observed table with the sum of the probabilities of each table whose probability is less than that of the observed table. To obtain the two-tailed test for Example 17.1 in this way, first compute the probabilities for each table as follows: TABLE 17.4. Individual Table Probabilities for Example 17.1
Table
Group
Responders
NonResponders
1
Active
0
35
Placebo
7
13
Active
1
34
Placebo
6
14
2 3 4 5 6 7 8
Probabilities
0.0004
+
0.0067
+
0.0455
+
Active
2
33
Placebo
5
15
Active
3
32
Placebo
4
16
Active
4
31
Placebo
3
17
Active
5
30
Placebo
2
18
Active
6
29
Placebo
1
19
Active
7
28
Placebo
0
20
0.0331
TOTAL:
1.0000
(Observed)
0.1563 0.2941 0.3040 0.1600 +
+ = Probability included in two-tailed p-value
The observed Table 3 has probability 0.0455. Tables 1, 2, and 8 have probabilities less than 0.0455 and are included in the p-value for the twotailed alternative. Thus, the two-tailed p-value is P = 0.0004 + 0.0067 + 0.0455 + 0.0331 = 0.0857 As noted in Output 17.1, both the one-tailed and two-tailed results for Fisher's exact test are provided.
Chapter 17: Fisher’s Exact Test 291
17.4.4. Fisher's exact test can be extended to situations that involve more than two treatment groups or more than two response levels. This is sometimes referred to as the generalized Fisher’s exact test or the Freeman-Halton test. With g treatment groups, you would establish a g´2 table of responses by extending the 2´2 table. We may also extend this method to multinomial responses, i.e., responses that can result in one of r possible outcomes (r > 2). With two treatment groups, the 2´2 table would be extended to a 2´r table, or more generally, to a g´r table. An example that illustrates the generalized Fisher’s exact test applied to a 3´6 table is shown in the “SAS/STAT User’s Guide”. The alternative hypothesis that is tested by the generalized Fisher’s exact test is the hypothesis that corresponds to ‘general association’, as discussed in Chapter 16. In SAS, the FISHER or EXACT option must be specified in the TABLES statement in PROC FREQ to perform the Fisher’s exact test when the table has more than two rows or two columns.
292
Common Statistical Methods for Clinical Research with SAS Examples
CHHAAPPTTEERR 18 McNemar’s Test 18.1 18.2 18.3 18.4
Introduction...................................................................................................... 293 Synopsis ............................................................................................................ 293 Examples........................................................................................................... 294 Details & Notes................................................................................................. 301
18.1 Introduction McNemar's test is a special case of the binomial test (Chapter 15) for comparing two proportions using paired samples. McNemar's test is often used in a clinical trials application when dichotomous outcomes are recorded twice for each patient under different conditions. The conditions might represent different treatments or different measurement times, for example. The goal is to compare response rates under the two sets of conditions. Because measurements come from the same patients, the assumption of independent groups, which is needed for the chi-square test and Fisher's exact test, is not met. Typical examples where McNemar's test might be applicable include testing for a shift in the proportion of abnormal responses from before treatment to after treatment in the same group of patients, or comparing response rates of two ophthalmic treatments when both are given to each patient, one treatment in each eye.
18.2 Synopsis In general, there are n patients, each observed under two conditions (time points, treatments, etc.) with each condition resulting in a dichotomous outcome. The results can be partitioned into 4 subgroups: the number of patients who respond under both conditions (A), the number of patients who respond under the first but not the second condition (B), the number of patients who respond under the second but not the first condition (C), and the number of patients who fail to respond under either condition (D), as shown in the 2H2 table which follows.
294
Common Statistical Methods for Clinical Research with SAS Examples
TABLE 18.1. Layout for McNemar’s Test
Condition 2
Condition 1
Number of Responders
Number of NonResponders
Total
Number of Responders
A
B
A+B
Number of NonResponders
C
D
C+D
Total
A+C
B+D
n= A+B+C+D
The hypothesis of interest is the equality of the response proportions, p1 and p2, under conditions 1 and 2, respectively. The test statistic is based on the difference in the discordant cell frequencies (B,C) as shown in the test summary below. This statistic has an approximate chi-square distribution when H0 is true.
null hypothesis: alt. hypothesis:
H0: p1 = p2 HA: p1 =/ p2
test statistic:
χ2 =
decision rule:
reject H0 if χ > χ 21 (α)
(B − C) 2 B+C 2
The rejection region is found by obtaining the critical chi-square value based on 1 degree of freedom, χ 21(α), from chi-square tables or by using the SAS function CINV(1–α, df). For α = 0.05, the critical chi-square value is 3.841 (see Table A.3 in Appendix A.3, or the SAS function CINV(0.95,1)).
18.3 Examples p Example 18.1 -- Bilirubin Abnormalities Following Drug Treatment
Eighty-six patients were treated with an experimental drug for 3 months. Preand post-study clinical laboratory results showed abnormally high total bilirubin values (above the upper limit of the normal range) as indicated in Table 18.2. Is there evidence of a change in the pre- to post-treatment rates of abnormalities?
Chapter 18: McNemar’s Test 295
TABLE 18.2. Raw Data for Example 18.1 Patient Number
Pre-
Post-
Patient Number
Pre-
Post-
Patient Number
Pre-
Post-
1
N
N
31
N
N
61
N
N
2
N
N
32
N
N
62
N
N
3
N
N
33
Y
N
63
N
N
4
N
N
34
N
N
64
N
N
5
N
N
35
N
N
65
N
N
6
N
Y
36
N
N
66
N
N
7
Y
Y
37
N
N
67
N
N
8
N
N
38
N
Y
68
N
N
9
N
N
39
N
Y
69
N
Y
10
N
N
40
N
N
70
N
Y
11
N
N
41
N
N
71
Y
Y
12
Y
N
42
N
Y
72
N
N
13
N
N
43
N
N
73
N
N
14
N
Y
44
Y
N
74
N
Y
15
N
N
45
N
N
75
N
N
16
N
N
46
N
N
76
N
Y
17
N
N
47
Y
Y
77
N
N
18
N
N
48
N
N
78
N
N
19
N
N
49
N
N
79
N
N
20
N
Y
50
N
Y
80
N
N
21
N
N
51
Y
N
81
Y
Y
22
Y
N
52
N
N
82
N
N
23
N
N
53
N
Y
83
N
N
24
N
N
54
N
N
84
N
N
25
Y
N
55
Y
Y
85
N
N
26
N
N
56
N
N
86
N
N
27
N
N
57
N
N
28
Y
Y
58
N
N
29
N
Y
59
N
N
30
N
Y
60
N
N
N = normal, Y = abnormally high
296
Common Statistical Methods for Clinical Research with SAS Examples
Solution Let p1 and p2 represent the proportions of patients who have abnormally high bilirubin values (Y) before and after treatment, respectively. The data are summarized in Table 18.3. TABLE 18.3. Abnormality Frequency Summary for Example 18.1 Post-Treatment
Pre-Treatment
N
Y
Total
N
60
14
74
Y
6
6
12
Total
66
20
86
Y = Total bilirubin above upper limit of normal range
The test summary is null hypothesis: alt. hypothesis:
H0: p1 = p2 HA: p1 =/ p2
test statistic:
2 2 c = (14 – 6) ⁄ (14 + 6)
= 64 ⁄ 20 = 3.20 2
decision rule:
reject H0 if c > 3.841
conclusion:
Because 3.20 is not > 3.841, you do not reject H0, concluding that, at a significance level of 0.05, there is insufficient evidence that a shift in abnormality rates occurs with treatment.
The actual p-value of 0.074 for the chi-square value of 3.20 can be found by using the SAS expression 1–PROBCHI(3.20,1).
Chapter 18: McNemar’s Test 297
SAS Analysis of Example 18.1 Use the AGREE option in the TABLES statement in the FREQ procedure to conduct McNemar's test in SAS. The SAS code and output for Example 18.1 follow. The chi-square value of 3.20 with a p-value of 0.074 are shown in the output. SAS Code for Example 18.1 PROC FORMAT; VALUE RSLTFMT 0 = 'N' RUN; DATA BILI; INPUT PAT PRE PST @@; DATALINES; 1 0 0 2 0 0 3 0 0 4 0 7 1 1 8 0 0 9 0 0 10 0 13 0 0 14 0 1 15 0 0 16 0 19 0 0 20 0 1 21 0 0 22 1 25 1 0 26 0 0 27 0 0 28 1 31 0 0 32 0 0 33 1 0 34 0 37 0 0 38 0 1 39 0 1 40 0 43 0 0 44 1 0 45 0 0 46 0 49 0 0 50 0 1 51 1 0 52 0 55 1 1 56 0 0 57 0 0 58 0 61 0 0 62 0 0 63 0 0 64 0 67 0 0 68 0 0 69 0 1 70 0 73 0 0 74 0 1 75 0 0 76 0 79 0 0 80 0 0 81 1 1 82 0 85 0 0 86 0 0 ;
1 = 'Y';
0 0 0 0 1 0 0 0 0 0 0 1 1 0
5 11 17 23 29 35 41 47 53 59 65 71 77 83
0 0 0 0 0 0 0 1 0 0 0 1 0 0
0 0 0 0 1 0 0 1 1 0 0 1 0 0
6 12 18 24 30 36 42 48 54 60 66 72 78 84
0 1 0 0 0 0 0 0 0 0 0 0 0 0
1 0 0 0 1 0 1 0 0 0 0 0 0 0
ODS EXCLUDE SimpleKappa; PROC FREQ DATA = BILI; TABLES PRE*PST / AGREE NOROW NOCOL; FORMAT PRE PST RSLTFMT.; TITLE1 “McNemar's Test”; TITLE2 'Example 18.1: Bilirubin Abnormalities Following Drug Treatment'; RUN;
298
Common Statistical Methods for Clinical Research with SAS Examples
OUTPUT 18.1. SAS Output for Example 18.1 McNemar's Test Example 18.1: Bilirubin Abnormalities Following Drug Treatment The FREQ Procedure Table of PRE by PST PRE
PST
Frequency| Percent | N | Y | Total ---------+--------+--------+ N | 60 | 14 | 74 | 69.77 | 16.28 | 86.05 ---------+--------+--------+ Y | 6 | 6 | 12 | 6.98 | 6.98 | 13.95 ---------+--------+--------+ Total 66 20 86 76.74 23.26 100.00
Statistics for Table of PRE by PST McNemar's Test ----------------------Statistic (S) 3.2000 DF 1 Pr > S 0.0736
Sample Size = 86
Three Response Categories Many situations arise in clinical data analysis in which the data are paired and the response has three categorical levels. The bilirubin response in Example 18.1, for example, can be classified as ‘normal’, ‘above normal’, or ‘below normal’. Clinical laboratory data are often analyzed using pre- to post-study ‘shift’ tables, which show the frequencies of patients shifting among these three categories from baseline to some point in the study following treatment. Such data can be analyzed using the Stuart-Maxwell test, which is an extension of McNemar’s test. The Stuart-Maxwell test is also a chi-square test based on the cell frequencies from a layout similar to that shown in Table 18.4. The null hypothesis is that the distribution of responses among the three categories is the same under both conditions, in this case, pre- and post-study.
Chapter 18: McNemar’s Test 299
TABLE 18.4. Layout for a Paired Experiment with a Trinomial Response
--------------Post-Study ----------Pre-Study
Low
Normal
High
Low
n11
n12
n13
Normal
n21
n22
n23
High
n31
n32
n33
Compute d1 = (n12 + n13) – (n21 + n31) d2 = (n21 + n23) – (n12 + n32) d3 = (n31 + n32) – (n13 + n23) and, for i ¹ j, n¯ij = (nij + nji) / 2 For i = 1, 2, 3, let pi(1) represent the probability of being classified in Category i under Condition 1 and pi(2) represent the probability of being classified in Category i under Condition 2, the test summary is null hypothesis:
H0: pi(1) = pi(2) for i = 1, 2, and 3
alt. hypothesis:
HA: pi(1) =/ pi(2) for i = 1, 2, or 3
test statistic:
S=
decision rule:
reject H0 if S > χ 22 ( α )
(n 23 ⋅ d12 ) + (n13 ⋅ d 22 ) + (n12 ⋅ d 32 ) 2 ⋅ ((n12 ⋅ n 23 ) + (n12 ⋅ n13 ) + (n13 ⋅ n 23 ))
When H0 is true, S has an approximate chi-square distribution with 2 degrees of freedom. For α = 0.05, the critical chi-square value, χ 2(α), is 5.991 from Table A.3 2
(Appendix A) or by using the SAS function =CINV(0.95,2) .
300
Common Statistical Methods for Clinical Research with SAS Examples
p Example 18.2 -- Symptom Frequency Before and After Treatment
Patients characterized their craving of certain high-fat food products before and two weeks after an experimental diet therapy as ‘never’, ‘occasional’ or ‘frequent’, as summarized in Table 18.5. Does the diet appear to have an effect on the frequency of these cravings? TABLE 18.5. Cell Frequencies for Example 18.2 --------------- Two Weeks --------------
Pre-Study
Never
Occasional
Frequent
Never
14
6
4
Occasional
9
17
2
Frequent
6
12
8
Solution Compute d1 = (6 + 4) – (9 + 6) = –5 d2 = (9 + 2) – (6 + 12) = –7 d3 = (6 + 12) – (4 + 2) = 12 and n 12 = (n12 + n21) / 2 = (6 + 9) / 2 = 7.5 n 13 = (n13 + n31) / 2 = (4 + 6) / 2 = 5 n 23 = (n23 + n32) / 2 = (2 + 12) / 2 = 7
The test statistic is S=
(7 ⋅ ( − 5) 2 ) + (5 ⋅ ( − 7) 2 ) + (7.5 ⋅ (12) 2 ) 2 ⋅ ((7.5 ⋅ 7) + (7.5 ⋅ 5) + (5 ⋅ 7))
= 6.00
which is larger than the critical chi-square value of 5.991. Therefore, there has been a significant shift in the distribution of responses at the α = 0.05 level. The marginal response probabilities are estimated as TABLE 18.6. Marginal Response Probabilities for Example 18.2 Pre-Study
2 Weeks
Never
24 / 78 = 30.8%
29 / 78 = 37.2%
Occasional
28 / 78 = 35.9%
35 / 78 = 44.9%
Frequent
26 / 78 = 33.3%
14 / 78 = 17.9%
Chapter 18: McNemar’s Test 301
18.4 Details & Notes 18.4.1. For McNemar’s test, the estimate of p1 is (A+B) / n, and the estimate of p2 is (A+C) / n. The difference in proportions, p1 – p2, is estimated by ((A+B) / n) – ((A+C) / n) = (B–C) / n. An approximate 95% confidence interval for p1–p2 is given by (B – C) 1 ± 1.96 ⋅ ( ) ⋅ n n
B+C –
(B – C) 2 n
For Example 18.1, the approximate 95% confidence interval for p1 – p2 is
(14 – 6) 1 ± 1.96 ⋅ ( ) ⋅ 86 86
14 + 6 –
(14 – 6) 2 86
= 0.093 ± 0.100 or (–0.007 to 0.193) 18.4.2. Note that the chi-square test statistic is based only on the discordant cell sizes (B, C) and ignores the concordant cells (A, D). However, the estimates of p1, p2 and the size of the confidence interval are based on all cells because they are inversely proportional to the total sample size, n. 18.4.3. follows: χ2=
A continuity correction can be used with McNemar's test as ( | B – C| – 1 )2 B+C
This adjustment results in a more conservative test, and the correction factor may be omitted as B+C gets larger. Re-analyzing the data in Example 18.1 2 with the continuity correction, the test statistic becomes χ = 2.45, which is seen to be smaller and less significant (p=0.118) than the uncorrected value. 18.4.4. A well-known relationship in probability theory is that if a random variable, Z, has a standard normal distribution (i.e., mean 0 and variance 1), 2 then Z has a chi-square distribution with 1 degree of freedom. This principle can be used to show the relationship of the McNemar chi-square to the binomial test using the normal approximation as follows.
302
Common Statistical Methods for Clinical Research with SAS Examples
When the null hypothesis is true, the discordant values, B and C, should be about the same, that is B/(B+C) should be about 1/2. McNemar's test is equivalent to using the binomial test (Chapter 15) to test H0: p = 0.5, where p equals the fraction of the events that fall in 1 of the 2 discordant cells. Using the setup of Chapter 15 with n = B + C and X = B, the normal approximation to the binomial test (expressed as a standard normal Z when H0 is true), is 1 ) |B – C|–1 2(B+C) = 0.5 ⋅ 0.5 B+C (B+C)
( | pˆ – 0.5 | – Z =
pˆ =
because
B (B + C)
Therefore, 2
Z =
( | B – C | – 1 )2 = χ2 (B+C)
18.4.5. Because McNemar's test is identical to the binomial test using a normal approximation, it should be used only when the conditions for the normal approximation apply (see Chapter 15). Using the notation here, these 2 2 conditions can be simplified as B ³ C(4–B) and C ³ B(4–C). Notice that these conditions need to be checked only if either B or C is less than 4. If the conditions are not satisfied, the exact binomial probabilities should be used rather than the normal approximation or the chi-square statistic. 18.4.6. The above application of the binomial test is an example of the sign test mentioned in Chapter 15. In applying the sign test, the number of increases (+) are compared with the number of decreases (–), ignoring tied values. In Example 18.1, B represents the number of increases, C represents the number of decreases, and the concordant values (A and D) represent the tied values, which are ignored. 18.4.7. The p-value in the SAS output corresponds to a two-tailed hypothesis test. The one-tailed p-value can be found by halving this value. 18.4.8. McNemar’s test is often used to detect shifts in response rates between pre- and post-treatment measurement times within a single treatment group. A comparison in shifts can be made between two treatment or dose groups (e.g., Group 1 and Group 2) using a normal approximation as follows. As in Table 18.1, let Ai, Bi, Ci, Di, and ni represent the cell and overall frequencies for Group i (i = 1, 2). The proportional shift for Group i is Di =
Ai + Ci Ai + Bi Ci − Bi − = ni ni ni
Chapter 18: McNemar’s Test 303
To test for a difference between groups in these proportional shifts, you can use D1 − D 2
Z=
2
σD
1
+
2
σD
2
as a test statistic that has an approximate standard normal distribution under the hypothesis of no difference in shifts between groups. Above, the variance of Di is computed as 2
σD
i
=
n i (Bi + Ci ) − (Bi − Ci )2 ni3
18.4.9. The Stuart-Maxwell test is sometimes referred to as a test for ‘marginal homogeneity’ because it tests for equality between the two conditions in the marginal response probabilities. Another alternative hypothesis that might be of interest is that of symmetry. Symmetry occurs when the off-diagonal cell probabilities are the same for each 2H2 sub-table. This implies that changes in response levels between the conditions (e.g., preand post-) occur in both directions with the same probability.
With dichotomous outcomes, the conditions of marginal homogeneity and symmetry are equivalent. With more than two response categories, symmetry is a stronger condition and implies marginal homogeneity. The 3H3 tables in Table 18.6 show a pattern of marginal homogeneity with symmetry (Table i) and without symmetry (Table ii). TABLE 18.6. Examples of Symmetry (i) and non-Symmetry (ii) in a 3 x 3 Table
Table i Condition 1
Condition 2 1
2
3
1
30
15
5
50
2
15
5
10
30
3
5
10
5
20
50
30
20
100
Table ii Condition 1
Condition 2 1
2
3
1
30
0
20
50
2
20
10
0
30
3
0
20
0
20
50
30
20
100
304
Common Statistical Methods for Clinical Research with SAS Examples
Bowker’s test can be used to test for symmetry in 3H3 or larger tables. The test statistic is found by adding the McNemar chi-square statistics for each 2H2 sub-table. For a table with k categories, there are k·(k–1) / 2 2H2 sub-tables. For example, the 3H3 table contains three 2H2 sub-tables with categories 1 and 2, 1 and 3, and 2 and 3. Bowker’s test for k categories is a chi-square test with k·(k–1) / 2 degrees of freedom. Bowker’s test for symmetry can be performed in SAS in the same way that McNemar’s test is run (by using the AGREE option in the TABLES statement in PROC FREQ). If SAS detects k > 2, the test for symmetry is automatically performed. Bowker’s test results for Example 18.2 are found by using the SAS program that follows. The output shows a chi-square test statistic of 8.1429 , which indicates a significant departure from the hypothesis of symmetry (p=0.0431). DATA DIET; INPUT PRE WK2 DATALINES; 0 0 14 0 1 6 0 1 0 9 1 1 17 1 2 0 6 2 1 12 2 ;
CNT @@; 2 4 2 2 2 8
ODS SELECT SymmetryTest; PROC FREQ DATA = DIET; TABLES PRE*WK2 / AGREE NOCOL NOROW; WEIGHT CNT; RUN;
OUTPUT 18.2. Output for Symmetry Test in Example 18.2 Statistics for Table of PRE by WK2 Test of Symmetry ----------------------Statistic (S) 8.1429 DF 3 Pr > S 0.0431
Sample Size = 78
18.4.10. Notice that Section 18.4.8 shows one method for comparing the magnitude of shifts in conditions based on correlated responses among independent groups. If there are more than two groups or more than two response levels, or the design involves other stratification factors, a more complex analysis is required. Stokes, Davis, and Koch (2000) illustrate the analysis of examples that involve repeated measures using PROC CATMOD in SAS.
&++$$3377((55 7KH&RFKUDQ0DQWHO+DHQV]HO7HVW ,QWURGXFWLRQ 6\QRSVLV ([DPSOHV 'HWDLOV 1RWHV
,QWURGXFWLRQ
7KH &RFKUDQ0DQWHO+DHQV]HO WHVW LV XVHG LQ FOLQLFDO WULDOV WR FRPSDUH WZR ELQRPLDO SURSRUWLRQV IURP LQGHSHQGHQW SRSXODWLRQV EDVHG RQ VWUDWLILHG VDPSOHV 7KLV WHVW SURYLGHV D PHDQV RI FRPELQLQJ D QXPEHU RI H WDEOHV RI WKH W\SH GLVFXVVHG LQ &KDSWHUV DQG ZKHQ HDFK LV IURP D VHSDUDWH LQGHSHQGHQW VWUDWXP 7KH VWUDWLILFDWLRQ IDFWRU FDQ UHSUHVHQW SDWLHQW VXEJURXSV VXFK DV VWXG\ FHQWHUV JHQGHUDJHJURXSRUGLVHDVHVHYHULW\DQGDFWVVLPLODUWRWKHEORFNLQJIDFWRULQD WZRZD\ $129$ &KDSWHU 7KH &RFKUDQ0DQWHO+DHQV]HO WHVW REWDLQV DQ RYHUDOO FRPSDULVRQ RI UHVSRQVH UDWHV DGMXVWHG IRU WKH VWUDWLILFDWLRQ YDULDEOH 7KH DGMXVWPHQW LV VLPSO\ D ZHLJKWLQJ RI WKH H WDEOHV LQ SURSRUWLRQ WR WKH ZLWKLQ VWUDWDVDPSOHVL]HV 7KH &RFKUDQ0DQWHO+DHQV]HO WHVW LV RIWHQ XVHG LQ WKH FRPSDULVRQ RI UHVSRQVH UDWHVEHWZHHQWZRWUHDWPHQWJURXSVLQDPXOWLFHQWHUVWXG\XVLQJWKHVWXG\FHQWHUV DVVWUDWD
6\QRSVLV
$VVXPH WKHUH DUH N VWUDWD N :LWKLQ 6WUDWXP M WKHUH DUH 1 SDWLHQWV M N UDQGRPO\DOORFDWHGWRRQHRIWZRJURXSV,Q*URXSWKHUHDUHQ SDWLHQWV ; RI ZKRP DUH FRQVLGHUHG µUHVSRQGHUV¶ 6LPLODUO\ *URXS KDV Q SDWLHQWVZLWK; µUHVSRQGHUV¶DVVKRZQLQ7DEOH M
M
M
M
M
&RPPRQ6WDWLVWLFDO0HWKRGVIRU&OLQLFDO5HVHDUFKZLWK6$6([DPSOHV
7$%/(/D\RXWIRUWKH&RFKUDQ0DQWHO+DHQV]HO7HVW
6WUDWXP
*URXS
5HVSRQGHUV
1RQ5HVSRQGHUV
7RWDO
;
Q±;
Q
;
Q±;
Q
7RWDO
;;
1±;;
1
;
Q±;
Q
;
Q±;
Q
7RWDO
;;
1±;;
1
N
;N
QN±;N
QN
;N
QN±;N
QN
7RWDO
;N;N
1N±;N;N
1N
/HW S DQG S GHQRWH WKH RYHUDOO UHVSRQVH UDWHV IRU *URXS DQG *URXS UHVSHFWLYHO\)RU6WUDWXPMFRPSXWHWKHTXDQWLWLHV
180 M
;
M
×
Q ± ; M
1
'(1
Q
M
Q × Q × ; ; × 1 ± ; ± ; M
M
M
M
1
M
M
×
M
M
M
1 ± M
7KH&RFKUDQ0DQWHO+DHQV]HOWHVWVXPPDU\LV
QXOOK\SRWKHVLV DOWK\SRWKHVLV WHVWVWDWLVWLF
+ S S + S S
$
&0+
å å '(1
æ N ç è M =
ö 180 M ÷ ø
N
M
M
×
M
M
DQG
GHFLVLRQUXOH
UHMHFW+ LI
!
&0+
&KDSWHU7KH&RFKUDQ0DQWHO+DHQV]HO7HVW
$V LQ SUHYLRXV FKDSWHUV UHSUHVHQWV WKH FULWLFDO FKLVTXDUH YDOXH ZLWK
VLJQLILFDQFHOHYHO DQGGHJUHHRIIUHHGRP
([DPSOHV
p ([DPSOH'HUPRWHO5HVSRQVHLQ'LDEHWLF8OFHUV
$PXOWLFHQWHUVWXG\ZLWKFHQWHUVLVWHVWLQJDQH[SHULPHQWDOWUHDWPHQW 'HUPRWHOXVHGWRDFFHOHUDWHWKHKHDOLQJRIGHUPDOIRRWXOFHUVLQGLDEHWLF SDWLHQWV6RGLXPK\DOXURQDWHZDVXVHGLQDFRQWUROJURXS3DWLHQWVZKR VKRZHGDGHFUHDVHLQXOFHUVL]HDIWHUZHHNVWUHDWPHQWRIDWOHDVWE\ VXUIDFHDUHDPHDVXUHPHQWVZHUHFRQVLGHUHGµUHVSRQGHUV¶7KHQXPEHUVRI UHVSRQGHUVLQHDFKJURXSDUHVKRZQLQ7DEOHIRUHDFKVWXG\FHQWHU,V WKHUHDQRYHUDOOGLIIHUHQFHLQUHVSRQVHUDWHVEHWZHHQWKH'HUPRWHODQG FRQWUROJURXSV"
7$%/(5HVSRQVH)UHTXHQFLHVE\6WXG\&HQWHUIRU([DPSOH 6WXG\ &HQWHU
7UHDWPHQW *URXS
5HVSRQVH
1RQ5HVSRQVH
727$/
'HUPRWHO
&RQWURO
7RWDO
'HUPRWHO
&RQWURO
7RWDO
'HUPRWHO
&RQWURO
7RWDO
'HUPRWHO
&RQWURO
7RWDO
&RPPRQ6WDWLVWLFDO0HWKRGVIRU&OLQLFDO5HVHDUFKZLWK6$6([DPSOHV
6ROXWLRQ &RQVLGHUWKHVWXG\FHQWHUVDVVHSDUDWHVWUDWD)RU6WXG\&HQWHUFRPSXWH ; × Q ± ; × Q 1 × ± ×
180
DQG
'(1
Q × Q
×
; ; × 1 ± ; ± ; 1 × 1 ±
× × × ×
7KHVHTXDQWLWLHVFDQEHFRPSXWHGLQDVLPLODUZD\IRUWKHRWKHUFHQWHUVDQGWKH UHVXOWVDUHVKRZQLQ7DEOH 7$%/(&RPSXWDWLRQDO6XPPDU\IRU&0+7HVW6WDWLVWLFRI ([DPSOH
678'< &(17(5M
180
'(1
727$/
M
M
&KDSWHU7KH&RFKUDQ0DQWHO+DHQV]HO7HVW
7KHWHVWVXPPDU\XVLQJWKH&RFKUDQ0DQWHO+DHQV]HOWHVWDWDVLJQLILFDQFHOHYHO RILV
QXOOK\SRWKHVLV DOWK\SRWKHVLV
+ S S + S S
$
WHVWVWDWLVWLF
&0+
å å'(1
æ ç è M=
ö 180 M ÷ ø
M
M
GHFLVLRQUXOH
UHMHFW+ LI &0+!
FRQFOXVLRQ
%HFDXVH ! \RX UHMHFW + DW D
»
VLJQLILFDQFH OHYHO RI DQG FRQFOXGH WKDW WKHUHLVDVLJQLILFDQWGLIIHUHQFHLQUHVSRQVHUDWHV EHWZHHQWKH'HUPRWHOWUHDWPHQWDQGWKHFRQWURO
6$6$QDO\VLVRI([DPSOH 7KH &0+ RSWLRQ LQ WKH )5(4 SURFHGXUH LQ 6$6 FDQ EH XVHG WR FRQGXFW WKH &RFKUDQ0DQWHO+DHQV]HOWHVWDVVKRZQRQWKHIROORZLQJSDJHV7KHVWUDWLILFDWLRQ IDFWRU&175 PXVWEHVSHFLILHGILUVWLQWKH7$%/(6VWDWHPHQWDVVKRZQLQWKH 6$6FRGH å8VHWKH123(5&(17DQG12&2/RSWLRQVWRVXSSUHVVSULQWLQJRI XQQHHGHGUHVXOWVWKHURZSHUFHQWDJHVZLOOEHSULQWHGEHFDXVHWKH1252:RSWLRQ LVQRWVSHFLILHG 7KH RXWSXWVKRZVWKHHWDEOHVIRUHDFKVWXG\FHQWHU 7KHVHFDQEHRPLWWHG XVLQJ WKH 12)5(4 RSWLRQ LQ WKH 7$%/(6 VWDWHPHQW 7KH URZ SHUFHQWDJHV DUH SULQWHGXQGHUWKHFHOOIUHTXHQFLHVDQGWKHSHUFHQWIRU5(63 <(6UHSUHVHQWVWKH HVWLPDWHGUHVSRQVHUDWH7KH&RFKUDQ0DQWHO+DHQV]HOWHVWUHVXOWVDUHVKRZQRQ WKHVXEVHTXHQWRXWSXWSDJH êFRQILUPLQJWKHFKLVTXDUHVWDWLVWLFRI1RWH WKDW WKH WHVWV IRU µJHQHUDO DVVRFLDWLRQ¶ µURZ PHDQ VFRUHV GLIIHU¶ DQG µQRQ]HUR FRUUHODWLRQ¶ DUH DOO HTXLYDOHQW IRU H WDEOHV 7KH SYDOXH RI LQGLFDWHV VLJQLILFDQFHZKHQWHVWHGDW 7KH 6$6 FRGH LQFOXGHV DQ RYHUDOO FKLVTXDUH WHVW &KDSWHU LJQRULQJ WKH VWUDWLILFDWLRQIDFWRU6WXG\&HQWHU ZKLFKLVQRWVLJQLILFDQWS 7KLVLV GLVFXVVHGPRUHLQ6HFWLRQ
&RPPRQ6WDWLVWLFDO0HWKRGVIRU&OLQLFDO5HVHDUFKZLWK6$6([DPSOHV
6$6&RGHIRU([DPSOH DATA ULCR; INPUT CNTR $ TRT $ RESP $ FRQ @@; DATALINES; 1 ACT YES 26 1 CTL YES 18 1 ACT _NO 4 1 CTL _NO 11 2 ACT YES 8 2 CTL YES 7 2 ACT _NO 3 2 CTL _NO 5 3 ACT YES 7 3 CTL YES 4 3 ACT _NO 5 3 CTL _NO 6 4 ACT YES 11 4 CTL YES 9 4 ACT _NO 6 4 CTL _NO 5 ; /* Analysis using CNTR as stratification factor */ ODS EXCLUDE CommonRelRisks; PROC FREQ DATA = ULCR; TABLES CNTR*TRT*RESP / CMH NOPERCENT NOCOL;å WEIGHT FRQ; TITLE1 ’The Cochran-Mantel-Haenszel Test’; TITLE2 ’Example 19.1: Response to Dermotel in Diabetic Ulcers’; RUN; /* Analysis without stratification (ignoring CNTR) */ PROC FREQ DATA = ULCR; TABLES TRT*RESP / CHISQ NOPERCENT NOCOL; WEIGHT FRQ; RUN;
2873876$62XWSXWIRU([DPSOH
The Cochran-Mantel-Haenszel Test Example 19.1: Response to Dermotel in Diabetic Ulcers The FREQ Procedure Table 1 of TRT by RESP Controlling for CNTR=1 TRT
Frequency| Row Pct | YES | NO | ---------+--------+--------+ ACT | 26 | 4 | | 86.67 | 13.33 | ---------+--------+--------+ CTL | 18 | 11 | | 62.07 | 37.93 | ---------+--------+--------+ Total 44 15
RESP
Total 30
29
59
&KDSWHU7KH&RFKUDQ0DQWHO+DHQV]HO7HVW
2873876$62XWSXWIRU([DPSOHFRQWLQXHG
The Cochran-Mantel-Haenszel Test Example 19.1: Response to Dermotel in Diabetic Ulcers
Table 2 of TRT by RESP Controlling for CNTR=2 TRT
RESP
Frequency| Row Pct | YES | NO | ---------+--------+--------+ ACT | 8 | 3 | | 72.73 | 27.27 | ---------+--------+--------+ CTL | 7 | 5 | | 58.33 | 41.67 | ---------+--------+--------+ Total 15 8
Total 11
12
23
Table 3 of TRT by RESP Controlling for CNTR=3 TRT
RESP
Frequency| Row Pct | YES | NO | ---------+--------+--------+ ACT | 7 | 5 | | 58.33 | 41.67 | ---------+--------+--------+ CTL | 4 | 6 | | 40.00 | 60.00 | ---------+--------+--------+ Total 11 11
Total 12
10
22
Table 4 of TRT by RESP Controlling for CNTR=4 TRT
Frequency| Row Pct | YES | NO | ---------+--------+--------+ ACT | 11 | 6 | | 64.71 | 35.29 | ---------+--------+--------+ CTL | 9 | 5 | | 64.29 | 35.71 | ---------+--------+--------+ Total 20 11
RESP
Total 17
14
31
&RPPRQ6WDWLVWLFDO0HWKRGVIRU&OLQLFDO5HVHDUFKZLWK6$6([DPSOHV
2873876$62XWSXWIRU([DPSOHFRQWLQXHG
The Cochran-Mantel-Haenszel Test Example 19.1: Response to Dermotel in Diabetic Ulcers Summary Statistics for TRT by RESP Controlling for CNTR Cochran-Mantel-Haenszel Statistics (Based on Table Scores)
ê
Statistic Alternative Hypothesis DF Value Prob --------------------------------------------------------------1 Nonzero Correlation 1 4.0391 0.0445 2 Row Mean Scores Differ 1 4.0391 0.0445 3 General Association 1 4.0391 0.0445
Breslow-Day Test for Homogeneity of the Odds Ratios -----------------------------Chi-Square 1.8947 DF 3 Pr > ChiSq 0.5946
Total Sample Size = 135 Table of TRT by RESP TRT RESP Frequency| Row Pct | YES | NO | ---------+--------+--------+ ACT | 52 | 18 | | 74.29 | 25.71 | ---------+--------+--------+ CTL | 38 | 27 | | 58.46 | 41.54 | ---------+--------+--------+ Total 90 45
Total 70
65
135
Statistics for Table of RESP by TRT Statistic DF Value Prob -----------------------------------------------------Chi-Square 1 3.7978 0.0513 Likelihood Ratio Chi-Square 1 3.8136 0.0508 Continuity Adj. Chi-Square 1 3.1191 0.0774 Mantel-Haenszel Chi-Square 1 3.7697 0.0522 Phi Coefficient 0.1677 Contingency Coefficient 0.1654 Cramer’s V 0.1677 Fisher’s Exact Test ---------------------------------Cell (1,1) Frequency (F) 52 Left-sided Pr <= F 0.9837 Right-sided Pr >= F 0.0385 Table Probability (P) Two-sided Pr <= P Sample Size = 135
0.0222 0.0676
&KDSWHU7KH&RFKUDQ0DQWHO+DHQV]HO7HVW
'HWDLOV 1RWHV
$ FRQYHQLHQW ZD\ WR VXPPDUL]H WKH WHVW UHVXOWV E\ VWUDWXP LV VKRZQIRU([DPSOHLQ7DEOH7KHUHVSRQVHUDWHIRU*URXSL6WUDWXP M LV HVWLPDWHG E\ ; Q 7KH RYHUDOO UHVSRQVH UDWH IRU *URXS L LV HVWLPDWHGE\ ; ; ; ; Q Q Q Q ML
L
ML
L
L
L
L
L
L
L
7$%/(6XPPDU\RI5HVXOWVIRU([DPSOH
5HVSRQVH5DWHV
6WXG\ &HQWHU
$FWLYH
Q
&RQWURO
Q
&KL6TXDUH
S9DOXH
2YHUDOO
$OWKRXJK QRW LQFOXGHG LQ WKH 6$6 RXWSXW IRU WKLV H[DPSOH WKH FKLVTXDUH YDOXHVDQGSYDOXHVFDQEHRXWSXWIRUHDFKVWUDWXPXVLQJWKH&+,64RSWLRQLQ WKH7$%/(6VWDWHPHQWRI352&)5(4 %\LJQRULQJWKHVWUDWDDQGFRPELQLQJDOOWKHGDWDRI([DPSOH LQWRRQHVLPSOHFKLVTXDUHWHVW&KDSWHU \RXREWDLQWKHIROORZLQJUHVXOWV DVVKRZQLQ2XWSXW
7$%/(&KL6TXDUH7HVWIRU([DPSOH,JQRULQJ6WUDWD
*URXS
5HVSRQVH
1RQ5HVSRQVH
7RWDO
$FWLYH
&RQWURO
7RWDO
FKLVTXDUHYDOXH S 7KHWHVWVWDWLVWLFGRHVQRWTXLWHDWWDLQVLJQLILFDQFHDWWKHOHYHODVZLWKWKH &RFKUDQ0DQWHO+DHQV]HO WHVW ,QWKLVH[DPSOHWKHZLWKLQVWUDWDLQIRUPDWLRQ XVHGE\WKH&RFKUDQ0DQWHO+DHQV]HOWHVWLVDGYDQWDJHRXVLQUHYHDOLQJJUHDWHU VWDWLVWLFDO VLJQLILFDQFH 7KLV ZLOO RIWHQ EH WKH FDVH ZKHQ WKHUH LV D ELJ GLIIHUHQFHLQVDPSOHVL]HVDPRQJVWUDWDDQGWKHODUJHVWVWUDWDVKRZWKHELJJHVW UHVSRQVHUDWHGLIIHUHQFHV
&RPPRQ6WDWLVWLFDO0HWKRGVIRU&OLQLFDO5HVHDUFKZLWK6$6([DPSOHV
:LWKVRPHDOJHEUDLFPDQLSXODWLRQ
180
M
;
M
×
Q ± ; M
1
FDQEHH[SUHVVHGDV Z
M
ZKHUH SÖ = ML
×
Q
M
M
M
;
ML
Q
ML
M
LV WKH HVWLPDWH RI S DQG Z LV D IXQFWLRQ RI WKH JURXS VDPSOH ML
æ ç è
Z M
Q M
M
±
+
ö ÷ Q M ø
:ULWWHQLQWKLVZD\WKH&RFKUDQ0DQWHO+DHQV]HOVWDWLVWLFLVVHHQWREHEDVHG RQ WKH ZLWKLQVWUDWD UHVSRQVH UDWH GLIIHUHQFHV FRPELQHG RYHU DOO VWUDWD ZHLJKWHG E\ Z %HFDXVH Z LQFUHDVHV ZLWK WKH Q ¶V LW LV VHHQ WKDW JUHDWHU ZHLJKWVDUHDVVLJQHGWRWKRVHVWUDWDWKDWKDYHODUJHUVDPSOHVL]HV M
×
SÖ ± SÖ
VL]HV
M
M
LM
7KH &RFKUDQ0DQWHO+DHQV]HO WHVW ZDV RULJLQDOO\ GHYHORSHG IRU XVH ZLWK UHWURVSHFWLYH GDWD LQ HSLGHPLRORJLFDO DSSOLFDWLRQV 7KH VDPH PHWKRGRORJ\ KDV EHHQ ZLGHO\ DSSOLHG WR SURVSHFWLYH FOLQLFDO WULDOV VXFK DV WKRVHLOOXVWUDWHGKHUH$QXPEHURIDOWHUQDWLYHVWDWLVWLFVVLPLODUEXWZLWKPLQRU YDULDWLRQVWRWKHYHUVLRQSUHVHQWHGKDYHDOVREHHQXVHG )RUH[DPSOH&RFKUDQ VFKLVTXDUH WHVWLVFRPSXWHGLQWKHVDPHZD\DVWKH &RFKUDQ0DQWHO+DHQV]HOVWDWLVWLFZLWKWKHH[FHSWLRQWKDWWKHGHQRPLQDWRURI '(1 LV 1 LQVWHDG RI 1 1 ± 7KH &RFKUDQ0DQWHO+DHQV]HO WHVW LV DQ DWWHPSWWRLPSURYHRQ&RFKUDQ VWHVWE\JLYLQJWKHVWUDWDZLWKIHZHUSDWLHQWV OHVVZHLJKWLQWKHRYHUDOODQDO\VLVZKLOHOHDYLQJVWUDWDZLWKODUJH1 VUHODWLYHO\ XQDOWHUHG1RWLFHWKDWLI&RFKUDQ VWHVWLVXVHGZLWKRQO\RQHVWUDWXPN WKHFKLVTXDUHYDOXHLV180 '(1 ZKLFKLVLGHQWLFDOWRWKHFKLVTXDUH WHVWGLVFXVVHGLQ&KDSWHU
M
M
M
M
M
$QRWKHUYDULDWLRQRIWKH&RFKUDQ0DQWHO+DHQV]HOVWDWLVWLFLVWKHFRQWLQXLW\ FRUUHFWHGYDOXHDVIROORZV
&0+
(
N
å 180 - ) M
M
N
å'(1 M
M
&KDSWHU7KH&RFKUDQ0DQWHO+DHQV]HO7HVW
7KH FRQWLQXLW\ FRUUHFWLRQ VKRXOG EH XVHG LI WKHUH DUH PDQ\ VPDOO FHOO VL]HV +RZHYHUXVHRIWKHFRQWLQXLW\FRUUHFWLRQPLJKWSURGXFHRYHUO\FRQVHUYDWLYH UHVXOWVDQGLWFDQJHQHUDOO\EHRPLWWHGIRUUHDVRQDEOHFHOOVL]HV $QLQWHUDFWLRQH[LVWVLIWKHGLIIHUHQFHVLQUHVSRQVHUDWHVEHWZHHQ JURXSV DUH QRW FRQVLVWHQW DPRQJ VWUDWD $Q H[DPSOH RI DQ LQWHUDFWLRQ LV DQ $FWLYH YV &RQWURO UHVSRQVH UDWH FRPSDULVRQRIYVLQRQHVWUDWXP DQG YV LQ DQRWKHU VWUDWXP &RPELQLQJ VWUDWD PLJKW PDVN WKLV LQWHUDFWLRQZKLFKOHDGVWRRIIVHWWLQJUHVSRQVHVDQGQRRYHUDOOGLIIHUHQFHV ,Q WKH SUHVHQFH RI DQ LQWHUDFWLRQ RU ODFN RI µKRPRJHQHLW\¶ DPRQJ UHVSRQVH GLIIHUHQFHVHDFKVWUDWXPVKRXOGEHDQDO\]HGVHSDUDWHO\DQGIXUWKHUDQDO\VHV PLJKWEHSXUVXHGLQDQDWWHPSWWRH[SODLQWKHLQWHUDFWLRQ 7KH %UHVORZ'D\ WHVW IRU KRPRJHQHLW\ DPRQJ VWUDWD LV LQFOXGHG LQ WKH 6$6 RXWSXWZKHQFRQGXFWLQJWKH&RFKUDQ0DQWHO+DHQV]HOWHVW7KLVWHVWSURYLGHV DQLQGLFDWLRQRIWKHSUHVHQFHRIDQLQWHUDFWLRQHYHQWKRXJKLWLVEDVHGRQWKH RGGV UDWLRV VHH &KDSWHU UDWKHU WKDQ WKH GLIIHUHQFHV LQ UHVSRQVH UDWHV $ VLJQLILFDQW %UHVORZ'D\ WHVW GRHV QRW LQYDOLGDWH WKH UHVXOWVRIWKH&0+WHVW EXW LQGLFDWHV FDXWLRQ VKRXOG EH XVHG LQ LWV LQWHUSUHWDWLRQ DQG WKDW SHUKDSV LQGLYLGXDOVWUDWDVKRXOGEHIXUWKHULQYHVWLJDWHG7KH6$6RXWSXWIRU([DPSOH VKRZVDQRQVLJQLILFDQW%UHVORZ'D\WHVWZLWKDSYDOXHRI &RPELQLQJGDWDDFURVVVWUDWDLQWRDVLQJOHHWDEOHVKRXOGRQO\ EH FRQVLGHUHG ZKHQ WKHLQGLYLGXDOWDEOHVKDYHOLNHSURSRUWLRQV7RVKRZWKH SUREOHPV WKDW FDQ DULVH E\ DXWRPDWLFDOO\ FRPELQLQJ WDEOHV ZLWKRXW FDUHIXOO\ H[DPLQLQJHDFKFRQVLGHUWKHUHVXOWVIRUWZRVWUDWDVKRZQLQ7DEOH
7$%/(([DPSOHRIH7DEOHVZLWK&RXQWHU,QWXLWLYH &RPELQHG5HVXOWV
6WUDWXP
*URXS
5HVSRQGHUV
1RQ 5HVSRQGHUV
7RWDO
5HVSRQVH 5DWH
$
%
$
%
&RPELQHG
$
%
7KHUHVSRQVHUDWHVIRU*URXS$DUHJUHDWHUWKDQWKRVHRI*URXS%ZLWKLQHDFK VWUDWXP%XWWKHRYHUDOOUHVSRQVHUDWHIRU*URXS%LVJUHDWHUIRUWKHFRPELQHG VWUDWD 7KLV VLWXDWLRQ FDQ RFFXU LI WKHUH LV D ELJ GLIIHUHQFH LQ UHVSRQVH UDWHV DPRQJ VWUDWD DQG VDPSOH VL]HV DUH LPEDODQFHG EHWZHHQ JURXSV LQ RSSRVLWH ZD\VDPRQJVWUDWD
&RPPRQ6WDWLVWLFDO0HWKRGVIRU&OLQLFDO5HVHDUFKZLWK6$6([DPSOHV
7KH &RFKUDQ0DQWHO+DHQV]HO WHVW FDQ EH H[WHQGHG WR FRQWLQJHQF\ WDEOHV ODUJHU WKDQ WKH H WDEOHV FRQVLGHUHG KHUH 7KH JHQHUDO OD\RXW LV EDVHG RQ N VWUDWD RI JHU FRQWLQJHQF\ WDEOHV ZKHUH J QXPEHU RI URZV DQGUFROXPQV W\SLFDOO\UHSUHVHQWWKHQXPEHURIWUHDWPHQWJURXSVDQG WKHQXPEHURIUHVSRQVHFDWHJRULHVUHVSHFWLYHO\7KH)5(4SURFHGXUHLQ6$6 FDQ DOVR EH XVHG WR DQDO\]H UHVXOWV IRU WKLV PRUH JHQHUDO VHWXS 7KH &0+ RSWLRQLQWKH7$%/(6VWDWHPHQWSULQWVWKHUHVXOWVIRUWKHµJHQHUDODVVRFLDWLRQ¶ µURZPHDQVFRUHVGLIIHU¶DQGµQRQ]HURFRUUHODWLRQ¶K\SRWKHVHVZKHQVDPSOHV DUHVWUDWLILHGLQWKHVDPHZD\DVWKHXQVWUDWLILHGFDVHGLVFXVVHGLQ&KDSWHU &DXWLRQ PXVW EH XVHG ZLWK ODUJHU YDOXHV RI J DQG U GXH WR LQWHUSUHWDWLRQ GLIILFXOWLHVVPDOOHUFHOOVL]HVDQGWKHLQFUHDVHGSRWHQWLDOIRULQWHUDFWLRQV ,Q &KDSWHU \RX VHH WKDW WKH FKLVTXDUH WHVW IRU H WDEOHV LV HTXLYDOHQW WR XVLQJ D QRUPDO DSSUR[LPDWLRQ WR WKH ELQRPLDOGLVWULEXWLRQDQG IRU JHQHUDO XVH WKH FHOO IUHTXHQFLHV VKRXOG EH ODUJH HQRXJK WR YDOLGDWH WKDW DSSUR[LPDWLRQ $ FRQVHUYDWLYH ³UXOHRIWKXPE´ LV WR UHTXLUH H[SHFWHG FHOO IUHTXHQFLHV RI DW OHDVW 7KLV UXOH FDQ EH ORRVHQHG VRPHZKDW IRU VWUDWLILHG GDWD DV ORQJ DV WKH FRPELQHG FHOO IUHTXHQFLHV RYHU DOO VWUDWD DUH VXIILFLHQWO\ ODUJH7KHUHTXLUHPHQWIRUODUJHUFHOOIUHTXHQFLHVEHFRPHVHYHQOHVVLPSRUWDQW ZKHQWKHUHVSRQVHYDULDEOHLVRUGLQDODQGKDVPRUHWKDQWZRFDWHJRULHV 'HWDLOVDERXWVDPSOHVL]HJXLGHOLQHVIRUDSSO\LQJWKH&0+WHVWDUH GLVFXVVHGLQWKHKLJKO\UHFRPPHQGHG³&DWHJRULFDO'DWD$QDO\VLV8VLQJ WKH6$66\VWHP6HFRQG(GLWLRQ´E\6WRNHV'DYLVDQG.RFK7KLVERRN ZDVZULWWHQXQGHUWKH6$6%RRNVE\8VHUVSURJUDP
7KH &RFKUDQ0DQWHO+DHQV]HO WHVW DV SUHVHQWHG KHUH LV D WZR WDLOHG WHVW 7KH SYDOXH IURP WKH 6$6 RXWSXW FDQ EH KDOYHG WR REWDLQ D RQH WDLOHGWHVW
&++$$3377((55 /RJLVWLF5HJUHVVLRQ ,QWURGXFWLRQ 6\QRSVLV ([DPSOHV 'HWDLOV 1RWHV
,QWURGXFWLRQ
/RJLVWLF UHJUHVVLRQ DQDO\VLV LV D VWDWLVWLFDO PRGHOLQJ PHWKRG IRU DQDO\]LQJ GLFKRWRPRXV UHVSRQVH GDWD ZKLOH DFFRPPRGDWLQJ DGMXVWPHQWV IRU RQH RU PRUH H[SODQDWRU\ YDULDEOHV RU µFRYDULDWHV¶ 7KLV PHWKRG LV DQDORJRXV WR DQDO\VLV RI FRYDULDQFH&KDSWHU ZKLFKLVXVHIXOIRUFRPSDULQJWZRRUPRUHJURXSVZKLOH DGMXVWLQJ IRU YDULRXV EDFNJURXQG IDFWRUV FRYDULDWHV $OWKRXJK ERWK PHWKRGV LQFOXGH FRYDULDWH DGMXVWPHQWV $1&29$ DQDO\]HV PHDQV RI QXPHULF UHVSRQVH PHDVXUHVORJLVWLFUHJUHVVLRQDQDO\]HVSURSRUWLRQVEDVHGRQFDWHJRULFDOUHVSRQVHV PRVWFRPPRQO\ELQDU\UHVSRQVHVHJVXFFHVVUDWHVVXUYLYDOUDWHVRUFXUHUDWHV +LVWRULFDOO\ORJLVWLFUHJUHVVLRQWHFKQLTXHVKDYHEHHQZLGHO\XVHGIRULGHQWLI\LQJ ULVNIDFWRUVDVVRFLDWHGZLWKGLVHDVHLQHSLGHPLRORJLFDOVWXGLHV/RJLVWLFUHJUHVVLRQ DOVR LV SRSXODU IRU DQDO\]LQJ SURVSHFWLYH FOLQLFDO WULDOV DQG LQ LGHQWLI\LQJ SRWHQWLDOO\LPSRUWDQWFRYDULDWHVLQH[SORUDWRU\DQDO\VHVRIFOLQLFDOUHVHDUFKGDWD ([DPSOHVIURPFOLQLFDOUHVHDUFKZKHUHORJLVWLFUHJUHVVLRQPLJKWEHXVHIXOLQFOXGH
FRPSDULQJ VXUYLYDO UDWHV LQ FDQFHU SDWLHQWV DPRQJ YDULRXV WUHDWPHQW JURXSVDGMXVWHGIRUDJHDQGGXUDWLRQRIGLVHDVH
FRPSDULQJ SURSRUWLRQV RI SDWLHQWV ZKRVH GHUPDO XOFHUV VKRZ FRPSOHWH KHDOLQJ EHWZHHQ DQ DFWLYH DQG SODFHER JURXS DGMXVWHG IRU EDVHOLQH XOFHU VL]H
FRPSDULQJWKHSURSRUWLRQRIQRUPDOL]HGK\SHUWHQVLYHSDWLHQWVEHWZHHQWZR DQWLK\SHUWHQVLYH WUHDWPHQW JURXSV DGMXVWHG IRU DJH FKROHVWHURO OHYHO WREDFFRXVHDQGH[HUFLVHKDELWV
&RPPRQ6WDWLVWLFDO0HWKRGVIRU&OLQLFDO5HVHDUFKZLWK6$6([DPSOHV
6\QRSVLV
7KH/RJLW0RGHO2QH&RYDULDWH &RQVLGHU UHVSRQVH YDULDEOHV < WKDW WDNH RQH RI WZR SRVVLEOH YDOXHV \HVQR QRUPDODEQRUPDOSUHVHQWDEVHQWFXUHGQRWFXUHGGLHGVXUYLYHGHWF ZLWKFRGHG YDOXHV DQG $ UHVSRQVH RI < LQGLFDWHV WKDW WKH HYHQW RI LQWHUHVW RFFXUV HYHQW DQGDUHVSRQVHRI< LQGLFDWHVWKDWWKHHYHQWGRHVQRWRFFXUQRQHYHQW ,I \RX VXVSHFW WKDW RQH RU PRUH EDFNJURXQG IDFWRUV ZLOODIIHFWWKHUHVSRQVH\RX ZDQWWKHDQDO\VLVWRUHIOHFWWKLVUHODWLRQVKLSDQG\RXPXVWLQFRUSRUDWHWKHYDOXHVRI WKHFRYDULDWHV; LQWRWKHDQDO\VLV 7KHPHWKRGVXVHGWRGHYHORS$1&29$SURFHGXUHV&KDSWHU DUHQRWDSSOLFDEOH EHFDXVH WKH UHVSRQVHV DUH QRW QRUPDOO\ GLVWULEXWHG ,QVWHDG \RX DSSO\ D WUDQVIRUPDWLRQRIWKHGDWDFDOOHGWKHµORJLWIXQFWLRQ¶ < = OQË 3 Û Ì Ü Í - 3 Ý ZKHUH 3 LV WKH H[SHFWHG YDOXH RI < IRU D VSHFLILHG VHW RI ;YDOXHV DQG µOQ¶ UHSUHVHQWVWKHQDWXUDOORJDULWKPIXQFWLRQ )RUQRZDVVXPHWKHUHLVMXVWRQHFRYDULDWH;%HFDXVH<RQO\WDNHVWKHYDOXHV RUIRUDJLYHQYDOXHRI;WKHPHDQRI<HTXDOVWKHSUREDELOLW\WKDW<
[
3[ H - ;
[
:LWKVRPHDOJHEUDLFPDQLSXODWLRQWKLVFDQEHUHH[SUHVVHGDV
æ
3[ ö ÷ ; è - 3[ ø
OQ ç
WKHOHIWVLGHRIZKLFKLVWKHORJLVWLFWUDQVIRUPDWLRQRUµORJLW¶IXQFWLRQ< 3 ± 3 LVNQRZQDVWKHµRGGV¶WKDW< LHWKHRGGVWKDWWKHHYHQWRILQWHUHVWRFFXUV 7KHORJLWLVVRPHWLPHVUHIHUUHGWRDVWKHµORJRGGV¶7KHORJRGGVEHFRPHVDOLQHDU IXQFWLRQRIWKHFRYDULDWH;ZKHQDVLJPRLGDOUHODWLRQVKLSLVDVVXPHGEHWZHHQ; DQG3 DVLOOXVWUDWHGLQ)LJXUH ;
;
;
&KDSWHU/RJLVWLF5HJUHVVLRQ
),*85(/RJLVWLF3UREDELOLW\)XQFWLRQ
Px
X
7KH2GGV5DWLR 1RWHWKDWWKHRGGVIRUDVSHFLILFFRYDULDWHYDOXHRI; [ LV
[
DQGWKHRGGVIRUDFRYDULDWHYDOXHRI; [LV
2GGV[ H
2GGV[ H
[
VRWKDWWKHUDWLRRIWKHRGGVEDVHGRQDXQLWLQFUHPHQWLQ;LVVLPSO\
2GGV [ + =H 2GGV [ 7KLVLVFDOOHGWKHµRGGVUDWLR¶25 IRUWKHFRYDULDWH;%\VXEWUDFWLQJIURPWKH 25 DQG PXOWLSO\LQJ E\ \RX REWDLQ WKH SHUFHQW FKDQJH LQ WKH RGGV RI HYHQW RFFXUUHQFH ZKHQ WKH FRYDULDWH ; LQFUHDVHV E\ XQLW ,I ; LV D GLFKRWRPRXV YDULDEOH VXFK DV *HQGHU ZLWK YDOXHV µ0DOH¶ DQG µ)HPDOH¶ \RX DVVLJQ QXPHULF YDOXHVWRLWVOHYHOVHJ 0DOHDQG )HPDOHLQZKLFKFDVHWKH25UHSUHVHQWV WKHIDFWRUE\ZKLFKWKHRGGVRIHYHQWLQFUHDVHVIRUIHPDOHVUHODWLYHWRPDOHV
&RPPRQ6WDWLVWLFDO0HWKRGVIRU&OLQLFDO5HVHDUFKZLWK6$6([DPSOHV
0RGHO(VWLPDWLRQ )RUDIL[HGYDOXHRIWKHFRYDULDWHVXFKDV; [RQHZD\WRHVWLPDWH3 LVE\XVLQJ SA \ Q ZKHUHQ LVWKHQXPEHURIREVHUYDWLRQVDW; [DQG\ LVWKHQXPEHURI [
[
[
[
[
[
HYHQWVRXWRIWKHQ REVHUYDWLRQV%HFDXVH\ LVDELQRPLDOUDQGRPYDULDEOHSA ZLOO EHDEHWWHUHVWLPDWHZKHQQ LVODUJH 6XSSRVH SDWLHQWV ZLWK XOFHUV VHFRQGDU\ WR + S\ORUL ZHUH FXUHG RXW RI SDWLHQWVZKRZHUHWUHDWHGZLWKDQDQWLELRWLF7KHVWXG\UHVXOWVZHUHEURNHQGRZQ E\WKHXOFHUKLVWRU\;LQ\HDUV DVVKRZQLQ7DEOH [
[
[
[
7$%/(1DwYH(VWLPDWHVRI3[
[
\[
Q[
3[HVW
/RJLW
727$/
$QRWKHU ZD\ WR HVWLPDWH 3 LV WR REWDLQ DQ HVWLPDWH RI WKH ORJLVWLF UHJUHVVLRQ PRGHOWKHQSOXJLQWKHYDOXHRI[DQGVROYHIRU3 7KLVDSSURDFKLVEDVHGRQD PHWKRG NQRZQ DV PD[LPXP OLNHOLKRRG ZKLFK XVHV DOO WKH GDWD WR GHWHUPLQH WKH HVWLPDWH RI HDFK PRGHO SDUDPHWHU 7KHVH HVWLPDWHV DUH GHWHUPLQHG LQ D ZD\ WKDW PD[LPL]HV WKH OLNHOLKRRG RI REVHUYLQJ WKH GDWD FROOHFWHG DQG WKH\ DUH FDOOHG µPD[LPXP OLNHOLKRRG HVWLPDWHV¶ 0/( ,Q DGGLWLRQ WR KDYLQJ VRPH SRZHUIXO VWDWLVWLFDOSURSHUWLHV0/(VKDYHWKHDGYDQWDJHWKDWGDWDQHHGQRWEHJURXSHGDVLQ WKH DERYH H[DPSOH DQG WKH UHVXOWLQJ PRGHO FDQ EH XVHG WR HVWLPDWH 3 HYHQ IRU XQREVHUYHGYDOXHVRI; [
[
[
$OWKRXJKWKHPDWKHPDWLFDOGHULYDWLRQVEDVHGRQWKHPD[LPXPOLNHOLKRRGPHWKRG DUH FRPSOH[ DQG EH\RQG WKH VFRSH RI WKLV ERRN WKLV PHWKRG \LHOGV D VHW RI VLPXOWDQHRXV HTXDWLRQVWKDWFDQEHVROYHGIRUDDQGEWKHHVWLPDWHVRI DQG 7KHVHHTXDWLRQVZKLFKGRQRWKDYHDFORVHGVROXWLRQKDYHWKHIRUP
å< å + H
DQG
L 1
L
L
å;L ×
1
1
(
- DE[L - )
1
å;L × ( + H L
- DE[L
-
)
&KDSWHU/RJLVWLF5HJUHVVLRQ
1XPHULFDO WHFKQLTXHV VXFK DV LWHUDWLYHO\ ZHLJKWHG OHDVWVTXDUHV DQG 1HZWRQ 5DSKVRQDOJRULWKPVDUHXVHGE\6$6DQGRWKHUFRPSXWHUSURJUDPVWRVROYHWKHVH HTXDWLRQV
7KH/RJLW0RGHO0XOWLSOH&RYDULDWHV ,Q JHQHUDO WKH ORJLVWLF UHJUHVVLRQ OD\RXW KDV 1 SDWLHQWV DQG N FRYDULDWHV ; ; ; DQGDW\SLFDOGDWDVHWFDQKDYHWKHIROORZLQJOD\RXW
N
7$%/(/D\RXWIRU/RJLVWLF5HJUHVVLRQ
3DWLHQW 1XPEHU
5HVSRQVH
&RYDULDWHV
<
;
;
;N
\
[
[
[N
\
[
[
[N
1
\1
[1
[1
[N1
7KH; VFDQEHNQRZQQXPHULFYDOXHGFRQFRPLWDQWIDFWRUVVXFKDVDJH:%&RU IDVWLQJJOXFRVHOHYHORUWKH\FDQUHSUHVHQWOHYHOVRIRUGLQDOFDWHJRULFDOYDULDEOHV VXFK DV VPDOOPHGLXPODUJH RU QRQHPLOGPRGHUDWHVHYHUH
L
7KHPRGHOIRUWKHSUREDELOLW\RIµHYHQW¶3LV
3
H
3
æ
ö ÷ è - 3ø
OQ ç
-
[
[ N [ N
VRWKHORJLWEHFRPHVWKHOLQHDUIXQFWLRQ
[ [ N [ N
7KHRGGVFDQEHH[SUHVVHGDV
æ 3 ö ç ÷ è - 3ø
H
+
[ [ N [N
DQGWKHRGGVUDWLRIRU; LV L
25 = H [
L
L
&RPPRQ6WDWLVWLFDO0HWKRGVIRU&OLQLFDO5HVHDUFKZLWK6$6([DPSOHV
ZLWKWKHLQWHUSUHWDWLRQWKDW
× H
L
-
UHSUHVHQWVWKHSHUFHQWLQFUHDVHLQWKHRGGVRIµHYHQW¶RFFXUUHQFHZKHQ; LQFUHDVHV E\XQLWDQGDOORWKHU;¶VDUHKHOGFRQVWDQW L
7KH LPSRUWDQFH RI HDFK FRYDULDWH ; IRU SUHGLFWLQJ WKH HYHQW SUREDELOLW\ LV PHDVXUHG E\ WKH PDJQLWXGH RI WKH SDUDPHWHU FRHIILFLHQW (VWLPDWHV RI WKHVH SDUDPHWHUV FDQ EH IRXQG E\ WKH PHWKRG RI PD[LPXP OLNHOLKRRG XVLQJ 352& /2*,67,& LQ 6$6 )RU ODUJH VDPSOHV WKHVH HVWLPDWHV E KDYH DQ DSSUR[LPDWH QRUPDOGLVWULEXWLRQ,IV UHSUHVHQWVWKHVWDQGDUGHUURURIWKHHVWLPDWHE WKHQE V KDV DQ DSSUR[LPDWH VWDQGDUG QRUPDO GLVWULEXWLRQ XQGHU WKH QXOO K\SRWKHVLV WKDW DQGLWVVTXDUHKDVWKHFKLVTXDUHGLVWULEXWLRQZLWKGHJUHHRIIUHHGRP7KH WHVW VXPPDU\ IRU HDFK PRGHO SDUDPHWHU LV EDVHG RQ WKLV :DOG FKLVTXDUH VXPPDUL]HGDVIROORZV L
L
L
E
L
L
E
L
L
QXOOK\SRWKHVLV DOWHUQDWLYHK\SRWKHVLV
+ +
L
$
WHVWVWDWLVWLF
GHFLVLRQUXOH
Z
L
ËE Û ÌÌ L ÜÜ Í VE Ý
UHMHFW+ LI
Z
!
([DPSOH LOOXVWUDWHV D ORJLVWLF UHJUHVVLRQ DQDO\VLV IRU N XVLQJ 352& /2*,67,&LQ6$6
([DPSOHV
p ([DPSOH5HODSVH5DWH$GMXVWHGIRU5HPLVVLRQ7LPHLQ$0/
2QHKXQGUHGDQGWZRSDWLHQWVZLWKDFXWHP\HORJHQRXVOHXNHPLD$0/ LQ UHPLVVLRQZHUHHQUROOHGLQDVWXG\RIDQHZDQWLVHQVHROLJRQXFOHRWLGH DV2'1 7KHSDWLHQWVZHUHUDQGRPO\DVVLJQHGWRUHFHLYHDGD\LQIXVLRQ RIDV2'1RUQRWUHDWPHQW&RQWURO DQGWKHHIIHFWVZHUHIROORZHGIRU GD\V7KHWLPHRIUHPLVVLRQIURPGLDJQRVLVRUSULRUUHODSVH;LQPRQWKV DW VWXG\HQUROOPHQWZDVFRQVLGHUHGDQLPSRUWDQWFRYDULDWHLQSUHGLFWLQJ UHVSRQVH7KHUHVSRQVHGDWDDUHVKRZQLQ7DEOHZLWK< LQGLFDWLQJ UHODSVHGHDWKRUPDMRULQWHUYHQWLRQVXFKDVERQHPDUURZWUDQVSODQWEHIRUH 'D\,VWKHUHDQ\HYLGHQFHWKDWDGPLQLVWUDWLRQRIDV2'1LVDVVRFLDWHG ZLWKDGHFUHDVHGUHODSVHUDWH"
&KDSWHU/RJLVWLF5HJUHVVLRQ
7$%/(5DZ'DWDIRU([DPSOH DV2'1*URXS 3DWLHQW 1XPEHU
;
<
3DWLHQW 1XPEHU
;
<
3DWLHQW 1XPEHU
;
<
&RQWURO*URXS 3DWLHQW 3DWLHQW 3DWLHQW ; ; < < ; < 1XPEHU 1XPEHU 1XPEHU
&RPPRQ6WDWLVWLFDO0HWKRGVIRU&OLQLFDO5HVHDUFKZLWK6$6([DPSOHV
6ROXWLRQ
DV2'1*URXS
&RQWURO*URXS
; PRQWKV
1XPEHU RI(YHQWV
1
(VWLPDWHG 3[
1XPEHU RI(YHQWV
1
(VWLPDWHG 3[
727$/
,I SULRU UHPLVVLRQ WLPH ; LV LJQRUHG RYHUDOO UHODSVH UDWHV DUH HVWLPDWHG DV IRU WKH DV2'1 JURXS DQG IRU WKH &RQWURO JURXS 7KLV LV QRW D VLJQLILFDQW GLIIHUHQFH LI \RX XVH WKH FKLVTXDUH WHVW GLVFXVVHG LQ &KDSWHU S
&KDSWHU/RJLVWLF5HJUHVVLRQ
),*85(3UREDELOLW\RI5HODSVH3[ YV5HPLVVLRQ7LPH; IRU([DPSOH
3[
DV2'1 &RQWURO
;
7RSHUIRUPORJLVWLFUHJUHVVLRQLQFOXGHDWUHDWPHQWIDFWRUVD\; DQGDUHPLVVLRQ WLPHFRYDULDWHVD\; 'HILQH; DV
; LIQRWUHDWPHQW&RQWURO*URXS
LIWUHDWHGDV2'1*URXS 3ULRUUHPLVVLRQWLPH; LVFRQVLGHUHGDFRQWLQXRXVQXPHULFFRYDULDWH;LQ7DEOH 7KHORJLVWLFUHJUHVVLRQPRGHOLQFRUSRUDWHVERWK; DQG; DVFRYDULDWHV
3 H
; ;
8VH 352& /2*,67,& LQ 6$6 WR ILW WKLV PRGHO DV VKRZQ LQ WKH DQDO\VLV WKDW IROORZV
6$6$QDO\VLVRI([DPSOH ,Q WKH LQSXW GDWD VHW GHILQH WKH WUHDWPHQW HIIHFW 757 DV D GLFKRWRPRXV H[SODQDWRU\YDULDEOHZLWKYDOXHVSODFHERJURXS DQGDFWLYHJURXS å&RGH WKH UHVSRQVH YDULDEOH < DV LI WKH SDWLHQW UHODSVHG DQG LI UHODSVH GLG QRW RFFXU 7KH6$6FRGHIRUSHUIRUPLQJWKHORJLVWLFUHJUHVVLRQDQDO\VLVLVVKRZQIROORZLQJ WKHGDWDLQSXWVHH6$6FRGHIRU([DPSOH DQGWKHUHVXOWVDUHJLYHQLQ2XWSXW 7KH'(6&(1',1*RSWLRQLVXVHGLQWKH352&/2*,67,&VWDWHPHQW êWR OHW6$6NQRZWKDWWKHHYHQWRILQWHUHVWLVµUHODSVH¶RU< 2WKHUZLVH6$6ZLOO PRGHOWKHVPDOOHUYDOXHRI<< DVWKHHYHQW
&RPPRQ6WDWLVWLFDO0HWKRGVIRU&OLQLFDO5HVHDUFKZLWK6$6([DPSOHV
7KH 0/(V RI WKH PRGHO SDUDPHWHUV DQG DUH VHHQ WR EH D E ±DQGE ± ZKLFKUHVXOWLQDQHVWLPDWHGORJLVWLFUHJUHVVLRQ HTXDWLRQRI
3Ö = - -× ; - × ; + H 2QHTXHVWLRQRILQWHUHVWLVZKHWKHU; UHPLVVLRQWLPH;LQ6$6DQDO\VLV LVDQ LPSRUWDQWFRYDULDWHWHVWHGDVIROORZV
QXOOK\SRWKHVLV
+
DOWK\SRWKHVLV
+
WHVWVWDWLVWLF
GHFLVLRQUXOH
UHMHFW+ LI Z !
FRQFOXVLRQ
%HFDXVH ! \RX UHMHFW +
$
Z
LQGLFDWLQJDVLJQLILFDQWFRYDULDWH;
7KHRXWSXWLQGLFDWHVWKDWWKHWLPHVLQFHSULRUUHPLVVLRQ; LVDKLJKO\VLJQLILFDQW EDFNJURXQG IDFWRU IRU SUHGLFWLQJ WKH SUREDELOLW\ RI D UHODSVH ZLWKLQ GD\V S 7KLV LV QRW WDNHQ LQWR DFFRXQW LQ WKH XQDGMXVWHG FKLVTXDUH WHVW ZKLFK\LHOGVUHODSVHUDWHVRIDQGIRUWKH$FWLYHDQG&RQWUROJURXSV UHVSHFWLYHO\S ñ 6LPLODUO\ FRPSDULVRQ RI UHODSVH UDWHV EHWZHHQ WUHDWPHQW JURXSV 757 DGMXVWHG IRUWKHFRYDULDWHLVWHVWHGE\WKHK\SRWKHVLV
+
7KH 6$6 RXWSXW LQGLFDWHV D VLJQLILFDQW :DOG FKLVTXDUH YDOXH RI ZLWK D SYDOXH RI ò 7KH HVWLPDWH RI LV ± DQG WKH HVWLPDWH RI WKH 7KLVPHDQVWKDWDGMXVWHGIRUSULRUUHPLVVLRQWLPH WUHDWPHQW25LVH WKHUH LV D ± ± LQFUHDVH RU UHGXFWLRQ LQ WKH RGGV RI UHODSVH DVVRFLDWHG ZLWK WKH DFWLYH WUHDWPHQW JURXS ; FRPSDUHG ZLWK WKH SODFHERJURXS; %HJLQQLQJZLWK6$69HUVLRQWKHRGGVUDWLRVDQG FRQILGHQFHLQWHUYDOVDUHSULQWHGLQWKHRXWSXWô
±
&KDSWHU/RJLVWLF5HJUHVVLRQ
6$6&RGHIRU([DPSOH DATA AML; INPUT PAT GROUP $ X RELAPSE $ TRT = (GROUP = ’ACT’) ; Y = (RELAPSE = ’YES’); DATALINES; 1 ACT 3 NO 2 ACT 3 YES 4 6 ACT 6 YES 7 ACT 15 NO 10 11 ACT 6 YES 14 ACT 6 YES 15 17 ACT 15 NO 20 ACT 12 NO 21 22 ACT 6 YES 25 ACT 15 NO 26 28 ACT 15 NO 29 ACT 12 YES 32 33 ACT 6 YES 36 ACT 6 NO 39 42 ACT 6 NO 44 ACT 3 YES 46 49 ACT 9 NO 50 ACT 12 YES 52 54 ACT 9 YES 56 ACT 9 YES 58 60 ACT 9 YES 62 ACT 12 NO 63 66 ACT 3 NO 67 ACT 12 NO 69 71 ACT 12 NO 73 ACT 9 YES 74 77 ACT 12 NO 79 ACT 6 NO 81 83 ACT 9 NO 85 ACT 3 YES 88 90 ACT 9 NO 92 ACT 9 NO 94 95 ACT 9 YES 98 ACT 12 YES 99 102 ACT 6 YES 3 PBO 9 YES 5 8 PBO 12 YES 9 PBO 3 YES 12 13 PBO 15 YES 16 PBO 9 YES 18 19 PBO 3 YES 23 PBO 9 YES 24 27 PBO 9 YES 30 PBO 6 YES 31 34 PBO 6 YES 35 PBO 12 NO 37 38 PBO 15 NO 40 PBO 15 YES 41 43 PBO 9 NO 45 PBO 12 YES 47 48 PBO 6 YES 51 PBO 6 YES 53 55 PBO 12 NO 57 PBO 12 YES 59 61 PBO 12 YES 64 PBO 3 YES 65 68 PBO 6 YES 70 PBO 6 YES 72 75 PBO 15 NO 76 PBO 15 NO 78 80 PBO 9 NO 82 PBO 12 NO 84 86 PBO 18 YES 87 PBO 12 NO 89 91 PBO 15 NO 93 PBO 15 NO 96 97 PBO 18 YES 100 PBO 18 NO 101 ;
@@;
ACT ACT ACT ACT ACT ACT ACT ACT ACT ACT ACT ACT ACT ACT ACT ACT ACT PBO PBO PBO PBO PBO PBO PBO PBO PBO PBO PBO PBO PBO PBO PBO PBO PBO
å 3 6 15 18 6 9 6 18 6 3 12 12 6 15 9 9 3 3 3 12 15 9 9 9 3 12 3 12 9 12 15 15 18 18
YES YES NO NO YES NO NO NO NO NO NO NO YES YES NO NO YES NO YES YES YES YES NO NO YES NO YES YES YES NO NO YES NO NO
PROC LOGISTIC DATA = AML DESCENDING; ê MODEL Y = TRT X; TITLE1 ’Logistic Regression’; TITLE2 ’Example 20.1: Relapse Rate Adjusted for Remission Time in AML’; RUN; PROC FREQ DATA = AML; TABLES TRT*Y / CHISQ NOCOL NOPERCENT NOCUM; TITLE3 ’Chi-Square Test Ignoring the Covariate’; RUN;
&RPPRQ6WDWLVWLFDO0HWKRGVIRU&OLQLFDO5HVHDUFKZLWK6$6([DPSOHV
2873876$62XWSXWIRU([DPSOH
Example 20.1:
Logistic Regression Relapse Rate Adjusted for Remission Time in AML The LOGISTIC Procedure
Model Information Data Set WORK.AML Response Variable Y Number of Response Levels 2 Number of Observations 102 Link Function Binary Logit Optimization Technique Fisher’s scoring Response Profile Ordered Value
Y
Total Frequency
1 2
1 0
53 49
Model Convergence Status Convergence criterion (GCONV=1E-8) satisfied. Model Fit Statistics
Criterion
Intercept Only
Intercept and Covariates
143.245 145.870 141.245
129.376 137.251 123.376
AIC SC -2 Log L
Testing Global Null Hypothesis: BETA=0 Test
Chi-Square
DF
Pr > ChiSq
17.8687 16.4848 14.0612
2 2 2
0.0001 0.0003 0.0009
Likelihood Ratio Score Wald
Analysis of Maximum Likelihood Estimates
Parameter
DF
Estimate
Standard Error
Intercept TRT X
1 1 1
2.6135 -1.1191 -0.1998
0.7149 0.4669 0.0560
Chi-Square
Pr > ChiSq
13.3662 5.7446 12.7187
0.0003 0.0165 0.0004
õ
Odds Ratio Estimates
Effect TRT X
Point Estimate 0.327 0.819
95% Wald Confidence Limits
ù
0.131 0.734
0.815 0.914
ô
ò
&KDSWHU/RJLVWLF5HJUHVVLRQ
2873876$62XWSXWIRU([DPSOHFRQWLQXHG Example 20.1:
Logistic Regression Relapse Rate Adjusted for Remission Time in AML
Association of Predicted Probabilities and Observed Responses Percent Concordant Percent Discordant Percent Tied Pairs
68.5 23.1 8.4 2597
Somers’ D Gamma Tau-a c
0.454 0.496 0.229 0.727
Chi-Square Test Ignoring the Covariate The FREQ Procedure Table of TRT by Y TRT
Y
Frequency| Row Pct | 0| 1| ---------+--------+--------+ 0 | 20 | 30 | | 40.00 | 60.00 | ---------+--------+--------+ 1 | 29 | 23 | | 55.77 | 44.23 | ---------+--------+--------+ Total 49 53
Total 50
52
ñ
102
Statistics for Table of TRT by Y Statistic DF Value Prob -----------------------------------------------------Chi-Square 1 2.5394 0.1110 Likelihood Ratio Chi-Square 1 2.5505 0.1103 Continuity Adj. Chi-Square 1 1.9469 0.1629 Mantel-Haenszel Chi-Square 1 2.5145 0.1128 Phi Coefficient -0.1578 Contingency Coefficient 0.1559 Cramer’s V -0.1578
Fisher’s Exact Test ---------------------------------Cell (1,1) Frequency (F) 20 Left-sided Pr <= F 0.0813 Right-sided Pr >= F 0.9637 Table Probability (P) Two-sided Pr <= P Sample Size = 102
0.0450 0.1189
ñ
&RPPRQ6WDWLVWLFDO0HWKRGVIRU&OLQLFDO5HVHDUFKZLWK6$6([DPSOHV
0XOWLQRPLDO5HVSRQVHV ,QDGGLWLRQWRGLFKRWRPRXVRXWFRPHVORJLVWLFUHJUHVVLRQPHWKRGVFDQEHXVHGLQ FDVHV WKDW LQYROYH RUGLQDO FDWHJRULFDO UHVSRQVHV ZLWK PRUH WKDQ WZR OHYHOV 7KH VWUDWHJ\ XVHV D µSURSRUWLRQDO RGGV¶ PRGHO ZKLFK VLPXOWDQHRXVO\ ILWV P ELQDU\ ORJLVWLF UHJUHVVLRQ PRGHOV RI WKH IRUP MXVW GLVFXVVHG ZKHQ WKHUH DUH P ! RUGLQDOUHVSRQVHOHYHOV &RQVLGHU DQ H[DPSOH LQ ZKLFK WKH UHVSRQVH < LV PHDVXUHG RQ DQ RUGLQDO LPSURYHPHQWVFDOHRIµZRUVH¶µQRFKDQJH¶µLPSURYHG¶RUµFXUHG¶,QWKLVFDVHP LVHTXDOWRVR\RXFDQGHILQHWKUHHµFXPXODWLYH¶ELQDU\UHVSRQVHVDVIROORZV < LI< µZRUVH¶ LI< µQRFKDQJH¶µLPSURYHG¶RUµFXUHG¶ < LI< µZRUVH¶RUµQRFKDQJH¶ LI< µLPSURYHG¶RUµFXUHG¶ < LI< µZRUVH¶µQRFKDQJH¶RUµLPSURYHG¶ LI< µFXUHG¶ 7KH ILUVW GLFKRWRPRXV UHVSRQVH YDULDEOH < LV PHDVXULQJ ZKHWKHU WKH SDWLHQW¶V FRQGLWLRQ GRHV QRW GHWHULRUDWH 6LPLODUO\ < PHDVXUHV ZKHWKHU WKH SDWLHQW LPSURYHVDQG< PHDVXUHVZKHWKHUWKHSDWLHQWLVFXUHG7DNHQWRJHWKHU\RXFDQ XVH< < DQG< DVDQRYHUDOOPHDVXUHRILPSURYHPHQW
æ 3 ö OQ ç ÷ [ [ + + [ è - 3 ø ZLWK WKH FRQVWUDLQW WKDW WKH VORSHV PXVW EH HTXDO LH HWF VR\RXDFWXDOO\ILWWKHPRGHO M
M
M
M
NM
N
M
æ 3 ö OQ ç ÷ [ [ + + [ è 3 ø 5HTXLULQJ HTXDO VORSHV DPRQJ WKH P OHYHOV IRU HDFK FRYDULDWH LV FDOOHG WKH µSURSRUWLRQDO RGGV DVVXPSWLRQ¶ 8VLQJ ORJLVWLF UHJUHVVLRQ DQDO\VLV \RX ILW WKH PRGHO E\ REWDLQLQJ HVWLPDWHV RI WKH FRHIILFLHQWV RI HDFK FRYDULDWH XQGHU WKLV H FRQVWUDLQW$VEHIRUHLVWKHRGGVUDWLRWKDWUHSUHVHQWVWKHSURSRUWLRQDOLQFUHDVH LQRGGVRILPSURYHPHQWIRUHDFKXQLWLQFUHDVHLQ[ L «N M
M
N
N
M
L
L
,Q6$6352&/2*,67,&DXWRPDWLFDOO\ILWVWKHSURSRUWLRQDORGGVPRGHOZKHQLW GHWHFWVPRUHWKDQWZRTXDQWLWDWLYHUHVSRQVHOHYHOVDVVKRZQLQ([DPSOH
&KDSWHU/RJLVWLF5HJUHVVLRQ
p ([DPSOH6\PSWRP5HOLHILQ*DVWURSDUHVLV
3DWLHQWVZLWKVHYHUHV\PSWRPVRIJDVWURSDUHVLVZHUHUDQGRPL]HGWRUHFHLYH DQH[SHULPHQWDOWKHUDS\$ RUGLHWDU\FKDQJHVZLWKµZDWFKIXOZDLWLQJ¶% 5HVSRQVHZDVPHDVXUHGDIWHUGD\VXVLQJWKHIROORZLQJVFDOH QR UHVSRQVH VRPHUHVSRQVH PDUNHGUHVSRQVHRU FRPSOHWHUHVSRQVH EDVHGRQWKHGHJUHHRIV\PSWRPUHOLHI$SDWLHQW¶VKLVWRU\RIVHYHUH JDVWURSDUHVLVWKRXJKWWREHDQLPSRUWDQWFRYDULDWHZDVDOVRUHFRUGHGDVµQR SULRUHSLVRGHV¶ µRQHSULRUHSLVRGH¶ RUµPRUHWKDQRQHSULRUHSLVRGH¶ 7KHGDWDDUHVKRZQLQ7DEOH,VWKHUHDQ\GLIIHUHQFHLQUHVSRQVH EHWZHHQWKHSDWLHQWVZKRUHFHLYHGWKHH[SHULPHQWDOWKHUDS\DQGWKHXQWUHDWHG SDWLHQWV"
7$%/(5DZ'DWDIRU([DPSOH 1HZ7UHDWPHQW$ 3DWLHQW +LVWRU\ 5HVSRQVH 1XPEHU
8QWUHDWHG*URXS% 3DWLHQW +LVWRU\ 5HVSRQVH 1XPEHU
&RPPRQ6WDWLVWLFDO0HWKRGVIRU&OLQLFDO5HVHDUFKZLWK6$6([DPSOHV
6ROXWLRQ
7KHUHVSRQVH<LVRUGLQDOZLWKOHYHOVVRWKHSURSRUWLRQDORGGVDSSURDFKLVXVHG E\ ILWWLQJ WKUHH GLFKRWRPRXV ORJLVWLF UHJUHVVLRQ PRGHOV DV VKRZQ LQ 7DEOH 7KHUHDUHWZRFDWHJRULFDOFRYDULDWHV6WXG\*URXS; ZLWKOHYHOV$DQG%DQG +LVWRU\RIVHYHUHJDVWURSDUHVLV; ZLWKOHYHOVDQG
7$%/('LFKRWRPRXV6XE0RGHOVIRUWKH3URSRUWLRQDO2GGV0RGHO
'LFKRWRPRXV &DWHJRULHV
,QWHUSUHWDWLRQ
L
YV
µFRPSOHWH¶YVµLQFRPSOHWHUHVSRQVH¶
; ;
LL
YV
µPDUNHGRUFRPSOHWHUHVSRQVH¶YV µVRPHRUQRUHVSRQVH¶
; ;
LLL
YV
µUHVSRQVH¶YVµQRUHVSRQVH¶
; ;
0RGHOORJLW
0D[LPXPOLNHOLKRRG HVWLPDWHV DUH HDVLO\ IRXQG E\ XVLQJ 6$6 DV VKRZQ LQ WKH DQDO\VLVWKDWIROORZV
6$6$QDO\VLVRI([DPSOH
,QWKLVFDVHWKHV\QWD[XVHGLQ352&/2*,67,&LVVLPLODUWRWKDWXVHGLQ([DPSOH IRUDGLFKRWRPRXVUHVSRQVH7KHPDLQGLIIHUHQFHKHUHLVWKDWWKH6WXG\*URXS HIIHFWKDVFKDUDFWHUYDOXHV$DQG% 5DWKHUWKDQXVHWKH'$7$VWHSWRFRQYHUW WKHP WR QXPHULF YDOXHV DQG UHVSHFWLYHO\ \RX XVH D &/$66 VWDWHPHQW LQ 352&/2*,67,&ZKLFKWHOOV6$6WKDWWKHYDOXHVKDYHQRW\HWEHHQFRQYHUWHG 11
&KDSWHU/RJLVWLF5HJUHVVLRQ
L
3 ö ÷ è - 3 ø æ
OQ ç
= - +
;
-
;
æ 3 ö LL OQ ç ÷ = + ; - ; è - 3 ø æ 3 ö LLL OQ ç - 3 ÷ = + ; - ; è ø 7KH RGGV UDWLR IRU WKH 6WXG\ *URXS HIIHFW LV H 14 ZLWK WKH LQWHUSUHWDWLRQ WKDW WKH RGGV RI VRPH GHJUHH RI UHVSRQVH ZLWK *URXS $ H[SHULPHQWDO WUHDWPHQW LV JUHDWHU WKDQ WKDW RI WKH XQWUHDWHG JURXS % DGMXVWHGIRUKLVWRU\RISULRUH[SHULHQFHZLWKVHYHUHJDVWURSRUHVLV7KHSYDOXHLV QRWTXLWHVLJQLILFDQWS 15
±
6$6&RGHIRU([DPSOH DATA GI; INPUT PAT RX $ HIST RESP @@; /* A = new treatment, B = untreated /* HIST = 0 if no prior episodes, HIST = 1 if one /* prior episode, HIST = 2 if >1 prior episodes /* RESP = 1 (none), 2 (some), 3 (marked), 4 (complete) DATALINES; 101 A 1 3 102 A 2 3 103 B 1 3 104 A 1 2 105 B 1 3 106 A 0 4 107 B 1 2 108 A 2 1 109 A 0 4 110 B 2 3 111 B 2 1 112 B 0 4 113 A 1 3 114 A 1 4 115 A 0 4 116 A 2 2 117 A 1 3 118 B 0 3 119 B 2 2 120 B 1 1 121 A 2 1 122 B 2 1 123 B 1 4 124 A 1 4 125 A 1 2 126 B 0 2 127 A 2 4 128 A 0 3 129 B 2 2 130 A 1 4 131 B 1 3 132 A 1 3 133 A 1 1 134 B 1 2 135 A 1 4 136 B 2 3 137 A 0 3 138 B 2 1 139 B 0 4 140 A 2 4 141 B 1 3 142 B 2 4 143 A 1 3 144 A 2 3 145 A 2 2 146 B 1 2 147 A 1 4 148 B 1 1 149 A 2 1 150 B 1 2 151 A 2 4 152 B 1 3 153 A 1 2 154 A 1 2 155 A 1 4 ; PROC LOGISTIC DESCENDING DATA = GI; 11 CLASS RX / PARAM = REF ; MODEL RESP = RX HIST; TITLE1 ’Logistic Regression’; TITLE2 ’Example 20.2: Symptom Relief in Severe Gastroparesis’; RUN;
*/ */ */ */
&RPPRQ6WDWLVWLFDO0HWKRGVIRU&OLQLFDO5HVHDUFKZLWK6$6([DPSOHV
2873876$62XWSXWIRU([DPSOH
Logistic Regression Example 20.2: Symptom Relief in Severe Gastroparesis The LOGISTIC Procedure Model Information Data Set Response Variable Number of Response Levels Number of Observations Link Function Optimization Technique
WORK.GI RESP 4 55 Cumulative Logit Fisher’s scoring
Response Profile Ordered Value
RESP
Total Frequency
1 2 3 4
4 3 2 1
16 17 13 9
Probabilities modeled are cumulated over the lower Ordered Values. Class Level Information
Class
Value
RX
A B
Design Variables 1 12
1 0
Model Convergence Status Convergence criterion (GCONV=1E-8) satisfied. Score Test for the Proportional Odds Assumption Chi-Square
DF
Pr > ChiSq
2.5521
4
0.6353
Model Fit Statistics
Criterion AIC SC -2 Log L
Intercept Only
Intercept and Covariates
155.516 161.538 149.516
149.907 159.944 139.907
18
&KDSWHU/RJLVWLF5HJUHVVLRQ
2873876$62XWSXWIRU([DPSOHFRQWLQXHG
Logistic Regression Example 20.2: Symptom Relief in Severe Gastroparesis The LOGISTIC Procedure Testing Global Null Hypothesis: BETA=0 Test
Chi-Square
DF
Pr > ChiSq
9.6084 9.0147 8.7449
2 2 2
0.0082 0.0110 0.0126
Likelihood Ratio Score Wald
Type III Analysis of Effects
Effect RX HIST
DF
Wald Chi-Square
Pr > ChiSq
1 1
2.9426 6.4304
0.0863 0.0112
Analysis of Maximum Likelihood Estimates
Parameter
DF
Estimate
Standard Error
Chi-Square
0.5831 0.6001 0.6698 0.5061 0.3841
0.4449 3.2123 13.3018 2.9426 6.4304
Pr > ChiSq
13
Intercept4 Intercept3 Intercept2 RX A HIST
1 1 1 1 1
-0.3890 1.0756 2.4430 0.8682 -0.9741
0.5048 0.0731 0.0003 0.0863 15 0.0112 16
Odds Ratio Estimates
Effect RX A vs B HIST
Point Estimate 2.383 14 0.378 17
95% Wald Confidence Limits 0.884 0.178
6.425 0.802
Association of Predicted Probabilities and Observed Responses Percent Concordant Percent Discordant Percent Tied Pairs
57.6 23.7 18.7 1115
Somers’ D Gamma Tau-a c
0.339 0.417 0.255 0.670
&RPPRQ6WDWLVWLFDO0HWKRGVIRU&OLQLFDO5HVHDUFKZLWK6$6([DPSOHV
&OXVWHUHG%LQRPLDO'DWD ,Q &KDSWHU \RX VDZ KRZ WKH &RFKUDQ0DQWHO+DHQV]HO WHVW FDQ EH XVHG WR DQDO\]HELQRPLDOUHVSRQVHVIURPVWUDWLILHGJURXSV,QWKDWFDVHREVHUYDWLRQVZLWKLQ VWUDWDDUHFRQVLGHUHGLQGHSHQGHQWJHQHUDOO\DULVLQJIURPGLIIHUHQWSDWLHQWV 6RPHWLPHVELQDU\UHVSRQVHVZLWKLQVWUDWDDUHFRUUHODWHGLQZKLFKFDVHWKH&0+ WHVW ZRXOG QRW EH DSSURSULDWH 6XFK VLWXDWLRQV DULVH IUHTXHQWO\ LQ FOLQLFDO WULDOV ZKHQDQXPEHURIELQDU\RXWFRPHVDUHREVHUYHGLQHDFKRIVHYHUDOSDWLHQWV )RU H[DPSOH VXSSRVH PLJUDLQH SDWLHQWV DUH SUHVFULEHG D QHZ µDV QHHGHG¶ PHGLFDWLRQDQGDUHDVNHGWRNHHSDGLDU\RIWKHQXPEHURIPLJUDLQHKHDGDFKHVWKH\ H[SHULHQFHDQGWKHQXPEHURIWKRVHZKLFKDUHUHVROYHGZLWKLQWZRKRXUVRIWDNLQJ WKH PHGLFDWLRQ ,Q WKLV FDVH HDFK PLJUDLQH RFFXUUHQFH UHVXOWV LQ WKH ELQDU\ RXWFRPH µUHVROYHG¶ RU µQRW UHVROYHG¶ ZLWKLQ WZR KRXUV RI PHGLFDWLRQ :KLOH LQGHSHQGHQFH FDQ EH DVVXPHG DPRQJ SDWLHQWV WKHUH LV D QDWXUDO FOXVWHULQJ RI REVHUYDWLRQVIRUHDFKSDWLHQWIRUZKRP\RXFRXOGUHDVRQDEO\DVVXPHUHVSRQVHVDUH FRUUHODWHG 6XFK GDWD FDQ EH DQDO\]HG E\ XVLQJ 352& /2*,67,& ZLWK D FRUUHFWLRQ IRU RYHUGLVSHUVLRQDVVKRZQLQ([DPSOH7KLVSURFHGXUHHVVHQWLDOO\UHVFDOHVWKH FRUUHODWLRQ PDWUL[ WKDW LV XVHG WR HVWLPDWH WKH VWDQGDUG HUURUV RI WKH SDUDPHWHU HVWLPDWHV :LWKRXW WKLV DGMXVWPHQW WKH ZLWKLQFOXVWHU FRUUHODWLRQ PLJKW OHDG WR XQGHUHVWLPDWLQJWKHVWDQGDUGHUURUVZKLFKLQWXUQOHDGVWRUHVXOWVWKDWDUHRYHUO\ VLJQLILFDQW
p ([DPSOH,QWHUFRXUVH6XFFHVV5DWHLQ(UHFWLOH'\VIXQFWLRQ
0DOHSDWLHQWVZKRH[SHULHQFHGHUHFWLOHG\VIXQFWLRQIROORZLQJSURVWDWH VXUJHU\ZHUHHQUROOHGLQDSDUDOOHOVWXG\WRFRPSDUHDQHZDQWLLPSRWHQFH WUHDWPHQWZLWKDPDUNHWHGGUXJ3DWLHQWVZHUHDVNHGWRNHHSDGLDU\IRURQH PRQWKWRUHFRUGWKHQXPEHURIDWWHPSWVDWVH[XDOLQWHUFRXUVHDIWHUWDNLQJWKH VWXG\PHGLFDWLRQDQGWKHQXPEHURIWKRVHDWWHPSWVWKDWZHUHµVXFFHVVIXO¶ 'DWDLQFOXGLQJSDWLHQW¶VDJHZKLFKLVWKRXJKWWREHDQLPSRUWDQWFRYDULDWH DUHVKRZQLQ7DEOH,VWKHUHDQ\GLIIHUHQFHLQµVXFFHVV¶UDWHVEHWZHHQWKH QHZPHGLFDWLRQDQGWKHSURGXFWDOUHDG\RQWKHPDUNHW"
&KDSWHU/RJLVWLF5HJUHVVLRQ
7$%/(5DZ'DWDIRU([DPSOH 5HIHUHQFH&RQWURO 3DWLHQW $JH 1XPEHURI 1XPEHURI 1XPEHU \UV 6XFFHVVHV $WWHPSWV
3DWLHQW 1XPEHU
1HZ'UXJ $JH 1XPEHURI \UV 6XFFHVVHV
1XPEHURI $WWHPSWV
6ROXWLRQ
,Q WKLV FDVH HDFK SDWLHQW UHSUHVHQWV D FOXVWHU RI OLNH ELQRPLDO UHVSRQVHV ,W LV QDWXUDOWRDVVXPHWKDWZLWKLQSDWLHQWUHVSRQVHVPLJKWEHFRUUHODWHGFHUWDLQO\PRUH VRWKDQDPRQJGLIIHUHQWSDWLHQWV,I\RXSHUIRUPDORJLVWLFUHJUHVVLRQDQDO\VLVRI UHVSRQVHUDWHVIRUWUHDWPHQW*URXSDQG$JHWKDWLJQRUHVWKLVFOXVWHULQJ\RXPLJKW JHWLQDFFXUDWHUHVXOWVVR\RXSURFHHGZLWKDORJLVWLFUHJUHVVLRQDQDO\VLVWKDWXVHVD FRUUHFWLRQIRURYHUGLVSHUVLRQ
&RPPRQ6WDWLVWLFDO0HWKRGVIRU&OLQLFDO5HVHDUFKZLWK6$6([DPSOHV
6$6$QDO\VLVRI([DPSOH
7ZRSDWLHQWVZLWKQRDWWHPSWV 19 DUHH[FOXGHGIURPWKHDQDO\VLVZKLFKOHDYHV SDWLHQWVRUµFOXVWHUV¶RIUHVSRQVHGDWD7KHORJLVWLFUHJUHVVLRQPRGHOLVVSHFLILHGLQ 6$6LQWKHXVXDOZD\XVLQJ352&/2*,67,&H[FHSWWKDWLQWKLVFDVH\RXXVHWKH µHYHQWVWULDOV¶IRUPZKLFKLVDQDOWHUQDWLYHVSHFLILFDWLRQIRUWKHGHSHQGHQWYDULDEOH LQWKH02'(/VWDWHPHQW7KHµHYHQWV¶DUHWKHQXPEHURIVXFFHVVHV68&& DQG WKHµWULDOV¶DUHWKHQXPEHURIDWWHPSWV$7737 ,QFOXGHWUHDWPHQWJURXS757 DV D GLFKRWRPRXV H[SODQDWRU\ YDULDEOH DQG DJH $*( DV D FRQWLQXRXV QXPHULF FRYDULDWHLQWKHPRGHO 6HYHUDO GLIIHUHQW PHWKRGV IRU UHVFDOLQJ WKH FRYDULDQFH PDWUL[ DUH DYDLODEOH LQ 6$6:LOOLDPV¶PHWKRGRIWHQZRUNVZHOOZLWKGDWDRIWKLVW\SHHVSHFLDOO\ZKHQ VDPSOH VL]HV GLIIHU DPRQJ WKH FOXVWHUV 7KLV DSSURDFK LV LPSOHPHQWHG E\ VSHFLI\LQJWKH6&$/( :,//,$06RSWLRQLQWKH02'(/VWDWHPHQW 20 7KHUHVXOWVLQGLFDWHDVLJQLILFDQWWUHDWPHQWHIIHFW 21 S ZLWKDQHVWLPDWHG RGGVUDWLRRI 22 7KLVPHDQVWKDWWKHDJHDGMXVWHGRGGVRIVXFFHVVLQFUHDVH E\IRUWKHQHZGUXJUHODWLYHWRWKHFRQWURO$JHLVVKRZQWREHDVLJQLILFDQW FRYDULDWH 23 S ,W¶VRGGVUDWLRLV 24 ZKLFKPHDQVWKDWWKHRGGVRI VXFFHVVGHFUHDVHE\DERXWIRUHDFK\HDUWKHSDWLHQWDJHV 6$6&RGHIRU([DPSOH DATA DIARY; INPUT PAT TRT AGE SUCC ATTPT @@; /* TRT = 1 for new drug, TRT = 0 IF ATTPT = 0 THEN DELETE; DATALINES; 1 0 41 3 6 3 0 44 5 15 5 0 8 0 70 3 8 11 0 35 4 8 13 0 18 0 61 1 7 22 0 35 5 5 24 0 27 0 35 4 10 30 0 61 4 8 31 0 37 0 53 2 4 39 0 72 4 6 40 0 44 0 53 8 15 45 0 45 3 4 48 0 4 1 54 10 12 7 1 65 0 0 9 1 12 1 44 17 22 14 1 66 2 3 16 1 19 1 40 2 4 20 1 44 9 16 21 1 26 1 51 6 12 28 1 67 5 11 29 1 33 1 69 0 2 35 1 53 4 14 36 1 42 1 39 4 9 43 1 35 8 10 46 1 ;
for reference control */ 19
62 0 4 72 1 6 52 6 8 55 2 5 68 0 0 40 14 20 51 5 8 55 9 11 64 5 9 44 3 3 49 5 8 47 4 5
6 15 25 34 41 2 10 17 23 32 38 47
0 0 0 0 0 1 1 1 1 1 1 1
44 1 2 34 5 15 66 1 7 41 7 9 56 12 17 57 3 8 53 8 10 37 6 8 78 1 3 65 7 18 74 10 15 46 6 7
PROC LOGISTIC DATA = DIARY; MODEL SUCC/ATTPT = TRT AGE / SCALE = WILLIAMS; 20 TITLE1 ’Logistic Regression’; TITLE2 ’Example 20.3: Intercourse Success Rate in Erectile Dysfunction’; RUN;
&KDSWHU/RJLVWLF5HJUHVVLRQ
2873876$62XWSXWIRU([DPSOH
Logistic Regression Example 20.3: Intercourse Success Rate in Erectile Dysfunction The LOGISTIC Procedure Model Information Data Set Response Variable (Events) Response Variable (Trials) Number of Observations Weight Variable Sum of Weights Link Function Optimization Technique
WORK.DIARY SUCC ATTPT 46 1 / ( 1 + 0.056638 * (ATTPT - 1) ) 26 268.53902437 binary logit Fisher’s scoring
Response Profile Binary Total Outcome Frequency
Ordered Value 1 2
Event Nonevent
Total Weight
234 183
149.48211 119.05692
Model Convergence Status Convergence criterion (GCONV=1E-8) satisfied. Deviance and Pearson Goodness-of-Fit Statistics Criterion
DF
Value
Value/DF
Pr > ChiSq
Deviance Pearson
43 43
48.2014 42.9997
1.1210 1.0000
0.2706 0.4713
Number of events/trials observations: 46
NOTE: Because the Williams method was used to accommodate 25 overdispersion, the Pearson chi-squared statistic and the deviance can no longer be used to assess the goodness of fit of the model. Model Fit Statistics
Criterion AIC SC -2 Log L
Intercept Only
Intercept and Covariates
370.820 374.853 368.820
364.792 376.891 358.792
Testing Global Null Hypothesis: BETA=0 Test Likelihood Ratio Score Wald
Chi-Square
DF
Pr > ChiSq
10.0277 9.9254 9.6295
2 2 2
0.0066 0.0070 0.0081
&RPPRQ6WDWLVWLFDO0HWKRGVIRU&OLQLFDO5HVHDUFKZLWK6$6([DPSOHV
2873876$62XWSXWIRU([DPSOHFRQWLQXHG
Logistic Regression Example 20.3: Intercourse Success Rate in Erectile Dysfunction The LOGISTIC Procedure Analysis of Maximum Likelihood Estimates
Parameter
DF
Estimate
Standard Error
Chi-Square
Intercept TRT AGE
1 1 1
1.3384 0.5529 -0.0271
0.5732 0.2534 0.0108
5.4515 4.7614 6.2497
Pr > ChiSq 0.0196 0.0291 21 0.0124 23
Odds Ratio Estimates
Effect TRT AGE
Point Estimate 1.738 0.973
95% Wald Confidence Limits 22 24
1.058 0.953
2.856 0.994
Association of Predicted Probabilities and Observed Responses Percent Concordant Percent Discordant Percent Tied Pairs
57.9 38.5 3.6 42822
Somers’ D Gamma Tau-a c
0.194 0.201 0.096 0.597
'HWDLOV 1RWHV
/RJLVWLF UHJUHVVLRQ LV D XVHIXO WRRO IRU FRQGXFWLQJ H[SORUDWRU\ DQDO\VHVWRLGHQWLI\EDFNJURXQGIDFWRUVWKDWPLJKWKHOSH[SODLQWUHQGVREVHUYHG LQWUHDWPHQWFRPSDULVRQV6$6SULQWVRXWWZRPHDVXUHV$NDLNH¶V,QIRUPDWLRQ &ULWHULRQ$,& DQGWKH6FKZDU]&ULWHULRQ6& ZKRVHUHODWLYHYDOXHVFDQEH XVHG WR DVVHVV WKH JRRGQHVV RI ILW RI WKH PRGHO &RYDULDWHV FDQ EH DGGHG RU UHPRYHG LQ D VWHSZLVH IDVKLRQ WR REWDLQ WKH PRVW DSSURSULDWH PRGHO E\ DWWHPSWLQJWRPLQLPL]HWKHYDOXHVRIWKH$,&DQG6& :KHQ \RX XVH WKH 6&$/( RSWLRQ LQ WKH 02'(/ VWDWHPHQW LQ 352& /2*,67,& WZR RWKHU JRRGQHVVRIILW WHVWV WKH 3HDUVRQ DQG 'HYLDQFH WHVWV DUH SULQWHG 1RQVLJQLILFDQFH RI WKHVH WHVWV ZKLFK KDYH DQ DSSUR[LPDWH FKL VTXDUHGLVWULEXWLRQZLWKVXIILFLHQWO\ODUJHVDPSOHVL]HVSURYLGHVDQLQGLFDWLRQ RIDGHTXDWHILW$VQRWHGLQ2XWSXW 25 WKHVHJRRGQHVVRIILWWHVWVDUHQRW YDOLGZKHQWKH:LOOLDPV¶FRUUHFWLRQIRURYHUGLVSHUVLRQLVPDGH
&KDSWHU/RJLVWLF5HJUHVVLRQ
7KH $**5(*$7( RSWLRQ LQ WKH 02'(/ VWDWHPHQW FDQEHXVHGWRLGHQWLI\ VXEJURXSVGHILQHGE\FRPELQDWLRQVRIIDFWRUOHYHOVIRUDVVHVVLQJJRRGQHVVRI ILW 7KHVH VDPH VXEJURXSV DUH WKHQ XVHG WR FRPSDUH UHODWLYH JRRGQHVVRIILW PHDVXUHVDPRQJFRPSHWLQJPRGHOV :KHQRQHRUPRUHRIWKHH[SODQDWRU\YDULDEOHVLVPHDVXUHGRQDFRQWLQXRXV QXPHULF VFDOH \RX PLJKW SUHIHU WR XVH WKH /$&.),7 RSWLRQ WR DVVHVV JRRGQHVVRIILW7KLVRSWLRQSHUIRUPVDJRRGQHVVRIILWWHVWEDVHGRQDGLYLVLRQ RIWKHUDQJHRIWKHFRQWLQXRXVQXPHULFFRYDULDWHVLQWRLQWHUYDOV $VVHVVLQJWKHILWRIDORJLVWLFUHJUHVVLRQPRGHOFDQEHDFRPSOH[SURFHVVDQGWKH GHWDLOVDUHEH\RQGWKHVFRSHRIWKLVLQWURGXFWLRQWRORJLVWLFUHJUHVVLRQ([DPSOHV WKDWGHPRQVWUDWHJRRGQHVVRIILWPHWKRGVDQGSURYLGHDGGLWLRQDOGHWDLOVFDQEH IRXQG LQ PDQ\ H[FHOOHQW UHIHUHQFHV LQFOXGLQJ WZR KLJKO\ UHFRPPHQGHG 6$6 %%8ERRNV$OOLVRQ DQG6WRNHV'DYLVDQG.RFK ,Q FOLQLFDO WULDOV HTXDOLW\ RI UHVSRQVH UDWHV EHWZHHQ JURXSV LV XVXDOO\WHVWHGE\WKHK\SRWKHVLVRIDGLIIHUHQFHDV\RXKDYHVHHQLQSUHYLRXV FKDSWHUV
$FWLYHJURXS
&RQWUROJURXS
$FRPSDULVRQRIWKHVHRGGVUHVXOWVLQDQRGGVUDWLRRI ZKLFK LV FRQVLGHUDEO\ ODUJHU WKDQ WKH RGGV UDWLR IRXQG E\ XVLQJ WKH FRYDULDQFHDGMXVWPHQW $Q DSSUR[LPDWH FRQILGHQFH LQWHUYDO FDQ EH FRQVWUXFWHG IRU WKHRGGVUDWLRVE\H[SRQHQWLDWLQJWKHXSSHUDQGORZHUFRQILGHQFHOLPLWVIRUWKH PRGHO SDUDPHWHU %HFDXVH WKH 0/(V RI WKH PRGHO SDUDPHWHUV KDYH DV\PSWRWLF QRUPDO GLVWULEXWLRQV FRQILGHQFH OLPLWV FDQ EH FRQVWUXFWHG LQ WKH XVXDOPDQQHU,Q([DPSOHWKHFRQILGHQFHOLPLWVIRUWKH757SDUDPHWHU LVUHSUHVHQWHGE\ L
E ¼VE
&RPPRQ6WDWLVWLFDO0HWKRGVIRU&OLQLFDO5HVHDUFKZLWK6$6([DPSOHV
ZKHUH E DQG VE FDQ EH REWDLQHG IURP WKH 6$6 RXWSXW õ 2XWSXW VKRZVE ±DQGVE UHVXOWLQJLQDFRQILGHQFHLQWHUYDO IRU DV
±¼ RU WR± $Q DSSUR[LPDWH FRQILGHQFH LQWHUYDO IRU WKH RGGV RI UHVSRQVH RQ DFWLYH WUHDWPHQWUHODWLYHWRFRQWUROLV H WRH RU WR ±
±
%HJLQQLQJ ZLWK 6$6 9HUVLRQ HVWLPDWHV RI WKH RGGV UDWLRV DQG FRQILGHQFH OLPLWV DUH DXWRPDWLFDOO\ SULQWHG LQ 6$6 RXWSXW ô ,Q YHUVLRQV RI 6$6 SULRU WR 9 FRQILGHQFH LQWHUYDOV FDQ EH REWDLQHG E\ XVLQJ WKH 5,6./,0,76RSWLRQLQWKH02'(/VWDWHPHQWLQ352&/2*,67,&VHHWKH 6$6+HOSDQG2QOLQH'RFXPHQWDWLRQIRU352&/2*,67,& 2WKHU WUDQVIRUPDWLRQV VXFK DV WKH SURELW EDVHG RQ WKH LQYHUVH QRUPDOGLVWULEXWLRQ FDQEHXVHGLQDPDQQHUVLPLODUWRWKHORJLW7KHORJLVWLF WUDQVIRUPDWLRQ HQMR\V SRSXODU XVDJH EHFDXVH RI LWV DSSOLFDWLRQ HDVH DQG DSSURSULDWHQHVV DV D PRGHO IRU D ZLGH UDQJH RI QDWXUDO SKHQRPHQD $V LOOXVWUDWHGLQDSORWRI3YVLWVORJLWIXQFWLRQ)LJXUH OQ3±3 LVIDLUO\ OLQHDULQ3RYHUDODUJHSRUWLRQRIWKHUDQJHRI3DSSUR[LPDWHO\WR )RU YHU\ODUJHRUYHU\VPDOOYDOXHVRI3WKHORJLWRI3LVJUHDWO\PDJQLILHG7KLVLV FRQVLVWHQW ZLWK D PRGHO EDVHG RQ LQWXLWLRQ LQ ZKLFK D FKDQJH LQ UHVSRQVH UDWHV VD\ IURP WR PLJKW QRW EH DV FOLQLFDOO\ UHOHYDQW DV D FKDQJH IURPWR 7KHORJLVWLFWUDQVIRUPDWLRQDOVRKDVWKHDGYDQWDJHRIPDSSLQJDOLPLWHGUDQJH RI REVHUYDEOH YDOXHV WR RQWR DQ XQOLPLWHG UDQJH ±WR 7KXVWKH SRWHQWLDO SUREOHP RI REWDLQLQJ HVWLPDWHV RXWVLGH WKH REVHUYDEOH UDQJH LV HOLPLQDWHG
&KDSWHU/RJLVWLF5HJUHVVLRQ
),*85(3ORWRI7UDQVIRUPDWLRQIURP3WRWKH/RJLW3
OQ33 3 7KLVFKDSWHUSUHVHQWVDQLQWURGXFWLRQWRORJLVWLFUHJUHVVLRQXVLQJ FRYDULDWHVWKDWKDYHHLWKHUQXPHULFRURUGLQDOYDOXHV7KHVHPHWKRGVLQFOXGH ERWK RUGLQDO DQG QRPLQDO GLFKRWRPRXV H[SODQDWRU\ YDULDEOHV EHFDXVH WKH GLFKRWRPRXVOHYHOVFDQDOZD\VEHFRGHGDVWR/RJLVWLFUHJUHVVLRQFDQDOVR EH XVHG ZLWK QRPLQDOOHYHO FDWHJRULFDOH[SODQDWRU\YDULDEOHVWKDWKDYHPRUH WKDQWZROHYHOVE\GHILQLQJGXPP\YDULDEOHVWKDWFRUUHVSRQGWRWKHOHYHOV)RU H[DPSOH VXSSRVH JHRJUDSKLF UHJLRQ LV D FRYDULDWH RI LQWHUHVW ZLWK OHYHOV $ 1RUWK %6RXWK &0LGZHVW DQG':HVW 2QHRIWKHOHYHOVHJ$LV GHVLJQDWHGDVDUHIHUHQFHDQGWKUHHGXPP\YDULDEOHVDUHFUHDWHGDVIROORZV ; LIIURP5HJLRQ% ; RWKHUZLVH ; LIIURP5HJLRQ& ; RWKHUZLVH ; LIIURP5HJLRQ' ; RWKHUZLVH 8VLQJDORJLVWLFUHJUHVVLRQPRGHOZLWKWKHVHYDULDEOHVDVFRYDULDWHVWKHRGGV UDWLRVUHSUHVHQWWKHRGGVRIUHVSRQVHUHODWLYHWR5HJLRQ$ :KLOHWKHVHGXPP\YDULDEOHVFDQHDVLO\EHFUHDWHGE\XVLQJD'$7$VWHSLQ 6$6 WKLV FDQ EHFRPH FXPEHUVRPH IRU FDWHJRULFDO YDULDEOHV WKDW KDYH ODUJH QXPEHUV RI OHYHOV %HJLQQLQJ ZLWK 6$6 9HUVLRQ 352& /2*,67,& DXWRPDWLFDOO\FUHDWHVWKHGXPP\YDULDEOHVZKHQWKHYDULDEOHLVLQFOXGHGLQD &/$66VWDWHPHQW7KLVDOVRPDNHVLWHDV\WRVSHFLI\LQWHUDFWLRQWHUPVLQWKH 02'(/VWDWHPHQWE\XVLQJ*/0W\SHVSHFLILFDWLRQVHJ$ % ,QYHUVLRQV
&RPPRQ6WDWLVWLFDO0HWKRGVIRU&OLQLFDO5HVHDUFKZLWK6$6([DPSOHV
RI 6$6 SULRU WR 9 352& /2*,67,& UHTXLUHV LQWHUDFWLRQV WR EH H[SOLFLWO\ GHILQHGLQWKH'$7$VWHS,Q([DPSOH\RXFDQLQFOXGHWKH6WXG\*URXS E\+LVWRU\ LQWHUDFWLRQ VLPSO\ E\ LQFOXGLQJ 5; +,67 LQ WKH 02'(/ VWDWHPHQW /RJLVWLF UHJUHVVLRQ FDQ DOVR EH FDUULHG RXW E\ XVLQJ WKH 6$6 SURFHGXUH &$702' 7KLV SURFHGXUH LV UHFRPPHQGHG IRU FDWHJRULFDO PRGHOLQJ ZLWK QRPLQDOFDWHJRULFDOFRYDULDWHV %\ PDNLQJ WKH SURSRUWLRQDO RGGVDVVXPSWLRQORJLVWLFUHJUHVVLRQ WHFKQLTXHVFDQEHXVHGWRDQDO\]HRUGLQDOFDWHJRULFDOUHVSRQVHVZLWKPRUHWKDQ WZR OHYHOV $ WHVW IRU WKLV DVVXPSWLRQ LV LQFOXGHG LQ WKH 6$6 RXWSXW IURP 352& /2*,67,&,Q([DPSOHDµ6FRUH¶WHVWLVXVHGEDVHGRQWKHFKL VTXDUHGLVWULEXWLRQZKLFKVKRZVDQRQVLJQLILFDQWUHVXOWS 18 7KLV VXSSRUWVWKHDVVXPSWLRQRIHTXDOVORSHSDUDPHWHUV ,IWKLVWHVWLVVLJQLILFDQWFDXWLRQVKRXOGEHXVHGLQWKHLQWHUSUHWDWLRQ,ILWLV QRW SRVVLEOH WR MXVWLI\ WKH SURSRUWLRQDO RGGV DVVXPSWLRQ DQ DOWHUQDWLYH PHWKRG FDQ EH XVHG ZLWK 352& *(102' E\ FRQVLGHULQJ WKH UHVSRQVH OHYHOVWREHQRPLQDO 0RGHO EXLOGLQJ E\ DGGLQJ RU UHPRYLQJ H[SODQDWRU\ YDULDEOHV LQ WKHORJLVWLFPRGHOVKRXOGEHSHUIRUPHGXQGHUFRQVWDQWDVVXPSWLRQVUHJDUGLQJ WKH FRYDULDQFH VWUXFWXUH RI WKH UHVSRQVH YDULDEOH :KHQ REVHUYDWLRQV DULVH IURPGLIIHUHQWSDWLHQWVWKHDVVXPSWLRQRILQGHSHQGHQWREVHUYDWLRQVLVXVXDOO\ YDOLG VR WKLV LV QRW D SUREOHP ,Q WKH FOXVWHUHG ELQRPLDO FDVH WKH YDULDQFH FRYDULDQFHPDWUL[LVZHLJKWHGXVLQJDVFDOHSDUDPHWHUWKDWLVHVWLPDWHGE\6$6 DQG LQFOXGHG LQ WKH RXWSXW )RU WKH 6&$/( :,//,$06 DGMXVWPHQW XVHGLQ ([DPSOH WKLV HVWLPDWH LV DV VKRZQ LQ WKH RXWSXW IRU :HLJKW 9DULDEOH 26 7RPDLQWDLQWKHVDPHFRYDULDQFHVWUXFWXUHIRUDOWHUQDWLYHPRGHOV \RX FDQ LQFOXGH WKH VFDOH SDUDPHWHU LQ SDUHQWKHVLV LQ WKH 6&$/( RSWLRQ WR UHTXHVW WKDW WKH VDPH ZHLJKWLQJV EH XVHG ZKHQ DGGLQJ RU UHPRYLQJ PRGHO HIIHFWVHJ6&$/( :,//,$06) ,Q DGGLWLRQ WR 352& /2*,67,& WZR RWKHU 6$6 SURFHGXUHV 352&&$702'DQG352&*(102'FDQEHXVHGIRUORJLVWLFUHJUHVVLRQ DQDO\VLV352&&$702'XVHVDµZHLJKWHGOHDVWVTXDUHV¶DSSURDFK:/6 LQ DGGLWLRQWRPD[LPXPOLNHOLKRRG7KLVLVRIWHQSUHIHUDEOHZKHQDOOFRYDULDWHV DUHFDWHJRULFDODQGIRUFDVHVRIDPXOWLQRPLDOUHVSRQVHRQDQRPLQDOVFDOH 352& *(102' ZKLFK LV D PRUH JHQHUDOL]HG PRGHOLQJ SURFHGXUH FDQ EH XVHGIRUORJLVWLFUHJUHVVLRQE\VSHFLI\LQJWKHORJLWDVWKHOLQNIXQFWLRQ352& *(102' LV YHU\ XVHIXO ZKHQ REVHUYDWLRQV DUH FRUUHODWHG VXFK DV LQ D UHSHDWHG PHDVXUHV H[SHULPHQW 'HWDLOV DERXW XVLQJ WKHVH DOWHUQDWLYH SURFHGXUHV LQ 6$6 IRU ORJLVWLF UHJUHVVLRQ FDQ EH IRXQG LQ RWKHU ERRNV LQFOXGLQJ 6WRNHV 'DYLV DQG .RFK DQG $OOLVRQ ZKLFK ZHUH ZULWWHQXQGHUWKH6$6%RRNVE\8VHUV3URJUDP
&KDSWHU/RJLVWLF5HJUHVVLRQ
6XEVWLWXWLQJDQGUHVSHFWLYHO\IRU; LQWKHHVWLPDWHGORJLVWLFUHJUHVVLRQ PRGHOIRU([DPSOH\RXREWDLQ
3Ö
,Q2XWSXW\RXVHHWKDWWKHRGGVUDWLRIRUWKHWUHDWPHQWHIIHFW LVùZKLFKPHDQVWKDWWKHRGGVRIUHODSVHIRUDFWLYHWUHDWPHQWLV OHVV WKDQ WKDW IRU WKH SODFHER 7KLV HVWLPDWH GRHV QRW GHSHQG RQ WKH RWKHU FRYDULDWH SULRU UHPLVVLRQ WLPH +RZHYHU LQ FRPSDUDWLYH FOLQLFDO WULDOV LW LV RIWHQ GHFLGHG WR UHSRUW FRYDULDWHDGMXVWHG UHVSRQVH UDWHV IRU HDFK WUHDWPHQW JURXS (VWLPDWHV RI WKHVH UHVSRQVH UDWHV GR KRZHYHU GHSHQG RQ RWKHU FRYDULDWHV LQ WKH PRGHO 7DEOH VKRZV WKH µQDwYH¶ HVWLPDWHV RI WKH FRYDULDWHDGMXVWHGUHVSRQVHUDWHV
$
H
- ×; -
3Ö - H &
$FWLYHJURXS
&RQWUROJURXS
- × ;
*LYHQ D SULRU UHPLVVLRQ WLPH RI ; PRQWKV IRU H[DPSOH WKH SUHGLFWHG UHODSVH UDWHV IRU WKH DFWLYH DQG SODFHER JURXSV DUH HVWLPDWHG IURP WKH SUHFHGLQJHTXDWLRQVQDPHO\IRUWKHDFWLYHJURXSDQGIRUWKHFRQWURO JURXS 7KLV PHWKRG FDQ EH UHSHDWHG IRU DQ\ RWKHU YDOXHV RI SULRU UHPLVVLRQ WLPH LQ WKH UDQJH RI REVHUYHG ; YDOXHV WR PRQWKV ZKLFK LQFOXGHV XQREVHUYHGYDOXHVVXFKDVWKHRYHUDOOPHDQUHPLVVLRQWLPH&DXWLRQPXVWEH XVHGZKHQH[WUDSRODWLQJEH\RQGWKHH[SHULPHQWDOUHJLRQ
7KHVH SUHGLFWHG YDOXHV FDQ EH ZULWWHQ WR DQ RXWSXW GDWD VHW LQ 6$6 E\ VSHFLI\LQJWKH287387VWDWHPHQWIROORZLQJWKH02'(/VWDWHPHQWLQ352& /2*,67,&:KHQWKH3 RSWLRQLVXVHG6$6LQFOXGHVHVWLPDWHGSUREDELOLWLHV LQWKHRXWSXWGDWDVHWIRUHDFKFRPELQDWLRQRIREVHUYHGFRYDULDWHYDOXHV,I\RX ZDQW SUHGLFWHG SUREDELOLWLHV IRU QRQREVHUYHG YDOXHV DGGLWLRQDO REVHUYDWLRQV ZLWKPLVVLQJYDOXHVIRUWKHUHVSRQVHYDULDEOHDQGWKHFRYDULDWHYDOXHVWKDW\RX FKRRVHFDQEHLQFOXGHGLQWKHGDWDVHW1RWLFHWKDWLIWKHUHVSRQVHYDULDEOHLV PLVVLQJIRUDQREVHUYDWLRQWKDWREVHUYDWLRQLVQRWXVHGE\352&/2*,67,& LQHVWLPDWLQJWKHPRGHO ,Q([DPSOH\RXFDQREWDLQWKHUHODSVHUDWHVIRUHDFKJURXSE\FUHDWLQJ D GXPP\ GDWD VHW RI REVHUYDWLRQV WKDW FRQWDLQV WKH YDOXHV RI WKH FRYDULDWH DQG FRQFDWHQDWH WKLV ZLWK WKH DQDO\VLV GDWD VHW )RU H[DPSOH WR LQFOXGH HVWLPDWHG SUREDELOLWLHV IRU ; « FRQFDWHQDWH WKH GDWD VHW $''21VHHEHORZ ZLWKWKH$0/GDWDVHW
1RWH&RQFDWHQDWLRQUHIHUVWRWKHSURFHVVRIDSSHQGLQJRQHGDWDVHWZLWK REVHUYDWLRQVIURPDQRWKHUGDWDVHW7KLVLVGRQHZLWKWKH6(7VWDWHPHQW LQWKH6$6'$7$VWHS)RUPRUHLQIRUPDWLRQVHH³7KH/LWWOH6$6 %RRND3ULPHU´E\'HOZLFKHDQG6ODXJKWHUSDJH ZKLFKLVZULWWHQ XQGHUWKH6$6%RRNVE\8VHUVSURJUDP
&RPPRQ6WDWLVWLFDO0HWKRGVIRU&OLQLFDO5HVHDUFKZLWK6$6([DPSOHV
DATA ADDON; DO I = 1 TO 16; X = I + 2; TRT = 0; OUTPUT; TRT = 1; OUTPUT; END; RUN;
%\DGGLQJWKHVWDWHPHQW OUTPUT OUT = ESTIM
P = P_EST;
IROORZLQJ WKH 02'(/ VWDWHPHQW LQ 352& /2*,67,& WKH HVWLPDWHG SUREDELOLWLHV DUH LQFOXGHG LQ WKH RXWSXW GDWD VHW (67,0 ZKLFK FDQ EH UH DUUDQJHGWRGLVSOD\WKHUHVXOWVDVVKRZQLQ7DEOHDQGSORWWHGDVVKRZQLQ )LJXUH&RQILGHQFHLQWHUYDOVIRUWKHVHSUHGLFWHGYDOXHVFDQDOVREHHDVLO\ REWDLQHGLQWKHRXWSXWGDWDVHWVHHWKH6$6+HOSDQG2QOLQH'RFXPHQWDWLRQ IRU352&/2*,67,&
7$%/((VWLPDWHG5HODSVH5DWHVIRU9DULRXV5HPLVVLRQ7LPHV; LQ([DPSOH ;
&RQWURO
$FWLYH
;
&RQWURO
$FWLYH
),*85(3ORWRI(VWLPDWHG5HODSVH3UREDELOLWLHVYV7LPHVLQFH 3ULRU5HPLVVLRQ;
&RQWURO $FWLYH
H VS DO H 5 ER U 3 W V (
5HPLVVLRQ7LPHPRV
&KDSWHU/RJLVWLF5HJUHVVLRQ
([DPSOH XVHV DQ RUGLQDO UHVSRQVH ZLWK OHYHOV VR WKH HVWLPDWHGSUREDELOLWLHVDUHSURYLGHGIRUHDFKRIWKHWKUHHGLFKRWRPRXVPRGHOV XVHGLLLDQGLLLLQ7DEOH )RU; *URXS$ DQG; QRSUHYLRXV HSLVRGHV WKHVHPRGHOVUHVXOWLQWKHIROORZLQJHVWLPDWHGSUREDELOLWLHV
L
OQ3 ±3 ± RU3
LL
OQ3 ±3
RU3
LLL
OQ3 ±3
RU3
3 UHSUHVHQWVWKHSUREDELOLW\RIµFRPSOHWH¶UHVSRQVH3 LVWKHSUREDELOLW\RI µPDUNHG¶ RU µFRPSOHWH¶ UHVSRQVH DQG 3 LV WKH SUREDELOLW\ RI DQ\UHVSRQVH µVRPH¶ µPDUNHG¶ RU µFRPSOHWH¶
7$%/((VWLPDWHG3UREDELOLWLHVIRU([DPSOH
+LVWRU\ ;
0RGHO
L LL LLL L LL LLL L LL LLL
*URXS; $ %
7KH; FRYDULDWHLQ([DPSOHKLVWRU\RIVHYHUHJDVWURSDUHVLV LV XVHG DV DQ RUGLQDO FRYDULDWH ZLWK OHYHOV µQR SULRU HSLVRGHV¶ µRQH SULRU HSLVRGH¶ DQG µPRUH WKDQ RQH SULRU HSLVRGH¶ 7KH RGGV UDWLR VHHQ LQ 2XWSXW LV 17 ZKLFK PHDQV WKDW WKH RGGV RI LPSURYLQJ UHVSRQVH GHFUHDVHVE\IRUHDFKLQFUHPHQWDOLQFUHDVHLQWKHKLVWRU\FDWHJRU\
7KLVDQDO\VLVDVVXPHVWKDWWKHORJRGGVLVUHODWHGWRLQFUHPHQWDOFKDQJHVLQ; LQ D OLQHDU ZD\ ,I WKHUH LV UHDVRQ WR EHOLHYH WKDW WKH ORJRGGV RI µRQH SULRU HSLVRGH¶YVµQRSULRUHSLVRGHV¶VKRXOGEHGLIIHUHQWIURPWKHORJRGGVRIµPRUH WKDQ RQH SULRU HSLVRGH¶ YVµRQHSULRUHSLVRGH¶WKHVORSHFDQQRWEHDVVXPHG WR EH OLQHDU ,Q WKLV FDVH D TXDGUDWLF WHUP IRU SULRU KLVWRU\ +,67 +,67 FDQ EH LQFOXGHG LQ WKH PRGHO RU DOWHUQDWLYHO\ WKH +,67
&RPPRQ6WDWLVWLFDO0HWKRGVIRU&OLQLFDO5HVHDUFKZLWK6$6([DPSOHV
YDULDEOHFDQEHLQFOXGHGLQWKHFODVVVWDWHPHQW:KHQXVHGDVDFODVVYDULDEOH 352& /2*,67,& HVWLPDWHV ¶V DQG RGGV UDWLRV VHSDUDWHO\ IRU HDFK OHYHO RI +,67UHODWLYHWRWKHUHIHUHQFHYDOXH -XVW DV DQDO\VLV RI FRYDULDQFH DVVXPHV HTXDO VORSHV DPRQJ JURXSVORJLVWLFUHJUHVVLRQDOVRDVVXPHVWKDWWKHVORSHRIWKHORJLWIXQFWLRQLV VLPLODUDPRQJJURXSV7KLVDVVXPSWLRQFDQEHFKHFNHGE\DSUHOLPLQDU\SORW RI WKH GDWD DORQJ ZLWK WKH LQFOXVLRQ RI WKH LQWHUDFWLRQ WHUPV EHWZHHQ WKH DSSURSULDWH;YDULDEOHVLQWKHORJLVWLFUHJUHVVLRQPRGHO&DXWLRQPXVWEHXVHG LQ WKH LQWHUSUHWDWLRQ RI WUHDWPHQW JURXS GLIIHUHQFHV LQ WKH SUHVHQFH RI LQWHUDFWLRQV 7KH PD[LPXPOLNHOLKRRG HTXDWLRQV PLJKW QRW DOZD\V JLYH D VROXWLRQ ,Q FDVHV ZLWK VPDOO VDPSOH VL]HV FRQYHUJHQFH WR D XQLTXH VROXWLRQ EDVHG RQ WKH LWHUDWLYH FDOFXODWLRQV PLJKW QRW EH DWWDLQDEOH $OWKRXJK FRQVHUYDWLYHDJHQHUDOJXLGHOLQHLVWKDWWKHVDPSOHVL]H1VKRXOGEHDWOHDVW WLPHV WKH QXPEHU RI FRYDULDWHV ZKHQ ILWWLQJ D ORJLVWLF UHJUHVVLRQ PRGHO %HJLQQLQJ ZLWK 6$6 9HUVLRQ ZKHQVDPSOHVL]HVDUHVPDOOH[DFWPHWKRGV IRUILWWLQJORJLVWLFUHJUHVVLRQPRGHOVDUHDYDLODEOH+RZHYHURWKHUFDWHJRULFDO PHWKRGVPLJKWEHPRUHDSSURSULDWHLQFOLQLFDOWULDOVDSSOLFDWLRQVZKHQVDPSOH VL]HVDUHYHU\VPDOO
&++$$3377((55 7KH/RJ5DQN7HVW ,QWURGXFWLRQ 6\QRSVLV ([DPSOHV 'HWDLOV 1RWHV
,QWURGXFWLRQ
7KHORJUDQNWHVWLVDVWDWLVWLFDOPHWKRGIRUFRPSDULQJGLVWULEXWLRQVRIWLPHXQWLO WKH RFFXUUHQFH RI DQ HYHQW RI LQWHUHVW DPRQJ LQGHSHQGHQW JURXSV 7KH HYHQW LV RIWHQGHDWKGXHWRGLVHDVHEXWHYHQWPLJKWEHDQ\ELQRPLDORXWFRPHVXFKDVFXUH UHVSRQVHUHODSVHRUIDLOXUH7KHHODSVHGWLPHIURPLQLWLDOWUHDWPHQWRUREVHUYDWLRQ XQWLOWKHHYHQWLVWKHHYHQWWLPHRIWHQUHIHUUHGWRDVµVXUYLYDOWLPH¶HYHQZKHQWKH HYHQWLVQRWµGHDWK¶ 7KHORJUDQNWHVWSURYLGHVDPHWKRGIRUFRPSDULQJµULVNDGMXVWHG¶HYHQWUDWHV7KLV LV XVHIXO ZKHQ SDWLHQWV LQ D FOLQLFDO VWXG\ DUH VXEMHFW WR YDU\LQJ GHJUHHV RI RSSRUWXQLW\ WR H[SHULHQFH WKH HYHQW 6XFK VLWXDWLRQV DULVH IUHTXHQWO\ LQ FOLQLFDO WULDOVGXHWRWKHILQLWHGXUDWLRQRIWKHVWXG\HDUO\WHUPLQDWLRQRIWKHSDWLHQWIURP WKHVWXG\RULQWHUUXSWLRQRIWUHDWPHQWEHIRUHWKHHYHQWRFFXUV
([DPSOHVZKHUHXVHRIWKHORJUDQNWHVWPLJKWEHDSSURSULDWHLQFOXGHFRPSDULQJ VXUYLYDOWLPHVLQFDQFHUSDWLHQWVZKRDUHJLYHQDQHZWUHDWPHQWZLWKSDWLHQWVZKR UHFHLYHVWDQGDUGFKHPRWKHUDS\RUFRPSDULQJWLPHVWRFXUHDPRQJVHYHUDOGRVHVRI DWRSLFDODQWLIXQJDOSUHSDUDWLRQZKHUHWKHSDWLHQWLVWUHDWHGIRUZHHNVRUXQWLO FXUHGZKLFKHYHUFRPHVILUVW 8VXDOO\HYHQWWLPHVDUHQRWZHOOPRGHOHGXVLQJWKHQRUPDOGLVWULEXWLRQ0RGHOLQJ WHFKQLTXHV XVLQJ VSHFLILF GLVWULEXWLRQDO DVVXPSWLRQV DUH GLVFXVVHG LQ&KDSWHU 7KH ORJUDQN WHVW LV D QRQSDUDPHWULF WHVW DQG DV VXFK GRHV QRW UHTXLUH DQ\ GLVWULEXWLRQDO DVVXPSWLRQV DERXW WKH HYHQW WLPHV ,I HYHU\ SDWLHQW ZHUH IROORZHG XQWLOWKHHYHQWRFFXUUHQFHWKHHYHQWWLPHVFRXOGEHFRPSDUHGEHWZHHQWZRJURXSV XVLQJWKH:LOFR[RQUDQNVXPWHVW&KDSWHU +RZHYHUVRPHSDWLHQWVPLJKWGURS RXWRUFRPSOHWHWKHVWXG\EHIRUHWKHHYHQWRFFXUV,QVXFKFDVHVWKHDFWXDOWLPHWR HYHQWLVXQNQRZQEHFDXVHWKHHYHQWGRHVQRWRFFXUZKLOHXQGHUVWXG\REVHUYDWLRQ
&RPPRQ6WDWLVWLFDO0HWKRGVIRU&OLQLFDO5HVHDUFKZLWK6$6([DPSOHV
7KH HYHQW WLPHV IRU WKHVH SDWLHQWV DUH EDVHG RQ WKH ODVW NQRZQ WLPH RI VWXG\ REVHUYDWLRQ DQG DUH FDOOHG µFHQVRUHG¶ REVHUYDWLRQV EHFDXVH WKH\ UHSUHVHQW WKH ORZHUERXQGRIWKHWUXHXQNQRZQHYHQWWLPHV7KH:LOFR[RQUDQNVXPWHVWFDQEH KLJKO\ ELDVHG LQ WKH SUHVHQFH RI FHQVRUHG GDWD 7KH ORJUDQN WHVW DGMXVWV IRU FHQVRULQJ WKHUHE\ DFFRXQWLQJ IRU WKH SDWLHQW¶V RSSRUWXQLW\ WR H[SHULHQFH WKH HYHQW
6\QRSVLV
7KH QXOO K\SRWKHVLV WHVWHG E\ WKH ORJUDQN WHVW LV WKDW RI HTXDO HYHQW WLPH GLVWULEXWLRQV DPRQJ JURXSV (TXDOLW\ RI WKH GLVWULEXWLRQV RI HYHQW WLPHV LPSOLHV VLPLODU ULVNDGMXVWHG HYHQW UDWHV DPRQJ JURXSV QRW RQO\ IRU WKH FOLQLFDO WULDODVD ZKROHEXWDOVRIRUDQ\DUELWUDU\WLPHSRLQWGXULQJWKHWULDO5HMHFWLRQRIWKHQXOO K\SRWKHVLVLQGLFDWHVWKDWWKHHYHQWUDWHVGLIIHUDPRQJJURXSVDWRQHRUPRUHWLPH SRLQWVGXULQJWKHVWXG\ +HUH \RX ZLOO H[DPLQH WKH FDVH RI WZR LQGHSHQGHQW JURXSV DOWKRXJK WKH VDPH PHWKRGLVHDVLO\H[WHQGHGWRPRUHWKDQWZRJURXSV$W\SLFDOGDWDVHWOD\RXWKDV WKHIROORZLQJIRUPZKHUH<UHSUHVHQWVWKHWLPHIURPLQLWLDOWUHDWPHQWWRWKHHYHQW RFFXUUHQFHDQGDLQGLFDWHVDFHQVRUHGYDOXH 7$%/(/D\RXWIRUWKH/RJ5DQN7HVW *5283 3DWLHQW (YHQW7LPH 1XPEHU
*5283 3DWLHQW (YHQW7LPH 1XPEHU
<
<
<
<
<
<
<1
1
1
<1
LQGLFDWHVFHQVRUHGWLPH
'LYLGHWKHVWXG\LQWRNGLVWLQFWWLPHSHULRGVW W W ZKHUHW M N UHSUHVHQWV WKHM WLPHSRLQWZKHQRQHRUPRUHSDWLHQWVLQWKHFRPELQHGVDPSOHV EHFRPHVHYHQWSRVLWLYH/HWG UHSUHVHQWWKHQXPEHURISDWLHQWVLQ*URXSLL ZKRILUVWH[SHULHQFHWKHHYHQWDWWLPHSHULRGW DQGOHWQ UHSUHVHQWWKHQXPEHURI SDWLHQWV LQ *URXS L ZKR DUH DW ULVN DW WKH EHJLQQLQJ RI WLPH SHULRG W $W ULVN GHVFULEHV WKH SDWLHQWV ZKR DUH µHYHQWQHJDWLYH¶ DQG VWLOO LQ WKH VWXG\ /HW G G G DQGOHWQ Q Q )RUM NFRPSXWH
N
M
WK
LM
M
LM
M
M
M
M
M
QM × G M HM = DQG QM
M
YM
M
=
QM × Q M × G M × Q M QM
×
Q M
-
-
GM
&KDSWHU7KH/RJ5DQN7HVW
)LQDOO\\RXREWDLQ N
2 Ê GM
M
N ( Y M M
å
DQG
9 å Y N
M
M
'HQRWHE\< DUDQGRPYDULDEOHWKDWUHSUHVHQWVWKHHYHQWWLPHIRU*URXSLL DQGOHW6 W 3URE< W 7KHWHVWVXPPDU\IRUWKHORJUDQNWHVWLVDVIROORZV L
L
L
QXOOK\SRWKHVLV DOWK\SRWKHVLV
+ 6 W 6 W IRUDOOWLPHVW + 6 W 6 W IRUDWOHDVWRQHWLPHW
$
2 - ( 9
WHVWVWDWLVWLF
GHFLVLRQUXOH
UHMHFW+ LI !
ZKHUH LV WKH FULWLFDO FKLVTXDUH YDOXH
([DPSOHV
ZLWK VLJQLILFDQFH OHYHO DQG GHJUHH RI IUHHGRP
p ([DPSOH+69(SLVRGHVIROORZLQJJ'9DFFLQH
)RUW\HLJKWSDWLHQWVZLWKJHQLWDOKHUSHV+69 ZHUHHQUROOHGLQDVWXG\RID QHZUHFRPELQDQWKHUSHVYDFFLQHEDVHGRQWKHDQWLJHQJO\FRSURWHLQJ' 3DWLHQWVZHUHUHTXLUHGWRKDYHDKLVWRU\RIDWOHDVW+69HSLVRGDO RXWEUHDNVLQWKHPRQWKVSULRUWRHQUROOPHQWLQWKHVWXG\DQGEHLQ UHPLVVLRQDWWKHWLPHRIYDFFLQDWLRQ3DWLHQWVZHUHUDQGRPO\DVVLJQHGWR UHFHLYHHLWKHUDVLQJOHJ'YDFFLQHLQMHFWLRQQ RUDSODFHERQ DQG WKHLUFRQGLWLRQVZHUHIROORZHGIRU\HDU7KHGDWDLQ7DEOHVKRZWKH WLPHLQZHHNV WRILUVWUHFXUUHQFHRI+69IROORZLQJLPPXQL]DWLRQIRUHDFK SDWLHQW< DORQJZLWKWKHQXPEHURIHSLVRGHVGXULQJWKHPRQWKSHULRG SULRUWRHQUROOPHQW; ,VWKHUHHYLGHQFHWKDWWKHGLVWULEXWLRQVRIWKHWLPHVWR UHFXUUHQFHGLIIHUEHWZHHQWKHJURXSV"
&RPPRQ6WDWLVWLFDO0HWKRGVIRU&OLQLFDO5HVHDUFKZLWK6$6([DPSOHV
7$%/(5DZ'DWDIRU([DPSOH J'9DFFLQH*URXS 3DWLHQW 1XPEHU
;
<
3DWLHQW 1XPEHU
;
<
3ODFHER*URXS 3DWLHQW 1XPEHU
;
<
3DWLHQW 1XPEHU
;
<
LQGLFDWHVFHQVRUHGWLPH 1RWH3DWLHQWVZKRKDGFHQVRUHGWLPHVRIOHVVWKDQZHHNVZHUHVWXG\GURSRXWV
&KDSWHU7KH/RJ5DQN7HVW
6ROXWLRQ
7KLVDQDO\VLVZLOOLJQRUHWKHFRYDULDWH;7KLVH[DPSOHLVDOVRXVHGLQ&KDSWHU ZKLFK GLVFXVVHV WKH FRPSDULVRQ RI VXUYLYDO GLVWULEXWLRQV XVLQJ FRYDULDWH DGMXVWPHQWVZLWKWKH&R[SURSRUWLRQDOKD]DUGVPRGHO 7KHILUVWVWHSLQPDQXDOFRPSXWDWLRQRIWKHORJUDQNWHVWLVWRFRQVWUXFWDWDEOHIRU HDFK HYHQW WLPH DV VKRZQ LQ 7DEOH $W HDFK WLPH SHULRG W WKH QXPEHU RI SDWLHQWVDWULVNQ LVUHGXFHGE\WKHQXPEHURIHYHQWSRVLWLYHVG DQGFHQVRUHG REVHUYDWLRQVZ IURPWKHSUHYLRXVWLPHSHULRGW M
LM
LM
LM
M±
/HW 6 W DQG6 W UHSUHVHQWWKHFXPXODWLYHSUREDELOLW\GLVWULEXWLRQVRIWKHWLPHV IURP YDFFLQH DGPLQLVWUDWLRQ XQWLO WKH ILUVW +69 UHFXUUHQFH IRU WKH J' DQG SODFHERJURXSVUHVSHFWLYHO\7KHK\SRWKHVLVWHVWVXPPDU\LV
QXOOK\SRWKHVLV
+ 6 W 6 W IRUDOOWLPHVW
DOWK\SRWKHVLV
+ 6 W 6 W IRUDWOHDVWRQHWLPHW
WHVWVWDWLVWLF
$
2 - ( 9 -
GHFLVLRQUXOH
UHMHFW+ LI !
FRQFOXVLRQ
%HFDXVH ! UHMHFW + DQG FRQFOXGH WKDW WKH UHFXUUHQFH WLPHV DUH VLJQLILFDQWO\ GLIIHUHQW EHWZHHQ WKH J' DQG SODFHER YDFFLQH JURXSVDWDOHYHORIVLJQLILFDQFH
&RPPRQ6WDWLVWLFDO0HWKRGVIRU&OLQLFDO5HVHDUFKZLWK6$6([DPSOHV
7$%/(3UHOLPLQDU\6XPPDU\7DEOHIRU/RJ5DQN&RPSXWDWLRQV IRU([DPSOH :((.
J'9$&&,1( *5283
3/$&(%2 *5283
G
Z
Q
G
Z
Q
G
Q
H
Y
2
(
9
M
M
M
M
M
M
M
M
1XPEHUVLQVKDGHGURZVGRQRWDIIHFWWKHFDOFXODWLRQVRI(DQG97KHVHDUHVKRZQRQO\WR DFFRXQWIRUDOOFHQVRUHGREVHUYDWLRQV
&20387$7,216
W M
727$/
&KDSWHU7KH/RJ5DQN7HVW
6$6$QDO\VLVRI([DPSOH 2QHZD\WRLGHQWLI\FHQVRUHGYDOXHVLQ6$6LVWRDVVLJQWKHPDQHJDWLYHYDOXHDV \RXUHDGWKHHYHQWWLPHV:.6 LQWRWKH6$6GDWDVHW+69 DVVKRZQLQWKH6$6 FRGHIRU([DPSOH7KHYDULDEOH&(16LVGHILQHGDVDQLQGLFDWRUYDULDEOHWKDW WDNHVWKHYDOXHLIWKHREVHUYDWLRQLVFHQVRUHG:.6 DQGLIWKHREVHUYDWLRQ LV QRW FHQVRUHG å $ SDUWLDO OLVWLQJ RI WKH GDWD LV GLVSOD\HG LQ WKH RXWSXW IURP 352&35,17 7KH6$6SURFHGXUH/,)(7(67LVXVHGWRSHUIRUPWKHORJUDQNWHVWê7KH7,0( VWDWHPHQWZKLFKIROORZVWKH352&/,)(7(67VWDWHPHQWGHVLJQDWHVWKHYDULDEOH :.6 ZKRVH YDOXHV DUH WKH HYHQW WLPHV IROORZHG E\ WKH YDULDEOH &(16 WKDW LGHQWLILHV ZKLFK REVHUYDWLRQV DUH FHQVRUHG LH &(16 7KH 675$7$ VWDWHPHQW LGHQWLILHV WKH JURXSLQJ RU VWUDWLILFDWLRQ YDULDEOH LQ WKLV FDVH YDFFLQH JURXS9$& 2XWSXWVKRZVWKHORJUDQNWHVWUHVXOWLQJLQDFKLVTXDUHYDOXHRI ñ ZKLFKFRQILUPVWKHPDQXDOFDOFXODWLRQV7KLVLVDVLJQLILFDQWUHVXOWZLWKDSYDOXH RI ò$OVRVKRZQLQWKHRXWSXWDUHWKHYDOXHVRI2 ±( ± ôDQG 9 õ
6$6&RGHIRU([DPSOH DATA HSV; INPUT VAC $ PAT WKS X CENS = (WKS < 1); WKS = ABS(WKS); TRT = (VAC = ’GD2’); DATALINES; GD2 1 8 12 GD2 3 -12 GD2 7 28 10 GD2 8 44 GD2 12 3 8 GD2 14 -52 GD2 18 6 13 GD2 20 12 GD2 24 -52 9 GD2 26 -52 GD2 31 -52 8 GD2 33 9 GD2 36 -52 6 GD2 39 15 GD2 42 21 13 GD2 44 -24 GD2 48 28 9 PBO 2 15 PBO 5 -2 12 PBO 9 8 PBO 13 -52 7 PBO 16 21 PBO 19 6 16 PBO 21 10 PBO 25 4 15 PBO 27 -9 PBO 30 1 17 PBO 32 12 PBO 37 -32 8 PBO 38 15 PBO 43 35 13 PBO 45 28 ;
@@;
å
/* used in Example 22.1 */ 10 6 9 7 12 10 14 16 9 7 7 16 9 8 8 9
GD2 GD2 GD2 GD2 GD2 GD2 GD2 GD2 PBO PBO PBO PBO PBO PBO PBO PBO
6 10 15 23 28 34 40 46 4 11 17 22 29 35 41 47
-52 14 35 -7 36 -11 13 -52 -44 12 19 -15 27 20 5 6
7 8 11 13 13 16 13 13 10 7 11 6 10 8 14 15
PROC SORT DATA = HSV; BY VAC PAT; PROC PRINT DATA = HSV; VAR VAC PAT WKS CENS X; TITLE1 ’The Log-Rank Test’; TITLE2 ’Example 21.1: HSV-2 Episodes with gD2 Vaccine’; RUN;
&RPPRQ6WDWLVWLFDO0HWKRGVIRU&OLQLFDO5HVHDUFKZLWK6$6([DPSOHV
ODS EXCLUDE ProductLimitEstimates; PROC LIFETEST PLOTS=(S) LINEPRINTER DATA = HSV; TIME WKS*CENS(1); STRATA VAC; RUN;
ê
2873876$62XWSXWIRU([DPSOH
The Log-Rank Test Example 21.1: HSV-2 Episodes with gD2 Vaccine Obs
VAC
PAT
WKS
CENS
X
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 . . . 43 44 45 46 47 48
GD2 GD2 GD2 GD2 GD2 GD2 GD2 GD2 GD2 GD2 GD2 GD2 GD2 GD2 GD2 GD2 GD2
1 3 6 7 8 10 12 14 15 18 20 23 24 26 28 31 33
8 12 52 28 44 14 3 52 35 6 12 7 52 52 36 52 9
0 1 1 0 0 0 0 1 0 0 0 1 1 1 0 1 0
12 10 7 10 6 8 8 9 11 13 7 13 9 12 13 8 10
PBO PBO PBO PBO PBO PBO
37 38 41 43 45 47
32 15 5 35 28 6
1 0 0 0 0 0
8 8 14 13 9 15
The LIFETEST Procedure Stratum 1: VAC = GD2 Summary Statistics for Time Variable WKS Quartile Estimates
Percent
Point Estimate
75 50 25
. 35.0000 13.0000
95% Confidence Interval (Lower Upper) 35.0000 15.0000 8.0000
. . 13 28.0000
&KDSWHU7KH/RJ5DQN7HVW
2873876$62XWSXWIRU([DPSOHFRQWLQXHG The Log-Rank Test Example 21.1: HSV-2 Episodes with gD2 Vaccine
Mean
Standard Error 3.2823 14
28.7214
NOTE: The mean survival time and its standard error were underestimated because the largest observation was censored and the estimation was restricted to the largest event time. Stratum 2: VAC = PBO Summary Statistics for Time Variable WKS Quartile Estimates
Percent
Point Estimate
95% Confidence Interval (Lower Upper)
75 50 25
28.0000 15.0000 8.0000
19.0000 10.0000 5.0000
Mean
Standard Error
. 27.0000 13 15.0000
2.5387 14
18.2397
NOTE: The mean survival time and its standard error were underestimated because the largest observation was censored and the estimation was restricted to the largest event time.
Summary of the Number of Censored and Uncensored Values
Stratum
VAC
Total
Failed
Censored
Percent Censored
1 GD2 25 14 11 44.00 2 PBO 23 17 6 26.09 --------------------------------------------------------------Total 48 31 17 35.42
&RPPRQ6WDWLVWLFDO0HWKRGVIRU&OLQLFDO5HVHDUFKZLWK6$6([DPSOHV
2873876$62XWSXWIRU([DPSOHFRQWLQXHG The Log-Rank Test Example 21.1: HSV-2 Episodes with gD2 Vaccine The LIFETEST Procedure
S u r v i v a l D i s t r i b u t i o n G
SDF | | 1.0 + | | | | | | 0.8 + | | | | | | 0.6 + | | | | | | 0.4 + | |
Survival Function Estimates **--G | G---G P----P | PPG--G | G-G 12 PP | | G---G | GG P--P | P--P G-G | | P--P GG | G-------G | | | G---------G P---P | | | | G---------G P-----P | | GG PP | | G-----G PP | |
| P--------P F | | u | PP n | P---------P c 0.2 + | t | | i | P o | n | | | 0.0 + ----+-----+-----+-----+-----+-----+-----+------+----+-----+-0 5 10 15 20 25 30 35 40 45 50 WKS Censored Observations Strata P | P P P P P P G | G G G G * ----+-----+-----+-----+-----+-----+-----+------+----+-----+-0 5 10 15 20 25 30 35 40 45 50 WKS Legend for Strata Stratum Symbol VAC 1 G GD2 2 P PBO
&KDSWHU7KH/RJ5DQN7HVW
2873876$62XWSXWIRU([DPSOHFRQWLQXHG The Log-Rank Test Example 21.1: HSV-2 Episodes with gD2 Vaccine The LIFETEST Procedure
Testing Homogeneity of Survival Curves for WKS over Strata Rank Statistics VAC
Log-Rank
GD2 PBO
-5.1556 5.1556
Wilcoxon
ô
-160.00 160.00
Covariance Matrix for the Log-Rank Statistics VAC
GD2
GD2 PBO
6.81973 -6.81973
PBO
õ
-6.81973 6.81973
Covariance Matrix for the Wilcoxon Statistics
Test
VAC
GD2
PBO
GD2 PBO
7136.46 -7136.46
-7136.46 7136.46
Test of Equality over Strata Pr > Chi-Square DF Chi-Square
Log-Rank Wilcoxon -2Log(LR)
3.8976 3.5872 4.2589
ñ
1 1 1
0.0484 0.0582 0.0390
ò ù 11
'HWDLOV 1RWHV
&OLQLFDO VWXG\ RXWFRPHV RIWHQ KDYH WKH JRDO RI HVWLPDWLQJ UHVSRQVH RU HYHQW UDWHV DQG FRPSDULQJ WKHP DPRQJ UDQGRPL]HG JURXSV +RZHYHU WKH VHOHFWLRQ RI D WLPH SRLQW GXULQJ WKH FOLQLFDO VWXG\ DW ZKLFK WR FRPSDUHHYHQWUDWHVLVQRWDOZD\VFOHDUFXW7KHODVWGRXEOHEOLQGYLVLWLVRIWHQ GHVLJQDWHGIRUVXFKXVHDOWKRXJKGUDZEDFNVPLJKWH[LVWIRUWKLVFKRLFH)LUVW WKHGXUDWLRQRIWKHVWXG\PLJKWEHVRPHZKDWDUELWUDU\RUFRQVWUDLQHGE\VDIHW\ LVVXHV RU RWKHU FRQVLGHUDWLRQV XQUHODWHG WR UHVSRQVH 0RUH LPSRUWDQWO\ WKH DQDO\VLVLJQRUHVLQIRUPDWLRQSULRUWRWKLVHYDOXDWLRQWLPHSRLQWWKDWPLJKWKDYH DQLPSRUWDQWLPSDFWRQWKHUHVXOWV
&RPPRQ6WDWLVWLFDO0HWKRGVIRU&OLQLFDO5HVHDUFKZLWK6$6([DPSOHV
,I WKH GDWD LQ ([DPSOH ZHUH DQDO\]HG ZLWK WKH FKLVTXDUH WHVW &KDSWHU IRU FRPSDULQJ HYHQW UDWHV DW WKH ILQDO YLVLW XVLQJ D µODVW REVHUYDWLRQFDUULHGIRUZDUG¶PHWKRG\RXREWDLQ 7$%/(8QDGMXVWHG5HFXUUHQFH5DWHVIRU([DPSOH 5HVSRQVH
J' *URXS
3ODFHER *URXS
727$/
+695HFXUUHQFH
1R5HFXUUHQFH
7RWDO(QUROOHG
7KH +69 UHFXUUHQFH UDWH FRPSDULVRQ YV LV QRW VLJQLILFDQW S EDVHG RQ WKH FKLVTXDUH WHVW &KDSWHU ZKLFK LJQRUHV WKH WLPH WKDW SDWLHQWV ZHUH DW ULVN %\ DQDO\]LQJ WLPHWRHYHQW RFFXUUHQFH DQG DFFRXQWLQJ IRU FHQVRULQJ WKH ORJUDQN WHVW XVHV HYHQW UDWHV RYHUWKHZKROHWULDOUDWKHUWKDQUHO\LQJRQDVLQJOHWLPHSRLQW
)RU WKH M WLPH SHULRG W \RX FDQ FRQVWUXFW D FRQWLQJHQF\ WDEOHDVIROORZV WK
M
*5283
*5283
727$/
(YHQW3RVLWLYH
GM
GM
GM
(YHQW1HJDWLYH
QM±GM
QM±GM
QM±GM
7RWDO
QM
QM
QM
7KHORJUDQNWHVWLVHTXLYDOHQWWRDSSO\LQJWKH&RFKUDQ0DQWHO+DHQV]HOWHVW &KDSWHU XVLQJWKHVHWDEOHVDWHDFKWLPHSRLQWDVWKHVWUDWD 0DQ\ YDULDWLRQV RI WKH ORJUDQN WHVW IRU FRPSDULQJ VXUYLYDO GLVWULEXWLRQVH[LVW7KHPRVWFRPPRQYDULDQWKDVWKHIRUP
2 - ( (
2 - ( (
ZKHUH 2 DQG ( DUH FRPSXWHG IRU HDFK JURXS DV LQ WKH IRUPXODV JLYHQ SUHYLRXVO\7KLVVWDWLVWLFDOVRKDVDQDSSUR[LPDWHFKLVTXDUHGLVWULEXWLRQZLWK GHJUHHRIIUHHGRPXQGHU+ L
L
&KDSWHU7KH/RJ5DQN7HVW
,Q ([DPSOH \RX KDYH 2 2 DQG ( 8VLQJ WKH UHODWLRQVKLSWKDW2 2 ( ( \RXREWDLQ( ZKLFKUHVXOWVLQ
- -
ZLWKDSYDOXHRI$OWKRXJKFRPSXWDWLRQDOO\HDVLHUWKLVFKLVTXDUHWHVW JLYHVDPRUHFRQVHUYDWLYHUHVXOWWKDQWKHYHUVLRQSUHVHQWHG $ FRQWLQXLW\ FRUUHFWLRQ FDQ DOVR EH XVHG E\ UHGXFLQJ WKH QXPHUDWRUV E\ EHIRUH VTXDULQJ 8VLQJ WKLV W\SH RI FRUUHFWLRQ OHDGV WR HYHQ IXUWKHU FRQVHUYDWLVPDQGPD\EHRPLWWHGZKHQVDPSOHVL]HVDUHPRGHUDWHRUODUJH 3UHYLRXVO\ LW ZDV QRWHG WKDW WKH :LOFR[RQ UDQNVXP WHVW FDQ EH XVHG WR DQDO\]H WKH HYHQW WLPHV LQ WKH DEVHQFH RI FHQVRULQJ $ JHQHUDOL]HG :LOFR[RQWHVWVRPHWLPHVFDOOHGWKH*HKDQWHVW EDVHGRQDQDSSUR[LPDWHFKL VTXDUH GLVWULEXWLRQ KDV EHHQ GHYHORSHG IRU XVH LQ WKH SUHVHQFH RI FHQVRUHG REVHUYDWLRQV0DQXDOFRPSXWDWLRQVIRUWKLVPHWKRGFDQEHWHGLRXV7KHUHDGHU LVUHIHUUHGWR'DZVRQ6DXQGHUV 7UDSSS IRUDQLOOXVWUDWLRQRID ZRUNHG H[DPSOH 5HVXOWV RI WKLV WHVW DUH DYDLODEOH LQ 2XWSXW 7KH JHQHUDOL]HG:LOFR[RQFKLVTXDUHIRU([DPSOHLVZLWKDSYDOXHRI ùZKLFKLVFORVHWRWKHUHVXOWVRIWKHORJUDQNWHVWIRUWKLVH[DPSOH EXWQRWVLJQLILFDQW %RWKWKHORJUDQNDQGWKHJHQHUDOL]HG:LOFR[RQWHVWVDUHQRQSDUDPHWULFWHVWV DQGUHTXLUHQRDVVXPSWLRQVUHJDUGLQJWKHGLVWULEXWLRQRIHYHQWWLPHV:KHQD JUHDWHU HYHQW UDWH LV VHHQ HDUO\ LQ WKH WULDO UDWKHU WKDQ WRZDUG WKH HQG RI WKH WULDOWKHJHQHUDOL]HG:LOFR[RQWHVWLVWKHPRUHDSSURSULDWHWHVWEHFDXVHLWJLYHV JUHDWHUZHLJKWWRWKHHDUOLHUGLIIHUHQFHV 6XUYLYDO DQG IDLOXUH WLPHV RIWHQ IROORZ WKH H[SRQHQWLDO GLVWULEXWLRQ ,I VXFK DPRGHOFDQEHDVVXPHGDPRUHSRZHUIXODOWHUQDWLYHWR WKH ORJUDQN WHVW LV WKH OLNHOLKRRG UDWLR WHVW WKH UHVXOWV RI ZKLFK DUH DOVR SULQWHGLQWKH6$6RXWSXWDV±/RJ/5 2XWSXWVKRZVDOLNHOLKRRGUDWLR FKLVTXDUHRIZLWKDSYDOXHRI 11 7KLV SDUDPHWULF WHVW DVVXPHV WKDW HYHQW SUREDELOLWLHV DUH FRQVWDQW RYHU WLPH 7KDWLVWKHFKDQFHWKDWDSDWLHQWEHFRPHVHYHQWSRVLWLYHDWWLPHWJLYHQWKDWKH LVHYHQWQHJDWLYHXSWRWLPHWGRHVQRWGHSHQGRQW$SORWRIWKHQHJDWLYHORJ RIWKHHYHQWWLPHVGLVWULEXWLRQWKDWVKRZVDOLQHDUWUHQGWKURXJKWKHRULJLQLV FRQVLVWHQW ZLWK H[SRQHQWLDO HYHQW WLPHV ,Q 6$6 WKH RSWLRQ 3/276 /6 LQ 352&/,)(7(67FDQEHVSHFLILHGWRREWDLQVXFKDSORW /LIH WDEOHV FDQ EH FRQVWUXFWHG WR SURYLGH HVWLPDWHV RI WKH HYHQW WLPHGLVWULEXWLRQV(VWLPDWHVFRPPRQO\XVHGIRUGDWDVHWVVLPLODUWRWKDWIRU ([DPSOH DUH NQRZQ DV WKH .DSODQ0HLHU HVWLPDWHV 7KHVH DUH WKH GHIDXOW HVWLPDWHV JLYHQ E\ 6$6 DOWKRXJK WKH RXWSXW FDQ EH VXSSUHVVHG E\ XVLQJ WKH 2'6 VWDWHPHQW $ SORW RI WKH GLVWULEXWLRQ IXQFWLRQ YV WLPH
&RPPRQ6WDWLVWLFDO0HWKRGVIRU&OLQLFDO5HVHDUFKZLWK6$6([DPSOHV
SURYLGHV D JRRG YLVXDO RI WKH GLVWULEXWLRQV 6$6 ZLOO SURYLGH VXFK D SORW LI \RX XVH WKH 3/276 6 RSWLRQ LQ 352& /,)(7(67 DV LOOXVWUDWHG LQ 2XWSXW12 $GGLWLRQDOVXPPDU\VWDWLVWLFVSULQWHGE\6$6WKDWPLJKWEHXVHIXO DUH WKH HVWLPDWHG PHGLDQ DQG PHDQ HYHQW WLPHV IRU HDFK JURXS 7KH PHGLDQ HYHQWWLPHVDUHVKRZQLQ2XWSXWDVWKHTXDUWLOHDQGLISRVVLEOHWR FRPSXWHWKH\DUHDFFRPSDQLHGE\FRQILGHQFHLQWHUYDOV,Q([DPSOH WKHPHGLDQWLPHVXQWLOILUVWUHFXUUHQFHDUHDQGZHHNVUHVSHFWLYHO\IRU WKHJ'DQGSODFHERYDFFLQHJURXSV 13 ZKLOHWKHHVWLPDWHVRIWKHPHDQVDUH DQGZHHNVUHVSHFWLYHO\ 14 7KHHVWLPDWHVRIWKHPHDQVZLOODOZD\V EH ELDVHG XQOHVV DOO SDWLHQWV EHFRPH HYHQWSRVLWLYH E\ VWXG\ FRPSOHWLRQ $V QRWHG LQ WKH 6$6 RXWSXW WKHVH HVWLPDWHV DUH ELDVHG GXH WR FHQVRUHG REVHUYDWLRQV 7KHORJUDQNWHVWLVSUHVHQWHGKHUHDVDPHDQVIRUFRPSDULQJWZR WUHDWPHQWJURXSV7KHVDPHPHWKRGVFDQEHXVHGZLWKRQHWUHDWPHQWJURXSWR FRPSDUHWZRVWUDWD)RUH[DPSOHDFRPSDULVRQRIHYHQWWLPHVEHWZHHQPDOHV DQGIHPDOHVZKRDUHWUHDWHGZLWKWKHVDPHWHVWSUHSDUDWLRQFDQEHPDGHZLWK WKHORJUDQNWHVW 7KHORJUDQNWHVWFDQHDVLO\EHH[WHQGHGWRPRUHWKDQWZRJURXSV :LWKJJURXSVRUVWUDWDWKHK\SRWKHVLVRIHTXDOHYHQWWLPHGLVWULEXWLRQVFDQEH WHVWHGZLWK
å J
L
2 L - ( L (L
ZKLFKKDVDQDSSUR[LPDWHFKLVTXDUHGLVWULEXWLRQZLWKJ±GHJUHHVRIIUHHGRP XQGHU+ ,WLVHDV\WRVHHWKDWWKHWHFKQLTXHVRIWKHORJUDQNWHVWFDQEHXVHG WRFRPSDUHHYHQWWLPHVDPRQJWKHOHYHOVRIDQ\FDWHJRULFDOIDFWRU
&RYDULDWHV FDQ EH FRQVLGHUHG ZKHQ FRPSDULQJ HYHQW WLPHV E\ XVLQJ DQ H[WHQVLRQ RI WKH ORJUDQN WHVW )RU ([DPSOH WKH FRYDULDWH ; QXPEHURIHSLVRGHVLQSDVW\HDU FDQEHLQFOXGHGLQWKHORJUDQNWHVWWRREWDLQ FRYDULDWHDGMXVWHGFRPSDULVRQRIILUVWUHFXUUHQFHWLPHVEHWZHHQJURXSV 2QH PHWKRG XVLQJ 6$6 LV WR LQFOXGH WKH 9DFFLQH *URXS HIIHFW LQ D 7(67 VWDWHPHQW DQG WKH FRYDULDWH ; LQ WKH 675$7$ VWDWHPHQW %HFDXVH ; LV FRQVLGHUHGDFRQWLQXRXVQXPHULFYDULDEOHLQWHUYDOVRI;QHHGWREHVSHFLILHG IRU GHWDLOV VHH WKH 6$6 +HOS DQG 'RFXPHQWDWLRQ IRU 352& /,)(7(67 $GGLWLRQDOO\DQ\YDULDEOHVLQFOXGHGLQWKH7(67VWDWHPHQWPXVWEHQXPHULF VRWKH*URXSHIIHFW9$&ZLWKYDOXHV*'DQG3%2 PXVWILUVWEHFRQYHUWHG IURP FKDUDFWHU WR QXPHULF YDOXHV %HFDXVH UHVXOWV DUH GHSHQGHQW RQ WKH LQWHUYDOV FKRVHQ IRU ; WKLV LV QRW UHFRPPHQGHG XQOHVV WKH FRQWLQXRXV FRYDULDWHEUHDNVLQWRQDWXUDOLQWHUYDOV
&KDSWHU7KH/RJ5DQN7HVW
6$6 FDQ DOVR DFFRPPRGDWH PXOWLSOH FRYDULDWHV LQ WKH 7(67 VWDWHPHQW LQ 352&/,)(7(676$6ZLOOILUVWGHWHUPLQHWKHVLJQLILFDQFHRIHDFKLQGLYLGXDO FRYDULDWH¶VDVVRFLDWLRQZLWKHYHQWWLPH7KHQE\LQFOXGLQJWKHPRVWVLJQLILFDQW FRYDULDWHVRQHDWDWLPHXVLQJDVWHSZLVHPHWKRG6$6GHWHUPLQHVWKHOHYHORI DVVRFLDWLRQ RI HDFK FRYDULDWH WR HYHQW WLPH DGMXVWHG IRU DOO RWKHU FRYDULDWHV DOUHDG\LQFOXGHG7KH675$7$VWDWHPHQWLVQRWXVHG%HFDXVHRIWKHQDWXUHRI DVWHSZLVHPHWKRGWKHUHLVQRZD\WRSUHGLFWEHIRUHFRQGXFWLQJWKHDQDO\VLV ZKHWKHUWKH*URXSHIIHFWLQ([DPSOHZRXOGEHDGMXVWHGIRUWKHFRYDULDWH ;ZKHQXVLQJWKLVDSSURDFK 352& 3+5(* LQ 6$6 LV RIWHQ PRUH XVHIXO WKDQ 352& /,)(7(67 ZKHQ FRPSDULQJ VXUYLYDO WLPHVLQWKHSUHVHQFHRIFDWHJRULFDOFRIDFWRUVRUQXPHULF FRYDULDWHV7KLVSURFHGXUHLVSUHVHQWHGLQ&KDSWHU
&RPPRQ6WDWLVWLFDO0HWKRGVIRU&OLQLFDO5HVHDUFKZLWK6$6([DPSOHV
CHHAAPPTTEERR 22 The Cox Proportional Hazards Model 22.1 22.2 22.3 22.4
Introduction...................................................................................................... 365 Synopsis ............................................................................................................ 366 Examples........................................................................................................... 367 Details & Notes................................................................................................. 376
22.1 Introduction Like the log-rank test discussed in the previous chapter, the Cox proportional hazards model (sometimes referred to as ‘Cox regression’) is used to analyze event or ‘survival’ times. This is a modeling procedure that adjusts for censoring, may include numeric covariates, one or more of which may be time-dependent, and requires no assumption about the distribution of event times. An example of where the Cox proportional hazards model might be useful in clinical trials is a comparison of time to death among treatment groups with adjustments for age, duration of disease, and family disease history. Informally, the inverse of the time to event occurrence is called the ‘hazard’, given a suitable time unit. For example, if an event is expected to occur in 6 months, the hazard is 1/6 (‘events per month’), assuming the unit of time is ‘months’. Of chief concern when using the Cox proportional hazards model is the way in which the hazard changes over time. The hazard function is discussed in more detail later in this chapter (see Section 22.4.2). The hazard of some events might be likely to increase with the passage of time, such as death due to certain types of cancer. Other types of events might become less frequent over longer periods of time. For example, reinfarction following coronary artery bypass can have the greatest risk in the weeks immediately following surgery, then risk might decrease as time passes. Still other types of events might be subject to fluctuating risks, both increasing and decreasing, over time.
366
Common Statistical Methods for Clinical Research with SAS Examples
Theoretically, survival analysis would be pretty easy if you could assume that the hazard remains constant over time. Recognizing the impracticalities of such an assumption, the Cox proportional hazards model adopts the more reasonable assumption that the event hazard rate does change over time, but that the ratio of event hazards between two individuals is constant. This is known as the ‘proportional hazards’ assumption, which requires that the ratio of hazards between any two values of a covariate not vary with time. If Age is a covariate used in the analysis, the proportional hazards assumption would require, for example, that the ratio of the hazard for 60-year-old patients to the hazard for 35-year-old patients be constant over time.
22.2 Synopsis The Cox proportional hazards model is used to test the effect of a specified set of k covariates, X1, X2, ..., Xk on the event times. The Xi’s can be numeric-valued covariates or numerically coded categorical responses that have some natural ordering. The Xi’s can also be dummy variables that represent treatment groups in a comparative trial or levels of a nominal categorical factor as described for logistic regression (see Section 20.4.6). The model for the hazard function of the Cox proportional hazards method has the form
h(t) = λ(t) × e (β1X1 + β 2X 2 + ... + β k X k ) where h(t) represents the hazard function, the Xi’s represent the covariates, the βi’s represent the parameter coefficients of the Xi’s, and λ(t) represents an unspecified initial hazard function. As with regression analysis, the magnitudes of the βi’s reflect the importance of the covariates in the model, and inference about these parameters is the focus here. D. R. Cox, after whom the model is named, developed a method for estimating βi (i=1,2,...,k) by bi based on a ‘maximum partial likelihood’ approach, which is a modification of the maximum likelihood method mentioned in Chapter 20. Manual computations are impractical, and solutions generally require numerical techniques using high-speed computers. When using SAS, the PHREG procedure is recommended for fitting the Cox proportional hazards model.
Chapter 22: The Cox Proportional Hazards Model 367
For large samples, these estimates (bi) have an approximate normal distribution. If sb represents the standard error of the estimate bi, then bi/sb has an approximate standard normal distribution under the null hypothesis that βi = 0, and its square has the chi-square distribution with 1 degree of freedom. The test summary for each model parameter, βi, can be summarized as follows:
null hypothesis: alternative hypothesis:
H0: βi = 0 HA: βi =/ 0
test statistic:
æ bi ö 2 χ w = çç ÷÷ è sb ø
decision rule:
reject H0 if χ
2
2 w
2
> χ (α) 1
The magnitude of the effect of a covariate is often expressed as a hazard ratio, similar to the odds ratio discussed in Chapter 20. Because the hazard at any time, t, for a specified set of covariate values, x1, x2, …, xk, is given by
h(t) = λ(t) × e (β1x 1 + β 2 x 2 + ... + β k x k ) the ratio of hazards for a 1-unit increase in Xi (all other covariates held constant) is β e i. The estimated hazard ratio associated with the covariate Xi is found by simply exponentiating the parameter estimate, bi . Example 22.1, which follows, illustrates the Cox proportional hazards model with a simple example using one dichotomous group variable and one numeric covariate.
22.3 Examples p Example 22.1 -- HSV-2 Episodes with gD2 Vaccine (continued)
(See Example 21.1) The data in Example 21.1 include a history of HSV-2 lesional episodes during the year prior to the study (X). Does the covariate X have any impact on the comparison of HSV-2 recurrence times between treatment groups?
368
Common Statistical Methods for Clinical Research with SAS Examples
Solution Although covariates can be included in the TEST or STRATA statements when using PROC LIFETEST in SAS (Chapter 21), it is often easier to use PROC PHREG to model survival distributions that include one or more covariates. The SAS analysis below shows how to use PROC PHREG to fit the Cox proportional hazards model for this example.
SAS Analysis of Example 22.1 PROC PHREG requires all covariates included in the model to be numeric. The number of episodes in the prior 12 months (X) is a numeric variable. The vaccine group (VAC), however, must be converted. You can do this conversion with the new variable, TRT, which takes the value 1 for the gD2 vaccine group and 0 for the placebo group (see the SAS code for Example 21.1). With PROC PHREG, the MODEL statement is used with the time variable and censoring indicator on the left side of the equal sign (=) and the covariates on the right side of the equal sign (=) . Recall from Example 21.1 that the indicator variable CENS takes a value of 1 if the observation is censored, 0 if it is not censored. The left side of the MODEL equality is specified in the same format as the TIME statement in PROC LIFETEST for identifying censored values. You specify the vaccine group factor, TRT, and the HSV-2 lesion frequency during the prior year, X, as the two covariates. The TIES option is used in the event that more than one observation has tied event times. This is discussed further in Section 22.4.9. The results are shown in Output 22.1. The estimate of the parameter associated with the covariate X is 0.176 , which is significant (p = 0.0073) based on the chi-square value of 7.1896. This means that prior lesional frequency is an important consideration when analyzing the event times. The test for the group effect (TRT) is also significant (p = 0.0160) and indicates a difference in the distributions of event times between vaccine groups after adjusting for the covariate, X. The hazard ratio, 0.404 , indicates that the covariate-adjusted ‘hazard’ of HSV-2 recurrence for the active (gD2) group is only about 40% of that for the placebo group. The log-rank test using the same data set (ignoring the covariate, X) resulted in a significant finding with a p-value close to 0.05 (Example 21.1). The Cox proportional hazards model resulted in an even greater significance (p = 0.016) after adjusting for the important covariate, X.
Chapter 22: The Cox Proportional Hazards Model 369
SAS Code for Example 22.1 . . . (continued from Example 21.1) PROC PHREG DATA = HSV; MODEL WKS*CENS(1) = TRT X / TIES = EXACT; TITLE1 'Cox Proportional Hazards Model'; TITLE2 “Example 22.1: HSV-2 Episodes with gD2 Vaccine cont'd.” ; RUN;
OUTPUT 22.1. SAS Output for Example 22.1 Cox Proportional Hazards Model Example 22.1: HSV-2 Episodes with gD2 Vaccine - cont'd. The PHREG Procedure Model Information Data Set Dependent Variable Censoring Variable Censoring Value(s) Ties Handling
WORK.HSV WKS CENS 1 EXACT
Summary of the Number of Event and Censored Values
Total
Event
Censored
Percent Censored
48
31
17
35.42
Convergence Status Convergence criterion (GCONV=1E-8) satisfied. Model Fit Statistics
Criterion -2 LOG L AIC SBC
Without Covariates
With Covariates
183.628 183.628 183.628
172.792 176.792 179.660
Testing Global Null Hypothesis: BETA=0 Test Likelihood Ratio Score Wald
Chi-Square 10.8364 11.0159 10.7236
DF 2 2 2
Pr > ChiSq 0.0044 0.0041 0.0047
Analysis of Maximum Likelihood Estimates
Variable TRT X
DF 1 1
Parameter Standard Estimate Error Chi-Square Pr > ChiSq -0.90578 0.37600 5.8033 0.0160 0.17627 0.06574 7.1896 0.0073
Hazard Ratio 0.404 1.193
370
Common Statistical Methods for Clinical Research with SAS Examples
The next example illustrates the use of stratification in Cox regression and introduces the concept of time-dependent covariates. p Example 22.2 -- Hyalurise in Vitreous Hemorrhage
A new treatment, Hyalurise, is being tested to facilitate the clearance of vitreous hemorrhage, a condition that results in severe vision impairment. Patients were randomly assigned to receive a single injection of either Hyalurise (n =83) or saline (n = 60) in the affected eye and were followed for 1 year. Time, in weeks, to ‘response’ is shown in Table 22.1. (‘Response’ is defined as sufficient clearance of the hemorrhage to permit diagnosis of the underlying cause and appropriate intervention.) The time for patients who discontinued or completed the trial before achieving ‘response’ is considered censored (†). Covariates include a measure of baseline hemorrhage density (Grade 3 or 4), study center (A, B, or C), and whether the patient developed certain infectious complications (ocular infection, inflammation, or herpetic lesion) during the study (based on ‘Infect. Time’ in Table 22.1). Does the time to response differ between the patients treated with Hyalurise and the control group treated with saline?
Solution The response variable is time, in weeks, from randomization to ‘response’, some of whose values are censored. Because this analysis clearly calls for ‘survival’ methodology that incorporates covariates, you use PROC PHREG in SAS to fit a Cox regression model under the assumption of proportional hazards. In addition to the treatment group effect, you want to include as covariates: baseline hemorrhage density, study center, and the occurrence of infectious complications. Notice that the last covariate is a bit unusual. The occurrence of complications can take the values ‘yes’ or ‘no’, which can be coded 1 and 0, respectively. However, these values can change during the study and depend on the time of the response. Clearly, the degree to which complications affect ‘response’ is only relevant if the complication develops before ‘response’ occurs. For this reason, this is called a ‘time-dependent’ covariate. Such time-dependent covariates can be included in the Cox proportional hazards model, as illustrated in the SAS analysis that follows.
SAS Analysis of Example 22.2 In the SAS program for Example 22.2, the variables are coded as follows: RSPTIM TRT DENS CENTER INFTIM
= time (weeks) from randomization to ‘response’ = treatment group (0=saline, 1=Hyalurise) = baseline hemorrhage density (3 or 4) = Study Center (A, B, or C) = time (weeks) from randomization to onset of complication.
Chapter 22: The Cox Proportional Hazards Model 371
TABLE 22.1. Raw Data for Example 22.2 -------------------------------- Hyalurise Injection --------------------------------Patient Number
Density
Infect. Time
Resp. Time
Patient Number
Density
Infect. Time
Resp. Time
A
101 102 105 106 108 110 112 114 116 117 119 121 123 125 126 128 130
3 4 4 3 4 3 4 3 3 4 3 4 3 4 3 3 4
. 10 52 . . . . 36 . . . . 28 . . . .
32 20 24 41 32 10 † 32 45 6 52 † 13 6 44 23 30 2 26
132 133 134 136 139 140 141 144 146 148 150 151 153 155 156 159 160
4 4 3 3 3 4 3 3 4 4 4 3 3 3 3 4 4
18 . . . . 20 12 . . 24 . 4 . . . . .
46 21 14 24 † 19 24 31 52 † 40 7 16 36 28 38 † 9 30 6
B
202 203 206 207 208 210 212 214 215 217 219
3 4 3 4 3 4 3 4 4 3 3
. . . . . . . 30 . 33 .
4† 10 11 52 † 12 46 10 31 4 20 2
220 223 224 225 226 228 230 231 233 234 237
4 4 3 3 3 3 4 4 3 4 3
. . 24 . . . . . 10 . .
25 20 9 22 23 13 8 37 24 52 † 27
C
301 303 304 306 309 311 312 313 315 317 318 320 321 323
3 4 4 4 3 3 3 4 3 4 3 3 3 3
. . 20 18 . . . . . . . . 16 .
10 32 14 14 7 5 9 12 10 52 † 10 5 34 6
324 327 329 330 332 334 337 338 340 341 342 344 346
4 3 3 3 3 4 3 3 3 4 4 4 3
. 30 . 30 . . . 20 . 18 . . .
22 15 20 52 † 10 2† 16 33 12 27 2 20 † 3
Center
† indicates censored observation
372
Common Statistical Methods for Clinical Research with SAS Examples
TABLE 22.1. Data Set for Example 22.2 (continued) ----------------------------------- Saline Injection ------------------------------------Patient Number
Density
Infect. Time
Resp. Time
Patient Number
Density
Infect. Time
A
103 104 107 109 111 113 115 118 120 122 124 127 129
3 4 4 3 3 4 4 4 4 4 3 3 3
. . 12 20 . 38 . 28 . . . . 6
52 † 4 32 25 8 52 † 14 † 9 36 42 † 52 † 16 21
131 135 137 138 142 143 145 147 149 152 154 157 158
3 3 3 3 4 4 3 3 4 3 3 3 4
. . . . . 34 . . . . 42 . .
8 52 † 22 † 35 † 28 52 † 35 4 2† 12 48 11 20
B
201 204 205 209 211 213 216 218
3 4 3 3 4 3 3 3
. 9 . . 44 . . .
15 † 10 8 20 32 42 24 † 16
221 222 227 229 232 235 236
4 4 3 4 4 4 4
. 40 . . 26 . .
11 52 † 11 7† 32 20 34
C
302 305 307 308 310 314 316 319 322 325
4 3 3 3 4 3 4 4 4 4
8 . . . . . 36 12 . 12
10 † 24 36 52 † 6 21 26 27 14 45
326 328 331 333 335 336 339 343 345
3 4 4 3 4 4 3 4 3
. 22 . 15 . . 11 . .
16 33 5 16 17 16 24 6† 14
Center
† indicates censored observation
In this example, the variable CENS is created to identify censored observations associated with a value of 0 . Thus, the left side in the MODEL statement in PROC PHREG is ‘RSPTIM*CENS(0) =’ . The baseline hemorrhage density (DENS) is easily included in the MODEL statement as a dichotomous numeric covariate (Grade 4 indicates the most severe condition). Notice that Study Center is a nominal categorical factor that cannot be incorporated directly as a covariate because PROC PHREG can only use numeric valued covariates. You can create two new 0 and 1 dummy variables as described in Chapter 20. However, because differences among centers are not of direct interest, it is easier to include CENTER in a STRATA statement . Essentially,
Resp. Time
Chapter 22: The Cox Proportional Hazards Model 373
this tells SAS to treat the centers as three distinct clusters and combine the results across centers. CENTER effects are not estimated, and use of CENTER in the STRATA statement does not require the proportional hazards assumption with respect to Study Center. To include the possible effect of occurrence of complications in the analysis, you must create a new, time-dependent covariate (INFCTN) using a special feature in PROC PHREG that allows SAS DATA step type programming statements to be used in the procedure. This ensures that the covariate takes the correct values (0 or 1) at the appropriate time points. INFCTN is defined as 1 if a complication occurs prior to response, and 0, if a complication occurs after response or not at all . In Output 22.2, you see the analysis of the maximum likelihood estimates of the model parameters, including p-values based on the Wald chi-square tests. The pvalue for treatment effect (TRT), which controls for the covariates, is 0.0780 11 . The hazard ratio of 1.412 12 indicates that the ‘hazard’ of responding at any arbitrary time point is 41.2% higher with Hyalurise injection compared with saline. Neither the numeric covariate, baseline hemorrhage density (DENS), nor the timedependent covariate (INFCTN) is significant. Notice that no tests are performed for the Study Center effect because Study Center is a categorical factor included in the STRATA statement. SAS Code for Example 22.2 DATA VITCLEAR; INPUT PAT RSPTIM TRT CENTER $ DENS INFTIM @@; CENS = (RSPTIM GE 0); RSPTIM = ABS(RSPTIM); /* RSPTIM = time (wks) from randomization to response (RSPTIM is censored if negative) */ /* TRT = 1 for Hyalurise, TRT = 0 for Saline */ /* CENTER = study center (A, B, or C) */ /* DENS = 3, 4 for Grade 3 or 4, respectively */ /* INFTIM = time (wks) from randomization to onset of infection or other complications */ DATALINES; 101 32 1 A 3 . 102 20 1 A 4 10 103 -52 0 A 3 . 104 4 0 A 4 . 105 24 1 A 4 52 106 41 1 A 3 . 107 32 0 A 4 12 108 32 1 A 4 . 109 25 0 A 3 . 110 -10 1 A 3 . 111 8 0 A 3 . 112 32 1 A 4 . 113 -52 0 A 4 38 114 45 1 A 3 36 115 -14 0 A 4 . 116 6 1 A 3 . 117 -52 1 A 4 . 118 9 0 A 4 28 119 13 1 A 3 . 120 36 0 A 4 . 121 6 1 A 4 . 122 -42 0 A 4 . 123 44 1 A 3 28 124 -52 0 A 3 . 125 23 1 A 4 . 126 30 1 A 3 . 127 16 0 A 3 . 128 2 1 A 3 . 129 21 0 A 3 6 130 26 1 A 4 . 131 8 0 A 3 . 132 46 1 A 4 18 133 21 1 A 4 .
374
Common Statistical Methods for Clinical Research with SAS Examples
SAS Code for Example 22.2 (continued) 134 137 140 143 146 149 152 155 158 201 204 207 210 213 216 219 222 225 228 231 234 237 303 306 309 312 315 318 321 324 327 330 333 336 339 342 345 ;
14 -22 24 -52 40 -2 12 -38 20 -15 10 -52 46 42 -24 2 -52 22 13 37 -52 27 32 14 7 9 10 10 34 22 15 -52 16 16 24 2 14
1 0 1 0 1 0 0 1 0 0 0 1 1 0 0 1 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 1 0
A A A A A A A A A B B B B B B B B B B B B B C C C C C C C C C C C C C C C
3 3 4 4 4 4 3 3 4 3 4 4 4 3 3 3 4 3 3 4 4 3 4 4 3 3 3 3 3 4 3 3 3 4 3 4 3
. . 20 34 . . . . . . . . . . . . . . . . . . . . . . . . 16 . 30 30 . . 11 . .
135 -52 0 A 3 . 138 -35 0 A 3 . 141 31 1 A 3 12 144 -52 1 A 3 . 147 4 0 A 3 . 150 16 1 A 4 . 153 28 1 A 3 . 156 9 1 A 3 . 159 30 1 A 4 . 202 -4 1 B 3 . 205 8 0 B 3 . 208 12 1 B 3 . 211 32 0 B 4 44 214 31 1 B 4 20 217 20 1 B 3 33 220 25 1 B 4 . 223 20 1 B 4 . 226 23 1 B 3 . 229 -7 0 B 4 . 232 32 0 B 4 26 235 20 0 B 4 . 301 10 1 C 3 . 304 14 1 C 4 20 307 36 0 C 3 . 310 6 0 C 4 . 313 12 1 C 4 . 316 26 0 C 4 36 319 27 0 C 4 12 322 14 0 C 4 . 325 45 0 C 4 12 328 33 0 C 4 22 331 5 0 C 4 . 334 -2 1 C 4 . 337 16 1 C 3 . 340 12 1 C 3 . 343 -6 0 C 4 . 346 3 1 C 3 .
136 139 142 145 148 151 154 157 160 203 206 209 212 215 218 221 224 227 230 233 236 302 305 308 311 314 317 320 323 326 329 332 335 338 341 344
-24 19 28 35 7 36 48 11 6 10 11 20 10 4 16 11 9 11 8 24 34 -10 24 -52 5 21 -52 5 6 16 20 10 17 33 27 -20
1 1 0 0 1 1 0 0 1 1 1 0 1 1 0 0 1 0 1 1 0 0 0 0 1 0 1 1 1 0 1 1 0 1 1 1
A A A A A A A A A B B B B B B B B B B B B C C C C C C C C C C C C C C C
3 3 4 3 4 3 3 3 4 4 3 3 3 4 3 4 3 3 4 3 4 4 3 3 3 3 4 3 3 3 3 3 4 3 4 4
. . . . 24 4 42 . . . . . . . . . . . . 10 . 8 . . . . . . . . . . . 20 18 .
PROC PHREG DATA = VITCLEAR; MODEL RSPTIM*CENS(0) = TRT DENS INFCTN / TIES = EXACT; IF INFTIM GT RSPTIM OR INFTIM = . THEN INFCTN = 0; ELSE INFCTN = 1; STRATA CENTER; TITLE1 'Cox Proportional Hazards Model'; TITLE2 'Example 22.2: Hyalurise in Vitreous Hemorrhage' ; RUN;
Chapter 22: The Cox Proportional Hazards Model 375
OUTPUT 22.2. SAS Output for Example 22.2 Cox Proportional Hazards Model Example 22.2: Hyalurise in Vitreous Hemorrhage The PHREG Procedure Model Information Data Set Dependent Variable Censoring Variable Censoring Value(s) Ties Handling
WORK.VITCLEAR RSPTIM CENS 0 EXACT
Summary of the Number of Event and Censored Values
Stratum
CENTER
Total
Event
Censored
Percent Censored
1 A 60 45 15 25.00 2 B 37 30 7 18.92 3 C 46 39 7 15.22 ------------------------------------------------------------------Total 143 114 29 20.28 Convergence Status Convergence criterion (GCONV=1E-8) satisfied. Model Fit Statistics
Criterion -2 LOG L AIC SBC
Without Covariates
With Covariates
641.460 641.460 641.460
636.479 642.479 650.688
Testing Global Null Hypothesis: BETA=0 Test Likelihood Ratio Score Wald
Chi-Square
DF
4.9809 4.9703 4.9251
3 3 3
Pr > ChiSq 13 0.1732 0.1740 0.1774
Analysis of Maximum Likelihood Estimates
Variable
DF
Parameter Estimate
Standard Error
Chi-Square
TRT DENS INFCTN
1 1 1
0.34500 -0.23884 0.12835
0.19577 0.19598 0.27797
3.1057 1.4852 0.2132
Pr>ChiSq 0.0780 11 0.2230 0.6443
Hazard Ratio 1.412 12 0.788 1.137
376
Common Statistical Methods for Clinical Research with SAS Examples
22.4 Details & Notes 22.4.1. The Cox proportional hazards model is a ‘semi-parametric’ approach that does not require the assumption of any specific hazard function or distribution of survival times. In this sense, it takes the character of a nonparametric test. However, as in regression analysis, this method models the survival times as a function of the covariates with the goal of obtaining estimates of the model parameters (βi’s). Various types of functions can be used and different models can be specified for the distribution of the error variation. In this sense, the Cox proportional hazards model is parametric. 22.4.2. The hazard is a function of time, t. Formally, the hazard represents the probability that the event of interest occurs at a specified time (t) given that it has not occurred prior to time (t). If the hazard is constant over time, event times can be modeled with the exponential distribution. The natural log of the hazard function is ln h(t) = λ(t) + β1X1 + β2X2 +…+ βkXk If the event times have an exponential distribution, λ(t) is a constant, independent of t, such as α, and the model becomes ln h(t) = α + β1X1 + β2X2 +…+ βkXk which means the hazard is constant with respect to time and is a simple linear function of the covariates, Xi’s, on the log-scale. Section 21.4.5 describes how to test for exponential event times by checking whether a plot of the log of the survival probabilities is linear through the origin. This test is a direct result of the fact that the exponential distribution is associated with a constant hazard function (i.e., risk is independent of time). In general, if a model for the hazard function can be assumed, then the survival distributions can be specified. In fact, the cumulative survival distribution is a function of the hazard function integrated over time. Therefore, the survival distribution can be described by the hazard function, and, conversely, the survival distribution uniquely defines a hazard function. When the survival distribution is known or can be accurately assumed, a parametric analysis can be applied, often with greater efficiency than the nonparametric log-rank test or the ‘semi-parametric’ Cox proportional hazards model. The parametric approach with a known survival distribution can easily incorporate covariates into the analysis. In addition to exponential event times, widely used models include the log-normal, Weibull, gamma, and log-logistic distributions. The SAS procedure LIFEREG can be used to fit any of these parametric models with any number of covariates.
Chapter 22: The Cox Proportional Hazards Model 377
The main disadvantage of the parametric approach is that, in many cases, model assumptions might be incorrect and there might not be enough data to adequately test the assumptions or, perhaps, the actual hazard function is not well represented by any of those in common usage. Often, the results are highly dependent on the appropriate distributional assumptions. 22.4.3. The strength of the Cox proportional hazards model is its ability to incorporate background factors as covariates. When many covariates are being considered, this method can be used in an exploratory fashion to build a model that includes significant covariates and excludes the covariates that have little apparent effect on the event times. This stepwise procedure is demonstrated in the SAS OnlineDoc for PROC PHREG. 22.4.4. PROC PHREG prints the results of a global test that determines whether any of the covariates included in the model are significant after adjusting for all the other covariates. As shown in the output for Example 22.1 , three different methods are used: "Likelihood Ratio", "Score" and "Wald". Each of these is a goodness-of-fit test with an approximate chi-square distribution with 2 degrees of freedom under the global null hypothesis of ‘no covariate effects’. With the significant results of p < 0.005 for all three methods in the example, you conclude that fluctuations in event times are associated with either vaccine group (TRT) or prior lesional frequency (X) or both. A poor fit is indicated by the global tests in Example 22.2, all of which show non-significant chi-square tests with p-values of approximately 0.17 13 . 22.4.5. A positive parameter estimate indicates that the hazard increases with increasing values of the covariate. A negative value indicates that the hazard decreases with increasing values of the covariate. In Example 22.1, the negative parameter estimate for TRT (–0.906) indicates that the hazard of HSV-2 recurrence decreases with active treatment (TRT=1) compared with a placebo (TRT=0). The hazard increases with larger prior lesional frequencies (X), as seen by the positive parameter estimate for X (0.176). The term ‘hazard’ is often associated with the event of death or disease. However, in survival analysis, ‘hazard’ does not always connote a danger or an undesirable event. In Example 22.2, the event of interest is ‘response,’ which is a desirable result. Although it sounds awkward, an increasing ‘hazard’ in the desirable response with active treatment compared with saline injection is seen. Notice that the parameter estimate for TRT is positive, and the hazard ratio for the treatment effect is greater than 1 (1.412 12 ), just the opposite of Example 22.1. 22.4.6. The model for the hazard function using Cox proportional hazards with only one covariate (k=1) is
h(t) = λ(t) ⋅ e βX
378
Common Statistical Methods for Clinical Research with SAS Examples
For two patients with different values of X, such as Xu and Xv, the ratio of hazards at time t is
λ(t) ⋅ eβ X u h u (t) = = e β( X u − X v ) λ(t) ⋅ e β X v h v (t) which is constant with respect to time, t. This is the proportional hazards assumption. If X is a dummy variable with Xu = 1 (the active group) and Xv = 0 (the β placebo group), Xu – Xv = 1 so that e represents the ratio of the hazard that’s associated with the active treatment to the hazard that’s associated with the placebo. This is the hazard ratio in the SAS printout for Example 22.1 and Example 22.2 12 . Note: In SAS releases prior to Version 7, the hazard ratio is labeled ‘Risk Ratio’ in SAS output. –0.906
= 0.404. This can be In Example 22.1, the hazard ratio for TRT is e interpreted as the hazard at any time t for the active group is only 40.4% of the placebo group’s hazard. The hazard ratio for X is 1.193. Because X is a numeric variable (rather than a numerically coded categorical variable), you interpret this as a 19.3% increase in the hazard of an HSV-2 recurrence for each additional prior lesion that was experienced. 22.4.7. Example 22.2 illustrates the inclusion of a dichotomous timedependent covariate. Continuous numeric covariates can also be timedependent. For example, time to a certain response in a nutritional study might depend on a subject’s protein intake, which might change during the course of the study. This type of continuous numeric time-dependent covariates can also be included in PROC PHREG. Details and an excellent discussion of timedependent covariates using PROC PHREG can be found in Survival Analysis Using the SAS System, by Paul Allison (1995), which was written under the SAS Books By Users program. 22.4.8. ‘Non-proportional hazards’ implies an increasing or decreasing hazard ratio over time. In PROC PHREG, this can be represented by a covariate-by-time ‘interaction’. The proportional hazards assumption for any covariate can be checked by including the interaction of that covariate with time as a new time-dependent covariate in the MODEL statement in PROC PHREG. For example, to check the proportional hazards assumption for the covariate DENS in Example 22.2, you can include the time-dependent covariate, TDENS, by specifying the following SAS statement: TDENS = DENS*RSPTIM;
after the MODEL statement. A significant test for the parameter that’s associated with TDENS would indicate that the hazard changes over time.
Chapter 22: The Cox Proportional Hazards Model 379
This test is also illustrated in the SAS Help and Documentation for PROC PHREG, which recommends using the natural logarithm of the response times decreased by the average of the log response times to obtain greater stability. Use of this stabilizing transformation in Example 22.2 would require the statement TDENS = DENS*(LOG(RSPTIM ) – 2.85);
in place of the preceding statement (2.85 is the mean of the ln(RSPTIM) over all observations). A visual aid in determining conformance with the assumption of proportional hazards is a plot of the log(–log(S(t))) versus time, where S(t) is the estimated survival distribution. This can be done in SAS using the PLOTS=(LLS) option in the LIFETEST procedure. The graphs of these functions for each group should be parallel under the proportional hazards assumption. Lack of parallelism suggests deviations from the proportional hazards assumption. When the proportional hazards assumption does not seem to hold for a specific covariate, the Cox proportional hazards model might still be used by performing the analysis within each of a series of sub-intervals of the range of the covariate’s values, then combining the results. This is done in SAS by using the STRATA statement, which is illustrated in the SAS code for Example 22.2. If the covariate has continuous numeric values, the strata can be defined by numeric intervals in the STRATA statement (for details, see the SAS Help and Documentation for PROC PHREG). 22.4.9. An adjustment might be necessary when ties occur in event times. SAS has 4 adjustment options for handling ties: BRESLOW, EFRON, EXACT, and DISCRETE. BRESLOW is the default, although it might not be the best option to use. With small data sets or a small number of tied values, the EXACT option is usually preferable. Because of computational resources, the EXACT option is not efficient for larger data sets. The EFRON option provides a good approximation and is computationally faster when large amounts of data are being analyzed.
380
Common Statistical Methods for Clinical Research with SAS Examples
CHHAAPPTTEERR 23 Exercises 23.1 23.2 23.3 23.4
Introduction...................................................................................................... 381 Exercises ........................................................................................................... 381 Appropriateness of the Methods .................................................................... 383 Summary .......................................................................................................... 387
23.1 Introduction The data set, TRIAL, in Chapter 3, is based on a contrived, simplified clinical study that measured three types of response variables for 100 patients. The exercises in this section provide an opportunity to practice applying some of the methods discussed in this book using this data set. Section 23.2 lists the exercises and requests an analysis that uses a specific method referenced in an earlier chapter. The SAS code for performing these exercises is shown in Appendix G. Section 23.3 provides a critique of the methods used and a discussion of their appropriateness, based on the assumptions the analyst is willing to make.
23.2 Exercises Write a SAS program to perform the following analyses using the ‘TRIAL’ data set in Chapter 3. Analyses 1 to 11 use the continuous numeric response SCORE, Analyses 12 to 17 use the dichotomous response RESP, and Analyses 18 to 27 use the ordinal categorical response SEV. 23.2.1.
Using the SCORE response variable:
Analysis 1:
Compare mean responses between treatment groups by using the two-sample t-test (Chapter 5).
Analysis 2:
Apply the Wilcoxon rank-sum test to test for any shifts in the distribution between Treatment Groups A and B (Chapter 13).
Analysis 3:
Using a one-way ANOVA model (Chapter 6), test for a difference in mean response between treatment groups.
382
Common Statistical Methods for Clinical Research with SAS Examples
Analysis 4:
Determine if there is an interaction between treatment group and study center by using the two-way ANOVA (Chapter 7).
Analysis 5:
Re-run the two-way ANOVA without the interaction effect to determine the significance of treatment group differences.
Analysis 6:
Compare treatment group responses by using ANOVA accounting for any effects due to Study Center and Sex. Include all interactions.
Analysis 7:
Check for a significant treatment group effect adjusted for Study Center and Sex by using a main effects ANOVA model.
Analysis 8:
Ignoring all group factors, determine if the response variable SCORE is linearly related to Age (Chapter 10). Plot the data.
Analysis 9:
Determine if the slope of the linear regression in Analysis 8 differs between treatment groups (Chapter 11).
Analysis 10:
Estimate the mean response for each treatment group using Age as a covariate. Assuming the answer to 9 is ‘no’, determine the pvalue for comparing these adjusted means (Chapter 11). Plot the data by treatment group.
Analysis 11:
Compare treatment group means adjusted for Age and Study Center (Chapter 11). Assume there are no interactions.
23.2.2.
Using the RESP response variable:
Analysis 12:
Compare response rates between treatment groups by using the chi-square test (Chapter 16).
Analysis 13:
Compare response rates between treatment groups by using Fisher’s exact test (Chapter 17).
Analysis 14:
Compare response rates between treatment groups controlling for Study Center by using the Cochran-Mantel-Haenszel test (Chapter 19).
Analysis 15:
Compare response rates between treatment groups by using logistic regression (Chapter 20).
Analysis 16:
Compare response rates between treatment groups adjusted for Age assuming a logistic model (Chapter 20).
Analysis 17:
Using a logistic regression model, test for a significant treatment group effect adjusted for Age, Sex, Study Center, and Treatmentby-Center interaction (Chapter 20).
Chapter 23: Exercises 383
23.2.3.
Using the SEV response variable:
Analysis 18:
Test for general association between severity of response and treatment group by using the chi-square test (Chapter 16).
Analysis 19:
Apply the generalized Fisher’s exact test to determine if the distributions of response by severity differ between treatment groups (Chapter 17).
Analysis 20:
Use the chi-square test to determine if the row mean scores differ between treatment groups (Chapter 16). Use ‘table’ scores for coding severity.
Analysis 21:
Repeat Analysis 20 using modified ridit scores in place of ‘table’ scores.
Analysis 22:
Test for a shift in the distribution of severity between treatment groups by using the Wilcoxon rank-sum test (Chapter 13).
Analysis 23:
Test for a difference in mean severity scores between treatment groups controlling for Study Center (Chapter 19). Use ‘table’ scores.
Analysis 24:
Repeat Analysis 23 using modified ridit scores in place of ‘table’ scores.
Analysis 25:
Determine the significance of the treatment group effect by using a logistic regression model under the assumption of proportional odds (Chapter 20).
Analysis 26:
Using a proportional odds model, compare response rates between treatment groups adjusted for Age (Chapter 20).
Analysis 27:
Adjusting for Age, Sex, and Study Center, determine the significance of the treatment group effect by using the proportional odds model (Chapter 20).
23.3 Appropriateness of the Methods SAS code for conducting the analyses in the previous section is given in Appendix G. Table 23.1 summarizes the p-values for the treatment group effect for each of these analyses. Although the exercises are assigned for practice in using SAS to perform commonly used statistical analyses, some of these methods would clearly be inappropriate to use in practice. Let’s take a look at the results.
384
Common Statistical Methods for Clinical Research with SAS Examples
TABLE 23.1. Summary of Significance of Treatment Group Effects from Exercises
Analysis
Method
p-Value for Treatment Group effect
1 2
Analyses Using the SCORE Variable Two-sample t-test Wilcoxon rank-sum test
0.0422 0.0464
3 4 5 6 7 8 9 10 11
One-way ANOVA Two-way ANOVA with interaction Two-way ANOVA without interaction Three-way ANOVA with all interactions Three-way ANOVA main effects model Linear regression ANCOVA / unequal slopes ANCOVA / equal slopes Stratified ANCOVA / equal slopes
0.0422 0.0702 0.0378 0.1036 0.0493 -0.5663 0.0257 0.0252
12 13 14 15 16 17
Analyses Using the RESP Variable Chi-square test Fisher’s exact test Cochran-Mantel-Haenszel test Logistic regression, TRT effect only Logistic regression, TRT and AGE effects Logistic regression, TRT, AGE, SEX, CENTER, TRTxCENTER effects
0.4510 0.4794 0.4113 0.4523 0.3576 0.6984
18 19 20 21 22 23 24 25 26 27
Analyses Using the SEV Variable Chi-square test Generalized Fisher’s exact test Mantel-Haenszel, row mean scores differ / table scores Mantel-Haenszel, row mean scores differ / mod. ridit scores Wilcoxon rank-sum test CMH, row mean scores differ / table scores CMH, row mean scores differ / mod. ridit scores Proportional odds model, TRT effect only Proportional odds model, TRT and AGE effects Proportional odds model, TRT, AGE, SEX, CENTER
0.1216 0.1277 0.0514 0.0468 0.0472 0.0445 0.0471 0.0447 0.0244 0.0313
The SCORE variable is considered a continuous numeric response. As such, you can apply a variety of ANOVA, ANCOVA, and non-parametric methods to test for equality of treatment groups.
Chapter 23: Exercises 385
23.3.1. The two-sample t-test and one-way ANOVA produce the same pvalues (Analyses 1 and 3). This will always be the case when there are only two groups because the ANOVA F-test is functionally equivalent to the t-test in this situation. When using SAS, PROC TTEST would usually be preferable because it automatically provides a test for the assumption of equal variances and computes Satterthwaite’s adjustment for the unequal variance case. 23.3.2. The p-values differ considerably for the two-way ANOVA with (Analysis 4) and without the interaction (Analysis 5). In an unbalanced design, the inclusion of an interaction term in the model, when, in fact, no interaction exists, can produce conservative main effects comparisons based on the SAS Type III sums of squares in the ANOVA. For this reason, many analysts prefer to use the interaction model only as a preliminary test, rather than as a final model. 23.3.3. Notice that inclusion of the Study Center effect using the two-way main effects model (Analysis 5) provides sufficient blocking effect to bring out a slightly more significant treatment effect than the one-way ANOVA (Analysis 3), even though it is not a significant effect. When a third main effect (Sex) is added (Analysis 7), the treatment comparison actually becomes less significant than either the one-way or two-way main effects models. Although such a scenario can occur for a number of reasons, in this case, it appears to be an inconsequential difference based on the sample size imbalance, which affects the SAS Type III sums of squares. The treatment effect can be masked, as in Analysis 6, when exploring interaction effects. 23.3.4. ANCOVA (Analysis 10) reveals that Age might be an important covariate, and the adjusted Treatment Group differences appear more significant than those unadjusted for Age. This comes at the expense of having to assume equal slopes between the two groups, an assumption that is supported by Analysis 9. As you’ve seen previously, the Treatment Group effect is masked when testing for the interaction (equal slope assumption) as in Analysis 9. 23.3.5. Addition of the Study Center effect to the ANCOVA (Analysis 11) does not appreciably alter the significance of the Treatment effect noted in Analysis 10. Overall, it is clear that the inclusion of the Age effect in the model removes more background noise than the inclusion of either Study Center or Sex. 23.3.6. Although the t-test and ANOVA methods assume a normal distribution of the response data, these methods are fairly robust against departures from this assumption, especially with larger sample sizes. A histogram of the data (Output 3.5) does not support the normality assumption in this case. However, treatment comparisons based on ranked data (Analysis 2) result in similar p-values with this size sample (N=100). Therefore, any of the following analyses: 1, 2, 3, 5, 7, 10, or 11, would be
386
Common Statistical Methods for Clinical Research with SAS Examples
appropriate. The analyses intended to explore interactions (4, 6, and 9) would not be the best tests to use for treatment differences when interactions do not exist. In the presence of an interaction that involves Treatment Group, alternative approaches, such as subgroup analyses, might be more appropriate. Notice that the treatment group comparisons adjusted for Age result in the greatest significance. This might be due to Age differences for this sample, or it might be a repeatable effect. In any case, to select appropriate statistical methods for pivotal studies prior to examination of the data, you must rely on trends found from previous studies. If a specific covariate (such as Age) was consistently found to be an important factor in Phase II studies, for example, the pre-specified statistical model to be used for the Phase III program should include that covariate. Usually, the simplest models that contain all factors known to have an important effect on the treatment comparisons should be used in pivotal studies. The RESP variable is a dichotomous response. For this, you can apply multiple categorical methods to analyze differences between treatment groups. 23.3.7. Comparisons of response rates between treatment groups are not significant. Similar p-values for the treatment group comparisons, which range between 0.41 and 0.48, are found by using the chi-square test (Analysis 12), Fisher’s exact test (Analysis 13), the CMH test stratified by Study Center (Analysis 14), and a logistic regression model with no covariates (Analysis 15) (a finding that might be expected in studies of moderate sample sizes, such as this one). With very small sample sizes or response rates close to 0 or 1, Fisher’s exact test would be more appropriate than the other methods. Exact methods, useful in the case of small samples, are also available in logistic regression analysis. 23.3.8. The inclusion of Age as a covariate in the logistic regression model results in a smaller p-value for the treatment effect (Analysis 16). As shown in the ANCOVA (Analysis 10), Age appears to be an important factor when modeling treatment differences. 23.3.9. The inclusion of Study Center, Sex, and the Treatment-by-Center interaction in the logistic regression model results in even less significant treatment comparisons (Analysis 17), further supporting the recommendation to use parsimonious models. This can often be done by using logistic regression in a stepwise fashion to eliminate non-significant effects. Stepwise procedures are most often used in an exploratory role or in earlier phase studies. The SEV variable is an ordinal categorical response with 4 levels. Here, you can apply a rank test, a logistic regression analysis using the proportional odds model, or a number of other categorical procedures.
Chapter 23: Exercises 387
23.3.10. Both the chi-square test (Analysis 18) and the generalized Fisher’s exact test (Analysis 19) are tests for general association. Notice that the p-values for treatment comparisons are much larger than those of the other tests (Analyses 20 to 27) for the SEV response variable. As explained in Chapter 16, these methods are testing a less-specific hypothesis. 23.3.11. The Mantel-Haenszel chi-square test (Analysis 20) for ‘row mean scores differ’ is more appropriate for ordinal data than the test for ‘general association’. The Mantel-Haenszel chi-square tests for equality of mean scores between treatment groups, when the scores are assigned to the categorical levels in integer increments (‘table scores’). In this case, greater significance is found by using the modified ridit scores (Analysis 21) compared with the ‘table scores’ (Chapter 16). This will not always be the case, as seen in Analyses 23 and 24. In these analyses, the CMH test controlling for Study Center is used for comparing mean scores between treatment groups. The p-value using the ‘table scores’ (Analysis 23) is slightly smaller than that based on the modified ridit scores (Analysis 24). 23.3.12. The Wilcoxon rank-sum test (Analysis 22) is actually a special version of the Kruskal-Wallis test. Recall that the Mantel-Haenszel chi-square using modified ridit scores is identical to the Kruskal-Wallis test (Chapter 18). The difference in p-values (0.0472 in Analysis 22 vs. 0.0468 in Analysis 21) is due only to a correction factor. 23.3.13. Logistic regression modeling, similar to the ANCOVA method, shows the most significant treatment comparison (Analysis 26) when adjusting only for Age. Interestingly, this p-value (0.0244) is slightly less than that obtained by using the ANCOVA method (Analysis 10), perhaps due to the deviation from normality of the SCORE variable.
23.4 Summary When you can quantify degree of response, it is usually easier to discriminate between treatment groups because more powerful statistical methods are available. With only the knowledge of whether or not a patient responded, you see that treatment group differences are non-significant for all of the applicable methods (Analyses 12 to 17). When the response is categorized by severity, you obtain significant treatment group discrimination. Notably, when using a more precise response (continuous numeric representation of severity), the significance of treatment comparisons was not improved. This might be due, in part, to the nonnormality of the SCORE results, but it also points out the power of categorical analyses when using ordinal response categories relative to ANOVA methods. Comparison of these results points out the need to know how closely the underlying assumptions are satisfied before selecting an appropriate statistical method.
388
Common Statistical Methods for Clinical Research with SAS Examples
Appendix A Probability Tables A.1 Probabilities of the Standard Normal Distribution .............................................. 390 A.2 Critical Values of the Student t-Distribution.................................................... 391 A.3 Critical Values of the Chi-Square Distribution................................................ 392
390
Common Statistical Methods for Clinical Research with SAS Examples
P 0
z
APPENDIX A.1. Probabilities of the Standard Normal Distribution
z
P
z
P
z
P
z
P
z
P
z
P
0.00
0.5000
0.01 0.02 0.03 0.04 0.05
0.4960 0.4920 0.4880 0.4840 0.4801
0.51 0.52 0.53 0.54 0.55
0.3050 0.3015 0.2981 0.2946 0.2912
1.01 1.02 1.03 1.04 1.05
0.1562 0.1539 0.1515 0.1492 0.1469
1.51 1.52 1.53 1.54 1.55
0.0655 0.0643 0.0630 0.0618 0.0606
2.01 2.02 2.03 2.04 2.05
0.0222 0.0217 0.0212 0.0207 0.0202
2.51 2.52 2.53 2.54 2.55
0.0060 0.0059 0.0057 0.0055 0.0054
0.06 0.07 0.08 0.09 0.10
0.4761 0.4721 0.4681 0.4641 0.4602
0.56 0.57 0.58 0.59 0.60
0.2877 0.2843 0.2810 0.2776 0.2743
1.06 1.07 1.08 1.09 1.10
0.1446 0.1423 0.1401 0.1379 0.1357
1.56 1.57 1.58 1.59 1.60
0.0594 0.0582 0.0571 0.0559 0.0548
2.06 2.07 2.08 2.09 2.10
0.0197 0.0192 0.0188 0.0183 0.0179
2.56 2.57 2.58 2.59 2.60
0.0052 0.0051 0.0049 0.0048 0.0047
0.11 0.12 0.13 0.14 0.15
0.4562 0.4522 0.4483 0.4443 0.4404
0.61 0.62 0.63 0.64 0.65
0.2709 0.2676 0.2643 0.2611 0.2578
1.11 1.12 1.13 1.14 1.15
0.1335 0.1314 0.1292 0.1271 0.1251
1.61 1.62 1.63 1.64 1.65
0.0537 0.0526 0.0516 0.0505 0.0495
2.11 2.12 2.13 2.14 2.15
0.0174 0.0170 0.0166 0.0162 0.0158
2.61 2.62 2.63 2.64 2.65
0.0045 0.0044 0.0043 0.0041 0.0040
0.16 0.17 0.18 0.19 0.20
0.4364 0.4325 0.4286 0.4247 0.4207
0.66 0.67 0.68 0.69 0.70
0.2546 0.2514 0.2483 0.2451 0.2420
1.16 1.17 1.18 1.19 1.20
0.1230 0.1210 0.1190 0.1170 0.1151
1.66 1.67 1.68 1.69 1.70
0.0485 0.0475 0.0465 0.0455 0.0446
2.16 2.17 2.18 2.19 2.20
0.0154 0.0150 0.0146 0.0143 0.0139
2.66 2.67 2.68 2.69 2.70
0.0039 0.0038 0.0037 0.0036 0.0035
0.21 0.22 0.23 0.24 0.25
0.4168 0.4129 0.4090 0.4052 0.4013
0.71 0.72 0.73 0.74 0.75
0.2389 0.2358 0.2327 0.2296 0.2266
1.21 1.22 1.23 1.24 1.25
0.1131 0.1112 0.1093 0.1075 0.1056
1.71 1.72 1.73 1.74 1.75
0.0436 0.0427 0.0418 0.0409 0.0401
2.21 2.22 2.23 2.24 2.25
0.0136 0.0132 0.0129 0.0125 0.0122
2.71 2.72 2.73 2.74 2.75
0.0034 0.0033 0.0032 0.0031 0.0030
0.26 0.27 0.28 0.29 0.30
0.3974 0.3936 0.3897 0.3859 0.3821
0.76 0.77 0.78 0.79 0.80
0.2236 0.2206 0.2177 0.2148 0.2119
1.26 1.27 1.28 1.29 1.30
0.1038 0.1020 0.1003 0.0985 0.0968
1.76 1.77 1.78 1.79 1.80
0.0392 0.0384 0.0375 0.0367 0.0359
2.26 2.27 2.28 2.29 2.30
0.0119 0.0116 0.0113 0.0110 0.0107
2.76 2.77 2.78 2.79 2.80
0.0029 0.0028 0.0027 0.0026 0.0026
0.31 0.32 0.33 0.34 0.35
0.3783 0.3745 0.3707 0.3669 0.3632
0.81 0.82 0.83 0.84 0.85
0.2090 0.2061 0.2033 0.2005 0.1977
1.31 1.32 1.33 1.34 1.35
0.0951 0.0934 0.0918 0.0901 0.0885
1.81 1.82 1.83 1.84 1.85
0.0351 0.0344 0.0336 0.0329 0.0322
2.31 2.32 2.33 2.34 2.35
0.0104 0.0102 0.0099 0.0096 0.0094
2.81 2.82 2.83 2.84 2.85
0.0025 0.0024 0.0023 0.0023 0.0022
0.36 0.37 0.38 0.39 0.40
0.3594 0.3557 0.3520 0.3483 0.3446
0.86 0.87 0.88 0.89 0.90
0.1949 0.1922 0.1894 0.1867 0.1841
1.36 1.37 1.38 1.39 1.40
0.0869 0.0853 0.0838 0.0823 0.0808
1.86 1.87 1.88 1.89 1.90
0.0314 0.0307 0.0301 0.0294 0.0287
2.36 2.37 2.38 2.39 2.40
0.0091 0.0089 0.0087 0.0084 0.0082
2.86 2.87 2.88 2.89 2.90
0.0021 0.0021 0.0020 0.0019 0.0019
0.41 0.42 0.43 0.44 0.45
0.3409 0.3372 0.3336 0.3300 0.3264
0.91 0.92 0.93 0.94 0.95
0.1814 0.1788 0.1762 0.1736 0.1711
1.41 1.42 1.43 1.44 1.45
0.0793 0.0778 0.0764 0.0749 0.0735
1.91 1.92 1.93 1.94 1.95
0.0281 0.0274 0.0268 0.0262 0.0256
2.41 2.42 2.43 2.44 2.45
0.0080 0.0078 0.0075 0.0073 0.0071
2.91 2.92 2.93 2.94 2.95
0.0018 0.0018 0.0017 0.0016 0.0016
0.46 0.47 0.48 0.49 0.50
0.3228 0.3192 0.3156 0.3121 0.3085
0.96 0.97 0.98 0.99 1.00
0.1685 0.1660 0.1635 0.1611 0.1587
1.46 1.47 1.48 1.49 1.50
0.0721 0.0708 0.0694 0.0681 0.0668
1.96 1.97 1.98 1.99 2.00
0.0250 0.0244 0.0239 0.0233 0.0228
2.46 2.47 2.48 2.49 2.50
0.0069 0.0068 0.0066 0.0064 0.0062
2.96 2.97 2.98 2.99 3.00
0.0015 0.0015 0.0014 0.0014 0.0013
Appendix A: Probability Tables
p 0
t(p)
APPENDIX A.2. Critical Values of the Student t-Distribution
df
t(0.100)
t(0.050)
t(0.025)
t(0.010)
1 2 3 4 5
3.078 1.886 1.638 1.533 1.476
6.314 2.920 2.353 2.132 2.015
12.706 4.303 3.182 2.776 2.571
31.821 6.965 4.541 3.747 3.365
6 7 8 9 10
1.440 1.415 1.397 1.383 1.372
1.943 1.895 1.860 1.833 1.812
2.447 2.365 2.306 2.262 2.228
3.143 2.998 2.896 2.821 2.764
11 12 13 14 15
1.363 1.356 1.350 1.345 1.341
1.796 1.782 1.771 1.761 1.753
2.201 2.179 2.160 2.145 2.131
2.718 2.681 2.650 2.624 2.602
16 17 18 19 20
1.337 1.333 1.330 1.328 1.325
1.746 1.740 1.734 1.729 1.725
2.120 2.110 2.101 2.093 2.086
2.583 2.567 2.552 2.539 2.528
21 22 23 24 25
1.323 1.321 1.319 1.318 1.316
1.721 1.717 1.714 1.711 1.708
2.080 2.074 2.069 2.064 2.060
2.518 2.508 2.500 2.492 2.485
26 27 28 29 30
1.315 1.314 1.313 1.311 1.310
1.706 1.703 1.701 1.699 1.697
2.056 2.052 2.048 2.045 2.042
2.479 2.473 2.467 2.462 2.457
31 32 33 34 35
1.309 1.309 1.308 1.307 1.306
1.696 1.694 1.692 1.691 1.690
2.040 2.037 2.035 2.032 2.030
2.453 2.449 2.445 2.441 2.438
391
392
Common Statistical Methods for Clinical Research with SAS Examples
p 0
X(p)
APPENDIX A.3. Critical Values of the Chi-Square Distribution
df
X(0.100)
X(0.050)
X(0.025)
X(0.010)
1 2 3 4 5
2.706 4.605 6.251 7.779 9.236
3.841 5.991 7.815 9.488 11.070
5.024 7.378 9.348 11.143 12.832
6.635 9.210 11.345 13.277 15.086
6 7 8 9 10
10.645 12.017 13.362 14.684 15.987
12.592 14.067 15.507 16.919 18.307
14.449 16.013 17.535 19.023 20.483
16.812 18.475 20.090 21.666 23.209
11 12 13 14 15
17.275 18.549 19.812 21.064 22.307
19.675 21.026 22.362 23.685 24.996
21.920 23.337 24.736 26.119 27.488
24.725 26.217 27.688 29.141 30.578
16 17 18 19 20
23.542 24.769 25.989 27.204 28.412
26.296 27.587 28.869 30.144 31.410
28.845 30.191 31.526 32.852 34.170
32.000 33.409 34.805 36.191 37.566
21 22 23 24 25
29.615 30.813 32.007 33.196 34.382
32.671 33.924 35.172 36.415 37.652
35.479 36.781 38.076 39.364 40.646
38.932 40.289 41.638 42.980 44.314
26 27 28 29 30
35.563 36.741 37.916 39.087 40.256
38.885 40.113 41.337 42.557 43.773
41.923 43.195 44.461 45.722 46.979
45.642 46.963 48.278 49.588 50.892
31 32 33 34 35
41.422 42.585 43.745 44.903 46.059
44.985 46.194 47.400 48.602 49.802
48.232 49.480 50.725 51.966 53.203
52.191 53.486 54.775 56.061 57.342
$SSHQGL[% &RPPRQ'LVWULEXWLRQV8VHGLQ 6WDWLVWLFDO,QIHUHQFH
% % % %
%
1RWDWLRQ 3URSHUWLHV 5HVXOWV 'LVWULEXWLRQDO6KDSHV
1RWDWLRQ ;a1 PHDQV WKDW WKH UDQGRP YDULDEOH ; LV QRUPDOO\ GLVWULEXWHG ZLWK PHDQ DQGYDULDQFH
;a
PHDQV WKDW WKH UDQGRP YDULDEOH ; KDV WKH FKLVTXDUH GLVWULEXWLRQ ZLWK GHJUHHVRIIUHHGRP
;a)
PHDQV WKDW WKH UDQGRP YDULDEOH ; KDV WKH )GLVWULEXWLRQ ZLWK XSSHUDQG ORZHUGHJUHHVRIIUHHGRP
;aW
PHDQV WKDW WKH UDQGRP YDULDEOH ; KDV WKH 6WXGHQWW GLVWULEXWLRQ ZLWK GHJUHHVRIIUHHGRP
;a8 PHDQVWKDWWKHUDQGRPYDULDEOH;KDVWKH8QLIRUPGLVWULEXWLRQRYHU WKHLQWHUYDO± >±@
&RPPRQ6WDWLVWLFDO0HWKRGVIRU&OLQLFDO5HVHDUFKZLWK6$6([DPSOHV
%
3URSHUWLHV ,I;a1 DQG= ;± WKHQ=a1 =LVFDOOHGDVWDQGDUGQRUPDOUDQGRPYDULDEOH ,I; a1 DQG; a1 ; DQG; DUHLQGHSHQGHQWDQG < ; ; WKHQ<a1 0RUH JHQHUDOO\ IRU DQ\ FRQVWDQWV D D D LI ; a 1 LQGHSHQGHQWO\IRUL NDQG< D ; D ; D ; WKHQ
N
(
N
N
<a1 ÊD Ê D L
L
L
L
L
L
L
N
N
)
,I=a1 WKHQ= a ,I= a1 LQGHSHQGHQWO\IRUL QDQG < = = «= WKHQ<a ,I;a DQG<a LQGHSHQGHQWO\DQG: ;<WKHQ:a
L
L
L
Q
Q
P
,I=a1 DQG<a LQGHSHQGHQWO\DQG7 =< WKHQ7aW ,I;a DQG<a LQGHSHQGHQWO\DQG) QP ;< WKHQ)a)
,I;a8
P
>@
P
Q
Q
WKHQ±ÂOQ; aÂ
PQ
Q
$SSHQGL[%&RPPRQ'LVWULEXWLRQV8VHGLQ6WDWLVWLFDO,QIHUHQFH
%
5HVXOWV
/HW; a1 LQGHSHQGHQWO\IRUL QDQGOHW
L
Q
;
Ê; L
Q
L
UHSUHVHQWWKHVDPSOHPHDQ 7KHQ3URSHUW\LPSOLHVWKDW
aL ; 1 ( Q )
3URSHUW\DQG(TXDWLRQL LPSO\WKDW
= =
Q ; - a1 LL
3URSHUW\DQG(TXDWLRQLL LPSO\WKDW
Q; - = = a
LLL
$OVR3URSHUW\LPSOLHVWKDW
;± a1 LY L
3URSHUW\DQG(TXDWLRQLY LPSO\WKDW
å Q
L
; L ±
a
Y
Q
&RPPRQ6WDWLVWLFDO0HWKRGVIRU&OLQLFDO5HVHDUFKZLWK6$6([DPSOHV
,WFDQEHVKRZQWKDW
å ; ± å ; ± ; Q; ± L
L
DQGWKDWWKHVXPPDWLRQVRQWKHULJKWVLGHRIWKHHTXDOVLJQDUHLQGHSHQGHQWVRWKDW
Q ± V å; L ± ± Q; ± YL
ZKHUHVLVWKHVDPSOHYDULDQFHJLYHQE\
V
å ; ± ; L
Q ±
3URSHUW\DQG(TXDWLRQVLLL Y DQGYL LPSO\WKDW
Q ± × V
a
YLL
Q±
3URSHUW\DQG(TXDWLRQVLL DQGYLL LPSO\WKDW
W
æ ; ± ç Q è
ö ÷ ø
; ±
aW æ Vö V Q ç è
÷ ø
Q±
YLLL
3URSHUW\DQG(TXDWLRQVLLL DQGYLL LPSO\WKDW
) W a) L[
Q
$SSHQGL[%&RPPRQ'LVWULEXWLRQV8VHGLQ6WDWLVWLFDO,QIHUHQFH
:LWKDSSOLFDWLRQWR$129$OHW06DQG06UHSUHVHQWWZRLQGHSHQGHQWXQELDVHG HVWLPDWHV RI WKH HUURU YDULDQFH EDVHG RQ DQG GHJUHHV RI IUHHGRP UHVSHFWLYHO\WKHQE\YLL
06 a
06 DQGa
[
ZKLFKWRJHWKHUZLWK3URSHUW\LPSO\WKDW
06 [L a) 06
:LWK DSSOLFDWLRQ WR PHWDDQDO\VLV OHW SL UHSUHVHQW WKH SYDOXH IRU WKH WUHDWPHQW FRPSDULVRQ IRU WKH LWK RI N VWXGLHV 8QGHU WKH QXOO K\SRWKHVLV RI QR WUHDWPHQW GLIIHUHQFHV SL a 8± %\ 3URSHUW\ ±  OQSL a DQG WRJHWKHU ZLWK
3URSHUW\
%
: ± Â OQS OQS «OQS aÂ
N
N
[LL
'LVWULEXWLRQDO6KDSHV
)LJXUH % LOOXVWUDWHV WKH VKDSHV RI VRPH FRPPRQ SUREDELOLW\ GLVWULEXWLRQV 7KH YDOXHRIWKHUDQGRPYDULDEOHLVVKRZQRQWKHKRUL]RQWDOD[LVDQGWKHSUREDELOLW\LV VKRZQ RQ WKH YHUWLFDO D[LV 7KHVH VKDSHV YDU\ ZLWK FKDQJHV LQ WKH GLVWULEXWLRQDO SDUDPHWHUVVXFKDVWKHYDULDQFHRUGHJUHHVRIIUHHGRP1RWLFHWKHV\PPHWU\RIWKH QRUPDODQGWKHWGLVWULEXWLRQVDQGWKHVNHZDVVRFLDWHGZLWKWKHFKLVTXDUHDQG) GLVWULEXWLRQV
&RPPRQ6WDWLVWLFDO0HWKRGVIRU&OLQLFDO5HVHDUFKZLWK6$6([DPSOHV
),*85(%'LVWULEXWLRQDO6KDSHV
7KH 1250$/ 'LVWULEXWLRQ
7KH &+,648$5( 'LVWULEXWLRQ
7KH ) 'LVWULEXWLRQ
7KH 678'(17W 'LVWULEXWLRQ
$SSHQGL[& %DVLF$129$&RQFHSWV
& & &
&
:LWKLQYV%HWZHHQ*URXS9DULDWLRQ 1RLVH5HGXFWLRQE\%ORFNLQJ /HDVW6TXDUHV0HDQ/6PHDQ
:LWKLQYV%HWZHHQ*URXS9DULDWLRQ 6XSSRVHUHFHQWFROOHJHJUDGXDWHVDUHDVVLJQHGWRWKUHHJURXSVVXEMHFWVWRDQ H[HUFLVH JURXS , VXEMHFWV WR D GUXJ WUHDWPHQW JURXS ,, DQG VXEMHFWV WR D FRQWURO JURXS ,,, $JHV SXOVH UDWHV GLDVWROLF EORRG SUHVVXUHV DQG WULJO\FHULGH PHDVXUHPHQWVDUHWDNHQDIWHUZHHNVLQWKHVWXG\ZLWKWKHIROORZLQJUHVXOWV
*5283
$JH\HDUV
,
,,
,,,
,
,,
,,,
7KH DJH PHDVXUHPHQWV DUH WKH VDPH IRU DOO VXEMHFWV &OHDUO\ WKHUH LV QR YDULDWLRQ DPRQJ WKH JURXS PHDQV DQG WKHUHIRUH QR GLIIHUHQFHDPRQJJURXSVZLWKUHVSHFWWRDJH
3XOVH5DWHVEHDWVPLQ
7KH SXOVH UDWH PHDVXUHPHQWV DUH WKH VDPH IRU DOO VXEMHFWV ZLWKLQ HDFK JURXS EXW WKH PHDVXUHPHQWV YDU\ DFURVV JURXSV 2EVHUYDWLRQ RI WKHVH GDWD ZLWKRXW IXUWKHU DQDO\VLV ZRXOG OLNHO\ OHDG WR WKH FRQFOXVLRQ RIDGLIIHUHQFHLQPHDQSXOVHUDWHVDPRQJWKH WKUHHJURXSV
&RPPRQ6WDWLVWLFDO0HWKRGVIRU&OLQLFDO5HVHDUFKZLWK6$6([DPSOHV
'LDVWROLF%3PP+J 'LDVWROLF EORRG SUHVVXUH PHDVXUHPHQWV VKRZ D YDULDWLRQ ERWK ZLWKLQ DQG DPRQJ JURXSV :LWKLQJURXS PHDVXUHPHQWV YDU\ E\ QR PRUH WKDQ PP +J ZKLOH WKH YDULDWLRQ DPRQJ WKH JURXS PHDQV LV DV PXFKDVPP%HFDXVHWKHDPRQJJURXS YDULDWLRQ LV ODUJH FRPSDUHG WR WKH ZLWKLQ JURXSYDULDWLRQ\RXZRXOGOLNHO\WKLQNWKDW WKHUHLVDGLIIHUHQFHDPRQJJURXSVLQPHDQ GLDVWROLFEORRGSUHVVXUH
7ULJO\FHULGHVPJGO 7KH WULJO\FHULGH PHDVXUHPHQWV VKRZ FRQVLGHUDEOH YDULDWLRQ ERWK ZLWKLQ DQG DPRQJJURXSV%HFDXVHWKHYDULDWLRQZLWKLQ JURXSV QR ORQJHU DOORZV RXU LQWXLWLRQ WR GLVWLQJXLVK DPRQJ JURXSV LW LV QRW FOHDU ZKHWKHUDJURXSHIIHFWH[LVWV
,
,,
,,,
,
,,
,,,
)RUWKHWULJO\FHULGHUHVXOWVWKHUHPLJKWQRWEHDUHDOGLIIHUHQFHWKDWFDQEHDWWULEXWHG WR WKH JURXSV EXW \RX QHHG DQ DQDO\WLF PHWKRG WR GHWHUPLQH WKLV 7KH $129$ PHWKRGV DUH XVHG IRU H[DFWO\ WKLV SXUSRVH LH WR DQDO\]H WKH YDULDELOLW\ DPRQJ JURXSV UHODWLYH WR WKH YDULDELOLW\ ZLWKLQ JURXSV WR GHWHUPLQH LI GLIIHUHQFHV DPRQJ JURXSVDUHPHDQLQJIXORUVLJQLILFDQW 7KH$129$PHWKRGVVHHNWRLGHQWLI\VRXUFHVRIYDULDWLRQIRUHDFKPHDVXUHPHQW DQGVHSDUDWHWKHWRWDOYDULDELOLW\LQWRFRPSRQHQWVDVVRFLDWHGZLWKHDFKVRXUFH7KH WRWDOYDULDELOLW\FDQJHQHUDOO\EHGHVFULEHGDVWKHVXPRIVTXDUHGGHYLDWLRQVRIHDFK PHDVXUHPHQWIURPWKHRYHUDOOPHDQ7KLVLVFRPSULVHGRIDVXPRIVTXDUHVGXHWR VXVSHFWHGVRXUFHVRIYDULDWLRQRIWHQFDOOHGWKHµPRGHOVXPRIVTXDUHV¶ DQGDVXP RIVTXDUHV66 GXHWRHUURU66(
667RWDO 660RGHO 66(UURU
7KH HUURU YDULDELOLW\ LQFOXGHV VXFK WKLQJV DV PHDVXUHPHQW HUURU QRUPDO YDULDWLRQ GXHWRUHSHDWHGVDPSOLQJDQGYDULDWLRQDPRQJµOLNH¶H[SHULPHQWDOXQLWVIURPRWKHU XQNQRZQ IDFWRUV 6XFK VRXUFHV PLJKW EH GLIILFXOW WR FRQWURO HYHQ LQ D ZHOOGHVLJQHGVWXG\7KHVRXUFHVWKDWFDQEHLGHQWLILHGDQGFRQWUROOHGDUHLQFOXGHG LQWKHPRGHO66 $Q $129$ LV FRQGXFWHG XVLQJ )WHVWV WKDW DUH FRQVWUXFWHG IURP WKH UDWLR RI EHWZHHQJURXS WR ZLWKLQJURXS YDULDQFH HVWLPDWHV $VVXPSWLRQV XVXDOO\ HQWDLO LQGHSHQGHQWVDPSOHVIURPQRUPDOO\GLVWULEXWHGSRSXODWLRQVZLWKHTXDOYDULDQFHV
$SSHQGL[&%DVLF$129$&RQFHSWV
,QWKHH[DPSOHJLYHQLQWKHEHJLQQLQJRIWKLV$SSHQGL[WKHWKUHHJURXSV,,,,,, UHSUHVHQWOHYHOVRIDIDFWRUQDPHG*5283$µVLJQLILFDQW*5283HIIHFW¶LQGLFDWHV WKDW WKHUH DUH VLJQLILFDQW GLIIHUHQFHV LQ PHDQ UHVSRQVHV DPRQJ WKH OHYHOV RI WKH IDFWRU*5283 7KH RQHZD\ $129$ KDV RQH PDLQ HIIHFW RU JURXSLQJ IDFWRU ZLWK WZR RU PRUH OHYHOV,QDQDO\]LQJFOLQLFDOWULDOVWKHPDLQHIIHFWZLOORIWHQEHDWUHDWPHQWHIIHFW 7KHOHYHOVRIWKHIDFWRU7UHDWPHQWPLJKWEHµORZGRVH¶µPLGGOHGRVH¶µKLJKGRVH¶ DQG µSODFHER¶ 7KH WZRZD\ $129$ KDV WZR PDLQ HIIHFWV XVXDOO\ D JURXSLQJ RU WUHDWPHQW IDFWRU DQG D EORFNLQJ IDFWRU VXFK DV *HQGHU 6WXG\ &HQWHU 'LDJQRVLV *URXSHWF 6WDWLVWLFDOLQWHUDFWLRQVDPRQJPDLQHIIHFWVFDQDOVREHDQDO\]HGZKHQWZRRUPRUH IDFWRUVDUH XVHG $Q LQWHUDFWLRQ RFFXUV ZKHQGLIIHUHQFHV DPRQJ WKHOHYHOV RI RQH IDFWRUFKDQJHZLWKGLIIHUHQWOHYHOVRIDQRWKHUIDFWRU$WZRZD\$129$FDQKDYH RQO\ RQH WZRZD\ LQWHUDFWLRQ $ WKUHHZD\ $129$ KDV WKUHH PDLQ HIIHFWV IRU H[DPSOH$%DQG& WZRZD\LQWHUDFWLRQV$E\%$E\&DQG%E\& DQG WKUHHZD\ LQWHUDFWLRQ $E\%E\& 0XOWLZD\ $129$ VHWXSV FDQ KDYH DQ\ QXPEHURIIDFWRUVHDFKDWDQ\QXPEHURIOHYHOV7KHWZRZD\$129$&KDSWHU LVRQHRIWKHPRVWFRPPRQO\XVHGDQDO\VHVIRUPXOWLFHQWHUFOLQLFDOVWXGLHVXVXDOO\ ZLWK7UHDWPHQWRU'RVHJURXS DQG6WXG\&HQWHUDVWKHWZRPDLQHIIHFWV 2WKHU W\SHVRIFRPPRQO\ XVHG $129$VLQFOXGHWKH UHSHDWHGPHDVXUHV $129$ &KDSWHU QHVWHG RU UDQGRPHIIHFWV $129$ DQG DQ $129$ RI PL[HG PRGHOV $QDO\VLV RI &RYDULDQFH &KDSWHU LV D VSHFLDO W\SH RI $129$ WKDW LQFOXGHV DGMXVWPHQWVRIWUHDWPHQWHIIHFWVIRUQXPHULFFRYDULDWHV ,QPRVWW\SHVRI$129$XVHGLQFOLQLFDOWULDOVWKHSULPDU\TXHVWLRQWKHUHVHDUFKHU ZDQWVWRDQVZHULVZKHWKHUWKHUHDUHDQ\GLIIHUHQFHVDPRQJWKHJURXSSRSXODWLRQ PHDQV EDVHG RQ WKH VDPSOH GDWD 7KH QXOO K\SRWKHVLV WR EH WHVWHG LV µWKHUH LV QR *URXSHIIHFW¶RUHTXLYDOHQWO\µWKHPHDQUHVSRQVHVDUHWKHVDPHIRUDOOJURXSV¶7KH DOWHUQDWLYH K\SRWKHVLVLVWKDW µWKH *URXSHIIHFWLVLPSRUWDQW¶ RUHTXLYDOHQWO\ µWKH *URXSPHDQVGLIIHUIRUDWOHDVWRQHSDLURIJURXSV¶ 7KHRQHZD\$129$LQYROYHVDVWUDLJKWIRUZDUGFRPSDULVRQRIWKHEHWZHHQJURXS YDULDWLRQWRWKHZLWKLQJURXSYDULDWLRQ$Q$129$WKDWLQYROYHVEORFNLQJIDFWRUV RU FRYDULDWHV VHHNV WR UHILQH WUHDWPHQW FRPSDULVRQV E\ IDFWRULQJ RXW H[WUDQHRXV YDULDWLRQGXHWRNQRZQVRXUFHV
&
1RLVH5HGXFWLRQE\%ORFNLQJ
7KHPRUHH[WUDQHRXVYDULDELOLW\RUµEDFNJURXQGQRLVH¶WKDW\RXFDQDFFRXQWIRUWKH PRUH SUHFLVH WKH *URXS FRPSDULVRQV EHFRPH $FFRXQWLQJ IRU NQRZQ EORFNLQJ IDFWRUVLVRQHPHDQVRIUHGXFLQJEDFNJURXQGQRLVH 6XSSRVH\RXFROOHFWERG\ZHLJKWGDWDLQSRXQGV IRUSDWLHQWVUDQGRPL]HGWRWZR JURXSV DV VKRZQ LQ 7DEOH &
&RPPRQ6WDWLVWLFDO0HWKRGVIRU&OLQLFDO5HVHDUFKZLWK6$6([DPSOHV
7$%/(&%RG\:HLJKW'DWDIRU7ZR7UHDWPHQW*URXSV
7UHDWPHQW*URXS$
7UHDWPHQW*URXS%
3DWLHQW1XPEHU
:HLJKW
3DWLHQW1XPEHU
:HLJKW
0HDQZHLJKWVDUHFRPSXWHGWREHDQGIRU7UHDWPHQW*URXSV$DQG% UHVSHFWLYHO\2QHZD\$129$ZLWKN JURXSV SURGXFHVDQ )WHVWVWDWLVWLFRI ZLWKXSSHUDQGORZHUGHJUHHVRIIUHHGRPDVVKRZQLQ7DEOH&7KLV UHVXOWV LQ D SYDOXH RI OHDGLQJ WR WKH FRQFOXVLRQ WKDW WKHUH¶V QR VLJQLILFDQW GLIIHUHQFHLQPHDQZHLJKWVEHWZHHQWKHJURXSV 7$%/(&2QH:D\$129$6XPPDU\
6RXUFH
GI
66
7UHDWPHQW
)
(UURU
7RWDO
)
06
)
LVQRWVLJQLILFDQWS
1RZVXSSRVH\RXNQRZWKDWSDWLHQWVZLWKQXPEHUVOHVVWKDQDUHIHPDOHVDQG SDWLHQWVZLWKQXPEHUVRIRUDERYHDUHPDOHV,WLVFOHDUIURPVFDQQLQJWKHGDWD WKDW WKH PDOHV ZHLJK PRUH WKDQ WKH IHPDOHV %HFDXVH WKH HUURU YDULDQFH LV D ZLWKLQJURXS SDWLHQWWRSDWLHQW YDULDELOLW\ WKH 06( RI RYHUHVWLPDWHV EHFDXVH LW LQFOXGHV D FRPSRQHQW GXH WR YDULDWLRQ EHWZHHQ JHQGHUV 7KH LQIODWHG HVWLPDWHXVLQJWKLV06(OHDGVWRDQ)YDOXHIRUWUHDWPHQWJURXSGLIIHUHQFHVWKDWLV VPDOOHU DQG OHVV VLJQLILFDQW WKDQ LI D PRUH SUHFLVH HVWLPDWH RI ZHUH REWDLQHG 5HPRYLQJ WKH JHQGHU YDULDWLRQ IURP WKH 06( PLJKW UHVXOW LQ D PRUH SUHFLVH FRPSDULVRQRIWKHJURXSV
$SSHQGL[&%DVLF$129$&RQFHSWV
$IWHU*HQGHULVLGHQWLILHGDVDVRXUFHRIYDULDWLRQ\RXFDQLQFOXGHWKDWIDFWRULQWKH $129$E\FRPSXWLQJDVXPRIVTXDUHVDPHDQVTXDUHDQGDQ)YDOXHIRUWKLV IDFWRU VLPLODU WR WKH PHWKRGV XVHG IRU WKH *URXS HIIHFW LQ D RQHZD\ $129$ &KDSWHU ,QWKHWZRZD\$129$&KDSWHU \RXFDQWHVWZKHWKHU*HQGHULVD VLJQLILFDQWIDFWRUDQGUHPRYHLWVYDULDWLRQIURPWKH06(WRJHWDPRUHSUHFLVHWHVW IRUWKH7UHDWPHQWHIIHFW $V VHHQ LQ 7DEOH & *HQGHU LV QRW RQO\ VLJQLILFDQW EXW WKH HVWLPDWH RI HUURU YDULDQFH 06( LV VXEVWDQWLDOO\ UHGXFHG IURP WKDW RI WKH RQHZD\ $129$ 06( 7KLVOHDGVWRJUHDWHUSUHFLVLRQLQWHVWLQJIRUWKH7UHDWPHQWJURXS HIIHFWZKLFKLVQRZVHHQWREHVLJQLILFDQW 7$%/(&7ZR:D\$129$6XPPDU\
6RXUFH
GI
7UHDWPHQW
*HQGHU
(UURU
7RWDO
6LJQLILFDQWS
66
06
)
) W
)
E
6LJQLILFDQWS
7KHJHQHUDOVHWXSRIDWZRZD\OD\RXWZLWKJOHYHOVRIWKHµWUHDWPHQW¶IDFWRU*URXS DQG E OHYHOV RI WKH µEORFNLQJ¶ IDFWRU %ORFN LV VKRZQ LQ 7DEOH & )URP HDFK *URXSE\%ORFN FRPELQDWLRQ RU µFHOO¶ \RX LQGHSHQGHQWO\ VDPSOH D QXPEHU RI REVHUYDWLRQVOHWWLQJ\LMNUHSUHVHQWWKHNWKGDWDYDOXHIURP*URXSLDQG%ORFNM7KH QXPEHURIGDWDYDOXHVZLWKLQ&HOOLMLVUHSUHVHQWHGE\QLM
7$%/(&7ZR:D\$129$/D\RXW5DQGRPL]HG%ORFN'HVLJQ
%ORFN
*URXS
E
Q
Q
QE
Q
Q
QE
J
QJ
QJ
QJE
&RPPRQ6WDWLVWLFDO0HWKRGVIRU&OLQLFDO5HVHDUFKZLWK6$6([DPSOHV
,QWKHZHLJKWH[DPSOHGLVFXVVHGHDUOLHU\RXKDYHJ E DQGQLMDVIROORZV
*HQGHU%ORFN
7UHDWPHQW *URXS
)HPDOHV
0DOHV
$
Q
Q
%
Q
Q
,I HDFK FHOO LQ WKH WZRZD\ OD\RXW KDV WKH VDPH QXPEHU RI REVHUYDWLRQV QLM QIRUDOOLM \RXKDYHDµEDODQFHG¶GHVLJQ&RPSXWLQJIRUPXODVIRUEDODQFHG GHVLJQV DUH VWUDLJKWIRUZDUG DQG VLPLODU WR WKRVH JLYHQ IRU WKH RQHZD\ OD\RXW &KDSWHU :LWKLPEDODQFHWKHUHDUHDQXPEHURIGLIIHUHQWZD\VWRFRPSXWHWKH VXPV RI VTXDUHV ,Q 6$6 WKHUH DUH IRXU W\SHV RI VXPV RI VTXDUHV FRPSXWDWLRQV DYDLODEOH 7\SH , 7\SH ,, 7\SH ,,, DQG 7\SH ,9 7KH GLIIHUHQFHV DPRQJ WKHVH W\SHV DUH GLVFXVVHG LQ $SSHQGL[ ' $QDO\VLV RI FOLQLFDO UHVHDUFK GDWD XVLQJ DQ $129$RIWHQIRFXVHVRQWKH7\SH,,,VXPVRIVTXDUHVIURP352&*/0LQ6$6 ,Q WKH SUHYLRXV ZHLJKW H[DPSOH QRWLFH WKH SYDOXH IRU *HQGHU RI ZKLFK PHDQV WKDW WKHUH LV D KLJKO\ VLJQLILFDQW GLIIHUHQFH LQ PHDQ ZHLJKWV EHWZHHQ WKH PDOHV DQGWKHIHPDOHV )XUWKHUPRUHWKH SYDOXH RI IRU7UHDWPHQW WHOOV \RX WKDW D VLJQLILFDQW GLIIHUHQFH LQ PHDQ ZHLJKWV GRHV H[LVW EHWZHHQ WKH WZR WUHDWPHQW JURXSV $ DQG % 7KLV GLIIHUHQFH ZDV PDVNHG E\ WKH ODUJH YDULDELOLW\ ZKHQWKH*HQGHUIDFWRUZDVLJQRUHG 7KH $129$ UHVXOWV JLYHQ LQ 7DEOH & GR QRW LQFOXGH WKH LQWHUDFWLRQ EHWZHHQ WUHDWPHQW JURXS DQG JHQGHU DV D VRXUFH RI YDULDWLRQ $ VWDWLVWLFDO LQWHUDFWLRQ EHWZHHQWZRHIIHFWVVXJJHVWVWKDWGLIIHUHQFHVLQUHVSRQVHPHDQVDPRQJWKHOHYHOVRI RQH HIIHFW DUH QRW FRQVLVWHQW DFURVV WKH OHYHOV RI WKH RWKHU HIIHFW $ VLJQLILFDQW 7UHDWPHQWE\*HQGHU LQWHUDFWLRQ LQ WKH ZHLJKW H[DPSOH ZRXOG LQGLFDWH WKDW WKH GLIIHUHQFH LQ PHDQ ZHLJKW EHWZHHQ 7UHDWPHQW $ DQG 7UHDWPHQW % GHSHQGV RQ ZKHWKHUZH¶UHWDONLQJDERXWPDOHVRUIHPDOHV 'LIIHUHQW UHVSRQVH SDWWHUQV ZKLFK LQGLFDWH LQWHUDFWLRQV DQG QR LQWHUDFWLRQV DUH VKRZQLQ&KDSWHU&HOOPHDQVIURPWKHZHLJKWH[DPSOHZRXOGGHSLFWDSDWWHUQDV VKRZQLQ)LJXUH&ZKLFKVXJJHVWVQRLQWHUDFWLRQ7KH$129$WKDWLQFOXGHVWKH LQWHUDFWLRQHIIHFWFRQILUPVWKLVDVVKRZQLQ7DEOH& 7$%/(&$129$6XPPDU\±,QWHUDFWLRQ(IIHFW
6RXUFH
GI
7UHDWPHQW
*HQGHU
7UHDWPHQWE\*HQGHU
)
(UURU
7RWDO
6LJQLILFDQWS
66
06
)
) W
)
E
WJ
6LJQLILFDQWS
$SSHQGL[&%DVLF$129$&RQFHSWV
)LJXUH&3ORWRI0HDQ:HLJKWV6KRZV 1R7UHDWPHQWE\*HQGHU,QWHUDFWLRQ
0DOHV
:HLJKWOE
&
)HPDOHV
$
7UHDWPHQW
%
/HDVW6TXDUHV0HDQ/6PHDQ ,QWKHGHYHORSPHQWRIDQHZUHFXPEHQWH[HUFLVHELF\FOHWKHGHVLJQHUVQHHGHGWR HVWLPDWHWKHDYHUDJHKHLJKWRISRWHQWLDOXVHUV7KH\WRRNWKHKHLJKWRIPHPEHUV RIDORFDOKHDOWKFOXEFKDLQDQGFRPSXWHGWKHPHDQKHLJKWDV
;
;
å[
L
L
ZKHUH[LUHSUHVHQWVWKHKHLJKWRIWKHLWKPHPEHU,VWKLVDJRRGHVWLPDWH"6XSSRVHLW LVNQRZQWKDWWKHFOXEVHOHFWHGKDVPRVWO\PDOHPHPEHUVDQGWKDWRIWKH PHPEHUVVHOHFWHGZHUHPDOH7KHVDPSOHPHDQDERYHFDQWKHQEHUHSUHVHQWHGDV
× ; P × ; I
× ; P × ; I
ZKHUH; DQG; UHSUHVHQW WKH PHDQ KHLJKWV RI WKH PDOHV DQG IHPDOHV UHVSHFWLYHO\ 7KLV HVWLPDWH LV D JRRG RQH LI WKH SRSXODWLRQ LV FRPSRVHG RI DSSUR[LPDWHO\ PDOHV LH LI RI WKH SRWHQWLDO XVHUV RI WKH QHZ H[HUFLVH HTXLSPHQWDUHPDOH+RZHYHULIWKHVDPSOHWKDWZDVWDNHQLVQRWLQSURSRUWLRQWR WKH WUXH FRPSRVLWLRQ RI WKH SRSXODWLRQ WKLV VLPSOH DYHUDJH PLJKW EH D SRRU HVWLPDWH P
I
&RPPRQ6WDWLVWLFDO0HWKRGVIRU&OLQLFDO5HVHDUFKZLWK6$6([DPSOHV
6XSSRVH LW LV NQRZQ WKDW XVHUV RI WKH UHFXPEHQW F\FOH DUH LQ JHQHUDO HTXDOO\ GLVWULEXWHG EHWZHHQ PDOHVDQGIHPDOHV 8QGHU VXFKDQDVVXPSWLRQWKH PHDQWKDW ZDV FRPSXWHG IRU WKLV H[DPSOH ZRXOG WHQG WR RYHUHVWLPDWH WKH SRSXODWLRQ PHDQ EHFDXVHLWLVKLJKO\ZHLJKWHGE\WKHPDOHVZKRDUHNQRZQWREHWDOOHULQWKHJHQHUDO SRSXODWLRQWKDQIHPDOHV $PHDQKHLJKWIRUPDOHVRILQFKHVDQGDPHDQKHLJKWIRUIHPDOHVRILQFKHV ZRXOG SURGXFH DQ RYHUDOO PHDQ KHLJKW HVWLPDWH RI LQFKHV ZKHQ XVLQJ WKH ZHLJKWHG DSSURDFK MXVW SUHVHQWHG $ EHWWHU HVWLPDWH PLJKW EH LQFKHVRULQJHQHUDO ; ; P ; I × ; P × ; I
7KLV HVWLPDWH LV FDOOHG D OHDVW VTXDUHV PHDQ RU /6PHDQ 7KLV LV WKH HVWLPDWH FRPSXWHGE\ 6$6 ZKHQ XVLQJ DµVDWXUDWHG¶ PRGHO LH LQFOXVLRQRI PDLQ HIIHFWV DQGDOOSRVVLEOHLQWHUDFWLRQV ,Q FOLQLFDOWULDOV WKH /6PHDQLV RIWHQ XVHGIRUHVWLPDWLQJWUHDWPHQWHIIHFWV ZKHQ EORFNLQJE\VWXG\FHQWHU$W\SLFDOWULDOPLJKWLQYROYHDODUJHUQXPEHURISDWLHQWV IURP RQH VWXG\ FHQWHU WKDQ IURP DQRWKHU 7KH /6PHDQ RI WKH WUHDWPHQW HIIHFW ZRXOG EH HTXLYDOHQW WR D FRPELQHG DYHUDJH RYHU VWXG\ FHQWHUV DV LI WKH VDPH QXPEHURISDWLHQWVZHUHVWXGLHGZLWKLQHDFKFHQWHU ,IWKHVDPSOHVL]HVIURPHDFKVWXG\FHQWHUDUHSURSRUWLRQDOWRWKHSRSXODWLRQVL]HV WKH XVXDO DULWKPHWLF PHDQ ZRXOG EH DSSURSULDWH :KHQ WKH SRSXODWLRQ VL]HV DUH FRQVLGHUHGµLQILQLWH¶WKH/6PHDQLVXVHGWRJLYHHTXDOZHLJKWLQJWRWKHHIIHFWVL]HV IURPHDFKVWXG\FHQWHU(VWLPDWHVEDVHGRQWKH/6PHDQFDQGLIIHUPDUNHGO\IURP WKDW RIWKH XVXDODULWKPHWLF PHDQ LIHLWKHUWKH FHOO PHDQV RUWKH FHOOVDPSOHVL]HV ZLWKLQDQ\GRVHJURXSGLIIHUVXEVWDQWLDOO\DPRQJVWXG\FHQWHUV,QVXFKLQVWDQFHV WKH DQDO\VW PXVW H[DPLQH WKH DVVXPSWLRQV RI SRSXODWLRQ VL]HV DQG VDPSOH UHSUHVHQWDWLRQWRGHFLGHZKLFKHVWLPDWHLVPRUHDSSURSULDWH ,Q ([DPSOH &KDSWHU WKH PHDQV DQG /6PHDQV RI WKH QXPEHU RI LWHPV FRUUHFWO\ UHPHPEHUHG E\ SDWLHQWV LQ WKH 'RVH JURXSV LQ WKH PHPRU\ VWXG\ DUH VKRZQLQ7DEOH&
7$%/(&$ULWKPHWLFDQG/HDVW6TXDUHV0HDQIRU([DPSOH
'26(*5283
&(17(5
3ODFHER
PJ
PJ
'U$EHO
0HDQ 1
'U%HVW
0HDQ 1
&RPELQHG
0HDQ /60HDQ
$SSHQGL[&%DVLF$129$&RQFHSWV
1RWLFHWKDWWKHWZRHVWLPDWHVDUHYHU\FORVHWRHDFKRWKHU7KLVPLJKWQRWEHWKHFDVH LI IRU H[DPSOH WKHUH DUH SDWLHQWV LQ RQH VWXG\ FHQWHU DQG RQO\ SDWLHQWV LQ DQRWKHU:KHQVXFKDGHJUHHRILPEDODQFHRFFXUVLQWKHVDPSOLQJLWLVLPSRUWDQWWR FRQVLGHU WKH VDPSOLQJ DVVXPSWLRQV EHIRUH GHFLGLQJ ZKLFK HVWLPDWH RI WUHDWPHQW HIIHFWVLVPRUHDSSURSULDWH
&RPPRQ6WDWLVWLFDO0HWKRGVIRU&OLQLFDO5HVHDUFKZLWK6$6([DPSOHV
$SSHQGL[' 667\SHV,,,,,,DQG,90HWKRGVIRU DQ8QEDODQFHG7ZR:D\/D\RXW
' ' ' ' '
'
667\SHV&RPSXWHGE\6$6 +RZWR'HWHUPLQHWKH+\SRWKHVHV7HVWHG (PSW\&HOOV 0RUH7KDQ7ZR7UHDWPHQW*URXSV 6XPPDU\
667\SHV&RPSXWHGE\6$6
7KLVVHFWLRQ XVHVWKH UDQGRPL]HG EORFN GHVLJQWRLOOXVWUDWHWKHGLIIHUHQFHVDPRQJ WKH YDULRXV $129$ W\SHV RI VXPV RI VTXDUHV 66 FRPSXWHG E\ 6$6 7\SH , 7\SH,,7\SH,,,DQG7\SH,9 &RQVLGHUWKHXQEDODQFHGWZRZD\OD\RXWWKDWKDVWZRWUHDWPHQWJURXSV$DQG% DQGVWXG\FHQWHUVDQG /HW DQGQ UHSUHVHQWWKHPHDQDQGVDPSOHVL]H UHVSHFWLYHO\ IRU 7UHDWPHQW L &HQWHU M )RU WKLV H[DPSOH SDWLHQWV UHFHLYH 7UHDWPHQW$DQGSDWLHQWVUHFHLYH7UHDWPHQW%7KHSDWLHQWVDUHGLVWULEXWHGDPRQJ VWXG\FHQWHUVDVVKRZQLQ7DEOH' LM
LM
7$%/('6DPSOH&RQILJXUDWLRQIRUDQ8QEDODQFHG7ZR:D\/D\RXW &HQWHU RYHUDOO
7UHDWPHQW $ %
Q
Q
Q
Q
Q
Q
Q$
Q%
$
%
&RPPRQ6WDWLVWLFDO0HWKRGVIRU&OLQLFDO5HVHDUFKZLWK6$6([DPSOHV
'HQRWH HDFK WUHDWPHQW PHDQ DV D OLQHDU FRPELQDWLRQ RI WKH FHOO PHDQV IRU WKDW WUHDWPHQW
D DQG E
D
$
%
E
D
E
7KHJRDOLVWRWHVWWKHK\SRWKHVLVRIHTXDOLW\RIWUHDWPHQWPHDQV
+
$
%
7KH PHWKRG XVHG WR SHUIRUP WKLV WHVW GHSHQGV RQ KRZ \RX GHILQH WKH WUHDWPHQW PHDQVLQWHUPVRIWKHFHOOPHDQV&RQVLGHUWKHIROORZLQJWKUHHFDVHV
&DVHL /HWD Q Q E Q Q M M
M
$
M
M
%
7KHK\SRWKHVLVRIHTXDOWUHDWPHQWPHDQVEHFRPHV
+
7KHWUHDWPHQWPHDQVDUHWKHZHLJKWHGDYHUDJHVRIWKHFHOOPHDQVIRUWKDWWUHDWPHQW ZHLJKWHGE\WKHFHOOVDPSOHVL]HV7KLVLVWKHK\SRWKHVLVWHVWHGE\WKH6$67\SH, VXP RI VTXDUHV IRU 7UHDWPHQW LI WKH 7UHDWPHQW IDFWRU LV VSHFLILHG ILUVW LQ WKH 02'(/VWDWHPHQW
&DVHLL /HWD ZKHUHX X X DQG8 M
E X 8M Q Q Q Q Q Q Q Q Q Q Q Q Q Q Q Q Q Q X X X M
M
7KHK\SRWKHVLVRIHTXDOWUHDWPHQWPHDQVEHFRPHV
+
7KHWUHDWPHQWPHDQLVWKHZHLJKWHGDYHUDJHRIWKHFHOOPHDQVZHLJKWHGE\ZKDW VHHPV WR EH D FRPSOH[ IXQFWLRQ RI WKH FHOO VDPSOH VL]HV 7KLV LV WKH K\SRWKHVLV WHVWHGE\WKH6$67\SH,,VXPRIVTXDUHVIRU7UHDWPHQW
$SSHQGL['667\SHV,,,,,,,9LQ6$6
7KH ZHLJKWV DUH DFWXDOO\ LQYHUVHO\ UHODWHG WR WKH YDULDQFH RI WKH HVWLPDWHV RI WKH PHDQVIRUHDFKFHQWHU7KLVFDQEHVHHQZKHQWKHFRHIILFLHQWVX 8LQ&DVHLL DUH UHZULWWHQLQDPRUHIDPLOLDUIRUP M
)RU&HQWHUMM RU OHWZ ÂQ Q 8QGHUWKHXVXDO$129$ DVVXPSWLRQVRILQGHSHQGHQW JURXSV DQG YDULDQFH KRPRJHQHLW\ WKLVUHSUHVHQWV WKH YDULDQFH RI ;M ; M /HWWLQJ : Z Z Z WKH ZHLJKWV Z : DUH
M
M
M
±
±
M
SURSRUWLRQDO WR WKH X 8 VKRZQ LQ &DVH LL :ULWWHQ WKLV ZD\ \RX VHH WKDW HDFK WUHDWPHQWPHDQLVDZHLJKWHGDYHUDJHRIWKHFHQWHUPHDQVIRUWKDWWUHDWPHQWZKHUH WKHZHLJKWVDUHSURSRUWLRQDOWRWKHLQYHUVHRIWKHYDULDQFHVZLWKLQWKHFHQWHU M
&DVHLLL /HWD D D E E E
7KHK\SRWKHVLVRIHTXDOWUHDWPHQWPHDQVEHFRPHV
+
ZKLFKGRHVQRWGHSHQGRQWKHVDPSOHVL]HV7KLVLVWKHK\SRWKHVLVWHVWHGE\WKH6$6 7\SH,,,DQG7\SH,9VXPVRIVTXDUHVIRU7UHDWPHQW 1RWH7KH6$67\SH,,,DQG7\SH,9WHVWVDUHLGHQWLFDOLQDWZRZD\ OD\RXWZLWKQRHPSW\FHOOV,IQ IRUDWOHDVWRQHFHOOWKH7\SH,,, DQG ,9 VXPV RI VTXDUHV IRU 7UHDWPHQW ZLOO GLIIHU LI WKHUH DUH PRUH WKDQWZROHYHOVRIWKH7UHDWPHQWIDFWRU,IWKHUHDUHHPSW\FHOOVLWLV UHFRPPHQGHGWKDWWKH7\SH,9WHVWVEHXVHGVHH6HFWLRQ' LM
'
+RZWR'HWHUPLQHWKH+\SRWKHVHV7HVWHG 1RWH7KLVVHFWLRQDVVXPHVWKDWWKHUHDGHULVIDPLOLDUZLWKYHFWRUQRWDWLRQ
LM
L
M
LM
ZKHUH UHSUHVHQWV WKH RYHUDOO PHDQ UHVSRQVH UHSUHVHQWV WKH DYHUDJH DGGLWLYH HIIHFW RI 7UHDWPHQW L LV WKH DYHUDJH DGGLWLYH HIIHFW RI &HQWHU M DQG LV WKH DYHUDJH DGGLWLYH HIIHFW RI WKH 7UHDWPHQW LE\&HQWHU M FRPELQDWLRQ LQWHUDFWLRQ 6$6 XVHV YHFWRUV RI WKH SDUDPHWHU FRHIILFLHQWV DQG PDWULFHV WR FRPSXWH VXPV RI VTXDUHVDQGLWLVDOVRFRQYHQLHQWWRH[SUHVVK\SRWKHVHVLQWHUPVRIPDWUL[DOJHEUD :LWKWZRWUHDWPHQWJURXSVWKHK\SRWKHVHVWHVWHGE\6$6DUHRIWKHIRUP L
M
+ D¶
LM
&RPPRQ6WDWLVWLFDO0HWKRGVIRU&OLQLFDO5HVHDUFKZLWK6$6([DPSOHV
ZKHUH LVWKHYHFWRURISDUDPHWHUVDQGDLVDYHFWRURISDUDPHWHUFRHIILFLHQWV,Q WKHH[DPSOHZLWK7UHDWPHQWVDQG&HQWHUVWKHSDUDPHWHUYHFWRULV
¶
8VHWKHIROORZLQJ6$6VWDWHPHQWV PROC GLM; CLASS TRT CENTER; MODEL Y = TRT CENTER TRT*CENTER / E1 E2 E3 E4;
WRJHQHUDWHWKH$129$IRUWKHFRQILJXUDWLRQLQ7DEOH'DQGSULQWWKHIRUPRIWKH SDUDPHWHUFRHIILFLHQWYHFWRUVIRUHDFKRIWKHIRXUW\SHVRIVXPVRIVTXDUHV7KHQH[W VHFWLRQ VKRZV KRZ WKHVH YHFWRUV FDQ EH XVHG WR FRQYHUW WKH K\SRWKHVLV IURP D SDUDPHWULFK\SRWKHVLVD¶ WRDK\SRWKHVLVEDVHGRQWKHFHOOPHDQV
7\SH,+\SRWKHVHV
7KH RXWSXW RI WKH YHFWRU RI SDUDPHWHU FRHIILFLHQWV DB IRU WKH 7\SH , HVWLPDEOH IXQFWLRQVIRU7UHDWPHQWLVVKRZQLQ2XWSXW' /UHIHUVWRDQ\FRQVWDQWWKDWLVFKRVHQE\WKHXVHU6HWWLQJ/ WKHYHFWRUD¶LV ±±±±± DQGWKHK\SRWKHVLVEHFRPHV
D¶ ± ± ± ± ±
$OJHEUDLFPDQLSXODWLRQRIWKLVHTXDWLRQ\LHOGV
ZKLFKFDQEHH[SUHVVHGLQWHUPVRIWKHFHOOPHDQVDVWKH6$67\SH,K\SRWKHVLV
RU
$SSHQGL['667\SHV,,,,,,,9LQ6$6
287387')RUPRIWKH6$67\SH,(VWLPDEOH)XQFWLRQV
Type I Estimable Functions for: TRT Effect
Coefficients
INTERCEPT
0
TRT
A B
L2 -L2
CNTR
1 2 3
0.0833*L2 -0.25*L2 0.1667*L2
TRT*CNTR
A A A B B B
1 2 3 1 2 3
0.25*L2 0.25*L2 0.5*L2 -0.1667*L2 -0.5*L2 -0.3333*L2
7KHIRUPRIWKH7\SH,K\SRWKHVLVIRU7UHDWPHQWGHSHQGVRQWKHRUGHULQZKLFKWKH IDFWRUVDUHOLVWHGLQWKH02'(/VWDWHPHQW,I7UHDWPHQWLVOLVWHGEHIRUH&HQWHUWKH 7\SH , K\SRWKHVLVLV DV JLYHQLQWKHSUHFHGLQJ H[DPSOH ,I &HQWHULV OLVWHG EHIRUH 7UHDWPHQW WKH 7\SH , K\SRWKHVLV IRU 7UHDWPHQW LV WKH VDPH DV WKH 7\SH ,, K\SRWKHVLVIRU7UHDWPHQW7\SHV,,,,,DQG,9GRQRWGHSHQGRQWKHRUGHULQZKLFK WKHIDFWRUVDUHOLVWHGLQWKH02'(/VWDWHPHQW
7\SH,,+\SRWKHVHV
7KH RXWSXW RI WKH YHFWRU RI SDUDPHWHU FRHIILFLHQWV IRU WKH 7\SH ,, HVWLPDEOH IXQFWLRQVIRU7UHDWPHQWLVVKRZQLQ2XWSXW' %\VXEVWLWXWLQJIRU/WKHK\SRWKHVLVEHFRPHV
D¶ ± ± ± ±
&RPPRQ6WDWLVWLFDO0HWKRGVIRU&OLQLFDO5HVHDUFKZLWK6$6([DPSOHV
$OJHEUDLFPDQLSXODWLRQRIWKLVHTXDWLRQUHVXOWVLQ
ZKLFKLVH[SUHVVHGLQWHUPVRIWKHFHOOPHDQVDVWKH6$67\SH,,K\SRWKHVLV
287387')RUPRIWKH6$67\SH,,(VWLPDEOH)XQFWLRQV
Type II Estimable Functions for: TRT
Effect
Coefficients
INTERCEPT
0
TRT
A B
L2 -L2
CNTR
1 2 3
0 0 0
TRT*CNTR
A A A B B B
1 2 3 1 2 3
0.2083*L2 0.375*L2 0.4167*L2 -0.2083*L2 -0.375*L2 -0.4167*L2
7\SH,,,+\SRWKHVHV
7KH IRUP RI WKH 7\SH ,,, K\SRWKHVHV IRU 7UHDWPHQW LV D YHFWRU RI SDUDPHWHU FRHIILFLHQWVVKRZQLQ2XWSXW' %\XVLQJ/ WKHK\SRWKHVLVEHFRPHV
D¶ ± ± ± ± ±
$SSHQGL['667\SHV,,,,,,,9LQ6$6
$OJHEUDLFPDQLSXODWLRQRIWKLVHTXDWLRQ\LHOGV
ZKLFK FDQ EH UHH[SUHVVHG LQ WHUPV RI WKH FHOO PHDQV DV WKH 6$6 7\SH ,,, K\SRWKHVLV
287387')RUPRIWKH6$67\SH,,,(VWLPDEOH)XQFWLRQV
Type III Estimable Functions for: TRT Effect
Coefficients
INTERCEPT
0
TRT
A B
L2 -L2
CNTR
1 2 3
0 0 0
TRT*CNTR
A A A B B B
7\SH,9+\SRWKHVHV
1 2 3 1 2 3
0.3333*L2 0.3333*L2 0.3333*L2 -0.3333*L2 -0.3333*L2 -0.3333*L2
$VSUHYLRXVO\PHQWLRQHGWKH7\SH,9K\SRWKHVHVIRU7UHDWPHQWDUHLGHQWLFDOWRWKH 7\SH,,,K\SRWKHVHVZKHQWKHUHDUHQRHPSW\FHOOV
&RPPRQ6WDWLVWLFDO0HWKRGVIRU&OLQLFDO5HVHDUFKZLWK6$6([DPSOHV
'
(PSW\&HOOV
7KH 6$67\SH ,,, DQG7\SH ,9VXPV RIVTXDUHV 66 IRU7UHDWPHQWDUHLGHQWLFDO ZKHQ HDFK FHOO LQ WKH WZRZD\ OD\RXW KDV DW OHDVW RQH REVHUYDWLRQ ,I WKHUH LV DQ HPSW\FHOOQ IRUVRPHLMFRPELQDWLRQ 66IRU7\SHV,,,DQG,9ZLOOEHWKH VDPH LI WKHUH DUH WZR WUHDWPHQW JURXSV EXW GLIIHUHQW LI WKHUH DUH PRUH WKDQ WZR WUHDWPHQWJURXSVVHHQH[WVHFWLRQ %\SULQWLQJWKHIRUPRIWKH7\SH,9SDUDPHWHU FRHIILFLHQW YHFWRU IRU 7UHDWPHQW ZKHQ WKHUH LV RQH HPSW\ FHOO \RX VHH WKDW WKH WUHDWPHQWPHDQVDUHXQZHLJKWHGDYHUDJHVRIWKHFHOOPHDQVH[FOXGLQJWKHEORFNWKDW FRQWDLQVWKHHPSW\FHOO LM
)RUH[DPSOHLQWKH H H[DPSOHLQWURGXFHGLQ6HFWLRQ'VXSSRVHWKDWWKHL M FHOO KDV QR REVHUYDWLRQV Q 7KH 6$6 7\SH ,9 VXP RI VTXDUHV IRU 7UHDWPHQWWHVWVWKHFHOOPHDQVK\SRWKHVLV+ 1RWLFHWKDW &HQWHUGRHVQRWHQWHULQWRWKHK\SRWKHVLVDWDOOEHFDXVHRILWVLQFRPSOHWHGDWD
:KHQ WKHUH DUH PDQ\ HPSW\ FHOOV WKH WZRZD\ $129$ FDQ UHVXOW LQ WHVWLQJ K\SRWKHVHV WKDW H[FOXGH PDQ\ RI WKH FHOO PHDQV 6XFK DQ DQDO\VLV PLJKW QRW EH EHQHILFLDO GXH WR WKH GLIIHUHQFHV LQ WKH K\SRWKHVHV DFWXDOO\ WHVWHG YHUVXV WKRVH SODQQHGLQWKHGHVLJQSKDVHRIWKHVWXG\ 2QH PHWKRG RI FLUFXPYHQWLQJ WKHVH GLIILFXOWLHV LV WR UHPRYH WKH 7UHDWPHQWE\ &HQWHU LQWHUDFWLRQ HIIHFW DV D VRXUFH RI YDULDWLRQ LI LW FDQ EH DVVXPHG WKDW QR LQWHUDFWLRQH[LVWVRULISUHOLPLQDU\WHVWVVXJJHVWWKDWVXFKDQDVVXPSWLRQLVFUHGLEOH 7KH 6$6 7\SH , ,, ,,, DQG ,9 WHVWV IRU WKH LQWHUDFWLRQ HIIHFW DUH WKH VDPH UHJDUGOHVVRIWKHHPSW\FHOOSDWWHUQ7KLVWHVWIRULQWHUDFWLRQZKLOHQRWLQFOXGLQJDOO FHOOVFDQSURYLGHDQLQGLFDWLRQRIZKHWKHUDQLQWHUDFWLRQLVSUHVHQW,IWKLVWHVWLVQRW VLJQLILFDQWWKHDQDO\VLVPLJKWEHUHUXQXVLQJMXVWWKHPDLQHIIHFWV7UHDWPHQWDQG &HQWHU:LWKQRLQWHUDFWLRQLQWKH02'(/VWDWHPHQWWKH6$67\SHV,,,,,DQG,9 VXPV RI VTXDUHV DUH LGHQWLFDO DQG HDFK 7\SH WHVWV WKH K\SRWKHVLV RI HTXDOLW\ RI WUHDWPHQW PHDQV ZKHQ WKH WUHDWPHQW PHDQV DUH UHSUHVHQWHG E\ DQ XQZHLJKWHG DYHUDJHRIWKHFHOOPHDQVIRUWKDWWUHDWPHQW 6RPHWLPHVFUHDWLYHZD\VFDQEHIRXQGWRFRPELQHEORFNVWRHOLPLQDWHHPSW\FHOOV ZKLOHUHWDLQLQJWKHDGYDQWDJHVRIEORFNLQJDQGHDVHRILQWHUSUHWDWLRQ)RUH[DPSOH VXSSRVHDVWXG\WKDWLQFOXGHVVWXG\FHQWHUVLVFRQGXFWHGXVLQJWUHDWPHQWJURXSV DQGWDUJHWVSDWLHQWVSHUFHOO$WWKHHQGRIWKHVWXG\SHUKDSVDQXPEHURIHPSW\ FHOOVH[LVWGXHWRDWWULWLRQRISDWLHQWVRUSURWRFROYLRODWLRQVOHDGLQJWRH[FOXVLRQRI GDWD ,W PLJKW EH SRVVLEOH WR FRPELQH FHQWHUV E\ VSHFLDOW\ UHJLRQ RU VRPH RWKHU FRPPRQ IDFWRU %\ FRPELQLQJ FHQWHUV IURP WKH VDPH JHRJUDSKLF UHJLRQV WKH DQDO\VW FDQ LJQRUH &HQWHU DQG XVH *HRJUDSKLF 5HJLRQ HJ 1RUWK 6RXWK :HVW 0LGZHVW DVDQHZEORFNLQJIDFWRU7KLVWHFKQLTXHFDQKHOSHOLPLQDWHWKHSUREOHP RIHPSW\FHOOVE\FRPELQLQJEORFNVWKDWDUHDOLNH ,IDSUHOLPLQDU\WHVWIRULQWHUDFWLRQLVVLJQLILFDQW\RXPLJKWSURFHHGLPPHGLDWHO\WR WUHDWPHQW FRPSDULVRQV ZLWKLQ HDFK OHYHO RI WKH EORFNLQJ IDFWRU HJ &HQWHU RU *HRJUDSKLF 5HJLRQ XVLQJ D WZRVDPSOH WWHVW &KDSWHU RU D RQHZD\ $129$ &KDSWHU 7KLVFDQRIWHQEHKHOSIXOLQUHYHDOLQJVXEJURXSVRIVWXG\FHQWHUVRU UHJLRQV WKDW KDYH VLPLODU UHVSRQVH SDWWHUQV ZLWKLQ HDFK VXEJURXS ,Q JHQHUDO FDXWLRQ PXVW EH XVHG LQ WKH DQDO\VLV DQG LQWHUSUHWDWLRQ RI UHVXOWV ZKHQ WKHUH DUH HPSW\FHOOV
$SSHQGL['667\SHV,,,,,,,9LQ6$6
'
0RUH7KDQ7ZR7UHDWPHQW*URXSV
7KH SUHFHGLQJ H[DPSOHV DSSO\ WR WKH RIWHQXVHG FDVH RI WZR WUHDWPHQW JURXSV :KHQWKHUHDUHJJ ! WUHDWPHQWJURXSVWKHK\SRWKHVLVRIµQR7UHDWPHQWHIIHFW¶LV WHVWHGE\6$6XVLQJVLPXOWDQHRXVVWDWHPHQWVRIWKHIRUP
+ D
D
DJ
%\UHTXHVWLQJWKHIRUPIRUWKHSDUDPHWHUFRHIILFLHQWYHFWRUVXVLQJWKH(((RU (RSWLRQVLQWKH02'(/VWDWHPHQWLQ352&*/0YDOXHVGHQRWHGE\// /J DSSHDU LQ WKH 6$6 RXWSXW 7KH DQDO\VW PD\ VHOHFW J VHWV RI YDOXHV IRU / WKURXJK /J WR REWDLQ WKH DFWXDO FRHIILFLHQWV WKDW GHWHUPLQH WKH VLPXOWDQHRXV VWDWHPHQWVWKDWFRPSULVHWKHK\SRWKHVLV2QHVHWRIVHOHFWLRQVIRUWKH///J YDOXHVLV )RUH[DPSOHVXSSRVHJ LQWKHOD\RXWVKRZQLQ7DEOH' 7$%/('6DPSOH&RQILJXUDWLRQIRU7ZR:D\/D\RXWZLWK! 7UHDWPHQWV &HQWHU RYHUDOO
7UHDWPHQW $ % &
Q
Q
Q
Q
Q
Q
Q
Q
Q
Q
Q
Q
Q
Q
Q
$
Q$
%
Q%
&
Q&
&RPPRQ6WDWLVWLFDO0HWKRGVIRU&OLQLFDO5HVHDUFKZLWK6$6([DPSOHV
7KH SDUDPHWHU FRHIILFLHQW YHFWRUV IRU WKH 7\SH ,,, K\SRWKHVHV IRU 7UHDWPHQW JHQHUDWHGE\6$6DUHVKRZQLQ2XWSXW' 287387')RUPRIWKH7\SH,,,(VWLPDEOH)XQFWLRQVLQ6$6IRU! 7UHDWPHQW*URXSV
Type III Estimable Functions for: TRT Effect Coefficients INTERCEPT
0
TRT
A B C
L2 L3 -L2-L3
CNTR
1 2 3 4 5
0 0 0 0 0
TRT*CNTR
A A A A A B B B B B C C C C C
1 2 3 4 5 1 2 3 4 5 1 2 3 4 5
0.2*L2 0.2*L2 0.2*L2 0.2*L2 0.2*L2 0.2*L3 0.2*L3 0.2*L3 0.2*L3 0.2*L3 -0.2*L2-0.2*L3 -0.2*L2-0.2*L3 -0.2*L2-0.2*L3 -0.2*L2-0.2*L3 -0.2*L2-0.2*L3
7KLV SULQWRXW FDQ EH XVHG WR VWDWH WKH K\SRWKHVHV LQ WHUPV RI FHOO PHDQV XVLQJ D VLPLODUSURFHGXUHDVVKRZQLQ6HFWLRQ'8VLQJWKHVHOHFWLRQVHW IRU WKH// YDOXHVUHVXOWVLQWKHYHFWRUVDDQGDDVIROORZV
+ D¶ Ã +
+
+ D¶ Ã
$SSHQGL['667\SHV,,,,,,,9LQ6$6
7DNHQWRJHWKHUWKHK\SRWKHVLVWHVWHGE\WKH7\SH,,,VXPRIVTXDUHVIRU7UHDWPHQW LVWKHHTXDOLW\RIWUHDWPHQWPHDQV+ ZKHQHDFKWUHDWPHQWPHDQLV WKHXQZHLJKWHGDYHUDJHRIWKHFHOOPHDQVIRUWKDWWUHDWPHQW
$
%
&
:KHQWKHUHDUHQRHPSW\FHOOVWKH7\SH,9K\SRWKHVLVIRU7UHDWPHQWLVWKHVDPHDV WKH7\SH,,,+RZHYHUVXSSRVHDQXPEHURIHPSW\FHOOVH[LVWIRUH[DPSOHFHOOV DQGWKDWLVQ Q Q DVVKRZQLQ7DEOH'
7$%/('6DPSOH/D\RXWZLWK(PSW\&HOOV
&HQWHU
7UHDWPHQW %
$
&
7KH 6$6 7\SH ,9 VXP RI VTXDUHV VLPXOWDQHRXVO\ WHVWV WKH HTXDOLW\ RI SDLUV RI WUHDWPHQW PHDQV ZKHQ HDFK WUHDWPHQW PHDQ LV DQ XQZHLJKWHG DYHUDJH RI WKH FHOO PHDQVIRUWKDWWUHDWPHQWH[FOXGLQJDQ\FHOOLQDEORFNWKDWKDVRQHRUPRUHHPSW\ FHOOV7KHK\SRWKHVLVLVIUDPHGLQWHUPVRIVWDWHPHQWVDERXWWKHFHOOPHDQVDV
DQG
+
7KH 7\SH ,,, K\SRWKHVLV IRU WKH VDPH FRQILJXUDWLRQ GHILQHV D WUHDWPHQW PHDQ LQ WHUPV RI LWV FHOO PHDQV SOXV H[WUDQHRXV HIIHFWV IURP RWKHU WUHDWPHQWV PDNLQJ LQWHUSUHWDWLRQYHU\GLIILFXOW7KHUHIRUHWKH7\SH,9UHVXOWVVHHPWREHPRUHXVHIXO LQ WKH HPSW\ FHOO FDVH ZLWK PRUH WKDQ WZR WUHDWPHQW JURXSV $V SUHYLRXVO\ GLVFXVVHGLIWKHLQWHUDFWLRQLVRPLWWHGDVDVRXUFHRIYDULDWLRQIURPWKH$129$ WKH7\SHV,,,,,DQG,9UHVXOWVDUHWKHVDPHHDFKWHVWLQJIRUSXUH7UHDWPHQWHIIHFWV WKDWDUHIUHHRIEORFNHIIHFWVDQGDUHQRWDIXQFWLRQRIWKHFHOOVL]HV
'
6XPPDU\
.QRZLQJWKHVDPSOLQJSODQDQGWKHUHDVRQVIRUGHVLJQLPEDODQFHKHOSWKHDQDO\VW GHWHUPLQH WKH PRVW DSSURSULDWH 6$6 66 W\SH WR XVH ZKHQ SHUIRUPLQJ DQDO\VLV RI YDULDQFH7KHILUVWVWHSLQWKHDQDO\VLVLVWRUHTXHVWDSULQWRXWIURP6$6RIWKHIRUP RI WKH HVWLPDEOH IXQFWLRQV E\ XVLQJ WKH RSWLRQ ( ( ( ( LQ WKH 02'(/ VWDWHPHQW LQ 352& */0 ,W LV WKHQ SRVVLEOH WR VSHFLI\ WKH K\SRWKHVHV WHVWHG E\ HDFK66W\SH6HOHFWLRQRIWKHPRVWDSSURSULDWHW\SHFDQEHPDGHE\PRVWFORVHO\ PDWFKLQJ WKH K\SRWKHVLV WKDW LV DFWXDOO\ WHVWHG ZLWK WKH LQWHQGHG K\SRWKHVLV 7KH 7\SH,,,VXPRIVTXDUHVZKLFKLJQRUHVWKHFHOOVDPSOHVL]HVLVRIWHQXVHGLQWKH DQDO\VLV RI FOLQLFDO VWXG\ GDWD XQGHU WKH DVVXPSWLRQ RI µLQILQLWH¶ RU YHU\ ODUJH SRSXODWLRQVL]HVZLWKLQHDFK%ORFNOHYHO
&RPPRQ6WDWLVWLFDO0HWKRGVIRU&OLQLFDO5HVHDUFKZLWK6$6([DPSOHV
7KH7\SH,WHVWVZHLJKWWKHFHOOPHDQVLQSURSRUWLRQWRWKHDPRXQWRILQIRUPDWLRQ WKH\FRQWULEXWHLHWKHFHOOVDPSOHVL]HV7KH7\SH,,WHVWVDOVRZHLJKWWKHDPRXQW RILQIRUPDWLRQFRQWULEXWHGWRWKHWUHDWPHQWPHDQVEXWWKH\GRWKLVE\EORFNDQGFHOO UDWKHU WKDQ MXVW LQGLYLGXDO FHOOV 7KH 7\SH ,, WHVWV IRU 7UHDWPHQW DUH DSSHDOLQJ EHFDXVH WKH\ ZLOO DOZD\V EH IUHH RI EORFN HIIHFWV UHJDUGOHVV RI WKH GHJUHH RI LPEDODQFH 7KLV FDQ EH VHHQ LQ WKH H[DPSOH VKRZQ XQGHU ³7\SH ,, +\SRWKHVHV´ ZKHUHWKHFRHIILFLHQWVRIWKHEORFNHIIHFWV DUH7KHVDPHLVWUXHRIWKH 7\SH ,,, PHWKRGV ZKLFK DUH DOVR VLPSOH WR LQWHUSUHW DQG GR QRW GHSHQG RQ WKH VDPSOHVL]HVLQHDFKFHOO$VPHQWLRQHGLQ&KDSWHUWKH7\SH,,,WHVWVDUHJHQHUDOO\ WKHFKRLFHRIFOLQLFDOGDWDDQDO\VWVDQGVKRXOGEHXVHGZKHQWKH/6PHDQVDUHXVHG IRUHVWLPDWLRQ/6PHDQVDUHGLVFXVVHGLQ$SSHQGL[& +RZHYHU7\SH,RU,,WHVWV DUHSUHIHUDEOHLILWLVNQRZQWKDWWKHVDPSOHVL]HVDUHSURSRUWLRQDOWRWKHSRSXODWLRQ VL]HV ,Q HDFK DQDO\VLV WKH DVVXPSWLRQV VKRXOG EH UHYLHZHG EHIRUH DXWRPDWLFDOO\ VHOHFWLQJRQHRIWKH6$67\SHVRIVXPRIVTXDUHV
Appendix E Multiple Comparison Methods E.1 E.2 E.3
Multiple Comparisons of Means..................................................................... 421 Multiple Comparisons of Binomial Proportions ........................................... 432 Summary........................................................................................................... 436
E.1
Multiple Comparisons of Means The Treatment effect is of primary concern in comparative clinical trials. Many of the statistical methods described in this book are illustrated with examples comparing two treatment groups. When there are more than two treatment groups, say K (K > 2), an ‘omnibus’ approach, such as analysis of variance, can be applied to test the overall null hypothesis of ‘no difference among the treatment groups’. If rejected, the next step is to determine which pairs of groups differ, resulting in as many as K· (K–1) / 2 possible pairwise comparisons. First, consider the situation in which it is appropriate to compare mean responses among groups, such as ANOVA. This is one of the most common situations involving pairwise treatment comparisons. A number of approaches to multiple comparisons are shown using the following example.
422
Common Statistical Methods for Clinical Research with SAS Examples
p Example E.1
Consider a parallel study that includes a placebo group, a reference group, and 3 different dose levels of an experimental treatment. The response, Y, is measured for each patient in each of the 5 groups shown in Table E.1. TABLE E.1. Raw Data for Example E.1 A Lo-Dose
B Mid-Dose
21 13 25 18 17 23
27 28 31 24 20
16 12
Treatment Group C D Hi-Dose Reference
19 18 24 27 29
E Placebo
25 19 22 24 34 26
23 18 22 19 24 14
14 17 21 20 13 23
29 28 32
20 29
27 12 16
Treatment group summaries are shown in Table E.2, with a pooled standard deviation of 4.671 (see Chapter 6 for calculating formula). TABLE E.2. Summary Statistics by Treatment Group for Example E.1 A
B
C
D
E
Mean
18.125
24.700
26.556
21.125
18.111
SD
4.612
4.473
4.746
4.486
5.011
N
8
10
9
8
9
With 5 treatment groups, there are (5) · (4) / 2 = 10 possible pairwise comparisons. You can proceed by conducting a series of 10 two-sample t-tests (Chapter 5) to obtain the ‘raw’ (unadjusted) p-values for these comparisons. With the assumption of homogeneous variability across groups, the estimate of the standard deviation pooled over the 5 groups (4.671) is used in the calculation of the t-tests, as demonstrated in Chapter 5. Alternatively, you could use the one-way ANOVA, as shown next. In SAS, pairwise t-tests can be performed as part of the ANOVA by using PROC GLM and specifying the PDIFF option in the LSMEANS statement as shown in the SAS code and output that follow. You can omit the ADJUST=T option , which requests pairwise t-tests, because that is the default.
Appendix E: Multiple Comparison Methods
SAS Code for Example E.1 DATA MC; INPUT TRT $ Y @@; DATALINES; A 21 A 13 A 25 A 18 A 17 A 23 A 16 A 12 B B 31 B 24 B 20 B 19 B 18 B 24 B 27 B 29 C C 22 C 24 C 34 C 26 C 29 C 28 C 32 D 23 D D 19 D 24 D 14 D 20 D 29 E 14 E 17 E 21 E E 23 E 27 E 12 E 16 ; PROC GLM DATA = MC; CLASS TRT; MODEL Y = TRT / SS3; LSMEANS TRT / PDIFF ADJUST=T; TITLE1 'EXAMPLE E.1: Multiple Comparisons'; TITLE2 'All-Pairwise Comparisons of Means'; RUN;
27 25 18 20
B C D E
28 19 22 13
OUTPUT E.1. SAS Output with Pairwise t-Tests for Example E.1 EXAMPLE E.1: Multiple Comparisons All-Pairwise Comparisons of Means The GLM Procedure Class Level Information Class Levels Values TRT 5 A B C D E Number of observations 44 Dependent Variable: Y DF
Sum of Squares
Model 4 Error 39 Corrected Tota1 43
521.470707 850.961111 1372.431818
Source
Source TRT
Mean Square
F Value
130.367677 21.819516
5.97
Pr > F 0.0008
R-Square
Coeff Var
Root MSE
Y Mean
0.379961
21.34268
4.671136
21.88636
DF
Type III SS
Mean Square
F Value
Pr > F
4
521.4707071
130.3676768
5.97
0.0008
TRT A B C D E
Least Squares Means LSMEAN Y LSMEAN Number
18.1250000 24.7000000 26.5555556 21.1250000 18.1111111
1 2 3 4 5
423
424
Common Statistical Methods for Clinical Research with SAS Examples
OUTPUT E.1. (continued) Least Squares Means for effect TRT Pr > |t| for H0: LSMean(i)=LSMean(j) Dependent Variable: Y i/j 1 2 3 4 5
1 0.0051 0.0006 0.2066 0.9951
2 0.0051 0.3926 0.1147 0.0039
3 0.0006 0.3926 0.0216 0.0004
4 0.2066 0.1147 0.0216
5 0.9951 0.0039 0.0004 0.1919
0.1919
NOTE: To ensure overall protection level, only probabilities associated with pre-planned comparisons should be used.
The p-values associated with the 10 pairwise comparisons are summarized in Table E.3 based on the results shown in Output E.1. (Note: The LS-means are the same as the arithmetic means for the one-way ANOVA). The values that are flagged with an asterisk (*) indicate a value less than 0.05, the nominal significance level of each test. Because the ‘raw’ p-values are not adjusted for multiple testing, the overall ‘experiment-wise’ error rate is not controlled. To maintain an overall significance level at the 0.05 level, you can use one of the methods described in the sections that follow Table E.3. TABLE E.3. p-Values from t-Tests for Pairwise Treatment Group Comparisons in Example E.1 Comparison
|Mean Difference|
‘Raw’ p-value
C vs. E A vs. C B vs. E A vs. B C vs. D B vs. D D vs. E A vs. D B vs. C A vs. E
8.444 8.431 6.589 6.575 5.431 3.575 3.014 3.000 1.856 0.014
0.0004* 0.0006* 0.0039* 0.0051* 0.0216* 0.1147 0.1919 0.2066 0.3926 0.9951
The Tukey-Kramer Method Tukey developed a test statistic for simultaneous comparisons of means, which is related to the maximum mean difference over all pairwise comparisons. This statistic has the so-called ‘studentized range’ distribution when sample sizes are the same for all groups. Probabilities associated with this distribution can be found by using numerical techniques and are tabulated in many statistical texts.
Appendix E: Multiple Comparison Methods
425
The Tukey-Kramer method is based on an approximation to the studentized range distribution when sample sizes differ among groups, as is usually the case in clinical trials. The SAS function PROBMC with the “RANGE” distribution can be used to obtain the adjusted p-values for all pairwise comparisons associated with the TukeyKramer approach. Let Tij represent the two-sample t-statistic for comparing the means of Groups i and j.
Tij =
xi − x j s⋅
1 1 + ni n j
where s is the square root of the ANOVA mean square error (MSE). The TukeyKramer p-value can be found by using a SAS statement in the form Pij = 1 – PROBMC(“RANGE”, ABS(SQRT(2)*Tij), . , df, K) where df is the number of ANOVA error degrees of freedom, and K is the number of treatment groups. For example, to compute the Tukey-Kramer adjusted p-value for comparing Groups A (Lo-Dose) and B (Mid-Dose) in Example E.1, find TAB = (18.125 – 24.700) / 4.671((1/8) + (1/10))½ = –2.9675 then use SAS to obtain the adjusted p-value PAB, P_AB = 1 – PROBMC(“RANGE”, SQRT(2)*2.9675, . , 39, 5);
for which SAS returns a value of 0.0386. These probabilities can also be obtained directly by using PROC GLM for all comparisons. Simply specify the ADJUST=TUKEY option in the LSMEANS statement, as follows. LSMEANS TRT / PDIFF ADJUST=TUKEY;
The results from this statement are shown in Output E.2.
426
Common Statistical Methods for Clinical Research with SAS Examples
OUTPUT E.2. Tukey-Kramer Results for Example E.1 EXAMPLE E.1: Multiple Comparisons All-Pairwise Comparisons of Means The GLM Procedure Least Squares Means Adjustment for Multiple Comparisons: Tukey-Kramer
TRT A B C D E
Y LSMEAN
LSMEAN Number
18.1250000 24.7000000 26.5555556 21.1250000 18.1111111
1 2 3 4 5
Least Squares Means for effect TRT Pr > |t| for H0: LSMean(i)=LSMean(j) Dependent Variable: Y i/j
1
1 2 3 4 5
0.0386 0.0054 0.7021 1.0000
2
3
4
0.0386
0.0054 0.9080
0.7021 0.4980 0.1389
0.9080 0.4980 0.0300
0.1389 0.0039
5 1.0000 0.0300 0.0039 0.6759
0.6759
Dunnett’s Test Although the Tukey-Kramer method has wide appeal for all pairwise comparisons, Dunnett’s test is the preferred method if the goal is to maintain the overall significance level when performing multiple tests to compare a set of treatment means with a control group. With K groups, you now have only K–1 comparisons. Similar to the Tukey-Kramer p-values, the adjusted p-values based on a two-sided Dunnett’s test can be obtained by using the SAS function PROBMC with the distribution “DUNNETT2”, as shown below. Let Ti represent the two-sample t-statistic (Chapter 5) for comparing the mean of Group i with the control Group mean.
xi − x0
Ti = s⋅
1 1 + ni n0
Appendix E: Multiple Comparison Methods
427
where s is the square root of the ANOVA mean square error (MSE) and the subscript 0 refers to the control group. The Dunnett p-value can be found by using a SAS statement of the form Pi = 1 – PROBMC(“DUNNETT2”, ABS(Ti), . , df, K–1, L1, L2,…, LK–1) where df is the number of ANOVA error degrees of freedom, K is the number of treatment groups (including the control group), and the Li’s are distributional parameters used to adjust for unequal sample sizes, defined as
Li =
ni ni + n0
For example, to compute Dunnett’s adjusted p-value for comparing Group B (Middle-Dose) with Group E (Placebo) in Example E.1, find TB = (24.700 – 18.111) / 4.671((1/10) + (1/9))½ = 3.0701 and L1 = (8 / (8+9))½ = 0.6860 L2 = (10/ (10+9))½ = 0.7255 L3 = (9/ (9+9))½
= 0.7071
L4 = (8 / (8+9))½
= 0.6860
then, evaluate P_B=1–PROBMC(“DUNNETT2”,3.0701, . ,39,4,0.6860,0.7255,0.7071,0.6860);
to obtain the adjusted p-value, PB, for which SAS returns a value of 0.0138. These probabilities can be easily obtained for all comparisons by using PROC GLM. You must use the CONTROL option to identify Group E as the control group, and specify the ADJUST=DUNNETT option in the LSMEANS statement, as follows: LSMEANS TRT / PDIFF=CONTROL(‘E’) ADJUST=DUNNETT;
428
Common Statistical Methods for Clinical Research with SAS Examples
The results from this statement are shown in Output E.3. OUTPUT E.3. Dunnett’s Results for Example E.1 EXAMPLE E.1: Multiple Comparisons Treatment Comparisons vs. Control Group (TRT=E) The GLM Procedure Least Squares Means Adjustment for Multiple Comparisons: Dunnett
TRT A B C D E
Y LSMEAN 18.1250000 24.7000000 26.5555556 21.1250000 18.1111111
H0:LSMean= Control Pr > |t| 1.0000 0.0138 0.0017 0.4901
p-Value Adjustment Methods Both the Tukey-Kramer and Dunnett methods rely on numerical integration to evaluate complex probability distributions and require the assumption of normality. They are robust statistical methods that have good power and enjoy widespread usage in statistical programs, including SAS. Let’s look at the Bonferroni, the Sidak, the step-down Bonferroni (Holm), and the step-up Bonferroni (Hochberg) methods, which are useful in multiple comparisons. These methods attempt to control the overall significance level by adjusting p-values without requiring the evaluation of complex distributions. They can be used in making pairwise comparisons of means but also have a much wider application. Although versatile and very easy to compute, these methods are generally very conservative and might fail to detect important differences. The adjusted p-values for the 10 pairwise comparisons of Example E.1 are shown in Table E.4 based on the methods listed in the preceding paragraph, in addition to the results previously obtained by using the Tukey-Kramer and Dunnett methods.
Appendix E: Multiple Comparison Methods
429
TABLE E.4. Adjusted p-Values for Example E.1
(i) Comparison
t-Test (‘Raw’)
TukeyKramer
Dunnett
Bonferroni
Sidak
Step-Down Bonferroni
Step-Up Bonferroni
pRAW
pTUK
pDUN
pBON
pSID
pSTPDWN
pSTPUP
0.0017*
0.004*
0.004*
0.004*
0.004*
0.006*
0.006*
0.005*
0.005*
0.039*
0.038*
0.031*
0.031*
(1)
C v. E
0.0004*
0.0039*
(2)
A v. C
0.0006*
0.0054*
(3)
B v. E
0.0039*
0.0300*
(4)
A v. B
0.0051*
0.0386*
0.051
0.050*
0.036*
0.036*
(5)
C v. D
0.0216*
0.1389
0.216
0.196
0.130
0.130
(6)
B v. D
0.1147
0.4980
1.000
0.704
0.574
0.574
(7)
D v. E
0.1919
0.6759
1.000
0.881
0.768
0.620
(8)
A v. D
0.2066
0.7021
1.000
0.901
0.768
0.620
(9)
B v. C
0.3926
0.9080
1.000
0.993
0.785
0.785
(10) A v. E
0.9951
1.0000
1.000
1.000
0.995
0.995
0.0138*
0.4901
1.0000
The adjusted p-values using the Bonferroni and Sidak methods, are found from the raw p-values by using the formulas,
pBON = m · pRAW and
pSID = 1 – (1 – pRAW)m where m is the number of comparisons and pRAW is the raw (unadjusted) p-value. Values greater than 1.0 are truncated to 1.0. For example, adjusted p-values for Comparison (5) in Table E.4 are 10
pBON = 10(0.0216) = 0.216 and pSID = 1– (1–0.0216)
= 0.1962
The Bonferroni step-down method adjusts the raw p-values in a stepwise fashion starting with the smallest. Comparison (1) is adjusted as pSTPDWN(1) = m·pRAW(1) or 10(0.0004) = 0.004. The next smallest raw p-value, Comparison (2), is adjusted as pSTPDWN(2) = (m–1)pRAW(2) = 9(0.0006) = 0.0054. Comparison (3) is adjusted as pSTPDWN(3) = (m–2)pRAW = 8(0.0039) = 0.0312, and so on. Any adjusted p-value calculated to be greater than 1 is set to 1.0. Also, all comparisons that follow the first non-significant comparison must be nonsignificant, and an adjusted p-value is set equal to its predecessor if its calculated value is less. Therefore for Comparison (8), the adjusted p-value is calculated to be 3(0.2066) = 0.6198, which is less than the preceding adjusted p-value (0.7676), so the Comparison 8 p-value is set to 0.7676. The Bonferroni step-down method is sometimes referred to as Holm’s method.
430
Common Statistical Methods for Clinical Research with SAS Examples
The Bonferroni step-up-method adjusts the raw p-values in a stepwise fashion, but it starts with the largest value. For Comparison (10) in Table E.4., you compute pSTPUP(10) = 1 · pRAW(10) = 0.9951. The next largest value, Comparison (9), is adjusted as pSTPUP(9) = 2·pRAW(9) = 2(0.3926) = 0.7852. Comparison (8) is adjusted as pSTPUP(8) = 3·pRAW(8) = 3(0.2066) = 0.6198, and so on. Each adjusted p-value must be no greater than its predecessor. Those values that are greater are set equal to the preceding adjusted p-value. The Bonferroni step-up method is also known as Hochberg’s method. Adjusted p-values using the Bonferroni, Sidak, and Bonferroni stepwise methods can be found by using PROC MULTTEST in SAS as shown below with the MC data set of Example E.1. The BONFERRONI, SIDAK, HOLM, and HOC options are specified in the PROC MULTTEST statement, and the results are written to the data set NEWMC . The treatment group (TRT) is specified as the grouping variable in a CLASS statement , and the MEAN option in the TEST statement identifies the response variable (Y) for which comparisons of means are to be performed. One CONTRAST statement is used for each pairwise comparison. You see very similar results between the Bonferroni step-down and step-up methods for Example E.1 (stpbon_p and hoc_p, respectively, in Output E.4). However, the step-up method is actually more powerful and, in many cases, will find more significant comparisons, but it must be used with caution because the step-up method assumes that the raw p-values are mutually independent. Notice also that the Bonferroni step-down and step-up methods produce results that are very comparable to the Tukey-Kramer method for Example E.1. Although the Bonferroni and Sidak methods are very easy to use, they are not recommended for the primary analysis in confirmatory or pivotal clinical trials because of their overly conservative nature. SAS Code for Example E.1 (continued) PROC MULTTEST BONFERRONI SIDAK HOLM HOC NOPRINT OUT=NEWMC DATA=MC; CLASS TRT; TEST MEAN(Y); CONTRAST 'A v B' -1 1 0 0 0; CONTRAST 'A v C' -1 0 1 0 0; CONTRAST 'A v D' -1 0 0 1 0; CONTRAST 'A v E' -1 0 0 0 1; CONTRAST 'B v C' 0 -1 1 0 0; CONTRAST 'B v D' 0 -1 0 1 0; CONTRAST 'B v E' 0 -1 0 0 1; CONTRAST 'C v D' 0 0 -1 1 0; CONTRAST 'C v E' 0 0 -1 0 1; CONTRAST 'D v E' 0 0 0 -1 1; RUN; PROC PRINT DATA=NEWMC; TITLE1 'EXAMPLE E.1: Multiple Comparisons'; TITLE2 'p-Value Adjustment Procedures'; RUN;
Appendix E: Multiple Comparison Methods
431
OUTPUT E.4. p-Values for Pairwise Treatment Comparisons EXAMPLE E.1: Multiple Comparisons p-Value Adjustment Procedures
O b s
_ t e s t _
_ v a r _
1 2 3 4 5 6 7 8 9 10
MEAN MEAN MEAN MEAN MEAN MEAN MEAN MEAN MEAN MEAN
Y Y Y Y Y Y Y Y Y Y
_ c o n t r a s t _ A A A A B B B C C D
v v v v v v v v v v
_ v a l u e _ B C D E C D E D E E
_ s e _
_ n v a l _
r a w _ p
b o n _ p
s t p b o n _ p
s i d _ p
h o c _ p
289.300 97.491 39 0.00511 0.05109 0.03577 0.04993 0.03577 370.944 99.870 39 0.00064 0.00637 0.00573 0.00635 0.00573 132.000 102.765 39 0.20655 1.00000 0.76780 0.90110 0.61965 -0.611 99.870 39 0.99515 1.00000 0.99515 1.00000 0.99515 81.644 94.435 39 0.39257 1.00000 0.78513 0.99316 0.78513 -157.300 97.491 39 0.11470 1.00000 0.57352 0.70428 0.57352 -289.911 94.435 39 0.00389 0.03887 0.03109 0.03820 0.03109 -238.944 99.870 39 0.02165 0.21645 0.12987 0.19654 0.12987 -371.556 96.888 39 0.00045 0.00447 0.00447 0.00446 0.00447 -132.611 99.870 39 0.19195 1.00000 0.76780 0.88132 0.61965
The Dunnett’s p-values are not directly comparable to the other p-values in Table E.4 because they are testing different hypotheses. The Dunnett p-values are included to show how the values change when the alternative hypothesis is changed from “at least one pair of means differs” to “at least one treatment mean differs from the control mean”. Adjusted p-values can be found for the latter hypothesis by using the Bonferroni and Sidak methods in the same way that is described for the all-pairwise comparisons method. For example, when restricting attention only to the treatment versus control comparisons, the Sidak adjusted p-value for the B v. E comparison would be 1–(1–0.0039)4 = 0.0155.
‘Closed’ Testing Multiple testing of individual null hypotheses that control for the overall significance level can be performed by using a principle called ‘closure’. This concept is discussed in depth with good examples in Chi (1998) and Bauer (1991). Here, the concept is introduced for the case of K=3 (3 groups) and m=3 pairwise comparisons. Under the principle of closure, each of the 3 pairwise comparisons can be conducted at the full α (0.05) level with no p-value adjustments, but only if the omnibus hypothesis (i.e., ‘all groups equal’) is first rejected at the α level. This method has great utility in the analysis of clinical trials data because many studies include 3 treatment groups. Pairwise comparisons of the mean responses among the 3 groups can be performed with t-tests for example, each at the 0.05 level of significance following a significant F-test when analysis of variance is used. If the Treatment effect is not significant based on the ANOVA F-test, each pairwise comparison must be declared non-significant under closed testing in order to maintain the experimentwise error rate at 0.05.
432
E.2
Common Statistical Methods for Clinical Research with SAS Examples
Multiple Comparisons of Binomial Proportions Because the p-value adjustment methods discussed in the previous section are based only on the raw p-values and not the underlying distribution of response data, you can use these same methods for a wide range of applications, including those with non-normal data (e.g., categorical responses and time-to-event data). When responses are binomial, you can use a set of raw p-values from a series of chisquare tests, Fisher’s exact tests, Cochran-Mantel-Haenszel tests, and many other methods to adjust for multiple comparisons of group proportions, such as response rates among treatments. In fact, PROC MULTTEST in SAS can be executed from an input list of raw p-values, as shown next. The comparison identification must be in the variable named TEST, and the raw (input) p-values must be values of the variable named RAW_P. The output shows the same adjusted p-values as previously obtained. Notice that in this program, no TEST or CLASS statements are used with PROC MULTTEST. Also, these adjusted p-values are not as accurate as those shown in Output E.4 because the raw p-values are entered using only 4-decimal place accuracy. DATA ADJST; INPUT TEST RAW_P @@; DATALINES; 1 0.0004 2 0.0006 3 0.0039 6 0.1147 7 0.1919 8 0.2066 ;
4 0.0051 5 0.0216 9 0.3926 10 0.9951
PROC MULTTEST BONFERRONI SIDAK HOLM HOC PDATA = ADJST; TITLE 'DATA SET ADJST'; RUN;
OUTPUT E.5. p-Value Adjustments Using PROC MULTTEST for Example E.1 DATA SET ADJST The Multtest Procedure p-Values
Test
Raw
Bonferroni
Stepdown Bonferroni
Sidak
Hochberg
1 2 3 4 5 6 7 8 9 10
0.0004 0.0006 0.0039 0.0051 0.0216 0.1147 0.1919 0.2066 0.3926 0.9951
0.0040 0.0060 0.0390 0.0510 0.2160 1.0000 1.0000 1.0000 1.0000 1.0000
0.0040 0.0054 0.0312 0.0357 0.1296 0.5735 0.7676 0.7676 0.7852 0.9951
0.0040 0.0060 0.0383 0.0498 0.1962 0.7043 0.8812 0.9012 0.9932 1.0000
0.0040 0.0054 0.0312 0.0357 0.1296 0.5735 0.6198 0.6198 0.7852 0.9951
Appendix E: Multiple Comparison Methods
433
Resampling Methods ‘Resampling’ is another method that can be used for performing multiple comparisons with binomial data. This method uses the same randomization principle on which Fisher’s Exact test (Chapter 17) is based and requires the use of a computer program. In SAS, you can use PROC MULTTEST to perform ‘permutation’ resampling, as shown in Example E.2. In essence, the algorithm randomly selects, with replacement, a sample of all possible permutations of outcomes assuming the null hypothesis is true. The adjusted p-value is the proportion of those samples that have outcomes more extreme than the observed outcome. p Example E.2
Suppose a binomial response (‘response’ or ‘non-response’) is measured on a new sample of 168 patients randomized to the 5-group parallel study given in Example E.1, with summary results as follows. TABLE E.5. Response Rates for Example E.2 A
B
C
D
E
Lo-Dose
Mid-Dose
Hi-Dose
Reference
Placebo
N Responders
33 9
31 14
34 23
35 10
35 5
Non-Responders
24
17
11
25
30
Response Rate
27.3%
45.2%
67.6%
28.6%
14.3%
Use PROC MULTTEST to perform permutation resampling and compare the results with other p-value adjustment methods. SAS Code for Example E.2 DATA RR; INPUT TRT $ RESP COUNT @@; /* RESP=0 is “non-response”, RESP=1 is “response” */ DATALINES; A 0 24 A 1 9 B 0 17 B 1 14 C 0 11 C 1 23 D 0 25 D 1 10 E 0 30 E 1 5 ;
434
Common Statistical Methods for Clinical Research with SAS Examples
ODS SELECT Multtest.pValues; PROC MULTTEST ORDER = DATA PERMUTATION NSAMPLE=20000 SEED=28375 DATA = RR; CLASS TRT; TEST FISHER(RESP); FREQ COUNT; CONTRAST 'A vs B' -1 1 0 0 0; CONTRAST 'A vs C' -1 0 1 0 0; CONTRAST 'A vs D' -1 0 0 1 0; CONTRAST 'A vs E' -1 0 0 0 1; CONTRAST 'B vs C' 0 -1 1 0 0; CONTRAST 'B vs D' 0 -1 0 1 0; CONTRAST 'B vs E' 0 -1 0 0 1; CONTRAST 'C vs D' 0 0 -1 1 0; CONTRAST 'C vs E' 0 0 -1 0 1; CONTRAST 'D vs E' 0 0 0 -1 1; TITLE1 'EXAMPLE E.2: Multiple Comparisons of Proportions'; TITLE2 'Permutation Resampling Method'; RUN; /* (prespecify the seed value so results can be duplicated) */;
OUTPUT E.6. Resampling Method Using PROC MULTTEST for Example E.2 EXAMPLE E.2: Multiple Comparisons of Proportions Permutation Resampling Method The Multtest Procedure p-Values Variable
Contrast
RESP RESP RESP RESP RESP RESP RESP RESP RESP RESP
A A A A B B B C C D
vs vs vs vs vs vs vs vs vs vs
B C D E C D E D E E
Raw
Permutation
0.1932 0.0014 1.0000 0.2365 0.0831 0.2038 0.0072 0.0017 <.0001 0.2436
0.5530 0.0080 1.0000 0.7095 0.3666 0.6039 0.0397 0.0093 <.0001 0.7095
The ‘raw’ p-values are the values that result from unadjusted pairwise comparisons using two-tailed Fisher’s exact tests (Chapter 17). The values are found directly by using PROC MULTTEST with the FISHER option in the TEST statement , which eliminates the need to run PROC FREQ. A resampling size of 20,000 is requested, and a random seeding value is provided . If no value is specified for SEED, SAS randomly selects a value, which results in slightly different adjusted pvalues each time the program is run.
Appendix E: Multiple Comparison Methods
435
For comparison, compute adjusted p-values by using the Bonferroni, Sidak, stepdown Bonferroni and step-up Bonferroni methods from the raw p-values based on Fisher’s exact test (Output E.6).
DATA BINP; INPUT TEST RAW_P @@; DATALINES; 1 0.1932 2 0.0014 3 1.0000 6 0.2038 7 0.0072 8 0.0017 ;
4 0.2365 5 0.0831 9 0.0001 10 0.2436
PROC MULTTEST BONFERRONI SIDAK HOLM HOC PDATA = BINP; RUN;
The p-values from the SAS output are reproduced in Table E.6, which compares the adjusted p-values for the permutation resampling method with the methods previously described. Notice that permutation resampling results in 4 significant comparisons (p < 0.05), while each of the other methods find only 3 that are significant. Resampling techniques are preferred, when possible, because they generally have greater power than the other methods previously discussed. In the past, application of resampling methods has been limited by the huge computing resources required, but they are gaining in use with the increasing power of desktop computing. One disadvantage of resampling is that the adjusted p-values will differ slightly with each implementation. When using PROC MULTTEST, results can be duplicated by specifying the same value for SEED each time you run the program. TABLE E.6. Adjusted p-Values for Example E.2
Comparison
Fisher’s (‘Raw’)
Permutation Resampling
Bonferroni
Sidak
Stepdown Bonferroni
Stepup Bonferroni
C v. E
0.0001*
0.0001*
0.001*
0.001*
0.001*
0.001*
A v. C
0.0014*
0.0080*
0.014*
0.014*
0.013*
0.013*
C v. D
0.0017*
0.0093*
0.017*
0.0170*
0.014*
0.014*
B v. E
0.0072*
0.0397*
0.072
0.070
0.050
0.050
B v. C
0.0831
0.3666
0.831
0.580
0.499
0.487
A v. B
0.1932
0.5530
1.000
0.883
0.966
0.487
B v. D
0.2038
0.6039
1.000
0.898
0.966
0.487
A v. E
0.2365
0.7095
1.000
0.933
0.966
0.487
D v. E
0.2436
0.7095
1.000
0.939
0.966
0.487
A v. D
1.0000
1.0000
1.000
1.000
1.000
1.000
436
Common Statistical Methods for Clinical Research with SAS Examples
The adjusted p-values found by PROC MULTTEST are based on the entire set of CONTRAST statements used. To compare each treatment group with the control, use the same SAS statements as shown in the SAS Code for Example 4.2, and replace the ten CONTRAST statements with the following set of four statements: CONTRAST CONTRAST CONTRAST CONTRAST
E.3
'A 'B 'C 'D
vs vs vs vs
E' -1 0 0 0 E' 0 -1 0 0 E' 0 0 -1 0 E' 0 0 0 -1
1; 1; 1; 1;
Summary This introduction to multiple comparisons illustrates just some of the many methods available for conducting multiple comparisons for hypothesis testing. Many of the same methods can be used when addressing the multiplicity problem caused by simultaneous analysis of multiple response variables, such as that which occurs with multiple primary endpoints or study objectives. Adjustments based on methods discussed here can also be implemented for simultaneous confidence intervals. The Tukey-Kramer and Dunnett’s methods presented in this section using one-way ANOVA (Chapter 6) are widely used when response data have a normal distribution. These methods can also be used with more complex models, such as two-way ANOVA (Chapter 7), repeated measures ANOVA (Chapter 8), and analysis of covariance (Chapter 11), although these methods can be somewhat more conservative with these models. You have also seen how to adjust the raw pairwise p-values to maintain a prespecified, overall α-level without requiring ANOVA assumptions. The p-value adjustment methods discussed here can be used for pairwise treatment comparisons when response data do not follow the normal distribution. This approach can be applied to most of the methods discussed in this book that involve comparisons of more than two treatment groups. In addition, resampling techniques and simulation methods are gaining greater attention with the increasing power of low-cost computing. The field of multiple comparisons is a broad and rapidly evolving field. Entire books have been written and professional conferences are held, about this one subject. Standard tests of the past, such as Duncan’s test, have been replaced by more powerful methods. ‘Stepwise’ methods can be used with the Tukey, Dunnett, and Sidak methods, re-sampling methods, and tests under the closure principle. Improvement in power often results by implementing stepwise versions of the methods discussed in this chapter. Note: For a more detailed discussion of multiple comparisons with SAS examples, refer to the excellent book, “Multiple Comparisons and Multiple Tests Using the SAS System,” by Westfall, Tobias, Rom, and Wolfinger, which was written under the SAS Books By Users program.
Appendix F Data Transformations F.1 F.2
Introduction...................................................................................................... 437 The Log Transformation................................................................................. 438
F.1
Introduction Many statistical tests, including the two-sample t-test (Chapter 5) and ANOVA (Chapters 6, 7, and 8) are performed under the assumption of normally distributed data with variance homogeneity among groups. Although such tests are known to be robust against mild deviations to these assumptions, especially with larger sample sizes, clinical data are often encountered that appear to have skewed distributions or larger variability within certain groups. The underlying distribution in such tests might actually be non-normal, or there might be a small proportion of the data called ‘outliers’ that occur outside the range of usual expectation under the normality assumption, for known or unknown reasons. Data transformations can often be used effectively in normalizing the data and/or stabilizing the variability. The logarithmic, square root, arcsine, and rank transformations are among the most frequently used data transformations in applied statistics. The square root and arcsine transformations are infrequently used in clinical statistics, however. Converting the observations to their ranks before analysis is one of the most common methods of transforming the data (see Chapters 12, 13, and 14). The logarithmic (‘log’) transformation is also frequently used, and its application is discussed in the next section. Log transformations are commonly used in analyzing many types of clinical research data. One example is in vaccine and immunogenicity studies in which antibody titer values are measured. These values (xi’s) are usually modeled with the log-normal distribution, and results are summarized in terms of the geometric mean titer and the geometric mean ratio (or ‘n-fold increase’). Another common application of the log-transformation is in the analysis of area-under-the-curve (AUC) data based on blood levels from bioavailability and pharmacokinetic studies.
438
F.2
Common Statistical Methods for Clinical Research with SAS Examples
The Log Transformation The Log-Normal Distribution If the logarithm of a response, Y, denoted as log(Y), has a normal distribution, the random variable Y is said to come from a ‘log-normal’ distribution. The lognormal distribution looks like a normal distribution (see Chapter 1 or Figure B.1 in Appendix B) with its left tail compressed and its right tail stretched out due to a small number of very large values. (This is also called a ‘skewed’ distribution). It is appropriate to perform the log transformation on clinical data prior to analysis if the data come from a log-normal distribution. More generally, the log transformation is often used to ‘improve’ the normality of data sets that have any type of skewed distribution as described above. What Is a Logarithm? A logarithm has an associated base, b. The base b logarithm of X is expressed as logbX. A logarithm is the inverse function of exponentiation. That is, if Y = logbX, then X = bY. When not specified, b is usually 10. Most often, b is selected to be (“Euler’s constant”) e. In fact, this selection is so common, the base e logarithm has a special name, the ‘natural logarithm’, which is abbreviated as ‘ln’. Thus, ln(X) = loge(X), which implies X = eY (also denoted, X = exp(Y)). Only the natural logs are referred to in the discussion that follows. Analyzing Data Using the Log Transformation Following transformation, analyses are carried out using an ANOVA on the transformed data, and summary statistics are computed in the usual way. These statistics can be expressed in terms of the original data by using the relationships discussed next. Observe the response measures x1, x2, …, xn. Apply the log transformation as yi = ln(xi), and conduct analyses as usual on the yi’s. Let Y and sY denote the mean and standard deviation of the yi’s, and let LY and UY represent the lower and upper limits of the 100(1–α)% confidence interval, found by Y ± tα/2 ּ s y
that is,
LY = Y – tα/2 ּ s y and LU = Y + tα/2 ּ s y A 100(1–α)% confidence interval based on the original data (xi’s) is found by exponentiating LY and UY. L
U
(eY –e
Y
)
$SSHQGL[)'DWD7UDQVIRUPDWLRQV
%HFDXVH \L OQ[ L OQ [ L Q < Q Q \RX VHH WKDW H[S < LV VLPSO\ WKH JHRPHWULF PHDQ RI WKH [ ¶V GHQRWHG E\ ; J 7KLV UHSUHVHQWV WKH µJHRPHWULF PHDQ WLWHU¶ RU *07 XVHG LQ LPPXQRJHQLFLW\ VWXGLHV
L
\
6(H =
H\ × V\ Q
)LQDOO\OHW UHSUHVHQWWKHGLIIHUHQFHEHWZHHQWZRWUHDWPHQWPHDQVVD\$DQG%
<$ - <%
7KLVGLIIHUHQFHLVHDVLO\FRQYHUWHGWRWKH³JHRPHWULFPHDQUDWLR´E\H[SRQHQWLDWLQJ J ; H = H<$ - <% = $J ;% 1RWLFHWKDWWKHWUDQVIRUPDWLRQGRHVQRWVXEVWDQWLYHO\FKDQJHWKHK\SRWKHVLVWKDW¶V EHLQJ WHVWHG 7KH K\SRWKHVLV RI QR WUHDWPHQW GLIIHUHQFH LV HTXLYDOHQW WR WKH K\SRWKHVLVWKDWWKHUDWLRRIWUHDWPHQWPHDQVLV
6LPLODUO\ FKDQJHV IURP D EDVHOLQH YDOXH H[SUHVVHG DV < ± < FDQ EH
FRQYHUWHGWRDSHUFHQWFKDQJHLQJHRPHWULFPHDQVDV H ±
440
Common Statistical Methods for Clinical Research with SAS Examples
$SSHQGL[* 6$6&RGHIRU([HUFLVHVLQ &KDSWHU 6$6&RGHIRU([HUFLVHVVHH&KDSWHU LIBNAME EXAMP ’c:\bookfiles\examples\sas’; OPTIONS NODATE NONUMBER;
*======================================================* | STATISTICAL ANALYSES for Response Variable = "SCORE"| | (continuous numeric response variable) | *=======================================================* ; /* ANALYSIS #1: Two-Sample t-Test */ PROC TTEST DATA = EXAMP.TRIAL; CLASS TRT; VAR SCORE; TITLE ’ANALYSIS #1: Two-Sample t-Test’; RUN;
/* ANALYSIS #2: Wilcoxon Rank-Sum Test */ PROC NPAR1WAY WILCOXON DATA = EXAMP.TRIAL; CLASS TRT; VAR SCORE; TITLE ’ANALYSIS #2: Wilcoxon Rank-Sum Test’; RUN;
/* ANALYSIS #3: One-Way ANOVA */ PROC GLM DATA = EXAMP.TRIAL; CLASS TRT; MODEL SCORE = TRT; MEANS TRT; TITLE ’ANALYSIS #3: One-Way ANOVA’; RUN;
&RPPRQ6WDWLVWLFDO0HWKRGVIRU&OLQLFDO5HVHDUFKZLWK6$6([DPSOHV
/* ANALYSIS #4: Two-Way ANOVA with Interaction */ PROC GLM DATA = EXAMP.TRIAL; CLASS TRT CENTER; MODEL SCORE = TRT CENTER TRT*CENTER; LSMEANS TRT / STDERR PDIFF; TITLE ’ANALYSIS #4: Two-Way ANOVA With Interaction’; RUN;
/* ANALYSIS #5: Two-Way ANOVA without Interaction */ PROC GLM DATA = EXAMP.TRIAL; CLASS TRT CENTER; MODEL SCORE = TRT CENTER; TITLE ’ANALYSIS #5: Two-Way ANOVA Omitting Interaction’; RUN;
/* ANALYSIS #6: Three-Way ANOVA, with All Interactions */ PROC GLM DATA = EXAMP.TRIAL; CLASS TRT CENTER SEX; MODEL SCORE = TRT | CENTER | SEX; LSMEANS TRT / STDERR PDIFF; TITLE ’ANALYSIS #6: Three-Way ANOVA - all Interactions’; RUN;
/* ANALYSIS #7: Three-Way Main Effects ANOVA */ PROC GLM DATA = EXAMP.TRIAL; CLASS TRT CENTER SEX; MODEL SCORE = TRT CENTER SEX; TITLE ’ANALYSIS #7: Three-Way Main Effects ANOVA’; RUN;
/* ANALYSIS #8: Regression of Score on Age */ PROC GLM DATA = EXAMP.TRIAL; MODEL SCORE = AGE / SOLUTION; TITLE ’Linear Regression of Score on Age’; RUN; PROC PLOT DATA = EXAMP.TRIAL; PLOT SCORE*AGE; RUN;
/* ANALYSIS #9: Analysis of Covariance - Test for Equal Slopes */ PROC GLM DATA = EXAMP.TRIAL; CLASS TRT; MODEL SCORE = TRT AGE TRT*AGE; /* use interaction to check for equal slopes */ TITLE1 ’ANALYSIS #9: ANCOVA Using Age as Covariate,’; TITLE2 ’Test for Equal Slopes’; RUN;
$SSHQGL[*6$6&RGHIRU([HUFLVHVLQ&KDSWHU
/* ANALYSIS #10: Analysis of Covariance - Assuming Equal Slopes */ PROC GLM DATA = EXAMP.TRIAL; CLASS TRT; MODEL SCORE = TRT AGE / SOLUTION; LSMEANS TRT / STDERR PDIFF; TITLE ’ANALYSIS #10: ANCOVA Using Age as Covariate’; RUN; PROC PLOT DATA = EXAMP.TRIAL; PLOT SCORE*AGE=TRT; RUN;
/* ANALYSIS #11: Stratified ANCOVA */ PROC GLM DATA = EXAMP.TRIAL; CLASS TRT CENTER; MODEL SCORE = TRT CENTER AGE / SOLUTION; TITLE1 ’ANALYSIS #11: ANCOVA -- using Age as covariate,’; TITLE2 ’Stratified by Center’; RUN;
*=======================================================* | STATISTICAL ANALYSES for Response Variable = "RESP" | | (dichotomous response variable) | *=======================================================*
;
PROC FORMAT; VALUE RSPFMT 0 = ’NO ’ 1 = ’YES’; RUN;
/* ANALYSES #12 and #13: Chi-Square and Fishers Exact Test -- Dichotomous Response */ PROC FREQ DATA = EXAMP.TRIAL; TABLES TRT*RESP / CHISQ NOCOL NOPCT; FORMAT RESP RSPFMT.; TITLE1 ’ANALYSIS #12: Chi-Square Test’; TITLE2 ’ANALYSIS #13: Fishers Exact Test’; RUN;
/* ANALYSIS #14: Stratified CMH Test for Response Rate Analysis */ PROC FREQ DATA = EXAMP.TRIAL; TABLES CENTER*TRT*RESP / CMH NOCOL NOPCT; FORMAT RESP RSPFMT.; TITLE ’ANALYSIS #14: CMH Test Controlling for Center’; RUN;
&RPPRQ6WDWLVWLFDO0HWKRGVIRU&OLQLFDO5HVHDUFKZLWK6$6([DPSOHV
/* ANALYSIS #15: Logistic Regression - Dichotomous Response */ PROC LOGISTIC DATA = EXAMP.TRIAL; CLASS TRT / PARAM = REF; MODEL RESP = TRT; TITLE1 ’ANALYSIS #15: Logistic Regression Analysis for’; TITLE2 ’Treatment Group Differences Using Dichotomized Response’; RUN;
/* ANALYSIS #16: Logistic Regression - Dichotomous Response, Adjusted for Age */ PROC LOGISTIC DATA = EXAMP.TRIAL; CLASS TRT / PARAM = REF; MODEL RESP = TRT AGE; TITLE1 ’ANALYSIS #16: Logistic Regression Analysis for’; TITLE2 ’Treatment Group Differences Using Dichotomized Response’; TITLE3 ’Adjusted for Age’; RUN;
/* ANALYSIS #17: Logistic Regression - Dichotomous Response, Adjusted for Age, Sex, Center */ PROC LOGISTIC DATA = EXAMP.TRIAL; CLASS TRT SEX CENTER / PARAM = REF; MODEL RESP = TRT AGE SEX CENTER TRT*CENTER; TITLE1 ’ANALYSIS #17: Logistic Regression Analysis for’; TITLE2 ’Treatment Group Differences Using Dichotomized Response’; TITLE3 ’Adjusted for Age, Sex, Center & Interactions’; RUN;
*=======================================================* | STATISTICAL ANALYSES for Response Variable = "SEV" | | (ordinal categorical response variable) | *=======================================================*
;
\* Obtain Severity Distributions based on the Response = "SEV" *\ PROC FORMAT; VALUE SEV1FMT
0 = ’0 = None’ 2 = ’2 = Mod ’
1 = ’1 = Mild’ 3 = ’3 = Sev ’
VALUE SEV2FMT
0 = ’0 = None ’ 2 = ’2.185 = Mod ’
1 = ’1.036 = Mild’ 3 = ’3 = Sev ’ ;
VALUE SEV3FMT
0 = ’None’ 2 = ’Mod.’
1 = ’Mild’ 3 = ’Sev.’
;
;
RUN;
/* ANALYSES #18 and #19: Chi-Square / Fishers Exact Test -Multinomial Response */ PROC FREQ DATA = EXAMP.TRIAL; TABLES TRT*SEV / CHISQ EXACT NOCOL NOPCT; FORMAT SEV SEV1FMT.;
$SSHQGL[*6$6&RGHIRU([HUFLVHVLQ&KDSWHU
TITLE1 ’ANALYSIS #18: Chi-Square Test’; TITLE2 ’ANALYSIS #19: Generalized Fishers Exact Test’; RUN;
/* ANALYSIS #20: CMH Test for Comparing Severity Distributions Using Table Scores */ PROC FREQ DATA = EXAMP.TRIAL; TABLES TRT*SEV / CMH NOCOL NOPCT; FORMAT SEV SEV1FMT.; TITLE1 ’ANALYSIS #20: Mantel-Haenszel Test on Severity’; TITLE2 ’(Using Table Scores)’; RUN;
/* ANALYSIS #21: CMH Test for Comparing Severity Distributions - Using Modified Ridit Scores */ PROC FREQ DATA = EXAMP.TRIAL; TABLES TRT*SEV / CMH SCORES = MODRIDIT NOCOL NOPCT; FORMAT SEV SEV2FMT.; TITLE1 ’ANALYSIS #21: Mantel-Haenszel Test on Severity’; TITLE2 ’(Using Modified Ridit Scores)’; RUN;
/* ANALYSES #22: Wilcoxon Rank-Sum Test on Response = "SEV" */ PROC NPAR1WAY WILCOXON DATA = EXAMP.TRIAL; CLASS TRT; VAR SEV; TITLE ’ANALYSIS #22: Wilcoxon Rank-Sum Test on "SEV" Response’; RUN;
/* ANALYSIS #23: Stratified CMH Test for Comparing Severity Distributions - Using Table Scores */ PROC FREQ DATA = EXAMP.TRIAL; TABLES CENTER*TRT*SEV / CMH NOCOL NOPCT; FORMAT SEV SEV3FMT.; TITLE1 ’ANALYSIS #23: CMH Test on Severity Distributions’; TITLE2 ’(Using Table Scores), Controlling for Center’; RUN;
/* ANALYSIS #24: Stratified CMH Test for Comparing Severity Distributions - Using Modified Ridit Scores */ PROC FREQ DATA = EXAMP.TRIAL; TABLES CENTER*TRT*SEV / CMH SCORES = MODRIDIT NOCOL NOPCT; FORMAT SEV SEV3FMT.; TITLE1 ’ANALYSIS #24: CMH Test on Severity Distributions’; TITLE2 ’(Using Modified Ridit Scores), Controlling for Center’; RUN;
&RPPRQ6WDWLVWLFDO0HWKRGVIRU&OLQLFDO5HVHDUFKZLWK6$6([DPSOHV
/* ANALYSIS #25: Logistic Regression: Proportional Odds Model */ PROC LOGISTIC DATA = EXAMP.TRIAL; CLASS TRT / PARAM = REF; MODEL SEV = TRT; TITLE1 ’ANALYSIS #25: Proportional Odds Model for Treatment’; TITLE2 ’Differences Using Ordinal Response Categories’; RUN;
/* ANALYSIS #26: Logistic Regression: Proportional Odds Model, Adjusted for Age */ PROC LOGISTIC DATA = EXAMP.TRIAL; CLASS TRT / PARAM = REF; MODEL SEV = TRT AGE; TITLE1 ’ANALYSIS #26: Proportional Odds Model for Treatment’; TITLE2 ’Differences Using Ordinal Response Categories’; TITLE3 ’Adjusted for Age’; RUN;
/* ANALYSIS #27: Logistic Regression: Proportional Odds Model, Adjusted for Age, Sex, Center */ PROC LOGISTIC DATA = EXAMP.TRIAL; CLASS TRT SEX CENTER; MODEL SEV = TRT AGE SEX CENTER; TITLE1 ’ANALYSIS #27: Proportional Odds Model for Treatment’; TITLE2 ’Group Differences Using Ordinal Response Categories’; TITLE3 ’Adjusted for Age, Sex & Center’; RUN;
REFERENCES and Additional Reading
Allison, P.D. (1995). Survival Analysis Using the SAS System: A Practical Guide. Cary, NC: SAS Institute Inc. Allison, P.D. (1999). Logistic Regression Using the SAS System: Theory and Application. Cary, NC: SAS Institute Inc. Altman, D.G. (1991). Practical Statistics for Medical Research. New York: Chapman & Hall. Anderson, S., Auquier, A., Hauck, W.W., Oakes, D., Vandaele, W., and Weisberg, H.I. (1980). Statistical Methods for Comparative Studies. New York: John Wiley & Sons. Armitage, P. and Berry, G. (1987). Statistical Methods in Medical Research. 2d ed. Palo Alto, CA: Blackwell Scientific Publications. Bancroft, T.A. (1968). Topics in Intermediate Statistical Methods, Volume One. Ames, IA: The Iowa University Press. Bauer, P. (1991). Multiple Testing in Clinical Trials. Statistics in Medicine 10: 871-890. Cantor, A. (1997). Extending SAS Survival Analysis Techniques for Medical Research. Cary, NC: SAS Institute Inc. Chi, G.Y.H. (1998). Multiple Testings: Multiple Comparisons and Multiple Endpoints. Drug Information Journal 32: 1347S-1362S. Cochran, W.G. and Cox, G.M. (1957). Experimental Designs. 2d ed. New York: John Wiley & Sons. Conover, W.J. and Iman, R.L. (1981). Rank Transformations as a Bridge between Parametric and Nonparametric Statistics. The American Statistician 35: 124-129.
448
Common Statistical Methods for Clinical Research with SAS Examples
Cox, D.R. and Oakes, D. (1984). Analysis of Survival Data. New York: Chapman & Hall. D’Agostino, R.B., Massaro, J., Kwan, H., and Cabral, H. (1993). “Strategies for Dealing with Multiple Treatment Comparisons in Confirmatory Clinical Trials.” Drug Information Journal 27: 625641. Dawson-Saunders, B. and Trapp, R.G. (1990). Basic and Clinical Biostatistics. San Mateo, CA: Appleton & Lange. Delwiche, L.D. and Slaughter, S.J. (1998). The Little SAS Book: A Primer. 2d ed. Cary, NC: SAS Institute Inc. DeMets, D.L. and Lan, K.K.G. (1994). “Interim Analysis: The Alpha Spending Function Approach.” Statistics in Medicine 13: 13411352. Elashoff, J.D. and Reedy, T.J. (1984). “Two-Stage Clinical Trial Stopping Rules.” Biometrics 40: 791-795. Elliott, R.J. (2000). Learning SAS in the Computer Lab. 2d Ed. Pacific Grove, CA: Brooks/Cole. Fleiss, J.L. (1981). Statistical Methods for Rates and Proportions. 2d ed. New York: John Wiley & Sons. Freund, R.J. and Littell, R.C. (1991). SAS System for Regression. 2d ed. Cary, NC: SAS Institute Inc. Geller, N.L. and Pocock, S.J. (1987). “The Consultant’s Forum: Interim Analyses in Randomized Clinical Trials: Ramifications and Guidelines for Practitioners.” Biometrics 43: 213-223. Gillings, D. and Koch, G. (1991). “The Application of the Principle of Intention-to-Treat to the Analysis of Clinical Trials.” Drug Information Journal 25: 411-424. Hand, D.J. and Taylor, C.C. (1987). Multivariate Analysis of Variance and Repeated-Measures: A Practical Approach to Behavioural Scientists. New York: Chapman & Hall. Hatcher, L. and Stepanski, E.J. (1994). A Step-by-Step Approach to Using the SAS System for Univariate and Multivariate Statistics. Cary, NC: SAS Institute Inc.
References and Additional Reading
Hollander, M. and Wolfe, D.A. (1973). Nonparametric Statistical Methods. New York: John Wiley & Sons. Hosmer, D.W. and Lemeshow, S. (1989). Applied Logistic Regression. New York: John Wiley & Sons. Johnson, M.F. (1990). “Issues in Planning Interim Analyses.” Drug Information Journal 24: 361-370. Jones, B. and Kenward, M.G. (1989). Design and Analysis of Crossover Trials. New York: Chapman & Hall. Kalbfleisch, J.D. and Prentice, R.L. (1980). The Statistical Analysis of Failure Time Data. New York: John Wiley & Sons. Koch, G.G. and Gansky, S.A. (1996). “Statistical Considerations for Multiplicity in Confirmatory Protocols.” Drug Information Journal 30: 523-534. Lachin, J.M. (1981). “Introduction to Sample Size Determination and Power Analysis for Clinical Trials.” Controlled Clinical Trials 2: 93-113. Lachin, J.M. (2000). “Statistical Considerations in the Intent-to-Treat Principle.” Controlled Clinical Trials 21: 167-189. Lan, K.K.G., and DeMets, D.L. (1983). “Discrete Sequential Boundaries for Clinical Trials.” Biometrika 70: 659-663. Lee, J. (1981). “Covariance Adjustment of Rates Based on the Multiple Logistic Regression Model.” Journal of Chronic Diseases 34: 415426. Lehman, E.L. (1988). Theory of Point Estimation. New York: John Wiley & Sons. Littell, R.C., Freund, R.J. and Spector, P.C. (1991). SAS System for Linear Models. 3rd ed. Cary, NC: SAS Institute Inc. Littell, R.C., Milliken, G.A., Stroup, W.W., and Wolfinger, R.D. (1996). SAS System for Mixed Models. Cary, NC: SAS Institute Inc. Mendenhall, W. (1975). Introduction to Probability and Statistics. 4th ed. North Scituate, MA: Duxbury Press.
449
450
Common Statistical Methods for Clinical Research with SAS Examples
Munro, B.H. and Page, E.B. (1993). Statistical Methods for Health Care Research. 2d ed. Philadelphia: JB Lippincott Co. O'Brien, P.C. and Fleming, T.R. (1979). “A Multiple Testing Procedure for Clinical Trials.” Biometrics 35: 549-556. PMA Biostatistics and Medical Ad Hoc Committee on Interim Analysis. (1993). “Interim Analysis in the Pharmaceutical Industry.” Controlled Clinical Trials 14: 160-173. Pocock, S.J. (1977). Group Sequential Methods in the Design and Analysis of Clinical Trials.” Biometrika 64, 191-199. Pocock, S.J., Geller, N.L., and Tsiatis, A.A. (1987). “The Analysis of Multiple Endpoints in Clinical Trials.” Biometrics 43: 487-498. Reboussin, D.M., DeMets, D.L., Kim, K.M., and Lan, K.K.G. (2000). “Computations for Group Sequential Boundaries Using the LanDeMets Spending Function Method. Controlled Clinical Trials 21:190-207. Sankoh, A.J. (1999). “Interim Analyses: An Update of an FDA Reviewer’s Experience and Perspective.” Drug Information Journal 33: 165-176. SAS OnlineDoc, Version 8. 1999. CD-ROM. SAS Institute Inc., Cary, NC. SAS Institute Inc. (1996). SAS/STAT Software: Changes and Enhancements through Release 6.11. Cary, NC: SAS Institute Inc. SAS Institute Inc. (1998). SAS/STAT User’s Guide, Version 6, 4th ed. 2 Vols. Cary, NC: SAS Institute Inc. Stokes, M.E., Davis, C.S., and Koch, G.G. (2000). Categorical Data Analysis Using the SAS System. 2d ed. Cary, NC: SAS Institute Inc. Tibshirani, R. (1982). “A Plain Man's Guide to the Proportional Hazards Model.” Clinical and Investigative Medicine 5: 63-68. Walker, G.A. (1999). Considerations in the Development of Statistical Analysis Plans for Clinical Studies. San Diego, CA: Collins-Wellesley Publishing.
References and Additional Reading
Westfall, P.H., Tobias, R.D., Rom, D., Wolfinger, R.D., and Hochberg, Y. (1999). Multiple Comparisons and Multiple Tests Using the SAS System. Cary, NC: SAS Institute Inc. Winer, B.J. (1971). Statistical Principles in Experimental Designs, 2d ed. New York: McGraw-Hill. Zeger, S.L., Liang, K.Y., and Albert, P.S. (1988). “Models for Longitudinal Data: A Generalized Estimating Equation Approach.” Biometrics 44: 1049-1060.
451
452
Common Statistical Methods for Clinical Research with SAS Examples
Index A a-spending function 36–37 active vs. placebo comparisons of degree of response (example) 277–278 ADJUST= option, LSMEANS statement (GLM) 425, 427 ADR frequency with antibiotics (example) 267–269 aerosol inhalation, antibiotic blood levels following (example) 165–170 AGGREGATE option, MODEL statement (LOGISTIC) 341 AGREE option, TABLES statement (FREQ) 297, 304 Akaike’s Information Criterion (AIC) 147, 340 Alzheimer’s progression (example) 137–148 AML remission, relapse rate adjusted for (example) 322–329 analysis of covariance See ANCOVA (analysis of covariance) analysis of variance, one-way See ANOVA (analysis of variance), one-way analysis of variance, three-way See ANOVA (analysis of variance), three-way analysis of variance, two-way See ANOVA (analysis of variance), two-way ANCOVA (analysis of covariance) 199–226 coefficient of determination 223 confidence intervals 222 GLM procedure 215, 224 GLM procedure (example), 209–220 least squares criterion 200, 216, 217 mean square 202, 205 mean square error 206, 216 MEANS procedure (example) 209–213, 215–220 multi-center hypertension study (example) 214–220 sum of squares 201, 202, 205 sum of squares for error 201, 205 sum of squares for group 201 triglyceride changes for glycemic control (example) 203–213 Type I sum of squares 216, 224 Type III sum of squares 216, 224 anemia, hemoglobin changes in (example) 94–102
ANOVA (analysis of variance), See also repeated measures analysis basic concepts of 399–407 four-way crossover (example) 165–170 random effects in 109–110 two-way crossover 155, 157–173 two-way crossover (example) 160–164 ANOVA (analysis of variance), one-way 77–90 confidence intervals 89–90 Dunnett’s test 88 F-distribution 78, 86, 90 F-statistic 78, 86, 90 F-test 78, 86, 87, 89–90 F-values 86, 89 GLM procedure (example) 82–86 HAM-A scores in GAD (example) 80–86 Kolmogorov-Smirnov test (example) 87 Kruskal-Wallis test 87, 250, 254 least squares mean 405–407 mean square error 78, 79, 81, 90 mean square group 78, 79, 81 MEANS procedure (example) 82–86 multiple comparison of treatment effects 87, 89 noise reduction by blocking 401–405 non-parametric analog (Kruskal-Wallis test) 247–256 on ranked data 254 pairwise t-tests 87–88 p-value (example) 90 Shapiro-Wilk test 87 sum of squares for error 79, 81 sum of squares for group 79, 81 two-sample t-test 87, 90, 237–245 Types I-IV sum of squares 90 within- vs. between-group variation 399–401 ANOVA (analysis of variance), repeated measures See repeated measures analysis ANOVA (analysis of variance), summary table ANCOVA (analysis of covariance) 200, 205 crossover design 159, 162 for two-way ANOVA 92 multiple regression analysis 193 one-way ANOVA (example) 86 repeated measures analysis 113 repeated measures analysis, univariate approach 118 ANOVA (analysis of variance), three-way 109
454
Index
ANOVA (analysis of variance), two-way 91–110 F-test 93, 96–97, 110 F-statistic 96–97 F-value 92, 97, 106 Fixed effects 109–110 Friedman’s test 256 GLM procedure 97, 98, 108 handling empty cells in SS calculations 416 hemoglobin changes in anemia (example) 94–102 interaction effects 102, 103 mean square 92 mean square error 92–93, 107, 108, 109 MEANS procedure (example) 97–99 memory function (example) 103–107 noise reduction by blocking 401–405 non-parametric analog (Friedman’s test) 256 pairwise t-tests 97 random effects 109–110 sum of squares 92–93, 95–96 sum of squares for error 96, 109 sums of squares, SAS types 97, 104, 108, 109 within- vs. between-group variation 399–401 anti-anginal response vs. disease history (example) 178–184 antibiotic blood levels following aerosol inhalation (example) 165–170 applied statistics, defined 1 arcsine transformation 197, 437 arthritic discomfort (example) multivariate approach to repeated measures analysis 122–127 univariate approach to repeated measures analysis 115–121 autocorrelation in time series analysis 197
B balanced residuals 172 Bartlett’s test 87 between- vs. within-group variation 399–401 bias in clinical research studies 10–12 bilirubin abnormalities (example) 294–299 binomial distribution 5 binomial data, multiple comparisons of 432–436 binomial response for more than two groups 274–276 binomial test 257–263 genital wart cure rate (example) 259–261 McNemar’s test 293–304
one-tailed hypothesis tests 263 two-tailed hypothesis tests 263 blinded randomization 11–12 blocking for noise reduction 401–405 BMI (body-mass index), example 56–60 Bonferroni adjustment 33, 428–431, 435 BONFERRONI option, MULTTEST procedure 430 Bowker’s test 304 BRESLOW option 379 Breslow-Day test 315
C CABG, CHF incidence in (example) 286–289 cardiac medication, diaphoresis following (example) 160–164 carryover effect See crossover design case report forms (CRFs), examples 40–41 CATMOD procedure logistic regression analysis 344 repeated measures ANOVA 155, 304 Central Limit Theorem 6–9, 235, 261 central tendency 14 characterizing populations 3 CHART procedure, TRIAL data set 46, 47, 50, 53 CHF incidence in CABG after ARA (example) 286–289 chi-square distribution critical values (table) 392 notation 393 shape of 397, 398 chi-square test 265–284 ADR frequency with antibiotics (example) 267–269 binomial response for more than two groups 274–276 confidence intervals 280–281 continuity correction 280 for contingency table analysis 270–273 for non-zero correlation 281 FREQ procedure 268, 275, 288 Kruskal-Wallis test 284 log-rank test 360 Mantel-Haenszel 275, 282 multinomial responses 276–278 one-tailed hypothesis tests 281 sample size determination 29 two-tailed hypothesis tests 281
Index Z-test statistic 279 CHISQ option, TABLES statement (FREQ) 268, 275, 288 CI= option, TTEST procedure 70 CINV function 294 CLASS statement GLM procedure 97 MIXED procedure 143 CLM option, MODEL statement (GLM) 182, 194 closed testing 431 clustered binomial data in logistic regression analysis 336–340, 344 CMH option, TABLES statement (FREQ) 272–278, 316 Cochran-Armitage test 31, 281–282 Cochran-Mantel-Haenszel test 305–316 chi-square test vs. 313 correlated binary responses and 336 Dermotel response in diabetic users (example) 307–312 FREQ procedure 272–278, 316 log-rank test 360 relapse rate adjusted for AML remission (example) 324 coefficient of determination ANCOVA (analysis of covariance) 223 linear regression analysis 187 multiple regression analysis 194 collinearity 188, 197, 224 COLLINOINT option, MODEL statement (REG) 197 compound symmetry 112, 122, 136 concatenating data 345 confidence intervals 16–17 ANCOVA (analysis of covariance) 222–223 chi-square test 280–281 odds ratios 341–342 one-way ANOVA (example) 89–90 confidence limits 70 contingency table analysis 270–272 continuity correction chi-square test 280 McNemar’s test 301 continuous probability distributions 5 CONTRAST statement GLM procedure 88 MULTTEST procedure 434, 436 CONTRAST(1) option, REPEATED statement (GLM) 130, 131 controlled studies 10
455
CORR option, REG procedure 187 correlated binary responses 336 covariate effect, linear regression analysis 193 Cox proportional hazards model 353, 365–379 HSV-2 episodes (example) 367–369 Hyalurise in vitreous hemorrhage (example) 370–375 log-rank test vs 368 Cox regression See Cox proportional hazards model CRFs (case report forms), examples 40–41 crossover design 155, 157–173 See also repeated measures analysis antibiotic blood levels following aerosol inhalation (example) 165–170 CLASS statement, GLM procedure 163 diaphoresis following cardiac medication (example) 160–164 GLM procedure (example) 163–164, 165–170 mean square 159 repeated measures ANOVA 155, 159 sums of squares 159, 161–162, 172 sum of squares for error 162 2-period 171–172 3-period 171 4-period 165–170
D data collection forms, examples 40–41 DATA step 45 data transformations 437–439 dehydration, symptomatic recovery in (example) 185–192 Dermotel response in diabetic ulcers (example) 307–312 DESCENDING option, LOGISTIC procedure 325, 332 descriptive statistics 3, 13–15 designing the study plan 9–12 diabetes Dermotel response in diabetic ulcers (example) 307–312 gastroparesis symptom relief (example) 331–335 Hyalurise in vitreous hemorrhage (example) 370–375 triglyceride changes for glycemic control (example) 203–214 diaphoresis following cardiac medication (example) 160–164
456
Index
DISCRETE option 379 discrete probability distribution 5 dispersion 14 DIST option, MODEL statement (GENMOD) 147 distributions used in statistical inference 393–398 double-blind studies 11 dropout rates for four dose groups (example) 274–277 DUNNETT option, MEANS statement (GLM) 88 Dunnett’s test 82, 88, 426–428 Durbin-Watson test 197 DW option, MODEL statement (REG) 197
FREQ procedure See also TABLES statement, FREQ procedure Bowker’s test 304 chi-square test 268, 275, 288 Cochran-Armitage test 282 Cochran-Mantel-Haenszel test 272, 274, 277, 316 EXACT statement 256 Fisher’s exact test 288, 291 McNemar’s test 297 TRIAL data set 46, 47, 51–52 WEIGHT statement 268, 288 Friedman’s test 256
E EDF option, NPAR1WAY procedure 245 EFRON option 379 empty cells 416 erectile dysfunction after surgery (example) 336–340 estimable functions SAS Type I 413 SAS Type II 414 SAS Type III 415 events, defined 5 EXACT option 279, 291 EXACT statement FREQ procedure 256 NPAR1WAY procedure 253 exploratory analyses, defined 3–4
F F-distribution notation 393 shape of 397, 398 F-tests 73–75 See also crossover design one-way ANOVA 78, 86, 87, 89–90 repeated measures analysis, univariate approach (example) 117 TEST statement, GLM procedure 110 two-way ANOVA 93, 96–97 factorial, defined 258 FEV1 changes (example) 68–73 FINV function 74 FISHER option, TABLES statement (FREQ) 288, 291 Fisher’s exact test 280, 285–291 fixed effects in ANOVA models 109–110 Freeman-Halton test 291
G GAD (general anxiety disorder), HAM-A scores (example) 80–86 gastroparesis symptom relief (example) 331–335 GEE analysis 147–148, 154, 155 Gehan test 361 generalized estimating equations (GEE analysis) 147–148, 154, 155 generalized Fisher’s exact text 291 generalized Wilcoxon test 361 genital wart cure rate (example) 259–261 GENMOD procedure 147–148, 155, 344 MODEL statement 147 repeated measures ANOVA, multivariate approach 147–148 TYPE3 option 153 WALD option 153 GLM procedure See also LSMEANS statement, GLM procedure See also MODEL statement, GLM procedure See also REPEATED statement, GLM procedure ANCOVA (analysis of covariance), examples 209–221 CLASS statement 97 CONTRAST statement 88 crossover design (example) 163–164, 165–170 linear regression analysis 181 one-way ANOVA (example) 82–83, 85–86 RANDOM statement 110, 149 repeated measures analysis, missing data (example) 140–143 repeated measures analysis, multivariate approach (example) 122–136 repeated measures analysis, univariate approach (example) 119–121
Index TEST statement 110 three-way ANOVA 109 two-way ANOVA (example) 97 global evaluations of Seroxatene (example) 239-243 glycemic control, triglyceride changes for (example) 203–214 Greenhouse-Geiser values 141, 150 group sequential methods 33 group variance, homogeneity of 78
H HAM-A scores in GAD (example) 80–86 hazard functions See Cox proportional hazards model hemoglobin changes in anemia (example) 94–102 histogram 13 HOC option, MULTTEST procedure 429 HOLM option, MULTTEST procedure 429 homogeneity of group variance 78 Hotelling-Lawley Trace test 130, 140, 150 HSV-2 episodes (example) 351–359, 367–370 Hyalurise in vitreous hemorrhage (example) 370–375 hypertension study (example) 214–220 hypothesis testing 18–21, 23–38 See also one-tailed hypothesis tests See also two-tailed hypothesis tests multiple hypotheses testing 30–37 p-value 27 sample size 27–29 significance levels 21, 23–24 test power 25–26
I inferences 2–3, 9–12 inferential methods 3 inferential statistics 16–21 interaction effects, two-way ANOVA 102 intercourse success rates after surgery (example) 336–340 interim analyses 33–37
J Jonckheere-Terpstra test 31, 255–256 JT option, TABLES statement (FREQ) 256
457
K Kaplan-Meier estimates 361–362 Kolmogorov-Smirnov test 76, 87, 245 Kruskal-Wallis test 247–256 chi-square test vs. 284 one-way ANOVA 87 psoriasis evaluation (example) 249–253
L LACKFIT option, MODEL statement (LOGISTIC) 341 Lan-DeMets a-spending function 36–37 last-observation-carried-forward (LOCF) 152 least squares criterion 177, 199, 217 least squares mean 405–407 LIFEREG procedure 376 LIFETEST procedure 355–359 PHREG procedure vs. 363 PLOTS= option 361, 362, 379 STRATA statement 355, 363 TEST statement 363 likelihood ratio test 361 limiting conditions for populations 2 linear contrasts 88 linear regression analysis 175–198 anti-anginal response vs. disease history (example) 178–184 coefficient of determination 187, 194 correlation coefficient 193–195 covariate effect 193 GLM procedure 181 mean square 193 sum of squares 193, 195 sum of squares for error 180, 195 zero regression slope 193 LOCF (last-observation-carried-forward) 152 log-rank test 349–363 chi-square test vs. 360, 361 Cochran-Mantel-Haenszel test vs. 360 Cox proportional hazards model vs. 368 generalized Wilcoxon test vs. 361 HSV-2 episodes (example) 351–359 likelihood ratio test vs. 361 Wilcoxon rank-sum test vs. 349–350 log transformations 197, 437–439 LOGISTIC procedure 322 See also MODEL statement, LOGISTIC procedure clustered binomial data 336–340
458
Index
LOGISTIC procedure (continued) DESCENDING option 325, 332 gastroparesis symptom relief (example) 331–335 OUTPUT statement 345–346, 347 PARAM= option 332 relapse rate adjusted for AML remission (example) 322–329 logistic regression 317–348 CATMOD procedure for 344 clustered binomial data 336–340 gastroparesis symptom relief (example) 331–335 intercourse success rates after surgery (example) 336–340 logit model, multiple covariates 321–322 logit model, one covariate 317–319 model estimation 320 multinomial responses 330–335 odds ratio 319 relapse rate adjusted for AML remission (example) 322–329 weighted-least squares 344 with nominal covariates 343 logit model multiple covariates 321–322 one covariate 317–319 longitudinal data 111 LS-mean 405–407 LSMEANS statement, GLM procedure ADJUST= option 425, 427 ANCOVA (analysis of covariance) 209 PDIFF option 98, 108 two-way ANOVA 98, 108
M Mann-Whitney U-test 244 Mantel-Haenszel chi-square test 275, 282 MAR (missing-at-random) values 152 marginal homogeneity (Stuart-Maxwell test) 298–300, 303 masked randomization 11 Mauchley’s Criterion 122–123 maximum likelihood method 320, 366 MCAR (missing values completely at random) 152 McNemar’s test 293–304 binomial test vs. 301–302 continuity correction 301 FREQ procedure 297–298 sign test and 302 trinomial response 298–300
mean square error See MSE mean square for group effect (MSG) 79 repeated measures analysis 113 means, multiple comparisons of 421–431 MEANS procedure one-sample t-test 59, 62 TRIAL data set 45, 47 two-sample t-test 70 MEANS statement (GLM) DUNNETT option 88 one-way ANOVA 82 two-way ANOVA 97 memory function (example) 103–107 methodology bias 12 mid-ranks 283 missing-at-random (MAR) values 152 missing data, repeated measures analysis with 136–147, 151–153 GENMOD procedure 147–148 GLM procedure 140–142 MIXED procedure 136, 143–147, 152, 153–154 sphericity test 122–123, 140, 150 missing values completely at random (MCAR) 152 MIXED procedure 152, 153–154 ANCOVA (analysis of covariance), example 224 CLASS statement 143 Handling missing data 136, 152 MODEL statement 143 repeated measures analysis (examples) 137–140, 143–147 REPEATED statement 143, 153 RCORR option 153 SUBJECT option 143 TYPE= option 143, 153 unbalanced mixed models 110 model parameters 14 MODEL statement, GENMOD procedure 147 MODEL statement, GLM procedure 181–182 ANCOVA (analysis of covariance), example 209–213, 215–220, 224 CLM option 182, 196 crossover design 163, 165–170, 171–172 NOUNI option 129 one-way ANOVA 82 P option 182, 196 repeated measures analysis, multivariate approach 122, 129 repeated measures analysis, univariate approach 119
Index SOLUTION option 209, 216 two-way ANOVA 97 MODEL statement, LOGISTIC procedure AGGREGATE option 341 LACKFIT option 341 RISKLIMITS option 342 SCALE option 338, 340 MODEL statement, MIXED procedure 143 MODEL statement, PHREG procedure 368, 378 MODEL statement, REG procedure 185, 187 COLLINOINT option 197 DW option 197 SS1 and SS2 options 195 VIF option 197–198 modified ridit scores 282–283, 383–384, 387 MS (mean square) ANCOVA (analysis of covariance) 202, 206 crossover design 159 two-way ANOVA 92 MSE (mean square error) ANCOVA (analysis of covariance), 206, 216 one-way ANOVA 79, 81, 85 repeated measures analysis 114, 151 repeated measures analysis, univariate approach 119 two-way ANOVA 92–93, 107, 108, 109 MSG (mean square for group effect) one-way ANOVA 79 repeated measures analysis 113 multi-center hypertension study (example) 214–220 multinomial responses chi-square test 276–278 logistic regression analysis 330–335 multiple comparisons 31–32, 421–436 one-way ANOVA (example) 87–89 pairwise t-tests for 97 multiple hypotheses testing 30–37 interim analyses 33–37 multiple linear regression 184–192, 193, 195, 197 coefficient of determination 194 REG procedure 185, 187 sum of squares for error 195 symptomatic recovery in pediatric dehydration (example) 185–192 Types I, II, III sum of squares 194 multiple response variables 32–33 multivariate approach, repeated measures ANOVA 122–136 Alzheimer’s progression (example) 140–143
459
missing data 136–147, 151–153 missing data, GENMOD procedure 147–148 missing data, GLM procedure 140–143 missing data, MIXED procedure 144–147 treadmill walking distance (example) 127–136 MULTTEST procedure BONFERRONI option 430 HOC option 430 HOLM option 430 p-value adjustment methods 430 permutation resampling 433–436 SIDAK option 430
N NOFREQ option, TABLES statement (FREQ) 309 noise reduction by blocking 401–405 nominal variables 270 non-parametric analog Friedman’s test 256 Kruskal-Wallis test 247–256 Wilcoxon rank-sum test 237–245 Wilcoxon signed-rank test 227–235 non-zero correlation, chi-square test for 281 non-zero slope 221 NOPERCENT option, TABLES statement (FREQ) 309 normal approximation 260–263 normal distribution 6 notation 393 probabilities of (table) 390 shape of 397, 398 NORMAL option, UNIVARIATE procedure 76, 231 normalizing data 437–439 NOROW option, TABLES statement (FREQ) 309 NOUNI option, MODEL statement (GLM) 129 NPAR1WAY procedure EDF option 245 EXACT statement 253 WILCOXON option 242, 245, 252
O O’Brien-Fleming method 35 odds ratio (OR) for covariate 319, 341 confidence intervals 341 logistic regression analysis 319 relapse rate adjusted for AML remission (example) 333 ODS statement 155–156
460
Index
one-sample t-test 55–65 body-mass index (example) 56–60 MEANS procedure 59, 62 non-parametric analog (Wilcoxon signed-rank test) 227–235 paired-difference in weight loss (example) 60–63 sample size determination 28 TTEST procedure 58–60, 62–63 one-sample Z-test 28 one-tailed tests 26 binomial test 263 chi-square test 281 two-tailed tests vs. 64 one-way ANOVA 77–90 confidence intervals 89–90 Dunnett’s test 88 F-distribution 78, 79, 81, 86, 90 F-test 82, 87, 89, 400 GLM procedure (example) 82–86 HAM-A scores in GAD (example) 80–86 Kolmogorov-Smirnov test 87 Kruskal-Wallis test 87, 247–256 mean square error (MSE) 78, 79 mean square group (MSG) 78, 79 MEANS procedure 82 MEANS statement (GLM) 82 MODEL statement (GLM) 82 multiple comparison of treatment effects 87, 89 noise reduction by blocking 401–405 non-parametric analog (Kruskal-Wallis test) 247–256 on ranked data 254 p-value 90 SAS Types I-IV sum of squares 90 Shapiro-Wilk test 87 within- vs. between-group variation 399–401 ordinal variables 270 outliers 47 OUTPUT statement, LOGISTIC procedure 345–346 overdispersion, correcting for 336–340
P P option, MODEL statement (GLM) 182, 196 p-values 27 adjustment methods 428–431 one-way ANOVA 90 paired-difference in weight loss (example) 60–63 paired-difference t-test 56
weight loss example 60–63 Wilcoxon signed-rank test vs. 229 PAIRED statement, TTEST procedure 62 pairwise t-tests 87–88, 422–424 for multiple comparisons 97–98 two-way ANOVA 99 parallel-group study 10 PARAM= option, LOGISTIC procedure 332 partial correlation coefficients 195 PDIFF option, LSMEANS statement (GLM) 98, 108 Pearson-product-moment correlation coefficient 193 pediatric dehydration, symptomatic recovery in (example) 185–192 permutation resampling (example) 433–436 PHREG procedure 366, 377, 378–379 HSV-2 episodes (example) 367–369 Hyalurise in Vitreons Hemorrhage (example) 370–375 LIFETEST procedure vs. 363 MODEL statement 368, 378 STRATA statement 372–373, 379 Pillai’s Trace test 130, 140, 150 placebo effect 11 placebo vs. active comparisons of degree of response (example) 277–278 PLOTS= option, LIFETEST procedure 361–362, 379 Pocock’s method 34 point estimates 17 pooled variance 68, 69 populations characterizing 3 defined 2 limiting conditions for 2 power (Type II error) 25–26 power curve, for Z-test 26 principal component analysis 198 PRINTE option, REPEATED statement 122, 129, 140 probability distributions 4–9, 390 probability tables 390–392 PROBBNML function 258, 263 PROBCHI function 249, 252, 268 PROBF function 74 probit transformation 342 PROBMC function 425 PROBNORM function 263
Index PROFILE option, REPEATED statement (GLM) 150 proportional hazards See Cox proportional hazards model proportional odds assumption 330, 344 PRT option, MEANS procedure 59, 62, 70 psoriasis evaluation (example) 249–253
R random effects in ANOVA models 109–110 RANDOM statement, GLM procedure 110, 149 random variables 4–5 notation 393 properties 394 results 395–396 randomization 10–12 randomized block design 91–92 See also two-way ANOVA RANK procedure 244 ranked data one-way ANOVA on 254 two-sample t-test on 244 RCORR option, REPEATED statement (MIXED) 153 REG procedure 185–192, 198 CORR option 187 MODEL statement 185, 187, 196, 197 multiple linear regression analysis 185 regression analysis See linear regression analysis See multiple linear regression relapse rate adjusted for AML remission (example) 322–329 relative risk ratio 341 repeated measures analysis 111–156 arthritic discomfort (example) 115–127 CATMOD procedure 304 crossover design 155 disease progression in Alzheimer’s (example) 137–148 GEE analysis 147–148, 155 mean square error 151 mean square for group effect 113 missing data 136–147, 151–153 multivariate approach 122–136, 150, 152 sum of squares for error 149 treadmill walking distance (example) 127–136 univariate approach 112–121, 150, 151 weighted-least-squares 155
461
within- vs. between-group variation 399–401 repeated measures analysis, with missing data 136–147, 151–153 GENMOD procedure 147–148 GLM procedure 136, 140–142 MIXED procedure 136–139, 143–147 sphericity test 140 REPEATED statement, GLM procedure CONTRAST(1) option 131, 150 PROFILE option 150 repeated measures analysis, multivariate approach 122 repeated measures analysis, missing data 140 SUMMARY option 150, 151 REPEATED statement, MIXED procedure RCORR option 153 SUBJECT option 143 repeated measures, missing data (example) 143–147 resampling methods 433–436 residuals, balanced 172 response variables, multiple 32–33 ridit See modified ridit scores RISKLIMITS option, MODEL statement (LOGISTIC) 342 risk ratio 341, 378 Rose-Bengal staining in KCS (example) 228–233 Roy’s Greatest Root test 130, 150
S samples characterizing populations from 3 defined 2 resampling methods 433–436 sample size 27–29 sampling theory 3 stratified sampling 3 Satterthwaite adjustment 74–75 SC (Schwartz Criterion) 340 SCALE= option, MODEL statement (LOGISTIC) 338, 340 Schwartz Criterion (SC) 340 SCORES= option, TABLES statement (FREQ) 283 Seroxatene evaluations (example) 239–243 Shapiro-Wilk test 76, 231 one-way ANOVA 87 Sidak method 428–431
462
Index
SIDAK option, MULTTEST procedure 430, 432 sign test 263, 302 significance levels 21, 23–24 simple linear regression analysis See linear regression analysis simple random sample, defined 2 single-blind studies 11 SOLUTION option, MODEL statement (GLM) 209, 216 sphericity test 122–123, 129, 150 repeated measures analysis 140 square root transformation 197, 347 SS (sum of squares) ANCOVA (analysis of covariance) 201, 205 crossover design 159, 161–162, 172 empty cells in SS calculations 416, 419 linear regression analysis 195 repeated measures analysis, univariate approach 116–117 SAS Types I, II, III, IV for ANOVA 97, 409–420 two-way ANOVA 92–93, 96 SS1 and SS2 options, MODEL statement (REG) 195 SS3 option, MODEL statement (GLM) crossover design 163 repeated measures analysis, univariate approach (example) 119 two-way ANOVA 97 SSE (sum of squares for error) 79, 177 ANCOVA (analysis of covariance) 200, 205 crossover design 162 linear regression analysis 180 multiple regression analysis 195 repeated measures analysis 117, 149 two-way ANOVA 96, 109 SSG (sum of squares for groups) 79 standard normal distribution, probabilities (table) 390 statistical design consideration 9–12 statistical inferences 10–12 distributions used in 393–398 statistical methods, selecting 12 statistics, defined 1 step-down Bonferroni method 428, 429 step-up Bonferroni method 428, 430 STRATA statement LIFETEST procedure 355 PHREG procedure 372–373, 379 stratified sampling 3
Stuart-Maxwell test 298–300, 303 See also McNemar’s test Student t-distribution critical values (table) 391 one-sample t-test 55–56 notation 393 shape of 398 two-sample t-test 67 study plan design 9–12 SUBJECT option, REPEATED statement (MIXED) 143 sum of squares See SS sum of squares for error See SSE sum of squares for groups (SSG) 79 SUMMARY option, REPEATED statement (GLM) 150 survival analysis 36, 350, 366 Cox proportional hazards model 365 log-rank test 349 symptom frequency, before and after treatment (example) 300 symptom relief, in gastroparesis (example) 331–335 symptomatic recovery in pediatric dehydration (example) 185–192
T T option, MEANS procedure 59, 62, 70 T option, MEANS statement (GLM) 88, 97 t-tests multiple response variables 32 paired difference 56, 62 pairwise 87–88, 97, 422–424 sample size determination 28–29 Satterthwaite adjustment 74–75 Wilcoxon signed-rank test vs. 229, 234, 235 Z-test vs. 64 t-tests, one-sample 55–65 body-mass index (example) 56–60 MEANS procedure 59 non-parametric analog (Wilcoxon signed-rank test) 227–235 paired-difference in weight loss (example) 60–63 sample size determination 28 Student t-distribution 55–56 TTEST procedure 58–60, 62 t-tests, two-sample 67–76
Index FEV1 changes (example) 68–73 MEANS procedure 70 non-parametric analog (Wilcoxon rank-sum test) 237–245 on ranked data 244 one-way ANOVA 87–88 sample size determination 28–29 Student t-distritbution 67 TTEST procedure 70, 244 Wilcoxon rank-sum test vs. 244–245 within-group tests vs. 76 TABLES statement, FREQ procedure AGREE option 297, 304 CHISQ option 268, 275, 288 CMH option 272, 274, 277, 316 EXACT option 291 FISHER option 288, 291 JT option 256 NOFREQ option 309 NOPERCENT option 309 NOROW option 309 SCORES option 278, 283 TREND option 282 test criterion 20–21 test power 25–26 TEST statement GLM procedure 110 LIFETEST procedure 362–363 test statistic 20–21 three-way ANOVA 109 time series analysis, autocorrelation in 197 treadmill walking distance (example) 127–136 treatment, symptom frequency before and after (example) 300 treatment effects See multiple comparisons TREND option, TABLES statement (FREQ) 282 TRIAL data set 39–54 statistical summary 45–53 triglyceride changes for glycemic control (example) 203–213 trinomial response 298–299 triple-blind studies 12 TTEST procedure 58–63 CI= option 70 one-sample t-test 58–60, 62, 70 PAIRED statement 62 two-sample t-test 70 two-sample t-test on ranked data 244 using with SAS 6.12 and earlier 59, 62
463
VAR statement 62 Tukey-Kramer method 424–426 two-sample comparison of proportions 29 two-sample t-test See t-test, two-sample two-sample Z-tests 28–29 two-tailed Z- tests 27 two-tailed tests 26–27, 235, 262 chi-square test 281 one-tailed tests vs. 64 two-tailed Z-test 27 two-way ANOVA 91–110 F-test 93, 96–97 F-values 92, 97 fixed effects 109–110 GLM procedure 97, 98, 109 handling empty cells 416 hemoglobin changes in anemia (example) 94–102 interaction effects (graphs) 102 mean square 92 mean square error 92–93, 107, 108, 109, 403 MEANS procedure 97 memory function (example) 103–107 noise reduction by blocking 403 non-parametric analog (Friedman’s test) 256 random effects 109–110 SAS Types I, II, III, IV sum of squares 97, 104, 108 sum of squares 93, 96 sum of squares for error 96, 109 within- vs. between-group variation 399–401 two-way crossover design 155, 157–173 See also repeated measures analysis antibiotic blood levels following aerosol inhalation (example) 165–170 diaphoresis following cardiac medication (example) 160–164 Type I error 23 Type I sum of squares 409–420 ANCOVA (analysis of covariance), 216, 224 estimable functions 412–413 multiple regression analysis 195 one-way ANOVA 90 two-way ANOVA 97, 108 Type II error 25 Type II sum of squares 409–420 estimable functions 413–414 one-way ANOVA 90 two-way ANOVA 97, 108
464
Index
Type III sum of squares 409–420 ANCOVA (analysis of covariance) 216, 224, 225 estimable functions 414–415, 418 handling empty cells 416, 419 more than two treatment groups 417–419 multiple regression analysis 195 one-way ANOVA 90 repeated measures analysis, 121 two-way ANOVA 97, 104, 108 Type IV sum of squares 409–420 estimable functions 415 handling empty cells 416, 419 more than two treatment groups 419 one-way ANOVA 90 two-way ANOVA 97 TYPE= option, MIXED procedure 143, 153 TYPE3 option, MODEL statement (GENMOD) 147, 153 REPEATED statement 143
U unbalanced layout (example) 103–107 unbiased comparisons 10–12 uniform distribution 7–8 univariate approach to repeated measures analysis 112–121 arthritic discomfort (example) 115–121 UNIVARIATE procedure NORMAL option 76, 231 sign test 263 test for normality 64 TRIAL data set 45, 48–49 Wilcoxon signed-rank test 231
V VAR statement, TTEST procedure 62 variables multiple response variables 32–33 nominal variables 270 ordinal variables 270–271 random variables 4, 5 variance homogeneity 78 variance inflation factor (VIF) 197–198 VIF option, MODEL statement (REG) 197 vitreous hemorrhage, Hyalurise in (example) 370–375
W WALD chi-square test 373 WALD option, GENMOD procedure 153 walking distance on treadmill (example) 127–136 weight loss, paired-difference in (example) 60–63 WEIGHT statement, FREQ procedure 268, 288 weighted-least-squares (WLS) 155, 344 WILCOXON option, NPAR1WAY procedure 242, 252 Wilcoxon rank-sum test 237–245 log-rank test vs. 349–350 Wilcoxon signed-rank test 64, 227–235 paired-difference t-test 229, 231 Rose-Bengal staining in KCS (example) 228–233 Seroxatene evaluations (example) 239–243 UNIVARIATE procedure 231 Z-test vs. 234 Wilcoxon test, generalized 361 Wilks’ Lambda test 130, 140, 150 Williams’ method 338 within-group tests instead of two-sample t-tests 76 within- vs. between-group variation 399–401 within-patient control study 10 WLS (weighted-least-squares) 155, 344
Z Z-test 24 comparing binomial proportions 279, 280 multiple response variables 32 one-tailed hypothesis test 26, 27 one-sample 28 power curve for 26 sample size determination 27–29 t-test vs. 64 two-sample 28–29 two-tailed hypothesis test 27 Wilcoxon signed-rank test vs. 234 zero regression slope 193
Numbers 2-period crossover design 171–172 3-period crossover design 171 4-period crossover design 165–170
Call your local SAS office to order these books from
Books by Users Press
Annotate: Simply the Basics by Art Carpenter .......................................Order No. A57320
Applied Multivariate Statistics with SAS® Software, Second Edition by Ravindra Khattree and Dayanand N. Naik..............................Order No. A56903
Applied Statistics and the SAS Programming Language, Fourth Edition ®
by Ronald P. Cody and Jeffrey K. Smith.................................Order No. A55984
An Array of Challenges — Test Your SAS ® Skills by Robert Virgile.......................................Order No. A55625
Beyond the Obvious with SAS Screen Control Language ®
by Don Stanley .........................................Order No. A55073
Carpenter’s Complete Guide to the SAS® Macro Language
A Handbook of Statistical Analyses Using SAS®, Second Edition by B.S. Everitt and G. Der .................................................Order No. A58679
Health Care Data and the SAS® System by Marge Scerbo, Craig Dickstein, and Alan Wilson .......................................Order No. A57638
The How-To Book for SAS/GRAPH ® Software by Thomas Miron .....................................Order No. A55203
In the Know ... SAS ® Tips and Techniques From Around the Globe by Phil Mason ..........................................Order No. A55513 Integrating Results through Meta-Analytic Review Using SAS® Software by Morgan C. Wang and Brad J. Bushman .............................Order No. A55810
by Art Carpenter .......................................Order No. A56100
Learning SAS ® in the Computer Lab, Second Edition The Cartoon Guide to Statistics by Larry Gonick and Woollcott Smith.................................Order No. A55153
Categorical Data Analysis Using the SAS ® System, Second Edition by Maura E. Stokes, Charles S. Davis, and Gary G. Koch .....................................Order No. A57998
Client/Server Survival Guide, Third Edition by Robert Orfali, Dan Harkey, and Jeri Edwards......................................Order No. A58099
Cody’s Data Cleaning Techniques Using SAS® Software by Ron Cody........................................Order No. A57198 Common Statistical Methods for Clinical Research with SAS ® Examples, Second Edition by Glenn A. Walker...................................Order No. A58086
Concepts and Case Studies in Data Management by William S. Calvert and J. Meimei Ma......................................Order No. A55220
Debugging SAS ® Programs: A Handbook of Tools and Techniques by Michele M. Burlew ...............................Order No. A57743
Efficiency: Improving the Performance of Your SAS ® Applications by Robert Virgile.......................................Order No. A55960
Extending SAS ® Survival Analysis Techniques for Medical Research by Alan Cantor..........................................Order No. A55504
by Rebecca J. Elliott ................................Order No. A57739 The Little SAS ® Book: A Primer by Lora D. Delwiche and Susan J. Slaughter ...........................Order No. A55200 The Little SAS ® Book: A Primer, Second Edition by Lora D. Delwiche and Susan J. Slaughter ...........................Order No. A56649 (updated to include Version 7 features) Logistic Regression Using the SAS® System: Theory and Application by Paul D. Allison ....................................Order No. A55770 Longitudinal Data and SAS®: A Programmer’s Guide by Ron Cody ............................................Order No. A58176 Maps Made Easy Using SAS® by Mike Zdeb ............................................Order No. A57495 Models for Discrete Data by Daniel Zelterman ................................Order No. A57521
Multiple Comparisons and Multiple Tests Text and Workbook Set (books in this set also sold separately) by Peter H. Westfall, Randall D. Tobias, Dror Rom, Russell D. Wolfinger, and Yosef Hochberg ................................Order No. A58274 Multiple-Plot Displays: Simplified with Macros by Perry Watts .........................................Order No. A58314
Multivariate Data Reduction and Discrimination with SAS ® Software by Ravindra Khattree and Dayanand N. Naik..............................Order No. A56902
www.sas.com/pubs
The Next Step: Integrating the Software Life Cycle with SAS ® Programming
SAS ® Software Roadmaps: Your Guide to Discovering the SAS ® System
by Paul Gill ...............................................Order No. A55697
by Laurie Burch and SherriJoyce King ..............................Order No. A56195
Output Delivery System: The Basics by Lauren E. Haworth ..............................Order No. A58087
SAS ® Software Solutions: Basic Data Processing by Thomas Miron......................................Order No. A56196
Painless Windows: A Handbook for SAS ® Users by Jodie Gilmore ......................................Order No. A55769 (for Windows NT and Windows 95)
Painless Windows: A Handbook for SAS ® Users, Second Edition by Jodie Gilmore ......................................Order No. A56647 (updated to include Version 7 features)
PROC TABULATE by Example by Lauren E. Haworth ..............................Order No. A56514
Professional SAS ® Programmers Pocket Reference, Second Edition by Rick Aster ............................................Order No. A56646
Professional SAS ® Programmers Pocket Reference, Third Edition by Rick Aster ............................................Order No. A58128
Programming Techniques for Object-Based Statistical Analysis with SAS® Software
SAS ® System for Elementary Statistical Analysis, Second Edition by Sandra D. Schlotzhauer and Ramon C. Littell.................................Order No. A55172
SAS ® System for Forecasting Time Series, 1986 Edition by John C. Brocklebank and David A. Dickey ...................................Order No. A5612
SAS ® System for Linear Models, Third Edition by Ramon C. Littell, Rudolf J. Freund, and Philip C. Spector ...............................Order No. A56140
SAS ® System for Mixed Models by Ramon C. Littell, George A. Milliken, Walter W. Stroup, and Russell D. Wolfinger .........................Order No. A55235
SAS ® System for Regression, Third Edition by Rudolf J. Freund and Ramon C. Littell.................................Order No. A57313
by Tanya Kolosova and Samuel Berestizhevsky ....................Order No. A55869
SAS ® System for Statistical Graphics, First Edition
Quick Results with SAS/GRAPH ® Software
The SAS ® Workbook and Solutions Set (books in this set also sold separately)
by Arthur L. Carpenter and Charles E. Shipp ...............................Order No. A55127
Quick Start to Data Analysis with SAS ® by Frank C. Dilorio and Kenneth A. Hardy..............................Order No. A55550
Regression and ANOVA: An Integrated Approach Using SAS ® Software
by Michael Friendly ..................................Order No. A56143
by Ron Cody .............................................Order No. A55594
Selecting Statistical Techniques for Social Science Data: A Guide for SAS® Users by Frank M. Andrews, Laura Klem, Patrick M. O’Malley, Willard L. Rodgers, Kathleen B. Welch, and Terrence N. Davidson .......................Order No. A55854
by Keith E. Muller and Bethel A. Fetterman ..........................Order No. A57559
Solutions for Your GUI Applications Development Using SAS/AF ® FRAME Technology
Reporting from the Field: SAS ® Software Experts Present Real-World Report-Writing
Statistical Quality Control Using the SAS ® System
Applications .............................................Order No. A55135
by Dennis W. King....................................Order No. A55232
SAS ®Applications Programming: A Gentle Introduction
A Step-by-Step Approach to Using the SAS ® System for Factor Analysis and Structural Equation Modeling
by Frank C. Dilorio ...................................Order No. A56193
by Don Stanley .........................................Order No. A55811
by Larry Hatcher.......................................Order No. A55129
SAS ® for Linear Models, Fourth Edition by Ramon C. Littell, Walter W. Stroup, and Rudolf J. Freund ...............................Order No. A56655
A Step-by-Step Approach to Using the SAS ® System for Univariate and Multivariate Statistics
SAS ® Macro Programming Made Easy
by Larry Hatcher and Edward Stepanski .............................Order No. A55072
by Michele M. Burlew ...............................Order No. A56516
SAS ® Programming by Example by Ron Cody and Ray Pass ............................................Order No. A55126
SAS ® Programming for Researchers and Social Scientists, Second Edition by Paul E. Spector....................................Order No. A58784
www.sas.com/pubs
Strategic Data Warehousing Principles Using SAS ® Software by Peter R. Welbrock ...............................Order No. A56278
Survival Analysis Using the SAS ® System: A Practical Guide by Paul D. Allison .....................................Order No. A55233
Table-Driven Strategies for Rapid SAS ® Applications Development by Tanya Kolosova and Samuel Berestizhevsky ....................Order No. A55198
Tuning SAS ® Applications in the MVS Environment by Michael A. Raithel ...............................Order No. A55231
Univariate and Multivariate General Linear Models: Theory and Applications Using SAS ® Software by Neil H. Timm and Tammy A. Mieczkowski ....................Order No. A55809
Using SAS ® in Financial Research by Ekkehart Boehmer, John Paul Broussard, and Juha-Pekka Kallunki .........................Order No. A57601
Using the SAS ® Windowing Environment: A Quick Tutorial by Larry Hatcher.......................................Order No. A57201
Visualizing Categorical Data by Michael Friendly ..................................Order No. A56571
Working with the SAS ® System by Erik W. Tilanus ....................................Order No. A55190
Your Guide to Survey Research Using the SAS® System by Archer Gravely ....................................Order No. A55688
JMP® Books Basic Business Statistics: A Casebook by Dean P. Foster, Robert A. Stine, and Richard P. Waterman........................Order No. A56813
Business Analysis Using Regression: A Casebook by Dean P. Foster, Robert A. Stine, and Richard P. Waterman........................Order No. A56818
JMP® Start Statistics, Version 3 by John Sall, Ann Lehman, and Leigh Creighton ................................Order No. A55626
www.sas.com/pubs
Welcome * Bienvenue * Willkommen * Yohkoso * Bienvenido
SAS Publishing Is Easy to Reach Visit our Web page located at www.sas.com/pubs You will find product and service details, including •
companion Web sites
•
sample chapters
•
tables of contents
•
author biographies
•
book reviews
Learn about • • •
regional user-group conferences trade-show sites and dates authoring opportunities
•
e-books
Explore all the services that SAS Publishing has to offer! Your Listserv Subscription Automatically Brings the News to You Do you want to be among the first to learn about the latest books and services available from SAS Publishing? Subscribe to our listserv newdocnews-l and, once each month, you will automatically receive a description of the newest books and which environments or operating systems and SAS ® release(s) each book addresses. To subscribe,
1.
Send an e-mail message to
[email protected].
2.
Leave the “Subject” line blank.
3.
Use the following text for your message: subscribe NEWDOCNEWS-L your-first-name your-last-name For example: subscribe NEWDOCNEWS-L John Doe
You’re Invited to Publish with SAS Institute’s Books by Users Press If you enjoy writing about SAS software and how to use it, the Books by Users program at SAS Institute offers a variety of publishing options. We are actively recruiting authors to publish books and sample code. If you find the idea of writing a book by yourself a little intimidating, consider writing with a co-author. Keep in mind that you will receive complete editorial and publishing support, access to our users, technical advice and assistance, and competitive royalties. Please ask us for an author packet at
[email protected] or call 919-531-7447. See the Books by Users Web page at www.sas.com/bbu for complete information.
Book Discount Offered at SAS Public Training Courses! When you attend one of our SAS Public Training Courses at any of our regional Training Centers in the U.S., you will receive a 20% discount on book orders that you place during the course.Take advantage of this offer at the next course you attend!
SAS Institute Inc. SAS Campus Drive Cary, NC 27513-2414 Fax 919-677-4444
E-mail:
[email protected] Web page: www.sas.com/pubs To order books, call SAS Publishing Sales at 800-727-3228* For other SAS business, call 919-677-8000*
* Note: Customers outside the U.S. should contact their local SAS office.