A General Model for Multivariate Analysis
INTERNATIONAL SERIES IN DECISION PROCESSES INGRAM OLKIN, Consulting Editor Statistical Theory of Reliability and Life Testing: Probability Models, R. E. Barlow and F. Proschan Probability Theory and Elements of Measure Theory, H. Bauer Time Series, R. Brillinger Decision Analysis for Business, R. Brown Probability and Statistics for Decision Making, Ya-lun Chow A Guide to Probability Theory and Application, C. Derman, L. J. Gieser, and I.Oikin Introduction to Statistics and Probability, E. Dudewicz A General Model for Multivariate Analysis, J.D. Finn Statistics: Probability, Inference, and Decision, 2d ed., W. L. Hays and R. L. Winkler Statistics: Probability, Inference, and Decision, Volumes I and II and Combined ed., W. L. Hays and R. L. Winkler Introduction to Statistics, R. A. Hultquist Introductory Statistics with FORTRAN, A. Kirch Reliability Handbook, B. A. Kozlov and I. A. Ushakov (edited byJ. T. Rosenblatt and L. H. Koopmans) An Introduction to Probability, Decision, and Inference, I. H. LaValle Elements of Probability and Statistics, S. A. Lippman Modern Mathematical Methods for Economics and Business, R. E. Miller Applied Multivariate Analysis, S. J. Press Fundamental Research Statistics for the Behavioral Sciences, J. T. Roscoe Applied Probability, W. A. Thompson, Jr. Quantitative Methods and Operations Research for Business, R. E. Trueman Elementary Statistical Methods, 3d ed., H. M. Walker and J. Lev An Introduction to Bayesian Inference and Decision, R. L. Winkler FORTHCOMING TITLES
A Basic Course in Statistics with Sociological Applications, 3d ed., T. R. Anderson and M. Zelditch, Jr. Fundamentals of Decision Analysis, I. H. LaValle
I
I ,
A General ~ode I tor Multivarifte Analysis I
I I
FI~N
JEREMY D. State University pf New York at Buffalo I
HOLT, RINEHART' AND WINSTON, INC. New York Chicago S~n Francisco Atlanta Dallas Montreal ToroJto London Sydney
I
I
Library of Congress Cataloging in Publication Data Finn, Jeremy D. A general model for multivariate analysis. (International series in decision processes) Bibliography: p. 410 1. Multivariate analysis. 2. Analysis of variance. I. Title. OA278.F56 519.5'3 74-8629 ISBN 0-03-083239-X Copyright© 1974 by Holt, Rinehart and Winston, Inc. All rights reserved Printed in the United States of America
4 5 6 7 038 9 8 7 6 5 4 3 2 1
K, andL
Preface Scientists faced with the task of analyzing and understanding human behavior are in constant need ~=Jf models by which they may test hypotheses involving a greater quantity and complexity of behavioral variables. In response to this need, multivariate mod~ls for the application of such techniques as analysis of variance, regression analysis, and analysis of covariance are gaining in accessibility and usage. Multivariate analysis may be' conceptualized in two ways. First, it is a means of analyzing behavioral phenomena. It is based upon the realization that hardly any form of human behavior wor thy of study has only a single facet; that behind any measurable trait are compor]lents that covary only partially; that a "better" scientific description of any behavior is derived through some degree of finer analysis. Further, no observable pehavior results from a single antecedent. The "principle of multiple causes" is one we confront in all except the smallest analytic units (for example, th~ "one-gene, one-enzyme theory" of genetic . . action). The second conceptualizatibn of multivariate analysis is fitting a set of algebraic models to situations with multiple random variables, usually criterion or outcome variables, which are measures of the same sample(s) of subjects. Behavioral data are often of thi~ form. Intelligence is measured in terms of at least a quantitative and a verbal 'ability. Following Guilford (1959), creativity is measured by the administration of at least six separate scales. Often two achievement scores, one for speed and one for power, are assigned to a test respondent. The class of experimental designs known as "repeated measures" designs denotes a multivariate situation. In this case each subject is measured on a given scale at two or more points in time or under differing experimental conditions. The evaluation of the attainment of course objectives in either experimental or traditional instru~tional settings is likely to require multivariate analysis procedures. With multivariate models, the simultaneous consideration of the attainment of several co9nitive levels or of both cognitive and noncognitive instructional outcomes is facilitated. In each instance, the analysis of a single summary measure-for example, a total or average score-will result in the loss of the information conveyed by the individual scales. Statistical analysis of each of a series of measures separately will result in redundancy, which in turn will threaten the validity of the interpretations drawn from the data. Use of the appropriate multivariate model will allow the researcher to retain the multiple scores and to treat them simultaneously, giving appropriate consideration to the correlations among them. 1
vii
viii
Preface
Multivariate techniques comprise two related methodologies, each with its own objectives. The first of these is concerned with the discovery of an underlying structure of response data that have been collected, or of the behaviors they represent. Factor analysis is employed in the attempt to locate and isolate sets of measures with the properties that the tests or scales within the set have relatively high intercorrelations with each other, and that scales of one set have small or zero intercorrelations with those of another set. If sets of variables that contribute very little to discrimination among subjects can be identified, they are often ignored or eliminated. In this sense, factor analysis or a simpler technique, component analysis, is used as a data-reduction device. Recent contributions by Bock and Bargmann (1966) and by Joreskog (1969) allow the researcher to hypothesize, from psychological theory or from prior analysis, a given latent structure underlying a set of measures, and to apply statistical criteria to test the fit of the observed outcomes to the hypothesis. A second set of multivariate procedures, which constitutes the primary focus of this book, includes multivariate extensions of such commonly used estimation and hypothesis-testing procedures as analysis of variance, analysis of covariance, and regression analysis. Through these methods questions may be answered about the contribution of structured and identifiable independent variables to the explanation of between-individual or between-group variation in one or more criterion measures. Examples of such questions in multivariate form might include: "Does intelligence predict these four achievement measures?'' "Are there significant differences between control and experimental groups on speed as well as accuracy of learning, or on four body dimensions?" "Does the mean growth curve of the group administered an experimental drug differ from that of a placebo group?" The dependent or criterion variables generally have nonzero intercorrelations. The model implied by each question is of a form familiar to most behavioral scientists. A primary purpose of this book is to describe the multiple-criterion form of these models, and to provide the computational tools which facilitate data analysis under that form. A general model for multivariate analysis describes the analysis of quantitative data through application of a "general linear model." Linear estimation and tests of hypotheses are discussed, which are univariate and multivariate forms of the following techniques: The summary of raw and transformed multivariable data Multiple correlation and regression Canonical correlation Principal components Analysis of variance, with equal or unequal subclass frequencies Analysis of covariance Discriminant analysis Step-down analysis To provide sequence with other statistical materials, the univariate multiple regression model is introduced first, and is discussed in greatest mathematical detail. Multivariate regression and univariate and multivariate analysis of
Preface
ix
variance are presented as extensions of that basic model. The analysis-ofvariance presentation is less detailed and contains more exemplary material. The remaining techniques are viewed as by-products of the formulation of the regression and variance analysis models. Particular emphasis is given to topics that have been inadequately described in current journals and texts in the social sciences. For example, lengthy discussion is devoted to reparameterization in the analysis of variance and to the estimation of parameters in linear models. Five sample problems are introduced in the first chapter and are described throughout the text as the appropriate analysis techniques are encountered. These are relatively large problems. Hand computation in multivariate analysis is not feasible for any but the most trivial examples. The sample problems were selected instead to exemplify a variety of real design and analysis problems. The analyses for the five samples were originally performed on the MULTIVARIANCE program (Finn, 1972d). The computer input-output listings are provided as an appendix to the text (separately numbered C.1 through C.166), along with a brief version of the MULTIVARIANCE user's manual. The statistical results are transcribed to the earlier chapters as they are discussed. This book is addressed to users and potential users of multivariate statistical techniques. Readers should have familiarity with univariate statistical theory, to a degree provided by, say, a good one-year course in applied statistical methods. Topics that are especially requisite are estimation and significance testing in fixed-effects analysis of variance; the design and estimation of planned contrasts in analysis-of-variance models; simple univariate regression models and analysis; and the basic concepts of covariance and correlation. The book relies entirely on the statement and formulation of linear models, and some facility with these skills is essential. Knowledge of the algebra of matrices is desirable but not necessary. Those aspects of linear algebra employed in the book are discussed briefly in Chapter 2. This book may be read in several ways. As a text in multivariate analysis, the organization provides sequence for detailed study of the general linear model and its applications. Supplementary material on matrix algebra plus computer routines for class exercises are recommended. As a reference, the examples may be studied by themselves as illustrations of (a) the data for which multivariate models are appropriate and (b) the presentation and interpretation of the outcomes. Toward this end, study of the computer runs and the respective problem discussions is likely to be especially useful. This book has been in preparation a long time. In that time there have been many people who have helped in one way or another. I wish to thank them all. In particular, I owe a great deal to Professor Darell Bock of The University of Chicago. Without his teachings this book, and more, would not have been possible. I wish to thank Professor Ingram Olkin, who has been continually supportive. His reviews and comments have had a major impact on the form of the book. Two students, Kathleen VanEvery and Nancy Breland, have provided useful reviews and suggestions for improvements. Also there are those who have lent their data as examples and are acknowledged in the first chapter, those who have helped in the development of the MUL TIVARIANCE program, and many who have made individual suggestions
x
Preface
which are incorporated in the book. Thank you. Computer time and assistance in running the examples were provided by the Computing Center of the State University of New York at Buffalo. In the preparation of the manuscript, Jeanette Ninas Johnson and the staff at Holt, Rinehart and Winston spent many difficult hours with the material. The manuscript was typed by Jacqueline Rance and Diana Webster. I am glad it is they who have their jobs. I am especially grateful to my wife, Joyce, who sat up many evenings reading and re-reading galley proofs with me. Although her knowledge of statistics increased only a little, her knowledge of Greek has grown immensely. Stockholm, Sweden June 1974
Jeremy D. Finn
Contents Page vii
Preface
Section I Chapter 1 1.1 1.2 1.3 1.4
2.3 2.4 2.5 2.6 2.7
Multivariate Analysis
The Algebra of Matrices
Chapter 3
3.2 3.3
2 5 9
10
19
Notation Simple matrix operations Transposition Addition, Subtraction Multi plication Scalar functions of matrices Rank Determinant Trace Matrix factoring and inversion Triangular factorization Inversion Orthonormalization Matrix derivatives Characteristic roots and vectors Exercises Matrices Problems
Section II 3.1
2
Perspective The multivariate general linear model Application Approach to the general multivariate model Five sample problems Sample Problem 1-Creativity and achievement Sample Problem 2-Word memory experiment Sample Problem 3- Dental calculus reduction Sample Problem 4- Essay grading study Sample Problem 5- Programmed instruction effects
Chapter 2 2.1 2.2
Introduction
20 23 30 36 47 48 50
Method
Summary of Multivariate Data
Vector expectations Standardization The multivariate normal distribution Samples of multivariate data One sample More than one sample Note on within-group variances and correlations Linear combinations of variables
54 54 61 66
xi
xii
Contents
3.4
Sample problems Sample Problem 1- Creativity and achievement Sample Problem 3- Dental calcu Ius reduction
Chapter 4 4.1 4.2
4.3 4.4
4.5 4.6
Chapter 5 5.1
5.2
5.3 5.4
6.4 6.5 6.6
7.2 7.3 7.4
92 96
108 110
123 127
Multiple Regression Analysis: Tests of Significance
134 135
Correlation
Simple correlation Partial correlation Multiple correlation Multiple criteria Canonical correlation Sample Problem 1- Creativity and achievement Condensing the variates: Principal components Sample Problem 1 -Creativity and achievement Sample Problem 3-Dental calculus reduction
Chapter 7 7.1
92
Separating the sources of variation Model and error Subsets of predictor variables Order of predictors Test criteria Hypotheses Likelihood ratio criterion Hotelling's P Univariate statistics Step-down analysis Multiple hypotheses Reestimation Sample Problem 1 -Creativity and achievement
Chapter 6 6.1 6.2 6.3
Multiple Regression Analysis: Estimation
Univariate multiple regression model Estimation of parameters: Univariate model Conditions for the estimability of (3 Properties of~ Estimating dispersions Some simple cases Prediction Summary Multivariate multiple regression model Estimation of parameters: Multivariate model Properties of B Estimating dispersions Some simple cases Prediction Summary Computational forms Sample Problem 1- Creativity and achievement
85
Analysis of Variance: Models
Constructing the model Univariate case Multivariate case Least-squares estimation tor analysis~ot-variance models Rep"arameterization Conditions tor the selection of contrasts Some simple cases The selection of contrasts One-way designs Bases tor one-way designs Higher-order designs Interpretation of contrast weights
145
160 165
173 175 181 182 187 193 198
205 205 215 219 228
Contents
Chapter 8 8.1 8.2 8.3 8.4
Chapter 9 9.1 9.2
9.3
Analysis of Variance: Estimation
Point estimation Properties of 8 Conditions for the estimation of 0 Estimating dispersions Predicted means and residuals A simple case Sample problems Sample Problem 2-Word memory experiment Sample Problem 3-Dental calculus reduction Sample Problem 4-Essay grading study
Analysis of variance: Tests of Significance
Separating the sources of variation Some simple cases Test criteria Hypotheses Likelihood ratio criterion Hotelling's P Univariate F tests Step-down analysis Multiple hypotheses Notes on estimation and significance testing Sample problems Sample Problem 2-Word memory experiment Sample Problem 3-Dental calculus reduction Sample Problem 4- Essay grading study Sample Problem 5-Programmed instruction effects
Chapter 10 Analysis of Variance: Additional Topics 10.1 10.2
Discriminant analysis Sample Problem 3-Dental calculus reduction Analysis of covariance The Models Estimating 0 and B Estimating dispersions Prediction Tests of hypotheses Sample Problem 2-Word memory experiment
Appendix A.
Answers to matrix algebra exercises (Section 2.7)
xiii
251 252 260 267 273
296 297 308
328
357 357 368
394
Appendix B. Program user's guide
397
Appendix C. Input-output listings
409
Computer printout C.1-C.166
References
410
Index
417
A General Model for Multivariate Analysis
Section
I
Introduction
CHAPTER
I
Multivariate Analysis 1.1
PERSPECTIVE
This book describes the application of one general statistical model to behavioral data. The model frequently has high appeal. It is general; it is simple; there are available computer programs; and almost any behavioral data can be analyzed according to one form or another of the model. This same appeal function also requires that we be cautious. For we must ask whether this model is the correct one for our particular research assumptions and hypotheses. It is not always clear whether mathematical models that contain estimable parameters, and that reflect the behavioral models we assume, actually exist. Frequently they do not, and models must be constructed for specific cases. Often enough, these endeavors culminate in formulations that have a more general applicability. Statistical journals publicize large numbers of such cases. In the discipline of psychology, no instance is more outstanding than Thurstone's attempts to discover processes basic to the then-held concept of general intelligence. The by-product of these endeavors was the development and dissemination of a widely used technique, multiple factor analysis. Multivariate linear models are not always applicable to the specific problems at hand. And indeed the subset of multivariate procedures presented in this book represents only a small portion of those conceivable. Yet the procedures discussed in this book under the rubric "multivariate analysis techniques" share a resemblance to models of behavior in their very representation of behavior as having multiple antecedents and experimental outcomes as having multiple facets. Students of human behavior, with frames of reference from the extremes of atomism to those of holism, find themselves with multiple observations of each subject. For the atomist, this may involve tracing the development of a specific trait over time, or of its variants with specific imposed or natural stimulation. Bloom (1964) has summarized more than a thousand longitudinal studies of the development of physical characteristics, cognitive achievement, interests and attitudes, and personality measures through the childhood years. Each characteristic is represented by responses to the same or parallel tests, at different ages, by the same individuals. The data are of a naturally multivariate form, and various multivariate growth models are useful for describing the trends over time. Similarly, traits from a variety of disciplines are studied over time or under vary2
Multivariate Analysis
3
ing experimental conditions: the effectiveness of drugs with given diseases, over time or after repeated administrations; the change in value of certain preferred stocks overtime or with modifications in the company's and competitors' products; the gradual consumption of the nation's natural resources; changes in national birth rates; and so on. For the holist there are problems of a different multivariate nature. Here the outcome of an experiment or comparative study, at a single point in time, is completely represented only by multiple measurement scales. This may occur when the construct of interest is composed of well-defined but conceptually smaller units or when there is some lack of certainty about the definition of the construct, and a subsequent need to measure it in several ways. Generally the multiple measures are moderately to highly intercorrelated, as aspects of the same behavioral phenomena. Examples of such cases abound. All useful theories of personality attribute behavior to a multiplicity of underlying components. Murray (1938) has postulated a series of idiosyncratic "needs" as the driving forces in observed human behavior. For example, with respect to the seeking, giving, or withholding of affection, the individual will respond in a manner determined largely by his needs for affiliation, rejection, ni.Jrturance, and succorance. Thus the individual's capacity to exhibit a given degree of affection is reflected in tour measurements on these partial constructs. They may in turn be analyzed simultaneously for between-individual or between-group variation. Academic achievement is best described in terms of behaviors of progressively greater complexity. Both The Conditions of Learning (Gagne, 1966) and The Taxonomy of Educational Objectives (Bloom, 1956; Krathwohl, Bloom, and Masia, 1964) define categories of intellectual achievement. They are ordered according to the ability of an individual to achieve a given level of content mastery, only after having mastered the behaviors of lower or simpler levels. The Taxonomy pertaining to cognitive achievements lists six general levels of content mastery: knowledge, comprehension, application, analysis, synthesis, and evaluation. Each level is defined further in terms of subcategories. For example, comprehension includes the abilities to translate materials from one form of communication to another; to interpret, explain, or summarize a communication; and to extrapolate trends or sequences beyond given data to determine implications and consequences. Although an individual may achieve at one level in the hierarchy only after having mastered prior levels, the progression is tar from absolute. As a result, a person's achievement with respect to any curriculum is adequately described only by providing estimates of achievement at every level. For analysis, the resulting data are both multivariate and naturally ordered by complexity. There are additional situations in which multivariate analysis is particularly useful. A tester may be interested in the simultaneous reliability of a series of items or tests, which may not be independent or equally intercorrelated. Variable-reducing analyses, such as component or discriminant analysis, are of practical value tor placing individuals into homogeneous groupings, from multiple behavioral measures. Multivariate procedures may be applied to sets of measures that have been identified through cluster or factor analysis to have
4
Introduction
common components. This may be accomplished without the necessity of forming arbitrary linear composites ofthe measures, such as the summation of scores on scales having high intercorrelations with a particular "factor." Finally, since most computer programs for multivariate methods also provide results for each of the criterion measures separately, their use facilitates the simultaneous performance of a number of univariate analyses. This feature is of value, for example, in the comparison of results from raw and transformed data (see Pruzek and Kleinke, 1967). In every case, it is critical that the variables of any set share a common conceptual meaning, in order for the multivariate results to be valid. It is an easy matter to abuse, say, an extensive computer program to perform analyses on sets of variables which bear no "real-life" counterpart as a group. Likewise, an extensive program, such as MULTIVARIANCE, may be used to produce tests of significance that are quantitatively correct but do not conform to assumed probability statements. This may be due either to the quantity of nonindependent results, or to their exploratory nature. When research yields multiple response measures, the employment of rigorous scientific methodology resting on strong design formulations is more important than ever. One of the most concise treatments of the design and conduct of quantitative evaluation of behavioral data is provided by Federer in the introductory chapter of Experimental Design (Macmillan, 1955). Federer's brief but important chapter is recommended reading for anyone concerned with problems in the behavioral sciences. The evaluation process may be conceptualized as having six aspects: 1. Discovery of a behavioral problem. 2. Searching for existing solutions to the problem. 3. Selection of an approach to the study of the unknowns and the statement of expectations. 4. Formulation of the technical methodology to be employed in the evaluation. 5. Execution of the technical formulations. 6. Interpretation of the research outcomes. Quantitative analysis forms only a small portion of evaluation methodology, and we might be chagrined at the disproportionate quantity of reference material we have for this one aspect. However, quantitative thinking modes can form a basis for formulation of all phases of the evaluation process. Unresolved research problems are subjected to evaluation through the principles of scientific method. The hypothetico-deductive approach to empirical investigation has been well described by Ellis (1952). The primary assumption is the existence of a research hypothesis, or expected solution to the problem, prior to the collection of quantitative data. Although unexpected findings have often been generated for verification by hypothesis-seeking approaches to data analysis, the validation of such findings through replication is essential. Testing hypotheses drawn from earlier studies and from behavioral theory has the advantage of providing two sets of confirmatory data, one logical and one empirical. When the two agree, the conclusions form a firm base and are likely to replicate. If one has a large set of data, dividing it into two parts-one for
Multivariate Analysis
5
generating hypotheses and the other for confirmation-can maintain this confirmatory power. Researchers often disavow any prior knowledge from which to draw. hypotheses. Yet in informal discussion, the same individuals may admit that they really believe the new approach to be superior to the old or that their results will be essentially the same as another investigator's, in a different situation. These are hypotheses. That is, they are informed best guesses as to the experimental outcomes. Frequently a problem yields competing hypotheses, each of which would suggest a different outcome. These too should be stated and tested, as competing explanations. Nothing is lost, for no amount of exploration or estimation is precluded by testing prior beliefs.
1.2 THE MULTIVARIATE GENERAL LINEAR MODEL* A model of an object or event is any attempt at representing that object or event other than the original occurrence or representation. The general linear model is a very specific sort of model, involving the algebraic representation of relationships among observable human characteristics. In most studies the symbolic representation of such relationships is the second model applied to the observed behavior. The first is the modeling of behavioral constructs through algebraic or quantitative representation. This occurs in the process of measurement. In contrast, we will restrict ourselves here to the analysis of the already quantified responses. If the measurements are objective, reliable, and valid, we will be willing to assume that such quantitative indicators correspond in important ways to the constructs of interest and will yield insight into their behavior. Let us denote as y1 the quantified response of subject ito a single outcome or criterion measure y. Subject i may have been assigned to, or selected from, a population of observations identified by sharing common attributes on one or more exactly observable traits. In addition to the y1 then, subject i is identified by having values on a set of antecedent or independent variables X;, hypothesized to be related in some way toy. The X; may be categorical variables defining the population, or measured variables having ordinal, interval, or ratio scales, or both. x;1 is the value on variable X; for subject i. The process of fitting a linear model to data is one of determining a set of coefficients, a;, that multiply the X;1 in order to reproduce y1 as closely as possible for a set of observations. The model may be written as
Yi =
~; a;Xn+ei
= a1xli+a2x21+ · · ·+a3x;i+et
(1.2.1)
e1 represents the extent to which y1 cannot be reproduced by the weighted function for the particular subject. If y is a random variable and x 31 represents a fixed *For a more extensive discussion of linear statistical models, Chapter 5 of Introduction to Linear Statistical Models, Volume 1 (Graybill, 1961) is highly recommended.
6
Introduction
value of X;, then E will also be a random variable. It represents both the extent to which the model is incompletely specified and the measurement error iny. To state that "the model fits the data" implies that theE; are small, and that y; can be known from knowledge of the xii. Researchers in the behavioral sciences are perhaps more accustomed to asking, "Is there a significant difference between means?" or "Is a significant amount of variation in the dependent variable attributable to the predictors?" than "Does the model fit the data?" Yet when the components of the linear model are clearly specified and understood, it will be seen that the two sets of questions are in fact the same. Equation 1.2.1 is a model in two senses. First,_ the equation specifies the components into which the observation is partitioned-that is, some function of the particular X;i and all else. The original event y; is represented as the sum of two components, one being itself a function of j= 1, 2, ... , J additional events chosen in advance by the researcher. Second, the relationship between the additional events X; and the rest of the model is linear; that is, all weights aj are to unit power only. In still another sense, however, Eq. 1.2.1 is not truly a model. The sum of the right-hand components of 1.2.1 is exactly y;, and not some other representation of it. Thus, for consistency, we will refer to the portion of 1.2.1 exclusive of Ei as the linear model. Indeed, the most important modeling in behavioral research is the representation of the outcome y;, by other selected and weighted measures, X;. E; is commonly relegated the function of depicting unknown factors, hypothesized to be of a random and/or trivial nature in influencing y;, at least when compared to the purposefully selected antecedent measures. The variables X; may be of several types. When the X; are entirely categorical measurements and have values 0 and 1, the linear model is usually referred to as the analysis-of-variance model. The question of fit to sample data-that is, of the relative contributions of I.;aiX;; and E; in yielding information about Yi- is most commonly phrased as "Are there significant differences among means of the J populations represented by the samples?" When X; are scores on J measured variables, the model is that of regression. The question asked most often is one of the percentage of variation in y attributable to one or more of the X;. Finally, the analysis-of-covariance model is the form of Eq. 1.2.1 when some of the X; are categorical and others are measured. The variables X; may themselves be nonunit powers or cross products without destroying the linearity of the model. The variables X; in unpowered form comprise the additive portion of the model. Any x; that are non unit powers of a measure, or the cross products of two or more other X;, comprise the nonadditive or interactive portion of the model. Thus the general linear model as represented by 1.2.1 will suffice for a variety of polynomial analyses, as well as all linear analyses of variance, including interaction terms. MPd.lti_j2"_Lls mYHi'!!}!iate when y is a V€lQt~>r variable having more than a single outcome measure. A separcife'set ofweig hts a~ is necessary each out~- -
for
Multivariate Analysis
7
Application
T~~~gnific~,!. re~~~rch in education assum~~~~~.'!l~I1JPie S()Uf.~~ of influence and multiple outcomes"oTediicalionarprocess. An essential con-n itTorf"l'iiSiJch~re"search is tfie exj)1Tci'tYyutllTzecfassumptio~ that the educational enterprise has both a manifest and latent curriculum and yields both intended outcomes, as well as others that co-occur and may not be anticipated. The multiple-input, multiple-outcome assumption is frequently formulated as a model depicting an educational setting. ' For example, antecedent measures may be home and school variables (Coleman, 1966), time needed and time spent in learning (Carroll, 1963), environmental process variables (Wolf, 1965), educational opportunity and educational press (Finn, 1972c), or puP,il entry behavior, affective entry characteristics, and instructional quality (Bloom, 1971). Outcome measures may be classified as cognitive and affective (Bloom, 1956; Krathwohl, Bloom, and Masia, 1964), higher and lower-level cognitive processes (Bloom, 1956), or in other categories (Finn, 1972a). When a linear model in the form of Eq. 1.2.1 is employed with educational data, the independent variables are usually assumed fixed and known. The outcome measures comprise random variables or variates. Thus multivariate models are usually appropriate when a study contains multiple outcome, or dependent, or criterion measures. They frequently constitute the most realistic statistical models for behavioral data, especially when the research evolves from a multiple-input multiple-outcome paradigm. The multivariateU~h.niques describedjn tl'lis QQ()JJ!@ generalizations of, .. well-·knowiia'ilaly~sis-of-variance and regression procedures to the case of multip!~"-~~-~-~-~_:1~~-nt_y_~ri~~le~: For·el(§'ffi'pTe, with-several groups.cif"subjectswe.may" "' wish to test that group means are equal, not on a single outcome variable but on two or more intercorrelated variables simultaneously. For this a multivariate extension of the usual F test may be employed. Or we may wish to predict achievement in a literature course, from one or more instructional variables. Achievement may be measured by both an indicator of cognitive performance, as well as an index of the pupils' attitudes toward reading literary material. For this, a multivariate extension (here bivariate) of the multiple linear regression model is appropriate. We may test for the prediction of the two intercorrelated outcomes, which jointly describe the results of the course or unit. ,\:;/L In each case the multivariate approach attends to the data as a whole, ' ~ · rather than to a few isolated or transient aspects. The analysis of a single summary measure (such as a total or average score) will result in the loss of information conveyed by the individual scales. The results will have dubitable meaning. Analyses of each of the measures separately results in redundancy to the extent that the measures are nonindependent. Statistical error rates may be multiplied manyfold, and the replicability of the study is reduced. The appropriate multivariate model retains the multiple scores as a set of interrelated traits. It is essential that the set be conceptually meaningful! Th~_number of dependent variables d~!E:!rrnLnes whether the fQ!DLQ.Uhe model is univaiTalel1),"6ivarTafe(2), ormuitivariate (generally). In every instance:•.'
n
*
·-~~-_..,.~.,_~~~-..... ~-~-~~~_,_,,_- c>''""~-'"'-~-
•
•""'" ...,......,._..,_ ·--···-~~~1>"• ..,......~_'"""'"',..,.,..,~-~----,
<"~'
""•"""'-
....,,..--..
~····~-~··
•"·- -· -
--·•·•·~~--
8
Introduction
the multivariate result§_§[mplify to the familiar univariateJorm when the number on~rlterTaisreduc-ed to one. The dependent variables are..assiTmecfto 'bemea~that' iS,~ha~;·r; g n~meri~·ar~calewitb Iei~fof~Iil'?Tr3rop~e'rtfes~""··-·····rll-ere is··olfe·n~so·me-confusion of terms in referring to linear-models, es~ pecially when the number of independent variables is considered. We shall assume the following conventions. When the independent variables are measured (regression TI'.odel), the number gf inf!eP'en-ifenfvarrabfes·o~·-p·redictorsd~t~~= miQ_es whether the model is simple regressio.n{'one.predictor), on1'1uHTple.re.gression'(ge'r'li3'raJiyflllu·sTile regressTo-nnl'odeTvirth one p're3Tctora11d one criterTonis the univariate''simpTe~regressionmodel; with onepredlcforancfmultipl'e.cnteria. the mu lffvari"afe sTmple-regressio'n.model; wTth .. many predictors. anCfm'ariy'oufcomes': the multivariate multiple regressi6'r1 model; and so on ... Whenih~Jrdependent variables are categorical arid su6je6ts are-classified _by group membership, the model is that of analysis o(varlance~::.A QD.~7.t~itgr Cii3Sfgffwitf1--~[cJi"subj€icfhav!ng mariy outcOme measures requires a one:W<'!Y multivariate.JJ.nalysis-of-varian9e model; T.Vo~way~6Y'many-waidesign with multiple..outcome variaples requlres'a many::waymTJitiv·a-riate"an.alysis-of-var(Cirlce model; and so on.-Wheneveteachsubiect has only srngle outcome measure, the model Tsun ivart~.ie. regardless oft he com-pfexity of the samplin!;fCleslgn. (1-ne-oniy-exceptfon-of course, is in random-6Ymixed-effects models;'Which in~~ 'Ciude random independent variables. ·-·J _\The independent variables in analysis-of-variance designs may reflect
su.red
a
<Ji
..
a
a
Multivariate Analysis
9
another specialization. Data-manipulation practices such as the dichotomization of measured variables are obviated when the general model can be fit to data as they occur. Multivariate models yield test statistics that are simple extensions of their univariate counterparts; for example, a single F-ratio to test H0 : /J- 1 = /J-2 , but for /J- 1 and /J-2 being vectors composed of several means. In addition, several uniquely multivariate functions may be obtained as part ot"'~ir1a"naf~isTs"iinc:l'er tnegeneraf model. These are "re-ferrecf'fo-·aS'Canonical anal\ises~-two ofwhich are canonical correlation analYsis and discriminant function analysTs-:tloth are procedures for identifying linear combinations of criterion.-measures thafliave specified optimum propertres: In the muffivariafe multiph3 regression .. model, th~ c~E!lQJJJQgLc..Q.f~tions are riieas1Tres0Tass6ciatl6northe criterig 9.mU!llt. me-asured predictors ....Li"near1Ui1Ctfo~ris·on:lothset5-otv~i~s are identified which are themselves maximally intercorrelated. In the multivariate analysis-ofvariance model, linear functions of the criteria are identified that have maximal between-group variation. In both instances, weighted combinations of measured variables replace the original measures for analysis purposes. This substitution introduces complexities that are not usually offset by the gain in parsimony.
1.3 APPROACH TO THE GENERAL MULTIVARIATE MODEL There are several emphases in this book that reflect a philosophy ofsocial science research and should be explicated. First, it is assumed that a workable paradigm involves the testing of major research hypotheses for decisions about essential acceptance. When the model consists of multiple outcome variables, multivariate test statistics provide the primary decision-making information. Emphasis is placed upon the statement, interpretation, and testing of planned contrasts in analysis-of-variance models. The majority of elementary statistics texts first presents the partitioning of sums of squares for main effects and interactions-that is, for "omnibus" hypotheses of the sort Ho: JLI=JL2= · · · = /LJ· Discussion of planned or post hoc contrasts is reserved for later sections. The approach here is the reverse. Both because planned comparisons are usually of interest to researchers and because of their superior inferential power, the estimation of single-degree-of-freedom effects is presented first and in greatest detail. Sums of squares and tests for omnibus effects are viewed as optional "pooled" functions of the more detailed partition. Specific estimates and univariate test statistics are stressed for interpretation beyond the acceptance decision. These include the simple, partial, and multiple correlation coefficients, the point and interval estimates of regression coefficients, mean differences, predicted means and residuals, and so on. The magnitudes of multiple univariate test statistics may be compared. Although they are not independent for any one hypothesis, univariate statistics aid in identifying the variates most and least affected by their antecedent(s). In contrast, canonical correlation and discriminant function analyses, which depend upon complex functions of the original measures, yield only minimal interpretive data, and are given minor emphasis. The purposes of social science are
10
Introduction
best served by the simplest and most straightforward techniques that yield valid research answers. Discussion of each model is separated into "estimation" and "hypothesis testing." Estimation in regression models involves finding "best" estimates of the partial regression coefficients and their standard errors, the correlations between criteria and predictors, scores predicted by the regression model, and residuals. Hypothesis testing involves decisions about the nullity of the regression or correlation coefficients or about variation in the criteria attributable to the independent variables. Estimation in analysis of variance involves determining sample values for means, mean differences, their directions and standard errors, means predicted under the model, and residuals. Hypothesis testing involves decisions about the nu II ity of population mean differences, in specific or general patterns. In a sense, the presentation of estimation and hypothesis testing is in reverse order. Obtaining estimates of effects is dependent upon the knowledge that the effects are meaningful and not random; that is, dependent upon the results of the significance tests. The reader is asked to excuse the use of this reverse order, since the discussion of estimation provides for a better initial description of the effects themselves. Reestimation of terms in a reduced model is presented for the situation where tests indicate that not all of the original estimates are necessary. Finally, the emphasis in this book is computational. The similarities and distinctions of specific models become the most obvious at this level. To understand the computations being applied to behavioral data involves comprehension of where they are useful and where they are likely to misrepresent the information content. Five studies, which require analysis through a variety of multivariate linear models, have been selected. Each is briefly described in Section 1.4. The studies are introduced into each chapter as the appropriate analysis techniques are encountered. In addition, an attempt has been made to exemplify the minimal complete and clear presentation of the outcomes. It is each researcher's responsibility to strive toward these criteria in his own presentations. The MU LTIVARIANCE program has been used extensively for the examples. However, computing flow and algorithms are not given here. The program may be obtained from National Educational Resources, Inc., 215 Kenwood Avenue, Ann Arbor, Michigan 48103. Flow diagrams and algorithms are given in Enslein, Ralston, and Wilt (in press).
1.4
FIVE SAMPLE PROBLEMS Sample Problem 1- Creativity and Achievement*
In recent years the concept of creativity as an ability to exhibit new and unique idiosyncratic behaviors has grown in importance in the minds of edu'The data for this problem are selected from a larger set collected by Dr. I. Leon Smith, Yeshiva University, New York, N.Y.
Multivariate Analysis
11
cators. Although creativity is conceptually different from general intelligence both in definition and in its effects upon various cognitive and noncognitive achievements, independently developed measures of creativity have shown little unique reliable variance. It is the purpose of this study to determine whether creativity can be shown to contribute to a class of cognitive achievements that require a high level of functioning-that is, those known as divergent achievements. Diver.gent achievement within an educational setting involves situations in which theTrlaTvTatTans"expected to create new and unique responses or organizations, where such responses were lacking in the stimulus situation, and where there is no single correct response. This definition necessitates additional criteria for evaluation. The levels of divergent achievement chosen for the study include the processes of synthesis and evaluation, as defined by Handbook I of the Taxonomy of Educational Objectives (Bioolll', 1956). It is the assumption of the Taxonomy that achievement at lower levels (knowledge, comprehension, application, analysis) is necessiry but--n!?t sufficient for achievement at the higher levels to be manifested. ynthesis ls defined by the Taxonomy as "The putting together of elements an parts-so-is to form a whole. This involves the process of working with pieces, parts, elements, etc., and arranging and combining them in such a 1 way·as_to;~onstitut~, a patt~rn. or structur~ n~t cl_early there before" (p. 206).jEvaluattoq mvolves Ouant1tat1ve and qualitative JUdgments about the extenho which material and methods satisfy criteria" (p. 207).* The major research question is whether an individual's level of creativity is a determinant of divergent achie;etl;Ei"rlt".· fu rt11e~ whetherthis-~rrtrfbi:t tion represenfs-aneffectl11aTcannotbe more parsimoniously attributed gerJ'eral intelligence.answer t.riese questions, 60 eleventh-grade students in a ~ .western New York metropolitan school were administered achievement tests 1 developed by Kropp, Stoker, and Bashaw (1966), designed to provide a quantitative indicator of achievement at each level of the Taxonomy. The test items require the subject to read selected passages at varying levels of ease and familiarity and to respond to a variety of types of questions on each. Data on creativity levels were obtained through administration of three sub- ~ tests, designed by Guilford (1967) to measure levels of symbolic and semantic divergent production abilities. These tests are co(lsequences obvious, which involves the ability of the subject to list direct consequenceSOT·a given hypothetical event, consequ.ences remote, which involves identifying more remote or original conseqLien-;es of similar situations, and f29§§.ibJ~jobs, which involves the ability to list a quantity of occupations that might be represented by a given emblem or symbol. lnteJiigeoc~_§£QI~\3.QD.~!b~..J,.g_rge-Thorndike Multi-Level Intelligence Test, Level" G, Form 1, w~re, dr~wn from school records (Lorge, Th_orndike, and Hagen, 1966). · - ~ ·-·---- --··--~·~"· -- ···---W.··--·-------··--~ ·
an(f
To
to
Twci majo·r hypotheses are involved in the study. FJJ:§.LJU.§...~)(PE:lC:~e.~ tb.~t~ thE! thr~~-.!ests of creativity and ~£..~--~jjyergentachievement representah.O.!!l.Q:: gen~gus_set of underlying tn'(HJght processes. nws, particular lin§2! combinations of th·e~tTve"measures-·sT-lqyld account for large proportions of va1Tat10:n' in ~'~"""',_"""'-=~·~0""""'··........ ~'-'"'''"'""""""''~'.....,.,.
""
•
*© 1956 by David McKay Co., Inc. Reprinted by permission of the publisher.
12
Introduction
the scales. Princip..alg_omp_oo~nt~ Qf the correlati.QD matrix among the measures are employed to provide evidence here~---·-~--- ·-------·-·-·-------~-::--sec15"nd, ·rns-expectedthat levels of creativity do determine, to some extent, ·the individual's divergent achievement functioning. To test this hypothesis, multivariate multiple regression analysis is employed, with the two measures of divergent achievement as criteria. Independent (predictor) variables are intelligence and the creativity scores. Lorge-Thorndike scores are included as the first independent variable. Thus we may test whether the three creativity measures contribute to achievement above and beyond the established effect of general intelligence . .({' Torrence has postulated that creativity has a greater effect on the achieve\' ment of individuals having high intelligence than on individuals of low intelligence. Thus interactions of creativity and intelligence may function to determine an individual's level of divergent achievement. This postulate is evaluated by adding three predictors to the model, which are cross products of standardized intelligence scores and three standardized creativity measures. The complexity of the interaction terms suggests that they be placed last in the order of predictors. If they do not contribute to criterion variation above and beyond the simpler intelligence and creativity variables, the criterion of parsimony dictates that they be eliminated. All statistical tests are conducted through stepwise multivariate multiple regression techniques, with a fixed order of predictor variables.
Sample Problem 2- Word Memory Experiment* Jthas loog_E~~!:l held .!>..YJ'-syc~.I()S1 i~!~-~hat .rna~~~~l.s_. !~ -~~.Je~r.n~d.wil I !1~.. more easily internalized if their organization bears a resemblance to the internal orgarl'izatk>n TrilpoSedbyhu,;an thoug'i:lt.processes:·ln-particular, ·Manci'ier. Stephens'(-1967) have suggested that if word lists be memorizedare presented to subjects arranged in common word-meaning categories, then memorizations of the words would be more accurately accomplished. Further, the facilitation is accentuated if the word categories follow a natural hierarchical ordering. To test Mandler's hypotheses, four word lists having fifty words each were created. The first list (list A) contains ten words in each of the five major groupings: recreation, inanimate materials, edible materials, plants, and animals. Each grouping consists of five words in each of two subcategories. For example, recreation is considered to consist of subcategories sports and dances. The words included in category sports are baseball, basketball, golf, hockey, and skiing, and those in dances are waltz, tango, mambo, cha-cha, and jitterbug. The four remaining categories are similarly subdivided, as follows:
ai-Ul
io
inanimate materials edible materials plants animals
into into into into
metals vegetables flowers insects
and and and and
precious stones food flavorings trees other animals
*The data for this problem have been collected by Dr. ThomasJ. Shuell, Department of Educational Psychology, State University of New York at Buffalo.
Multivariate Analysis
13
A second list (list B) having the same category structure and of equal overall difficulty was created to assure generalizability of results across a variety of specific words. The third list (list C) contains all ten words from lists A and Bin each of the subcategories sports, metals, vegetables, flowers, and insects. It does not contain any built-in hierarchy of subcategories. A fourth list (list D) contains all ten words from lists A and B in the subcategories dances, precious stones, food flavorings, trees, and other animals. Each word contained in these lists was printed on an otherwise blank card. The cards for each list were shuffled. The lists were presented to a total of 48 college seniors, in four different manners. 1. Twelve subjects were told of the ten-category hierarchical structure, but not of the actual names of the categories. They were instructed to sort the cards into such a structure, using any groupings that they felt were meaningful. Six of the subjects were given list A and six list B. 2. Twelve subjects were presented with the lists (six each list A and B), and were instructed to sort the words into ten unique categories, without being informed of the hierarchical arrangement. 3. Twelve subjects were presented with the lists (six each list A and B), and were instructed to sort the words into five unique categories, without being informed of the hierarchical arrangement. 4. Twelve subjects were presented with the third and fourth lists (six each list C and D), and were instructed to sort the words into five unique categories. At the end of the sorting, measurements were taken of the time the individual had spent in sorting, the number of words he could recall of the fifty, and the number of the original categories designed by the experimenter that were re-created by the subject. The latter was transformed to a proportion, based on the number of categories into which he was instructed to sort the words (ten for groups 1 and 2, five for groups 3 and 4). The entire procedure was repeated six times for each subject. In order to test the hypotheses of the study, two achievement measures from the sixth trial are used as criterion measures in a one-factor (four-level) fixed-effects variance analysis. These are the number of words recalled and the proportion of the experimenter's categories that were re-created. Specific contrasts are established to provide data on sub hypotheses. For the main hypothesis to be supported, the number of words recalled should decrease with an increase in the group's index number (that is, group 1 should have the highest recall score). The contrast of group 1 with 2 will be used to test the expectation that knowledge of a hierarchical structure of material will increase internalization. The comparison of groups 2 and 3 will provide an estimate of the extent to which smaller word categories improve internalization. That is, group 2 requires ten categories of five words each, and group 3 the reverse. Finally, the comparison of groups 3 and 4 will provide information on the extent to which any hierarchical structure of material, whether or not the subject is aware of its existence, will facilitate learning. It is a plausible alternative hypothesis for any of the results, that time spent in sorting the words affects the degree of memorization. As a second analysis, the same outcome measures are considered, this time with the time measures
14
Introduction
from the second, fourth, and sixth trials as covariates. These three measures are assumed to represent the six time measures adequately. The exception may be time at trial 1, which is subject to additional extraneous sources of variation in beginning the experiment. A prior regression analysis is employed to determine the extent to which time does affect individual differences on the experimental outcomes.
Sample Problem 3- Dental Calculus Reduction* Producers of salable merchandise intended for consumer edification- be it educational material, such as textbooks or teaching programs, or consumable products, such as tobacco or petroleum products, foodstuffs, and medicationsare in constant need of techniques for assessing their products' effects. Often the effects are measured simply in terms of sales volume. At other times, the more crucial effects are those of physiological or psychological changes in the consumers. Such is the case in the present example. The data represent a small portion of an. ongoing program for the evaluation and improvement of agents for reducing the formation of dental calculus. Subjects for the study are random samples of male inmates in a large state prison. The procedure for any experimental or control subject is as follows. The subject is randomly selected from prisoners who will be inmates for at least twelve months, and who are not missing any of the six anterior teeth of the lower mandible. The subjects are asked if they are willing to volunteer for the study. The majority of responses are positive. Demographic data are recorded for each volunteer, and all teeth in the mouth are given a complete scaling and prophylaxis. Experimentation is conducted over the following four three-month periods. During the first period, every subject is given toothpaste and mouth rinse, lacking the experimental agents. He is asked to use the products regularly, recording every use on a summary sheet. At the end of the three-month period, measurements are taken on the amount of calculus formation on each of the six lower front teeth. These early data provide an indication of normal calculus formation rates for the subjects, as well as an opportunity for some standardization of all subjects to a common beginning point. The teeth are again cleansed and subjects are randomly assigned to control and experimental groups and provided with the same paste and rinse, or with the same products containing an anticalculus ingredient, respectively. For two three-month periods, the subjects use the paste and rinse provided, under a double-blind condition. At the end of each period, the calculus formation is measured and a prophylaxis given. All subjects return to the no-active-agent state for a final three-month period to determine the nature of any carryover effects that may exist. The first and final control periods are useful for eliminating the alternative explanation of observed effects as simply dental care or tooth-brushing effects. Many of the prisoners would otherwise ignore all dental care, and simple brushing is often the most effective dental health agent. Three measures of calculus formation are taken from each of the six anter'The data for this problem are selected from a larger set collected by Dr. Stuart L. Fischman, School of Dentistry, State University of New York at Buffalo.
Multivariate Analysis
15
ior mandibular teeth in the manner described by Volpe, Manhold, and Hazen (1965). The three measures are obtained by placing a small rod, calibrated in millimeters, against the rear of the tooth in three positions, as illustrated in Figure 1.4.1. The experimenter compares the height of the calculus formation in each position to the calibrations, and records his readings. The three measures are summed for statistical analysis, providing a more reliable indicator
,'~
lo...il'il
'
Figure 1.4.1
Placements of calibrated rod for measuring dental calculus formation.
than any one reading. The measurements from the six teeth at any one time period are considered as multiple dependent variables. Experimental design considerations for product evaluation are often complex. Frequently, the numbers of subjects under the various experimental conditions are unequal, due to problems in subject accessibility or experimental mortality. Second, the expense factors involved demand that ineffective treatments be discontinued at the earliest possible time and that promising treatments be tested more extensively or in new quantities and forms. These problems are reflected in the present example, which draws from two consecutive years of testing the anticalculus agents. During the first year, four treatment groups with 29 subjects were involved. Group 52 used only the unaltered paste and rinse for all periods. Group 54 used paste and rinse thought to contain an anticalculus agent. Group 56 used paste and rinse containing an active agent, and group 58 a second active ingredient. The experimental additive for group 54 was later discovered to be chemically inert and was not continued for a second year. For analysis purposes, the subjects involved are considered as control subjects, as long as their mean calculus scores are not significantly different from the other controls. The group 58 agent was discontinued due to high expense without a simultaneous large drop in calculus formation. · During the second year of experimentation, three larger experimental groups were used with a total sample size of 78. Group 67 received no additive and group 93 repeated the test of the first additive (used the previous year by group 56). Group 75 used paste and rinse containing a third additive, not tested the previous year but based upon the same principles as the more expensive additive of group 58. The arrangement of treatments is diagrammed in Table 1.4.1, with experimental group numbers. Mean differences in calculus formation are tested through fitting a two-way fixed-effects analysis-of-variance model to the data. The six calculus measures
16
Introduction
Table 1.4.1
Sampling Diagram for Dental Calculus Study Control
Control
ACtive 1
Active 2
First Year, group
52
54
56
58
Second Year, group
67
93
Active 3
75
taken at the end of the third experimental period are simultaneous criterion variables. Planned contrasts are employed for information on the separate active agents and on the comparability of the two control groups. Some interactions are not estimable because of the irregular pattern of null subclasses. Principal components of the covariance matrix among tooth measures are extracted and a discriminant analysis is performed to obtain further information on the development and retardation of dental calculus. Step-down analysis is employed to determine whether the experimental effects are concentrated in a subset of teeth having higher calculus formation rates.
Sample Problem 4- Essay Grading Study* Recent research on expectations (Rosenthal and Jacobson, 1968; Elashoff and Snow, 1971) suggests that teachers establish expectancies for pupil performance that are translated into teaching behaviors, andultimately affect the learning achieved. Increases in learning as a function of expectancies can only be applauded. However, low or negative expectations can be harmful, especially if they are reinforced over several years' time (Finn, 1972b). It is important to identify the pupil and situational characteristics upon which teachers base their expectancies and to take educational and preventative measures where necessary. Pupils' sex, race, and ability levels may be among the nonperformance determinants of teachers' expectations. If so, these factors would be reflected in teachers' evaluations of pupils, especially if the evaluations are subjer.Jive in nature. To test this hypothesis, a group of fifth-grade pupils was asked to 1rite two essays each, on the topics "What I think about" and "My favoritE' school subject." Four essays were selected on each topic that were at about the center of the distribution of overall quality of writing. The essays chosen were typed, and averaged a little over half a page each in length. Each selected "What I think about" essay was paired with a single "My favorite school subject" essay, yielding four pairs. Though not randomly chosen, the four essays did represeni differing writing styles and qualities. Each pair of essays was sent to a fifth-grade teacher, accompanied by a cover letter providing him with systematic false information about the pupil who supposedly wrote both of the essays. The pupil was identified in terms of race (Negro-white), sex (male-female), and an achievement-ability rating that consisted of an IQ score and a report of the pupil's past achievement. The IQ scores were either high (115-120) or low (87-90), and the achievement rating ·The results of this research were reported in Finn (1970).
Multivariate Ana lysis
17
was "above average" for the high-IQ pupils and "below average" for the low-IQ pupils. Both the intelligence and achievement data were included to reinforce the concept that the child was generally bright or dull. Each essay was accompanied by a rating form consisting of a series of ten-point scales from "very poor" to "very good" for the categories "spelling and punctuation, grammar, sentence structure, organization, neatness, relevance of ideas, appropriate word usage, clarity, creativity and imagination," and "completeness of thought." The scales were not defined further, and no definition was provided for the ten scale points, other than to label the end points to clarify direction. The teachers were asked to evaluate the two essays on the rating scales, and were told to put any comments on the essays or rating forms that they cared to. Subjects were 112 volunteer fifth-grade teachers in a large integrated urban school district in the northeastern part of the United States. The subjects represented every subclass of the 32-group experimental design, obtained by the crossing of pupil sex, pupil race, pupil ability, and the four different essay pairs. About half of the teachers did not respond to the "neatness" scale, since the essays were in typewritten form. The other nine scales are summed to provide a total score for each essay, yielding two totals for every teacher-subject. The possible range of scores on each scale is .from 9 to 90 points. The two totals are treated as simultaneous criterion variables. Subclass means are compared through a four-way fixed-effects unequai-Nvariance analysis, with two criterion measures. For comparison purposes, the same analysis is repeated with the sum of the two totals as a single criterion measure. The expectation is that mean ratings will vary systematically as a function of the assumed characteristics of the pupi 1-authors.
Sample Problem 5- Programmed Instruction Effects* High absenteeism rates in schools attended by underprivileged pupils can counteract educational benefits that might otherwise accrue. In order to attempt to reduce the effects of absenteeism, a large eastern city in the United States has developed a set of programmed materials to parallel normal instruction in seventh-grade mathematics. The materials can be presented to the pupils, upon return to school, via individual computer consoles. If the students are able to learn the material by themselves, they are thus given the opportunity to compensate for learning not attained during their absence. To test the remedial program, a random sample of nineteen seventh-grade classes was drawn from a large school district in the city's core area. The program consoles were installed at the rear of each room. The teachers were instructed to allow the students who had been absent to utilize the consoles for the appropriate missed lessons, and to provide instructions for the consoles' use. A second random sample of 18 classes was used as control with no "These data were collected under the direction of Dr. Chester Kiser and Dr. Austin Swanson, Professors of Education at the State University of New York at Buffalo. The study was financed by a grant under Title I of the Elementary and Secondary Education Act during the fiscal year 1968, to Baltimore City Public Schools, Dr. Orlando F. Furno, Assistant Superintendant for Research and Development.
18
Introduction
particular compensation for absences. All students were tested at the end of the school year on: 1. Cooperative Mathematics Test, Form A (Educational Testing Service) 2. Stanford Modern Mathematics Concepts Test (Harcourt, 1965) 3. City Junior High School Mathematics Test, Grade 7 In addition, the number of days of the school year each student was absent was recorded. Since students may be assigned to classes in nonrandom fashion, and since students within classes do not function independently, class means are the unit of analysis. Prior studies have indicated sex differences both in achievement levels in mathematics and in absenteeism rates. Thus the data for statistical analysis are means on the four measures, for members of one sex group in each class. The sampling arrangement is diagrammed in Figure 1.4.2, with each of the subclasses containing a single mean observation. Experimental
Control
Males
Females
Figure 1.4.2
Sampling design for programmed instruction study.
To test for higher mean achievement in the experimental classes, a mixedeffects variance analysis is employed. Sex and experimental condition are fixed effects; classes are random, nested within experimental conditions and crossed with sex. Outcome measures are the three achievement scores. To control for differential absenteeism rates among classes within experimental conditions, a covariance analysis is performed. Total class absenteeism is the concomitant variable (covariate).
CHAPTER
!
The Algebra of Matrices Rectangular arrays of numbers are the basic data and algebraic representations for statistical analysis. Even in univariate problems, the data form a list of scores, which may be ordered by subject index number or sorted into groups. The list of scores for all subjects o'n a single measure is an observational vector. Or for one subject we may have multiple outcome measures. The list of scores for a single observation is termed a vector observation. Most commonly data from an experiment are ordered irl a table that has a row for each subject of the study, and a column for each test administered. For example·, we might place scores for N subjects on three tests in the form of Table 2.0.1. In this table Xii
i
Table 2.0.1
Subject
Datfl Array for N observations, Each Having 3 Test Scores Test 1
Test2
Test3
1 2
xll -'!21
x,2 X22
X,a -'!23
N
XNl
XN2
XN3
represents the test score for subjection test j. The complete array of scores is the data matrix, having three observational vectors as columns and N vector observations as rows. · Consider summarizing the data of Table 2.0.1. Minimal summary information would consist of the means fpr the three tests, which themselves may be placed in an array. If there are multiple groups of subjects, several arrays of means are necessary. The variances of the tests and the covariances of pairs of tests can be naturally ordered intd a table that has as many rows and columns as tests, as in Table 2.0.2. Sjk is the covariance of test j with test k. The covariance of a test with itself, Sjj, is the variance of the test, and is represented sF The variances of the three tests ~ppear as the diagonal entries in the table. Without such a table the representation and ordering of variances and covariances becomes complex, especially if the number of tests is large. The distributional assumptions of analysis of variance require a table similar to Table 2.0.2 to be completely dep'icted. The usual assertion is that errors Eij are 19
20
Introduction Tabl~
2.0.2
Test 1 Test 2 Test3
Variances and Covariances of Three Tests Test 1
Test 2
sl" s,l
S1z
Ss1
S32
s22
Test3 S1a
s,a so"
normal and independent, with common variance u 2 . Restated, this stipulates that the variances of errors for, say, three observations can be displayed in a table such as Table 2.0.2 with s 1 2 =S 22 =S 3 2 and all si;=O (i oF j). With any reasonable number of observations, constructing the entire table is prohibitive and unnecessary. Just as in scalar algebra, we may use a condensed matrix form of the assumptions, to represent the entire array given in Table 2.0.2. In univariate analysis, we can sometimes circumvent operations on arrays of numbers, since the basic datum for computations (for example, sums or sums of squares) is an individual score. Even then, in complex models array representations are a necessity as well as a convenience. In multivariate models the basic datum is the vector observation. Representations of multivariate models, as well as the simplest statistical operations, require an ability to operate with arrays. Otherwise the computations are intractable. The notation of matrices forms a system for data analysis that is not only convenient, but often necessary. Matrix operations parallel the algebra employed in statistical analysis. Simple matrix algebra is used extensively throughout this book. The following sections provide a brief introduction to those aspects that are employed in later chapters. More extensive presentations may be found in Hahn (1964), Noble (1969), Searle (1966), and Graybill (1969). The exercises of Section 2.7 illustrate the application of these operations to simple problems.
2.1
NOTATION
A matrix is any rectanguJntrast, a single constantor symbolls'lermecr·a·sca/ar. For exa'mple, the cciristaiil andthe variable representation y, which mayfake.on a variety of values, are scalars. Matrices have more than a single element, either as additional rows, or additional columns, or both. For example, a matrix is
6:
y =
[!!
1~~]
81
80
Y has 3 rows and 2 columns. Matrices are symbolically represented by boldface uppercase Greek or Arabic letters, and are surrounded by brackets. The elements of a matrix are represented by the same symbol in lowercase form, followed by a row and column subscript. The element y 11 is the element in the first row and first column of Y, or the number 96; y 12 is 110, and so on. The size of a matrix is its order. If matrix Y has 3 rows and 2 columns, we
I I
'
The Algebra of Matrices
x
21
may denote the matrix as Y (3 2). In general, a matrix with _m _.LQ."Y§~.aod c6TUmnsmay be.writt"EmasY(mxn) and dePictedas _______ ~.
--- ·-· ------·--·· ~~~,·-::·:·::··:a
A general element of Yin row i and column j is Yu or [y;;]. A matrix having only a singl~ row or a single column is a are denoted by a T:ioiClface lowercase such.as· cotu7ni'illf!Jctorvis-· -----------
symi:i-oT:
1-
[1.7] 2.2
'
-.8 .2
V=
' =
vector._~rs
a, p:, y:-'For example,
- - - - - - - - - - - - --·- ......... --- -·---------------·--·---·
-
I
v has scalar components [v 1 ]
n
1.7, [v
2]
= 2.2,
[v3 ] = - .8, and [v4 ] =
.2.~~c
tQ!__[ej:>r_~ted horiz~~!~IJy_ is -~_,!2~- ~~c~_r. ~!ric;es....~~- S(_)~~!~lll~~~pre
--ser:u_~_
by their vector components. For example, matrix V
ha~ tw~u::olun:m
vectors,
and
Thus Y may also be represented as Y = [y1 , y2 ]. Y also consists of three row vectors. , Certain matrices arise in st~tistical applications which have particular unique forms. These include the.fJUII m..Etrix or_n,f.!./1 ~ectorg~ro~, We represent these by 0 and 0, respectively. ihe unit vector, represented 1' c_ontains all unities. For example, \ ... ·----~---·-
The subscript on 1 is used to denote its order. ~g_(J~?_.~~--~~quare ,matrix containing all zeros to one side of the principal diagonal, and general elements elsewhere. For example, ...
0
J
0 tnn
A 4x4 triangular matrix is ._ .. - .. ~~...--~··- "~---- ~··- -~"' ..
~-----~_. ,
. [1 0 0
4· T= 6 7
3 0 6 -2 1
0
3
(Zero)]
6 -2 1 0
4
22
Introduction
T is a lower triangular matrix, with nonzero elements on or below the diagonal. An#J!p$F'·triangular ma(rix has nonzero elements on or above the diagonal. L~~diag_o_'!_f!!_!!Jatrix ils square with zeros in all positions except the principal diagonal; for example, J
D = [1
~ 1~ ~l
0
0 11
has diagonal elements [ du] # 0, and off-diagonal elements [ d;;] = 0 (i # j). For simplicity, a diagonal matrix may be written in terms of only the diagonal elements; for example, D = diag (16, 15, 11) or D = diag (du, d22• ... , dnn) A diagonal matrixwith unities as the diagonal elements is the !identity matrix,
andls-aenotecfFTor exa0ple, .
14
·
=
· - · . . - . '"-~
1 0 0 OJ [00 01 01 00 0
0
1
0
The s~scrir;>Lon Us Y_s~d to denote its row and column order. A[ symmetric matrixh any square array in which the elements above and -J below the-principal diagonal, element for element, are identical. That is, the first row is identical to the first column, the second row is identical to the second column, and so on. For example,
s=
13 [ 102
-6 4
4]
102 -6 12 17 -4 17 0 0
-4
0 -8
Each element [s;;] is identical to corresponding element [s;;]. For simplicity, only the lower symmetric half of a symmetric matrix is written explicitly. S may be represented as 13
s=
[ 102
-6
4
(Symmetric] 12 0 17
-4
0 -8
At times, two matrices are juxtaposed and treated as a single matrix. For example, let matrix X be
X=[;3 ;0
~] 2
If we extend matrix Y by juxtaposing X to it, the result is a matrix having 3 rows
The Algebra of Matrices
23
and 5 columns. Vis matrix Y augmented by X, or
V= [Y, X]
=[:~
ii i ~ ;]
1
is
Y'=[ 11096
44 81] 58 80
Transposition is rewriting every element of A, as the [ ji] element of A'. It is easy to see that the transpose of any symmetric matrix is the matrix itself. The transpose of an upper triangular matrix is lower triangular, and vice versa. The lower triangular form of such a matrix will be considered the normal or untransposed form. The transpose of an nx 1 column vector y is the 1 xn row vector y'. Similarly the transpose of a row vector is a column vector having the same elements. The column form of a vector will always be considered as the normal or untransposed form. That is, v' is always the row vector form of v. The transpose of a transpose, such as (A')', is the original matrix, A. The placement of the transpose symbol is sometimes important. For example if matrix Y has two columns, y 1 and y2 , then Y' has two rows y; and y~. By comparison the rows of Y are two-element row vectors, denoted (Y'h, (y')z, and (y')s. The vectors represented are named and described in cases where ' this may be confusing.
Addition, Subtraction Two matrices are conformable for addition or subtraction if they are of the same order. The sum or difference of two mxn matrices is the mxn matrix of sums or differences of each of the elements. That is C =A± B implies that, for each element, [ c;;] = [aij] ± [b;;]. For example,
[3~
;
~] + [;
0
4
;
6 5
~] = [~ 1~ 1~] 7
9
5
11
The operation of matrix addition is commutative (A+B) = (B+A), and associative
[A+(B+C)
=
(A+B)+C].
24
Introduction
It can be seen that sums or differences of two or more symmetric matrices are also symmetric. A row and a column vector are by definition not conformable for addition unless one is transposed. If y; (i= 1, 2, ... , N) are N column vectors, then the sum of the N vectors is the vector of element-by-element sums. For example, let
Then 3
r
16-10+2 1 + 3-1]
~ Y;= 4+ 2+1 = 10+ 8-6
8 l3J 7 12
Multiplication
/"'~T'rhere are several forms of matrix products we shall consider. The simPiest(s the scalar prodJ.Jct of a- -"scalar-- and -·-a matrix. The product of the scalar c and the matrix A, written cA, is the matrix formed by multiplying every element of A by the scalar c. That is, each [ca;;] = c[a;J, for all i and j. For example, let c= 1/2 and y' = [ 110 58 80] ~-----"---·
-
---~--
-
''
--··~---~-
Then
cy'=1/2y'=[55
29 40]
m<:>.?LCC>!!!!!l~C>'l_vectorproduct the[/nner~~;;;~~;Jof rowand
G.JThf2 is a a CO}(Jrr!J?.Vf?QtQr, respectiyely:Go-mpufatfon of the inn-er· product of two vecTors, V' and w, requires that the vectors be of the same order; that is, the two vectors must have the same number of elements to be conformable. The inner product of two n-element vectors, v' and w, is the scalar that results from summing the cross products of each element in v and the corresponding element in w. That is, if c=v'w, then n
C= ~ V;W; i=l
For example, let
y'=[1/2
1/6 2/3]
and
z=[j]
Then
c=y'z= (1/2x4)+(1/6x0)+(2/3x12) = 10 It is obvious that multiplication of vectors in this manner is commutative; that is,
'
I
The Algebra of Matrices
25
v'w= w'v. Two vectors whose in.~Jer product is. zero are sai..d toJ~.EtPrH7g9.ona/. If vectors are.plottea·geometrically,' orthogonal veCtors are at right angles to one another. Just as v'w is the sum of cross products of the elements of vectors v and w, the product of a row vector and its own transpose is the sum of squares of its elements. That is, I
n
V'V=
2
n
V1V1
i=l
=2
V12
i=l
With z from above,
z'z= [4 0 12][
~l = (4X4)+(0X0)+(12X12)
.
1~J
=160 T~~-inn~~uct oL"!. ye<:;Jo~ v ar'ldi!~.9YY1'1tr~nspose, v'v, is the sqyare leng!!J of v, writteniVfThe square root of the prodljct, or lvl, is the length qf vecto~r:cifi~ngth"unityk'said'to' be~n-ormalized:..bny vector v can be transformed to a normalized vector, v*, by scalar multiplication,
(
/.~~") ~ ______ v· Ivi _........--/;'~""'\ vv
&IYY.,." \ )
!
r..
:'AJ
For example, let
.
The length of y is
vYY = v'4 = 2. The product 1/2y is
··~112y~[J The length of y* is 1, and y• is the normalized vector. If v and w are both normalized and are orthogonal, the two vectors are ortfro'mmrrai; thatts·;-if lVI = lwl ,;Ta-nd.v1w=;lf.Tivtfs'nof orlliQgonal to-·v:it'riJ}iy
be orTFiogo;ia/TierTby'"-----c:---·~_;:::-::-.:~~:=::.···-··--·· "'
"·-··-·--·......,.._,.
For example, let
w.L 1= w-(v'w)v ---. . ) - - - · - ...---...--
and
-
{\
I '
..
0'1"~' . ~\,} 1111"'\".., ~--<....
26
Introduction
Thus v'w = .5 and the two are not orthogonal (both are normalized). Then
W"{~J -5m =
.25] .25 [ .25 -.75
Vector wJ. is orthogonal to v. It represents a residual vector, or just that po(tjon of wwhich i"s at rignfarlglesto v: wJ. is not necessarily "norinallzed,anclrenormal~q:IL"r'ilaTb-e r~q·urrea~Ortfi"og"onallzatTonis 6rder-!=ll5~ciHc; vJ. and·w arealso orthogonal vectors, but vJ. is the residual v-vector from w. The two are not the same as v and wJ., although both pairs are at right angles on the same dimension graph. Vector products can produce results that are usually represented in scalar algebra. The reader may wish to demonstrate for himself that the product of the 1 xn unit vector 1' and a conformable column vector y is simply the sum of the elements of y. The square length of ann-element unit vector is n. Thus the mean of the elements of vector y, is
1 1' Y v.=n The vector of mean deviation scores is
=
y-y.1
The variance of the elements of y is 1 n-1
n
S 2--~ y
-
~
(y1 -y)Z •
i=l
Substituting for Ya, multiplying and combining like terms, Syz
1 =--1 (y'y-ny.z)
n-
Since y'y = L1Y1 2 , this is the common computational form for the sample variance, but derived through vector operations.
The Algebra of Matrices
27
,("3': The product of two matri~es, A and B, is the matrix of inner products of e\rct(row vector of A and each column vector of B. The result has as many rows as A and as many columns as B. For A and B to be conformable for multiplication, the row vectors of A must be conformable for multiplication with the column vectors of B; that is, matrix A must have as many columns as B has rows. If A is of the order mxn and B is nxp, then the product C=AB 'is mxp. Each [ci.i] is the inner product of the ith row of A and the jth column of B. That is, n
[c;j] =
:L
a;kbk;
k~!
As an example, let
A=[!
~
6
~]
and
Then C=AB
= [(2X1 +2X0+1 X4+3X2) (2X2+2x0+1 X1 +3X1) (2X1 +2X3+1 X1 +3X3)] (4x1 +Ox0+6X4+0X2) (4X2+0X0+6X1 +Ox1) (4x1 +Ox3+6X1 +0X3) [ 12
8
18]
= 28 14 10
~ ~ tiYf?itix_muUiplicatio~-~:~t g~:o_:~~~~;m~~~ln the exampi~!_~~J?.!_Od:_. \
uct !;)A is not .QJ~fine_g, ..since B ha~ tlJJ:§.€L9-9J!:J.!!Lil§.IlncLAJwo ro~.s>~ si6le, however; the result is C'.ln this case the same vectors are being multiplied asTirforming AB, burTflatl=ansposed order. This is a general rule of matrix ·multiplication: Thetranspo~.!~Qf.1ltVO or '1l9!.S-~ic~~!:!&!o the product.oL11Rm .separa!.!Llransposes, multiplied in reverse order. Matrix multiplicat"i9rL.j§_ E~s§Q.ci~tive [(Ae)'c,:;A(BC)}"a:naaisfiiEutlve[A(B±c)~_AB~i-ACJ. -----·--·· b(<:;rv:;;;c· --- . . - - . ... - -· .
Certain matrix•. prqc:tlJ91§ J~q,tJ_r~~-q ue_r)tly in_ -~tl~.t!~O.!:!:l£~.!~~~:: multipligati2_t:J_gL~--by a,diagonalr::r~tri~7bA, has. tl1.!:l. . E;lff~pJ of rotJI.tiplying every element in the_ i1b...Io.!1:'...Q.L~- by ~?;;]. f9~!mu!_!ipli9.~!iOI}_by -·~~goJ:l.~LmfitrJ2'
[email protected] aff~c!~J~'?in,?:2_f!~e ·.original ~atr~~Pre- or po_:_~multiplica.:--· tron by the rde!:'_!!!ym~t~r)<~~~ tb~_orrgrnal ms:!.tms. una~=-An m-element column vector v and an n-element row vector w' are always confarmable-for multiplication (unliR'e the situatiml'~h~)§..a row and w is a ~-,The cofumn vectorhas"""necoium·n~ the row vector one row, assurrng conformability of the two as general matrices. The product vw' is the mxn matrix of scalar products of each element in v and every element of w'. As an example, let · y' = [1
2 3]
28
Introduction
Then
[i] [ =[i : ~]
yy'
1
=
2 3]
yy' is the symmetric matrix of squares and cross products of the elements of y. The product of any matrix A and its own transpose, taken in either direction (A' A or AA') is symmetric. Let A haven columns, a1 (i= 1, 2, ... , n). Then A' has row vectors
a;. The product is
l
a,J
a~
S =A' A=
az
:' [at
an]
an
Multiplying each row of A' by each column of A, we have
a; a1 a; a2
· · ·
a; a"]
a~a 1
· · ·
a~an
a~a 2
S= [ ..
..
.
.
a~a 1
a~a 2
· · ·
a~an
Sis symmetric, since a; a;= a;ai for any two vectors As an example, let
B=
a; and ai.
[10 20 1]3 4 1 1 2 1 3
= [bl bz b3] Then
B'B
=[~ ~ ~ ~l[~ ~ 1
=
21 [ 8 11
3
1
3
4
;J
2
3
1
8 11] 6
6
6 20
The diagonal elements are the square lengths of the column vectors of B; for example, lb 1 l2 =21. The off-diagonal elements are the cross products of every pair of vectors; for example, b;b 2 = b~b 1 = 8. B' B is the matrix of sums of squares
The Algebra of Matrices
29
4X4
and cross products of the columns of B. Similarly, BB' is the matrix of sums of squares and cross products of the rows of B. When B is a "subjectsx tests" data matrix, B'B is the first result in the computation of the matrix of variances and covariances or of correlations. Let b'i be the ith row vector of B; that is, b' 1 = [1 2 1], and so on. It should be noted that B'B, the sum of squares and cross products of columns of B, is equivalently ~ibib'i· The result is obtained by computing a squares and products matrix for each row vector and summing. The same operations are performed, but in different sequence. That is,
B'B=[~}1
2 1J+[~}o o 3]+[~}4 1 1]+[!][2 1 3]
If A' A is diagonal, A~L<:LLQ.....b.e ..c.o1umnwise orthogonal; th_e.__liJ.~!" prg~~ts of ~~-~~!iP~[.Qicofumns otA is zero~~.l§.gjggonal, f!.J§!QJY.:.'t!.i$JlOrlflggg!J2.1. !£~1D...§.9.9J!L9n. the diagonal elements. of A' A. OLA.A.'.§r~.._unlt_x, }\..)~~~ to be orthonormal; each vector is normalized to unit length. For example, let --·~- ·--~
--
.5
.5 .5] -.5 -.5 .5 -.5 .5 .5 .5 -.5
A= [ .5
A is columnwise orthonormal. Every column ai has zero inner product with a; (j T" i), and unit length, a;ai = 1. As a result, A' A= I. However, it is not the case that AA' =I, since the rows of A in the example are neither orthogonal nor normalized. T~QQ!!Q.ts.of the.Jorm A'QA or AQA' ..are.C!I§Q.~YDJ.r:t:.~trJc. if a is symmetric. If A'QA or AQA' is diagonal, A is said to be orthogonal with respect to themetric Q.Theprodt.Jcts A' A, AA', A.'QA~ a~dAa.i''§reterrl"]edthe gramia-ns'
·~~"'''
•'• ••···"
•
·•
••·--·.~~>W•
•
•·
~'-••
,....._
(~the final matrix product we shall consider is the K!_C:'!~~-~-~ pro.d!.lc.t,_of two matri_g_e§. C= A®B is the matrix formed by juxtaposing scalar products of everyeiemenforA-withthe en.ffr·e~rraylrThaJ is, if A··rs m anci B is kxJ;C is theiTikXri/ilia.irix m·atrices.as.elements, of the form [ ai;] B. That is,
xn
ha.vTn-g
all
f
a21
C=A®B= :
aml
30
Introduction
Let B=
and
[-1-1 OJ2
Then
C
~A@ J~i-'~~~~i---i ti ::~ --PJ i B
[
0
0 -9
18
2 -4
There is no restriction upon the sizes of the factors. It can be seen that A Q9 Band B Q9 A contain the same elements but in differing orders. The definition defines the order of elements, with the first matrix containing the scalar factors. In a product of the form C(AQS>B)D where C and Dare conformable for multiplication with A, B becomes the posttactor; that is, C(A®B)D=(CAD)QS>B. The Kronecker product is distributive; that is, (AQS>B)(C®B) = AC®B. Kroneckerproducts are used in con~truc;ting contrastmatrice~JprJactorlal anarysrs=-o1~i?rfance·Ci;si-gns:·Assu.me·a ~ain-ettectconirasfvect~~ is _____ -'"" .......-~_,.
,...,..,..,--"~"-
a=
Y·t· [ 01] Y·z· -1 Y·3·
and a contrast vector tor a second, crossed factor, is
b=[
1]Y··t
-1 Y··z
The weights for the six subclass means for the interaction of the two contrasts is the Kronecker product:
Y·11 -1 Y·tz 0 Y·zt a®b= 0 Y·zz -1 Y·31
Y·:lz This application is discussed extensively in later chapters.
2.3
SCALAR FUNCTIONS OF MATRICES
Rank The rank of an m x n matrix A, is the number of columns (or rows) of the matrix that cannot be exactly obtained as linear composites of other columns (or rows). A column vector a;, is expressible as a linear composite of other columns a; (j ¥- i), it n
·i-1
a;= 2:
J=l
c;a;
+ 2:
C;a;
j=i+l
The C; are any real constants. The rank by columns is always identical to the rank by rows.
\ \S
The Algebra of Matrices
31
~ The_@~ a matrix can never exceed the smaller of its two dimensions.
That is, let r(A) belnefankormafrix A.Then____ - -----------------------·-·--·-··-··-r(A),;; min (m, n)
!(""'--~----------------~~--------~------------···--·-·--:!
~-=.min..(m, n),.the.r1..AJlL§£1LQ.._!.() ~-_9_!_Lu_~.!._a~lf r(A) <min (m, n), then~~ is of deficient ran)i._ ----· · --~-~-~-As an example, let
A=[~ 22 121 3
0
The rank of A is 3 since no column can be expressed as a linear combination of other columns. By comparison, let
A*=[~ 322 21]1
B=
[~
2 2]
3 0
the rank of B cannot exceed m=2; the third column is necessarily linearly dependent upon the first two (can you find the cj?). The matrix is of full rank 2, as long as the second column (or row) is not a linear function of the first. ~.i.."!DJ!...QLCU?~~~-~~r.l.f.§LO.ever exce§d.§..1b.~.. .§!!!.~~ei r .. separate ranks:
....,.....--·,-=--'""'-"""""''"'.......,~~ .... ~..
- ·---............._ ~
r(AB),;; min [r(A), r(B)] Further, if A is mxn and B i5'7ix/, and both··areoTr;~k n, then r(AB)=n. If n is less than m and /,the product is of deficient rank; otherwise it is of full rank. If B is square and full rank, the rank of AB is always equal to the rank of A. The rank of gramians. ~·~~~D.~--~-~'.i!!§.equaj to.. the fi'!.llk of. A Thus if A is square and of full rank, A' A and AA' are of full rank. If A is rectangular with m
-f"'
,_.,...___,_, •. ~ ,_....-«'>"<'"..,.,.""'"'""- ,.,..,
_"_,.,,..,_._..,'-"'''~•~-
,.- .. ·=..-,,_,_,.._..,.,,.,__....""
>~-.,~
"'"-"""•"-W-'>4"""""""'"""'""-"""""fl•""""'" •
-•~·•'
"·'"""~""'"~·"'>•-=:z•.;:c,'..,.'""'~"'-
.,.,. .,,,
32
Introduction
often_tactored. To preserve full rank, two conditions must met: (a) can . . . ·.-· .-.. ..be ,.,,,._..... .......,, no test··-be linearly dependent.l!pon o!bette~t~-~9.Jb~the number of observations must ext'eed tne-number of scores per subject. --···· ·· · · ~-·--·· ·-- ··
~~~
----~~"·
•
-
'~--.-~
......~"'-·'-··· ···- _..... ,._~~-<'"
-~
,,,
.,~
,_~--.~~
·~----.
.....-·--····" - ..
De~erminant
The determinant oLasquare matrix .A, wrill!;!f'!J~l,J§J!...Y.ni~~larJl.~ !l.!.~Jtb A.t....vy_hJ..c)l,§l~§2§.~.§.l:!'!!~-~ry_measure with re~ct t9 i!~.YE3~~s.t The determinant of a matrix_repres~nts th.t3\rolurne of th'e parallel piped generated'6yTtscolitmn'vecto'rs:ln two dimensions:ffi·e determin.ant is the.area ofthe ·pararlelo~cfraryi··generated by 'hie two vectors. In a ~sing,le-dimensfOn':~·Lsjhe IE~.ngth of the vector described . .. Co~side·~ th~ 1 1 matrix'A = [a 1 ]. A may be depicted as a single vector, a, of length Ia I in a single dimension, or drawn on a single axis, as follows:
x
------------~·----------
~--..........- . .-.--"""'·~--~-~-·~··- . . . . . ._ . . . _.... ""~···""'"".,....,......
al
.l
The length of ~ ..is.Ya'a. or a1 its13lf; aiJ>o, then, IAI~·' ··.-·~·For two dimensions, fi~st cons(der A diagonal. For example, let
A=[~ ~] = [al a2]
~'('A
!\ JJQ ~
1'"-0 ......
{i"< ·~·~" ....\ '14 ~ ,:J ad In a unit Cartesian coordinate system of two dimensions; we may draw the column vectors of A as in Figure 2.3.1. The.a~~i!!h~~~gle)ormt:l9 ~Y2.rrrP~: i~-~-~~.~~9_9~~~jrQ.01 the end points.of th~. v,~c!c:»r~.J~ la~llazl, or 5:~__!5. Th1s result can also be obtained directly by multiplying the diagonal elements of
A.
I I I I I
1-.1-1-l~,jf---.;0)
'>
Figure 2.3.1 tThe vertical lines are used in two ways. When the enclosed array is a matrix, such as IAI, they denote the determinant. When the array is a vector, such as Ivi, they denote its length. In the case of a 1 x 1 matrix or a 1-element vector, the two are equal.
The Algebra of Matrices
33
In the nondiagonal case, the angle between a1 and a 2 is of consequence. Consider
A=[~ ~] which may be represented as a parallelogram, completed by drawing the additional parallel sides; see Figure 2.3.2.
__ ....,
(2,3) _ _ _ _ .......
I
/
r
I
~
I
I
I
I
I
I
I ~(5,1)
and sin 9=sin [i-(a+,B)] =cos (a+,B) =cos a cos ,a-sin a sin ,B Further, 5 cos a= \1'26
3 cos,B=Vf3
.
1
Slna=\1'26
.
2
sm,B= Vf3
and 5 3 -1 2 ) IAI=V26Vf3 ( \1'26 Vf3 \1'26 v'13 =15-2=13 Again the result may be obtained directly from A, as the difference of products a11 a22 -a 12a21 • Higher-order determinants are more difficult to depict graphically,
34
Introduction
although their conceptual basis is the same as the simpler cases presented here. Generally, the determinant of any n xn matrix A is defined by
\AI= 2;;(-1 )18u8zk · · · Bn,. summed over all permutations of the second subscript from the natural order 1, 2, 3, 4, and so on. The total sum is across n! terms, each term being the product of n elements. The resulting products form certain patterns. The determinant when n = 2 is a 11 a 22 -a 12 8 21 , or the product of the diagonal elements minus the product of the off-diagonal elements:
Jll \ J a/y''a,~
',+
/
For n= 3, the products are across one element from each row; that is, Bu8zz833+a1zaz3831 +al;;azla3z-a13azz831-a23aazau-a33alzazl or
+ 1
-~h,~!:J~ n:::.~1:1 Jt:w, pattern .is
+
+
.more. cornple)(,_et,~ci.J~I- m..~Y. ~t3~e':'~ll!e!e£L~Y:
Le)(_ea,n_~i_£f?_by_ (T")in_orE;, The minor of any element [au], represented IMul, is the
determinant of the (n-1) x (n-1) matrix that remains when column i and row j of A are deleted. The determinant of A is found from its minors by
IAI =
2;;aij(-1 )i+; IMi;\
This is.the product of elements in any one row of A, each times 1 or -1, times the corresponding minor. For example, let
-3 -1 2 1
The Algebra of Matrices
35
Summing across the first row,
/A/= 2(-1)21[-~
+2(-1)41[~
2 -3 -1 2
_!JI-3(-1) 3 1[~
_rll +S(- ) 5 1[~ 1
1
1 2 -3 _!JI -1 2 1
-ill
Each of the smaller determinants may be evaluated by minor expansion, or by summing diagonal products as in the 3x3 example. Then . r;N' ["] Q -rr,o..
/A/ =2(-14)+3(-17)+2(-5)-5(-1 = 1
~
h~.JV-,.,.~-
t -· (}J"'"' M ultiply!!].9..r:lJ~~_E_L~--~-~~~-~~~..~-~.!~52.!.}2luJtiplying t.h~.,2~_r minant by en. That is,
/eA/ =en/A/
~-~----~----~-~·
The determinant of a product of matrices is the product of their separate detefffiinants:..------·-----..------·· · ---------·--·--··-----· ····-·-··---·- ...... -
·--------.,
/AB/ = /A//8/
F~m th~)Lf.QJ!Q1Y2..th.91JJ1.lt?I£h9DQing !IJVOLC>.IJV§LQ.r_columns of A mu[tipli_es tb~ .... determinant by -1. For example, let A be 3 X 3, with coTCi'mns'Ta";~a;:a~]' and . I
-~..,_.--~--~--..-•~-1
B=[~ ~ ~l 0
Then
0
1J
AB has columns [a 2 , a,, aa] and determinant /AB/ = /A//8/ =-1/A/
....
~" -11+~-b~/ = 1 +b'a
From_tbe expansion by minors, it can be seen Jb.9t the determinant of a triai1Q-U§'!..•IJ1Citrix i§Jbe product of its,djagonal elernents~ ..AII.coeHicien.. a 11 are zero, and all terms but the first vanish from the summation. Thus, if Tis (upper or lower) triangular,
ts··a;;··
except
n
/T/=fltii ·i=1
\.-
II . 1
36
Introduction
The matrix of variances and covariancesis afunction ofa gramian ofthe data matnx:-ForliTo-~be..nonsfngufiir"ffiere must be ~tleast asm;:~ny subjects as t~-sts~andn~-~ariabie""canJ;>§ exac::tly a linear combination of other test scores. the Q§ermiri<mf ()(the ·variance-covari?DCe ·calT€i~U5~--ge{i~~fi.?:ed variance-oTthetests,-shice-if-compr-ises a p-dimensionalvolume or dispersion measun3-t6-rt11E:;set --~-- -- -----'"- --------- · . - .
matrix' is
Trace The tr?_Q<;:l_Qfa.square matrix A, written tr (A), is the sum of the diagonal elements-of the IT!.<~Jr:_ix;-thatTs;·r;a;~·TnefraceTsencoLIQt~red iY1statTstiC-al_appJl: cations·whe~-=only diagonal matrix elements are of concern:-ancTtlie remai[iing niarrrx elements are ignored. "f:or exam-ple-, if . . . . ... ... .. ... . ""--·- "'""' ... --- •. --. ,_ ""
-,.
_,_~·-•·•
-h~"'·>--
''•'-"""'"'"~''x""~'--.---'
,.,...r_•••••
A=[~3 ~6 -21~] then tr (A)
= 1 +4-2 = 3
Although the trace is a simple matrix function, it has several properties that al\mportant. ·.
(J,,
If cis a scalar, then tr (cA) =
·c~~tr(A±B) =tr(A)±tr(B)
c · tr (A)
(); tr (CB) =tr (BC) ~~The trace of gramians C'C and CC' are both the sum of the squares of all of the elements of C; that is, tr (C'C) = tr (CC') = L;L;C;/. It follows from property 3 that if b is a vector, tr (b'Cb) = tr (Cbb'). Since b'Cb is a scalar, its trace is identically the same scalar value. If further b'; (i = 1, 2, ... , n) is the ith row vector of matrix B, then
Li b';Cb; = L; tr (Cb;b';) =tr Li (Cb;b';) =tr (CB'B) The result follows since
2.4
L; (Cb;b';) =
Cl:;b;b'; = CB'B.
MATRIX FACTORING AND INVERSION
There are two !Ypes of factorizations of matrices that have repeatec! applicationTrj-stat1stical methods::fhe firsTfs fac;toring.of gramfan matrix i.Qto Jria.n9!Jl<~:r.J~.9Jors, which ·a.re'thetransposes of on~anott"l~r.,Ir_!_?._ngyJ9LfilQ!9.rlng is present~tc:l here as a first step in matrix inv~rsiQQ. The second and closely -related 'technique involves factoring a rectangular matrix into components, of which one is an orthonormal rectangular matrix of the same order. For further information on both procedures, see Golub (1969).
the
a
38
Introduction
For the first column of triangular factor T,
tu=~
ti t-ail - tu For the jth column ofT
· 2' I=
3 , ... 'n +1
(j = 2, 3, ... , n)
i=1,2, ... ,j-1
tij=O
(Pivotal elements) ,1-1
au- :k tu,tj,,
i=j+1,j+2, .... n+1
tu=--•--:-=--'--1- -
t;;
With the check row included, columns of Twill sum to zero, within rounding error. The diagonal elements of T (that is, tjj) are the pivotal elements of the factorization. As an example, let B be the first three columns of the matrix of powers,
The gram ian B'B is symmetric and of full rank 3. That is,
A = BIB =
l
4 10 30
10 30 100
30l 100 354 :__
___________ __ _ -44 -140 -484
The check row is affixed. For the first column (j= 1)
tu
= V4= 2
10 t21 =-=5 2
30 2
tat=-=
15
-44 t41=-=-22 2
For j=2, t12 =
o
t22 =
= 5" ~5 t32=- 100-(15)(5) Vs YO
V30-5 2 =
V5
- -140-(-22)(5). ~ yl5 --6v5
t42-
The Algebra of Matrices
37
Triangular Factorization f.ll.11XLLsymOJ.fltri.c matrix-A can be factored into.the productof a triapgl!.!.l!r matrix and its transpose. We will represent this decomposition as
A=TT' The form of the decomposition may be represented pictorially as
The condition for factoring A is that IA[k~-~:~_er than zero: where
. [au ··· a,kl
A[ k -] - .: ·
\_
,<.-\..;> f) 1~.._Jv' '"' ,,.,..,,-,..- \J
:
ak,
···
akk
for k = 1, 2, ... , n. Most sum-of-cross-product and variance-covariance matrices meet this condition. --~-----·--·----·-·-· ·--~--------'"~--· ~ti10d0Tdecomposition is that of Cholesky or, equivalently, squar.~~~ root taqtoring. Othe·r-tact-oring.te"Chniql:ies-are'"avalfabie; "although .comparisons pres9n'ted by Fox (1965) indicate that the Cholesky method is both efficient and accurate. The Cholesky factors of a diagonal matrix 0, are easily ob,ained. Let the matrix to be factored be
du
(Zero] d22
D- [
(Zero)
dnn
The Cholesky factor is the diagonal matrix of the square roots of the [dii]. That iS;.... .......~- . . ·=- -~. ,- ............ · ··~»··· --"~-~........ ~· ·...., ... ........-.-....;......,. ..... ......... ,v . -- ............. - ..........~~.,...~-~---~···~~ ... -J-··~ ..
-
T=
l
vCI:"t ·
(Zero)]
(Zero)
\rd,:;,
VCJ;;_
0
It can easily be seen that TT' =D. .., Fora general nxn symmetric matrix A, the computations are more complex. We sh~ll begin by allowing for a check on computational accuracy. To accomp_sh th1s, extend each column of A by adding an element - , n
an+l,j
=- L
au
i=l
to the column. 1\Jl c.g_mputations upon column j are extended to operate upon this additional element. '" · · ··· ·· -- - .. ·--··---· ... . .... · · " , .•.-....
_..------------
The Algebra of Matrices
39
For j=3,
t33 = Y354-(15 2 +{5VS}2)=2 t43
=
-484-(-22) (15)-(6V5) (5V5) . 2
=
-2
The following array is formed:
T= [
o ol
v5
52 15
5Vs
0 2
--------------22
-6Vs -2
It is easily verified that the columns ofT sum to zero, and that TT' =A, for the first three rows of both matrices. Had any column of B been linearly dependent upon other columns, or had B contained more columns that rows, both Band A would be of deficient rank. The effect upon T is that a pivotal element f;; becomes zero when the corresponding column is encountered. Further computation is not possible unless the dependency is eliminated. Some computer algorithms, such as those in MULTIVARIANCE (Finn, 1972d), will ignore the dependent column so that any further dependen€:ies may also be discovered. ""> , The Cholesky factorization can facilitate computing the determinant of ) ( symmetric matrices, especially if t. hey are of high order. According to the rule . for determinants of products, IAI = ITIITI = IT/ 2 . The determinant ofT is simply \ the product of its diagonal elements; that is, IAI=IT/ 2 n
=IT til j~t
Conveniently, t;/ is computed prior to f;; in the Cholesky algorithm. The log determinant, frequently of use in multivariate analysis, is log IAI =log
TIt;/
/
~j
0. 1.. .;:;
:l 1~' t)
th.
j= 1
= 2L; log [t;;] The log provides further accuracy for large matrices. For the example, IAI = (2XVsX2} 2 =80 loge /AI = 1.3863+ 1.6094+ 1.3863 = 4.3820 If A had been of deficient rank, at least one of the requiring that IAI = 0 as well.
f;;
would have been zero,
40
Introduction
Assume that A is a matrix of variances and covariancesfortwgte..sts._, Y1 and is the covariance of"y~·-a.nd Yi; while s;;=-.s; 2 and Sz2=Sz 2 are the respective variances.
Y2·
Sz1=s12
The resulting Cholesky factor is
We recognize t 22 as the conditional standard deviation of y2 given, or holding constant, y 1 ; t 22 2 is the conditional variance s 211 2 • This property holds for all variance-covariance matrices. The Cholesky factor contains the conditional standard deviations, holding constant a// prior variables, on the diagonal. The off-diagonal elements are the conditional covariances, given preceding variables. Thus triangular factorization is of significant use in all stepwise or ordered analyses, when we wish to examine the effect of some variables independent of those preceding. For example, with three variates the variance-covariance matrix is
The Cholesky factor is
0
s~J
s211 and S 3211 are the standard deviation of y2 and covariance of y2 and y3 , respectively, holding constant y1 ; s 3112 is the standard deviation of y3 , holding constant both Y1 and Yz·
Inversion The inverse of square nonsingular matrix A is the matrix A-\ such that ·-·---·"-'' "'-••• •- "'"'"" oO ---------·-Y·--~~·~---·-·-~~·~>0,,~
AA- 1 = I=A- 1A We need not discuss matrix inversion generally. For in most statistical computations we are concerned with inverses of symmetric or gram ian matrices; their inverses are simpler to find than for general nonsymmetric matrices. Inversion for diagqnal matrices is ?_traightforvvard. o- 1 is the diagonal matrix for n~ciprocafsotitie nonzero elements--of D; that is~·-· ···--~"-~---··· ·· · · · ~-
......
...-""~'
.....'"""'"'~····~..... '.
For example, let '\"
~>1"{'•
f ·~ ., ,;.
,p'
_;_ ~,.~r:c,~
"'
<•<":
.t,""' ~·
D=G 1~ 5~]
The Algebra of Matrices
41
Then
and D- 10 = DD- 1 = I. The inverse of a sml![Lsquare matri)( (for:_t;!~g_mple, ~of order 2, 3, or4) may be founa-trom-TfideTer~minant and minors. For n xn matrix A, the inverse is _,__ . .-. - .. --·- ~-- ·-=~p _ , . , . ..=
"~0--
~---".,.,...,_,.........__,_~,,._,_
B has elements [bi;] = (-1 )i+j
and
IMul
IMjjl
is the minor of [bij].
Let
4 A= [ 10 30 with determinant
10 30] 30 100 100 354
IAI = 80. The inverse elements are 1/80 times B', where
bll=(-1)211~~
1001 =620 354
C~c 'r.ti ..
b12 = bz1 = (-1 )31 1O 100 I=-540 . 30 354 ... 10 b13 = bs1 = (-1 )4 1 30 bzz= (-1) 4 1
b23 = bsz = (-1) 5 1
~.
301 100 = 100
- .:r "1 ~ J"l~
'I
Q
o
"'1 Q i)
4 301 30 354 =516 4 30
101 =-100 100 101 =20 30
-6.75 1.25] 6.45 -1.25 -1.25 .25 It is easily verified that A- 1A= I. In the case of a symmetric matrix, IMi; I = IM;; I and the inverse is also symmetric. For larger matrices, or matrices with fractional elements, inversion by determinants is neither. accurate nor practical. Thus we shall make use of two inverse relationships. First, for any matrix A, which may be expressed as a
~
0
0
..,110
;lo.
42
Introduction
product BC, the inverse of A is equal to the product of the inverses of Band C in the reverse order; that is, if A= BC, then A- 1 = c- 1B- 1. Second, the inverse of a triangular matrix is easily determined. Thus, for symmetric A we may utilize the Cholesky factorization, A= TT', and determine A- 1 as (T-1)'T- 1. In the previous section, we discussed the triangular factorization of A to the product TT'. T- 1 is computed as follows. Maintain the nxn matrix T, plus the check row as obtained from factoring A, with elements
For the /th diagonal element of T- 1
(i = 1, 2, ... , n)
1
(t-1);;=-t.. 11
For the ith row of
r-
1
(i = 2, 3, .... n)
j=1,2, ... ,i-1
W1)u=O
j=i+1. i+2, ... , n
For the (n+1)th row,
,· n
(t- 1)n+t,J= L
tn+t.h·(t- 1),..;
;
,.
.f
j=1, 2 .... , n
k=.i
Elements of the check row should equal unity, within rounding error. As an example, let us recall T, the Cholesky factor of A,...__ -.....~~-----~,~-~..,
4 A= [ 10 30
10 30] 30 100 100 354
Including the check row,
T=
2 V50 OJ 5 0 15 5\15 2 [--------------22 -6\15 -2
The diagonal elements of T- 1 are (t- 1}11 =1 /2, (t- 1b= \15/5, and (t- 1}33 =1/2. For the first row (i = 1)
The Algebra of Matrices
43
Fori=2,
For i=3,
~ (t-1) 3 1-- - 15(1/2)+(5Vs)(-Vs/2)2
(t-lhz = -
(5Vs) (Vs/5)
2
2
5
= --
2
For the check row,
W1)41 =- [ -22(1/2)+(-6v's)(-Vs/2)+(-2)(5/2)] =
1
(t- 1)42 =- [(-6v5) (Vs/5)+(-2) (-5/2)] = 1
u-
1)43
=-(-2)(1/2) = 1
Forming the array, 1/2 T- 1 = [ -Vs/2 5/2
0
0]
v5/5
-5/2
0
1/2
It is easily verified that TT- 1 = T- 1T =I. Returning to A,
r;--::(T-~~~ ~--·---h.._,....,,_M~>-'"
3114 = [ -27/4 5/4
-27/4 5/4] 129/20 -5/4 -5/4 1/4
It may now also be verified that AA- 1 = A- 1A= I. It can be seen that the inverse of A agrees with that obtained by the method of determinants. If matrix A had been singular, one of the diagonal elements [f;;] would be
zero. Tnthis"caseTFiein'verse.diag·on·arelemerilcann"oTi)e founifandtileTnverse -matrix Ts'il·arcrerrnea~-Agenera·rr.zea Searle~ T966: cliapter e) for A be ignoring rows that have dependencies. \Y~.:>h~!Lr(3str..l~t our~Ei!.Y.E?2Jl~I~J9 _symmetric matrices with nonzero determinant$, .. - YJ& .have noted that the inverse of a symmetric matrix is itselfsymmetric. AIS():. if A="ac, 'fhen c~1 B~1 ~A~t:"''"f:or allsguare"'lllatrices~ "the fhe. transposeTst.lie transpose"'artiiei"ll\~erse·miirrl~; thai is,····· ·
may
rrrverse..(see
o15talnea;--6y
Inverse of
"-··~!-- .. -,_,___.,.._,_,.-.~---·
~--~"':\.,._._._.y·>~~-·;_~
-. ,,
---~.,.,.
:;·(A.c1)'=(A')-1
This follows since
(" ...
)
·····m·-· · · · · --"" and
(A- 1)' A'= I'
Since I'= I, (A- 1)' is the inverse of A' and must be identical to (A')- 1 . The left inverse matrix is also the right inverse, A- 1A=I=AA- 1 . Thus if K is square and
44
Introduction
orthonormal by columns (K'K = 1), it must also be orthonormal by rows (KK' = 1), since K' = K- 1. The determinant of the inverse matrix is the inverse of the original determina~·-----" ·~---"'"··~··,·· "·--", -----"·· ...-;_···~-'-""'"""""-->-'"'--It''·"·'"
In the preceding example, IAI =80. Then IA- 11 = IT- 112
=Gx ;x~r =a~ Let an nxn matrix A be expressed in partitioned form, with sections having n 1 and n2rows and columns, respectively, with n1+n 2=n.
A=
[
Au
l
A12] n1 rows
-~z~-r-~2~-
n1 columns
n2 rows
nz columns
The determinant of A can be expressed as a function of the determinants of the separate portions. Specifically, IAI = IAuiiAzz-Az1Au- 1Atzl = IAzziiAu-A1zAzz - 1Az1l It is necessary that Au or A22 be nonsingular. Note that in utilizing these relationships, it is normally preferable to invert the smaller of the two matrices, Au or Azz.
The computation of the Cholesky factor or inverse matrix requires computer routines for both speed and accuracy. Good subroutines for these functions are contained in the Chicago package (Bock and Repp, 1970). The MU LTIVARIANCE program (Finn, 1972d), contains a single routine with three entry and exit points, respectively, for one, two, or all steps: (1)
A=TT'
(2)
T ::;> T- 1
(3)
(T- 1)'T- 1 =A- 1
Orthonormalization SupposJUhat X is an m x n rectangular matrixof full rank n. Then X'X is of fulirankand is symmetr(c. Facf<5"ri ngK"r>C bylhemetfioa of Cholesky,
weil1ave--
X'X=TT' and
The Algebra of Matrices
45
Let us examine the product
First, X* has the same order as X but is columnwise orthonormal; that is, [X*]'X* = T- 1 (X'X)(T- 1)' =T- 1TT'(T- 1)' =I Second, columns of X* are linear composites of columns of X, as defined by
(T- 1 )'. Specifically, x; is simply x1 normalized; x; is a linear function of x1 and x2 such that x; is orthogonal to x 1 and is normalized; x; is a linear function of x1 , x2 , and x3 such that x; is orthogonal to x1 and x2 and is normalized; and so on. Th e_Qctb_ong£J17J!l!:?.?.ti91L..QLX.~-~-*j~ I:)_;J_l:li,V.~~ nt to facto~i n2 -~--~~~~ :~~e
product of X* . c:!D9 t:~l)lsl_sup~r§.~DPt (*)91LK!§.L1.§~a tQ_cg~_Q()tEtJJ:i.~l([ig_fl1_atnx IS "QithQD-ormar"thafis, (X*)'X* = I. In generaL any m.x n L!l£!§l.ngyl~r:na!r.~:fiifyJr re1r_1_k '2·..-lll~e.f~_gJgE~<:!.).Dt.Q_Jttepr.gdu~t-~f__CI_n__ p_xtz_g.QI!J.n1n.'AflS.ElQrJb.QOPrm<'l.l matrix and an nxn upper triangular matrix. The decompositiqns;.?D.l:J()repre-
se'rlfeCl as···-·-···-~-------~-····· ;:::· t::.> .•.- ... ·...~""-.) . ... . This may be depicted as
·? 1\.
X~X*T'
······ · ··
·
· ···
..-·· .f""'
,,._.,.,,r'---_,.,t-<', ....
--··--
.--------,*
The method of decomposition is the Gram-Schmidt technique, which does not requirecompLita.tion ofX'X ftieCnOiesky factor from X' X. A modification of the ma:m.::Schmidtpr6ceau're(l3fork;T967rancfotfi'er'meffioos, such as that of Householder (1964), are more accurate and should be employed for large problems. The Gram-Schmidt method operates by successively computing columns of X that are orthogonal to preceding columns, at each stage normalizing the newly computed vector. Bjork's modification postpones normalization until all orthogonal vectors are computed. It is also possible to produce a factorization which is orthogonal with respect to an mxm diagonal matrix metric D. The outcome is a matrix X*, such that (X*)'DX* =I. The computations involved in using D reduce to those of the simpler case if D is an identity matrix. Therefore the general algorithm is presented here. Assume that we wish to factor matrix X with columns X; (i = 1, 2, ... , n).
or
46
Introduction
For the first column of X,
1
x; = -t xl
(Normalization)
11
For the jth column of X (j = 2, 3, ... , n) [t']u=x;ox~
i=1,2, ... ,j-1
i-1
(Intermediate orthogonal result)
xt=xi-L [t'lux; i=l
(Normalizing constant)
[t']u= Y(XT)'Dxt * 1 _L X·=--X·
'
[t']u
(Normalization)
.~
Let 8 be the first three columns of the order-four matrix of powers, as in the Cholesky example; defineD to be an order-four identity matrix.
21
1~
4
3 9 4 16 For the first column,
[t']ll = v'4=2 Forj=2,
1l
l:~l
[-1.5j
b~= [;J-5l~J ~:~ =
Forj=3,
15 [ 5Vs -Vs/10 0~ [-11] [ 1~ -15 l5l...555 -5\15 [-3Vs/1 ·Vs/10 3Vs/10 [ t']13 =
b_L = 3
4
9 6
t']23 =
=
-1 1
The Algebra of Matrices
47
b*=[-:;] 3
-.5
.5
l
Forming the entireprrays,
.5 -3Vs/10 B* = .5 -Vs/10 -.5 .5 Vs/10 -.5 .5 3Vs/10 .5
.5J
and
It is easily verified that (B*)'B* =I and that T' is identically the Cholesky factor of B'B, obtained earlier. B* is termed an orthonormal basis for the columns of B~ Again, if the rank of B is less than n, one or more of the diagonal elements ofT will be null. These are the normalizing constants (lengths) for the orthogonal vectors. The column of B that is dependent upon preceding columns will go to zero in s·\ and would have zero length. This can occur either by columns of B being direct linear combinations of one another or by B having more columns than rows. While the Cholesky factor contains the conditional standard deviations and covarfances lofeacfl-column- "giveri'"i)"rece~diri9--coTu mils, ·arcontaTos~th~ ccfhditiorialvarra.·61esfileniselves. That is b; is the part o(t>;iilat"istridei)endent ----········-··--~~----·-··•:J:''""' . . . ............... ~~--------~--~-~--- . --~------··"····--------··"'·~~·-·· ....,.••~"-~giy~~lEL:....b."-J§..Jb!U.esidui;!J b3 that Lsifldependent of the varial;>)§.§. LfL~ 1 §nd b 2 ; ar1d so on. Readers rnay recognize B* as containing theJirst three normaUzed" ortliogo.nar,Joiynom.iar contrast~of coT.~C>!l.~~:: - - .. --. -'"~~,-=-----M--"""""-~~,.,,..,.,.,..,.P.· '•'".»>.<'~-"'-""""~"·" ,•,_, ,. ,,,, ,,
2.5
MATRIX DERIVATIVES
The calculus of matrices is a complex topic, but fortunately we do not require more than a few operations. The needed rules are presented here. The reader should be able to see a close resemblance of these rules to those of scalar algebra. ~~!!l
48
Introduction
Rule3: ~~; -a~' Av · · ----
~--J!..Y_:.~~v
if A is ann xn symmetric matrix.
Corollary 3.1: If A= I, then v' Av = v'v and av'v/ av = 2v. 2.6
CHARACTERISTIC ROOTS AND VECTORS Vo_ ,/\ (~ '"" t.J..~~
r·'
Freg\,J§.DJJy in statistical applications.. it ibecomes necessary to find.a.v.ecto r J! Ln orderJo c:lefllle a linear combination of variables y; that has ma~l!J1um vari____ ---·r -----·_ance._Ihat is, if we let y be the vector variabl_e with elements Y; (i = 1, 2, . :.: ,_n), tbe Pr.QbleJn~~-q-~e-off!ri-afhg )rsu'cn.fha.f:V1(x'yfi_s maximal:. It will be shown the next chapter th-at ifthe var-Tance~covarfarice matrix of the y; is n X n matrix A, then the variance of the linear combination x'y is the scalar x' Ax. E\@ll.Jt{ithout this understanding, thematrix problem is.on.e.of general interest: tollnd x th.aim.axfmizes~x,Ax,-·wfie·r·e .A is any symmetric matrix of grder n and rank m ,=---~----,.,..,-·--·~-"---,-".
•••~
•",o.•,,-<
.. ,,.,_"""''-•"Y''-i•
·-~-·<•~--
in..
(':;;;(:;y:··
-·-·· ......................
. .. . .. . . .
.. -.........
Since the maximum value of x' Ax can become infinite, xis frequently restricted to having unit length; that is, x'x = 1. Let us introduce A. to represent the maximum value of the variance: (2.6.1) The maximum value A. may be obtained by maximizing x' Ax or, more conveniently, by maximizing the equivalent expression
Setting the first derivative with respect to x at zero
ag =2Ax-2A.x
ax and
AX= AX This may be reexpressed as the set of n homogeneous equations (2.6.2)
(A-A.I)x = 0
The maximum A. and corresponding vector x are the non-null solutions of these equations. I is an identity matrix of order n. In order for there to be a non-nu II x, it is necessary that (2.6.3)
IA-A.II =0
If IA-A.II were not equal to zero, we could invert A-A.I and solve for x by premultiplying both sides of Eq. 2.6.2 by (A-A.I)- 1 The only solution for x would then be x= 0. This is not the only solution when A-A.I is singular and cannot be inverted. The s~[.l:!.!l2.rLA of Eq. 2.6 .. 2.J~the characteristic root, or eigenvalue, o!~.:..!'LS the a5s0ciated charagteristl9. ve'QIP-r:· c£elgenvecfoT solve-·'Ec(-2.6.3 we· --- ---~-~-·-~"'". ,,
lo
The Algebra of Matrices
49
subtract A-AI and write the expression for the determinant in terms of the resulting elements. The result is a polynomial in A of degree n. This characteristic equation of A has as many roots above zero as the rank of A. The remaining n-m roots and vectors, if they exist, will be null. For example, let 4
10]
A= [ 10 30 Then 4-A A-AI= [ 10
10 ] 30-A
The determinant of A-AI is lA-AII =A2_34A+20 Setting the determinant to zero, we find that the characteristic equation is A2 -34A+20=0 We note that the polynomial is of degree two, and has two real roots. Solving for the two roots, At= 33.40
and
Associated with each root or eigenvalue is ann-element eigenvector x. The vector corresponding to A1 is Xr. with elements Xu and x21 . Substituting A1 in Eq. 2.6.2, A-A 1 I = [-29.4 10.0] 10.0 -3.4 From either row, the ratio of Xu to X12 is .34 to 1. Taking these values as the initial vector, we may normalize so that x;x 1 = 1. That is,
I[ :~~J 1
I= VD156 = 1.06
1 [ .34] [.32] XI= 1.06 1.00 = .94
x1 is the first normalized eigenvector. Its components are the weights that produce the maximum value A1 = x;Ax 1 , subject only to the restriction of unit length. There is a second solution to Eq. 2.6.2, with A2 = .60. Substituting in 2.6.2 and solving for x 2 , the normalized vector is .94] Xz= [ -.32 x 2 is the second eigenvector of A, and maximizes Eq. 2.6.1 to give Az.ln addition to unit length, x2 is subject to the condition that it is orthogonal to x1. In the example, the characteristic equation is of degree two, and there are no further solutions. It is convenient to form an nxn diagonal matrix of characteristic roots, A,
Introduction
50
and an nxm matrix X, having the characteristic vectors as columns. A and X have the following properties:
1. X'X = lm; that is, nonzero eigenvectors are orthon_ormal if A is symmetric. 2. AX=XA; X'AX=A; XAX'=A. 3. IAI=IXAX'I=IAI=fp,u. i=l
n
4. tr [A] =tr [XAX'] =tr [A]=
2:
Aii·
i=l
It can be seen from property 3 that at least one of the eigenvalues of a singular matrix will be zero. A matrix having aiJ.J1igeny€,],!,!~§ gr~at13L tb,,an zero of Jhi3.Yalyes a,re null and O!J:l_~($~_are is said to be positiye definite. 9re~tert11an zil"ro,tile. matri~'iS§aicftobe positiveserriideHniie. -·. - . - · AttimesTrl''stati.sticaYappi'ication~ the p-roblem is'more'complex and maxima are required of the ratio of two quadratic expressions. Here,
TfSQfue
x'Ax} A= max { x'Bx If B is positive definite and symmetric, it may be factored into triangular components, B = TT', by the Cholesky algorithm. Letting v = T'x and x = (T- 1)'v,
x'Ax } { A= max v'(T- 1)TT'(T- 1)'v =max {
v' (T- 1}A(T- 1) 'v} v'v
If we now restrict v to unit length so that v'v = 1, the problem reduces to the simpler one of computing eigenvalues and vectors, A and v, of the single matrix (T- 1}A(T- 1)'. The X; may be obtained as a second step, through x= (T- 1)'v. The computation of eigenvalues and vectors, even for small matrices, is formidable. Algorithms have been forwarded by Householder (1964), Ortega (1960), and Wilkinson (1965), and have been summarized for programming in Ralston and Wilt (1965, 1967). Since eigen solutions constitute a minor role in this book, these methods are not discussed here. The MULTIVARIANCE program utilizes a subroutine written according to the Householder specifications (Bock and Repp, 1970).
2.7
EXERCISES
Understanding the components of the matrix expressions in the following chapters is necessary to understanding the statistical methodology. The following matrix exercises provide examples of the important aspects of matrix computation, and are recommended to readers without matrix algebra preparation. Answers to the problems are given in the Appendix.
The Algebra of Matrices
51
Matrices
A~[~ -~ 2 -1 2 0
o~[~ H~
B=[-1~
o2 o0 0ol 0 4 0 0 0 2
4 -1
2 0
T=[:
1 -1
~]
-i]
K= [V:l/3 v'313 v'3t3
U= [ -1.51
0 1
2
~]
0 30 32
-3] 32 41
·~[~1]
0] f=[~~l e= ['1.5 1.0 2.5
4 -2 8 -2 [: : -2 -2 17
-1
C= [23 0 -3
~]
2 0
Vs/6]
-v212 0
-V6/3
v2t2
a~[~
v6t6 0 0 0 1
0 0 1 0
~]
Problems 1. 2. 3. 4. 5. 6.
Elements: What is the value of a22 ? of c2 ? Definitions: Write Is; 1s. Transposition: Write A'; e'; H'. Addition: Compute C+T; (e+v)'; 0-H. Scalar multiplication: Compute (1 /1 O)A; 4v. Vector multiplication: Compute e'v; v' e. What property do v and e have with respect to one another? Compute ef'; e'f; 14v. Compute 141.; e'e; v'v. What is this function of the respective vectors? Find lei, the length of e; Ivi, the length of v. Normalize v; that is, find v* = (1/lvl)v. What is lv*l, the length of v*? 7. Matrix multiplication: Compute AI; lA. The transpose of a matrix product is the product of their transposes multiplied in the reverse order. Find A8; (A8)'; A'8; 8'A; 8' A'. Compute OA, 80. What are the effects of pre- and postmultiplication by diagonal matrices? Compute T+U; TU. In what way do these two results resemble T and U? Compute KK'; K'K. What is the nature of K? Compute A' A; 8'8; AA'; 88'. Note that A' A and 8'8 are the sums of squares and cross products of the columns of A and B, respectively. AA' and 88' are the sums of squares and cross products of the rows of A and B, respectively. Compute A'OA; BHB'. Note that the gram ian symmetry is preserved even when the product is taken with a third symmetric matrix as a metric. Compute O®T; f®v. Compute e.= (1 /{141 4 } )e'1; e- e.1. What are these? Compute GA. How is the product related to A?
52
Introduction
8. Factorization: Find the Cholesky factor of C; that is, find W, where C= WW' and W is lower triangular. Verify that C= WW'. What is the determinant of W? of C? Find the inverse of W. What is its determinant? Verify that W- 1W = WW- 1 = 1. Find c- 1 = (W- 1)'W- 1. What is the determinant of c- 1 ? Verify that c- 1C=CC- 1 =I. Find A*, orthonormalized by columns with respect to the metric D. Call the corresponding triangular factor R. What is A(R- 1)'? Verify that (A*)'DA* =I. 9. Rank and determinants: What is the rank of A? of A' A? of AA'? of D? What islA' AI? !AA'!? IDI? !AA'DI? !G!? !GDI? 10. Trace: What is tr (TT')?
Section~
Method
CHAPTER
I
Summary of Multivariate Data 3.1
VECTOR EXPECTATIONS
A basic concept of multivariate statistics is a vector variable. When each member of a population is represented by more than a single outcome measure, the measures may be juxtaposed to yield a vector; to summarize data in this form, we may use matrix operations and develop conventions for the simultaneous treatment of multiple variables. A vector random variable or random vector is a vector comprised of p (? 1) distinct random variables; each variable or element may be described by its own univariate density function. Let
be a vector random variable. Each x 1 is itself a random variable with expectation (3.1.1)
and variance (3.1.2)
The variance u;1 is represented as u 12 , to agree with elementary statistics presentations. The standard deviation is the square root u 1. Let O";j be the covariance of random variables x1 and x;; that is, V(X;, X;)= :if(X;- f.L;){Xj- f.L;) (3.1.3)
The covariance of
X;
and itself is the variance of x1. That is, when i = j,
u;; =
u 11 =
u; 2 . Further the covariance is always symmetric so that u;.;= u 51 •
From univariate theory, we have the following results. For c constant,
54
:if(CX;) = Cf.L;
(3.1.4)
:if(X;+C) = f.L;+c
(3.1 .5)
Summary of Multivariate Data
r(cx;) = c2a"l r(x;+c) = r(x;) = ai
55 (3.1.6) (3.1. 7)
For two variables x 1 and x1, /
lt(x1+x1) = p.1+ p.1 r(Xt+XJ) = ai+a}+2CTij
(3.1.8) (3.1.9)
Like the variance, adding a constant to x1 or x1 does not alter the covariance. Multiplication of either variable by a constant multiplies the covariance by the same quantity. That is, with c and d constants, r(x;+C, X1+d) =r(x;,
XJ) =
CTij
r (CX;, dxJ) =Cdr (X;. XJ) =' CdCTu
(3.1.10) (3.1.11)
If we standardize x 1 and x 1to obtain z1 and z1, respectively, we have and
(3.1.12)
Then (3.1.13)
and r(zt) = r(z1) = 1
(3.1.14)
The covariance of z1 and z1 is
(3.1.15)
=pij
Pu is the correlation of X; and x1and has Iimits -1 and 1. Let us proceed to the entire vector x. The expectation of a vector variable is the vector of separate expectations .
.%'(x)=.%'[~~] =[;~f~~]=[~:l Xp
ft(Xp)
=p.
(3.1.16)
fJ-p
The variance of any vector variable is the matrix of variances of each of the variates, and covariances of each pair of variables. That is,
r(x) =lt[x-lt(x)][x-.%'(x)]' =lt(x-p.)(x-p.)' = lt(xx') -p.p.' =l:
(3.1.17)
56
Method
l: is the pxp variance-covariance matrix of the variables:
l:=
(]"12
(]"12
0"1p
CTz1
O"z2
(J'21J
(]"31
(]"32
0"3p
(3.1.17a)
The expectation and variance- covariance matrix of any vector variable (observed scores, errors, linear combination of scores) are the basic summary measures which parallel f.L and u 2 in univariate theory. In fact, when p= 1, p= f.L and l: = u 2 . When p > 1, l: has the separate variances of the p variates as diagonal elements, and the covariance of xi and xj as the ij off-diagonal element. Since uu= u;i, l: is symmetric. The covariance of each variable with itself is the = ui. variance, or diagonal element, uii = ul; the standard deviations are The algebra of expectations and variance for vector variables parallels that for scalars. Let c be any given scalar and c' be a vector of constants; that is, c' = [c1, C2 , .•• , Cp]. We have the following results:
va:;;
iif(cx)= cp
(3.1.18)
iif(x+c) = p+c
(3.1.19)
r(cx)=c 2l:
(3.1.20)
r(x+c) =r(x) = l:
(3.1.21)
As an example, assume that we have two random variables x 1 and x 2 . Let f.L 1= 10, f.Lz= 11, u 12 = 100, u 2 2 = 64, and 0"1z=O"z1 = 40. Then
p'=[10
11)
l:= [100 40
and
40] 64
Suppose further that c = 1/2. Then iif(cx') = (1/2)p' = [5.0
5.5]
and
Thus if every score on both variables is divided in half, the resulting scores have 1/4 the original variances and covariance. Assume instead that 3 is added to every score on x 1, and 2 subtracted from x 2 . Then c'=[3
-2]
iif(x+c)'=p'+c'=[13 100 r(x+c) = [ 40
9)
40] 64
The expectation of a sum or any linear combination of the random variables
I
Summary of Multivariate Data
57
of x can be formulated in matrix notation. For example, with p=2 and 1'= [1 1 ], we may apply Eq. 3.1.8: Ef(X1 + X2) = Ef(1 'x) = p, 1+p,2= 1'p, Generally, if cis any vector defining a weighted linear combination of the X;, the expectation of the linear function y= c1x1+c2x2+ · · ·+cvxv is Ef(y) = Ef(c'x) =C'p,
(3.1.22)
The variance of the sum x1+x2, according to Eq. 3.1.9, is CT1 2+CT22+2CT12 • This is also obtained through vector manipulation. Note that r(x1+x2)=r{1'x). Expression 3.1.9 is equivalent to 1 'I1, since r(1'x)=1'I1
= [1 1] [:::
::~J [~]
Further, if I is diagonal with CT 12 = 0, then 1 'I1 = CT12+CT22. That is, the variance of the sum of independent random variables is the sum of their separate variances. For a general set of weights c, the variance of the linear function y= c'x is r(y) =Ef(c'x -c' p,)2
= L; LJ
C;CjG'ij
=c'r{x)c (3.1.23)
=c'Ic If y1 = c'x and y2 = d'x are both linear combinations of the Y1 and Y2 is
X;,
the covariance of
r(y 1 , y2) = Ef[(c'x-c' p,) (d'x-d' p.)] i=l J=l
=c'r(x)d
=c'l:d
(3.1.24)
Results 3.1.22 to 3.1.24 can be formulated in a more general way. Suppose that we have a vector variable x with p elements. Let p, and I be the expectation and variance-covariance matrix of x. To form q ("" 1) linear combinations of x, we may construct a qxp transformation matrix C, with each row containing the weights for a single linear function. The vector of transformed variables is
y=Cx
(3.1.25)
xis the p x 1 untransformed vector; y is the qx 1 vector of linear combinations.
58
Method
The vector of expectations for they-variables is q x 1; that is, ~-tu=lf(Cx)
(3.1.26)
=C~-t
The variances and covariances of they-variables form the qx q matrix:
"l:Y = lf(Y-ILY) (Y-1-tu)' = 8"(Cxx' C '-C~-t~-t' C') =C;I/"(x)C'
= Cl:C'
(3.1.27)
For example, let
with
8"(x)
=It=[~~]
and
0] 100 40 r(x)=l:= [ 40 64 -24 0 -24 144 Suppose we wish to form two new variables y1 = L;X; and y2 = x 3 -(1/2)x1 . The transformation matrix is
r
1 C= L-1/2
1 1] 0 1
The two linear functions are
The expectation of y is
[ 10+11+9]
#Ly= 9-1/2(10) =
[30] 4
The covariance matrix of y1 and y2 is
l: = Cl:C' = [340 y 50
50] 169
These are identically the results from 3.1.23 and 3.1.24 for the two linear combinations simultaneously. That is, if we let
c' = [1
1
1]
and
d'=[-1/2
0 1]
Summary of Multivariate Data
59
then
I = CIC' = [c'Ic c'Id] u d'Ic d'Id These results for linear functions of measured variables are particularly useful in statistical analysis. The sample results are identical to those given here except that sample values y. and V substitute for p. and I, respectively. Let us consider some cases of special interest. First, in general, two linear combinations of correlated variables have nonzero covariance. This is the case even when the weight vectors are orthogonal. For example, consider the orthogonal vectors
c=[n
and
The covariance of c'x and d'x is
This sum will only be null under a particular equality of variance components. Should the original variates be uncorrelated, with a 21 = a 23 = 0, the covariance of the linear composites may still be nonzero, if a 1 2 =F a 3 2 • Thus, two linear combinations of variables having differing variances will generally be intercorrelated. Third, nonorthogonal linear combinations of variables will generally have nonzero intercorrelations. For example, consider
c=[-il
and
c and d have unit inner product. The covariance of c'x and d'x is
This expression is not generally null, even when all au= 0 (i =F J). In each of the cases above, it is possible to construct weight vectors for either correlated or uncorrelated variables such that the linear composites do have zero intercorrelation.
Standardization The random variables of x may be simultaneously standardized through matrix operations. Represent as a the diagonal matrix consisting of only the diagonal elements of I. That is, let ..i=diag (I) = diag (at 2 ,
Then a- 112 is the matrix of inverse standard deviations, with diagonal elements (al)-tlz= 1/a;.
60
Method
Then
(3.1.28) z is the vector of standardized random variables, expressed as linear functions of the original x 1:
Z=
x1-JL1 0"1
z1
x2-JL2 0"2
z2
The expectation and covariance matrix of z are i?'(z) = i?'[..1-li2(x-t-t)]
= a-li2[ifx-i?'x] = 0
(3.1.29)
and r(z) = a-112r(x-t-t)..1-1'2 =
a-112r(x)a-1'2
=
4 -lt2:ta-1'2
=~
(3.1.30)
The expectation of each standardized variable is zero. The covariance matrix of standardized variables is the pxp symmetric correlation matrix of x. The elements are the correlations of each pair of measures x 1 and xJ; that is,
(3.1.30a) Pu has limits (-1 .;;;pu.;;; 1l: The correlation of each variable with itself is unity, since pii=aN(a1a 1) = 1. !Ji has the form
P32
P13 P23 1
PP2
...
P12 P21 Pa1
1
Ptr> P2P PaP
!Ji=
ppt
pp,p-1
The covariance matrix :t is transformed to correlational form by dividing each element in row i by [a1], and each element of column j by [aJ]. This is accomplished as the pre- and postmultiplication of :t by the diagonal matrix in Eq. 3.1.30. As an example, let x be the three-element vector with J.t 1 =[10
11
9]
Summary of Multivariate Data
61
and 100 40 0] :I= [ 40 64 -24 0 -24 144 The diagonal matrix of variances is .1=<~1iag
(100, 64, 144)
and the standard deviations are .1112 =diag (10, 8, 12)
Letz=.1- 112 (x-p.). Then g'(z)' = [0
0
0]
and 100 10·10 r(z) = ,1-112:1;.1-1'2 =
40 10·8
0 10·12
64 8·8
-24 8 ·12
-24 12·8
144 12 ·12
40 10.8 0 12·10
1.00 .50 0 ] = [ .50 1.00 -.25 0 -.25 1.00 The correlation of x1 and x2 is P12 =.50.
3.2
THE MULTIVARIATE NORMAL DISTRIBUTION
The probability distribution assumed for statistical tests in this book is a multivariate extension of the no~mal distribution. The d<;msity function of a pvariate normal vector x is
-(x-n)':l;- 1(x-n)]
(x) = (27T)-v'21 :Ij-112 exp [ ,.. 2 ,..
(3.2.1)
For standardized vector z= .1- 112(x-p.), the density becomes (z) = (27T)-v'212Jtl-112 exp
1 z] [-z'ffi2
(3.2.2)
Expressions 3.2.1 and 3.2.2 are complex, but they can be seen to parallel the usual univariate normal density, with x, p., and :I replacing x, p.., and a 2 , respectively. For exemplary purposes, Figure 3.2.1 presents an overhead view of the twovariate or bivariate normal distripution. We may specify that x 1 and X2 have a
62
Method
Figure 3.2.1
Bivariate normal distribution of x, and X 2 .
joint normal distribution, with g'(x) = fL and r·(x) = l:, by writing
or X-
ff2(/A.,l;)
The subscript on A" (that is, 2) gives the dimensionality of x and of the corresponding normal distribution. Perhaps the most useful aspect of multivariate normality of p variables xi is that all linear combinations of the xi also have multivariate normal distributions (Dempster, 1969, p. 277). Let random vector x have a p-variate normal distribution with expectation fL and covariance matrix l:; that is, (3.2.3)
x and fL are p x 1 ; l: is p x p. Then any q (~ 1) linear functions y= Cx+d. where dis a qx 1 vector of constants, have distribution (3.2.4)
Cis the q x p transformation matrix defining q linear functions of the xi· y is the
Summary of Multivariate Data
qx1 vector of transformed variables;
C~-t+d
63
is its qx1 vector expectation, and
Cl:C' its qx q variance-covariance matrix. The result for ?(y) follows from 3.1.19 and 3.1.26; r(y) follows from 3.1.21 and 3.1.27. From this theorem, we may observe the properties of subsets of variables forming the p-variate set. That is we shall examine the marginal distribution of some of the variates out of context, or ignoring the others. For example, assume that x is a p-variate normal vector composed of p, variates x, (p, ~ 1) and p2 variates x2 (p 2 ~ 1); that is, x' = [x;, x~]. The moments of the total distribution may be represented as (3.2.5) respectively, where
p=p,+pz is the p, x 1 vector mean of the vector variable x, is the p 2 x 1 vector mean of X 2 l: 11 is the p, xp, covariance mat~ix of x 1 l:22 is the P2 x p2 covariance matrix of x 2 l: 12 = l:~, is the p 1 Xp 2 matrix of covariances of each variable in x1 and each variate in X 2 ~-t 1 ~-t 2
We may inspect the distribution of x, alone by applying 3.2.4 with d= 0, and C= [
lv1 l 0 p, xp, identity p, Xp2 null matrix matrix
The effect of this transformation is only to delete the X 2 variables. y = Cx+d retains only the x, set. According to 3.2.4, they (or x,) variables have a p,-variate normal distribution, with and By letting C = [0'
lv2 ], we may ignore x, and observe that Xz
~ ff v2 (~-tz, l:zz)
The marginal distribution of any subset of multivariate normal variates is (univariate or multivariate) normal, and retains the same expectation, variances, and covariances as in the higher-order multivariate distribution. In Figure 3.2.1, x, alone is univariate .ff(p, 11 a 12 ) and x 2 is univariate .ff(p,2 , a 2 2 ). In each case, other variables in the set are simply ignored. The conditional distribution of x 2 given x, is the distribution of X 2 values at any particular x, value. This is often termed the distribution of Xz given x,, removing the effects of x, or holding x, constant. The conditional distributions for x 2 are indicated by the horizontal cross-cut lines in Figure 3.2.1. The conditional variate is represented x 2 jx,. In the bivariate normal distribution the conditional distributions of x 2 have the following properties: (1) All of the conditional distributions are normal. (2)
64
Method
The center (expectation) of each lies on the straight regression line, a, of x 2 on x1• (3) The variance of each conditional x2 distribution is the same, regardless of the x1 value. These results are most easily represented if we first standardize x1 and x2 . Let and Then
The covariance of z1 and z2 is p 12 • In Figure 3.2.1 there is a concentration of points in the first and third quadrants, reflecting a positive covariance. The conditional variate after standardizing is z2 given z 1 or z2 lz 1 . The mean of z 2 at a particular z 1 value depends upon z1 and the angle, o:, which the regression line makes with the z 2 axis. The equation relating the mean to the Z 1 value is
These results may be extended to the unstandardized xi. The equation of the line connecting the means of the conditional distributions is given above. Substituting raw-score forms for p 12 , z 1 and z 2 , we obtain
Solving for X 2 , (3.2.6) Expression 3.2.6 is the mean of the conditional distribution of x 2 at particular values of X 1 . The variance of X 2 is
=
(3.2.7) The variance does not depend on the value of x 1 . Summarizing then, the distribution of x 2 given X1 is (3.2.8) These expressions may be extended to the general (p 1 +p 2 )-variate form. If p variates are partitioned into sets of p 1 variables x 1 , and p 2 variables x 2 , as in
Summary of Multivariate Data
65
3.2.5, the expectation of x2 given X1 is the p 2x1 vector, W(x2l Xt) = I2tiu-1(Xt-#-tt) +11-2
(3.2.9)
The variance-covariance matrix of x2at any given set of values x 1is p 2 xp 2 • r(x2lx1) = I22-I21Iu-1I12 =I2211
(3.2.10)
Like its univariate counterpart, I22 11does not depend on the particular x 1values. The covariance matrix of the conditional distribution remains unchanged regardless of where the x1 variables are fixed. It is easily seen that when p 1=p 2 = 1, Eqs. 3.2.9 and 3.2.10 simplify to yield 3.2.6 and 3.2.7, respectively. When p 1 is greater than unity, the inversion of Iu is complex, and algebraic parallels to the operation are difficult to follow. If x1 arid x2 are jointly (p1+P2)-variate normal, then x2 is p 2-variate normal at all values of x1. Also, x2 given x1 is uncorrelated with x1 . Assuming normality, the two sets are statistically independent; that is, r(xto x2lx1)=0
(3.2.11)
0 is a P1XP2 null matrix. I2211 may be standardized to yield the P2XP2 correlation matrix among the
x2 variables, eliminating x1. Let ,:12211 = diag (I2211) be the diagonal matrix of conditional variances . .:12211-1'2 is the matrix of inverse standard deviations. Then ~2211 = .:12211- 112I2211.:12211-1/2 ~22 1 1
is the P2XP2 matrix of partial correlations among the x2 variables, holding constant the x1 measures. We may also examine the moments of successive univariate conditional distributions. For example, we may require the variance of y2 given y1, of y 3 givenyt and Y2• of Y4 given Y1• Y2• and Ya· and so on to yp, eliminating all others. These results are obtained directly from the Cholesky factor of the total variance-covariance matrix. For example, letp =2 and
Operating upon the elements of I yields the Cholesky factor
The second diagonal element is the square root of the variance of x 2 given x 1 -that is, the conditional standard deviation, or vr(x2lx1). Let us represent this as
66
Method
conditional covariances and standard deviation of each variable, given all preceding. That is, (Zero)
(]'1
T=
(]'21,(]'1
(]'211
(]'31,(]'1
(]'3211
(]'3112
(]' p1,(]'1
(]'P3112
(3.2.12)
is the conditional covariance of x 1 and Xj, given xk and x1.
<TiJikl
3.3 SAMPLES OF MULTIVARIATE DATA One Sample Let us represent the observed scores for one observation i on p variates, as the 1 xp row vector y;:
Y! = [yil
Y12
· · · YiP]
y1 is a single vector observation. Its elements may consist of raw scores on p measured variates, or may have resulted from linear or nonlinear transformations of a prior set of p' original measures. For example, the creativity-intelligence problem (Sample Problem 1) requires that the product of standardized variates be formed, to yield interaction terms for regression analysis. The category-reproduction measure of Sample Problem 2 must be converted to a proportion in order to make results comparable across experimental conditions. Or transformations may be employed to cause the variates under study to approximate more nearly the properties of multivariate normal variates (see Kirk, 1968, pp. 63-67). The vector observations for N subjects may be juxtaposed to produce the complete Nxp data matrix for the group:
Yn Y21
V= [ .
Ym
~[J~l
~~: ::: ~~:] YN2
•• .
YNP
(3.3.1)
Each column, consisting of scores for all subjects on a single variate, is the observational vector. The data matrix contains all the information necessary for
Summary of Multivariate Data
67
describing the sample. If the sample is representative of a defined population', Vis also sufficient for estimating the parameters of the population from which it was drawn. If the data are dra'A'n from a normal distribution, then sufficient summary statistics are the vector of means for the p variables, and the pxp variance-covariance matrix. The sum of the N observations' scores on each of the variates may be presented in vector form, as N
y~=:L
y;
i=l
=1'V
(3.3.2)
1. is an N-element unit vector. T~e number of subjects in the sample may be computed from the unit vector asthe scalar product 1'1. From these results, the vector mean of the p variates is a 1 xp row vector 1
N
y: ='N 2. y; i=l
= -1 -1'V
(3.3.3)
1'1
Example Fifteen students were randomly sampled from the freshman class at a large midwestern university. Each student yielded five measures: y1 , grade average for required courses taken; y2 , grade average tor elective courses taken; x1. high-school general knowledge test, taken previous year; x2. lQ score from previous year; X3, educational motivation score from previous year. The scores are reported in Table 3.3.1. Table3.3.1
Scores for Fifteen College Freshmen on Five Educational Measures y,
Observation
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
rl 2.2.8 ,1.6 1 2.6 I 2.7 2.1 3.1 I 3.0 3.2 2.6 I 2.7 3.0 I 1.6 I .9 I I 1.9
l l l l l
Y2
x,
x.
2.0 2.2 2.0 3.7 3.2 3.2 3.7 3.1 2.6 3.2 2.8 2.4 1.4 1.0 1.2
72 78 84 95 88 83 92 86 88 80 87 94 73 80 83
117 117 120 117 123 118 114 114 115 114 112 115 111 112
114
Xa
17.31 17.6 1 15.0 I 18.0 18.7 17.9 1 17.3 1 18.1 I 16.o 16.4 17.6 19.5 12.1 1 17.0 1 16.1 I
l l
I
I l
L--------------------------~
68
Method
Each row of Table 3.3.1 is a single vector observation. The portion of the table enclosed in dashed lines is the 15x5 data matrix Y. If we let 1 be a 15-element unit vector, then the sums for the five measures are y~=
[34.0
37.7
1263
1733
255.3]
The means are y:=
1
15 y~=[2.27 2.51
84.2
115.5
17.02]
In most behavioral science applications the origins of the measurement scales are arbitrary. Thus, we may wish to express each vector observation as a row vector of mean deviations, y{ -y:. Each element is the deviation of th~ score for subject i on variate j from the variable mean, or Yu-Y·J· The data matrix of mean deviations is the Nxp matrix having mean-deviation vectors as rows:
y;-y.j [ y'-y'
Y-1y!=
2
:
•
y;,-y: 1 isanNx1 unitvector. The data of Table 3.3.1 can be expressed in this form. The first row of Y-1y: is [.8 2.0 72 114 17.3]- [2.27 2.51 84.2 115.5 17.02] = [-1.47 -.51
-12.2 -1.5
.28]
It can be seen that the sum or mean of the mean-deviation vectors is null, since
1 '(Y-1y!)=1 'Y-1 '1y! =1'Y-1'1(-1-)1'Y 1'1 =0' The basic data for the variance-covariance matrix are the sum of squared scores for each variable, and the sums of cross products for pairs of variables, across all subjects. The matrix product of each vector observation y; and its transpose is the pxp symmetric matrix of squares and cross products of each of the elements:
YtYI =
[~::~il ~::r~2 ::: : .
YtPYil
YtPYi2
~::~:PJ :
.
. . . Ytl
Summing these matrices across all N subjects yields the total sum-of-squares
Summary of Multivariate Data
69
' cross-products matrix and N
Sr= i=l I YtYl =Y'Y
(3.3.4)
The matrix, abbreviated total sum of products, contains ~1y1l, the sum of squared scores on the jth variate in the jj diagonal position. ~1Y;;Y1k, the sum of cross products of variables y3 and Yk is in the jk off-diagonal position, for all N subjects. Sr is also symmetric, and of order p xp. For a single variable YJ· the sum of squared mean deviations is
I; (y;;-Y·J)2= I; Y;; 2 -Ny./
(3.3.5)
The sum of cross products of mean deviations on Yi and yk is (3.3.6) In the multiple-variable case we may adjust all sums of squares and cross products simultaneously. Let Sw represent the pxp sum of squares and cross products of mean deviations; that is,
N
=I YtYi-Ny.y: i=l (3.3.7) since ~y1 y: = ~y.y[ = Ny.y!. In Eq. 3.3.7, NY·JY·k is subtracted from the jk element of Sr. The reader may wish to demonstrate for himself that the diagonal elements of Sw contain exactly the mean-adjusted sums of squares that would be obtained for that variable alone by Eq. 3.3.5. The off-diagonal elements identically reproduce Eq. 3.3.6 for each pair of measures. Sw is the sum of products adjusted to the group vector mean, or the within-group sum of products. The variances and covariances of the p variables may be obtained from Sw by dividing each element by the associated degrees of freedom, N-1. The sample variance-covariance matrix is
1
= N-1 '
N
L (y;-y.)(y;-y.)'
(3.3.8)
i=l
Vw has the sample variance (sl) of each of the variables on the diagonal, and the sample covariance (s3k) of variates Yi and Yk in the jk off-diagonal position.
70
Method
That is,
for (j, k= 1, 2, ... , p). This is the common form, 1/(N-1) times Eq. 3.3.5 or 3.3.6. The standard deviation of variable Y; is the square root of the jj diagonal element of vW• [Vw];/ 12 = S;. Example
Represent the data of Table 3.3.1 as the 15x 5 matrix Y. The total sum of products for the five measures is (Symmetric)] 85.38 92.21 105.31 Sr=Y'Y= [ 2918.70 3225.10 107009 3934.1 0 4380.20 145976 200363.00 585.70 651.79 21585 29503.90 4383.13 The sum of squared scores are on the diagonal-for example, .82 +2.2 2 + · ·+1.9 2 =85.38. The sums of cross products are the off-diagonal elements-for example, .8X2.0+2.2X2.2+ · ·+1.9X1.2=92.21. The within-group sum of products is Sw = Sr-15y.y:; that is,
77.07 (Symmetric)] 85.45 94.75 [ 15y.y.' = 2862.80 3174.34 106344.60 3928.13 4355.61 145918.60 200219.27 578.68 641.65 21496.26 29495.66 4345.21 and therefore
8.31 6.76 [ Sw= 55.90 5.97 7.02
(Symmetric)] 10.56 50.76 664.40 57.40 143.73 24.59 10.14 88.74 8.24 37.92
l
Tbe variance-covariance matrix is
(Symmetr;c] y,
1 .48 Yz 59 .75 Vw= 14 Sw= 3.99 3.63 47.46 Xt x2 .43 1.76 4.10 10.27 .50 .72 6.34 .59 2.71 X 3 Yt
Y2
X1
x2
Xa
Summary of Multivariate Data
71
The standard deviations of the five measures are:
Sz=
Y.59=
.77
\175=
.87
[ Y1J [Yz]
v47.46 = 6.89 vT027 = 3.20 s5 = V2.71 = 1.65 S3 =
s4 =
The reader may wish to verify that Vw contains the usual (marginal) variances and covariances of the five measures. These results would have been obtained by scalar algebra for any one or two of the measures removed from the set. The matrix operations facilitate computing multiple dispersion measures simultaneously. The sample intercorrelations among the measures are obtained by substituting sample values in Eqs. 3.1.28 and 3.1.30. Let Dw be an order-p diagonal matrix of variances Dw= diag (V10 )
= diag (s1 2,
s2 2 ,
••• , Sp 2 )
(3.3.9)
Dw 1' 2 is the diagonal matrix of standard deviations, and Dw - 112 is the matrix of inverse standard deviations, Dw - 112 = diag (1 /s 1 , 1lsz, ... , 1/sp) The vector of standardized scores for observation i is
(3.3.1 0)
The mean of all standardized vectors in the sample is 0. The variance-covariance matrix of standard scores is the sample correlation matrix among the variables.
(3.3.11) The diagonal elements of Rw are the variances of the standardized variables, or unity. The jk off-diagonal element is r;k> the sample correlation of Y; ang Yk· The pre- and postmultiplication of Vw by the diagonal matrix yields elements
72
Method
The maximum possible absolute value of s;k is s;sk. Thus rjk has absolute limit 1, with the sign depending only upon the sign of the covariance S;k· Expression 3.3.11 is the matrix of common Pearson correlations among the p measures. The elements agree with the results which would be obtained if the variables were correlated two at a time, in isolation. However, if data consist of distinct subgroups of subjects having different means, separate covariance matrices, or else the pooled within-group covariance matrix must be used in place of V, (3.3.8). The groups may be either natural (for example, sex, grade) or experimentally formed (for example, control, treatment 1, 2). Details are given in the next section and the "Note on within-group variances and correlations," p.81. The general form of Eqs. 3.3.8 and 3.3.11 may be followed in reducing any sum of products of mean deviations S to covariance form V and to correlational form R. If we let v be the degrees of freedom for S, then
v=!s v
(3.3.12)
o- 112 vD- 112
(3.3.13)
Letting D= diag (V), then R=
R is the symmetric matrix of intercorrelations, having unities on the principal diagonal. S may also be easily reconstructed from R, asS= vD 112 RD 112 . The value
of v is a function of the number of observations, N, and the number of subgroups, J, into which they are combined. In the preceding discussion there is only one group of observations and v= N-J= N-1.1f some subjects are missing scores on one or more tests, determination of Sand vis complex, with a number of possible solutions.
Example
The five variables of Table 3.3.1 give us variance-covariance matrix
l
.59 .48 Vw= 3.99 .43 .50
.75 3.63 1.76 .72
(Symmetric)J 47.46 4.10 10.27 6.34 .59
2.71
The variances alone are Dw=diag (.59,
.75,
47.46,
10.27,
2.71)
The inverse standard deviations are Dw- 112 =diag (1/.77,
1/.87,
1/6.89,
1/3.20,
1/1.65)
Summary of Multivariate Data
73
The correlations are
.59
(Symmetric)
.77(.77)
.48 .77(.87)
.75
.87(.87)
3.99 3.63 47.46 .77(6.89) .87(6.89) 6.89(6.89) .43 .77(3.20)
1.76 4.10 10.27 .87(3.20) 6.89(3.20) 3.20(3.20)
.50 .77(1.65)
.72 6.34 .59 .87(1.65) 6.89(1.65) 3.20(1.65)
2.71 1.65(1.65)
1.00 (Symmetric] Y1 .72 1.00 Y2 [ = .75 .61 1.00 x1 .17 .63 .19 1.00 x2 .40 .51 .56 .11 1.00 X3
The variable having highest intercorrelation with y1 is x 1 ; x2 and Xa have very little interrelationship. All measures have positive intercorrelations and so seem to be measuring aspects of the same general phenomenon, perhaps a general verbal ability.
More than One Sample Often subjects have been sampled from or assigned to J > 1 populations. The number of populations may be a function of the crossing or nesting of a number of classification variables. In a completely crossed AxBxC sampling design having a levels of factor A, b levels of B, and c levels of C, J is the product abc. Or J may be a function of the sample size, as in cases where the sampling design includes effects that are functions of the sampling unit, as in most randomized block designs. A nested sampling design having b 1 levels of B within A1 and b2 levels of B within A2 has a maximum of J= b1+b2 subclasses. It is not necessary that all J groups or subclasses of subjects have observations. Some subclasses may be null either because of design considerations or such external factors as subject mortality. For general applications, we shall represent the number of subclasses having at least one analysis unit as Jo, so J 0 ,;;;;J. It will be seen that the J-J 0 null subclasses do not contribute in any way to the summary data and may be eliminated from the computations.
74
Method
The dimensionality of sampling designs may be too great to allow us to represent the data in multidimensional tables. Thus we will assign a unique number to each subclass of subjects so that the data array may be represented in a subjects-by-variates or groups-by-variates table. A convenient convention is to assign numerals from 1 to J according to the natural order of subclass identification numbers. For example, in the AxBxC design, the group at the first level of A, B, and C, or group 1, 1, 1 is coded simply 1. Group 2 is at the first level of A and B, and the second level of C (that is, group 1, 1, 2), and so on. In a 2X2X3 arrangement, the numerals 1 through 12 may be assigned to the groups, in the natural ascending order of the group identifications:
Group 1 , 1, 1 1, 1, 2 1, 1, 3 1, 2, 1 1, 2, 2 1, 2, 3 2, 1, 1 2, 1,2 2, 1,3 2,2, 1 2,2,2 2,2,3
Assigned 1 2 3
4 5 6 7 8 9 10 11 12
We shall assume this numbering system for designs considered in this text. The result is a single subclass identification code or subscript in all cases. Hierarchial arrangements with equal or unequal numbers of levels of nested factors may be represented notationally as crossed but possibly incomplete designs. The programmed instruction problem (Sample Problem 5) involves 19 experimental classes and 18 control classes, constituting a random nested effect. This may be considered a treatmentXclassesxsex (2X19X2) incomplete design, with J=76 and J 0 =74. The nested effects of course must be identified in the computation of summary data. For multiple groups of subjects the data matrix may be represented in partitioned form. Let N; represent the number of subjects in subclass j {j= 1, 2, ... , J). The N; are not restricted to being equal or proportional to other N;'s. The total number of observations is N= ':i;N;. Let Yu be the px 1 vector observation for subjecti in group j. The Nxp array of data is
Summary of Multivariate Data
75
N 1 observations group 1
Y=
N 2 observations group 2
N.1 observations group J
The Jxp matrix of subclass sums is 1.V1
2: y;l i=l
Y~1
lv'2
2: Yiz Y+= i=l
Y~z
.vJ
L Yi.J
Y~J
i=l
Each row contains simply the sum of all vector observations for the respective group or subclass. The matrix of subclass means is formed by dividing the elements of each row in Y+by the number of subjects in the subclass. That is 1
Nl Y~l 1
y:l
Y.= fiTY~z 2
y:2
1 ' N.1 Y+.1
Y-.1
Y·; is the p-element vector mean for subclass j.
'
76
Method
Y. may be derived from Y+ as a matrix product. Represent the numbers of subjects in the J subclasses as a diagonal matrix D, D= diag (N1 , N 2 ,
•.• ,
NJ)
The subclass means are the product
Y.=D- 1V+
(3.3.14)
diagonal matrix with elements [d- 1 h= 1/N;. Should any of theN; be zero, computation of o- 1 for that element may simply be bypassed. As an alternative, Y+ may be reduced in size to (J 0 Xp) by omitting any rows corresponding to null subclasses. The analogous operation upon D will result in a J0 XJ 0 diagonal matrix with all positive diagonal elements. · For interpretive purposes, combined observed means and frequencies are often required. Consider the 2X2X3 (AXBXC) arrangement introduced earlier. It may be desired, for example, to table means for each of the two levels of the A classification variable. The vector mean for all subjects sharing A 1 is the weighted average of the means for the first six groups; that is,
o- 1 is the
6
Y~t+Y~z+Y~3+Y~4+Y~5+Y~6
N1+N2+N3+N4+N5+N6
i~IN;y:;
-6--
2.
N;
J=l 6
The associated total frequency is ;~1 N;. In like fashion, the rows of Y. may be combined to yield means for the second or third classification variables, 8 or C. Further breakdowns to the means of combinations of effects may also be obtained. The mean of all subjects sharing attributes 8 2 and C2 , for example, is N5y:5 +NuY:u
N5 +Nu This is the mean of N 5 +N 11 observations.
Example Random samples of students were drawn from the student body at a large midwestern university at the time of continued violent clashes between students and police. Students were classified according to (A) sex, and (B) whether or not they participated in confrontations with the police. Each student subject was scored on y1 , his attitude toward the university and its administration, prior to the violence; and y2 , his attitudes following the violent confrontations. A higher score on y 1 or y2 indicates a higher negative attitude (dislike) toward the university. Due to difficulties in locating students after the confrontations, unequal numbers of subjects resulted in the four subgroups. The data are presented in Table 3.3.2.
Summary of Multivariate Data
Table 3.3.2
Sex (A) Male(A,)
Female(A2 )
Attitude Scores for College Students before and after Violent Confrontations with City Police Participation (B)
Group Number (J)
Scores Observation (i)
y,
Yz
1 2 3 4
0 0 2 0
0 0 2 1
1 2 3 4 5 6
2 4 3 4 4 2
3 4 4 4 4 3
1
2 1 3 4 1
2
2 3 4 5
3 4 2
1 2 3 4 5 6 7
3 3 4 3 3 4 4
4 3 4 3 3 4 4
No(B1 )
Yes (B2 )
2
No(B,)
3
Yes(B2 )
4
1
The sums for the four groups are the J (=4) rows of V+·
2.0 3.0] Y~t [ 19.0 22.0 Y~2 V+= 11.0 12.0 Y~a 24.0 25.0 Y~4 The cell frequencies are D= diag (4, 6, 5, 7)
The total N is 22. The means for the four groups are V. = o-ty+where
o-
1
=diag (1/4, 1/6,1/5, 1/7)
Then
[ .50 .75] Y·sy:,Y;2
y = 3.17 3.67 ·
2.20 3.43
2.40 3.57 y:4
77
78
Method
The mean attitude scores for nonparticipants (Bt) are (1.44
1.67]
with 4+5 = 9 observations. Mean scores for participants (B2) are Y~z+Y~4
[3.31
N2+N4
3.62]
with 13 observations. It appears that participants have more negative attitudes toward the university.
Whenever there is more than one group of subjects, variances and covariances must be expressed in terms of deviations from the separate group mean vectors. Let Sri (j = 1, 2, ... , J) represent the total sum of products for observations in group j alone. That is, Nj
Sr. = 'L YuYu J
i=l
(3.3.15)
The p x p symmetric total sum of squares and cross products for all subjects is
Sr= 'LJSrJ
=Y'Y
(3.3.16)
This form is equivalent to Eq. 3.3.4, but summed across multiple groups. Since subclasses generally have different mean vectors, within-group sums of products are computed separately for each group. They may then be pooled for a common estimate. The sum of products of mean deviation scores in a single subclassj is identical to the matrix for one group in Eq. 3.3.7. Nj
Swi= L (Yu-Y·JHYu-Y·J)' i=l
= Sri -NiY·iY:i
(3.3.17)
The additional subscript indicates the particular group of observations, with Ni-1 degrees of freedom. The variance-covariance matrix for the group is
1 Vw1 = N1_ 1 Swi
(3.3.18)
The variances and correlations for the group are
DwJ= diag (Vw1)
(3.3.19)
and (3.3.20) respectively, following Eq. 3.3.11. To obtain a single common estimate of the variance-covariance or correla-
Summary of Multivariate Data
79
tion matrix, we may pool the S,r The pooled within-group sum of products is
Sw= 2:; S,; =
2:1 (S1J -N;Y·;Y:;)
=
Sr-"2-;N;Y·;Y:;
=Y'Y-v:ov.
(3.3.21)
Sw is pxp symmetric, and has the.usual within-group sum of squares for Yi in the ii diagonal position. The ij off-diagonal element of Sw is the sum of cross products for Yi andy;, adjusted to the J variable means. Each S,; has N;-1 degrees of freedom; Sw has "L;(N;-1) = N-J degrees of freedom. From expression 3.3.12, the pooled within-group variance-covariance matrix is 1 (3.3.22) Vw= N-JSw Let (3.3.23)
Dw= diag (Vw)
be the diagonal matrix of variances. Then the pooled within-group standard deviations and correlations are Dw 1i 2 and (3.3.24) respectively.
Example The total sum of products for group 1 in Table 3.3.2 is
Adjusting to the group mean vector, S.,1 N y y' =4 [.50] [ 50 I 'I ' l .75 •
=
Sr1 -N1 y.,y:,, where
75) = [1.00 . 1.50
Then
s u;!
=
[3.00 2.50
2.50] 2.75
The variance-covariance matrix is V
= (ii 3)S w!
= [1.00 WI
.83
.83] .92
The inverse standard deviations are D
-1/2 "'!
=
[1 Jyf[{)Q 0
0
J
1/Y.92
1.50] 2.25
80
Method
and the correlations are
=
R WI
c.OO .87 Yt
.87] Yt 1.00 Yz Yz
In like fashion,
s "'2
[4.83 2.33] = 2.33 1.33
s "'3 -
s "'4 -
R
wz
= [1.00
.92] .92 1.00
[6.80 5.60] 5.60 5.20
R. = [1 .00
[1.71 1.29] 1.29 1.71
R. = [1.00
"3
u4
.94
.94] 1.00
.75] .75 1.00
The correlations for all groups are similar, and we may pool to obtain the common value. The total sum of products for all the data of Table 3.3.2 is Y'Y
= [184 195] 195 212
The mean corrections form the product Y 'DY = [167.65
183.18] 183.18 201.00
..
The difference is
Sw=[16.35 11.72]y1 11.72 11.00 Yz Yt Y2 The reader may wish to verify that this is equivalently ~
Sw= j=l 2: SwJ The within-group degrees of freedom are N-J= 22-4= 18; the variancecovariance matrix is
v ·= 1/18 [16.35 11.72 fl
_[.91 .65
-
Yt
11.72] 11.00
.65] y1
.61 Yz
Yz
The standard deviations are s 1 = V.91 = .95 and s2 = \/.61 = .78, respectively. The correlation matrix from Vw is
Rw= [1.00 .87] Y1 .87 1.00 Yz Y1 Yz
Summary of Multivariate Data
81
All summary statistics necessary for representing data from multivariate normal populations can be obtai'ned from the subclass sums, frequencies, and the total sum of products, V+• D, and Sr. In general the complete data matrix V is not required. Vector observations may be entered into computation one at a time, as when the analysis is performed by computer. The sums and frequencies may be sequentially accumulated, as may the sum of products
Sr =
L; Sri= LJ L1 YuYu
At times researchers wish to reconstruct analyses from published sources. Usually provided are the means, frequencies, standard deviations, and withingroup correlation coefficients. The missing matrix is the total sum of products, Sr. However, Sr may be reconstructed by
Sr=Sw+Y:ov. =(N-J)Vw+Y:ov. 112 RwDw 112 + L. NJY·;Y:; = (N -J)Dw ' 3
(3.3.25)
is a diagonal matrix of within-group standard deviations. The matrices of sums and ~requencies may be reduced in size if J 0 subclasses have observations, and the remaining J-J0 do not. If the null rows are left intact in D and V., the effect Jpon Sw can be seen to be nil. The groups without subjects do not make a contribution to either V'V, or to v:ov., since Y+; is also null. Thus collapsing rows of V. and D does not affect the correct computation of Sw. Since the number of subclass mean adjustments is J 0 instead of J, the degrees of freedom associated with Sw is N-:-J 0 rather than N-J. The covariance matrix is 1 (3.3.26) Vw= N-Jo Sw Dw 112
Note on Within-group Variances and Correlations The pooled within-group variances, covariances, and correlations will generally differ from those computed without adjusting to separate subgroup means. Common formulas for P;earson product-moment correlations contain only a single mean adjustment. These correlations are not correct when there are subgroups of subjects in the data that have different means. Suppose that we have two independent groups of observations with N1 = N 2 = 11. Each subject is measured on two variables y1 and y2 • The mean vectors for the two groups are
y:l = [20 20]
y:2 = [40
40]
Assume further that both groups have identical covariance matrices. The variances of the two measures in either group are s1 2 =s2 2 = 100 and the covariance is s12 =50. Then
82
Method
Reducing Vw1 or Vw2 to correlational form, the correlation is 50/ V1 00 ·1 00 and R Wt
=
R w2
=
[1.00 .50
.50] 1.00
· The actual correlation of y1 and y2 for any group of subjects is .50. The pooled within-group matrix will give this result correctly. 1
Vw= 22 _ 2 [10Vw1 +10Vw2 ] = [100 50] 50 100 Reducing Vw to correlational form, r 12 is again .50. The common within-group matrix is a weighted average of the variances and covariances for the separate groups of observations. If subgroup matrices differ slightly, the pooled within-group matrix correctly averages the differences. Suppose however that instead of computing the within-group matrix, we had simply computed the correlation for a//22 observations, ignoring subclass membership. The overall mean vector is
y:. = [30 30] . The all-subjects covariance matrix is
1-["'. ""· YiJYi1 -Ny.. y:.] V=-.22-2 £..., £..., We can find
~~~JYiJYu
from the subclass means and covariance matrix: ~~ ~; Yt1Yi1
= ~1 [(Nj-1)Vw1+NJY·JY:;] = [24,000 23,000] 23,000 24,000
Then
v=
[200.00 152.38] 152.38 200.00
From V the correlation is r12 = 152.38/Y2002 =. 76. The correlation is clearly discrepant from the known .50. It results from our having doubled the value of the variance, while more than tripling the covariance, in going from pooled subclasses (Vw) to total-group results (V). This is because the mean deviations from the common y .. are larger than those from subgroup means y. 1. The variances in V are artificially inflated by the fact that the two group means y. 1 and'y. 2 are not the same, and not equal toy .. ; in other situations the covariances may be inflated to a greater extent than the variances, resulting in unduly low correlation measures. Whenever a data set contains results for several distinct groups of observations, valid dispersion measures cannot be obtained by treating the sample as a single group. Either separate subclass variances and covariances or the pooled within-group measures are necessary. Caution must be exercised in using computer programs to produce summary statistics which do not correctly reflect the subgroup structure in the data. The degree of bias in ignoring sub-
Summary of Multivariate Data
83
group structure is a function of the differences among subgroup means, and should not be introduced into the summary statistics.
Linear Combinations of Variables There are numerous situations in which linear combinations of an original set of p measures are required. We may wish to compute subtest and total scores from item responses, to create factor scores from test results, to select subsets of the measures, or to take differences and contrasts among the scores that reflect experimental effects of interest. This may be accomplished through the application of a transformation matrix to each vector observation. The sample results for linear transformations follow directly from population results, as given in Eqs. 3.1.25-3.1.27. Assume that we wish to create q linear combinations of the p variables in vector y. Let each set of weights define one row of the qxp transformation matrix C. Represent the row as c'i (i= 1, 2, ... , q). The transformed variable is xi= c'iY· The complete transformation matrix is
The vector observation of q transformed scores for subject kin group j is (3.3.27) where Y"'i is the untransformed vector for the same subject. Let Y. be the Jxp matrix of means for the J (~ 1) groups on the original measures; let Vw be the variance-covariance matrix. Then the means and variance-covariance matrix of the transformed variables are
X.=Y.C'
(3.3.28)
and (3.3.29)
X. is Jxq;
Vw(x) is qxq symmetric. The same pre- and postmultiplication as in Eq. 3.3.29 may be applied to Sr, Sw, or a matrix for any specific subgroup (such as Sw;l· Transformation of a correlation matrix (Rw) will not generally result in standardized measures, and requires restandardization by Eq. 3.3.13. For analyses that require both the transformed measures and the original variables, the matrix of weights may be augmented by a pxp identity matrix; that is,
_[;: l
q-p rows
C- ·, c q-p lv
Altogether, C has q rows and p columns.
prows
84
Method
Example
The data of Table 3.3.2 contain pre~ and postattitude scores. For additional analyses, the change or difference (post minus pre) is useful. The means for four groups and within-group variance-covariance matrix are
.50 .75~ y- [ 3.17 3.67 .- 2.20 2.40 3.43 3.57 Yt Y2
V
[.91
w= .65
Yt
.65] Yt .61 Y2 Y2
The change score (x 1 = y2 -y1) requires weight vector
c't= [-1
1]
In addition, let us preserve the original measures by adding an identity matrix to C:
C=
[-16 1]~
The three measures may be scored for each vector observation. Or the means and covariance matrix are found by Eqs. 3.3.28 and 3.3.29.
25
.50 .75] 3.17 3.67 20 2.20 2.40 14 3.43 3.57 x1 Yt Y2
X_ [ .50
.-
[ .22 -.26 -.04r~ .91 .65 Yt -.04 .65 .61 Y2 Xt Yt Y2
Vw<x>= -.26
Vw<.r> may be reduced to correlational form in the usual manner:
1.00 -.58 -.11] 1.00 .87 -.11 .87 1.00
Rw<x>= [ -.58
The change score has a negative correlation with both pre- and postmeasures. The more negative the original attitude, the less likely is change in a positive direction. The effect of the identity matrix in preserving they variables among the x can be seen in X., Vw<:r>, and Rw<x>.
Summary of Multivariate Data
85
3.4 SAMPLE PROBLEMS . Sample Problem 1 -Creativity and Achievement The complete data set for each of the sample problems is listed in Appendix C. TheN= 60 subjects of the example were measured on two tests of divergent achievement, three of creativity, and on an intelligence scale.* Three additional variables yvere created, the products of standardized scores on each of the creativity measures and standardized values on the intelligence measure. For standardization, the sample means and standard deviations of the four measures ll}tere determined by prior analyses. Standardizing the four measures does not lilffect their relationships with the criteria. This adjustment is necessary however, for computation of the cross-product or interaction terms. In this manner, the dominance of the interaction by one or another variable due to scaling 'is avoided. The interaction terms themselves need not be standardized. Each six-element vector observation y1 is transformed to a nine-element vector, in the following sequence of operations:
Transformation
Yu left intact
Variable Synthesis Evaluation
Y12 left intact Yta replaced by
Yta-102.02 14_83
Intelligence
Yi4 replaced by
¥!4-18.43 6_80
Consequences obvious)
Y;s replaced by
Yts-4.12 3 _27
Consequences remote
Y16 replaced by
Yi6-14.52 5.42
Possible jobs
Yn formed by Y14Y13 Y;s formed by Y;sY;a y19 formed by Yi6Yta
Creativity
Consequences obvious X intelligence Consequences remote X intelligence Possible jobs X intelligence
For the first observation with observed scores of 5, 1, 106, 20, 5, and 13, respectively, the nine-element transformed vector observation is y~ = [ 5.0
1.0 .27 ' .23
.27 -.28
.06
.07 -.08]
The complete 60X9 data matrix is formed by juxtaposing the yi (i=1, 2, ... , •Intelligence scores are considered the first independent variable. On the punched and listed data cards, intelligence scores follow those for the three creativity measures. Thus the summary matrices presented here and in the following sections constitute a simple reordering of the elements of the same matrices as produced by the MULTIVARIANCE program.
86
Method
60). The sample vector mean is 1
60
y:= 60 LYI i=l
= [2.55 1.38 0.0 0.0 0:0 0.0 .09 .40 .46] We note that the four variables that have been standardized in the sample do indeed have mean zero. The mean intelligence score before standardization is about 102, with a standard deviation of approximately 15 points. The range of intelligence scores is from 67 to 143. Together, these findings indicate that the sample has a notatypical mean and a wide range of values. The heterogeneity is desirable to avoid biasing effects of a truncated range in a situation that demands considering a large spectrum of score values. The total sum of products for the p = 9 variates is
569.00 322.00 301.00 65.79 56.69 59.00 21.29 17.66 5.55 59.00 40.49 24.29 3.38 59.00 Sr= 35.23 40.74 33.58 27.47 31.98 25.13 59.00 18.73 27.90 5.56 -3.06 18.56 1.19 43.86 107.31 73.77 28.65 18.56 38.66 24.42 24.52 101.83 67.77 17.56 1.19 24.42 14.98 21.66
(Symmetric)
114.53 68.91 77.36
The reader may wish to verify that [s1] 21 = 322, for example, is the total sum of cross products of scores on the first two variables, which are left untransformed. The sum of products of all nine variates adjusted to the vector mean is 60
Sw=
~
(y;-y.)(yi-y.)'
i=l
=
Sr-60y.y:
178.85 (Symmetric) 110.35 186.18 65.79 56.69 59.00 21.29 17.66 5.55 59.00 35.23 40.49 24.29 3.38 59.00 40.74 33.58 27.47 31.98 25.13 59.00 4.56 20.22 5.56 -3.06 18.56 1.19 43.35 45.38 40.17 28.65 18.56 38.66 24.42 22.27 104.70 31.79 29.77 17.56 1.19 24.42 14.98 19.12 57.79 64.79 Sw has 59 degrees of freedom. We note that columns three through six of Sr and Sw are identical, since the corresponding variates have been standardized.
Summary of Multivariate Data
87
The sample covariance matrix is 3.03 1.87 3.16 1.12 .96 .30 .36 1 .69 Vw= 59 Sw= .60 .69 .57 .08 .34 .77 .68 .54 .50
(Symmetricl 1.00 .09 1.00 .06 .41 .47 .54 .09 -.05 .49 .31 .02 .30
1.00 .43 .31 .66 .41
1.00 .02 .41 .25
.73 .38 .32
1.77 .98
1.10
The diagonal elements of Vw are the variances of the nine variates. Those for the four standardized variables are, of course, unity. The off-diagonal elements are the covariances; for example, the sample covariance of the first two measures, evaluation and synthesis, is 1.87. The diagonal matrix of standard deviations is
Dw 112 =diag(1.74, 1.78, 1.00, 1.00, 1.00, 1.00, .86, 1.33, 1.05) Each nonzero element is the square root of the corresponding element of VwTaking the reciprocal of each principal element of Dw 112 to obtain Dw - 112 , and multiplying, the matrix of intercorrelations is
Ru, = Dw -112VwDw -1/2 1.00 .60 .64 .21 .34 .40 .05 .33 .30
(Symmetric) 1.00 .54 .17 .39 .32 .23 .29 .27
1.00 .09 1.00 .41 .06 .47 .54 .11 -.06 .36 .24 .28 .02
1.00 .43 .37 .49 .39
1.00 .02 .31 .24
1.00 .33 .36
1.00 .70
1.00
Synthesis Evaluation Intelligence Cons. obvious Cons. remote Possible jobs Cons. ob. x intell. Cons. rem.xintell. Poss. jobs X intell.
Rw presents a number of noteworthy patterns. The entire matrix appears to display a manifold of positive correlations, indicating that to some extent all tests are measuring a common (general ability) trait. The correlation of the two cognitive achievement measures is high (r12 = .60), as would be expected. Intelligence displays the expected high relationship with the two achievement measures. By contrast, the correlations of the creativity measures with achievement are lower (.17 to .40). With no further evidence, one might adopt a pessimistic attitude toward the possibility of finding creativity to be a factor in achievement,
88
Method
beyond the role played by intelligence. Creativity by itself may play a role in determining achievement, if the confounding with intelligence is ignored.
Sample Problem 3-Dental Calculus Reduction The subjects of the calculus study have been assigned to one of five dentifrice treatment groups, over a two-year period. The first year, treatment group five had no subjects. The second year of the study, treatments two and four were discontinu~d. The resulting subclass membership is as follows: Year
Treatment Group
Subclass Index (j)
First (1) First (1) First (1) First (1) First (1) Second Second Second Second Second
1 2
1 2
8
3
3
7
4
4
5 0
1 2
5
28
3
6
24
7
0 26
Frequency (N3)
9
5 (2) (2) (2) (2) (2)
0
4
5
The total N for the problem is 7
N=2.
Nj=107
j=l
Although the number of subclasses implied by the design is J=2(5)=1 0, three have no observations, leaving Jo=7 for computational purposes. It is only necessary to maintain the 7X7 matrix of frequencies, and a seven-row matrix of means. The six measures involved in the problem are calculus accumulation measures for six anterior teeth of the lower mandible. The matrix of sample means for the seven groups is
.75 2.25 3.75 1.33 1.78 3.11 .43 .86 1.29 Y.= 1.00 .80 2.00 .68 1.57 2.71 .54 .79 2.08 .23 .42 .77
4.13 3.33 1.57 1.20 2.75 1.71 1.31
2.25 .88 2.56 1.56 1.00 .43 .60 .00 1.57 .71 .96 .67 .65 .19
-9<91) ~BI) ~BI)
<eFt <eFt <eFt tt to. oe /Clt. oCl "Cll)l~ Clte,.Cl/ ~l)tt: 'l)t,.Cl/ . e,.Cl/ . l)li}e e II) Cl!l II) . II) . o1~ 1)0l o1~ o1~
to.
o,.
~o,.
o,.
o,.
Reading across the columns of Y., it becomes apparent that the greatest cal cui us development is on the central teeth. In contrast, the teeth farther from the center
Summary of Multivariate Data
89
in either direction show less calculus formation. Fewer than six tooth measurements may be adequate for product testing. The hypothesis that the end teeth do not contribute to between-group variability may be statistically tested through "step-down analysis." It is more difficult to interpret the rows of Y., both because of missing rows and the multiplicity of rows for some treatments. The observed means, combined across years for each level of the treatment factor, are more lucid. These are given in Table 3.4.1. Active agents 2 and 3 appear to produce the greatest reduction in dental calculus, although all three seem beneficial. For further detail, Table 3.4.2 presents the distribution of raw calculus scores for each tooth. The table has results for all groups of subjects combined, and separately for group 1 (the primary control group) and group 5 (using active agent 3). We may examine the effectiveness of the experimental dentifrice. One effect is the virtual elimination of extreme calculus scores and the concentration of low calculus levels in each of the teeth. Although the entire range of scores was observed in the control group, the range has been severely curtailed at the upper end in the experimental group. Further, the absence of any measurable calculus from the teeth is more prominent with the experimental subjects. There are also intertooth differences of interest. The predominance of zero scores for the canines, and secondarily for the lateral incisors, reflects overall calculus formation differences in all groups. It would appear that the lateral and central incisors are more affected by the anticalculus agents. That is, there appears to be a regression effect, with teeth having higher calculus levels also showing the greatest reduction with treatment. We note also that the positive skew of the distributions may violate the assumption of normality necessary for statistical testing.
Table 3.4.1
Combined Observed Means for Treatment Factor of Anticalculus Agent Example Means Vectors
N
R.C.
R.L.I.
R.C.I.
L.C.I.
L.L.I.
L.C.
1. Control
8y:, +28y: 5 36
36
.69
1.72
2.94
3.06
1.72
.75
2. Control
y:2
9
1.33
1.78
3.11
3.33
2.56
1.56
31
.52
.81
1.90
1.68
.97
.61
Level
3. Anti calculus agent 1 4. Anticalculus agent2
7y:3 +24y: 6 31 y:.
5
1.00
.80
2.00
1.20
.60
.00
5. Anticalculus agent3
y:,
26
.23
.42
.77
1.31
.65
.19
90
Method
Table3.4.2
Distribution of Raw Calculus Scores for all ~ubjects (N = 107) and for Experimental Groups 1 (N = 36) and 5 (N = 26) Score
Tooth
Occurrence
Right Canine
Frequency in total sample Percent control1 Percent ex perimental5
Right Lateral Incisor
Frequency in total sample Percent control1 Percent experimental5
Right Central Incisor
Frequency in total sample Percent control1 Percent experimental5
Left Central Incisor
Frequency in total sample Percent control1 Percent experimental5
0
Left Lateral Incisor
Frequency in total sample Percent control1 Percent experimental5
Left Canine
Frequency in total sample Percent control1 Percent experimental5
2
3 4
4
5
6
7
8
9
10
11
1 2.8
69 63.9
25 22.2
7 11.1
84.6
11.5
3.8
51 44.4
33 19.4
9 13.9
5 2.8
.69.2
23.1
3.8
3.8
33 19.4
23 22.2
15 8.3
57.7
19.2
7.7
43 30.6
14 11.1
53.8
1 2.8
4 5.6
1 2.8
2 5.6
1 2.8
13 11.1
4 8.3
8 11.1
4 5.6
5 11.1
2 2.8
11 11.1
11 5.6
9 11.1
7 11.1
4 5.6
7.7
7.7
19.2
7.7
3.8
52 44.4
24 13.9
14 22.2
4
3 2.8
4 5.6
69.2
7.7
11.5
11.5
74 66.7
20 19.4
5 8.3
3
92.3
3.8
3 2.8
6 8.3
1 2.8
1 2.8
3 8.3
2 5.6
3 3.8
To test hypotheses about mean effectiveness, information on the variances and covariances of the measures is also required. The pooled within-group sum of products is 7
Sw= L Sw.= Y'V-v:ov. J~l
=
J
["6
186 238 265 170 140
(Symmetric)
428 518 981 570 995 351 577 190 277
1206 697 519 299 226
213
(Symmetric) 48.10 84.09 166.13 156.96 300.47 557.02 157.77 309.65 563.80 586.86 96.41 184.33 334.48 348.04 210.38 48.59 89.42 165.13 169.13 1 04.56 55.1 0
Summary of Multivariate Data
91
(Symmetric) 137.90 101.91 261.87 81.04 217.53 423.98 107.23 260.35 431.20 619.14 73.59 166.67 242.52 348.96 308.62 91.41 100.58 111.87 129.87 121.44 157.90 The variance-covariance matrix is 1.38 1.02 1 Vw= 107-7Sw= .81 1.07 .74 .91
(Symmetric) 2.62 2.18 4.24 2.60 4.31 1.67 2.43 1.01 1.12
6.19 3.49 3.09 1.30 1.21
1.58
Vw has 100 degrees of freedom. The within-group standard deviations and correlations are Dw 112 =diag (1.17, 1.62, 2.06, 2.49, 1.76, f26) and
Rw=
1.00 .54 .34 .37 .36 .62
(Symmetric) 1.00 .65 .65 .59 .49
1.00 .84 .67 .43
1.00 .80 .42
1.00 .55
1.00
Right canine Right lateral incisor Right central incisor Left central incisor Left lateral incisor Left canine
Like the achievement measures of Sample Problem 1, the calculus measures exhibit a strong positive manifold of correlations. The structure underlying the present correlations is probably better determined, however. There appears to be two nonindependent components underlying the tooth intercorrelations. The first is a spatial pattern, with close teeth bearing a stronger relationship than more disparate teeth. Second, because of similar proximities to the salivary glands and to similar use in eating and brushing, the left and right canines tend to react alike, as do the left and right lateral incisors and the left and right central incisors. Thus we see high intercorrelations on the diagonal of Rw. from the lower leftto the upper right. Both structural components influence the correlation of measures taken from the two central incisors, which is the highest of the set (.84).
CHAPTER
q
Multiple Regression Analysis: Estimation Multivariate multiple linear regression analysis is presented in this and the next chapter. Construction of the regression model, the point and interval estimation of regression coefficients, and the prediction of scores and residuals are the topics of Chapter 4. Chapter 5 presents the partitioning of sums of squares and cross products according to specified sources of variation, and tests of signifiyance for individual and multiple independent variables. Mea. sures of association or correlation between the independent and dependent variables are described separately in Chapter 6.
4.1
UNIVARIATE MULTIPLE REGRESSION MODEL
A regression model is applicable to data having one or more measured independent or predictor variables xj· The xJ generally have scales with ordinal properties, although they may also comprise 0-1 dichotomous nominal measures. Let X;; be the value of variable xJ for obs~rvation i (i = 1, 2, ... , N). The X;j are assumed to be exactly known constants. rhey may be, for example, specific values of random variable Xj, or they may represent specific values of experimental variables determined by the researcher. Let Y; be the value of a random outcome y for observation i. The linear regression model relating the dependent variable y to q antecedents xJ is
Y; = a+ !31 xi!+ J32X;2 + · · · + J3qX;q + e; q
=a+ L f3Jx;;+e1
(4.1.1)
J=l
The J3J are partial regression coefficients, or weights applied to the xi! in the attempt to optimally predict y1• The J3J are common to all N observations and are population constants to be estimated from the data. The regression model is linear as long as all J3J are to unit power, that is, are not multiplied, raised to powers, transformed to logarithms, and so on. The constant a is included to assure equality of the left- and right-hand portions of Eq. 4.1.1; the term serves to absorb sealing differences in they- and 92
Multiple Regression Analysis: Estimation
93
x-variables. E; is the error, or diff~rence between y; and that portion of it predicted by the model. That is, E; = Y;-
(a+ i
/3;X;;)
J=l
E; encompasses both measurement error in Y; and errors in the selection and weighted summing of predictors. When there is only a single predictor variable ( q= 1), the model is the simple regression model; when q > 1, the model is that of multiple regression. Since the multiple regression model sub~umes the simple model as a specific instance, only the more general form is discussed in the following sections. Models for N randomly chosen subjects may be written in array form: Y1 =(1)a+f31Xu+f3zXlz+ · · ·+f3qX1q+E1 Yz = (1 )a+ f31X21 + f3zXzz + · · · + f3qXzq+Ez
.
.
Y:v=(1)a+f3 1 X,\'J+f3zXNz+ · +{3qXNq+E:v
The models may be represented in vector notation as the sum and product of matrices: Let y be the N-element vector of outcomes Y; and e the vector of errors. Then
[}[j
Xn
X12
Xz1
Xzz
XNl
XN2 ...
...
X']["]~q [~N Lq X2 q
/31
Ez
+
or
y = XP+E
(4.1.2)
X is the NX(q+1) regression model matrix consisting of a vector of unities corresponding to the constant term a, and of the N values on the q predictor variables. pis the ( q+1) x 1 vector of partial regression coefficients. It can be seen that the model for any one observation comprises a single row of Eq. 4.1.2. Let xj be the vedor of predictor values for observation i; that is, row i of the model matrix;
x; =
[1
xil
X;z
···
X;q]
Then Eq. 4.1.1 is identically' (4.1.3) The analysis of data under Eqs. 4.1.1-4.1.3 rests upon determining "best" estimates of the regression weights. The weights are used to obtain predicted or estimated outcome scores unqer the model, which can be compared with the observed y;. Hypotheses may be tested about the nullity of all or portions of p. The results of the hypothesis tests determine whether the model fits the data or does not. The model is said to fit the data if variation of theE; in the sample is small relative to variation in y. When this is the case, the difference of the two variances is attributable to the remaining term Xp; knowledge of an individual's
94
Method
x-scores does in fact give some knowledge of the outcome Yt· When variation in y and e are tl)e same, the x-variables do not aid in knowing y, and elements of fJ are null. Correlation measures reflect the extent to which variation in y is attributable to the x-variables. Frequently, the regression model is employed to provide estimates for some predictor variables and tests of significance for others. An initial model may include predictor variables that are known to be related to the criterion measure; for example, verbal and quantitative Scholastic Aptitude Test scores and highschool grades will certainly be included in any regression model predicting college freshman achievement. If we are employed by the college admissions office to derive a selection equation, our first and perhaps only purpose will be to obtain best estimates of the regression weights for these variables. The number of predictor variables for which regression coefficients alone are sought is the rank of the model for estimation. In addition, however, there may be other variables that are hypothesized or suspected to contribute to criterion variation. We may believe that a new test of motivation will make an important contribution to predicting college success. Thus we may include the motivation variable with our predictors. Significance tests can be employed to determine whether motivation accounts for additional criterion variation beyond that attributable to the aptitude and achievement scores. If we decide from this that the motivation score is worthy of inclusion in the model, we shall then want best estimates of the regression weights for all four measures. For maximum external validity (Campbell and Stanley, 1966) these estimates should be derived from a separate sample of observations. If motivation does not contribute significantly to criterion variation, it is excluded from the final model. The total number of predictor variables for estimation or tests of significance is q, the rank of the model for significance testing (q+1 if the constant term is counted). A variation of the regression model is the polynomial model of the form q
Yt =a+ ~ {33x/+e;. j=l
The model matrix for N observations is
°
X= [ ~2°
1 x1 X2 1
2 x 1 X2 2
XN°
XN 1
XN 2
x 1
· · ·
x 1• ]
· · ·
x2•
""•
XNq
with parameters a and {33 (j= 1, 2, ... , q). X; 1 is the value of the independent variable for subject i. Here interest is in the degree of polynomial in x that will optimally predict y. The powers of x differ from the xu of the Eq. 4.1.1 model in that the squares and higher powers of x are not directly observable. The purposes and modes of analysis directly follow those of the multiple regression model. For example, a child's standing height should be predictable from his age. Since height has a decreasing acceleration with age, we might postulate the
Multiple Regression Analysis: Estimation
95
model (4.1.4) where y; is measured height for subject i and X; is his age. Although squared age is not observable, its inclusion in the model accurately predicts the deceleration, as long as {3 2 is negative. To estimate a, {3 1 , and {3 2 , we might select a sample of children at various ages between birth and five years, and record the heights of each. The model matrix would .consist of a vector of unities, a vector of ages, and a vector of squared ages. If we wish to test the fit of the model, we might also include the third power, age cubed, as an additional predictor variable. The corresponding /3 3 will be nonzero if more complex growth trends occur. If the correct model is the simpler one of Eq. 4.1.4, tests of significance will reveal that /3 3 is zero. A further generalization of the regression model that fits the same analysis pattern is the response surface model. Here the independent variables are expressed both in their powers and cross products. For example, a complete model for a cubic "surface" in two predictors (x 1 and x 2 ) is Yi = a+f31Xil + /32X;2+/3aXi1 2+ f34X;z 2+/35Xi1Xiz+ f36Xi1Xiz 2+ f37Xi1 2Xiz+ /38Xi1 3+ /39X;23+E;
The model matrix is
x~ [j
Xu
X12
Xu
X21
Xzz
X21
XN!
Xm
X.v1
2 2
2
X12 2 2 Xzz
Xsz
2
XuX12
XuX12
'"]
2
Xu 2X12
Xu 3
2
x2lzxzz
x21 3 ?"~
2
XN1 2XN2
XNl
Xz1X22
X21X22
XN!XNZ
XN!XN2
3
Xm 3
with parameters a and /3i (j = 1, 2, ... , 9). As an example, consider Sample Problem 1, the creativity-achievement study. It is certain that general verbal intelligence plays a role in cognitive achievements such as "synthesis" and "evaluation." Thus the intelligence variable is included in the model as the first predictor x1 (in addition to the constant term). Creativity is hypothesized to be related to divergent achievement in ways not attributable to intelligence. Three creativity variables have been measured, and these become predictors X2, X3 , and X4. Finally it is hypothesized that the effect of cre~tivity upon divergent achievement is accentuated for individuals of high intelligence. That is, there is an interaction of creativity and intelligence in determining achievement scores. The higher the intelligence level, the more important creativity becomes. To create predictor variables to reflect the hypothesized interaction, cross products of standardized intelligence scores and the three (standardized) creativity measures are taken. The four variables are standardized so that scaling factors in one measure do not obscure the contribution of the other measure to the product. If we assume that x 1 through x4 are standardized, the cross-product terms are x 5 = x 1x 2, x6 = x 1x 3, and x1 = X1X4. The complete response model is
96
Method
The model matrix has a column of unities, four columns with intelligence and creativity scores, and three columns of their respective cross products. Tests of significance will reveal that {3 1 through {34 are nonzero to the extent that intelligence and creativity are related to the outcome. {35through /37 will be nonzero to the extent that the dependence of acbievement upon creativity is greater at high intelligence levels. The rank of the model is eight.
Exercise Examine the effects of interaction terms in a regression model, by assuming two standardized predictors Zt and z2. Let Zt and Zz have all interval values from -4 to +4. Obtain "predicted" y-scores by substituting in Yi = 6+.5zu+1.25z;~+.125ZaZi2 Graph y as height against z1 and z2 • Note that height increases directly with both Zt and z2, but that it increases to a greater degree with z2 at higher z 1 values.
4.2
ESTIMATION OF PARAMETERS: UNIVARIATE MODEL
The univariate regression model is given by Eq. 4.1.2 as y=X{J+e. We shall want to sample observations and use the observed values of the vector variable y at the particular levels of the xJ to estimate {J. We shall assume that the elements of y constitute N independent observations drawn at random from a specified population. Let {J represent the (q+1)X1 vector estimate of (J, to be computed from the sample. Then Eq. 4.1.2 can be rewritten in terms of the partition of effects in the sample. The observational equation is
,Y=X{J+e
(4.2.1)
y is theN-element vector of sample values on random variable y. e= € is theNelement random vector of sample errors, or residuals. e will differ from e for the sample to the extent that {J is not equal to the population value /l We shall employ the least-squares criterion to provide an estimate of {J. That is, {J is chosen in such a way as to minimize the sum of squared sample errors N
c=L el i=l
Equivalently, c=e'e
= (y-X{J)'(y-X{J) = y'y-y'X{J-{J'X'y+ P'X'X{J = y'y-2{J'X'y+ {J'X'X{J
(4.2.2)
Multiple Regression Analysis: Estimation
97
c may be a minimum when the ~artial derivatives with respect to the /3; are zero. The vector derivative is
ac'
A
---;}"=-2X'y+2X'XP Bpi
Setting the derivative equal to the null vector, we obtain the set of normal equations,
·
X'X{J=X'y To solve for the vector estimate of plied by (X'X)- 1 . Then /
p,
(4.2.3)
both sides of Eq. 4.2.3 may be premulti-
1
•
•
(X'X)~ 1 X'X{J= (X'X)- 1X'y !
and (4.2.4)
{J is the "best" estimate of p in the sense that it yields the minimum sum of squared errors (c) in the sample. This can be seen without calculus. Let some other estimate of P be fj•. Since is not equal to {J but has the same number of elements, we may write 1
p•
/J*=fJ+d I
The sum of squared errors w~th [J• in place of pis c*= (y-X/J*)'(y-X{J*) = [(y-X{J)-Xd]'[(y-X/J)-Xd] =(y-Xp)'(y-X/J)-2d'X'(y-Xp)+d'X'Xd. The first term is c; the second term is zero since A
I
d'X'(y-Xp) = d'X'y-d'X'X(X'X}-1 X'y= 0 I
The final term, d'X'Xd, is positivd since it is the sum of squared elements of the vector Xd. It wi II inflate c* to be !luger than c unless d = 0. That is, any estimate other than {J (Eq. 4.2.4) will result in larger residual values. I
Conditions for the Estima~ility of fJ One component in Eq. 4.2.4 is the inverse of the sum of squares and cross products of the columns or variables in X. In order for a unique inverse to exist, X'X must be of full rank (q+1). this condition is met only if N;:;. q+1, and no column of X can be exactly expressed as weighted linear combinations of other columns. That is, there must be more subjects than predictor variables. Also, the inclusion of both subtest and total test scores, or a set of scores that sum to the same constant for all subjects, will violate this requisite. Techniques have been developed for the computer to overlook the dependencies. However, the researcher should reconsider the meaning of his analysis if he finds himself in such a situation. : No further statistical conditi~ns need be met to estimate p. In particular, '
98
Method
the patterns of intercorrelations among the predictors or between predictors and criterion is arbitrary. In practice the number of subjects N is usually larger than q+1. When the two are equal, the regression model is trivial. This can be seen since X will be square and can itself be inverted. Then
/3 = (X'Xt 1X'y = X- 1(X')- 1 X'y =X-ty When substituted in Eq. 4.2.1, the equations simplify to
y=X/3+e =XX- 1 y+e =y That is, the "modeling" of N observations through an equal number of predictor variables will not lead to parsimony or to understanding the trends in the data. The original N outcome scores are undoubtedly better understood by the researcher, than a linear combination of an equal number of predictor variables.
Properties of fj Examination of Eq. 4.2.4 will reveal some of the relationships among the elements estimating the regression weights. Altering the order of the independent variables will not affect the estimates, other than to reorder them in the same way. Addition or deletion of independent variables will affect all of the remaining estimates. This is the case since each estimate is a function of all predictor variables. The nature of the interrelationships among the {3; is complex and depends upon the interrelationships of the predictor variables. The set of partial regression coefficients is determined so as to maximize prediction under this particular model only. Each coefficient may be read as a coefficient of regression of yon X;, "given" the values of the other predictors in the equation. To estimate {3 no assumptions about thedistribution of random variables yore are necessary. Let us now assume that over repeated samplings each of the ei is normally distributed with expectation zero and variance
r(e)
= <J'2 1=
diag
(<J'2 , <J'2 , , , ,
I
<J'2 )
From these assumptions, it follows that the distribution of y is also Nvariate normal, since X and {3 are both matrices of constants. The expectation
Multiple Regression Analysis: Estimation
99
of y is 8"(y) = 8"(X/He) = 8"(XJl)+8"(e) =
X{3
(4.2.6)
For any one observation 8"(y;) =xi {3
(4.2.6a)
x; is the 1 x (q+ 1) vector of x-val ues. The variance of y for the particular set of values in X is
'Po (y) =
g' [y-8" (y)] [y-8" (y)] /
=8"(y-X{3)(y-X(3)' = 8"(ee') =r(e) =
(4.2.7)
They;, like thee;, are independent and have common variance u 2 , at the particular values of the independent variables. This value is the conditional variance of y given the values of the independent variables X;. From Eqs. 4.2.6 and 4.2.7, we can deduce further properties of {1 as an estimator of (3. First, it can be seen that {1 is an unbiased estimator. The expectation of {1 is
8"({1) = 8"[ (X'X)- 1X'y] Since X is a matrix of constants, X'X is likewise fixed. (X'X)- 1 X' is a (q+1)XN matrix with rows defining linear combinations of the elements of y. Thus 8"({1) = (X'X)- 1X'8"(y) = (X'X)- 1X'X(3
=(3
(4.2.8)
{1 is also an efficient estimator; each element is the minimum-variance estimate of the corresponding population weight {3;. Over many samples we would obtain a range of estimates of each /3;. The degree of variation is described by the variance of each and covariance of each pair of estimated weights. Together these form the variance-covariance matrix of the vector [1. The matrix is (q+1)X(q+1) symmetric, and may be derived through the rules of expectation of Chapter 3:
/3;
r({J) = :r[ (X'X)- 1X'y] = (X'X)- 1X'r(y)X(X'X)- 1
using the rule given by Eq. 3.1.27. Substituting r(y) = u 2 1, 'f/ ({1) = (X'X)- 1X' u 2 1X(X'X)- 1 0
= u 2 (X'X)- 1 =u2 G
(4.2.9)
100
Method
The partial result G= (X'X)- 1 is termed the matrix of variance-covariance factors of the estimates. Each diagonal element is proportional (by a factor of tT 2 ) to the variance of one of the /3;; each off-diagonal element is proportional to the covariance of the two respective regression coefficients. The standard error of regression weight /3; is the square root of the jj diagonal element of tT 2G-that is, the square root of the variance of the element. The variance is (4.2.10) The standard error is
(]"/3·.=0"~ J
(4.2.11)
The variances and covariances of the /3; may be expressed in correlational form. The covariance matrix is tT 2 G. Following the usual procedures for correlations, let DG= diag
(tT 2 G)
= diag
(tT2
tT2gz2, ... , tT2gq+t,q+t)
gn,
DG 112 is the diagonal matrix of square roots of these elements, or the standard errors of the /3;. The inverse oG- 112 is diagonal and has nonzero elements 11tT~. The matrix of intercorrelations is (4.2.12) Elements of RG are of the form
[ rg ]··= 1)
A
(]" v
tTz[gu] ;r::::-1 [gii](J" y [g;;]
;r::::-1
A
All multiplications by tT 2 may be omitted, so long as tT 2 is assumed common to all observations. The unstandardized covariance matrix tT 2G is generally of use in determining the precision of estimation. Correlational form 4.2.12 is easier to interpret and provides some indication of the interdependence of the estimates. The information is useful in describing the effect of deleting predictors from the regression equation.
Estimating Dispersions From the N-observation sample, we may estimate tT2 , which in turn can be used for interval estimates and significance tests on {3. /3 is obtained under the condition that the error sum of squares, e'e=~e?, is minimal. This provides the conditions for estimating tT 2 . e'e=(y-Xfj)'(y-X/3) is the residual sum of squares from the model. y has N independent observations; q+1 terms, which are linear functions of y, have been subtracted. Thus e'e is the sum of squares of N-(q+1) independent random variables; e'e has N-(q+1) residual degrees of freedom. An unbiased estimate of tT 2 is &2 =
e'e N-(q+1)
Multiple Regression Analysis: Estimation =
101
(y-X/J)'(y-X{J) N-q-1
= y'y-/J'X'y N-q-1
(4.2.13)
The reader will also recognize this as being the variance of the conditional distribution of y at particular values of X;, when y and the X; have a multivariate normal distribution. The estimate of a 2 is the same whether the X; are fixed or random variables. The estimated conditional standard deviation of y is W=&. a may be substituted in Eq. 4.2.11 to estimate the standard error of each regression coefficient. The estimated standard error of /3; is
IT.B;
= &\l'[gJ
(4.2.14)
If y is normally distributed, then /3;, a linear function of y, is also distributed normally (with expectation {3; and variance a .B/- Then
t = /3;-{3;
(4.2.15)
&fJ;
follows a t distribution with N-q-1 degrees of freedom. The 1-a confidence interval on {3; is
/3;- k&fJ;:;;; /3;:;;; /3; + k&.B;
(4.2.16)
Alternately, expression 4.2.16 may be written {3;: /3;±kftfJ;
(4.2.16a)
k is the 1OOa/2 upper percentage point of the t distribution with N-q-1 degrees of freedom (tN-q-t,at2l· We may test that {3; is equal to any fixed value {3j by using Eq. 4.2.15. The hypothesis is Ho: /3; = {3j for the two-tailed test. Ho is rejected if ltl > k; otherwise maintain H0 • For a one-tail test, the critical value is k = tN-q-t,a and the sign on t; must be in the proper direction.
Some Simple Cases To understand the estimation of p, it is useful to consider some simple cases algebraically. The simplest regression model is y1 = a+e1• The model matrix is anN-element unit vector, X= 1. The estimate of pis
IJ=a =(X'X)- 1X'y = (1 '1)- 11'y 1 N
=-,;;2;Y; l=l
=y. With no predictor variables, the pop!,Jiation constant is the sample mean. The standard error is a 11 =a2 (1'1)-1 =a2 /N.
102
Method
It can be seen that the function of the constant in the model, and the unit vector in X, is to preserve the equality y= X{3+e. By absorbing scaling factors, the estimation of other terms in {J is not dependent upon the origins of the measurement scales. For this model the observational equation is Y; = y.+e;. y1 is a simple function of the origin of the measurements in the sample. If there is a single independent variable in addition to the unit vector, the model is y; = a+{3x;+E;. The vectors of X are x; = [1 x;]. The model matrix is
and
Then
and
=[~] =
J
y.-(3x. [ L:(xi-x.)(y;-y.) L:(xi-x.)z
Again we can see that the estimate of a preserves the model equality. The observational equation is y; = xi/1+ei or and
y;-y. = (3(x;- x.)+ei (3 describes the relationship between y and x when both are expressed in mean deviation units. The estimate of f3 obtained through matrix operations is identical to the usual simple regression coefficient of y on x. When there is more than a single predictor, however, the expressions become complex and even intractable in scalar algebra.
Multiple Regression Analysis: Estimation
103
Example Employing the data of Table 3.3.1, assume that y 1 is a criterion measure we wish to predict from a linear combination of x 1 , x 2 , and x3. They observational vector has N= 15 elements (.8, 2.2, ... , 1.9). The regression model matrix has four columns: a unit vector, X 1 (72, 78, ... , 83), X2 (114, 117, ... , 112), and x3 (17.3, 17.6, ... , 16.1). The corresponding q+1 =4 parameters are P' =[a {31 f3z {33] The model to be fit to the data is
The conditions for estimability of pare met: N exceeds q and no predictor is a linear function of other x-variables. The matrix of cross products for predictors is 15.00
(Symmetric] Constant xl X2 145,976.00 200,363.00 29,503.90 4383.13 x 3 21 ,585.00
X'X= [ 1263.00 107,009.00 1733.00 255.30 Constant
XI
x3
Xz
Then
G=(X'X)- =(10- x 1
3)
[96,279.46 -32.12 -779.26 -204.32
(Symmet,;cro ostant 2.24 -.60 -5.11
7.21 -.16
38.36 x 3
X1
Xz
X3
Constant
l
X1 X2
34.00l Constant 2918.70 x 1 X'y= 3934.10 X 2 585.70 x 3 and
[ {J=
-5.613~Constant
,
.086 X1 .008 Xz -.017 X3
The estimated variance is
1
~
fr 2 = 15 _ 4 (y'y-p'X'y)
1 =TI (85.38-81.79)= .33
The standard deviation is fr =
\!33 =
.57
104
Method
The standard errors of the regression coefficients are the square roots of the diagonal elements of 8-2G:
&a = .57v'96.2s = 5.608 (r ilJ
= .57-v:o522 =
.027
&rtt = .57Y.0072 = .049
&13:\ = .57Y.0384 = .112
The .95 confidence interval on {3 1 requires tu .. o25 = 2.201. The interval is
f3t
~
.086+2.201 (.027)
.027,; f3t
~
.145
.086-2.201 (.027)
~
or We reject Ho: /31 = 0, and maintain predictor x1in the equation. The equation with all three predictors is
Y; =-5.613+.086x 11 +.008x 12 - .017x 1:J+e1
Prediction Having estimated p, we may obtain the vector of scores on the outcome variable as predicted by the model. The vector of predicted or estimated scores is (4.2.17)
y=X/3 or
(4.2.17a) Substituting Eq. 4.2.17 in the observational equation (4.2.1 ), we have
y=y+e
(4.2.18)
With the inclusion of the unit vector in X, the mean of the predicted scores is the mean of the observed scores (y.). The expectation of y is 8'(y) = 8'(X/j)
=XP
(4.2.19)
The covariance matrix of the predicted scores is of use in applying Eq. 4.2.17 and in deriving interval estimates. That is, V(y) =V(X/j) =XV(/j)X' =
X0' 2 (X'X)- 1X'
= ( T2
XGX'
(4.2.20)
Predicted mean y at a particular set of x-values is given by Eq. 4.2.17a. In
Multiple Regression Analysis: Estimation
105
vector notation, (4.2.21)
x; is the vector of values on the x-variables, or one row of X. The variances of theN estimates are on the diagonal of Eq. 4.2.20, and the covariances are in the off-diagonal positions. The standard error of Yi is the square root of the variance, or the square root of the ii diagonal element of Eq. 4.2.20: (4.2.22) The sample value frfi. may be obtained by substituting fr 2 for cr2 . Unlike the original observations.' the predicted scores generally have nonzero covariances. Let us inspect the variance for a one-predictor situation. Let xj = [ 1 X;] be the jth row of X for one observation. Then Y; = xj{J and
=
cr2[1
X;][gn 9zt
9tz][1 J 9zz
Xi
It can be seen that the variance of the estimated mean of y is minimal (cr2 /N) when estimated at mean x (x;=x.). The variance increases as X; moves away from x., and the precision of estimation decreases. When X;~ x., the variance contains components relating to the precision of both & and /3 in the regression equation; when x;= x., only & need be considered. The standard error may be used to obtain an interval estimate of mean y at the values x;. The 1-a interval is (4.2.23) k is the 100a/2 upper percentage point of the t distribution with N-q-1 degrees of freedom ((v-q-Lalz). The prediction of values other than mean y tor given xi involves additional variation of the scores about their mean. The variance of y-scores about the mean is cr2 , which is the same at all sets of x-values. Thus if Y; is an estimate of a particular y-score instead of a mean, the error variance is increased by the addition of the cr2 component. In this case crfi/ = cr2 (1 +x{Gxi)
which may be estimated by substituting fr 2 for cr2 •
(4.2.24)
106
Method
The vector of sample residuals is
e=e =y-y =y-X/l
(4.2.25)
The mean of the ei is zero; the sample variance is &- 2 , with N-q-1 degrees of freedom. The residuals are useful in locating observations that do not follow the model and observations that do, and in isolating unmodeled trends in the data (Anscombe and Tukey, 1963; Draper and Smith, 1966). The expectation of e is Ef(e) =Ef(y-X/l) (4.2.26)
=0 The variance-covariance matrix is r(e)=r(y-X/l) ='r'(y-XGX'y) =r[(I-XGX')y] = (1-XGX')r(y)(I-XGX')'
(4.2.27)
= u 2 (1-XGX') The standard error of the ith residual, element of Eq. 4.2.27. That is,
ue;•
is the square root of the ii diagonal (4.2.28)
Substituting the estimated iT, e/&e. follows at distribution with N-q-1 degrees of freedom. The 1-a interval estim~te of Ei is (4.2.29) k is the 100a/2 percentage point of the t distribution with N-q-1 degrees of freedom (tN-q-J.aJz).
Example The predicted score for variable y, of Table 3.3.1 is obtained by
Y; =
-5.613+ .086x11 + .008x;2 - .017x;3
For the first set of predictor values,
x;=[1 and
114 17.3]
72
y, =x;/l= 1.20
The standard error of mean
y, at x, is
&Vx1 (X'X) 'x,
=
.57V:S0 = .403
Multiple Regression Analysis: Estimation
The .95 confidence interval, with t11 .. o25
1.20-2.201 (.403)
107
= 2.201, is
~ /.LytiXt ~
1.20+2.201 (.403)
or
The residual for x1 is
The standard error is
G-e 1 = .p7Y1-.50
= .40
The model overpredicts y1 at x;. If the first subject had followed the model exactly, he would have attained a score of 1.20; instead we observe only .80. The difference is the extent to which we cannot know the outcome from knowledge of the antecedent (xi) variables. The residuals for all observations are given in Table 4.2.1. The mean (within rounding error) is zero. Only one residual is outstanding in its value, e 14 = -.97. Although x; 4 is close to the mean vector, Yt4 = 90 is considerably below the group mean, leading to the large difference between the predicted and observed outcomes. Table 4.2.1 Observation (i)
1 2 3 4 5 6 7 8 9 10 11 12 13
14 15
Predicted Scores and Residuals Observed (y;)
Predicted ()/;)
Residual (e;)
3.00
1.20 1.73 2.29 3.21 2.57 2.20 2.95 2.39 2.60 1.91 2.48 3.04
1.60 .90 1.90
1.37
.23
1.87 2.15
-.97 -.25
.80
2.20 1.60 2.60 2.70 2.10 3.10 3.00 3.20 2.60 2.70
-.40 .47 -.69 -.61 .13 -.10 .15 .61 .60 .69 .22 -.04
Summary We may now inspect portions of the model, y=XP+e, in juxtaposition. The covariance matrix of e and yare bot,h u 2 1 (see Eq. 4.2. 7). The product Xp has no variance in the population, since pis a vector constant that multiplies the model matrix of fixed values. '
108
Method
The estimate fj varies over repeated samplings, as do vectors y and €= e. The covariance matrices of y, Xfj, and e are CT2 1, CT 2XGX', and CT 2(1-XGX'), respectively. We may note the following relationships: 1. r(y) = r(Xfj)+r(e). The exact partition of variation and covariation of the observations follows from the normal equations, requiring X'y-X'Xfj=O. Then fj'X'y-fj'X'Xfj = 0 and (Xfl)'(y-Xfj) = (Xfl)'e = 0. That is, Xfj and e are always orthogonal partitions of the observational vector, y. It follows that Xfl and e are statistically independent, since g>[(Xfl)'(y-Xfl)) =0=g'(Xfl)'g'(y-Xfj). 2. In general, as predictors are added to the model, CT 2 will decrease and the diagonal elements of XGX' will increase. At some value of q between zero and N there will be an optimum selection of predictor variables, beyond which the decrease in CT 2 by the addition of predictors will be exceeded by the increases in the diagonal of XGX'. It is beyond this point that additional predictors are not contributing to predictive power, but are instead compounding error. At the same time, it can be seen that decreases in the variance terms of CT 2XGX' are accompanied by parallel increases in the variances of e. 3. Individual observations y; and errors Ei are both independently distributed. This is seen through their diagonal covariance matrices. The predicted scores and sample residuals are not independent except in the (rare) situation in which X is columnwise orthogonal. In general this correlation presents little problem in the plotting and interpretation of the residuals, so long as the number of predictor variables is small relative to the sample size (Anscombe and Tukey, 1968).
4.3
MULTIVARIATE MULTIPLE REGRESSION MODEL
When more than a single outcome measure is to be considered, a multivariate form of the linear regression model is appropriate. Multivariate techniques are useful for obtaining multiple univariate results from a given sample. In addition, simultaneous confidence intervals and significance tests may be obtained. However, these are only as meaningful as the common trait being measured by the outcome variables! Assume that observation i has been measured on p (~ 1) random variables Yk· In addition, scores are obtained on q antecedents xJ, as for the univariate case. The linear model relating the two sets contains p separate univariate equations: av] [Yil Ytz ... Y;v] =[a! lXz + xil[f3u +X;z[f3z!
/312
f3Ip)
f3zz
f3zp]
+xiq[f3qJ
f3qz
+[Eil
Ei2
...
... E;p]
f3qp]
(4.3.1)
Ytk is the response of the subject on criterion yk; xij is his value on predictor x;.
Multiple Regression Analysis: Estimation
109
qf
Any one term of the vectors Eq. 4.3.1 reproduces univariate equation 4.1.1 exactly. A second subscript is added to Yi and ei to indicate the respective criterion measure. The constants ~ and {3; differ for each criterion measure. Thus these too have an additional subscript to correspond to they-variate. ak is the scaling constant for criterion y/c; {3;k is the partial regression coefficient relating Yk to predictor X;. The same antecedents appear in the model for every outcome measure. Their differing importance is reflected in different {3;k values. Expression 4.3.1 may be represented more succinctly in matrix form. Let y[ be the p-element outcome vector 1
[yi~
Yi =
Yiz
· · · Yiv]
Then Eq. 4.3.1 is represented as
y[=x;B+ei
(4.3.2) i As in the univariate model, xi contains the values of the predictor variables for observation i; that is,
x; =
[1
Xiz
Xu
Xiq]
B is the (q+1)Xp matrix of partial ~egression coefficients, for predicting each outcome measure from the independent variables: I
~:2 ::: ~:PJ
[;:,1.
B=
f3ql
f3oz
···
f3qv
···
f3v]
I [{31 . f3z
=
Each column f3k contains exac~ly the coefficients for predicting yk alone, from the xj. Thus ak and {3;k for any outcome variable are not affected by the addition of other criterion measu~es. This is in contrast to the addition or deletion of independent variables, which will generally alter all the values. We may also represent the mod~ls for N subjects in matrix form. Let us first juxtapose theN models: ·
y; ,k x;B+e; I
Y~ ~ xfB+e~
i : Y~7x,~B+e~ !
Since B is common to all observations, the models may be written in matrix form, as
y; [y;] .
~,~
=
[x;J [e;J X~ .
x:~
E~
B+ .
.~~
110
Method
or
I~:~ ~~:
l~Nl
~::] [~ ~:: ~~: ::: ~::]~{3:1 .
. •
Y.vz
YNP
=
·
·
•
0
.
.
1
XNl En
E12
Ez1
Ezz
ENl
E.vz
+ . [
X.vz
···
X.vq
f3zl
. •
. ql
·..··. Ew] Ezp E,vp
Let V be the Nxp observed matrix with elements Y;k, and E the Nxp matrix of errors Eik· Then the equations may be represented as
V=XB+E
(4.3.3)
X is the Nx(q+1) model matrix, exactly as in Eq. 4.1.2. Any of the model matrices described in Section 4.1 can also be substituted in the multivariate model. Each row of V, X, and E corresponds to a single observation; each column of VandE to a single criterion measure. B hasp columns of weights for predicting each variable Yk from the q antecedents xj. The primary goals in the multivariate case are the same as for a single response measure. We shall draw a sample of N observations at random and use the observed data to derive point and interval estimates of the entire matrix B. Tests of significance can be obtained for the regression weights for any one criterion measure, or jointly for all outcomes (that is, for one or more rows of B). In this manner we determine whether the x-variables, alone or together, predict the multiple-measure response.
4.4
ESTIMATION OF PARAMETERS: MULTIVARIATE MODEL
Let B be the (q+1) Xp estimate of B to be obtained from sample data. The observational equation for the multivariate situation is
Y=XB+E
(4.4.1)
E is the Nxp matrix of sample residuals or errors, with elements f;.k= e;k· In order to estimate B, we will minimize the squared sample residuals for all of the outcome measures. The sum of squared residuals for one outcome measure is one diagonal element of E'E. Their sum is the trace of E'E. Let c= ~
eik 2
Lk
=tr (E'E) =tr [(V-XB)'(V-XB)]
Multiple Regression Analysis: Estimation
111
To minimize c we set the partial derivatives with respect to the elements of B to zero and solve. The resulting normal equations are :X'XB=X'Y
(4.4.2)
Premultiplying by (X'X)-1, the esti~ate of B is the (q+1) xp matrix
B=
(X'X)-l X'Y
I
(4.4.3)
=GX'Y
Expression 4.4.3 shows ob~ious similarities to the univariate solution (4.2.4). In fact, each column of B consists of exactly that set of regression coefficients for predicting a single oJtcome measure. The same result would have been obtained for the particular variate had it been considered alone for univariate estimates. The estimability criterion is also the same. ForB to have a unique solution, the number of subjects must exceed the number of predictor variables and no predictor can be expressible as an exact linear combination of other antecedents. When the con~itions are met, X'X is nonsingular and can be inverted. There is no restriction on the intercorrelations of they-variables.
Properties of B The effects of order in the matrix estimate are the same as in the univariate· model. Interchanging either predictors or criterion measures will not alter the value of the regression weights,, but will only interchange their order in the same manner. The addition or deletion of y-variables will not affect the remaining estimates, since the columns pf Bare multiple univariate results. The addition or deletion of predictor variables, however, will generally affect the values of all coefficients. Each row of B is the set of regression coefficients of vector y on predictor X;, "given" the other predictor variables, or at the particular ' values of the other predictors. Let us assume that the errors for observation i (elements of e[) have a p-variate normal distribution with expectation 0' and variance-covariance matrix l:, at any set of values xi: (4.4.4) Further, we shall assume that the e'rrors are independent across observations. Since e[ is a row of E, the e~pectation of E is an Nxp null matrix. The assumption that Y(e[) =I, simply asserts that there is a general interrelationship of the criterion measures. That is, I, is the pxp covariance matrix' among variates, or among elements of a row of E: '
CT1 z
I,= [ ~z1 ;PI
CT1z
· · •
p]
CT1
crl ·.· · ~zv CTpz
.• •••
CTp2
Since observations are independ~nt, the matrix of covariances between the p elements in one row of E and ~he p terms of any other row is a pxp null matrix, 'f'(ef, ei.) = 0 fori""' i'.
112
Method
The assumptions may be represented in several ways. A column of E contains the errors for all observations on one outcome measure. The variancecovariance matrix of one column (ek) is crk 21. That is, the variance of each eik is uk 2 , the variance of the criterion measure; the covariances of errors across observations are all zero. Thus, the distribution of a single column is (4.4.5) Since the column contains all the errors for one criterion measure, expression 4.4.5 is similar to the univariate form (4.2.5). The only distinction is that cr2 requires a subscript to indicate which criterion measure the column represents. Expressions 4.4.4 and 4.4.5 summarize the distributional information about E.* To facilitate algebra involving E, we may write the total covariance matrix of all elements. The result is an NpxNp matrix which consists of blocks, each block being a pxp matrix. On the diagonal of the large matrix is the matrix of variances and covariances of any one row of E-that is, ~- The off-diagonal matrices are covariance matrices of pairs of rows of E, that is, 0 (pxp). The entire covariance matrix has the form, :V(E) = diag (~. ~ •... , ~)
I 0 I
~
I 0
Row 1 of E
I 0
Row 2 of E
~~~~-~---~~~~-~--~-4---
0 I
~
I
· ··
---t-----~---------1--1 I I
I I I
I I I
---t-----~---------1--0 I 0 ! I ~
-s>o
tz, )'
-s>o
Row N of E
-s>o
tz,<2
tz,""'/,;
The total covariance matrix may be found as the Kronecker product of an order-N identity matrix, with~- That is,
(4.4.6) The distribution of matrix E is (4.4.7)
0 is an Nxp null matrix. The Kronecker product form (4.4.7) can be used to find the distribution of matrix V. Y is distributed in multivariate normal form with expectation iif(Y) = iif(XB +E)
=XB
(4.4.8)
For any one observation (one row of V), iif(y;) = x;B
(4.4.8a)
•For readers not concerned with the derivation of the various variance-covariance matrices, this discussion may be bypassed, through expression 4.4.9.
Multiple Regression Analysis: Estimation
113
The variance-covariance matrix of Y given X, is f/"(V) = iif
[Y -iif{Y)] [Y -iif(Y)]
I
= iif(Y- XB) (V-XB) I =iif(E E 1 ) 1
=Y(E) I
=I®I
(4.4.9)
The yf vectors for observations are independent of one another. The variates within the vector observa~ions are generally related, with variancecovariance matrix I. This is the qovariance matrix of the y-variables at any particular X; value, and is termed the conditional variance-covariance matrix. The variance of criterion Y~r at partkular x-values is the kk diagonal element, ak 2 · In summary, the distribution of vector observations is (4.4.1 0)
I may also be expressed in sta~dardized or correlational form in the usual · manner. Let
a=:= d iag (I) I
be the diagonal matrix of variances from I. a 112 is the matrix of standard deviations, at the particular values df the predictors. The matrix of partial correlations among the criterion variables is I
(4.4.11)
with elements -1 ,;; p;;,;; 1. Adjusting the variances or correlations to particular x-values is sometimes termed "holding the x; constant" or "removing the effects of X." Both the covariance and the variances invol'ved in P;; have been adjusted for the xvariables. Thus p;; is termed a partial correlation. The estimate B, like its univariate counterpart, is unbiased and minimum variance. The expectation and varianfe-covariance matrix of Bare iif(B) = iif [(X 1 X)- 1 X 1 V] =
~x~x)- 1 X 1 iif(Y)
=
(X'X)- 1X'XB
=~
(4.4.12)
and A
I
Y(B) = Y[(X'Xt 1 X'Y] =
(X'X)+ 1X'Y(Y)X(X 1 X)- 1
=
(X'X)- 1X' (I ® I)X(X'X)- 1
=(X'X)-' 1 ®I
=G®I
(4.4.13)
114
Method
The Kronecker product is a p(q+1)xp(q+1) symmetric covariance matrix of all the /3;k· G®l: may be d,rawn as the matrix of scalar products of l:; that is, ~
I
~
~
I
I
gu._, 1 g12..:. I •• • I gl,q+l._, -----~------~----~------1 I I g21l: I 9zzl: l · ·· l gz, q+ll: -----i------~----,------1 I I
I I I
·.
·
I I I
1
· · ·
1
-----i------~----t------gq+l,ll:
1
gq+1.2l:
gq+l .•+ll:
Each pxp diagonal block [g;;]l: is the covariance matrix of one row of the estimated regression coefficients (relating predictor X; to the criterion measures). We can see that elements of a row of Bare interdependent to the extent that y-variables have nonzero covariances in l:. The off-diagonal blocks contain the covariances of the estimated regression coefficients for different predictor variables. The variance-covariance matrix of a single column fjk, is
Estimating dispersions To estimate the precision of B and the interrelationships among variates and coefficients requires a sample value for !. The estimate is provided by residual variation about the model. The residual sum-of-squares and crossproducts matrix is
SE=E'E = (Y-XB)'(Y-XB)
(4.4.16)
/
Multiple Regression Analysis: Estimation
115
Each element has N-(q+1) degrees of freedom. The maximum likelihood estirnate of the pxp covariance matrix is
1
A
I= N -q- 1 SE =
(Y-XB)'(Y-Xll) N-q-1
Y'Y-B'X'Y N-q-1
(4.4.17)
We note that i is also identically the covariance matrix of the p outcome measures, given the q+1 additional variates, if all p+q+1 variates are distributed in multivariate normal form. It is the unbiased estimate of l22!t in expression 3.2.1 0. The diagonal elements of l; are the variances for each of the p separate measures, ftk 2 , as would be obtained for that variable alone by 4.2.13. The standard deviation of yk is v'ii;1=a-k. These elements may be used for reducing l: to the matrix of partial correlations and for obtaining interval estimates of the elements in B. The partial correlations are estimated by substituting i in Eq. 4.4.11. Let !=diag (:i) be the matrix of variances of y at the particular x-values. Then the correlations among they-variables, holding X constant, is
(4.4.18) Each element fJu = &ul ft1ft; is the correlation of y 1 and y1, removing any variance in the two measures. that is shared by the x-variables. The standard error of any one regression weight is given by 4.4.15. Substituting tTl for crk 2 will provide the estimate
(4.4.19) This term is the same as in Eq. 4.2.14, with an additional subscript to denote a particular y-variate from the set. Assuming that yk is normally distributed, then
t = ~Jk-f3Jk a-!3jk
(4.4.20)
follows at distribution with N-q-1 degrees of freedom. t may be used to test one- or two-tailed hypotheses about [3;k or to construct interval estimates. The 1-a confidence interval is the same as in expression 4.2.16; that is,
(4.2.21) or
/3;k: ~Jk ± kft131k
(4.4.21 a)
k is the 100a/2 upper percentage point of the tdistribution with N-q-1 degrees of freedom.
116
Method
It is convenient to represent the standard errors of all the elements of B as a (q+1)xp matrix H, such that the standard error of (J;"' is element [h,k]. H can be constructed element-by-element from Eq. 4.4.19 or as a matrix product. Each standard error has two components. Let d' be a 1 xp row vector of variable standard deviations: (4.4.22) Let g be the (q+1)x1 vector of square roots of the diagonal of G:
g=
ll ~
\1'% v~q+l.q+l 0
(4.4.23)
Then the matrix of standard errors is
l
H=gd' h'l ]
~ ::: ..
(4.4.24)
Each row of H contains the standard errors for the same row of :8, for predicting all criteria from a single x-variable; that is, (4.4.24a) Intervals may be drawn on an entire row of B, as a set of p simultaneous intervals for one predictor. Let {J'; be the jth row vector of B, and fj'; the same row from the estimate. Then the p intervals are formed by adding and subtracting a multiple of h'; from the estimate fj' 1. tJ'.· P.1D'.+kh'·1 PJ·
(4.4.25)
where
k=
-J_(N-q-1)p · N-q-p Fp,,\-q-p,a
(4.4.26)
is the upper 100a percentage point of the F-distribution with p and N-q-p degrees of freedom. The multiplier k assures that the confidence level for every one of the p separate intervals is at least 1-a. When p = 1, expression 4.4.25 is identical to 4.4.21 for a single coefficient. As p increases so does k, yielding a wider interval for each separate coefficient in the vector. We may also draw confidence intervals on a vector that is a linear combination of the rows of B. Let v' be a 1 x(q+1) vector defining a new vector v'B, which is a weighted sum of the {J' 1. The covariance matrix of the linear combina-
Fp,N-q-p,a
Multiple Regression Analysis: Estimation
tion is
117
r (v'B) = v' r (B)v =v'(G®I)v =v'Gv®I
=wi
(4.4.27)
The scalar w= v'Gv = v'(X'X)- 1v. If we substitute I for I, the variances of the linear combination of coefficients, tor p y-variates, are the diagonal elements of wi. The estimated standard errors are the square roots, and may be put in vector form:
h' =
vw[a-1
&z
· · · &vJ
(4.4.28)
=Vwd' d' is the same as in Eq. 4.4.22. The point estimate of v'B is v'B. The 1-a interval is v'B:
v'B±kh'
(4.4.29)
where k is defined by Eq. 4.4.26. The confidence level is at least 1-a tor each of the p intervals in the vector. An obvious special case of 4.4.29 is with v a column of an identity matrix (a unity and all other elements zero). In this instance, v'B is simply one row of B, and expression 4.4.29 simplifies to 4.4.25. In other instances, we may wish to draw intervals on differences and weighted differences of the p'j when we are concerned with the comparative impact of multiple predictors upon the criteria.
Some Simple Cases Let us examine some simple cases algebraically. In the univariate case (p=1), expressions 4.4.3 and those following reduce to the forms presented in Section 4.2. Consider the bivariate case with one independent variable (p = 2; q= 1). The Nx2 data matrix is
The model matrix is
l?
X= :
1
and
X] XN
118
Method
as in Section 4.2. The matrix of estimated regression coefficients is
B= (X'X)- 1X'y
The first row, or a', preserves the equality of the two sides of the model for both variates simultaneously. sl and Sz are the simple regression coefficients of Yl on x and y2 on x, respectively. Let e; represent the two-element row of E, corresponding to observation i, and x; = [1 x;]. Then including the expressions for B in the observational equation,
=
[Y·~J-(xi-x.)[~~]+[eil] Y·z f3z e;z
and
The elements of B yield two simple prediction equations for y 1 and y2 , respectively. The regression weights relate the x-variable to each of they-measures, when all three are expressed in mean deviation units.
Example
Using the data of Table 3.3.1, assume that y1 and y2 are two dependent variables, to be predicted from a linear combination of three antecedents, X~o x2. and x,l· The regression model is
[ Yil
Y;z] =[at <Xz] + X;1( f3a f312J + x;z( /321 f3zz] + X;a[ /331 /33z]+[ Eil Eiz]
Multiple Regression Analysis: Estimation
119
When models for the N = 15 subjects are juxtaposed, the matrix of ob· served outcome variables Y has 15 rows and 2 columns, for Y1 and Y2. re· spectively. The model matrix X is 15X4, having the unit vector and X1 , x2 , and X3 , respectively, as columns. X is identical to the model matrix for predicting y1 alone, as in Section 4.2. The complete parameter matrix is
E is 15 x 2, with a column for each y-variate. The univariate equation for y1 alone is obtained simply by extracting the first column of Y, B, and E. Following Eq. 4.4.3,
(Symmetric] Constant
[96,279.46 G= (X'X)-1 = 1Q-3x
2.24 X1 -~~~:~~ -.60 7.21 x2 -204.32 -5.11 -.16 38.36 X3 Constant Xz X3 X1
~ ~.00
37.70~Constant
3225.10 X1 - 3934.10 4380.20 x2 585.70 651.79 X3 Yz Y!
X'Y- 2918.70
and
:8-
[-5.6t3 -20.35~ Constant .086 .008 -.017 Y!
.047 Xt .145 x2 .126 x3 Yz
The first column of B is identically /l for predicting Y1 alone in Section 4.2. The estimated variance-covariance matrix of Y given X is 1-(Y'Y-B'X'Y) i=15-4
=n1 [85.38 92.21 =
92.21] [81.79 89.83] 105.31 - a9.83 1o1.9a
[.33 .22] Yt .22 .30 Yz Y1 Yz
120
Method
The standard deviations of Yt and Y2 are
o-1 = Y.33=.57 fr2
= Y.3o = .55
The partial correlation of y 1 and y 2 is
Ptz =
.22tv'.33(.30) = .69
Eliminating the three x-variables, Y1 and y2 have a high positive correlation. (We may also wish to compare this with the unconditional correlation of Yt and y2 , without adjusting for the x-values.) G and i contain all the information about the dispersion of B. The entire variance-covariance matrix of all p(q+i) elements is the Kronecker product GG?Ji. Extracting the variances and covariances for just the first column, the variance-covariance matrix of /3 1 is .33G; for just /32, it is .30G (see 4.4.14). The square roots of the diagonal elements are the standard errors. Let
.55] Y2
d' =[.57
Yt and g=
r~l [9:~!~] ~~onstant \f.0072 .085 =
X2
Y.0384
.196 X3
The standard errors are element-by-element products ofg and d':
5.608 H- [ .027 .049 .112
Yt
5.395~ Constant .026 .047 .108
Xt
X2
X3
Yz
The first column reproduces the univariate results for y 1 alone in Section 4.2. The standard errors for predicting Y2 alone are fr 2 times the same G multipliers. Confidence intervals may be drawn on single elements according to Eq. 4.2.16 or 4.4.21. A multivariate confidence interval on the vector [/Jn fJt 2 ] may be constructed by 4.4.25 or 4.4.29. Using 4.4.29, let v' = [0 1 0 0]. Then
w=v'Gv= .0022 or simply g 22 • The vector of standard errors is
\1.0022[.57
.55]= (.027 .026]
= h'2
Multiple Regression Analysis: Estimation
121
the second row of H. The .05 F-value is F2,1 0 ,. 05 = 4.10, and
k=
~11(2\~.10) = 3.00
The .95 interval estimate is .086-3(.027)
~
{311 ~ .086+3(.027)
.047-3(.026)
~
{3 12
~
.047+3(.026)
or .005
~
{311
~
.167
.:.._ .031 ~ {312 ~ .125 The interval on f3u alone is somewhat wider than the univariate interval for !31 in Section 4.2. Here, the confidence level for each coefficient is at least .95. A two-tailed test with a=.05 would not allow us to reject H0 : {312 =0.
Prediction Scores on the p criterion measures as predicted by the model, are the "best" linear combination of x-values (4.4.30) The expectation of Yis iif(V)=i?(XB)
=XB
(4.4.31)
The variance-covariance matrix is 'r(Y) =r(XB) =Xr(B)X'
=XGX'@I
(4.4.32)
If x; is a particular (q+1 )-element vector of values on the independent variables (one row of X), the estimated mean vector at that point is one row of Y; that is,
y;=x!B =
[~iP1
x;p2 · · · xi/lvJ
(4.4.33)
I
Each element is the prediction equation for one criterion, as in Eq. 4.2.17a. The standard errors of the p predicted means are a simple extension of the univariate case (4.2.22). The variance-covariance matrix of the vector is (4.4.34) The square roots of the diago'nal elements are the standard errors for the I
122
Method
separate variables (J"·
#tk
=(Jdx;Gx;)1 12
(4.4.35)
(Jk is the standard deviation of variable Yk· Substituting the sample value for (Jk yields the estimate UfJu,· Confidence intervals may be constructed on single elements by expression 4.2.23, using the t distribution with N-q-1 degrees of freedom. Simultaneous intervals for the elements of y; are obtained by juxtaposing the p standard errors:
(4.4.36) The 1-a interval estimate of the vector mean is obtained by adding and subtracting a multiple of h' from the estimate y{:
y;:
y; ±kh'
(4.4.37)
where k=
1(N-q-1)p N-q-p Fv,,v-q-v,a
(4.4.38)
p and N-q-p degrees of freedom. The confidence level tor every individual estimate in the vector is at /east 1-a. The residuals in the multivariate case are the Nxp matrix
Fp,N-q-v,a is the upper 1OOa percentage point of the F distribution, with
E=Y-Y =Y-XB
(4.4.39)
The expectation of E is
lt(E)=It(Y-XB)
=0
0 is of order Nxp.
(4.4.40)
The variance-covariance matrix is
r(E) = r(Y- XB) =7/'(Y-XGX'Y) =:V'[(I-XGX')Y] =
(1-XGX')(I®l:)(I-XGX')'
=(1-XGX')®l:
(4.4.41)
The standard errors of the residuals are the square roots of the elements on the main diagonal of 4.4.41. The standard error of the residual tor observation i on outcome k is the same as Eq. 4.2.28 for one variable yk: (Jeik
= (J":Y1-x{Gx;
(4.4.42)
Substituting &" for (Jk provides the sample value. Assuming normal y, e;k/Cteik follows at distribution with N-q-1 degrees of freedom. Expression 4.2.29 may be employed to construct interval estimates.
Multiple Regression Analysis: Estimation
!
123
Summary In the multivariate case, the original observations V are partitioned into orthogonal components XB and E, having additive covariance matrices
I® l
1
= XGX,' ® l+(I-XGX') ® l
XB is determined after the selection
of predictors which are hypothesized to be related to a set of outcome measures. The predictors may include measures known to be related to y, for which regression weights are required, as well as variables whose contribution toy is to be tested from the data.
4.5
COMPUTATIONAL FORMS
Inspection of expressions 4.4.3 and 4.4.17 reveals that both Band I may be obtained from sum-of-product matrjces X'X, X'V, and V'V. X'X and X'V involve the q+1 independent variables, and X'V and V'V the p outcome measures. When N is large, sum-of-products matrices are generally smaller than the data matrices and are easier to utilize in cpmputations. Let v; be the (p+q)-element veQtor containing scores for observation ion all outcome measures and all predictor variables; that is,
v; = [y.f, x;]
= [yil
Y12
Y;p
· · ·
I xil
X;2
· · ·
X;q]
(4.5.1)
V is the NX (p+q) matrix having row vectors v; (i = 1, 2, ... , N). V may be written V = [V, X], or matrix V (NXp) augmented by X (Nxq). I
v=
(V I X)
(4.5.2)
The total sum of products for the p+q variables, is N
Sr= L V;v; i~i
=v!v =
[V, X]'[V, X]
(4.5.3)
Sr has p+q rows and columns and q.n be partitioned into the sums of products for the two parts V and X. I
[ V'V
Sr =
I
V'X ]Prows
C-X'vtx'x-J q rows
p columns q columns I
_ [Sr' 11"J I S/""")]P rows - Ls-;Tr-;;-Ts)·x-:;0j q rows I
(4.5.4)
S/""J is the total sum-of-products matrix for the y-variables alone; Src.rx) is the total sum of products of the -}'-variables; Sr'"xJ, the transpose of Srcxy>,
1.24
Method
contains the sums of cross products of each y-variable and every x-measure. When p or q is unity, the corresponding element in Sr
v: = [y:, 1
x:] N
=-r:('i v; i=·l
= -1 -1'V 1'1
(4.5.5)
1 is an Nx 1 unit vector. The vector of mean deviation scores for observation i is v,-v. and the sum of products of the deviation vectors is Sw =
L; (v;-v.)(v 1-v.)'
= L 1viv;-Nv.v: =Sr-Nv.v:
(4.5.6)
Since both Sr and v. may be obtained through accumulation of the vector observations, matrix Vis not necessary to computation. Sw is the within-group sum of products, and may be considered partitioned as Sr: Y'Y-Ny.y: I Y'X-Ny.x:] prows
-x~v-=N~~;tx'X-_=-N;~x:-
[
Sw =
p columns
- [Su;.·(yy)
q rows
q columns
I Sw(yx)l prows
- -----t----SwCry) i Sw<xx) q rows
(4.5.7)
It can be seen that Sw contains the X' X, Y'Y, and X'Y components necessary to regression analysis, each adjusted to the overall mean. Sw may be reduced to the sample variance-covariance matrix of all the measures. Assuming N independent observations, the degrees of freedom for Sw is N-1. The sample covariance matrix is 1
Vw= N- 1 Sw
(4.5.8)
Vw contains the marginal or overall variances and covariances, ignoring the other measures in the set. Vw may be partitioned like Sr and Sw into Vw
Multiple Regression Analysis: Estimation
125
Vw may be reduced to correlatlonal form by dividing each covariance by the product of the respective standard deviations. Let Dw be the (p+q)-element diagonal matrix of variances from Vw. That is, Dw=diag (Vw) I I
(Zero)
II V
wPP
I I I
----------~--r---------------1
I
Vw P+1.p+1
I
II
(Zero)
- [ Dw \ (Zero)'] prows
-
---- +----1 (Zero) ! Dw<xx> q rows
p columns
(4.5.9)
q columns
Dw 112 is the diagonal matrix of standard deviations of all variables. Dw 112 , and the matrix of reciprocals, Dw -1!2, may be, likewise partitioned. The complete matrix of intercorrelations is
Rw =
b -v2v·· c -1/2 'w w w
(4.5.10)
Rw may also be partitioned like Sw, into Rw, R,}Yxl = [Rw<xYl]', and Rw<xxJ. Sw contains all the informatio("l necessary for the estimation of B and I. From 4.4.3, B is the (q+ 1) xp matrix B= (X'X)- 1X'Y. Substituting portions of Sw,
:B becomes the qxp estimate:
B= [Sw<xxlJ-1 Sw(xy)
(4.5.11a)
I
= [Vw(x.rlJ-1 Vw(xy)
(4.5.11 b)
Equation 4.5.11 b follows from 4.5.1.1 a since the scalar 1/(N-1) multiplies each element of Sw<xy) and divides each element of [Sw<xxl]- 1, having no effect on the product. B has no row for the cons~ant term; a has become identically zero by expressing variables as mean deviations, and is not a necessary term. Comparing the magnitudes of elements of B is hazardous. Not only do the coefficients reflect contribution to' regression, but the values are also direct functions of the units of measurement and of the other independent variables in the equation. The confounding' of scaling effects may be eliminated by standardizing all variables in the equations. Each element of vi-v. is divided by the corresponding sample standard deviation. The vector of standard scores for subject i is · zi = Dw - 112 (v 1-v.) (4.5.12) The covariance matrix of the standardized variables is 1 1 z z' ~ - - D -112S w Dw-112 N- 1 "" ~~ i i - N- 1 w 1
~Rw
(4.5.13)
126
Method
As in Eq. 4.5.1 0, Rw is the sample correlation matrix for all measures. unit diagonal elements and off-diagonal elements,
Rw
has
(4.5.13a) and [ri;] has limits -.1""' [rij] ""'1. The standardized regression coefficients are the qxp matrix of weights for standard-score variables. The standard score covariance matrix is Rw, and substitutes for V"' in 4.5.11 b. The standardized weights are
B=
[ Rw(xxl]- 1 Rw(;ry)
(4.5.14)
Bmay also be obtained directly from B by noting the following: Rw (.rx) = [
Dw(.r.r)J -112Vw(.rxl[ Dw(xxlJ -1/2
and Thus
B=
[Rwlxx)J-1Rw(.ry)
= [ Dw(xxlJ!I2[VwCrx)J-1[ D,}x.r)J 1/2[ Dw(xxlJ-112Vw(xyl[ Dw(yy)J-112 = [Dw(xxlJ112B[Dw(yy)J-112
(4.5.15)
Each standardized weight bJk is the raw coefficient /3;k multiplied by the standard deviation of predictor xi and divided by the standard deviation of criterion Yk· The elements of B are more easily interpreted as reflecting the relative contribution of the predictor variables. However, they are still interdependent; removal or addition of x-variables may affect all the weights, dramatically, in either direction. The relative contribution of predictors to the regression equation is best determined by inspecting the simple and multiple correlations of they- and x-variables, and the partial correlation of yk with x1, holding constant the other x-measures. The information necessary for estimating ! is also contained in Sw. From Eq. 4.4.16 the residual sum of products is SE=V'V-B'X'V =
V'V-V'X(X'X)-'X'V
Using Sw. this is equivalently (4.5.16) The conditional covariance matrix of they-variables, given X, is ' 1 !=N -q- 1 SE (4.5.17) SE and
i
l: are pxp symmetric. The degrees of freedom are N-q-1 = ne. may be standardized to the matrix of partial correlations among the
Multiple Regression Analysis: Estimation
127
criterion measures. Let .i=diag (l) I
DE is the pxp diagonal matrix of sample variances from VE. The partial correlations are
1
:fft=RE
= DE-1/2vEDE -1/2
(4.5.18)
The standard errors of the regression weights may be computed as in Eqs. 4.4.22-4.4.24. [Sw(xx>]- 1 is identical to the qxq lower-right-hand submatrix of G, ignoring the row and coluflln for the constant term. The resulting matrix H isqxp. Thus, for estimation of regression coefficients and covariance matrices, only the all-variable within-group sum-of-products matrix is necessary. For predicted scores and residuals for each subject, the total data matrix Vis needed. Computation of Y and E is straightforward. We must note that in obtaining Sw and B we have expressed all yariables in mean deviation units. Thus to obtain predicted means in the original Y metric, we must add the appropriate constants to XB; that is,
Y~ 1y:+XB-1x::B
(4.5.19)
I
If B is (q+1)Xp, including the constant term, the additional y. and X. terms are unnecessary.
4.6
SAMPLE PROBLEM 1 .!._CREATIVITY AND ACHIEVEMENT
Using the procedures of the preceding sections, let us construct the model matrix and estimate par,ameters for the achievement-creativity-intelligence problem. Sixty subjects have each been measured on two criterion measures, synthesis (y1 ) and evaluation (y2 ). The independent variables are intelligence (xJ); three creativity measures-consequences obvious (x2 ), consequences remote {x3 ), possible jobs {x4 ); and three interactive terms-x5 = X1X2, X6 =X1X3, and X7 =x1x 4. Terms X 1 ,X2 , x 3, and x 4 are standardized prior to forming the cross products. The regression model for ob~ervation ion the first outcome measure is Yil
=
a1 +f3uxil +f3z1Xi2+f3a1Xta+f341xi4 +f3s1Xi1Xi2 +{361Xi1Xia+f371xilxi4 +Eil
For both measures and all observations
Y=XB+E Y is 60X2 and contains the 60 scores on the two random outcome variables; X has 60 rows and 8 columns. Each row corresponds to one subject, with a unit element and his scores on the seyen predictor variables. B is 8X2 and contains
128
Method
the regression weights for the two outcomes:
B=
a1
a2
Constant
f3u
/312
x1
/321
/322
X2
f3s1
/332
Xs
/341
/342
X4
/351
/352
x5
/361
/362
x6
f3n
/372
x1
Y1
Yz
E is 60X 2, with the residual vectors for all 60 subjects. Using the computational forms of Section 4.5, the first column of X and first row of B may be eliminated. Remaining two-element rows of B are (3'; (j = 1, 2, ... , 7) for the seven predictors. From Section 3.4, the vector mean after standardizing and forming cross products, for both criteria and predictors, is v: = [y: I x:] = [2.55 Y1
1.38 \ 0.0
Yz
X1
0.0
0.0
Xz
X3
0.0
.09
.40
X4
X5
x6
.46] X7
The sum of products adjusted to the sample mean is Sw(yy) I Sw (yx)] [
s'" = ~=(~~r~::.:
178.85 110.35
(Symmetric)
1 186.18 \
------------4---------------------------------------65.79 21.29 35.23 40.74 4.56 45.38 31.79
s6.69 I 59.oo 17.66 I s.ss 59.00 40.49 24.29 3.38 33.58 \ 27.47 31.98 20.22 \ 5.56 -3.06 40.17 I 28.6s 18.56 29.77 117.56 1.19
I
59.00 25.13 18.56 38.66 24.42
59.00 1.19 43.35 24.42 22.27 14.98 19.12
104.70 57.79
64.79
Sw has 60-1 =59 degrees of freedom. The variance-covariance matrix is
as given in Section 3.4. The variances are Dw=diag [3.03 Y1
3.16
Yz
I 1.00
1.00
1.00
1.00
.73
1.77
1.10]
x1
The matrix of simple correlations among all nine measures is Rw= Dw - 112VwDw -1/2,
Multiple Regression Analysis: Estimation
129
Estimation of Band its precision requires the inverse sum of products for predictors. The 7x7 inverse matrix is 251 56 296 -33 72 302 G= [Sw<xxl]-l~ 1Q-4x -117 ;-199 -122 379 -84 46 9 0+ -46 -92 -78 50 81 16 -64 9
(Symmetric)
296 -14 -56
245 -183
339
The matrix of raw estimated regression coefficients is
B=
[Sw<xxl]-lSw(xy)
1.00 .86 Intelligence .28 .33 Consequences obvious .16 .32 Consequences remote -.04 -.14 Possiblejobs -.16 .24 Cons. obviousxintelligence -.04 -.16 Cons. remotexintelligence .25 .20 Possible jobs X intelligence
&J.-1)~, ~v
/,s
(;<Tii
01)
The prediction equation for synthesis alone is Ya-2.55 = 1.QQ(xa)+ .28(X;2)+ .16(Xi3)- .Q4(Xi4) -.16(xHX; 2 - .09)-.04(xitxi3- .40)+ .25(xaxi4- .46) All variables are expressed as mean deviations. For raw scores, using Eq. 4.5.19, the constant is
y. 1 -
.2:. x.;{JH = J
2.55+.16(.09)+.04(.40)-.25(.46)=2.47
The prediction equation is YH = 2.47 +1.00(xH)+ .28(xiz)+ .16(X;s)- .04(xi4)-.16(xilxi2)-.04(xilxi3)+ .25(xilxi4)
130
Method
Matrix B may be substituted in the observational equation to provide estimates for both criterion measures. Four of the measures have been standardized in advance of the analysis. Regression coefficients for all nine measures in standardized form may be computed from B. The standard deviations of the nine measures are the square roots of Dw:
[Dw
The matrix of standardized regression coefficients is
B=
[
Dw <xx)] li2Jl[ Dw (yy)] -1/2
.58 .48 Intelligence .16 .19 Consequences obvious .09 .18 Consequences remote -.02 -.08 Possible jobs -.08 .12 Cons. obviousxintelligence -.03 -.12 Cons. remoteXintelligence .15 .12 Possible jobs X intelligence
&;...Qt.?
~~--
&&.
/6'
tl
OQ
Prediction of both achievement measures appears to be dominated by intelligence. Although the magnitudes of the coefficients are difficult to interpret, their signs tend to have greater stability. Both the synthesis and evaluation measures are predicted by a construct that differentiates intelligence and event consequences from the possible-jobs measure. Although we have no overall strength-of-association measure, Rw
= [178.85 110.35
11 0.35] -[81.18 69.41 186.18 69.41 67.22
= [97.67 40.94
40.94] 118.97
J
SE has ne= 60-1-7= 52 degrees of freedom. The conditional covariance
Multiple Regression Analysis: Estimation
131
matrix, VE = i is 1
VE= 52 SE
= [1.88 .79
.79] Yt 2.29 Y2
Yz
Yt
The matrix of adjusted standard deviations is OEt/2
= [1.g7
1.~1]
The partial correlations are RE = DE -1/2VEDE -1/2
= [1.00 .38
.38] Yt 1.00 Y2
Yz
Yt
The unconditional variances from Vw are 3.03 and 3.16, respectively. By comparison to VE, we can see that roughly 40 percent of the variation in the dependent variables has been lost through removing the effects of the independent variables. However, after "holding constant" the predictor measures, the variances of y 1 and y 2 still appear large enough to be noteworthy, and must be attributed to other factors. One of these is undoubtedly common to the two measures, as the partial correlation is strong positive (.38). The estimate of I is VE. The matrix of standard errors of the ~;k may be formed by multiplying as vectors, DE 112 and g, the square roots of1the diagonal elements of G. For the example, these are
d' = [1.37
1.51] .17 .17
g'=[.16
.19
.17 .16
.18]
The standard errors are
H = gd' =
.24 Intelligence .26 Consequences obvious .26 Consequences remote .29 Possible jobs .26 Cons. obviousxintelligence .24 Cons. remotexintelligence .28 Possible jobs xi ntelligence ~t "0 ~/(,(
.22 .24 .24 .27 .24 .21 .25 ~
&u-. "o-
o,
~(/.·
H may be employed in drawing intervals on elements and rows of B. The matrices of estimated scores and residuals, Yand V-,V, respectively, provide useful interpretive information. As an example, subject 1 has predictor
132
Method
scores of
x; =
[1.00
.27
.23
.27 -.28
.06
.07 -.08]
The unit element corresponding to the constant is included. The constant for synthesis is 2.47. For evaluation, Y-2- L. X-;S;2 = 1.38-.05 = 1.33 j
If we include these as the first row of B, then predicted scores are
y; =
x;fi = [2.83
~
1.75]
"0
~ ~
&IS}·
oS'
~.
0?
Observed values for the criteria are
The residuals are
Figure 4.6.1 Predicted scores (y) and residuals (y-y) for synthesis outcome variable, for Sample Problem 1.
Multiple Regression Analysis: Estimation
133
All sixty residuals have been computed for the synthesis criterion variable, and plotted against predicted scores in Figure 4.6.1. Although the distribution of residuals across ordinate values cannot be clearly seen, the decreasing concentration of points as values move away from zero gives an impression of normality. The frequencies of residuals in unit intervals from -3 to +3 are 1, 4, 8, 15, 19, 10, 3, and 0, respectively. With s = 1.37, we should expect about 95 percent of the values to be within 1.96s, or in the range ±2.69. In fact, four of the residuals, or about 7 percent, have absolute value 2.69 or greater. Thus we may rest comfortably with the assumption of normality. The scatter of points in Figure 4.6.1 suggests a uniform variability of residuals across the entire range of predicted scores, in accordance with the assumption about E;. The range of synthesis scores is from zero to seven, with only one individual (subject 137), obtaining the highest score. The individual with the highest predicted score is not that person, however, but instead is s134 , with standard scores on intelligence and the three creativity measures of 2.76; .53, 1.80, and 1.56, respectively. His observed synthesis score is 6, and the predicted score is 6.24. The resulting prediction for s134 is quite accurate. By contrast, s 137 has standard scores on the predictors of 1.21, .97, .27, and 1.56; the lower predicted score of 4.21, and residual of 2.79 can be largely attributed to the lower intelligence score, carrying a high regression weight. The raw intelligence scores for S 134 and s137 are 143 and 120. Both are in a high range. Within this range Smith hypothesized that differences in intelligence alone will not adequately account for divergent achievement. The regression weight derived from a sample representing a broad intelligence range is probably larger than that applicable to individuals in the high-IQ range alone. Three of the remaining residuals have absolute value above 2.69-those for subjects 23, 66, and 149. Spurious criterion scores may provide a partial explanation. For example, the intelligence and three creativity standard scores for s23 are -.34, .38, -.65, and -1.02, respectively. The predicted synthesis score of 2.26 is far below the observed outcome of 5. The evaluation score for the same individual is zero, further supporting a hypothesis of measurement error for synthesis. For s66 , outcomes on both synthesis and evaluation are very low. Scores on the predictors are low, but not sufficiently to predict such extreme divergent achievement results. The finding would suggest an intelligence threshold below which high divergent achievement is not possible. Thus we have a hypothesis for further study.
CHAPTER
li
Multiple Regression Analysis: Tests of Significance The preceding chapter defines the multiple linear regression model and the estimates of unknown parameters. In this chapter we are concerned with testing whether some or all of the parameters are null or are equal to other specified nonzero values. Significance testing proceeds in two stages. The first is partitioning variation (and covariation) in the dependent variables into components for each predictor variable or set of predictors. For example, we may wish to determine the sum of squares (and cross products) in college achievement which can be attributed (a) to high-school grades, (b) to abilities as measured by the Scholastic Aptitude Tests, and (c) to our own measure of motivation. Each effect has a hypothesis sum of squares (and cross products), which is a measure of criterion variation attributable to that predictor or set. The hypothesis measures are determined so that they are independent of one another, and can be summed to give the total criterion variation attributable to all predictor variables. The residual or error sum of squares (and cross products) provides a measure of the extent to which criterion variation is not attributable to any of the predictor variables. The second stage of significance testing involves comparing each of the hypothesis measures to the error measure, with one or more test statistics. If the test statistic exceeds the tabled critical value, it is because of the large magnitude of the hypothesis sum of squares relative to the error sum of squares. That is, criterion variation that can be attributed to the predictor(s) is large relative to the variation that cannot. If the test statistic is small, the predictor variables do not contribute to criterion variation, and can be omitted from the model. It may be necessary, then, to reestimate regression weights for predictors that remain in the model after significance testing. When there is more than a single criterion variable, the hypothesis and error results are matrices of sums of squares and cross products. Multivariate forms of the test statistics are required. Since the multivariate matrices and statistics reduce to the univariate form when p = 1, the general multivariate case is presented first, and then is specialized for exemplary purposes. 134
Multiple Regression Analysis: Tests of Significance
5.1
135
SEPARATING THE SOURCES OF VARIATION Model and Error The multivariate multiple regression model is (5.1.1)
Y=XB+E where Y is the Nxp matrix of observed outcomes X is the Nx(q+1) matrix of predictor values B is the (q+ 1) xp matrix of regression weights E is the Nxp matrix of random errors N is the number of subjects pis the number of criterion (dependent) variables q is the number of predictor (independent) variables
The least-squares estimate of B is the (q+1)Xp matrix
B= (X'X)- 1X'Y
(5.1.2)
We assume that each vector observation y{ (i = 1, 2, ... , N) has the distribution (5.1.3)
xi is the vector of predictor values for observation i. From expression 5.1.3 it follows that the distr,ibution of the kth column of one criterion, yk) is
:8 (the prediction equation for
Pk-.Jim(fJk, O"iG)
(5.1.4)
with G = (X'X)- 1. The unbiased maximum likelihood estimate of I is
1 i = N-q-1
(Y'Y- B'X'XB)
(5.1.5)
Computationally, :8 (without the constant) and i may be derived from the mean-adjusted sum of squares and cross products for all p+q measures. Let
v{ = [yj, xn
(5.1.6)
V= [Y, X]
(5.1.7)
and as in Eq. 4.5.2. Then
1 =-""' N .L..; v';
(5.1.8)
= V'V-Nv.v:
(5.1.9)
V •,
and Sw
Partitioning Sw into Sw (pxp), Sw= [Sw<xu>]' (pxq); and Sw<xx> (qxq), Eqs. 5.1.2 and 5.1.5 become
(5.1.10)
136
Method
and
i
1 [S
Bin expression 5.1.10 is qxp, and does not contain an estimate of the constant a, which is identically zero for mean-deviation scores. · The observational equation is Y=XB+E
(5.1.12)
The least-squares estimation of B yields E such that L;Lk[e;k)2 is minimal. Let us inspect the sum of squares and cross products of the p variates in V. From Eq. 5.1.12, the matrix is Y'Y=(XB+E)'(XB+E) = B'X'XB+B'X'E + E'XB + E'E = B'X'XB+E'E
(5.1.13)
For the final step, note that B'X'E=B'X'(Y-XB) = lJ'X'Y'-B'X'X(X'X)- 1 X'Y =0=E'XB In summary, the pxp sum of products of the sample observations, Y'Y, can be partitioned into additive components, the sum of products due to the model (B'X'XB) and the sum of products of the residuals (E'E). Let us continue to denote the total sum of products Y'Y, as Sr
(5.1.14b)
Sr
Multiple Regression Analysis: Tests of Significance
137
may be applied. Since the scaling constant is often superfluous, mean-adjusted Sw and Sw[Sw<xx>]- 1 S~<xv> (see Eq. 4.5.16) may be substituted for Sr and SR, respectively, for significance tests. Sr is the sum of squares and cross products of N independent vector observations; the total has N degrees offreedom. SR is the sum of products of q+1 linear combinations of the observed values, and has nr= q+1 degrees of freedom (1 for the constant and 1 for each predictor variable). nr is also termed the rank of the rewession model, since it is the rank of model matrix X. The remaining terms in E are N-(q+1) independent linear functions of the observations; the degrees of freedom for error are ne= N-q-1. When the number of criterion measures is one (univariate multiple regression), they, {J, and e equivalents of Eq. 5.1.1 are vectors. The variance of y given the X; variables (5.1.5) is a scalar, fr 2 =
1 (y'y-{J'X'X{J) N-q-1 A
A
The partition of the sum of squares (5.1.14b) is y'y= y'y+E'E
or
L 1 Yl=L 1 Yl+ L 1 el These are exactly the values of the diagonal elements of Sr, SR, and SE in Eq. 5.1.14a, for each criterion measure.
Subsets of Predictor Variables Tests of significance concerning all predictors simultaneously may be made directly from SR and SE in Eq. 5.1.14a. In many instances, however, theresearcher is concerned with not one but a sequence of hypotheses about the relationship of subsets of predictors to the criteria. Each hypothesis (except the first) involves the contribution of one or more predictors to regression, given that prior sets have already been entered into the regression equations. Independent tests for predictors are not directly obtainable from rows of B; in general, the coefficients for pairs of predictors are correlated. This is evidenced by the covariance factor G in expression 5.1.4, which is rarely diagonal. A series of independent tests is facilitated by transforming the predictor variables to a new set of uncorrelated measures, in a specified order. We shall substitute for predictor x; in X only the linear function or portion of Xj that is uncorrelated with preceding predictors x 1 , Xz, ... , X;-1· That is, we shall find the X; values that are obtained if we "partial out" or "hold constant" the effects of earlier predictors in the set. Using the transformed variables, we may reestimate the regression coefficients; these will reflect the extent to which each predictor X; contributes to regression, above and beyond the effects of x 1 through X;- 1 • The transformation may be accomplished through the Gram-Schmidt orthonormalization of X in Eq. 5.1.1. The Gram-Schmidt process will create columns of X that at each stage are orthogonal to (and uncorrelated with) all preceding columns. Formally, the operation is one of factoring X into the
138
Method
product of a columnwise orthonormal matrix X* (with the new set of uncorrelated predictors) and a triangular matrix T' (relating X to X*). That is, X=X*T'
(5.1.15)
X and X* are Nx(q+1); T' is (q+1)-square with zeros below the main diagonal. Equation 5.1.15 also satisfies (X*)'X*=I and X(T- 1)'=X*. Substituting in the regression model, Y = XB+E becomes Y=X*T'B+E
(5.1.16)
T'B is the (q+1)xp matrix of regression weights for X having orthogonal columns, or for predictors that are uncorrelated in the specified order. The least-squares estimate of T'B is obtained exactly as was B. by minimizing tr ("E'E). The estimate is
'fB =
[(X*)'X*]- 1(X*)'Y
=
I(X*)'Y
=
(X*)'Y
=U
(5.1.17)
U is the (q+1)xp matrix of orthogonal estimates or semipartial regression coefficients. Like B, each column of U contains exactly the regression coefficients for one criterion variable; that is, the q+1 elements of any one column form one univariate multiple regression equation. Unlike B, each row of U contains the regression weights for the corresponding predictor variable, eliminating all preceding predictors. The values of the conditional predictor variables themselves are the columns of X*. The conditions for estimating U are identical to those forB; namely, X must be of rank q+1, so [X'X[ # 0. This requires that N~q+1 and that no column of X is exactly a linear combination of other columns. The estimation of the effect of one predictor eliminating those preceding, is termed stepwise elimination. Since X is a matrix of fixed constants, so are X* and T'. Then the estimate U = fB is a simple transformation of B; that is, U = T'B. The expectation of U is
ff(U) =ff(T'B) =T'B
(5.1.18)
The variance-covariance matrix of U is 'JV(U) =r(T'B) =T'~'"(B)T =
T'(G®:t)T
=
T'(X'X)- 1T®:i
=
l®:t
(5.1.19)
Eq. 5.1.19 follows since X' X= T(X*)'X*T' = TT' and (X'X}- 1 = (T- 1)'T- 1 . U is an unbiased estimate of the population orthogonal coefficients T'B. More important, rows of U (the coefficients for multiple predictor variables) are
Multiple Regression Analysis: Tests of Significance
139
uncorrelated. From Eq. 5.1.19, the variance-covariance matrix of column k of U, for predicting yk from the orthogonal predictors, is (5.1.20)
Each coefficient has variance ak 2 ; each pair of coefficients has zero covariance. The regression weights for orthogonal predictor vectors, unlike those for the original X; measures, are independent. Let us obtain regression sums of products for the first predictor only, for the second predictor eliminating (holding constant) the first, for the third eliminating the first two, and so on. Let u'; be the jth row of the orthogonal regression coefficients U. u'; relates the p criteria to one predictor, eliminating those preceding. That is,
U=
u'1
Constant
u'2
x 1 , eliminating constant
u'3
x 2 , eliminating constant and X 1
u'q+ 1
Xq,
eliminating constant,x1.
(5.1.21)
x2 •... , Xq-1
p columns The squares and products for regression for the first predictor alone, is the pxp matrix The squares and products for just the second predictor, eliminating the first, is
The last predictor variable has squares and products
This term measures the contribution to regression of just the final predictor variable, removing the effects of the constant and X 1 through Xq- 1. Each matrix U;U'; has a single degree of freedom. The sum of the independent matrices u;u'; is exactly the overall regression sum of squares and cross products, as in Eq. 5.1.14; that is, L;U;U'; =
=
U'U
fi'TT'B
=B'X'XB
=SR
(5.1.22)
SR has one degree of freedom for each u;u';; that is, q+1. Predictable variation has not been increased or decreased by transforming to orthogonal predictors, but has been redistributed among the x;'s in a particular order. The order of predictors is fixed by the researcher, prior to the stepwise, elimination and
140
Method
the computing of squares and products. The error sum of cross products is (5.1.23)
SE.= Y'V- U'U
with ne = N-q-1 degrees of freedom. The complete partition of sums of squares and cross products is given in Table 5.1.1. Assessing the expected squares and cross products is facilitated by considering the triangular factor in U = T'B to be partitioned by rows. Lett'; be the jth row ofT'. That is,
T' =
tu
t12
0 [ 0
tzz
t1a tza
··· ···
tz,q+l
tt,q+l
.. .
0
t33
· · ·
ta,q+l
0
0
0
...
tq+!,q+l
(5.1.24)
.. .
t' q+l
Then rows of U are linear functions of rows of B, and may be diagrammed as: U=T'B
[
=
l
[u'U'zt t' .~ 1 B = u' ~+t
1 t' 2 B
]
(5.1.25)
u;u';
Each simple matrix is equivalently B't;t';B, with expectation B't;t';B+:t, over repeated samplings. The orthogonal regression weights yield squares and product matrices that are additive; they can be summed to provide tests of the joint contribution of two or more predictors. In general, we shall let qh (1 ~qh~q+1) represent the num-
Table 5.1.1
Partition of Sums of Products for Multivariate Regression Analysis
Source of Variation
Degrees of Freedom
Sum ofSquares and Cross Products
Expected Sum of Products
B't1t' 1B +:I B't 2t' 2B+:I B't3 t' 3 B +:I
Constant x 1 eliminating constant Xz eliminating constant and x1
x. eliminating constant and x,,
Xz, ... ,
All regression Residual
Total
x._,
B'X'XB+(q+ 1):I
q+1
ne = N-q-1
SE=Y'Y-U'U
N
Sr= Y'Y
(N-q-1):I
I I
Multiple ~egression Anafysis: Tests of Significance
141
I !
.
ber of rows of U (number of predictor variables) being combined for one statistical test. The sum of the squares and cross products is the sum of products for hypothesis, SH. qh represents the degrees of freedom for the hypothesis matrix. In Table 5.1.1, each SH is simply one 'u;u';; all corresponding qh values are unity. At times the predictors form logical groupings, such as items or subtests of the predictor variables, or several ~easures of the same construct. To test for the joint contribution of x,, x 2 , and Xa in Table 5.1.1, we can obtain hypothesis matrix SH = U2 U' 2 +u 3 u' 3 +u 4 u' 4 , with qh = 3 degrees of freedom. The expected value is i 4
&f(SH)::: ~ B't;t';B+3I j=2
I
Only adjacent U;U'; may be combined in this manner. Sample Problem 1 (creal tivity and intelligence) utilizes such groupings to test for the joint effects of three creativity measures, and then ~or three interaction terms. The p diagonal elements of each SHare the squares or sums of squares tor individual criterion variables, attributable to the qh predic'tors. That is, they are exactly the univariate regression sums of squares tor each y-measure. Computationally, some steps may be bypassed. The orthonormalization of a large X-matrix to estimate U is a formidable task. Instead we may utiliz~ the tact that T is also the triangular Cholesky factor of X'X. Further, let Sw be the (p+q)-squ~re sum of products of me~n deviations as in Eq. 5.1.9. Then Sw<.r.r> is the qxq sum of products of meari deviations for the predictors. The qxq Cholesky factor without the constant'term may be obtained from Sw<xx>. That is,
Sw<~x> = TT'
(5.1.26)
[Sw(xx>]i- 1 = (T-1)'T-I
(5.1.27)
and I
The qxp matrix of orthogonal regression coefficients, with no constant term, is
U=T'BI =
T' [SwCxx>]-ISw(xy)
=T-'Sw<xY>
(5.1.28)
j
Even for orthogonal estimation, Sw contains all the necessary summary data. The regression sum of products, ~xcludingthe constant, is
SR=U'U
j
= Sw~Sw<xx>] -ISw<xu> and SE=
Sw(YY>-s~[Sw<xx>]-ISw<xv>
(5.1.29) (5.1.30)
as in Eq. 5.1.11. The "total" sum of Pj'Oducts without the constant is Sw<w> with N-1 degrees of freedom. ' The diagonal elements of SR and SE are the univariate sums of squares tor regression and error, respectively, tbr each of the p criterion variables. Ottdiagonal elements are the sums of ~ross products, which are useful in multivariate tests and confidence limits. 1
142
Method
When p = 1, u is a vector, and each u3u'3 is the scalar u/. The sum of squares for a set of predictors is simply the sum of two or more scalars l:3u/. These terms are exactly the diagonal elements of Sn when there are multiple criterion measures; their sum is a scalar SR in Eq. 5.1.22.
Order of Predictors The orthogonal regression weights U are a patterned function of the intercorrelated weights B. For simplicity, let us assume a univariate model, with fj' = [a {3 1 {32 .. • {3q]. Then u in Eq. 5.1.25 estimates T' fj, with tua+t1zf3t+ · · · +ttqf3q-t +t1,q+1{3q
Constant (ut)
0 +t22 {3 1+ · · · +tzqf3q-t+t2,q+1{3q
X1 eliminating constant (uz)
0 + 0 + · · · +tqqf3q-t+tq,q+1{3q
Xq- 1 eliminating all above (uq}
T'fj=
0 + 0 + .. ·+0
+tq+t,q+tf3q Xqeliminating all above (uq+l)
(5.1.31)
With p > 1, each column of U is the same function of the corresponding column of B. Inspection of T' fj reveals that the first element is a weighted combination of all elements of fj. The second element involves all of fj except a. The third element involves all of fj except a and {3 1, and so on. The last element of the product is a scalar multiple of only the last element of fj. It is the triangular nature of T' that produces this pattern and has also produced the orthogonal X* from X. Interchanging the order of predictor variables will alter both the corresponding U vectors and the hypothesis sums of products. The last row of U contains the only terms that are not composites of two or more regression weights. That is, Uq+t reflects the effect of Xq, eliminating (and unbiased by) any other predictors. Only if we decide (through significance tests) that {3q is zero will all other rows of U be unconfounded with Xq effects. Thus, variables that make doubtful contribution to regression or those reflecting complex antecedents (for example, interactions) should be placed last in the order of ,elimination. If they do not add to predicting the criterion measures above and beyond simpler or better-known predictors, they may be quickly deleted from the model. The earlier terms may then be tested without confounding from these predictors. Example
In the previous chapter we estimated regression weights for the data of Table 3.3.1. Yt and Y2 are criterion measures; Xto X2 , and x3 are predictors. The raw coefficients are
B= [
-5.613 -20.353] Constant .047 Xt .086 .008 .145 X2 -.017 .126 Xa
Yt
Yz
Multiple Regression Analysis: Tests of Significance
143
The total sums of products for predictors, criteria, and the cross products of the two are
X'X=
~ 2~;:~g
(Symmetric)] Constant 107,009.00 Xt 1733.00 145,976.00 200,363.00 x2 255.30 29,503.90 4383.13 x2 21,585.00 Constant x't Xa .
Y'Y=[85.38 92.21
92.21y1 105.31 Yz
Yz
Yt
Y'X = [34.00 ,2918. 70 3934.10 37.70 3225.10 4380.20 Constant Xt x2
585.70yt 651.79 Y2 Xs
Y'Y is Sr
S =B'X'XB=[81:?9 R
89.83
.
89.83]y,
101.98 Y2
Yz
Yt The error sum of products is
s
= E
s
(yy)
T
-s
= [3.59
R
2.38
Yt
2.38] y, 3.33 Y2
Yz
Sr(yy) has 15 degrees of freedom, SR has 4 degrees of freedom, SE has ne = 11 degrees of freedom. As in Qhapter 4, I may be estimated by SEine, with error variances3.59/11 and 3.33/11, respectively. To obtain sums of products for the constant and for each predictor alone, we transform to orthogonal predictors. Let us maintain the order of predictors x 1 through x3 to test whether x3 contributes to criterion varia· tion above and beyond x, and X2 • The full orthonormalization is X= X*T' where T' is the Cholesky factor of X'X. Rather than obtain the orthogonal predictors themselves (X*) let us obtain the regression weights for them (U). For this we only require the triangular factor T'. Factoring X'X, we obtain
T'= [
3.87 ;326.07 25.77 (Zero)
447.27 65.89] 2.22 3.44 11.78 .05 5.10
144
Method
and U=
T'B =
8.779 9.734~ Constant [ 2.169 1.969 x1 eliminating constant .097 1.715 x2 eliminating constant and x, -.088 .641 x3 eliminating constant, x 1 and x2 Yt Yz
It is easily verified that U' U =SR. The squares and cross products for the constant are formed from the first row alone. We probably do not wish to test hypotheses about a.
u,u',
=
[~:;~~] [8.779
9.734)
77.07 85.45] y,
= [ 85.45 94.75 Yz Yt
Yz
To test for the joint contribution of x 1 and the second and third rows: SH1 = UzU' 2
=
X2
to criterion variation, we use
+ U3U' 3
[4.4.4471 4.44] Y1 6.82 Yz Y1
Yz
For X3, removing the effects of the constant,
= [
x1,
and X 2 ,
.01 -.06]y, -.06 .41 Yz Yt Yz
The subscript 1 or 2 is added to SH so that matrices for separate hypotheses may be distinguished. SH1 has 2 degrees of freedom, and SH2 has 1 degree of freedom. The sum of the SHj matrices plus U 1 u' 1 is SR. We now have all necessary results to describe the partition of variation in a table like 5.1.1. In testing the significance of the predictor variables, each of the S 11j is compared with SE. The diagonal elements of S 11j are the sums of squares of the separate criteria that can be attributed to the particular predictor variables. These are the results that would be obtained for the individual dependent measures. Univariate test criteria apply only to the diagonal elements of S 11j and SE; multivariate criteria apply to the entire matrices. Had Sw been used in place of S/w>, the constant term would not appear in B or U. U1 U' 1 would not be computed. SR is then Su1 +SJ12 , with 3 degrees of freedom.
Multiple Regression Analysis: Tests of Significance
145
5.2 TEST CRITERIA Statistical test criteria may be applied to hypothesis and error matrices to determine whether the corresponding predictor variables accountfor significant variation in the criterion measures. It is through test criteria and their associated distributions that we judge whe'ther the elements of the hypothesis matrix represent only random effects in the data or also represent fixed population effects that are nonzero. Multivariate test statistics provide a single hypothesis test for more than one criterion measure. (For example, does this predictor variable contribute to variation in theset of outcomes?) In particular, we shall employ a general likelihood ratio statistic through which a variety of multivariate hypotheses may be tested. A special case of the likelihood ratio is Hotelling's P statistic, appropriate when the number of predictors for the hypothesis is unity. When the number of criteria i,s one, the general forms reduce to the usual univariate Ftests for multiple regression. Also multiple univariate results may be obtained by considering only the diagonal elements of the sum-of-products matrices. The univariate results are primarily of use in identifying the criteria least and most affected by the independent variable(s). Step-down analysis provides tests of the hypothesis of' no effect when there is a particular specified order of criteria (for example, by ti'me or complexity). When no logical ordering can be supported, step-down results are of dubitable value. Other tests of significance, associated with the simple, multiple, and canonical correlations of the predictors and criteria, are ptesented in the next chapter. In most cases these are shown to be identical tq the corresponding test statistics presented here.
Hypotheses The general hypothesis we shall consider is that the last qh row(s) of Bare null. This is the hypothesis that the final qh x-variables do not contribute to criterion variation, above and beydnd earlier measures. qh may be 1 or greater, up to all q predictors. If we have q predictor variabl!ils and q+1 rows in B, then B may be considered partitioned into two parts. The first (B 0 ) contains weights for the predictors not being tested. The second (Bh) contains the weights for the qh predictors whose contribution to regression is being tested. That is, I
B = [~~] q+1-qh rows [Constant and predictors not tested] (5.2.1) Bh qh rows [Predictors being tested] pcolumns The null hypothesis is (5.2.2) where Bh and 0 are both qhxp matrices. The alternate hypothesis is that at least one element of Bh is not zero. Hypothesis 5.2.2 asserts that the regression coefficients for the final qh
146
Method
predictors are null; that eliminating or holding constant the preceding independent variables, Xq+l-qh through Xq do not contribute to variation in the set of criteria; and that there is no significant relationship of these predictors to the p dependent variables, above and beyond earlier predictors. Corresponding to Bh there are qh rows of the orthogonal coefficients u'j· If we are testing the contribution of a single variable to regression, qh is unity and there is one associated row of U (the final row). If we wish to test for the joint contribution of multiple predictors, then qh > 1 and there are multiple rows of B and U. The sum of squares and cross products of the one or more rows of U form the matrix of hypothesis sums of squares and cross products (SH), as in Section 5.1. We may also obtain SH by partitioning U exactly as we did B,
U= [~~] q+1-qh rows
(5.2.3)
Uh qh rows p columns
Then SH = u;, Uh, with qh degrees of freedom. The test criteria we shall consider are the likelihood ratio criterion for a single decision regarding 5.2.2 and the univariate F ratios for testing H0 one column at a time. Also, step-down test statistics provide multiple tests of H0 for the first dependent variable, for the second dependent variable eliminating the first, and so on until the final dependent variable yp, eliminating all others. All of these criteria yield the usual univariate F ratio for regression when p = 1, regardless of whether we are testing the effect of one or more predictors. In most research employing regression methods, we are concerned with multiple hypotheses of the form of Eq. 5.2.2. For example we may wish to test whether the motivation variable adds to our knowledge of college achievement, above and beyond ability and past achievement. Upon completing the test of this hypothesis we may wish to test whether the other two predictor variables are also significantly related to achievement. The conditions for conducting this test are determined by whether or not the first hypothesis is supported. in general the testing procedures of this section may be applied more than once, for different subsets of predictor variables. Each may be tested eliminating all other significant predictors. We shall discuss testing a single hypothesis first, and then consider the effects of variable order and multiple significance tests.
Likelihood Ratio Criterion The likelihood ratio criterion (Wilks, 1932) is a general statistic that may be used to test Eq. 5.2.2 with any values of qh and p. The null and alternate hypotheses are regarded as representing two models. Let ~ be the predictor values for observation i. The vector may be partitioned like Bin Eq. 5.2.1. That is,
x;
=
[1
=[
xil · · ·
X;,q-qh
l
xi,q+l-qh
~o
x;h
First q+1-qh predictors
Final qh predictors
(5.2.4)
Multiple Regression Analysis: Tests of Significance
147
If the null hypothesis is true, then the correct model for the data is (5.2.5) The terms of B" are null; x;"Bh reflects only random variation and may be subsumed in e;o. If the alternate hypothesis is true, a better prediction model is Eq. 5.1.1, or
Yi = x{B+e;
(5.2.6)
The outcome is expressed as a weighted sum of all q+1 terms. The model given in Eq. 5.2.6 will yield better predictions of Yi than 5.2.5, and the variance of the e;'s will be notably smaller than the variance of the Ew's. To the extent that the variances of Ei are smaller than those of Ew in a given sample, we may say that Eq. 5.2.6 is a more likely representation of the population relationship of y and x. If, after fitting both models to sample data, the alternate model does not appear to be a more likely representation, the principle of scientific parsimony dictates that the simpler model (Eq. 5.2.5) is maintained, and H0 is supported. The likelihood ratio criterion is the ratio of a measure of the likelihood of Eq. 5.2.5, compared with the likelihood for Eq. 5.2.6. The smaller the ratio in the sample, the more inclined we shall be to reject H 0 and to conclude that the additional predictor(s) are necessary to the model. The likelihood measure for each model is obtained by evaluating the joint density function (Eq. 3.2.1) of the p variates, for the sample outcomes Yi (i = 1, 2, ... , N). The resulting index is proportional to the probability that these sample values would be observed, if the population model is correct. Thus we may evaluate Eq. 3.2.1 assuming 5.2.5 and again assuming 5.2.6, and compare the results. Anderson (1958) has derived the likelihood ratio in detail for this and other hypotheses. The results are both mathematically and intuitively meaningful. Let L 0 represent the likelihood of the data under the null hypothesis, or model 5.2.5, in which B" = 0 and the additional predictors are excluded. If this model is correct then the maximum likelihood estimate of the variance-covariance matrix of errors (Eio) is A
1
A
A
!o = N(Y- XoBo) '(Y-X0B0 ) =_!._ (Y'Y-B~XbXoBo)
(5.2.7)
N
X 0 contains the leading q+1-q, columns of X, with row vectors xio. Similarly, let L represent the value of the likelihood of the data under the
alternate hypothesis, or model 5.2.6 with the additional predictors included. If this model is correct then the estimate of the variance-covariance matrix of errors (ei) is
i=_!._ (Y-XB)'(Y-XB) N
=..!.. (Y'Y-B'X'XB) N
(5.2.8)
148
Method
The likelihood ratio criterion A is a function of the ratio L0 /L. After evaluation and simplification, A is (5.2.9) The range of A is from 0 to 1. If A is sufficiently small in the sample, we reject H0 anq ~onclude that the final predictor(s) contribute to criterion variation. If A is close to unity, the alternate model is not noticeably more likely. We maintain H0 and delete the final predictors from the model. If we think of the determinant of the covariance matrix asap-dimensional or generalized variance of the residuals, we can see how Eq. 5.2.9 corresponds to comparing the fit of the two models. If the alternate model holds, then the additional predictors do account for some criterion variation. We would expect Iii to be noticeably smaller than liol. as additional sources of variation have been eliminated from E. If the null hypothesis is true, the residual variances Iii and liol estimate the same random variation and will be similar in value; A will be close to unity. With appropriate distributional assumptions, we may choose critical values of A at which to reject H0 • Because of the complexity of the likelihood ratio, complete tables of the distribution of A have not been constructed or widely disseminated. Several functions of A have been developed, however, that have probability distributions very close to the well-known x_ 2 (chi-square) and F distributions. We may test H0 employing one of the transformations of A as a test statistic. Bartlett (1938) has shown that the statistic x_2 =-mlog.A
(5.2.10)
has approximately a x_2 distribution with qhp degrees of freedom. The multiplier is m = n.- (p + 1 - q h )/2 (5.2.11) where qh is the number of predictor variables being tested and n. is the degrees of freedom fori; that is, n.= N -q-1. H0 is rejected with confidence 1-a if x_ 2 exceeds the 1OOa upper percentage point of the x_ 2 distribution with qhp degrees of freedom. Schatzoff {1966) has shown that the x_2 approximation is reasonably accurate as long as the number of observations exceeds the total number of variables by 30 or more, even in large models (say, pqh = 70). Rao (1952) has given a more accurate approximation for the distribution of A. The test statistic is F = _1-_A_1_1• • _m_s_+_1_-___,q-':!hp_l_2 Alfs qhp
(5.2.12)
where (5.2.13) and m is defined by Eq. 5.2.11. In Eqs. 5.2.10 and 5.2.12 small values of A result in large test statistics. The F statistic has approximately an F distribution with
J
Multiple Regression Analysis: Tests of Significance
149
qhp and ms + 1 '- qhp/2 degrees of freedom. H0 is rejected with confidence 1 -a ifF exceeds the 1OOa percent critical value of the corresponding F distribution. Note that F is undefined when pqh = 2, since s vanishes. However, settings to unity provides an appropriate test statistic in either the univariate (p = 1, qh = 2) or bivariate (p = 2, qh = 1) case. The F approximation is generally more accurate than the x2 approximation, although with large N they will yield the same results. When either p or qh has values of 1 or 2, the F statistic has exactly the corresponding F distribution. With other values the inaccuracy of the F statistic is of the order of m-4, which is less than .01 form-values as low as 4. It is possible for the denominator degrees of freedom to be fractional. Rounding to the next lower integer provides a conservative test if computing routines for fractional values are not available. If H0 is rejected, inspection of the univariate F ratios (see the following section) may identify the particular variates affected by the predictors. Computationally, A can be obtained from the error sum of products SE and the hypotheses matrix SH = u;,uh. Note that
A=
1~1
IIo I INil INiol
(5.2.14)
From Eq. 5.2.8,
NI=Y'Y-B'X'XB =5E
(5.2.15)
From Eq. 5.2.7,
Nio =
Y' Y- B,; Xh X0B0
= (Y'Y-B'X'XB) + (B'X'XB -B~XhX0 B 0 ) = SE+(U'U-U~U 0 )
=5E+5H
(5.2.16)
Then (5.2.17) This form may be applied directly to any of the matrices obtained in partitioning criterion variation. When testing the hypothesis that the entire matrix B is null, SH= U'U= SR is the entire sum of products for regression. The condition for conducting the test of significance on A is that ISEI be greater than zero; that is, SE must be of full rank. This requires that N-q-1 be at least p or that there be more subjects than the total number of variables, p+q. Also, no criterion variable can be exactly a linear combination of other measures (for example, subtests and total test score, or percentages that sum to 100 for every subject will violate this requisite). When the condition is not met, one or more criteria must be deleted or tested separately; all of the information for the test is contained in the linearly independent subset of criterion variates.
150
Method
Example Using the data of Table 3.3.1, let us test for the contribution of Xa to the prediction of both y1 and y2 • The hypothesis is Ho: {J' 4 = 0'. where {J' 4 is the final row of B. The hypothesis and error matrices from Section 5.1 are
s Hz= [-.06 .01
-.06] .41
s. =
2.38] 3.33
and E
[3.59 2.38
respectively. The degrees of freedom for the two matrices are
q,, = 1 and
ne = 11. The likelihood ratio is AISEI -jSE+SH2 1 = 6.30= 78
8.07
.
The multipliers are
m = 11 -(2+ 1 -1)/2 = 10 and
s = 1 (by substitution) The F transformation is
F= 1-.78. 10(1)+1-(1)(2)/2 .78 (1 )(2) 10 =.28·2=1.41 with 2 and 10 degrees of freedom. The critical Fvalue is not exceeded and Ho is accepted. X:; does not contribute to criterion variation, above and beyond x 1 and X2.
Hotelling's T2 The P statistic (Hotel ling, 1931) is a special instance of the likelihood ratio criterion when the degrees of freedom for hypothesis are unity. When there is a single predictor variable (and multiple criteria) being tested, the general likelihood ratio of the preceding section may be employed; the results will be identical to those from P. Several other hypotheses having one degree of freedom are more conveniently tested through the P statistic.
Hypothesis 1: As a special case of the likelihood ratio criterion, we may use P to test that a row of regression weights for predicting multiple criteria from
Multiple Regression Analysis: Tests of Significance
151
one predictor is equal to a vector of constants. The hypothesis is
Ho: /3'
=
/36
(5.2.18)
/3~ is any 1 xp vector of constants, including the null vector as it is in Eq. 5.2.2. The regression weights may comprise one row of B in Eq. 5.1.2 or may be weights for the only predictor, when q = 1. If the constants are other than zero, the P statistic is usually more convenient. The alternate hypothesis is that the two vectors are not the same, for all elements. To construct the P statistic, let the corresponding row estimate be fj'. The expectation of fj' is f3' and the variance-covariance matrix is
r({j') = (x'x)- 1X
(5.2.19)
x is the vector of all N scores on the corresponding x predictor variable. The estimate of X is given by Eq. 5.1.5. If observations y; are distributed in p-variate normal fashion, then so is Consider a transformation of the elements of p'. Let us factor X into the product of the Cholesky triangular factor T and its transpose; that is,
/3'.
X=TT'
(5.2.20)
and (5.2.21) Let us transform fi' to 1' by (5.2.22) The distribution of 1' is p-variate normal, being a linear function of the normally distributed elements [~k].The expectation of 1' is ~(y) =
vX'X T- 1 (/3-Po)
(5.2.23)
The variance-covariance matrix is r(y) = x'xr[T- 1 (P-f3o)] = x'x[T- 1V(p)(T- 1)'] = x'x[T- 1 (x'x( 1 X(T- 1 )']
=I
(5.2.24)
I is a pxp identity matrix. The elements of 1' are independent and normal, each with unit variance. Under the null hypothesis that f3 = /30 the expectation of yis zero. Each [ yk] is an independent standard normal random variable. The product y'y is the sum of squares of p such variates, and follows a x2 distribution with p degrees of freedom. That is, y' 1' = (x'x)[ fj- f3o] '(T- 1 ) 'T- 1( fj-f3 0 ] = (x'x)[ fj- f3o] 'X- 1 [ fi-f3o] -
As fj departs from
/3 0 , the x
2
Xp2
(5.2.25)
value will increase, and we shall be more inclined to
152
Method
reject H0 . If H0 is true, {3 will generally be closer to /Jo and the X2 value will be small. When 1 is estimated from a sample, the test statistic is Hotelling's P, P = (x'x)[ f3-fJ 0 ] 'i- 1 [ /J-/Jo]
(5.2.26)
Tables of Pare not widely distributed, although a partial set has been published by Jensen and Howe (1968). However a simple transformation of P can be compared directly to tables of the F distribution. The transformation is F = (ne-p+1)P pne
(5.2.27)
Expression 5.2.27 exactly follows an F distribution with p and ne-p+1 degrees of freedom, where ne is the degrees of freedom for the error matrix 1. In the regression model ne = N-q-1. Thus we reject H0 with confidence 1-a, if F exceeds the critical F value, or if
P?c:p(N-q-1)F , N-p-q p,,\-:p-q,a is the 1 OOa percentage point of the degrees of freedom. Let us consider several special cases: Fp,N-p-q,a
F
distribution with p and N-p-q
1. If P is being used to test the nullity of regression weights for one predictor out of several, a row of orthogonal estimates U replaces {3' in Eq. 5.2.26. The vector of regression weights is the jth row of U (u';), and the x-values are the corresponding vector (xj) of orthonormal predictors from X* in Eq. 5.1.15. Then
P = (xj}'xj[ U;- O]'i- 1 [u; -0]
= u';i- 1 u;
(5.2.28)
since (X*)'X* =I. For the same hypothesis, the likelihood ratio criterion is
with SH
=
u;u';· Simplifying,
A= iSE - 11 ISEI iSE 1 //SE+U;U';/
1
Let
sE- 1 U;
=a and u';
=
b'. Applying the rule of determinants from Section 2.3,
A=
1
1+u'-SE - 1u-.I j
Multiple Regression Analysis: Tests of Significance
153
Since i = (1 lne)SE, it follows that
and
A=
~
u'-:t- 1U·
1+-'--' ne
(5.2.29) Then
1-A P=n e · -A-
(5.2.29a)
That is, when q" = 1, the likelihood ratio criterion and P statistic for /3 0 = 0 are monotonically related. The F transformation of A is the same as the exact F statistic for P by Eq. 5.2.27. if H0 is rejected, we conclude that predictor xi is significantly related to the p criterion scores. 2. When the regression model is a trivial one of the form (5.2.30)
P provides a test that the population mean vector (for p variates) is equal to a vector of constants; that is, (5.2.31 a) or
Ho:p.=a 0 Under Eq. 5.2.30,
(5.2.31b)
a is the sample mean vector for they-measures. That is, &'
= (1'1)- 1 1'Y = y:
1 is anN-element unit vector.
i
'
(5.2.32)
is the simple variance-covariance matrix
1
:t=~
2:i (yi-y.) (y;-y.)' (5.2.33)
Substituting in Eq. 5.2.26,
P
= N[y.-a0 ]'Vw- 1 [y.-a0 ]
(5.2.34)
P may be converted to F by Eq. 5.2.27, with ne = N-1. H0 is rejected ifF exceeds the critical value of the F distribution with p and N-p degrees of freedom. P is the multivariate form of the ordinary t test that a population mean p., is equal to a specified constant.
154
Method
Example The sample means of y 1 and y 2 of Table 3.3.1 are
2.27]
y.= [ 2.51
for required and elective courses, respectively. The variance-covariance matrix is
i= [.59
.48
.48] .75
The sample size is 15. The university mean freshman grade point average for all courses is 2.40. To test whether this vector score deviates from the overall mean, we test
[/J-1] = [2.40] j.t 2.40
Ho:
2
against the alternate that the two are not equal. Hotelling's Pis
.59
P= 15 [ (2.27 2.51)-(2.40 2.40) ] [ .48 =15[-.13
.11][-2.25 3.51
.48]- 1 [(2.27-2.40)] .75 (2 .51 _ 2 .40)
-2.25][-.13] 2.76 .11
=2.49 The F transformation is F
(14-2+1)(2.49) 2.14 =1.15
The .05 critical F value with 2 and 13 degrees of freedom is 3.81. H0 is maintained; students do not score significantly above or below average in courses classified by requirement.
3. When there is only a single criterion measure, P is the square of the simple t test on one regression coefficient. When p = 1, i in Eq. 5.2.26 is the variance of the single measure 6-2 ; Sis a single estimate and {3 0 is a scalar. Then
P=
<S;f3o)
2
a-z L;xiz
(5.2.35)
The variance of a single weightS in Chapter 4 is identically the denominator of 5.2.35 (& 13 2). Further when p is unity in 5.2.27, P is also the F value. The univariate t statistic is
(5.2.36)
Multiple Regression Analysis: Tests of Significance
155
t follows at distribution with ne degrees of freedom; J2 is distributed as F with 1 and ne degrees of freedom. H0 is rejected if either statistic exceeds the respective critical value. Hypothesis 2: The P statistic can be employed to test for values of any linear combination of regression' coefficients. Let B be the (q+1)Xp matrix of regression coefficients, and v a (q+1 )-element vector of weights. The null hypothesis is H 0 : v'B
=
(v'B) 0
(5.2.37)
The alternate hypothesis is that the two products are not the same. v'B is a · 1 xp vector, and v'B the corresponding vector estimate. The expectation of the vector ~f linear combinations is W(v'B)
=
v'B
(5.2.38)
The pxp variance-covariance matrix is
= v':V(B)v ', = [v'(X'X)- 1v]I
:V(v'B) i
(5.2.39)
Substituting in Eq. 5.2.26, p =
1 [v'B-(v'B) 0 ]±-1 [v'B-(v'B)o]' v'(X'X)- 1v •
(5.2.40)
P may be transformed to F by 5.2.27. H0 is rejected in favor of HA ifF exceeds the critical F value, with p and ne-.r;>+1 degrees of freedom.
Univariate Statistics Summary matrices for multi~ariate analysis are efficient devices for obtaining p separate univariate results simultaneously. In the case of regression analysis, elements of Sn and S8 may be employed to provide tests of the contribution of qh (~ 1) predictor variables to each of the criteria. Let S8 and Sn represent pxp s,ums of products for error and for hypothesis, having ne and qh degrees of freedom, respectively. Sn is the matrix product of qh row "ectors of U. The univariate 'F ratio for one criterion variable Yk is the ratio of two mean squares, one for hypothesis and one for error. Each mean square is a sum of squares divided by its degrees of freedom. The mean square for regression for Yk is the kk diagonal element of Sn divided by its degrees of freedom. ,Dividing the entire matrix Sn by qh produces the pxp matrix of mean squares and cross products (5.2.41) The diagonal element is (5.2.41 a)
I,
156
Method
Similarly the error mean squares and cross products are
' SE I=ME=-
(5.2.42)
_ [sehk [m] e kk - - -
(5.2.42a)
ne
The error mean square for y" is
ne
Expression 5.2.42a is identically the kk diagonal element of i, that is, A univariate test statistic for the effect of the predictor(s) on y,, is
frk 2 •
(5.2.43)
F may be referred to critical values of the F distribution, with qh and ne degrees of freedom.
Example Using the data of Table 3.3.1, let us obtain test statistics for the contribution of x 3 to variation in y1 and y2 separately. The hypothesis matrix is
s =[
.01 - .06] -.06 .41
Hz
with 1 degree of freedom. Thus MH2 = SH2 • The error matrix is
s = [3.59
2.38
E
2.38] 3.33
with 11 degrees of freedom. The mean-squares-and-products matrix is
M
1 S
E=rr E=
[.33 .22
.22] .30
The F ratios for y1 and y2 alone are .01
F, = .33 = .03
(y1)
1
(Yz )
36 fz= ..4 30 =1.
Neither exceeds the .05 critical value of F with 1 and 11 degrees of freedom.
Multiple univariate F ratios for any one hypothesis (one or more predictors) are not statistically independent. When the criterion variables form a meaningful set, a multivariate test statistic should be used for a single decision about H0 (Eq. 5.2.2). Both clarity and statistical validity are thus maintained. Hummel and Sligo (1971) have recommended a two-stage significance-testing procedure. At the first stage, a decision is made about H0 from a multivariate result (such as Wilk's A). If H0 is rejected, the separate univariate ratios may be inspected to determine where the significant effects are located.
Multiple Regression Analysis: Tests of Significance
157
The largest univariate F ratio .is obtained for the variable the most affected by the qh predictors, the smallest for the variable least affected, and so on. However, there is no necessary relationship of the significance of univariate and multivariate tests for one hypoth~sis. For example, one or more univariate F's may be significant and not the multivariate statistic, or vice versa. In the regression model, inspection of simple a.nd multiple correlations is likely to be more useful for interpretation. Matrices SHand Sw(yy) can provide a univariate measure of the percentage of variation in Yk attributable to the predictors. The regression sum of squares is [sh]kk; the sum of squares for yk before eliminating the effects of the predictors is [sw(yy)hk· The percent of variati\)n in y,,, due to the predictors, above and beyond preceding independent variables, is
100[shhk [Sw(YY)hk
(5.2.44)
When there is only a single criterion measure, the F transformation from Wilk's A is identically the univariate F ratio for y. From Eq. 5.2.17, AI=
ISEI ISE+SHI
When p = 1, both matrices are scalars and A= SEf(SE+SH). The multipliers for F (Eq. 5.2.12) are ~
s= (' qh2-4)1'2 = 1 qh 2 -4 and
Then F= 1-A_ne-1+qhl2+1-qh/2 A qh =
SE. qh SH ne
F is identically the univariate ratio as in Eqs. 5.2.41-5.2.43.
Step-down Analysis Step-down analysis, described by Roy (1958) and by Roy and Bargmann (1958), provides p univariate test statistics to test H0 (Eq. 5.2.2), which are statistically independent but depend upo'n an a priori ordering of criterion measures. Logical orderings of criterion measures arise when subjects have been tested repeatedly over time, when measures involve progressively more complex behaviors, or whenever there exists? systematic progression from one outcome measure to another. If no logical or theoretical ordering can be justified, stepdown tests will be of little scientific value, and a general test like Wilk's A may be employed.
158
Method
. Step-down analysis is a stepwise procedure in that variables are considered in a predetermined order; at each stage only the unique contribution of the additional variable is estimated and tested. The term "step-down" is used to indicate that it is the criterion variables, as opposed to the predictors, that are being considered in an elimination process. The distinction is somewhat arbitrary, however; the step-down statistic for the criterion variable yk is identical to the univariate test that would have been obtained if preceding criteria were listed in X as predictors, ahead of those actually under consideration. Step-down test statistics are computed from hypothesis and error matrices SHand SE, with q" and ne degrees of freedom, respectively. Step-down analysis enables the researcher to test the relationship between the q" predictors and y1 , with y2 eliminating y1 , y 3 eliminating y1 and y2 , and so on. The test statistics depend upon the conditional variance of each measure, given preceding criteria. Let us factor SE into the product of the triangular Cholesky factor and its transpose, SE = TET~ (5.2.45) The p diagonal elements of TEare the square roots of the error sums of squares for variable yk, given y1 through y 1r-1. The square Etehk 2 is the sum of squares for the conditional Y~.:, and has degrees of freedom ne-k+1; one degree of freedom has been attributed to each preceding variable eliminated. The conditional error variance, or mean square for yk given preceding measures, is
Etehk 2
(5.2.46)
ne-k+1 Let us now compute a "total" matrix for this hypothesis:
(5.2.47)
S=SE+SH
S may be factored like SE to obtain conditional "total" sums of squares for each variate, eliminating preceding measures. By the Cholesky method,t
S= T*[T*]'
(5.2.48)
The "total" sum of squares for yk given y1 through y1H is [t*hk 2 · Thus the hypothesis sum of squares for Y~.: given y 1 through Yk- 1 is [t*hk 2 - Ueh~.: 2 , with q" degrees of freedom. The hypothesis mean square for Y~.:, eliminating preceding criteria is
[t*hk 2 - Ueh1/
(5.2.49)
qh
The step-down F statistic is a ratio of the conditional mean squares as given in Eqs. 5.2.49 and 5.2.46. It is the univariate F ratio for Y~.:, eliminating y 1 through Yk+ The kth F statistic is [t*]k~.: 2 -[te]kk 2 . ne-k+1 F*k-
Etehk2
IS£1
qh
(5.2.50)
tNote that the A criterion and step-down statistics are easily computed at the same time. A requires and IS/, which is efficiently computed through Cholesky factors, as described in Chapter 2.
Multiple Regression Analysis: Tests of Significance !
~
159
I
Any matrix SH will yield p F* statistics. F~ is referred to critical values of the F distribution, with q 11 and ne-k+ 1 degrees of freedom. H0 is accepted if and only if none of the F statistics exceeds its critical value. Since the first F* statistic has no prior variates eliminated, it is equal to the simple univariate F ratio for YJ. (t*];1 2-[te]11 2 . ne [te]11 2 qll
F*1
=
{ [se]ll+[s~z]ll}-[se]ll ne [se]ll . qh
=
[sh])l ne [se];1 . qh
=F1
F; through F;, however, are tests of the effect of the q11 predictors upon response variable yk, eliminating any portion of the effect that can be attributed to preceding dependent variables. Under the null hypothesis for variable yk, F~ is statistically independent of F; through F~_ 1 . Under the alternative hypothesis that the effect on yk is not zero, F; through ~- 1 are not independent ofF~. The appropriate technique for interpreting the step-down statistics is to begin with F; and proceed backward toward F;. At each stage a hypothesis is tested concerning prediction of a specific criterion, eliminating earlier measures in SE and SH. If F~ is not significant, the corresponding variable is not significantly related to the predictors a~ove and beyond y1 through fk-.: 1 . We proceed with inspection of FZ_ 1• Testing stops if a significant F* statistic is encountered, and H0 is rejected. Variables earlier in the order of elimination cannot be validly tested; F; through F~_ 1 are all confounded with significant variation due to Yk· If no F* statistic is significant, Ho is accepted. It has been suggested by Bacik (1966, p. 828) that criterion variables known to be important be ordered first, and the "more dubious" or more complex variables later. In this manner the value of the latter variables in contributing to the regression upon the predictors may be tested first. In the search for parsimonious explanations of behavio,r these complex or doubtful contributors may be eliminated, whereas the earlier and simpler explanatory variables are retained. The logical ordering of m~asures is critical, since the step-down results will change with a permutation of the criterion variables. Step-down statistics are the only tests that are influenced by the order of the criteria. It is possible to assign differing type-/ error rates to the step-down tests, maintaining a constant overall probability of falsely rejecting at least one of the p null hypotheses. If we assign probability ak to statistic F~, then the overall probability is · p
a= 1 -IT (1 - ak)
(5.2.51)
k~1
Further consideration of step-down decision rules has been given by Das Gupta (1970).
160
Method
Example Using the data of Table 3.3.1, let us test H0 : fJ' 4 = 0' employing SHz and SE:
= -.06
[ .01
-.06] .41
(qh = 1)
s .= [3.59
2.38] 3.33
(ne= 11)
SH2
l;
2.38
To use the step-down method, we must assume a natural order toy, and y2 . There is no inherent order in the measures as they appear in this example. Therefore the following results, though mathematically correct, are less informative than the likelihood ratio test. Factoring SE, the Cholesky factor is
TE = [ 1.89 0.00] 1.25 1.32 The "total" matrix is
S= [3.60 2.32] 2.32 3.74 The Cholesky factor of S is
T*= [1.90 0.00] 1.22 1.50 The F* ratios are
1.902 -1.89 2 11-1+1 · 1 _892 1
.03
(1, 11 d.f.)
1.502 -1.32 2 .11-2+1 F*= 2 1.322 1
2. 78
{1' 10 d.f.)
*
F,=
Finding F; not significant at a=.05, we inspect F;. F~ is also not significant and Ho is accepted. Had F; been significant, we would have no separate test for y" but Ho would be rejected. If F; were significant but not F;, we should conclude that Ho is rejected but that y2 does not relate to x3 above and beyond y,. If both tests are conducted with a 1 = a 2 = .05, the a for H 0 is 1-.952 = .10.
5.3
MULTIPLE HYPOTHESES
Frequently we wish to test multiple hypotheses about rows or sections of the matrix B. For example, in Table 5.1.1 we have partitioned predictable variation into components for each of the predictor variables eliminating preceding measures. Regression variation in the accompanying example, based on the
Multiple Regression Analysis: Tests of Significance
161
i
data of Table 3.3.1, is partitioneQ into sums of products for the constant, the first two predictors jointly, and the final predictor eliminating the constant and the first two predictors. In each 'case we may wish to test several hypotheses about subsets of predictor variables. We assume a fixed order of predictor variables. "Stepwise" procedures which attempt all possible orderings, or search for the best single prediction equation do not generally yield ,valid test statistics, and must be interpreted with caution. With a predetermined order of predictor variables, valid sequential test statistics are obtained. Using a fixed order, it is also possible to test important combinations or sets ofyariables. Notationally, we may design~te the multiple hypotheses by adding a subscript to the terms of the preceding section. Hypothesis j is (5.3.1) Bh; is sectirn j of the entire matrix B, having qh; rows and p columns. The regression sum of products for Ho; is
(5.3.2)
Sn; is the sum of products of the ~orresponding qh; rows of the orthogonal estimates U. The total number of hypotheses isj= 1, 2, ... , J. For example, in testing q hypotheses about the contribution of each predictor to criterion variation elimi~ating preceding predictors, Bh 1 through Bhq are each a single row of B, with all qh; = 1. Each matrix Sn; is a squares-andcross-products matrix of one row <;>f U, as in Table 5.1.1. For the example based on Table 3.3.1.,
(5.3.3)
Ho1 is Bh1 = 0; the sum of products for regression is Sn1 = u~1 uh1 = UzU'z+U3U'a, with qh 1 = 2 degrees of freedom. H02 is Bh 2 =0; the sum of products for x 3 , eliminating the constant, xt. and x 2 , is Sn2 = Uhzuhz, with qhz = 1 degree of freedom (see p. 144). For purposes of significance testing we proceed in an order opposite to the order in which we have eliminated predictors. The effect of the last predictor or set, eliminating all others, is tested for significance first, using Snr If the null hypothesis is accepted that BhJ is zero, then we proceed to test H0 J_ 1 using SnJ_ 1 . If H0 J_ 1 is accepted, we proceed to test Ho1 _ 2 , and so on. If any Ho; is rejected, then predictors earlier in the elimination cannot be validly tested in this order. The rationale for the order of testing can be seen algebraically by examining U in Eq. 5.1.31. Let us assume that we are testing the effect of each predictor separately. The last element Uq+l is a function of only f3q· If the null hypothesis is accepted, then {3q is zero and disappears from all preceding u-terms. In this
162
Method
situation we may test the contribution of Xq- 1 using Uq, and the results will not be confounded with Xq variation. Similarly if f3q- 1 is judged to be null, no preceding terms are confounded with Xq_ 1 variation; they may be validly tested. The same logic applies to sets of predictors. For example, if the last two regression weights are simultaneously judged to be null, then both coefficients disappear from all preceding u-terms. If however, H0 is rejected from Uq+l• and f3q is not zero, then all preceding uterms contain nonzero functions of Xq, in addition to the other {3-weights they reflect. A test employing Uq, tor example, would not reveal whether f3q- 1 alone was null (note also that the elements ti; can be either positive or negative). In order to obtain valid test statistics for any predictors preceding Xq, it is necessary to order predictor Xq ahead of the other predictors (earlier in the order of elimination), and to reorthogonalize. The resulting orthogonal coefficients will reflect the corresponding predictors eliminating nonzero Xq effects. The same procedure is used with Xq- 1 or for any predictors found to contribute to criterion variation. Tests made in various orders of predictors are not independent of one another. The number of orders of predictors should be kept to a minimum to avoid multiplying statistical error rates. The error rates associated with regression procedures attempting all possible orderings (or finding the best predictor at each stage) are generally so high as to render the tests invalid. Also, a variable may appear insignificant because it is entered into regression following other predictors. If entered first, however, its contribution may be larger. This does not imply that the variable is unimportant, but only that it is correlated with other measures. In general it is best to determine an initial ordering of predictors, with the most crucial or complex variables last. Their contributions to regression are tested first, and with the least confounding. Control variables or measures known to be associated with the criteria are ordered first. In this manner, variables preliminary or central to our hypotheses are validly tested tor their contribution to regression, above and beyond those about which there is little doubt or concern. For each hypothesis, a decision about H0 is made from an appropriate test statistic. When p > 1, a multivariate testing procedure should be employed (Wilk's A, step-down analysis, or the canonical correlation tests of the next chapter). The multiple univariate F statistics, as well as simple and multiple correlations, provide information about the criteria least and most affected by the antecedent(s). These should not be used for a decision regarding H0 so long as the criteria form a conceptually meaningful set (see Hummel and Sligo, 1971). If the last hypothesis H0 J is accepted, then SHJ contains only random variation and may be pooled with SE to provide a better estimate of error dispersion; qhJ is pooled with ne. Testing and pooling continues in backward order with SHJ_ 1 , SHJ_ 2 , and so on. At each stage, we test the next hypothesis with the maximum statistical power (Anderson, 1962). If any Ho; is rejected, hypothesis testing must stop, in this order. If earlier variables require significance tests, they must be reordered to follow the significant predictors. U is recomputed and repartitioned in the alternate order. Under this pooling procedure, we may derive an
Multiple Regression Analysis: Tests of Significance
163
exactly-determined type-/ error rate for one order of predictors. Differing alevels may be assigned to each of the J hypotheses (a;). The overall probability of falsely rejecting at least one null hypothesis out of J is
(5.3.4) a;
may be chosen so as to fix the overall a at a particular value, say .05 or .01.
Example In Section 5.1 we partitioned criterion variation in Y1 and Yz of Table 3.3.1 into sums of products for error, plus regression effects for the constant, XI and x2 Simultaneously, and x3 eliminating the constant, x~. and Xz. The error and hypothesis matrices are
s = [3.59
2.38] 2.38 3.33
E
s
= Hl
s
Hz=
ne= 11
[4.71 4.44] 4.44 6.82
[
.01
-.06
qh 1
-.06] .41
=
2 (xl and X2 , eliminating constant)
qhz = 1 (x3 eliminating all else)
SH 1 and SH2 correspond to the partitions of Band U in Eq. 5.3.3. In Section 5.2 we tested Ho2 : B, 2 = 0 using univariate and multivariate test criteria. H 0 was accepted in all cases. Thus we may pool SH2 with SE. The revised error matrix is S' = [ 3.60 E
2.32 ] 2.32 3.74
with n; = 12 degrees of freedom. Using S~ and SH1 we may test Ho 1 : Bh 1 = for predictors x1 and x2 jointly. The likelihood ratio is IS~!
A=
8.07
IS~+SHll = 42.11 =· 192
The multipliers are
m = 12-(2+1-2)/2 = 11.5 - ( 22(22)-4)1/2s- 22+22-5 -2 The F transformation is
1 - Y.192 11 .5(2) + 1 - (2)(2)/2 F= Y.192 . 2(2) 22 =1.28·4
=7.06
0
164
Method
F does exceed the .05 critical value of 2.82, with 4 and 22 degrees of freedom. H0 , is rejected; x1 and x2 do contribute significantly to criterion variation. (Had Ho, been rejected, it would have been necessary to order x,l ahead of x1 and x2. and to recalculate U to test Ho, for x1 and Xz, eliminating X3.)
To inspect the effects of x1 and x2, let us compute univariate statistics for y, and Y2 separately:
F1
4.71/2 3.60/12 6.82/2
F2 =3.74/12
10.95
(Yz)
Each F ratio has 2 and 12 degrees of freedom. Both are significantly related to the two predictors, although y2 is more affected by x, and X 2 than isy,.
Reestimation The final predictors that do not contribute to criterion variation are ultimately discarded from the model. Remaining are early predictors that were not tested but were assumed necessary to the model (for example, the constant, known predictors, control variables), plus other tested variables that make significant additional contributions to regression. Altogether c (~ q) predictors remain; c is the rank of the model for estimation. The original estimate B was a "best fit" for the q simultaneous predictor variables. The rows of B are interdependent and change with the deletion of variables. Thus we shall reestimate B to obtain best estimates for only the leading c predictors. This may be accomplished from Xe, the leading c columns of X, by (5.3.5) Or, let Tc be the leading crows and columns of the Cholesky factor of X'X. Tis also the triangular factor from the orthonormalization of X in Eq. 5.1.15. Then
(5.3.6)
Ue represents the leading crows of U. Corresponding variances and standard errors for Be are obtained by pooling the nonsignificant effects with error variation. The variance-covariance matrix of column k of Be is (X~Xe)- 1
-c
1-[S =N-c
r
e
(5.3.7)
Multiple Regression Analysis: Tests of Significance
165
Under ideal circumstances, these final estimates should be obtained from a sample other than the one used for significance tests.
5.4
SAMPLE PROBLEM 1-CREATIVITV AND ACHIEVEMI:NT
Three hypotheses are of concern in the creativity-intelligence-achievement example. It is asserted by the researcher that both intelligence and creativity play significant roles in determining an individual's level of divergent achievement. In addition, individuals high in both intelligence and creativity are expected to perform particularly well, whereas a low degree of intelligence or creativity, or both, will result in low levels of divergent achievement. Thus the third hypothesis asserts that an interaction of the two predictors is a determinant of an individual's achievement level. The two criterion variables are measures of the divergent achievements synthesis (y1 ) and evaluation (y2 ). The seven predictors are intelligence (x1 ), three creativity measures (x2, x 3, x 4), and three interaction or cross-product variables (x5 , x6 , x 7 ). The sum-of-products matrix of mean deviations for all nine measures (Sw) is given in Section 3.4, along with means, standard deviations, and correlations. The partitions of Sw for y, x, and their cross products, are
s
(yy) w
= [178.85 11 0.35]yl
110.35 186.18 Y2 Y1
Sw(.xx)=
Sw(y.x)
=
59.00 5.55 24.29 27.47 5.56 28.65 17.56
59.00 3.38 31.98 -3.06 18.56 1.19
x1
x2
[Sw(.xy)],
Y2
(Symmetric)
x1 x2
59.00 25.13 59.00 18.56 1.19 43.35 38.66 24.42 22.27 104.70 24.42 14.98 19.12 57.79 64.79
Xa X4 Xs Xa
Xa
X4
x1
Xa
Xs
x1
21.29 35.23 40.74 4.56 45.38 31.79yl = [65.79 56.69 17.66 40.49 33.58 20.22 40.17 29.77 Y2 x1
x2
x4
Xa
Xs
Xa
x1
The estimated regression coefficients are given in Chapter 4. The matrix (without the constant) is
B= [SwC.x.xl]- Sw(.xy)= 1
1.00 .86 x1 .28 .33 x2 .16 .32 X a -.04 -.14 X4 -.16 .24 Xs -.04 -.16 Xs .25 .20 x1 YI
Y2
166
Method
The three hypotheses may be represented as tests on portions of B.
B11 1
X1
Intelligence
Xz
Bh 2
B=
Xa
Creativity
x4 Xs Xs
B"3
Interactions
x1
Yz
Y1
The three null hypotheses are H0 ;: B111 = 0 (j = 1, 2, 3). To test the hypotheses, the orthogonal estimates are computed and partitioned like B. The lower triangular Cholesky factor of Swrxx> is
T=
(Zero)
7.68 .72 7.65 3.16 .14 3.58 3.84 .72 -.47 3.73 2.07 2.29 -.06
7.00 1.90 5.28 2.33 -.76 3.80 -.78 2.46 .45
6.05 1.83 8.25 1.99 4.45
5.43
The orthogonal estimates are given by
U=T'B 8.56
7.38
u' 1 Intelligence U'z
1.97 1.61 1.12 2.42 .07 -.68
u'3 Creativity, eliminating intelligence u' 4
-.54 1.57 .74 -.40 1.35 1.11
u'5 Interactions, eliminating u' 6 intelligence and creativity u' 7
The squares and products of the rows of U are combined for tests of hypotheses. The first in the order of elimination relates to intelligence, and has q11 , = 1 degree of freedom. That is,
=
[73.36 63.22
Yt
63.22] Y1 54.48 Yz
Yz
Multiple Regression Analysis: Tests of Significance
167
The second hypothesis matrix is for the three creativity measures, with qh, = 3. That is,
S
= Hz
±u u'. = U' U
;~z
'
'
hz
hz
= [5.17 5.85
5.85] y 1 8.91 Y2
Yt
Y2
The third hypothesis matrix, involving the interaction terms, has qh,=3 degrees of freedom. That is, 7
SH3
= 2: u;u' i = u;,3 u"3 1=5
= [2.65 _ .34
.34] y1 3.83 y 2
Yt
Y2
The error sum of products is
= [178.85 11 0.35 [ 97.67 = 40.94
Yt
11 0.35]- [81.18 186.18 69.41
69.41] 67.22
40.94] Yt 118.97 Y2
Y2
Error degrees of freedom are ne= 60-7-1 =52. The hypothesis tested first is last in the order of elimination -that is, the interaction terms or H03 : Bh3 = 0. The univariate F ratios for synthesis and evaluation are ratios of hypothesis and error mean squares. The interaction sum of products is SH3 . Then 2.65/3 F1 = 97.67/52
.4 7
3.83/3
Fz= 118.97/52
·56
Each has 3 and 52 degrees of freedom; neither exceeds the .05 critical value. The proportions of variation in y 1 and y 2 attributable to interaction, above and beyond intelligence and creativity, are 2.65/178.85 = .015 and 3.83/186.18 = .021, respectively. For the multivariate and step-down tests of the same hypothesis, we form S=SE+SH3=
[ 1 00.32 41.28
41.28] 122.80
The Cholesky factors of SE and S may be used to find ISEI and ISE+SH3 j, as well as to provide the conditional variances for step-down analysis. The factors are TE= [9.88 4.14
0.00] 10.09
T*= [10.02 4.12
0.00] 10.29
The determinant of a symmetric matrix is the squared product of the diagonal
168
Method
elements of the Cholesky factor (see Chapter 2). Thus
A=ISEI lSI _ (9.88X10.09) 2 - (1 0.02 X10.29) 2 =.94 within rounding error. For computing the multivariate F statistic, p=2, s=2, and m=52. The resulting value is F= .5649, with 6 and 102 degrees of freedom. The critical F value is not exceeded. The diagonal elements of TE and T* are employed in the step-down analysis. The step-down statistic for synthesis is
• ' (1 0.02 2 -9.88 2 ) 52-1 +1 F1 = 9.88 2 • 3
= .47 F~
has 3 and 52 degrees of freedom. For evaluation, eliminating synthesis,
f!-(10.29L10.09 2) 52-2+1 2• 10.092 3 =.67 F; has 3 and 51 degrees of freedom. Neither step-down F exceeds the corresponding critical value. From all results, H03 is accepted; Bh3 is null. Interactions
of creativity and intelligence, above and beyond their individual effects, do not account for divergent achievement levels. · Finding no significant interaction, we may pool SH3 with the error sum of products. The new estimate is
SE = SE+SH3
41.28] = [100.32 41.28 122.80 SE has 52+3 =55 degrees of freedom. (The procedure is also computationally efficient, since the Cholesky factor of [SE+SH3 ] was computed for testing the interaction hypothesis. T* may now be employed as the factor of SE for the test of the next-to-last, or creativity, hypothesis.) Should the interaction terms have been significant, tests of the creativity effects would not be valid (a) because the sums of products for creativity are confounded with nonzero interaction components and (b) because interaction in particular indicates that any main effect of intelligence does not apply equally across creativity levels, and vice versa. Having accepted H03 , we repeat the testing procedure for creativity, eliminating intelligence (H02 ). The hypothesis matrix is SH2 ; the error matrix is SE. The univariate tests for the two criterion measures are
. 5.17/3 F1 = 1 00.32/55
·94
F2
8.91/3 122.80/55
1 ·33
) Multiple Regression Analysis: Tests of Significance
169
Each ratio has 3 and 55 degrees of freedom. Neither exceeds the .05 critical F value of 2.8. The percentages of variation accounted for by creativity, beyond that accounted for by intelligence, are 100 x (5.17 /178.85) = 2.89 for synthesis, and 100X(8.91/186.18)=4.78 for evaluation. Neither is appreciably larger than the variation attributable to the interaction terms. For the multivariate tests, the matrix to be factored is S = S'. +S E
H2
= [105.49 47.13
47.13] 131.71
The Cholesky factor is T*= [10.27 4.59
0.00] 10.52
Using the diagonal elements ofT* and the factor of S~, IS~I
A=ISI (1 0.02 X 10.29) 2 (1 0.27 X 10.52)2 =.91 The multivariate F statistic is computed with m=55 (assuming ne=55). The value is F= .8749, with 6 and 108 degrees of freedom. We have insufficient data· to reject Ho 2 • The step-down statistic for synthesis is F* 1
(1 0.27L 10.02 2 ) • 55-1 + 1 10.02 2 3 =.94
F; has 3 and 55 degrees of freedom. F* 2
For evaluation, eliminating synthesis,
(10.52L10.29 2 ) . 55-2+1 10.292 3 =.82
F;
has 3 and 54 degrees of freedom. No test statistic is significant. Creativity does not play a role in divergent achievement outcomes, above and beyond intelligence. H02 is accepted. Finding creativity nonsignificant, we proceed with the test of the intelligence effect. (Had the creativity effect been significant, intelligence could not be tested with this order of predictors.) SH2 may be pooled with the sum of products for error to obtainS~= [S~ + SH2 ] with 55+ 3 =58 degrees of freedom. The hypothesis matrix is SH 1 = u1 u' 1 . The univariate F statistics are
Ft
73.36/1 105.49/58
40.33
F2
54.48/1 131.71/58
23 ·99
Each has1 and 58 degrees of freedom. Both exceed the .05 critical Fvalue.
170
Method
For multivariate tests, the "total" matrix is S = S~+SH 1 · The Cholesky factor of S is T* = [13.37 8.25
0.00] 10.87
From the diagonal elements,
A= IS~!
lSI
_ (1 0.27X 10.52) 2 - (13.37X10.87) 2 =.55 The corresponding F statistic has the multiplier m=57. The F value is 23.0717 with 2 and 57 degrees of freedom. H 01 is rejected at a= .05. The step-down test statistic for synthesis is * (13.372_10.27 2 ) 58-1+1 F1 = 10.272 . 1 =40.33
F;
has 1 and 58 degrees of freedom. The step-down statistic for evaluation, eliminating synthesis, is F*=(10.87 2 -10.52 2 ) . 58-2+1 2 10.522 1 =3.84
F; has 1 and 57 degrees of freedom. F; does not exceed the .05 critical value, while F; is significant. We conclude that the relationship between the criterion and predictor variables is concentrated in the synthesis measure. If we assume that evaluation is the more complex process, it appears that the simpler process of synthesis is sufficient to account for the significant relationship with intelligence. The percentages of variation in synthesis and evaluation accounted for by intelligence are 100X(73.36/178.85)=41.02, and 100X(54.48/186.18)=29.26, respectively. Since the hypothesis involves only the first predictor variable in the order of elimination, these values are simply the squared correlations between achievement and intelligence, as contained in R10 • The findings are summarized in Table 5.4.1. Mean squares and cross products [MH. = SH/q11 .] are presented in place of sums of squares for ease of ' ' J interpretation (especially for the error matrix). The table is read from the bottom upward. Hypotheses are maintained or rejected by the multivariate test statistic. In Table 5.4.1, no significant A is encountered until intelligence. Finding this, we inspect the univariate results to isolate the source of the effect. When criterion measures have a natural order (as they may have here, by complexity), the step-down results may be used in place of A for the decision about H0 • In the
Multiple Regression Analysis: Tests of Significance
Table 5.4.1
Summary of Results for Creativity and Achievement Significance Tests Mean Squares and Products
Source of Variation
Synthesis
Intelligence Creativity, eliminating intelligence Interaction, eliminating creativity and intelligence Residual Total (Sum of products)
171
s w
Evaluation
Degrees of freedom
Multivariate
Univariate F (percent of variation)
A
F
Synthesis
Evaluation
[73.36 63.22
63.22] 54.48
qh1 =1
.55
23.07*
40.33* (41.02)
23.99* (29.26)
[1.72 1.95
1.95] 2.97
qh2=3
.91
.87
.94 (2.89)
1.33 (4.78)
[.88 .11
.11 1.28
qh, =3
.94
.56
.47 (1.48)
.56 (2.06)
J
i=[1.88 .79
.79] 2.29
ne=52
(yy)=[178.85 110.35
11 0.35] 186.18
N-1 =59
*Significant at p < .05.
data under consideration, only general intelligence plays a role in divergent achievement. There does not appear to be a differential creativity effect, either across all intelligence levels (there is a wide range of intelligence scores) or at particular levels of intelligence. The univariate results indicate that synthesis is more affected by intelligence than is evaluation. The step-down statistics indicate that in fact the more complex trait, evaluation, does not contribute to the association with the predictors. The relationship between the two sets of measures is parsimoniously summarized in the correlation of the two simplest constructs, intelligence and synthesis. Bargmann (1967) has recommended that prior to testing individual components of a linear model the researcher should consider the overall relationship of the independent and dependent variables. At times we are concerned only with such a relationship, such as in deciding whether to employ covariates in analysis-of-variance designs. The test is obtained by repeating the test procedure with the overall regression sum of products,
The test criterion is
172
Method
For the example, the Cholesky factor of S,/YY) is
T*= [13.37 8.25
0.00] 10.87
and
A= (9.88x10.09) 2 (13.37 X 10.87) 2 = .47 Applying the F approximation,
m=52- 2 + 1 - 7 2
54
and F=3.3329. F has 14 and 102 degrees of freedom. With a=.05, we reject the null hypothesis that the entire qxp matrix B is null. Having decided that only intelligence is related to the criterion measures, we will reestimate Band I with only the single predictor. The final estimates are
B1=(T1- 1)'U1 = (V5§}- 1[8.56 =[1.11
7.38]
.96]
~ ~ "~ 'i6)\S".
<s>
<$>~. /0?
and
i= N-1-1 1 = [1.82 .81 Synthesis
(S _(YYl-U'U)
"
1
1
J
.81 Synthesis 2.27 Evaluation Evaluation
With the other independent variables eliminated from the model, the regression coefficients for intelligence have gone from 1.00 to 1.11, and from .86 to .96, for the two criterion variables. I is now more efficiently estimated with 58 degrees of freedom. Thus the "best" prediction model is
J
[ Yi1=Y·1] = [1.11 (x;- x.)+[ei1] Y;2 Y·2 .96 ei2 y 1 is the synthesis random variable, y2 evaluation, and x intelligence.
CHAPTER
I
Correlation The social scientist often requires a measure of the strength of association of one or more criterion measures, with one or more antecedents. For this purpose we have the correlatio~ coefficient, a measure of the degree of interrelation of two random variables. We shall denote the correlation of random variables x and y, as Pxy· The squared value, Px/. is the proportion of variation in y that can be attributed to x. Equivalently, Pxu2 is the proportion of decrease in y-variation if the effects of x are removed. Let us represent the variance of y by f'(y), and the condiiional variance of y given x or of y for all subjects having the same x-value, as :V(y Ix). The squared correlation coefficient is 2_
PxY -
:V(y)- :V(y I x) :V(y)
(6.0.1)
Since :V(y);, :V(yl x), it follows that 0 ~ Px/ ~ 1; for the coefficient itself, ~ PxY ~ 1. If much of the variation in y can be attributed to the x-measure, :V(ylx) will be small and Pxu large. If 'Y'(y Ix) is close to :V(y), x does not account for the y-variance, and PxY is close to zero. When PxY = 0, x and y are said to be uncorre!ated. The correlation coefficient is symmetric (Pxu=Pyx) by interchanging y and x in Eq. 6.0.1. Thus when we use measures of correlation, the designation of independent and dependent variables is arbitrary. For tests of significance on correlations, we shall assume that x andy have a bivariate normal distribution. In earlier chapters we assumed instead that the values of x (the predictor variables) are fixed constants and known without error. However, it was shown that the conditional distribution of y given x has the same variance in either case. If the x values are regarded as fixed constants, Pxu can still be defined as in 6.0.1, and the appropriate estimator remains unchanged, although the derivation is different. Under the normality assumption, uncorrelated variables are also statistically independent. In this section we discuss only linear measures of correlation. Thus, positive Pxu will indicate the extent to which high values of x are associated with proportionally high values of y, across the entire range of both measures. Negative Pxy is indicative of the extent to which low y is associated with high x and high y with low x, across the entire range of both.
-1
173
174
Method
Correlational measures may be categorized according to the number of variates being related. In the case of simple correlation, Px·y is an indicator of the association of the two individual measured variables. The partial correlation is an index of association of two variables holding constant or eliminating q (~ 1) additional variables, or covariables. The multiple correlation is a measure of association of a single variate, with a weighted linear function of q (~ 1) additional measures. The canonical correlation reflects the association of two variates, of which each is itself a linear function of two or more original measures. The simple correlation may be considered the association of one y measure with one x measure. (The partial correlation is a special case of simple correlation.) In contrast, the multiple correlation is the association of one y-measure with two or more x-measures. The canonical correlation is the correlation of two or more y-measures with two or more x-measures. Unlike the simpler coefficients, more than a single canonical correlation may be necessary to describe the relationship between two sets of variables. In each case the more general measure subsumes the simpler. A canonical correlation involving one y- or 'one x-measure is identically the multiple correlation of that measure with the other set. The multiple correlation of one y-measure and one x-variable is identically the simple correlation of the two. Let y be a px 1 vector variable, with component measures Y; (j = 1, 2, ... , p); let x be a qx 1 vector variable with component measuresx~c (k = 1, 2, ... , q). vis the (p+q) x 1 vector of measures for ally- and x-variables. That is,
v'=[y',x'] The expectation of vis ff(v) = p,
=
[::.J
(6.0.2)
The variance-covariance matrix of all measures is (p+q)-symmetric.
= [=~~+=~:] :I(xy)
1
:I(xx)
p rows
q rows
p q columns columns =:I
(6.0.3)
The simple correlation is the correlation of y and x when there is only one variable in each set (p = q = 1). Partial correlation is the association of y 1 and y2 (p = 2) on the conditional distribution of q (~ 1) x-variables. Multiple correlation is the correlation of y and x with p = 1 and q ~ 1. Canonical correlation is the association measure for y and x when p ~ 1 and q ~ 1.
Correlation
175
To estimate I, let y1 be the vector of y-scores for observation ;, and x 1 be the vector of x-scores for the same subject. The augmented vector is vf = [yf xi]
(6.0.4)
x:]
(6.0.5)
The estimate of IL is
= [y:,
where N is the total number of observations. The sum of products of mean deviations is given by Eq. 3.3.7: N
Sw
=
L
(v 1-v.) (v1-v.)'
(6.0.6)
i~l
Sw may be partitioned for y and x in the same way as I in Eq. 6.0.3. That is, Sw =
[~~::_+-~~:~] p rows Sw(xy)
I
p columns
(6.0.6a)
Sw<x.x) q rows q columns
The unbiased sample covariance matrix is
i=Vw 1 = N-1 Sw
- [Vw(yy)
-
I
Vw(yx)]p rows
I
Vw<xxl q rows
-----+----1
Vw<xvl
p columns
(6.0.7)
q columns
Vw may be partitioned for y and x in the same way as I and Sw. Sample correlation results may be derived from these matrices. When there is more than a single subgroup of observations, the pooled Sw from Eq. 3.3.21 must replace Eq. 6.0.6 to avoid biasing variances and correlations by mean differences.
6.1
SIMPLE CORRELATION
The simple correlation of y and xis obtained by substituting in Eq. 6.0.1, with p= q= 1. Terms I
176
Method
The squared correlation is
(6.1.2)
The correlation is (6.1 .3) PYx is positive or negative depending upon the sign of the covariance G"yx· By the Cauchy-Schwarz inequality, -1 ~pyx~ 1. Since G"yx= G"xy then pyx= PxY· It is demonstrated in Chapter 3 that Eq. 6.1.3 is the covariance of variables y and x after both have been standardized. We may simultaneously obtain the matrix of simple correlations between every pair of p + q measures in y and x by standardizing the entire vector v. Let
.1= diag (!.)
(6.1.4)
where a is the diagonal matrix of variances. Then the vector of standard scores is (6.1.5)
The variance-covariance matrix of z is the (p+q)-square correlation matrix (6.1.6)
The diagonal elements of '!It are unity; the jk off-diagonal element is PJk> the correlation of variates v1 and vk. The estimate of '!It is obtained by performing the same operations on i in Eq. 6.0.7. The estimated matrix is (6.1.7) with (6.1.8)
Dw is the diagonal matrix of sample variances; Dw - 112 contains the inverse standard deviations. The jk element of Rw is the sample correlation of v1 and vk. Let us represent the element as [rw]Jk or r;k· Then r;k=
[ Vw];k
. V[vw]J.i[vwhk (6.1 .9)
[vw];k is the sample covariance, and [vw];; and [vwhk are the variances from Vw. Eq. 6.1.9 may also be expressed in raw-score form. Let V;; be the score for observation ion variable v1; let v0, be the score on variable vk. Then Eq. 6.1.9 is
Correlation
177
equivalently
(6.1.9a)
v.; and v.Jc are the means on v; and V~c respectively, from v. (see Eq. 6.0.5). Expressions 6.1.9 and 6.1.9a are the simple Pearsonian correlation of vi and V~c, and agree with common elementary statistical presentations. The estimate of '!A or p;,, is not unbiased. Olkin and Pratt (1958) have provided a minimum-variance unbiased estimator for each coefficient, as a function of the sample value. The actual estimation is quite complex. However, for N of 9 or greater, a less-biased estimate is obtained by multiplying r;~c by a simple correction factor (see Olkin, 1966). The adjusted estimate is (6.1 .1 0) All of the correlations in Rw may be of interest. When data are analyzed through a regression model, it is likely that the intercorrelations between the criterion variables and the predictors are of primary concern. Rw may be partitioned like I and Sw. That is,
Rw =
t-----+------
I1
Rw
prows
Rw(xy)
I
Rw(xx)
q rows
(6.1 .11)
q columns
p columns
The intercorrelations of predictors and criteria comprise the p xq submatrix Rw(yx).
The square of any element
[r;~cF
of
is the proportion of variance of No other variable is considered in evaluating r;k· Thus r;~c is perhaps the simplest and most useful measure of association for interpreting the regression results. The x-variable having the highest simple correlation with Y; is the single best predictor; the x-variable having lowest correlation with Y; is the poorest predictor. Further, the sign on rH, indicates the direction in which Y; and X~c covary, positively or negatively. In addition to the sign and magnitude of the correlations, we may wish to test hypotheses about the population value P;~c· In particular, we shall consider three hypotheses:
Y; attributable to the particular predictor
Rw
X~c.
1. We may wish to test whether there is any nonzero correlation of Y; and in the population. The null hypothesis is
Ho:
P;k =
0
X~c
(6.1 .12)
To test H0 against the alternative that P;k # 0, the test statistic is
t=
f;k
v7V=2.
V1- r;~., 2
(6.1.13)
178
Method
where t follows a t distribution with N-2 degrees of freedom. Ho is rejected against the two-tailed alternative if ltl exceeds the upper 1OOa/2 percentage point of t with N-2 degrees of freedom. One-tailed alternatives are considered in the usual manner. Also, tables of the critical values of lr;~cl. above which H0 is rejected, are provided in numerous books (e.g., Fisher and Yates, 1942; Edwards, 1960; Glass and Stanley, 1970). The test that P;k = 0 is equivalent to a simultaneous test that /3;~c= {3,.;=0, where {31" and f3~c; are the simple regression weights for Y; and x". 2. It is possible to test whether P;k is equal to a value other than zero. The nu II hypothesis is (6.1.14) where p* is any constant between -1 and 1. When population P;~c is nonzero, the distribution of r 1" is significantly more complex. For small samples, and P;~c to tenths of units, David (1938) has provided tables of the distribution of r;" in intervals of .05. To test H0 , r;h· is computed and compared to David's tables, with parameters Nand p*. From the table we read the probability of observing r; 1" assuming p* to be the population value. If the probability is less than a, H0 is rejected with confidence 1-a; otherwise H0 is maintained. As N increases, r;" assumes a normal-like form regardless of P;~c· However, the closer IP;h·l is to unity, the higher N must be to attain normality. Fisher (1921) has provided a transformation of r;" that takes a normal form with smaller N than is required for r;" itself. Fisher's transformation is
z' = 1/2 loge [ 1
+r;"]
1-r;"
(6.1.15)
z' tends toward a normal distribution with large N. The expectation and variance of z' are approximately i5'(z')
= 1/2 loge [ 11-p;~c +p;"] + ~ 2(N-1)
(6.1.16)
and cY/'( j/
z
')
1
= N-1
+
4- P;~c 2 2(N-1) 2
('6.1.17)
For even moderate N, the second term in Eq. 6.1.16 is small, and 'Y(z') is close to 1/(N-3). These simpler forms are generally those seen in textbook presentations. To use z' to test H0 : P;k = p*, let
+p*]
(6.1.18)
Z=(z'-l;,)vlli=3
(6.1.19)
1 _p* 1;,=1/21oge [ 1
The test statistic is
Correlation
179
Z has approximately a standard normal distribution. H0 is rejected with confidence 1- a if IZI exceeds the 1OOa/2 upper percentage point of the standard normal distribution. Tables of the transformation from r;k to z' are provided in most elementary textbooks. Fisher's transformation may also be used to obtain an interval estimate of Pik· The interval is first computed for the transformed coefficient, '· The 1-a interval is
(z'-~)~,~(z'+~) ~ ~
(6.1.20)
Both limits are referred to the z' tables. The corresponding r;k values are the limits for the 1-a interval on the population coefficient Pik· To compare the correlations between variates Y; and xk across two or more independent groups, the z' transforms may be treated as the dependent variable in analysis-of-variance designs. Each z' is assumed to be distributed normally with variance 1/(N-3). Nonorthogonal analysis models are required if the correlations are based upon different numbers of observations. In using z' transforms with programs designed for tests on means, N-3 is substituted for the number of observations in subclasses. That is, the z' score is treated as a mean score for a particular group, having N-3 independent observations. In this manner, the homogeneity of correlations may be tested across one or more sampling factors, just as we conduct tests on means. 3. We may test whether two variables have the same correlation with a third. This test is of particular value in comparing the simple predictive power of two x-variables, or in comparing the effects of the same x" on tvvo criterion measures. The null hypothesis is
(6.1.21)
Ho asserts that the correlation of Y; and x" is the same as that of Y; and x1• Since the two correlations are from a common population, they are not independent. The standard error of the difference, d=r; 1,,-rj1, must take into account the correlation of x" and x 1-that is, rkt· It is shown in Olkin and Siotani (1964) that an appropriate test statistic is Z= r;k-ril (td
(6.1 22)
where ft i =
N~ 1 [(1-rj//) 2 +(1-r;z2) 2 -2r"l -
(2r" 1- r;krj/)(1- r;k 2 - r;/-r" 12 )]
(6.1.23)
Z is distributed in standard normal form. H0 is rejected in favor of the two-sided alternative if IZI exceeds the 1OOa/2 upper percentage point of the unit normal distribution.
180
Method
Example The intercorrelations of the five measures of Table 3.3.1 are given in ChapterS.
1.00 (Symmetric)] Y1 .72 1.00 ~ ---------1 R,c = [ .75 .61 I 1.00 xl .17 .63 l .19 1.00 X2 .40 .51 l .56 .11 1.00 x3 The total sample size is N = 15. All of the correlations are positive. Let us test to see whether there is any significant correlation of y, and X 2 • The correlation is r, 2 = .17. The test statistic is
t= .17V15-2 \11-.17 2
.63
t does not exceed the .05 critical t value with 13 degrees of freedom. The correlation is not significant. To determine whether y1 is predicted better than y2 from X1 alone, we may test
Ho: Ptt """Pz1 against the alternative that p 11 > p 21 • The difference is d=. 75-.61 = .14. The variance is
ai= 1~ {(1-.75 2 ) 2 +(1-.61 2 ) 2 -2(.72 3) -[2(.72)-.75(.61 )] . (1-.752-.612-.72 2)}
=
.0204
and The test statistic is
Z=:!_i= 1.00 .14 Z does not exceed the .05 critical value of the standard normal distribution. Although the correlations are in the direction of HA, they are not significantly different.
Tests on one or a few correlations are useful in providing information about particular hypothesized relationships. Correlations among many variables for the same sample of observations are not independent, however. Care should be taken to minimize the number of individual tests that are conducted. When many tests are made from the same data and are not independent, statistical error rates may be inflated sufficiently to invalidate all of them.
Correlation
6.2
181
PARTIAL CORRELATION
The partial correlation of variates y1 and y3 is their simple correlation evaluated on the conditional distribution of q (;;;. 1} additional variates, xk. Restricting the correlation to specific values of the x-variables is termed "holding the xk's constant," or "removing the effects of the xk." The partial correlation is the estimate of Pu that woulo be obtained if all observations had identical scores on the xk measures. If all variables are normal, the conditional distributions of y1 and y3 have the same covariance matrix, regardless of the values of the xk assumed. Thus the partial correlation does not depend on which values of xk are observed. Let v' = [y', x'] be the (p+q)-element normal vector random variable, with expectation and covariance matrices given by Eqs. 6.0.2 and 6.0.3, respectively. The conditional distribution of y, given x, is p-variate normal, with expectation given by Eq. 3.2.9. That is, (6.2.1} The covariance matrix of y given x, regardless of the x-value, is given by Eq. 3.2.1 0. That is, r(yjX)
= :I(yy)_l;(Y.r)[:I(.r.r)J-ll;(.ry)
(6.2.2)
=l:yyl.x
The estimate of :I is obtained from the partition of Sw in expression 6.0.6a. The sum of products for y given x is (6.2.3) Note that Eq. 6.2.3 is the error SUfl1 of products, or sum of products of y given x under the linear regression model, as in expression 5.1.11. Sw
1
Iuyl.r= N-q-1 SE
=VE
(6.2.4)
VE is pxp symmetric. The diagonal elements of VE are the variances of the p y-variates, eliminating the x-measures. The off-diagonal elements are the "adjusted" covariances. The diagonal elements of SE are always smaller than the corresponding elements of Sw
182
Method
Note that these are also the correlations of the errors obtained under the linear regression model in Chapter 4. The distribution of the partial correlation coefficient is the same as the simple correlation, but with q fewer degrees of freedom. Thus to test the hypothesis that a partial correlation P<;; is equal to a specified value p*, we may apply Fisher's z' transformation (Eq. 6.1.15). Element [re]ii and p* are transformed to z' and'· respectively. The test statistic is Z=(z'-,)VN-q-3
(6.2.6)
Ho is rejected if IZI exceeds the critical Z-value from the standard normal distribution. Partial correlations for the data of Table 3.3.1 are given in Section
4.4.
6.3
MULTIPLE CORRELATION
The multiple correlation coefficient, like the simple or partial correlation, is a measure of association of two random variables. Unlike the former, the multiple correlation is the correlation of one random variable y, with a linear combination of q (~ 1) variates xk. Weights are chosen for the xk so as to maximize the relationship of the linear function withy. In the special case with q= 1, the multiple correlation is the ordinary simple correlation of y and x. Individual xvariables may have both positive and negative correlations with y. The multiple correlation is defined to be only positive, with limits 0 and 1. The weights that maximize the correlation are the vector of regression cbefficients of yon x. Let p be the qx1 vector, assuming that all variables are expressed as mean deviations. The multiple correlation is the simple correlation of the vector of y-values, and the linear combinations X'p. Under the univariate multiple regression model y=Xp+e, the vector Xp is the set of scores predicted from the x-variables; y = Xfj is the sample value (Eq. 4.2.17). Thus, the multiple correlation is an index of association of observed and predicted values; the better the prediction, the higher the correlational measure. The multiple correlation may be found by computing the predicted scores and their correlation with the observed values. Or we may apply Eq. 6.0.1 directly. Let the squared correlation be R2 • Then
R2= 'Y'(y)-'Y'(yJ x) r(y)
(6.3.1)
The terms of Eq. 6.3.1 may be obtained from~ (Eq. 6.0.3) with p= 1. The variance of y is ~(uu>=u2 • The variance of ygiven x (from Eq. 3.2.1 0) is the scalar 'Y'(yJx) =
u2-~(yxl[~(xx>J-1~(xyJ
{6.3.2)
Substituting in Eq. 6.3.1, we have (6.3.3)
Correlation
183
An estimate of R2 is obtained by substituting sections of Vw for I in Eq. 6.3.3. The sample value is A
R2 =
V (!JX)[V <xx)J-lV (xy) w
vww(yy)
w
(6.3.4)
Since Sw=(N-1)Vw, Eq. 6.3.4 may also be expressed in terms of the sum-ofproducts matrix, as follows: , S (yx)[S (xx)J-lS (xy) R2 = w (6.3.4a) sw (yy) w w
Eq. 6.3.4a can be analyzed to inspect the properties of R2 • In addition to being the squared correlation of predicted and observed y-scores, R2 is a ratio of their respective sums of squares. The sum of squared mean deviations for y is Sw
and with all variables as mean deviations,
y=XtJ The regression sum of squares is the sum of squared predicted scores. That is, SR=
y'y= fj'X'Xfj= Sw(yx![Sw<xx!]-lSw(xy)
Eq. 6.3.4a is the ratio of SR to Sw
(6.3.5)
The orthogonal regression weights are
u=T'P
(6.3.6)
184
Method
The sum of squares tor u is the regression sum of squares,
u'u=SR
(6.3.7)
Each uk 2 reflects the contribution to yvariation tor just variable X1,, eliminating all preceding variables x 1 , x 2 , . . . , xk-t· The total of these terms is the numerator of Eq. 6.3.4a. Thus (6.3.8) is the proportion of criterion variation attributable to just xk, eliminating X 1 , x 2 , •.• , xk-t· Pk may be multiplied by 100 to convert to a percentage measure. The sum of the P,. terms is the squared multiple correlation of y with all q predictors (6.3.9) Each component P" is termed the increment in R2 tor predictor xk. R may be tested tor departure from zero by means of a likelihood ratio test. The null hypothesis asserts that there is no correlation of y and all q x-variates; (6.3.1 0) The alternative is that the population correlation is greater than zero. The test statistic is R2 N-q-1 F=--,- . ---'-(6.3.11)
1-R2
q
H0 is rejected with confidence 1-a if F exceeds the 1OOa upper percentage point of the F distribution with q and N-q-1 degrees of freedom. Otherwise Ho is maintained. Testing Eq. 6.3.10 is equivalent to testing that the entire vector p is null, under the linear regression model in Chapter 5. Although we accept H0 that R is zero, the sample value may still be nonzero. The expected value, or average over many samples, is (6.3.12) Even if R is zero, sample values R tend to approach unity as q approaches N. To avoid obtaining a false image of strong predictive power, N must be large relative to q. If N-1 is equal to the number of predictors, the sample correlation is always unity. This can be seen since X is square and can itself be inverted. Then the predicted scores are
y=XP =
X(X'X)- 1X'y
=
xx- 1(X')- 1X'y
=y
(6.3.13)
Correlation
185
The sum of the squared predicted scores is equal to the sum of squared observed outcomes. There is little point to the prediction of N outcomes from an equal number of antecedents. No parsimony is gained but the complexities in interpreting linear functions X/3 are added.
Multiple Criteria When there is more than a single dependent variable, the multiple correlation of each criterion measure with all of the predictors may be obtained simultaneously, from summary matrices. If p > 1, then X (see Eq. 6.0.3) and Sw(yy) (see Eq. 6.0.6a) are pxp matrices. The p diagonal elements are the variances and observed sums of squares, respectively, for each outcome measure. ~(yx> and Sw(yx> are not vectors but pxq matrices. The regression weights form a qxp matrix
B= [S
10
(xx!]-1S.,}xy)
SR is the pxp sum of squares and cross products for regression: SR = Sw(yx>[Sw<xx!]-lSw<xy!
Like S,}YY>, SR has the sum of squares for each outcome measure in the diagonal positions. In both matrices these elements are determined without reference to the other variables in the set. That is, the results are the same as those that would have been obtained in univariate analyses of the particular measures alone. The squared multiple correlation of variate Y; with all q x-measures may be obtained from the jj diagonal elements of SR and Sw(yyJ. flz =
[s,.];;
(6.3.14)
[Sw(yy)];;
.I
The subscript j is necessary to distinguish the criterion variable; j has values from 1 top. The contribution of individual predictors to R/, eliminating earlier predictors, is obtained as in Eq. 6.3.8. The matrix of orthogonal weights is
U=T'B If u' k is the kth row of U, then the squares and products for xk, eliminating x1 through xk-l is the pxp matrix
The diagonal elements of SHare the uk 2 of Eq. 6.3.8 for separate measures. Any diagonal element may be divided by the corresponding element of Sw to estimate the increment in R/. That is, p. = Jk
[sh]n
(6.3.15)
[Sw(yy)];;
The total of the P;k across predictors (rows of U) is of criterion Y; with all x's (see Eq. 6.3.9).
R/, the squared
correlation
186
Method
A univariate test of the multiple correlation between one Y; and all x's may be conducted using Eq. 6.3.11. Multiple univariate tests are not independent, however, since the criterion variables are themselves intercorrelated. We may turn instead to one of several multivariate procedures. A simultaneous test of all the multiple correlations (all relationships between y and x) is equivalent to testing that the entire matrix B or !,
S (yy)=[8.31 6.76 Yt
w
6.76] Yt 10.56 Yz Yz
s,< yyl has N-1 = 14 degrees of freedom. The regression sum of products of y, and Yz on q = 3 predictors is found from the computational forms of Section 4.5. SR =
[4.4.3872 Y1
4.38] Yt 7.23 Yz
Yz
The squared correlations of y1 with the three x-variables, and y2 with the three x-variables are
Rt z = 8.31 4. 72 = 568 ( ) . Yt
R22 =
170 ~;6 = .685
(Yz)
The correlations are R, = .75 and R2 = .83, respectively. We may test Ho: R; = 0 for the two correlations separately. For y"
.568 15-3-1 ft = 1-.568. 3
4.82
F, has 3 and 11 degrees of freedom. H0 is rejected at a= .05 (but not at .01).
Correlation
.685 Fz= 1-.685.
15-3-1 3
187
7.97
Ho is rejected at a:=.01. The two F tests are not independent. Either the likelihood ratio criterion or step-down techniques should be employed for a single decision about the relationship of they- and x-variables. The orthogonal regression coefficients for these data are obtained in Section 5.1. Without the constant term, U= [
1.97]x1 1.72 x2 , eliminatingx1 .64 X3 , eliminating X1 and x2
2.17 .10
-.09
Yz
Y1
The proportion of variation in y 1 attributable to Pn =
x1 is
2.172
8 _31 = .566
For x 2 , eliminating xi> pl2
For x3 , eliminating X 1 and x2 ,
=
.102
8.31
=.001
(-.09)2 P, 3 =8,31=.001 After attributing all possible variation to x 1 , little remains to be attributed additionally to Xz or x3 • The sum of the P1 k is .568, identically R12 • The same procedure may be followed for y2 using the second column of U.
6.4
CANONICAL CORRELATION
The canonical correlation coefficient is the simple correlation of two random variables, each of which is a linear function of two or more original variates. Canonical correlation analysis can provide proportion-of-shared-variation measures to describe the relationships of two sets of random variables, each set consisting of multiple measures. The technique is useful for understanding the overlap of information content in two batteries of tests. To describe completely the relationships between p measures y1 and q measures xk, s = min (p, q) correlations are necessary. Each is a productmoment correlation of one linear combination of the y1 and a separate compound of the xk. The weights defining the linear functions are chosen so as to maximize the association measure. Tests of significance may be employed to determine whether the relationship between the two sets of variables can be described by a subset of the s compounds.
188
Method
When either p or q is unity, the specific form is the multiple correlation of the one measure with the other set. The weights are the partial regression coefficients. When both p and q are unity, the measure is the simple correlation of yand x. Let us define the desirable properties of a measure of association between two linear composites. Represent the ith canonical correlation between y and x as R 1 (i = 1, 2, ... , s). R 1 is the simple correlation of linear compounds (6.4.1 a) and (6.4.1b) a;'u) and a;'x) are weight vectors having p and q elements, respectively. y* and x* are p- and q-element mean deviation vectors, y* = y-p.y and x* = X-fl-x· The vectors a; are determined so as to maximize R 1. v 1 and w 1 are termed the canonical variates for y and x. a/u) and a/x) may be multiplied by any arbitrary constants without affecting R;. The usual convention of restricting the weight vectors to unit variance is employed. The variances of the composites are r(v;) = [a_/Y)] 'r(y*)a;(y)
(6.4.2a) and
r(w.J
=
[a;<xl] 'r(x*)a/x)
=
[a/x)]'l;<xxla/xl
(6.4.2b)
When Eqs. 6.4.2a and b are restricted to unit value, both composites have unit variance. The expectation of both composites is zero. That is, :?(v;) = :5'( [a/Yl] 'y*) =
[a;
=0 =:?(w;)
The correlation between
V;
(6.4.3)
and w; across all observations is
R;= 8'[ V;-8'(v;)] [ W;-8'(w;)] =
:?[({a/ul}'y*)({a/xl}'x*)']
=
[a;
(6.4.4) :?(y*[x*]') = l;
Correlation
189
respect to the a;. R 1 is the maximum correlation of any linear composites of the two sets of variables. The second canonical correlation (R 2 ) is the maximum correlation between composites and
(6.4.5)
is subject to the additional constraint that v2 and w2 are uncorrelated with and w1 , respectively. That is,
R2
if(VrV2)
=
Vr
[a/YlJ'l;
=0 =if(W1 W2 )
(6.4.6)
The third correlation (R 3 ) is the maximum correlation of composites defined by a 3
[A
(6.4.7a)
[A<xlJ 'l;<xx) A<x) = 1
(6.4.7b)
and (6.4.8) It is shown in Anderson (1958, pp. 289ft.), that maximum values for R; and associated vectors a; and a/x> may be obtained as the solutions of the homogeneous equations (6.4.9) where R 1 satisfies (6.4.1 0) Estimates of R;, a/Y>, and a;<xJ are obtained by substituting sample values for l;, and solving the eigenequations. That is, let the estimate of I be Vw as in Eq. 6.0.7, partitioned for y- and x-measures. In the sample, Eq. 6.4.9 is (6.4.11) where R/ satisfies (6.4.12) The sum of products Sw may be used in place of Vw in Eqs. 6.4.11 and 6.4.12. The eigenvalues are identical; the vectors from Sw must be multiplied by~ to maintain identities 6.4.7 and 6.4.8. The sample value a; is represented as a;.
190
Method
Solutions to Eq. 6.4.11 may be found through the expansion of 6.4.12 in the sample. Computer routines discussed in Chapter 2 are essential for most realdata problems. The two-matrix problem of Eq. 6.4.11 may be transformed to a single-matrix equation (see Chapter 2) by factoring Vw(uul by the Cholesky method. That is, let (6.4.13) and (6.4.14) Then Eq. 6.4.11 becomes {Vw(yxl[V,/xxlJ-1Vw(xyl_RlTwT:C} (Tw -1)'9i = 0
(6.4.15)
and or
(C-R/1)9;=0
(6.4.15a)
R;2 and 9; are the eigenvalues and eigenvectors of the symmetric matrix C. a/"l is obtai ned from 9; by Eq. 6.4.14. Let _kul be the p xs matrix of sample coefficients for they-variables; A(uJ has columns a/ul. The sample coefficients for the x-variables comprise A(x) with columns a/xl. A(xJ may be obtained directly by solving the reverse eigenequations: (6.4.16) Without recomputing, however, A(x) may be obtained by A(xl = [Vw(xxl] -1v,}xYl ,\
(6.4.17)
B is the qxp matrix of regression coefficients of they-variates on the xk measures. R. is the diagonal matrix of sample canonical correlations Note, however, that the canonical correlations are symmetric; the solution of Eq. 6.4.16 yields identical roots to those of 6.4.11. A'Yl and Atxl may be estimated in a form for standardized measures Y; and xk. This is accomplished by substitution of the correlation matrix Rw for the covariance matrix Vw in Eq. 6.4.11. Or the raw weights may be multiplied by appropriate vectors of variable standard deviations. Let
v7f1.
(6.4.18a) and Dw(xx) =
diag {Vw(xxl}
(6.4.18b)
Dw(uul and Dwcxxl are the pxp and qxq diagonal matrices of variances for the y- and x-variates, respectively. The matrices of standardized coefficients are (6.4.19a)
Correlation
191
and (6.4.19b) Each weight is multiplied by the respective variable standard deviation. Like regression coefficients, the canonical weights are dependent upon the selection of variables as well as their scales. The relative or absolute magnitudes of the weights should be interpreted with extreme caution. Addition or deletion of variables in either set may produce major alterations in the remaining coefficients. Although standardization to A
[D,/x.rlJ-li2V1/.rxJA<xJ, Rw<xx)A(x)
(6.4.20b)
Equations 6.4.20a and b are pxs and qxs matrices of correlations between each of the original variates and each of the canonical or composite variates. The correlations are more stable than either the raw or standardized weights under the addition and deletion of variables. They more accurately reflect the contribution of each measure to canonical variates vi and wi. Meredith (1964) has provided an important addition to the theory of canonical correlation, for the case when there is a correction for unreliability in some or all of the measures. Assume that 8 is a pxp diagonal matrix of variances of the errors of measurement in y. The reliability-adjusted covariance matrix is l:-8, which may be substituted in Eq. 6.4.11 in place of l:""Vw. Meredith indicates that the following likelihood ratio tests may be employed even when 8 is estimated from the sample, if N is sufficiently large. The canonical correlations may be used to estimate the percentage of variation in they- or x-measures attributable to each linear combination of the other set. The squared canonical correlation is divided by the total of the standardized variances of the variables in the opposing set. Since standardized variables have unit variance, the denominators are simply p and q, respectively. R/fp is the proportion of variation in the set of y-measures accounted for by W;"" [a/xl] 'x*; Rlfq is the proportion of variation in the set of x-mesures accounted for by V;"" [a/Yl]'y*. The overall percentage of variance accounted for in the p yvariates by the q x-measures is
(6.4.21) Each of the R; 2 represents an orthogonal component. Simply, the percentage of variance of the x-variates attributable to they set, is
(6.4.22)
192
Method
Other measures of shared variance of two sets of variables are given by Stewart and Love (1968) and by Miller and Farr (1971 ). As with the other correlation measures, the canonical correlations may be tested for departure from zero. A simultaneous test of all s correlations will reveal whether or not there is any correlation between the sets of variables in the population. We may then test whether all correlations minus the largest, all minus the largest two, and so on, are jointly significant. In this manner we may be able to isolate one or a small number of linear composites of the measures that describe all significant relationships between the two sets. For each significant dimension, the correlation of the linear composite with the original variables may be inspected to determine which measures contribute to the correlation, and in which direction. The joint test of nullity of all s canonical correlations is made using Wilk's A criterion. The test of H0 : R;=O, for all i, is equivalent to the test of the nullity of :I,(uxl or the entire qxp matrix B. The test statistic in terms of correlations is s
AI=
Il (1-R i=l
l
(6.4.23)
2)
Bartlett's transformation is (6.4.24)
where
m= [N-1-(p+q+1)/2]
(6.4.25)
The test statistic follows a chi-square distribution, with pq degrees of freedom. H0 is rejected with confidence 1- a, if x2 exceeds the 1OOa upper percentage point of Xpq 2 • If H0 is rejected, we conclude that at least one (and perhaps more) of the correlations is nonzero. It can be seen that this test is equivalent to the test of the contribution of all q predictor variables to regression (Eq. 5.2.17), since s
AI=
Il (1-R}) 1=1
=11-RRI ISw -Sw(y.r) [Sw(x.rl] -lSw (yy)
(xy)
I
!Su(yy)l !SRI
Thus predictable criterion variation is neither increased nor decreased in transforming the data to the canonical variables. Instead, it is reallocated so that the best prediction can be made in terms of the fewest measures. Sequential tests of all canonical correlations, all except the first (largest), all but the first two, and so on, are also possible. The test of the joint nullity of correlations j through sis made with s
Aj =
Il (1-R; i=J
2)
(6.4.26)
Correlation
193
The test statistic is (6.4.26a) where m is defined by Eq. 6.4.25. ;x2 has (p-j+1)(q-j+1) degrees of freedom. The null hypothesis is rejected if ;x2 exceeds the 1OOa upper percentage point of xzp-j+IJ
6.5
SAMPLE PROBLEM 1-CREATIVITY AND ACHIEVEMENT
The sum of products for the creativity-intelligence-achievement example, with N = 60, is given in Section 3.4, and partitioned in Section 4.6 for predictors and criteria. The matrix is
178.85 110.35
(Symmetric)
1 186.18
1
------------;---------------------------------------65.79 21.29 35.23 40.74 4.56 45.38 31.79
56.69 159.00 17.66 1 5.55 59.oo 40.49 24.29 3.38 33.58 1 27.47 31.98 20.22 I 5.56 -3.06 40.17 I 28.65 18.56 1.19 29.77 117.56 1
59.oo 25.13 18.56 38.66 24.42
59.00 1.19 24.42 14.98
43.35 22.27 19.12
104.70 57.79
64.79
194
Method
Reducing Sw to correlation form, and extracting Rw
.21 .17
R ,Cyxl=[·64 u; .54
.40 .32
.34 .39
.05 .23
.33 .29
.30] Synthesis .27 Evaluation
~
~-::::·
~ '? 'J,
All of the correlations are positive. Intelligence has the highest relationship with both outcomes. The t-value required to reject H0 : P;k,;;; 0, with 60-2 =58 degrees of freedom and a= .05, is 1.67. Substituting in Eq. 6.1.13, we find that the corresponding value of r;k must be .214 or greater. Thus intelligence, consequences remote, and possible jobs have significant relationships with the criterion measures, while consequences obvious is not significantly related to either outcome. This does not imply, however, that the multivariate tests will be significant. Most of the interaction terms are individually related to the criteria, although they may not explain criterion variance that is not attributable to the simpler measures. The error sum of products, or conditional sum of products of y given x, is
=[178.85 110.35]-[81.18 69.41] 110.35 186,18 69.41 67.22 = [97.67 40.94 Synthesis
40.94] 118.97
Synthesis Evaluation
Evaluation
Extracting the diagonal elements of SE and Sw
97 ·67 178.85
R1z_ 178 ·85 -
.45
Synthesis
R2 2 = 186 ' 18 -
.36
Evaluation
-
and
118· 97 186.18
The univariate tests of the two correlations yield F statistics of F1 =6.17 and F2 = 4.20. Both exceed the .05 critical F value, with 7 and 52 degrees of freedom; R1 and Rz are greater than zero. Taken individually, both criteria are significantly explained by scores on the seven predictor variables. To determine which measures are the best predictors, we may inspect RwcyxJ, or examine orthogonal variance components in a par-
Correlation
195
ticular order of antecedents. The regression coefficients are .86 1.00 .33 .28 .16 .32 -.04 -.14 -.16 .24 -.04 .:....16 .25 .20
Intelligence (x,) Consequencesob~ous(~)
Consequences remote (x 3) B= Possible jobs (x4) Cons. obvious x intelligence (xs) Cons. remote X intelligence (x6 ) Possible jobs x intelligence (x 7 ) Synthesis Evaluation The triangular Cholesky factor of estimates are
u=
8.56 1.97 1.12 T'B = .07 -.54 .74 1.35 Synthesis
Swcxx)
is T (from Section 5.4). The orthogonal
7.38 x, 1.61 x 2 , eliminating x, 2.42 x 3 , eliminating x, and x 2 -.68 x 4 , eliminating x~> x 2 , and X3 1.57 Xs, eliminating x,, X2, X3 , and x4 -.40 x 6 , eliminating x~> x 2 , X 3 , X 4 , and x 5 1.11 X 7 , eliminating all others Evaluation
The proportions of variation attributable to intelligence are Pu
=
8.56 2 178 _85 = .41
Synthesis
P2 ,
=
7.38 2 186 _18 = .29
Evaluation
The additional proportion of variation due to creativity (three measures) is 1.972 +1.12 2 +.072 178 _85
.03
Synthesis
1.612+2.42 2 +.68 2 186 _18
.05
Evaluation
The proportions of additional variation attributable to interactions, above and beyond all other measures, are .54 2 +.74 2 +1.35 2 178 _85
.01
Synthesis
1.572 +.40 2 +1.11 2 186 _18
.02
Evaluation
The numerators of the measures are the diagonal elements of matrices SHp used for testing regression hypotheses in Chapter 5. The sum of the proportion measures for either criterion is the respective squared multiple correlation (for example, .41 +.03+ .01 = .45).
196
Method
Step-down tests of R1 and R 2 are obtained from Cholesky factors of SE and Sw
TE = [9.88 4.14
0.00] 10.09
T*= [13.37 8.25
0.00] 10.87
The step-down statistic for syntheses (R 1 ) is equivalent to the first univariate test. F{=
13.372 -9.88 2 52-1+1 9.882 7
6.17
F{ exceeds the .05 critical F-value with 7 and 52 degrees of freedom. The step-
down statistic for evaluation, eliminating synthesis (R 2 ) is
1Q.87L1Q.Q92 52-2+1 10.092 7
F;=
1.17
F; has 7 and 51 degrees of freedom. F; does not exceed the critical F-value. Thus we may conclude that the correlation of divergent achievement with intelligence, creativity, and their interactions is concentrated in the variable synthesis. Synthesis in turn is presumably basic to the more complex behaviors of evaluation. When scores on synthesis are accounted for, evaluation does not add any further predictable variation, at least with these seven antecedent measures. The canonical correlations between the two sets of variates are the (s= 2) unique solutions of or I
I
-97.67R? -40.94 -40.94 -118.97Rl =O
Expanding the determinant and solving for the roots, R 1 2 = .50 and These values are substituted in
R22 = .06.
The nontrivial solutions for a; are a~= [.40 .23) and a;= [.60 -.67], respectively (after multiplying by the constant V59). These vectors are juxtaposed as the columns of A
Acrl = BA'u):R-' 1.00 .28 .16 -.04 -.16 -.04
.25
.86 .33 .32 .60] [1 ;Y.SQ 0 -.14 [ .40 0 1;Y.Q6 .24 .23 -.67 -.16 .20
J
Correlation
197
.85 .10 .27 -.23 .19 -.47 -.06 .28 -.01 -1.04 -.08 .32 .21 .05
canoni~al
variates may be standardized to assist in The weights for the interpretation or may be transformed to correlations with the original variates. The matrices of intercorrelations are found from Eqs. 6.4.20aand b. The diag~ anal matrices of variances are Dw(yu>= diag (Sw("">/59) and Dw(.r.r>= diag (Sw(.r.r>/59). Then 1
.32] -.56
Synthesis Evaluation
--+---Vadate 1 ,95
:so
[Dw(.xx>]-li2Vw<.x.x>A.<x>
=
.56 ,58 .18 .49 .45
Variate 2 .09 .06 -.42 .13 -.87 .01 -.06
Intelligence Cons. obvious Cons. remote Possible jobs Cons. obv.xintell. Cons. remotexintell. Pass. jobsXintell.
Although the raw weights are useful for scoring subjects on the canonical variates, the intercorrelations are r;nore stable and should be used for interpretation. The differences between the weights and the correlations are often large. The first canonical correlatioh is the correlation of a linear combination of predictors, heavily loaded with intblligence, with a composite of the two criteria. Both criterion measures have si~ilar positive weights. This suggests that the two outcome measures share corr)mon behaviors, which in turn resemble those of intelligence. The first canonical correlation explains .50/2 = .25 of all variation in the criterion measures and .'50/7 = .07 of variation in the predictors. In contrast, the second corr;elation accounts for only .06/2 = .03 of the variation in the criterion measures. Thus the total percentage of variation in the criteria attributable to the seven 'independent variables is 1oox (.25+.03) = 28 percent. The variation in indep~ndent variables attributable to criteria is 100X(.07+.01) = 8 percent. As witM the other correlation measures, the designation of independent and depende;nt variables is arbitrary, and does not affect the index of assoCiation. · The test of the nullity of both canonical correlations is equivalent to the overall test of the relationship between the two sets of measures under theregression model. From the correlations, the x 2 approximation is x 2 =-5l(log •. 50+1og •. 94) I =40.98
I i
198
Method
x2
has 14 degrees of freedom. The null hypothesis of no relationship between the sets of variates is rejected at a= .05. Test of all correlations excluding the largest (which in this case is the test of only the smaller) is made with A2 = (1-.06) = .94. The x2 statistic is
x2 =-541og •. 94 =3.38 This x2 has (2-1)(7-1)=6 degrees of freedom. H0 is not rejected; R 2 2 is zero in the population. All-linear relationship between the two sets is concentrated in the correlation of the first pair of composites of the variables.
6.6 CONDENSING THE VARIATES: PRINCIPAL COMPONENTS When confronted with a large number of variables measuring a single construct, it may be desirable to represent the set by some smaller number of variables that convey all or most of the information in the original set. Principal components are linear transformations of a set of random variables that summarize the information contained in the variates. The transformations are chosen so that the first component accounts for the maximal amount of variation of the measures of any possible linear transform; the second component accounts for the maximal amount of residual variation; and so on. The principal components are constructed so that they represent transformed scores on dimensions which are mutually orthogonal. A set of p measures is transformed into p composites or principal components. A subset of the components may account for a large portion of variation of the original measures. Let y represent the p-element vector random variable consisting of p outcome measures, with expectation lif(y)= p.
(6.6.1)
V(y)=l:
(6.6.2)
and covariance matrix
Let a 1 be the p-element vector defining the first component or first composite of the outcome measures. The principal component transformation is
(6.6.3) v1 has expectation
(6.6.4) The variance of
V1
is
(6.6.5) To provide a single maximum value for V(v1), it is necessary to restrict the values that a1 can have. The convenient and usual restriction is to set the length
Correlation
199
!
of a 1 to unity; that is, a;a 1 = 1. With this limitation, we may rewrite the expression for~'(vd and differentiate with respect to a 1 . Thus
'Y(vl)=a;Ial = a; Ia1- A. 1(a; a 1-1) 1
Setting the first derivatives with respect to a 1 to zero, we have
I
2!a1-2A.1a1= 0 and
(6.6.6)
(1-A.ll)al = 0 Equation 6.6.6 has nontrivial solutions if and only if
II-A.lll =
o
(6.6.7)
i
A.1 and a 1 are the largest characteristic root and vector of I, with A.1 also being the variance of the composite a;y. · The second principal compon~nt is the maximal solution (6.6.8)
a 2 is restricted additionally to be orthogonal to a1. Since eigenvectors of symmetric matrices are orthogonal, a 2 is the second eigenvector of I. The associated root is A. 2 . a 2 summariz~s the maximal amount of criterion variation under the additional restriction, and is also confined to unit length. The process of defining components continues in the manner of Eqs. 6.6.6 and 6.6.7 until all p roots and vectors are extracted from I. The relationships among the components may be summarized as follows. Let A be the pxp diagonal matrix of characteristic roots; let A be the pxp matrix having ai as the ith column. Then 1
A'A=I
(6.6.9)
A'IA=A
(6 ..6.1 0)
p
(6.6.11)
II,I = IAI =II A; i=l p
tr (I)= tr (A)=
2:
A.i
(6.6.12)
i=l
io
We may use property 6.6.12 obtain a measure of the degree to which criterion variation is summarized by any one of the components. The proportion of variation of the original measures attributable to the ith principal component is
(6.6.13) i=l
The sum of all the Pi is, of course, 1.00.
200
Method
The component weights may provide insight into the basic structure of the measures. For example, we may wish to know what single linear function summarizes the maximum share of variation of the p measures. For this we might inspect the first principal component and the associated weights. The principal component solution is only meaningful when (1) the original measures are all in the same units and (2) components are extracted from the covariance matrix. Standardizing dissimilar measures to provide an arbitrarily similar unit does not yield components that account for the structure of the original measures. There is no simple relationship between the components of a covariance matrix and the respective correlation matrix, except when all the variances are equal. There are times, however, when the components of a correlation matrix yield useful interpretive information. The coefficients in A may be examined in an attempt to understand the structure of the set of variates. Care should be exercised in their interpretation. The magnitudes are dependent upon the particular set of variables and may change considerably under the addition or deletion of measures. The signs of the coefficients tend to be more stable. Thus the components are best interpreted in terms of similarities and differences of the original variables that explain "total" variation. To aid in the interpretation of component weights, subsets of the weights may be used to estimate the original covariance or correlation matrix. The difference between the original matrix and the estimated matrix can be used for information about the omitted components. First it is necessary to scale the vectors ai to length equal to the variance of the component A;. This is accomplished by multiplying each element of a; by v7-:;. The set of p rescaled weight vectors is contained in the pxp matrix A*. That is, A*=AA112
(6.6.14)
where Al 12 is the diagonal matrix of square roots of the p eigenvalues. The length of each column of A* (that is, at) is A;, since [A*] I A*= A112AI AA112
=A
(6.6.15)
The original covariance matrix can be reproduced directly from A*: A*[A*] I= AA112(AA112)1 =AAA 1 =~
(6.6.16)
Equation 6.6.16 follows from 6.6.10, since a square matrix orthonormal by columns is of necessity also orthonormal by rows. The product A*[A*] 1 is the sum of matrix products of the p vectors of adjusted weights (a;)'. That is, p
A*[A*] 1 =
2: a;*(at} f=l
1
(6.6.17)
I \
Correlation
201
To understand the variation and cbvariation attributable to the first component, :1; may be reproduced omitting a;. That is, p
l:.*=}:; a;(a;)' (6.6.18) I !=2 I Either l:*, or :l;-:1;*= a;( a;)', can be used for interpretation. In similar fashion, l:* may be obtained by omitting lmy of the other components or by omitting particular subsets, such as those with smallest variance. If components have originally been extracted from a cbrrelation matrix ffi., the product A*[A*]' will also be the correlation matrix. : To further facilitate understa?ding the principal component variates, the weights may be converted to standardized form. Each element of the resulting matrix is the product-moment correlation of an original variate Yi and a component variable v1• Let .1 = diag (i) be a diagonal matrix of variances from :1;. Then the variables X components matrix of correlations (termed "factor loadings" in factor analysis models) is · 1
a-1/2~Al/2 = a-li2A*
(6.6.19)
!
The intercorrelations are somewhat more stable than the raw weights under addition or deletion of variables, 'and have limits ± 1. When components are extracted from the correlation matrix, the adjustment (Eq. 6.6.19) is unnecessary, since A= I. The correlations of components and variates is always the scaled matrix necessary to reproduce ffi. by Eq. 6.6.16, si nee 1
I
a-tt2AA112(.1-1t2AA112)' = a-tt2:1;.1-tt2 1
= ffi,
(6.6.20)
Estimates of the principal components are obtained by substituting the estimate of i for l: in eigenequatiOns 6.6.6 and 6.6.8, and solving for >..1 and &1. The estimate of l: may be either the variance-covariance matrix of mean deviations (3.3.8), the pooled within-groups matrix (3.3.22), or the residual matrix in a regression model (4.4.17). Test of the hypothesis that a principal component does not account for any of the variation in the outcome measures is not particularly meaningful. Even the smallest root may be increased in importance by the addition of measures of the specific behaviors represented by the component. If it is judged that sufficient variation is account~d for by a subset of components, however, scores for individuals on just these variables may be obtained. Thejth component score for observation i is ·vu=v;&i (6.6.21) 1
I
If the number of components required is much less than p, an economy of representation is achieved. Otherwise, (he simple? original measures provide more information for quantitative analyst
Sample Problem 1- Creativity and Achievement In the creativity-intelligence-achievement example, principal components may be extracted from the corre!lation matrix for the conceptually similar creativity and divergent achievrment measures (synthesis, evaluation,
I I
202
Method
Table 6.6.1
Principal Components of Divergent Achievement and Creativity I ntercorrel ations Component Weights
Component
Variance (eigenvalue)
Percentage of Variation Accounted for
Synthesis
Evaluation
Cons. Obv.
Cons. Rem.
1 2 3 4 5
2.42 1.10 .76 .41 .32
48.3 22.0 15.1 8.2 6.4
-.76 .29 -.37 .40 -.18
-.74 .39 -.33 -.40 .18
-.52 -.77 -.16 -.20 -.26
-.64 .33 .65 -.09 -.22
Pass. Jobs
-.77 -.41 .25 .20 .37
consequences remote the obvious, and possible jobs). The adjusted sample coefficients A:, and the corresponding component variances are given in Table 6.6.1. The largest component accounts for 100(2.42/5) = 48.3 percent of between-individual variation. The compound is an effect common to all measures. This is indicated by the similar negative weights for the five variables in constituting v1 . All of the remaining components also account for large portions of variation and cannot be ignored. The second, accounting for an additional 22 percent of test variation, appears to be a contrast between the "consequence obvious" and "possible jobs" measures and the remaining measures. These two tests in particular do not require the degree of abstraction required by the three other measures. The operations in listing obvious event consequences and jobs are noticeably more concrete. Numerous interpretations of the remaining weights may be forwarded. However, in view of the finding that a much smaller number of components does not summarize a large portion of the variation of the original five measures, such interpretation is unnecessary. A return to the original measures rather than linear combinations of them is both simpler and more lucid.
Sample Problem 3- Dental Calculus Reduction From the toothpaste-additive study, the covariance matrix of the calculus measures for the six anterior mandibular teeth is
i=
1.38 1.02 .81 1.07 .74 .91
(Symmetric) 2.62 2.18 2.60 1.67 1.01
Jt,.Q:~.
'If
4.24 4.31 2.43 1.12
It~~. c<91') . II)&
'If;.
6.19 3.49 1'.30
It~~.
'If
<..e~'t
3.09 1.21
1.58
<..&l'f
Right canine Right lateral incisor Right central incisor Left central incisor Left lateral incisor Leftcanine
<..&l'f
C /q Cq c&l') &l')tr. te,.<9 1');1} .,.<9/. 1'r<9/ . <9!; '11~ . & II)Ci II)Ci I')Ci Ct,s
<9t&
so,.
so,.
so,.
o,.
Correlation
203
Solving for the roots of 11-5..111 =0, the six eigenvalues in descending order are the diagonal elements of A. Th~t is,
I
A= diag (13.82, 2.02, 1.28, .88, .71, .38)
Their sum is 19.09, equal to the su~I of the diagonal elements of 1. The matrix of coefficients, with vectors normalized to the corresponding eigenvalues, is I I
Component
3
4 5 6 I -.57 .83 -.08 -.14 .32 Right canine .49 -1.26 .58 -.55 .59 -.22 -.09 Right lateral incisor -1.87 -.35 -.52! -.51 -.28 .17 Right central incisor A*=AA 112 = -2.38 -.50 .13 .12 .46 -.23 Left central incisor -1.51 .04 .77 .21 -.31 .26 Left lateral incisor -.72 .79 .30 -.44 -.16 -.35 Left canine I The inverse matrix of variable stand~rd deviations is I .i- 1' 2 =diag(1/v'1:38, 1/V2.62, 1/v'4.24, 1tv'6.19, 1;V3.Q9, 1tvT.58} 1
2
I
1
A
~
A
Multiplying a- 112A* yields the matrix of intercorrelations between the components and the original measures. These are presented in Table 6.6.2. The first component accounts for about 72 percent of between-individual variation in the calculus measures. The effect appears to be common to all teeth, with the contribution to the cbmponent increasing as the teeth approach the front of the mouth. Probably thib component is simply reflecting the greater mean and variance of calculus accumulation of the anterior teeth. The second component, accounting for an additional 10.6 percent of the variation, is largely a comparison ~f the frontmost teeth with the canines. The component probably reflects both differential usage in eating and brushing, as well as proximity to the salivary glands. i I
Table 6.6.2
Correlations of Prindpal Components with Calculus Measures for Six Anterior Teeth ·
Component
Variance (eigenvalue)
Percentage of Variation Right Accounted for Canine
1 2 3 4 5 6
13.82 2.02 1.28 .88 .71 .38
72.4 10.6 6.7 4.6 3.7 2.0
Sum
19.09
100.0
Right Lateral Incisor
Right Central Incisor
Left Central Incisor
Left Lateral Incisor
Left Canine
.~7
-.78 .36 -.34 .36 - . 14 -.05
-.91 -.17 -.25 -.25 -.13 .08
-.95 -.20 .05 .05 .19 -.09
-.86 .02 .44 .12 -.18 .15
-.57 .63 .24 -.35 -.13 -.28
1.~0
1.00
1.00
1.00
1.00
1.00
-.39
:lo
-.Q7 -.12 .42
(Sum of squares)
204
Method
The third component accounts for an additional 6.7 percent of score variation. This component reflects differential calculus formation on the two sides of the mouth. It appears that there exists a tendency of individuals to have a greater calculus formation on one or the other side of the mouth. This may reflect lateral favoritism in biting and/or brushing. The remaining three components account for a total of about 10 percent of between-individual variation. As such, they depict only minor trends in the teeth data.
CHAPTER
7
Analysis of Variance: Models 7.1
CONSTRUCTING THE MODEL Univariate Case
Through the analysis-of-variance model, attention is centered on subpopulation means. The model may be formulated in a fashion similar to that for linear regression. Consider that observations have been sampled from, or assigned to, J distinct populations. The groups are distinguishable by a single sampling characteristic or factor. The sample of observations from one population comprises a group or subclass of observations. These are sometimes referred to as the observations of a single cell in the sampling design. The additive linear model for an observation in one subclass is (7.1.1) where Y;; is the random outcome for observation i in thejth subclass (j= 1, 2, ... , J). f.L is a fixed parameter representing the mean of all observations. a; is the (fixed) mean deviation from f.L, of the observations in subclassj. Eij is the random
deviation of observation i in subclass j from the subclass mean; that is, the unique deviation from f.L+a;. The analysis-of-variance model may be appropriate whenever the observations are subdivided into identifiable groups, whether these are naturally or experimentally formed. The purposes of fitting the model to sample data are three: testing the fit of the model, estimating the fixed parameters, and providing interpretations of the random outcomes in terms of the fixed group-membership variable(s). Testing the fit of the model usually involves deciding whether or not the entire model "explains" the observations. Is a large or significant portion of variation in YiJ nonrandom and attributable to an overall population constant f.L, plus systematic group deviations from f.L? Is variation attributable to f.L+a;, as compared with that attributable to E, a large proportion of the variation of the Yu? Also we test the model to decide whether variation in the outcome may be more economically attributed to a subset of terms. That is, do the data indicate that the a;'s are not zero? Are there nonzero differences among the a.i (or 205
206
Method
among subpopulation means) that make the term worthy of inclusion in the model? After deciding that a subset of terms in the model is nonzero, we may wish to obtain "best" estimates from the sample data, of only those parameters and their dispersions. Ultimately we face the question of how the results Yii can best be explained in terms of the constructs underlying the measures and subgroup definitions. Here we must combine forces with the theory of the discipline from which the data are drawn. Suppose that J = 4 and that there is a single observation per group. The models for all subjects may be written as Yu =~+a1 Y1z=~
+az
Y1s= JL y14=~
+Eu +E12 +a3 +E13 +a4+E14
(7.1.2)
The set of models may be formulated in vector notation. First, let us fill in zeroone coefficients. Eq. 7.1.2 is, equivalently, Yu = 1J-t+1a1+0az+Oa3+0a4+Eu Y12 = 1~+0a1 +1a2+0as+Oa4+E1z Y1s= 1J-t+Oa1+0a2+1a3+0a4+E13 y14= 1~+0a1+0a2 +0a3+1a4+E14
(7.1.2a)
Any observation can be expressed as the product of a vector of ones and zeros and a vector of fixed effects. For example,
(7.1.3)
Yu = [1
or Yu = [1
0
0
0)8*+E11
(7.1.3a)
8* is an m x 1 vector of analysis-of-variance effects or parameters. In Eq. 7.1.3, m = 5. Testing the fit of the model involves deciding whether some or all of the terms of 9* are null or are equal to each other. For example, testing that a1 =a2=a3=a4 is equivalent to testing that linear combinations of the rows of 9* are null. Estimation in analysis of variance involves obtaining best estimates of 9* or linear combinations of the elements. These objectives are identical to those encountered relative to the vector of regression weights {J in the linear regression model (Chapter 4). Models for all N observations (Eq. 7.1.2a) may also be displayed in matrix form. That is,
Y12 = [11 [Yll] Y13 1 Y14
(7.1.4)
1
0
Analysis of Variance: Models
207
or
y=AO*+e
(7. 1.4a)
where y and e are N x 1 random vectors, and A is the N x m analysis-of-variance mode/ matrix. A has a row for each observation and a column corresponding to each of them parameters. The entry is 1 if the subclass has the corresponding parameter in its model and 0 if it does not. The unknown coefficients or effects to be estimated are contained in vector 8*. The same vector multiples each row of A to produce the set of models given in Eq. 7.1 .2. The terms in Eq. 7.1.4a parallel those of the linear regression model y=X{l +e. However, A contains only ones and zeros, designating whether the observation has the corresponding effect or not. The regression matrix X contains measured scores indicating the extent to which the parameter is contained in each equation. That is, the measured variables reflect varying degrees of presence of the independent variables; in Eq. 7.1.4 the observation either belongs to a group having a particular parameter or to a qualitatively distinct group lacking the parameter. The analysis-of-variance distributional assumptions are two in number. First, each Eij is assumed to be normally distributed with expectation zero, and common variance CT2 ; that is, (7.1 .5) Further, pairs Eu and ei'j' (ij""" i' j') are assumed independent of each other. Under normality, this implies zero covariance of pairs of errors. The entire distributional assumption may be written as
(7. 1 .6)
or (7.1.6a) If a cell in the design contains responses for more than a single observation, all observations in the group will have the same model-that is, the same row of A. It is convenient to represent the models only in terms of subclass means. Thus, for the one-way design, we may write ~-t+at = 1L +az
Y·t =
Y·z
(7.1.7)
or
y. = AO*+e.
(7.1.7a)
Let N; be the number of observations in subgroup j. Then Y·;=~iyij!N; and The distributional assumptions are that each E.; follows a normal distribution, with expectation zero and variance CT2! N;. That is,
E-;=~iEi/NJ.
E.; ~
JV (0,
U" 21N;)
208
Method
Further, every pair e.; and e.k (j -,f k) is independently distributed. This assumption may be represented in matrix terms by letting the N; be the nonzero elements of a diagonal matrix D: {7.1.8) The complete distributional assumption is
{7.1.9) or {7.1.9a) There is no restriction on the frequencies N;. However the variance u 2 is the same for all groups. The matrix model for higher-order designs may be constructed in the same fashion. Consider a situation in which subjects have been sampled from six populations, representing the crossing of two experimental factors. Let factors A and B have two and three levels, respectively. The linear model for the mean in subclass jk is (7.1.1 0) where JL is a population effect common to all groups; IX; and f3k are- the deviations from JL due to the mean having been drawn from level j of the first sampling factor and level k of the second; {IX/3);1, is the nonadditive, or interaction, effect specific to the mean of subclass jk. {1X{3);k represents the extent to which treatment effects IX; are not equally effective across all levels of the other factor(s) in the design; E·;k is the random deviation of the observed mean from the model, JL+IX;+{3k+(1X{3);k; e.;k is assumed to have expectation zero, and variance u 2 1N;k· In terms of the larger model, Eq. 7.1.1 0, the purposes of the analysis include deciding whether there are nonzero differences among population meansthat is, whether the model JL+1X;+{3k+{1X{3);k· fits the data. In particular, we must decide whether variation among means is confined to a subset of terms in the model. For example, are some treatment effects significant, but not others, or are perhaps, the interactions null? Ultimately, we will want to obtain "best" estimates of those terms deemed nonzero, to aid in interpreting the observed outcomes. Models for all six subclass means may be written as + {1X/3) 11 Y·n = JL+1X1 +{1X{3)12 Y·12 = JL+1X1 +{1X{3)13 Y·13 = JL+IX1 +{1X{3b Y·21 = JL +IXz+/31 +(1X{3b Y·22 = JL +IX2 +f3z +(1X{3}zs Y·23 = JL +IX2 +f3a
+e.n +E-12 +E-13 +E-21 +E-22 +E.za
{7.1.11)
Analysis of Variance: Models
2lJ9
Equivalently, we may write f.L
Y·u Y·12 Y·1a
Y·z1 Y·zz Y·z3
1 1 1
0 1 0 0 0 0 1 0 0 0 0 1 0 1 1 0 0 0
0
1 0 1 0
1 0
0
0 1 0
0
0
0 0 1 0
0 0 0 1 0
0
a1 az (31 f3z (33
0
0 0 0 0 0 0
(a(J)u
1 0 0 0 0 1 0 0 1 0 0 0 0 0 1
(a(3)1z (a(3)t3 (a{3)z1 (afJ)zz (a(J)n
€.11
+
E-iz E-13 E-21 E.zz €.23
(7.1.12)
or
y. = AO*+E.
(7.1.12a)
As in the one-way example, y. is the J-element mean vector. J is the total number of subclasses of observations, formed by the crossing of the two design factors (J = 6 in the example). A is the Jxm analysis-of-variance model matrix. Thus 6* is the mx1 vector of unknown coefficients or effects. E. is the JX1 vector of mean errors. Should any of the J subclasses contain no observations, corresponding rows of y., A, and E. may be deleted, leaving J 0 rows for that number of subclasses with subjects. The vector of parameters 6* is unaffected by missing cells. Constructing mean and error vectors for designs of any dimensionality follows the same pattern. So, for example, for a 2x2x2 factorial arrangement, the model matrix for main effects a, (3, andy alone is 1 0 1 0 0 1 0 0 1 0 0 1 1 0 0
A=
0 0
0 0 0
0 1 1
1 0
1
(7.1.13a)
0 1 0 0
0
1
0 1 1 0 0 1 0 1
with
[6*]' = [f.L
a1
az
(31
f3z
/'1
Yz]
(7.1.13b)
Model matrices for nested designs may also be constructed in this manner. For example, in a situation where school classes have been randomly assigned to treatment conditions, and pupils randomly assigned to classes, we may have the following setup: Treatment2
Treatment 1 Class 1
1
Class 2
Class 3
Class 4
I
Class 5
210
Method
Here classes 1 and 2 are subjected to the first experimental condition and classes 3, 4, and 5 are subjected to the second. In addition, there may be further effects crossed with treatments and classes (such as ability or sex groups) that have the same definition for all treatment conditions. The mean model for class i under condition j, assuming no other factors, is (7.1.14) where a; is the (fixed) treatment effect and biw is the random class effect under treatment condition j. For all groups, the models are +bw> +E.u Y·u = 1L+a1 +b2(1) +E•21 Y·z1 =IL+a1 +b1(2) +€·12 Y·12 = !L +az +b2(2) +E•22 Y·22 = fL +a2 +ba(2)+E·32 Y·32 =f.L
(7.1.14a)
In Eq. 7.1.14 the b-effects are not the same for a 1 as they are for a 2 since experimental groups are not crossed with conditions. Specifically, class i for treatment 1 is not the same as class i for treatment 2. We say that classes are nested within experimental conditions and represent the effect of the ith class within the jth experimental condition only as bi(j)· Since the subjects (pupils) are also different from class to class, subjects are said to be nested within classes. If the entire design were crossed with pupil sex, subjects would be nested within sex X class combinations or interactions. In this sense, all crossed designs with replications have subjects nested within the smallest cell formed by the crossing of the design factors.
Multivariate Case When each subject has been measured on more than a single outcome measure (dependent variable), a multivariate form of the analysis-of-variance model is appropriate. The correlations among the measures may form any arbitrary pattern. For joint confidence intervals or tests of significance to be interpretable, the measures must form a conceptually meaningful set. Assume that each subject in the one-way design with J=4, has been measured on p (;31) outcome variables. Each outcome variable is a random variable that comprises a portion of the total response. For the p measures, the one-way model for observation i in subclass j is
y;/Pl)
[y;/1) yij(2)
= [!L
(1)
!L (2)
f.Lcr'l]
+ [a;w
aPl
···
a/v>] + kuw
E;}2>
···
Eurv>]
(7.1.15) The model is formed by juxtaposing the p separate univariate models. A superscript is added to designate the outcome variable; Yuck) is the score for subject ij on dependent variable yk; !L'k) and ar) are. the parameter values for y" alone. It is necessary to estimate parameters !L and a; for each variate. Confidence intervals and tests of hypotheses are made tor the total set of measures. In
Analysis of Variance: Models
211
this we must take into account the nonzero intercorrelations among the variates. Equation 7.1.15 is the sum of p-element row vectors (7.1.15a) where YI; is a single vector observation, or one row of the entire data matrix. With J = 4 and one observation per cell, the complete set of models is
y;1 = p./ +a; y;z =!A-' +a~ y;3=1A-' +a;
+e;1 +e;z . +e; 3
(7.1.16)
+a~+e;4
y;4 =!A-' or
y12 (1) y13 (1) y14 (1)
l""'
y11 (2) y12(2) y13(2) y14 (2)
y,'"'ll'
Y12 Y13(p)
Y14
=
JL(2)
Of"
0 0 1 0 1 0 0 1 0 0 1 0 1 0 0 0 1 1
a1(1)
JL(p)l
a1l2J
a1(p)
a4l21
~4(p)
:
~4(1)
Eu(pl E
(p)
12
E E
13
(pi
(7.1.16a)
(p)
14
Expression 7.1.16a may be represented as
Y=A0*+E
(7.1.17)
where Y is the Nxp data matrix having row vectors y;;. E is the Nxp matrix of residuals for all observations with rows e;;. A is the analysis-of-variance model matrix, and is identical to A for the univariate case (7.1.4); the structure underlying group membership is the same for each variate. 8* is the mxp matrix of effects having a column for each of the p variates,
(7.1.18)
8*, like the regression matrix B, contains the parameters to be estimated from sample data and upon which we conduct tests of significance. The distributional assumptions concerning E are: (1) The errors for the p measures for one subject follow a multivariate normal distribution with expectation 0, and common pxp covariance matrix'$. That is,
e[;
~
ff(O',l:)
(7.1.19)
212
Method
The elements of e[; represent multiple errors for the same observation (1 row of E), which are generally intercorrelated. (2) The errors for any pair of subjects are independent. That is, (7.1.20)
(ij # i'j')
where 0 is the pxp null matrix. If E has N rows, we may represent the covariance matrix of all Np elements in Kronecker product form (see Chapter 4). The covariance matrix of elements of E is r(E) =I®~= diag (~, ~, ... , ~)
r~OOOJ
= 0 ~ 0 0
0 0 ~ 0 0 0 0 ~
(7.1.21)
All submatrices ~ and 0 are pxp symmetric. The matrices on the diagonal are ~,the covariance matrix of any one row of E; the rs off-diagonal matrix 0 is the matrix of covariances of elements in the rth and sth row of E (r # s). Thus the distribution of the entire matrix E is (7.1.22) where 0 is an Nxp null matrix. If there are multiple observations in each subclass, it is convenient to represent the model and distributional assumptions in terms of vector means. For the one-way case, the mean model is (7.1.23) where y:;, p.!, aj, and e:; are 1 xp vectors. The means are y.5 =2. 1y1;1N; and e.;= 2.;e1;1 N;. Juxtaposing the models for J = 4 groups, Y·1(p)] Y·2(p) Y·a(pl = Y·4(p)
[11
... M(vlJ ~l(p)
1 1
...
(X4(p)
E.l(p)J E E
(p)
•2 (p)
•3
(7.1.24)
E •4 (p)
or
Y.=A@*+E.
(7.1.24a)
where A and@* are the same as given in Eq. 7.1.16. Y. and E. are both Jxp. Rows of E. are distributed independently'and with expectation zero. The covariance matrix of a row of means is inversely proportional to the number of
Analysis of Variance: Models
213
observations in the group. N; is the number of subjects in subclass j. The covariance matrix of row vector e:,; is (7.1.25) I
The square roots of the diagonal elements of Eq. 7.1.25 are the standard errors of the means for variates Yk (that is, u-,JVFi;). Let D be the diagonal matrix of subclass frequences as in Eq. 7.1.8. Then the covariance matrix of E. is · I
JV(E.)=D- 1 ®l:~diag(~,I, ~2 !, ~3 !, ~ 4 I)
(7.1.26)
The complete distribution of E. is . (7.1.27) Again, under the general-mode) analysis, the N; elements are not restricted to being equal or proportional to qne another. The variance-covariance matrix of observations, t, is the same for 1;1ll groups. In parallel fashion, we may write the model for the multivariate two-way or many-way situation. For example, in the toothpaste evaluation (Sample Problem 3) there are two years of experimentation, five experimental conditions, and six outcome measures taken from the anterior mandibular teeth. The six-variate model for means is (7.1.28) with j=1. 2. and k=1, 2, ... , 5. Juxtaposing the vector means for the ten groups, we have Eq. 7.1.24a withY. and~- as 10x6 matrices. The model matrix is
A=
0 1 0 1 0 1 0 1 0 0 0 0 0 0
1 0 0 0 0 1 0 0 0 0
0 1 0 0 0 0 1 0 0 0
0 0 1 0 0 0 0 1 0 0
0 0
0 0
1 0 0 0 0
0 1 0 0 0 0
o· o
0
1 0 0 0 0 0 0 0 0 0
0 1 0 0 0 0 0 0 0 0
0 0 1 0 0 0 0 0 0 0
0 0 0 1 0 0 0 0 0 0
0 0 0 0 1 0 0 0 0 0
0 0 0 0 0 1 0 0 0 0
0 0 0 0 0 0 1 0 0 0
0 0 0 0 0 0 0 1 0 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 (7.1.29)
The final 10 columns comprise ah order-ten identity matrix that multiplies the interaction effects. 8* is an 18x6 matrix:
[ 8*]' = [~-t
a,
a2
p,
P2
p3 I
134
Ps
(a/3)11
(a/3),2
... (ap)zs]. (7.1.30)
Each vector has the corresponding effect for all criterion variables, for example ... ,8,(6)]. p~ = [,8, (1),8, (2)
214
Method
In the example, three treatment-year combinations have no observations. The diagonal matrix of subclass frequencies is
D = diag (8, 9, 7, 5, 0, 28, 0, 24, 0, 26).
(7.1.31)
The J=10 rows of V., A, and E. may be reduced to J 0 =7 by eliminating rows corresponding to the null subclasses. The resulting model matrix is
A=
0 0 1 0 1 0 0 1 0 0
0 0 0 1 0 0
0 1 0 0 0 0 0
0 0 1 0 0 1 0
0 0 0 0 0 0 0 0 0 0 0 0
1 0 0 0 0 0 0
0 1 0 0 0 0 0
0 0 1 0 0 0 0
0 0 0 1 0 0 0
0 0 0 0 0 0 0
0 0 0 0 1 0 0
0 0 0 0 0 0 0
0 0 0 0 0 1 0
0 0 0 0 0 0 0
0 0 0 0 0 0 (7.1.32)
It can be seen that three interactions cannot be estimated, since three columns of the reduced A have no nonzero elements. This is an effect of the null subclasses. Parameters which are unique to the empty groups cannot be estimated or tested. In this example, there are three fewer degrees of freedom for interaction. The covariance matrix of the 7 X6 matrix E. is given by Eq. 7.1.26.
1 1 1 1 1 1 ) 1 r(E.)=D- 1 C>
(7.1.33)
X is the 6x6 matrix of variances and covariances of the six tooth measures. Zero elements have been eliminated from D, resulting in a 7x7 diagonal matrix. Error variances and standard errors may be drawn from the diagonal of o- 1 ®X. For example, the standard error of the mean for the first tooth, in group (1, 2), is v0'1 2 /9. For further illustration, consider the remedial instruction experiment (Sample Problem 5). Three cognitive achievement measures are hypothesized to increase with a televised remedial program. With class means as the unit of analysis, 18 randomly selected classes conducted the "usual" instructional program, without special consideration for absenteeism, and 19 experimental classes utilized machine-programmed curriculum materials. These materials were assigned to students who had been absent, upon their return. Sex of the student is considered as an additional design factor, crossed with classes and experimental conditions. The model for the mean vector in class k, within treatment group j, for sex I, is (7.1.34) with j=1, 2; /=1, 2; and k=1, 2, ... , 19 for j=1, and k=1, 2, ... , 18 for j=2. All terms are three-element vectors; p., a;, and {j 1 are fixed effects; and ckw is a random class effect. The model matrix for all 74 class-sex combinations is extensive, and so is represented here for only the first two classes within each treatment condition.
Analysis of Variance: Models
1 0
0
1 0 0 1 0 1 0 0 0 0
0 0 0 0
1 0 0 1 1 0 0
0 0 1 0 0 0 0 1 0 0 1 0 0 0 0
0
0
Q A=
1 0 0 1 1 0
0
0 0 0 0
0 0 0 0 0 0 1 0 0 0 0 0
215
0
0 0 1 0 0 1 0 0 0 0 0 0
(7.1.35)
0 0 1 1
The parameters are
[W]' = [t-t cl(l)
at
a2
C2(1J
Pt p2 (ap)11 c1(2) c~(2)]
(aPJ12
(aPJ21
(aPb (7.1.36)
The mean matrix and residuals Y. and E., are 74 x 3. The covariance matrix of any row of E. is the 3 x 3 matrix :t. The design will require further consideration for tests of significance, since both class and residual effects are random. A model of this sort, with one or more fixed effects and at least two sources of random variation, is termed the analysis-of-variance mixed model. Univariate model7.1.14 is another example.
7.2
LEAST-SQUARES ESTIMATION FOR ANALYSIS-OFVARIANCE MODELS
Let us assume a general analysis-of-variance model for means, as in the preceding section (Eq. 7.1.24a). Thqt is, I
Y.,= A0*+E.
(7.2.1)
where Y. is the Jxp matrix mean; A is the Jxm model matrix; 0* is the mxp matrix of fixed parameters; and E. is the J x p matrix of mean errors. J is the number of subclasses of observations (or J 0 , should some subclasses be empty); m is the total number of parameters in the analysis-of-variance model; and p is the number of criterion measures. In addition, let D be the J x J diagonal matrix of subclass frequencies, N;, and the total number of observations N=:t; N;. Multivariate point estimates of the effects in 0* consist of p separate sets of univariate estimates. Thus we ~hall consider estimation only for the more general multivariate case. It is in th.e construction of joint confidence intervals and test statistics for the p outcoflles that the multivariate results depart from the univariate. The distributional assumption for E. is given by Eq. 7.1.27. Rows of E. are independently distributed in p-va~iate normal fashion, with expectation 0' and pxp covariance matrix (1 /N;):t. For the entire matrix, 1
E. ~ JV(O,
o-l ® :t)
(7.2.2)
216
Method
We shall seek the estimate of 8* that yields the minimal sum of squared residuals in the sample. Minimizing the sum of squared errors for each variable separately will satisfy this condition. However, each row of the sample E. must be weighted by the number of subjects in the corresponding group of observations. Let [e.];k be the jk element of sample matrix E. and let e.k be the kth column. The quantity to be minimized to obtain the estimate~* is
q = Lk L 1 N;[ e.];k 2 = Lk e:koe.k
=tr (E:DE.) = tr (Y.-A~*)'D(Y.-A~*)
(7.2.3)
In the univariate case, Eq. 7.2.3 reduces to the scalar q = e'.De .; in the multivariate case, q is the sum of p sums of squares, one for each criterion measure. Minimization of q yields the estimates which give minimal sum of squared residuals for every outcome variable. We may examine the distribution of errors for any one variate to see the role of matrix D in Eq. 7.2.3 (which does not appear in the regression model). The distribution of one column of E. (mean errors for J groups on one variable) is given by univariate expression 7.1.9. That is,
(7.2.4) We append the subscript k to indicate that yk is one of several criterion measures. e.k is the sample value for E-k·
The elements of E.k do not generally have equal variance, since the subclass frequencies may be unequal. To minimize e:ke·k would yield an estimate of 0* that relies more heavily upon smaller groups of subjects than upon larger groups. If the subclass frequencies are arbitrary and unrelated to finite population sizes, then q should be the sum of squared equal-variance components of E-k· We may obtain these components as follows. First, factor D- 1 by the Cholesky method. The Cholesky factor Tis a diagonal matrix with elements 1/VN;. Second, we may define a new vector
(7.2.5) The variance-covariance matrix of 'Y is Y(y) = Y(T- 1 E.k) = r-tak20-t(T-t)'
= ak 2T- 1TT'(T- 1)' =
ak2
1
(7.2.6)
The elements of 'Yare uncorrelated; under normality, they are also independent. Each 'Y; has variance ai. Thus y, as defined by Eq. 7.2.5, is a transformation of E.k to an equal number of independent, equal-variance random variables.
Analysis of Variance: Models
217
The sum of squares of the ,elements of y is
= e:dT- 1)'T- 1e.k =e:koe.k
(7.2.7)
Expression 7.2.7 pertains onlyto variate Yk· In the multivariate case, the sum of squares is computed for each variate and their total is minimized in the sample (Eq. 7.2.3) to provide the least-squares estimate of 0*. Let us minimize Eq. 7.2.3, first assuming that p = 1, with y. and 0* as vectors. The sum to be minimizfd is
q = (y.-AO*)'D(y.-AlJ*) =
y:Dy.~2[0*]'A'Dy.+ [O*]'A'DAO*
(7.2.8)
Setting the first derivative of q with respect to {)* to zero can yield a minimum value. The derivative is I
aq
~
(7.2.9)
a[O*]' =-2A'Dy.+2A'DAO* Setting the derivative to zero, the normal equations are I
(7.2.10)
(A'DA)O*=A'Dy.
In the general p-variate case, the expression to be minimized is the sum of terms like Eq. 7.2.8, summed across all measures (see Eq. 7.2.3). This value has a minimum when the least-squares criterion is individually met for every yk-variable. The multivariate normal equations are similar to Eq. 7.2.1 0, but with a column of 0* andY. for each criterion variable: (7.2.11)
(A'DA)@*=A'DY. I
~
I
Normal equations 7.2.1 0 or 7.2.11 are easily solved for 0* if we premultiply both sides by (A'DA)- 1 . We note, however, that A'DA is of deficient rank, and cannot be inverted. Since there may be more columns (parameters) in A than subclasses, A'DA is of order m, but of rank no greater than the smaller dimension J. Further, columns of A ar'e simple linear combinations of one another. For example, in the one-way matrix
0 0 0] 1
0
0
0 0
1 0
0 1
the first column is equal to the simple sum of all the others. In the model matrix for the 2x3 crossed design (Eq. 7.1.12) the difference of the two dimensions is still greater. Column 1 is equal to the sum of columns 2 and 3; to the sum of columns 4, 5, and 6; and to the sum of columns 7 through 12.
218
Method
In common terminology, we should like to estimate five parameters in the one-way model, but have only four degrees of freedom among means (1 for f.L plus 3 for the a,;). In the two-way model, we would like to estimate 12 parameters, but have only six degrees of freedom among means (1 for f.L, 1 for the a,;, 2 for the f3k, and 2 for the [af3h). Thus the analysis-of-variance model is termed the model of deficient rank. The maximum degrees of freedom available among means is J, the number of groups with observations, also the number of rows of A. This is the maximal number of effects that may be estimated, or the maximum rank of (A'DA). If J-J0 subclasses have no observations, then the corresponding diagonal element of Dis zero, and the rank is restricted still further to J0 • That is, we may only estimate as many effects as the total number of subclasses with at least one observation. Three classes of solutions have been proposed for the situation. The first of these is the definition of a generalized inverse for A'DA, which does notrestrict the rank of the matrix. This solution basically involves rearranging columns of A, or ignoring columns, such that only a JXJ submatrix of A' DA, of rank J, is inverted. The second and most common solution involves bringing A up to rank m by adding m-J rows, which destroy the linear dependencies among columns. These rows form additional equations, which are usually interpreted as restrictions upon the parameters. For example, the one-way univariate model is
The dependency among the columns may be broken by adding an equation · restricting 4
2: a 1 =0 j=l The model with the restriction is
or
y:
=
A*fJ*+E:
(A*)'DA* is of full rank and columns of A* are not multiples of one another.
The normal equations may be solved through usual inversion procedures, using A* in place of A. In parallel fashion in the two-way design, we may restrict
Analysis of Variance: Models
219
the sum of the a;'s, the sum of the f3k's, and the sums of interactions across each main effect to zero. These restrictions are equations added to the model for the purpose of bringing A to the rank m, and making the parameters estimable. The restrictions in the multivariate case are the same but apply to the vector effects; for example, Lj aj= 0. A third solution for the model of deficient rank is to select and estimate I (=;s; J) linear combinations of the parameters that are of scientific interest. These combinations are expressed as c'ontrasts among subpopulation means and can be explicitly chosen in accordance with the experimental design and procedures. This solution has the advantage of providing direct results concerning the experimental outcomes. It is usually differences among group means that are of concern in analysis-of-variance models. If we do not restrict the sum of the parameters, the connotation is avoided that experimental effects somehow nullify one another. (Are the summative effects of three and six hours of sensory deprivation null?) The linear combinations of m parameters are I alternate parameters, which are selected so that all may be estimated. The alternate parameters replace the original set in the matrix 8*. To maintain the equality of the two sides of the model equation, a modified mod.el matrix is also required. The substitution of alternate parameters and performing the necessary model alterations is termed reparameterization.
7.3
REPARAMETERIZATION
1
In the analysis-of-variance mqdel (Eq. 7.2.1 ), the matrix A has rank less than its column order m. Thus all parameters are not estimable. We may choose instead I linear combinations of the parameters, which are all uniquely estimable. I is the rank of the model for significance testing, and must satisfy I =;s;J. The weights defining /linear combinations of rows of 8* are constructed as the rows of an /xm contrast matrix L, such that its rank is also/. That is, I parameters in L8* = 8 are to be estimated, as an alternative to the estimation of them parameters in 8*. The "new" parameters for all variates are contained in the lxp matrix 8. , Since the multivariate formulation reduces to the univariate when p= 1 (8* and 8 are vectors), the more'. inclusive multivariate model is emphasized here. Reparameterizing the independent variables is not affected by the number of criteria. The alternate paramete'rs in 8, like those of 8*, have the same form for every outcome measure, although their numerical values will differ. For the alternate set of pararheters, 8, an alternate model matrix is also necessary. Represent the model matrix for the I alternate parameters as a Jxl matrix K. K is termed the basis for the design. The model as originally formulated is
Y:=A8*+E.
(7.3.1)
220
Method
Eq. 7.3.1 is reparameterized to the alternate model, Y. = K(L8*) + E.
=
(7.3.2)
K8+E.
Through the careful selection of contrasts in L, K has full rank /. Eq. 7.3.2 is then identical in form to the regression model (Eq. 4.3.3), with Kin place of X and 8 in place of B. To estimate 8, we utilize the multivariate least-squares criterion and minimize tr (:E:OE.). The normal equations for the reparameterized model are the same as Eq. 7.2.11 but with K and 8 in place of A and 0*. That is,
(K'DK)S
K'DY.
=
(7.3.3)
K'DK has full rank and can be inverted. The least-squares solution for the estimate of 8 is obtained by premultiplying both sides of the equations by (K' DK)- 1. Then (K'DK)- 1 (K'DK)0= (K'DK)- 1 K'DY. and
0= (K'DK)- 1 K'DY.
(7.3.4)
The usual research procedure involves (1) constructing A and 8*, (2) selecting Land evaluating the basis K, and (3) the estimation of 0 according to Eq. 7.3.4. Comparison of Eqs. 7.3.1 and 7.3.2 reveals that A is replaced by the product KL in the reparameterized model. This sets the conditions for evaluating K once Lis selected. That is,
A= KL
(7.3.5)
and AL' = KLL'
Then AL' (LL' )- 1 = KLL' (LL')- 1
and K
=
AL'(LL')- 1
(7.3.6)
K is the Jx/ model matrix for the alternate parameters; L is the /xm contrast matrix, so that LL' is /x /. Consider the one-way model with four subclasses of observations. The mean models are
~l~
1
0
0 y .- 1 0 1 0
1
Y.=
A
0 0
0 0 1
0
~ [~J+E. 1
(7.3.7)
a,l
a'4
or
W+E.
(7.3.7a)
Analysis of Variance: Models
221
The model matrix A has only four rows but m = 5 columns. The final column is exactly the first column minus the sum of the remaining three. Thus the rank of A is J = 4. Given the limitation, we can select at most four linear combinations of the elements in 0* that are uniquely estimable. For exemplary purposes, let us assume that the fourth experimental group is a control group and that the other three groups represent three experimental conditions. A useful set of parameters might be an overall population constant, plus the contrasts of a 1 , a 2 , and a 3 respectively, with a 4 • For this situation, the contrast matrix is
1 1/4 1/4 1/4
l
0 L= 0
1 0
0
0
0 0
0 1 0
1
1/4] -1 -1 -1
(7.3.8)
The alternate set of parameters is L0*=0
(7.3.9)
0, like 0*, has a column for each criterion measure. Rows of 0 are contrasts among the parameters in 0*. Once Lis selected, K may be computed by Eq. 7.3.6. K=AL'(LL')- 1
[25 1 -
[' 25 -
~J[r
0 1 1.25 0 1.25 0 0 1.25 -1 -1 -1
1 0 1 1.25 0 1.25 0 0 1.25 -1 -1 -1
0
mr 0
0 0 2 1 1 2 1 1
rr -~5]
0 0 .75 -.25 -.25 .75 -.25 -.25 -.25 .75
-25]
.75 -.2.5 ['1.00 00 -.25 .75 -.25 - 1.00 -.25 -.25 .75 1.00 -.25 -.25 -.25
(7.3.10)
K may be employed in the estimation of 0 by Eq. 7.3.4. Let us examine how this reparameterization has altered the original models.
222
Method
For simplicity, let p = 1. The four mean models are
Y·1 =~-t+al· Y·2 = 1-t +a2 Y·3=~-t Y·4=~-t
+e.l +e.2 +a3 +e.3 +a4+e.4
In reparameterizing, we have substituted parameters a 1-a4 for the a 1. In order to maintain the equalities, terms have been added to or subtracted from I-t· The reformulated models are y. = K9+e. or, in extended form, ·
Y·1 = 1 (~-t+1/4 LJ a 1)+.75(a1-a4)-.25(a2-a4)-.25(a3-a4)+e.1 Y·2= 1 (~-t+1/4 LJ aJ)-.25(a1-a4)+.75(a2-a4}-.25(aca4)+e.2
y. 3= 1 (~-t+1/4
LJ a 1)-.25(a1-a4)-.25(a2-a4)+.75(a3-a4)+e.3 Y·4 = 1 (~-t+1 /4 LJ a 1) -.25(a1-a4)-.25(a2-a4}-.25(a3-a4)+e.4
The reader may wish to verify that these models constitute a simple regrouping of terms in the original models. The total value is not changed. For example, the first model is ~-t+1/4
L; a 1+.75a1-.75a4-.25a2+.25a4-.25as+.25a4 = ~-t+al
The first term in the alternate models absorbs the scaling effects of the reparameterization. As a result, it estimates a somewhat complex constant term. The contrasts among parameters replace the original parameters themselves. Some manipulation of coefficients is required in order to assure that the same parameters are estimated in all four equations. The effects in 9 are estimated from the constants inK, rather than from the zero-one coefficients of the original model matrix. The vectors of K appear similar to contrast vectors, but should not be mistaken for them. Other than as an intermediate step in the computations, the basis assumes no further importance in the analysis.
Conditions for the Selection of Contrasts There are two conditions which the contrast matrix L must satisfy. First, since A= KL (Eq. 7.3.5), rows of L must be linear combinations of the rows of A. Only if this is the case can the relationship between the two matrices be accounted for by matrix multiplication with K. That is, each row of K defines a linear combination of the rows of L to yield a row of A. If rows of L are not linear functions of the rows of A, then matrix multiplication cannot transform one into the other. The rows of L in the one-way example (Eq. 7.3.8) are linear combinations of the rows of A, as required. The first row of Lis the average of the rows of A, 1/4 ~1 a' 1 ; the remaining three rows of L are the differences of rows one through three of A, respectively, and row four, a' 1-a' 4 • Note that an estimate of p. alone is not possible in the presence of main effects a 1, since the vector [1 0 0 0 0] cannot be expressed as a function of the rows of A. Should tests of significance
Analysis of Variance: Models
223
indicate that all a; are equal, however, the first term in 0 will estimate the population mean, p,+a. common to all groups. Condition one is not difficult to meet. Almost all contrasts among parameters a; (or the other parameters in higher-order designs) can obviously and simply be expressed as functions of the rows of the model matrix. Thus it is easiest to construct the weight vectors first by considering the parameter matrix and then to check that the contrasts selected satisfy the dependency criterion. The second condition is that the /x m matrix L must be of rank /. This is necessary to assure the inversion of LL' in Eq. 7.3.6. It requires (1) that we do not choose more contrasts than the rank of A. Rows of L are functions of rows of A, and the rank of A is maximally J. Any more than J linear functions will necessarily be dependent upon the first J functions. That is, we have no more than J degrees of freedom among means and can estimate no more than J alternate parameters (although we may estimate fewer). It is also required that (2) the rows of L, or contrasts, cannot be exact linear functions of one another. For example, in the one-way model, the contrast vector for comparing a 1 -a2 , or [0 1 -1 0 0], could not be included in L as the fourth vector in place of a 3 -a4 • This vector is exactly the difference of the weights for a 1 -a4 and a 2 -a4 , the second and third rows of L. On the other hand, the rows of L need not be orthogonal.
Some Simple Cases Let us consider the reparameterization of a two-way model (univariate). Consider a main-effects model for a crossed design having two and three levels of the design factors, respectively. The mean models are +(31 +e.11 Y·u =,u+a1 +f3z Y·tz=,u+at +E.tz Y·ts=,u+at +f3s+E.t3 +az+f3t +E.zt Y·21 = ,U +az +f3z +E.zz Y·zz = ,U +az +f3s+E.zs Y·zs=.u
(7.3.11)
or
y.=
1 0 1 0 1 0 0 1 0 1 0 1
1 0 0 0 1 0 0 0 1 1 0 0 0 1 0 0
A
0
1
,u al az
(31
+e.
(7.3.11a)
f3z
(33 0* +e.
Again we note the dependencies among the columns of A. Specifically, columns two and three sum to column one, as do four, five, and six. The rank of A is four. We shall reparameterize to four linear combinations of the effects which are of interest in the research. In addition to a constant term, we will have one alternate effect in the a factor and two in (3. These quantities are the usual "between groups degrees of freedom" for the model.
224
Method
Four useful alternative parameters for the model might be the constant term, the difference a 1 -a., and the differences {3 1 -{3. and f3z-f3., where a.= (a 1+a 2)/2 and [3. = ({31+{3 2+{33 )/3. This reparameterization is equivalent to altering the original models, by appropriately adding and subtracting identical terms. That is, Y·u = 1(p.+a.+{3.) + 1(a 1-a.) + 1({3 1 -{3.) +0({32-{3.) + e.u Y·12 = 1 (p.+a.+{3.) + 1(a1-a.) + 0({31-{3.) + 1({32-{3.) + e. 12 Y·1a = 1(p.+a.+{3.) + 1(al-a.) -1 (/31-{3.) -1 (f3z-f3.) + e.1a Y·z1 = 1(p.+a.+{3.) -1 (a1-a.) + 1({31-{3.) + 0({32-{3.) + e. 21
(7.3.12)
Y·22 = 1(p.+a.+{3.) -1 (al-a.)+ 0({31-{3.) + 1(f3z-f3.) + E-22
Y·za = 1(p.+a.+{3.) -1 (a1-a.) -1 ({31-{3.) -1 (f3z-f3.) + e.2a In matrix form, these are K
y.=
8
+e.
(7.3.13)
1 1 0 1 0 1 1 -1 -1 a 1-a. 1 -1 1 0 {31-/3· 1 -1 0 1 f3z-{3. 1 -1 -1 -1
+e.
(7.3.13a)
1
r·-+~J
1 1
6 in Eq. 7.3.13a is the product of the contrast matrix L and the original vector of effects 8*.
~[~
1/2 1/2 1/3 1/3 1/2 -1/2 0 0 0 0 2/3 -1/3 \ 0 0 -1/3 2/3 -1/3
""]
-1~3
p. al a2 {31 f3z
(7.3.14)
/3a L
8*
(7.3.14a)
If there is more than one criterion variable, matrix @ has a similar column for each outcome measure.
Analysis of Variance: Models
225
The contrast matrix L is of rank 4. It meets the condition that its rows can be constructed as linear functions of the rows of A. For example, the constant term is the simple average of the rows of A,
!±a'. 6 j~1 J
The second row of L is the contrast in o:. It is constructed by subtracting the average of all rows of A from the average of the three first rows having o:1 ; that is,
1
1
3
6
- 2: a'. - 6- j~1 2: a'. 3 i~l J
J
The contrasts in {3 are also estimable, since their weights are obtainable from therowsofA.Forexample,thevector[O 0 0 2/3 -1/3 -1/3]inlcanbe obtained by subtracting the average of all rows of A from the average ofthose containing a nonzero {3 1 coefficient. That is,
1~ 21(, a1+a4') - 6 ..:.,.a'j j=l
In altering the models, a more complex constant becomes the first parameter to be estimated. Only in the absence of main effects does the term estimate p.. alone. If interactions are included in the model, then main effects are similarly confounded with interactions. Only if the interactions are null do the main-effect contrasts estimate simple differences of the cx;'s and f3k's. The coefficients in K can be seen clearly in the reparameterized models of Eq. 7.3.12. They can also be found algebraically once L is defined. For the two-way example, the basis matri.x in Eq. 7 .3.13a is identically the product K=AL'(LL')- 1 , where 11/6 1/2 11/6 1/2 11/6 1/2 AL'= 11/6 -1/2 11/6 -1/2 11/6 -1/2
l1116 LL'- O -
0
0
0 1/2 0 0
and
(LL')- 1 =
l'11 ~
0 2 0 0
2/3 -1/3 -1/3 2/3 -1/3 -1/3 2/3 -1/3 -1/3 2/3 -1/3 -1/3 0 0 2/3 -1/3
0 0 2
~J
_J~ 2/3
226
Method
The reader may wish to complete the demonstration. K may be used in Eq.
7.3.4 for the estimation of e.
Example The data of Table 3.3.2 comprise four groups of the design formed by crossing two levels of sampling factor A with two levels of B. The matrix mean for the p = 2 variates is
y
=
·
.50 .75] Group 11 [ 3.17 3.67 Group 12 2.20 2.40 Group 21 3.43 3.57 Group 22 Y1
Y2
The subclass frequencies, in diagonal matrix form, are
D= diag (4, 6, 5, 7) The fixed-effects analysis-of-variance model with interaction is
y:;k
= p,' +a) +Jli..+'Y!k+e:;k
All terms are two-element vectors. The matrix model is
Y.= A0*+E. with
A~[j
0 1 0 1 0 0 1 0 1 1 0 0 1 0 1 0
1 1 0 0
0 0 1 0 0
o]Gmup11
0 0 Group 12 1 0 Group 21 0 1 Group 22
and
[ 0*)' = [ P,
0:1
O:z
/J1
/Jz
'Yu
1'12
1'21
'YlJ
The columns of [0*]' have two elements for variables y1 and y2 • The model matrix A is of rank four. The linear dependencies can be easily seen. Thus we shall choose four linear combinations of the parameters that are of interest in the research. The first alternate parameter is a population constant, necessary to equate the two sides of the model. The constant is the average of all four groups' observations. To obtain the row of L for the constant then, we average the four rows of A:
1'1 = [1
.5 .5
.5
.5
.25 .25
.25
.25]
The population constant is
1' 1 0* = p! + 1/2(o:1 +o:z)' +1/2(/31+/Jz)' +1/4(')111+1'1 2 +')121 +'}'22 )' =
p,'+a:+!J:+'Y;·
Analysis of Variance: Models
227
A second parameter of interest is the comparison of A 1 groups (11 and 12) with A 2 groups (21 and 22). The contrast weights are the average of the first two rows of A minus the fast two rows. That is,
0 The second alternate pE
·~meter
l'z0*
0
.5
.5 -.5 -.5]
is
= (al-az)'+('Yt·-1'z·l'
If the interactions for A 1 and A 2 are equal, then the second parameter estimates the difference at-a2 . If not, then the A main-effect contrast is confounded with interactions. A third parameter is the comparison of 8 1 with 8 2 (groups 11 and 21 with 12 and 22). The contrast weights are the averages of the first and third rows of A minus the average of the second and fourth. That is, 1' 3
=
[O
0
0
1 -1
.5
-.5
.5 -.5]
The third alternate parameter is
I' 30* = (JJ~- Pd' +(y.t-'Y·zl' Again, if the mean interactions are equal, then the third parameter estimates the difference of the two treatment effects. If the interactions are not equal, then the difference of the means for 8 1 and 8 2 is not the simple difference of treatment effects, but is confounded with interaction as well. A final parameter is the interaction contrast, with 0
0
0
0
-1
-1
1]
I' 4 is the sum of the first .and fast rows of A minus the sum of the second and third. The parameter is I' 40*
(yn-yd' -('Yzt-'Yzz)'
=
If the difference of interactions y 11 -y 12 is the same as the difference y 2 ,-y22 , then we shall say that there is no interaction, for the cell means can be "explained" by a simpler main-effects model. In this case, interaction does not confound the A and 8 main-effect contrasts. The complete contrast matrix is
1 L= [ 0
0 0 and
.5
.5
.5
.25
1 -1
0
0
.5
0 0
1 -1 0 0
.5 1
.5
0 0
.251
.25 .25 .5 -.5 -.5 -.5 .5 -.5 -1 -1 1
228
Method
The reparameterized model matrix is
K
~
AL'(LL')-'
~ [l
.5
.5
.5 -.5 -.5 .5 -.5 -.5
.25] -.25 -.25 .25
The feast-squares estimate of 8 is S=(K'DK)- 1K'DV.
2.60J Constant 2.32 [ -.98 -.78 A main effect = -1.95 -2.04 8 main effect -1.44 -1.75 A8 interaction Yt
Y2
The mean for A1 is lower than the mean for A2 for both variates (.98 on Yt· .78 on y2). The 8 differences are larger, with mean 8 1 about 2 points lower than mean 8 2 on both variates. The differences among interactions also appear large by comparison. We may test these statistically to see if they contribute to between-group variation on the criteria. In particular, the low means of group 11 alone seem to contribute to the interaction contrast.
7.4 THE SELECTION OF CONTRASTS Once an experiment has been designed and the analysis model and hypotheses are known, the particular contrasts to be estimated and/or tested can be constructed. The contrast weights form the rows of a contrast matrix l, which defines linear functions of the analysis-of-variance effects. The particular contrasts are chosen according to the design of the study and to the questions and hypotheses that are posed. Therefore, it is impossible to impose statistical rules for their selection. Several examples of contrasts are given in Section 7.3. More are provided at the end of this chapter in the context of actual research studies. It often happens that the contrasts of interest to researchers can be categorized into general contrast "types." A number of these are discussed here. It is convenient to describe contr~st vectors through a symbolic notation that conveys both the form of the reparameterization as well as the particular effects involved. The symbolic representation of contrasts obviates constructing long and perhaps complex weight vectors, and can be used to improve the convenience and accuracy of computer analyses. One such notation has been employed by Bock (1963), which is based upon the formulation of Kurkjian and Zefen (1962). The notation is discussed here, first for designs with only one classification factor. Contrast vectors for two-way or many-way designs, and for designs with nested effects, can be represented as combinations of effects for the separate classification factors.
i
Analysis of Variance: Models
229
One-way Designs Assume a one-way analysis-bf-variance model with a levels of the classification factor. The model is Y. = A8.+E., with
and
(7.4.1)
I where 1a is an a-element unit vec;:tor; Ia is the axa identity matrix. Each row of 8* has p elements, corresponding to the effect for the p separate outcome measures. (For simplicity, it may hJelp to assume p = 1). Represent the contrast matrix for the factor having a levels as La. La premultiplies 8* to create no more than a linear combinations of the effects in the alternate parameter matrix 8= la8*. In the usual one-way analysis-of-variance model, there are a+1 parameters, so La is of the order ax(a+1). Each row of La consists of a set of weights that multiply the parameters in 8* to produce one of the alternate parameters. Almost universally in behavioral research, the first parameter is a population constant, which reflects bot~ the scale of the measures and the overall response level of the population(s) sampled. As a function of the rows of the model matrix, the corresponding vector of La is the mean of all rows of A. Inspection of A reveals that the mean of its rows is the vector
[1
! ! . . . !] a a
a
(7.4.2)
We shall denote the constant ter111 as the "zeroth" effect, and symbolize' the corresponding row of the contrast fllatrix as LO. The remaining (a-1) vectors. are denoted L1, L2, ... , L(a-1 ). Thus the contrast matrix La has the form · L1 LO
]
La= [ L2
(7.4.3)
~(a-1) L1 through L(a-1 ), like LO, each re~resent an entire vector, with a+1 elements. Except for the first vector, the rows of La define contrasts among the parameters and all have zero as the first ~lement. The reader may note that this was true for the examples of the preceding section. We shall primarily concern ourselves with the (a-1)xa final sybmatrix of La defining the contrasts among the aj. The sum of the elements of each row of the submatrix is zero, by the definition of a contrast. For example, a contrast matrix tpr the four-level design is given by Eq. 7.3.8.
230
Method
That is,
1
L4 =
~
1I 4
0 0 0
1I 4 1I 4
1 0 0
0 1 0
0 0 1
1I 4l LO -1 L1 -1 L2 -1
(7.4.4)
L3
The submatrix that describes the contrasts among the parameters consists of the vectors L 1, L2, and L3, omitting the first element. That is, 1 0 [0 1
0
0
0 -1]L1 0 -1 L2 1 -1 L3
(7.4.5)
The first contrast is a 1 -a4 . If the mean vector in subclassj is fL;= fL+a;, then L 1 is equivalently fL 1 -p. 4 , or the comparison of the respective population means. Various sorts of contrasts among means are employed in behavioral research. Many of those most frequently used form selective subsets of those possible. One such subset is the set of deviation contrasts, whereby each of a-1 parameters, a;, is contrasted to a., the mean of all a;. The parameters estimated are of the form a;-a. or p..;-p... The contrast submatrix for deviation contrasts is of the form 1 1--
a
Lva=
a a
1
a
a
a
a
a
a
a
a
a
1 1-a
a
a 1 1--
1
D1 D2
(7.4.6)
D(a-1)
The subscript D denotes that the contrasts are of the deviation type; a indicates the number of levels of the design factor. The contrast vectors themselves may be represented by the codes D1, D2, ... , D(a-1 ). DO, not seen in the contrast submatrix, is again the mean of all rows of the model matrix, and is identical to LO of Eq. 7.4.2. As an example, let a= 4 for a one-way design having four levels. The alternate parameters to be estimated are a 1 -a., a 2 -a., and a 3 -a .. In addition to the constant term, or LO, the matrix has the contrast submatrix Lv 4 • That is, 3/4
Lv4 = [ -1/4
-1/4
-1/4 3/4 -1/4
-1/4 -1/4 3/4
-1/4]01 -1/4 D2 -1/4 D3
(7.4.7)
The row vectors are represented by D1, D2, and D3, respectively. The complete
Analysis of Variance: Models
231
contrast matrix is 1 rJ!i __"!fj-__1_/_4:___:!_-:._1 DO L _ 0 1 D1 4- l0 I Ln 4 D2 o D3
(7.4.8)
1
With 0* defined by Eq. 7.4.1, then
[p,'+aj.
1
a~-a:
(7.4.9)
a~-a:
=
a;-a: The set of deviation contrasts cqrresponds most closely to the "traditional" terms in the analysis-of-variance model, assuming a;= p,;-p,. The final or missing estimate, jl 4 -jl, may be obtained as minus the sum of the other three estimates. Deviation contrasts may be employed when there is no particular order to the groups in the design and when it is useful to estimate the simple terms in the analysis-of-variance model (f.t, a;, (3k, Y;k, and so on). A second set of contrasts is appropriate when a-1 group effects are compared with a control or comparison group. These comprise the set of simple contrasts, and will be represented by the code letter C. The contrast submatrix of La, for simple contrasts, is
1 0 0 1
0 0
· · · -1JC1 · · · -1 C2
00
0
· ·1
Lea= [ .
(7.4.1 0)
. ·
-1
C(a-1)
In Eq. 7.4.10 all subclass effects are compared with the mean of the last group. For the one-way situation with four levels of the design factor, Lc is I
4
1 0 0 -1] C1
Lc4 = [ 0
0
1 0 -1 C2 0 1 -1 C3
(7.4.11)
CO, representing the constant term, is the same as LO or DO in Eq. 7.4.2. The reparameterized model has parameter matrix 0, given by
0=
p,'+a:J [ a'-a' 1
4
a~-a~
(7.4.12)
a~-a~
Letting a;= p,;-p,, it can be seen that the three contrasts are simple comparisons of population means, p,;-p,4 . The convention for symbolic coding will
232
Method
be that the contrast number omitted from the set will be the index of the common comparison group. Thus, in the four-level example, the codes C1, C3, C4, will represent the comparison of ab a 3, and a 4, with a 2, respectively. That is, Lc4 =
[1 -1 0 0] 0 -1
0 -1
C1 1 0 C3 0 1 C4
(7.4.13)
Two additional types of contrasts have the property of orthogonality. Their employment in balanced, or equai-N, designs lends simplicity to the analysis. These are the set of Helmert contrasts and the orthogonal polynomial contrasts. Through Helmert contrasts, each group effect, a;, is contrasted with the mean of succeeding group effects, in a given order. The contrast submatrix of La has the form
a-1
LHa= 0 0
0
a-1 1 a-2
a-1 1 a-2
H1 a-1 1 H2 a-2 -1
0
(7.4.14)
H(a-1)
Helmert contrasts for a five-level factor would be H1, H2, H3, and H4. 1 -1/4 _ [0 1 Hs0 0 0 0
L
-1/4 -1/3 1 0
-1/4 -1/3 -1/2 1
-1/4] H1 -1/3 H2 -1/2 H3 -1 H4
(7.4.15)
Although Helmert contrasts may be applied to any design, they are of particular value when there is an order underlying the subgroup definition. As can be seen from Eq. 7.4.15, the Helmert contrasts may be used to compare sequentially ordered group means in a regular fashion. It can also be seen that all Helmert contrasts are orthogonal to one another. When experimental groups are defined by an underlying quantitative metric (for example, age, dosage, time elapsed) it may be of value to determine whether group means differ as a function of the values of the underlying independent variable. For example, physical growth may increase proportionally to a simple polynomial function of age over, say, a five-year period. Or recall of simple learned material may decrease in a manner directly proportional to time elapsed since learning, in 15-minute intervals. In these and similar instances, experimental groups represent discrete conditions on a measured independent variable (for example, 1, 2, 3, 4, 5 years of age; 5, 10, 15 cc drug dosage; 0, 15, 30, 45-minute delay to recall). To determine whether group means differ according to a polynomial function of these numeric values, orthogonal polynomial contrasts may be employed. The contrast weights for orthogonal polynomials are determined through the row-wise orthonormalization of a polynomial matrix in the original metric.
Analysis of Variance: Models
That is, let the scale underlying the group variable have values x 1, x 2, x 3, ... Then the contrast matrix is the ro~-wise orthogonal factor of X, where
x1° X1 1' [ X= :I
0
Xa ·· ·· ·· xi
X a-1
J
233 , Xa.
(7.4.16)
Xaa-1
1 j !
! The first row of X has all unities; the second has the original values of the metric, or the scale underlying the group-membership variable. The following rows define polynomials of increa?ing degrees of complexity. If group j has mean outcomE! score Y·J> then the reparameterized, or polynomial, model is
a-1
= ~ Ok+1x/' +e.;
(7.4.17)
k=O
I
is the value of the independentvariable common to observations in group j. Its powers comprise one column of X in 7.4.16. The 01 are the coefficients of the powers of X;, in the reparameterized model. These are the elements of 8; (h is a constant term which absorbs scaling factors in y. To test for overarfgroup-mean differences, all a-1 terms in (J excluding 81 may be tested for nullity (as with any of the contrast reparameterizations). To test for differences proportional to particular polynomials in x, indi,vidual coefficients oi may be tested. When p is greater than one, 8 has a colu'Tin with the polynomial coefficients for each outcome measure. : The symbolic representation of orthogonal polynomials has contrast code P. The set of weights for the con,trast submatrix is symbolically represented P1, ... , P(a-1), X;
,
i,
Lp =
a\
[P1 P2 .
J
~(a-1)
(7.4.18)
'
DeLury (1950) has tabled LPa from a= 2 to a= 26, for the case where values of the metric are evenly spaced (proportional to 1, 2, 3, ... , a-1, a). For example, if there are two groups, we can only test for a simple linear difference of group means. The contrast weights are (7.4.19a) The vector is represented symbolically as P1. The test of significance is the same as any contrast between Y·1 and Y•2·
234
Method
If there are three groups, the contrast weights for linear and quadratic differences among the means are 0 -2
1] P1 1 P2
(7.4.19b)
It can be seen that the P1 weights increase in even intervals, for three groups having means Y·1o y. 2 , and y. 3 • This contrast is significant if the group means increase (or decrease) in a monotonic fashion, proportionally to the metric, or to the numbers 1, 2, 3. The P2 weights describe a second-degree or parabolic curve. This contrast is significant if y. 2 is significantly off a straight line connecting points y. 1 and y. 3 • To test for overall mean differences (H 0 : J.L 1 = f.L 2 = J.L 3 ), the two-degree-offreedom test of both contrasts is conducted. This yields the same result as the test of any two contrasts among the three groups. However, to test the complexity of the curve separating the means, separate tests are conducted of linear P1, and of additional variation attributable to quadratic P2. If P2 is significant, both linear and quadratic effects are maintained in the model. Not only do group means describe a parabola, but the slope of the line upon which the parabola rests may be non-zero (see Figures 7.4.1 0, 7.4.11 ). When there are four groups, the weights for linear, quadratic, and cubic differences among the means are
r-~ =~ -~ ~l =~
Lp4 =
l:-1
(7.4.19c)
3 -3 -1J P3
Again, each successive contrast vector describes a more complex curve than the one preceding. Since each vector has been orthogonalized from those preceding it, rejection of H0 for any contrast requires that all simpler terms be included in the model as well. Frequently, significant terms which describe complex curves are difficult to interpret. This implies only that the mean outcomes are not easily explained as multiples of the metric values. Some other contrasts may be better to describe differences among the groups. It can be seen that the rows of LPa are orthogonal. They may also be normalized to unit length. The entire contrast matrix is
L4 =
l
4l P1PO
1 1 _1j~__!_!.±__:!_t'_1.__ _l::_ 0I 0\ Lp 4 0I 1
P2
(7.4.20)
P3
Tabled orthogonal polynomial weights are based on an underlying independent variable with equal intervals. That is, the treatment difference between groups one and two must be the same as that between groups two and three, and so on. In practice, the treatment (time, dosage, etc.) is not always alloted in equal intervals. General computational algorithms such as those in the MULTIVARIANCE program may be used instead of tables to generate weights for any metric. The same symbolic representation is still employed.
Analysis of Variance: Models
Bases for One-way
235
Design~
After a set of alternate parameters is chosen, the alternate model matrix may be determined, and used for further analysis. The construction of the alternate model matrix, or basis, is given, by Eqs. 7.3.5 and 7.3.6. That is, (7.4.21) Since K multiplies the contrast matrix L, there is one column of the basis for each alternate parameter-that is, a column for each row of L. We may represent the columns of K symbolically using the same notation as for the contrast vectors. For example, the simple contrast matrix for the one-way design (Eq. 7.3.8), is 1
L= [ 0 0 0
1I 4 1 0 0
1I 4 0 1 0
1I 4 1I 4l 0 -1 0 -1 1 -1
CO C1 C2 C3
(7.4.22)
The resulting basis is given by Eq. 7.3.10. Its columns may be referred to as basis vectors CO, C1, C2, and C3.
r
K = 1.00
1.00 1.00
co
-25]
.75 -.25 -.25 .75 -.25 -.25 -.25 .75 -.25 -.25 -.25 C2 C3 C1
(7.4.23)
Any real problem will undoubtedly require evaluation of the basis by computer. However, there are several aspects of constructing K that are useful to us. First, the basis may be constructe,d without reference to the model matrix A. Bock (1963) has shown that for any single design factor, the basis may be alternately constructed by
Ka =;' [1, KcJ
(7.4.24)
where (7.4.25) (using simple contrasts as an example). Lea is the (a-1)xa contrast submatrix of La; 1 is an a-element unit vector. For example, Lc 0 , from Eq. 7.4.22, ,is
236
Method
and
Kc4 = Lb4 (Lc4 Lb4)-1
-
[-:~;
-:~; =:;;~
-.25 -.25 .75 -.25 -.25 -.25
where K of Eq. 7.4.23 is Kc4 preceded by a four-element unit vector. Further, the basis for any given contrast type follows a regular pattern. It may be constructed without Eq. 7.4.21, knowing only the type of contrast, and number of levels, a. Thus, for most regular designs, a basis matrix necessary for estimation may be generated from the symbolic contrast codes, without the development of A or L. The MULTIVARIANCE program utilizes these regularities to avoid constructing unnecessary and potentially large matrices. For arbitrary contrasts that do not conform to the regular patterns, K is found from Eqs. 7.4.24 and 7.4.25. The vectors comprising the basis are an intermediate computation in the analysis-of-variance model. It is not essential that the researcher be able to compute them. The basis vectors are not themselves contrast vectors, but fairly complex functions of the contrasts. Thus the reader will want to be aware of the function of the basis in the analysis-of-variance model. Beyond that it is necessary only to understand and be able to interpret the symbolic contrast codes. The following extensions plus the examples in Chapter 8 exemplify a variety of applications of the symbolic codes.
Higher-order Designs For sampling designs of more than one factor, the model matrices and bases can be constructed from the corresponding matrices for the separate dimensions of classification. For example, inspection of the model matrix for the 2x3 crossed design in Eq. 7.1.12 reveals that A may be generated by means of Kronecker products of model matrices for the separate design factors. Let the model matrix for the two-level factor alone be A2 . That is,
A2 = [1 1 0) 1 0 1 = [12, lz]
(A factor)
(7.4.26)
where 12 denotes a two-element unit vector; 12 is the 2x2 identity matrix. The parameters are p., a 1 , and a 2 . For the three-level factor alone, the model matrix is
A3 =
[~ ~
! ~]
(B factor)
(7.4.27) for parameters p., /31> {32, and {33. The 6x 12 model matrix A for the two-way design, including interactions, is a matrix of Kronecker products of the sections of A2 and A3. For example, the
Analysis of Variance: Models
237
vector of all unities in A is the product 1 2®1 3 ; the two vectors for the a effects are the products 12 ® 13 ; the columns for {3 effects are the products 1 2 ® 13 ; the interaction vectors are 12 ® 13 . Juxtaposing these results, we have
A= [12®1a i 12®1.3 i 12®1 3 i 12®1s] I
1 1 1 1 1 1
1 0 1 0 1 0 0 0 0
1 0 0 1 0 0
0 1 0 0 1 0
1 0 0 0 0 0
0 0 1 0 0 1
0 1 0 0 0 0
0 0 1 0 0 0
0 0 0 0 0 0 1 0 0 1 0 0
0 0 0 0 0 1
(7.4.28)
The columns of A are the columns of the Kronecker product A2 ®A3 , except for their order. Construction of model matrices for three-way and higher-order designs may be accomplished through extended application of the Kronecker product operator. The main-effects model m'atrix for a three-factor crossed design, having a levels of factor A, b levels of 8, 13-nd c levels of C, is
A= [1a®1b®1ci 1a®1b®1c i 1a®lb®1c i 1a®1b®lc] The matrix of Eq. 7.1.13a is an example, with a= b = c = 2. In a similar fashion, the model matrix after reparameterization tor a multifactor design may be constructed from bases for the individual design factors. For example, the basis for the 2x3 crossed design of Eq. 7.3.13a can be obtained from one-way bases. Deviatiop contrasts a;- a. and f3~r- {3. were selected for both factors. The one-way contra~t matrices are L = [1 2 0
1/2 1/2] DO 1/2 -1/2 D1
(A factor)
(7.4.29)
and 1 1/3 1/3 1/3] DO L3 = [ 0 2/3 -1/3 -1/3 D1 0 -1/3 2/3 -1/3 D2
(B factor).
(7.4.30)
From these and the separate model matrices (or by Eq. 7.4.24) we obtain bases Kz=
[~ -~J DO
(A factor)
(7.4.31)
D1
and
K3 =
[~ ~ ~] 1 -1
DO
-1
(B factor)
(7.4.32)
D1. D2 I
The equal-element vector corresponding to the constant term for each factor is the "zeroth" effect, DO (or CO, HO, and so on). The equal-element
238
Method
vector in the total 6x4 basis K is the product 12 ®1 3 or, in symbolic notation, DO®DO (one vector from K2 , one from Kal· . The basis vector for the A main effect is the product of vector D1 for the A factor and DO for the 8; that is, 1 1 1 D1 ®DO= _ 1 -1
-1 The 8 effect has two degrees of freedom. The basis vectors corresponding to the two 8-effect contrasts are the Kronecker products of the DO vector for factor A, and the two effects (D1 and D2) for B. The entire basis in Eq. 7.3.13a is formed by juxtaposing these products. That is, 1
0 K=
DO®DO
-1 1
0 1 -1
1 -1 -1 -1
-1
1 -1
D1®DO
DO®D1
DO®D2
0
0
(7.4.33)
Should interaction terms be included in the model, the corresponding basis vectors for the two degrees of freedom would be the products D1 ®D1 and D1 ®D2. 1 0 -1 D1 ®D1 is -1 0 1
0 D1 ®D2 is
1 -1
0 -1
In every case it is necessary to maintain the order of factors (that is, a column of K2 is consistently the prefactor). If some subclasses have no observations, the corresponding rows of Y., E., and the complete K may be deleted, leaving Jo < J rows. The following example is adapted from Bock (1965; pp. 77-78).* Consider a 2X3X3 (AXBXC) completely crossed design. Linear and quadratic polynomial effects are desired across levels of 8, and simple contrasts among the levels of A and C. The separate contrast matrices are: L. = [1 2
0
1/2 1/2] CO (A factor) 1 -1 C1
'Material reprinted by permission, from Proceedings of IBM Scientific Symposium on Statistics, October, 1963. ©1965 by International Business Machines Corporation.
Analysis of Variance: Models
L3=[~
1/3 1/3 -1 0 1 -2
l3=[~
113 1
1/3
0
1
0
1/3] PO 1 P1 1 P2
co
1/3] -1 C1 -1 C2
239
(8 factor)
(Cfactor)
The bases that result are
K2=[~
1/2] -1/2 C1
co
K3=[~
~1/2
1/6] -1/3 1/2 1/6 P1 PO P2 (8 factor)
(A factor)
0
K3=U
I
co
2/3 -1/3] -1/3 2/3 -1/3 -1/3 C1 C2 (Cfactor)
The complete model is rank 18. The basis for the entire design is constructed as Kronecker products of the vectors of the separate matrices. The value of the products themselves is not of major importance. However, the reader should be familiar with the symbolic notation from which they are generated. The effects and symbolic vectors denoting the products are given in Table 7.4.1.
Table 7.4.1
Source Listing and Symbolic Vectors for 2X3X3 Factorial Design
Source
Degrees of Freedom
Constant
CO®PO®CO C1 ®PO®CO CO®P1®CO CO®P2®CO
A 1 -A2 8: Linear Quadratic C:Cl-C3
CO® PO®C1 CO®PO®C2 C1 ® P1 ®CO C1 ®P2®CO
c2-c3 AB: Ax linear B Ax quadratic B AC
2
BC
4
ABC
4
Between groups
Symbolic Vector
/=2X3X3=18
C1 ®PO®C1 C1®PO®C2 CO® P1 ®C1 CO® P1 ®C2 CO®P2®C1 CO®P2® C2 C1 ® P1 ® C1 C1 ® P1 ®C2 C1 ® P2®C1 C1 ®P2®C2
240
Method
Each vector of K is an 18-element Kronecker product of one vector from each of the separate one-way bases. For example, the Ax linear 8 basis vector is
C1 0P10 =
[-1/4
CO~[~~;~]@ [-:~J m 0
-1/4
-1/4
0 0
0
1/4
1/4
1/4 1/4 1/4 1/4 0 0 0 -1/4 -1/4
-1/4]'
The 18x 18 complete basis may be substituted in Eq. 7.3.4 to obtain estimates of each effect in the reparameterized model. The number of vectors in each Kronecker product is equal to the number of factors in the sampling design. That is, for a four-factor design, each basis vector is the product of four smaller vectors from the one-way matrices. The total number of K vectors corresponds to the "degrees of freedom between groups," if we include one for the overall population constant. The degrees of freedom, in turn, are restricted to being no greater than the number of groups in the design with at least one observation (I ~J). _Sample Problem 4, the essay grading study, yields an Ax8xCxO (2x2x 2X4) factorial arrangement. The four-level factor is to be reparameterized to deviation-type contrasts. The symbolic representation of the basis vectors is given in Table 7.4.2. Each vector in the basis has 32 elements, corresponding to the 32 subclasses in the design, and is a function of four smaller vectors. As before, each column vector of the basis represents a single effect, or single degree of freedom between groups. Bases for nested designs may be constructed by employing the identity matrix as a factor in the Kronecker products. For example, consider the twotreatment situation, with two classes assigned to treatment one, and three to treatment two. The model is given by Eq. 7.1.14. That is,
The treatment factor, A, has two levels; classes, 8, has at most three levels for any treatment. The model matrix is
A{
1 0 1 0 0 1 0 1 0 1
1 0 0 0 0
0 1 0 0 0
0 0 1 0 0 1 0 0
0 0
~]
(7.4.34)
with parameters
[0*]'= [J.t
at
az
bl(l)
bzm
b1(2)
bz(2)
b3(2)]
A may be constructed as the Kronecker product of matrices for the twolevel treatment factor, and a three-level class factor, treating the design as crossed. The third row of the model matrix for the complete crossed design is
Analysis of Variance: Models
Table 7.4.2
241
Source Listing al'td Symbolic Vectors for 2X2X2X4 Factorial Design
Source
Degrees of Freedom
Symbolic Vector
CO® CO® CO® DO C1 ®CO®CO®DO CO®C1 ®CO® DO
Constant A B
c
CO®CO®C1 ®DO 3
CO® CO® CO® D1 co 0 co 0 co 0 D2 co 0 co 0 co 0 D3 C1 ®C1 ®CO® DO C1 ®CO®C1 ®DO
AD
3
BC BD
1
C1 0 CO® CO® D1 C1 ®CO®CO®D2 C1 ®CO®CO®D3 CO®C1 ®C1 ®DO
CD
3
D
AB AC
CO®C1 ®CO®D1 CO®C1 ®CO®D2 CO® C1 ®CO® D3 ' CO®CO®C1 ®D1 CO®CO®C1 ®D2 CO 0 CO 0 C1 0 D3 ·. C1 ®C1 ®C1 ®DO
3
ABC ABO
3
AGO
3
BCD
3
ABCD
3
Between groups
C1 ®C1 ®C0®01 C1 ®C1 ®C0®02 C1 ®C1 ®C0®03 C1 ®CO®C1 ®01 C1 ®CO®C1 ®02 C1 ®CO®C1 ®03 CO®C1 ®C1 001 CO®C1 ®C1 ®02 ' CO®C1 ®C1 ®03 C1 ®C1 ®C1 ®01 C1 ®C1 ®C1 ®02 ,C1 ®C1 ®C1 ®03
I= 2X2X2X4
=
32
deleted, as there is no third class under the first treatment condition. The rank of the model is five. Bases for the two separate factors, assuming simple contrasts for both, are
1/2] [ 1 -1/2
K2 = 1
CO
C1
(A factor)
242
Method
and
2/3 -1/3] -1/3 1 -1/3 2/3 CO C1 C3 1
K3 = [ 1 -1/3
(B factor)
The reparameterized model matrix for an entire six-group crossed design has CO®CO for the constant term, and C10CO for the treatment contrast. Both products assume that the K 2 vector is the prefactor. In order to contrast levels of B for only the first treatment (that is, classes in treatment one), we may introduce an alternate basis for factor A
12=[~ ~] 11
12
The columns of the identity matrix are symbolically represented as 11 and 12, respectively. The product 110C1 is
[2/3 -1/3 -1/3 0 0 O] 110C1 has nonzero elements only in the first three positions, or corresponding to the classes under treatment one. 110C3 cannot be estimated since C3 denotes the comparison of group 3 with group 2, and there is no class 3 under treatment one. We may estimate two contrasts among classes under treatment two, however. These are the column products 120C1, and 120C3. The complete basis is
K=
1 1 1 CO®CO
1/2 1/2 1/2 -1/2 -1/2 -1/2 C10CO
2/3 -1/3 -1/3 0 0 0
0 0 0 0 0 0 -1/3 2/3 -1/3 -1/3 -1/3 2/3 110C1 120C1 120C3
(7.4.35)
The third row of K may be deleted, since there is no third class under the first experimental condition. The resulting matrix K is 5x5 and of rank five, since the product 110C3 was omitted. Note especially that the omission of the effect was made by considering the contrast weights, and the fact that the particular contrast would involve a nonexistent group mean. Inspection of the basis vectors would not accomplish the same purpose. The basis elements do not have an easily seen correspondence to the contrasts. We may extend the use of Kronecker products for nested designs, by considering Sample Problem 5, the programmed instruction experiment. The experiment involves a sex-by-experimental-groups fixed design, with classes randomly assigned to experimental conditions. There are 19 classes nested within the experimental group and 18 within the control group. Sex is crossed with experimental groups and classes. The measures gathered from each class are mean scores, for boys and for girls separately.
Analysis of Variance: Models
243
The model for the mean outcome of one sex group in one class is given by 7.1.34. For reparameterizin~ to contrasts, we may consider the design as if it were an incomplete 2X19X2 crossed design (conditionsxclassesxsex), with only 18 classes under the control condition. This will result in two empty groups, since there will be no observations for males or females in the missing class. The contrast matrices for both two-level factors, experimental groups and sex, may be the usuai2X3 matri~es. That is, ~q.
L ='[1 2 0
1/2 1/2] CO 1 -1 C1
In addition, we shall employ the columns of a 2x 2 identity matrix to allow us to estimate between-class effects 'separately for each experimental condition. Since classes comprise a random effect, particular group comparisons are not of interest. Any arbitrary contrast~ may be selected. For exemplary purposes, we shall use deviation contrast parameters for classes, DO, D1, ... , D18. Each column of the reparameterized model matrix, K, has 76 elements. The last two elements, corresponding to the two missing groups, may be deleted. Assuming design factors in the order: conditions-classes-sex, the symbolic representation for the constant term is CO®DO®CO. The single degrees of freedom for "experimental groups," for "sex," and for the "groupsxsex interaction" are represented by C1 ® DO® CO, CO® DO® C1, and C1 ® DO® C1, respectively. These constitute all the fixe~ effects in the model. To obtain tests or variance estimates for the "classes" and "sexxclasses" random effects, we may wish to include additional columns in the basis for one or both sources of variation. There is no "experimental groupsx classes" or "sexx groups x classes" interaction, since the same class is not observed under both treatment conditions. That' is, classes are nested within experimental groups. The Kronecker products for the 18 degrees of freedom, or 18 contrasts among 19 classes within the first experimental condition, are the products 11 ®D1 ®CO, 11 ®D2®CO, ... , 11 ®D18®CO. Those for the 17 degrees of freedom among 18 classes within the second experimental (control) condition are 12 ® D1 ®CO, 12 ® D2 ®CO, ... , 12 ® D17 ®CO. Together these constitute the 35 degrees of freedom for the classes~within-conditions random effect. Group or class means should comprise the unit of analysis in most studies of class teaching methods, couns,eling groups, family interaction, and so on. (See Glass, 1968; Raths, 1967.) It may not be necessary to estimate variation among subjects within groups. Subjects in these instances are not responding independently of one another, nor are they responding under varying experimental conditions, times of day, settings, and so forth. Further, variation among subjects is not the appropriate error term or denominator mean square for any of the fixed effects in the model. These effects are usually of greatest concern. Thus, in the example, we vyill not estimate the variance among students of a particular sex-class group. We may consider instead that each cell in the design has only a single vector observation, the vector mean for the sex group within the particular class.
244
Method
With no within-group variance to estimate, the only remaining source of variation in the example is the random "sexxclasses within conditions" interaction. This may be obtained like the "classes" random effect-that is, by estimating specific interaction contrasts. Or the interaction sum of products may be found by subtracting variation due to all other sources from the total sum of squares and cross products. Should we choose to code the interaction effects, the Kronecker products for sex-by-classes are 11 ®01 ®C1, 11 ®02®C1, , ... , 11 ® 018 ® C1, for the experimental group, and 12 ® 01 ® C1, 12 ® 02 ® C1, ... , 12 ® 017 ® C1, for the control. However, this approach would entail much unnecessary additional computation. Further examples of the use of the symbolic representation and the use of the Kronecker products may be found in Bock (1965), in the examples of this chapter, and in the instructions for the use of the MULTIVARIANCE program. The symbolic conventions for the program are identical to those here, with the exception that commas (,) replace the Kronecker operator (®).Additionally, the program allows for deleting the letter codes from all contrasts but the first, and for generating multiple vectors from a single symbolic code.
Interpretation of Contrast Weights The reparameterized analysis-of-variance model is y.=KO+e., where 0= LO*. Each term in the sample matrix {j estimates a particular contrast among the original parameters. The magnitude, direction, and standard error of each term conveys the information necessary to the interpretation of group-mean differences. That is, from the elements of 0, we have the actual number of score points separating population means, and the direction of the difference (which group is highest, lowest}. The standard error also reveals the precision of estimation, as well as the relative size of the group-mean differences. A contrast estimate will tend to be large if the parameters differ as specified by the contrast weights, and small if there are no differences among parameters or if the differences are not in the direction specified by the contrast weights. For example, if a contrast vector is I' = [0 1 -1/2 -1 /2], for parameters [O*]'=[f.L a1 <Xz a 3 ],then
1'0* is maximal (in absolute value} when a 1 is very different from (a 2 +a3 )/2, regardless of <Xz or a 3 . nS the parameters depart from that pattern, 1'0* will diminish, and estimates of 1'0* will tend to be smaller in magnitude. Note that
the sign reflects only the direction of difference-that is, whether a 1 is above or below (a 2 +a 3)/2. Thus a difference of -k points is as revealing as one of +k points. For any common one-way design, contrasts among parameters are also contrasts among subclass means. For example, the one-way fixed model is Y;;=f.L+a;+Ei; = f.L;+Ejj
(7.4.36)
Analysis of Variance: Models
Any contrast among ference a 1 -a 2 is
a;
245
is also. a contrast among f.LJ· As an example, the dif-
(7.4.37) Generally, for any J weights 1, that multiply the
a, and sum to zero,
~J /;a;=~; l;(p.,;-p.,) =~j/jf.Lj-J.L~Jj
I
(7.4.38)
=~;,if.Li
Thus each contrast may be interpreted as a set of weights applied to group means. The resulting effects are simple weighted mean differences. Let us assume a one-way five-group design, with simple contrasts CO, C2, C3, C4, and C5. The contrast matrix is
1 0 [ L= 0 0
1/5 -1 -1 -1
1/5 1 0 0
0 -1
1/5 0 1 0 0
0
1/5 0 0 1 0
1/51 0 0 0 1
co
C2 C3 C4 C5
(7.4.39)
The final five columns of L multiply the a; parameters, and are thus the weights for subclass means. If we represent the design by means of a block diagram, the design and contrasts are as shown in Figure 7.4.1. From such a diagram it is easily seen that the estimate in 0 porresponding to CO, is the average (weighted by N,) of the subclass means. The estimate corresponding to C1 is the simple weighted difference of the meal'] for group 2 and the mean for group 1. The estimate will be large if the two group means are very different, and small if their values are close. Block diagrams such as Figure 7.4.1, with any set of contrast weights, provide a guide and simple method for examining exactly the mean differences involved in any term of 8. The technique can also be employed to clarify the meaning of single interaction contrasts. In most statistical treatises, interaction effects are not clearly ' Group I
1/5 -1 -1 -1 -1
2,
3
4
1/5
1/5
1/5
1 0 0 0
0
0 0
0 0
0
Figure 7.4.1
5 1/51 co 0 I C2 0 IC3 0 IC4 I C5
246
Method
explicated, and only "omnibus" tests of significance are presented. Yet interaction effects may also be understood in terms of comparisons among means, in two-way or higher-order designs. Consider a 2x2 (AX8) factorial arrangement, with effects CO and C1 for each dimension of classification. The block diagram can be represented as in Figure 7.4.2. Here the contrast weights are written for both factors. The A main8 factor 81
82
A factor AI Az
1 1/2 -1 I co C1 [CO [C1
~---~
1/2 -1
1/2
I Figure 7.4.2
effect contrast is C1 ®CO (assuming factor A is first). Multiplying the C1 weights for A, element by element, by the CO weights for 8, and inserting the products in the corresponding cells, we obtain Figure 7.4.3. The A effect is the compari8 factor 81
82
Figure 7.4.3
son of the average of all subclasses under Al> with the average of all under A2 . In terms of means, the contrast to be estimated is
1
1
2 (iLll +iLd- 2(iLzl +iLzz) =
iLl·-iLz·
Similarly, for the 8 main effect the weights are formed from the product CO®C1. These provide an estimate of
1
1
2 (iLll +iLz1l -2 (iLJz+iLzz) =
iL·J-iL·z
Interactions may also be explicated in this manner. The interaction contrast is C1 ®C1 (see Figure 7.4.4). It can be seen that the interaction contrast estimates (iL11 -iL 12 )-(iL21 -iL 22 ). If the estimate is large, it indicates that the difference of the means of 8 1 and 8 2 for A 1 is not equal to the difference of the means
Analysis of Variance: Models
247
8 factor 8, A factor
8z
-1
A, Az
-1
--I C1 IC1 I
-1 Figure 7.4.4
of 8 1 and 8 2 for A2 • The comparison of differences of levels of one factor across levels of a second factor is the interaction effect of the two design factors. Interactions of factors having more than two levels may be depicted in similar fashion. Consider the 2x3x3 (Ax8xC) design having the effects listed in Table 7.4.1. Any one of the ma!n-effect contrast vectors may be seen by collapsing across other factors of classification. For example, the C main effect CO®PO®C1 may be seen from a diagram of just this factor alone, as shown in Figure 7.4.5. The contrast is 1p,.. 1 -1p, .. 3 . Or, in terms of individual subclasses, 1 1 6(/Lm +~.t121 +p,131 +~.t211 +~.t221 +~.t23,)- 6 (p,113 +~.tm+~.t133 + IL213+ /L223+ /L233)
C factor
c,
c2
c,.
0
-1
I I
1
C1
I
Figure 7.4.5
The AX linear 8 interaction effect, C1 ®P1 ®CO, may be diagrammed by collapsing across C, as shown in Figure 7.4.6. The weight differences of A, and A2 , across levels of 8, form a regular pattern. The effect will estimate the extent to which the differences of the A, and A2 means are large for 8, smaller 8 factor
8,
-1
0
1
1
0
-1
0
1
-1 I C1 IP1
A factor
1 ;
-1
Figure 7.4.6
248
Method
for B 2 , and smaller still for B3 ; that is, the extent to which the differences of mean A 1 and mean A 2 fall on a straight line across levels of B. Similarly, C1 ® P2 ®CO provides a set of weights for determining the extent to which the differences of the A 1 and A2 means fall in parabolic fashion across levels of B. The Ax C interaction has two degrees of freedom. The second contrast is C1 ®PO®C2, yielding weights as shown in Figure 7.4.7. This interaction contrast is a comparison of the means !-L 1 .2 and !-Lz. 3 , with the means !-L 1 .3 and 1-Lz·z· The corresponding term in ()is an estimate of the extent to which the difference of A 1 and A 2 is larger for C2 than for C3 . C factor
A factor
0
1
-1
0
-1
1
-1
1
0
1 -1
-
I C1 lc2
Figure 7.4.7
Any main-effect contrast may also be viewed as graphed points to be fit to sample data. For example, in a one-way three-level design with contrasts C1 and C2, the contrast weights may be depicted as in Figure 7.4.8. The weights for C1 are [1 0 -1] and for C2 are [0 1 -1]. C1 may be viewed as a "hypothesis" set of points across subclass means. To the extent that ~-L 1 and /La differ in the manner specified by the points for C1, the corresponding estimate
·,, ·. ' Weight
0
'
C1'
•• C2
.•,
' ' .·. '·.-...:.
'
-1
f..Lt
/-L3
Means Figure 7.4.8
Analysis of Variance: Models
in
6 will
249
be maximal. If the two means are equal in magnitude, the estimate of
M1-p,3, or fit to the line, will be reduced.
For a four-level factor witt) Helmert contrasts, the weight vectors for H1, H2, and H3 are [1 -1/3 -1/3 ·. -1/3], [0 1 -1/2 -1/2], and [0 0 1 -1]. These may be graphed as in Figure 7.4.9. The estimate of each effect can be
., Weight
0
•·.
''
H1 '\.
H2• ',
'
-1/2
.·..
.. ..... ...
', I ~---~eo--
•-.
-~
-1
Means . Figure 7.4.9
viewed both as a weighted comparison of means, and as reflecting the extent to which the means differ in accordance with the respective graph points. So, for example, for H2, sample mean.s of P,2 = 8, P,3 = 2, and (1., 4 = 2 will yield a maximum fit for the contrast with the t:JStimated value of 6. Deviations from the hypothesized pattern may either increase or decrease the estimate, but will affect other contrasts to a greater extent. For example, if the mean for group 4 were to be P,4 = 1 instead of P, 4 = 2, the estimate for H2 would become 6.5, while that for H3 would go from 0 to 1. Orthogonal polynomial contrasts describe curves of successively increasing complexity. For a four-level factor, the weights P1, P2, and P3 may be graphed as in Figure 7.4.10. The ~stimate corresponding to P1 will be maximal when all four means are proportionately larger (or smaller), in the specified order; that is, when they fall on the P1 line, with either positive or negative slope. If p,1 and p, 4 tend to differ from p,2 and p,3, while within these pairs the means are equal, the estimate for P2 will be m~ximal. If, instead, the four means do not conform to any single pattern, they may be described by nonzero functions of several of the weight vectors. For example, a learning curve with time-point means of 5, 12, 16, and 17, will be reflected in both the P1 contrast (for the monotonic increasing trend) and the estimate for P2. The P2 reflects the parabolic trend of decreasing acceleration over time. That is, the entire curve may be viewed as a parabola resting upon the line of weights P1. This may be graphed as in Figure 7.4.11. I
250
Method
3
/. P1
/
/ /. Weight
/
/
0
/
/
-1
·....... / .
/
/ -3
./
Mean
Figure 7.4.10
20
P2
.,..,. (17)
15 10 5
/1>2
Means
Figure 7.4.11
CHAPTER
I
Analysis of Variance: Estimation The multivariate analysis-of-variance model is Y.=A<">*+E.
(8.0.1)
Y. is the Jxp matrix of means for J groups of observations on p outcome variables. A is the Jxm model matrix. 8* is the mxp matrix of unknown analysisof-variance parameters, or effects, with one column of effects for each criterion variable. E. is the Jxp matrix of mean errors, with distribution (8.0.2)
D is the JxJ diagonal matrix of subclass frequencies . .1: is the pxp matrix of variances and covariances of the criterion measures. J is the total number of groups in the design and may result from crossing two or more sampling factors; for example, in an axb two-way model, J is the product ab. When J-J 0 subclasses have no observations, rows of Y., A, E., and D may be eliminated corresponding to the null groups. Unique estimates of all terms in 8* cannot be obtained, since the columns of A are linear functions of one another. The maximum number of estimable parameters is J, the number of groups or the total degrees of freedom among means. Equation 8.0.1 can be reparameterized to full rank by defining I(~ J) linear combinations of the parameters that are estimable. I is the rank of the model for significance testing. Since we usually define as many parameters as we have degrees of freedom, in most situations I= J. The weights defining the linear combinations form the rows of an /xm contrast matrix L, of rank/. The alternate parameters are
8=L8*
(8.0.3)
8 is Jxp and contains the alternate parameters (contrasts) to be estimated. Each column of 8 is the set of contrasts for one criterion measure. When 8 replaces 8* in Eq. 8.0.1, a model matrix for the new parameters must also replace A. The reparameterized model matrix, or basis for the design, is the Jx I matrix K, which satisfies A= KL. The model is reparameterized from 251
252
Method
Eq. 8.0.1 to Y. = KL8* +E.
= K8+E.
(8.0.4)
Lis chosen from the hypotheses of the research. K is constructed from L by K = AL' (LL')- 1
(8.0.5)
Alternatively, K may be constructed directly from L or by Kronecker products (for crossed designs), as described in Section 7.4. Note only that K defines the independent variables for analysis of variance and, like A, is unaffected by the number of criterion measures. Although the basis provides no useful interpretive information, its construction is essential for the estimation of 8. Once the model is reparameterized to Eq. 8.0.4, its form is identical to the full-rank regression model, and may be solved by least-squares regression procedures. 8 in Eq. 8.0.4 is identical in function to the regression weights B. K is the full-rank model matrix, like the matrix of regression predictors X. In fact, the entire analysis of variance may be performed through regression formulas and programs. Rows of K are employed as values of the predictor variables and are appended to the outcome vector for each subject. The procedure is referred to as coding "dummy variables" for regression analysis. The estimated regression weights will be identically the contrast values for 8.
8.1
POINT ESTIMATION
The least-squares estimation of 8 follows directly from the minimization of tr (:E:DE.) in the sample, as presented in Section 7.2. When the model is reparameterized to full-rank form, as in Eq. 8.0.4, the normal equations are obtained from Eq. 7.2.11, with K and 8 in place of A and 8*, respectively. The leastsquares estimate is obtained by solving (K'DK)@= K'DY.
(8.1.1)
Premultiplying both sides of Eq. 8.1.1 by (K'DK)-1, we obtain (K' DK)- 1 K' OKS= (K' DK)- 1 K' DY. and
(8.1.2)
8 contains I /east-squares estimates of effects for each of the criterion measures. For example, in Chapter 7 we reparameterized a one-way fourgroup model to simple contrasts of each group with the last (Eq. 7.3.9). Let k= ~-t+114l:3 a;. The estimated 4xp matrix is: (8.1.3)
Analysis of Variance: Estimation
253
In the 2X3 crossed design (Eq. 7.3.13a), with effects k= JL+a.+f3., a 1 -a., f3r-f3., and {32 -{3., the form of 8 is
(8.1.4)
Each column contains exactly the estimated effects for a single outcome variable; that is, it contains those that would be obtained for the single measure if it were the only criterion. The terms of 8 are the estimated mean differences for each of the measures. Together with an estimate of their precision, they provide all the descriptive data necessary for interpreting the analysis-of-variance outcomes. The magnitude of the estimate, plus the sign, reveals the degree to which the groups differ and in which direction. The only requisite to their interpretation is knowledge of the specific contrasts estimated by each term. By comparison, "strength of effect" or proportion of explained variation measures are not generally valid in fixed-effects models. Each row of the matrix 0 represents a single contrast or single degree of freedom between groups. There are at most as many contrasts as the total between-groups degrees of freedom J, also the total number of groups (or Jo if some groups have no observations). For purposes of significance testing, we may test the nullity of any one contrast (one row of@), or of multiple contrasts. For example, a simultaneous test of the last three rows of@ for the one-way model (Eq. 8.1.3) is a test that all four group mean vectors are equal (Ho: ar= a 2 =a3 =a4 ). Test of the two {3-effects in the two-way model (Eq. 8.1.4) is the test that the three means for factor B are equal (H0: Pr = fJz = (33). These simultaneous tests yield the common "omnibus" results for multiple-degree-of freedom main-effect and interaction hypotheses. Under the general model, tests can also be conducted on individual contrasts or on subsets of the contrasts among means.
Example The estimation of terms for the two-way data of Table 3.3.2 is presented in Section 7.3. The matrices of means and frequencies are
Y ·
=
.50 [ 3.17 2.20 3.43
Yr
.75] Group 11 3.67 Group 12 2.40 Group 21 3.57 Group 22
4
D- [
6 Zero)
Yz
The analysis-of-variance model is
with j = 1, 2 and k
=
1, 2. There are four groups of subjects; the rank of the
254
Method
The basis is
.5 .5 .25~ .5 -.5 -.25 .5 -.25 1 -.5 1 -.5 -.5 .25
1
K= [ 1
The estimate of 8 is
(K'DK)- 1
fl=
047 (Symmetric] [ 009 .190 - 018 .007 .190 007 .070 .037 .760
l
K'DY.
56.0 62.0~ -7.0 -6.0 -15.0 -16.0 -1.0 -1.5
2.32 2.60] Constant [ -.98 -.78 A main effect - -1.95 -2.04 B main effect -1.44 -1.75 AB interaction Yt Yz The elements estimate effects for separate criterion measures. For example, [(&~-~)+('Yt·-Y2·)]
(Yt)
[(&t-&z)+(Yt·-Yz•)]< l=-.78
(Yz)
2
If interactions are all equal, then these are the simple &I-&2 estimates.
Properties of 0
e
The elements of are estimated under a simultaneous least-squares procedure. That is, they provide best estimates given the particular number and selection of effects to be estimated. The order of effects is of no consequence. However, the addition or deletion of terms wiH generally affect the estimates of all remaining parameters. Each parameter is estimated given, or eliminating, all others. If subclass frequencies are unequal, or if nonorthogonal contrasts are chosen, then the number of parameters estimated will affect all of their numerical values. Under completely orthogonal conditions (equai-N;, orthogonal contrasts), the estimates are independent and will not change with the inclusion or removal of other independent effects.
Analysis of Variance: Estimation
255
We may examine a simple orthogonal case. Consider a one-way design with four levels, and twelve observations per subclass. With Helmert contrasts, the orthogonality is maintained and the estimation of effects through matrix operations may be expressed through scalars. If either orthogonality condition were violated, however, the demonstration would be more complex. The model matrix, contrast matrix, and basis are
1/4 1/4 1/4 1/4~ HO 1 -1/3 -1/3 -1/3 H1 0 . 1 -1/2 -112 H2 -1 H3 0 0 1 and K=AL'(LL')- 1 3/4 0 -1/4 2/3 -1/4 -1/3 -1/4 -1/3 HO
H1
H2
OJ
0 1/2 -1/2
H3
The diagonal matrix of subclass frequencies is D=diag (12, 12, 12, 12) The estimates for one outcome measure are
=
[~~1:_~/:(~.z+y.3+Y·4] ~~ Y·z-1/2(y.a+Y-4) Y·3-Y·4
H2 H3
(8.1.5)
Were the number of variates to be greater than one, 0 would have the same function in each row but a separate column for each additional measure. The elements of {) are simple and obvious combinations of the observed group means. If nonorthogonal contrasts were selected, or if the elements of D were not equal, K'DK would not be diagonal, and the linear functions of the
256
Method
means would be significantly more complex. In particular, the diagonality of K'DK assures that only the single scaling constant (1/48, 1/9,1/8, 1/6) multiplies each effect in K'Dy .. With nondiagonal K'DK, elements would also multiply K'Dy., which are functions of effects other than the single one being estimated. Under nonorthogonal conditions, each effect is a linear function of all of the means or of all other effects as well. Let us examine the distribution of the estimate 0. Since elements are linear functions of the rows of Y., we shall first examine the distribution of the mean matrix. According to the assumption of Eq. 8.0.2, E. ~JV(O, o- 1 ® !,). From this we may obtain the expectation and covariance matrix of Y.; the expectation is i3(Y.)=i3(K€HE.)
=K0
(8.1.6)
and the covariance matrix is r(Y.) = i3[Y.-i3(Y.)] [Y.-i3(Y.)]'
= i3[(Y.-K0)(Y.-K0)'] =i3(E.E:) =r(E.) = o-1®!,
(8.1.7)
The mean observations Y~;. like the mean errors, are independently distributed with variance-covariance matrix (1 IN;)!.. These expressions may be used to determine the properties of 0. 0 is an unbiased estimate of 0, since 13(0) = i3[(K'DK( 1 K'DY.] =
(K'DK)- 1 K'Di3(Y.)
=(K'DK)- 1K'DK0 =0
(8.1.8)
The variance-covariance matrix of the elements of 0 is //(0) = :?/'[(K'DK)- 1 K'DY.]
= (K'DK)- 1 K'Dr(Y.)D'K(K'DK)- 1 =
(K'DK)- 1 K'D[D- 1 ®!.]D'K(K'DK)- 1
=
(K'DK)- 1 ®!,
=G®!,
(8.1.9)
The covariance matrix of the ith row of 0 (that is, 0';) is the pxp matrix [g;;!.]. [g;;] is the ii diagonal element of (K'DK)- 1. That is, estimates of a single contrast for different variables are interdependent and to an extent proportional to the covariance of the respective measures. Generally, mean contrasts for multiple variables are not independent, and multivariate test criteria should be employed.
Analysis of Variance: Estimation
The variance-covariance matrix of the kth column of elements in Ok) is
257
0 (that is, (8.1.1 0)
is the variance of criterion measure Yk· The diagonal elements of Eq. 8.1.1 0 are the variances of the I contrasts for one criterion measure; the square roots are the standard errors.
a-k 2
G"a 17
ik
= G"k v'"g;;
(8.1.11)
where {Jik is the estimate of contrast i for criterion measure Yk· G or (K'DK)- 1 may be nondiagonal (nonzero covariances), indicating that the contrasts in {jk are not independent. That is in general mean contrasts, even for a single criterion, are interdependent. This complicates significance testing, since tests on various effects are also not independent. If G is not diagonal, the terms in c3 change if contrasts are added or deleted or if the selection of contrasts is altered. Nondiagonal G may result from one or both of two conditions: nonorthogonal contrasts in Lor unequal elements in D (that is, unequal subclass frequencies). Contrasts may be rendered independent by selecting only orthogonal contrasts, and by restricting cell frequencies to being equal. This situation can be solved with scalar algebra, as presented in most texts. Solutions under the general linear model, while more complex, do not place the stringent requirements upon research design. If rows of the contrast matrix L are orthogonal, the orthogonality of K= AL'(LL'}- 1 is maintained. LL' is diagonal, and multiplication by (LL')- 1 constitutes only a rescaling of the contrast vectors. Since construction of K involves appending a unit vector to a subset of the scaled contrast vectors (see Eq. 7.4.24), K is likewise orthogonal. Contrariwise, the nonorthogonality of L will in general destroy the orthogonality of the basis. If elements of D are not equal, G will generally be nondiagonal, even with orthogonal K. For example, assume a two-group design with two observations per group. Then
K=
[11 -11]
D=[~ ~r
and
The variance-covariance matrix of the estimated grand mean and mean difference has (K'DK) = [ 4
0] 0 4
and a-2(K'DK)-1=a-2
[1~4 1~4]
With equal NJ, the estimated constantterm and mean difference are uncorrelated. If we introduce a single additional observation into the second group,
D= [~
~]
258
Method
Then (K'DK)
= [-15
- 1]
5
and (J"
1]
2(K)DK)-1=u-2[5 24 1 ,5
The two effects are interdependent, with covariance a-2/24. The reason for the nonzero covariance can be more plainly seen if we write basis vectors for all observations, rather than only for means. In the two-group example with N1 = N2 = 2, the basis vectors are orthogonal. That is,
K=~1 ~~
.
1 -1 1 -1
With an additional observation in group two, each vector takes on an additional element, to destroy the orthogonality. That is,
r~ -~1
- K= 1 -1 1 -1
The inner product is -1. Since one or both sources of nonorthogonality may be present in a particular research design, a general orthogonal solution is necessary. Then the source of nonorthogonality is of little consequence. The need to discard observations or to restrict contrasts to those that are orthogonal is obviated. We will perform tests of significance on elements of 8 by transforming 8 to a matrix with independent rows. In a completely orthogonal arrangement (equal NJ> orthogonal contrasts), the rows of 8 are independent, and the transformation involves only a simple rescaling. The matrix G=(K'DK)- 1 in Eq. 8.1.10 is the matrix of variance-covariance factors among the estimates. Its elements are proportional to the variances and covariances by the constant factor u-k 2• G or a-k2G may be reduced to correlational form in the usual manner. Let De= diag (G), then
Re = De-112GDe -112
(8.1.12)
Re are the correlations among the estimates for any one criterion measure. The correlations provide a standardized measure of the extent to which the contrasts are interdependent and would change with the addition or deletion of other effects. That is, the elements are measures of the degree of nonorthogonality of the design.
Analysis of Variance: Estimation
259
Example
The variance-covariance factors among the estimates for the data of Table 3.3.2 are .
[.047 (Symmetric)] Constant .009 .190 A main effect G = (K'DK)-l = .018 .007 .190 B main effect .007 .070 .037 .760 AB interaction Constant A B AB The variance-covariance matrix is (K'DK)- 1 times the constant multiplier uk 2 for variable Yk alone. When uk2 is ·estimated, we may estimate the covariance matrix and the precision of The correlations among the estimates are the same with or without ui. The constant multiplier drops out in the computations. The diagonal matrix of variances is
e.
De= diag (.047,
.190,
.190,
.760)
The correlations are .000 (Symmetric)] Constant A main effect R _ [ .097 ·1.000 c- .185 B main effect .034 1.000 .034 .185 .097 1.000 AB interaction B AB Constant A The inequality among subclass frequencies has introduced small positive correlations among all the estimates. If the interactions were not estimated, we would expect the constant estimate to change slightly, the B contrast to a somewhat larger extent, and the A contrast still more. Never. theless, none of the intercorrelations is particularly large.
Conditions for the Estimation of 0 The conditions for estimating 0 depend upon the conditions for the selection of contrasts in Section 7,3. In order for 0 to. be estimable by Eq. 8.1.2, the product K'DK must be of full rank so that it may be inverted. K is Jx/ and the gram ian K'DK is /x/. The first condition is that I cannot exceed J. The maximum number of estimable contrasts, including one for the constant term, is the number of groups of observations, or the total degrees of freedom among means. Fewer than J effects may be estimated. Further, no column of K can be exactly a linear function of other columns, This condition restricts the contrasts estimated to those values not completely determined by others in 0. For example, having estimated a 1 -a2 and a 2 -a3 , we could not include a vector for a 1 -a 3 in the contrast matrix. The corresponding
260
Method
vector of K would be a simple function of the preceding two vectors. Contrasts need not be orthogonal, however. If an element of Dis zero, then the rank of (K'DK) will be restricted further. One between-group degree of freedom is "lost" for each empty subclass in the sampling design. If the complete design has J groups of observations but only J0 (<J) groups have one or more subjects, the maximum degrees of freedom among means is restricted to J 0• The particular contrasts which may not be estimated are determined by examination of the sampling design (see Sample Problem 3, the dental calculus study). , ~is not restricted in any manner by the correlations among criterion measures. Each column of 0 comprises a single set of univariate estimates for the corresponding y-variate, and does not depend upon other columns.
8.2
ESTIMATING DISPERSIONS The variance-covariance matrix of 0 is given by Eq. 8.1.9. That is, Y(@) = (K'DK)- 1 ®~ =G®~
(8.2.1)
For any one criterion measure, we may extract a diagonal element of~ to obtain the variance-covariance matrix of the corresponding column of ~. For yk, (8.2.2)
K'DK is determined by the analysis-of-variance model.~ may be estimated from sample data. The estimate is used to describe the intercorrelations among the criterion measures and to provide interval estimates and tests on elements or sections of 0. The estimate of the pxp covariance matrix is provided by the sums of squares and cross products of the residuals, or discrepancies of the observations and the model. The total sum of products of the observed scores is
Sr= L;
L; Y;;Y{;
=Y'Y
(8.2.3)
where Y is the Nxp data matrix; Sr has N degrees of freedom. The sum of products due to the model (between groups) is
Sn
= (Ke)'DK0 =0'K'DK0
(8.2.4)
The center matrix D is necessary since we have defined K as the basis for the means, while the total and residual sums of products include variation among subjects. That is, each mean is weighted by the number of subjects it represents. S8 is the weighted sum of products of /linear functions of the observations and has I degrees of freedom.
Analysis of Variance: Estimation
261
The residual sum of products is the difference
SE = Sr-SB =Y'V-S'K'DKS
(8.2.5)
SE is the error sum of squares and cross products. The error degrees of freedom aren.=N-1. The unbiased maximum likelihood estimate of the variance-covariance matrix is 1
:I= -sE n. A
=-1-(Y'V-S'K'DKS) N-1 ·
(8.2.6)
Each diagonal element of :i is ftk 2 , the sample variance of one criterion measure, given the model parameters. Each off-diagonal element is the sample covariance of two outcome measures. In the most common situation, I is equal to J and :i has a simpler form. This occurs whenever all between-group degrees of freedom are specified in the model (the maximum J or J 0 if there are empty groups). In this case all main effect and interactions are included in 8. The only residual variation is then the within-group sum of squares and cross products, or the sum of products of the individual scores deviated from their separate subclass means. Algebraicatly, when I=J the basis is square and can itself be inverted. Then
SB=S'K'DKS =
v:DK(K'DK)- 1 K'DK(K' DK)- 1K'DY.
=
V:DKK- 1D-1(K')- 1 K'DV.
=Y:DV. and
SE=Y'v-v:ov.
(8.2.7)
In terms of data vectors, SE is
J
N;
=L L
(y;.;-y.;)(Yu-Y·J)'
(8.2.7a)
j=l i=l
where Yu is the p x 1 vector observation for subject i in subclass j and Y·J is the vector mean for subclass j. That is,
262
Method
When SE has the form of Eq. 8.2.7, it is identically the sum of J sum-ofproducts matrices for the separate groups, each adjusted to the subclass mean vector. These are pooled to obtain the common within-group sum of products, under the assumption that I is the same for all subgroups of observations. Let Sw; be the sum of products of mean deviations for subclassj. That is, ·"'j
Sw;= ~1 (Yi;- y.;)(Y;;-y.;)'
(8.2.8)
Then Eq. 8.2.7a becomes J
SE=L Sw;=Sw
(8.2.9)
j=l
The degrees of freedom for SE are the within-group degrees of freedom, J
L
(N;-1)=N-J
j=l
Jo replaces J if there are empty subclasses. Whenever there are real or hypothesized differences among the mean vectors y.;. the within-group matrices Sw; and SE provide the only estimates of random variation that are not confounded with fixed mean differences. The diagonal elements of SE are the usual analysis-of-variance within-group sums of squares for each of the p measures. The within-group variance-covariance matrix is (8.2.10) The variances or diagonal elements of 1 are the within-group mean squares for the p outcome measures. i may be reduced to correlational form in the usual manner. This provides correlations among the y-variables which are not inflated, or deflated, by mean differences (see Section 3.3). Regardless of the form of 1, the population correlations among the criteria may be estimated from random variation about the model. Let a be a diagonal matrix of only the variances from I. That is, a=diag (I) (8.2.11) The standard deviations are (8.2.11 a) The pxp matrix of correlations among the criteria are
m= a -112Ia -112
(8.2.12)
Substituting 1 for I in Eqs. 8.2.11 and 8.2.12 yields the sample matrix ci. A correction for bias in ri is given by Olkin (1966). If the sample consists of subgroups with different vector means, the within-group correlations should be interpreted instead of correlations obtained by treating all subjects as members of a single population.
Analysis of Variance: Estimation
263
In the univariate case, i is the scalar residual variance of the outcome measure. That is, when p = 1,
i=&2 1 =-(y'y-ti'K'DKfJ) N-1
(8.2.13)
where y is the Nx 1 observational vector and!') is the single column of estimated parameters. Further, if I=J, then "' . ..:;... ""' (Yi; - Y·; )2 a 2- N 1 J ..:;.. A
-
J
(8.2.14)
'
This is the usual within-group variance or mean square, and comprises one diagonal element of l; in Eq. 8.2.10.
Example The total sum of cross products for the 22 observations of Table 3.3.2 is given in Section 3.3 as
S =[184 195]Yt T 195 212 Yz Yt Yz In the two-way analysis-of-variance model, we have I=J=4. K is square and S
V'DY
B=
•
.=
[167.65 183.18] Yt 183.18 201.00 Yz
Yz
Yt SE is the within-group sum of products:
SE=Sw=[16.35 11.72
11.72]Yt 11.00 Yz
The error degrees of freedom are ne=22-4=18. The error covariance matrix is
i=J_S£=[·91 .65]Yt 18 - .65 .61 Yz Yt Yz The within-group mean squares for y 1 and y2 are .91 and .61, respectively. The correlation of Yt and Yz is A
P12=
.65 v'.91 (.61)
·
85
The two variates have a high positive association.
264
Method
Given the estimate of X, we may substitute in Eq. 8.2.2 to obtain the standard errors of@. The variances of the elements in one column of@ are the diagonal elements of Eq. 8.2.2. For contrast i in column k, the sample variance of the single estimate is (8.2.15) where g;; is the ii diagonal element of G = (K'DK)- 1 . fh 2 is the variance of y"" from i. The standard error of O;k is (8.2.16) The standard errors may be expressed as an /xp matrix, having the sample standard error of 0;" in the ik position. Let d be a p-element vector of standard deviations from i. That is, (8.2.17) Define g as the /-element vector of square roots of the diagonal elements of G = (K'DK)- 1 . That is,
g'
=
[\/% vg;;
Vg';;].
(8.2.18)
The matrix of sample standard errors is
H=gd'
lh'tj
-l:::
(8.2.19)
The ith row of H, h';, contains the standard errors for the p elements of the same row of 0. That is,
The elements of H may be employed in the construction of confidence intervals about the elements of 8. Under the assumption of normally distributed Y., [0;k-eik]/&0ik follows at distribution with N-1 degrees of freedom. The 1-a interval estimate of O;k is (8.2.20) or (8.2.20a) where c = tN-r,a12 is the upper 1OOa/2 percentage point oft with N-1 degrees of freedom, and h 1k is the standard error & 81 k. Intervals may also be drawn on entire rows of 8, in the form of p simultaneous intervals for one effect or contrast, on all measures. Let 6'; be the (1 Xp) ith row of 8 and 0'; be the corresponding estimate. The p intervals in vector form, are obtained by adding and subtracting a multiple of h'; from the estimate 0' 1•
Analysis of Variance: Estimation
6';: o'.;±kh';
265
(8.2.21)
where k=
..JN-f-p+1 (N-I)p
Fp,N-l-p+l,a
(8.2.21 a)
k is a function of the 1OOa upper percentage point of the F distribution, with p and N-l-p+1 degrees of freedom. The multiplier k assures that the confidence level for every one of the p separate intervals in the vector is at least 1-a. It can be easily seen that when p = 1, the interval is the same as that specified by expression 8.2.20. With an increased number of variables k also increases, yielding wider intervals for each measure. That is, the intervals for individual variates must be wider than Eq. 8.2.20 to assure 1 -a confidence for the entire set. Intervals on rows of e are a special case of intervals on linear combinations of the rows. Linear combinations of rows of 8 are v'8, with v being any arbitrary weight vector. In expression 8.2.21, v is a column of an identity matrix, and v'8 = 6';. If other linear compounds are of interest, it is necessary to obtain the covariance matrix of the composite, v'S.
r(v'S) = v'r(S)v
= v' [(K'DK)- 1 ® I]v [v'(K'DK)- 1v]I
=
=wi
(8.2.22)
Substituting the sample value for I, the estimated variances of the linear composite for each of the p measures are the diagonal elements of The standard errors may be put in vector form. Express the square roots of the diagonal of i as a vector d', as in Eq. 8.2.17. The standard errors comprise the 1 xp vector,
wi.
h' =Vwd'
(8.2.23)
The 1-a interval estimate of v'8 is v'8: v'S±kh'
(8.2.24)
where k is defined by Eq. 8.2.21a. The confidence level is at least 1-a for each of the p intervals in the vector. This procedure for interval estimation directly parallels the procedure for regression, which is presented in Section 4.4. Example The estimate @tor the 2x2 bivariate crossed design is given in Section 8.1. The variance-covariance factors among the estimates are (Symmetric)] Constant 047 G= [ 009 .190 Amaineffect 018 .007 .190 B main effect 007 .070 .037 .760 AB interaction B AB Constant A
266
Method
The estimated variances are 6-, 2 = .91 and 6-l = .61. Each is estimated with ne = 18 degrees of freedom. The vector of standard deviations is
d'
= [v:9i58 VT11] =
[.953
.782]
The square roots of the diagonal elements of G, in vector form, are g' = [.218
.436
.436 .871]
The standard errors are
21 .17~ Constant [ 42 .34 A main effect H= 42 .34 B main effect 83 .68 AB interaction Y~
Let us extract the A main-effect contrasts from 0. The means for the first level of the A factor, minus the means for the second level are
{)•2 = [-.98 -.78] The corresponding standard errors are
A .95 confidence interval on 821 alone requires t 18 •. 023 = 2.1 01. The interval is
-.98-2.101 (.42),;; 821 ,;;- .98+2.1 01 (.42) or
-1.85,;;
821,;;
-.11
We are convinced (with .95 confidence) that the population mean for A, is below that for A2 , on y, alone. To draw a simultaneous interval on the two elements of 8' 2 we require F2 , 11,.m; = 3.59. The .95 bivariate interval has constant k-
..J22-4-2+1 (22-4)2
3.59-2.76
The lower limit to the interval is
[-.98 -.78]-2.76[.42
.34] = [-2.13 -1.72]
The upper limit is
[-.98 -.78] +2.76[.42
.34] = [.16
.16]
The interval for 821 alone is wider, to assure at least .95 confidence for both terms. For both variables together we cannot be certain that the popula-
Analysis of Variance: Estimation
267
tion mean for A1 is below the mean for A2 • Both separate intervals contain zero. A multivariate test of significance is necessary to decide if there are real group-mean differences.
8.3
PREDICTED MEANS AND RESIDUALS
The estimates 0 and their standard errors provide all the data necessary to interpret the fixed effects in analysis-of-variance models. ~ also contains the information necessary to test hypotheses about between-group differences, although the tests may be complicated by the non independence oft he rows of 0. As an interpretive device, subclass means as predicted or estimated through the model, may provide direct insight into the effects in the data. It is often useful to estimate subsets of terms in the model and to use those terms alone to predict the mean outcomes. For example, consider an Ax B two-way fixed-effects factorial arrangement, with a levels of factor A and b levels of B. The p-variate mean model is (8.3.1) All terms are p x 1 vectors. It all terms including interactions are non-null, then the only estimate of Y·Jk is the sum of all the effects. Represent the estimated vector mean by Y·;k· Then Y·jk = ft+a;+/Jk+Y;k (8.3.2) The predicted Y·;k is of course equal to the observed sample mean Y·Jk· The terms in Eq. 8.3.2 exhaust all possible sources of variation in the vector mean. Should some terms be taken to be zero, by assumption or by hypothesis tests, it may be desirable to omit them from the model and to predict means from those that remain. Two purposes are served. One, the means predicted under the smaller model are generally easier to interpret. This occurs by the elimination of random sources of variation distinguishing one vector from another. The resulting means are estimated with smaller standard error, since fewer components are summed than in Eq. 8.3.2. Two, comparison of means estimated under models of alternate sizes, or of estimated and observed means, can generally lend insight into the effects omitted from the larger model. The examination of residuals is a useful device for the discovery of unusual treatment effects, or of subclasses of observations which conform particularly well, or particularly poorly, to the model underlying the data. To complete the example, assume that we discover that neither the j'Jk nor the Y;k differ significantly from zero. We may remove these terms from the model, to obtain best estimates of those that remain and to predict the mean outcomes from them. The predicted means are (8.3.3) Predicted Y·Jk tor any variate will differ from one level of A to another. They do not differ across levels of B, however, nor will A mean differences vary across
268
Method
B. In this manner, the predicted means provide a clearer illustration of specific effects of importance than do the observed means. The number of degrees of freedom in the prediction model is c, the rank of the model for estimation. In Eq. S.3.2, c is equal to the product J = ab [1 for p,, plus a-1 for a 3, plus b-1 for f3k, plus (a-1)(b-1) for 'Y;k]. It is also equal to I, the rank of the complete model or the rank of the model for significance testing. In Eq. 8.3.3, the rank of the model for estimation is c= a (1 for p,, plus a-1 for a 3). The rank of the model for estimation can be no greater than the rank of the full model I, and is usually less. When c = I= J, as in Eq. 8.3.2, the model is a trivial one since J subclass means are predicted through an equal number of more complex parameters. The differences between the observed and estimated means are the mean residuals. For the model of Eq. 8.3.3 these are
[P,+aj+fik+Yjk]- [jHaj] = fik+.Yjk
Y·;k-Y·Jk=
(8.3.4)
The residuals, estimated in separate components (fi, y) or together, can indicate factors operating in the data that are not described by the model. In more complex models, the interpretive facilitation may be even greater, as the number of extraneous sources of variation excluded from the model increases .. Single interactions or complex terms in the model that are nonzero may be understood by the comparison of predicted means under models including and excluding the specific term of interest. In designs having subclasses without observations, the reduced-rank model can be used to predict the missing means (see Sample Problem 3, dental calculus reduction). Suppose there were no subjects in the second B group at the first A level. The mean predicted by Eq. 8.3.3 would still be the mean for all groups at the first level of A; that is, y. 12 =P,+a1 . The expression does notrequire either a fJ or 'Y estimate. Under a "main-effect" model, y. 12 = P,+a 1 +fi2 , the estimates can still be obtained as long as there are any observations at the second B level. These observations make the estimation of fJ 2 , and then y. 12 , possible. However, y 12 would not be estimable. For the estimates y. 12 to be valid, it is necessary to assume that the group mean y. 12 follows the same model as vectors for other subclasses, with no unique interaction. Residuals for null subclasses cannot be obtained. The predicted means for all subclasses can be obtained in matrix form. The degrees of freedom in the model for estimation is c; this is also the number of columns of the basis or rows of 8 in the estimation model. .Let Kc be leading c columns of K, corresponding to the terms to be included in the estimation model. Then the cxp matrix of estimated effects is (8.3.5)
e
ec
Since rows of are generally interdependent, may differ from the leading rows of the txp matrix 0. The variance-covariance matrix of Sc is 'Y(Sc)= (K~DKc)- 1 ®I. '
Analysis of Variance: Estimation
269
The matrix of means predicted through the rank-c model is
Y.= KeEle
(8.3.6)
When c=I=J, Kc is square and Eq. 8.3.6 reduces to the trivial form Y.=Y.. When c < J, the estimated and observed means differ by the terms that are omitted from Kc and Elc. Y. is of order Jxp, even when some subclasses have no observations. Under the general arbitrary-N model, each predicted mean equally represents a particular subpopulation. The rows of Y. may be averaged without respect toN; to obtain estimates of row and column means. These are not biased by disproportionate sample sizes. For example, in a 2X3 (Ax B) crossed design, the matrix of predicted means has row vectors y;jk· The estimate of the population mean for the first level of A is the simple average, (8.3.7)
Y·t· does not depend on the number of observations in the groups, even if there are one or more null subclasses. The same unweighted averaging may be employed across any dimensions of classification, to obtain the combifled estimated means for row, column, or interaction effects. From Eqs. 8.1.6 and 8.1.7, the expectation and covariance matrix of Y. are i.if(Y.) = K8 and r(Y.) = D- 1 ®!, respectively. The expectation of the predicted means is (8.3.8) The covariance matrix is
r(Y.) = //"(Keflc) = Kc?/'(Elc)K~
= Kc[(K~DKe)- 1 ®!] K~ = Kc(K;.DKc)- 1 K~®!
=Q®!
(8.3.9)
Although the observed means are independent across subclasses, the predicted means are not. The covariance matrix of the kth column of Y. for one criterion measure is (8.3.10) In general, the matrix is nondiagonal and predicted means are interrelated. The diagonal elements of Eq. 8.3.1 0 are the variances of the predicted means; the square roots are the standard errors. The standard error of the predicted mean in group jon variate k is a-k \fei;j. Substituting the sample standard deviation for a-k provides an estimate of the standard error, which may be used to draw confidence intervals on particular predicted values. The standard error of the mean predicted from a model of rank c (< J) is always smaller than the standard error of the corresponding observed mean a-h.;VN;.
270
Method
The estimated mean residuals are the Jxp matrix
E.=v.-v.
(8.3.11)
The residuals may be standardized to a common metric by dividing each element by the standard deviation of the corresponding variable. The standardized residuals are
:E:=.E..i-1/2
(8.3.12) \
.i is the pxp diagonal matrix of variances from i, or .i=diag (i) . is the diagonal matrix of inverse standard deviations. The conversion to :E: facilitates comparing the residuals across variates. The residuals may be further standardized to mean zero and unit variance, to assure comparability across subclasses with different N/s. This requires the expectation and covariance matrix of E .. The expected value of each residual is zero. The covariance matrix is .i-112
r(E.)=r(Y.-Y.) =r(Y.-Kc0c) =r(Y.-Kc[K~DKc]- 1 K~DY.)
=r(I-QD)Y. = (1-QD)r(Y.)(I-QD)'
= (1-QD)D- 1(1-QD)' 0l (8.3.13)
= (O-LQ)0l
Like the estimated means, the estimated residuals are not independent across subclasses, to the extent that Q = Kc(K~DKc)- 1 K~ is nondiagonal. The standard errors of the elements of E. are the square roots of the diagonal elements of Eq. 8.3.13. The standard error of e.;k for group j and variate k is (8.3.14) q;; is the jj diagonal element of Q and is never greater than 1IN;. (I"k is the stan-
dard deviation of the variate yk from l. Substituting the sample value for l in Eqs. 8.3.13 and 8.3.14 yields the estimate of the standard error(&, ). . . Under the assumption of normal e.Jk• the 1-a confidence interval on e.;k is ~
E•jk:
E•;k±tN-t,od6-€,;)
(8.3.15)
tN-l,a/2 is the 100a/2 upper percentage point of the t distribution with N-1 degrees of freedom. The null hypothesis may be tested that e.;k is zero, by direct comparison of the ratio E·;kl(&, ) to the critical t-value. H0 is rejected if the critical value is exceeded. ·;k
A Simple Case Consider the situation in which no between-group differences have been found, and the model best fitting the data has only a population scale constant.
·i.
Analysis of Variance: Estimation
271
That is, assume that Y·Jk =fl. The basis K1 is the J-element unit vector, 1, and the predicted means are
(8.3.16)
The variance-covariance factors are
... ... 1] 1 ...
(8.3.17)
1
The variances for variate k are all a-k 2/N, and are smaller than the variance of the observed mean a-k 21N1 since N = ~1 N1 • The estimated residuals are the differences
E.=Y.-Y.=Y.-1,1'
(8.3.18)
Each residual is the simple difference of an observed mean and the grand mean of all observations for the particular variate. The JxJ matrix of sample variances and covariances of residuals for all groups on one outcome measure is &k2(Q-!_Q)
= &k2( o-1-~ 11 ')
r~l-1J -1J
1
N
(8.3.19)
The standard error for the mean residual in group j is (8.3.19a)
The term can be employed to test the departure of the particular residual from zero. The standard error of the mean residual will tend to be .smaller as N1 or q" (subtracted from 11N1) becomes larger. The residuals have maximum precision
272
Method
when the number of subjects in the group and the number of terms in the model are large.
Example The observed means and frequencies for the data of Table 3.3.2 are .50
.75]Group 11
D= [4 6 (Z:ro)j
Y.= [ 3.17 3.67 Group 12 2.20 3.43
Yt
2.40 Group 21 3.57 Group 22
(Zero)
7
Yz
The within-group standard deviations are & 1 = .953 and & 2 = .782. Assume that both A and B main effects are significant to the model but the interaction is not. We shall estimate means based on a rank-three model,
Y·jk = fl+aj+fi,. The first three columns of the basis, for the constant and the A and B main effects, are
K,~[j
.5 .5 -.5 -.5 .5 -.5 -.5
51
B
Constant A
The least-squares estimates of the first three effects alone are @3 = (K;DKs}- 1 K~DY.
=[
2.34 -.85 -1.88
Yt
2.61] Constant -.62 A main effect -1.96 Bmaineffect
Yz
These values differ slightly from the corresponding elements of @ in Section 8.1, since the interactions have been deleted. The predicted means are .97 1.32~ Group 11 [ 2.85 3.28 Group 12 Y. = K,e3 = 1.82 1.94 Group 21 3.70 3.90 Group 22 ,
,
Yt
Yz
The estimated mean for all subjects at the first level of the A factor is (.97+2.85)/2=1.91 for Y~> and (1.32+3.28)/2=2.30 for y2 • The means in
Analysis of Variance: Estimation
273
Y. do not appear to be close to those of Y .. The residuals are -.47 -.57lGroup 11 , , .38 Group 12 [ .32 E.= Y.-Y. = .38 .46 Group 21 -.27 -.33 Group 22
Yt
Yz
The largest residuals appear for group 11, having mean observed scores very different from the other groups. Since E. contains only the interaction effects, it appears that there may be a statistically significant interaction. The standard error of i.tz =-.57 is
&2
~N1
-q!l
11
The (1, 1) diagonal element of Q = K3 (K;DK 3 )- 1 K~ is q 11 = .168. The standard error is .782
~~ -.168= .224
The predicted and observed means differ by about 2 1/2 standard errors. The large difference suggests that the mean for group 11 is being affected by important factors not included in the main-effect model.
8.4
SAMPLE PROBLEMS Sample Problem 2-Word Memory Experiment
The data for the four-treatment word memory problem may be analyzed by comparing mean recall scores across experimental conditions. For this, we may apply a one-way bivariate analysis-of-variance model. The number of words recalled and the proportion of word categories reconstructed are the outcome measures. The mean model is All terms are 1 x2 row vectors. The matrix model for all groups is
Y.=A0*+E.
Y. and E. are each 4 x2 matrices. The model matrix and parameters are defined as in the preceding sections. 1
1
A= [ 1 0 1 1
0 0
~ ~ ~J
0 0
1 0 0 1
274
Method
and
[0*]'= [t-t
a1
a2
aa
a4]
The data for all 48 observations are listed in the Appendix. The first observation has scores on the dependent variables, of 50 and 1, respectively. The second of these scores is the ratio of categories reconstructed to categories possiblethat is, 10/10. In addition, each observation has six time measures, to be employed for covariance analysis. The basic descriptive data for the problem are the matrix mean, the diagonal matrix of subclass frequencies, and the pooled within-group variancecovariance and correlation matrices. The means and frequencies are
y = .
[~;:~; 40.00
.97J Condition 1 .94 Condition 2 1.00 Condition 3 .97 Condition 4 Categories
36.25 Words and
D=diag (12, 12, 12, 12) The sum-of-product matrices for the four groups, adjusted to the group vector means, are Swl =
s "'3
[ 547.67 2.94
= [222.00 0.0
2.94] .02
s
J
s
0.0 0.0
'"2
= [505.67 6.62
6.62] .13
w4
= [21 0.25 2.30
2.30] .07
On inspection, the within-group matrices may appear not to be homogeneous. To maintain the example, we will ignore possible differences and pool to obtain a common estimate of "I. The pooled within-group sum-of-products matrix is 4
Sw=
~
Sw.
j=l
J
11.85 .22
=[1485.58 11.85 Words
J Words
Categories
Categories
Sw may equivalently be obtained from the total sum of products and the subclass means. That is,
Sw=
Y'Y-v:ov.
= [76831.00 1850.90
1850.90]- [75345.42 45.29 1839.05
The within-group covariance matrix is A
1
Vw="I =--Sw N-J
1839.05] 45.07
Analysis of Variance: Estimation
275
= _1_ [1485.58 11.85] 48-4
11.85
.22
J
= [33.76
.27 Words .005 Categories
.27
Categories
Words
The diagonal elements, s12 = 33.76, and s2 2 = .005, are the within-group variances or mean squares for the two measures. The within-group standard deviations are
s, = S2
=
v33.76 = 5.81
Words
Y.005
Categories
= .07
These may be included as the elements of a diagonal matrix, COw 112 , which is inverted to obtain the within-group intercorrelations. That is,
=
=
[1/~81
0 ] [33.76 1/.07 .27
.27 ] [1/5.81 .005 0
0 ] 1/.07
.65] Words 1.00 Categories
[1.00 .65 Words
Categories
The correlation of the two measures is high positive, r 12 = .65. In general, mean differences on the two measures do not appear large. There is a noticeable ceiling effect for the second outcome variable, with all subjects in treatment group 3 obtaining a perfect score of 5, or 100 percent. Between-group differences are not consistent. Although more words were recalled in group 2, fewer categories were reconstructed than in the other groups; however, differences on one or both variates may be nonsignificant. To test the hypotheses of the effect of structure upon word recall, three specific mean contrasts are of interest. These are the comparisons of groups 1 and 2, groups 2 and 3, and groups 3 and 4, respectively. The contrast matrix is
l
1 .25 .25 0 1 -1
.25 0
L= 0
0
1
-1
0
0
0
1
.25] LO
0 0 -1
L1 L2 L3
The row vectors LO, L 1, L2, and L3 are not orthogonal. The alternate parameters are
276
Method
Each difference is a two-element vector, having the contrast value for words and categories, respectively. The basis, necessary for estimating 0, is K = AL' (LL')- 1
[.00 -
.25]
.75 .50 1.00 -.25 .50 .25 1.00 -.25 -.50 .. 25 1.00 -.25 -.50 -.75 LO
L1
L3
L2
The parameter estimates are
8 = (K' DK)- 1K' DY. .97]~-t'+1/4 LJ aj .03 a;-a~ -.06 a~-a~ .03 a~-a~
39.56
= [ -2.33 2.17 3.75 Words
Categories
The standard errors of the estimated effects are the square roots of the diagonal elements of (K'DK)- 1 ®:I. These may be expressed as a column and royv vector, respectively. The variance-covariance factors of the estimates are
G
~
(K'DK)-'
~ lT"
(Symmetric)] LO L1 .1667 -.0833 .1667 L2 -.0833 .1667 L3 0
LO
L2
L1
L3
Since all N;'s are equal, the negative off-diagonal covariance factors are due exclusively to the nonorthogonality of contrast vectors. The diagonal elements of (K'DK)- 1 are proportional to the variances of the contrasts. In this particular instance, the three contrasts are simple mean differences. For example, 021 = y. 1 <1>-y. 2°)=-2.33 for the first criterion measure. From univariate theory, the estimated variance of a difference of means is
s 12 (_:!__ + _:!__) N1
N2
Since N1 = N2 = 12, we can see that this is exactly the value s 12 times the second diagonal element of (K'DK)- 1. That is, 51
2(1 1)- 2(1 1) N1 + N2 - 51 12 + 12 =s1 2
(.1667)
The standard error of the difference is the square root, s 1 "1/.1667.
Analysis of Variance: Estimation
277
All standard errors may be expressed in matrix form by defining
g= and
l
v% yyg;; g;;
llV0208J = [.144~
\1.1667 = \1.1667 ~ \1.1667
d' =
[s1
s2J
=
[5.81
.408 .408 .408
.07]
The matrix of standard errors is
H=gd'
~
.0~ ~-t'+1/4 L; aj
.84
= 2.37
a~-a;
2.37 2.37
.03 .03 .03
Words
Categories
a~-a~
a~-a~
To draw a confidence interval on any single element, we require
fv.a~ 2 •
With
a= .05, and v= N-1=44, the tabled t value is 2.02. The .95 interval estimate of ll21
is
-2.33-2.02(2.37)
~ (}21 ~
-2.33+2.02(2.37)
or
-7.12
~ (121 ~
2.46
The point estimate is negative. Our "single best guess" is that a 2 exceeds a 1 for the first outcome measure. The estimate of the difference is -2.33. That is, on the average, treatment 2 increased word recall by about 21/3 words over treatment condition 1. The standard error and interval estimate provide further information. In particular, since the interval includes zero, we may not conclude that there is a "significant difference" in treatment means, in favor of treatment 2. We may not conclude that information regarding a hierarchical structure was more beneficial to recall than was a lesser degree of information. At this level of specificity, the major hypothesis of the study is not supported. Inspection of the remaining estimates and standard errors shows few effects which, in isolation, appear significant. Contrasts other than those chosen may be significant, however. For example, &2<1>-&3<1'=2.17 and &3<1>-&P' = 3.75. The difference &2 <1'-&P' of 5.92 points is larger, and in the expected direction. A joint test of the three contrasts, for one or both variates, may support the major hypothesis of the study. This is discussed in the following sections. For comparison purposes, let us draw an interval on the second row of 8, (a~-a~;). as in expression 8.2.21 or more generally in expression 8.2.24. Let v'= [O 1 0 OJ. The estimate is identically the second row of the matrix,
v'S = [-2.33 Words
.03] Categories
278
Method
The estimated covariance matrix of the differences is
=
.045] Words .001 Categories
[5.630 .045 Words
The standard errors are
V5.63 and v':001, h' = [2.37 Words
Categories
respectively, or
.03] Categories
h' is equivalent to the multiplier~ times the vector of standard deviations, d' = [5.81 .07]. Also, h' is identical to the second row of H, due to the simple form of v. The vector interval requires Fv,v~l~p+l.cx· With p=2, N=48, 1=4, and a=.05, F2,4s .. o5 = 3.22. Then
k=
1 (48-4)2
"V 48-4-2+13.22
= 2.57
For just the number of words recalled,
-2.33- 2.57(2.37) ""'021 ""'- 2.33 + 2.57(2.37) or
-8.42""'
821
""'3.76
This is wider than the previous interval drawn on 821 , as the confidence is now at least .95 for the second variate, categories, as well as for the words measure. The rank of the model for estimation has been assumed to be c = 4. Since c = J, the predicted and observed means are equal, and the mean residuals are null. Should tests of significance indicate that some of the terms in@ are null, we may wish to reduce the model rank and predict means from the remaining nonzero effects. The estimate of@ for those effects remaining will be altered, si nee the three contrast vectors are not orthogonal. From inspection of the data, it appears that the major hypothesis of the study is not supported. No effect in@ exceeds twice its standard error, and the direction of the effects is not consistent. In particular, the "categories" variable shows little variation among subjects or groups. We suspect that the measure does not add any discrimination among experimental conditions.
Sample Problem 3- Dental Calculus Reduction The dentifrice study has J= 10 groups, representing the crossing of two years of experimentation with five experimental conditions. Experimental conditions one and two are controls, whereas three, four, and five represent three different experimental agents. The mean model for all ten subclasses is
Y.=A@*+E.
Analysis of Variance: Estimation
279
Y. and E. are each 10X6, with a column for each of the six tooth measures. The complete model matrix and parameters are given by Eqs. 7.1.29 and 7.1.30. The 10X10 diagonal matrix of subclass frequencies is given in Eq. 7.1.31: Since groups (1,5), (2,2), and (2,4) have no observations, rows of V., A, D, and E. corresponding to the three null subclasses are omitted as in Eqs. 7.1.32 and 7.1.33. The maximal rank of A is reduced from J= 10 to J0 = 7, the number of subclasses with observations. The first (or a) main effect is years; the second (or {3) effect is · treatments. The observed means for the ten groups are 2.25 . .88 2.56 1.56 1.00 .43 .60 .00
.75 2.25 3.75 4.13 1.33 1.78 3.11 3.33 .43 .86 1.29 1.57 1.00 .80 2.00 1.20 Y.=
.68
1.57
2.71
2.75
1.57
.71
.54
.79
2.08
1.71
.96
.67
rntrol1
-< Control2
!ll
Agent 1 :.. Agent2 Agent3
rntrol1
-< Control2
!ll
~
.23
.42
~·
~· ~
i%
.77
1.31
.65
.19
Agent 1 Agent2 Agent3
~· ~ ~ ~ ~ o<.'i! "/. ~'@ ~ Q)<1 otl> ~~ ~ ~· .,tl> ll3.. ~· ~ "~Q)/. ~r. r.~ .,tl> Q)/ •
"'
"'
~
~·
%,...
~
~·
%,...
~ ~·
il"o,...
~·
il"o,...
The frequencies are D=diag (8, 9, 7, 5, 0, 28, 0, 24, 0, 26) The within-groups variance-covariance matrix, given in Section 3.4, is 1.38 1.02 .81 Vw=I= 1.07 .74 .91
(Symmetric) 2.62 2.18 4.24 2.60 4.31 6.19 1.67 2.43 3.49 3.09 1.01 1.12 1.30 1.21 1.58
Right canine Right lateral incisor Right central incisor Left central incisor Left lateral incisor Left canine
~ OQ!
~·
.,IS>
The within-group variances, or mean squares, are the diagonal elements.
280
Method
The rank of the entire model matrix, or "rank of the model for significance testing" is /=J0 =7. Thus we will reparameterize and estimate seven linear combinations of the 18 effects in 8*. These may be obtained by first considering each main effect in isolation. For the two-level year factor, the model matrix is
1 1 OJ
A2 = [ 1 0
=
1
[12,
lz]
The parameters are IL· a 1 and a 2 • The contrast matrix is l2
= [~
.5
.5] CO -1 C1
Constant Years
The basis for just this factor is
Kz = Az~(Lz~)- 1
~ -:~]
= [
CO
C1
For the experimental conditions factor alone, the model matrix is
As= [1s,
Is]
The parameters, in addition to /L. are {J 1 and {J 2 for the control groups and {J 3 , {J4 , and {J 5 for the three experimental agents. To compare treatments for the unusual design structure, unique contrasts must be constructed. The first three contrasts are the mean comparisons for agents 1, 2, and 3, respectively, with the mean of the two control conditions. The final contrast is the comparison of the two control-group vector means (expected to prove nonsignificant). The contrast matrix is
~~[~
.2 .2 -.5 -.5 -.5 -.5 -.5 -.5 1 -1
.2 1 0 0 0
.2 0 1 0 0
·r 0 0 1 0
L1 L2 L3 L4
Constant Agent1 Agent2 Agent3 Controls
The basis is
~{ LO
-.2 -.2 .8 -.2 -.2
-.2 -.2 -.2 .8 -.2
L1
L2
-.2 -.2 -.5 -.2 0 -.2 0 .8 0
.5]
L3
L4
The complete basis for the population constant, the three active-agent contrasts, the contrast of the two controls, the year contrast, and the single estimable interaction, may be formed as Kronecker products of the columns of
Analysis of Variance: Estimation
281
K2 and K5 . Assuming that the products are computed with the K2 (year) vector as the prefactor, the 10 x 7 basis is -.2 -.2 .8 -.2 -.2 -.2 -.2 .8 -.2 -.2
K=
CO®LO
CO®L1
-.2 -.2 -.2 .8 -.2 -.2 -.2 -.2 .8 -.2
-.2 -.2 -.2 -.2 .8 -.2 -.2 -.2 -.2 .8
CO®L2
.5 -.5 0 0 0
.5 -.5 0 0 0
CO®L3
CO®L4
-.1 -.1 .4 -.1
.5 .5 .5 .5 .5 -.5 -.5 -.5 -.5 -.5
-.1
.1 .1
-.4 .1 .1
C1®LO
C1®L1
Parallel to the reduction of the matrix of means, the fifth, seventh and ninth rows of K may be eliminated, leaving the basis for extant groups as a square 7X7 matrix. We note that C1 ®L 1 is the only estimable interaction. The control and experimental agent 1 are the only two experimental conditions having observations in both years. The sampling diagram is given in Figure 8.4.1. Cells with X's Treatment
Figure 8.4.1
have no observations. If we write the weights for the interaction C1 ®L 1 as in Section 7.4, we obtain the diagram of Figure 8.4.2. No estimate that depends Treatment
B, A,
Ba
B.
-.5
-.5
1
0
0
1
.5
.5
-1
0
0
-1
-.5
-.5
1
0
0
F1gure 8.4.2
I
--------
~ L1 I
C1
282
Method
upon a null subclass is involved in the contrast. Although cell (2,2) has no observations, the mean for cell (2,1) and (2,2) may be estimated from cell (2,1) alone. Any other interaction of L2 and L5 contrasts is dependent upon other combinations of null-group means and cannot be obtained. Other sampling arrangements create other inestimable effects. For example, if the design had no observations in either B2 group, the diagram would be as shown in Figure 8.4.3. Two degrees of freedom between groups are "lost" for the two empty groups. However, one of the inestimable effects is a B maineffect contrast. No comparison with B2 is possible; the B degrees of freedom are three rather than four. In addition, one interaction contrast cannot be estimated. Treatment
a.
B,
Ba
84
Bs
A, Year
A. Figure 8.4.3
The rank of the full model for significance testing is /= 7. All effects that may contribute to criterion variation must be considered. However, it is hypothesized (and later verified) that only the first four effects are essential to the model. That is, the control contrast and the years and interaction contrasts are all null, and may be deleted from the model. We will obtain "best" estimates of only the remaining terms; the rank of the model for estimation is c = 4 (one degree of freedom to estimate the constant term, plus three for the activeagent contrasts). The terms to be estimated comprise the 4 x6 matrix 8 4 •
k'
J
CO®LO CO® L 1 [ 84 = {J4-1/2(P~+Pn CO® L2 P~-1/2(Pi+Pn CO® L3 p~-1/2(P~+P~)
The least-squares estimate of 8 4 is
Kt is the first four columns of K. Then
Constant Agent 1 Agent2 Agent3 I
Analysis of Variance: Estimation
283
Let g be the 4X1 vector of square roots of the diagonal of (K~D~)- 1 and d' the 1 X6 vector of standard deviations from i. Then the standard errors are
Inspection of S 4 reveals that, with only one exception, all comparisons with the controls are negative. For all three experimental agents, mean calculus scores were lower than in the two control groups. For example, agent 3 ({J 5), which appears the most effective, has contrast values ranging from -.59 to -2.21, for the right canine ahd the right central incisor, respectively. A test of any one contrast may be obtained by dividing the estimate by the respective standard error. The resulting statistic is referred to a table of the t distribution, with 100 degrees of freedom (N-J = 107-7). For example, with a= .01, the critical t-value for the one-sided alternative is -2.36. For the comparison of agent 3 with the control, this value is exceeded for all teeth but the canines (that is, -.59/.29 > -2.36 and -.72/.31 >-2.36). We might expect that a multivariate test of the agent effect would show significance, with the canines contributing no useful between-group variation. It appears that the only teeth to demonstrate effectiveness are those with a higher degree of calculus formation. Although the standard errors of the contrasts for agents 1 and 3 are similar, the mean differences are larger for agent 3, probably the most effective in reducing calculus formation.
284
Method
Means estimated from the rank-four model may be obtained for all groups, by employing the first four columns of K. These are
v.=~
1.73 1.73 .81 .80 .42 1.73 1.73 .81 .80 .42
~· 1>~ i9~... ~...
~·
i9~
"'
"
2.98 2.98 1.90 2.00 .77 2.98 2.98 1.90 2.00 .77
~
~· ..,@
~?.:
3.11 3.11 1.68 1.20 1.31 3.11 3.11 1.68 1.20 1.31
1.89 1.89 .97 .60 .65 1.89 1.89 .97 .60 .65
(@
(@
~"
"@
<9....,
~
9>/. /
@.., (':!
,.
~ ~·
(@
~
~~
,. ..,(> . "..-.·
~ 9>/
"'"'\2. o,...
~
.91 ] Contml1 .91 -< Control2 .61 l8 Agent 1 .00 :.. Agent 2 .19 Agent 3 .91 ] Contml t .91 -< Control2 .61 l8 Agent 1 .00 ~ Agent2 .19 Agent 3
"
~/.'
<9....,
~@
9>/.
'\2o,... iSla,••
1>.
"..-.·
iS'o,...
These means have a patterned simplicity, since all rows have been generated as linear functions of four rows of 0 4 • The elements of Y. are obtained by summing only a subset of all possible components, eliminating those hypothesized or known to be null. In particular, the control contrast {J 1 2 reflects only random variation and is among those eliminated. The resulting estimates of population means for the two control groups are identical. Since year effects are also judged to be null, the difference reflects only random variation and is not included in the estimates. As a result, the estimates for years 1 and 2 are identical. Finally, elimination of nonsignificant interactions produces a particularly simple pattern of means, free from random components that add complexity to the comparison of specific groups. The predicted means are more efficiently estimated than the observed means, which contain all components. The estimated variance-covariance matrix of Y. is o- 1 ®±. For example, the mean calculus score for the right canine in the first control group for year 1, y. 11 m, has variance
-P
~1 &1 2 = i(1.38) = .172 The standard error is v':i72= .415. The variance-covariance matrix of Y. is l<.t(K~DK4)- 1 K~®i = Q®I. The variance of y.11 w is the product of & 1 2 and the leading element of Q. This is approximately (1145)1.38 = .031. The standard error is v':031 = .175. Under the general model with arbitrary N, the estimated means may be combined across subclasses without weighting by the Nj. Let us obtain means for the five treatment conditions, combined across years of experimentation.
Analysis of Variance: Estimation
285
Averaging means for the two years in this example will yield the same means as we have for each year, since the two are identical. The reader may wish to compare the "combined estimated means" from one half ofY. with the "combined observed means" of Table 3.4.1. The observed values are confounded by the inclusion of all effects, of which some are nonsignificant, and by undue weighting for the larger groups (for example, control1, year 2; agent 1, year 2). In Y. vector means have been estimated for null subclasses. These groups are assumed to follow the same model as those with data. For example, the estimated means for agent 2 in the second year of experimentation, y:24 are valid as long as the observation of that treatment combination would not alter the mean effectiveness of agent 2, of agents in year 2, or introduce a nonzero interaction. Mean residuals, Y.-Y., may be obtained for the J 0 =7 groups having observations. These include the mean differences of the two controls, year differences, and interactions. It is useful to inspect these terms to identify unmodeled effects of possible interest. For the example, these are
E.=Y.-v. -.07 .52 .77 1.01 .36 -.04 Control1 .51 .04 .13 .22 .67 .64 ~ Control2 .03 -.18 :.. Agent 1 -.09 .05 -.62 - .11 .00 .00 .00 .00 .00 .00 Agent2 -.14 -.16 -.26 -.36 -.32 -.20 ~J Control1 .03 -.01 .18 .03 -.01 .05 ~ Agent1 .00 .00 .00 t\) Agent 3 .00 .00 .00
J
The residuals may be put in standard metric by dividing by variable standard deviations. A more useful metric is to divide each by its own standard error, yielding zero-mean, unit-variance statistics, which individually follow a t distribution. Residual degrees of freedom are N-J 0 = 107-7 = 100. The estimated variance-covariance matrix of E. is [D-LQ] ®I. The standard errors are the square roots of the diagonal elements. For example, the standard error of the mean residual for the right canine in the first control, for year 1, e. 11 m, is V1.38(1/8-1/45) = .376 The original residual is -.07. The t-value is -.07/.376 = -.19, and does not exceed the critical value of t100 at oc=.05. The effects eliminated for this tooth alone are seen to reflect only random variation (t10o •. o25 = 1.98).
286
Method
Dividing all the residuals by their respective standard errors, we have: -.09 -.19 1.00 1.27 .64 1.17 1.72 1.27 .22 1.46 .30 .09 -.44 -.13 -.22 -.90 .06 .09 .00 .00 .00 .00 .00 .00 -1.05 -.86 -1.10 -1.25 -1.56 -1.35 -.06 .13 .44 .90 .22 -.09 .00 .00 .00 .00 .00 .00
~
Control1
~ Control2 ~ Agent 1 ...... Agent 2
~J Control1 m Agent 1 ;;:, Agent3
None exceeds the critical t-value, although the rows corresponding to the controls exhibit some residuals of moderate magnitude. Together, these may throw some doubt upon the assumption that the control-2 agent was truly nonactive, or that control vector means are identical. Unfortunately, residuals for behavioral data are seldom so small or regular as those demonstrated here. Although they tend to confirm the fit of the model in this instance, in other situations they are useful to identify a lack of fit instead.
Sample Problem 4- Essay Grading Study The essay study has 32 experimental conditions; these represent all combinations of information provided teachers, concerning characteristics of the pupil-author. Eight combinations result from subjects being informed that pupil-authors were Negro or white, male or female, and of high or low ability. Four different essay pairs were employed as stimuli; each pair was scored by some teachers under each information condition. Each teacher-subject rated two essays (on two topics) supposedly written by the same pupil-author. The subclass frequencies, and mean teacher ratings under all 32 conditions, for both essay topics, are given in Table 8.4.1. The number of observations per group ranges from 1 to 6, with a total N of 112. Each subject has two outcome scores, which are themselves the sums of nine item responses. For comparative purposes in significance testing, we may wish to create a third score, as the sum of these two. Since the means for the sum are the sums of the separate mean ratings, they are not included here. The data matrix for the entire sample, Y, is of order 112 x 2. The total sum of products for the two essay topics is S"=Y'Y=[251969 1 262323
262323] 300707
Analysis of Variance: Estimation
Table 8.4.1
Group (j)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32
287
Frequencies and Mean Essay Ratings for Teachers under Different Information Conditions* Information Given Race
Sex
Ability
Negro
Male
High
Essay Pair
1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 .2 3 4 1 2 3 4 1 2 3 4 1 2
I
Low
Female
I
Hr Low
White
I
Male
High
I
Low
Female
I
High
I
T
:3 4
Number of Teachers (N;)
4 3 3 3 3 4 3 3 4 3 4 3 6 3 4 4 5 3 3 3 5 3 4 5 1 4 3 2 4 4 3 3
Mean Ratings Essay Essay Topic It Topic"*
44.25 53.33 47.00 40.33 37.33 39.50 47.00 34.33 61.50 46.00 61.25 49.67 41.17 35.00 43.75 47.75 56.20 39.33 59.67 52.33 45.80 43.00 41.25 48.00 10.00 40.00 59.00 39.00 37.25 30.75 48.00 41.67
34.50 59.67 67.67 47.67 28.33 58.50 70.00 35.67 44.75 68.67 56.75 49.00 33.33 41.00 47.75 46.50 56.80 53.00 70.67 54.00 45.00 63.67 56.00 44.80 10.00 39.25 61.33 43.00 29.25 47.25 60.00 33.67
*Range of possible ratings is from 9 to 90 points. tMy favorite school subject. *What I think about.
Adjusting to separate subclass mean vectors, the within-groups sum of products is
Sw=Sr-Y:DY.
= [13877.68 8475.37 Essay topic I
8475.37] Essay topic I 17429.10 Essaytopicll Essay topic II
Y. is the 32X2 matrix mean, and D the order-32 diagonal matrix of subclass frequencies, both from Table 8.4.1.
288
Method
The estimate of error variation is obtained by dividing by the degrees of freedom, N-J= 112-32=80. A
1
Vw=I= 80 sw = [173.47 105.94] Essay topic I 105.94 217.86 Essay topic II Essay Essay topic I topic II The variances, or within-group mean squares, are 173.47 and 217.86 for the two essay topics. It there are group-mean differences, these mean squares are valid estimates of the U"k 2 , whereas estimates not adjusting to separate vector means will be biased. The sample standard deviations for the two outcome measures are s1 = V173.47 = 13.17, and s2 = V217.86= 14.76, respectively. We may obtain the correlation matrix of the outcome measures by expressing the standard deviations in diagonal matrix form Dw 112 • Then
= [1/103.17
0 ][173.47 1/14.76 105.94
105.94][1/13.17. 0 ] 217.86 0 1/14.76
.54 ] Essay topic I = [ 1.00 .54 1.00 Essay topic II Essay Essay topic I topic II The two essay topics share a moderate amount of variation, but are tar from perfectly intercorrelated (r12 = .54). Maintaining the two separate measures for analysis appears necessary. The complete four-way analysis-of-variance mean model is Y·Jklm
= ~i-+a;+Pk+'Yl+Bm+(aP};k+(ay);t+(aB);m+(P'Y)kt +(P8hm+(y8)1m +(apy);kl+(ap8);km+(ay8);lm+(py8)ktm+(apy8);klm+E.Jklm
All terms are 2x1 vectors. Terms of the form (ap);k, (py8)k 1m, or (apyB);ktm do not indicate the product of main effects, but rather the interactions of the respective treatment conditions. a is the race effect; p, sex; y, ability; and 8, essay pair. The entire model matrix A has 32 rows for the 32 subclasses, and 135 columns for all main-effect and interaction parameters. The rank of A tor significance testing is I= J = 32. This is the sum of all main-effect and interaction degrees of freedom, including one for the constant. The residual or withingroup covariance matrix has 112-32 = 80 degrees of freedom. For exemplary purposes, let us concern ourselves with only the constant, the tour main effects, and the racexsex interaction. These terms have proven through further analysis to be statistically significant. The model matrix tor just these terms is the 32 x 15 matrix,
Analysis of Variance: Estimation
1 1 1 1 1 1 1 1 1 1 1 1 1 A= 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
1 1 1 1 0 0 0 0 1 1 1 1 0 0 0 0 1 1 1 1 0 0 0 0 1 1 1 1 0 0 0 0
0 0 0 0 1 1 1 1 0 0 0 0 1 1 1 1 0 0 0 0 1 1 1 1 0 0 0 0 1 1 1 1
1 0 0 0 1 0 0 0 1 0 0 0 1 0 0 0 1 0 0 0 1 0 0 0 1 0 0 0 1 0 0 0
0 1 0 0 0 1 0 0 0 1 0 0 0 1 0 0 0 1 0 0 0 1 0 0 0 1 0 0 0 1 0 0
0 0 1 0 0 0 1 0 0 0 1 0 0 0 1 0 0 0 1 0 0 0 1 0 0 0 1 0 0 0 1 0
0 0 0 1 0 0 0 1 0 0 0 1 0 0 0 1 0 0 0 1 0 0 0 1 0 0 0 1 0 0 0 1
1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0
289
0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 .1 1
The parameters are
[8*] I= [I-t a1 a2 /J1 (J2 'Yt 1'2 Bt B2 Ba B4 (a(J)u (a(J)t2 (a(Jb (a(Jb] The rank of the model for estimation, including just these terms, is c = 8. This is the between-group degrees of freedom for this subset of effects (1 for estimating I-t• plus 1 for a, 1 for (J, 1 for.,, 3 forB, and 1 for a(J). Let us develop the contrast matrix for all the effects in A simultaneously. The row of L corresponding to the constant term is the average of all rows of A. That is, l't = [1
.5
.5
.5
.5
.5
.5
.25
.25
.25
.25
.25
.25
.25
.25]
For the contrast a 1-a2, we can subtract the average of rows 17 through 32, from the average of rows 1 through 16. That is,
1'2 = [0
1 -1
0
0
0
0
0 0
0
0
.5
.5 -.5 -.5]
290
Method
We note that the a main effect is confounded with (a{J) interaction terms. Only if the interaction proves to be nonsignificant will the a main-effect estimates be unconfounded. Similarly, the row of L for {l 1-{l2 is the average of rows 1 through 8 plus 17 through 24, minus the average of the remaining sixteen. That is,
1'3 = [0
0 0
1 -1
0 0
0 0
0 0
.5 -.5
.5 -.5]
The fl main-effect estimates also involve (afl) interaction terms. The row of L for 'Y1 -'Y2 is the average of A rows 1 through 4, 9 through 12, 17 through 20, and 25 through 28, minus the average of the other sixteen. That is,
,,4 = [0 0 0 0 0 1 -1
0 0 0 0 0 0 0 0]
The 'Y main-effect estimates are not confounded with (a{J) interactions. However, the unequal cell frequencies may introduce interdependencies between a, {J, and 'Y estimates. For the different essay pairs, used primarily as a control variable, the type of contrast employed is not of much significance. For deviation contrasts, Bm-B. (m = 1, 2, 3), the rows of the contrast matrix are the average of the eight rows having Bm minus the average of all32 rows. The three vectors are
['1'65] = ,,7
[0 0 0 0 0 0 0 .75 -.25 -.25 -.25 0 0 0 0 0 0 0 -.25 .75 -.25 -.25 0 0 0 0 0 0 0 -.25 -.25 .75 -.25
0 0 0 0] 0 0 0 0 0 0 0 0
Finally, the interaction contrast (a{J) 11 -(a{J)t2-[(aflht-(aflb] may be obtained by including
=
[0
0 0 0
0 0
0 0 0 0 0
1 -1
-1
1]
Although the a and fl main effects are cqnfounded with the interaction, the interaction term is not confounded with main effects. Again, however, unequal subclass frequencies may induce interdependencies among all of the estimates. The basis for the eight effects may be obtained from L by juxtaposing the eight row vectors, and solving for K by Eq. 8.0.5. Or the basis vectors may be generated as Kronecker products of basis vectors for the individual main effects. For example, for all three two-level factors (A, B, C), the one-way basis is
K= [11 -.5.5] 2
CO
C1
For the essay factor (D), the basis is
~~[j DO
1 0 1 0 0 0 -1 -1 01
02
jJ 03
Analysis of Variance: Estimation
291
The 32X8 basis has vectors that are the Kronecker products of three K2 vectors and one K4 vector, in that order. The products are the first eight effects listed in Table 7.4.2; namely, 1. 2. 3. 4. 5. 6. 7. 8.
CO®CO®CO®DO C1 ®CO®CO®DO CO®C1 ®CO® DO CO®CO®C1 ®DO CO®CO®CO®D1 CO®CO®CO®D2 CO®CO®CO®D3 C1 ®C1 ®CO® DO
(Constant) (Race) (Sex) (Ability) (Essay pair 1) (Essay pair 2) (Essay pair 3) (Racexsex)
The leading c = 8 vectors of the reparametrized model matrix are
.5 .5
Ks=
.5 .5 .5 .5 .5 .5 .5 .5 .5 .5 .5 .5 .5 .5 -.5 -.5 -.5 1 -.5 1 -.5 1 -.5 -.5 -.5 -.5 -.5 -.5 -.5 -.5 -.5 -.5 -.5
.5 .5 .5 .5 .5 .5 .5 .5 -.5 -.5 -.5 -.5 -.5 -.5 -.5 -.5 .5 .5 .5 .5 .5 .5 .5 .5 -.5 -.5 -.5 -.5 -.5 -.5 -.5 -.5
.5 .5 .5 .5 -.5 -.5 -.5 -.5 .5 .5 .5 .5 -.5 -.5 -.5 -.5 .5 .5 .5
1 0 0 -1 1 0 0 -1 1 0 0 -1 1 0 0 -1 1 0 0 .5 -1 -.5 1 -.5 0 -.5 0 -.5 -1 .5 1 .5 0 .5 0 .5 -1 -.5 1 -.5 0 -.5 0 -.5 -1
0 1 0 -1 0 1 0 -1 0 1 0 -1 0 1 0 -1 0 1 0 -1 0 1
0 0 1 -1 0 0 1 -1 0 0 1 -1 0 0 1 -1 0 0 1 -1 0 0 0 1 -1 -1 0 0 1 0 0 1 -1 -1 0 0 1 0
0 -1 -1
.25 .25 .25 .25 .25 .25 .25 .25 -.25 -.25 -.25 -.25 -.25 -.25 -.25 -.25 -.25 -.25 -.25 -.25 -.25 -.25 -.25 -.25 .25 .25 .25 .25 .25 .25 .25 .25
292
Method
The parameters are
08 = [k Constant
Sex
Race
81-8.
82-8.
83-8.
Essay pair 1
Essay pair2
Essay pair3
Ability
(a/J)u- (afJ)12- {(a~b- (afJ)22}] · Racexsex
The variance-covariance factors of the estimates are
G = (K8DKs)- 1 9.16 -.44 -.45 1.62 = 10-3 X -.98 .01 .23 3.75
(Symmetric) 36.30 4.03 -1.55 -1.05 1.21 -.25 -2.02
36.36 -1.62 36.22 -1.04 .74 24.88 1.23 -1.01 -8.46 28.02 .96 -.95 -8.11 -9.51 -2.07 -.10 5.58 -7.92
27.56 .34
147.38
The estimates of effects are 49.09 45.31 .74 .22 .99 7.90 5.74 8.12 S8 =GK8DY.= -.31 -11.04 -4.37 4.75 11.24 5.12 -12.75 -11.80 Essay topic I
Constant Race Sex Ability Essay pair 1 Essay pair 2 Essay pair 3 Racexsex
Essay topic II
The standard errors of the elements are the products of the standard deviations for the two essays and the square roots of the diagonal elements of G. As a product of vectors, these are \1.00916 \1.03630 v.o3636
H =gd' =
\1.03622 v.o2488 v.o2802 v.o2756 v.14738
[13.17
14.76]
Analysis of Variance: Estimation
1 .26 2.51 2.51 2.51 2.08 2.20 2.19 5.06 Essay topic I
1.41 2.81 2.81 2.81 2.33 2.47 2.45 5.67
293
Constant Race Sex Ability Essay 1 Essay 2 Essay 3 Racexsex
Essay topic II
The contrast estimates and standard errors provide direct information on the effects in the data. First, it is clear that the individual essays are quite different. The estimates of essay effects vary from +11.24 to -11.04, with the two topics and four essay pairs showing few consistencies. For the first topic, "my favorite school subject," the first (almost randomly chosen) composition received scores averaging .31 below the mean of all essays (that is, 81°>-8.w= -.31). The second and third compositions received ratings averaging 4.36 points below and 5.12 points above the mean of all four stimuli, respectively. Though not estimated, 8p-8.w is -(-.31-4.36+5.12)=-.45 point. Thus the four compositions on topic I, and even more so on topic II, represent differing writing styles and qualities. Across essay pairs, this is an expected and desired experimental con.dition. Yet, for any one pair of compositions, it is possible that the credibility of the stimulus is threatened if a teacher notes that two essays supposedly written by the same pu pi I are of very different quality. The ability effect is consistent. When the teachers believed that the essays were written by high-potential students, they rated them 8.12 and 5.74 points, on the average, above the ratings given low-potential students. The standard errors are 2.51 and 2.81 for the two ability contrasts. Both ratios of contrast to standard error exceed the .05 critical tvalue with 80 degrees of freedom. Before concluding that this effect is significant and without exception, a multivariate test criterion should be applied to the pair of differences. Also tests of the interaction terms should be conducted. These will assure that the ability differenc.es are not an artifact of one or more exceptional essays or experimental conditions. These tests are discussed in Chapter 9. The contrasts of essay scores for black and white pupils are both positive (blacks having a higher average), but small. The mean differences of .74 and .22 point for the two essay topics, respectively, do not appear to be significantly different from zero. The standard errors are relatively large (2.51 and 2.81 ). Both sex contrasts are positive, but only one is large. On the average, males were rated .99 point higher than females on essay topic I, with a standard error of 2.51. For essay topic II, males received an average of 7.90 points more than females. The standard error is 2.81. This effect in isolation exceeds the .05 critical t value with 80 degrees of freedom; that is, 7.90/2.81 > 1.99. Since there is not consistency for both essays, we shall seek further information including the results of multivariate significance tests, before deciding whether the effect is significant.
294
Method
The interaction effect of race and sex is large with respect to the standard errors and is in the same direction for both essay topics. The difference of male and female means for Negroes is much smaller than the difference for whites. To clarify the nature of the interaction, we can predict the matrix of means from the rank-8 model. These are combined across ability and essay levels to yield means for each combination of race and sex. The complete matrix of predicted means is 32 x2, and is
Y.= Ks@s The residuals are the differences of observed and predicted means (Y.-V.). In this case the residuals consist of all estimable interaction terms, with the exception of racexsex. The residuals have been converted to tstatistics by dividing each by its standard error. These are presented, together with the predicted means, in Table 8.4.2. The predicted means may be combined across subclasses, without respect to the Ni, to yield means for combinations of effects. For example, the mean rating given male Negro authors on the first essay topic is 42.99, or (46.74+ 42.68+52.17+46.60+38.62+34.56+44.05+38.48)/8. The means for all racesex combinations are Male
Female
Negro r---:-:::..:...::...:=--+___:_:::..:...::...::-~ White L..:..:::..:.="---'---'-'c.:.=.:::_j Essay topic I
Male
Female
Negro ~---=-=-+--:-:---::-::--i Wh ite ~.:..::..::--'----'-=-=--=--' Essay topic II
It may be of interest, or necessary, to compute sex differences and standard errors for males and females separately. However, at least one trend is apparent from the tables. For pupil-authors thought to be black, the average ratings of males and females are 5.38 and 2.00 points apart, but in opposite directions for the two essay topics. By imprecise judgmental criteria, these differences appear comparatively small. Further, if the differences were random, they could well be expected to be in opposite directions. For pupils thought to be white, however, large rating differences are observed for both essay topics (7.36 and 13.80 mean points), consistently in favor of males. The teachers appear to hold very different expectations for white girls and boys, but do not distinguish when rat~ ing blacks. Girls, from whom better products are expected, are punished for average work. Questions are raised concerning additional factors, such as the teachers' own race and sex-group membership, as well as the influence of their expectations upon the pupils' actual achievement. A number of specific hypotheses may be drawn from inspection of the residuals. Only one value exceeds the 1.99 critical value, however. This is the mean residual for essay topic I in experimental subclass 25. The raw residual for the group is 10.00-45.01 = -35.01 with a standard error of 12.61. The .95 interval estimate of e. 2211 is -35.01 ±1.99 (12.61 ), or from -60.11 to -9.91. The interval does not include zero, and probably reflects an effect other than random variation. Paralleling this, the group shows a large but nonsignificant residual for the second essay topic. Inspection of the raw data and subclass means re-
Analysis of Variance: Estimation
Table 8.4.2
Predicted Mean Essay Ratings, and Residuals as for Teachers under Different Information Conditions Information Given
Group
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20' 21 22 23 24 25 26 27 28 29 30 31 32
Essay
Essay Topic I*
295
t Statistics,
Essay Topic lit
Race
Sex
Ability
Pair
Mean
Residual
Mean
Residual
Negro
Male
High
1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4
46.74 42.68 52.17 46.60 38.62 34.56 44.05 38.48 52.13 48.07 57.56 51.99 44.00 39.95 49.43 43.87 52.37 48.31 57.80 52.23 44.25 40.19 49.68 44.11 45.01 40.95 50.44 44.88 36.89 32.83 42.32 36.75
-.45 1.59 -.77 -.94 -.19 .90 .44 -.62 1.65 -.31 .66 -.35 -.66 -.74 -1.01 .69 .79 -1.34 .28 .01 .31 .42 -1.51 .81 -2.78 -.17 1.29 -.69 .07 -.37 .85 .74
42.04 57.83 64.31 48.13 36.29 52.08 58.57 42.38 40.04 55.82 62.31 46.13 34.29 50.08 56.56 40.38 47.72 63.51 69.99 53.81 41.98 57.76 64.25 48.07 33.92 49.71 56.19 40.01 28.17 43.96 50.45 34.26
-1.21 .24 .45 -.06 -1.05 1.04 1.53 -.90 .74 1.71 -.89 .38 -.20 -1.21 -1.40 .97 1.67 -1.40 .09 .03 .55 .78 -1.32 -.61 -1.69 -1.69 .69 .32 .17 .53 1.28 -.08
I
Low
Female
I
High
I
Low
White
Male
I
High
I
Low
Female
I
High
I
Low
*My favorite school subject. tWhat I think about.
veals that only one observation was recorded for the subclass. More specifically, subject number 88 (see data listing in the Appendix), responded with a rating of 1 on eight of nine scales for both essays, and did not provide any of the additional information requested. It would appear that the subject was not a serious participant in the study. His scores may be omitted from further analyses. All effects may be reestimated with one fewer observation. Failure to omit the observation may result in other spurious results, such as a single significant high-order interaction.
CHAPTER
I
Analysis of Variance: Tests of Significance Variation and covariation in the criterion variables may be partitioned into components attributable to the analysis-of-variance model and a residual. Or the variation may be attributed to components representing particular mean differences. Test criteria are applied to determine whether individual contrasts or "omnibus" main effects and interactions explain observed variation in the outcome measures. When there is more than a single criterion score, a multivariate test statistic is appropriate. The estimation of individual mean contrasts is discussed in the preceding chapters. Those estimates provide the starting point for tests of significance. Unlike most statistical treatises, planned contrasts are discussed prior to "overall" significance tests. Results for omnibus hypotheses of the sort H0 : p., 1 = p., 2 = · · · = J.LJ are obtained by pooling sums of products for J-1 independent contrasts among the group means. The analysis-of-variance mean model is Y.=A0*+E.
(9.0.1)
Y.=K0+E.
(9.0.2)
The reparameterized model is
Y. and E. are each Jxp; K is the Jx/ basis for the parameters in 0. 0= L0* is the /xp matrix of parameters to be estimated from the sample./ is the number of mean contrasts in the complete model or the rank of the model for significance testing, satisfying I~ J. In this chapter we are concerned with testing the nullity of some or all of the contrasts in 0. After the model has been reparameterized to full rank, the form is mathematically identical to the full-rank regression model, Y = XB+E. The distributional assumptions made under the two models are also identical. Specifically, the model matrices, X and K, are fixed and error-free, and are of rank equal to the number of columns. Rows of E (observations), are assumed to be independent with expectation zero and identical covariance matrices. The common covariance matrix I is arbitrary and may exhibit any pattern of intercorrelations among the criterion measures. For interval estimation and tests of significance, 296
Analysis of Variance: Tests of Significance
297
we assume further that the random errors are drawn from a multivariate normal population. Algebraically the models are the same. The distinction of the two is primarily one of convenience. The analysis-of-variance model explicitly defines subclass membership of the observations through the use of subscripts. It is usually employed in group-comparison research, whenever the independent variables are defined by membership in experimental or naturally occurring subpopulations. By comparison, if the independent variables consist of a range of scores on measured variables, the regression model is perhaps more convenient to apply. Either analysis may be applied to either sort of data (using "dummy variables") with identical results. The distinction of the analysis-of-variance and regression models is not the ability to interpret causal relationships. These may be implied when the experimental setting involves the random assignment of subjects to conditions, and when the manipulation of the experimental treatment is controlled. Either statistical model may be applied. Without the two essential experimental considerations, both models will provide only measures of association, or correlation, of the independent and dependent variables. Analysis of variance is valuable in comparing means of naturally occurring or intact groups, even when causal relationships cannot be determined. The significance-testing procedures of this chapter parallel those in Chapter 5 for the regression model. The reader may wish to follow the material in the earlier chapter, which is presented in greater detail. The role of a single predictor variable in the regression model is supplanted by that of a particular contrast among means in analysis of variance.
9.1
SEPARATING THE SOURCES OF VARIATION The least-squares estimate of 8 in Eq. 9.0.2 is given by Eq. 8.1.2. That is, (9.1.1)
Dis the diagonal matrix of subclass frequencies D= diag [Nb N 2 , . . . , NJ]. Each row of 0 is a single contrast among means for the p criterion measures, and corresponds to one degree of freedom between groups. In total there are I rows, each of which is the product of one row of the contrast matrix Land the original parameters 8*. A hypothesis about one contrast involves a single row of 8 and is tested from the corresponding row of 0. A hypothesis involving multiple effects or multiple degrees of freedom among groups involves two or more rows of 8, and is tested from the corresponding rows of 0. Usually we have a series of tests involving sections of 8. For example, in an axb crossed design, we may wish to test the nullity of a-1 rows of 8 containing the A contrasts, b-1 rows with the B contrasts, and (a-1 )(b-1) rows with interaction contrasts. These provide tests of overall A and B mean differences and of interaction. Hypothesis tests about 8 require sum-of-squares and cross-products matrices that reflect the relative importance of the effects to the model. The
298
Method
partition of sums of products into components for the model and residual is given by Eqs. 8.2.3 to 8.2.5. The total sum of products of observed scores is the pxp matrix Sr = Y'Y, with N degrees of freedom. The sum of products due to the model is Ss = 0'K'DK0 with I degrees of freedom. The error sum of products is SE = Sr-Ss, with ne = N-1 degrees of freedom. When I= J, and all estimable between-group effects are included in the model, SE is the pooled within-group sum of products (SE = Sw). with ?!-J degrees of freedom. The population variance-covariance matrix is estimated by i =SEine. In general, multiple tests using rows of 0 are not valid. The sum of products of the rows of0 do not exhaust all the between-group variation. That is,
El'0'i'Sn The matrix estimate may be employed only to conduct tests or draw intervals upon one or a small number of elements. Further, rows of 0 are not independent. Statistical error rates for tests upon multiple rows may be increased manyfold. This can be seen by inspecting the covariance matrix of one column of 0. Let (jk be the kth column of estimates, for criterion h· Then the covariance matrix of the estimates is given by Eq. 8.2.2. That is, (9.1.2)
Estimates are interdependent to the extent that the covariances, or off-diagonal terms of (K'DK)- 1 are nonzero. Should L be row-wise orthogonal (so that K will be orthogonal by columns), and D have equal elements, the rows of 0 will be independent. Sums of squares and products may then be obtai ned directly from these rows. If either orthogonality condition is violated, the sums of products will not be independent and will not sum to the correct total between-group sum of squares and cross products. The orthogonality condition is highly artificial. It does not often conform either to the experimental situation or to the research goals. A general nonorthogonal solution provides an analysis tool consistent with the research design. If the conditions of orthogonality are met, the general model may still be applied. In the special case of orthogonal contrasts and equal Nj the computations are simplified. The general model will produce the solution more commonly derived through scalars or vectors, rather than matrix manipulation. The general solution involves a second reparameterization of the analysisof-variance model to a basis matrix with orthogonal columns. This is precisely the reparameterization to orthogonality in the regression model (see Section 5.1). That is, rather than estimating 0 we estimate a function of 0that would be obtained if K'DK were diagonal. LetT be the triangular Cholesky factor of K'DK, such that K'DK=TT'
(9.1.3)
Then the transformed effects to be estimated are T'0. K'DK and T' are both /x /,and T'0 is txp. Like 0, there is one row for each effect in the model and one column for each criterion variable. Since 0 contains p separate sets of contrasts for separate measures Y~c, T'0 also contains p separate sets of transformed effects.
Analysis of Variance: Tests of Significance
299
The estimate of T'8 is the /xp matrix /"--._
U=T'8
(9.1.4) Let us inspect the properties of the transformed estimates and then observe the effects of the reparameterization upon the original model. The expectation of U is iif(U)=iif(T'0) =T'iir(0)
=T'E)
(9.1.5)
U is an unbiased estimate of the population matrix T'8, since K'DK and thus T are matrices of fixed constants. The variance-covariance matrix of U is
r(U) = T''V(0)T =T'[(K'DK)- 1 ®~]T
= T'(TT')- 1 T®~ =I®~
(9.1.6)
The variance-covariance matrix of one column of U is (9.1.6a) If Eq. 9.1.6a is compared with Eq. 9.1.2 it can be seen that unlike 0, rows of U are independent of one another. U is the matrix of orthogonal estimates or semipartial regression coefficients. Although (K'DK)- 1 may be nondiagonal, the off-diagonal covariance terms in I are identically zero. U is the same order as 0 and estimates the same functions. However, the contrast estimates in U have the additional property of independence in a predetermined order. Rows of U, like those of 0, have an arbitrary variance-covariance matrix, ~-Values of the same contrast for p measures are still intercorrelated. If the design is completely orthogonal, with diagonal K'DK, Twill also be diagonal and U will be a simple rescaling of the rows of 0. In the general case, T is a nondiagonal matrix and performs a more complex transformation. The estimate U is
u=r0 =T'(K'DK)- 1 K'DY. =T'(TT')- 1 K'DY. =T- 1 K'DY.
(9.1.7)
T- 1 is the Cholesky factor of (K'DK)- 1 . Thus it is the specific transformation that premultiplies K' to yield an orthonormal matrix with respect to the metric D. That is, U=T- 1 K'DY.
= [K*]'DY.
(9.1.8)
300
, Method
K* is columnwise orthonormal, satisfying [K*] 'OK*= I. The computations for the MULTIVARIANCE program are performed according to Eq. 9.1.8, with K orthonormalized by the modified Gram-Schmidt procedure. Equation 9.1.8 may also be written as
(9.1.8a)
U= [(K*)'OK*]- 1 (K*)'OV.
The term in brackets is simply an identity product. Comparison of Eq. 9.1.8a with 9.1.1 shows us that U is exactly the least-squares estimate of effects, computed from orthonormal K rather than from the original basis. U serves the same function as 8, if the basis is first orthonormalized by columns. Just as 8 is the estimate of 0 in V. = K0+E., U is the estimate of T'0 in (9.1.9)
V.= K*T'0+E.
The reparameterization involves factoring the basis into the orthonormal basis and an upper triangular matrix. That is, (9.1.10)
K= K*T'
The orthonormalization is performed in a particular order of columns or effects. Thus, the order of contrasts in K and 0 plays a role in testing multiple consecutive hypotheses. Each column of K* is computed to be orthogonal to preceding columns. Each row of U is the estimate of the corresponding term in 0, eliminating or holding constant all preceding effects. The sums of squares and cross products of rows of U are the sums of products for particular contrasts, eliminating all preceding contrasts. Represent the ith row of U as u' i• where u'i is the 1 xp vector containing the estimates of the ith contrast, independent of contrasts 1 through i-1. The orthogonal estimates may be represented by rows. That is,
(9.1.11)
The sum of squares and cross products for the ith contrast, eliminating preceding effects, is the pxp matrix uiu'i· To examine uiu'i and its expectation we need to partition T, the Cholesky factor of K' OK. Let the rows ofT' bet' i and the rows of 0 be O'i· Then the orthogonal effects T'8 estimate population matrixT'0,
T'0=
t' 1
(J'l
tn
t'2 t'a
(J' 2 (J' 3
0 0
t',
6',
0
(J'l
faa
tll t21 tal
0
tu
(J',
t12
t13
t22
t23
0 0
(J' 2 (J' 3
Analysis of Variance: Tests of Significance
tn(J' 1+t1zfJ' z+t136' 3+ · · · +t116' 1 0' +tzzfJ'z+tz36'3+· · · +tz16'l 0' + 0' +t33fJ' 3+' ' '+t31fJ' I
301
(9.1.12)
0' + 0' + 0' +· · ·+tufJ'1 Each row of T'8 is the product of a row of T' and the entire matrix 8. The ith row of U is u' 1 = t' 10, with expectation t' 18 and variance-covariance matrix :1. The expectation of the sum of products of u' 1 is (9.1.13) The partition of sums of squares and cross products is presented in Table 9.1.1. Each product u1u' 1 is a pxp matrix of squares and products for the ith row of 8, eliminating preceding effects. Tests of significance involve determining whether some or all of the u;u' 1 with fixed components 8 are large, relative to the residual lacking the fixed components. The sum of the u1u' 1 for all effects is l
LUtU';=
U'U
i~l
=0'TT'0 =0'K'DK0 (9.1.14)
=SB
The sums of products of rows of U, unlike rows of 0, are a complete partition of the variation due to the model, or between groups. In the transformation to orthogonal parameters the total sum of products has not been altered. Instead, variation has been attributed to the sources in the model, in a specified stepwise order. Table 9.1.1
Partition of Sums of Products for Rank-/ Multivariate Analysis of Variance
Source
Degrees of Freedom
Squares and Cross Products
0'1
Expected Squares and Products
0'tlt'I0+I E>'t2t' 2e+ I E>'t3t' 30+ I
8' 2 , eliminating 8' 1 8' 3 , eliminating 8' 1 and 8' 2
O'~o
eliminating 8' I• 8' 2• ... , O'z-1 l
~ U1U' 1 =
U'U
0'K'DK0+/I
i=l
Residual Total
ne = N-1 N
SE=Y'Y-U'U Sr=Y'Y
(N-/)I
302
Method
Sums of products and degrees of freedom for adjacent contrasts may be combined to test hypotheses about overall main effects and interactions, or about two or more contrasts simultaneously. The complete sum of products for any main effect, in a particular order, has the same value regardless of the type of contrasts chosen. The sum of products of one or more rows of U, employed to test a particular hypothesis, is the sum of products for hypothesis, SH, with degrees of freedom nh. If there is more than one such matrix, an additional subscript is added to distinguish one matrix and degrees of freedom from another-that is, SHj and nhr For example, consider a 2x3 crossed design, with the model reparameterized to deviation contrasts for main effects and interactions. In the complete model and U each has six rows, with effects for the constant, A, 8, and interaction, in that order. The contrast parameters are represented symbolically as
e
DO®DO D1®DO DO®D1 DO®D2 D1®D1 D1®D2
Constant A main effect B main effect B main effect AB interaction AB interaction
The orthogonal estimates comprise a 6xp matrix, with one row for each column of K. That is,
U= [K*]'DY. u' 1
Constant A main effect, eliminating constant B main effects, eliminating constant and A
(9.1.15)
u' 5 } AB interactions, eliminating constant, A, and B u'6 The partition of sums of squares and cross products for main effects and interaction is given in Table 9.1.2. Each pxp sum of products for hypothesis, SH;• contains the p univariate sums of squares in the diagonal positions. Each is exactly the sum of squares that would be obtained if performing the analysis for that particular criterion variable alone. In addition, the off-diagonal positions contain the between-group sums of cross products of every pair of criterion variables. These reflect the intercorrelations among the measures and need be -considered in conducting multivariate significance tests. Tests of significance are obtained by determining whether the terms of SHj are large compared with those of the residual SB· One test criterion is based upon the comparison of the "generalized variance" or determinant of SB with that of SB+SH;· If the two are about the same, we conclude that the t';0 components do not have a noticeable effect and are null. If the generalized variance of the residuals in SB is much smaller than that of S],+SHp we conclude the reverse, that t' ;0 are non null and of importance. In this case we reject the null hypothesis.
Analysis of Variance: Tests of Significance
Table 9.1.2
303
Partition of Sums of Products for 2X3 Fixed-effects Crossed Design
Source
Constant
Degrees of Freedom
Sum of Products
Mean Products
Expected Sum of Products
nh1 = 1
SH, = u,u' 1
MH, =SH/1
0't,t',0+l:
nh2 = 1
SH2 = u2u' 2
MH2 =SH/1
0't2t' 20+ l:
and constant
nhs=2
SH3 = U3 U' 3 +u4 u' 4
MH3 = SH/2
0't3 t' 3 0+0't.t' 4 0+2l:
A8, eliminating A, 8, and constant
nh• = 2
SH4 = U5U'5+UsU's
MH4 =SH/2
0't5 t' 5 0+0't,t' 6 0+2l:
SB = SH1 +SH2+SH3 +SH4 = U'U
MB=SB/6
0'TT'0+6l:
SE=Sr-SB
ME=~,Jne
(N-6)l:
A, eliminating
constant 8, eliminating A
Between groups Within groups
n0 =6 ne=N-6
=Y'Y-U'U
Total
N
=I
Sr=Y'Y
In Table 9.1.2, u3 u' 3 and u4 u' 4 have been pooled to obtain tests of significance of the overall two-degree-of-freedom 8 main effect. It is also possible to obtain independent tests of the two separate contrasts. This is accomplished by not summing the two matrices and considering each separately. Overall or "omnibus" test statistics can obscure specific results that are hypothesized to exist in the data. Specific planned contrasts will yield different results depending upon the contrast type and order. However, the sum u3 u' 3 +u 4 u' 4 is the total of all orthogonal 8 sources of variation, eliminating the constant and the A main effect. This total is constant, regardless of the type of contrasts chosen. Each sum-of-products matrix may be divided by its degrees of freedom to obtain the pxp matrix of mean squares and cross products (M). MH; contains the univariate mean squares between groups for each of the criterion variables in the diagonal positions. The error mean-squares-and-products matrix in the example is simply the within-group variance-covariance matrix, with the variance estimates on the diagonal. The covariance matrix is not restriced and may reveal any pattern of association among criterion variables. It may also be reduced to standardized form to estimate the variable intercorrelations. The error term for analysis of variance depends upon the particular model and effects being tested. The most common is the pooled within-group variance-covariance matrix employed in fixed-effects factorial designs. The matrix isSE=Sw=Y'Y-Y:DY. (see Eq. 8.2.7). The degrees of freedom are ne=N-J, where J is the total number of subclasses. Should some groups have no observations, J 0 , the number of groups with at least one subject, replaces J. A second class of error terms is the residual, after fitting a given model to the data. This has the form SE= Y'Y-U'U where U contains the contribution of I defined sources of variation, to the model. The residual is estimated with ne = N-1 degrees of freedom. For example, in a one-observation-per-subclass design, a basis may be constructed for only main effects. The "residual" is then the interaction sum of products. However, if there are replications within
304
Method
cells, and all possible between-cell variation is not included in the basis, the residual is all remaining between-cell variation, plus within-cell variation. Should all J sources of between-group variation be contained in the basis, the "residual" is identically the within-group sum of products (see Eqs. 8.2.78.2.9). For example, in Table 9.1.2 the maximum rank of the model is J = 6. Since six contrasts are estimated, the residual SE is identically the within-groups matrix SE = Sw. A third class of error terms shall be designated special effects. Here the sum of products for error is the sum of one or more products u1 u' 1, as estimated through the model. For example, in Sample Problem 5, on programmed instruction effects, experimental conditions and sex are fixed effects. Classes are random, nested within conditions but crossed with sex. There is no within-cell replication since there is only one testing of each sex group in each class. The interaction of sex and classes may be omitted from the basis and designated "residual," providing the error term for the sex and sex x conditions effects. The error term for the experimental conditions main effect is classes nested within conditions. This effect is included in the basis. Thus we may designate the final 35 rows of U (the 35 degrees of freedom for classes effects) as a "special effects" error. Then ne = 35 and UE is the final 35Xp submatrix of U. The error sum of products is (9.1.16) ,_I
It is always possible to obtain one sum-of-products matrix by subtraction. This might be, for example, SE in Table 9.1.1 or 3
SH/=Sr-SE-~ SH. J=l
J
in Table 9.1.2. If nh is large, subtracting to obtain sums of products can obviate orthonormalizing a potentially large basis. However, no specific contrasts within the effect can be estimated. Tests of significance are conducted in the order opposite from the order of elimination in orthonormalizing K. The significance of the last contrast, or set of contrasts, using the last rows of U is determined first. If these are not significant, the next-to-last hypothesis is tested, using earlier rows of U. The procedure continues until a significant effect is encountered. Once significant effects are found, all preceding terms of U are confounded with those effects. This can be seen in expression 9.1.12. For example, if we find that ()' 1 is zero (null hypothesis accepted), then the()' 1 term disappears from all earlier rows of U. If 6' 1 is nonzero, then all earlier terms are confounded with some nonzero 6' 1 effect. When this is the case, ()' 1 must be ordered ahead of the other terms in K and 0. K is reorthonormalized to obtain unconfounded tests of the other terms, eliminating 6' 1• Further consideration is given to the order of effects, in the following sections.
Some Simple Cases When subclass frequencies are equal and contrasts are orthogonal, the sums of products between groups are simple functions of rows of 0. Consider a 2x2 fixed-effects crossed design, with two observations per group; that is, D=diag (2,2,2,2).
Analysis of Variance: Tests of Significance
305
The contrasts and basis for main effects and interaction are symbolically represented as CO® CO (constant), C1 ®CO (A main-effect contrast), CO® C1 (8 main-effect contrast), and C1 ® C1 (interaction). These are
L~[j
.5 1 0 0
.5 -1 0 0
.5 0 1 0
.5 0 -1 0
.25 .5 .5 1
.25 .5 -.5 -1
.25 -.5 .5 -1
.25] Constant
-.5 A main effect -.5 8 main effect 1 A8 interaction (9.1.17)
and .5 .5 1 K= [ 1 .5 -.5 1 -.5 .5 1 -.5 -.5 Constant
A
.25] -.25 -.25 .25
8
(9.1.18)
A8
The orthogonal basis and triangular factor (Cholesky factor of K'DK) are
Va/8
v'214 v'214 v2!4] v2!4 -v'2!4 -v214 -v2/4 v'2!4 -v214 Vs/8 -v2/4 -v'214 v2!4 Constant A 8 A8
K *- [ Vs/8 - Vs/8
(9.1.19)
and
T'=
[
Va
0
0
v'2
0
0
0 0
0
v'2
0
0
Since T' is diagonal, U=T'0 is the same as Equivalently,
(9.1 .20)
0
but with rows multiplied by (t;;].
U= [K*]'DY. Vs/8 [ v'214 =
Vs/8
Vs/8
Vs/81 [2 2 V2!4 -v214 -v'214 v2!4 -v'214 v'214 -v'214 v214 (Zero) V2!4 -\1'214 -v214
v:: 2:i 2:k y:ik
u' t
Constant
') 2v22:' "(Y·t"-y"'2"
u'2
A main effect (9.1.21)
' -Y·i2 ' ) 2v2 2:; (Y·it
u' 3 8 main effect
' + Y·22-Y·t2-Y·2t ' ' ' ) 2v2( Y·11
u' 4
A8 interaction
306
Method
The p xp squares and products for the A main effect are
UzU'z = 112[2:" (y.,k-Y·2h·l] =
[2:k (YoJk-Y·zJ.·l']
2:k 4(y .. ,,-y ... )(y .. "-y... )'
(9.1.22)
where the number of observations in each level of factor A is four. In the orthogonal case, sums of products for all main effects and interactions may be expressed as simple deviations of level and subclassmeans, from the overall vector mean y.... The reader may wish to verify this for the 8 contrast and AB interaction. The orthogonal model is the case most frequently presented in texts (for example, Morrison, 1967; Cooley and Lohnes, 1971). Although the computations under the general linear model are more complex, many of the complexities disappear when subclass frequencies are equal. Use of the general model, on the other hand, facilitates the estimation of contrasts that are not orthogonal, even in equai-N designs. The transformation to T'0 assures that only sums of squares and products of independent components will be obtained. In the univariate case, 0 is an /-element column vector. Also each row of U has only a single element. If we assume that the 2X2 example is univariate, the A sum of squares is
(9.1.23) This is the usual A sum of squares as obtained through scalar algebra for univariate analysis of variance.
Example The data of Table 3.3.2 comprise a bivariate 2X2 crossed design with unequal cell frequencies. Thus the orthonormal basis and triangular factor have a more complex form than in Eqs. 9.1.19 and 9.1.20. The basis is K as given by Eq. 9.1.18; the matrix of subclass frequencies is D = diag (4, 6, 5, 7). The Cholesky factor of K'DK is
T' = [
-.43 O.OOJ
4.69 -.21 2.34 -.04 -.21 2.31 -.11 (Zero) 1.15
The Gram-Schmidt orthonormal basis satisfying LK*] 'DK* =I is
21 K* = [ .21
.21 .21
.23 .26 .23 -.17 -.19 -.19 .26 -.23 -.19 -.18 .16_
.291
Analysis of Variance: Tests of Significance
307
The matrix of cell means is
Y .
=
.50 .75] Group 11 3.67 Group 12 3.17 [ 2.20 2.40 Group 21 3.43 3.57 Group 22 YI
The orthogonal estimates are
U= [K*]'DY.
11.94 13.22] Constant _ [ -1.91 -1.36 A, eliminating constant - -4.33 -4.52 B, eliminating constant and A -1.65 -2.00 AB, eliminating constant, A, and B Yt Yz These estimates are mean differences computed to be statistically independent in the order constant, A, B, AB, and to have equal variancecovariance matrices over repeated samplings. Multiplication ofT' and e from Section 8.1 will yield identical results and may be inspected to see the relationship of with u. The sum of products for. the constant term is
e
[~ ~:;~] [ 11.94
U1 u', =
Su 1 =
[142.55 157.82
=
13.22]
157.82] 174.73
The sum of products for A eliminating the constant is
S liz
= u2 u' 2 _ [3.64 2.60] -
2.60 1.86
The diagonal elements of SH., are the sums of squares for the A main effect for y 1 and y 2 alone. The- sum of products for B, eliminating A and the constant, is
s H3
=
UaU
' :J
=
[18.75 19.56
19.56] 20.41
The sum of products for interaction, eliminating all other effects, is sf!4
= u4u' 4 = [2. 72 3.30] 3.30 4.01
All degrees of freedom are unity (n 111 = n" 2 = n 11 ,1 = n" 1 = 1). Each matrix is the squares and products of a single row of U. The mean products matrices are identical to the sums of products.
308
Method
The error sum of products is the within-groups matrix, computed in Section 8.2. That is, [ 16.35 SE=Sw= 11.72
11.72] 11.00
SE has nr = 18 degrees of freedom. The elements of SE are compared with those of the Su matrices to decide whether each effect contributes to criterion variation. The total sum of squares and products of the original data is [ 184 195] Sr= 195 212 It is easily verified that the total has been exactly partitioned. That is, 4
sT=Sl!+~ sil;=SE+U'U j=l
9.2
TEST CRITERIA
In this section we use the sums of products developed in Section 9.1 to test the nullity of rows of 8. The significance of effects may be tested for all p variates jointly, for each of the criterion measures, and for each measure eliminating preceding measures in a specified order. The test statistics are the multivariate likelihood ratio criterion, univariate F statistics, and "step-down" F statistics, respectively. In addition, Hotelling's Pis employed, as a special case of the likelihood criterion, to test multivariate hypotheses concerning a single contrast or single linear combination of effects in 8. All of the test statistics presume the multivariate normality of mean vectors. Subjects are assumed to respond independently of one another, with a common variance-covariance matrix of scores, I. There is no necessary relationship between the results of multivariate and univariate tests of the same hypothesis. For example, consider vector means for two groups on two variates having their end points at y., and y. 2 , as depicted in Figure 9.2.1a. The points around y., and y. 2 represent individual vector observations on the two variates y, and y2 . Assuming a zero or negative correlation between \fariables, the mean difference on y, or Yz alone may not be significant. This would be evidenced by the difference on either variable separately (the length of a or b) being small, relative to within-group dispersion of the variate. By comparison, a multivariate test statistic incorporating both dimensions of variation may yield significant effects. This is evidenced by the length of segment c being longer than either a or b. In addition, within-group variation in the direction of segment c may be smaller than for either y, or y2 alone. Altering the within-group correlation of y1 and Yz may produce the reverse result. Let us assume that group means differ significantly on both y1 and y2 alone. Because of the positive correlation of the two, the largest within-group
Analysis of Variance: Tests of Significance
Figure 9.2.1a
309
Multivariate difference significant, not univariate.
b l---::-f===:::!4=~:::t---l yl I
a
/ --/'\
,. /
/'y _.-
,_ ...........
I
Figure 9.2.1 b
_·2
/
I
I
Univariate differences significant, not multivariate.
dispersion is on the same dimension as the bivariate maximally discriminating line (c). This segment, in turn, may not be of sufficient relative length to produce a significant multivariate test statistic. It is easy to imagine situations in which one or more of a number of uni~ variate results is significant and not the multivariate, or the reverse. Not only may correlations differ, but also the locations of vector means and the variances within groups, as well as the number of groups and measures. The choice of test statistic must follow the design of the research! Whenever variables consist of multiple measures of the same construct, a multivariate test statistic is appropriate for deciding whether or not the hypothesis is supported. Univariate F ratios for a single effect are not independent, and will tend to compound sampling and decision errors.
310
Method
If the criterion measures have a theoretical or actual ordering (for example, in terms of complexity, time, or ordered dimensions), then step-down analysis may be appropriate. If it is hypothesized that groups will vary on specifically defined dimensions that are linear combinations of the original measures (for example, factors and discriminant variables), it is more appropriate to transform the data to those dimensions before hypothesis testing. This is accomplished by explicit prior definition of the transformation or, empirically, through "discriminant analysis" techniques. Other test criteria, discussed in Chapter 10, may then be employed.
Hypotheses The general multivariate analysis-of-variance hypothesis is that one or more rows of 0 are null for all p measures. Let 0h be a section of 0 having nh rows. Each row is a particular mean contrast with p elements, and corresponds to a single degree of freedom between groups. Then the null hypothesis is that the particular contrasts are zero for all measures, or (9.2.1) The alternate is that the contrasts are not zero for one or more measures. For example, in the one-way four-group design of the word memory experiment (Sample Problem 2), the rows of 0 are two-element vectors. The four rows are the constant k' and the three differences a;-a~, a~-a;, and a~-a~, respectively. We may test the nullity of one, two, or all three of these contrasts. For example, the test of (9.2.2) is equivalent to testing that a 3 = a 4 , or that the last row of 0 is null. Also, Eq. 9.2.2 becomes H0 : 11-3 = 11- 4 if we let /1-; = 11-+a;. Equation 9.2.2 is not equivalent to H0 : a3 = a4 = 0, however. Under the reparameterization to simple contrasts, the aj only need be equal, not necessarily zero, to accept H0 . The test of 9.2.2 requires only the estimates from the corresponding last row of 0 or U. We may also test the nullity of all three final rows of 0. The hypothesis is (9.2.3) The alternative is that one or more of the vectors is not equal to the others. Equivalently, Eq. 9.2.3 becomes H0 : 11- 1 = 11-z= 11- 3 = 11- 4 if we let aJ= /1-;-11-· · The test of Eq. 9.2.3 is the three-degree-of-freedom test of equality of the four vector means, or the nullity of all contrasts among the four groups. The hypothesis is of the identical form as Eq. 9.2.1 if 0h is the last nh = 3 rows of 0. Each hypothesis is an extension of its univariate counterpart. In order for the multivariate hypothesis to be accepted, however, the data must be supportive for all p measures. Consider the 2X3 main-effects crossed design with rows k' = 11-'+a:+p:, a; -a:, p; -{l:, and p; -p:, respectively. The hypothesis that all ,B-effects are equal is (9.2.4) or that 11-· 1 = 11-·z = 11-·:J· The hypothesis is that the final nh = 2 rows of 0 are zero.
Analysis of Variance: Tests of Significance
311
Ho is tested from the final two rows of 0 or U. Again, only under the unnecessary assumption that ~k pk = 0 is H0equivalent to the test thatpk = 0. We may separately test only the second row of 0 for the hypothesis that Ho: a1 =a. (9.2.5) or that P-1· = #J-2· for all outcome variables. The one-degree-of-freedom test requires the second row of the estimates 0 or U. If both hypotheses 9.2.4 and 9.2.5 are accepted, we may wish to test that the overall population constant is zero, or (9.2.6) Generally, in social science research the origins of measurement scales are arbitrary, and hypothesis 9.2.6 is not of concern. However in selected instances, such as when the data are change scores or growth scores, departure from zero mean is meaningful. The hypothesis is tested using only the first row of 8 or U. Equation 9.2.6 is identical to the simple hypothesis H0 : p, = 0, if there are no a or fJ effects (one group of observations). In like fashion, interaction contrasts may comprise further rows of 0 and may individually or jointly be tested for departure from zero. Unlike main effects, interactions are departures from the simpler (main-effect) model. If the interactions are accepted as equal, they are also identically zero. Specific research problems usually involve testing a sequence of hypotheses about main effects and interactions, or about multiple specific contrasts. The following sections present test criteria for a single hypothesis. The effects of order and testing multiple hypotheses are discussed following the test criteria; the sample problems discussed in Section 9:3 provide further illustration. The multivariate null hypothesis is given by Eq. 9.2.1. 0 may be assumed partitioned into the final nh rows to be tested and the leading 1-nh rows, not under scrutiny. That is, 0 =
[---~~---]1-nh rows 0h
(9 .2.?a)
nh rows
p columns where nh represents the degrees of freedom for hypothesis. 0 is estimated by Eq. 9.1.1. The final nh rows of the estimate are 0h. The orthogonal estimates U (Eq. 9.1.4) may be likewise partitioned. Uh is the final nh xp submatrix of U corresponding to the nh effects, eliminating effects 1 through 1-nh. That is, U=
[---~~---J 1-nh rows Uh
(9.2.7b)
nh rows
p columns The sum of products for hypothesis is the sum of squares and cross products of the nh rows of U, l
Sn=
L i=l-nh+1
UtU';
(9.2.8)
312
Method
The error sum of squares and cross products may have any of the forms discussed in Section 9.1. The matrix is represented as SE with n. degrees of freedom. Mean squares and cross products Jor hypothesis and error, respectively, are MH=SH
nh and
'
SE
I=ME=n.
(9.2.9a) (9.2.9b}
Equations 9.2.8 through 9.2.9b are all pxp symmetric matrices.
Likelihood Ratio Criterion The likelihood ratio criterion provides a multivariate statistic for testing H0 with any values of p and nh. The same criterion for testing H0 tor the regression model is presented in Chapter 5. Substitution of 8 forB and K for X yields parallel results tor the analysis-of-variance model. The assumption for the use of the likelihood ratio criterion is that the mean errors are distributed independently in multivariate normal form with expectation 0 and variance-covariance matrix IIN;. I may have any arbitrary form as long as it is nonsingular, with III> 0. This restriction requires that (a) there must be at least as many degrees of freedom tor error as criterion measures, that is, n. ~ p; and (b) no dependent 'variable can be exactly expressible as a linear combination of other variates. If the degrees of freedom are too few, either the number of observations m1,1st be increased or the number of variates decreased. Variables that are linearly dependent occur, for example, when both subtest and total test scores are included in a single analysis or when the variables are percentage scores that sum to 100 for each subject. In either case, at least one of the variables may be omitted from the analysis; its inclusion yields only redundant information. Or variables may be analyzed in subsets that do not individually contain the dependencies. All other aspects of the analysis (that is, estimation of effects) may be completed with the entire set of measures. The likelihood ratio is a measure of the extent to which the data are less likely to have arisen from a population described by the model when H0 is true than when H0 is false. It is derived by evaluating the likelihood function for the multivariate normal distribution under two models. One model is Y. = K8+E., as in Eq. 9.0.2. The other model is Y. = K08 0 + E. 0 , where K0 and 8o are the leading 1-nh effects in K and 8, respectively. The effects in 8h are omitted. If the likelihood of the first model is sufficiently greater in the sample, then we conclude that the terms in 8h are important and we reject H0 • The first model with 8h provides a better description of the data. If the likelihoods of the two models are similar, then the smaller model is maintained and H0 is supported. The likelihood ratio statistic is a comparison of residual variation under the two models. Under the first model, I is the variance-covariance matrix of the residuals and III is a multivariate or generalized dispersion measure. Under the second model, the variance-covariance matrix of the residuals is Io and IIol is the generalized dispersion measure. If the terms omitted in the second model are significant, then IIol will be much larger than III. since important sources
Analysis of Variance: Tests of Significance
313
of variation have been attributed to the residual. If the terms omitted in the second model are not significant, then their estimates only reflect additional random variation. IIol will not be much different from III. The likelihood ratio statistic is
A=
1~1
(9.2.10)
IIol
The smaller the value of A, the more inclined we are to reject H0 • Simple transformations of A follow well-known distributional forms. For large N, Bartlett (1947) has shown that (9.2.11) follows a chi-square distribution, with nhp degrees of freedom. The multiplier is (9.2.11 a) where ne is the error degrees of freedom, usually N-J. H0 is rejected with 1-a confidence if x2 exceeds the 1OOa upper percentage point of the x2 distribution with nhp degrees of freedom. Otherwise H0 is maintained. Note that smaller values of A yield larger x2 results. A more accurate approximation for the distribution of A is an F transformation given by Rao (1952). The test statistic is F = 1-A'18 • ms+1-nhp/2 Atls nhp
(9.2.12)
where (9.2.12a) m is the same as defined for the chi-square test (Eq. 9.2.11 a). Using Rao's transformation, H0 is rejected with confidence 1-a ifF exceeds the 1OOa upper percentage point of the F distribution with nhp and ms+1-nhp/2 degrees of freedom. Like the x2 , a smaller value of A results in a larger F statistic. Since s becomes indeterminate with nhp = 2, setting s to unity in such situations will provide a correct test statistic. For accuracy, especially in smaller problems, the F transformation (Eq. 9.2.12) should be employed in preference to the x2 transform (Eq. 9.2.11 ).. In the special case when either porn" is 1 or 2, the F statistic of Eq. 9.2.12 exactly follows the corresponding F distribution. When the degrees of freedom are fractional, rounding to the next lowest integer value will provide the conservative test. A may be computed for any sum of products for hypothesis SH, with nh;, 1 degrees of freedom, and error sum of products SE· Expression 9.2.10 is equivalently (9.2.13) The likelihood ratio can be obtained directly from the sum-of-products partitions. Both ISEI and ISE+SHI may be found by factoring the matrices into tri-
314
Method
angular factors by the method of Cholesky. The determinant is obtained from the factors. For example, let SE= TE TI,, such that TE is the Cholesky factor. The determinant is p
ISEI
=
IJ [teb
ITEIIT~I =
2
(9.2.14)
Then p
log IS~·I
=
2
2:
log [tehk
(9.2.15)
Parallel forms may be followed for SE+SH· Hummel and Sligo (1971) have shown that a realistic error rate is maintained if the multivariate test statistic is used for a single decision about H0 • A significant multivariate result may be followed by a small number of univariate tests to determine which variates show the greatest and/or smallest mean differences.
Example We shall use the data of Table 3.3.2 to test the significance of the interaction effect. The null hypothesis is
or The sum of products for interaction, eliminating the constant and main effects, is given in Section 9.1. That is,
[ 2.72
3.30] y,
s~~. = u"u'" = 3.30 4.01 y" Yt Ye There is n"" = 1 degree of freedom for interaction. The error sum of products is
[ 16.35 S 1; = 11.72 Y1
11.72] Yt 11.00 Ye
Ye
having n,. = 18 degrees of freedom. The determinants are ISEI = 16.35(11.00) -11.72" = 42.45 ISE+SI/41 = 19.07(15.01) -15.02 2 = 60.50 The likelihood ratio statistic is
42.45
:\ = 60.50 = .70 2
The multiplier for the F transformation is m = (18-(2+1-1)/2]
=
17
I
Analysis of Variance: Tests of Significance
315
Since n 11 p = 2, sis not computed but set to unity by substitution. The F transformation is
F = 1-. 702' . 17(1) + 1 - (2)/2 = 3 61 .702' 2 . with 2 and 17 degrees of freedom. The .01 critical F value is not exceeded (the .05 value is exceeded) and H0 is maintained. The final row of 0 is null for both variates. Since the hypothesis is the only possible interaction contrast, we shall consider the individual interaction terms l';k to be zero.
Hotelling's f2 The P statistic (Hotelling, 1931) is a specialization of the likelihood ratio statistic to a single vector mean, or a single linear composite of mean vectors. P can be employed in place of the likelihood ratio test whenever the degrees of freedom for hypothesis is n, = 1. The test of significance from A or P will yield identical results. The P statistic is a generalization of Student's t to multiple criterion measures; when p=1, Pis identically the square oft, or F with one degree of freedom in the numerator. For the most general form of P, let M be a Jxp matrix mean for J populations of observations on p variates. Assume that observations in each group are independently normally distributed, with expectation p,j (the jth row of M) and pxp covariance matrix I. The null hypothesis is (9.2.16) Equation 9.2.16 asserts that a particular linear combination of the rows of M is equal to a specified 1 xp vector T'. The alternative hypothesis is that the two vectors are not equal, for one or more elements. To test Eq. 9.2.16, we draw a random sample of N; observations from population j (j = 1, 2, ... , J). Let yij be the p x 1 vector observation for subject i in group j. The total number of observations is N="L; N;. The sample vector mean for group j is
1 £.-i "' yij y:;= N· I
(9.2.17)
.)
Row vectors y:; are juxtaposed in the Jxp sample mean matrix Y. to estimate M (Y. = M). Over repeated samplings, Y. is an unbiased estimate, with (Y.)=M. Let D be a JxJ diagonal matrix of cell frequencies, D=diag (N,, N2 , • • . , NJ). Then the covariance matrix of Y. over repeated samplings is ".V(Y.) =
o- 1 ®I
(9.2.18)
Rows of Y. are independent. The variance-covariance matrix of a row of means y:; is (1/ N;)I. The form of the covariance matrix is arbitrary; the p variates may have any pattern of intercorrelations. The estimate of I is provided by the pooled within-
316
Method
group variation of observations about the subclass means. Let Y be the entire Nxp observation matrix. Then
l:=Sw
= N~J L; L1 (Y;rY-;)(Y;;-Y-;)' = N~J [ L;
(L YuYiJ-N;Y·;Y!;)] 1
1 N-)Y'Y-Y!DY.)
=
(9.2.19)
To test H0 (Eq. 9.2.16) we require an estimate of v'M and of the covariance matrix of the product. The estimate of v'M is the same linear function of the sample means, V1\l = v'Y. (9.2.20) The covariance matrix of v'Y. is
r(v'Y.)
= v'?'~"(Y.)v = v'(D- 1 ®I)v
=
[v'D- 1 v]I
(9.2.21)
The covariance matrix of a linear combination of means is a scalar function of
I. Hotelling's statistic is
1
A
T2 = - (v'Y.- 'T1) ., ~- 1 (v'Y -T')' v'D-lv .
(9.2.22)
P follows the P distribution, with parameters p and N-J-p+1. Tables of Pare given in Jensen and Howe (1968). H0 is rejected with confidence 1-a if P exceeds the 100a upper percentage point of T$.N-J-p+l· P may also be referred to tables of the F distribution, using the transformation given by expression 5.2.27. ne is replaced by N-J for analysis-of-variance applications. One-group case: The P statistic is appropriate in a number of familiar situations. If there is only a single group of observations, we may use P to test that the population vector mean is equal to a vector of constants. The null hypothesis is Ho: 1L = T (9.2.23)
This has the form of Eq. 9.2.16, with p,' a 1 xp vector, and v the unit scalar, v = 1. The general form of P is given by Eq. 9.2.22. y: is the 1 xp sample vector mean and D a scalar equal to the number of subjects, N. The test statistic is P=N(y.-T)'
i-l (y.-T)
(9.2.24)
The sample covariance matrix (Eq. 9.2.19) for one group is
'
1
I= N-1
N
L (y;-y.)(y;-y.)' i~l
(9.2.25)
Analysis of Variance: Tests of Significance
317
H0 is rejected if 2
>
T ~
p(N-1) N-p
Fp,N-P,OI
It can be seen that when p = 1, J2 is the square of the t statistic for H0 : J.L=T. Equation 9.2.24 simplifies to
Example Two multiple-choice tests were administered to N = 11 students. Test 1 consisted of 10 four-choice items, with the chance response level of 2.5 items correct. Test 2 consisted of 20 five-choice items with chance response level of 4 correct. The researcher wished to determine whether actual responses differed from the chance level. The observed means on the two tests are y.
t = [1.44
.30
.30] 1.69
Substituting in Eq. 9.2.24,
P= 11[3.2-2.5 8-4]i-t [ 3 ·~=~· 5 ]=104.14 The .05 critical F value, with 2 and 9 degrees of freedom, is 4.26. The critical value of J2 is
2 (~ O) (4.26) = 9.47 H0 is rejected. The subjects responded above chance level.
Two-group case: When there are J= 2 groups of subjects, we may wish to test that the vector means are equal. The null hypothesis is
or
(9.2.26) Equation 9.2.26 has the form of 9.2.16, where M is 2Xp, v'
[O
=
[1
-1], and
T1
=
0].
Assume that y. 1 and y. 2 are sample vector means for the two groups, based on N 1 and N 2 observations, respectively. Y. is 2xp, with row vectors y: 1 and y: 2 . Dis 2x2 with diagonal elements N 1 and N2 • The sample covariance
,
318
Method
matrix for two groups is (9.2.27)
where N is the total N1 + N2 • If the general form (Eq. 9.2.22) is applied, the test statistic is
P= { [1
1(Y·t-Y·z)' l;- (Y·t-Y·z) 0Nz ]-1[_ 1]}1
N1 -1] [ O
A
1
=N1N2 - - (Y·t-Y·2 )'~-~( _., Y·t-Y·2 )
(9.2.28)
N1 +Nz
We may recognize the univariate form of Eq. 9.2.28 as the square of the t statistic for H 0 : p,1 = p, 2 • That is if p = 1, i is the scalar tP, and
where
Row of 8: P may be used to test the nullity of a single row or linear combination of rows of 8. Since a single row is a particular linear combination of rows, with v having one unit element and the remaining elements zeros, we discuss only the general case here. Suppose we wish to test (9.2.29)
where 8 is an /Xp matrix of contrast parameters and T' is a 1 Xp vector of constants, selected according to the research questions. The estimate of 8 is S = (K'DK)- 1 K'DV., with covariance matrix (K'DKt 1 ®l:. The covariance matrix of the linear combination v'e is 'V(v'S)
=
v''V(S)v
=
v' [(K'DK)- 1 ® l:]v
=
[v'(K'DK)- 1v]l:
(9.2.30)
Let i be an estimate of 1:, derived under the analysis-of-variance model. The test statistic is P
=
1
v'(K'DK)- 1v
A
A
A
(v'8-T')l:- 1(v'8-T')'
(9.2.31)
Note that K'DK, the covariance factors of 8, replaceD in the Pdenominator. Let ne be degrees of freedom for estimating I. Then H0 is rejected if (9.2.32)
Analysis of Variance: Tests of Significance Fp,ne-PH,a is the upper 100a percentage point of the ne-P+ 1 degrees of freedom.
F
319
distribution, with p and
We often test the significance of a vector of contrasts, eliminating preceding effects. The same statistic (Eq. 9.2.31) may be employed with u' 1, one row of U, in place of v'S. I replaces the covariance factors (K'DK)-t, and the premultiplication factor disappears. To test for zero effect, "Tis null. The test statistic is
(9.2.33) There is a simple correspondence between A when n" = 1 and Hotelling's P statistic. The relationship of the two is given in Chapter 5 for the regression model. That is,
(9.2.34)
A=1+Pine
Both A and P yield the same (exact) F value for one hypothesis degree of freedom.
Example Let us use Hotelling's P statistic to test whether the B main effect is significant for the 2X2 crossed design of Table 3.3.2. The null hypothesis is Ho: Pt = /3 2 • The row of U for the B contrast, eliminating the constant and A, is given in Section 9.1. That is,
u' 3 = [-4.33 -4.52] Yt Yt The within-group variance-covariance matrix is
i
.65] Yt .65 .61 Y2
= [.91
Y1
i
Y2
has ne = 18 degrees of freedom. The P statistic is
rz = u':J,-tu3 =
34.49
The .01 critical F value, with 2 and 17 degrees of freedom, is 6.11. Transforming Pto F, we have
F
17(34.49) 2(18)
1629 .
F does exceed the critical value, and Ho is rejected with .99 confidence. There is a significant difference between the vector means for levels of the B main effect. For both variates, the population mean for Bt is lower than for Bz.
320
Method
Univariate Ftests Multivariate test statistics should form the basis for decision making whenever criterion variables are aspects of the same behavioral construct. However, $eparate F statistics for each of the outcome measures provide useful descriptive data. For any one hypothesis, the largest single-variate F ratio is obtained for the variable having the largest between-group difference, relative to withingroup variation. Likewise, the smallest single F statistic is obtained for the measure least affected by the group membership variables; and so on. Multiple univariate F ratios provide relative strength-of-effect estimates for the outcome measures. They may also facilitate comparing mean differences for transformed and untransformed data, for subtest scores computed in different manners, or for multiple factors or principal components of the same test battery. Separate F statistics for variables that are correlated are not independent of one another and should not be used as partial tests of multivariate hypotheses. Hummel and Sligo (1971) compared several interpretive devices for multivariate outcomes. They conclude that the most useful approach is to use a multivariate criterion for the global test. If H 0 is rejected, a small number of univariate F statistics may be inspected or tested individually, to determine which variables have important group-mean differences. The general form of the multivariate null hypothesis is given by Eq. 9.2.1. A section of 0 comprising nh rows or contrasts, and p columns or measures is hypothesized to be null. Specific instances of H0 may be (9.2.35a)
or
Ho 2 : fJ1 = fJ2 = {J.
(9.2.35b)
or involve only a single comparison, (9.2.35c)
Each vector contains the effect for the p separate measures. Univariate results will provide p separate but correlated test statistics, for each element of the vectors, respectively. For example, in place of H01 we may have p univariate hypotheses (9.2.36) for outcomes y" (k = 1, 2, ... , p). The univariate results are simple by-products of the multivariate sum-ofproducts matrices. Each hypothesis sum of products, SH, is the sum of nh (;;,: 1) vector products of specified rows of U. SH is the sum of products between groups, and has n" degrees of freedom. Dividing, SHin" yields MH, the pxp matrix of mean squares and products. MH has the mean squares between groups, for each variable separately, in the diagonal positions. The mean squares and products for error are obtained from the sum of products for error, SE, and the error degrees of freedom, ne. The pxp matrix is ME= SEine= i. ME has the error mean squares for each variate in the diagonal positions. Ratios of the diagonal elements of MH to those of ME follow F distributions,
Analysis of Variance: Tests of Significance
321
with nh and ne degrees of freedom. For any pair of matrices, there are p such ratios. That is,
(9.2.37) fork= 1, 2, ... , p. Each F statistic is a test that (J"(k) = 0, or that then" effects are null for a particular variate. The univariate F statistics are invariant under permutation of criterion measures. Each is exactly the F statistic that would have been obtained if only a single measure were included in the analysis. The univariate F ratios for various contrasts, like their multivariate counterparts, are subject to the conditions of ordering of effects. In a nonorthogonal design, for tests to be independent across main effects and interactions, an ordered elimination procedure is necessary, whether the number of criterion variates is one or many. Example We obtain the multivariate test of the interaction effect for the data of Table 3.3.2, on page 314. The sum of products for interaction, eliminating all else, is
s . = [2.72 If I
3.30] 3.30 4.01
The sum of products for error is
s ·= [16.35 "
11.72] 11.00
11.72
The mean squares between groups for y, and y2 are identically the sums of squares, since there is only a single degree of freedom. That is,
[m11Ju =2.72
[m,b=4.01
The mean squares within groups are
16.35 [m .. ]u=~=.91 11.00
[m,.b=---:ul= .61
(y,)
(Ye)
The F ratios for the two variates for interaction, eliminating all else, are
F 1 = 2·72 =3.00 .91
F.= 4 ·01 = 6.56 - .61 Each ratio has 1 and 18 degrees of freedom. Neither exceeds the .01 critical F value. The value for y2 does exceed the .05 critical value however, and we may suspect that there is some interaction for a small portion of the data.
322
Method
Step-down Analysis Step-down tests may be conducted for one or more analysis-of-variance effects, as for regression effects in Section 5.2. The step-down statistic for the criterion variable Yk is identical to the univariate test that is obtained if preceding criteria are eliminated as predictors in a regression model. Step-down analysis is the same as p-1 ordered analyses of covariance. The step-down procedure enables the researcher. to determine the between-group effect for the first criterion measure; for the second criterion measure, eliminating the first; for the third eliminating the first two; and so on. In this manner it can be determined whether all between-group differences are concentrated in the leading criteria or whether later criteria contribute to group differences, above and beyond earlier measures. Like the other criteria, the step-down statistics depend upon the pxp sum of products for hypothesis S8 and the error sum of products SE· Assume that S8 is the sum of squares and products of nh rows of orthogonal estimates, with nh (;;;, 1) degrees of freedom. SE is the residual or error sum of products, estimated with ne degrees of freedom. Step-down test statistics are F ratios based upon the conditional variance of each criterion, given preceding measures. These are obtained through the triangular Cholesky factorization of SE and of the sum S = SE+S8 . Let us represent these factorizations as (9.2.38a)
and
S
=
T*[T*]'
(9.2.38b)
[te]kk and [t*hk are the diagonal elements ofTE and T* respectively.
It is shown in Chapter 2 that the Cholesky factor of a covariance matrix contains the standard deviations of the conditional distributions of the variates, eliminating preceding measures. Ete] 112 is the error sum of squares for h. Each following [tehk 2 is the error sL,Jm of squares for Yk• given Y~o y2 , . . • , Yk-!· Similarly, [t*hk 2 is the sum of squares between groups plus error for yk, given the preceding variates. And [t*hk 2 -[tehk 2 is the sum of squares between groups for yk, given Y1• Yz, ... , Yk+ Dividing the between-group value by its degrees of freedom, nh, yields the step-down hypothesis mean square. Dividing [tehk 2 by its degrees of freedom yields the step-down error mean square. For each criterion eliminated (as with predictors in regression), error degrees of freedom are reduced by one. Thus the degrees of freedom for [tehk 2 are ne-k+1. This is ne for [te] 11 2 , and one fewer for each subsequent term. There are p step-down F statistics for any hypothesis of the form H0 : 0h =0. The kth statistic is the ratio of the two conditional mean squares. That is,
FZ =
([t*hkz_Uehk 2 )1nh [tehk 21(ne-k+1)
[t*hk 2 -[te],_kZ ne-k+1 = Uehk 2 nh •
(9.2.39)
Analysis of Variance: Tests of Significance
323
The univariate and step-down F statistics for the first variate are identical.
F; through F; however, are tests of the between-group effect for yk> eliminating
any portion of the effect that can be attributed to variables preceding Yk in SE and SH. FZ is referred to upper percentage points of the F distribution, with nh and ne-k+1 degrees of freedom. H0 is rejected if, and only if, at least one stepdown F statistic exceeds its critical value. Step-down analysis is a multivariate procedure that may be used instead of the likelihood ratio criterion. It provides the only test criteria, of those discussed here, that are dependent upon the order of the dependent variables. If there is no logical order (in terms of complexity, time, and so on) inherent in the outcome measures, step-down procedures are of little value, and may yield misleading results. Important criteria should be ordered first, and the variables that are more dubious, more complex, or occurring at later times, be ordered last. In this manner it may be determined whether all significant variation between groups is concentrated in simpler, or earlier, measures. The final F; is interpreted first, then F;_ 1 , F;_ 2 , and so on. If a significant F~ is encountered, we have no valid test of preceding terms. y 1 through yk are deemed necessary to the model and H0 is rejected. Variables may be reordered to test preceding terms, eliminating those already found significant. Assuming a prior logical ordering however, this practice is questionable. Step-down tests of a given hypothesis under alternative orders of variates are not independent, and would tend to inflate decision error rates. Further discussion of step-down tests is given in Section 5.2. It is possible to assign differing type-1 error rates to the p step-down tests. Let ak be the probability of a type-1 error, assigned to the variable yk alone. The overall probability of falsely rejecting at least one of the p sub hypotheses is p
a= 1-
IJ {1-ak)
(9.2.40)
k~1
Differential ak levels can be selected so that a fixed a for the overall hypothesis is maintained (for example, .05 or .01 ). Example The criteria of Table 3.3.2 are attitude scores before and after an intervening stimulus situation. Let us test for interaction of A and B main effects using step-down tests. This provides an alternative to the likelihood ratio test, illustrated for the same hypothesis. The error sum of products and its Cholesky factor are
S-=[16.35 11.72] f, 11 .72 11.00 The sum Sg+SH 4 and its Cholesky factor are
19.07 15.02] S = St:+Su• = [ 15.02 15.01
I
T*= [4.37
3.44
0 ] 1.78
324
Method
The step-down ratios are
(4.372-4.04 2) . 18-1+1 F*1
4.04 2
3.00
(y,)
1
2 F.2*= (1.78 -1.612). 18-2+1= 3 _77
1.612
1
(y , e 11m1na .. t·mg y, ) 2
F; is identical to the univariate ratio for y,. Interpreting F; first, the value does not exceed the .01 critical F-value, with 1 and 17 degrees of freedom. Thus we proceed to F;. This value is also not significant (d.f. = 1, 18), and H" is accepted. Note that in computing the likelihood ratio statistic, \Sf.l and ISE+S111 are required. Both may be accurately obtained through the Cholesky factorization (see Chapter 2). It is a simple matter in programming to obtain both multivariate test criteria simultaneously.
Multiple Hypotheses In most cases we are concerned with testing a series of consecutive hypotheses. These may involve the various main effects and interactions in factorial designs. Or they may involve several orthogonal planned contrasts among levels of a single factor. The result is that there may be two or more sections of e to be tested, each involving a hypothesis such as Eq. 9.2.1. For example, in a 2x3 crossed design, e consists of one row for the constant, one row for the A contrast, two rows for 8 contrasts, and two for A8 interactions. Each row has p elements. The 6xp matrix may be diagrammed in partitioned form as follows:
e=
eA
(1 xp)
(9.2.41)
----------
eR
@AB
(2Xp) (2Xp)
For each section of e there is a corresponding section of the matrix of orthogonal estimates U. For the 2x3 example, the sections of U are
uk u.4 U=
Un
UAB
u',
Constant
U'z
A, eliminating constant
= u'3 8, eliminating constant and A u'" A8, eliminating constant, A, and 8
(9.2.42)
Analysis of Variance: Tests of Significance
325
The partition is the same as that given by Eq. 9.1.15. I ' The hypothesis sum-of-products matrices are formed from the rows of ~· If we were to test a hypothesis about the constant, the sum-of-products matrix is (9.2.43a) The degrees of freedom are nh1 = 1. For the A main effect, the sum of products is (9.2.43b) =
1. For 8, eliminating A,
SH3
= U~Ue = u3u' 3+u4u'4
with degrees of freedom nh2
SH3 has nh3
= 2 degrees of freedom. For interaction, SH4 = U~eUAB = UsU's+UsU's
(9.2.43c)
(9.2.43d)
with nh4 = 2. To test the specific B contrasts, the third and fourth matrices would be u3u' 3 and u4u'4, each with one degree of freedom, instead of SH3 • The additional subscript on SHand nh is necessary to distinguish one effect from another. When there is more than one hypothesis, the subscripted matrix and degrees of freedom are substituted for SHand nh in applying all the test criteria. For example, to test the four multivariate hypotheses, SH 1 through SH4 are separately employed in place of SHin Eq. 9.2.13. The respective nh; replaces nh in Eqs. 9.2.11 and 9.2.12. In the general nonorthogonal model (unequal N; or nonorthogonal contrasts), the orthogonal estimates U represent a reparameterization of 8 to orthogonal independent variables in K*. Each effect is tested, eliminating all preceding effects in the model. Both the order of placement of effects in 8 (the "order of elimination") and the order in which the results are interpreted are of consequence. In general, simple effects and those known to be of importance are placed first. The constant term is among these, as are any control variables, blocking variables, and so forth. We test the contribution of other factors, above and beyond those that are deemed necessary in advance of analysis. Complex effects and effects that are the focus of the research are placed in later positions. They are tested for their unique contribution to criterion variation, above and beyond the others. In this manner we avoid attributing effectiveness to experimental variables that may be explained just as well by simpler or better-known factors. Thus in the two-way design, B is the major factor of concern in the study. Interactions are complex functions and are ordered last, usually by degree (two-way interactions, then three-way, then four-way, and so on). If interactions do not contribute to criterion variation above and beyond the main effects, they are omitted from the model. The principle of scientific parsimony dictates that the simpler (main-effect) explanation is preferred. Interactions also indicate the interpretability of main effects. If an interaction is significant, it is likely that simpler main-effect explanations will not adequately describe the outcomes. It is particular combinations of experimental conditions that are effective. Interactions cannot be meaningfully ordered to precede the main effects they involve; for example, the AB interaction cannot precede main effects A orB in the order of elimination.
326
Method
The first hypothesis test to be interpreted is the last set of effects in 0, or the last in the order of elimination. In the two-way example, this is H0 : 0A 8 =0. The last terms in the model are the more complex terms (high-order interactions, complex between-group effects). Elimination of these effects from the model is consistent with scientific parsimony. Maintenance of complex effects is an early warning that simpler explanations of the data may not suffice. Only if H0 is accepted for the last effect can a valid test be obtained of preceding terms in U. This may be seen in the diagrams of Eq. 9.1.12. If H0 is accepted for 0' 1, then the row of 0 is null in the population. The last row of 0 reflects only random variation about expectation 0'. All terms til0' 1 are zero and disappear from the prior rows of T'0. The preceding row is then t1-u- 10' 1+ Tests of significance applied to this row are tests of only 0' 1+ Similarly, if H0 is accepted for 8' 1_ 1 , then 0' 1_, is null and disappears from rows one through/- 2 of the product matrix. Only in this case are prior terms in the ordering not confounded 0' 1_ 1 effects; and so on, through 0' 1. The same situation occurs when multiple rows of U are tested simultaneously (multiple-degree-of-freedom tests). For example, in the 2 x 3 design, only if the AB interaction is nonsignificant do we have a valid test of the A orB main effects. Here we jointly test the two rows of 0A 8 -that is, 0' 5 and 8' 6 . If B is nonsignificant, then A may also be tested in this order. If H0 is rejected for 0' 1 or 8' 1_,, all preceding rows of T'0 and U contain fixed components due to these final rows. Preceding effects are confounded and may not be validly tested in this order. For example, if 0A 8 rows are nonzero, the rows of U for A and B main effects have nonzero AB components. Tests on 0A or 0 8 may prove significant or nonsignificant, due only to the inclusion of the AB terms. Should tests of preceding terms be of importance, they must be reordered and placed in a position in the basis following those that are significant. For example, if the B main effect is significant and we wish to test the A contrast, we must reorder the B contrasts to precede A in K and 0. U is reestimated in the new order, and A may be tested, eliminating the significant B effects. In a threeway design (A X B x C) we may find the C main effect to be significant. To test either the A or B main effect, the C contrast(s) must be ordered ahead of those for A and B. If B is significant and we wish to test A, we may require still another ordering, with both B and C effects ahead of A. Interactions should never precede the main effects they involve (for example, the AC interaction should not precede either the A or C main effect, although it may precede B). The number of alternative orders should be kept to a minimum, to avoid compounding statistical error rates. The importance of each main effect should be given careful consideration in establishing an initial order. Anderson (1962) has shown that under a single ordering of effects, differing a levels may be applied to each test of significance and combined to obtain an overall or experimentwise type-1 error rate. If we assign error rate a.; to the hypothesis for SHi (j = 1, 2, ... , q), then the probability of falsely rejecting at least one null hypothesis, out of q, is
a= 1-IT (1-ai) j=l
(9.2.44)
Analysis of Variance: Tests of Significance
327
Suppose, for example, in the 2X3 crossed design, that we wish to test the three hypotheses Ho 1 : 0A = 0, Ho 2 : 0B = 0, and H03 : 0AB = 0. Our decision rule is that only if an effect is nonsignificant will we test preceding terms. That is, only if we accept Ho 3 will we test H01 or H02 , and only if we accept H02 will we test Ho 1 . We assign a 3 = .01 for H03 , and a 2 = .05 for H02 . Wishing to keep the experimentwise a at no greater than, say, .08, we decide that a 1 for H01 must be no greater than .0218, since .08 = 1-(1-.0218) (1-.05) (1-.01)
Any value larger than a 1 = .0218 will increase the product to greater than .08.
Notes on Estimation and Significance Testing Under the general linear model, the magnitudes of the estimates in 0 vary according to the number and type of contrasts selected. Only effects that are nonzero in the population should be maintained in the model. The common research procedure is to test the significance of all terms in the model that may contribute to criterion variation (all main effects and interactions that the design permits). Those that are not significant may be omitted, and best estimates may be obtained of those remaining. This procedure usually requires two passes with most computer programs, one for significance testing and another for estimation. The material of this chapter concerns significance testing. It is assumed that the entire analysis-of-variance model has rank /, or I degrees of freedom (including 1 for the constant). A contrast is estimated corresponding to each degree of freedom. The contribution of every term or set is tested by partitioning the total sum of products into independent components that exhaust all sources of variation. In practice, this testing forms the first stage of the analysis. Once significance testing has been completed, terms that are not significant may be omitted from the model. The smaller number of terms remaining may be reestimated and used in interpreting trends in the data. The number of degrees of freedom that are maintained is c, the rank of the model for estimation. These c contrasts may be combined to estimate cell means and mean residuals (Section 8.3). The maximum value of c is /, the rank of the model for significance testing. 0c is the c x p matrix of estimates for the reduced model, after significance testing. Computational forms may follow the testing-estimation procedure. The orthogonal estimates U may be obtained first for all/ effects by U= [K*]'DY. (Eq. 9.1.8). The Cholesky factor of K'DK is constructed at the same time as K is orthonormalized. E>c is obtained by the inverse relationship from Eq. 9.1.4. Let Tc be the leading cxc submatrix of the Cholesky factor, and Uc the leading c rows of U. Then (9.2.45)
The MULTIVARIANCE program proceeds from I orthogonal estimates to the
c estimates in E>c. Since both parameters are required prior to run time, c may be initially set to zero, to I, or to some other arbitrary value until the tests of significance can be inspected. A second run, with c corrected, will provide the most useful data for interpretation.
328
9.3
Method
SAMPLE PROBLEMS Sample Problem 2-Word Memory Experiment
To test the significance of mean differences in the one-way, four-level word memory example, we must first obtain the orthogonal estimates U. All N; in the problem are equal. Should the contrast vectors be orthogonal, U would consist of standardized row vectors of 0. However, the design of the experiment dictates the selection of contrasts a 1 -a 2 , a 2-a3 , and a 3 -a 4 • These are not orthogonal, and a matrix transformation of the estimates to independence is necessary. The original model for the data is
All terms are 1 x 2 vectors, with j= 1, 2, 3, 4. The rank of the model matrix A, and reparameterized model matrix K, is I= 4 (one degree of freedom for the constant, plus three between groups). The matrix of contrast parameters is
0=
[ /t:~1~4 2:;a;J a. Uz a~-a~
=
[-~()_0_~~LJ @A
(3X2)
a3-a~
The single null hypothesis for both variates is H0 :
0A
=0.
Both
@A
and
0
are
3 x 2 matrices. H0 is equivalent to H0 : a 1 = a 2 = a 3 = a 4 • We shall orthonormalize the basis by columns to obtain U. The matrix of subclass frequencies is D = diag (12, 12, 12, 12). K is given in Section 8.4. The factoring is K= K*T' or
.75 .50 ['1.00 00 -.25 .50 .25 1.00 -.25 -.50 .25 1.00 -.25 -.50 -.75
25]
-
l'« .144 .144 .144
0 gll~' 0
.250 0 -.083 .236 -.083 -.118 .204 -.083 -.118 -.204
0 0 3.00 2.00 2.83 0 0 0
The reader may verify that [K*] 'OK*= I. The orthogonal estimates are
U= [K*]'DY.
-
[144 250 0
0
.144 -.083 .236 0
144l [200
.144 -.083 -.083 -.118 -.118 .204 -.204
0
0 12 0 0
0 0 12 0
:r83
0 0 12
42.17 40.00 36.25
97]
.94 1.00 .97
~ooj
1 1.41 2.45
Analysis of Variance: Tests of Significance
= [
274.10 1.08 11.43 9.19
329
6.71] u> -.01 u 2 -.12 U 1 3 .08 U 14
If we wished to test that the constant term is equal to a specified vector (that is, H0 : ~-t+1/4};iai=1'), we would employ U 1 1 and obtain the 2x2 matrix of hypothesis squares and products, SH = u1 u' 1 . Since the scale of the measures is not of major concern to the study, we shall proceed to test the hypothesis of mean differences. The vectors of U containing the independent estimates for the threedegree-of-freedom test are u' 2 , u' 3 , and u' 4 • Each yields a squares and products matrix for one contrast, eliminating preceding effects. For a 1 -a2 , eliminating the constant, the matrix is 1 -[1.17 -.01] u2u 2- -.01 .00
For a 2 -a3 , eliminating the constant and a 1 -a2 , we have 1
u3 u
_
3 -
[130.68 -1.35] -1.35 .01
And for a 3 -a4 , eliminating all others, the squares and products are I
[84.38 .75] .75 .01
_
u4u 4-
Each matrix has the contrast sums of squares for the two dependent variables in the diagonal positions. The interdependency of the two measuresinduces a nonzero sum-of-products off-diagonal element. To obtain the three-degree-of-freedom test, the hypothesis matrices are pooled to obtain SH.t That is, 4
SH=
L uiu i 1
i=2
=
[ 216.23 -.61] Words - .61. .02 Categories
tt,-0
"'"O'a
..
cqtl9
:&>o,.,. /19-s-
tSince the contrasts are not orthogonal, only the last of the set may be tested separately. This is a 3 -a4 in the present order, with matrix u,u' 4 • To test the other contrasts separately, they must be reordered and placed in the last position.
330
Method
In one-way designs and equai-N crossed designs, the between-group sum of products may be obtained directly from vector means for the subclasses, and the overall vector mean. Since the present example meets both conditions, we observe that the leading element of Su, the between-group sum of squares for words, is equivalent to the univariate value: 4
L
SSB =
N;(y.;-y .. ) 2
j=l
In the example, Nj= 12 and y .. = 39.83+42.17 +40.00+36.25
39.56
4 Then SSB = 12 [(39.83-39.56) 2 +(42.17-39.56) 2
+(40-39.56) 2 +(36.25-39.56) 2 ]
= 216.23 This is precisely the value obtained through matrix operations. By construction of the contrast matrix and basis, however, the result is derived through the estimation of specific contrasts of experimental concern. By expanding the scalars to vectors, sums of cross products as well as sums of squares are obtained simultaneously for p measures (see Morrison, 1967). If subclass frequencies are unequal in crossed designs, then the results may be obtained only through matrix operations. The estimated effects and sums of products are not simple functions of observed vector means. The between-group mean squares and products is Su divided by its degrees of freedom, nh = 3. The mean products are M = 1_ S = [72.08 u
3
H
-.20] Words .01 Categories 0
-.20
~~&
:9-o,.,. /&&
The between-group mean squares for the two measures separately are 72.08 and .01. Because of the relationship between the two measures, we obtain a multivariate test of significance, utilizing both mean squares as well as the mean cross product, -.20. The error sum of products is the within-group matrix, from Section 8.4,
sE = V'v-v:ov. =
[ 1485.58 11.85 1-z;_
o,.,
11.85] Words .22 Categories 0
O'.s-
~~&
:9-o,.,. /6>&
The degrees of freedom are ne= N-J= 44.
Analysis of Variance: Tests of Significance
331
The error mean squares and cross products are ME=_!_ SE = [33.76 44 .27
.270] Words .005 Categories 0qta tz,ot: ~
;gOt:·
l&.s-
To test H0 , we utilize the likelihood ratio criterion, A=
ISEI ISE+SHI To assess ISEI, we factor SE according to the Cholesky method, obtaining SE= TETE (although in a simple case as this we may evaluate ISEI directly). The Cholesky factor is T = [38.54 E .31
0 .36
J
and ISEI = ITEI 2 = 38.54 2(.36) 2 = 190.12 Similarly, factoring SE+SH=T*[T*]', we have 11.24] .24
S +S = [1701.81 E H 11.24 and T* = [41.25 .27
0 .41
J
Then The likelihood ratio is 190.12 A=--=.662 287.33 The value is noticeably below unity, and may show a significant effect. A may be transformed to an F statistic, with multipliers m and s. That is, m = [ne-(p+1-nh)/2]
= [44-(2+1-3)/2] =44 and
= (22(32)-4)1/2 2 2 +32 -5 =2
332
Method
The F transform is F = 1-A11' . ms+1-nhp/2 Ails nhP
1-Y.662
\1662 .
44(2}+1-(3·2}/2 3(2}
=.23. ~6 =3.29 The ~05 critical value of the F distribution with 6 and 86 degrees of freedom is 2.2. The observed F-value is larger, and H0 is rejected. Before we conclude that the research hypothesis is supported, however, we must inspect the direction of mean differences. According to the original hypothesis, group means are expected to be highest when a maximal amount of structural information is provided the subject. That is, the means were expected to be highest for group 1, and sequentially lower for groups 2, 3, and 4. Inspection of the means V., reveals that there is no such trend for variable 2, proportion of categories reconstructed. For the first variate, number of words recalled, the trend exists for groups 2, 3, and 4. Inspection of 0 (Section 8.4) reveals that the group-1 mean of 39.83 is not significantly higher or lower, than the group-2 mean. The estimated difference ih( 1L&2( 1> is -2.33, with standard error 2.37. The t value, -2.33/2.37, is not significant. We may conclude that there is some support, although not complete, for the major hypothesis of the study. It may be of some interest to inspect the F ratios for the separate outcome measures. These are ratios of between-group mean squares to the respective error mean squares. The univariate F statistics are F1 = [mh] 11 [m.] 11
=
72 ·08 = 2 13 Words 33.76 ·
and F2 = [mh]zz= ·007 = 1.36 Categories [m.J22 .005
Each ratio is referred to an F distribution, with nh = 3 and n. = 44 degrees of freedom. Although the two are not independent, neither individually exceeds the .05 critical value of FaA4· Presentation of the analysis for the example should include descriptive matrices V., D, and Rw, as well as 0 and the corresponding standard errors. If predicted means are obtained from a subset of effects, these should be presented in as simple manner as possible, for each hypothesis. Since all four effects are necessary to the one-way model, predicted means are identical to the observed means and are not obtained here (that is, c=/=4). The analysisof-variance summary is presented in Table 9.3.1. The experimental effect is more strongly seen in the number-of-words variable. However, only when the two responses are considered jointly do we
I
Analysis of Variance: Tests of Significance
Table 9.3.1
333
Analysis of Variance for Word Memory Experiment Mean Products
Source
Constant Between groups, eliminating constant
d.f.
1
Words
Categories
[75129.19 1839.66] 1839.66 45.05
3
[72.08 -.20
-.20] .01
Within groups
44
[33.76 .27
.270] .005
Total
48
Multivariate F (d.f.)
3.29* (6, 86)
Univariate F Words
Categories
-
-
2.13
1.36
*Significant atp < :05.
see significantly different outcomes with differing degrees of structural information. It does not appear that information regarding the organization of word lists universally increases memory of the words. The directional hypothesis, as forwarded by Mandler and Stephens (1967), has received little support. Further analysis may determine whether the mean differences are altered in direction or magnitude when adjustments are made for varying periods of time utilized by the subjects in memorizing the word lists (see analysis of covariance).
Sample Problem 3- Dental Calculus Reduction The data for the dental calculus study are measured calculus scores on the six anterior teeth of the lower mandible. The results of the study are analyzed by fitting a two-way fixed-effects analysis-of-variance model to the data. The sampling factors are years of experimentation (A), havJng two levels, and five experimental conditions (B). The total number of cells is J= 10. The experimental conditions are two control agents (groups 1 and 2) and three different anticalculus agents (groups 3, 4, and 5). Three subclasses have no observations, those for condition 5 in year 1, and for conditions 2 and 4 in year 2. The sampling design is diagrammed on page 16. The number of cells having one or more observations is Jo=7. . The full model has rank /=J 0 =7. The degrees of freedom are one for the ""population constant, four for contrasts among experimental conditions, one for the years effect, and one for the conditions-by-years interaction. Three additional interaction degrees of freedom are lost due to the missing cells. The effect of different anti calculus agents is the most critical to the study. However there are aspects of the design that require inspection prior to interpreting agent differences. One, it is necessary to see whether there is an interaction of agents and years, with agents having different effects if administered at different times. If not, it is likely that both the A effect and the AxB interaction may be deleted from the model. These are ordered last for significance testing so that they may be inspected for significance first, eliminating agent effects. (Note by comparison that if the years effect were assumed or known to be
334
Method
significant, it would have been ordered ahead of the agents effect. Then agents could be tested, eliminating year differences.) Further, it is necessary to test differences between the two "controls" (experimental conditions 1 and 2) to see whether they are equivalent. Only if they are can they be used together as a single comparison group. The difference between the two controls is the "experimental conditions" contrast ordered last. Its test can be interpreted individually to determine whether control differences add to criterion variation above and beyond other agent effects. If not, the control effect may be deleted and the other agent differences tested without reordering. 0 may be considered as ordered and partitioned into sections corresponding to the various effects: 0k (1 X6} Constant -------0B (3X6} Active agents -------0= 0B* (1 X6} Controls -------0A (1 X6} Years
--------
Interaction
0AB (1 X6}
Each column of 0 contains the measured calculus contrasts for one of the six mandibular teeth. The model and basis for the design are constructed in Chapter 8. The years effect is the simple comparison of year 1 with year 2, or symbolically C1. The agents effects are the comparison of agent 1 (group 3) with the average of the two controls (groups 1 and 2), the comparison of agent 2 with the controls, the comparison of agent 3 with the controls, and the difference of the two control means. Since these contrasts do not fit the regular pattern of other contrast types, the rows of L are represented as L1, L2, L3, and L4 for the four effects, respectively. Contrast matrices for the two factors (L2 and L5 ) are constructed and transformed to one-way bases. Kronecker products of one column from K2 and one from K5 produce the entire basis for the 10-group design (see Section 8.4). Deleting the fifth, seventh, and ninth rows for empty groups, we have 1 1 1 1 1 1 1
-.2 -.2 .8 -.2 -.2 .8 -.2
-.2 -.2 -.2 .8 -.2 -,2 -.2
-.2 -.2 -.2 -.2 -.2 -.2 .8
.5 -.5 0 0 .5 0 0
.5 .5 .5 .5 -.5 -.5 -.5
-.1 -.1 .4 -.1 .1 -.4 .1
CO®LO
C00L1
C00L2
C00L3
C00L4
C10LO
C10L1 ~
K=
C6~IS'
1>.
~
(Q.,
...
~
.......
1>.
~ ~
1>.
(Q.,
"'o>
C6~..-'!
)(Q~
'0
~
'd-
~
~
"~·o_,
Analysis of Variance: Tests of Significance
335
"
Columns do not now sum to zero, However, orthogonalizing from the leading unit vector will assure that the col:umn weights multiplied by the respective N; will have zero sum. To obtain the orthogonal estimates, we first orthogonalize K with respect to the diagonal matrix of subclass frequencies. The frequencies, eliminating diagonal zeros, are D = diag (8, 9, 7, 5, 28, 24, 26). The total sample size is N = 107. The Gram-Schmidt factorization is .10 .10 10 K=K*T'= .10 .10 .10 .10
-.06 -.06 .15 -.06 -.06 .15 -.06
.07 .23 -.21 -.63 -.09 0 0 -.03 -.09 -.30 .24 .23 0 0 0 0 0 0 .43 0 .06 .07 -.07 -.03 -.09 0 0 -.07 -.07 0 0 0 -.03 .16 0
.93 -1.59 .44 1.31 -2.37 -.35 -.83 -.30 -1.23 -.31 -1.61 4.69 -.79 -.41 2.16 1.64 -.33 X 4.06 -1.22 -1.53 .31 .42 2.68 -2.09 3.41 .91 1.70 (Zero) 10.34
The orthogonal estimates are U = [K*]'DY .. The matrix of observed means for the seven subclasses having observations is
Y.=
.75 2.25 3. 75 1.33 1.78 3.11 .43 .86 1.29 1.00 .80 2.00 .68 1.57 2.71 .54 .79 2.08 .23 .42 .77
4.13 3.33 1.57 1.20 2.75 1.71 1.31
2.25 2.56 1.00 .60 1.57 .96 .65
.88 1.56 .43 .00 .71 .67 .19
l
Control1 Control2 e; Agent 1 ...... Agent2
~
~] Control1 Ill Agent1 ~ Agent3
~-
~~(19 (19 (19 19~ 19~ 19~ ~ ~ ~ 7a 019 .... "'..-: ~1.! "~ "9l..,,• ~~L 19.., "~ 19~ ~~ "'~ ;:., 9l,; v~. & & ~9l,; ~ ~ ~
C}.·
iS'o,...
1:>.
~.:,
"o.....
""Cl·
~o
,...
C}.·
iS'o,...
336
Method
The orthogonal estimates are
U=
u'l Constant u' z Agents u' ~) u' • u' 5 Controls u'u Years u' 7 Interaction
6.28 11.41 21.56 22.43 13.05 -1.96 -1.20 -3.24 -1.94 .04 -.98 -.37 -2.70 -1.81 -1.40 -5.32 -8.97 -7.32 -5.01 -2.92 -.75 -2.24 -2.16 -.45 -.15 -.09 2.29 1.30 1.34 .62 -.68 -1.04 -3.12 -2.57 -1.08
6.19 -.54 .85 -2.40 -1.71 -.05 -.31
~·
((\)
~11'
?0
o0
G.-,
"Q.:.~
~<$'/.
v~.
'0 o
"o.
~
o,...
o,..
From U we are able to derive the sums of squares and cross products for the hypotheses of interest. We are not concerned with the nullity of the constant term, since all subjects show some predisposition to calculus formation. To compare the three active agents with the control (eliminating the constant), the sum-of-products matrix is
6.78 13.00 33.08 = [ 21.86 50.38 17.03 47.94 11.55 32.24 5. 79 16.82
(Symmetric)
81.94 70.50 47.93 26.63
71.41 47.89 25.03
~· ~· ~ ~ ?I' ?I' 0<9 ~ 09> ~ ~·
~· ~1...
"(\)
-0
~.s.
o,..
((\)
((\)
?
0<9
"....:
~
?
~
0~
?/
?(\) ~/. -0 -0 0:· -0(}· ~.s. <S'o,.. o,.. <S'o,..
"r.:~/.
~/.
32.18 17.09
((\)
?
Right canine Right lateral incisor Right central incisor Left central incisor Left lateral incisor 10.48 Leftcanine
~/.
has nh1 = 3 degrees of freedom. The sum of squares and cross products for comparing the two control groups, eliminating the constant and active-agent effects, is sf[!
sll2
has nh2 =
=
u5 u' 5 2.94 .26 .02 .77 .07 1.28 .11 3.83 .33 3.71 .32 1 degree of freedom. Su2
(Symmetric)
.20 .33 1.00 .97
.56 1.67 1.61
5.00 4.83
4.67
Analysis of Variance: Tests of Significance
337
The matrices for years (eliminating the constant and all conditions) and for the years-by-agents~ interaction (eliminating all main effects) are SH3 = u6 u' 6 and SH4 = U 7 U' 7 , respectively. These are
.00 (Symmetric) -.07 1.80 -.03 .83 .39 SH3 = -.11 3.07 1.42 5.24 --.06 1.75 .81 2.99 1.70 .00 -.11 -.05 -.19 -.11 .01 and
.1 0 .33 1.09 .98 3.26 SH4 = .81 2.68 .34 1.13 .71 .21
(Symmetric)
9.74 8.03 6.62 3.38 2.79 2.12 1.75
1.17 .74
.46
Both nh 3 and nh 4 are 1. The error sum of products is given in Section 3.4. SE may be obtained either as a residual, subtracting U1U from the total V'V, or directly as the withingroups matrix, since all seven between-group effects are included in K. That is,
137.90 101.91 81.04 SE= 107.23 73.59 91.41
(Symmetric) Right canine
261.87 Right lateral incisor Right central incisor 217.53 423.98 260.35 431.20 619.14 Left central incisor 166.67 242.52 348.96 308.62 Left lateral incisor 100.58 111.87 129.87 121.44 157.90 Left canine (6) ~· (6) (6) ~· ~· 19-?(X ~ ~ 19-? ~ 19-?(X ("/. C\6> (\~ ~ C\6> (\~ ~ $~ "~~...... G..., "..--..,6) ~...... "..--..,6) $/, ~...... ~
"ll
~
C}.·
.s>o,..
~
~
C}.·
.s>o,..
C}.·
.s>o,..
C}.·
.s>o,..
The error degrees of freedom are nc = N-J 0 = 100. The error mean squares and cross products are ME= SE/1 00. The six error mean squares are on the diagonal of ME, that is fr" 2 (k = 1, 2, ... , 6). The first hypothesis to be tested is that ordered last in the Gram-Schmidt elimination. This is the test of interaction, or H0 : 0As = 0 . Equivalently, H0 is 'Yik= 0 for all (j, k). Only for interactions, however, does equivalence of terms imply that the individual effects are null. The likelihood ratio for interaction is
338
Method
To obtain ISEI. we factor SE by the Cholesky procedure. The factor is .
• 11.74 8.68 13.66 6.90 11.54 15.59 TE= 9.13 13.26 13.80 6.27 8.22 6.69 7.78 2.42 1.94
(Zero)
13.02 6.94 -.00
10.43 3.82
8.55
The determinant of SE is the squared product of the diagonal elements of TE: 6
ISEI =
fi
[tehk 2
k=l
Since this product is unwieldy, we may instead use natural logarithms. That is, 6
loge ISEI = 2
L
loge [tehk
k=l
=2(1oge 11.74+1oge 13.66+1oge 15.59 +log~
13.02+1oge 10.43+1oge8.55)
=29.764 Factoring SE+SH 4 , the Cholesky factor is 11.75 8.70 6.98 T*= 9.20 6.29 7.80
(Zero) 13.68 11.70 13.37 8.26 2.44
15.75 13.88 6.69 1.97
13.03 6.95 -.00
10.43 3.81 8.55
From T*, 6
loge ISE+SH4 i = 2
L
loge [t*hk
k=l
=29.791 From the two logs, loge ISEI-loge ISE+SII4 =loge 1
IS~!~H4 I
= 29.764-29.791 =-.027 The likelihood criterion is A= e-· 027 = .973 A is close to 1 and we suspect that the interaction is not significant.
Analysis of Variance: Tests of Sig.nificance
339,
The F transformation is 1-A vs ms+ 1-nh4 p/2
F=
- - . -----'---
nh4P
A11s
The multipliers are
m
=
[ne,-(p+1-n 114 )/2]
= 1 00-(6+1-1 )/2 =97 and
= (62(12)-4 )1/2 6 2+12_5 =1 The F statistic is F
1-.9731 97(1)+1-(1)(6)/2 .973 1 1 (6) = .028. 965 =.43
The test statistic is referred to the F distribution, with 6 and 95 degrees of freedom. F does not exceed the critical value at a= .05, and H0 is accepted. The six univariate test statisti6s provide descriptive data concerning the effect upon particular teeth. The univariate statistics are simple ratios of hypothesis to error mean squares. That is,
. [S~t 4 hhJnh4 Fk = --=--=--------" [sehklne
For the interaction effect with matrix SH4 ,
.1 0/1 F1 = 137.90/100
.07
Right canine
.42
Right lateral incisor
2.30
Right central incisor
6.62/1 F4 = 619.14/100
1.07
Left central incisor
1.17/1 F5 = 308.62/1 00
.38
Left lateral incisor
.46/1 F6 = 157.90/100
.29
Left canine
Fz F3
1.09/1 261.87/100 9.74/1 423.98/100
340
Method
Each F ratio has 1 and 100 degrees of freedom. Although the tests are not independent, no single F statistic exceeds the .05 critical value. The decision to accept H0 is further supported. There appears to be some trend for the F ratio to be higher for the middle teeth, or those with higher mean calculus scores, and to decrease symmetrically toward the two sides. Since 8As =0, we may proceed to test H0 : SA= 0 in the same order of effects. The hypothesis is equivalently that the vector means for the two years are equal, or H0 : a 1 = a 2 • The hypothesis matrix is SH:l' having one degree of freedom. The likelihood ratio is
A= ISE+SH31 = "976 The F approximation is F = .38, with 6 and 95 degrees of freedom. F does not exceed the critical value, and Ho is accepted. The agents show no significant difference in effectiveness over years of experimentation, for the six teeth jointly. Since the years effect is not significant, we may proceed to test the equality of the two control groups, and between active agents and the controls. If years were significant, we would not have valid tests of the preceding effects in this order of elimination. Instead, we would reorder the effects so that "years" is ahead of "agents," and reorthogonalize. We could then obtain tests of the controls and agents, eliminating significant year differences. Instead, we proceed to test equality of the vector means for the two control groups. The null hypothesis is H0 : 8s* =0, or H0 : {3 1 = {3 2 . The hypothesis matrix is SH 2 · The likelihood ratio is
A= ISE +S Hz I = .943 The F statistic for all six variates is F = .96, with 6 and 95 degrees of freedom. The critical F value is not exceeded; H0 is supported. The two control groups both appear to have only the base product or unaltered control dentifrice. Since years, controls, and interaction effects are nonsignificant, we may obtain a valid test of the active agent effects, in this order. The hypothesis is H0 : 8s = 0 . The hypothesis matrix is SH 1 , with n 111 = 3 degrees of freedom. The likelihood ratio is ISEI A = -:-::-'---:::-----c iSE+SH11 We have determined that loge ISEI = 29.764. To obtain loge ISE+SH 11, we sum the two matrices and factor according to Cholesky, (SE+SH1) = T*[T*]'. The triangular factor is 12.03 9.55 8.55 T*= 10.33 7.08 8.08
(Zero) 14.27 13.05 14.69 9.20 2.82
16.20 13.69 6.78 2.01
13.45 7.12 .19
10.46 3.87
8.72
Analysis of Variance: Tests of Significance
341
Then 6
logeiSE+SH 1 = 2 1
L
loge [t*hk
k=l
=30.086 The log of the likelihood ratio is ISEI loge ISE+SH I= 29.764-30.086 1
=-.322 and A = e-· 322 = .725 The multipliers for Fare m = 100-:(6+1-3)/2 = 98 and 62(32)-4
s= ( 62 + 32 _ 5
)1/2 =2.8284
The test statistic requires 1/s = .354. Then
F
1-.725·354 98(2.8284)+1-(3)(6)/2 .725· 354 . 3(6) = 121.269,186 . 18 =1.80
F has 18 and 269.186 degrees of freedom. The distribution may be evaluated by mathematical approximation, as in the MULTIVARIANCE program, or by rounding the degrees of freedom down to the nearest whole number (269). F exceeds the tabled critical value of 1.6, and H0 is rejected at a= .05. For further information about .the t~eth most strongly affected, we may inspect the six univariate F ratios, The ratios of hypothesis and error mean squares are 6.78/3 1.64 Right canine F1 137.90/100 33.08/3
4.21
Right lateral incisor
81.94/3 F3= 423.98/100
6.44
Right central incisor
71.41/3 F4 =619.14/100
3.84
Left central incisor
3.48
Left lateral incisor
2.21
Left canine
Fz = 261.87/100
32.18/3
F5 = 308.62/100 10.48/3
F6 = 157.90/100
342
Method
The effect appears strongest for the middle teeth, or those having the greatest overall calculus formation. The four central teeth (incisors) have univariate F ratios that exceed the .05 critical F value, with 3 and 100 degrees of freedom. These, however, are independent and should not be employed as partial tests of the multivariate hypothesis. The constant term and three active-agent contrasts are all necessary to the model. The appropriate rank of the model for estimation is c =4. Best estimates of nonzero effects are obtained by deleting all terms for which H0 was accepted and determining 0 for those remaining. This is the matrix estimate given on page 282, along with the standard errors in H. Inspection of 0. reveals that virtually all contrasts of agents with controls are negative; all agents have produced lower mean calculus scores than the control dentifrice. For comparability across elements of 0 4 , each estimated contrast is divided by its respective standard error. The resulting matrix is Constant Agent 1 Agent 2. Agent3
The most effective agent is consistently agent 3. The effect is accentuated still further on those teeth showing the highest univariate F ratios-that is, the right central and lateral incisors. By comparison, agent 2 appears to have the smallest effect, and by itself may not differ significantly from the controls. The standardized contrasts for agent 2 may be compared to critical values of the t distribution with 100 degrees of freedom. No single tooth shows a significant improvement. However a multivariate test of the contrast for the six teeth together may suggest otherwise. The central teeth are known to develop higher calculus concentrations than those further apart. Step-down analysis may be considered as an alternative to A for testing agent effectiveness. All significant differences among agents may be "concentrated" in the four central teeth, or in the two central incisors alone. Peripheral teeth may not contribute any additional significant between-group variation. Let us employ the step-down technique to test H0 : 0 8 =0, although we might have used the same procedure for all hypotheses. For the step-down analysis, we may reorder the variates according to centrality-that is, right and left central incisors, followed by right and left lateral incisors, and finally the right and left canines. This constitutes a simple reordering of the elements of SE and S,1 , but does not affect the values of the elements or the determinants. A is unaltered by the reordering. The step-down statistics are specific to the predetermined order of variates, however.
Analysis of Variance: Tests of Significance
343
The Cholesky factors of SE and SE+S81 , with the modified order of elements, are
TE (reord.)
20.59 20.94 10.56 11.78 3.94 5.43
(Zero)
13.44 2.91 11.91 7.61 1.69 10.44 1.85 4.62 .52 9.87 1.20 3.33 4.09 5.10 8.55
and
22.49 22.31 11.91 T* = 12.91 (reord.) 4.57 6.16
(Zero)
13.89 3.07 11.99 7.83 1.76 1.60 4.63 1.26 3.35
10.47 .52 9.97 4.13 4.94
8.72
Each diagonal element is the square root of the conditional sum of squares for one variate, eliminating those preceding. The step-down F ratios are ratios of conditional hypothesis mean squares to conditional error mean squares. That is, *
Fk =
([t*hk 2 - [te]kk 2 )1n, 1 Uehk2f(ne-k+1)
The six ratios are * (22.492_20.59 2 )/3 F1= 2o. 59 z1(100 _ 1 + 1)
27.31 4 _24 =6.44
Rightcentralincisor
* (13.892_ 13.44 2)/3 Fz=13.442J( 100 _ 2 + 1)
4.15 1 _82 =2.28
Left central incisor
(11.992_11.91 2)/3 F = 11.91 2/(1 00-3+1)
.63 1.45
.43
Right lateral incisor
*- (10.472_10.44 2)/3 F4 -10.44 2/(100-4+1)
.17 1.12
.15
Left lateral incisor
*
3
F 5 = 9.87 2 /(1 00-5+ 1)
.68 1.01
* (8.72 2 - 8.55 2 )/3 Fs = 8.55 2 /(1 00-6+1)
.77
*
(9.97 2 -9.87 2 )/3
.99
= .67
Right canine
= 1.28 Left canine
The first step-down F* statistic is identical to the univariate F ratio for the right central incisor. Each subsequent ratio tests the effect of agents on the particular tooth, eliminating variation attributable to prior teeth. Each has one fewer degree of freedom for error. F; is referred to the F distribution with 3 and 100 degrees of freedom, Ff with 3 and 99, and so on.
344
Method
Table 9.3.2
Analysis of Variance for Anticalculus Agents Step-down F Statistics
Simultaneous Test Source Constant Agents, eliminating constant Controls, eliminating constant and agents Years, eliminating constant, agents, and controls Agents x years, eliminating all above
Degrees of Freedom
1
A -
F(d.f.)
-
-
L.C.I.
-
R.L.I.
-
L.L.I.
R.C.
-
-
L.C.
-
6.44t
2.28
.43
.15
.67
1.28
.96(6, 95)
.05
.05
.01
2.99
2.30
.35
.976
.38(6, 95)
.09
1.50
.30
.01
.41
.06
.973
.43(6, 95)
2.30
.19
.12
.08
.00
.03
1.38
1.58
n., =3
.725
n,.,=1
.943
n.,= 1 n,, = 1
1.80t (18,269)
R.C.I.*
Univariate Mean Squares Residual (within)
n, = 100
Total
N=107
4.24
6.19
2.62
3.09
"R.C.I., right central Incisor; L.C.I., left central incisor; R.L.I., right lateral incisor; L.L.I., left lateral incisor; R.C., right canine; L.C., left canine. Step-down tests conducted in this order. · tSignificantatp < .05.
Beginning with F~ and proceeding backward, there is no significant effect for the left canine, eliminating all others; no significant effect for the right canine, eliminating all incisors; no significant effect for the left lateral incisor, eliminating the central and right lateral incisors; no significant effect for the right lateral incisor, eliminating central incisors; and, at a= .05, no significant effect for the left central incisor, eliminating the right. However, Ft does exceed the .05 critical F value with 1 and 100 degrees of freedom, and H0 is rejected. Should some other F* ratio exceed the critical value, the same hypothesis would be rejected sooner, and testing would stop at that point. However, because of strong interdependencies of the six teeth, it appears that the effectiveness of anticalculus agents may be assessed from only a single tooth, and at most from the two central incisors. We conclude that the agents under study (especially agent 3) are effective in reducing calculus formation. The descriptive data for the study are the means, variances, correlations, and contrast estimates and their standard errors. In addition, an analysis-ofvariance summary table may be useful in conveying the results of the hypothesis testing. A suggested format is given by Table 9.3.2. Especially in the nonorthogonal design, notations to indicate the order of effects are crucial. Should alternate orderings be necessary, the table may become more complex and present only selected results from among several orderings.
Sample Problem 4- Essay Grading Study The essay grading data are analyzed by fitting a four-way fixed-effects analysis-of-variance model to the data. The factors of classification are pupil race (A), having two levels; pupil sex (8), with two levels; pupil ability (C), with two levels; and essay pairs (D), having four levels. The total number of groups is J=32, with N=112 observations. The cell frequencies vary from 1 to 6 per subclass. All effects are experimental in the sense of assigned and manipulated
Analysis of Variance: Tests of Significance
345
by the researcher. However, the same model may be employed in any comparative or nonexperimental situation as well. The two dependent variables are scores on essay topics I and II, respectively. Sample means and N's are given in Table 8.4.1. The complete model has 32 degrees of freedom between groups; this is /, the rank of the model for significance testing. The complete reparameterization is represented symbolically in Table 7.4.2. Simple contrasts are used for the race, sex, and ability factors, and deviation contrasts for essay pairs. The first eight basis vectors are presented following Table 8.4.1. Further vectors of K are found in the same manner, or may be obtained as term-by-term multiples of the leading columns. To obtain the orthogonal estimates for the full model, we orthogonalize the complete 32 x 32 basis, and multiply by the cell sums [ DY.]. The resulting esti-
346
Method
Each effect is estimated eliminating all preceding effects. The error sum of products for the two essay topics is the within-group matrix, SE = Y'Y-Y!DY. = Sw. This is given in Section 8.4. That is,
S = [13877.68 8475.37] Essay topic I E 8475.37 17429.10 Essay topic II Essay topic II
Essay topic I
SE has N-J = 112-32 = 80 degrees of freedom. The matrix of mean squares and cross products is the within-group variance-covariance matrix
ME=i 1
=8osE = [173.47 105.94] Essay topic I 105.94 217.86 Essay topic II Essay Essay topic I topic II The error mean squares are 6-1 2 =173.47 and 6-2 2 =217.86 for the two topics separately. To test the fit of the model, we compute the F transform of A for each main effect and interaction, in reverse order. There are a total of fifteen hypothesis sums of products, one for each main effect, and one for each interaction of two or more design factors. We shall assume that there is no particular interest in the constant term. The last effect in the order of elimination is the four-way interaction, eliminating all other effects. The sum of products is 32
L
SH 15 =
uiu'i
i=30
= [978.54 329.45] 329.45 449.22 SH15 has nh15 = 3 degrees of freedom. The hypothesis is H015 : @ABeD= 0, where @ABeD is the final three rows of 0 for the four-way interaction; or else H015 : (afj')I8)Jklm = 0. The likelihood ratio is
A= IS
ISEI E+
s Hl5 I =
.904
The F transformation with p = 2, m = 80, and s = 2, is F= 1-\f9o4. 80(2)+1-(3)(2)/2
\f9o4 = .052. 1 ~ 8 =1.36
3(2}
Analysis of Variance: Tests of Significa'!ce
347
F does not exceed the .05 critical value of the F distribution, with 6 and 158 degrees of freedom; Ho15 is accepted. The univariate F statistics for the two essay topics separately, for the fourway interaction, eliminating all other effects, are 1 F1 = 978.54/3 173.4 7 = .88
. Essay top1c I
and F2 = 449 ·2213 217.86
= .69 Essay topic II
Neither exceeds the critical F.value with 3 and 80 degrees of freedom. Having found the four-way interaction nonsignificant, we proceed with the test of the three-degree-of-freedom sex-by-ability-by-essay interaction, with SH14 = u21u'21+u2su' 2s+U29U' 29· The results for this and all other hypotheses, in the specified order of elimination, are given in Table 9.3.3. Reading from the last hypothesis backward, we see that no significant multivariate test statistic is encountered until we reach the, race-by-sex interaction. A single significant univariate F ratio is found for the race-by-sex-byessays three-way interaction, but the decision to accept H012 is made from the simultaneous test only. We may later inspect the data to locate the source of the single significant univariate ratio. Similarly, one univariate F ratio for the essaypair main effect is not significant. H04 is rejected, however, based upon the multivariate likelihood ratio criterion of .57. A provides information on both stimuli simultaneously, and can be tested independently of the other multivariate tests, with a prespecified type-1 error rate. However, before investigating the essay-pair main effect, we must inspect the significant race-by-sex interaction. Under this ordering of effects, all preceding terms (all main effects) are confounded with race-sex interactions, and the resulting test criteria are not valid. To understand the interaction, we may inspect means predicted by a rank-8 model, which includes all main effects plus the race-by-sex interaction. These are obtained by letting the rank of model for estimation be c = 8, and estimating only the leading 8 rows of 0. The estimates are combined with the leading columns of K to predict V. = K8 El 8 • Table 8.4.2 on page 295 has predicted means for all32 groups. To investigate the race-by-sex interaction, the means are combined to obtain means for each combination of pupil sex and race. The significant interaction may be interpreted in terms of the teachers having very different reactions to white boys and girls, and about the same expectations for all black children regardless of sex. Or it may be that Negro females are given a score advantage relative to white girls, while Negro males are punished. Let us consider the other main effects. A significant interaction confounds main effects in two manners. First, although the test of interaction is made eliminating main effects, main-effect tests are confounded with interaction sums of products .. Second, for interpretation, the existence of interaction suggests that tests of simple main effects may not be valid.
348
Method
Table 9.3.3
Analysis of Variance for Four-factor Essay-grading Study
Source
Constant Race, eliminating constant Sex, eliminating constant and race Ability, eliminating constant, race, and sex Essay pairs, eliminating constant, race, sex, and ability Racexsex, eliminating constant and all main effects Racexability, eliminating all above Racexessays, eliminating all above Sexxability, eliminating all above Sexxessays, eliminating all above Abilityxessays, eliminating all above Racexsexxability, eliminating all above Racexsexxessays, eliminating all above Racexabilityxessays, eliminating all above Sexxabilityxessays, eliminating all above Race x sex x ability x essays, eliminating all else
Degrees of Freedom
Simultaneous Test
A
F(d.f.)
nh1 = 1 nho= 1
.99 .91
.25 (2,79) 3.79* (2, 79)
nha = 1
.88
5.41* (2, 79)
nh4 = 3
.57
nhf, = 1
1
-
-
Univariate F Statistic
Essay
Essay
Topic/
Topic II
-
.15 .22
.08 6.46*
10.50*
5.43*
8.61* (6, 158)
2.70
12.25*
.92
3.49* (2, 79)
6.36*
4.34*
nh6 = 1
.99
.36 (2,79)
.60
.52
nh7=3
.95
.71 (6, 158)
.48
.80
nhs=3
1.00-
.08 (2,79)
.08
.01
nh 9 = 3
.95
.71 (6,158)
.79
.38
nh10 =3
.95
.65 (6, 158)
.63
.24
nhu = 1
.98
.97 (2,79)
1.20
1.77
nh12 = 3
.86
2.00 (6, 158)
1.94
3.18*
nh,3 = 3
.94
.86 (6, 158)
.72
1.10
nh,. = 3
.96
.54 (6, 158)
.46
.88
nh,s = 3
.90
1.36 (6,158)
1.88
.69
Mean Squares
Within
ne =80
Total
N= 112
173.47
217.86
"Significant atp < .05.
For example, overall mean differences across sex groups appear small or inconsistent. When subjects are divided by race, however, sex differences for whites are large and consistent, while those for black pupils are not. We may wish to estimate or test sex differences separately for each racial group. Or, we may wish to compare the ratings for black and white males separately from those for black and white females. Either approach consists of an alternate reparameterization of the original model. A nested design reparameterization replaces the crossed design, for those factors of concern. For example, to test sex differences separately for the two racial groups, the sex and race-by-sex contrasts are replaced by a sex contrast for Negro pupils and one for whites. That is, Kronecker products CO®C1 ®
Analysis of Variance: Tests of Significance
349
CO® DO (sex) and C1 ®C1 ®CO® DO (race-by-sex) are replaced by 11 ®C1 ® CO® DO (sex within race 1) and 12 ® C1 ®CO® DO (sex within race 2). Logically, the race-by-sex interaction does not confound either the ability or essay-pairs main effect. However, because of the nonorthogonality of the design, there remains a mathematical interdependence. The essay-pairs effect is not of particular concern. Mean differences are expected. Since there is no vested interest in these particular stimuli, we shall not be concerned with testing the magnitude or direction of differences. However, the ability effect is of concern. We may order the race-by-sex interaction ahead of ability and reorthogonalize. First, let us evaluate the degree to which ability is confounded with the race-by-sex interaction. The variancecovariance factors of the estimates are assessed in Section 8.4, as matrix G. The covariance of the ability and race-by-sex (fourth and eighth) effects is- .0001, with variances of .03622 and .14738, respectively. The correlation of the two effects is -.0001 -.0013 v'.03622(.14738) Making the subjective decision that the confounding is negligible, we proceed directly with a test of the ability effect, without concern for the race-by-sex interaction. The essay main effect is significant and must be eliminated from ability. We must reorthogonalize the basis with the ability column following all the other main effects. That is, index the leading columns of Kin theoriginal order, as vectors k 1 , k 2 , ••• , k 7 ; k4 is the ability-effect column. The vectors are reordered to k1 , k2 , k3 , k5 , k 6 , k 7 , k 4 . We may reorthogonalize the leading seven columns of K by Gram-Schmidt; all following columns will remain unaltered. The seventh row of U now estimates the ability effect, eliminating the constant and the three other main effects. The revised hypothesis sum of products for H0 : y 1 = y 2 or H0 : 0c = 0, is
=[1816.86 1284.61
1284.61] 908.28
SE is defined as previously. Then A=
ISEI ISE+SHcl
.883
The multivariate test statistic is F=5.23, with 2 and 79 degrees of freedom. Ho is rejected at a= .05. The univariate results for ability, eliminating all other main effects, are
F1 = 1816 ·8611 173.47
10.47 Essay topic I
and F _ 908.28/1 2217.86
4.17 Essay topic II
350
Method
Table 9.3.4
Estimated Means and Mean Differences for Ratings of Essays When Teachers Have Been Told the Authors Were High-ability and Low-ability Fifth-grade Pupils*
Student Group
Essay Topic It
Essay Topic"*
49.37 41.25
51.96 46.22
8.12 2.51
5.74 2.81
High ability Low ability Estimated difference Standard error of difference *Range of possible ratings is from 9 to 90 points. tMy favorite school subject. *What I think about.
Both exceed the .05 critical F-value with 1 and 80 degrees of freedom. Although none of these results differs dramatically from those of Table 9.3.3, in other situations the reordering may produce large alterations. These multivariate and univariate results should replace those in the table, with appropriate labeling, rather than requiring a second table for the alternate order of effects. The data may be inspected to determine the source of the significant ability effect. The predicted means of Table 8.4.2 may be combined across groups, to obtain unweighted averages for high and low-ability pupils. These results, together with the estimated difference from 0 8 , and standard errors, are given in Table 9.3.4. When teachers perceived that the essays were written by highintelligence, high-achievement pupils, their mean ratings were invariably higher than if they thought the authors were of low ability and achievement. The differences are even more pronounced for individual essays (see Table 8.4.1). The expectations held by the teachers for pupils of differing ability and achievement records may be so strong as to pervade their evaluations of the pupils' actual performance. The evaluations, in turn, may play a significant role in shaping the child's own behavior, both in and out of the classroom. From Table 9.3.3, we see that the eight leading effects in K and 0 describe factors that affect the teachers' ratings. These are all main-effect degrees of freedom, plus the race-by-sex interaction. The other terms do not contribute, and may be eliminated. The model of rank c = 8 is confirmed for these data. This is the rank assumed for estimation in Section 8.4. Let us turn briefly to the large univariate race-by-sex-by-essay-pair F ratio. The observed means for race-sex combinations, for the second essay topic have a single mean out of the range of the others. The mean score on topic II, for teachers scoring essay pair 1 who thought the pupils were white females, is low, with value [10X1 +29.25X4]/5 = 25.4. It is observed in Chapter 8 that subject 88, with the lowest score, did not truly participate in the study. Reanalysis without S88 reduces the single three-way interaction to nonsignificance.
Sample Problem 5-Programmed Instruction Effects The final example is the conditions-by-classes-by-sex design for testing classroom procedures for absenteeism compensation. The fixed effects in the
Analysis of Variance: Tests of Significance
351
model are experimental conditions (remediation, no remediation, or experimental-control), and sex (male-female). The units of analysis are classroom means; mean achievement has been separately determined for males and females in each class. Classes are considered a random effect, nested within experimental conditions and crossed with sex. There are 19 seventh-grade teachers employing the remediation program, and 18 giving pupils no special remediation when mathematics material is missed. The dependent variables are three measures of cognitive achievement, administered at the end of the school year: the Cooperative Mathematics Test, the Stanford Modern Mathematics Concepts Test, and the (City) Junior High School Mathematics Test. The mean model is given by Eq. 7.1.34. That is,
Y·;k 1
is the vector mean for class k, nested within treatment j, for sex group/.
ckuJ is a random class vector, with (k= 1, 2, ... , 19) for the experimental condi-
tion, and (k = 1, 2, ... , 18) for the control. e. 3k 1 is assumed to follow a trivariate normal distribution, with covariance matrix I; ckw is assumed normal with covariance matrix Ic· For programming purposes, the nested design may be treated as a 2 x 19 x 2 crossed design.* In reparameterization to contrasts, effects may be chosen so that the variance among classes is estimated separately for each experimental condition. The symbolic representation of the reparameterization is presented in Section 7.4. There is only one vector observation per subclass (per sex-class combination). Thus we may obtain the sex-by-classes-in-groups interaction as a residual, obtaining orthogonal estimates for only the main effects and conditions-by-sex interaction. The symbolic contrasts and effects in the parameter matrix are
ek (1 X3) CO®OO®CO Constant -------@A (1 X3) C1®00®CO Experimental conditions -------0c (1 X 3) CO®OO®C1 Sex --------
0Ac (1 X3) C1 ®OO®C1
Conditionsxsex
--------
0=
11 ®01 ®CO 11 ®02®CO
Classes in experimental group
0n (35X3) 11 ®018®CO 12®01 ®CO Classes in control group 12®02®CO
12®017®CO
e and U have I= 39 rows (the rank of the model for significance testing). For 'This is conditionsxclassesxsex, or AxBxC; it is not the order in which factors appear in the model.
352
Method
obtaining the Kronecker products, the vectors from one-way bases are multiplied in the order conditions-classes-sex. The complete K is 76X39; the last two rows are deleted since there are only 18 control classes. Because of the nested factor and multiple sources of random variation (8ll contains random effects). we must consider the sources of variation and the expected sums of squares and cross products, prior to significance testing. Let us partition the triangular factor of K'DK, as we did 8. The upper triangular Cholesky factor is T~ (1 X39)
T' =
T~
(1 X39)
T~.
(1 X39)
T~c
(1 X39)
T~ (35X39) Tsatisties K'DK=TT'. The entire matrix of orthogonal estimates is U=T'e= [K*]'DY., of order 39X3. The portion of U corresponding to any effect is the corresponding section of T' times the entire estimate 0. For example, U tor the experimental groups effect, is the 1 X3 submatrix UA = T~@. Because of the upper triangular nature of T', rows of Ull are weighted sums of only the rows of @ll; UAc is a weighted sum of rows of @AC and ell; Uc is a weighted sum of rows of ec. eAC• and ell; and so on.
Table 9.3.5
Partition of Sums of Products for Mixed Model, Nested Design
Source
Sum of Products
d.f.
Fixed effects: Constant
u;,uk
Experimental conditions, eliminating constant
n", =
Sex, eliminating conditions and constant
n", = 1
Sn2 =
Conditionsxsex, eliminating constant, conditions and sex
n"·' = 1
Su3 = U~cUAc
Random effects: Classes within conditions, eliminating fixed effects
ne, = 35
sE, =
Sexxclasses within conditions, eliminating all above (Residual)
ne2 =35
SE2 = Y'Y-U'U
N=74
Sr= Y'Y
Total
Expected Sum of Products
1
sH, = u~uA U~Uc
u;,us
m,l:+m 2 l:c+0'TA T~@ m 3l:+0'TcTL0
m,l;+@'TAcT~ce
m.I+m5Ic
mal:
Analysis of Variance: Tests of Significance
353
The partition of sums of products is given in Table 9.3.5. m 1 through m6 are multipliers which depend upon the number of observations per subclass and the number of levels of the experimental factors. This partition of sums of products may be employed for any such design, with design factors A and C fixed, and levels of B random, nested within A and crossed with C. There is one unit of analysis per subclass, so that there is no within-group variation. The design arises commonly in educational experimentation when class means are the units of analysis (Glass, 1968; Raths, 1967). To determine the appropriate tests of hypotheses, we compare the expected sums of products. SE 1 contains all components in SH1 except the A fixed effect. Thus SE 1 will form the correct error sum of products for experimental conditions. SE2 has only a random I component, and is an appropriate error term for SH2 and SH~· Each has the I component, plus a single fixed effect of concern. Finally, should we wish to test a hypothesis concerning Ic. a comparison of SE 1 and SE 2 would be appropriate. Since variation among classes is expected, we shall omit a test of this hypothesis. The sample partition of effects is obtained from the orthogonal estimates .. U has row u'1 for the constant, u' 2 for experimental conditions, u' 3 for sex, u'4 for conditions-by-sex, and u' 5 , u' 6 , . . . , u' 39 for classes within conditions. The sums of products are 6 57 (Symm.)l Experimental conditionl:j, _ 9 :74 14.45 eliminating constant
SH2
=
U3U'3
=
12.32 [ -2.92 1.65
·69 -.39
.56 .59 [865.95
39
SE1 = ~
U;U';
(Symm.)l Sex, eliminating conditions and constant .22 (Symm.)l Conditionsxsex, · eliminating constant, .62 conditions, and sex
(Symm.)l Classes within conditions, eliminating fixed effects 479.62 621.88
= 575.90 514.97 677.25
i=o
The residual sum of products is
SE2
= Y'Y-U'U 20332.73
= [ 22444.51 25275.56 16193.22 17947.80
= [ 6 ::~: 20 .69
(Symm.)l [20268.15 - 22438.02 13054.33 16172.53
(Symm.)l 25188.27 17943.84 13016.10
87 . 2 ~Symm.)l Sex~~las~es within conditions, 3 .96 38 .23
ellmmatmg all else
354
Method
For all matrices the three' variables are in the following order: Cooperative Mathematics. Test, Stanford Modern Mathematics Test, and (City) Junior High School Mathematics Test. The first hypothesis to be tested involves the last fixed effect in the order of elimination-the conditions-by-sex interaction. The hypothesis is Ho: 8Ac=0. The hypothesis matrix is SHa and the error matrix is SE 2· From these
A=
ISE2 1
.926
ISE2+SHa1 The F statistic, with p = 3, nha = 1, and ne 2 = 35, is F = .88, with 3 and 33 degrees of freedom. F does not exceed the .05 critical value. The univariate F statistics are 1.73/1
Ft = 64 .58135
.94
Cooperative test
.56/1 F2 = 87 .29135
.23
Stanford test
.62/1 Fa= 38 .23135
.56
City test
None exceeds 'the .05 critical F value, with 1 and 35 degrees of freedom. We accept the null hypothesis. . The second null hypothesis to be tested is H0 : 8c=0, for the equality of sex group means. The hypothesis matrix is SH2 and the error matrix is SEz·
The corresponding F approximation is F = 2.41, with 3 and 33 degrees of freedom. Although F does not exceed the .05 critical value, it does approach significance. The three univariate test statistics are Ft = 6.67, F2 = .28, and Fa= .20, respectively. Each is referred to the F distribution with 1 and 35 degrees of freedom. It appears that there is some sex differentiation on the Cooperative Mathematics Test. However, based upon the simultaneous test statistic, we continue to maintain H0 . Since neither sex nor conditions-by-sex is significant, we may test experimental conditions in this order. The hypothesis is H0 : 8A =0, or H0 : at =a 2 . The hypothesis and error matrices are SHt and SEt• respectively. SEt is a "special effects" error term, comprised of variation due to the last 35 contrasts coded in 8. The likelihood ratio criterion is
Analysis of Variance: Tests of Significance
355
The corresponding F statistic is F= 2.74. F comes very close to but does not exceed the .05 critical value with 3 and 33 degrees of freedom. The three univariate F statistics are F1 = .02, F2 = .45, and F3 = .81. No one measure displays any significant between-group variation. We must again maintain H0 . Finding none of the experimental effects to be significant, we conclude that the treatment was ineffective. The children were not able to compensate for absenteeism through utilization of the individually programmed materials. However, it is possible that the error sums of products may be inflated by systematic class differences in mean absenteeism. That is, in classes with the highest absenteeism rate it may have been impractical to have the large numbers of returning children operate the technical machinery. Thus we may still employ the technique of analysis of covariance, to remove any confounding with the additional measured antecedent, absenteeism rate. There is little point in our estimating means or mean differences other than the mean of all subjects. Note, however, that in estimating standard errors we must utilize the appropriate error variances for the respective effect. The estimate of I for standard errors of experimental-condition effects is SE/35, while that for the sex and sex-by-conditions effects is SE/35. For example, let the rank of the model for estimation be c=2, to estimate the constant and experimental-groups effects. Then
K2 contains the first two columns of the basis. These estimates are
e. = 10_ x[135.23
-7.31]K'DY.
4
-7.31
2
540.94
2
= [16.18 18.27 12.92] Constant .17
-.60
Ooo,o
.88 Experimental-control
&rQ>.IJr,
6l,..Q>
1/j-.
C>ij,.t.
Or. O't$
6l t. 6l&t
6l.9t
&t
The mean of 19 experimental classes on the Cooperative test is 16.27 and the control mean is 16.10. The estimates show that the mean of all classes is 16.18, and the experimental-control difference is +.17 point. The corresponding standard errors are H = gd', where g is a two-element column vector of square roots of the diagonals of (KfDK2 )- 1 and d' is a threeelement row vector of variable standard deviations. The first of these is
g=
[Y.013sl v:o541J = [116] .233
For d we require the variances of the three measures within experimental groups-that is, across classes. The variance-covariance matrix (mean squares
356
Method
and products) is
MEl= SE/35 24.74
(Symm.)l Cooperative test 14.71 Stanford test 13.70 17.77 Citytest
= [ 16.45 19.35
ME 1 may be reduced to correlational form in the usual manner to estimate the test intercorrelations. These are correlations among class means, however, rather than across individual pupils. The results for individuals and classes may be very dissimilar. d', the vector of variable standard deviations, is
V14.71
d'=[V24:74
v17.77J=[4.97
3.84
4.22]
The standard errors are
H = [ .116][ .233 4.97 [ .58 = 1.16 0o
.45 .89
O,.o
3.84
4.22]
.49] Constant .98 Experimental- control
&l c;~ ~?r, 'J.-l Or. 19.s-l ~~,-. 0';;
19,..
t-19
119
iS't
19.s'I
None of the separate experimental differences is even as large as one standard error, and therefore none is significant.
CHAPTER
II
Analysis of Variance: Additional Topics 10.1
DISCRIMINANT ANALYSIS
Through the method of discriminant analysis, a set of p variates may be linearly transformed to a new set of s (~p) measures, with properties facilitating the interpretation and analysis of group-mean differences. Namely the transformed variables, or discriminant variables, are constructed in such a way that the univariate F ratios (mean square between groups/mean square within) for these variables are maximal. In some instances more than one discriminant variable is computed for a given set of data. A second variable is constructed that has maximum between-group discrimination on a dimension orthogonal to the first. If a third is computed, it describes between-group variation on a dimension orthogonal to the first two, and so on. In total, the number of discriminant variables-_'le.Q§§&ary t9 ge_s.£r!_be all between-::·grou.p va-r.iaticin.Tslne minimUil'f of. theoetween-group degrees oftreedom· an-dflle rium'6erofvarlates~:(~aTis~-:--·-· s= min (nh, p)
(10.1.1)
Suppose, for example, that we have two groups of observations, with each observation measured on three criterion variates, y1 , y2 , and y3 • We may graph the group vector means as points in a three-dimensional Euclidean space. Let the means on the three measures for group one bey:,= (y.,o> y. 1(2) y.,< 3>), and for group two be y:2 = (y. 2 w y. 2 <2>y. 2<3 >). If we utilize the means for each group to give us coordinates on the three axes, we locate the vector means as in Figure 10.1.1. Regardle.§l~LQf the.dJm~.QsiQJ:!§Iity of the representation, allbetween-group variationis .described by a si ngle ...vari.abie (z),~-wTiich unTIK'e· tne"Y va-miDies,
exa-ttlyserpaf§Ie:§~poiilts~y:; ancCv:~:-·rne CllscrTmliranfva"ffa61e"~h()'~-~--9re'ater
berweerl-i;rro'up dispersion than any ofthe y;;. lhTufn; z_m.ay.be expressedasa li!J~.:OJ'i1polfnd o.f y,;y;,· and y 3 . With z:·fhe location of y., or Y·c may be expressed in terms of only a singlecoordinate~'fflafis"~"fHe·arstanCe-Hom'H1e ... origin on z. for this situation, s~ min (2_.:.:1", 3)= 1. It is possible'tilatfh.enumber oT'g'r'o~ups~~ould exceed two, in which case·~the nGmbetofdirhensfons·orE·e::-·tween-grou..p'variation.wouldincrease, Up to a maximum·of p =3. 357
358
Method
/ I
I I
I I
I I( y
I·
·2
·2
·2
I Figure 10.1.1
Two group vector means represented in three-space.
As a second example, consider four groups of observations, each measured on two variates. The group vector means may be represented in terms of Y1 and Yz, as in Figure 10.1.2. The original vector means may all be located in Figure 10.1.2 by specifying the y1 - and y2 -coordinates. However, a variable different from either of these (z 1) has maximal between-group distance of any line passing through the origin, including the two Yk· Thus z 1 is the first discriminant variable. The points Y·; do not fall exactly on z 1 , however. Thus we require a second discriminant variable, to describe between-group variation in a direction orthogonal to z 1 . In twodimensional space, there is only one such additional dimension, represented by Z 2 • In higher-order spaces Z 2 is selected from all dimensions orthogonal to Z 1 to maximize group differences on that dimension; z3 is selected from all dimensions orthogonal to z, and Z 2 , and so on up to Z 8 • In Figure 10.1.2, z 1 and z2 ma:y be expressed as linear combinations of y 1 and y2 . Then y. 1 and y. 2 may be located in terms of their coordinates with respect to the new z axes. Similarly, the estimated mean differences may be expressed in terms of the z rather than y. We rE;lJer to the z, or disgriminant, variates as canonical variates,since they expres"s 111ean differences. in a"stan~dar<{.~nd peffi~ps" simpler for}n. The least:sq·uares.est(mates of'group meari"co.ntrasts may be e;·pressed ·f;;rthe z; as the canonical form of the estimated differences.
.
Analysis of Variance: Additional Topics
359
--------------~~----------------yl
;< /
/ /
/ /
/
Figure 10.1.2
Four group vector means represented in two-space.
In Figure 1 0.1.2, s =min (4-1, 2) = 2. Whenever s is greater than unity, we may test whether between-group differences on one, two, or all of the discriminant measures are significant. It is possible, for example, that all significant between-group variation is described by the single dimension z 1• Deviations from the z 1 line on z 2 may represent not population mean deviations, but only random error. Intuitively, we can see that a test of between-group differences on all s canonical variates, must yield identical results to the overall test in terms of the yk-that is, the test of Ho: /1-1 = /1-2 = P-3 = /1-4· ·This is exactly the case; group mean differences are neither increased nor decreased in transforming to canonical variates, but merely expressed in terms of new axes. Technically, the discriminant problem is one of determining weights to apply to the Yk to form z variates that have the properties illustrated above. If we let a 1 (i = 1 , 2, ... , s) be a vector of weights, then z1=ajy
(10.1.2)
is the ith discriminant variable. We would like to choose a 1 so as to maximize the between-group-to-within-group variance ratio for z1• Elements ak1 are the d~!!2!1..£9..e~ts. - - Let us assume that we have a pxp sum-of-products matrix Sn. for any between-group hypothesis, with nh degrees of freedom (nh;:;, 1 ). Also; SE is the error sum of products, with ne degrees of freedom. The between-group sum of squares for z1 isa!Sna1. Likewise, the within-group sum of squares for z1 is ajSEa 1.
360
Method
The variance ratio to be maximized to obtain the first discriminant function is (10.1.3a) It is necessary to restrict Eq. 10.1.3a so that it will have only a single solution. A convenient side condition is to require the error variance of zi to be unity. That is, (10.1.4) ~a.res!JJt,
A.; is.. .on.ly_?.[l_~_s~:.e of between-group variation on zi, and is called the canonical variance. Larger vaTues of A.; indicatemore di"S'pm'afegi'i'5iJf5 mean veCio'rs "6ri'thedTScrfrriinant variable. If a second discriminant function is required, A. 2 is maximized, where
(1 0.1.3b) The condition of Eq. 10.1.4 is maintained for a 2 • In addition, a 2 is constrained to be orthogonal to a 1 in the error metric. That is, (10.1.5) As a resu It, sample values for z1 = a~y and z 2 = a~y are always u ncorrelated. If further discriminant functions are necessary (that is, s > 2), each A.i maximizes an expression such as Eq. 10.1.3a orb, subject to the constraints of unit within-group variances (Eq. 10.1.4), and orthogonality with all prior functions (Eq. 10.1.5). These properties can be described in terms ofmatrices. Let A be a pxs matrix having column vectors a;. Let A be an sxs diagonal matrix of canonical variances A.i, ordered from largest (A. 1 ) to smallest (A..). The conditions of Eqs. 10.1.3-10.1.5 are (10.1.6) and (1 0.1.7) Maximum values for A.i and associated vectors a; are the solutions of the homogeneous equations (1 0.1.8) where A.i satisfies (1 0.1.9) Estimates of A.i and a; are obtained by substituting sample hypothesis and error matrices in Eq. 1o:1.8. The solutions are the characteristic roots and vectors of S 8 , in the metric (1 lne)SE. The characteristic roots and vectors may be computed in a number of ways (for example, Jacobi and Householder-OrtegaWilkinson). The computations are heavy even for small problems, however, and computer routines are essential (Bock and Repp, 1970). For some computational ease, the equations may be transformed to a one-matrix characteristic
Analysis of Variance: Additional Topics
361
equation problem, as in Chapter 2, and in Chapter 6 for the estimation .of. canonical correlations. Let lt and a1 be the estimates of A. 1 and a;, respectively, comprising matrix estimates A and A. A may be estimated in raw form or in a form applicable to standardized measures Yk· This is accomplished by multiplying each estimated weight &k1 by the corresponding y standard deviation frk. In matrix representation, let DE=diag
(~:) =
diag (I)
(1 0.1.1 0)
where DE is the pxp matrix of variances of they variates. The matrix of stan-. dardized coefficients is · (1 0.1.11) Care must be exercised in the interpretation of the weight values, for they do not merely reflect the relative contribution of the y variates to betweengroup discrimination. The magnitudes of the weights are functions of the scaling of the criterion measures as well as their intercorrelations. Tliat is, addition or deletion of a single measure may have a major effect upon coefficients for remaining variables. Standardization, as in Eq. 10.1.11, removes scaling effects. The interdependencies are not so easily removed, however. Instead of the examination of discriminant weights, either the univariate or step-down test statistics should be used for locating the sources of group discrimination. For example, the largest of p univariate test statistics is obtained for the maximally discriminating measure. The estimates of group-mean differences and their standard errors for particular y variables reveal the magnitude, direction, and precision of the effect. These results are directly interpretable in terms of the original and better-understood outcome measures. Results from the discriminant functions are likely to be both more complex and more tenuous. Means and mean contrasts may be transformed to the discriminant metric to aid in interpretation. If group j in the analysis-of-variance design has mean vector Y·i• then the mean for the group on z1 is (1'0.1.12) The nh contrasts for the between-group effect in Sn is an nh x p submatrix of 8-that is, @h· Values of the same contrasts for the discriminant variables z1 (i = 1, 2, ... , s) are given by the nh xs matrix (1 0.1.13) Contrasts are all expressed in within-group standard deviation units, because of the restriction of Eq. 10.1.4. That is, each mean difference is the number of within-group standard deviations separating the means on the discriminant variable. The nh discriminant contrasts may be represented graphically to illustrate the separation of groups. The proportion of between-group variation attributable to each discriminant function is estimated by (10.1.14)
362
Method
The measure is converted to a percentage by multiplying by 100. Multiple proportions may be inspected to determine if all between-group variability is concentrated in one or several discriminant dimensions. Several methods are available for testing the nullity of true between-group variation on one or more of the discriminant functions. The overall hypothesis is the same as H0 : 0h = 0 of Eq. 9.2.1. In terms of the discriminant measures, H0 is (10.1.15)
0z can be null if and only if 0h is null, since a; are non-null solutions of Eq. 10.1.8. An equivalent form of hypothesis 10.1.15 is H0 : A;= 0 (i = 1, 2, ... , s), or that there is no between-group variation on the discriminant measure. Roy's largest root criterion for testing Eq. 10.1.15 is sensitive to departure from H0 in a single dimension. The test statistic is =A_l
A.,
+1
(10.1.16)
<1> is referred to the tables in Heck (1960), and Pillai (1960, 1964, 1965, 1967). The arguments for <1> ares and
m=
(lp-n~~l-1)/2
n = (ne-p-1 )/2
(10.1.17) (10.1.18)
Roy's criterion may be applied to smaller roots A2 , A3 , . . . , As, reducing ne and nh by one for each prior (larger) function removed. Hate/ling's trace criterion provides a test of H0 for all dimensions simultaneously. The test statistic is
s
=
2: A.i i=l
(10.1.19)
Hotelling's criterion is referred to the Heck (1960) and Pillai tables (1960, 1964, 1965, 1967), with arguments identical to those for <1> (that is, s, m, and n). Test of H0 may be made for all discriminant functions simultaneously, all minus the largest, all minus the largest two, and so on to just the smallest dimension. Through successive tests, it may be found that a subset of discriminant functions depicts all population mean differences, and remaining smaller functions reflect only random variation. To test H0 for all functions jointly, we may employ the likelihood ratio statistic, as in the analysis-of-variance model. The likelihood ratio is
=
rrc:A.) i=l
A 1 may be transformed to either the
(10.1.20a)
'I
x
2
or F statistic. The test statistic for all
Analysis of Variance: Additional Topics
363
discriminant functions is identical to the multivariate test of equality of mean vectors, in Chapter 9. A test of H0 , removing At, is obtained through likelihood ratio procedures, with ratio (10.1.20b) For computing degrees of freedom for the x2 statistic for A 2 , p-1 and nh-1 substitute for p and nh, respectively. In general, the test of H0 for roots j through s has the likelihood ratio 8 ( -1, ) A;=fl
t~J
(10.1.21)
1 +At
(1 0.1.22)
x2 is referred to tables of the x2 distribution with (p-j+1 )(nh-j+1) degrees of freedom. If the test statistic drops to below the critical value after removing the largest j-1 functions, it may be concluded that between-group variation resides in only the firstj-1 z-dimensions. If the first discriminant function is found to contain all significant betweengroup variation, At may be used to construct p-variate confidence bounds on 8h. Let Hh be the nhxp submatrix of sample standard errors of Sh, from H in Eq. 8.2.19. From the~Heck or Pillai tables, we can obtain cf>a. the 100a percent critical value of cp. Then the 1-a confidence bounds on 8h are (1 0.1.23) where Acx= cf>c/(1-cf>cxl· Hummel and Sligo (1971) suggest that this approach yields intervals for particular effects that are too conservative for most behavioral science purposes. Thus, specialization to a single contrast and/or single criterion measure, as in Eqs. 8.2.20 and 8.2.21, is recommended.
Sample Problem 3- Dental Calculus Reduction The toothpaste data of Sample Problem 3 comprise a 2 x 5 factorial arrangement, with six measures for each observation. The effects in the model are given in Section 9.3. These are . 1. 2. 3. 4.
8, differences between active agents and the control groups (3d. f.) 8*, differences between the two control groups (1 d. f.) A, differences between the two years of experimentation (1 d.f.) A8, interaction of years and treatments (3-2 = 1 d. f.)
The hypothesis cross-products matrices are Snt through Sn4 , on pages 336-337, and error matrix SE, on page 337. Assuming no interaction, let us examine differences between the two years of experimentation through a discriminant approach. The hypothesis matrix is Sn3 , with nh3 = 1 degree of freedom. The number of discriminant functions for the hypothesis iss= min (1, 6) = 1.
364
Method
Fotthis hypothesis, Eq.10.1.8 becomes
(sH 3 - 1 ~ 0 sE)a=o Solving the determinantal equation (10.1.9) for).. and then substituting to find
a, we have A=.0242 and
a'= [-.33
.48
-.59
.53
.04
-.19]
Before interpreting the function, we may wish to test H0: @A= 0', or H0 : 0. Since there is only a single discriminant function, A accounts for 100 percent of variation between year 1 and year 2 results. The need to test multiple discriminant functions is obviated. The likelihood ratio is @A a=
1 A= 1 +.0242
·98
This is identical to A for the analysis-of-variance test. The A is
x
2
transformation of
x 2 =-(1o0+1- 1 +~+ 1 )1oge.98 =2.32
x2 has 6(1) =6 degrees of freedom. H0 is not rejected. The discriminant function appears to reflect only random variation among group means, and is not worthy of interpretation. Having accepted H0 : @A= 0', we may move on to test Ho: 8 8 *= 0', with no reordering of effects. Solving Eq. 10.1.8 for the single root and vector, with SH 2 and SE, we find
~=.0607
a'= [-.49
.39
-.o2
.25 -.54
-.30]
The likelihood ratio is
1 A= 1 +.0607 = ·94 The X2 transformation is
x2 =- ( 100+1- 1 +6+1) 2 loge .94 =5.72 Again, H0 is accepted. There is no significant variation between vector means for the two control groups. Finding both @A and 8 8 ., to be null, we may proceed to test H0 : 8 8 = 0, with the same order of effects. Should either 8,1 or 8B* have been nonzero, the significant effect(s) would need to be reordered to precede the 8 effects, and the
Analysis of Variance: Additional Topics
365
design basis reorthogonalized before testing 8s (due to the lack of orthogonality). SH1 is the three-degree-of-freedom sum of products for comparing the three experimental agents with the two control groups. The number of discriminant functions is s =min (3, 6) = 3. Substituting SH1 and SE in Eq. 10.1.8, with nh, = 3, the three characteristic roots are \ 1 = .2273, \ 2 = .0920, and A3 = .0294. These are the between-group variances of the three discriminant variables. The corresponding raw weight vectors are .20 .74 .09 -.31 A= .66 .51 -.37 -.53 .17 .10 -.11 -.57
-.56 -.27 .33 -.21 -.24 .91
Right canine Right lateral incisor Right central incisor Left central incisor Left lateral incisor Left canine
Before attempting to interpret the discriminant functions, let us test H 0 : Bs = 0 or, equivalently, H0 : A.; =0 (i= 1, 2, 3). The sum of the Ai is tr(A) = .3487. The percent of between-group variation attributable to each of the functions is 100 (· 2273 ) P1
65.18
Firstfunction
P - 100(.0 920) 2 .3487
26.38
Second function
P - 100 (· 0294 ) 3.3487
8.44
-
.3487
Third function
It appears that most between-group variance is attributable to a single linear combination of the tooth measwres. A second function may also account for some of the variation of the experimental agents from the control group. We shall apply formal test crit~;Jria to determine if any of this variation is significant. Roy's largest-root statistic may be used to test H 0 : A.1 = 0. We find
~
.2273 1+.2273 =· 1852
with m = (16-31-1)/2 =1 and
n = (1 00-6-1 )/2 =46.5
366
Method
Comparison to the Pillai tables, with s=3, reveals that cp just exceeds the .05 critical value of .18 (obtained by interpolation). For H0 : A.2 = 0, nh and n. are each reduced by one. The test statistic is
.0920
cp 1 +.0920 = •0843 with
m = (16-21-1}/2 =1.5 and
n = (99-6-1)/2 =46
cp does not exceed the .05 critical value and H0 is accepted. All significant variation between the experimental and control groups can be attributed to a single linear function of the six variates. Although it is no longer necessary to test H0 : A. 3 =0, we would find that the hypothesis is not rejected. Here c/J=.0286, with m=2 and n=45.5. · If we chose to test H 0 : 8 8 = 0 by means of Hotelling's trace criterion, the test statistic is tr (A}= .3487 The .05 critical value for s= 3, m = 1, and n = 46.5, is .32. H0 is again rejected at a= .05 (but not at a= .01 ). Finally let us test H0 through successive likelihood ratio tests. For all three roots, the ratio is
A1 = ( 1 +.;273)( 1
+.~920)(1 +.~294)
=.72 This is exactly the likelihood ratio obtained in Chapter 9 for the test of equality of mean vectors. The x2 transformation is )( 2
=- ( 100+3- 3+6+1) 2 loge .72 =-98(-.322) =31.54
x 2 has 6(3) = 18 degrees of freedom. The tabled .05 critical value of x 182 is 28.87. This value is exceeded, and H 0 is rejected. We can also obtain a likelihood ratio for the two smaller roots alone. That is, A2 =
c+.~920) c+.~294)
=.89 The test statistic is
x= 2
=
-98 (log •. 89) 11.47
Analysis of Variance: Additional Topics
367
x2 from A2 has (6-1) (3-1) = 10 degrees of freedom. x10 •. 052 = 18.31 is not exceeded. When the largest canonical variate is excluded, no significant variation between experimental and control groups remains. All significant betweengroup variation is concentrated in the first discriminant dimension. The third function alone has 1
Al = 1+.0294 =.97 The test statistic is
x
2
= -98(1oge .97) =2.84
x2 from A 3 has (6-2) (3-2) = 4 degrees of freedom. Since A 2 did not exceed the critical value, A3 will also show no significant effect. In this case, all three test criteria have yielded the same conclusion. Let us examine the first discriminant function more closely. The discriminant weights are the first column of A. That is, .20 .09 .66 -.37 .17 -.11 Since the variances of the six teeth measures are disparate, we may wish to standardize & 1 for inspection. The diagonal matrix of within-group standard deviations is given in previous chapters.
DE 112 = diag (1.17,
1.62,
2.06,
2.49,
1. 76,
1.26)
The standardized weights are al
=
oEl/2&1
.23 .15 1.36 -.91 .30 -.14
Right canine Right lateral incisor Right central incisor Left central incisor Left lateral incisor Left canine
Although the weights are interdependent, the function is largely influenced by calculus formation on the two central teeth. These teeth have the highest mean calculus formation of the six. The difference between them appears to be the strongest discriminator of the control groups from those utilizing the experimental additives. The differences favor (lower means) the right side in the control groups and the left side in the experimental groups. To evaluate the effectiveness of the experimental agents, let us transform
368
Method
the three contrasts to the metric of the first discriminant function. The estimate is given in Section 8.4. The contrasts of the three experimental groups with the controls (groups 1 and 2), are
0
SB=
-.93 -1.07 -1.43 -.92 -.30r~;-1/2(P~+p;) [-.31 -.93 -.98 -1.91 -1.29 -.91 p~-112(P~+p;) .18 -.59 -1.31 -2.21 -1.80 -1.24 -.72 p~-1/2(p;+p;)
1;..
1;..
~·
"'~~· "'/.15>,... "' .,<"!: '?15>
'?(\.
'Is>
o,. . .
(}.·
.s>o,...
~
~·
'?15>
'"~ "'·
'"~ "'·
'"~ "'·
<15> ~
~ ~
.,<"!: ~
~
~~.
~
(Ill>
(Ill>
~~
<15> ~
<15> ~
i9~
'%
'%
(}.·
ll'o......
Agent 1 Agent2 Agent3
(}.·
.s>o
......
For the first discriminant variable the differences are
=[
-.451]Agent 1 -.112 Agent2 -1.161 Agent3
The mean of the first experimental group is about half a standard deviation below the controls, on the discriminant variable; the second experimental group is one tenth of a within-group standard deviation below the controls; the third experimental group is more than one standard deviation below the mean of the controls. The thir.d active agent appears to be the most effective of the three in reducing dental calculus. This effect may be detected, but with less clarity, by inspecting the subclass means or mean differences for the six separate teeth. The contrasts may be graphed to depict the effect. The control-group mean forms the origin of the graph. Graphing procedures are likely to be more useful, however, when more than a single discriminant variable is significant. The results can be plotted with two or three discriminant variables as orthogonal axes. In this example, the discriminant analysis effectively isolates and simplifies the effect of major interest in the study.
10.2
ANALYSIS OF COVARIANCE
The Models Analysis of covariance was introduced by Fisher (1932) as a means for reducing error variation and increasing the sensitivity of an experimental analysis to mean differences. The method depends upon identifying one or more measured independent variables, which are related to the response measures but not to the experimental treatment conditions. In a multiple regression analysis, these variables are termed predictor variables. When the same variables are .used for additional control in an experiment, they are termed concomitant vari-
Analysis of Variance: Additional Topics
369
ables or covariates. The corresponding statistical model is the analysis-ofcovariance model. By including concomitant measures in the model, residual variation can be reduced to the extent that it is attributable to the covariatesthat is, to the extent that the criteria and covariates are linearly intercorrelated. The analysis-of-covariance model may also be viewed from a different perspective. When the independent variables are defined by membership in experimental or naturally occurring groups, we compare means through the analysisof-variance model. When the independent variables are measured predictor variables, we test their association to the criteria through the multiple regression model. Both of these cases are different applications of the same general linear model- identical in computation and in statistical tests, but for differently scaled independent variables. The analysis-of-covariance model is the same general linear model. It is appropriate when we have both measured and group-membership independent variables, and one or more criteria. We may test hypotheses about group differences on the criteria, holding constant or eliminating the covariates. These tests alone are usually referred to as analysis-of-covariance tests. We may also test the regression of the criteria upon the covariates, removing group mean differences. These tests are a simple modification of the regression tests of Chapter 5, and are also conducted under the analysis-of-covariance model. A third statistical test is also of interest-the test of homogeneity of regression. Before adjusting error variation for covariates, it is necessary to determine whether a single adjustment will suffice or whether the covariate adjustment must be different for each group of observations. This is equivalent to testing that the regression weights are the same for all groups in the experimental design. The homogeneity test may also be of experimental interest in itself-for example; in the comparison of teaching approaches. A researcher may wish to test whether the dependence of achievement on verbal capacity is different (lower) when nonverbal media are emphasized in teaching. Verbal and nonverbal approaches are tested with independent groups of subjects. The regression weights of final achievement upon prior verbal 10 scores may be compared across groups, even with multiple measures of each trait. The inclusion of covariance controls in an experimental design is an efficient procedure. As in regression analysis, each measured independent variable requires only a single degree of freedom. However, no reasonable number of covariates can equate experimental groups in the sense that randomization can. An unjudicious choice of covariates can introduce sufficient error into the estimation of between-group effects to render them uninterpretable. The careful selection of a small number of covariates is best accomplished through reference to an explicit psychological model of the behavior under investigation. As an example, assume that we have a two-way fixed analysis-of-variance model, with p dependent measures taken from each subject. The analysis-ofvariance model for observation i in subclass jk is (10.2.1)
All terms are 1 xp vectors, with (1 0.2.1 a)
370
Method
Assume further that we postulate that q additional variables are related to the criteria. Scores on these covariates are collected prior to the introduction of the experimental conditions. Let the q scores for observation i in subclass jk comprise the 1 x q vector, (1 0.2.2) Adding these independent variables to Eq. 10.2.1, we have the covariance model, (10.2.3)
B is the qxp matrix of partial regression coefficients of yon x. x ... is the vector mean for the concomitant variables for all N = 'LN;k observations, (1 0.2.4)
et;" is the p-element vector of reduced or adjusted random errors. Error vectors are assumed identically and independently distributed in p-variate normal fashion, with expectation 0 and covariance matrix I*. That is, (1 0.2.5)
The analysis-of-covariance model may also be written for subclass means. Let N;k be the number of observations in cell jk of the sampling design. Then subclass mean vectors are
1 Y·;k=-N L·1 yiji,· jk
(1 0.2.6a)
and (10.2.6b) The mean model is (10.2.7) with
Error vectors are independently normally distributed with expectation 0 and covariance matrix (1 I N;")l:*. That is, (10.2.8) Mean models for all J cells of the sampling design may be juxtaposed and written as the sum of matrix products. For a 2x2 design, the total number of
Analysis of Variance: Additional Topics
groups is J
=
371
4. The models are
[~~::] Y~21 y:22
-
[~ ~ ~ ~ ~ ~ ~ ~ ~~ :~ 1 0 1 1 0 0 0 1 0 1 0 1 0 1 0 0 0 1
a~
p; p~
(1 0.2.9)
or
Y. = A€>*+(X.-1x: .. )B+E:
(1 0.2.9a)
Y. is the J x p matrix with rows y:;k. X. is the J x q matrix of subclass means for the covariates, having rows x:;k· 1 is J-element unit vector, so the matrix [1x: .. ] has identical rows [x: .. ]. A is the Jxm analysis-of-variance model matrix, and~· the mxp matrix of fixed parameters, as in Chapter 7. B is the qxp matrix of regression coefficients of yon x. Unlike B for the regression model, it does not contain a constant term; they scaiing factor is absorbed by the first column of model matrix A. The x scaling factor is zero for mean-deviation scores. E: is Jxp and contains the mean residuals for all J groups. Rows of E: have independent p-variate normal distributions as in Eq. 10.2.8. Together their distribution is
a
(1 0.2.1 0) Dis the diagonal matrix of subclass frequencies (Nn. N12. N21. N22l· 0 is a Jxp null matrix. The steps in covariance analysis begin with the reparameterization from 8-' to contrasts 8, as in the analysis-of-variance model. Estimates of the terms in 8 are obtained, eliminating the covarlates in X. The estimate ofB is found, eliminating group effects in A. 8 and B may be estimated simultaneously in a single least-squares solution, or separately. The latter approach is taken here; the estimation of B is presented first. Then 8 is estimated for they measures and adjusted for the effects of the covariates. Finally I* is estimated, eliminating both regression and group-membership effects. · The test for parallelism of regression planes (homogeneity of regression) indicates whether a single estimate of B is satisfactory for all groups, or whether the regressions differ from one group of subjects to another. Tests of significance are made on sections or all pf B, to decide whether there is nonzero regression of yon x. Finally tests may be conducted on sections or all of covariateadjusted 8. These provide tests of means and mean differences, after the effects of the covariates are eliminated. The tests of significance all follow directly from regression and analysis-of-variance tests, as described in the preceding chapters. We may begin by reparameterizing A8* to K8 in the two-way example. A 2x2 fixed model is described in detail and reparameterized to contrasts in
372
Method
Chapters 7 and 8. The rank of A in Eq. 10.2.9 is I= 4, while the column order is m = 9. The four effects in the reparameterized model are the constant term, the contrast between a/s, the contrast between fl"'s, and an interaction contrast. The rows of the contrast matrix L must be linear combinations of rows of A (a',.). Thus Lis 1 L= [ 0 0 0
.5 1 0 0
.5 -1 0 0
.5 .5 0 0 1 -1 0 0
.25 .5 .5 1
.25 .5 -.5 -1
.25 -.5 .5 -1
.25J -.5 -.5 1
1/4 2:,. a',. 1/2(a' 1+a' ,)-1 /2(a' 3 +a',) 1/2(a',+a'a)-1/2(a' 2 +a'.) (a'1+a'.)-(a'z+a'3) (10.2.11)
l
The substitute parameter matrix is
0=L0*
t-t'+a:+p:+y:. _ [ (a 1-a 2 )' +(y 1.-y2 .)' - (/31-/3 2)' +(y. 1-y. 2 )' (y"-yd'-(y 21 -yd'
Constant A main effect B main effect AB interaction
(1 0.2.12)
The basis, or reparameterized model matrix, is the Jx/ matrix,
K=AL'(LL')- 1
-
[
.5 1 1 .5 1 -.5
.5 -.5 .5
1 -.5
-.5
Constant
A
B
.25J -.25 -.25 .25
(1 0.2.13)
AB
The product K0 substitutes for A0* in Eq. 10.2.9a. Assuming X to be of full rank q, the entire model is now of full rank l+q; least-squares estimates of 0 and B may be derived. The rank condition requires that there be as many subjects as degrees of freedom in the complete model, N ~ l+q. Additional observations are necessary to estimate residual variation and to conduct tests of significance on 0 and B. For X to be of full rank, no covariate can be a direct linear combination of other concomitant measures.
Estimating 0 and B The reparameterized form of the model of Eq. 10.2.9a is
Y. = K0+(X.-1x: .. )B+ E:
(10.2.14)
0 and B may be estimated so that the sum of squared weighted residuals, tr ([E:] 'DE:), will be minimal in the sample. Two sets of normal equations are obtained, each containing both 0 and B. The two may be solved simultaneously to obtain the respective least-squares estimates. An alternative approach to estimation follows from regression analysis in Chapter 4. Let vij be the (p+q)-element vector for observation i in subclass
Analysis of Variance: Additional Topics
373
j, having all p y-measures and all q x-measures. That is,
(10.2.15)
We use only a single group index (j) although the groups may be cells in a two-way or higher-order design. Vis the Nx (p+q) total data matrix for all observations, with rows vi;. Vis formed by augmenting the Nxp outcome matrix V, by the Nxq matrix of covariate values X. That is,
V= [V, X]
(1 0.2.16)
The matrix of subclass means for all groups on p+q measures is
V.= [V., X.]
(10.2.17)
V. has J rows and p+q columns. As a first step, between-group effects are estimated for both the criterion measures and the covariates. Assume that 0 is of order /x(p+q), and is represented as 0v· The matrix may be partitioned for y and x variables. That is, (1 0.2.18)
0v (/Xp} are the contrasts among groups on the criterion variables, and will ultimately be "adjusted" for the covariates. 0x (/Xq} are mean differences for the concomitant measures. We may estimate 0v for all measures, assuming analysis-of-variance model V.=K0v+E.
(10.2.19)
The least-squares estimate is the usual one, according to Eq. 8.1.2. That is,
Sv = (K'DK}- 1 K'DV. ={K'DK}- 1 K'D[V., X.]
= [eu.
ex]
(10.2.20)
Since columns of 0v are estimated separately from one another, inclusion of variables in the matrix does not effect Sy. From Sv we may obtain the error sums of squares and products for all variables, eliminating group-mean effects. This is the matrix SE of the analysisof-variance model, but including the two sets of variates. The residual sum of products is (p+q)-square and symmetric.
X
SE = (V'V-S~K'DKSv)
{1 0.2.21)
SE has ne= N-1 degrees of freedom. If the model includes all possible between-group effects (/=J), K is square and SE is the within-groups sum of products for all p+q measures. In this case,
sE =
(V'V- v:ov.)
(10.2.22) =Sw The degrees of freedom are ne = N-J. This is the form in all common factorial analysis-of-variance designs with replications (see Eqs. 8.2.7-8.2.1 0).
374
Method
In either case, since V and V. have y and x components, SE may be partitioned like the residual matrix (SIC) in the regression model. Let SE be partitioned into SE(yyJ for the criterion measures, SE(xxJ for the covariates, and SE(y.rJ = [SE(:ryl]' for the cross products of the two. That is,
SE=[~~:~~-~~:~lprows
(1 0.2.23)
SE(xoJ : SE(xxJ q rows
p columns
q columns
The estimate of B may be obtained directly from the partitions of SE, as in the regression model. Unlike the regression model, B is estimated after eliminating group-mean differences (in Eq. 10.2.21 ). The estimate is (1 0.2.24)
The expectation of :B is B, and the covariance matrix is [SE(xxJ]- 1 ®I*. Once the estimate of I* is found, standardized regression coefficients, confidence intervals, and test statistics may be computed for B exactly as in the multiple regression model. B is used to adjust the mean differences in Ely for covariates. The estimate of 0 in Eq. 10.2.14, eliminating the covariates, is the /Xp matrix, (1 0.2.25)
The resulting estimates will be very different from. the unadjusted 0y if theregression is strong (B elements large). 0 and 0u are generally similar if there is little or no regression of yon x. 0 is an unbiased estimate of 0 in Eq. 10.2.14.
Estimating Dispersions To draw intervals and conduct tests on the adjusted mean differences, we require the variance-covariance matrix of 0. Two properties of the covariance model are necessary: (1) X and thus 0", are fixed constants and (2) and 0,..8 are linearly independent. Property (1) is true by definition; property (2) is easily demonstrated. Then
ey
]'/"(0) = f/"(0y-0"JJ) = 'V(0y) +7'/"(f>xB)
= (K'DK)- 1 ®I*+ Elx'V(B)0~ = [(K'DK}- 1 +0.r(SE(x.rl)- 1 0~] ®I*
=G®I*
(1 0.2.26)
The standard errors of the [O;,J are the square roots of the diagonal elements of Eq. 10.2.26. The standard error of adjusted contrast i for variate y"' is (1 0.2.27)
To estimate ui!ik and to draw intervals on 8;~<, we require an estimate of I*. The residual variance-covariance matrix is estimated after eliminating both group membership and regression effects. SE is the sum-of-products matrix removing the group-mean differences. The effects of the predictors may be
·Analysis of Variance: Additional Topics
375
removed as in regression analysis. Assume SE to be partitioned as in Eq. 10.2.23. The residual sum of products, given both sets of independent variables, is
s e= SE(yy)_sE<~x>:B = SE-sE[sE<xx)]-1SE<xv)
(10.2.28)
Se has degrees of freedom n;=ne-q =N-1-q
(10.2.29)
Thus the estimate of!* is
(1 0.2.30) The diagonal elements of V~ are the reduced error variances or reduced error mean squares; their square roots are the reduced variable standard deviations, The estimates may be used in drawing intervals on the elements of B, and substituted in Eqs. 10.2.26 and 10.2.27 to obtain &01k; (Oik-(}ik}l& 01k follows at distribution, with n; degrees of freedom.
Prediction Frequently social science research poses the conditional question, "What would the group means have been if all subjects had scored alike on the covariates?" The covariate-adjusted means for all groups are a function of the estimate Sv. The predicted means for all groups on both the criterion and covariate meaures are found by substitution in Eq. 10.2.19. The predicted matrix is V. of order J x (p + q). That is,
V.=KSv =K[S11 ,
0x]
= [Y., iJ
(1 0:2.31)
V. comprises two sets of predicted means, for they and x variables, respectively. Y. is Jxp, and i is Jxq. We note that if all between-group degrees of freedom are included in the model (/= J), these are equivalent to the observed means V .. The submatrix Y. may be adjusted for covariates in the following manner. Let x!; be the vector of covariate means for subclass j. x!; is one row of X.. The vector of grand means, or the mean of rows of X., is •'
X .•
A' =J1 2:;X.;
=}1'X. where 1' is a 1XJ unit vector.
(10.2.32)
376
Method
The Jxp matrix of adjusted means for all groups is
v: = v.- o<. -1x:.rs
(10.2.33)
Rows of v: may be averaged, without differential weighting, to provide adjusted row, column, and interaction means for combined subgroups of observations. If I= J and K is square, Y. is equal to the observed means Y .. Then v: is the matrix of covariate-adjusted observed means. These are commonly referred to as adjusted treatment means. It may be useful to obtain predicted means under the covariance model 10.2.14. It is especially informative to compare predicted means with and without covariates, to examine the improvement in prediction when they are added to the model. The adjusted predicted means are
v:
=
KS+(X.-1x:.)B
=Y:+(X.-1x:.)B
X. contains the observed covariate values and of X.. That is, 1
(10.2.34)
x:. is the simple average of rows
x.. -J 1'X . I-
(10.2.35)
The mean residuals are the differences between observed criterion means
Y. and those predicted under the covariance model. That is,
:E:= v.-v:
(1 0.2.36)
These may be inspected for outliers and for trends in the data not reflected in the model. The comparison of :E: and the difference Y.- Y. will give an indication of the predictive power gained through covariate adjustment.
Tests of Hypotheses Parallelism: The primary statistical tests conducted under Eq. 10.2.14 are the tests of whether there is nonzero regression of the criteria upon the covariates and whether there are significant group mean differences after making the covariate adjustment. Prior to conducting these tests, it is necessary to decide whether a common regression matrix is appropriate to all groups of observations or whether the matrices differ significantly from one group to another. Let B1 be the qxp matrix of regression coefficients for group j in the sampling design. The total number of groups in the complete design is J. The null hypothesis for the test of homogeneity, or parallelism of regression planes is
(10.2.37) If H0 is rejected, it may be necessary to make separate and distinct covariate adjustments for each group of subjects. The discussion on preceding pages assumes that the matrices are homogeneous, so that only a single estimate is necessary. To test for regression parallelism, the covariate adjustment is made to the sum of products for each group of observations. These results are pooled and then compared with the adjusted common within-group sum of products. Com-
Analysis of Variance: Additional Topics
377
parison of the two results reveal the extent to which the J separate regression estimates reduce criterion variation more than a single common estimate. To make the covariance adjustment for a single group of observations, let vu be the vector observation for subject i in subclass j. Vu has the p criterion scores plus q predictor variable values for the individual, (10.2.38) The vector mean of all observations in subclass j is (10.2.39) where N1 is the number of observations in the group. The sum of products of mean deviations for one group is
S.,"J.. ="'. (Vi;-V-;)(Vii-V.;)' ~1 . . . (10.2.40)
Swj may be partitioned into sections, as in Eq. 10.2.23. Sw/yy) is the criterion sum of products, Sw/.rxJ the covariate sum of products, and SwJ
= trj
[-S~"l:u~l~~{lJ.r.~]p rows I
(1 0.2.41)
Swj(xy) I Sw.ixx) q rows
p columns q columns
To make the covariate adjustment for one group, we follow Eq. 10.2.28. The pxp adjusted matrix for one group is s~.j
= S,,/yy)- Sw/yxl[Sw/XX)]-1S,/·XY)
= S w1(yy)- S1uj cuxrjJ.
(10.2.42)
J
s;:; has N;- 1 - q degrees of freedom. S~; is computed for all J groups and the results are summed to provide the error matrix for the parallelism test. That is,
(1 0.2.43)
S has degrees of freedom
n*= ~J (N;-1-q) =N-J-qJ
(10.2.44)
The adjusted common within-group sum of products is the hypothesisplus-error matrix for the test. Let
=V'v-v:ov.
(1 0.2.45)
378
Method
Then the adjusted matrix isS~, as defined by Eq. 10.2.28, with N-J-q degrees of freedom. (1 0.2.46) The difference of S~ and S represents the additional reduction attributable to making separate covariate adjustments. The hypothesis matrix for the parallelism test is (10.2.46a) Su =Si-S The degrees of freedom for Su are
nh = (N-J-q)-(N-J,qJ) (10.2.47)
=q(J-1)
All of the univariate and multivariate test criteria may be applied. The p univariate F ratios are (1 0.2.48)
F1, may be referred to the F distribution, with n" and n* degrees of freedom. Stepdown test statistics may be computed, with Su and S as the hypothesis and error matrices, respectively. The likelihood ratio may be used to provide a simultaneous parallelism test for the p criterion measures. The ratio is
lSI
A=IS~I
(1 0.2.49)
A can be transformed either to the x2 or F approximation, with n" degrees of freedom for hypothesis. The F transform is (10.2.50) The multipliers are
m = [n*-(p+1-n")/2]
(1 0.2.51)
and (10.2.52)
n"p and ms+1nhp/2 degrees of freedom. H0 is rejected if the critical F value is exceeded; a
F is referred to percentage points of the F distribution, with
single common covariate adjustment will not suffice. If H0 is maintained, a common B is assumed for the remainder of the analysis-of-variance testing. The set of regression weights is not significantly different from one group to another, and the regression planes are parallel. Covariance tests: The first set of covariance tests is for determining the extent to which the criterion variables depend upon the values of the covariates. These are the tests of regression of they on the x measures. The estimate of the
Analysis of Variance: Additional Topics
379
regression coefficient matrix depends upon the (p+q)-square matrix SE, and is given by Eq. 10.2.24. The error sum of products isS~, given by Eq. 10.2.28, with n; = N-1-q degrees of freedom. The sum of products for regression for all covariates is SR=SE:B = SE
n;
Then Eq.10.2.19 becomes V.= K*T'8v+E.
(10.2.56)
The least-squares estimates of orthogonal effects T'8v are Uv=T'flv = ( [K*] 'DK*)-l [K*] I DV.
= [K*]'DV.
(10.2.57) (10.2.57a)
Uv is the /X(p+q) matrix of semi partial regression coefficients. Uv has a single row for each effect in the analysis-of-variance model, eliminating all preceding effects. Uv may be determined either from Eq. 10.2.57 or 10.2:57a, since T' is also the upper triangular Cholesky factor of (K'DK); that is, (K'DK) =T[K*]'DK*T' =TT'
(10.2.58)
380
Method
In completely orthogonal designs (equal cell frequencies, orthogonal contrast parameters), K'DK and Twill be diagonal. The resulting orthogonal estimates are then a simple rescaling of rows of In any case Uv estimates the same effects as except that the basis has been columnwise orthonormalized before estimation. Thus each estimate obtained is the effect that is not confounded with preceding terms in the model. Let us represent the ith row of U as u'i· For the 2X2 design, the matrix has four rows:
e,.
ev,
U "
ll u' 1
Constant
u' 3 u' 4
8, eliminating constant and A A8, eliminating constant, A, and 8
= u'" A, eliminating constant
(1 0.2.59)
Each row has p+q elements. In designs with more than two levels of any factor, U will have more than a single row for the corresponding main effect and interactions. From each row of U,, a between-group sum-of-products matrix of order (p+q)x(p+q) is constructed. For the constant term, the matrix is Su 1 = U 1 U' 1 ; for A, eliminating the constant, S//2 = u2 u' 2 ; for 8, eliminating A and the constant, Su, = u3 u' 3 ; and for interaction, eliminating all else, S 84 = u4 u' 4 . Consecutive matrices u;u'i may be pooled to provide multiple-degree-of-freedom tests as necessary (see the example that follows). Each row of U represents a single degree of freedom. Each matrix S 8 is adjusted for covariates in the following manner. First, add the (p+q)-squa~e unadjusted error matrix SE (Eq. 10.2.21) to the hypothesis matrix SHr The result is the sum (1 0.2.60) This "total" matrix for the hypothesis may be partitioned like SE in Eq. 10.2.23. That is,
ST
= .1
sT.(yy) I ST(yx)] prows [ - - -I - - ,j - - -.I- - Sr(xu) [ .I
p columns
Sr(xx)
(1 0.2.61)
q rows
.I
q columns
Second, adjust Srj for covariates. Represent the matrix of adjusted sums of products for the criteria as s;j. Then (1 0.2.62) Having an adjusted hypothesis-plus-error matrix, and the adjusted error . matrix S~, the adjusted hypothesis matrix S;1 . is found by subtraction. That is, j
(10.2.63) S71 . has the same degrees of freedom as Su -that is, the number of rows of U from which it was formed (nhj). In the 2X2 example all S 8j have a single degree of freedom (all nh; = 1). The sums of products for the 2 x 2 crossed analysis-ofcovariance design are given in Table 10.2.1. j
j
Analysis of Variance: Additional Topics
Table 10.2.1
381
Sums of Cross Products for a 2 x 2 Analysis-of-covariance Model Degrees of Freedom
Source Constant
n", =
A, eliminating constant and X
n"z
B, eliminating constant, A, and X
n"3 = 1 n"4 = 1
AB, eliminating constant A, B, and X
=
1 1
Sum of Cross Products
s;,, = s,;., - s;; s;,z = s;2 - s~ s;,3 = s;, - s~ s;14= s;.-! -
s~.
-----------------------------------;-----------------------------------------------q
CovariatesX, eliminating design effects
n; = ne-q
Residual
=
N-4-q
All analysis-of-variance test criteria may be employed in analysis of covariance, with hypothesis matrix S7£.i and error matrix S~. The null hypothesis for any one matrix is (1 0.2.64) where 8" is an n,;xp submatrix of 8, containing n,; rows, or degrees of freedom. 8 is estimated under the covariar)ce model in Eq. 10.2.25 and has been "adjusted" for covariate effects. . For example, to test for interaction in the 2x2 design, the null hypothesis is Ho: eAB = 0', where eAB is the last 1 xp row of 8. The hypothesis sum-ofproducts matrix is S7£4 , with n,4 =(a -1 )(b- 1) = 1 degree of freedom. The error matrix isS~, with n; = ne- q degrees of freedom. If interaction is nonsignificant, we may test for the 8 main effect, with H0 : 0s = 0' or H0 : JJ-. 1 = JJ-. 2 . 8B is the third 1 xp row of 0, and the hypothesis matrix is s;13 . If H0 is not rejected, we may test H0 : SA= 0', or H0 : 11-t· = JJ-2 ., with S7£ 2 and S~. If 0B is nonzero (Ho rejected), the second and third columns of K must be interchanged and the basis reorthonormalized to estimate the A contrast, eliminating B. The A ro'll( of U will then be the third row, and the adjusted hypothesis matrix for A must be totally reconstructed by Eqs, 10,2.60-10.2.63 before the test can be conducted. Any of the hypotheses may be tested for all p variates jointly, for each of the p criterion measures, and for each measure holding constant preceding measures in a specified order. To test H0 for all criterion variables, eliminating the effects of the covariate(s), the likelihood ratio criterion is
ISil I "' = IS~+S7£; ,A
(10
6 ) .2. 5
A may be referred to either the chi-square or F distribution. If N is large, the chisquare approximation will have sufficient accuracy. (10.2.66) with
m = [n;-(p+1-n,;)/2]
(1 0,2,67)
x2 is referred to a table of the x2 distribution, with n,;p degrees of freedom,[!" is rejected with confidence 1-a if the 100a upper percentage point is exceeded.
382
Method
A more accurate approximation is the F transform, with
F=
1_
AJis
Ails
.
ms + 1 - n".p/2 ' n,;p
(10.2.68)
and (10.2.69) where m is defined by Eq. 10.2.67. F is compared to critical values of the F distribution, with n,;p and ms+1-n 11;pl2 degrees of freedom. In the case of s indeterminate (n,p=2), settings to unity will provide an appropriate statistic. Discriminant functions may be computed for covariance-adjusted effects, substituting s~j and s; for SH; and SE in Section 10.1. Roy's largest-root criterion, Hotelling's trace criterion, and likelihood ratio tests of successive canonical variates provide additional multivariate test criteria for H0 . All reflect departure from H0 in terms of the maximally discriminating canonical or discriminant variables. The simultaneous test of all discriminant functions is identical to the results of testing A in Eq. 10.2.65. Univariate covariance tests for the dependent variables are obtained from the diagonal elements of S~; and s;. The ratio for criterion measure y" (k = 1, 2, ... ,p) is F =
"
[s~]~c~cln,. '
..
(10.2.70)
J
[sa,";n;
n;
F" may be referred to tables of the F distribution, with n,; and degrees of freedom. Multiple univariate F ratios for one hypothesis are not independent of one another and should not be used to decide whether or not H0 is supported. They do provide useful descriptive data, however, to compare the relative effects of the experimental treatments upon the various outcome measures. Step-down tests for H0 , which depend upon an ordering of they variables, may be conducted exactly as in analysis-of-variance models. Successive conditional standard deviations are the diagonal elements of the Cholesky factors of Si and s;;, as defined by Eq. 10.2.62. Represent the factorizations as (10.2.71a) and (1 0.2.71 b) [teh~c and [t*h" are the diagonal elements of TE and T* respectively. The stepdown statistic for criterion y", eliminating y 1 through Y~c~ 1 , is
F~ = [t*h~c 2 - UeJk~c 2 [tehk2
•
n;- k + 1 n,j
0 7 ) (1 .2. 2
F!; is referred to tables of the F distribution, with n,j and n;- k + 1 degrees of freedom. Unlike all other test criteria, the step-down statistics are not invariant under permutation of the criterion variables. Testing begins with F; and proceeds in a backward direction. If a significant F* ratio is encountered, no earlier statistic may be validly interpreted under this ordering of variables. Hypothesis 10.2.64 is accepted if and only if none of the step-down F* ratios exceeds its
Analysis of Variance: Additional Topics
383
respective critical value. If ·no logical ordering of variates exists, the step-down tests are of little value and H 0 should be tested using the other multivariate criteria.
Sample Problem 2- Word Memory Experiment The model fit to the word me'mory example is a one-way bivariate analysisof-variance. model, with four experimental conditions. The matrix representation is Y. = A8*+E., with
10 01 00 0OJ 0010 0001
0
.= [11-:] a; a2
a~
a'4
Y. and E. are 4x2 matrices. The reparameterization of the model has contrast matrix
.L-[~
.Z5 1 0 0
.25 .25 .25] -1 0 0 1 -1 0 0 1 -1
LO L1 L2 L3
The matrix of contrast parameters is
. 11-:+1~4 ~;a]] LO L1 0 = [ a 1 -a2 a~-a~ L2 a~- a~ L3 The basis matrix for 0 is
[""
.75
.50 .25 .50 1.00 -.25 -.50 .25 1.00 -.25 -.50 -.75 L1 L2 L3 LO
K- 1.00 -.25
-
25]
In addition to the two outcome measures, three recordings were made of the time taken by the subjects in completing the task, on each of three earlier trials. Since individual differences in time may be related to task proficiency, an analysis of covariance, with time measures as the covariates, is performed. The analysis-of-covariance model is given by Eq. 10.2.14. X. is 4x3, with subclass means on the three cor)comitant measures; x:. is 1 x3. The purposes of the analysis are: 1. To estimate the regression weights of learning on time for each of the four groups, and to test whether they are the same. 2. To test whether there is any regression of the two outcome variables on the three time measures, eliminating between-group differences. 3. To test whether there are significant differences among groups' means on the two outcomes, eliminating the effects of time.
384
Method
The complete matrix of means for criteria and covariates (time in seconds) is
V.= [V., X.] 39 83 [ 42:17 - 40.00 36.25
.97 .94 1.00 .97
"v-o/1
O'.s-
155.58 151.75 117.92 100.00
116.00 153.67 106.00 110.42
106.92] Condition 1 130.75 Condition2 95.25 Condition 3 103.92 Condition 4
°<~te'{)Or:"~e f~t;,., "~e f~t;,.,"~e tlf'q, ...., <
'e.s-
. . ,
,e
There are twelve subjects under each experimental condition. The within-group sum of products for one subclass is
Li ViiV!;-N;v.;v:;
Sw; =
v:,; is row j of V.. Sw; is partitioned into sections for criteria, covariates, and the cross products of the two sets. That is,
[ Sw;=
Sw(yy)
I
Sw.
;:,~x~;~~--;~<:: J
3 time measures
,1
2 criteria
3 time measures
The 2x2 matrices for criteria only, for the four groups, are presented in Chapter 8. For the covariance analysis these previous matrices become Sw/uul of the extended sums of products. The complete 5x5 matrices are 547.67 2.94
(Symmetric)
I
.o2
II I 46522.92 I 9513.oo
--------------~-----------------------
-1939.83 -555.01 -180.17
-11.o1 -4.7o -1.63
1
6232.58
9336.oo 4o21.oo
I
505.67 6.62
.13
6128.91
(Symmetric)
1
-------------r-----------------------239.50 8.72 I 43322.25 -334.33 3.27 I 38758.oo -359.5o -3.87 II 17178.25 222.00 0
i
o II
39642.67 16516.oo
1o714.25
(Symmetric)
---------,------------------------180.00 2o5.oo 261.oo
o o o
I 7478.92 I 9399.oo I 6637.25 I
17o3o.oo 12683.oo
1oo18.25
Analysis of Variance: Additional Topics
210.25 2.30
i
.07
lI
385
(Symmetric)
------------,---------------------464.00 -.20 l 4662.00 238.75 3.57 l 2850.00 9168.92 -73.75 -6.03 : 3162.00 4325.42
6176.92
The pooled within-group sum of products is the unadjusted error matrix
Sg = 2:;1 Sw;
I .22 l
1485.58 11.85
-
-
(Symmetric)]
1---------------}-------------------------1416.33 -8.54 I 101986.08 -445.58 2.13 l 60520.00 75177.58 -352.42 -11.54 l 38210.08 37545.42
33038.33
Prior to adjusting means and mean differences for covariates, we shall test the hypothesis Ho: B 1 = B2 = B3 = B 4 = B that the 3X2 matrices of coefficients do not differ from one group to another, or that regression planes are parallel. Let us first make a single common regression adjustment to SE. The common estimate B is
B=
[SE<x.r!] -lsE<xy!
101986.08
(Symmetric)]- 1 [-1416.33
= [ 60520.00 75177.58 33210.08 =
37545.42
-195.75 1o- 4 x [ 123.29 -50.01 Words
33038.33
-8.54] -445.58 2.13 -352.42 -11.54
-1.45] Time 2 5.71 Time 4 -8.52 Time 6 Categories
The regression sum of products is SR = sE
= SE(yx!fi =
10-4 [-1416.33 -455.58 -352 42] [- 195 ·75 - 1 .4 5] X -8.54 2.13 -11:54 123.29 5.71 -50.01 -8.52
= [23.99 .25] .25
.01
386
Method
The covariate-adjusted error sum of products is
= [1485.58 11.85
11.85]- [23.99 .22 .25
=[1461.59 11.60
11.60] Words .21 Categories
War. tts
.25] .01
0 EJte
gories
n;
The degrees of freedom for SZ are = 48-4-3 = 41. The adjusted variance-covariance matrix is
i*=_!_ S* 41 E =
[35.65 .28
.280] Words .005 Categories
Woro: CCJteg s aries The adjusted error mean squares are the diagonal elements of i*. The standard deviations of the criteria, holding constant the three time measures, are the respective square roots: 6' 1 *= 6'2*
v'35.65 =
5.97
Words
= v:Oo5 = .07 Categories
(The unadjusted value of 6' 1 = 5.81 is smaller than the adjusted value of 5.97. In this instance the adjustment is negligible and the sum-of-square reduction is relatively smaller than the three-degree-of-freedom loss.) In contrast to s;, we may make separate covariate adjustments to each group's matrix, and pool the results. For each group the adjusted matrix is s;:,i' given by Eq. 10.2.42. For example, for the first group of observations,
S* = [547.67 w, 2.94 -[-
X
2.94] .02
~9n·g~ -~~.·~J -1!9:~~] [
-1939.83 -17.07]- [458.90 [ -555.01 -4.70 2 16 -180.17 -1.63 .
46522.92 9513.00 6232.58
9513.00 6232.58]-l 9336.00 4021.00 4021.00 6128.91
2.16] ·02
For the other groups,
S* "'2
=
[414.23 5.50 5.50 .11
J
S* "'3
=
[174.59 .00
.00] .00
S* = [120.91 "'4 1.53
1.53 .05
J
Analysis of Variance: Additional Topics
387
The pooled adjusted matrices are S=2:;S~,
=[1168.63 9.19
9.19] .18
S has n* = 48-4-12 = 32 degrees of freedom. The hypothesis matrix for the parallelism test is SH=[1461.59 11.60
11.60]-[1168.63 .21 9.19
= [292.96 2.41
9.19] .18
2.41] .03
Su has nh = 41-32 = 9 degrees of freedom. The likelihood ratio criterion for H0 is
A=
lSI ISil 127.88 172.40
= .741 The multiplier ism= [32-(2+1-9)/2] = 35, and
s
= (22(92)-4)1/2 22+92-5 =2
The F transformation is thus
F= 1-\f741. 35(2)+1-(9)(2)/2 ~ 9(2) .14 62 .86 18
=-·-
=.56
F does not exceed the .05 critical. F value, with 18 and 62 degrees of freedom. We accept H0 ; the regression planes are parallel and a single common regression estimate will suffice. The dependence of the learning measures upon time is the same for all four groups of subjects. We may also conduct univari.ate parallelism tests for each dependent variable. These reproduce the results that would have been obtained if only one criterion measure had been included in the model. The univariate F statistics are 292.96/9 F1 = 1168.63132 = .89
Words
388
Method
and
.03/9
F2 = _18132 =.56
Categories
Neither exceeds the critical F value, with 9 and 32 degrees of freedom. Although F 1 and F2 are not independent test statistics, H0 does receive support from either measure individually. Accepting that all regression planes are parallel, we may next test for nonzero slope, or nonzero regression of two y variates on the three covariates. The hypothesis is H0 :B=0 Complete tests of H0 are discussed in the earlier regression chapters. Here we shall obtain the single likelihood ratio statistic for all three predictors. The hypothesis matrix for regression is 5 11 , with q degrees of freedom; the error matrix iss;;, with n~ degrees of freedom. The likelihood ratio is
A=
IS~I
ISR+S~I
172.40 190.12 = .908 The F transformation has m = [ 41-(2+ 1 -3)/2] = 41 and
s
= (22(32)-4)112 22+32-5 =2
Then
F= 1--Y.908. 41 (2)+1-(3)(2)/2 ~ 3(2) .05 80 .95 6 =.66 F does not exceed the .05 critical value, with 6 and 80 degrees of freedom, and H0 is maintained. We must conclude that there is no significant relationship of learning scores with time taken to complete the task. We may decide to exclude the covariates on this basis, for residual variation is not significantly reduced by continuing to carry them in the model. Or we may decide that the covariates are of logical importance and continue to maintain them on theoretical grounds (although the theory appears to be challenged). For exemplary purposes, we shall take the latter approach and maintain the covariates. In nonexperimental studies we may wish to see if significant mean differences become smaller or disappear when the covariate adjustment is made. That is, even if error variation is not reduced, the covariates may account for observed group-mean differences on the criteria.
Analysis of Variance: Additional Topics
389
The' model matrix after reparameterization K, and the means V. provide the estimate of mean contrasts for all five measures. The covariance factors (also in Section 8.4) are
0208 (K'DK)- 1 = [
g
(Symmetric)]
.1667 -.0833 0
.1667 -.0833
.1667
Then
0v = (K' DK)- 1 K' DV. = [0y, 0xJ 39.56 .97 - [ -2.33 .03 2.17 -.06 3.75 .03
%
131.31 121.52 109.21] p.' +1/4 ~j aj 3.83 -37.67 -23.83 a~-a; 3:3.83 47.67 35.50 a~-~ 1Z.92 -4.42 -8.67 a~-a~ ~-
,..~
0(9
)-.. /?;&
.,
6'
0y
is reproduced from the analysis-of-variance estimates of Chapter 8. The estimate of 0, eliminating the thr~e time covariates, is
0 = 8y-8.,:B 121.52 109.21][-195.74 -1.45] 39.56 .97] [131.31 123.29 5. 71 .03 _ 10_4 x • 3.83 -37.67 -23.83 [ -2.33 2.17 -.06 33.83 35.50 -50.01 -8.52 47.67 .03 17.92 -4.42 -8.67 3.75
The adjusted variance-covariance factors are G = (K' DK)- 1+0x [SE<xx>] - 1 0~
.3924 (SymmetricJ - [-.0623, .2108 .1137 -.1160 .2074 -.0190 .0197 -.0944 .1797 Rather than computing the enti.re variance-covariance matrix of the eight
390
Method
elements in @, we shall obtain just the standard errors of each. The adjusted standard deviations comprise vector d, where
d' = [6-~
6-~]
= [5.97 .07]
The factors that multiply d to provide the standard errors are the square roots of the variance factors, or diagonal elements of G. Let
.J~]=[::iiJ l Y.1797
.424
The matrix of adjusted standard errors is
H=gd' 3.74
.04J~-t,'+1/42:;aj
2.72 2.53
.03 .03
= [ 2.74 .03 a 1 -az a~-a~ a~-a~
Woro.s Cete 9ories Confidence intervals may be drawn on any element or row of 0 using the corresponding standard errors [h;k]. Under the assumption of normal y, (fJu,-e;k)l = 41 degrees of freedom. The critical t value h;k follows at distribution with may be used to test that one contrast is null for a single variable (words or categories), holding constant the three time covariates. Adjusted treatment means are obtained from the predicted means V. by Eq. 10.2.31. In the word memory experiment we assume a model including the constant and all three mean contrasts. Thus f=J=4, and the matrix of predicted means [Y., X.] is the same as those observed, or [Y., X.]. The adjusted treatment means are given by Eq. 10.2.33; they are
n;
40.37 Y* = [ 42.28 · 39.86 35.75
.971 .94 1.00.96
Condition 1 Condition 2 Condition 3 Condition 4
Words Cetego . rtes In this case they are not systematically or greatly different from the observed matrix Y .. Predicted means under the analysis-of-covariance model are here identical toY. since Y.=Y. and X.= X. (see Eqs. 10.2.33 and 10.2.34). Adjusted mean residuals are identically zero. To test equality of the four experimental effects under the covariance model, the hypothesis is Orthogonal estimates are required for all five measures. H 0 is equivalently H0 : 0A = 0, where 0A is the last three rows of 0-that is, the three betweengroup contrasts for both criterion variables. 0 is a3X2 null matrix.
Analysis of Variance: Additional Topics
391
The orthogonal estimates are Uv= [K*]'DV.. K* is the orthonormalized basis matrix, and was obtained in· Chapter 9 for the analysis of variance. That is,
K*=
[:~:: -:~~~ .2~6 .144 -.083 -.118
gJ
.204 .144 -.083 -.118 -.204
and D=diiag (12, 12, 12, 12) Then
Uv= [Uu, Ux]
u>
i
=
[
274.10 6.71. 909.76 841.92 756.62] 1.08 - .01· 1 97.08 -22.08 -9.17 U 2 11.43 -.12 I 121.03 128.58 88.15 u' 3 9.19 .08 I 43.89 -10.82 -21.23 u' 4
lt.o
""o-.s-
oqt.
619 0 ...,.
J;~
&
J;~
J;~
& <7
&
e
'&.so
Rows of Uv estimate the same effects as those of Sv, but in a stepwise fashion. Each contrast is estimated after' eliminating any correlation with preceding effects. There is no particular interest in the constant term, and a corresponding sum of products u1 u' 1 is unnecessary. To test H0 : eA = 0, we pool the sums of products for the three independent effects. The hypothesis matrix is SH
==
u2u'2+U 3U' 3 +U4U'4
216.23 -.61
i
(Symmetric)
I -------------~-----------------------1891.90 -11.49 I 26ooo.23 1346.52 -15.85 I 12943.19 17136.4o .o2
802.79 -12.05! 8847.79
11766.37 8305.58
This is the same as SH for analysis of variance, but with the three additional covariate measures. sH is identical to SHin Section 9.3. Hypothesis degrees of freedom are nh=3. (Since there is only a single hypothesis in the particular study, the subscript on SH, Sr, and nh is omitted.) To adjust SH for covariates, form Sr= SH+ SE
I Sr] sT<.xy) I Sr<.xx)
S/vv>
= [-----1-----1 1701.81 11.24
1 .24 1
(Symmetric)
-------------~-------------------------
475.57 -20.03 1127986.31 9oo.94 -13.72 I 73463.19 450.37 -23.59 I 42057.87
92313.98 49311.79 41343.91
392
Method
The adjusted hypothesis-plus-error matrix is
8*=[1701.81 T 11.24
11.24] .24
-[ 475.57 900.94 450.37] [127986.31 73463.19 -20.03 -13.72 -23.59 73463.19 92313.98 42057.87 49311.79 X
475.57 -20.03] = [1692.15 [ 900.94 -13.72 11.32 450.37 -23.59
42057.87]- 1 49311.79 41343.91
11.32] .22
The adjusted hypothesis matrix is
Sfi=
s;-s;;
=[1692.15 11.32
=
11.32]- [ 1461.59 .22 11.60
11.60] .21
[ 230.56 -.28] Words -.28 .01 Categories Vltoro:
CCifeg
s
ories
Had there been more than a single hypothesis matrix (SH or SH)• each would have been separately pooled with SE and adjusted to obtain S~r Let us use the likelihood ratio criterion to test H0 . That is, IS~I
A=
1Si+S7II 172.40 249.14
=.694 The F transform has multipliers m = [41-(2+1-3)/2] = 41 and
s
= ( 22(32)- 4 22+32-5
)1/2
=2 Then
F = 1- v:694. 41 (2)+1-(3)(2)/2 ~ 3(2)
.17 80
= .83'6 = 2.68 F has 6 and 80 degrees of freedom. The .05 critical value is exceeded, and H0 is rejected. We note that H0 was rejected also without the covariates (analysis-ofvariance F = 3.29 with 6 and 86 degrees of freedom). Consistent with the
Analysis of Variance: Additional Topics
Table 10.2.2
393
Analysis of Covariance for Word Memory Experiment
Source
d.f.
Constant Between groups, eliminating covariates Covariates, eliminating design effects
Mean Products Words Categories
3
Within groups, eliminating covariates
41
Total
48
-
-
-
2.68* (6, 80)
2.16
.83
.66 (6,80)
.22
.80
u,u't
1
3
Multivariate Univariate F F (d.f.) Words Categories
[76.85 - ..09 [ 8.00 .08
-.090] .004 .080] .003
[35.65 .28
.280] .005
I
'Significant at p < .05.
findings of the regression analysis, the covariates do not account for any sizable portion of error variation in the outcome measures. Had the covariates been more influential in the learning outcomes, we might have discovered that mean differences were significant only with the covariates in the model. Univariate F ratios may also be computed for the separate criterion measures, for the adjusted mean differences. The ratios are '
F, =
230.56/3 1461.59/41
76.85
= 35.65 = 2·16
Words
and
F2
.01/3
= .21141
I
:==
.004 .00 5 =. 83
Categories
The numerators and denominat,ors are adjusted hypothesis and error mean squares respectively. Each F ratio has 3 and 41 degrees of freedom; neither exceeds the .05 critical F value. The covariance results may be summarized as in Table 10.2.2, and may be compared with the analysis-of-variance results of Table 9.3.1 .
•
Appendix A ANSWERS TO MATRIX ALGEBRA EXERCISES (SECTION 2.7)
1.
c,=[~~J
a,,=-1
2. Is=[~ ! ~J 3.
13
A'= [ 21 -13 02 010] 1 -1
4.
=rn
0 l26 31 ~ 31 -4 -4 -6 D-H= 2 2 -2 -4 C+T=
f'
[
6.
e'v = v' e = 0
o
8
f '= 42
28 70
::
2.0
3.5]
··{~]
e and v are orthogonal
~0~
e'f not defined (not conformable)
1 1 1]v=-1
1~14=4
e'e=10.5
lei =V10.5=3.24
394
.5
-2]
44 10 110 25
v'v=7
Thesumofsquaredelements
lvl=vY=2.65
756] ~ [ -2] -~ = [--:;;~ 1
H'=H
2 2 -4 -13 1 1 -1
1'v=[1
v*=
1.5 1.0 2.5]
(e+v)'=[-1.0
.2 ']
1 15A=
~
-3] 32 43
3 -.1 -.1 .2 .3 .1 0 0
5.
e
e'=[1.0
3
.378
lv* I =1
.
Appendix A
7.
395
AI=IA=A
AB~~:
2 3 -3 -1 1 1
B'A'~
5 1 4 1
l]
A'B, B' A not defined (not conformable)
[l l] 4 -3 3 -1 1' 4 2 2
DA~~ -~ 6 -2 8 0
BD
12 0
=[ ~ ;
~]
4 8 0
-3 -2
Premultiplication by D rescales ro:vvs of A. Postmultiplication rescales columns of B.
[1.5 0 OJ 0 TU = 3 0 1.0 3.0 4.0
[3.5 0 0] T+U = 5.0 2.0 0 5.0 1.0 4.0
0
Both results are (lower) triangular !llatrices.
KK'=K'K=[~
01 0OJ 0 1
AA'~~
-~]
[ 11 -1 A'A= -1 9 -2 9
11
r~
1 B'B= 2 3 3 1 3 5 1 2 3
A'DA=
K is orthonormal by rows and columns
4 -1 6 -2] BB'=[ : -2 -1 2
0 -3] [ 23 0 30 32 -3 32 41 9
0
~
0 11 -5 3 7 0 -5 13 0 1 3
46 -22] [ 42 46 75 -10 -22 -10 20
BHB' =
01
i
I I
12 3 0 18 -3 61
I
I
-------j--------t--------r--------1
D®T=
6
I8 i 12
0 0 2
-2 4J
--------~------1
i i
I
I
i i
0I
--------+--------
I 12
0
0
4 124 -4
0 8
116
I
i i
-------~--------+--------~--------
1 I I
I
i I
I
i I
(All missing elements zero)
6 8
o o 2 o
12 -2 4
-56 -28 28 28 -88 -44 f®v= 44 44 -20 -10 10 10
396
Appendix A
e-e. 1 =
[.0~
[.5~ ~-.5]
1.5 1.5 0 1 .0 - 1.5 = -.5
2.5
Mean of elements
1
8.
[1 GA= 0
2 0 2
3
-1
5.~77
-.626
5.843
Vector of mean deviations
~ l J
2.543
Iw I= 4. 796 (5.477)(2.543) = 66.80
c··l
=
[.209 o .051
r
.046 -.o22 .020
[209 A*= 626
0 .183 -420 -.022 .210 -.165
.365 -.183 .365 0 209 0
9.
10.
.3~J 39~
.179 .341 .051
ICI =66.80'=4462.08
1 IW- 11= 66 _80 = .0150
.020J -.165 .154
r(A)=r(A'A)=r(AA')=3 IA'AI=187 IAA'I=O IGI = -1 IGDI = -48 tr(TT')=67
1.0
Rows of A rearranged
w =[ 4 -~96
w- 1=
1.5
1 IC-11 = 4462.08 = .0002
R=
r
4.796 0 -626
r(D)=4 IDI =48
0 5477 5.843
IAA'DI =0
A(R- 1)' =A* 2IJ
Appendix _8 PROGRAM USER'S GUIDE The following pages describe the input data to the computer program MUL TIVARIANCE: Univariate and multivariate an'alysis of variance, covariance, and regressionVersion V (March, 1972) (Finn, 1972d). MULTIVARIANCE is a FORTRAN IV program which performs the analyses discussed in this text. The MULTIVARIANCE program package and User's Guide are distributed by National Educational Resources, Inc., 215 Kenwood Avenue, Ann Arbor, Michigan 48103. The five sample problems describe'd in the text have been run with MULTIVARIANCE Version V. The input/output listings are contained in Appendix C. The description that follows is intended as a guide for rei!ding the data listings. Parameters and options not directly germane to the analysis are omitted (for example, spacing, data screening, and punching codes). Control cards and parameters that are necessary to all runs are starred (*). For analysis-of-variance designs, coded "Symbolic Contrast Vectors" substitute for the Kronecker product codes of Chapt~r 8. These utilize the same conventions as given in Chapter 8, but the Kronecker operator 0 is replaced by commas (,) for keypunching. Several additional examples of contrast coding are provided. Key: b =blank column 0 =numeric zero 0 =alphabetic "oh"
Phase 1 -lnputt 1. Title Cards* Card 1* Columns 1-60 Alphameric problem title Card 2* Columns 1-60 Alphamer'ic problem title (continued) 2. Input Description Card* Cols. 1-4* Number of measure~ variables in input set No distinction is made between dependent variables and covariates. tAdapted from Chapters 2, 3, and 5 of User's Guide-MULTIVARIANCE: Univariate and multivariate analysis of variance, covariance and regression- Version V (March, 1972); ©1972 by National Educational Resources, Inc. Used by permis$ion of National Educational Resources, Inc., Ann Arbor, · Mich.
397
398
Appendix 8 Cols. 7-8*
Number of factors (ways of classification) in analysis-of-variance design. For regression only, set to 1. Col. 12* Data form code 1 =raw unsorted data, each observation with its own cell identification information 2 = raw data sorted by cells, each cell with its own header card 3 = raw data sorted by cells, no header cards 4 =within-group variance-covariance matrix and mean-frequency summary data 5 =raw unsorted data to be read from an independently prepared binary tape 6 =raw data grouped by subclasses to be read from an independently prepared binary tape 7 =within-group correlation matrix and mean-frequency summary data Col. 16* Number of Variable Format Cards Col. 20 Transformation code 1 =transformations b or 0 =no transformations Cols. 21-24 Number of variables remaining after input variable transformations have have been performed (if different from columns 1-4). Col. 28 Transformation matrix code 1 =read transformation matrix 2= generate transformation matrix with Symbolic Contrast Vectors 3= generate transformation matrix and orthonormalize leading rows 4= read transformation matrix and orthonormalize leading rows b or 0 =no transformation matrix Cols. 29-32 Number of variables remaining after transformation matrix has been applied (if different from columns 21-24). Col. 40 Punched output code 1 =punch summary data (summary data will be printed whether or not this option is chosen) 2 =punch all scores after transformations (see Variable Format Cards) 3 =punch summary data and all scores after transformations b or 0 =do not punch summary data or scores after transformations Col. 52 Data list code 1 =list all data before and after transformations b or 0 =list only first observation Col. 64 • Optional output code 1 =optional printed output requested throughout problem run b or 0 =normal printed output throughout problem run 3. Factor Identification Card(s)* Six-character factor names followed by four digits giving the number of levels for that factor. For regression analysis only, punch 1 in column 10. 4. Comments Cards Cols. 1-80 Comments, maximum of 300 cards 5. End-of-Comments Card* Cols. 1-6* FINISH 6. Variable Format Card(s)* Input: For data form 1 include format for observation and cell identification For data form 2 or 3 include format for dependent variables and covariates only
Appendix B Output:
399
If column 40 of the Input Description Card is 2 or3, then the user must include a Variable Format Card describing the punched transformed scores.
7. Transformation Cards Cols. 1-4 Location of resultant variable Cols. 7-8 Transformation code (list in complete manual) First variable to be transformed Cols. 9-12 Cols. 13-16 Second variable to be ~ransformed Cols. 17-20 Third variable to be transformed Cols. 21-30 Constant to be used iri transformation 8. End-of-transformation Card Cols. 1-80 Blank; to be used only if Transformation Cards are used 9. Variable Label Cards* Six-column alphameric labels, 13 to a card, as many labels as variables after all transformations 10. Data* The data form code is punched in column 12 of the Input Description Card. For regression only, all observations are entered into cell "1." Data Form 1: Raw unsorted data, each observation with its own cell identification information Each observation is contained. on one or more cards. The dependent variables and covariates are preceded by numbers identifying to which level the observation belongs on each factor of the design. For a 2x3 design (say sex by social class) a data card might be punched as follows: 010306.211.5 The identification vector (0103) indicates that this observation is of the first sex and the third social class. The scores on the two dependent variables of this observation are 6.2 and 11.5. The observations do not have to be sorted, and cell frequencies are accumulated automatically. The user does not have to account for missing cells. The Variable Format Card must contain Fixed Poil")t fields describing the level identifying numbers and Floating Point fields for the other variables to be read. The last card(s) is one completely blank observation (i.e., the data are ended with as many blank cards as cards in any one observation). Data Form 2: Raw data sorted For each cell, (a) Header Card Cols. 1-6 Cols. 7-10 Cols. 11-14 Cols. 15-18
by cells, each cell with its own header card
Number of observations in the cell (N0BS) Level on first factor named on Factor Identification Card Level on second factor Level on third factor
etc. (b) Observations for that cell (exactly N0BS observations) Last data card- one blank card following the observations of the last cell Data Form 3: Raw data sorted by cells, no header cards
400
Appendix 8 (a) Vector of cell frequencies (b) All data cards grouped by cells, in the same order as the vector of frequencies, no header cards Data Form 4: Within-group variance-covariance matrix and mean-frequency summary data (a) Matrix of cell means one row at a time (b) Vector of cell frequencies (c) One Variable Format Card describing the within-cell variance-covariance matrix (d) Pooled within-cell variance-covariance matrix Data Form 5: Raw unsorted data to be read from an independently prepared binary tape Data Form 6: Raw data grouped by subclasses to be read from an independently prepared binary tape Data Form 7: Within-group correlation matrix and mean-frequency summary data. Same as Data Form 4 except that the within-cell correlation matrix with standard deviations in diagonal positions replaces the variance-covariance matrix (d).
11. Transformation Matrix If column 28 of the Input Description Card is 1 or 4: (a) Variable Format Card describing one row of the matrix (b) Matrix one row at a time If column 28 of the Input Description Card is 2 or 3: (a) Design Card Cols. 1-4 Number of factors in design on dependent variables (within-subject factors) Col.8 Number of factors for arbitrary contrasts Col.12 Number of factors for orthogonal polynomials Col.16 1 =construct transformation matrix from contrast vectors b or 0 =construct transformation matrix from basis vectors (b) Factor Card for within-subject factors Six-character within-subject factor names, followed by four digits giving the number of levels for that factor (c) Arbitrary contrasts for transformation matrix (d) Orthogonal polynomials for transformation matrix (e) As many Symbolic Contrast Vectors as number of variables after transformation matrix
Phase II- Estimation 12. Estimation Specification Card* Cols. 1-4* Rank of analysis-of-variance model for significance testing (/ in Chapter 7). Total degrees of freedom for all between-cell hypotheses, including one for the grand mean. Set to zero for regression analysis only. May not exceed the number of nonempty cells in the design. Symbolic Contrast Vectors, each symbolizing a single-degree-of-freedom source of variation are entered at a later point in the input data deck. Rank of the model for estimation. Number of effects of which leastCols. 5-8
Appendix 8
401
squares estimates and their standard errors will be calculated and printed. Also number of effects to be included in model for predicting means and obtaining mean residuals (c in Chapter 8). May not exceed rank of the model for significance testing. It may be set to zero if contrast estimates are not required. Estimated subclass and row and column means will be based on fitting a model of rank c. The residuals, if calculated, will be the remainder after a model of this rank has been fit to the data. The rank of the model for significance testing (/) corresponds to between-cell contrasts that will be tested for significant contribution to criterion variation. At this point, the magnitude of the c leading contrasts and their standard errors may be calculated and printed. Col. 12' Error term code (error term to be used in the analysis of variance) b or 0 =Pooled within-group variance-covariance matrix 1 = Residual variance-covariance matrix, after variation corresponding to all degrees of freedom in the model for significance testing has been removed from the total variance-covariance matrix. 2 =Special effects, contrasts included in the model are used to obtain an error estimate. See columns 13-16. Cols. 13-16 Degrees of freedom for error if special effects are used (if column 12=2). The last effects in the first order of the model for significance testing/ will be summed to form the error sum of products and variancecovariance matrix. Cols. 17-20 Number of alternative orders of effects (contrasts) to be established, other than the first Cols. 23-24 Number of factors in the analysis-of-variance design for which arbitrary contrasts will be used Cols. 27-28 Number of factors in the design for which orthogonal polynomials will be used Col. 32 Cell means and residuals code 1 =calculate and print estimated cell means and residuals b or 0= do not calculate and print estimated cell means and residuals Col. 44 Combined means code 1 =print combined observed means and N's 2 =print combined estimated means 3 =print both observed and estimated combined means b or 0= no combined means (If option 1, 2 or 3 is selected, the following card must be a Means Key.) Col. 48 Orthogonal analysis code 1 =completely orthogonal analysis b or O=general nonorthogonal analysis 13. Means Key If column 44 of the Estimation Specification Card is 1, 2 or 3, a Means Key is entered at this point, to determine which observed and/or estimated combinations of subclass means are to be computed. The Means Key treats the factors in the order in which they appear on the Factor Identification Card. For example, if that card were SEXbbbbbb2MAJ0Rbbbb5TRTMTbbbb3 then SEX would be considered factor 1, MAJ0R factor 2, and TRTMT factor 3. To
402
Appendix 8 obtain means on all criterion variables for both levels of the sex factor, across all other factors, the Means Key is
I.
MEANS KEY
This would produce a table of means for all males and another for all females. To obtain means for each combination of levels of certain factors, an asterisk (•) is interposed between the numbers of the factors. For example, the Means Key to obtain means for each combination of the levels of sex and treatment would be 1•3.
MEANS KEY
The Means Key to print overall means for all subjects, for each treatment level, for each treatment level of each sex group, and for each treatment level of each major, is 0,
3,
1•3,
2•3.
MEANS KEY
Requests are separated by commas, and ended with a period. Comments may fill the remainder of the card. 14. Arbitrary Contrast Matrices If program's "standard" contrasts are not applicable (a) Factor Name Card; columns 1-6 contain factor name as it appears on the Factor Identification Card (b) Variable Format Card describing one row of contrast matrix (c) Contrast matrix one row at a time 15. Orthogonal Polynomials (a) Factor Name Card; columns 1-6 contain factor name exactly as it appears on the Factor identification Card; column 10 equals 1 for user-supplied metric, blank or 0 for evenly spaced metric (b) Arbitrary metric for orthogonal polynomials, if col. 10 of preceding card is 1 16. Symbolic Contrast Vectors (SCV's)* There are as many Symbolic Contrast Vectors as indicated in columns 1-4 of the Estimation Specification Card (rank of model for significance testing). Each SCV defines either the general mean or one contrast to be made among the cells of the design. Each SCV, which corresponds to one between-cell degree of freedom, is punched on one Symbolic Contrast Vector card. Each SCV causes the program's basis generator to produce one column of a basis for the model. The order in which these cards are entered is the original order of effects in the model for significance testing. Four types of contrasts are allowed by the program plus a nesting vector, and an option for using arbitrary contrasts of the user's own construction: (a) "C" or simple contrasts, in which all levels but one of a given factor are contrasted with the one omitted. For a three-level factor, the contrasts might compare the first level with the second level and the third level with the second level, respectively. (b) "D" or deviation contrasts, in which all but the last level of a given factor are compared to the mean of all levels of the factor. (c) "H" or Helmert contrasts, in which cells 1 through a-1 are compared with the cells following, where a is the total number of levels of the factor. In a four-level factor, the H contrasts would compare cell 1 to the average of cells 2 through 4, cell 2 to the average of cells 3 and 4, and cell3 with cell 4, respectively. (d) "I" or a column of an identity matrix used in constructing a basis for nested designs.
Appendix 8
403
(e) "L" or optional contrasts, entered by the user (f) "P" or orthogonal polynomials Accompanying each contrast code is a number indicating which of the contrasts is desired. Zero indicates grand mean, or constant term, of the model. For example: H1, would indicate the first possible Helmer! contrast, which would be cell 1 contrasted with the average of all the remaining levels of that factor. D3, would indicate cell3 contrasted with the mean of all cells for that factor. CO, would indicate grand mean and that simple contrasts will be used. Rules of thumb: (a) If indicating a list of "C" contrasts, say C1, C2, and C4, the number omitted will be the cell to which the others will be compared (in this case, the third). (b) In "H" or "D" contrasts, the number to be omitted must be the last; e.g. for a fourlevel factor, the D contrasts would be D1, D2, D3. (c) In "L" or "P" contrasts, the number indicates the row of the respective contrast matrix to be used. (d) In all cases, the letter-number combination signifies an entire contrast vector for one factor of the design. (Thus it represents more than a single group number or weight.) (e) To estimate common anova parameters (J.t, aj, f3k, etc.), deviation (D) contrasts are required. Every indicated contrast is followed by a comma (as shown in the examples). The /th comma indicates the end of that part of the contrast pertaining to the /th factor of the design. The comma replaces the Kronecker operator 0 of Chapter 8. Comments may fill unused card columns. Example 1 For a one-way (one factor) design with five levels, for which it is desired to compare the first (say, control group) to all others, the SCV's would be: CO, GRAND MEAN 2, LEVEL 2-LEVEL 1 (C0NTR0L) 3,
LEVEL 3-LEVEL 1 (C0NTR0L)
4, LEVEL 4-LEVEL 1 (C0NTR0L) 5, LEVEL 5-LEVEL 1 (C0NTR0L) Unless it is desired to change contrast types (for a given factor), the alphabetic contrast code can be omitted from all but the first SCV. Example2 Consider a 2x3 (Ax8) crossed design in which it is desired to contrast the two A groups with each other and to contrast the first with the second and third, and the second with the third level, on the three-level factor. The SCV's would read: CO, HO, GRAND MEAN 0F 80TH FACT0RS (I) 1, 0, A1-A2, GRAND MEAN 0F B (II) [levels of 8 all weighted equally for this contrast] 0, 1, GRAND MEAN 0F A, 81-(82+83)/2 (Ill) 0, 2, GRAND MEAN 0F A, 82-83 (IV) 1, 1, INTERACTI0N 0F II AND Ill (V) 1, 2, INTERACTI0N 0F II AND IV (VI) The contrast to the left of the first comma pertains to the first factor on the Factor Identification Card, that to the left of the second comma to the second factor, etc.
404
Appendix 8 Note how this compares to a "traditional" anova table: Source
Degrees of Freedom
Corresponding SCV's
Grgnd Mean A 8 AxB
1 a-1 (=1) b-1 (=2) (a-1)(b-1) (=2)
II Ill, IV V, VI
Example3 Consider a 2 x3 x3 three-way design for which orthogonal polynomials have been entered to find the linear and quadratic trends of the second factor. This is the example given on page 238, and in Table 7.4.1. The SCV's are: Symbolic Vector CO, PO, CO, 1, 0, 0, 0, 0, 1' 0, 2, 0, 0, 0, 1' 0, 0, 2, 1, 0, 1' 1, 2, 0, 0, 1' 1' 1, 0, 2, 0, 1' 1' 1, 2, 0, 0, 2, 1' 0, 2, 2, 1, 1, 1, 1, 2, 1, 1, 2, 1, 1, 2, 2,
Comment GENERAL MEAN A1-A2 LINEAR 8 QUADRATIC 8 C1-C3 C2-C3 LINEAR A 8 INTERACTI0N QUADRATIC A 8 INTERACTI0N AxC INTERACTI0N AxC INTERACTI0N 8xC INTERACTI0N 8 x C INTERACTI0N 8 x C INTERACTI0N 8xC INTERACTI0N Ax8xC INTERACTI0N Ax8xC INTERACTI0N Ax8xC INTERACTI0N Ax8xC INTERACT10N
Example4 The basis of a nested sampling design with the structure:
A,
a, Cu
C1
a. Cs
C"
C10
Comment GENERAL MEAN A1-A2 8 C0NTRAST IN 8 C0NTRAST IN 8 C0NTRAST IN C C0NTRAST IN C C0NTRAST IN C C0NTRAST IN
A1 A2 A2 81 81 82
(1) (2) IN A1 IN A1 IN A1
a, Cu
could be generated symbolically by: Symbolic Vector CO, DO, DO, 1, 0, 0, 11' 1, 0, 2, 1, 0, 2, 2, 0, 1' 11' 1' 1' 1, 2, 1' 2, 1,
(1) (2)
C12
C, 3
C,4
Appendix Symbolic Vector
2, 2, 2, 2, 2, 2,
1' 1' 2, 3, 3, 3,
1' 2, 1' 1' 2, 3,
Comment C CIZJNTRAST C CIZJNTRAST C CIZJNTRAST C CIZJNTRAST C CIZJNTRAST C CIZJNTRAST
IN IN IN IN IN IN
83 IN A2 83 IN A2 8~ IN A2 85 IN A2 85 IN A2 85 IN A2
a
405
(1)
(2) (1) (2) (3)
This design is regarded as an incomplete 2X3X4 design with non-empty cells as follows:
a.
a, C, C2 C3
-
C4 C5
-
c.
-
aa C1
c. - c9
a. C10 - -
Cu
a. c,. c13 c,.
The grand mean is almost always included as a first SCV, unlike tables shown in most statistics texts. There are always as many symbolic codes (grand mean or otherwise) indicated on each SCV as there are factors in the design. When indicating a main-effect contrast for one factor, the other factors may be set to zero, indicating "grand mean" for those factors, i.e., "not involved" in the contrast. Spacing. The SCV's may be punched one to a card in columns 1 through 74. Any number of embedded blanks is allowed. The columns following the last comma should be used to provide a description of the contrast, as in the examples. Repeat Code. MULTIVARIANCE provides a "repeat code" which reduces the number of SCV cards that must be prepared. The convention which has been adopted is a code of the form nCm, which will represent Symbolic Vectors Cm, C(m+1), C(m+2), C(m+n-1), For example, 612, would represent 12, 13, 14, 15, 16, 17, The number preceding the alphabetic contrast-type code determines the number of vectors to be generated. The first such vector will contain the contrast of that card, i.e. Cm,. Each of the n-1 remaining vectors assumes the same letter code and a number code increased by one. The repeat code may be used for designs with more than a single factor. For example CO, 3CO, H1, would generate the Symbolic Vectors CO, CO, CO,
CO, C1, C2,
H1, H1, H1,
406
Appendix B
17. Contrast Reordering Keys As many keys as indicated in columns 17-20 of the Estimation Specification Card. Each key may contain no more elements than the rank of the model for significance testing (number of contrasts) and begins on a new card. The key contains the desired new order of the basis vectors (i.e., the vector to be ordered first has its subscript first, the second vector in the new order has its subscript second, etc.). The original order is the order in which the SCV cards are entered in the input data deck. The reordering numbers are separated by commas and ended with a period. The format of the card is completely free. For example, a key to reorder the B factor first in the two-way SCV example would read: 1. 3, 4, 2, 5, 6. RE0RDER Since the interaction terms will not be altered by interchanging the preceding effects, they may be ignored in the alternate orders. Therefore, a Contrast Reordering Key containing all effects of interest in the second order would be 1,
3,
4,
2.
RE0RDER
Thus fewer columns are orthogonalized in the alternate order.
Phase Ill- Analysis The following set of cards (Analysis Selection Card, Variable Select Key, Covariate Grouping Key, and Hypothesis Test Cards) may be repeated any number of times. 18. Analysis Selection Card* Cols. 1-4* Number of dependent variables to be selected from the input set (p). Cols. 5-8 Number of covariates for analysis of covariance, or predictor variables for regression (q). Rank of regression model. Col. 12 Variable selection code. 1 =variables are to be deleted or rearranged to obtain the desired sets. b or O=variables are in correct sets and order as entered in Phase I. (Covariates must follow dependent variables in Phase Ill.) Cols. 13-16 Number of alternative contrast orders to be run with this set of variables -other than the first. Col. 20 Principal components code. 1 =principal components of correlation matrix 2 =principal components of covariance matrix b or 0 =no principal components Col. 24 Discriminant function analysis code. 1 =perform discriminant analysis b or 0 =do not perform discriminant analysis Col. 32 Canonical correlation code. 1 =perform canonical correlation analysis if there are covariates (predictors) b or 0 =do not perform canonical correlation analysis Col. 36 Covariate grouping code. 1 = covariates (predictors) are to be entered into regression by user's key b or 0 = covariates (predictors) are to be entered one-at-a-time (all qhj= 1) Ignore if no covariates.
Appendix 8 Col.40
407
Parallelism code. 1 =perform test of parallelism of regression planes b or 0 =do not perform test of parallelism of regression planes
19. Variable Select Key If column 12 of the Analysis Selection Card is 1, then this key is prepared with p+q elements indicating the variable order and placed in the input data deck at this point. Elements are separated by commas, and ended with a period. For example, if it is desired to select the tenth, eleventh, and fourteenth of the original input variables as dependent and the second as a covariate, the key would read: 10, 11, 14, 2. VAR SELECT 20. Covariate (Predictor) Grouping Key If column 36 of the Analysis Selection Card is 1, then this key is prepared with elements (q,;) indicating the numbers of covariates to be added to the regression equation, and placed in the input data deck at this point. Covariates are taken successively in the order in which they have been selected by the Variable Select Key or by their order when input to the program. Elements separated by commas, ended with a period. For example, if five predictors were chosen, and we wish to test the joint contribution of the first four, and then the additional contribution of the fifth, the key would read 4,
1.
21. Hypothesis Test Card(s)* At this point, the contrasts (corresponding to degrees of freedom) may be grouped and tested by analysis of variance or covariance. There should be one Hypothesis Test Card for the original contrast order plus one for each alternative order indicated in columns 13-16 of the Analysis Selection Card. These numbers are then"; values of Chapter 9, and determine the number of rows of U which form the hypothesis matrix. The grouping of effects will begin with the first Symbolic Contrast Vector entered into the program and will proceed in the order in which the SCV's follow one another. Effects may be tested for significance individually or may be combined to test the significance of main effects, interactions, or parts of them. Each group of degrees of freedom to be tested is separated by commas and ended with a period. If the "degrees of freedom" number is negative, tests of those contrasts will be skipped. If no degrees of freedom are given and only a period is punched, only the regression analysis of the dependent variables on the covariates will be performed. Example In the 2x3 crossed design the Hypothesis Test Card might read: -1,
1,
2,
2.
FIRST ORDER
The -1 indicates skipping the significance test for the general mean (corresponding to the first SCV). The 1 indicates a one-degree-of-freedom test of the two-level factor (corresponding to the second SCV). The first 2 indicates a two-degree-of-freedom test of the three-level factor (corresponding to the third and fourth SCV's), and the second 2 indicates the test of the interaction. Note that the numbers in this case correspond to the degrees of freedom in the traditional analysis-of-variance table. If, however, one-degree-of-freedom tests of the two 8 contrasts are desired, the card would read: -1,
1,
1,
1,
2.
1-DF TESTS
408
Appendix B
If effects Band A had been reversed in order, a second Hypothesis Test Card corresponding to testing the effects in the second order might read: -1,
2,
1,
2.
RE0RDER A AND B
22. End-of-job Card* This is the last card of the problem deck and is blank except for columns 60ft. The format of the End-of-job Card is as follows: Cols. 60-67 or Cols. 60-63
C0NTINUE If data and program control cards for another problem run follow immediately ST0P If this is the end of the last (or only) problem run
In case an error is encountered during the run, MULTIVARIANCE will ignore all cards until the End-of-job Card is encountered. If C0NTINUE has been punched, the next problem will be attempted. An)L number of problem decks may be stacked.
Appendix C* INPUT -OUTPUT LISTINGS FOR FIVE SAMPLE PROBLEMS* The following pages contain printouts of the input decks, including all data cards, for the five sample problems discussed in the text. Each deck has been prepared for the MULTIVARIANCE program, as specified in Appendix B. The design and control card setup is described in the Comments section of the data deck. Following the input listing, the output which results from running the data with the MULTIVARIANCE program is reproduced. The analysis and results correspond to the presentations of Chapters 1 through 10.
Sample Problem 1. 2. 3. 4. 5.
Creativity and achievement Word memory experiment Dental calculus reduction Essay grading study Programmed instruction effects
Beginning page Input listing Output listing C.1
c. 22 C. 56
C.107 C.147
c.s C. 26
C. 61 C.112 C.151
*From Listing of Sample Problem Output-MULTIVARIANCE: Univariate and multivariate analysis of variance, covariance, and regression- Version V (March 1972). (Ann Arbor, Mich.: National Educational Resources, Inc., 1972). Used by permission of National Educational Resources, Inc.
409
PROBLEM 1•-MULTIVARIATE MULTIPLE REGRESSION AND CORRELATION ANALYSIS 6 1 2 1 1 9 1 1 1INPUT DESC CARD 1 FACTOR ID CARD PROBL'M 1·-MULTIVARIATE MULTIPLE REGRESSION AND CORRELATION ANALYSIS THE DATA FOR THIS EXAMPLE HAVE BEEN SAMPLED FROM A LARGER SET COLLECTED BY HR. I. LEON SMITH OF YESHIVA UNIVERSITY, NEW YORK CITY, EACH OF THE 141 ELEVENTH GRADE STUDENTS FROM A WESTERN NEW YORK METROPOLI• TAN AREA OF WHICH 60 WERE SELECTED FOR THIS EXAHPLEt WAS MEASURED ON A.
FOUR MEASURES OF CONVERGENT ACHIEVEMENT, FROM THE TESTS OF READING COMPREHENSION DEVELOPED BY KROPP, STOKER, AND BASHAW 119661. THESE ARE BASED ON THE LEVELS OF THE TAXONOMY OF EDUCATIONAL OBJECTIVES IBLOOH,t95il AS FOLLOWS 1.
2, _ 3. 4,
e,
TWO MEASURES OF DIVERGENT ACHIEVEMENT, FROM THE SAHE BATTERY 1. 2,
C.
a
SYNTHESIS EVALUATION
....I
THREE MEASURES OF CREATIVITY !TESTS OF THE FACTORS OF SYMBOLIC AND SEMANTIC DIVERGENT-PRODUCTION ABILITIESI FROM GUILFOR0119&71, 1. 2. 3.
O.
KNOWLEDGE COMPREHENSION APPLICA-TION ANALYSIS
CONSEQUENCES • OBVIOUS CONSEQU~'NCES • REMOTE POSSIBLE JOBS
ONE MEASUR~ OF GENERAl INTELLIGENCE AS PROVIDED BY THE LORGE-THORNDIKE MUL TI·LEVEL INTELLIGENCE TEST t LEVEl G, FORM 1.
THE HYPOTHESES OF THF STUDY CONCERN THE EXTENT TO WHICH INTELliGENCE, CREATIVITY, AND THE INTERACTIONS OF THE THO RELATE TO THE THO SETS OF ACHIEVE· MENT MEASURES. THIS EXAMPLE WILL EMPLOY ONLY THE DIVERGENT ACHIEVEMENT MEAS· URES AS CRITERION VARIABLES, ALTHOUGH THE DATA CARDS CONTAIN All SCORES, THE VARIA8lES READ BY PHASE I OF THE PROGRAM ARE THE TWO DIVERGENT ACHIEVEMENT MEASURES, THE CREATIVITY MEASURES, AND INTELLIGENCE, THE CREATIVITY AND INTELLIGENCE MEASURES ARE STANDARDIZED AND THEIR CROSS•PRODUCTS FORHED TO Yli'LO THE ADDITIONAL INTERACTION MEASURES, THROUGH THE USE OF THE TRANSFORMATION FEATURES. REGRESSION TECHNIQUES ARE USED TO DETERMINE THE EXTENT TO WHICH CREATIVITY, INTELLIGENCE, AND FINALLY THEIR INTERACTIONS CONTRIBUTE TO VARIA• TION IN THE CRITERION MEASURES, THE OATA CARDS ARE PUNCHED AS FOLLOWS COLUMN 1·3 5
SUBJECT IDENTIFICATION NUHBER SEX IH,FI
7-10
KNOWLEDGE
11-14 15-16 19-22 23-26 27-30
COMP!>EHENSION APPLICATION ANALYSIS SYNTHESIS !SYNTHI EVALUATION IEVALI
36-39 40-43 44-47
CONSEQUENCES-OBVI~US ICONOBVI CONSEQUENCES-REMOTE !CONRHTI POSSIBLE JOBS IJOBSI
48-57
INTELLIGENCE liNTEL!
THIS RUN USES -DATA FORH II ALL DATA LISTED OPTIONAL PRINTED OUTPUT TRANSFORMATIONS CHORE VARIABLES AFTER TRANSFORMATIONS! REGRESSION FEATURES ONLY COVARIATE !PREDICTOR! GROUPING KEY--NOTE INTELLIGENCE IS FIRST PREDICTOR IN REGRESSION ANALYSES CANONICAL CORRELATION PRINCIPAL COMPONENTS OF CORRELATION HATRIX OF All DIVERGENT MEASURES INCLUDING CREATIVITY !FIRST ANALYSIS RUNI THERE IS ONLY A SINGLE GROUP OF OBSERVATIONS, FOR REGRESSION ANALYSIS. THE DESIGN IS SPECIFIED AS A ONE-WAY, ONE-LEVEL DESIGN ON THE INPUT DESCRIPTION AND FACTOR IDENTIFICATION CARDS. THE ESTIMATION SPECIFICATION CARD IS BLANK, AS THERE IS NO ANALYSIS-OF-VARIANCE HODEL. NO HEAN CONTRASTS ARE COOED. THE RANK OF THE REGRESSION MODEl IS SPECIFIED ONLY BY INDICATING THE NUMBER OF PREDICTORS ON THF VARIABLE SELECT KEY. THE FIRST ANALYSIS HAS NO PREDICTORS, BUT ONLY SYNTHESIS, EVALUATION, AND CREATIVITY MEASURES AS CRITERIA, SO THAT PRINCIPAL COMPONENTS CAN BE EXTRACTED FROH THE CORRELATION MATRIX. THE SECOND ANALYSIS HAS SYNTHESIS ANO EVALUATION AS CRITERIA, PLUS SEVEN PREDICTORS FOR REGRESSION, IN THE ORDER INTELLIGENCE-CREATIVITY-INTERACTIONS. A ICO~ARIATEI GROUPING KEY IS ENTERED SO THAT INSTEAD OF TESTING THE CONTRIBUTION OF EACH PREDICTOR TO RFGRESSION, THEY ARE TESTED IN SETS. THAT IS, THE FIRST PREDICTOR !INTELLIGENCE! IS TESTED ALONE, THE NEXT THREE !CREATIVITY! JOINTLY, AND THE ADDITIONAL REGRESSION OF THE LAST THREE !INTERACTIONS! JOINTLY. THE THIRD ANALYSIS SELECTS ONLY INTELLIGENCE AS A PREDICTOR, SINCE IT IS THE ONLY SIGNIFICANT INDEPENDENT VARIABLE. THE BEST ESTIMATE OF THE REGRESSION WEIGHTS AND STANDARD ERRORS ARE OBTAINED FOR THIS VARIABLE ALONE. FOR EACH ANALYSIS, A HYPOTHESIS TEST CARO WITH A PERIOD 1.1 ONLY IS ENTERED, TO INDICATE THAT NO ANALYSIS-OF-VARIANCE EFFECTS ARE TO BE TESTED. FINISH END OF COHHENTS !22X2F4.Q,5X3F4.Q,F1Q.QI VARIABLE FORMAT 3 2 3 -18.433333 STANDARDIZE CONOBV 3 4 4 5 5 & 6
3 2 3 2 3 2 3
3 4 4
5 5 6 6
.1471512& -4.11666&7 .30597965 -14.516667 .1B440740 -102.01&67 • 06744341
STANDARDIZE CONRHT STANDARDIZE JOBS STANDARDIZE INTEl
<> I
"'
7 8 9
18 18 18
3 4 5
6 6 6
INTERACTION INTERACTION INTERACTION END DF TRANSFORHATIONS SYIITH EV·AL COIIOBVCONRHT JOBS INTEl. CO X ICR X IJB X I VARIABLE LABELS 60 1 HEADER CARD 28 F 19 7 10 3 5 1 20 5 11!1 106 89 H 17 4 7 6 0 0 1l!l 3 10 97 18 12 12 11 4 2 10 3 15 121 159 " 48 F 8 7 20 5 1 2 2 4 12 99 137 H 19 11 14 8 7 1 25 5 23 120 14 5 10 1 z X 11 6 91 21 95 " 44 F 15 9 7 5 12 2 13 2 1 101 58 F 19 2 2 2 1 0 15 0 5 93 157 H 7 2 20 11 12 12 4 15 123 39 H 15 10 10 6 14 2 0 4 ill 98 138 H 17 13 15 8 4 2 16 3 17 121 17 10 11 3 6 3 25 106 2 18 90 " 165 F 5 19 10 13 6 l' 25 17 24 127 5 143 F 19 13 15 6 5 9 20 122 34 HF 9 8 8'8 15 1 0 16 6 13 141 F 18 14 14 10 5 6 22 11 18 128 62 H 13 7 a 5 2 1 14 104 23 0 3 88 F 13 3 3 0 26 1 67 0 9 24 H 15 8 7 1 1_ 12_ 84 1 0 11 938- H 7 19 10 24 3 19 97 1 2 34 F 17 11 6 9 2 0 12 3 12 117 151 F 5 6 20 13 14 11 18 2 23 120 42 H 9 19 8 7 2 1 18 118 4 10 53 F 19 7 5 2 2 1 17 18 96 5 68 F 7 5 17 10 3 2 14 4 11 93 11 5 10 5 1 17 8 17 93 1 55 " 57 F 19 9 6 6 4 7 14 10 12 100 52 F 3 4 16 19 12 7 3 2 10 103 96 F 6 4 104 19 9 10 2 28 1 21 83 H 10 7 8 2 2 19 10 18 85 1 77H 18 8 2 5 3 6 22 1 21 96 32 F 7 16 9 5 4 2 15 1 11 106 29 F 19 7 2 2 10 93 4 1 3 10 149 F 19 7 14 3 2 0 18 11 31 120 80 H 3 74 5 3 6 1 0 18 2 10 66 F 17 5 6 5 0 1 41 4 23 97 5 H 17 98 9 6 9 0 1 18 1 11 85 F 3 3 2 u 0 23 89 11 3 26 7 56 F 15 8 8 2 86 1 10 2 9 75 F 16 5 82 3 2 0 u 12 2 6 4 F 8 8 0 0 4 11 17 1 13 90 23 F 19 6 10 2 0 9 97 5 21 2 9 7 7 0 15 1 14 1 11 99 37 " 7 F 8 12 1:0 6 3 0 36 0 18 97 84 H 16 9 5 1 0 90 5 13 8 11 74 F 17 6 6 2 0 14 4 10 101 2 135 F 6 6 19 125 19 12 13 10 7 1'1 13 F 17 8 1 24 6 3 9 5 12 90 30 F 15 7 6 3 1 0 28 3 10 107 82 F 18 7 5 5 1 0 18 89 3 12 78 H 17 5 5 5 0 0 11 5 12 83 35 F 17 6 11 4 4 1 105 1'1 4 16 86 H 4 2 2 31 2' 21 11 10 10 101 155 H 19 12 11 7 3 0 15 4 17 129 8 F 1'9. 6 7 3 15 4 2 4 12 ' 107
.
..
?
VI
-;.
134M
105 101t 43 46
20
M
18 12
F 5
11! 6 12 12 11
20 16
F F
1,2,3,1t,5.
12
7
6
6
7
.9
7 8
0
0
11 5
9
3
0
7
3
1
1
1 7 1,2,&,3,1t,5,7,8,9. 1 ,3,3.
1
4
1
22 12 12 10
21
10
23
6
13 10 10
z
5 3
zo
11t3 101 115 97
92
BLANK--END OF DATA EST SPEC CARD ANALY SELECT-PRINC COHPS ONLY VARIABLE SELECT HYP TEST-NO ANOVA ANALY SELECT-GROUPING KEY 1 1 VARIABLE SELECT-INTEL FIRST PREDICTOR GROUPING KEY FOR INDEPENDENT VARIABLES HYP TEST-NO ANOVA CONTINUE
0
I _,_
¥
4
4
•
4
•
4
•
•
"
U L T I V AR I A NC E • • • • • • •
• UNIVARIATE AND MULTIVARIATE ANAlYSIS OF VARIANCE, COVARIANCE AND REGRESSION VERSION 5
•
•
MARCH 1972
..... .............. PROBLEM
PROBLEM 1••HULTIVARIATE MUlTIPlE REGRESSION. AND CORRElATION ANALYSIS
PAGE
PROBlEM 1··MUlTIVARIATE MULTIPLE REGRESSION AND CORRElATION ANAlYSIS THE DATA FOR THIS EXAMPLE HAVE BEEN SAMPlED FROM A lARGER SET COlLECTED BY MR. I. LFON SMITH OF YESHIVA. UNIVERSITY, NEW YORK CITY. EACH OF THE 141 ELEVENTH GRADE STUDENTS FROM A WESTERN NEW YORK METROPOLI• TAN AREA OF WHICH 60 WERE SELECTED FOR THIS EXAMPLE, WAS MEASURED ON A.
FOUR MEASURES OF CONVERGENT ACHIEVEMENT, FROM THE TESTS OF READING COMPREHENSION DEVELOPED BY KROPP, STOKER, AND BASHAW 119661. THESE ARE BASED ON THE lEVELS OF THE TAXONOMY OF EDUCATIONAL OBJECTIVES 1BlOOM,19561 AS FOllOWS 1. 2. 3. 4.
B.
TWO MEASURES OF DIVERGENT ACHIEVEMENT, FROM THE SAME BATTERY 1. 2.
c.
KNOWlEDGE COMPREHENSION APPLICATION ANAlYSIS
SYNTHESIS EVAlUATION
THREE MEASURES OF CREATIVITY !TESTS OF THE FACTORS OF SYMBOliC AND SEMANTIC DIVERGENT•PRODUCTION ABiliTIESl FROM GUILFOR0119671. 1. 2. 3.
CONSEQUENCES • OBVIOUS CONSEQUENCES • REMOTE POSSIBlE JOBS
1 n I
"'
D.
ONE MEASURE OF GENERAL INTELLIGENCE AS PROVIUED BY THE LORGE-THORNDIKE MULTI•LEVEL INTELLIGENCE TEST, LEVEL G, FORM 1.
THE HYPOTHESES OF THE STUDY CONCERN THE EXTENT TO WHICH INTELLIGENCE, CREATIVITY, AND THE INTERACTIONS OF THE TWO RELATE TO THE TWO SETS OF ACHIEVEMENT MEASURES. THIS EXAMPLE WILL EMPLOY ONLY THE DIVERGENT ACHIEVEMENT MEASURES AS CRITERION VARIABL~S, ALTHOUGH THE DATA CARDS CONTAIN All SCORES. THE VARIABLES READ BY PHASE I OF THE PROGRAM ARE THE TWO DIVERGENT ACHIEVEMENT MEASURES, THE CREATIVITY MEASURES, AND INTELLIGENCE. THE CREATIVITY AND INTELLIGENCE MEASURES ARE STANDARDIZED AND THEIR CROSS-PRODUCTS FDRMED TO YIELD THE ADDITIONAL INTERACTION MEASURES, THROUGH THE USE OF THE TRANSFORMATION FEATURES, REGRESSION TECHNIQUES ARE USED TO DETERMINE THE EXTENT TO WHICH CREATIVITY, INTELLIGENCE, AND FINAllY THEIR INTERACTIONS CONTRIBUTE TO VARIA· TION IN THE CRITERION MEASURES. THE DATA CARDS ARE PUNCHED AS FOLLOWS COLUI'N 1-3 5
7-10
SUBJECT IDENHFICATION NUMBER SEX (H,Fl
19-22 23-26 27-30
KNOWLEDGE COMPREHENSION APPLICATION ANALYSIS SYNTHESIS !SYNTHI EVALUATION !EVALI
36•39 40-43 ..4-47
CONSEQUENCES-OBVIOUS ICONOBVI CONSEQUENCES-REMOTE ICONRMTI POSSIBLE JOBS (JOBSI
48-57
INTELLIGENCE !INTEL!
11-14 15-U
THIS RUN USES •• OATA FORM II All DATA LISTED OPTIONAL PRINTED OUTPUT TRANSFORMATIONS !HORE VARIABLES AFTER TRANSFORMATIONS! REGRESSION FEATURES ONLY COVARIATE !PREDICTOR! GROUPING KEY--NOTE INTELLIGENCE IS FIRST PREDICTOR IN REGRESSION ANALYSES CANONICAL CORRELATION PRINCIPAL COMPONENTS OF CORRELATION MATRIX OF All DIVERGENT MEASURES INCLUDING CREATIVITY !FIRST ANALYSIS RUNI THERE IS ONLY A SINGLE GROUP OF OBSERVATIONS, FOR REGRESSION ANALYSIS. THE DESIGN IS SPECIFIED AS A ONE·WAY, ONE•LEVEL DESIGN ON THE INPUT DESCRIPTION AND FACTOR IDENTIFICATION CAROS. THE ESTIMATION SPECIFICATION CARD IS BlANKt AS THERE IS NO ANALYSIS-OF-VARIANCE HODEL. NO MEAN CONTRASTS ARE COOED. THE RANK OF THE REGRESSION HODEL IS SPECIFIED ONLY BY INDICATING THE NUMBER OF PREDICTORS ON THE VARIABLE SELECT KEY. THE FIRST ANALYSIS HAS NO PREDICTORS, BUT ONLY SYNTHESIS, EVALUATION. AND CREATIVITY MEASURES AS CRITERIA, SO THAT PRINCIPAL COMPONENTS CAN BE EXTRACTED FROM THE CORRELATION ~ATRIX, THE SECOND ANALYSIS HAS SYNTHESIS AND EVAlUATION AS CRITERIA, PlUS SEVEN
Q
I
"'
PREDICTORS FOR REGRESSIO~, IN THE ORDER INTELLIGENCE-CREATIVITY-INTERACTIONS. A {COVARIATE) GROUPING KEY IS ENTERED SO THAT INSTEAD OF TESTING THE CONTRIBUTION OF EACH PREDICTOR TO REGRESSION, THEY ARE TESTED IN SETS, THAT IS, THE FIRST PREDICTOR fiNTELLIGENCEl IS TESTED AlONE, THE NEXT TH.REE !CREATIVITY) .JOINTLY, AND THE AOOITIONAl REGRESSION OF THE lAST THREE !INTERACTIONS' JOINTLY. THE THIRD ANALYSIS SELECTS ONLY INTElliGENCE AS A PREDICTOR, SINCE IT IS THE O~LY SIGNIFICANT INDEPENDENT VARIABLE, THE BEST ESTIMATE OF THE REGRESSION WEIG~TS AND STANDARD ERRORS ARE OBTAINED FOR THIS VARIABLE ALONE. FOR EACH ANALYSIS, A HYPOTHESIS TEST CARD WITH A PERIOD 1.1 ONLY IS ENTERED, TO INCICATE THAT NO ANALYSIS-OF-VARIANCE EFFECTS ARE TO BE TESTED. INPUT PARAMETERS PAGE NUMBER OF VARIABLES IN INPUT VECTORS:
6
NUMBER OF FACTORS IN DESIGN=
1
NUMBER OF LEVELS OF FACTOR 1 I NUH~ER
Z
, :
OF VARIABLES AFTER TRANSFORMATIONS:
9
INPUT IS FROM CARDS. DATA OPTION 2 DATA WILL BE LISTED HI~IMAL
0
PAGE SPACING WILL BE USEO
.!,
ADDITIONAl OUTPUT WILL PRINTED FORMAT OF DATA 122X2F4,0,5X3F4.0,F10,0l
VARIABLE FORMAT
TRA NSF OR MA TIONS !'AGE
VARIAqLE
; WILL BE FORMED BY APPLYING TRANSFORMATION f X+C I TO INPUT VARIABLES Vflll X:V { 31, Y=V f •0), Z=VI -01, c= -18.43333.
VARIABLE
x•c 3 WILL BE FORMFD EY APPLYING TRANSFORMATION I X=VI 31. Y=VI -0)' Z=VI -0), c:
VARIABLE
4 WILL BE FORMED BY APPLYING TRANSFORMATION I X+C ' TO INPUT VARIABLES VIlli X:VI -01, Z:VI -01, C: 4t, Y==VC -4.11667.
VARIABLE
4 Will BE FORMED 1lY APPLYING TRANSFORMATION I x•c X:VI 41. Y:VI -0)' Z=Vf -01, C:
VARIABLE
X+C 5 WTLL BE FORMED BY APPlYING TRANSFORMATION f I TO INPUT VARIABLES VIlli X=VI 51, Y:V{ -01, Z=Vf -0), C= -14.51667.
VARIABLE
5 WTLL BE FORMFO PY APPlYI"'G TRANSFORMATION I
x•c
I TO INPUT VARIABLES VIlli .14715.
I TO INPUT VARIABLES VIlli .30598.
1 TO INPUT VARIABLES VIlli
3
X=V I VARIABLF
51, Y=VI
-01, Z=VI
-01, C=
& Will BE FORMED BY APPLYING TRANSFORMATION I X=VI
61, Y=VI
-01,
Z=VI
.18441.
X+C
-01, C=
I TO INPUT VARIABLES VIIlt
-102.01567.
VARIABLE
WILL BE FORMED BY APPLYING TRANSFORMATION I x•c X=V I 61, Y=VI -01, Z=V I -Ol, C=
VARIABLE
WILL BE
VARIABLE
WILL RE FORMFO EY APPLYING TRANS~ORHATION I X•Y l TO INPUT VARIABlES Vlllt X=VI 41, Y=VI 61, Z=VI -01, C= -0.00000.
VARIABLE
WILL 9£ FO'!MEO BY APPLYING TRANSFORMATION I X•Y I TO INPUT VARIABLES VIIlt X=VI 51, Y=VI 61, Z=VI -01, C= -0.00000.
SUBJECT 1 , CELL BEFORE TRANSFO~MATIONS AFTER TRANSFORMATIONS =
BY APPLYING TRANSFORMATION I X•Y I TO INPUT VARIABLES VIlli X=VI 31, Y=VI 61, Z=VI -01, C= -0.00000.
FOR~ED
20.0000 .2305
s.oooo
1.0000
o.oooo o.oooo
13.0000 -.7995
3.0000
4.0000 4. 0000 .11ft1
2.0000 2.0000
10.0000 -1.2410
1.0000 1.0000 .3946
2.0000 2.0000
7.0000 7.0000 1.897ft
5.0000
I TO INPUT VARIABLES VIII I • 067ft4.
1.oooo
13.0000 -.2797
106.0000 .2686
.0619
.0726
10.0000 -.8329
97.0000 -.3383
.2705
.1156
-.3 .. 17
15.0000 .0891
121.0000 1.2803
-1.5888
-.4374
12.0000 -.91t67
2.0000 -.6477
4.0000 -1.9394
99.0000 -.2035
.1925
.1318
1.0000 1.0000
25.0000 .%63
5.0000 • 2703
23.0000 1.5644
120.0000 1.2129
1.1720
• 3278
1.oooo 1.oooo
2.0000 2.0000
21.0000 • 3777
3.0000 -.3417
11.0000 -.6ft85
91.0000 -.7ft30
-.2806
• 2539
2.0000 2.0000 .0192
1.0000 1.0000
12.0000 -.9467
2.0000 -. 6477
13.0000 -.2797
101.0000 -.0686
• 06ft9
• 04 .. 4
1.oooo 1.oooo
o.oooo o.oooo
15.0000 -.5052
o.oooo -1. 259&
5.0000 -1.7549
93.0000 -.6081
.3072
• 7660
4.0000 4.0000 .1261
2.0000 2.0000
12.0000 -.9467
4.0000 -. 0357
15.0000 • 0891
123.0000 1. 4152
-1.3397
-. 05 05
2.0000 2.0000 .0758
0. 0 00 0
14.0000 -.6524
4.0000 -.0357
13.0000 -.2797
98.0000 -.2709
.1767
• 0097
s.oooo
.2703
-.0751
SUBJECT 2 , CELL 1 EEFORE TRANSFORMATIONS = AFTER TRANSFORMATIONS = SUBJECT 3 , CELL 1 BEFORE TRANSFORMATIONS = AFTER TRANSFORMATIONS = SUBJECT 4 , CELL 1 BEFORE TRANSFORMATIONS = AFTER TRANSFORMATIONS = SUBJECT 5 , CELl 1 BEFORE TRANSFO~MATIONS = AFTER TRANSFORMATIONS = SUBJECT 6 , CELL BEFORE TRANSFORMATIONS = AFTER TRANSFORMATIONS =
o.oooo o.oooo
-.3417
.2H8 3.0000
0
I
.48te
SUBJECT 7 , CELL 1 BEFORE TRANSFORMATinNS = AFTER TRANSFORMATIONS = SUBJECT 8 , CELL 1 BEFORE TRANSFORMATION' = AFTER TRANSFORMATIONS = SUBJECT g , CELL 1 BEFORE TRANSFORMATIONS = AFTER TRANSFORMATIONS = SUBJECT 10 , CELL 1 BEFORE TRANSFORMATIONS = AFTER TRANSFORMATIONS =
1. 0672
o.oooo
00
11 , CELL BEFORE T~ANSFORHATIONS AFTER TRANSFORMATIONS
oJIUO..,tl..l
= =
SUBJECT 12 , CELL 1 BEFORE TRANSFORMATIONS AFTER TRANSFOR~ATIONS
=
SUBJECT 13 , CELL 1 BEFORE TRANSFORMATIONS AFTER TRANSFORMATIONS
= =
SUBJECT 14 , CELL 1 BEFORE TRANSFORMATIONS AFTER TRANSFORMATIONS
=
=
=
SUBJECT 15 , CELL 1 BEFORE TRANSFORMATIONS = AFTER TRANSFOR~ATIONS = 16 , CELL SUBJECT 1 BEFORE TPANSFO~MATIONS AFTER T~ANSFORMATIONS
=
=
SUBJFCT 17 , .CELL 1 8EFORE TRANSFORMATIONS AFTFR TRANSFORMATIONS =
=
t• , CELL 1 BEFORE TRANSFORMATIONS AFTER TRANSFORMATIONS =
SU~JECT
=
19 , ~Ell SUBJECT 1 BEFORE TR~NSFO~~ATIONS AFTER TRANSFORMATIONS =
=
SUBJECT 20 , r,ELL 1 BEFORE TRANSFORMATIONS AFTE~ TRANSFORMATIONS =
=
SUBJECT 21 , CfLL 1 BEFORE TR~NSFORHAT!ONS = AFTER TRANSFOR~ATIONS = SUBJECT 22 , CELL 1 BEFORE TRANSFORMATIONS = AFTER TPANSFOR~ATIONS
=
SUBJECT BEFO~E
23 , CELL
1
TRANSFO~HATIONS
AFTER TRANSFORMATIONS
=
=
SUl'JECT 2'+ , CELL 1 BEFORE TRANSFOOHATIONS = AFTER TRANSFORMATIONS = SUBJECT 25 , CELL 1 BEFORE TRANSFORMATIONS AFTER TRANSFORMATIONS SUBJECT
26 , CELL
= =
4.0000 4.0000 .5863
2.0000 2.0000
16.9000 -.3581
3.0000 -.3417
17 .oooo
.4579
1Z1. 0000 1.2803
-.4584
-.4374
3.0000 3.0000 .1726
3.0000 3. DODO
25.0 0 00 .96&3
2.0000 -.6477
18.0000 .6424
106.0000 .2686
.259&
-.17411
5.0000 5.0000 2.9467
2.0000 z.oooo
25.0000 .9&&3
17.0000 3.9420
24.0000 1.7488
127 .uoo 1.&850
1.6282
&.6 .. 22
5.000D 5.0000 1.3628
5.000D 5.000D
34.0000 2.29D7
9.0000 1.4942
20.0DOO 1.0112
122.0000 1.3477
3.0872
2.0138
1.0000 1.ooDD .26'+4
o.oooo o.DOOD
16.0000 -.3581
6.oooo .5763
13.0000 -.2797
88.0000 -.9453
.3385
-.5448
5.DOOO 5.000D 1.1257
6.DOOD 6.0DOO
22.00DD .5248
a.ooDo 2 •.10&2
18.0000 .6424
128.0000 1.7524
.9197
3.6908
2. 000 0 2.DOOO -.-01-2-7'
1.DDOD 1.000D
23.0DOO .&720
D• 0 DOD -1.2596
14.0000 -.0953
104.0000 .1338
.0899
-.1685
3oOODO 3.0000 2.4025
o.oooo o.oooo
26.0000 1.1134
1.oooo -.9536
9.oooo -1.0173
67.8000 -2.361&
-2.6296
2.2521
1.000 0 1. 00 00 .5639
o.oooo o.OODD
11.000D -1.0938
1.0000 -.953&
12.0000 -.4641
84.0000 -1.2151
1.3291
1.1588
t.oooo 1.oooo -.2797
2. 0 00 0 2.0000
24.000D .8191
3.0000 -. 3417
19.0000. .8268
'17.0000 -.3383
-.2771
.1156
2o0DOO 2o0DOO -.46'10
o.oooo o.oooo
12.0000 -.9467
3.0000 -.3417
12 .. 0DOO -.4641
117.0000 1. 0105
-.9566
-.3453
5oOOOD 5.0DOO 1.8974
&.0000 6.0000
18.0000 -.0&38
2.0DOO -.&477
23.0000 1.5644
120.0800 1.2129
-.0773
-.7855
2.0000 2oDOOO -.8978
1.0000 1.ooDo
18.0000 -.0638
4o0000 -.0357
10.0000 -.8329
118.0000 1.0780
-.0687
-.0385
2.0000 2oDOOO -.2&07
1.0000 1.Dooo
17 .oooo -.2109
s.OODO .2703
18.0000 .6'+24
96.000D -.4058
.085&
-.1097
3oOOOD 3.000D .3'1'+4
2.0000 2. DDO 0
14.0090 -.&524
4. 0000 -.0357
11.0000 -.6485
93.0000 -.&081
.3967
.0217
?
"'
BEFORE AFTE~
TR~NSFORMATIONS
=
TRANSFORMATIONS =
SUBJECT 27 • CELL 1 BEFORE TRANSFORMATIONS = AFT~R TRANSFORMATIONS = 28 , !'ELL SUBJECT 1 BEFORE TRANSFO~MATIONS = AFTER TRANSFORMATIONS= 2q , CELL SUBJECT 1 BEFORE TRANSFORMATIONS = AFTER TRANSFORMATIONS =
SUBJECT 1 30 , CELL BEFnRE TRANSFORMATIONS = AFTER TRANSFORMATIONS = SUBJFCT 31 , CFLL 1 BEFORE TRANSFORMATIONS = AFTER TRANSFORMATIONS= SUBJFOT 32 , CELL 1 BEFORE TRANSFORMATIONS = AFTER TRANSFOR~ATIONS = 33 , CELL SUBJECT 1 BEFORE TRANSFORMATIONS = AFTER TRANSFORMATIONS = 34 , r:ELL SUBJECT 1 BEFORE TPANSFOR~ATIONS = AFTER TRANSFORMATIONS =
1.0000 1.0000 -.2785
1.oooo 1.0000
17.0000 -.2109
8.0000 1.1882
17.0000 .~579
93.0000 -.&081
.1283
-. 7Z2&
... oooo 4.0000 .0631
7.0000 7.0000
14.0000 -.6524
10.0000 1.8002
12.0000 -.4641
100.0000 -.1360
.0887
-.2~~8
4.0000 4.0000 -.0552
3.0000 3.0000
1&.0000 -.3581
2.0000 -. 6477
10.0000 -.8329
103.0000 .0663
-.0237
-.0430
4.0000 4. 0 0 00 .1599
2.0000 2.0000
28.0000 1.4077
1.0000 -.9536
21.0000 1.1956
104.0000 .1338
.1883
-.1276
2.0000 2.0000 -.7372
1.0000 1.0000
19.0000 .0834
10.0000 1. 8002
18.0000 .6~24
85.0000 -1.H77
-.0957
-2.0&&0
00 0 c
1.0000 1.0000
21.0 000 .3777
6.0000 .5763
22.0000 1.3800
96.0000 -.4058
-.1533
-.2338
4.0000 4.0000 -.1742
2.0000 2.0000
15.0000 -.5052
1.oooo -.9536
11.0000 -.6485
106.0000 .2&86
-.1357
-.2562
2.0000 2.0000 .5065
1.oooo 1.0000
10.0000 -1.2410
3.0000 -.3417
10.0000 -.8329
93.0000 -.6081
.75lt7
.2078
2.0000 2o0000 3.6867
o.oooo o.ooo~
18.0000 -.0638
11.0000 2.10&2
31.0000 3.0396
120.0000 1.2129
-.0773
2.5545
1.oooo 1.0000 1.5738
o.oooo o.oooo
18.0000 -.0638
2.0000 -.6477
10.0000 -.832'3
74.0000 -1.8895
.1205
1.2238
o.oooo o.oooo -.5293
1.0000 1.0000
41.0000 3.3207
4.0000 -.0357
23.0000 1.5644
97.0000 -.3383
-1.1215
.0121
o.oooo o.oooo .1757
1. 0 00 0 1. 0 00 0
18.0000 -.0638
1.oooo -.9536
u.oooo -.6485
'38.0000 -.2709
.0173
.2583
2.0000 2.0000 -1.3734
o.oooo o.oooo
26.0000 1.1134
o.oooo -1.2596
23.0000 1.5644
89.0000 -.8779
-.9775
1.1058
2.0000 2.0000 1.oqs9
1.oooo 1.0000
10.0000 -1.2410
2.0000 -.M77
9.0000 -1.0173
86.0000 -1.0802
1.3405
.6996
o.oooo o.oooo 2.1202
o.oooo o.oooo
12.0000 -.9467
2.0000 -.6477
6.0000 -1.5705
82.0000 -1.3500
1.2780
.8743
o. uc 0 0
o.oooo
13.0000
4. 0 000
u.oooo
90.0000
~.
3.0000 -.5600
35 , CFLL
SUBJECT 1 BEFORE TRANSFORMATIONS = AFTER TRANSFORMATIONS = SUBJECT 36 , CELL 1 BEFORE TRANSFORMATIONS = AFTER TPANSFORHATIONS = 37 , f:E'LL SUBJECT 1 BEFnRE TRANSFOP~ATIONS = AFTER TRANSFORMATIONS = 38 , CELL SUBJECT 1 BEFORE TRANSFORMATIONS = AFTER TRANSFORMATIONS =
SUBJFr.T 39 , CELL 1 BEFORE TRANSFORMATIONS = AFTER TRANSFORMATIONS = SUBJECT 40 , CELl 1 BEFORE TRANSFORMATIONS = AFTER TRANSFORMATIONS= SUBJECT 41 , t"FLL 1 eEFORE TRANSFORMATIONS =
0
I f-' 0
AFTER TRANSFORMATIONS
=
SUBJECT 42 , CELL 1 BEFORE TRANSFORMATIONS AFTER TRANSFORMATIONS =
=
SUBJECT 43 , CELL 1 BEFORE TRANSFORMATIONS AFTER TRANSFORMATIONS
= =
SUBJECT 44 , CELL 1 BEFORE TRANSFO~HATIONS AFTER TRANSFORMATIONS =
=
SUBJECT 45 , CELL 1 BEFORE TRANSFORMATIONS AFTER TRANSFORMATIONS
=
=
46 , CELL SUBJECT 1 BEFORE TRANSFORMATIONS = AFTER TRANSFORMATIONS = SUBJECT 47 , CELL 1 BEFORE TRANSFO~HATIONS = AFTER TRANSFORMATIONS = SUBJE-t-T - - 48- -,- -CE-Ll.- - 1: BEFORE TRANSFORMATIONS = AFTER TRANSFORMATIONS = 49 , CELL SUBJECT 1 BEFORE TRANSFOR~ATIONS AFTER TRANSFOR~ATIONS =
=
50 , CELL SUBJECT 1 BEFORE TRANSFORMATIONS = AFTER TRANSFOR~ATIONS = SUBJECT 51 , CELL 1 BEFORE TRANSFORMATIONS = AFTER TRANSFOR~ATIONS
=
SUBJECT 52 , CELL 1 BEFORE TRANSFO~~ATIONS = AFTER TRANSFORMATIONS = 53 , CELL SUBJECT 1 BEFQRE TRANSFORMATIONS = AFTER TRANSFORMATIONS = 54 , CELL SUBJECT 1 BEFORE TRANSFORMATIONS AFTER TRANSFOR~ATIONS·=
=
SUBJECT 55 , CELL 1 BEFORE TRANSFORMATIONS AFTER TRANSFOR~ATIONS
=
=
56 , CELL 1" SUBJECT BEFORE TRANSFORMATIONS AFTER TRANSFOR~ATIONS =
=
o.ouoo .5256
o.oooo
-.7995
-.0357
-.6 .. 85
-.uo,.
.&480
.0289
5.0000 5.0000 .3.. 42
o.oooo o.oooo
21.0000 .3777
2.0000 -.&477
9.0000 -1.017;5
97.0000 -.3383
-.1278
.2191
1. 000 0 1.0000 .1319
o.oooo o.oooo
14.0000 -.6524
1.oooo -.9536
u.oooo -.6485
99.0000 -.2035
.1327
.1940
3.0000 3.0000 -.2173
0. 00110 o.oooo
36.0 000 2.5850
o.oooo -1.2596
18.0000 .6424
97.0000 -.3383
-.871t6
.42&2
1. 0 00 0 1.0000 .525&
o.oooo 0. 0 00 0
13.0000 -.7995
s.oooo 1.1882
u.oooo -.6485
90.0000 -.8104
.6480
-.9&30
2.0000 2.0000 .0571
0. 0 DO 0 o.oooo
14.0000 -.&524
4.0000 -.0357
1o.oooo -.8329
101.0000 -.0686
.0447
.0024
6.0000 &.DODO 1.2815
6.0000 6.0000
19.0000 .0834
7.0000 ,8822
19.0000 .8268
125.0000 1.5501
.1293
1.3675
3.000~
1.0000 1.oooo
24.0000 .8191
5.0000 .2703
12.0000 -.4641
90.0000 -.8104
-.6639
-.2190
t.oooo t.ooDo -.2799
o.oooo OoDOOD
28.00DO 1.4077
3. 0000 -.3417
10.00DD -.8329
1D7.0000 .3361
.4731
-.1148
1.0000 1.0000 .lt074
o.oooo o.oooo
18.0DDO -.0638
3.000D -.3417
12oOODO -.lt<1
89.DDOO -.8779
.056D
.3QDO
O.DDoO o.ODOO .5952
D.OOOO o.OODO
u.oODO -1.D938
5.DODO .27D3
12.0DDO -.4641"
li3.00DO -1.2825
1.4029
-.3466
4.0000 4.00DD ,0550
1.DOOO 1.0DDD
19.DDDO .D834
4.DDDO -.0357
16.DODO .2735
105.0000 .2012
.0168
-.Do72
2. 0000 z.oooo -.0820
2. DODO 2.0000
31.0 ODD 1.8492
2.0000 -.6477
zt.ooDo 1.1956
101.0000 -.0686
-.1268
.0444
3.00DO 3 .ooo 0 .8334
o.oooo o.ODOO
15.000D -.5D52
4.0DDO -.D357
17.DoDO .4579
129.0000 1.81'98
-.9194
-.o&5o
3. 0 000 3.0000 -.1560
2.o-ooo 2.00DD
15oDDDO -.5052
4.0000 -.D357
12-00DO -.4641
107.00DO ,3361
-.1698
-.0120
&.ooo o 6.DODD
&.DODD &.DODO
22.00DO .5248
10.000D 1.8002
23.D000 1.5644
143.DODD 2.7&41
1.4507
4.9758
3.0000 .3761
Q I
1-' 1-'
4.3241 57 , CELL SUBJECT 1_ BEFORE T~ANSFORMATIONS AFTER TRANSFORMATIONS 58 , CELL
SU~JECT
1
=
=
=
BEFORE TRANSFORMATIONS AFTER TRANSFORMATIONS = 59 , CELL SUBJECT 1 BEFORE TRANSFORMATIONS AFTER TRANSFORMATIONS ~
=
60 , CELl SUBJECT 1 BEFORE TRANSFORMATIONS = AFTER TRANSFORMATIONS =
o.oooo
12.0DOD -.9467
2. ODOD -.6477
13.DDOO -.2797
101. DODO -.0686
.0649
.D444
4.0DOO 4.DOOO -.7293
1.DOOO 1.0DDD
12.DDOO -.9467
6.0000 .5763
10.DDDO -.8329
115.0000 .8756
-.8289
.5046
3.0000 3.0000 .2818
0 .o 000 o.oooo
1D.OOOO -1.2410
s.oooo .2703
10.0000 -.8329
'H • 0000 -.3383
.4199
-.091-4
3.0DOO 3.0000 -.6!!31
1.0000 1.oooc
21.000D .3777
3.0000 -.3417
2D.ODDD 1.0112
92.0000 -.6756
-.2552
.2308
O.OODO D.OOOD .0192
D.OODO
GROUP
==================;===============:============= SUBCLASS 2
SYNTH
EVAL
3
CONOBV
CORR~LATION
MATRIX
5
4
CONRMT
6
INTEL
JOBS
9
7
CO X
CR
JB X 0
1 2 3 4 5 6 7 8 q
SYNTH EVAL CONOBV CONRHT JOSS INTEl CO X I CR X I JB X I
1.0DDOOO • 604724 • 207282 .342994 • 396645 ·640434 • 051836 ·• 331608 .295356
I
1.0DCODD .168527 .386288 .320429 .540930 .225035 .287707 .271077
"'
·N
1.DDDOOD • D57210 .542D56 .094125 -.0&0543 .236085 .019241
1.000000 • 425944 .411636 • 366906 .491823 .394895
1.oooooa .465543 • 023523 .310643 .242220
1.0000DO .110019 .364512 .284085
1.oooooo .330603 .3&0788
1.000000 .701642
1. 000000
CELL IDENTIFICATION AND FREQUENCIES PAGE CELL 1
FACTOR LEVELS 1
N 60
TOTAL N=
60
TOTAl SUM OF CROSS-PROUUCTS
4
2
SYNTH 1 2 3 ~
5 6
7 8
9
SYNTH EVAL CONOBV CONRMT JOBS INTEL CO X J CR X I JB X I
5&9.0000 322.000G 21. 2q28 35.2336 40.7448 65.7876 18.7252 107.3084 101.8349
3
EVAL
CONOBV
301.0000 17. &631
~
CONRIH
5
JOBS
f>
INTEL
a
7
CO X I
CR X I
9
JB X
59.0000
~0.~662
3.375~
33.5837 56.6940 ?7. 8988 73.7658 67.7667
31.9813 5.553~
-3.0618 18.5553 1.16%
59.0000 25.1307 24.2865 18.5553 38.6553 2~.~153
59.0000 27.~670
1.1896 24.4153
59.0000 5.5639 28.6491
1~.9756
17.56~2
43.6626 24.5203 21.6625
114.5309 68.9068
77.3639
OBSERVED CELL MEANS --- ROWS ARE CELLS-COLUMNS ARE VARIABLES 1
SYNTH 2.550000
3
2
CONOBV
EVAL 1.363333
.000900
~
5
6
CONRHT
JOBS
INTEL
-.oooooo
-.000000
-.oooooo
7
9
CO X
CR
JB X
.092556
• 404776
.lt57781t
OBSERVED CElL STO OEVS--ROWS ARE CELLS-COLUMNS VARIABLES 1
2
3
4
5
6
SYNTH
EVAL
CONOBV
CONRMT
JOBS
INTEL
1. 741070
1.776415
1.000000
1.000000
1.000000
1.000000
7
CO X • 857159
8
CR X 1.33213~
0 I
9
(;;
JB X 1.047919
ESTIMATION PARAMETERS PAGE RANK OF THE BASIS = RANK OF MODEL FOR SIGNIFICANCE TESTING = -0 RANK OF THE MODEL TO BE ESTIMATED IS -0 ERROR TERM TO BE USED IS !WITHIN CELLS! VARIANCE-COVA~IANGE
FACTORS AND CORRELATIONS AMONG ESTIMATES WILl BE PRINTED
ERROR SUM OF GROSS-PRODUCTS 1
SYNTH 1
2 3
4
SYNTH EVAL CONOBV CONRMT
176.8500 110.?50C >1.2928 35.2336
2
fVAL 186.1833 17.6631 40.Lt862
3
4
CONOBV
CONRMT
59.0000 3.3754
59.0000
5
JOBS
0
INTEL
7
CO X
8
CR X I
g JB X
5 6 7 6 9
JOBS !~TEL
CO X I CR X I JB X I
t,0.1'+46 65.7817 4.561+2 45.3778 31.7939
33.5637 56.6941 20.2166 40.169 .. 29.7726
31.9813 5.553'+ •3.0618 18.5553 1.1696
25.1307 24.2865 18.5553 38.6553 24.4153
59.0000 27.4670 1.1896 24.4153 14.9756
59.0000 5.?&39 26.6491 17.5642
43.3486 22.2725 19.1202
104.7003 57.7888
64.7899
ERROR VARIANCE •COVARIANCE MATRIX
-----------------------------------------------------------1
SYNTH 1 2 3 4 5 6 7
8 9
SYNTH EVAL CONOBV CONRMT JOBS INTEL CO X I CR X I JB X I
~.
3
2
4
5
6
EVAL
CONOBV
CONRMT
JOBS
INTEL
•• 155650 .299374 .666207 .%9215 ·960916 .342655 .680837 .504621
1.000000 .057210 .542056 .094125 -.051895 .314497 .020163
1.oooooo • 425944 .'411636 .31'+1+97 .655175 .413618
1. 000000 .465543 .020163 ·413&18 .253827
1.000000 .094303 .485579 .297698
7
CO X I
8
CR X I
9 JB X
031356
1.~7033<>
.360895 .59717q .690590 1.115045 • 077359 • 7~9115 • 53868C
.734721 .377t,99 .324071
1.774582 .9791t71
1.098131t
ERROR CORRELATION MATRIX
-----------------------------------------------------------1
SYNTH 1 2 3 4 5 6 7
6 9
SYNTH EVAL CONOEV CONRMT JOBS INTEL CO X I CR X ! JB X I
1.000000 • 604724 .207262 • 342994 • 39&645 • 640434 • 051836 .B1608 .295356
2
3
..
5
6
7
8
EVAL
CONOEV
CONRHT
JOBS
INTEL
CO X I
CR X I
1.000000 .168527 .386288 .320429 .540930 .225035 .287707 .271077
1.000000 .057210 .542056 .094125 -.060543 .236085 .019241
1.000000 .t,25941t .411636 • 366906 .491623 • 394895
1.000000 .465543 .023523 • 310643 .Zt,2220
1.000000 .110019 .36t,512 .28t,085
1.oooooo .330603 .360786
1.000000 .70161t2
VARIABLE 1
2
SYNTH EVAL
3 CONOSV t,
CON~HT
JOSS IIIITEL CO X I 8 CR X I 9 JB X I
5 6 7
VARIANCE !ERROR MEAN SQUARES! 3.031356 3.155650 1.oooooo 1.000000 1.000000 1.000000 .734721 1.774582 1.096134
STANDARD DEVIATION 1.7411 1.7764 1.0000 1.0000 1.0000 1.0000 .8572 1.3321 1.0t,79
9
JB X
1.ooaaoo
\"' .... ....
o.F
.=
59
ERROR TERM FOR ANALYSIS OF VARIANCE (WITHIN CELLS) ANALYSIS Of VARIANCE PAGE
6
DEPENDENT VARIABLE($) 1 SYNTH 2 EVAL 3 CONOBV 4 CONRMT 5 JOBS PRINCIPAL COMPONENTS OF CORRELATION
~ATRIX
Will BE PRINTED
LOG-DETERMINANT ERROR SUM OF CROSS-PRODUCTS
2.13087565E+01
PRINCIPAL COMPONENTS -- VARIAELES X COMPONENTS CROWS X COLSl 0
I
1
2 3 4
5
SYNTH tVAL CONOBV CONRMT JOBS
-.7&4736 -. 739229 -. 522533 -.641226 -.774647
.285066 .391512 -.770283 .326256 -.405507
-.371511 -.328308 -.157177 .652841 .245676
VECTOR 1 2 3
4 5
4
5.
.403057 -.401261 -.197274 -.093473 .195458
-.182671 .177377 -.264578 -.217816 .370034
>-"
"'
EIGENVALUE
PER CENT Of VARIATION
2.415573 1.098760 .757070 .409323 .319275
46.3115 21.9752 15.1414 6.1865 6.3855
COMPUTED FROM CORRELATION MATRIX ANALYSIS OF VARIANCE PAGE
DEPENDE~T
VARIABLE!$)
1 SYNTH
EVAL 7 INDEPENDENT VARIABLEISl !PREDICTOR VARIABLES, COVARIATESJ 6
INTEL
3 CONOBV
CONRHT 5 JOBS 7 CO X I 8 CR X I 4
9 JB X I
CANONICAL CORRELATION ANALYSIS WILl BE PERFORMED PREDICTORS Will EE ADDED TO THE STEP-WISE REGRESSION ACCORDING TO USERS KEY REGRESSION ANALYSIS PAGE
SUM OF PRODUCTS CRITERIA
8
l'
1-'
1
SYNTH SYNTH EVAL
178.8500 110.3500
"'
2
EVAL 18&.1833
SUM OF PRODUCTS - PREDICTORS BY CRITERIA 1
SYNTH
4
5 &
7
INTEL CONOBV GONRMT JOBS CO X I CR X I JB > I
&5.787&7 21. 2q279 35.23356 40.74482 4.5&416 45.37776 31.79391
EVAL 56.59405 17.6630& 40.48621 33.58366 20.21662 40.10938 29.77261
SUM OF PRODUCTS - PREDICTORS
1 2 3 4 5 6
7
INTEl CONOBV CONRHT JOBS CO X I CR X I J!l X I
1 INTEL
2 CONOBV
3 CONRHT
59.0000 5. 553'+ 24.2865 27.4670 5.5&39 28.6491 17.5642
59.0 000 3. 3754 31.9813 -3.0&18 18.5553 1o18%
59.0000 25.1307 16.5553 38.6553 24.4153
4 JOBS
59.0000 1.1896 24.4153 14.9758
5 GO X I
43.346& 22.2725 19.1202
& CR X I
7 JB X
104.7003 57.7866
&4.7899
INVERSE SUM OF PRODUCTS OF PREDICTORS CXtXIINV 1
2
INTEL
1 2
3 4
5 6 7
INTEL CONOBV CONRHT JOAS CO X I CR X I JB X I
CONOBV
4
GONRMT
JOBS
5
6
7
CO X
CR X
JB X
25. 0985291"-03 56.153643E-04 2q.5s1onE-o3 -33. 234177E-04 71.910 842E-04 30.161261E-03 -11. &52612E-03 -19.919207E-03 -12.2278?5E-03 37.655447E-03 89 .195231E-05 25.823260E-06 -83.67148&£-04 45.513385E-04 29.619949E-03 -45. 8&45q7E-04 -92.392157£-04 -77.667770E-04 49.626&78E-04 -14.241675E-04 24.547045E-03 86. &27742E-05 80.b272%E-04 H. 261669£-04 -63.870281E-04 -56.121396E-04 -18.281505E-03
33. 877380E-03
0 I
SUH OF PRODUCTS - REGRESSION
1 SYNTH 1 2
SYNTH EVAL
81.18066 69.41264
2 EVAL 67.21607
RAW R<'GRESSION COEFFICIENTS -
1 SYNTH 1 2 3
INTEL CONOBV CONRHT
4 5
JOBS CO X I CR X I J!l X I
&
7
2 EVAL
1.002351
.8~5820
• 277615
.331934 .316332 -.135000 .239640 -.158042 .203666
.160021 -.036251 -.15799q -.04'761 • 247632
INDEPENDENT X DEPENDENT VARS
STANDARDIZED REGRESSION COEFFICIENTS -
INDEP X DEPENDENT VAR
1-'
""
1
SYNTH 1 2 3 4 5 6
7
INTEL CONOBV CONRMT JOBS CO X I CR X I JR X I
.?75707
.159450 .09190'! -. 020821 -. 077785 -. 033483 .14'!044
2
EVAL .481768 .1868S6 .178073 -.075'!96 .1156l!2 -.118516 .120144
STA~DARD
1
SYNTH INTFL CONOBV CONRMT 4 JOBS 5 CO X I & CR X I 7 JB X I 1 2 3
• 217121 • 235594 • 238014 .266&50 • 235868 .214722 .252250
ER~ORS
OF RAW REGRESSION COEFS-INO X DEP VARIABLES
2
EVAL .239627 .260015 • 262686
.294291 .260318 .236980 .278399 Q
I .... 00
ERROR 1
SYNTH 1 2
SYNTH EVAL
97.6691 40.9374
SU~
OF PRODUCTS ADJUSTED FOR PREDICTORS
2
EVAL 118.9673
ERROR VAR-COV MATRIX ADJUSTED FOR PREDICTORS 1
SYNTH 1 2
SYNTH EVAL
2
EVA'L
1.~7A253
• 787257
2.287832
MATRIX OF CORRELATIONS WITH PREDICTORS ELIMINATED
SYNTH SYNTH EVAL
EVAL
1.000~00
• 37g776
1.000000
VARIABLE
VARIANCE !EQROR MEAN SQUARES!
SYNTH EVAL
STANDARD DEVIATION
1.878253 2.287832
D.F.=
1.3705 1.5126
52
ERROR TERM FOR ANALYSIS OF COVARIANCE IHITHIN CELLS! 7 COVARIATEISI HAVE BEEN ELIMINATED
LOG-DETER~INANT
ERROR SUM OF PRODUCTS 8EFORE ADJUSTMENT FOR PREDICTORS = STATISTICS FOR REGRESSION ANALYSIS WITH
9o95805938E+00
AFTER
9.l?0'<68068E+OO
7 PREDICTOR VARIABLEISI PAGE
VARIABLE
SQUARE MU LT
o
.453g .3610
SYNTH EVAL
MUL T R
P LESS THAN 6.1745 4.1g71
.6737
.&oog
.0001 .0010
STEP DOWN F
10
P LESS THAN
6.1745 1.1657
.0001 .3388
DEGREES OF FREEDOM FOR HYPOTHESIS= 7 DEGREES OF FREFOC~ FOR ERROR= 52
F VALUE FOP TEST OF HYPOTHESIS OF NO ASSOCIATION BETWEEN DEPENDENT AND INDEPENDENT VARIABlES=
o.F.=
14 AND
!LIKELIHOOD RATIO
102.0000 4.70773260E-01
C~NONICAL
CANONICAL CORRELATION
.7063
P lESS THAN
3.339
.0003
LOG = -7.53378702E-D11
CORRELATION ANALYSIS
SQUARE CORRELATION= .498g
COEFFICIENTS FOR DEPENDENT VARIABLES
CORRELATION IIITH CANONICAL
?
~
ACCOUNTS FOR
24,9426 P>RGFNT OF .VARIATION IN DEPENDENT VARIABLES
VARIABLE 1 SVNTH 2 EVAL
STANDARDIZED .7042 • 4021
RAW .404435 .226368
COEFFICIENTS FOR INDEPENDENT VARIABLES .8482 3 INTEL .848249 4 CONOEV .265350 .2654 5 CONRHT .193014 .1930 6 JOBS -.064025 -.0640 7 CO X I -.013668 -·0117 -.075711 -.1009 8 CR X I 9 JB X I .207072 .2170 CANONICAL CORRELATION ACGOU NTS FOR
2 :
.24&2
SQUARE CORRELATION= .0606
COEFFICIENTS FOR DEPENDENT VARIABLES
3.0301 P&RCENT OF VARIATION IN DEPENDENT VARIABLES
VARIABLE 1 SYNTH 2 EVAL COEFFICIENTS 3 INTEL 4 CONOBV 5 CONRMT 6 JOBS 7 CO X I 8 CR X I 9 JB X I
RAW -.597079 .669585
STANDARDIZED -1.0396 1.1895
INDEPENDENT VARIABLES -.1033 -.103336 .229512 .2295 • 472291 .4723 -.279271 -.2793 1.035028 .6872 -.323728 -.4312 -.0489 -.046651
VARIATE .9473 ,8279 .9465 .3026 .5619 ,5779 .1798 • 4944 .4486 CORRELATION WITH CANONICAL VARIATE -.3203 .5608
FO~
TOTAL PERGEHAGE OF VARIATION IN D£PFNDENT VARIABLES ACCOUNTED FOR=
-.0908 -.0610 .4180 -.1267 .6684 -.0102 .0625
0 I
27.9729
f\)
0
TEST OF SIGNIFICANCE OF CANONICAL CORRELATIONS FOR CORRELATIONS
THROUGH
2, CHI SQUARE=
!LIKELIHOOD RATIO : FOR CORRELATIONS
THROUGH
2, CHI SQUARE:
(LIKELIHOOD RATIO =
40.6825
3.375g
14 DEGREES OF FREEDOM
WITH
4.70773253E-01 WITH
g.J9398274E-01
P LESS THAN
.0002
P LESS THAN
.7605
LOG = -7.53378717E-011 6 DEGREES OF FREEDOM LOG
-6.25157424£-02)
STEP-WISE REGRESSION TO ANALYZE THE CONTRIBUTION OF EACH INDEPENDENT VARIABLE
!LIKELIHOOD RATIO
5.526286JOE-01
LOG
-5,9306g112E-01l
ADDING VARIABLE F= 23.0717 VARIABLE
UNIVARIATE F
1
SYNTH
40.3309
2
EVAL
23.9910
1 liNTEL I THROUGH WITH 2 AND
11 INTEL J TO THE REGRESSION EQUATION 57.0000 O.F. PLESS THAN .0001
P lESS THAN
STEP DOWN F
.o 001 .o 001
40.3309 STEP-DOWN MEAN SQUARES =I 3.8386 STEP-COHN MEAN SQUARES =I D.F.=
!LIKELIHOOD RATIO
ADDING VARIABLE F= .8749 VARIABLF
2
9.09440300E-01
2 ICONOBVJ THROUGH WITH 6 AND P LESS THAN
UNIVARIATF
STNTH
.9448
.4254
EVA\.
1.3296
.2741
ADDING VARIABLE F= .5649 VARIABLF
P LESS. THAN
SYNTH
.4711
.7038
2
EVAL
.5583
.6450
CORF USED FOR DATA=
239 LOCATIONS OUT OF 3000 AVAILABlE
29.2605 1.%121
-9.lo9259241E-021
3 AND
PERCENT OF ADDITIONAL VARIANCE ACCOUNTED FOR
P LESS THAN .lo25lo 1.7233/ .4873 1.61151
2.8907 1. 8~411
4.7835 1. 95951
LOG
-6.53836660E-021
71 JB X II TO THE REGRESSION EQUATION 102.0000 O.F. PLESS THAN .757lo
.4711 STEP-DOWN MEAN SQUARES =I .6684 STEP-DOWN MEAN SQUARES =I 3 AND
52
P LESS THAN .7038 .8849/ .5754 1.3341t/
PERCENT OF ADDITIONAL VARIANCE ACCOUNTED FOR 1.4843 1.87831 2.0580 1.99621
0 I
"',_.
55
STEP DOWN F
D. F.=
"1.0155 1. 81891
41 JOBS I TO THE REGRESSION EQUATION. 108.0000 D.F. P LESS THAN .5160
9.367080111'-01
UNIVARIATE F
1
LOG
.9448 STEP-OOWN MEAN SQUARES =I .8224 STEP-DOWN MEAN SQUARES =I
5 !CO X II THROUGH WITH 6 AND
.0001 73.3563/ .0550 7elo514/
58
STEP DOWN F
D. F.=
!LIKELIHOOD RATIO
1 AND
PERCENT OF ADDITIONAL VARIANCE ACCOUNTED FOR
P LESS THAN
PROBLEH 2 -- ONE-HAY ANALYSIS OF VARIANCE, ANALYSIS OF COVARIANCE lANCE EQUAL SAHPLE SIZES q 1 3 1 1 1 2INPUT DESC CARD TASK 4 FACTOR ID CARD PROBLEM 2--0NE HAY ANALYSIS OF VARIANCE AND COVARIANCE WITH EQUAL SAMPLE SIZES !12 OBSERVATIONS PER GROUP.! THE DATA FOR THIS EXAMPlE HAVE BEEN COLLECTED UNDER THE DIRECTION OF DR. THOMAS J. SHUELL, FACULTY OF EDUCATIONAL STUDIES, STATE UNIVERSITY OF NEW YORK AT BUFFALO. THE TASK FOR EACH OF THE FOUR EXPERIMENTAL GROUPS OF 12 COLLEGE SENIORS HAS TO SORT 50 CARDS, EACH PRINTED WITH A SINGLE WORD, INTO A GIVEN NUMBER OF WORD CATEGORIES !EITHER 5 OR 10.1 THE FOUR EXPERIMENTAL GROUPS DIFFERED IN THE NUMBER AND STRUCTURE OF CATEGORIES OF WORDS AS ORIGINAllY BUILT INTO THE SET OF 50 WORDS BY THE EXPERIMENTER, AND IN THE INFORMATION ABOUT PRE-EXISTING STRUC TURE GIVEN THE SUBJECT. THE OUTCOME VARIABLES OF PRIMARY INTEREST ARE THE TOTAL NUMBER OF HOROS RECALLED BY THE SUBJECT AFTER SIX TRIALS AT SORTING THE WORDS, AND THE PERCENTAGE OF THE EXPERIMENTERS ORIGINALLY INTENDED CATEGORIES WHICH WE~E RE-CREATED BY THE SUBJECTS, AFTER THE SAME NUMBER OF TRIALS. THE EXPERIMENTAL CONDITIONS WERE ARRANGED AS FOLLOWS-1.
SUBJE~TS TOLD TO SORT WORDS INTO FIVE MAJOR CATEGORIES, OF WHICH EACH MAJOR CATEGORY CONTAINED THO SUB-CATEGORIES OF WORD CONCEPTS.
2.
SUBJECTS TOLD TO SORT WORDS INTO TEN CATEGORIES, BUT NOT TOLD OF SUB-CATEGORICAL STRUCTURE. !GIVEN SAME WORD LIST AS GROUP 1l
3.
SUBJECTS TOLD TO SORT WORDS INTO FIVE CATEGORIES AND ARE NOT TOLD OF ANY HEIRARCHICAL STRUCTURE !GIVEN SAHE WORD LISTS AS GROUPS 1 ANO 2.1
4.
SUBJE;TS TOLD ONLY TO GROUP WORDS INTO FIVE WORD OR CONCEPT CATEGORIES. HERE A DIFFERENT ORIGINAL WORD LIST IS GIVEN, WHICH DOES NOT CONTAIN THE PREDETERMINED SUBCATEGORICAL STRUCTURE.
A NUMBER OF RECOGNIZED AND POSSIBLY CONFOUNDING FACTORS ARE CONTAINED IN THE STUDY. THE ASSUMPTION HAS BEEN HADE, BASED ON EARLIER STUDIES, THAT SEX OF .SUBJECT lOR EXPERIMENTER! DOES NOT AFFECT THE RESULTS. IN EACH GROUP THERE ARE MORE FEMALE THAN HALE SUBJECTS. SECOND, TWO PARALLEL WORD LISTS WERE USED, OF WHICH EACH WAS GIVEN AS STIMULUS TO A RANDOM HALF OF EACH EXPERIMENTAL GRQUP. IN THIS EXAMPLE, THE ANALYSIS HAS NOT BEEN DESIGNED TO INCLUDE A WORD LI~T EFFECT, AS PRIOR STUDIES HAVE INDICATED THAT THE LISTS USED YIELD COMPARABLE RESULTS. THIRD,A FACTOR IN THE MEMORIZATION OF THE 50 WORDS AND REPRODUCTION OF THE EXPERIMENTERS WORD CATEGORIES IS THE TIME THE SUBJECT USES AT EACH OF THE SIX TRIALS. THESE HAVE BEEN RECORDED, AND FOLLOWING THE ANALYSIS OF VARIANCE FOR THE FOUR GROUPS, AN ANALYSIS OF COVARIANCE, USING TIMES AT THE SECOND, FOURTH, AND SIXTH TRIAL AS COVARIATE$, IS PERFORMED. IT IS ASSUMED THAT THESE TIHE MEASURES ADEQUATELY REPRESENT THE TIMES ON TRIALS TWO THROUGH SIX, A HYPOTHESIS WHICH MAY BE TESTEO. TIHE AT TRIAL ONE IS SUBJECT TO ADD• ITIONAL SOURCES OF EXTRANEOUS VARIATION, AND IS EXCLUDED. Tu~
nAT~
r.&~ns
4RF PUNCHED AS FOLLOWS-
n
,(, 1\)
CARD COLUMN 1 2-3 4-&7-8
EXPERIMENTAL CONDITION 11,2,3,4,1 WORD LIST 11,21 SUBJECT IDENTIFICATION SEX I 1=Ml IZ=Fl
9-13 14-18
19-21 22-21
28-33 34-39 40-45
46-51 52-57
NUMBER OF WORDS RECALLED TRIAL G NUMBER OF EXPERIMENTERS CATEGORIES RECONSTRUCTED TRIAL6 NUMBER OF CATEGORIES IN PROBLEM !10 FOR GROUPS 1 AND 21 I 5 FOR GROUPS 3 AND 41 TIME TRIAL 1 ISECONDSI TIME TRIAL 2 TIME TRIAL 3 TIME TRIAL 4 TillE TRIAL 5 TillE TRIAl. 6
THE NUMBER OF EXPERIMENTERS CATEGORIES RECONSTRUCTED BY THE SUBJECTS IS PUNCHED IN COLUIINS 14-18 OF THE OATA CARDS. HOWEVER SOME OF THE EXPERIMENTAL CONDITIONS WERE DEFINED TO HAVE 5 CATEGORIES, AND OTHERS 10. THE NUMBER OF CATEGORIES ORIGINALLY DEFINED IS PUNCHED IN COLUHNS 19-21. TO MAKE RESPONSES COMPARABLE ACROSS CONDITIONS, THE NUMBER OF CATEGORIES RECONSTRUCTED IS TRANSFORMED TO A PROPORTION, BY DIVIDING BY THE NUIIBER IN THE PROBLEM. l'HIS RUN USES - • DATA FORM !II OPTIONAL PRINTED OUTPUT TRANSFORMATIONS !TO CONVERT NUMBER OF CATEGORIES RECONSTRUCTED TO PROPORTION OF THOSE IN PROBLEM! ARBITRARY CONTRAST MATRIX TO PROVIDE ESTIMATES OF PARTICULAR EFFECTS IN THE DESIGN ANALYSIS OF COVARIANCE ESTIMATED AND ESTIMATED COMBINED ADJUSTED MEANS. TEST OF REGRESSION PARALLELISM ERROR TERM IS RESIOUALIIDENTICAL TO WITHIN IN THIS CASE! THE ANALYSIS·OF·VARIANCE HODEL FOR SIGNIFICANCE TESTING HAS RANK 4 11 D.F. FOR THE GRANO MEAN, PLUS 3 BETWEEN GROUPS!. ARBITRARY CONTRASTS ARE USED SO THAT ESTIMATES OF THE SPECIFIC MEAN DIFFERENCES HAY BE OBTAINED, WHICH 00 NOT CONFORM TO THE C, O, H, P CONTRASTS AVAILABLE. THESE ARE THE CONSTANT AND THE COMPARISONS OF GROUPS 1 AND 2, 2 AND 3, AND 3 AND 4, RESPECTIVELY. THEY ARE CODED LO, L1, L2, AND l3o THE HYPOTHESIS TEST CARD IS ·1, 3. , INDICATING THAT THE TEST OF THE FIRST 1 DEGREE OF FREEDOM !CONSTANT OR LOI IS BYPASSED. THE TEST OF THE SUBSEQUENT 3 DEGREES OF FREEOO~ !BETWEEN GROUP CONTRASTS L1, LZ, AND L3l ARE TESTED TOGETHER AS ONE OVERALL HYPOTHESIS. THE RANK OF THE MODEL FOR ESTIMATION IS ALSO SET TO 4, SO THAT LEAST•SQUARES ESTIMATES OF ALL FOUR CONT~AST VALUES HAY BE OBTAINED !BOTH BEFORE AND AFTER COVARIATE ADJUSTHE~Tl. SINCE THIS EXHAUSTS ALL BETHEEN·GROUP DEGREES OF FREEDOM, THE OBSERVED AND PREDICTED MEANS ARE IDENTICAL, AND THE MEAN RESIDUALS ARE NULLo THE MEANS KEY IS 1o TO YIELD ESTIMATED AND COVARIATE•AOJUSTED ESTIMATED MEANS FOR EACH LEVEL OF THE FIRST IONLYl DESIGN FACTOR. THE FIRST ANALYSIS COMPARES GROUP MEANS ON THE TWO MEASURES, NUMBER OF WORDS RECALLED AND PROPORTION OF CATEGORIES RECONSTRUCTED. THE COVARIATE RUN IS THE SECOND ANALYSIS. THE SAHE DEPENDENT VARIABLES ARE SELECTED, WITH THREE TIME
~ "'
ME~SURES AS COVARIATES. THE REGRESSION ANA,YSIS IS PERFORMED AUTOHATICAL'Y WHEN COVARIATES ARE INDICATED. THE SAHE ANALYSIS-OF-VARIANCE EFFECTS ARE ASSUMED, BUT ALL HATRICES.ARE ADJUSTED FOR THE CONCOMITANT MEASURES. FINISH 18X2F5.0,F3.0,6F6.0l VARIAB'E FORMAT TRANSFORMATION 2 19 z 3 BLANK-END OF TRANS NCAT TIME 1TIME 2TIHE 3TIME ~TIHE 5TIHE 6 WORDS CATS VARIABLE LABELS 12 12 12 FREQUENCIES 12 50 85 83 79 75 1 1 8 2 10 10 77 100 10 10 136 87 1 1 18 2 36 132 106 109 116 11t4 1 1 28 1 31 9 10 ZlB 115 111 109 115 35 9 10 1 1 ItO 2 137 133 10ft 102 10 0 91 1 1 50 1 lt3 10 10 188 201 216 125 137 147 1 1 58 2 49 10 10 160 161 16ft 129 150 167 148 1 ? 4 2 31 9 10 185 178 169 225 135 1 2 11t 2 33 9 10 285 338 330 129 125 103 1 2 24 1 36 10 10 132 112 182 91 151 86 153 1 2 31 2 Itt 10 10 126 11t9 133 7lt 90 110 1 2 45 1 45 10 10 162 12& 122 94 100 1 z 55 1 lt8 10 10 123 129 85 1 .. 9 118 129 2 1 5 2 50 10 10 130 130 1Zit 110 115 101 2 1 15 z 323 317 232 302 275 lt9 10 tO 212 2 1 25 2 lt4 10 10 146 litO 117 119 128 110 2 1 33 1 31 7 10 18~ 1"39 131 121 138 136 2 1 44 1 lt7 10 10 140 75 82 95 11t2 94 2 1 52 2 38 10 10 2Z8 228 230 13& 162 219 38 10 10 179 11t5 1&0 139 151 2 2 9 2 173 2 2 17 2 ~8 10 10 116 112 118 99 127 112 45 107 2 2 26 2 10 10 130 91 85 93 128 2 2 37 1 81 lt8 10 10 131 11t5 122 111 102 2 z lt8 2 8 10 142 35 123 96 133 208 129 161 2 2 59 2 810 156 162 224 33 1'+1 !92 3 1 7 2 44 5 5 170 113 104 123 98 1Z3 106 83 70 3 t 16 2 lt1 5 5 96 99 72 3 1 27 2 31t 121 122 107 111 97 93 5 5 3 1 39 1 35 137 108 5 5 112 124 113 95 3 1 47 2 9~ 97 80 78 8& 83 40 5 5 41t 185 184 3 1 60 2 5 5 170 210 190 172 3 2 1 2 39 195 115 125 150 130 5 5 110 3 2 12 2 39 5 5 litO 105 99 86 71 94 3 2 21 2 45 112 77 65 68 5 5 85 58 3 2 35 1 5 5 75 80 41 120 105 101 81 3 2 Itt 2 46 15ft 115 99 105 97 5 5 99 3 2 54 1 32 5 5 145 144 73 104 91 96 4 1 3 2 33 110 107 112 5 5 132 94 107 ... 1 11 t 116 36 5 5 H2 107 113 112 110 4 1 22 2 37 59 &0 58 65 5 5 58 '+2 4 1 32 2 lt2 308 132 1&6 1~7 5 5 12~ 111 33 ~ 1 42 1 5 5 132 85 110 159 126 10 0 .. 1 53 2 33 4 5 154 117 133 113 120 114 4 2 10 2 41 5 5 88 72 83 85 102 114 5 ~ 2 20 2 33 5 97 87 89 80 94 96 4 2 30 2 38 5 5 106 94 128 1Z6 108 125 .. 2 38 1 74 5 5 76 88 92 114 100 39 81t 4 2 49 2 28 5 82 8'+ 90 124 119 4 2 57 2 125 136 133 42 5 5 145 1~0 130 4 EST. SPEC. CARD 1 1 2 4 t. MEANS KEY TASK ARBITRARY CONTRAST MATRIX lltF~. Ol FORHAT FOR CONTRASTS
..
Q
I
"'
.p-
CONTRAST HEIGHTS -1 1 LOo 1,.
z,
3,
z
CONST • T1-TZ INFORMATION ON HIERARCHY TZ-T3 HORE GLOBAL CATEGORIES T3-H BUILT Il'l HIERARCHY 1
1,2.
-1.3. 2
1,2,5.7.,9. -1 •. 3.
-1
1
ANALYSIS SELECT CARD. VARIABLE SELECT KEY HYP TEST CARD ANALYSIS SELECT CARD-COVAR VAR SELECT KEY--COVARIANCE HYP TEST CARD--COVARIANCE COI'lTINUE
0' I [\)
\n
¥
~
~
~
4
~
•
MU L T I V A R I A N C E • • • • • • •
UNIVARIATE AND MULTIVARIATE ANALYSIS OF VARIANCE, COVARIANCE AND REGRESSION MARCH 1972
VERSION
PROBLEM
PROBLEM
•• ONE·WAY ANALYSIS OF VARIANCE, ANALYSIS OF COVARIANCE EQUAL SAHPLF SIZES
PAGE
1 0 I
PROBLEM 2·-0NE WAY ANALYSIS OF VARIANCE AND COVARIANCE WITH EQUAL SAMPLE SIZ~S (12 OBSERVATIONS PER GROUP.) THE DATA FOR THIS EXAMPLE HAVE BEEN COLLECTED UNDER THE DIRECTION OF OR. J. SHUELL, FACULTY OF EDUCATIONAL STUDIES, STATE UNIVERSITY OF NEW YOOK AT PUFFALO. THE TASK FOR EACH OF THE FOUR EXPeRIMENTAL GROUPS OF 12 COLLEGE SENIORS WAS TO SORT 50 CARDS, EACH PRINTED WITH A SINGLE WORD, INTO A GIVEN NUMBER OF WORD CATEGORIES l~ITHER 5 OR 10.) THE FOUR EXPERIMENTAL GROUPS DIFFERED IN THE NUMBER AND STRUCTURE OF COTEGORIES OF WORDS AS ORIGINAllY BUILT INTO THE SET OF ;c WORDS BY THE EXPERIMENTER, AND IN THE INFORMATION ABOUT PRE-EXISTING STRUC TURE GIVEN THE SUBJECT. THE OUTCOME VARIABLES OF PRIMARY INTEREST ARE THE TOTAL NUHEER OF WOROS ~ECALLEn BY THE SUBJECT AFTER SIX TRIALS AT SORTING THE WORDS, AND THE PERCENTAGE OF THE EXPERIMENTERS O~IGINALLY INTENDED CATEGORIES WHICH WFRE RE-CREATED BY THE SUBJECT>, AFTER THE SAME NUMBER OF TOIALS. THE EXPERIMENTAL CONDITIONS WERE ARRANGED AS FOL'LOWS-· THOMA~
1.
SUBJECTS TOLD TO SORT WORDS INTO FIVE MAJOR CATEGORIES, OF WHICH EACH MAJOR CATEGORY CONTAINED TWO SUB-CATEGORIES OF WORD CONCEPTS.
2.
SUBJECTS TO'LO TO SORT WOROS INTO TEN CATEGORIES, BUT NOT TOLD OF SUB-CATEGORICAL STRUCTURE. CGIVEN SAME WORD LIST AS GROUP 11
3.
SUBJFCTS TOlD TO SORT WORDS INTO FIVE CATEGORIES AND ARE NOT TOLD OF ANY HEIRARCHICAL STRUCTURE IGIVEN SAME WORD LISTS AS GROUPS 1 AND 2.)
4.
SUBJECTS
TOL~
ONLY TO GROUP WORDS INTO FIVE WORD OR CONCEPT
"'"'
HERE A DIFFERENT ORIGINAL WORD LIST IS GIVEN, WHICH DOES NOT CONTAIN THE PREDETERMINED SUBCATEGORICAL STRUCTURE.
CATEGO~IES.
A NUMBER OF ~ECOGNIZEO AND POSSIBLY CONFOUNDING FACTORS ARE CONTAINED IN THF STUDY. THE ASSUMPTION HAS BEEN MADE, BASED ON EARLIER STUDIES, THAT SEX OF SUBJECT (OR EXPE~IMENTERI DOES NOT AFFECT THE ~ESULTS. IN EACH G~OUP THERE ARE MORE FE"ALE THAN HALE SUBJECTS. SECOND, TWO PARAllEL WORD liSTS WERE USED, OF WHICH EACH WAS GIVEN AS STIMULUS TO A RANDOM HALF OF EACH EXPERIMENTAL GROUP. IN THIS EXAMPLE, THE ANALYSIS HAS NOT BEEN DESIGNED TO INCLUDE A WORD LIST EFFECT, AS PRIOR STUDIES HAVE INDICATED THAT THE LISTS USED YIELD COMPARAeLE RESULT;. TH!RD,A FACTOR IN THE MEMORIZATION OF THE 50 WORDS AND REPRODUCTION OF THE EXPERIMENTERS WORD CATEGORIES IS THE TIME THE SUBJECT USES AT EACH nF THE SIX TRIALS. THESE HAVE BEEN RECORDED, AND FOLLOWING THE ANALYSIS OF VARIANCE FOR THE FOUR GROuPS, AN ANALYSIS OF COVARIANCE, USING TIMES AT THE SECOND, FOURTH, AND SIXTH TRIAL AS COVARIATES, IS PERFORMED. IT IS ASSUMED THAT THESE TIME MEASURES ADEQUATELY REP~ESENT THE TIMES ON TRIALS TKO THROUGH STX, A HYPOTHESIS WHICH MAY BE TESTED. TIME AT TRIAL ONE IS SUBJECT TO ADDITIONAL SOURCES OF EXTRANEOUS VARIATION, AND IS EXCLUDED. THE DATA CARDS ARE PUNCHED AS FOLLOWSCARD COtUHIII 1 EXPERIMENTAL CONDITION 11,2,3,4,1 WORD t.IST (1,21 2-3 4-6 SUBJECT IDENTIFICATION 7-6 SEX 11=HI (2=FI g:-~u NUMBER OF WORDS RECALLED TRIAl~ 6 14-16 NUMBER OF EXPE~IHENTERS CATEGORIES RECONSTRUCTED TRIAL& 19-21 NUMBER OF CATEGORIES IN PROBLEM 110 FOR GROUPS 1 AND 21 ( 5 FOR GROUPS 3 AND 41 22-27 TIME TRIAL 1 (SECONDS! 26-33 TIME TRIAl 2 34-3S T IHE TRIAL 3 40-45 TIME TRIAL 4 4&•51 TIME TRIAL 5 52·57 TIME TRIAL & THE NUMBER OF EXPERIMENTERS .CATEGORIES RECONSTRUCTED BY THE SUBJECTS IS PUNCHED IN COLUMNS 14·16 nF TKE DATA CARDS. HOWEVER SOME OF THE EXPERIMENTAL CONOITICNS WERE OEFIN"D TO HAVE 5 CATEGORIES, AND OTHERS 10. THE NUMBER OF CATEGORIES ORIGINALLY DFFINEO IS PUNCHED IN COlUMNS 19-21. TO HAKE RESPONSES COMDARABLE ACROSS CONDITIONS, THE NUMBER OF CATEGORIES RECONSTRUCTED IS TRANSFORMED TO A PROPORTION, 6~ DIVIDING BY THE NUMBER IN THE PROBLEM. THIS RUN USES -· DATA FORM III OPTIONAl PRINTED OUTPUT TRANSFORMATIONS ITO CONVERT NUMBER OF CATEGORIES RECONSTRUCTED TO PROPOUIOi'l OF THOSE IN PROBlEM) ARBITRARY CONTRAST MATRIX TO PROVIDE ESTIMATES OF PARTICUlAR EFFECTS IN THE DESIGN AIIIALYSIS OF COVARIANCE ESTIMATED AND ESf.IHATED COMBINED ADJUSTED MEANS. TEST OF REGRESS ION PARALLEliSM EI!ROR HRH IS R~SIOUAUiDENTICAL TO WITHIN IN THIS CASE!
...,~
THE ANALYSIS-OF-VARIANCE HOOEL FO' SIGNIFICANCE TESTING HAS RANK 4 11 O.F. FOR THE GRANO MEAN, PLUS 3 BETWEEN GROUPS!. ARBITRARY CONTRASTS ARE USED SO THAT ESTIMATES OF THE SPECIFIC MEAN DIFFERENCES HAY BE OBTAINED, WHICH DO NOT CONFORM TO THE C, O, H, P CONTRASTS AVAILABLE. THESE ARE THE CONSTANT AND THE COMPARISONS OF GROUPS 1 AND 2, 2 AN0·3, AND 3 AND 4, RESPECTIVELY. THEY ARE COOED LO, L1, l2o AND L3. THE HYPOTHESIS TEST CARD IS -1, 3. , INDICATING THAT THE TEST OF THE FIRST THE TEST OF THE SUBSEQUENT 3 DEGREES OF FREEDOM !BETWEEN GROUP CONTRASTS L1o LZ, AND L31 ARE TESTED TOGETHER AS ONE OVERALL HYPOTHESIS.
1 DEGREE OF FREEOOH !CONSTANT OR LOl IS BYPASSED.
THE RANK OF T~E HODEL FOR ESTIMATION IS ALSO SET TO 4, SO THAT LEAST-SQUARES ESTIMATES OF All FOUR CONTRAST VALUES HAY BE OBTAINED !BOTH BEFORE AN1l AFTER COVARIATE ADJUSTMENT!. SINCE THIS EXHAUSTS ALL BETWEEN-GROUP DEGREES OF FREEDOM, THE OBSERVFD AND PREDICTED MEANS ARE IDENTICAL, ANO THE MEAN RESIDUALS ARE NULL, THE MEANS KEY IS 1. TO YIELD ESTIMATED ANO COVARIATE-ADJUSTED ESTIHATFO MEANS FOR EACH LEVEL OF THE FIRST !ONLY! OESIGN FACTOR. THE FIRST ANALYSIS COHPARtS GROUP MEANS ON THE TWO MEASURES, NUMBER OF WORDS RECALLED ANO PROPO~TION OF CATEGORIES RECONSTRUCTED, THE COVARIATE RUN IS THE SECOND ANALYSIS. THE SAME DEPENDENT VARIABLES ARE SELECTED, WITH TH~EE TIME ~EASURES AS COVARIATES. THE REGRESSION ANALYSIS IS PERFORMED AUTOMATICALLY WHEN COVARIATES ARE INDICATED. THE SAME ANALYSIS-OF-VARIANCE EFFECTS ARE ASSUMED, BUT ALL MATRICES ARE ADJUSTED FOR THE CONCOMITANT MEASURES, INPUT PARAMETERS PAGE
2
\'
N
NUMBER OF VARIABLES IN INPUT VECTORS;
co
9
NUMAER OF FACTORS IN DESIGN; NUMBER OF LEVELS OF FACTOR 1 !TASK
1 ;
NUMBER OF VARIABLES AFTER TRANSFORMATIONS;
4 9
INPUT IS FROM CAROS, DATA OPTION 3 MINIYAL PAGE SPACING WILl BE USED ADDITIONAL OUTPUT WILL PRINTED DEBUG OUTPUT WILL BE PRINTED FORMAT OF OATA !8X2F5,0,F3.0,6F6.0l
VARIABLE FORMAT
TRANSFORMATIONS l'AGE
VARIABLE
2 WILL BE FORMED BY APPLYING TRANSFORMATION I X/Y l TO INPUT VARIABLES Vtii• X;V( 21, Y;V( 31, Z;V( -0), C; -0.00000.
3
FIRST OBSERVATION SUBJECT 1 , CELL 1 BFFORE TRANSFORMATIONS AFTER TRANSFORMATIONS = GROUP
50.0000 75.0000 50.0000 75.0000
10.0000
10.0000
85.0000
83.0000
79.0000
77.0000
100.0000
1.0000
10.0000
85.0000
83.0000
79.0000
77.0000
100.0000
TASK
SUBC~ASS
COVARIANCE MATRIX
-----------------------------------------------------------1 WORDS 1 2 3 4 5
6 7
8 9
WORDS CATS NCAT TIME 1 TIME 2 TIME 3 TIME 4 TIM~' 5 TillE 6
49.788 .267 • 00 0 -196.788 -176.~48
-126.561 -50.455 -~7.545
-16. 37<J
2 CATS .002 .ooo -1.512 -1.552 -.-848 -.427 -.318 -.148
3 NCAT
.ooo • 0 00 .coo .ooo • 0 00 • 0 00 .ooo
4 TillE 1
2599.333 2992.576 2772.515 743.182 30~.727
420.424
5 TIME 2
6 TIME 3
4229.356 3967.325 864.818 509.091 566.598
4568.255 666.182 527.455 319.992
7 TIME 4
848.727 658.162 365o545
8 TIME 5
9 TIME 6
0
1449.091 564.182
I
557.174
SUBCLASS CORRElATION MATRIX
wo~ns
1 2 3 4 5 6 7
8 9
WORDS CATS NCAT TIME 1 TIME 2 TIME ~ TillE 4 TillE 5 TillE 6 GROUP
1.000000 • 767572 .oooooo -.547023 -.384~02
-. 262518 -.?45445 -.139781 -. 098339
2 CATS
3 NCAT
4 TIHE 1
1.000000 c.oooooo -.602376 -.484541 -.252219 -.297874 -.169762 -.127761
1.000000 .oooooo .oooooo .oooooo .oooooo .oooooo .oooooo
1.000000 .902562 .795912 .500357 .156497 .~49350
5 TillE 2
6 TIME 3
TIME 4
8 TillE 5
9 TIME 6
1.000000 • 8g2859 • 456461 .205641 .369099
1.000000 .334681 .202796 .198411
1.000000 .593491 • 531572
1.000000 .627879
1.oooooo
TASK
==========================~=====================
SUBCLASS COVARIANCE MATRIX
7
"'"'
2
WORDS 1 2 3
4 5 6 7 8 9
WORDS CATS NCAT TIME 1 TIME 2 TIME 3 TIME 4 TIME 5 TIME 6
4?.970 .602 .no o .909 21.773 -71.788 -30.394 -135.773 -32.f>B?
CATS .012 • ono -.218 .793 .18n .297 -1.402 -.352
3 NCAT
.noo .a no .non .non .ooo .noo • n DO
TIME 1
5 TIME 2
6 TIME 3
TIHE It
TIHE 5
2996.364 2990.455 2095. 818 2810.364 2360.455 1532.182
3938.386 2848.591 3523.455 3238.977 1561.659
2719.061 2722.465 2556.173 1282.136
3603.879 3251.51t5 1501.455
lt000.386 1475.250
~
7
8
9 TIHE 6
971t.023
SUBCLASS CORRELATION MATRIX
------------------------------------------------------------
.
5 6 7 8
9
WORDS CATS NCAT TIME 1 TIME 2 TIM!' 3 TIME 4 TIME 5 TIME 6 GROUP
1 WORDS
2 CATS
NCAT
4 TIME 1
5 TIME 2
6 TIHE 3
7 TIME It
8 TIME 5
9 TIHE 6
1.000000 • 818713 .nooono .00?449 • n51170 -. 203051 -.074673 -.316611 -.154"'q
1.oonooo o.nnonoo -.036783 .116637 .031909 • 045651 -.204599 -.104164
1.nnnoon .noooon .oooooo .oooooo .oonnno .oonnon .oooooo
1.000000 .670524 • 734255 .655225 .661765 .696868
1.000000 • 870465 .935243 .B160i
1.000000 .8697n3 • 775232 • 787644
1.onoooo .856355 .801387
t.oooooo ·7473&0
1.000000
3
TASK
==============================~=================
SUBC~ASS
COVA•IANCE MATRIX
-----------------------------------------------------------1 WORDS 1
2 3 4 5 6 7 8
9
WORDS CATS NCAT TIMF 1
TIMl' TIME TIME TIME TIME
2 3 4 5 6
20.182 .ooo .nco ~1. 72 7 -1~.364
-6.545 18.636 21.545 23.727
2 CATS .ooo • 00 0 .oon • 00 0 .ooo .ooo .ooo • 000
3 NCAT
.ooo .ooo .ooo .ooo .ooo .ooo .ooo
4 TIME 1
1065.061 517.712 50&.409 956.182 721.909 743.682
5 TIH£ 2
& TIME 3
7 TIME 4
6 TIHE 5
TIME &
679.902 561.568 854.455 787.091 603.386
639.477 902.273 758.3&4 661.295
1546.182 1216.273 1153.000
112&.727 899.727
910.750
SUBCLASS CORRELATION MATRIX 4
6
7
9
'Q I
"'0
1 2 3 4 5 6 7 8 9
WORDS cATS NCAT TIME 1 TIME 2 TIME 3 TIME 4 TIHE 5 TIME 6 GROUP
1.000000 o. 000000 .C00001 .216~04
-.13%94 -. 057617 .105431 .142878 .175012 TASK
VAl~
NCAT
TIME 1
TIME 2
TIME 3
TIME 4
TIME 5
TIME &
1.000000 0.000000 .oooooo .oooooo .oooooo .oooooo .oooooo .000000
1.000000 .000000 .000000 .oooooo .oooooo .oooooo .oooooo
1.000000 .608385 .613623 • 744633 .&59 001 .755093
1.000000 • 851661 • 832827 • 899275 .766784
1.000000 .906806 .893420 .865530
1.000000 .920896 .970999
1.000000 .888182
1.000000
9 TIME &
4
================================================ SUBCLASS COVARIANCE MATRIX
-----------------------------------------------------------1 WORDS 1 2 3 4 5 6 7 8 9
WORDS CATS NCAT TIME 1 TIME 2 TIME 3 TIME 4 TIME S TIME 6
19.114 .209 • oo e 110.341 42.182 42.614 21.705 -8.636 -6.705
2 CATS
3 NCAT
.006 .oo 0 .294 -.018 .094 • 324 -.564 -.548
.ooo • 0 00 .ooo • 000 .ooo • 0 00 .ooo
4 TIME 1
4141.720 984.545 1582.356 1149.598 886.545 533.735
5 TIHE 2
6 TIME 3
7 TIHE 4
8 TIME 5
423.818 516.091 259.091 285.455 287.455
908.811 662.508 503.455 483.371
833.538 481.818 393.220
556.909 453.909
561.538
0 I
SUBCLASS CORRELATION MATRIX
-----------------------------------------------------------1 WDRDS
3
4 5 6 7 8 9
WORDS CATS NCAT TI~F
TIME TIME TIME TIMF TIME
1 2 1 4 5 6
1.000000 • 614335 .000001 • 392170 • 468666 • 3233n .171956 -.083708 -.064715
2 CATS
3 NCAT
4 TIME 1
1.000000 o.oooooo .058669 -.011345 .040027 .144261 -.306795 -.297315
1.000000 .oooooo • 000000 .000000 .oooooo .oooooo .oooooo
1.000000 .743114 .815600 .618719 .583739 .349982
5 TIME 2
1.000-000 .831572 .435913 .587564
.589237
6 TIME 3
7 TIHE 4
8 TIME 5
9 TIHE 6
1.000000 .761187 .707671 .676635
1.000000 • 707177 .574755
1.000000 .811684
1.000000
8 TIME 5
9 TIME 6
CEtt SUMS - ALL GROUPS - BEFORE TRANSFORMATION MATRIX
-----------------------------------------------------------1
WORDS
2 CATS
3 NCAT
4 TIME 1
5 TIHE 2
6 TIME 3
7 TIME 4
"',_.
1 2 3 4
11.&00 11.300 12 .oo 0 11.600
4 7 ~.000
506.COO 480.000 435.COO
120.000 120.000 60.000 60.000
1952.000 1944.000 1654.000 1513.000
18&7.000 1621.000 1415.000 1200.000
1611.000 1618.000 1311.000 1333.000
1392.000 1844.000 1272.000 1325.000
1572.000 1839.000 1200.000 1246.000
tz83.000 15&9.000 1143.000 121t7.000
CELL IDENTIFICATION AND FREQUENCIES PAGE CELL
FACTOR LEVELS TASK 1 2 3 4
4
N 12 12 12 12
TOTAL N=
48
TOTAL SUM OF CROSS-PRODUCTS
1 2 3
4 5 6 7 8
9
WORDS GATS NCAT TIME 1 TIME 2 TIME 3 TIME 4 TIME 5 TIME 6
70.831000E+03 18.5090 OOE+02 14.415000£+0? 28. 014100F+04 24.98360CE+04 23.934000£+04 23.166900F+04 ?3.1fi~BOOE+04
20. 783700E+04
45.290000E+OO 7 4, 700000H01 68.1B3000E+OZ 60.860000F+02 58.675000E+02 56.370 00 OE+O 2 %.326000E•02 50.546000£+02
8
TIME TIME
B1.60150CE+04 68.983400£+04
5 TIME 2
4 TIME 1
7
6 TIME 3
TIME 4
30. 000000£>02 54.795000£+03 49.955000£+03 47.510000f+03 45. 345000£+03 46. 350000!'+03 40.470000E+03
11.700210E+05 10.273730H05 98.200200E+04 92.979700E+04 92 .338100£+04 81.317900E+04
95.564900£+04 90.201100£+04 63.940900E+04 8ft.Z97200E+04 73.039800E+04
86.106300E+04 79.934800£+04 80.Z73600E+04 69.876600E+04
60.114500E+Oit 79.163100E+04 68.&32400E+Oit
~
THE 6 61.361400£+04
OBSERVED CHL MEANS --- ROWS ARE CELLS-COLUMNS ARE VARIABLES 1 wo~os
il
);:l
TIME 8 9
3 NCAT
2 r,ATS
WORDS
2
CATS
3
NCAT 1n nnnn
4 TIME
1E>?.6667
6
TIME 15~.5833
TIME 150.9167
7
6
9
TIME 4
TIME 5
TIME 6
116.0 000
131.0000
106.91&7
.
3
1,?,16&7 1,0.0000 36.250~
.91,17 1.0000 .9&&7
10.0000 5.0000
s.ooco
162.0000 H7.8333 126.0833
151.7500 117.9167 100.0000
131,.8333 109.2500 111,0833
153.6667 106.0000 110.4167
153.2500 100.0000 104.0000
130.7500 95.2500 103.9167
OBSERVED CELL STD DEVS--ROWS ARE CELLS-COLUMNS VARIABLES 1
WORDS 1 2 3 4
7. 0560'5 o.7ROH 4.1,9242 4.37191
3
2
CATS
NCAT
• 04924 .10836 o.ooooo .07785
0.00000 o.ooooo o.ooooo 0.00000
.
6
7
TIME 1
TIME 2
TIME 3
TIHE t,
TIME 5
TIME 6
50.983&6 , .. • 73905 12.63527 61,.35&19
65.03350 62.7565& 2&. 071,92 20.58&81,
68.324 70 52.14461 25.2B789 30.H6t,9
29.13292 GO. 03231 39.31,691, 26.8710&
38.06693 63.246&1 33.56676 23.59892
23.601,51, 31.20934 30.17863 23.69679
5
ESTIMATIO~
8
9
PARAMETERS
=====================
PAGE
RANK o< THE BASIS = RANK OF MOQEL FOR SIGNIFICANCE TESTING =
5
4
RANK OF THE HODEL TO Be ESTIMATED IS ERROR TERM TO BE USED IS !RESIDUAL! NUMBER OF FACTORS WITH ARBITARY CONTRASTS IS
1
VARIANCE-COVARIANCE FACTORS AND CORRElATIONS AMONG ESTIMATES WILL BE PRINTED ESTIMATED COMBINED ·MEANS WILt EE PRINTED
OPTIONAL CONTRAST MATRIX--ROWS ARE CONTRASTS, COLUMNS SUBCLS
..
2 1 2 3
1.ooooon -o. ooooo o -o. oo ooo o
-1.000000 1.oroooo -o.oooooo
-o.oooooo -1.000000 1.000000
-o.oooooo -o.oooooo -1.000000
BASIS MATRIX FOR OPTIONAL CONTRASTS, BY COLUMNS
• 75000 0 -.250000 -.250000 -.250000
.sooooo .500000 -.500000 -.500000
.250000 .250000 .250000 -.750000
J! "'
FACTOR IT ASK SYMBOLIC CONTRAST VECTORS PAGE 11
6
CDNST,
lO,
BASIS VECTOR !NOT CONTRAST VECTOR! = 1,000000JOE+OO 1o00000000f+OO 1.00000000£+00 1.00000000£+00
VECTOR OF T - TRIANGULAR FACTOR OF BASIS, FROM GRAM-SCHMIDT 6. <;12 820 32 3f' +0 0
VECTOR OF
ORTHONORMAtiZE~
1.~~337567£-01
BASIS
1 •• 4337567£-01 1,44337567E-01 1.44337567£-01
21
INFORMATION ON HIERARCHY
1,
T1-T2
BASIS VECTOR !NOT CONTRAST VECTOR) = 7.50000000E-01-2.50000000E-01-2.50000000E-01-2.50000000f-01 0
VECTOR OF T - TRIANGULAR fACTOR OF BASIS, fROM O.
I
GRA~-SCHHIOT
"'+""
3.00000000F+OO
VECTOR OF ORTHONORHALIZ£0 BASIS z.soooooooE-o1-8.33333333E-oz-a.33333~33E-02-8.33333333E-02
31
z.
MORE GLOBAL CATEGORIES
BASIS VECTOR !NOT CONTRAST VECTOR)
l2-T3
=
5,ooooooooE-o1 s.ooooooooE-01-5.ooooooooE-01-5.oooocoooE-01
VECTOR OF T - TRIANGULAR FACTOR OF BASIS, FROM GRAM-SCHHIOT 2.00000000F+OO 2o82R42712E+OO
Q,
VECTOR Of ORTHONORHALIZEO BASIS -8.37382&45£-16 2.35702260E-01-1o17851130E-01-1.17851130E-01 41
BUILT IN HIERARCHY
3,
BASIS VECTOR
l~OT
cONTRAST VECTORl
T3-T4
=
2.50000GOOE-01 2.50000000E-01 2.50000000E-Ot-7.50000000E-01
VECTOR OF T O.
T~IANGULAR
FACTOR OF BASIS, FROM GRAM-SCHMIDT 1o41421356E+OO 2.44948974£+00
1.00~00000E+DO
VECTOR OF ORTHONORMALIZED BASIS -7.25194643E-1&-4.83463095E-16 2.04124145E-01-2o04124145E-01
= = = = = = == = =
=
= == = = ====
=
=== == ==
=
== =
HATRIX OF ORTHOGONAL ESTIMATES lUI EFFECTS X VARIABLES EFFECT
1 2o74097040E+02 6.71169688E+OO 5.19615242£+01 1.01945624£+03 9.09759667£+02 8.76562046£+02 8o41921030E+02 6.45673807£+02 7.5661752~£+02
EFFECT
EFFECT
1.08333333E+OU-8.J3333~33E-03 1.00000000E+01 6.20833333£+01 9o70633333E+01 9,75833333E+01-2.20833333E+01 3.57500000E+01 -9.16666667E+OO 3 1o14315596E+01-1.17851130F-01 1.41421356£+01 8.49706649£+01 1o21033111E+02 6.97678691E+01 1.28575563E+02 1.44956690E+02 8,8152&454E+01
EFFECT
4 9.18558654E+OO 8.16496581E-02 1.48331870£-14 2.87815045£+01 4.38866912E+Ot-4.49073t20E+D0-1.06185797E+D1-9.79795897E+DD -2.12289111E+01
ERROR SUH OF CROSS-PRODUCTS
-----------------------------------------------------------1 WORDS 1 2 3 4 5 6 7 8 9
WORDS CATS NCAT TIHE 1 TIME 2 TIHE 3 TIHE 4 TIH~ 5 TIME 6
14. 655633f+D2 11. 85DDOOE+D D 23.540516E-11 -59.191667E+01 -H.16"n3E+02 -17o850833F+02 -44.558333£+01 -17.645000E+02 -35.241667E+01
8
TIHE 5 8 9
TIHE 5 TIME &
78.464250E+03 37.323750E+03
2 CATS 22.250000E-02 21.759727E-l.3 -15.8000DOE+OD ·-85.416667E-01 -&3.166667E-01 21o3333J3f.-01 -25.125000f+OO -11o5416&7E+OO
9 TIHE 6
33.038333E+03
3 NCAT
4 TIHE 1
45.947018E-12 10.089939E-10 79.7&4748E-11 82.215382£-11 80.29108110-11 72.903346E-11 77.440819£-11
11.882725E+04 82.H8167E+03 76.528083E+03 62.252583E+03 4&.999UOOE+03 35.530250E+03
5 TIHE 2
1Do198608E+04 86.829333E+03 60.520000E+03 53.026750E+03 33o210083E+03
& TIHE 3
7 TIME 4
98.291750E+D3 54o4H917E+03 47.806500E+03 30,2H750E+03
75.177583E+03 61.686000E+D3 37o545417E+03
~
"' "'
ERROR VARIANCE -COVARIANCE MATRIX
-----------------------------------------------------------1
WORDS 1
2 3
.
5 6 7 8 9
WORDS CATS NCAT TIME 1 TIME 2 TIHE 3 TIME 4 TIME 5 TIME 6
33.763 .269 .ooo -13.453 -32.189 -40.570 -10.127 -40.102 -8.009
.
3
2
CATS • 005 .ooo -.359 -.194 -.144 .048 -.571
-.262
NCAT
5
TIME 1
.ooo .ooo .ooo .ooo .o 00 .ooo .ooo
2700.619 1871.322 1739.275 H14.8~1
1068.159 807.506
7
6
8
9
TIME 2
TIME 3
TIHE 4
TIHE 5
TIHE 6
2317.866 1973.394 1375.455 1205.153 754.775
2233.903 1238.362 1086.511 686.699
1708.581 1401.955 853.305
1783.278 848.267
750.871
ERROR CORRELATION
~ATRIX
-----------------------------------------------------------1
WORDS 1 2 3 4
5 6 7
8 9
WORDS CATS NCAT TIME 1 TIME 2 TIME 3 TIME 4 TIME 5 TIHE 6
1.oooooo .651786 • 000001 -.044551 -.115066 -.147724 -.0421~3
-.1&3432 -.050304
2
CATS 1.000000 o.oooaoo -.097170 -.05670~
-. 042713 .016495 -.19~154
-.13461'5
3 NCAT
4
6
7
TIME 3
TIME 4
5
TIME 1
TIME 2
8
TIHE 5
9
TIME 0 I
1.000000 .oooooo .oooooo .oooooo .oooooo .oooooo .000001
1.000000 .747950 • 708116 • 658651 .486737 .567063
VARIABLE !ERROR 1 WORDS 2 CATS 3 NCAT 4 TIME 1 5 TIHf 2
G TIME 7 TIME 8 TIHE 9 TIME
3 4
5 6
"'"' 1.000000 .867236 • 691169 • 592773 .572124
1.000000 .633866 .544368 .530214
VARIANCE ~EAN SQUARES!
STANDARD DEVIATION
33.763258 .005057 .oooooo
5.8106 .0711 • 0000 51.9675 48.1442 1,7.2642 41.335n .. 2.2289 27.4020
2700.61~318
2317.865530 2233.903409 1708.581439 1783. 278t,Q9 750.871212
o.F.=
1.oooooo .803168 .753362
44
ERROR TERH FOR ANALYSIS OF VARIANCE
!RESIDUA~!
1.000000 .733061
1.oooooo
INVERSE OF T 1
CONST. 1 2 3 4
CONST • T1-T2 T2-T3 T3-T4
2 T1-T2
~RIANGULAR
FACTOR OF BASIS FROM GRAH-SCHHIOT
4
3
T2-T3
T3-T4
.144338
o.oooooo o.oooooo o. 000000
.333333 -.235702
-.oooooo
.353553 -.204124
.408248
LEAST SQUARE ESTIMATES OF EFFECTS -- EFFECTS X VARIABLES
-----------------------------------------------------------1 ~OROS
1 2 3 4
CONST • Tt-T2
T?-13 T3-T4
39.5625 -2.3~33
2.1F;6 7 3.7500
4 TIME 1
2 CATS
3 NCAT
.%87 .0250 -.0583 .0333
7. 50 00
.oooo
5.0000 • 0000
147.1458 .6667 24.1667 11.7500
5 TI~E
2
131.3125 3. 8333 33.8333 17.9167
6 TIHE 3
7 TIME 4
8 TIME 5
9 TIHE 6
126.5208 16.0833 25.5833 -1.8333
121.5208 -37.6667 47.6667 -4.4167
122.0625 -22.2500 53.2500 -4.0000
109.2083 -23.8333 35.5000 -8.6667
1'
~ ESTIMATES OF EFFECTS IN STANDARD DEVIATION UNITS-EFF X VARS 1 ~OROS
1 2 3 4
CONST. T1-T2 T2-T3
n-n
68. 0865R6E-01 13.623 009E+O 0 •40.156385F-02 35.156152£-.02 37.288072£-02 -82.031022£-02 64.537048£-02 46.874870£-02
8 TIME
z
CONST. T1-T2
3 4
T3-T4
1
T?-T?
2 CATS
3 NCAT
4 TIME 1
73. 393726H05 51. 825922E-09 48. 9?9151£+05 59.259300£-10
28.314982E-01 12.828535£-03 46.503439E-02 22.610293£-02
5 TIME 2
6 TIME 3
7 TIHE 4
27.2748241'-01 26.768858E-01 29.399023E-01 79.621887!'-03 34.028583E-02 -91.125379E-02 70.274970£-02 54.128368£-02 11.531796£-01 3T.214578E-02 -38.789059£-03 -10.685055E-02
TIMF 6
28. 904981E-01 39.854104E-01 -52.689059£-02 ·86.976527E-02 12.609851'1'-01 12.955245E-01 -94. 7?1903£-03 -31.627828E-02
STANDARD FRRORS OF l!'AST-SQUARES ESTIMATES--EFFECTS BY VARS
WORDS
2 CATS
3 NCAT
• 83869 2. 37217 2.37217 2.37217
.0102& • 02903 • 02903 .02903
.ooooo .ooooo .ooooo .ooooo
1
CONST. T1-T2 T2-T3 T3-T4
1 2 3 4
~
TIME" 1
5 TI~E
7.5008& 21.215&'+
19.&5~79
21.215&~
19.65H9
21. 215&4
19.&5~79
&.9~902
7 TIHE
& TIHE 3
2
&.82200 19.29552 19.29552 19.29552
8 TII'IE 5
TIME 6
6.09521 17o23987 17.23987 11.23987
3.95511t 11.18683 11.18683 11.18683
~
5.96619 16.871t9~ 1&.87~9~
16.871t91t
9
VARIANCE-COVARIANCE FACTORS OF ESTIMATES
-----------------------------------------------------------1
CONST. 1 2
• 020833 o.ooooou
CONST • T1-T2 T2-T3 T3-T4
. 3
u.ooooo~
o. oo oono
2 T1-T2
3 T2-T3
~
T3-T4
.16~667
-.083333 -.000000
.166667 -.083333
.1666&7
INTERCORRELATIONS AHONG THE ESTIMATES
CONST. 1 2 3 4
CONST • Tt-T2 T2-T3 T3-T4
1.000000 o.oooooo o.oooooo u.ooooco
2 T1-T2
"
3
T2-T3
r
T3-T~
~ 1.000000 -.500000 -.oooooo
1.oooooo -.500000
1.oooooo PAGE
ESTI~ATED
CO~BINED
MEANS BASED ON FITTING A HODEL OF RANK
t,
========================================================:========= FACTORS
!TASK
l..EVEL ~EANS
WORDS = TIM~ 3=
39.83 150.9
CATS = TillE ~=
.9667 116.0
NCAT = 10.000 TIME 5= 131.0
TIME 1= TIHE &=
162.7 106.9
TIHE
z~
155,6
WORDS = TIME 3=
4?.17 134.8
CATS =
.9'+17 153.7
NCAT = 1o.ooo TIME 5= 153.2
TIHE 1= TIHE &=
162.0 130.7
TIHE 2=
151.7
TIME
WORDS = TIME 3=
~to.oo
CATS = 1.0000 TIME ~= 106.0
NCAT = 5.000 TIME 5= 100.00
TIME 1=
137.8 95.25
TI11E 2=
117.9
LEVEL MEANS LEVEL
3
MEANS LEVEL
r.:
4
109.2
.
TIHE 6=
7
MEANS
WORDS =
3&.25
TIME 3=
111.1
CATS =
TIME
4=
.9667 11 0. 4
NCAT = TIME 5=
5.000 104.0
T!HE 1= TIHE 6=
126.1 103.9
TIHE 2= 100.00
ANALYSIS OF VARIANCE
PAGE
DEPENDENT VARIABLf!SI WORDS CATS
ERROR SUM OF CROSS-PRODUCTS 1
!SEI
2 CATS
WO~OS
1
wo~Ds
1485.583
2
CATS
11.85 0
.223
0
CHOLESKY FACTOR SE
I
~
1
WORDS 1
2
WORD<; CATS
CATS
38.54327 .30745
• 357.74
LOG-O£TERMINANT ERROR SUM OF CROSS-PRODUCTS HYPOTHESIS
=
1 DEGREE!SI
5.24765436£+00
OF FREEDOM PAGE
LO,
CONST.
SUM OF GROSS-PRODUCTS FOR HYPOTHESIS 1
WORDS WORDS CATS
2
CATS
75129.19
1839.&6
45.05
9
TESTS OF HYPOTHESIS HYPOTHESIS
2
SKIPPED
BEI~G
3 DEGREEISI OF FREEDOM PAG.E
INFORMATION ON HIERARCHY MORE GLOBAL CATEGORIES BUILT IN HIERARCHY
1,
z.
3,
10
T1-T2 T2-T3 T3-T4
SUH OF CROSS-PRODUCTS FOR HYPOTHESIS 1 WORDS 1 2
WORDS CATS
216.2292 -.6062
2 CATS
.0206
ISCP HYP + SCP ERROR! - ADJUSTED FOR 1
WORDS WORDS CATS
1701.81"3 11.244
A~Y
COVARIATES ISTI•I
.,. 0 I
2
0
CATS .243
CHOLESKY FACTOR STI• 1 WORDS
1 2
WORDS CATS
41.25103 .27256
2 CATS
·41090
LOG-DETERMINANT SCP HYPOTHESES + SCP ERROR, ADJUSTED FOR ANY COVARIATE$, =
F-PATIO FOR MULTIVARIATE TEST OF EQUALITY OF MEAN VECTORS= D.F.=
6
ILIKELIHOOO RATIO
AND
86.0000
6.61674606E-01
P LESS THAN
LOG
5.66063573E+OO
3.2874
.0059
-4.12981375E-011
VARIABLE
HYPnTHESIS MEAN SQ
UNIVARIATE F
P LESS THAN
WORDS
72.0764
2.1346
.1095
CATS
,0069
1.3596
.2676
STEP DOWN f
P LESS THAN
2.1348
STEP-DOWN MEAN SQUARES •I 4.5765
STEP-DOWN MEAN SQUARES
•I
.1095 33.76331 .0073 .0136/ .00301
72.0764,
DfGWEES OF FREEDOM FOR HYPOTHESIS• 3 DEGREES OF FREEDC~ FOR ERROR• 44
HYPOTHESIS MEAN PRODUCTS, ADJUSTED fOR ANY COVARIATE$ 1 WORDS 1
2
HOROS CATS
72. 07639 -.20208
2 CATS • 0 0686
ANALYSIS OF VARIANCE PAGE
0
11
_,.I
1-'
DEPENDENT VARIABLEISI 1
2
WORDS CATS
3 INDEPENDFNT VARIABLE!Sl (PREDICTOR VARIABLES, COVARIATESI
5 TIME 2
7 TIME 4 9 THE 6
PEGRESSION PARALLELISM TEST REQUESTED
ERROR SUM OF CROSS-PRODUCTS 1
WOPDS WORDS CATS
2
CHS
14 •.355833E+02 11.850000~+00
22.250000E-02
3 TIME
4
TIME
ISEI' 5
TIME 6
<, 5
TIMF 2 TIME 4 TIME 6
-1<,.163~33£+02 -85.416667E-01 -44.55R333E+01 21.333~33E-01 -35.241667£+01 -11.541667£+00
10.198608£+04 6G.520000E+03 33. 210083E+03
75.177583E+03 37. 545417£>03
33.038333£+03
REGRESSION ANAlYSIS PAGE
12
SUM OF PRODUCTS CRITERIA 2
WOROS WORDS CATS
1485.583 11.150
CATS .223
SUM OF PRODUCTS - PREDICTORS SY CRITERIA <> 1
WOROS TIME
-1<,16.333
TIME
-445.563
TI~E
-352.417
_,.I
2
"'
CATS -8.542 2.133 -11.542
SUM OF PRODUCTS - PREDICTORS I TIME
3 TIME 6
2
THE
TIME
10.19R~08E+04
TIME <,
60,52000C~+O~
75.177583£~03
TIME 6
33.21nos~E+03
37.545417£+03
33.038333£+03
INVERSF SUM OF PRODUCTS OF PREDICTORS 2
TIMeTIME TIME <,
18.9%171E-06 -13.'10317F-06
TH<' 4C.085879E-06
3
TIMF 6
!X~XliNV
TIME 6
-39.~6•146£-07
-32.174877E-06
70.821485E-06
SUM OF PRODUCTS - REGRESSION
I,.,QPOS
wooos
23.99398 .25123
CATS
CATS
.01229
GRnUP
INVERSE SUM OF PRODUCTS PREDICTORS FOR THIS SUBCLASS ONLY 3
TIME 1
TIME
2 3
TI HE 5
TH~c
to.
TI~E
4
TIME 6
27.938757r-06 16.762113£-05 -13.567876E-06 -86.963776£-06
-~2.6247~2F-06
23.401269E-05
p
_.,.
"' Of PRODUCTS ADJUSTED FOR PREDICTORS , THIS SUBCLASS ONLY
SU~
2 WORD~
WORDS
CATS
458.8981 2.16 0 0
CATS .0199
GROUP
INVERSE 1 TIME 2 TIME 2 TIME " TIME 6
SU~
2
THE 4
19.410030E-05 -16. ~0 2291~-05 21.595459E-05 _,2.195627f-06 -63.501076£-06
Of PRoour,TS PREDICTORS FOR THIS SUBCLASS ONLY 3 TIME
27.490616E-05
SUH OF PRODUCTS ADJUSTED FOR PREDICTORS , THIS SUBCLASS ONLY 1
WORDS 1 2
WORDS CATS
It 1ft. 2334 5. 500 8
2
CATS o1121t
GR!)UP
INVERSE SUM OF PROCUCTS PREDICTORS FOR THIS SUBCLASS ONlY 1 TIME
1
2
TIME 2 TIME 4
3
TIME 6
2
TIHE It
ItS. lt98272E-05 ·49.636998£-05 15.35291t2E•Oit 30.709045£-05 •16.148131tE-Oit
3 TIME 6
19olt07025E-04
SUM OF PRODUCTS ADJUSTED FOR PREDICTORS , THIS SUBCLASS ONLY 0 I
1
WORDS 1 2
GROUP
WORDS CATS
174.5865 • 0000
<:<:-
2
CATS
.oooo
4
INVERSE SUM OF PRODUCTS PREDICTORS FOR THIS SUBCLASS ONLY 2
1
TIME 2 1 2 3
TIM" 2 TIME 4 TIME 6
33o585023E·05 -34.776988£-06 •11+.757~95£-05
TIME It 16.646699£-05 ·98o766794E-06
3
TIME 6
30.659745£-05
SUH OF PRODUCTS ADJUSTED FOR PREDICTORS , THIS SUBCLASS ONLY 1
WORnS
2
CATS
1 2
WORDS CATS
120,9136 1.5 297
.Oit94
STATISTICS FOR PAqALLELISH TEST WITH
4 SUBCLASSES OF OBSERVATIONS
PAGE VARIABLE
HYPOTHESIS MEAN SQ
WORDS CATS
UNIVARIATE F ,8913 .5567
~2.5509
.oon
P LESS THAN
STEl' DOWN F
,5437 ,8216
.6913 .2731t
13
P LESS THAN .51t37 ,9773
DEGREES OF FREEDOM FOR HYPOTHESIS= 9 DEGREES OF FREEDOM FOR ERROR= 32
F-STATISTIC FOR TEST OF PARALLELISM OF REGRESSION HYPERPLANES D.F.=
18 AND
ILIKE1IHOOD RATIO
62.JOOO 7, 407589GOE-01
P lESS THAN
.5576
,9159
LOG = -3,00079997E-01l
i' ..-
Vl
RAW REGRESSION COEFFICIENTS - INDEPENDENT X DEPENDENT VARS
1 2 3
TIME 2 TIME 4 TIME 6
1
2
WORDS
CATS
-19,575378£-03 -11t,48475&E-05 12.329208E-03 57.056054E-05 -50.009183E-04 -85.213742E-05
STANDARDIZED REGRESSION COEFFICIENTS - INOEP X DEPENDENT VAR 1 WORDS 1 2 3
TIME 2 TIME 4 TIME 6
-.1&2193 .087706 -.023584
2 CATS -.098066 .331651 -.3283&3
STANDARO FRRORS OF RAW REGRESSION COEFS-IND X OEP VARIABLES
CATS
WD~Ds
1 2 3
TIME 2 TIHE 4 TIME 6
26. 022786E-O 3 n.802147E-O~
50.2462D6E-03
31.208168£-05 45.334722E-05 60.25842 .. E-05
ERROR SUH OF PRODur,Ts ADJUSTED FOR PREDICTORS 1 WORDS
WORDS CATS
1461.589 11.599
2 CATS
.210
ERROR VAR-COV MATRIX ADJUSTED FOR PREDICTORS 1
WORDS WORDS CATS
35. 6<.85 2 .28290
2
CATS • 00510
0 I
_.,.
"' MATRIX OF CORRELATIONS WITH PREDICTORS ELIMINATED 1
woqos 1 2
WORDS CATS
1.000000 • 661717
2
CATS 1.000000
VARIABLE 1 WORDS
2
CATS
VARIANCE !ERROR MEAN SQUARES!
STANDARD DEVIATION
35.648521 .005127
D.F.=
5.9706 • 0716
41
ERROR TERM FOR ANALYSIS OF COVARIANCE !RESIDUAL! 3 COVARIATE!SI HAVE BEEN ELIMINATED
LEAST SQUARE ESTIMATES ADJUSTED FOR COVARIATES-EFFECT X VARS 1
WORDS 1 2
CONST. T1-T2
3
T2-T3
4
T3-T4
41.18088 -1.'!1308 2.41881 '+.11184
2 CATS
1· 01150 .02674 -.05038 • 0310& 3 COVARIATEISI ELIMINATED
ESTIMATES IN ADJUSTED STANDARD DEVIATION UNITS- EFFS X VARS
WO~DS
1
CONST.
2 3
T1-T2 T2-T3
4
T3-T4
&.8'!723 -.3204? .40512 .&88&8
2 CATS 14.12&31 .37340 -.70357 .43382
~
""'""" STANDARD 1
WORDS
2
1
CONST.
3. noo37
T1-I2
2. nl&O'l
.0~287'!
2.71'!020 2. 531311
• 032&08 • 030357
4
T?.-T3 n-T4
OF ADJUSTED ESTIMATES - EFFECTS X VARIABLES
CATS
2 3
ERRO~S
.044853
VARIANCE-COVARIANCE FACTORS AMONG ADJUSTED ESTIMATES 1
CONST. 1 2
CONST. T1-T2
3
T2-T3
4
T3-T4
• 392~8 3 -.062325 .1136&& -. 01895 8
2
T1-TZ .210848 -.115'!96 .01974&
3
T2-T3
.207388 -.094449
4
T3-T4
.17.9742
3 COVAR!ATEISI ELIMINATED
CORRELATIONS AMONG ADJUSTEO.ESTIMATES 1
CONST. 1
CONST.
1.000000
2
T1-T2
3
T2-T3 T3-T4
-.21668? • 398458 -.071384
lt
2
T1-T2
3 T2-H
"
T3-T4
1.000000
-.554712
1.000000
.101433
-.4891~5
t.oooooo 3 COVARIATEISJ ELIMINATED PAGE
ESTI~ATED
FACTORS lEVEL
COMBINED MEANS BASED ON FITTING A HODEL OF RANK
lit
4
!TASK 1
MEANS
WORDS
40.37
CATS
.9714
WORDS
42.28
CATS
.9446
WORDS
39.86
CATS
.9950
WORDS
35.75
CATS
.9640
LEVEL
.,. "" ()
MEANS
I
lEVEL MEANS LEVEL MEANS
3 COVARIATEISJ ELIMINATED
PAGE ESTIMATED COMBINED MEANS INCLUDING COVARIATE TERM - MODEl OF RANK
============================================================================== FACTORS
!TASK
LEVEL MEANS LEVEL
WORDS
39.83
CATS
.9667
15
~EANS
WOP OS
~?.17
CATS
.9417
WORDS
40.00
CATS
1. 0 00 0
WORDS
36.25
CATS
.9667
LFVFL
MEANS LEVEL
MEANS
3 COVARIATEISJ ELIMINATED
CHOLESKY FACTOR ERROR SCP ADJUSTED FOR PREDICTORS
1 WORDS WORDS CATS
2 CATS
38~.23074
.34375
• 3 0 339
0
1-
CHOLESKY FACTOR ERROR SCP BEFORE PREDICTOR ADJUSTMENT
"' 1
2
WORDS WORDS CATS
CATS
38.54~27
• 35774
.30745
LOG-nETERMINANT ERROR
~UH
OF PRODUCTS BEFORE ADJUSTMENT FOR PREDICTORS
STATISTICS FOR REGRESSION ANALYSIS WITH
=
5o24765436E+OO
AFTER
5.15161317£+00
3 PREDICTOR VARIABLEISI PAGE
VARIABLE 1 2
WORDS CATS
~QUARE
MUL T
• 0162
.0552
R
MULT R
·1271 .2350
P LESS THAN .2244 .7990
STEP DOWN F
.8790
.2244
.5016
1.1070
.8790 .3576
DEGREES OF FREEDOM FOR HYPOTHESIS= 3 DEGREES OF FREEDOM FOR ERROR= 41
F VALUE FOR TEST OF HYPOTHESIS OF NO ASSOCIATION BETWEEN DEPENDENT AND INDEPENDENT VARIABLES=
17
P LESS THAN
.6559
o.r.=
6 AND
!LIKELIHOOD RATIO
80.0000 9. 081t26604E-01
P LESS THAN LOG
.6853
= -9.60411829E-OZI
INVERSE CHOLESKY FACTOR IXtXI 2
1 TIME
1
TIME
2
TIME 4
3
TIME 6
3
TIME
TIME 4
31.,13351£-04 -29.947454E-Oit -47.16048qE-05
6
50.466351£-04 84,155502£-04
-~8.232648£-04
SEMI-PARTIAL REGRESSION COEFICIENTS - REGRESSION ANALYSIS 1
2
WOROS
CATS
-4.435014
-.02(,747 .036346 -.101257
1
TIME
2
TIME 4
1.9928~1
3
TIME 6
-. 59424 7
11
"' 0
STEP-WISE REGRESSION TO ANALYZE THE CONTRIBUTION OF EACH INDEPENDENT VARIABLE
SU~
OF PRODUCTS ERROR - PARTIALLY ADJUSTED
1
WORDS WORDS CATS
1465.914 11.731
CATS .222
CHOLESKY FACTOR PARTIALLY AOJUSTEO SCP 2
WOROS
CATS
1
WORDS
2
CHS
38.28726 • 3 064 0
.35763 !liKELIHOOD RATIO
ADDING VARIABLE F= .2943 VARIARLE
UNIVARIATE
9.86177916E-01
LOG
-1.39184986£-021
1 !TIME 21 THROUGH WITH 2 AND
11 TIME 21 TO THE REGRESSION EQUATION 42.0000 O.F. PLESS THAN .7466
P LESS THAN
STEP DOWN F
WORDS
.5770
.4517
CATS
.1387
.7115
.5770 STEP-DOWN HEAN SQUARES =I • 0248 S fEP-OOWN HEAN SQUARES =I D.F.=
1 AND
PERCENT OF ADDITIONAL VARIANCE ACCOUNTED FOR
P LESS THAN .lt517 19.6691t/ .8757 .00011
1.3.240 34.09101 .3215 .0030)
43
SUM OF PRODUCTS ERROR - PARTIALLY ADJUSTED
WOR.DS -r.ATS
~
2
1
'-" ....
ens
WORDS 1461.942 11.659
.220
CHOLESKY FACTOR PARTIALLY ADJUSTED SCP
1 WORDS 1 2
WORDS CATS
2 CATS
38.23536 .30493
.35705 !LIKELIHOOD RATIO
~ODING
F= VARIABLE WORDS
VARIABLE .1230
UNIVARIATE F .1141
9.94037691£-01
2 !TIME 41 THROUGH WITH 2 AND P
LESS THAN • 7373
LOG
-5.98015499E-031
21 TIME 41 TO THE REGRESSION EQUATION 41.0000 O.F. P LESS THAN .8847 STEP DOWN F
.111t1 STEP-DOWN HEAN SQUARES =I
P LESS THAN .7373 3.97151
PERCENT OF ADDITIONAL VARIANCE ACCOUNTED FOR .2673 34.80821
CATS
2
.2517
.618&
.1342 - STEP-DOWN MEAN SQUARES =I D. F.=
SU~
1
WORDS CATS
OF PRODUCTS ERROR •
.OOOit/
• 00311
42
PA~TIALLY
ADJUSTED
2
WORDS 1 2
AND
.5937
.7161
CATS
1461.589 11.599
.210
CHOLESKY FACTOR PARTIALLT ADJUSTED SCP 1
WOROS 1 2
WORDS CATS
38.23074 .30339
2
CATS .34375
0 I
!LIKELIHOOD RATIO
ADDING VARIABLE F= 1.5823 VARIABLE
UNIVARIATE
3
ITI~E
WITH
9.26684118E·01
61 THROUGH 2 AND
P LESS THAN
1
WORDS
.0099
.9213
2
CATS
1.q998
.1649
LOG
31 TIME 61 TO THE REGRESSION EQUATION 40.0000 D.F. P LESS THAN .2181 STEP DOWN F
• 0099 STEP•OOWN MEAN SQUARES =I 3.1542 STEP-DOWN MEAN SQUARES :( D. F.=
"'"'
-7.61425292E•OZJ
1 AND
P LESS THAN .9213 .3531/ .0834 .00931
PERCENT OF ADDITIONAL VARIANCE ACCOUNTED FOR .0238 35.&4851 4.6081 .00301
41
ANALYSIS OF COVARIANCE PAGE
HYPOTHESIS
1
1 OEGREEIS1 OF FREEDOM
17
PAGE
18
PAGE
19
cor.~sr.
LO,
SUM OF CROSS-PRODUCTS FOR HYPOTHESIS 2 CHS
WORO~
1 2 3
4 5
WORDS CATS TIME 2 TIME 4 TIME 6
75.129187H03 18.396562E+02 24. 93&244E+04 23. ~7&806E+04 20. 738662E+04
45.046875£+00 61.060312E+02 56.507187E+02 50.781875E+02
5 TIME 6
4
3
TIME
TIME
82.766269E+04 76. 594581E+04 68.834012E+04
70.883102E+04 63. 701221E+04
57.247008£+04
TESTS OF HYPOTHESIS BEING SKTPPEO HYPOTHESIS
1,
2
3 OEGREE!SI OF FREEDOM
INFORHA TIOt>l ON HIERARCHY MORE GLOEAL CATEGORIES BUILT It>/ HIERARCHY
2, 3,
T1-T2 T2-T3 T3-T4
0
I \J1
"' SUM OF CROSS-PRODUCTS FOR HYPOTHESIS 2
1 2 3 4
5
WORDS CATS TIME 2 TIME 4 TIME 6
WORDS
CATS
216.23 -.61 1891.90 1346.52 802.79
.02 -11.49 -15.85 -12.05
3 TIME
26000.23 12943.19 884 7. 79
4
TIME
17136.40 11766.37
5 TIME 6
8305.58
SUM OF PROD. FOR HYPOTHESIS,ADJUSTED FOR COVARIATES 2 WORn~
1
2
WORDS CATS
230.5623 -. 2753
CATS • 0127
ISCP HYP + SCP ERROR! - ADJUSTED FOR ANY COVARIATES ISTI•l -----------------------------------------------~------------
1
2
WORDS 1 2
WORDS CATS
1&'32.152 11.323
CATS .223
CHOLESKY FACTOR STI•
------------------------------------------------------------1 WORDS 1 2
MOROS CATS
2 CATS
~t.U577
.27527
.38361
LOG•OETERMINANT SCF HYPOTHESES + SCP ERROR, ADJUSTED FOR ANY COVARIATES, =
F•RATIO FOR MULTIVARIATE TEST OF EQUALITY OF NEAN VECTORS=
5.517512~8E+OO
2.&767 0
o.F.=
&
!LIKELIHOOD RATIO
VARIABLE
HYPOTHESIS MEAN SQ
AND
80.0000
6.93572639E·01
UNIVARIATE F
P LESS THAN
I..OG
WORDS
76.8541
2.1559
.1080
2
CHS
.00~2
• 8271
.~86&
STEP OOHN F
2.155'3 STEP•DOWN MEAN SQUARES =I 3.2714 STEI'·DOHN MEAN SQUARES =t
DEGREES OF FREEDOM FOR HYPO.THESIS= 3 DEGREES OF FREEDOM FOR ERROR= 41 3 COVARIATEtSl ELIMINATED
~YPOTHESIS
~
•3o65899302E•011
P I..ESS THAN
1
I VI
.0204
MEAN PRODUCTS, ADJUSTED FOR ANY COVARIATES
P LESS THAN .1080 35.64851 .11309 .0097/ .00301
76.854V
WORDS 1 2
WORDS CATS
76.85412 -.09177
CATS .00424 COVARIATFISI ELIMINATED
COR~
USED FOR
DATA~
229 LOCATIONS OUT OF 3000 AVAILABLE
" I
"'"'
PROBLEM 3 TWO•HAY FIXED EFFECTS MULTIVARIATE ANALYSIS OF VARIANCE WITH NULL SUBCLASSES .& 2 2 1 1 2INPUT DESC. CARD OLONEW 2EXPGRP FACTOR !DENT CARD PROBLEM 3••TWO FACTOR FIXFD EFFECTS HULTI~ARIATE ANALYSIS OF VARIANCE HITH THREE NULL SUBCLASSES THE DATA FOR THIS EXAMPLE WERE COLLECTED UNDER THE DIRECTION OF DR. STUART L. FISCHMAN, SCHOOL OF DENTISTRY, STATE UNIVERSITY OF NEW YORK AT BUFFALO. THE STUDY WAS DESIGNED TO TEST THE EFFECTIVENESS OF TOOTHPASTE AND MOUTH RINSE ADDITIVES IN REDUCING THE DEGREE OF DENTAL CALCULUS FORMATION. IN ITS ENTIRETY, THE STUDY WAS CONDUCTED OVER A TWO YEAR PERIOD. THE SUBJECTS WERE MALE PRISONERS IN THE N. Y. CORRECTIONAL FACILITY AT ATTICA. THE FIRST YEAR (SUBJECTS DESIGNATED -OLD-), FOUR TREATMENT GROUPS WERE INVOLVED.
GROUP GROUP GROUP GROUP
52--CONTROL, NO ANTI-CALCULUS AGENT 54·-ACTIVE AGENT SUPPOSED 56-· ANTI-CALCULUS AGENT 1 IN TOOTHPASTE AND RINSE 58--ANTI-CALCULUS AGENT 2 IN TOOTHPASTE AND RINSE
IT. HAS DISCOVERED AT THE ENO OF THE FIRST YEAR OF THE STUDY THAT THE AGENT INVOLVED FOR GROUP ~4 WAS CHEMICALLY INERT, ANO THE MATERIAL WAS NOT CONTINUED FOR A SFCOND TRIAL. FOR ANALYSIS PURPOSES, GROUP 54 IS CONSIDERED TO BE AN ADDITIONAL CONTROL ;ROUP. AGENT 58 HAS ALSO DISCONTINUED.
DURING THE SECOND YEAR OF THE STUDY, THREE EXPERIMENTAL GROUPS WERE GROUP 6?--CONTROL !SAM£ AS 52 FROM FIRST YEAR! GROUP 75·-nNTI-CALCULUS AGENT 3 IN TOOTHPASTE AND RINSE !NEW AGENTI GROUP 93•·ANTI-CALCULUS AGENT 1 IN TOOTHPASTE AND RINSE !SAME AS 56 FROM FIRST YEARI EACH SUBJECT HAS GIVEN A COMPLETE DENTAL PROPHYLAXIS AT THE BEGINNING OF THE STUDY. FOLLOWING THE CLEANSING, HE WAS INSTRUCTED TO BRUSH AND RINSE DAILY FOR A THREE MONTH PERIOD, KEEPING A RECORD OF WHENEVER HE HAD DONE SO. AT THE END OF THE THREE MONTH PERIOD, MEASUREMENTS WERE TAKEN OF THE AMOUNT OF CALCULUS FORMATION ON EACH OF THE SIX LOWER FRONT TEETH. AGAIN THE TEETH WERE COMPlETELY CLEANSED AND THE PROCEDURE WAS REPEATED FOR FOUR PERIODS IN ALL.THE JATA CONSIDERED IN THIS EXAMPLE ARE THE MEASURES OF CALCUlUS FORMATION FOR EACH OF THE SIX TEETH, AT THE END OF THE THIRD PERIOD. ARBITRARY CONTRASTS ARE EMPLOYED TO COMPARE THE PRESUMED CONTROL GROUPS WITH EACH OTHER, AND TO COMPARE THE CONTROL GROUPS WITH THE ACTIVE AGENTS.
THE DATA CARDS ARE PUNCHED AS FOLLOWS CARO COLUMN 1-5
PRISONER NUMBER 1= FIRST YEAR SUBJECT 2=SECONO YEAR SUBJECT
0 I
"'"'
9•10
EXPERIMENTAL G~OUP NUMBER fOR ANALYSIS GROUPS WERE COOED AS 1=52,67 2=5~
3=56,93 ~=58
5=75 SINCE GROUPS ARE NOT NUMBERED SEQUENTIALLY OR THE SAME FROM ONE YEAR TO THE NEXT DATA FORM 2 OR 3 IS NECESSARY. 11•12 13•14 15•16 17•18 19•20 21·22
CALCULUS CALCULUS CALCULUS CALCULUS CALCULUS CALCULUS
SCO~E
SCORE SCORE SCORE SCORE SCORE
RIGHT CANINE IR·CANI RIGHT LATERAL INCISOR CR•Loiol RIGHT CENTRAL INCISOR IR•C.I.I LEFT CENTRAL INCISOR IL·C.I.l LEFT LATERAL INCISOR IL•Loiol LEFT CANINE IL•CANI
THIS RUN USES DATA FORM II OPTIONAL PRINTED OUTPUT MISSING CELLS 13 LESS O.F. FOR INTERACTION! MEANS KEY FOR ESTIMATED AND OBSFRVED MEANS, INCLUDING ESTMATING MEANS FOR NULL SUBCLASSES ALTERNATIVE CONTRAST ORDER ESTIMATION OF EFFECTS CTHE ONLY EXPECTED SIGNIFICANT EFFECTS ARE HPffii-HEKT-Al.. GR-GUP- -DI-FF~RE~CE-S- -BET-IIE-t:K ACH-Vt; -AGt;KTS ANIJ. CMTR.ot.S. THUS THE RANK OF THE HODEL FOR ESTIMATION INCLUDES ONLY THE FIRST FOUR EFFECTS!. ARBI-TRARY CONTRAST MATRIX PRINCIPAL COMPONENTS OF COVARIANCE MATRIX DISCRIMINANT ANALYSIS LAST HYPOTHESIS BY .SUBTRACTION !UNSPECIFIED INTERACTION! THE COMPLETE DESIGN IS 12 X 51 WITH THREE EMPTY CELLS, THE RANK OF THE HODEL FOR SIGNIFICANCE TESTING IS 7 11 D.F. FOR CONSTANT, 1 FOR YEARS OF EXPERIMENTATION, 4 FOR TREATMENTS, AND 1 FOR INTERACTION!. RATHER THAN ANALYZING WHICH INTERACTION CONTRAST IS ESTIMABLE, THE RANK IS SPECIFIED AS & ~EXCLUDING THE INTERACTION!. THE INTERACTION MAY BE TESTED FOR SIGNIFICANCE IN ANY CASE, BY PUTTING A FINAL 111 ON THE HYPOTHESIS TEST CARD, JUST AS IF A C-ONTRAST HAD BEEN CODED. THE PROGRAM CAN COMPUTE THE SUN OF PRODUCTS FOR THISLAST HYPOTHESIS BY SUBTRACTION OF ALL OTHER EFFECTS FROM THE TOTAL SUH·OFPRQDUCTS MATRIX. Ill LARGER DESIGNS THIS AVOIDS LOCATING .COMPLEX TERMS WHICH HAY NOT BE ESTIMABLE. IT IS ONLY NECESSARY TO SPECIFY THE THE CORRESPONDING FINAL DEGREES OF FREEDOM CORRECTLY 11 IN THIS EXAMPLE!. THE CONTRASTS INDICATED FOR THE TREATMENT EFFECT DO NOT CONFORM TO THE C, Dt Ht OR P ·oPTIONS. AN ARBITRARY CONTRAST MATRIX IS ENTERED TO COMPARE EACH AGENT KITH THE MEAN OF THE TNO CONTROLS, AND FINALLY THE .TWO CONTROLS WITH EACH OTHER. THE CONSTANT AND ARBITRARY CONTRASTS ARE REPRESENTED AS LOo l1o ••• , L4. THE ORDER OF CONTRASTS IS THE CONSTANT, 3 ACTIVE-AGENT CONTRASTS, THE COMPARISON O.F THE TWO CONTROLS !REALLY ANOTHER TREATMENT EFFECT Lltlo AND THE COMPARISON OF THE TWO YEARS OF EXPERIMENTATION. IF AN INTERACTION EFFECT WERE CODED, IT WOULD COME LAST IN THE ORDER IN WHICH CONTRASTS ARE ENTERED, THE HYPOTHESIS TEST CARD IS •1, 3, 1, 1o 1. •1 BYPASSES TESTS OF THE CONST-ANT TERH. THE THR.EE AGENT ·CONTRASTS ARE TESTED SIMULTANEOUSLY. THE NEXT TWO EFFECTS !CONTROLS AND YEARSI ARE EACH TESTED SEPARATElY. THE REMAINING ONE DEGREE OF FREEDOH BETWEEN GROUPS !INTERACTION! IS TESTED ALONE.
Q
I
..,"'
ONE ALTE~NATE ORDER OF EFFECTS IS INDICATED, WITH THE YEARS CONTRAST PRECEDING TREATMENT EFFECTS, THE OUTPUT WILL ONLY BE USED IF CYEARSI IS SIGNIFICANT, AND IT .IS NECESSARY TO TEST TREATMENT EFFECTS ELIMINATING YEAR DIFFERENCES. THE HYPOTHESIS TEST CARD IS -2, 3, 1. FOR THE ALTERNATE ORDER. -2 BYPASSES TESTS OF THE CONSTANT AND YEARS EFFECT IN THIS ORDER. THE THREE ACTIVE-AGENT CONTRASTS ARE TESTED TOGETHER, FOLLOWED BY THE SINGLE COMPARISON OF THE CONTROL GROUPS. THE RANK OF THE HODoL FOR ESTIMATION IS 4, SINCE ONLY THE FIRST FOUR EFFECTS (CONSTANT AND ACTIVE-AGENT CONTRASTS) PROVE SIGNIFICANT. MEANS ARE ESTIMATED FROM THE RANK-4 HODEL, AND RESIDUALS ARE COMPUTED. THESE ARE COHBINED ACROSS ALL SUBJECTS, ANO ACROSS YEARS OF EXPERIMENTATION TO YIELD MEANS FOR EACH TREATMENT GROUP. ALSO HEANS ARE ESTIHATEO FOR ALL 10 SUBCLASSES, INCLUDING THOSE WITH NO DATA. THE FIRST ANALYSIS YIElDS AlL HYPOTHESIS TESTS, PLUS PRINCIPAL COMPONENTS OF THE WITHIN-CELL VARIANCE-COVARIANCE HATRIX AHONG THE TOOTH HEASURESo IN THE SECOND ANALYSIS, THE TOOTH OATA ARE REORDERED FROM THE CE~TER OF THE HOUTH OUTWARD, THE CSTEP-DOWNI TESTS RFVEAL WHETHER THE NON-CENTRAL TEETH HAKE ANY CONTRIBUTION TO BETWEEN-GROUP VARIATION, OR WHETHER THE CENTRAL TEETH COMPRISE ALL SIGNIFICANT HEAN DIFFERENCES. FINISH (10X6F2.0 I VARIABLE FORHAT FOR DATA R-CAN R-L;I,R-C.I.L-C.I.L-L.I,L-CAN VARIABLE LABELS 8 1 1 20883 1 52 2 2 1 2 2 1 2173~ 1 52 0 Q 0 2 1 0 22021 1 52 0 0 4 .. 0 0 22242 1 52 2 2 2 3 Z 2 22314 1 52 2 7 7 6 5 2 22489 1 52 0 1 3 4 1 0 22828 1 52 0 1 6 4 2 0 2~852 1 52 .o 5 7 6 5 2 g 1 2 17114 1 54 3 3 5 4 2 1 19260 1 54 2 2 1 3 2 2 20625 1 54 0 0 0 1 1 1 20752 1 54 1 3 3 1 1 1 20849 1 54 4 5 5 6 0 4 20997 1 54 0 1 2 0 1 3 223 93 1 54 1 1 2 1 0 1 22735 1 54 1 1 3 6 5 1 22808 1 ~4 0 0 7 8 5 0 1 3 7 15900 1 56 1 1 1 3 4 1 18588 1 56 0 1 2 0 1 1 20401 1 56 1 1 3 5 0 1 20739 1 56 0 1 1 2 1 0 20741 1 56 0 1 0 0 0 0 22346 1 56 0 Q 0 0 u 0 22907 1 56 1 1 2 1 1 0 5 1 4 12063 1 58 1 2 1 1 1 0 19427 1 58 0 0 1 2 1 0 20844 1 58 2 0 2 1 1 0 21254 1 58 2 1 3 0 0 0 22080 1 58 0 1 3 2 0 0 28 2 1 17559 2 67 2 2 5 3 19293 2 67 0 0 1 0
0 I
\JJ
00
19437 2 20519 2 20658 2 21049 2 21098 2 21372 2 21&55 2 21825 2 21910 2 22043 2 22069 2 22164 2 22319 2 22337 2 22426 2 22483 2 22700 2 22801 2 22835 2 22994 2 23166 2 23174 2 23230 2 23290 2 23351 2 23463 2 26 13753 2 17224 2 19426 2 20674 2 21066 ? 21357 2 21451 2 22041 2 22165 2 22166 2 22?14 2 22392 2 22540 2 22584 2 22622 2 22666 2 22855 2 22892 2 2292& 2 22991 2 23190 2 23449 2 23457 2 23508 2 23564 2 23615 2 24 14892 2 18821 2 1q982 2 2 0261 2 20264 2 205"7 2 20655 2
&7 &7 &7 &7
1 0 q 1 1 0 0 1
67 67 67 67 &7 o 67 1 &7 0 67 0 67 0 67 0 67 0 57 0 57 0 67 0 67 0 67 0 67 1 67 0 67 1 67 0 67 0 67 1 2._ 75 0 75 3 75 0 75 0 75 0 75 1 75 0 75 0 75 0 75 1 75 0 75 0 75 0 75 0 75 0 75 0 75 0 75 0 75 1 75 0 75 0 75 0 75 0 75 0 75 0 75 0 2 93 3 93 3 qJ 0 93 1 93 0 93 o 93 0
1 1 8 1 6 0 0
0 0 2 0 711 3 4 8 8 1 0 0 0 7 5 5
0 0 1 1
7 7 2 2 0 0 2
1 0 0 0 0
0 5 9 1 1 1 5 0 1 2 0 5 2 8 0 6 0
1 4 5 0 1 0 0 0 0 2 0 0 0 7 0
0 1 1 0 1 0 0 0 0 0 0 0 0 0 0
0 3 0 3 2 0 0 1 0 2 4 0 0 0 0 1 4 0 0 5 3 0 0 3 0 3
0 2 0 3 0 0 0 0 0 1 2
3 5 8 3 0
1 4 2 3 4 1 0 0
o oa o o 0 3 1 4 5 5 0 1 0 1 1 0 0 4 0 0 0 1 0 2 0 1 3 3 2 5 2 6 0 0 4 7 0 1 5_ 0 0 1 3 0 0 0 1 0 1 1 0 0 0 0 0 0 0 0 0 2 3 0 0 0 0 1 0 0 0 0 1 1 2 0 0 1 1 0 2 1 2 0 0 0 0 0 2 0 0 3 2 3 1 5 3 5 1 6 0 " 0 3
7 7 u 0
~
1 0 0 2 3 0 0 3 0 0 0 0 0 0
oooo
1 2 2
0 4 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0
?
"'"'
21044 218o9 22348 22433 22686 22763 22995 23058 23082 23119 23164 23184 23330 23478 23495 23594 23650 6
2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
93 93 93 93 93 93 93 93 93 93 93 93 9~
93 93 93 93
1 0 0 0 0 0 1 1 0 0 1 0 1 0 1 0 0
5 0 1 0 0 0 1 0 1 2 0 0 u 1 0 1 1
8 0 2 0 1 0 0 1 1 1 0 1 0 3 1 0 0
4
8 0 1 0 1 0 0 0 2 0 0 0 0 4 0 0 4
0 0 1 1 0 0 0 0 1 0 0 1 0 3 1 0 1 1
2 0 0 3 0 0 0 0 0 0 0 1 0 0 0 0 0 1
1
ll,2,1•2.
EXPGRP 15F4.0l -.5 -.5 -.5 -.5
1 1
-.s -.s
1
1 -1 CO,LO, CONST 0 ,1, ACTIVE o, 2t ACT! VE 0,3, ACTIVE o. t,., CONTRL t.o, VfAR 1,6,2, 3,4, 5. 0
-1,3 .. 1t1t1. -2, 3,1. 0
BLANK CARD EST SPEC CARD HEANS KEY FACTOR NAHE ARBITRARY CONTRASTS VARIABLE FORMAT-ARBITRARY CONTRASTS ARBITRARY CONTRASTS 3
1
3,4,2:,5,1,6. •t.,3t1t1t1.
ONE-CONTROLS TWO• CONTROLS THREE-CONTROLS THO CONTROL GROUPS FIRST-SECOND YEAR CONTRAST REORDERING KEY ANALYSIS SELECT CARD 2 1 1 HYPTEST•ORIGINAL ORDER HYPTtST•ALTERNATE ORDER ANALV SELECT•TEETH OR VARIABLE SELECT KEY • CENTER TO OUTSIDE TEETH HYPQTHESIS TEST CARD FOR STEP•OOWN ANALV CONTINUE
" "' I
0
• • • • • • • MU L T I V AR I A N C E • • • • • • • • •
UNIVARIATE AND MULTIVARIATE ANALYSIS OF VARIANCE, COVARIANCE AND RFGRESSION MARCH 1972
VERSION 5
PROetEH
PROBLE~
TWO-WAY FIXED
EFFFCT~
~ULTIVARIATF
ANALYSIS OF
VARIANCE WITH NULL SUBCLASSES
PAGE
1 ()
I
PRO~LEM
~--TWO
FACTOR FIXED EFFeCTS MULTIVARIATE ANALYSIS OF VARIANCE WITH THREE
NULL SUBCLASSES THE DATA FOR THIS EXAMPLE WERE COLLECTEC UNDER THE DIRECTION OF DR. STUART L. FISCH~AN, SCHOOL OF DENUSTRY, STATE UNIVERSITY OF NEW YORK AT 8UFFALO. THE STUDY WAS OFSIGNEE TO TEST THE EFFECTIVENESS OF TOOTHPASTE AND MOUTH RINSE ADDITIVES IN REDUCING 1£ DFGREE OF DENTAL CALCULUS FORMATION. IN ITS ENTIRETY, THE STUDY WAS CONDUCTED OVER A TWO YEAR PERIOD. THE SUBJECTS WERE MALE PRISONERS IN A NFW YORK STATE PENITENTIARY, DURING THE FIRST YEAR ISU~JFCTS DFSIGNATED -OL0-1, FOUR TREATMENT GROUPS WERE INVOlVED. GROUP 52--CONTROL, NO ANTI-CALCULUS AGENT GROUP 5,--ACTIVE AGENT SUPPOSED GROUP-S&-- ANTI-CALCULUS AGENT 1 IN TOOTHPASTE AND RINSE GROUP 58--ANTI-CALCULUS AGENT 2 IN TOOTHPASTE AND RINSE IT WAS DISCOV£~FO AT THE ENO OF THF FI~ST YEAR OF THE STUDY THAT THE AGENT INVOLVED FOP GROUP 54 WAS CHEMICALLY INERT, AND THE MATERIAL HAS NOT CONTINUED FOR A SfCOND TRIAL. FOP ANALYSIS PURPOSES, GROUP 5' IS CONSIDERED TO BE AN ADDITIONAL CONTROl GROUP. AGENT 58 WAS AlSO DISCONTINUED. DURING THE SECOND YEAR OF THE STUDY, THREE EXPERIMENTAL GROUPS WERE GROUP 67·-GO~TROL (SAME AS 52 FROM F!RST YEAR! GROUP 75--ANTI-CALCULUS AGENT 3 IN TOOTHPASTE ANO RINSE (NEW AGENT! GROUP 93--ANTT-CALCUlUS AGENT 1 IN TOOTHPASTE AND RINSE !SAME AS 56 FROM FIRST YEAR)
e;
EACH SUBJECT WAS GIVEN A COMPLETE DENTAL PROPHYLAXIS AT THE BEGINNING OF THE STUDY. FOLLOWING THE CLEANSING, HE WAS INSTRUCTED TO BRUSH AND RINSE DAILY FOR A THREF ~ONTH PEO!OO, KEEPING A RECC~D OF WHENEVE~ HE HAD DONE SO. AT THf FNO OF THE THREE ~ONTH PERIOD, MEASUREMENTS WERE TAKEN OF THE AMOUNT OF CALCULU> FOR~ATION ON EACH OF THE SIX LCWER FRONT TEETH. AGAIN THF TEETH WERE COMPLETELY CLEANSED AND THE PROCEDURE WAS REPEATED FOR FOUR PERIODS IN ALL.THE DATA CONSIOERED IN THIS EXAMPLE ARE THE MEASURES OF CALCULUS FORMATION FOR EAGH OF -THE SIX TEETH, AT THE END OF THE THIRD PERIOD. ARPITRARY CONTRASTS ARE EMPLOYED TO GOMFARE THE PRESUMED CONTROL GROUPS WITH EACH OTHER, AND TO CO~PARE THE CONTROL GROUPS WITH THE ACTIVE AGENTS.
THE DATA CAPOS ARE
PU~CHEO
CARD COLUMN 1-5 7
AS FOLLOWS PRISONER NUMBER 1= FIRST YEAR SUBJECT 2=SFCONO YEAR SUBJErT FXPERI•ENTAL GROUP NUMBER FOR ANALYSIS GROUPS WERE COOED AS
9-10
1•52,67 2•54 3=5&,q3 ~-58
~" SINCE GRnUPS A~E NOT NUMBERED SEQUENTIAL~Y OR THE SAME FROM ONE YEAR TO THE NEXT DATA FORM ? OR 3 IS NECESSARY. 11-12 13-14 15-16 17-18 19-20
21-22
CAlCULUS CALCULUS CALCULUS CALCULUS CALCULUS CALCULUS
SCORE SCORE SCORE SCO~E SCO~E
SCORE
RIGHT CANINE IR-CANI RIGHT LATERAL INCISOR IR-L.I.l RIGHT CENTRAL INCISOR IR-C.I.l LEFT CENTRAL INCISOR IL-C.I.I LEFT LATERAL INCISOR IL-L.I.I LFFT CANINE IL-CANI
THIS RUN USES DATA FORM II OPTIONAL PRINTED OUTPUT ~ISSING CELLS 13 LESS D.F. FO~ INTE~ACTIONI ~EANS KEY FOR ESTIMATED AND oeSERVED MEANS, INCLUDING ESTHATING ~EANS FOR NULL SUBCLASSES ALTERNATIVE CONTRAST ORDER ESTIMATION OF EF<ECTS fTHE ONLY EXPECTED SIGNIFICANT EFFECTS ARE EXPERIMENTAL GROUP DIFFERENCES BETWEEN ACTIVE AGENTS AND CONTROLS. T~US THE RANK OF THE HODEL FOR ESTIMATION INCLUDES ONLY THE FIRST FOUR EFFECTS!. APBITRARY CONTRAST MATRIX PRINCIPAL CO~PONENTS OF COVARIANCE ~ATRIX DI,CRIHINANT ANALYSIS LAST HYPOTHESIS BY SUBTRACTION !UNSPECIFIED INTERACTION! THE COMPLETE OE>IGN IS 12 X 51 HITH THREE EMPTY CELLS. THE RANK OF THE MODEL FOR SIGNIFICANCE TESTING IS 7 11 n.F. FOR CONSTANT, 1 FOR YEARS OF FXPERIHENTATICN, 4 FOP TREATMENTS, AND 1 FOR INTERACTION!. RATHER THAN
Q
I
"' "'
ANALY7ING WHICH INTFRAr.TION CONTRAST IS ESTIMABLE, THE RANK IS SPECIFIED AS 6 (EXCLUDING THE INTERACTION!. THE INTERACTION HAY BE TESTED FOR SIGNIFICANCE I~ ANY CASE, EY PUTTING A FINAL f1l ON THE HYPOTHESIS TEST CARD, JUST AS IF A CONTRAST HAD BEEN COOEO. THE PROGRAM CAN COMPUTE THE SUM OF PRODUCTS FOR THIS LAST HYPOTHESIS ~y SUBTRACTION OF ALL OTHER EFFECTS FROM THE TOTAl SUM-OFPRODUCTS MATRIX. IN LARGER DESIGNS THIS AVOIDS LOCATING COMPLEX TERMS WHICH MAY NOT rE ESTIMAPLE. IT IS ONLY NECESSARY TO SPECIFY THE THE CORRESPONDING FINAL DEGREES OF FREEDOM CORRECTLY (1 IN THIS EXAMPLE!. THE CONTRASTS INDTCATEQ FOR THE TREATMENT EFFECT DO NOT CONFORM TO THE C, D, H, P OPTIONS. AN ARBITRARY CONTRAST MATRIX IS ENTERED TO COMPARE EACH AGENT WITH THE MEAN OF THE TWO CONTROLS, AND FINALLY THE TWO CONTROLS WITH EACH OTHER. THE CONSTANT ANO ARBITRARY CONTRASTS ARE REPRESeNTED AS LO, L1, ••• , L4.
QP
THE ORDFO OF CONTRASTS IS THF CONSTANT, 3 ACTIVE-AGENT CONTRASTS, THE COMPARISON OF THE TWO CONTROLS fREAtLY ANOTHER TREATMENT EFFECT L4l, AND THE COMPARISON CF THF TWO YFARS OF EXPERIMENTATION. IF AN INTERACTION EFFECT WERE COOED, IT WOULD COME LAST IN THE ORDER IN WHICH CONTRASTS ARE ENTERED. THE HYPOTHESIS TFST CARD IS -1, 3, 1, 1, 1. -1 BYPASSES TFSTS OF THE CONSTANT TERM. THE THREE AGFNT CONTRASTS ARE TESTED SIMULTANEOUSLY. THE NEXT TWO EFFECTS (CONTROLS AND YFARSI ARE 'ACH TESTED SEPARATELY. THE REMAINING ONE DEGREE OF FREEDOM PFTWEFN GROUP< (INTERACTION! IS TESTED ALONE. ONE ALTERNATE OROEP OF EFFECTS IS INDICATED, WITH THE YEARS CONTRAST PRECEDING TREATMENT FFFECTS. THE OUTPUT WILL ONLY BE USED IF fYEARSl IS SIGNIFICANT, AND IT IS NECESSARY TO TEST TREATMENT EFFECTS ELI•INATING YEAR DIFFERENCES. THE HYPOTHESIS TEST CARD IS -2, 3, 1. FOR THE ALTERNATE ORDER. -2 BYPASSES TESTS OF THE CQNSTANT AND YEARS EFFECT IN THIS ORDER. THE THREE ACTIVE-AGENT CONTRASTS ARE TESTED TOGETHER, FOLLOWED BY THE SINGLE COMPARISON OF THE CONTROL GROUPS.
0
I
"'
THE RANK OF THE MODEL FO~ ESTIMATION IS 4, SINCE ONLY THE FIRST FOUR EFFECTS (CONSTANT AND ACTIVE-AGENT CONTRASTS! PROVE SIGNIFICANT. MEANS ARE ESTIMATED FOOM TH,E RANK-4 MODEL, AND RESIDUALS ARE COMPUTED. THESE ARE COMBINED ACROSS ALL SURJECTS, AND ACROSS YEARS OF EXPF~IMENTATION TO YIELD MEANS FOR EACH TREATMENT GROUP. ALSC MEANS ARE ESTIMATEO FOR ALL 10 SUBCLASSES, INCLUDING THOSE WITH NO DATA.
""
THE FIRST ANALYSIS YIELDS ALL HYPOTHESIS TESTS, PLUS PRINCIPAL COMPONENTS OF THE >IITHIN-CCLL VARIANCE-COVARIANCE MATRIX AMONG THE TOOTH MEASURES. IN THE SECOND ANALYSIS, THE TOOTH DATA ARE REORDERED FROM THE CENTER OF THE MOUTH OUTWARD. THE (STFP-DOWNI TESTS REVEAL WHETHER THE NON-CENTRAL TEETH MAKE ANY CONTRIBUTION TO qFTWEEN-GROUP VARIATION, OR WHETHER THE CENTRAL TEETH COMPRISE All SIGNIFICANT MEAN niFFERFNCES. INPUT PARAMETERS PAGE NU~BER
OF VARIABLES IN INPUT VECTORS=
6
NUMBER OF FACTORS IN DESIGN= NUMBER OF LEVELS OF FACTOR 1 fOLONEWl NUMBER OF LEVELS OF FACTOR 2 !EXPGRPI INPUT IS FROM CARDS. DATA OPTION 2 MINIMAL PAGF SPACING WILL BE USED
2 5
l
ADDITIONAL OUTPUT WILL PRINTED DEBUG OUTPUT WILL BE PRINTeD FORMAT OF DATA 110X&F2 .Ot
VARIABLE FORMAT FOR DATA
FIRST OBSERVATION SUBJECT 1 , CELL 2.0000 GROUP
OLONEW
z.oooo
1.0000
2.0000
2.0000
1.0000
EXPGRP
SUBCLASS COVARIANCE MATRIX
-----------------------------------------------------------1
R-CAN 1 2 3
.. 5 &
R·CAN R· L. I. R·C.!. L-C. I. L-L. I. l-CAN
1.071429 1. 214286 -.357143
-.392857 .61,2857 • 678571
P-L. I.
2
3 R-c.I.
6. 214286 ... 500000 3.&7R571 4.357143 2.035714
3.357143 .%4286
7.3t:i7143 4.8q2R57
4 L-C.I.
5 t·L .I.
4.125000 2.821429 1.017857
3. 357143 1.464286
&
L·CAN
.982143
0
I
"' "" 'URCLASS CORRELATION MATRIX 1 P·CAN 1 2 3 4 5 6
R-CAN P-L.I. R-c .I. L-C.I. L-L.I. l•CAN GRDUP
2
R·L.I.
3
R-C. I.
4
L·L.I.
1.000000 .7?B180 .505694
1.~ocooo
&
L-CAN
1. oo roo o -.127205 -.18&871 .338%0 .6&1495
1.000000 .665522 • 726560 .953940 .8?4013
Ol DNFW
EXPG PP
.470591
t.oooooo .888170 .~75508
.358727
SUBCLASS 3
R·CAN R-CAN
5
L-c.r.
z.ooo~oo
P-L.I.
R-c.r.
4
L-C. I.
.806406
COV~RJANCE
5 l·L.I.
1.000000
~ATRIX
6
L-CAN
R-C. I. ~
L-c.r.
5 6
l-L.I. L-CAN
.9<;8333 1. 250000 1.16<667 ,g16667
1.027778 .583333 1.013889 1· 26 3889
4. 861111 4.833333 3.180556 -.319444
8.oooooo 5.666667 -.458333
4. 777778 .527778
1.527778
SUBCLASS CORRELATION MATRIX 2 !;)-CAN
1 2 3 4 5 6
R- CAN R-L. I.
R-c.r.
L-C, I. L-L.I. L- CAN GRnUP
1.000000 • ~971>48 • 107350 .•12500 .377415 .?244n4
OL ONfW
3
R-l, I.
R-c.r.
1.000000 • 28 3986 .12?641 .282581 .62?937
1.000000 .775058 .659967 .... 11721q
5
4
L-C.
I.
1.000000 .916579 -.131101
L-L.I.
1.000000 .195348
6
L-CAN
1.000000
EXPGRP
SUBCLASS COVARIANCE MATRIX 0
2
R-CAN R-CAN R-L,I, R-C, I. L-C. I,
R-L. I.
4
R-C. I •
5
L-C. I.
L-L, I.
3.619048 .666667 .547619
2.000000 ,333333
6
l-CAN
.2~5714
.071429
.142857 .214286
1.2~8095
• 714286
.2~1905
1,309524
L-L, I.
• 3!P'33 3
L-CAN
.11gQ48
.166667 .071429
.357143
•
~57143
.oooooo
·285714
SUBCLASS CORRELATION MATRIX 1
R-CAN F'?-CAN
R-L .I· ~-
4 5 6
C, I,
L-C.I. L-l.I. L- CAN GROUP
2
R-L,I.
3
R-c.I.
4
5
6
L-C,I.
L-L.I.
L-CAN
1.000000 .2477g7 .538537
1.000000 .440959
1.000000
1. 00000 0
• 353553 ,600481 .70?439 .440959 ,416&67 0 L ONC~~
1,JOOOOO .509525 .3~4246
1.000000 .618642
,311805 .353553
.600481
EXPGRP
.oooooo
I
"'"'
SUBClASS COVARIANCE HATRIX
-----------------------------------------------------------1
2
R-CAN 1 2 3 ~
5 &
R-CAN R-L.I. R-C. I. L-c.I. L-L. I. L-CAN
3
R-L.I.
1. OQOuD 0 • 000000 .250000 -.750000
~
5
R-c. I.
L-c.I.
L-t. I.
1.000000 -.250000 -.500000 o.oooooo
.700000 .100000 o.oooooo
• 300000 o.oooooo
&
l-CAN
.700000 .OO~OGO
-;2ooooo -.100000 o.oooooo
• OJOOOr
o. 000000
o.oooooo
SUBCLASS CORRELATION HATRIX
-----------------------------------------------------------1 1 2 3 ~
5 6
R-CAN R-L. I. R-c.r. t-c.I. L-L.l. L-CAN GROUP
3
2
R-CAN
~
5
&
R-L.I.
R-c. I.
L-c.I.
t-L.I.
L-CAN
1.000000 .oooooo -.?85714 -.218218 o.oooooo
1. 00·0000 -.298807 -.912871 o.oooooo
1.oooooo .218218 o.oooooo
1.000000 o.oooooo
1.oooooo
1.~ooooc
• oooo•o .250000 -. 8%421 • 000000 o. ooooor
OLONEW
2
EXPGRP
0
I
"'"' SUBClASS COVARIANCE MATRIX
-----------------------------------------------------------1 1
~-CAN
2 3
R-L. I. R-C. I. L-c.I. L-L.I. L-CAN
~
5 0
2
3
~
R-L .I.
R-c.r.
1.86772 3.06481
;.43915 4.53%8 6o29630
6.13757
t.8~418
3.">51:12&
~.39153
&.407~1
1.941'0
?.317~&
2. 35979
3.18519
~-CAN
L-C. I.
5
&
L-t.r.
l-CAN
5.&&138 3.05820
3.32275
2.9fl~9~ 2.~~97~
7.07~07
10.78704
SUBClASS CORRElATION MATRIX 1 1 2
R- CAN
R-L.I. R-C. I. t-c.r.
2
3
P-L. I.
R-c.I.
.437~84
1.600000 .785709
1.000000
.~41751
.82199~
.869~02
~-CAN
1.000000 • &09817
4
t-c.I.
1.000000
5
L-t.I.
6
l-CAN
5 6
L-L.I. L-CAN GR~UP
• 462175 .618446
OL ONEW
,639785 .545127
.745002 .522548
.819918 .532028
1.000000 .705109
1.000000
EXPGPP
SUBCLASS COVARIANCE MATRIX
-----------------------------------------------------------1 ~-CAN
1 2 3
4 5 6
~-CAN
R-L.I. R-C .r.
• 424615 • 098462 ·"15~85
l -C. I •
.12615•
L-L. I. L-CAN
.12:":077 .Lt3~846
2
3
R-L.I. .573846 .461538 .5C4615 .072308 • 075385
R-C .I,
1.064615 1.473846 ,636923 .406154
4
5
L-c.I.
L-L.I.
2.701538 1.270769 • 418462
1.195385 • 30 9231
6
L-CAN
.641538
SUBCLASS CORRELATION MATRIX 1
P-C AN
3
4 5 6
R-CAN R-L. I. R-c.I. t-c.r. L-L, I. L-CAN GROUP
2
3
4
5
R-L.I.
R-C.I.
t-c.r.
L-L.I.
1· 00000 0 .19946 7 .320347 .117787 .1?2753 • 831240
1.000000 • 59 0492 .405282 • Q8n04 .124243
1.000000 .869061 • 5645S5 .491454
1.000000 • 707143 • 317862
1.000000 • 353116
OLONEW
FXPGRP
6
l-CAN
1.000000
SUBCLASS COVARIANCE MATRIX
-----------------------------------------------------------1
R-CAN 1 2 3
4 5 6
R-CAN R-L. I. R-c.r. L-c.I. L-L.I. L-CAN
• 78 0797 • 335145 • 735507 ,556159 .153986 .666667
2
3
4
R-L,I.
R-c.r.
L-C, I.
1.389493 1.844203 2. 023551 1.295290 .492754
5,644928 S.373188 2.6557S7
6.215580 3.248188
1.37681~
1.1?9420
5
6
L-L, I,
L-CAN
2.215580 • 6 37681
1.362319
SUBCLASS CORRELATION MATRIX
0
' ""'
-----------------------------------------------------------3 5 R-C.I. L-C."I. L-CAN L-L • I.
2 R-L. I.
R -CAN 1 2 3
"5 6
R-CAN R-L. I. R-C. I. L-C.I. L-L. I. L-CAN
1.nooooo .321762 • 350339 • 252•5 8 .111on • 64&398
1.0nOOOO .658493 • 688565 • 738236 .358148
1.000000 .907114 • 750%9 .4% .. 85
1.000000 • 875300 .398438
1.000000 .3670•&
1.000000
CELL SUMS - ALL GROUPS - BEFORE TRANSfORMATION MATRIX
-----------------------------------------------------------1 R-CAN 1 2
3 4 5 6 7
8 9 10
s.ryoooo 12.000ll0 3.00000 5.00000 0.00000 19.00000 o.oooor 13.oooor o.oooor s.ooooo
2 R-L. I.
3 R-C. I.
4 L-c.I.
5 L -L. I.
18.00000 16.00000 6.00000 ... ooooo
30.00000 28.00000 9.00000 10.00000 o.ooooo 76.00000 o.ooooo 50.00000 o.ooooo 20.00000
33.00000 30.00000 11.00000 6.00000 o.ooooo 77.00000 o.ooooo 41.00000 o.ooooo 34.00000
H.OOOOO
o.ooooo
44.00000 o.oocoo 19.00000 o.ooooo t1.ooooo
0
L-CAN 7.00000 14.00000 3.00000 o.ooooo o.ooooo 20.00000 o.ooooo 16.00000 o.ooooo 5.00000
z~.ooooo
7.00000 3.00000 o.ooooo 4 ... ooooo o.ooooo 23.00000 o.ooooo 17.00000
0 I
"'
(P
CELL IDENTIFICATION AND FREQUENCIES PAGE CELl 1 2 3
"
EMPTY 5 EMPTY 6 EMPTY
FACTOR LEVElS OLDNFW 1 1 1 1 1
N EXPGOP 1 2 3
4 5
R
9 7 5
0 28 0 24
2
1
2 2 2
4
0
5
25
2 0
TOTAL N=
107
3 NULL SUECLASSIESl.
TOTAL SUM OF CROSS-PRODUCTS
3
2 P-CON
R-CAN R-L. I. R-c.r. L-C. I • t-t.r. L-CAN
1 2 ~
4 5 6
R-L. I.
186.000 186.000
428.000
238.fln(l
5H.OOO
265.000 170.000 140.000
570.000 351.000 190.000
R-C.I.
9n.ooo 995.000 577.000 277.000
4
L-C.I.
1206.000 697.000 299.0 00
5
6
L-L.I.
L-CAN
519.000 226.000
213.000
OBSERVED CElL MEANS --- ROWS ARE CEllS-COLUMNS ARE VARIABLES ---------------------------------~--------------------------
1
R-CAN • 75000 0 1. 333333
1
2 3
• 4?8571
1.000000 .678571 .541667 • 230769
4 5 6
7
2
4
3
5
6
1<-L.I.
R-C . I .
L-C. I.
L-l.I.
2.250000 1. 777778 • 857143 .800000 1.571429 • 791667 .423077
3.750000 3.111111 1.285714 2.000000 2.714286 2.083333 .769231
4.125000
2.250000 2.555556 1.000000 ,600000 1.571429 .958333 .653846
3.333333
1.571429 1.200000 2.750000 1.708333 1.30769?
L-CAN .875000 1.555556 .428571
o.oooooo
.714266 .666667 .192308
0 I
OBSERVED CELL STO DfVS--ROWS ARE CELLS-COLUMNS VARIABLES
"'"'
-----------------------------------------------------------1
R-CAN
4
5 6 7
?
R-L.I,
3
R-c.r.
4
5
6
L-c.I.
L-L. I. 1.832251 2.185813 1.414214 • 547723 2.379365 1.488482
1. 0'50g8
2.492847
2.7124~5
2.0~1010
1. 414214
1. 5414 76
• 534522 1. 000000 1. 722478 .883627 • 651025
.377964 .836660 ?.332199 1.178767 • 757526
2.20479' 1.112697 1.000000 2.477411 2.175906 1.031802
2.828427 1.902379 • 836660 3.284363 2.493106 1.643636
L-CAN
1.093336
.9n031 1.236033 ,534522
o.oooooo
1.822842 1.167184 • 800961 PAGE
OBSERVED COMBINED
~EANS
0 VfRA lt
N
=
107
MEANS
FACTORS L"VEL
R-CAN L-CAN
!EXPGOP)
.5981 .6075
R-L.I.=
1.103
R-C.I,=
2.084
L-C.I.=
2.168
L-L.I.=
1.262
36
MnNS
------
~-CAN
= L•CAN =
.6944 .7500
R•L.I.=
1.722
R-C.I.=
2.944
L·C,I.=
3,056
L·t.I.=
1,7Z2
R-CAN = L·CAN =
1.331
R·L.I.=
1. 778
R-C.I.=
3.111
L·C.I.=
3,333
L·L.I.=
2,556
R-CAN = L-CAN =
.5161 ,6129
R·L.I.=
.8065
R·C.I.=
1.903
L•C,I,=
1.677
t·L.I.=
.9677
~-CAN=
1.0000
R-L. I •=
• 8000
R•C,I,=
2.000
L·C.I,=
1.200
L·L.I.=
.6000
L·CAN =
O.
R·t,I.=
.4231
R·C,I.=
.7692
L·C.I.=
1.308
L-L.I.=
.6538
tEVEL N
_____
.., MEANS
LFVEL N =
1.555
31
MEANS
-----LEVEL N MEANS
-----LEVEL
N =
26 ~EANS
MATRIX OF OBSERVED
R-CAN = l•CAN = ~EANS
.2308 ,1P23
FOR ALL CELLS SUPPRESSED. PRINTED ON PRICR PAGE
l'
"0
•STIMATION PARAMETERS PAGE
RANK OF THE BASIS
= RANK
OF MODEL FOR SIGNIFICANCE TESTING
RANK OF THE MODEL TO BE ESTIMATED IS
=
6
4
FRPOR TERM TO BE USED IS !WITHIN CELLSI NU~BER
OF ORDERS OF THE BASIS VECTORS OTHER THAN THE FIRST IS
NUMBER OF FACTORS WITH ARBITARY CONTRASTS IS
1
ESTIMATED CELL MFANS, RESIDUALS AND RESIDUALS IN FORM OF T-STATISTICS WILL BE PRINTED VARIANCE-COVARIANCE FACTORS AND CORRELATIONS AMONG ESTIMATES HILL BE PRINTED cSTlMATEC COMBINED
ME~NS
WILL BE PRINTED
OPTIONAL CONTRAST MATRIX••ROWS ARE CONTRASTS, COLUMNS SUBCLS 4 -.500000
-.500000
t.oooooo
5
-o.oooooo -o.oooooo
5
-.500000 -.500000 1.000000
2
3 ~
-.500000 -.500000 -1.000000
-o.oooooo -o.oooooo -o.oooooo
BASIS 1 -.200000 -.200000 • 800000 -. 200000 -.200000
1 2
3 ~
5
-.200~00 .~oooco
-.200000
HAT~IX
-.2ooooo -.200000 -.200000 -.200000 .600000
-o.oooooo 1.000000 -o.oooooo
FOR OPTIONAL CONTRASTS, BT COLUMNS
3
2 -.200000 .-.200000
1.000000 -o.oooooo -o.oooooo
~
.500000 -.500000 o.oooooo o.oooooo o. 000000
FACTOR IEXPGRPI SYMBOLIC CONTRAST VECTORS PAGE
&
11
CO,LD,
CONST
BASIS VECTOR !NOT CONTRAST VECTORI : t.OOOOOOOOE+OO 1.00000000E+OO 1.000.COOOOE+OO 1.00000000E+OO 1.0QOOOOOOE+OO t.OOOOOOOOE+OO
1.~0000000E+OO
t.OOOOOOOOE+OO 1.00000000E+OO 1.00000000E+OO
VECTOR OF T - TRIANGUlAR FACTOR OF BASIS, FROM GRAM-SCHMIDT 1.03~~080~E+01
VECTOR OF ORTHONORHALIZEO BASIS q.6&73648qE-02 q,&673&46qE-02 21
o, t,
9.&&7~&~'9E-02
q.&&736~8qE-02
ONE-CONTROLS
9.6&73&489E•02 9.&&73&469E·02 9.&&73&489E•02
ACTIVE
BASIS VECTOR !NOT CONTRAST VECTORI -2.COOOOOOOE-01·2.000QOOOOE-01 8.0QOOOOOOE·01·2.00000000E·01·2.00000000E·01-2.00000000E·01·2.000QOOOOE·01 B.OOOOOOOOE-01 ·2.00000000f-01·2.00000000E-01
VECTOR OF T - TRIANGULAR FACTOR OF BASIS, FROM GRAH•SCHHIOT 9.28067d29E·D1 4.&9240787E+OO
VECTOR OF ORTHONORHALIZEO BASIS -&.174220l8E•02-6.17422088E-02 1o513&7q9&E-01·&.17422D88E·02·6.17422088E·D2
1.51367996E·01-6.17~22088E-D2
...,,_.0 I
3)
TWO-CONTROLS
0,2,
BASIS VECTOR
C~OT
ACTIVE
CONTRAST VECTOR!
-2.00000aOaF-01-2.00aoaaoaE-01-2oOOaaaaoac-01 B.oaaaoooaE-a1-2oOOQOUaOOE-01-2.QOOOOOOOE-01-2oOOOOOOOOE-01-2.00000000E-01 s.oooaoaJoE-o1-2.ooooooaaE-a1
VECTOR OF T - TRIANGULAR FACTOq OF BASIS, FROM GRAM-SCHMIDT -t.58544784E+00-3.08711a44E-a1 2.16126181E+Oa
VECTOR OF ORTHONORMALIZEO BASIS -~.044a3o72F-02-~.04403a72E-a2
1.6a402531E-16 4.32252363E-01-3.a44a3072E-a2 1.604a2531E-16-3.a4403a72E-02
4)
0,3,
THREE-CONTROLS
BASIS VFCTOR (NOT CONTRAST VECTOR!
ACTIVE
=
-z.ooaaaoJOE-Ot-z.aaooooaaF-01-2.oaoaaooOE-a1-2.aooaaaaOE-a1 8.0aOOOOOOE-01-2oOOaGOOOGE-01-2.00000000E-01-Z.OOOOOOOOE-01 -2.000000JOE-01 a.QOOOaaaOE-at
VECTOR OF T - TRIANGULAR FACTOR OF BASIS, FROM GRAM-SCHMIDT 4.446q8785F-at-1.6a?zq743E+Oa-7.91447988E-01 4oa5g41784E+OO
" I
"""'
VECTOR OF ORTHONORMAL!ZEO BASIS -q.ozog2854E-C2-9.02092R54E-02-2.95885955E-16-8.80692586E-17-9.02a92854E-02-2.95865955E-16 1.56131456£-01 5)
TWO CONTROL GROUPS
0' 4,
CONTRl
BASIS VECTOR CNOT CONTRAST VECTOR)
a.
5.aooaOOOOE-01-5.aOOOOaOOF-01 O.
o.
a.
5.00aOOOOOE-Gt-5.aaOOOOOOE-01 o.
0.
VFCTOR OF T -
T~IANGULAR
FACTOR OF BASIS,
F~OM
GRAM-SCHMIDT
1.'0509426E+00-8.3351g81RF-a1-4.10944148E-01-1.21782535E+OO 2.68328157£+00
VECTOR OF ORTHONORMALIZED BASIS 7.453559gzE-a2-2.98142397E-at-2.26134887E-16-4.07658240E-16 7.45355992E-02-2.26134887E-16-7.35669276E-16 6l 1,
o,
FIRST-SECOND YEAR
YEAR
BASIS VECTOR (NOT CONTRAST VECTOR) = 5.cooaooJoE-01 5.orooooaaE-a1 s.oooaaaooE-01 s.aaaoooooE-a1 5.ooaoooooe-at-s.ooooooooE-ot-s.ooooaoooE-ot-5.ooooooooE-ot -5.00000aOOE-01-5.00000aOOE-01
VECTOR OF T - TRIANGULAR FACTOR OF BASIS, FROM GRAM-SCHMIDT •2.3&650440E+00-2.96752623E·01 t.64377&59E+00-1.53355765E+00·2.08&99678E+OO 3.41197554E+OO
VECTOR OF ORTHONORHA~IZEO BASIS 2.279552&2E·01·5.55332419E·16 2.26904777E·01•5o02356629E•16·6.51300749E·OZ-6.61605600E•02·7.26348081E·16
MAT~IX
EFFECT
1
EFFECT
2
6o167113?3E+OO
1.1~074906E+01
OF ORfHOGONAL ESTIMATES lUI EFFECTS X VARIABLES 2.24262665£+01 t.30509426E+01 6,26378718E+OO
2.155~2237E+01
-5.417~8090E·Ot-1.95782552E+00·1.19501049E+00·3.24246180E+00·1.94189205E+OO
3,58503148E•02
EFFECT 8.5?328603E·01·9.80177893E-D1·3.65283687E·01·2.70309928E+00·1.80815425E+00-1.40025413E+OO EFFECT
4
EFFECT
5
EFFECT
6
·2.40095483E+00•5.3~867825E+00·8.96541514E+00·7·320830~7E+00-5.01355452E+00•2.91792343E+OO
·1.71431876E+00-1.49071199E•01·4.47213596E·01·7.45355993E·01·2.2~606798E+00·2.16153238E+OO
0
I
-4.93727967£·02 1.34146945E+OO 6.21667167E·01 2.29005747E+OO 1.30365196E+00·8.50692914E·02
ERROR SUM OF CROSS•PROOUCTS
-------------------·---------------------------------------1 R•CAN 1 2 3 4 5 6
R·CAN R•L.I. R-c.r. t-c.I. L·L.I. l•CAN
137.8951 101.9060 61.0394
107. 2~12 73.5948 91.4057
2 R-L.I.
3 R-c.I.
4 l·C.I.
5 l•L.I.
261.8743 217.5345 260.3451 166.6676 100.5791
423.9605 431.2033 242.5223 111.6721
&19.1361 348.9609 129.6723
30 8.622J 121.41t11
6 l·CAN
157.6976
ERROR VARIANCE •COVARIANCE MATRIX 1
R-CAN 1
2
R•CAN R·L.I.
1. 376951 1·019060
2 R·L·I• 2.6167~3
3 R•C.I.
4
t-c. r.
5
l·l.I.
6
L·CAN
"'""
4 5 6
R-c.I. L-c. I. L-L. I. L-CAN
• 810394 1.072312 .735948 .qH057
2 .t 75~45
2.503451 1.666676 1.005791
4.239805 4.312033 2.425223 1.118721
6.191361 3.489609 1.298723
3.086223 1.214411
ERROR CORRELATION
1.578976
~ATRIX
-----------------------------------------------------------1 1 2 3
4 5 6
R-CAN R-L. I. R-C. I. L-C.I. L-L.I. L-CAN
2
3
4
6
5
R-CAN
R-t.I.
·R-r.I.
L-C.I.
L-L.I •
1. 000000 • 536275 .335157 • 366990 .356746 • 619457
1.oooooo .65?843 • 646562 .586261 .494622
1.000000 .841620 .670448 .432376
1.000000 .798307 .415370
1.000000
VARIABLE
L-CAN
.55012q
VARIANCE !ERROR MEAN SQUARES!
R-CAN R-L. I. R-c.I. 4 L-C.I.
1.378951 2.618743 4.239805 6.191361 3.086223 1.578976
5 l-L.I.
6 L•CAN
o.F.=
1.000000
STANDARD DEVIATION 1.1743 1.618'3 2.0591 2.4882 1.7568 1. 2566
too
ERROR TERM FOR ANALYSIS OF VARIANCE !WITHIN CELLSI
INVERSE OF T - TRIANGULAR FACTOR OF BASIS FROH GRAH-SCHHIOT 1
CONST 1 2 3 4
CONST ACTIVE ACTIVE ACTIVE
.096674 -.019120 • 068186 -.004857
2
3
4
ACTIVE
ACTIVE
ACTIVF
• 213110 .030440 .090209
.462693 .o90209
.246341
LEAST SQUARE ESTIMATES OF EFFECTS -- EFFECTS X VARIABLES
" "".,. I
R-CAN
CONST ACTIVE ACTIVE ACTIVE
• 6782&9 -.3Cr>093 .177778 -.5914~3
R-L. I •
1.099239 -.926882 -. 933333 -1.310256
~-c.I.
2.125602 -1.074552 -.977778 -2.208547
L-c.I.
L-L.I.
2.081467 -1.433692 -1.911111 -1.603419
1.199873 -.921147 -1.288889 -1.235043
L-CAN .525487 -.296206 -.911111 -.718803
ESTIMATES OF EFFECTS IN STANDARD DEVIATION UNITS-EFF X VARS 1 ~-CAN
1
2 3 4
CONST ACTIVE ACTIVE ACTIVE
.577600 -.260&1>1 .151392 -.503670
2
3
5
4
·6
R-L.I.
R-C.I.
t-c.I.
l-L.I.
l-CAN
.679275 -.S727&7 -.570754 -.809673
1.032308 -.521661 -.474862 -1.072590
.836520 -.576186 -.766056 -. 724776
.683002 -.524343 -. 73367? -.703021
.418190 -.237118 -. 725076 -.572035
STANDARn ER,ORS OF LEAST-SQUARES ESTIMATES--EFFECTS BY VARS i
2
3
4
R-CAN
R-L.I.
R-C. I.
.140838 .274091
.194085
.553565
.76?852 • 398642
• 246955 .480610 .970659 .507235
L-C.I.
5 L-L .!.
L-CAN
.210697 .410047 .828147 .432763
.150707 .293297 .5'!2354 .309545
6 0
I
1
2 3 4
CaNST ACTIVE ACTIVE ACTIVE
.289275
.377717
• 2qs427 .580781 1.172970 • 612956
VARIANCE-COVARIANCE
1 2
3 4
CONST ACTIVE ACTIVf ACTIVE
OF ESTIMATES
4
3
CONST
FACTO~S
ACTIVE
ACTIVE
ACTIVE
.054480 .022222 .022222
.222222 .022222
.060684
• 014384 -.0024~7
.0,1111
-.001197
INTFRCORRELATIONS AMONG THE ESTIMATES 1
CONST CONST ACTIVE
2
ACTIVE
1.000000 ... 087064
1.000000
3
ACTIVE
4
ACTIVE
"'"'
AGTIVE ACTIVE
4
• 550271 -. 040501
.201%4 .386484
1.000000 ·191363
1.000000
ESTIMATED CELL MEANS, ALL GROUPS - CELLS X VARIABLES
-----------------------------------------------------------1
,822?22 .82?222 .516129 1.00000C .230769 • 82222? • 8222?2 • 516129 1.000000 , 23C769
4
5 6 7
8 9 10
~
2
R•CAN
4
R-L. I,
R-c.I.
L-c.I.
1.733333 1. 733333 ,806452 ,800000 .423077 1.733333 1. 733333 .806452 .800000 .423077
2. 977778 2.977778 1.903226 2.000000 ,769231 2.977778 2.977778 1.903226 2.000000 • 769231
3.11111i 1.677419 1·200000 1.307692 3.111111 3.111111 1.677419 1.200000 1.307&92
3.111111
6
5 L•\.,J,
L·CAN
1.888889 1.888889 .%7742 .600000 .653846 1.888889 1.888889 .967742 .600000 .&53846
.911111 .911111 .612903 -.000000 .192308 • 911111 .911111 ,612903 -.000000 .192308
MEANS ESTIMATED BY FITTING MODEL OF RANK
4
PAGE ESTIHATEO
CO~BINED
MEANS BASED ON FITTING A MODEL OF RANK
7
4
0 I
"""'
OVERALL R·CAN = L-CAN :
MEANS
,6 783 ,5255
R·L.I,=
1.099
R•C,I,=
2.12&
L-C.I.=
2.081
l-l.I.=
1.200
R•L,I,=
1.733
R-C.I,=
2.978
L-C.I.=
3,111
l-L. I.=
1. 889
L-c.r.=
3.111
l-L.I.=
1.889
L-C,I.=
1.677
L-L.I.=
.9&77
L-C.I.=
1.200
L-L.I.=
.&000
-- -- -- - -- -- - - FACTORS LEVEL HANS
2 (£XPGRP) 1
R-CAN : L-CAN =
.8 222 .9111
-----------------------------------------------------------2
P-CAN
3
R-L,I,
1 2 3 4
-.072222 .511111 -. 087558
o.oooooo
.oooooo
5
-.143651 • 0255'6
-.161905 -.014765 .000000
6 7
.oooooo
.516667 .044444 • 050691
4
5
6
R-c.r.
L-C, I.
• 772222 .133333 -.617512 .000000 -.263492 .180106 .000000
1.013669 .222222 -.105991
.361111 .666667 .032256
-. 036111 .644444 -.164332
-. 361111 ,030914 • 000000
-. 3171t60 -.009409
-.196625 ,053763
.oooooo
L-L.I,
L-CAN
.oooooo
.oooooo
.oooooo
.oooooo
RESIDUALS IN STO, DEV, UNITS - FULL CELLS X VARIABLES 2
R-'Co!\N 1
2 3 4
5 6
-.061503 .435252 -. 074%2 o. 000000 -.122330 .021747
.ooooor
3
R-L, I,
4
R-C.I.
1..-C, I,
.319274 .0?7464
.375033 ,054754 -.2qqegr
.407472 .069309 -. 042597
-.100049
-.127%6 • 0671t70
-.145127 • 012424
.o 31325 .oooooo
-.009136
.oooooo
.oooooo .oooooo
.oooooo
.oooooo
5
6
L•L,I,
L·CAN
.205555 ,379485 • 018362
-.028736 .512659 -.146694
-.180707 -.005356
-.156637 .042766
.oooooo
.oooooo
.oooooo .oooooo
Q
..., "' I
RESIDUALS AS T-STATISTICS - FULL CELLS X VARIABLES
-----------------------------------------------------------1 ~-CAN
2 F<-L • I.
-.191843 1.459880 -.224?04
.995897 ,Qq2119 .0941%
-1.053160
-.861339 -.094191 .000001
o.oooooo
.. 22420 4
.oooooo
o.ooro~o
3
4
5
6
R-c.I.
L-c.I.
L-L,I.
L-CAN
1.169822 • 217191 -.901771
1.271006
.641176 1.272833 .055214
-1.101680 .901771 .000001
-1.249420 .128065 .000001
-.069640 1.720180 -.441100 0.000000 -1.348510 .441100 .000000
o.oooooo
.2~9551
-.128065
o.oooooo
o.oooooo
-1.555738 -.055214
,ooooao
D,F,= RESIDUALS ESTIMATED
AFTE~
160 FITTING MODEL OF RANK
4
ANALYSIS OF VARIANCE PAGE
6 DEPENDENT VARIABLEfSl
8
1 R-CAN 2 R-L.I.
3 R-c.r.
t-c. r.
4
5 l·L, I, 6 L-CAN NUMBER OF ALTERNATE BASIS ORDERS= PRINCIPAL COMPONENTS OF COVARIANCE
MATRIX WILL BE PRINTED
DTSCRIHINANT ANALYSIS WILL BE PERFORMED FOR EACH BETWEEN CELL HYPOTHESIS
ERROR SUM OF CROSS-PRODUCTS
ISEl
-----------------------------------------------------------R-CAN 1 2 3
4~
5 6
R-CAN R•L, I. R- C. I. L-C.(. L-L.I. L-CAN
137,8951 1U1.9060 61.0394 107.2312 n.5Q48 91.4057
2 R-L,I.
3 R-c.r.
4 L-C. I.
5 L-L.I.
261.8743 217.5345 260.3451 166.6676 100.5791
423.9805 43{. 2-0-31 242.5223 111.8721
619.1361 348.9609 129.8723
308.6223 1~1.4411
6 l•CAN
157.8976 Q
I
"
'-0
CHOLESKY FACTOR SE
-----------------------------------------------------------1 _R-CAN 1
2 3 4 5 6
R-CAN R-L. I. R-c.r. L-C.I. L-L. I. L-r.AN
11.74268 8.67828 6.9011'; 9.1316 0 6.26719 7.76393
2 R-L.I.
3 R-C, I,
4 L-C. I,
5 L-L.I.
13.65876 11.54163 13.25876 8,22031 2.41806
15.59312 13.79819 6.69499 1. 93'96 7
13.02170 6.93930 -.00247
10.43046 3.81685
LOG-DETERMINANT ERROR SUH OF CROSS-PRODUCTS
2 2 3 4 5
R-CAN R-L. I, R-c.r. L-C.I. L-L.I.
-.571025 -1.255254 -1.867509 -2.375411 -1.508914
• 825713 .581985
6.55163
=
2.97638536E+01
COMPONENTS -- VARIABLES X COMPONENTS !ROWS X COLSJ
PRIN~IPAL
1
6 L-CAN
3
4 -.141439 .587824
-.~46341
-.076736 -.549422 -.518302
-.500779
.134~25
.042375
.772560
.120961 • 212827
-.508~99
5 • 493734 -.222817 -.2756H .463039 -.312402
6 .318463 -.085604 .170790 -.225366 .260515
6
L-CAN
.794467
-.717927
.300065
-.1&1561
-.440340
-.349766
;,.,
CORRELATIONS OF ORIGINAL
MEAS~RES
AND PRINCIPAL COMPONENTS
5 1
2 3 4 5 6
R-CA N R-L.I. R-C. I.
t-c. I.
L-L.I. l-CAN
-.48<'273 -. 775685 -. 906%3
.703161 .3'S<3&38
-.168202 -.201258 • 024121 .632249
-.q54.6?3
-.858916 -.571337
-.06531,7 -.339515 -.251715 .054185 .439763 .238796
-.120446 • 363246 -.2Lt6906 .048613 .121147 -.350430
6
.Lt2045Lt -.137690 -.133869 .1860n -.177828 -.128573
EIGENVALUE
VECTOR 1 2 3
.271196 -.052899 .082945 -.090572 .1482g2 -.278349
PER CENT OF VARIATION
10.824137 2.024215 1.281455 .877839 • 707504 • 378910
..5
6
72.4002 10.6013 6.7113 4.5974 3.7054 1.9844
COMPUTED FROM COVARIANCE
MATRIX n
HYPOTHESIS
I 00
1 OEGREEISI OF FREEDOM
0
PAGE
co ,Lo,
CONST
SUM OF CROSS-PRODUCTS FOR HYPOTHESIS 1
3
R -C .~N
1 2 3 4
5 6
R-CAN R-L.I. R-C. I. L-G.I. L-L.I. L-CAN
5
4
R-L.I.
R-c.I.
L-C. I.
130.1308 245.9252 255.8505 1Lt8.8785 71.F822
464.7570 483.5140 281.3551 135.46 73
503.0280 292.7103
5
L -l.I.
L-CAN
~8.2804
70.5794 1~~.3832
138.7664 80.7477 38.8785
140.9~46
170.3271 82.0093
39.4850
TESTS OF HYPOTHESIS EE!NG SKIPPED HYPOTHESIS
3
DEGREEISI Of fREEDOM PAGE
0,1,
ONE -CONTROLS
ACTIVE
10
TWO-CONTROLS THR'EE-CONfROlS
0,2' 0,3,
ACTIVE ACTIVE
SUM OF CROSS-PRODUCTS FOR HYPOTHESIS
-----------------------------------------------------------1 R-CAN 1 2 3 4 5 6
R-CAN R-t. r.
R-c.r.
l-C.I. L-t. r. L-CAN
<;.78453 12.99558 21.86160 17.02962 11.54817 5.7929~
~-L.I.
2
3 R-c.r.
t-c.r.
4
5 L-L.I.
33.08430 50.383&2 47 .9~630 32.240&8 1&.82239
81.94015 70.49646 47.92967 2&.&2904
71.41486 47.88751 25.03041
32.17610 17.09143
6 L-CAN
10.47627
ISCP HYP + SCP ERROR! - ADJUSTED FOR ANY COVARIATES ISTI•l
-----------------------------------------------------------1 R-CAN 1 2 3 4 5 6
R-CAN R-L.I. R-c.r. l-C~I.
L-L. I. l-CAN
144.6797 114.9035 102.9010 124.2608 85.14~0
97.1986
2 R-L.I.
4 L-c.r.
3 R-C .I.
5 L-L.I.
6 L-CAN 0 I 00
294.9586 267.9181 308.2814 198.9083 117.4014
H
505.9206 501.6998 290.4520 138.5011
690.5509 396.8484 154.9027
340.7984 138.5325
168.3739
CHOLESKY FACTOR STI•
-----------------------------------------------------------1 R-CAN 1 2 3 4 5 6
R-CAN R-L. I. R-c. r. L-c.r. L-L.I. L-CAN
2 R-L.I.
3 R-c.r.
4 L-C.I.
5 L-L.I.
14.27246 13.04575 14.68523 9.19872 2.81711
16.20316 13.&8507 6.78209 2.01313
13.44953 7.12464 .18600,
10.45556 3.86774
6 L-CAN
1~.02829
9.55278 8.55492 10.3307Z 7.07856 8.0~U8~
8.72303
LOG-DETERMINANT SCP HYPOTHESES + SCP ERROR, ADJUSTED FOR ANY COVARIATES,
F-RATIO FOR MULTIVARIATE TEST OF EQUAliTY OF MEAN VECTORS= O.F.=
18
AND
269.1659
P LESS THAN
.0251
=
3.00856870E+01
1.8022
!LIKELIHOOD RATIO
VARIABLE
7 .24818916E-01
LOG
HYPOTHESIS MEAN SQ
UNIVARIATE F
------------------
---------------------------------
R-CAN
2.2615
1.6400
.1850
2
R-t.I.
11.0281
4.2112
.0076
R-C.Io
27.3134
6.4421
L-c.r.
23.8050
3 •• 449
L-L. I.
10.7254
3.r,752
3.4921
2.2116
L-CAN
STEP DOliN F
P LESS THAN
1
6
-3.21833427£-011
P LESS THAN
--------------------------------
1.6400 STEP-DOWN NEAN SQUARES 3.0320 STEP-DOWN MEAN SQUARES .0005 2.&0&0 STEP-DOWN MEAN SQUARES .0119 2.1595 STEP-DOWN MEAN SQUARES .0189 .1542 STEP-DOWN MEAN SQUARES 1.2821 .0915 STEP-DOWN MEAN SQUARES
:(
2.2615/
:(
5.7138/
:1
6.ft657/
:I
3.7751/
:(
.17ft7/
=I
.9870/
.1850 1.37981 .0329 1.88ltSI .0561 2.4&111 .0978 1o7ft81J .9268 1.13331 .2851 .76981
DEGREES OF FREEDOM FOR HYPOTHESIS: 3 DEGREES Of FREEDOM FOR ERROR= 100
r' rG' HYPOTHESIS MEAN PRODUCTS, ADJUSTED FOR ANY COVARIATES
-----------------------------------------------------------1 R-CAN 1 2 3 4 5 6
R-CAN R-L.I. R-C. I. L-c.I. L-L. I. L-CAN
2. 2H51 4.33186 7,2872C 5.67654 3. 849~9 1.93097
2 R-L.I,
3 R-C.I.
4 L-c.r.
5 L-t.I.
11.02810 16.79454 1S.q?A77 10.74689 5.60746
27.31338 23.49882 15.97656 8.87635
23.80495 15.96250 8. 34347
10.72537 5.69714
6 L-GAN
3.49209
DISCRIMINANT ANAlYSIS FOR HYPOTHESIS
VARIA~CE
OF CANONICAL VARIATE
.2273
2
PE.R CENT OF CANONICAL VARIATION:
&5.18
ROT#S CRITERION= M: 1.0
--DISCRIMINANT FUNCTION COEFFICIENTS--
N=
.1852 46.5
VARIABLE 1 2 3 4 5 6
VARIANCE OF CANONICAL VARIATE
RAW COEFFICIENT
R-CAN R-L.I. R-c.I. L-C.I. L-L.I. L-CAN
.195962 • 093560 .659359 -.367212 .169330 -.111778
.2301 .15H 1."!1577 -.9137 .2975 -.1405
PER CENT OF CANONICAL VARIATION•
.0920
2
STANOAROIZEO
26.38
ROY~S
CRITERION= H= 1.5 N=
.0843 46.0
ROY#S CRITERION• tt~ 2.0 N=
.0286 lo5.5
--DISCRIMINANT FUNCTION COEFFICIENTS-VARIABLE 1
2 3 4 5 6
VARIANr.E OF CANONICAL VARIATE
RAW COEFFICIENT
STANOAROIZEO
.742351 -.309825 .509238 -.526432 .09&182 -•>706"}6-
R-CAN R-L. T. R-C.I. L-c.I. L-L.I. t-C-Att
.8717 -.5014 1. 0 486 -1.31'<9 .1690
--.ruT
PER CENT OF CANONICAL VARIATION•
• 0294
l' 00 8.44
--DISCRIMINANT FUNCTION COEFFICIENTS-VARIABLE
RAW COEFFICIENT
1 R-CAN 2 R-L.Io 3 R-c.I. 4 L-C.I. 5 L-L.I. 6 L-CAN
HOTELLING~S
BARTLETT~S
FOR ROOTS
THROUGH
2 THROUGH
-.6518 -.4317 .6870 -.5235 -.U38
1.1382
TRACE CRITERION•
.3487
CHI SQUARE TEST FOR SIGNIFICANCE OF SUCCESSIVE CANONICAl VARIATES
3 CHI SQUARE• fliKELIHOOO RATIO
FOR ROOTS
STANDAROIZEO
-.55503& -.266784 .333655 -.210402 -.235526 • 905761
3 CHI SQUAR!'•
31.5397
WITH
7. 2481893 OE-01
11.4662
WITH
18 DEGREES OF FREEDOM LOG
P lESS THAN
.0250
P lESS THAN
.3223
= -3.21833408E-01)
10 DEGREES OF FREEDOM
VI
!LIKELIHOOD RATIO FOR ROOTS
3 THROUGH
B.89565494E-01
3 CHI SQUARE=
CANONICAL 1 1
2 3
ACTIVE
2 ACTIVE
-.4>;1394 .351937 .307140
-.111&15 1.329114 -.295497
FOR~
WITH
2.8425
(LIKELIHOOD RATIO
9.71411240E-01
= -1.17022145E-01l
LOG
4 DEGREES OF
FREEDO~
P LESS THAN
,5846
lOG = ·2.9D0537B7E-02J
OF LFAST SQUARE ESTIMATES-VARIATES X EFFECTS
3
ACTIVE -1.161263 .086623 -.039798 HYPOTHESIS
1 DEGREEISI OF FREEDOM PAGE
0,4,
TWO CONTROL GROUPS
11
CONTRL 0
I
..,.
00
SUH OF 1 R·CAN
1 2 3
R·CAN R•L, I. R-C. I.
" t-c.I. 5 6
L·L.I. l•CAN
2.938889 • 255556 • 766667 1.277778 3.833333 3. 705556
2 R·L. I.
FOR HYPOTHESIS
R·C .I.
L-c.r.
5 l-L.I.
·200000 .333333 1.000000 .966667
.555556 1.666667 1.611111
5.000000 4.833333
3
.022222 .066667 .111111 .333333 .322222
CROSS-~ROOUCTS
4
6 L·CAN
4.672222
!SCP HYP + SCP ERROR! - ADJUSTED FOR ANY COVARIATE$ ISTI•l 1
R·CAN 1 2
R•CAN R·L.I.
'+
t-c.I.
3 R-c. I.
5 6
l-L.I. l•CAN
140.8340 1n.1635 81,8060 108.5090 77. 4'281
95.1112
2 R-1.. r.
R-c.r.
t-c.I.
L-L.I.
261.8966 217.6012 260.4563 167.0009 100.9013
424.1805 431.5366 243.5223 112.8388
619.6916 350.6270 131.4834
126.2744
3
4
5
6
\.•CAN
~13.6223
162.5698
CHOLESKY FACTOR STI•
-----------------------------------------------------------1 R-CAN 1 2 3 4 5 6
R-CAN R-l. I. R-C. I. L-c.I. L-L.I, L-CAN
3 R-c.I.
4 L-c.I.
5 L-L.I.
15,59773 13.80603 6.74080 1.96841
13.02262 6.%029 • 01133
10.57140 3.95455
2 R-t. I.
11.86735 a .60679 6.89337 9.14349 6.52446
n. 70348 11.54872 13.26246 8.087%
8,014'~
2.326~1
6 l-CAN
8.5&755
LOG-DETERMINANT SCP HYPOTHESES + SCP ERROR, ADJUSTED FOR ANY COVARIATES,
F-RATIO FOR MULTIVARIATE TEST OF EQUALITY OF HEAN VECTORS= O.F,=
6
ILIKELIHOOO RATIO
AND
95.0000
9.42779095E-01
P LESS THAN
LOG
2.98227769E+01
.%10
.4560
-5.89232820E-02l
0
I 00
"' VARIABLE
HYPOTHESIS HEAN SQ
UNIVARIATE F
------------------
P LESS THAN
---------------------------------
2.9389
2.1312
o1H5
R-L. I.
.0222
.0085
.9268
R-c.I.
.2000
.0472
R-CAN
4
L-C, I.
.5556
.0897
5
L-L.I.
5.0000
1.6201
·L-CAN
4.6722
DEGREES OF FREEDOM FOR HYPOTHESIS= 1 DEGREES OF FREEOOH FOR ERROR= 100
H~AN
PRODUCTS, ADJUSTED FOR ANY COVAR!ATES 4
5
6
P LESS THAN
--------------------------------
2.1312 STEP-OOWN MEAN SQUARES .6493 STEP-OOWN MEAN SQUARES .8286 .0579 STEP-OOWN HEAN SQUARES ,0138 .7652 STEP-OOWN MEAN SQUARES .2061 2.6120 STEP-OOWN MEAN SQUARES .0885 .3541 STEP-DOWN HEAN SQUARES
2.9590
HYPOTHESIS
STEP OOWN F
=I
2.9389/
=I
1·2235/
=I
.1437/
=I
.02ft1/
=I
2.9601/
=I
.2726/
.H75 1.3790) .4224 1.88451 .6104 2 .4811) .9069 1.7481) .1094 1.1333) .5533 .7696)
R-CAN 1 2 3 4 5 6
R-CAN R-L.I. R-c.I. L-c.I. t-t.I.L-CAN
R-t..I,
2.q36s69 • 255556 .756657 1. ?77776 3.63,.133 3.705556
.022222 .056567 .111111 .33J333 .322222
R-c.I.
L-c.I.
L-L.I.
.2-00000 .333333 1.000000 .965567
.555556 1.566667 1·511111
5.000000 4.633333
L-CAN
4.572222
DISCRIMINANT ANALYSIS FOR HYPOTHESIS
VARIANCE OF CANONICAl VARIATE
.0607
3
PER CENT OF CANONICAL VARIATION= 100.00
ROY#S CRITERION= H= 2.0 N=
.0572 46.5
--DISCRIMINANT FUNCTION COEFFICIENTS-VARIABLE
RAW COEFFICIENT
1 R-CAN
2 R-L.I. 3 R-c.I. 4 t-c.I. 5 L-t.I. 6 L-CAN
HOTELLING~S
BAPTLEfT~S
FOR ROOTS
1 THROUGH
STANDARDIZED
-.493410 .392416 -.016609 .251156 -.542064 -.297905
-.5794 .6350 -.0342 .6249 -.9523 -.3743
TRACE CRITERION=
0
I Oo
"'
.0607
CHI SQUARE TEST FOR SIGNIFICANCE OF SUCCESSIVE CANONICAL VARIATES
1 CHI SQUARE= !LIKELIHOOD RATIO
HYPOTHESIS
5.7156
WITH
9.42779088E-01
4
6 DEGREES OF FREEDOM
P tESS THAN
.4558
LOG = -5.89232888E-021
1 DEGREE!Sl OF FREEDOM PAGE
1,0,
FIRST-SECOND YEAR
SUH Of CROSS-PRODUCTS FOR HYPOTHESIS
YEAR
12
-----------------------------------------------------------1 R•CAN 1 2 3 4 5 6
R•CAN R•L• I. R•C,I, L•C, I, L-L.I. t.•CAN
• 00243 8 -. 0662~2 -.030704 -.113067 -.064365 .004201
2 ~-L.I.
3 R-c.I.
4 L-C. I.
5 L•L,I,
1.799540 .834243 3. 072042 1.748809 -.114145
.386744 1.424157 .810724 -.052916
5.2443&3 2.985438 -.194859
1.&99508 -.110927
&
L-CAN
.007240
ISCP HYP • SCP ERROR! - ADJUSTED FOR ANY COVARIATES CSTI•I
-----------------------------------------------------------1 R•CAN 1 2 3 4 5 -&
R•CAN R•L, I, R•C, I, L-c. I. L·L·I•
l--eAN-
137,8976 10 t. 8417 81.0087 107.1182 73.5304 cn-.'+u9V
2 R-L.I,
3 R-C,I.
263,6739 218.3687 263.4172 168.4164
424.3672 'o32.6275 243.3330
--~ITIJi-404"f
~n-.-e-~crz
4 L-c.I.
624.3804 351.9463
--u.,..orrtr
5 l.•L, I,
6 L-CAN
310.3218 1-z-~-.~z
157.91T148
\'
00
CHOLESKY FACTOR STI•
"""
-----------------------------------------------------------1 2 3 4 5 6
R·CAN R-t..I. R•C, I, L-c. r. L-L.I. L·CAN
1 R·CAN
2 R-t. ,I,
3 R-c.I.
4 L•C, I.
L-L.I.
11.74298 8.67256 &.89848 9.12189 6.26165 7.78421
13.72809 11.54867 13.42554 8.31229 2.40061
15.60149 13.75851 &.67509 1.94828
13.10076 &.97609 -.02781
10.43057 3.81787
5
6 L-CAN
8.55425
\
LOG·O!'TERIHNANT SCP HYPOTHESES • SCP ERROR, AIJJUSTEIJ FOR ANY CO_VARIATES,
F•RATIO
FO~
MULTIVARIATE TEST OF EQUALITY OF MEAN VECTORS:
O,F,=
6
ILIKELIHOOIJ RATIO
ANO
95.0000
9, 76327065E·01
P LESS THAN
LOG
Z.97878113Eo81
.3839
.8877
-Z.3957&415E-021
HYPOTHESIS HI' AN SQ
V~RIABLE
R-CAN
P LESS THAN
UNIVARIATE F
------------------
STEP DOWN f
.0024
,96&&
• 0018
.0018
STEP-DOWN MEAN SQUARES =I R-L,T.
1.7995
.6872
.4092
R-C. I.
• 38&7
• 0'312
.7&33
• 8470
.3597
5.2444 1.6995
.5507
.4598
.0072
.004&
.'3462
.2&10/
1.1615 2.0653/
• 0021
STEP-DOWN MEAN SQUARES =I L-CAN
1.89571
.1052
STEP-DOWN HEAN SQUARES =I STEP-DOWN HEAN SQUARES =I
L-L.I.
.0024/
1.007&
STEP-DOWN HEAN SQUARES =I L-c.I.
P LESS THAN
--------------------------------
---------------------------------
.00231
.0582
STEP-DOWN MEAN SQUARES =I
.0448/
.9&&& 1.37901 .3180 1.8&451 .74&4 z.lt8111 .2796 1.74811 .9639 1.13331 .6099 • 76981
DEGREES OF FREEDOM FOR HYPOTHESIS= 1 DEG~EES OF FREEOCH FOR ERROR= 100
HYPOTHESIS MEAN PRODUCTS, ADJUSTED FOR ANY COVARIATE$
-----------------------------------------------------------1
2
P-CAN 1 2 3 4 5 6
P-C AN R-L, I. R-C • I, L-c.r. L-L, I, L-CAN
3
4
R-L. I,
R-C, I .
1. 799540 .P4243 3,072042 1. 748809 -.114145
.386744 1.424157 ,810724 -.052916
L-C. I.
5
L-L.I.
&
L-CAN
.002436 -.0662~2
-.03~704
-.113r67 -.064365 .004201
0 I
5.244363 2.985438 -.194859
"'"' 1.699508 -.110927
.007240
DISCRIMINANT ANALYSIS FOR HYPOTHESIS
VARIANCo O' CANONICOL
V~RIATE
.0242
4
PER CENT Of CANONICAL VARIATION• 100,00
--DISCRIMINANT FUNCTION COEFFICIENTS-VARIABLE R-G~N
R-L,I.
R-c.r.
4 L-c.I. 5 L-L.r. 6 L-CAN
RAW COEFFICIENT -.330570 .475620 -.594846 .526041 .039923 -.188132
STANDARDIZED -.3882 .7697 -1.Z248 1.3089 • 0701 -.2364
ROY#S CRITERION= M= z.o N=
.OZ37 4&.5
HCTELt.INGtS TRACE CRITERION=
.02~2
BARTtETHS. CHI SQUARE fEST FOR SIGNIFICANCE OF SUCCESSIVE CANONICAL VARIATES FOR ROOTS
THROUGH
1 CHI SQUARE=
2.3239
!LIKELIHOOD RATIO
WITH
9.7632706gE-01
HYPOTHESIS
5
6 DEGREES OF FREEDO"
P LESS THAN
.8877
lOG= -2.3957&374£-021
1 DEGREE!SI OF FREEDOM PAGE
13
SUH OF CROSS-PRODUCTS FOR HYPOTHESIS
-----------------------------------------------------------1 R-CAN 1 2 3 ~
5 6
R-CAN R-L.I. R-c.I. t-c.I. L-L.I. L-CAN
• 098&25 • 327689 • 979885 .808087 ,340415 • 213157
2 R-L.I.
R-c.I.
1.088773 3.25574 7 2.&84934 1.131055 .708231
9.735632 8.02873& 3,382184 2.117816
3
4 t-c.I.
5 L-t.I.
& l-CAN
" I
6.&21100 z. 789204 1· 7~6511
"'"' 1.174979 • 735735
.460694
SUM OF PRODUCTS OBTAINED BY SUBTRACTION
!SCP HYP + SCP ERROR! - ~OJUSTEO FOR ANY COVARIATE$ !STI•l
-----------------------------------------------------------1 ~-CAN
1 2 3 4 5 6
R-CAN R-L. I. R-c.r. L-c. r. L-L.I. l-CAN
137.9938 102.2357 a2.o1g3 108.0393 73.9352 91,618~
2 R-L.I,
~-c.r.
3
4 t-c. I.
5 L-L, I.
6 L-CAN
262.9631 220.7902 263.0301 167.7987 101.2873
433.7161 439,2320 245.9045 113.9899
&25.7572 351.7501 131.6188
309.7973 122.1768
158.3583
CHOlESKY FACTOR STI•
0-CAN 1 2 3 4 5 6
R-CAN R-L. I. R-C. I. L-C. I. L-L.I. L-CAN
11.74708 ' · 70307 6.98210 9.19712 ~.29392
7,79929
3 R-c.r.
4 L-C. I,
5 L-L.I.
15.75394 13.87654 6.68752 1. 96636
13.02927 6.95338 -.00405
10.43462 3.81402
R-L.I.
6 L-CAH
13.68282
11.69529 13.37347 8.?6015 2.44171
LOG-DETER~INANT
SC~
8.55301
HYPOTHESES + SCP ERROR, ADJUSTED FOR ANY COVARIATES, =
F-RATIO FOR MULTIVARIATE TEST OF EQUALITY OF HEAN VECTORS= Q,F,=
6
ILIKFLIHOOD RATIO
VARIA8Lf
HYPOTHESIS
g. 73321510!'-01
P LESS THAN
LOG
-2.70408194E-021
P LESS THAN
.0715
R-L .I.
1.0888
.4158
R-C.I,
9. 7356
2. 2962
L-C. I •
6.6211
1.0694
L-L.I.
1.175 0
.3807
.4607
.2918
1 2 3 4 5
R-CAN R-L. I, R-c.I. L-c.I. L-L.I.
R•L,I.
R-C. I.
L-C. I.
5 L-L.I.
1.088773 3. 255747 2.1;84934 1.131055
9. 735632 8.028736 3.382184
6.521100 2.789204
1.174979
2
3
4
.oq~525
• 327589 .979885 • 808087 • 340415
P LESS THAN
--------------------------------
.0715 STEP-DOWN HEAN SQUARES .5206 .3491 STEP-DOWN HEAN SQUARES .1329 2.0319 STEP-DOWN HEAN SQUARES .3036 .1129 STEP-DOWN MEAN SQUARES .5387 .0765 STEP-DOWN MEAN SQUARES .5903 .0308 STEP-DOWN HEAN SQUARES
HYPOTHESIS MEAN PRODUCTS, AOJUSTEO FOR ANY COVARIATES 1
STEP OOWN F
.7897
DEGREES OF FREEDOM FOR HYPOTHESIS= 1 DEGREES OF FREEDOM FOR ERROR= 100
R-CAN
• lt340
.8546
---------------------------------
.0986
L-CAN
95.0000
UNIVARIATE F
---·--------------
R-CAN
6
SQ
~EAN
AND
2.97908941tE+01
6 L-CAN
=I
.0986/
=I
.657U
=I
5.0413/
=I
.1973/
=I
.08671
=I
.0237/
.7897 1.37901 o5560 1.881t5l .1573 2.48111 .7377 1. 7 .. 811 .7827 1.13331 .8610 • 76981
0
I
"'0
6
L-CAN
• 21'15 7
.708231
2.117816
• 735735
1.7~6511
.460694
DISCRIMINANT ANALYSIS FOR HYPOTHESIS
VARIANCE OF CANONICAL VARIATE
.0274
PER CENT OF CANONICAL VARIATION= 100.00
--OISCRI~INANT
VARIABLE
5
STANDARDIZED
-.078587 -.157604 .675651 -.046739 -.212843 .128967
3 R-c.r. 4 L-c.r.
5 L-L.I. 6 L-tAN
.0267 46.5
FUNCTION COEFFICIENTS--
RAW COEFFICIENT
1 R-CAN 2 R-L.I.
ROY#S CRITERION= M= 2.0 N=
-.0923 -.2550 1.3912 -.1163 -. 3739 .1621
HOTELLINGtS TRACE CRITERION=
.0274
0 I
"'>-' BARTLETT~$
FOR ROOTS
THROUGH
CHI SQUARE TEST FOR SIGNIFICANCE OF SUCCESSIVE CANONICAl VARIATES
1 CHI SQUARE= (LIKELIHOOD RATIO
2.6230
WITH
9.73321508E-01
6 DEGREES OF FREEDOM
P lESS THAN
.8545
LOG= -2.70408223E-021
ORDER OF BASIS VECTORS PAGE 6
4
14
5
HYPOTHESIS
2 DEGREElSl OF FREEDOM PAGE
CO,LO, 1,0,
FIRST-SECOND YEAR
CONST YEAR
15
SUM OF CROSS-PRODUCTS FOR HYPOTHESIS 1 1 2
3 4
5 6
R-CAN R-L.I. R-C.I. L-C.I. L-L.I. L-CAN
41.8232 7 5. 4q96 140.1627 145.7754 86.6472 4.1.491&
3
2
R-CAN
R-t
.r.
136.%37 255.3404 265.5844 157.0716 75.3112
6
4
R-c.r.
L-C. I.
477.7303 49&.g266 2g2.6446 140.4677
516.8948 304.3 820 146.104~
L-L.I.
180.1512 66.3607
L-CAN
41.4134
TESTS OF HYPOTHESIS BEING SKIPPED HYPOTHESIS
DEGREE!Sl OF FREEDOM
3
PAGE
o, 3'
16
ACTIVE ACTIVE ACTIVE
ONf-CONTROLS TWO-CONTROLS THREE-CONTROLS
0,1, 0,2,
SUM OF CROSS-PRODUCTS FOR HYPOTHESIS
-----------------------------------------------------------1
R-CAN 1 2 3 4 5 6
R-CAN R-L. I. R-C. I. L-c.I. L-L. I. L-CAN
3
2
4
R-L.I.
R-c.r.
27.74504 41.g0201 41.06522 26.832H 14.46313
69.55033 58.87323
5
L-c.I.
L-L.I.
63.03544 41.55411 22.33268
?7.54529 15.14491
6
L-CAN
~.%833
q.11723 15.73321 12.017~2
7.59115 4.07935
38.38101
22.43471
ISCP HYP
+
9.66254
SCP ERROR! - ADJUSTED FOR ANY COVARIATE$ !STI•l
-----------------------------------------------------------1 1 2 3 4
5 6
R-CAN R-L. I. R-C. I. L-C.I. L-L.I. l-CAN
2
R-GAN
R-L. I.
141.8635 111.0252 96.7726 119.2465 81.18S9 95.4850
269.61g4 259.4365 301. 4104 193.5003 115.0622
3
R-c.I.
4
5
L-C.T.
L-L.I.
662.1715 390.5150 152.2049
336.167& 136.5860
6
L-CAN
493.5308 c+90.0765 280.90~3
134.3066
167.5601
CHOLESKY FACTOR STI•
0 I
"'"'
---------------------------------------------·--------------1 2 3 4 5 6
R•CAN R•Lo I. R-c.I. l-C.I.
L·t.r. L•CAN
1 R•CAN
2 R-L.I.
3 R-c.r.
4 L-c.I.
5 L•L.I,
11.91065 9.32151 8.12468 10.01193 f'>o61625 6.01678
14.23829 12.90186 14,61440 9.12768 2.63276
16.15733 13o62712 6.66930 2.01912
13.51494 7.25066 .22402
10.45656 3.92047
LOG-OETER~INANT
8.70458
SCP HYPOTHESES + SCP ERROR, ADJUSTED FOR ANY COVARIATES, :
F-RATIO FOR
~ULTIVARIATE
o.F.=
18
ILIKEtiHOOO RATIO
VARIABLE
6 L•CAN
TEST OF EQUALITY OF HEAN VECTORS=
AND
269.1659
7o4276017ZE-01
P LESS THAN
LOG
1.6580
.0469
-2.97382070E·011
HYPOTHESIS MEAN SQ
UNIVARIATE F
------------------
---------------------------------
P LESS THAN
R•CAN
1.3228
.9593
.4152
R-L.I,
9.2483
3.5316
o0176
STEP DOliN F
STEP-DOWN HEAN 23.1634
5.4660
o0017 STEP•DOIIN HEAN
4
L-C. I.
21.0116
3.3937
.0209
5
L•Lolo
9.1816
2. 9751
.0353
STEP-DOilN HEAN STEP-DOilN MEAN 6
L•CAN
3.2208
2. 0398
.1132 STEP•DOHN MEAN
DEG'REES OF FREEDOM FOR HYPOTHESIS: 3 DEG~EES OF FREEDOM FOR ERROR= 100
HYPOTHESIS MEAN PRODUCTS, ADJUSTED FOR ANY COVARIATES 1 R-CAN 1 2
3
R·CAN R-L.I. R-c.r.
1.3?278 3,03q08 5.24440
2 R•Loio
3 R-c.I.
g,24835 13.q6734
23.18344
4 L-c.I.
5 l•L.I.
&
L•CAN
P LESS THAN
--------------------------------
STEP•DOIIN HEAN R-C, I.
3.00612357E+01
.9593 SQUARES 2.6597 SQUARES 2.4067 SQUARES 2olt958 SQUARES .1605 SQUARES 1.1429 SQUARES
=I
1o3228/
:1
5.38'10/
=I
5.9713/
=I
4.3630/
:1
.1819/
=I
.8798/
... 152 1.37901 .0 .. 08 1o811t5) o0719 2.48111 o06ftlt 1o71t811 .9227 1.1ll331 .3359 • 76981
0 I
"'"'
4 5 6
L-C.I. l·L,T, L·cAN
4.00577 z.qo3A 1.35~78
13,,8841 6,q4424 4.82771
1g,624H 12.7g367 7.47824
21.01181 13.85137 7.44423
9.16176 5.04630
3.22065
DISCRIMINANT ANALYSIS FOR HYPOTHESIS
VARIANCE OF CANONICAL VARIATE
.1go5
1
PER CENT OF CANONICAL VARIATION=
59.77
ROY'S CRITERION: 11= 1,0 N=
.1600 46.5
··DISCRIMINANT FUNCTION COEFFICIENTS-VARIABLf 1 2 3 4 5 6
R-CAN R-L.I, R-c.r. l·C .I. L-t..I. L-CAN
RAW COEFFICIENT -.102532 -.097287 -.701081 .368307 -.093018 .043714
STANDARDIZED --1204 -.1574 -1.4436 .9164 -.1634 .0549 Q
I
'g. VARIANCE OF CANONICAL VARIATE
.1029
PER CENT OF CANONICAL VARIATION=
32.28
ROY'S CRITERION:
11= 1.5
N=
.093"3
46.0
-·DISCRIMINANT FUNCTION COEFFICIENTS-VARIABLE 1 2 3 4 5 6
VARIANCE OF CANONICAL VARIATE
R-CAN R-L.I. R-c.I. t.-c.I. t.-t..I. L-CAN
.0253
RAW COEFF!CIFNT .645728 -.295594 .535378 -.515225 • 015802 -.514981
STANDARDIZED .7583 -.4783 1.1024 -1.2620 • 0278 -.6471
PER CENT OF CANONICAl VARIATION=
7.95
ROY1S CRITERION= 11= 2,0
·-DISCRIMINANT FUNCTION COEFFICIENTS-VARIAPLE 1 R-CAN
2 R-t..I. 3 R•C.I.
RAW COEFFICIENT -.546633 -.293324 .224qt5
STANDARDIZED -.6419 -.4747 .4631
N=
.0247 45.5
4 L-c.I. 5 L-L.I. & L•CAN
-.1&9926 -.175640 .955845
HOTEL~ING~S
BARTLETT~S
FOR ROOTS
1 THROUGH
3 CHI SQUARE=
2 THROUGH
29.1431t
3 CHI SQUARE=
3 THROUGH
WITH
7.42760185£-01 12.0527
!LIKELIHOOD RATIO FOR ROOTS
TRACE CRITERION=
.3188
CHI SQUARE TEST FOR SIGNIFICANCE OF SUCCESSIVE CANONICAL VARIATES
!LIKELIHOOD RATIO FOR ROOTS
-.4228 -.3086 1.2011
WITH
8. 84275714£-01
3 CHI SQUARE=
2.4525
!LIKELIHOOD RATIO
WITH
9. 75285209£-01
HYPOTHESIS
3
18 DEGREES OF FREEDON
P LESS THAN
o0ft67
P LESS THAN
.2816
P lESS THAN
• 6532
LOG = -2.97382053E•011 10 DEGREES OF FRE£00N LOG = •1 •. 2298637:LE•011 4 DEGREES OF FREEDON
= •2.50253291E•021
LOG
1 OEGREEISI OF FREEDOM PAGE
0,4,
THO CONTROL GROUPS
CONTRL
SUM OF CROSS-PRODUCTS FOR HYPOTHESIS
-----------------------------------------------------------1 ~-CAN
1 2 3 4 5 &
R·CAN R-L.I. R-c.I. L-c.I. L-L.I. L·CAN
2. 214730 -.85?447 .084sn -.832053 t. 826439 2. 810220
2 R-L.I.
3 R-c.I.
4 L-c.I.
5 L·L.I.
6 L·CAN
.328106 -.032&54 .32025& -.702994 -1.081650
.003250 -.031872 .069963 .107647
• 312595 -.686176 -1.055773
1.506223 2. 317525
3.565822
!SCP HYP + SCP ERROR! - ADJUSTED FOR ANY COVARIATES ISTI•I 1
R•CAN
2
R-L.I.
3 R-c.I.
4
L-c.I.
5 l-L.I.
6
l·CAN
17
~
R~CAN
1 2 3 4 5 6
R-L.I. R-c.I. t-c.I. l-L.I. l-CAN
140.1099 101.0555 81,1242 106.3992 75.4212 94.2159
262.2024 217.5018 260.6&54 165,964& 99.4974
423,9837 431.1714 242.5922 111.9797
&19.4487 348.2747 128.81&5
310.1285 123.7586
161.4634
CHOlESKY FACTOR STI•
-----------------------------------------------------------1 R-CAN 1
R-CAN R-L.I. R-c.I. L-C.I, L-l,I, L-CAN
2
3 4 5 6
11.83&80 8.53740 6,85356 8.98884 6.37176 7,95957
3 R-c. I.
4 L-c.r.
5 t-L.r.
15.&0415 13.785UO &.74356 1.98266
13.03598 "'· 87717 -. 05423
10.53614 3.93464
2
R-L,Io
6 L-CAN
13.75919 11.~5521
13. 367:!6 8.10850 2.29253
8.56958 2.98224014E+01
LOG-DETERMINANT SCP HYPOTHESES + SCP ERROR, ADJUSTED FOR ANY COVARIATES,
F-RATIO FOR MULTIVARIATE TEST OF EQUALITY OF MEAN VECTORS=
o.F.-;
&
(LIKELIHOOD RATIO
VARIABLE
4
6
AND
95.0000
9.43133138F-01
P LESS THAN
LOG
UNIVARIATE F
------------------
---------------------------------
P LESS THAN
2.2147
1.6061
.2080
R-L, I,
• 3281
.1253
.7242
R-c.r.
.0032
,0008
t-c.I.
.3126
• 0505
L-L.!.
1.5062
.4880
L-CAN
3. 565 8
2.2583
0
I
"'"'
-5.85478212E-02l
HYPOTHESIS MEAN SQ
R-CAN
,9547
.4603
STEP DOWN F
1.6061 STEP-DOWN MEAN SQUARES 1.4611 STEP-DOWN MEAN SQUARES .9760 .1368 STEP-DOWN MEAN SQUARES .8227 .2129 STEP-OOHN MEAN SQUARES .4865 1.9552 STEP-DOWN MEAN SQUARES .3992 .1361 STEP-DOWN MEAN SQUARES
DEGREES OF FREEDOM FOR HYPOTHESIS= 1 DEGREES OF FREEDOM FOR ERROR= 100
P lESS THAN
-------------------------------=(
2.2147/
=I
2.7534/
=(
.3443/
=(
.3722/
=I
2.2158/
=(
.3073/
.2080 1.3790) .2297 1.6645) .7104 2.4811) .6456 1. 7481) .1653 1.1333) .5291 .7698)
HYPOTHESIS MEAN PRODUCTS, ADJUSTED FOR
COVARIATES
~NY
-----------------------------------------------------------1 R·CAN 1 2 3 4 5 6
R·CAN R•L. I. R·C. I. t-c.I. L·L· I. L·CAN
2.214730 -. ~5244 7 .084R37 -.832053 1. 825439 2.~10220
2 R-t..
r.
• 328106 -.032554 .3?0255 -.702994 -1.081650
3 R·C. I.
4 L-c.r.
.003250 -.031872 • 069953 .107547
• 312 595 -. 686176 -1. 0557T3
5 L •L. I.
1.505223 2.317525
5 !.•CAN
3.555822
DISCRIMINANT ANALYSIS FOR HYPOTHESIS
VARIANCE OF CANONICAL VARIATE
1
.0603
PER CENT OF CANONICAL VARIATION= 100.00
ROY#S CRITERION= H= 2.0 N=
.0569 46.5 0 I
··DISCRIMINANT FUNCTION COEFFICIENTS·VARIABLE 1 R-CAN 2 R-L.I. 3 R-c.I. 4 L-c.I. 5 L-L.I. 6 L-CAN
"'"'
STANDARDIZED
RAW COEFFICIENT -.531684 .493243 -.211045 .389024 -.450733 -.317223
-.&243 • 7982 -.4346 .9580 --7918 -.3986
HOTflLINGtS TRACE CRITERION=
• 0& 03
BARTLETTtS CHI SQUARE TEST FOR SIGNIFICANCE OF SUCCESSIVE CANONICAL VARIATES FOR ROOTS
THROUGH
1 CHI SQUARE= !LIKELIHOOD RATIO
5.6791
WITH
9.43133130E-01
& DEGREES OF FREEDOM
P LESS THAN
.4601
LOG = -5.85478289£-02)
ANALYSIS OF VARIANCE PAGE
18
6 DEPENDENT VARIABLE{$) 3
R-c.r.
" t.-c. r. 2
~-L.r.
5 L-L.I. 1 R-CAN 6 L-CAN
fRROR SUM OF CROSS-PRODUCTS
lSEl
-----------------------------------------------------------1
2
R-C. T. 1 2
3 4 5 6
R-C. I. L-C. r. R-L. I. L-L. r. R-CAN l- CAN
4?3.9805 431.2033 217.5345 24?.5223
81.0394 111.8721
4
3
5
L-C .I.
R-t.r.
l-L.I.
R-CAN
619.1361 ?60.3451 348.q609 107.2312 129.8723
261.8743 166.6676 101.9080 100.5791
308.6223 73.5948 121.4411
137.8951 91.4057
6
L-CAN
157.8976
0
CHOLESKY FACTOR SE
I
"'
00
1
2
R-c.r. 1 2 3 4 5 6
R-C.I. L-C.I. R-L. I. L-L. I. R-CAN l-CAN
t-c.r.
20.59079 20.%157 10. 56465 11.77819 3.9~571
5.43311
13.43826 2.909% 7.61312 1.84632 1.19765
3
4
R-L.I.
l-L.I.
5
6
R-CAN
L-CAN
9.87065 5. 09718
8.55163
11. 90775 1.{;8640
4.61513 3.33354
10.44470 • 51701 4.08~06
LOG-DETERMINANT ERROR SUM OF CROSS-PRODUCTS HYFOTHESIS
2o97638536E+01
1 OEGREElSl Of FREEDOM PAGE
co,Lo,
CONST
~UM
1
R-c.r.
2
L-C.I.
OF CROSS-PRODUCTS FOR HYPOTHESIS
3
R-L .I •
4
L-L.T •
5
R-CAN
6
L-CAN
19
1 2 3 4 5 &
R-c.r. L-C. I. R-L.I. L-L. I. R-CAN L-CAN
464.757 0 483.<140 21t5,q252 2'1. 3551 113,3832 135.4673
503. n8o 255.8505 292.7103 138,7664 140.9346
130.1308 H6.~785
70.5794 71.o822
170.3271 60.7477 82.0 093
38.2804 38.8785
39.4860
TfSTS OF HYPOTHESIS BEING SKIPPED HYPO THESIS
OEGREEISl OF FREEDOM
3
PAGE ONE-CONTROLS TWO-CONTROLS THREE-CONTROLS
0,1, 0,2, 0,3,
20
ACTIVE ACTIVE ACTIVE
SUM OF CROSS-PRODUCTS FOR HYPOTHESIS
-----------------------------------------------------------1
1 2 3 4
5 0
R-C. I. L-C. I. R-L. I. L-L.I. R-CAN L-CAN
3
2
R-G .!.
L-c.r.
R-L. I.
81,q4015 70.49646 so ,,an2 47.9?%7 21.86160 26.62904
71.41486 47.93!i30 47.88751 17. 029o2 25.03041
33.08430 32.24068 1?.99558 16.82239
4
L-L. I,
6
5
R-CAN
L-CAN Cl
I
32 .17o10 11 •. 54817 17.09143
"' "' 6.78453 5.79290
10.47&27
ISCP HYP + SCP ERROR! - ADJUSTED FOR ANY CDVARIATES ISTI•l
-----------------------------------------------------------1 ~-c.r.
1 2 3 4 5 6
R-C .I. L-C. I. R-L.I. L-L. I. R-CAN L-CAN
2
3
4
L-C. I.
R-L. I.
L-L.I.
690.5509 308,2814 396.8484 124.2608 154.9027
294.9586 198.9083 11'+.9035 117.4014
340.7984 85.1430 138.5325
0
5
R-CAN
l-CAN
SUS,.9?06
501.o998 267.9181 zqo.4520 102.9010 138,5011
H4.6797 97.1986
168.3739
CHOLESKY FACTOR STI• 1
R-c.r.
2
L-c.r.
3
R-L, I.
4
L-L.I.
5
R-CAN
6
l-CAN
R-C. I.
L-c. r. R-L. I. l-l.I. R-CAN L- CAN
4
5 6
22.4n68 22.30502 11.91135 12.91318 4,57467 5.15761
13.89377 3.06602 7.83226 1. 59916 1.26366
lOG-DETER~INANT
11.96657 1. 75873 4. 630 82 3.35222
10.46952 • 51555 4.12868
9.97410 4.94837
8.72~03
SCP HYPOTHESFS + SCP ERROR, ADJUSTED FOR ANY COVARIATES, =
F-RATTO FOR MUlTIVARIATE TEST OF EQUALITY OF MEAN VECTORS= O,F.=
16
AND
!LIKELIHOOD RATIO
VARIABlE
4
6
HYPOTHESIS MEAN SQ
269.1859
7.24818916£-01
UNIVARIATE F
--------·---------
P LESS THAN
LOG
-3.21833427£-011
P lESS THAN
27.3134
6.4421
.0005
23.8050
3.6449
.0119
4. 2112
t-L.I •
10.7254
3.4752
R-CAN
2.26~5
1.6400
L-CAN
3.4q21
2.2H6
DEGREES OF FREEDOM FOR HYPOTHESIS= 3 DEGREES OF FREEDOM FOR ERROR= 100
HYPOTHfSIS MEAN PRODUCTS, ADJUSTED FOR ANY COVARIATES 1 ~-c.
R-r.r.
2
I.
27.31338 23.49882 1&. 79454 15.97656
4
L-C. I. R-L.I. L-L.I.
5
~-CAN
7.?R720
6
L-CAN
8.87635
3
L-c.r. 23.60495 15.97877 1?.96250
5.67654 8.34347
3
R-L.I.
H.02810 10.74689 4.33186 5.60746
4
L-L.I.
5
6
R-CAN
L-CAN
2.26151 1.93097
3.49209
tO. 72537 3.84939 5.69714
HYPOTHESIS
1 DEGREEISI OF FREEDOM
P LESS THAN
STEP DOWN F
--------------------------------
6.4421 STEP-DOWN MEAN SQUARES 2.2751 STEP-DOWN MEAN SQUARES ,0076 ... 339 STEP-OOWN MEAN SQUARES .0189 .1535 STEP-DOWN MEAN SQUARES .1850 .6743 STEP-DOWN MEAN SQUARES • 0915 1.2821 STEP-DOWN HEAN SQUARES
R-C. I. L-C.I.
11.0281
1.8022
.0251
---------------------------------
R-L.I.
3.00856870E+01
=I
27.3134'
=I
4.15001
=!
.62781
=I
.17261
=I
.68 .. 31
=I
.98701
.0005 4.23981 .08 .. 6 1.82411 • 7293 1.ft4691 .9273 t.121t71 ,5699 t. 01491 .2851 .76981
0
... I
0 0
PAGE 0 0 4,
TWO CONTROL GROUPS
21
CONTRL
SUM OF CROSS-PRODUCTS FOR HYPOTHESIS
-----------------------------------------------------------1 2
. 3
5 6
R-c.r.
t-c.I. R-L.I. L-L.I. R-CAN L-CAN
3 R-L.I.
1 R-c.I.
2 t-c.I.
• 200000 .333333 • 066667 1.000000 .766667 • 966667
.555556 .111111 1.666667 1.277778 1.611111
.022222 .333333 .255556 .322222
4 t.-L. I.
5 R-CAN
6 L-CAN
5.000000 3. 833333 4.833333
2.938889 3.705556
4.672222
ISCP HYP + SCP ERROR) - ADJUSTED FOR ANY COVARIATES CSTI•I
-----------------------------------------------------------1 R-c.r.
2 L-c.r.
3 R-t.I.
4 L-L.I.
5 R-CAN
Q
6 L-CAN
I 1-'
0
1-'
1 2 3
.. 5 6
R-C. I. L-c.r. R-L. I. L-L.I. R-CAN L-CAN
424.1805 431.5366 217.6012 243.5223 61.801'>~
112.8388
619.6916 260.4563 350.6276 106.5090 131.4634
261.8966 167.0009 102.1635 100.9013
313.&223 77.4281 126.2744
140.8340 95.1112
162.5698
CHOLESKY FACTOR STI•
-----------------------------------------------------------1 2 3 4 5 6
R-c.I. L-c.r. R-L.I. L-L• I. R-CAN L-CAN
1 R-c.r.
2 1.-C .I •
3 R-L.I.
4 L-L.I.
zo. 59564 20.95261 10 .56540 11.82397 3.97201 5. 47877
13.44140 2.90754 7.65412 1.88107 1.24152
11.90861 1.66445 4.59570 3.30904
10.60472 .79360 4.38325
5 R-CAN
6 L-CAN
9.98842 5.23889
8.56755
LOG-DETERMINANT SCP HYPOTHESES + SCP ERROR, ADJUSTED FOR ANY COVARIATES, =
F-RATIO FOR MULTIVARIATE TEST OF EQUA'LITY OF· MEAN VECTORS=
2.962277&9E+01
.9&10
o.F.=
6
AND
!LIKELIHOOD RATIO
VARIABLE
95.0000
9.42779095£-01
P LESS THAN
LOG
•5.89232820E-021
HYPOTHESIS MEAN SQ
UNIVARIATE f
------------------
---------------------------------
R-C. I.
.2000
.0472
L-c.r.
.555&
• 0897
R-L. I.
.0222
• 0085
L-L.I.
s.oooo
1. 6201
R-CAN
2.938~
2.1312
L-CAN
4.6722
2.9590
.45&0
P LESS THAN
P LESS THAN
STEP DOWN f
--------------------------------
.828&
.0472 STEP-DOWN HEAN SQUARES .04&2 STEP-DOWN HEAN SQUARES .92&8 .0142 STEP-DOWN MEAN SQUARES .2061 2.9939 STEP-DOWN HEAN SQUARES .1475 2.30 .. 5 STEP-OOWN MEAN SQUARES .0885 .3541 STEP-OOWN HEAN SQUARES
=I
.2000/
.7&52
=I
.0841t/
=I
.0205/
=I
3.3&71/
=I
2.33811/
:1
.Z726/
.828& 4.23981 .8302 1.821ttl .9055 1.44691 .0868 1.12471 .1323 1. 01491 .5533 • 7&981
DEGREES Of FREEDOM FOR HYPOTHESIS= 1 DEGREES OF FREEDOM fOR ERROR= 100 0 I
...
~ HYPOTHESIS HEAN PRODUCTS, ADJUSTED FOR ANY COVAR!HES 1 R-C. I . 1 2
3 4 5
6
R-C. I. L-c.r. R-L.I. L-L.I. R-CAN L-CAN
2 L·C. I.
3 R-L.I.
.
L-L. I.
5 R·CAN
s.oooooo 3.833333 4.833333
2.938889 3. 705556
6
L-CAN
.2noooo • 33"1'}1:33
.555556 .111111 1.666667 1.277778 1.611111
• On&&67 1. 000000 .76&667 .966667
.022222 .333333 .255556 .322222
HYPOTHESIS
4
4.&72222
1 DEGREE!Sl OF FREEDOM PAGE
1,
o,
FIRST·SFCOND YEAR
YEAR
SUM OF CROSS-PRODUCTS FOR HYPOTHESIS 6
22
1 2
.
3
5
6
R•C,I. L-C.I. R•L.I. L-L.I• R·CAN L•CAN
R•C.I.
L•C.I.
R-L,I.•
L-L.I,
• 386744 1.424157 • 8342'>3 • 810724 -. 030704 -.052916
5.244363 3.072842 2.985'+38 -.113067 -.194859
1.799540 1o748809 -. 066232 -.111tH5
1.699508 -.o6i.365 -.110927
R·CAN
l·CAN
• 00.2438 .804201
.007240
ISCP HYP + SCP ERROR! - ADJUSTED FOR ANY COVARIATES ISTI•I
-----------------------------------------------------------1 2 3 4 5 6
R-C.I• t-c.r. R•Lol• L•L.I. R·CAN L•CAN
1 R-c.r.
2 L-c.r.
3 R•L.I.
4 L-L.I.
5 R•CAN
424.3672 432.6275 218,3687 243.3330 81.0087 111.8192
&24. 3804 263.4172 351.9463 107.1182 129.6774
263.&739 168,4H4 101.8'+17 tuO.It649
310.3218 73.5304 121.3302
137.8976 91.4099
6 L•CAN
157.9048
CHOLESKY FACTOR STI•
Q
-----------------------------------------------------------1 R-c.r. 1 2
..5 3
6
~-c.r.
L•C,I. R-L.I. L-L.I • R•CAN L·CAN
20. &'0017 21.00115 H
.6003~
11· 81218 3. 93243 5.42807
..
2 L-c.I.
3 R•L, I,
l•L,I.
13.54001 3. 01314 7.67185 1o81186 1.15817
11.92593 1,681t.31 4.58&42 3.30673
10.4'+510 .52221 4.09359
5 R·CAN
6 L•CAN
9.89155 5.12176
"'
8.55425
tOG·OFTERHINANT SCP HYPOTHESES + SCP ERROR, AOJUSTEO
FO~
ANY COVARIATES, =
F•RATIO FOR MUlTIVARIATE TEST OF EQUAliTY OF HEAN VECTORS= o.F.=
6
(liKELIHOOD RATIO
VARIABLE
HYPOTHESIS MEAN SQ
AND
95.0000
9.763278&5E·01
UNIVARIATE F
I .... 0
P LESS THAN
2.97878113E+01
.3839
.8877
LOG = •2.39576415E•021
P LESS THAN
STEP OOWN F
P LESS THAN
1
R-c.I.
.38&7
.0912
.7633
L-C. I.
5.2444
• 8470
.3597
I.
1.7995
.6872
.4092
.0912
STEP-OOWN HEAN SQUARES =I STEP-DOWN MEAN SQUARES =I ~-t.
t-t.r.
.5507
1.6995 .0024
.0018
L-CAN
.0072
• 0046
.9462
.0071/
.4070
.96&&
STEP-DOWN MEAN SQUARES =I 6
.4332/
.0063
.4598
STEP-DOWN HEAN SQUARES =I R-CAN
2.7451/
.2994
STEP-DOWN HEAN SQUARfS =I 4
.3867/
1.5049
.4131/
.0582
STEP-DOWN MEAN SQUARES =I
.0448/
.7&33 4.23981 .2229 1.82411 .585& 1.44691 .93&9 1.12471 .5250 1.0149) .8099 .76981
DEGREeS OF FREEDOM FOR HYPOTHESIS= 1 DEGREES OF FREEDOM FOR ERROR= 100
HYPOTHESIS HEAN PRODUCTS, ADJUSTFD FOR ANY COVARIATES 1 1 2
~-c.
r.
t-c.I. R-L.I. 4 L-t. I. 5 R-CAN & L-CAN 3
2
4
3
R-C.I.
t-c.r.
R-t.r.
l-t.I.
.386744 1.424157 • 834243 .810724 -. 030704 -. 052916
5.244363 3.072042 2.98543R -.113067 -.194859
1.799540 1.748809 -.066232 -.114145
1.699508 -.0643&5 -.110927
HYPOTHESIS
5
R-CAN
6
L-CAN
n I
• 0 02438 .004201
f-' 0
_,.
.007240
1 DEGREEISl
OF FREEDOM PAGE
SUM OF CROSS-PRODUCTS FOR HYPOTHESIS
-----------------------------------------------------------1
R-C.I. 1 2 3
4 5 6
R- c. r. t-c. r. R-L, I. L-t.I. R-CAN L-CAN
9. 735632 8. 026736 3.?55747 3.382184 • 979885 2.117816
2
L-C. I.
4
3
R-t.r.
L-t.I.
1.088773 t.B1055 .3U689 .708231
1.174979 .340415 .735735
5
6
R-CAN
t-CAN
• 098625 .213157
.4&0694
6.621100 2.684'334
2,789204 .808087 1.746511
SUM OF PRODUCTS OBTAINED BY SUBTRACTION
23
ISCP HYP + SCP ERROR! - AOJUST£0 FOR ANY COVARIATES ISTI•l
-----------------------------------------------------------1 R-~.
1 2 3
4 5 6
R-c.r. L-C. I. R-L. I. L-L. I. R-CAN L-CAN
r.
433.7161 4'9.2320 2?0.7902 245.9045 8?.0193 11,.9899
2 l-C.I.
3 R-L.I.
4 L-L.I.
625.7572 263.0301 351.7501 108.0393 131.6188
262.9631 167.7987 102.2357 101.2873
309.7973 73.9352 122.1768
5 ~-CAN
137.9938 91.6188
6 L-CAN
158.3583
CHOLESKY FACTOR STI•
-----------------------------------------------------------1 R-C. I. ~-c. I. L-C. I. R-L. I. l-L.I. R-CAN L-CAN
1 2 3
4 5 6
20.82585 21.09071 10.60174 11.80765 3.93834 5.47348
3 R-l.I.
2 L-C .I. 13.45136 2.93144 7.63627 1.856A3 1.20279
LOG-OETE~MINANT
4 L-L.I.
5 R-CAN
6 L-CAN
9.87067 5.09689
8.55301
11.91524 10.44896 • 51776 4.08662
1.697~7
4. 61923 3.33463
SCP HYPOTHESES + SCP ERROR, ADJUSTED FOR ANY COVARIATE$,
F-RATIO FOR MULTIVARIATE TEST OF EQUALITY OF MEAN VECTORS• D.F.•
6
ILIKELIHOOO RATIO
VARIABLE
4
UNIVARIATE F
9. 7356
2. 2962
6.6211
1. 0694
R-L. I.
1.0888
.U58
L-CAN
1.1750 • 09A6 .4&07
P LESS THAN
LOG
.3807 .0715 • 2918
0
,_.I 0i
• 4340
.8546
-2.70408194E-021
P LESS THAN
---------------------------------
R-C. I.
L-L. T.
95.0000
9.73321510E-01
L-c.I.
R-CAN 6
HYPOTHESIS MEAN SQ
------------------
AND
2.97908944E+01
STEP DOWN F
P LESS THAN
--------------------------------
.1329
2.29&2 STEP-DOHN MEAN SQUARES .1931 STEP-DOWN HEAN SQUARES .520& .1232 STEP-DOWN MEAN SQUARES .5387 .0781 STEP-DOKN MEAN SQUARES .7897 .0004 STEP-DOWN MEAN SQUARES .5903 • 0306 STEP-DOKN HEAN SQUARES
•I
9.7356/
•C
.352U
•I
.178U
.3036
•C
.08711/
•I
.00041
•I
.0237/
.1329 ... 23981 .6614 1.82411 • 72& .. 1.44691 • 7805 1.12 ..71 .9850 1.01491 .8610 .76981
DEGREES OF FREEDOM FOR HYPOTHESIS= 1 DEGREES Oc FREEDOM FOR ERROR= 100
HYPOTHESIS MEAN PRODUCTS, ADJUSTED FOR ANY COVARIATE$
----------------------------·------------------------------1 R -~ • I. 1
2 3 4 5
6
R-c.r. t-c. I. R-L. I. L-t.r. R-CAN L-CAN
g. 7356 32 8.028736 3.255747 3.382184 .Q7q885 2.117816
CORE USED FOR DATA=
2 L-C. I.
R-L.I.
4 L-L.I.
1. 08 8773 1.131055 .327689 .708231
1.174979 .340415 • 735735
3
6
5 R-CAN
L-CAN
.098625 .213157
.460694
6.&21100 2. 68 497' 4
2.789204 .808087 1.746511
253 lOCATIONS OUT OF 3000 AVAILABLE
0
,_,I 0
"'
PROBLEM 4 -- FOUR-HAY FIXED-EFFECTS BIVARIATE ANALYSIS OF VA RIANCE -- UNEQUAL SUBCLASS SIZES 4 .2 1 1 1 3 1 1INPUT DESC, 2· SEX 2ABllTY 2ESSAY RACE 4 FACTOR !DENT CD, PROBLEM 4 -- FOUR-WAY FIXED-EFFECTS BIVARIATE ANALYSIS OF VARIANCE--UNEQUAL SUBCLASS SIZES A RANDOM SAMPLE OF FIFTH-GRADE TEACHERS HERE EACH ASKED TO RATE TWO ESSAYS SUPPOSEDLY WRITTEN BT THE SAHE FIFTH-GRADE PUPIL, THE TEACHERS HERE PROVIDED WIT.H INFORMATION, VIA COVER LETTER, AS TO THE PUPil,S SEX, RACE, AND COGNITIVE ABILITY LEVEL. EACH ESSAY WAS TO BE SCORED ON A TEN-POINT SCALE, FROH VERY I'OOR TO VERY GOOD, FOR THE CATEGORIES SPElliNG AND PUNCTUATI·ON, GRAHHARt SENTENCE STRUCTURE, ORGANIZATION,NEATNESS, RELEVANCE OF IDEAS, APPROPRIATE WORD USAGE, ClARITY, CREA TIV.ITY AND !HAG INA TIONt AND COMPlETENESS OF THOUGHT, EACH SCALE RANGED FROM 1 POINT IHINIHUMl TO 10 POINTS !HAXIHUH), SINCE THE. ESSAYS WERE TYPEWRITTEN, HOST OF THE RESPONDENTS OMITTED THE NEATNESS RATING, THE OTHER NINE SCALES WERE SUHHED FOR EACH ESSAY, YIELDING TWO TOTAL SCORES, THE DEPENDENT VARIABLES, FOR EACH TEACHER-SUBJECT, ., THE EXPERIMENTAL CONDITIONS WERE DEFINED BY HIE INFORMATION PROVIDED THE TEACHER ABOUT THE PUPIL-AUTHOR. FOUR DIFFERENT ESSAY PAIRS WERE USED AS STIMULI T.O GIVE SOHE INDICATION OF GENERAliZABILITY OF RESULTS ACROSS ESSAYS, AND_ TO !~SURE TtfA:f TWO TEA:-HERS iN IHE SAM-E SCHOOL WERE NOT SCORING THE SAHE ESSAY. EACH TEACHER RECEIVED ONE COMBINATION OF SEX-RACE-ABILITY INFORHATION FROH AHONG POSSIBLE COMBINATIONS OF THE CROSSING OF THE THREE VARIABLES, WITH ABILITY HAVING TWO LEVELS !HIGH • LOWl. A TOTAl OF 112 TEACHERS RETURNED COHPLETED RESPONSES, EVERY COMBINATION OF THE EIGHT EXPERIHENTAL CONDITIONS 12 SEXES X Z RACES X Z ABILITY LEVELS!, WAS REPRESENTED, FOR EACH OF THE FOUR ESSAY PAIRS. IT WAS EXPECTED THAT THERE WOUlD BE HEAN DIFFERENCES ON THE ESSAY PAIRS, BY PUPIL-AUTHOR RACE, SEX, AND ABILitY. ESSAY PAIRS ARE KNOWN TO BE OF DIFFERING QUALITIES, AND ARE CONSIDERED AS A CONTROL VARIABLE, RATHER THAN ONE OF DJ~ECT EXPERIMENTAL INTEREST, THE DATA CARDS ARE PUNCHED AS FOLLOWS -CARD COLUMN
.
1 - 3 7 - 9
10-11
12-13 14-17
SUBJECT IDENTIFICATION NUHBER TEACHER SEX U=Ml IZ=Fl SCHOOL NUMBER IN CITY TEACHER EXPERIENCE, IN FULL YEARS 199 = NO RESPONSE! NUHBER OF POST-BACCALAUREATE COURSES TAKEN 199 = NO RESPONSE! STIHtllUS INFORMATION , 1.. 15 1& 17
PUPIL-AUTHOR RACE
11=NEGROl ·tz=WHITEI U=MALEl 12=FEHALEl PUPIL-AUTHOR ABILITY 11=HIGHI IZ=LOWI ESSAY PAIR 11-2-3-41 PUPIL-AUTHOR SEX
0 I
~
21•40
TWO-DIGIT FIELDS - TEN RESPONSES ESSAY I• !FAVORITE SC~OOL SUBJECTI
43-44
TOTAL OF NINE SCORES - ESSAY I
45-&4
TWO-DIGIT FIELDS - TEN RESPONSES ESSAY It• IWHAT I THINK ABOUTI
!RANGE 9 - 901
&6-67
TOTAL OF NINE SCORES - ESSAY II
!RANGE 9 - 901
•TEN RESPONSES 1 - SPELLING AND PUNCTUATION 2 - GRAMMAR 3 - SENTENCE STRUCTURE 4 - ORGANIZATION 5 - NeATNESS !NOT INCLUDED IN TOTALSI 6 • RELEVANCE OF IDEAS 7 - APPROPRIATE WORD USAGE 8 - CLARITY 9 - CREATIVITY AND IMAGINATION 10 - COMPLETENESS Of THOUGHT FOR FURTHER INFORMATION ON THE STUDY, SEE J, FINN, THE EDUCATIONAL ENVIRONMENT EXPECTATIONS, PAPER PRESENTED AT THE ANNUAL MEETING OF THE AMERICAN EDUCATIONAL RESEARCH ASSOCIATION, MINNEAPOLIS, MARCH, 1970, THIS RUN USES --
0
,_.I
DATA fORM I IOATA UNSORTED! OPTIONAL PRINTED OUTPUT FOUR-WAY ANALYSIS ALTERNATE CONTRAST ORDERINGS REPEAT CODE ON SYMBOLIC CONTRAST VECTORS MEANS KEY FOR OBSERVED AND ESTIMATED COMBINED MEANS ESTIMATION OF EFFECTS, INCLUDING INTERACTION, STANDARD ERRORS, !NTERCORRELATIONS AMONG THE EFFECTS, MEANS AND RESIDUALS. TRANSFORMATION UNIVARIATE UNEQUAL - N ANOVA
0
CX>
THE E-XPERIMENTAL DESIGN IS A COMPLETE 2 X 2 X 2 X 4 !RACE X SEX X ABILITY X ESSAYI FACTORIAL ARRANGEMENT WITH TWO TOTAL SCORES AS CRITERION MEASURES, THERE ARE FROH 1 TO 6 OBSERVATIONS PER SUBCLASS. ALL MAIN EFFECT AND INTERACTION CO~TRASTS ARE COOED, WITH THE RANK OF THE HODEL FOR SIGNIFICANCE TESTING BEING J 32, SIMPLE CONTRASTS ARE USED FOR ALL TWO-LEVEL FACTORS, AND DEVIATION CONTRASTS FOR THE 4-LEVEL ESSAY EFFECT, THE !REPEAT COOEI IS EMPLOYED ON THE SCV,S TO REDUCE THE NUHBER OF CONTRAST CARDS PUNCHED,
=
IT IS DESIRABLE AND PERHAPS NECESSARY TO ESTABLISH THREE ALTERNATE ORDERS OF EFFECTS, THIS WILL ASSURE THAT EACH MAIN EFFECT CAN BE TESTED IN THE LAST POSITION, IN CASE OTHER HAIN EFFECTS ARE SIGNIFICANT, IN THE ORDER OF SCV CARDS, ESSAY PAIRS ARE THE LAST MAIN EFFECT, IN CASE ESSAYS ARE SIGNIFICANTLY DIFFERENT lAS EXPECTED!, THE ALTERNATE ORDERS SHOULD ARRANGE FOR SEX, ABILITY, AND RACE EFFECTS, RESPECTIVELY, TO FOLLOW ESSAYS, HOWEVER, THE ANALYSIS REVEALS THAT THE SEX X RACE INTERACTION IS SIGNIFICANT AND OBSCURES BOTH THE SEX AND RACE HAIN EFFECTS, THUS TO BE CONSISTENT WITH THE TEXT, THE ONLY MAIN EFFECT ORDERED AFTE~ ESSAYS IS ABILITY, uAe TWn
~VPnT~~~T'
T~'T
CA.RDSo THE FIRST TESTS ALL MAIN EFFECTS AND INTERACTIONS IN THE ORIGINAL ORDERo THE. NUMBERS CORRESPOND EXACTLY TO THE DEGREES OF FREEDOM IN THE COMPLETE 4• WAY ANOVA TABLE. THE SECOND HYPOTHESIS TEST CARD, FOR THE ALTERNATE ORDER, BYPASSES THE TEST OF THE FIRST SIX CONTRASTS IN THE ALTERNATE ORDER ITHE CONSTANT, RACE, SEX, ESSAYS I • THE SEVENTH CONTRAST IN THE ALTERNATE ORDER IS THE ABILITY EFFECT, ELIMINATING THE FIRST SIX, AND IS TESTED ALONE. THE RANK OF THE HODEL FOR ESTIMATION INCLUDES THE FIRST l8l DEGREES OF FREEDOM ICQNSTANT THROUGH RtCE X SEX INTERACTlONl • THESE ARE THE EFFECTS WHICH PROVE SIGNIFICANT,. THE MEANS ARE ESTIMATED FROM THE RANK•8 HODEL, AND ARE COMBINED TO YIELD MEANS FOR EACH HAIN EFFECT AND EACH SEX·RACE COMBINATION. THESE GREATLY SIMPLIFY THE INTERPRETATION OF EFFECTS IN THE 4·WAY DESIGN. THE PROGRAH,S TRANSFORMATIONS ARE USED TO OBTAIN A THIRD CRITERION VARIABLE, AS THE SUM OF THE TNO ESSAY.SCORES. A SECOND ANALYSIS IS PERFORMED USING THE TOTAL AS .A SINGLE CRITERION VARIABLE. THE UNIVARIATE RESULTS HAY BE COMPARED WITH THE BIVARIATE FINDINGS, ALTHOUGH THE SUM SCORE MASKS EFFECTS IN THE SEPARATE ESSAY TOPICS. THE TOTAL SCORE CANNOT. BE USED IN THE SAME ANALYSIS AS THE OTHER TNO MEASURES, SINCE IT IS EXACTLY A LINEAR COMBINATION OF THEN, AND CAUSES THE JDI.NT VARIANCE•COVAIUANCE MATRIX TO BE SINGULAR. HOIIEVER, T.HE THREE MEASURES CAN BE USED TOGETHER IN Al..L OTHER PHAS£S OF THE PROBLEM RUN, A.ND SELECTED FOR SEPARATE TESTS OF SIGNIFICANCE IN PHASE 3. FINISH END OF COMHENTS 113XItl1,25XF2 oO ,t1XF2o 01· VARIABLE FORMAT .3 16 1 2 CREATE SUM SCORE EnD TRANSFORMATo ESAY IESAYII TOTAL 6 ·s 4 & 4 46 5 4 2 4 11 101 41221 7 4 4 6 .. 3 5 2 3 32 DATA 22 340 2114 6 6 6 6 6 6 7 5 6 54 5 6 .. 5 5 6 7 5 5 lt8 32 6 7 7 6 41.3 52224 6 1 7 7 7 &o 5 5 5 ·s 6 7 6 5 7 51 42 3 53 3 5 3 .. 2· 4 32 5 5 3 5 1110 31214 5 5 3 3 3 37 51 1102102121 6 5 .. 4 6 5 6 5 5 46 5 4 3 4 3 4 & 3 .. 36 62 1608171121 5 4 3 6 8 & 4 6 4 46 7 5 2 3 & 5 3 1 3 35 71 8 7 6 6 1708132213 7 7 8 6 6 61 7 7 6 8 8 6 7 1 6 &It 82 1835121211 9 4 3 3 4 3 3 1 1 31 8 4 2 1 2 3 3 1 2 26 92 9 5 5 5 5 3 3 3 5 43 1 5 3 5 1911121223 5 3 5 8 5 4& 1112 1916 212ft 4 5 4 3 9 8 9 3 4 49 1 5 3 3 9 8 9 5 & 49 2225 42123 8 7 4 3 11.2 .. 5 8 4 5 48 9 6 2 5 7 4 6 3 5 47 2305 5211.. 122 8 7 4 7 5 7· 6 6 4 Sit 8 8 4 8 9 8 6 9 8 70 7 6 6 5 2306 31212 131 6 5 7 5 5 52 7 6 6 8 9 8 8 9 8 69 11t2 6 5 4 5 2714182122 5 4 4 4 4 41 8 7 1 8 10 9 9 9 9 76 152 4 .. .. .. 3 5 .. 5 3 36 7 8 8 8 10 9 9 9 9 77 2808151123 16l!' 6 5 5 2 2810132211t .. 5 3 2 4 36 4 .. 5 .. 5 5 4 3 4 38 4 4 5 5 171 2803172221 5 5 5 5 4 42 3 .. 3 4 4 3 3 3 3 30 181 7 7 7 1 8 7 8 7 7 65 3 3 3 3 3103 71213 3 4 4 7 5 35 191 3106?01124 5 5 3 5 7 7 6 5 5 48 7 7 & & 1 8 8 6 7 62 202 7 5 3 4 5 4 3 7 3 41 5 2 2 .. 3109 1111 3 3 5 5 5 34 212 3106 62222 7 6 3 5 6 5 9 1 9 57 4 6 5 9 9 8 910 9 69 221 7 6 7 7 33 02121214 8 1 8 8 7 &5 7 7 5 & 7 6 8 6 8 60 232 7 7 5 7 1 8 6 4 5 5& 8 8 8 7 3410 42121 & & 8 2 9 &2 21t1 3603 92223 5 5 5 6 5 5 6 5 4 It& 7 5 5 9 7 & 5 7 & 57 251 4 .. 4 4 3708 71121 4 4 4 4 4 3& 2 2 2 2 2 2 2 2 2 18 261 8 8 .. 3 ..010202124 5 6 5 5 5 lt9 5 8 3 3 6 7 4 4 4 44 4022181221 272 3 2 2 4 5 3 3 2 3 27 5 2 1 4 3 3 3 2 3 26 281 4 7 5 1 lt199992123 3 4 2 1 2 29 9 & 7 3 8 8 310 5 59 292 9 8 7 It lt317161113 5 5 4 4 5 51 8 7 710 9 7 8 8 9 73 430720222 .. 30.2 2 2 2 1 4 3 2 1 2 19 1 1 1 2 5 5 4 2 4 25 312 4334 2111 1 1 1 1 9 8 6 9 1 37 110 3 7 9 3 8 & 9 56 321 4U7 2122 '6 7 8 5 5 1 6 5 4 53 & 5 4 9 9 8 9 9 8 &7 332 4507 2221 6 3 3 2 2 6 5 5 2 34 2 2 1 1 1 2 1 1 1 12 342 lt506111222 1 & 5 3 4 5 6 3 4 lt3 5 7 It 1 7 5 7 4 6 52
Q
I 1-' 0
"'
351 362 371 381 391 402 412 422 432 4'>1 451 4.62 472 481 492 502 512 522 532 542 552 562 572 582 592 602 612 622 632 642 652 661 672 682 691 702 711 722 731 7<>2 752 762 772 782 792 802 812 822 832 842 851 862 871 882 892 902 911 922 932 942 952
4702 71111 4835122222 4803 62121 5102 92212 5102121224 5209 42112 5202162213 5304121211 5317101223 540&161221 5406 71112 5408 72123 5601 1212 5910101113 &006 81123 6003 2214 6004102221 6125 1114 6126 1211 6113 31122 6109 21223 6240 1213 6305 1111 6301 22222 6435122113 6520222121 6525 22212 &618 1224 6602 31:1.21 6605 22112 6836 1211 6902 31221 7041 1112 7207121212 7411271113 7403 42111 7403 32122 7515 21123 7607122124 7&0610 1221 7&03 711!2 7701 32221 7702 1222 7835 1124 7807 &1111 7825102222 8113122121 8199992212 8113 72223 8201 21224 90 01 42112 9012162213 4508211213 5699992211 5902 52111 2715 2111 5901 22224 1225132212 6813161223 6826 1122 5201 41114
6 4 3 4 3 3 5 4 7 7 5 8 8 5 10 9 3 1 8 7 6 6 5 7 5 4 3 5 9 9 8 8 8 5 4 3 7 8 5 4 7 6 510 5 4 4 2 8 8 7 6 3 3 6 5 3 3 5 3 9 8 4 5 8 8 6 4 9 8 7 6 5 5 5 8 5 4 7 5 4 3 4 4 4 3 2 2 8 8 3 3 9 3 7 7 5 6 2 3 5 8 7 6 7 5 2 1 6 5 8 8 4 4 4 4 7 7 5 5 3 4
3 2 3 4 5 4 7 7 2 7 6 7 6 2 9 5 5 4 8 5 4 8 3 2 8 5 2 3 2 2 9 .. 9 .. 7 5 4 4 5 4 6 3 3 1 7 1 .. 7 3 6 8 5 5 1 5
e
3 5 7 7 5
3 1 5 3 5 4 3 9 1 7 6 3 2 3 6 3 5 3 7 5 5 9 3 5 6 7 2 5 4 1 8 3 6 4 6 7 4 2 7 6 5 2 1 .. 8 1 3 7 4 1 4 6 6 1 5 9 6 5 6 2 5
4 3 6 4 7 4 7 7 4 7 6 3 4 3 8 3 4 3 7 4 7 8 3 4 7 7 5 6 5 4 8 5 7 6 5 5 4 4 7 6 4 2 4 5 8 1 1 5 5 1 4 5 8 1 4 9 8 6 6 7 6
4 3 3 3 6 5 9 9 3 7 6 5 4 8 8 5 4 4 8 4 6 7 3 2 7 7 4 4 2 3 9 6 8 6 7 7 4 4 6 7 4 5 2 4 9 3 2 8 3 8
s
7 7 1 6 8 5 7 7 5 6
4 2 5 3 8 5 8 9 2 7 6 5 5 4 8 6 3 3 8 4 7 8 2 3 8 8 3 4 4 3 7 7 8 7 5 6 3 5 6 5 4 .. 3 3 6 2 1 5 7 7 4 6 6 1 5 6 5 5 7 5 8
3 1 8 1 4 2 7 9 1 8 6 3 4 3 3 2 5 2 7 5 5 7 3 2 6 8 1 4 4 3 6 & 7 4 4 8 3 3 5 4 4 5 1 1 9 2 i
5 1 3 9 6 7 1 7 8 6 8 6 2 5
2 2 3 3 8 3 7 8 2 7 6 3 4 3 5 2 3 2 7 5 7 9 3 3 6 7 2 3 3 3 7 4 8 7 5 6 3 5 5 4 3 2 i
2 9 2 2 5 4 & 4 7 6 1 5 7 5 5 6 2 5
33 4 1 2 2 21 2 5 5 5 39 3 5 5 3 30 1 1 1 1 57 8 7 6 7 40 3 4 4 8 61 5 4 1 7 77 2 4 1 1 19 2 3 2 1 &5 8 7 6 7 54 4 7 6 4 41 4 6 3 8 38 5 8 8 9 34 4 7 3 9 65 9 6 6 8 42 5 5 6 7 42 9 5 4 7 28 4 3 3 3 67 8 9 5 7 41 7 7 7 6 54 7 & 5 6 71 510 710 29 5 4 2 2 27 2 2 2 4 64 9 9 9 9 6.2 5 7 4 5 25 2 3 2 4 40 3 4 2 2 30 3 2 2 5 27 3 6 6 6 71 9 8 8 7 44 6 2 2 1 69 7 9 8 8 48 6 5 4 8 5&7668 5710 7 .. 7 35 4 5 5 6 40 6 7 7 7 50 7 4 5 6 48 4 4 2 2 37 6 7 5 8 312111 22 6 3 4 2 24 2 1 1 2 74 8 7 7 6 1.8 6 Q 2 5 2610 5 5 1 564577 38 8 7 5 5 37 3 .. 5 7 71 4 .. 4 4 55 8 7 7 8 57 7 7 5 6 10 2 1 1 1 48 8 7 5 6 736989 46 3 3 3 2 49 3 4 5 6 59 7 7 7 8 40 2 7 2 7 4 7, 3 4 4 8
3 1 1 1 1 16 8 6 7 7 8 53 3 4 5 1 3 32 2 1 1 1 1 10 6 5 7 4 6 56 8 7 9 9 9 61 8 7 7 8 7 54 5 5 5 1 3 27 4 2 2 1 1 18 6 7 7 6 7 61 & 5 4 3 3 42 8 6 7 7 7 56 9888871 8 9 710 9 66 9 6 610 8 68 7 5 5 3 5 48 9 5 7 5 6 57 4 4 3 3 3 30 8 5 7 5 8 62 8 & 7 8 7 63 8 7 7 6 8 60 10 9 91010 80 3 2 2 2 2 24 3 2 3 2 2 22 9 9 9 8 8 79 8 7 8 4 6 54 7 4 4 1 2 29 4 5 2 3 2 27 6 2 4 4 4 32 9 5 6 7 6 54 710 5 5 5 64 1 2 2 2 2 20 9 8 8 8 8 73 8 8 9 9 9 66 8 8 5 8 8 64 8 9 7 7 7 &6 7 5 5 5 6 48 8 7 5 9 9 65 5 5 6 4 5 47 2 3 2 1 2 22 9 5 7 <J 8 64 3 5 3 1 1 18 4 5 7 5 7 43 5 5 3 1 1 21 7 8 7 6 B 64 5 5 5 7 4 45 -5 6 4 1 " 41 8 6 8 9 9 63 6 6 8 4 & 55 101010 5 7 61 5 6 7 5 5 44 & 8 7 8 7 66 5 7 5 6 7 55 1 1 1 1 1 10 4 & 6 3 5 50 8 7 8 7 6 72 4 3 3 2 2 25 8 7 & 8 8 55 8 7 8 8 7 &7 7 5 7 9 7 53 8 6 9 8 8 58
0 I 1-' 1-'
0
961 972 982 992 1002 1012 1022 1032
10'!,2 1051 1061 1072 10$2 1091 1102 1112 1122 32
4702 1124 1603 91224 5306122124 5405 2114 3310222113 6004 ~1222 6904 72124 1905 21122 4902 1tZ13 2110 1221 6502 2223 699992111 3902 lo1111+ 3905211122 199992123 642lo 1211+ 905 2113 8
4 8 .. 5 8 8 " .. 9 1 8 8 6 7
2 7 8 6 8 7 5 .. 7 1 6 8 5 6 1 5 7 8 9 8
1
2 5 " 5 8 2 3 5 5 l!
6 7 5 6 4 3 8
2 5 3 7 3 2 3 4 5 1 5 6 3 4 3 3 6
7 7 7 6 5 7 4 2 5 3 6 8 1 1
5 4 6
5 3 8 7 910 5 5 5 5 4 5 &6 2 4 7 6 1 2 8 6 5 7 7 6 5 4 7 5 8 7 8 7
3 3 It 6
2 5 4 2 4 4 3 3 8 9 3 3 7 5 6
6 5 4 3
'+ 3 5 2 7 8 4 3 .. 7 7
31 3 3 3 2 57 7 5 4 3 53 3 7 4 3 49 5 6 4 4 50 5 5 4 8 40 6 5 .. 2 39 5 4 3 3 32 3 5 6 6 52 7 7 6 7 173645 GO 8 7 6 8 && 5 4 3 4 1+6 8 7 5 5 45 8 7 6 7 47 1 6 6 7 52 3 8 3 3 65 7 1 7 8
1
1t~t3t4t1•2.
co, co. co, oo,
t,o,o, o, o,t,o,o,
D,o ,1 tO, Otli.t0t301, 1t1tDt O, 1,0 ,1 .o ..
CONST. RACE SEX ABIL TV ESSAYS RXS RX A RXE S XA S XE AX E RXSXA RXSXE RXAXE SXAXE RSAE
3
4 5 5 5 8 2 .. 7
&
8 9 5 8 9' 8 7 8
2 3 2 2 8 4 2 4 910 2 4 6 5 4 5 7 81010 z3 2 2 6 6 3 3 6 6 5 5 6 7 5 6 1 6 3 3 8 7 9 6 6 4 6 3 8 7 3 .. 7 9 8 8 6 6 9 7 8 7 6 5 8 8 7 8
24 42 47 44 65 28 37 49 57 39 68 40 55 69 62
so
68
BLANK-ENDAT ESTIMATION HEANS KEY
NEGRO - WHITE HALE - FEHALE HIGH - LOW ABILITY
RACE X SEX RACE X ABILITY RACE X ESSAYS o,t,t, o, SEX X ABILITY o,t,o,3ot, SEX X ESSAYS o,o,t,3Dt, ABILITY X ESSAYS 1,1,1,0, RACE X SEX X ABILITY t,t,o,aot, RACE X SEX X ESSAYS 1,0,1,301, RACE X ABILITY X ESSAYS o,t,t,3ot, SEX X ABILITY X ESSAYS 1,1,1,301, RACE X SEX X ABILITY X ESSAYS 1,2:,3,5,6,7,4. AL T CONT • ORDER - ABILITY LAST HAIN EF•ECT 2 1 1 ANALV SELECT-BIV 1,2. VARIABLE SELECT KEY, BIVARIATE ANALYSIS -1t1t1t1t3t1t1t3t193,3,1,3,3,3,3. HYPOTHESIS TEST CARD - ORIGINAL ORDER -6.1. HYPOTHESIS TEST CARD - ABIL LAST MAIN EFF 1 1 .1 AN. SELECT•UNIV. 3. VARIABLE SELECT KEY, UNIVAR ANALY OF TOT. -1,1t1ti93,1,1,3,1,3,3t193t3t3t3. HYPOTHESIS TEST CARD - ORIGINAL ORDER -0,1. HYPOTHESIS TEST CARD - ABIL LAST MAIN EFF CONTINUE
t,o,o,3ot,
0 I
.........
• • • • • • • • • HUL T I V AR I AN C E • • • • • • • • •
• UNIVARIATE AND
~ULTIVARIATE
ANALYSIS OF
•'
VARIANCE, COVARIANCE AND REGRESSION VERSION 5
.. . ...
MARCH 197Z
~
......•.•.... • 1'ROBLEH
OROBLEH 4 -- FOIJR-WH FIXEO-HFECTS BIVARIATE ANALYSIS OF VARIANCE
•• UNEQUAL SUBCLASS SIZES
---------- --------------------- _,_-----------------------
.. --PAGE
1
? ....
PROBLEM 4 -- FOUR-WAY FIX.ED-EFFECTS BIVARIATE ANALYSIS OF VARIANCE--UNEQUAL SUBCUSS SIZES A RANDOM SAMPLE OF FIFTH-GRADE TEACHERS WERE EACH ASKED TO RATE TWO ESSAYS SUPPOSEDLY WRITTEN BT THE SAME FIFTH-GRAOf PUPIL. THE TEACHERS WERE PROVIDED WITH INFORMATION, VIA COVER LETTER, AS TO THE PUPIL,S SEX, RACE, ANO COGNITIVE AniLITY LEVEL. EACH ESSAY WAS TO BE SCORED ON A TEN-POINT SCALE, FROH VERY POOR TO VERY GOOD, FOR THE CATEGORIES SPELLING AND PUNCTUATION, GRAMMAR, SENTENCE STRUCTURE, ORGANIZATION,NEATNESS, RELEVANCE OF IDEAS, APPROPRIATE WORD USAGF, CLARITY, CREATIVIT~ AND IMAGINATION, AND COMPLETENESS OF THOUGHT. EACH SCALE RANGED FROM 1 POINT IHINIMUMI TO 10 POINTS IHAXIMUHl. SINCE THE ESSAYS WERE TYPEWRITTEN, MOST OF THE RESPONDENTS OHITTEO THE NEATNESS RATING. THE OTHER NINE SCALES WERE SUHHED FOR EACH ESSAY, YIELDING TWO TOTAL SCORES, THE DEPENDENT VARIABLES, FOP EACH TEACHER•SUBJECT. THE EXPERIMENTAL CONDITIONS HERE DEFINED BY THE INFORMATION PROVIDED THE TEACHER ABOUT THE PU 0 IL-AUTHOR. FOUR DIFFERENT ESSAY PAIRS WERE USED AS STIMULI TO GIVE SOHE INDICATION OF GENERALIZABILITY OF RESULTS ACROSS ESSAYS, AND TO INSURE THAT TWO TEACHERS IN THE SAME SCHOOL WERE NOT SCORING THE SAME ESSAY. EACH TEACHEP RECEIVED ONE COMBINATION OF SEX-RACE-ABILITY INFORMATION FROH AMONG POSSIBLE COMBINATIONS OF THE CROSSING OF THE THREE VARIABLES,~ WITH ABILITY HAVING THO LEVELS !HIGH- LOW!. A TOTAL OF 112 TEACHERS RETURNED COMPLETED RESPONSES. EVERY COMBINATION OF THE EIGHT EXPERIMENTAL CONDITIONS 12 SEXES X 2 RACES X 2 ABILITY LEVELS!, WAS REPRESENTED, FOR EACH OF THE FOUR ESSAY PAIRS. IT WAS EXPECTED THAT THERE WOULD BE MEAN DIFFERENCES ON THE ESSAY PAIRS, BY PUPIL-AUTHOR RACE, SEX, ANO ABILITY. ESSAY PAIRS ARE KNOWN TO BE OF DIFFERING QUALITIES, AND ARE CONSIDERED AS A CONTROL VARIABLE, RATHER THAN ONE OF DIRECT EXPERI!<ENTAL IN.TEREST.
:;;
THF DATA
CA~DS
ARE PUNCHED AS FOLLOWS --
CARD COLUMN
-
3
4 7 - 9
10-11 12-13 14-17
SUBJECT IDENTIFICATION NUMBER TEACHER SEX !1=MI !2=FI SCHOOL NUMBER IN CITY TEACHER EXPERIENCE, IN FULL YEARS 199 = NO RESPONSE! NUMBER OF POST-BACCALAUREATE COURSES TAKEN !99 = NO RESPONSE! STIMULUS INFORMATION 14
PUPTL-AUTHOR RACE
15
PUPIL-AUTHOR SEX
10 17
ll=NEGROI 12=HH!T£1 !1=HALEI 12=FEHALEI PUPil-AUTHOR ABILITY ll=HIGHI 12=l0Wl ESSAY PAIR !1-2-3-41
21-40
THO-DIGIT FIELDS - TEN RESPONSES ESSAY I• !FAVORITE SCHOOL SUBJECT!
43-44
TOTAl OF NINE SCORES - ESSAY I
45-64
TWO-DIGIT FIELDS - TEN RESPONSES ESSAY II• !WHAT I THINK ABOUT!
&&-&7
TOTAL OF NINE SCORES - ESSAY II
!RANGE 9 - 901
<>
!RANGE 9 - 901
I
f-' f-'
VI
•TEN RESPONSES 1 - SPELliNG AND PUNCTUATION 2 - GRAMMAR 3 - SENTENCE STRUCTURE 4 - ORGANIZATION 5 - NEATNESS !NOT INCLUDED IN TOTALS! G - RELEVANCE OF IDEAS 7 - APPROPRIATE WORD USAGE 6 - CLARITY 9 - CREATIVITY AND IMAGINATION 10 - COMPlEToNESS OF THOUGHT FOR FURTHER INFORMATION ON THE STUDY, SEE J. FINN, THE EDUCATIONAL ENVIRONMENT FXPECTATIONS, PAPER P'ESENTED AT THE ANNUAL MEETING OF THE AHER!CAN EDUCATIONAL RESEARCH ASSOCIATION, MINNEAPOLIS, ~ARCH, 1970. THIS RUN USES -DATA FORM I !DATA UNSORTED! OPTIONAL PRINTED OUTPUT FOUR-WAY ANALYSIS ALTFRNATE CONTRAST ORDERINGS REPEAT CODE ON SYMBOLIC CONTRAST VECTORS MEANS KEY FOR OBSERVED AND ESTIMATED COMBINED MEANS FSTIMATION OF EFFECTS, INCLUDING INTERACTION, STANDARD ERRORS, INTERCORRELATIONS AMONG THE EFFECTS, MEANS AND RESIDUALS. TRANSFOI!MATION
UNIVARIATE UNEQUAL - N ANOVA THE EXPERIMENTAL DESIGN IS A COMPLETE 2 X 2 X 2 X ~ !RACE X SEX X ABILITY X fSSAYl FACTORIAL ARRANGE~ENT WITH TWO TOTAL SCORES AS CRITERION MEASURES. THERE ARE FROM 1 TO 6 OBSERVATIONS PER SUBCLASS, All MAIN EFFECT AND INTERACTION CONTR~STS ARE COO~D, WITH THE RANK OF T~E MODEL FOR SIGNIFICANCE TESTING BEING J = 32. SIMPLE CONTRASTS ARE USED FOR All TWO-LEVEL FACTORS, AND DEVIATION CONTRASTS FOR THE 4-LEVEL ESSAY EFFECT, THE !REPEAT COOEl IS EMPLOYED ON THE SCV,S TO REOUCE THE NUMBER OF CONTRAST CARDS PUNCHED,
IT IS DESIRABLE AND PERHAPS NECESSARY TO ESTABLISH THREE AlTERNATE ORDERS OF EFFECTS. THIS WILL ASSURE THAT EACH MAIN EFFECT CAN BE TESTED IN THE LAST POSITION, IN CASE OTHER MAIN EFFECTS ARE SIGNIFICANT, IN THE ORDER OF SCV CARDS, FSSAV PAIRS ARE THF LAST MAIN EFFECT. IN CASE ESSAYS ARE SIGNIFICANTLY OIFFFRENT !AS EXPECTED!, THE ALTERNATE ORDERS SHOULD ARRANGE FOR SEX, ABILITY, AND RACf EFFFCTS, RESPECTIVELY, TO FOLLOW ESSAYS, HOWEVER, THt ANALYSIS REVEALS THAT THE SEX X RACE INTERACTION IS SIGNIFICANT AND OBSCURES BOTH THE SEX ~NO RACE ~AIN EFFECTS. THUS TO BE CONSISTENT WITH THE TEXT, THE ONLY MAIN EFFECT ORDERED AFTER ESSAYS IS ABILITY. ONE roNTRAST REORDER KEY IS ENTERED, AND EACH ANALYSIS HAS TWO HYPOTHESIS TEST CAqos. THE FIRST TESTS ALL MAIN EFFECTS AND INTERACTIONS IN THE ORIGINAL ORDER. THE NUMBERS CORRESPOND EXACTLY TO THE DEGREES OF FREEDOM IN THE COMPLETE ~WAY ANOVA TARLE, THE SECOND HYPOTHFSIS TEST CARD, FOR THE ALTERNATE ORDER, BYPASSE< THE TEST OF THE FIRST SIX CONTRASTS IN THE ALTERNATE ORDER !THE CONSTANT, RACE, SEX, ESSAYS). THE SEVENTH CONTRAST IN THE AlTERNATE ORDER IS THE ARILITY EFFECT, ELI~INATING THE FIRST SIX, AND IS TESTED ALONE. 0
THE RANK OF THE MODEL FOR ESTI~ATION INCLUDES THE FIRST !81 DEGREES OF FREEDOM !CONSTANT THROUGH RACE X SEX INTERACTION), THESE ARE THE EFFECTS WHICH PROVE SIGNIFICANT. THF MEANS ARE ESTIMATED FROM THE RANK•8 MODEL, ANO ARE COMBINED TO YTELD MEANS FOR EACH ~AIN EFFECT AND EACH SEX·RACE COMBINATION. THESE GREATLY SIMPLTFY THE INTERPRETATION OF EFFECTS IN THE 4•WAY DESIGN.
,_,I
.... ..,.
THE PROGRAM,S TRANSFORMATIONS ARE USED TO OBTAIN A THIRD CRITERION VARIABLE, AS THF SUM OF THE TWO FSSAY SCORFS. A SECOND ANALYSIS IS PERFORMED USING THE TOTAL A' A SINGLE C~ITERION VA~IABLF. THE UNIVARIATE RESULTS MAY BE COMPARED WITH THE BIVARIATE FINOINGS, ALTHOUGH THE SUM SCORE MASKS EFFECTS IN THE SEPARATE ESSAY TOPICS. ~HE TOTAL SCORE CANNOT 9E USED IN THE SAMF ANALYSIS AS THE OTHER TWO MEASURES, SIN~E IT IS EXACTLY A liNEAR COMBINATION OF THEM, AND CAUSES THE JOINT VARIANCE-COVARIANCE ~ATRIX TO BE SINGULAR. HOWEVER, THE THREE MEASURES CAN BE USED TOGETHER IN ALL OTHER PHASES OF THE PROBLEM RUN, AND SELECTED FOR SE 0 ARATE TESTS OF SIGNIFICANCF IN PHASE 3, INPUT PARAMETERS PAGE NUMBER OF VARIABLES IN INPUT VECTORS=
2
NUMBER OF FACTORS IN DESIGN= NUMBER NUMBER NUMBER NUMBER
OF OF OF OF
LEVELS LEVELS LEVELS LEVELS
OF OF OF OF
FACTOR FACTOR FACTOR FACTOR
~
1 ( RACE 1
2 ( SEX l 3 !ABilTYl 4 !ESSAY I
NUMBER OF VARIABLES AFTER TRANSFORMATIONS=
2 2 2
. 3
2
INPUT IS
F~OM
CARDS. DATA OPTION 1
HINIHAL PAGE SPACING WILL BE USEO ADDITIONAL OUTPUT WILL PRINTED COH.PUTATIOll OF COVARIANCE MATRIX FOR EACH GROUP IMPOSSIBLE DUE TO FORH OF DATA INPUT OF DATA 113X4I1,25XF2.0,21XF2.01
FO~HAT
VAlUABLE FORMAT
TRANSFORMATIONS PAGE
VARIABLE
3 WILL BE FORMED BY APPLYING TRANSFORMATION I x~vt 11, Y=VI 21, Z=Vt -01,
FIRST OBSERVATION SUBJECT 1 , CELL 1 2 2 BEFORE TRANSFO~HATIONS = AFTER TRANSFORMATIONS =
1 46.0000 46.0000
32.0000 32.0000
c=
X+Y
3
I TO INPUT VARIABLES VIIII
-o.ooooo.
0
I .... ....
78.0 000
"'
CELL IDENTIFICATION AND FREQUENCIES PAGE CELl
FACTOR LEVELS RACE
1 2 3
4 5 6 7 8
9 10 11
12 13 14 15 16 17 18 19 20 21 22
N
SEX
ABILTY> 1 1 1 1 2 2 2 2 1
ESSAY 1 2 3
1
1
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 2 2 2 2 2 2 2
2 2
3 4
2
1
1 1
1 2 3
2
1
2
2 2
1 1 1
2 2
1
1
1
1 1 2 2
1 1 2 2
.. 1
2
..3 1 2 3
4 3
3 3
3 4 3 3
..3
4
4
3
1 2
6 3 4 4
.1
2
5
3 3 3
5 3
4
23 24 25 2& 27 28 29 30 31
1 2 2 2
2 2 2
2
2
2 2 2 2
2 2
32
.
.
2 2 1 1 1 1 2 2 2 2
2 2 2
3
5 1 4 3 2 4 4 3 3
1 2 3 4 1 2 3 4
TOTAL 'I=
112
TOTAL SUH OF CROSS•PROOUCTS
1 ESAY 1 2 3
ES AY I ESAYII TOTAL
25.19690 OE•04 2&.23230UF•04 51.429200~•04
3 TOTAL
2 ESAYII
30.070700E•04 5&.303000E>04
10.773220£>05
"" 0
I ,_. ,_.
OBSERVED CEll HEAIIIS --- ROwS ARE CELLS-COLUMNS ARE VARIABLES
ESAY
2 ESAYII
44.25UO 73.3333 47.0000 40.n33 37.1333 39.5000 47.0000 34.3333 61.5000 46.0000 61.2500 49.6667 41.1667 35.0000 43.7500 47.7500 56.2000 39.3333 59.6667 52.3333 45.800& 43.0000 41.2500 48.0000
34.5000 59.6667 H.6667 47.6667 28.3333 58.5000 70.0000 35.6667 44.7500 68.6667 56.7500 49.0000 33.3333 41.,0000 47.7500 46.5000 56.8000 53.0000 70.6667 54.0000 45.0000 63.6667 56.0000 44.8000
1 1 2 3 4 5 & 7 8 9
10 11 12 13 H
15 10 17
18 19 20 21 ~2
23 24
3 TOTAL 78.7500 113.0000 11'4.6667 s8.oooo 65.6667 98.0000 117.0000 70.0000 106.2500 114.6667 118.0000 98.6667 74.50 00 76.0000 91.5000 94.2500 113.0000 92.3333 130.3333 106.3333 90.8000 106.6667 97.2500 92.8000
"'
25 26
10.0000 40.DOOO
59.0000 39.0000 )7 .2500
27
28 29 30 31 32
30.75~0
48.0000 41.6667
1n.oooo 39.2500 61.333~
43.0000 29.2500 47.2500 60.0000 33.6667
20.0000 79.2500 120.3333 82.0000 66.5000 78.0000 108.0000 75.3333
OBSERVED CELL STD DEVS·-ROWS
1 ESAY
20.45116 16.01041 11. 'i3256 10.69268 8 .o8290 5.44671 15.71623 12 .342H 20.74448 7.21110
3
4
5 6 7 A
9
10 11 12 13
14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 3?
2 ESAY!I
~.42120
21.00000 15.94783 4.72582 15.37314 9.07377 9.14695 6.24500 22.85461 21.09305 2.51661 18.40969
16.62'28 16.91646 11.35782 17.80215 10.75097 14.27235 12.n1388 8,38650 2.88675 14.18450 9.16515 8.73212 5.29150
15.22717 12.12436 21.66987 15.28616 12.6%46 8.54400 7.37111 14.00000 12.60952 14.29452 6.48074 4. 71169
o.ooooo
14.85485 '.46'+10 4.24264 5.61991 17.89553 11.13553 20.84067
11.532'5&
3 TOTAL
41.12076 25.23886 12.85820 26.05763 13.86843 14.21267 14.42221 35.00000 35.47182 6.02771 22.58318 28.14842 26.83095 16.52271 39.46729 19.41434 21.20142
"
10.26~20
20.5507~
a. B212 9.731;39
o.ooooo
24.30878 6,42910 7,07107
36.16958 5.03322 11.31371 24.58319 33.5q563 18.02776
1~.95620
0
I 1-' 1-'
14.18920 15.94 783 24.54995
o.ooooo
19.56826 7.00000 15.01111
ARE CEl.l.S-COt.UMNS VARIABLES
~3.70q54
PAGE OBSERVED COMBINED MEANS
FACTORS LEVEL N
=
( RACE
57
MONS
ES AY I=
45, 77
ESAYII=
48,39
TOTAL=
94.16
ESAY T=
~4.80
ESAYII=
49.18
TOTAL=
93,98
E<;AY I=
45.81
ESAYII=
52.28
TOHL=
98.09
ESAY I=
44,76
ESAYII=
45.15
TOTAL=
89.91
LEVEL N =
55 MEANS
FACTORS LEVEL N =
2 (
SEX l
57
MEANS
-----LEVEL =
N
55
MEANS
FACTORS LEVEL N =
3 CAB!LTYl 1
MEANS
------LEVEL N =
FACTORS
I
ES AY I=
'+9. 75
ESAYH=
52.47
TOTAL=
102.2
ES AY T=
1,1, 57
ESAYII=
45.&9
TOTAL=
87.26
I=
45.34
ESAYII=
38,&9
TOTAL=
84,03
ESAY I=
40.t,1
ESAYII=
53 .2&
TOTAL=
93.&7
ESAY I=
50.63
ESAYII=
&0 ,41
TOTAL:
111,0
61
MEANS
4
(ESSAY l
LEVEL
32
N
MEANS
LEVEL N =
E<;AY
27
~EANS
-----LEVEL N = MEANS
" "' f-' f-'
51
27
LFVEL 26
N
MEANS
.. t'AY I=
RACf
FACTORS lEVEl N =
..... 77
ESAYII=
"" .46
TOrAL=
89.23
SEX
26
~fANS
----lEVEl N=
ESAY I=
.. 2.81
ESAYII=
49.96
TOHL=
92.77
ESAY I=
48.26
ESAYII=
47.06
TOTAL=
95.32
ESAY I=
48.32
ESAYII=
54.23
TOTAL=
102.5
31
MEANS
--LFV Fl. N =
31
MEANS
-----LEVEL N =
0
,_,,_,I
2 ..
MEANS
"' ES AY I=
ESAYI!=
.. 0.25
42.67
TOTAL=
82.92
ESTIMATION PARAMETERS ===~=================
RANK OF THE BASIS =
~ANK
PAGE
6
PAGE
7
OF MODEL FOR SIGNIFICANCE TESTING = 32
RANK OF THE MODEL TO BE ESTIMATED IS ERRO~
TERM TO BE USED IS !WITHIN CELLS)
NUMBER OF
~RDERS
OF THE BASIS VECTCRS OTHER THAN THE FIRST IS
ESTIMATEC CELl MEANS, RESIDUALS AND RESIDUALS IN FDRH Of T-STATISTICS HILL BE PRINTED VARIANCE-COVARIANCE FACTORS AND, CORRELATIONS AMONG ESTIMATES WILL BE PRINTED ESTIMATED COMBINED MEANS WILL BE PRINTED SYMBOLIC CONTRAST VECTORS 1)
CO,C!l,f"O,OO, 2)
CONST.
1 '0' 0 '0'
NEGRO - WHITE
RACE
0 ,1 '0 'o,
MALE - FEMALE
SEX
0' 0 t 1' o, o,o,o,e 1,
HIGH - LOW ABILITY
31 41 51 61 71
n,r,o,o
2,
o,n,o,o
3,
ABILTY ESSAYS ESSAYS ESSAYS
8)
1,1, o, 91
o,
RACE X SEX
R XS
1,11 '1 ,o'
PAC< X ABiliTY
R X
t,o,o,o 1,
RACE X ESSAYS
R X I'
1 ,a' 0' [) 2,
RACE X ESSAYS
R XE
( 101 111 121
t,o,o,o
RACE X ESSAYS
R XE
SEX X ABIUTY
S X
o,t,o,o 1,
SEX X ESSAYS
S XE
o,t,o,n
2,
SEX X ESSAYS
S XE
n,t,o,o
"1:,
SFX X ESSAYS
s
o,o,t,o 1t
ABILITY X ESSAYS
A XE
o,o,t,o 2,
ABILITY X ESSAYS
AXE
o,o,t,o
ABILITY X ESSAYS
A XE RXSXA
~.
131
0,1 ,1' o, 141 151 161 ( 171 181 191
3,
2 Dl
1,1' 1 'o,
RACE X SEX X ABILITY
1,1,0,0 1,
RACF X SEX X ESSAYS
RXSXE
1,t,o,o 2,
RACE X SEX X ESSAYS
RXSXE
211 221 ( 231
t,t,o,o 241
RACE X SEX X ESSAYS
RXSXE
t,o,t,o t,
RACE X ABILITY X ESSAYS
RXAXE
t,o,t,o 2,
RACE X ABILITY X ESSAYS
RXAXE
3,
251 ( 261
1,0,1,0 3,
PACE X ABiliTY X ESSAYS
RXAXE
o,t,t,o 1,
SEX X ABILITY X ESSAYS
SXAXE
0,1,1,0 2,
SEX X ABILITY X ESSAYS
SXAXE
0,1,1,0 3,
SEX X ABILITY X ESSAYS
SXAXE
1,1.,1,0 1,
~ACE
1t1t1t0 2,
RACE X SEX X ABILITY X ESSAYS
RSAE
1,1,1,0 3,
RACE X SEX X ABILITY X ESSAYS
RSA£
271 281 291 30)
X SFX X ABILITY X FSSAYS
RSAE
31)
( 321
<>
....I
"'0
ERROR SUM OF CROSS-PRODUCTS 2 tSAY
1 2
3
ES AY I ESAYI! TOTAL
13877.&8 8475.37 22353.05
3
ESAYII
TOTAl
17429.10 25904.47
48257.52
ERROR VARIANCE -COVARIANCE MATRIX 1
ESAY 1 2 3
ES AY I ES AYI! TOTAL
17 3. 4710 105.q421 279.4131
2
3
TOTAl
ESAYII 217.8637 ~23.8058
603.2190
0
,..,I "',..,
ERROR CORRELATION MATRIX 1 E~AY
1 2 3
ESAY I ESAYII TOTAL
1. 000000 .544957 • 8&37&5
3
2
ESA YII
TOTAl
1. aoooo o .893212
1.000000
VARIABLE
VARIANCE !ERROR HEAN SQUARES!
1 ESAY I
173.471042 217 .8&3750 603.218958
2 3
ESAYI! TOTAL
D.F.,
STANDARD DEVIATION 13.1708 11t.7602 24.5605
eo
ERROR TERM FOR ANALYSIS OF VARIANCE (WITHIN CEllS!
LEAST SQUARE ESTIMATES OF EFFECTS -- EFFECTS X VARIABLES
2 ~SAY
1
CONST. 2 RACE 3 SEX 4 AB ILTY 5 ESSAYS o ESSAYS 7 ESSAYS 8
R X S
45.31015
ESAYII 49.0~323
• 74217
• 21835
• ~~62C
7.~0166
8.120&5 -. 30957
5. 7434~ -11.03798 4.75053 11.23506 -11.79948
-4.3~6?6 5.1202~
-12.74721
TOTAL 94.40337 .96053 B. 887 86 13.86413 -11.34755 .38427 16.35536 -24.546&9
ESTIMATES OF EFFECTS IN STANDARD DEVIATION UNITS-EFF X VARS 1 E SAY
1 2 3 4 5 6 7
CONST. RACE Sf"X ABILTY ESSAYS ESSAYS ESSAYS
8
P X S
3.440187 .056~50
.074877 .&1&5&3 -.0?3504 -.331510 • 3887&0 -. 967836
2
ESA VII
TOTAL
3.32&052 .014793 .535336 .189120 -. 747820 .321847 .7&1172 -.799412
3.843705 .039109 • 361876
.564489 -.462024 .015646 .665921 -.999437
STA~DARD
1
ESAY 1 2
1.2&0541 2. 509128
~
CONST. QAGE SEX
4
ABILTV
2.G0&7H
5 6
ESSAYS ESSAYS ESSAYS
2.204508
R X S
5.056269
2.1')tt"'t15 2.0775'>~ 2.1864'>~
"
I f-'
"'"'
FRRORS OF LEAST-SQUARES ESTIMATES--EFFECTS 6¥ VARS
2
ESAY!I 1·412654 2.812136 2.814364 2.809212 2.328269 2.470533 2."450317 5.666425
TOTAL 2.350612 4.679305 4.&83012 4.674438 3.874165 4.110888 4.077250 9.428750
VARIANCE-COVARIANCE FACTORS OF ESTIMATES 1
CONST • 1 2
CONST. RAG>'
• 009160 -.000436
3
2
RACF .0362~8
SEX
4
AB !LTV
5
ESSAYS
6
ESSAYS
7
ESSAYS
R X S
4 5 6
7 8
SEX AAILTV fS SAYS ESSAYS ESSAYS R X S
-.000452 .001621 -.000984 .000309 • 00 02~3
• 003748
.0040~0
-.00155~
-.001047 • 001210 -.000246 -.002015
.036356 -.001.&22 -.001040 .001228 .000%3 -.002068
.03&223 • 000736 -.001006 -.000954 -.000095
.024882 -.008455 -.008106 .005584
.028015 -.009506 -.007920
.027559 .000343
.147378
INTERCORRELATIONS AMONG THE ESTIMATES
-----------------------------------------------------------1
CONST. 1 2 3
4 5 6 7 8
CONST. RACE SEX AA!LTY ESSAYS ESSAYS ESSAYS R XS
1. oorooo -.02,901 -. 024781 • 088989 -.065179 • 000555 .014&82 .10>018
2
3
RACE
SEX
1.000000 .110928 -.042794 -.034851 .037940 -.007793 -.027549
1.oooooo -.0446% -.034594 .03846~
.030435 -.028254
4
5
&
7
8
AB!LTY
ESSAYS
ESSAYS
ESSAYS
R X S
1.000000 • 024523 -.031589 -.030197 -.001298
1.000000 -.320256 -.3095&9 .092220
1.000000 -.342129 -.123250
1.000000 .005378
1.000000
ESTIMATED CELL MEANS, ALL GROUPS - CEllS X VARIABLES
-----------------------------------------------------------2
•SAY I 46.7383
42.0~71
42.~816
57. 8256 64.3102 48.1275 36.?936 52.0821 58.5667 42.3840
?2.1681
4 5
6
ESA YII
4&.&034 38.6176 34.%09
7
lt-4.0475
8 9 10 11 12 13 14 15 16
~8.4827
52.1257
40.0~52
4•. 069U
sc;. s2::r7
e;7.5556
62.3082 4&.12% 34.2917 50.0802 %.5648 40.3821 47.7185 63.5070 69.9916 53.8089 41.9750
51.0908 44.QOSU 39.948lo 49.4349 4'.8702
17
5?.3~97
18 1q 20 21 22 23 24 25 26 27
"48.~1~0
57.7990
52.2348 .4.2491 40.1924 49.6789 44.1142 45.0099 40.9532 50.4398
57.7635
64.2481 48.0654 33.9171 49.7056 5&.1902
Q
'
3
t-' N
TOTA\.
"'
88.7754 100.5072 116.4783 94.7309 74.9113 86.6431 102.6142 80.8668 92.1609 103.8927 119.8638 98.1164 78.2968 90.0286 105.9997 84.2522 100.0882 111.8200 127.7911 106.0437 8&.2241 97.9559 113.9270 92.17% 78.9270 90.&588 106.6299
~4.8750
28 29 30
40.0075 28.1736 43.9621 50.4467 34.2640
36.8893 32.8326 ~2.3191
31 32
36.7544
84.8825 65.0629 76•7947 92.7658
71. 0184 ~EANS
ESTIMATED BY FITTING HODEL OF RANK
8 PAGE
ESTI~AT£0
FACTORS
COMBINED MEANS BASED ON FITTING
1
MEANS LEVEl
ESAY I=
~5
.68
ESAYII=
49.20
TOTAl=
94.88
ESAY I=
44.%
ESAYII=
48.98
TOTAl=
93.92
2
MEANS
FACTORS
2 I
LEVEL
SEX
n
,_.I
1
MEANS
------LEVEL
ESAY I=
45.80
ESAYII=
53.0~
TOTAL=
98.85
ESAY I=
44.82
ESAYII=
45.14
TOTAl.=
89.96
ESAY I=
49.37
ESAYII=
51.96
TOTAl=
101.3
ESAY I=
41.25
ESAYII=
l,6.22
TOTAL=
87.47
45.00
ESAYII=
38.06
TOTAl=
83.06
2
~EAI'IS
FACTORS
IABILTYI
lEVEL MEANS
LEVEl IIEANS
LEVEl. MEANS
HODEl OF RANK
RACE
1
L£VEL
FACTORS
A
4
!ESSAY
1 ESAY I=
"'...
LEVEL MEANS
-----·-
ESAY I=
L,0.9<,
ESAYII=
53.8'+
TOTAL=
94,79
ESAY I=
50.<,3
ESAYII=
60.33
TOTAL=
110.8
ESAY I=
l,4.87
ESAYII=
..... 15
TOHL=
89.01
LEVEL MEANS
------L<"VEL MEANS
-----
----------n~TORS
1
I RACE l
2 (
SEX
LEVEL ~EANS
-------
ESAY I=
<,2.99
ESAYII=
50,20
TOTAL=
93.19
ES d Y I=
<,8.37
ESAYII=
48.20
TOTAL:
96.58
LEVEL MEANS
0 I
>-'
LFVEL
2
MEANS
-------
N
1
"'
ESAY I=
.. 8.62
ESAYTI=
55 .88
TOTAL=
104.5
ES AY I=
41.26
ESAYII=
42
.o 8
TOTAL=
83.34
LEVEL ~FANS
RAW RESIDUALS -
ROWS ARE FULL CELLS -
COlUMNS ARE VARIABLES
------------------------------------------------------------· 1
ESAY 1 2 3 4
5 6 7 8
9 10 11
-2 ... 8829 10.6.174 -S.1~815
-6.27006 -1.28430 4.93905 ~.95250
--.149"1 9.374'1 -2.06900 3.694 .. 4
2
ESAYII -7,53713 1,84103 3,35650 -,46082 -7.960 31 6.4178'5 11,43332 -6.71734 4,7147'9 12.84296 -5.55825
3 TOTAL
-10.02542 12.49277 -1.81165 -6.73088 -9.24462 11.35690 14.38581 -10.86675 14.08910 10.77395 -1.86380
12 13 14 15 16 17 18 19 20 21 ?2 23 24 25 26 27 28 29 30 '1 32
-2.32411, -2.83~3~
-4.%835 _,. 68491 3,87984 ~.83028
-8.97%9 1.86709 • 09851 1.55093 2.80762 -8 .4289 3 3.86582 -35.009q1 -.95322 8,56023 -5,87502 .36074 -?.08257 5. 68087 4.91229
2.87'+43 -.C35839 -9.08023 -8.81476 6,11792 9.08148 -10.50702 ,67511 .19112 3.02497 5.90313 -8.24807 -3.26539 -23.91711 -10.45562 5,14318 2. 99253 1. 07637 3.28787 9,55333 -.59732
,55030 -3.79677 -14.02858 -1
RESIDUALS IN STD. DEV. UNITS • FULL CEllS X VARIABLES
-----------------------------------------------------------1 ESAY I 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
18 19 20 21 22 23 24 25 26 ?7 28 29 30
-,18R924
.808737 -. 392393 -.476056 -.097511 • 374909 • 224109 -.~t50Lo6
.711747 -.157090 .280502 -.176461 -.215505 -,375705 -.431628 .294578 ,290815 -.691786 .141759 .007479 .117755 .210170 -.639%9 .295032 -2.658138 -.072373 ,649938 -.446!J03
.027'89 -.15R1.2 0
2
ESAY!I -.510638 .124729 .227402 -.031221 -.539309 .434808 .774604 .... 55098 .319426 .870107 -.37o57o .194742 -. 064931
3 TOTAl -.'+081~2
.508653 -.073763 -.274053 -.376402 .462405 .585729 - ... 42448 ,573648 .438670 -.0758~6
,022406 -.154588
-.~15183
-.57118~
-,597198 .414487 .615268 -.711848 .045738 .01?948 .204941 .399935 -.558805 -.221229 -1.620378 -.708%5 • 348449 .202743 • 072924 .222752
-.590365 .4070E7 • 525712 -.793416 .103507 • 011792 .18631.1. • 354665 -.679017 .025261 -2.399258 -.464519 .557945 -.11.7363 .058513 .049075
"I .... N
"'
~1
32
,431322 .372967
.647236 -.040468
.620272 .175687
RESIDUALS AS T-STATISTTCS - FULL CELLS X VARIABLES 1
ESAY 1
2 3 4
5 6
7
8 g
10 11 12 13 14 1~
16 17 18 19 ?0
21 22 23 24 25 ?6 27 28 29 30 31 32
-.44?711 1. 585307 -. 77?834 -.%3110 -.190181 • eg6617 • 443135 -.621131 1.651368 -.~08684
.659455 -. 346947 -. 658007 -.736355 -1.011\706 ,692104 • 7890 84 -1.339615 .276994 • 014633 • 314131 • 417877 -1.508869 .807284 -2.775985 -.173106 1.294102 -.694655 • 065521 -.~73571
,85131g .739328
2
ESAYII -1.210110 .244498 ,(f.47877
-.061851 -1.051840 1.039618 1.531228 -.897252 .741120 1.709776 -.8853og .382891 -.198254 -1.205714 -1. 398404" .973828 1.66g437 -1.398683 • 089372 .0253~4
.546714 .783994 -1.317505 -.605341 -1.692216 -1.694295 .69380 3 .315732 .174452 .526270 1.277478 -.080220
3
TOTAL -.967333 .997075 -.145279 -.542g24 -.734114 1.105503 1.157863 -.872H3 1.330957 .8619g4 -.178407 .044053 -.472009 -1.119480 -1.-38240.5 .956393 1.426442 -1.558954 .202251 • 02'!·072 .497017 .695250 -1.600g32 .069121 -2.505628 -1.111055 1.110933 -.18?770 .13g977 .115943 1.224259 • 348263
,_."I IU
"
D.F ,:
80
RESIDUALS FSTIMITED IFTE. FITTING HODEL OF RANK ANALYSIS OF VARIANCE PAGE
DEPENDENT VA.IABLEISI 1
£SAY I
9
2 !'SAYII NUMBER OF ALTERNATE BASIS ORDERS= LOG-DETERMINANT ERROR SUM OF CROSS-PRODUCTS HYPOTHESIS
1.89515&&0HD1
1 DEGREE($! OF FREEDOM
1
co,co.,co,oo.,
PAGE
10
PAGE
11
CONST.
TESTS OF HYPOTHESIS BEING SKIPPED HYPOTH£S IS
ito, o, o,
1 DEGREE($! OF FREEDOM
NEGRO - WHITE
RACE
LOG·DFTERMINANT SCP HYPOTHESES + SCP ERROR, ADJUSTED FOR ANY COVARIATES, =
F·RATIO FOR MULTIVARIATE TEST OF EQUALITY OF MEAN VECTORS=
o.F .=
AND
(LIKELIHOOD RATIO
VARIABLE
HYPOTHESIS MEAN SQ
79.0000
9.q3724190E-01
UNIVARIATE F
P LESS THAN
LOG
ESAY I
7&.4417
.1524
.6973
17.7291
• 0 814
• 7762
.2495
"
00
-6. 29558579E-O 31
STEP DOWN F
.3477
DEGREES OF FREEDOM FOR HYPOTHESIS= 1 DEGREES OF FREEDOM FOR ERROR= 80
HYPOTHESIS MEAN PRODUCTS, ADJUSTED FOR ANY COVARIATE$
2 ESAYII
P LESS THAN
.1524 STEP-DOWN HEAN SQUARES = (
STEP•OOWN HEAN SQUARES
1 ESAY I
0
I f-'
.7799
P LESS THAN
ESAY II
1.89578&1&E+01
=<
,6973 173.47101 .5;71 53.934V 155.10171
26.ft4171
1 2
ESAY I ESAYII
26.44170 -21.65147
17.72905 HYPOTHESIS
1 DEGREEISI OF FREEDOM
3
PAGE
e,t,o,o,
"ALE - FEMALE
LOG-DETERMINANT SCF HYPOTHESES • SCP ERROR, ADJUSTED FOR ANY COVARIATES,
F-RATIO FOR MULTIVARIATE TEST OF EQUALITY OF MEAN VEcTORS=
D. F.:
2
AND
ILIKELIHOOD RATIO
HYPOTHESIS MEAN SQ
VARIABLE
12
SEX
79.0000
9.12541859E-01
UNIVARIATE F
P LESS THAN
LOG
=
1o90430873E•01
3.7857
.0270
-9.15213218E-021
P LE:SS lfHAN
STEP DOliN F
P LESS THAN
0 I
1-'
N
2
ESAY I
37.3117
.2151
.6441
ESAYII
11>07.3690
6.4599
.0130
o2151 .6441 STEP-DOilN HEAN SQUARES =I 37.3117/ 173.47101 7.3392 .0083 STEP-DOilN MEAN SQUARES =I 1138.3281/ 155.10171
DEGREES OF FREEDOM FOR HYPOTHESIS= 1 DEGREES OF FREEDOM FOR ERROR= 80
HYPOTHESIS MEAN PRODUCTS, ADJUSTED FOR ANY COVARIATES 1 ESAY
1 2
ESAY I ESAYII
2 ESAYII
I
37.312 229.153
1407.36~
HYl'OTH ESIS
4
1 DEGREEISI OF FREEDOM PAGE
o,o,1,o.,
HIGH - LOll ABILITY
ABilTY
U
"'
LOG-DETERMINANT SCP HYPOTHESES + SCP ERROR, ADJUSTED FOR ANY COVARIATES,
F-RATIO FOR MULTIVARIATE TEST OF EQUALITY OF MEAN VECTORS= D.F.=
AND
!LIKELIHOOD RATIO
V4R!APLE
HYPOTHESIS MEAN SQ
79.0000
8.79549978E-01
UNIVARIATE F
P LESS THAN
LOG
ESAY I
1822.0538
10.5035
.0018
1182.6876
5.4286
.0224
1.90799109£+01
5.4093
.0063
-1.28344891£-011
P LESS THAN
cs AY II
=
STEP DOWN F
P LESS THAN
10.5035 .0018 STEI"-DOWN MEAN SQUARES =I 1822.0538' 173.47101 .3946 .5317 61.2087, 155.10171 STEP-DOWN MEAN SQUARES =I
DEGWEES OF FREEDOM FOR HYPOTHESIS= 1 DEGREES OF FREEDOM FOR ERROR• 80 0
I 1-'
"' 0
HYPOTHESIS MEAN PRODUCTS, ADJUSTED FOR ANY COVARIATE$ 1
2
ESAY I ESAY J ESAYII
ESAYII
1822.054
1467.965
1182.688 HYPOTHESIS
5
3 OEGREEISI OF FREEDOM
PAGE
o,a.o,o o,o,o,o o,n,o,o
ESSAYS ESSAYS ESSAYS
1, 2, 3,
LOG-DETER~INANT
SCP HYPOTHESES + SCP ERROR, ADJUSTED FOR ANY COVARIATE$,
F-RAT!O FOR MULTIVARIATE TEST OF EQUALITY OF MEAN VECTORS• O.F.=
AND
158.0000
P LESS THAN
.0001
=
1o95171500E+01
8.6065
14
!LIKELIHOOD RATIO
VARIABLF
HYPOTHESIS MFAN SQ
5.6802a274E-01
UNIVARIATE F
LOG
·5.6558*084E·011
P LESS THAN
ESAY I
468.9599
2.7034
.0510
ESAYII
2669.3491
12.2524
.0001
P LESS THAN
STEP DOliN F
2.7034 STEP-DOWN HEAN SQUARES =! 15.7587 STEP-DOilN HEAN SQUARES =I
468.9599/ 2444.199~/
.0510 173.47101 .0001 155.10171
DEGREES OF FREEDOM FOR HYPOTHESIS= ~ DEGREES OF FREEDOM FOR ERROR= 80
HYPOTHESIS MEAN PRODUCTS, ADJUSTED FOR ANY COVARIAlES 1 ESAY I ESAY I ESAYII
2 ESAYII
468.%C 327.?65
0
2669.349
I
1-'
HYPOTHESIS
~
DEGREE!SI OF FREEDOH PAGE
1,1,0,0,
RACE X SEX
LOG-DETERMINANT SCP HYPOTHESES + SCP
ERRO~,
ADJUSTED FOR ANY COVARIATES, =
F-RATTO FOR MULTIVARIATE TEST OF EQUALITY OF HEAN VECTORS:
o.F.=
AND
lllKEliHOOD RATIO
VARIABLE ESAY I
HYPOTHESIS MEAN SQ 1102.5474
79.0000
9.18803401E-01
UNIVARIATE F 6. 355 8
15
R X S
P LESS THAN
LOG
3.4907
.0353
-8.4G831067E•02J
P LESS THAN • 0137
1.90362491E+D1
STEP DOWN F
P LESS THAN
6. 3558 • 0137 STEP-COHN HEAN SQUARES :( 1102.5,74/ 173.47101
ESAYII
944.6970
4.3362
•. 0406
.6532 STEP-DOWN MEAN SQUARES =I
.4215 101.3051'
155.1017~
DEGREES OF FREEDO~ FOR HYPOTHESIS= 1 DEGREES OF FREEDOM FOR ERROR= 80
HYPOTHESIS MEAN PRODUCTS, ADJUSTED FOR ANY CDVARIATES
1 2
ESAY I ESAVII
1 ESAY I
2 ESAYII
1102.547 1020.575
944.697 HYPOTHESIS
1 DEGREE (Sl OF FREEDOM
7
PAGE
16
R X A
RACE X ABILITY
1,0,1,0,
LOG-DETERMINANT SCP HYPOTHESES • SCP ERROR, ADJUSTED FOR ANY COVARIATES, =
1.89605776E•01 <> I
f-'
~
F-RATIO FOR MULTIVARIATE TEST OF EQUALITY OF MEAN VECTORS= D.F.=
2
!LIKELIHOOD RATIO
VARIABLE 1
HYPOTHESIS MEAN SQ
AND
79.0000
9.91028830E-01
UNIVARIATE F
P LESS THAN
LOG
.7006
-9. 01165343E-O 31
P LESS THAN
ESAY I
1-03.9764
.5994
.4'+11
ESAVI!
112.3468
.5157
.4748
.3576
STEP DOWN F
.5994 STEP-DOWN H£AN SQUARES =I .1223 STEP-DOWN MEAN SQUARES =I
DEGREES OF FREEDOM FOR HYPOTHESIS= 1 DEGREES OF FREEDOM FOR ERROR= 80
HYPOTHESIS MEAN PRODUCTS, ADJUSTED FOR ANY COVARIATES
P LESS THAN .....11 173.47101 .7275 18.971U 155.10171
103.978V
2
ESAY I 1 2
ES AY I ESAYII
ESAYII
103.9784 108.0816
112.3468
HYPOTHl':SIS
3
DEGREl':ISI OF FREEDOM PAGE
t,o,o,o 1., t,o,o,o 2, t,o,o,o 3,
RACE X ESSAYS RACE X ESSAYS RACE X ESSAYS
R XE R XE
LOG-OFTERMINANT SCP HYPOTHESES + SCP ERROR, ADJUSTED FOR ANY COVARIATES,
F-RATIO FOR MULTIVARIATE TEST OF EQUALITY OF MEAN VECTORS= o.F.~
6
AND
!LIKELIHOOD RATIO
17
R X E
158.0000
9.4606242&E-01
P LESS THAN
LOG
t.q004900qE+01
~
.7117
.6407
-5,3334q2qoE-021
)' ,_. (;l
HYPOTHESIS MEAN SQ
VARIA8Lf ESAY
UNIVARIATF F
P LESS THAN
I
63.3927
.4607
.&9&7
ESAYII
174.3521
• 6003
.4974
STEP DOWN F
P LESS THAN
.4607 ~c
83.3927/
.6967 173.47101 ·4Z03
STEP-OOHN MEAN SQUARES =I
147.4637/
155.10171
STEP-DOWN MEAN SQUARES .qsos
DEGREES OF FREEDOM FOR HYPOTHESIS~ 3 DEGREES OF FREEDOM FOR ERROR= 60
HYPOTHESIS MEAN PRODUCTS, ADJUSTED FOR ANY COVARIATES 1
ESAY I 1 2
ESAY I ESAYII
63.3927 47.4764
2
ESAYII 174.3521
HYPOTHESIS
1 DEGREEISI OF FREEDOM
PAGE
18
SEX X ABILITY
0,1,1,0,
LOG-D~TERMINANT
S X A
SCP HYPOTHESES + SCP ERROR, ADJUSTED FOR ANY COVARtATES, :
F-RATTO FOR MULTIVARIATE TEST OF EQUALITY OF MEAN VECTORS= AND
o.F.=
!LIKELIHOOD RATIO
VARIARLE
HYPOTHESIS •EAN SQ
79.0000
9.97940489£-01
UNIVARIATE F
ESAY I
14.5900
.0841
ESAY IT
1. 4145
.oo&~
P lESS THAN
LOG
1.895~6276E•01
.0815
.9218
-2.0&1&3426E-03)
P lESS THAN
STEP DOWN F
.Q641 STEP-DOWN MEAN SQUARES :( .93&0 .0799 STEP-DOWN MEAN SQUARES :(
P lESS THAN
• 772&
14.5900, 12.392~,
.772& 173.4710) .7782 155.1017)
....~
DEGREES OF FREEDOM FOR H'fl>OTHESIS= 1 DEGREES OF FREEDOM FOR ERROR= 80
"'..-
HYPOTHESIS MEAN PRODUCTS, ADJUSTED FOR ANY COVARIATE!
1 2
ESAY I ESAYIT
ESAY I
2 ESAYII
14.59003 -4.54300
1.41459 HYPOTHESIS
10
3 OEGREE!S) OF FREEDOM PAGE
c,t,o,o o,t,o,o o,t,o,o
1, 2, ~,
SEX X ESSAYS SEX X ESSAYS SEX X ESSAYS
LOG-DETERMINANT SCP HYPOTHESES + SCP
E~ROR,
S X E
S XE S X E
ADJUSTED FOR ANY COVARIATE!, =
F-RATIO FOR MULTIVARIATE TEST OF EQUALITY OF MEAN VECTORS=
1.900489&DE>01
.7116
1'1
O.F.=
6
AND
!LIKELIHOOD RATIO
VARIABLE
HYPOTHESIS MEAN SQ
158.0000
9o48067081E·01
UNIVARIATE F
P LESS THAN
LOG
.6408
·5.33300182E·02l
P LESS THAN
ESAY I
136.1694
• 7850
.5058
ESAYII
81.8548
• 3757
.7708
STEP DOWN F
.7850 STEP•OOWN !lEAN SQUARES =! .&482 STEP-DOHN MEAN SQUARES =!
P LESS THAN .5058 173.47101 .566 .. 100.5432/ 155.10171
136.169ltl
DEGREES OF FREEDOM FOR HYPOTHESIS= 3 DEGREES OF FREEDOM FOR ERROR= 80
HYPOTHESIS MEAN PRODUCTS, ADJUSTED FOR ANY COVARIATE$ 1
2
ESAY ES AY I E'S AYII
ESAYTI
135.1694 25.7129
Q
....I
~
81.8548 HYPOTHESIS
11
3 DEGREE!SI
OF FREEDOM PAGE
c,o,t,n c,o,t,o
1,
ABILITY X ESSAYS ABILITY X ESSAYS ABILITY X ESSAYS
2,
£l,o,t,o 3.,
LOG·OETER~INANT
A X E
AXE A X E
SCP HYPOTHESES + SCP ERROR, ADJUSTED FOR ANY COVARIATE$, =
f•RATIO FOR
~ULTIVARIATE
D.F.=
(liKELIHOOD RATIO
TEST OF EQUALITY OF HEAN VECTORS=
AND
158.0000
9.5244661H·01
P LESS THAN
LOG
lo9000Z872E+01
.6494
.6906
·4,87212246E·OZI
ZO
VARIABLE
HYPOTHESIS MEAN SQ
ESAV I ESAVII
P LESS THAN
UNJVARI ATE F
109.7661
.6329
.5960
2273
.2443
• 6652
5~.
P LESS THAN
STEP DOHN F
.6329 STEP-DOHN MEAN SQUARES =I .67H STEP-DOWN MEAN SQUARES =I
.5960 173.4710) .5707 104.5060/ 155.1017) 109.7881/
DEGREES OF FREEDOM FOR HYPOTHESIS= 3 DEGREES OF FREEDOM FOR ERROR= 60
HYPOTHESIS MEAN PRODUCTS, ADJUSTED FOR ANY COVARIATES 1 ESAY I
ESAY I ES AYII
2 ESAYII
109.7861 -q.4&%
53.2273 HYPOTHESIS
12
1 DEGREEtSl OF FREEDOM PAGE
1y 1, 1, o.
RACE X SEX X ABILITY
21
RXSXA
0 I
>-'
lOG-DETERMINANT SCP HYPOTHESES + SCP ERROR, ADJUSTED FOR ANY COVARIATES, =
F-RATIO FOR MULTIVARIATE TEST OF EQUALITY OF MEAN VECTORS= O.F.=
AND
ILIKELIHOOD RATIO
VARIAAL'"
HYPOTHESIS MEAN SQ
79.0000
9. 760~89089E-01
UNIVARIATE F
P LESS THAN
LOG
ESAY I
207.3301
1.1952
.2716
364.7483
1. 7660
.1677
1.89757674E+01
.9676
.3845
-2.42014166£-02)
P LESS THAN
ESAYII
"'"'
STEP OOHN F
1.1952 STEP-OOWN MEAN SQUARES = { .7439 STEP-DOWN MEAN SQUARES =I
DEGREES OF FREEDOM FOR HYPOTHESIS= 1 DEGREES OF FREEDOM FOR ERROR= 80
P LESS THAN
207.33011
.277& 173.4710) .3911
115.37&~/
15?.1017)
HYPOTHESIS MEAN PRODUCTS, ADJUSTED FOR ANY COVARIATES 1 ESAY I
1 2
ESAY I ESAYII
2 ESAYII
207.3301 282.~357
384.7483 13
HYPOTHESIS
3 DEGREE!SI OF FREEDOM !'AGE
1,1.,0,0 1,
RACE RACE RACE
1t1t0t0 2, '3,
1,1.,o,o
lOG-OETER~INANT
SEX X ESSAYS SEX X ESSAYS SEX X ESSAYS
SCP HYPOTHESES + SCP ERROR, ADJUSTED FOR ANY COVARIATES, =
F-RATIO FOR MUlTIVARIATE TEST OF EQUAliTY OF MEAN VECTORS= o.F.=
6
AND
158.0000
22
RXSXE RXSXE RXSXE
P lESS THAN
1.90977924E+01
1.9974 0
.0691
,..I
"'..., !liKELIHOOD RATIO
VARIABLE
HfPOTHESIS MEAN SQ
8.63962028E-01
UNIVARIATE" F
LOG
P LESS THAN
ESAY I
336.0564
1. 9372
.1303
ESAYII
693.6311
3.1838
.0283
.
-1.46226460E-011
1.9372 STEP-DOHN MEAN SQUARES =I 2.0821 STEP-DOHN MEAN SQUARES =I
DEGREES OF FREEDOM FOR HYPOTHESIS= 3 DEGREES OF FREEDOM FOR ERROR= 80
HYPOTHESIS HFAN PRODUCTS, ADJUSTED FOR ANY COVARIATES 2 ~SAY
1
ESAY I
I
336. 05~4
ESAYII
STEP DOWN F
P LESS THAN .1303 173.47101 .1093 322.938&1 155.10171
336.0564/
ESAYII
399.85&8
693.6311 HYPOTHESIS
3 DEGREEIS) OF FREEDOM
14
PAGE RACE X ABILITY·X ESSAYS RACE X ABILITY X ESSAYS RACE X ABILITY X ESSAYS
1t0t1t0 it
1tDt1-tD 2t 1,0,1,0 3,
LOG-DETERMINANT SCP HYPOTHESES + SCP ERROR, ADJUSTED FOR ANY COVARlATES, =
F•RATIO FOR MULTIVARIATE TEST OF EQUALITY OF MEAN VECTORS=
o.F.=
6
AND
(LIKELIHOOD RATIO
VARIABLE
HYPOTHESIS MEAN SQ
P LESS THAN
158.0000
LOG
9o3753641~E-01
UNIVARIATE F
23
RXAXE RXAXE RXAXE
~
1.90160656E+01
.8631
.5236
·6o44996748E•02t
P LESS THAN
·sTEP DOWN
F
P LESS THAN
0
....I
"'
00
ESAY I ESAYII
o51t26
.7207
125.0256
1.1002
239.6829
.7207 STEP•DOWN MEAN SQUARES =I 1.0153
.351t1
s·rEP·DOWN MEAN SQUARES =I
o51t26 173 ... 710) .3905 157.4753/ 155.1017) 125.0256/
DEGREES OF FREEDOM FOR HYPOTHESIS= 3 DEGREES OF FREEDOM FOR ERROR= 80
HYPOTHESIS MEAN PRODUCTS, ADJUSTED FOR ANY COVARIATES 1
2
ESAY I 1 2
FSAY I ESAYII
ESAYI!
12Go025f 105.3~69
239.6829
HYPOTHESIS
15
3
DEGREEISt OF FREEDOM PAGE
0,1,1,0 1,
Ot1t1t0 2,
o,t,t,o
3,
SEX X ABILITY X ESSAYS SEX X ABILITY X ESSAYS SEX X ABILITY X ESSAYS
SXAXE SXAXE SXAXE
24
LOG-OETERMINANf SCP HYPOTHESES • SCP ERROR, ADJUSTED fOR ANY COVARIATES,
=
F-RATtO FOR MULTIVARIATE TEST OF EQUALITY OF HEAN VECTORS=
o.F.=
6
AND
IL!KELIHOO,D RATIO
VARIABLE 1
HYPOTHESIS MEAN SQ
158.0000
9.60096638E-01
P LESS THAN
LOG
ESAY I
80.2076
.4&24
.7094
ESAYII
1'11.2732
.8779
.4562
o5417
.7760
•4.07211267E-02l
P lESS THAN
UNIVARIATE F
1.89922871£•01
STEP DOWN F
.4&24 STEP•DOWN MEAN SQUARES =I .6270 STEP·OOKN MEAN SQUARES =I
P lESS THAN .7094 173.47101 .5997 97.2479/ 155.10171 80.2076/
OF FREEDOM FOR HYPOTHESIS= 3 DEGREES OF FREEDOM FOR ERROR= 80
DEGRE~S
0
,_.I
"'"' HYPOTHESIS MEAN PRODUCTS, ADJUSTED FOR ANY COVARIATES
1 2
ESAY I ESAYII
1 ESAY I
2 ESAYII
80.2076 101.0006
191.2732 HYPOTHESIS
16
3 DEGREEISI OF FREEDOM
PAGE RACE X SEX X ABILITY X ESSAYS RACE X SEX X ABILITY X ESSAYS RACE X SEX X ABILITY X ESSAYS
1t1t1tD 1,
1.,1,1,0 2, 1,1, 1t0 3,
RSAE RSAE RSAE
LOG-DETERMINANT SCP HYPOTHESES + SCP ERROR, ADJUSTED FOR ANY COVARIATES, =
F•RATIO FOR MUlTIVARIATE TEST OF EQUALITY OF MEAN VECTORS= D. F.=
6
AND
158.0000
P LESS THAN
,2335
1.90523754£+01
1.3&13
25
!LIKELIHOOO RATIO
VARIABLE 1
HYPOTHESIS
SQ
MEA~
9.04105262E-01
UNIVARIATE F
LOG
-1.00809~85E-01l
P LESS THAN
fSAY I
326.180~
1. 6803
.1396
ESAYII
1~9.7386
.6873
.5625
STEP DOWN F
P LESS THAN .1396
1.8603
STEP-DOWN MEAN SQUARES =!
326.1804/
STEP-DOWN MEAN SQUARES =!
173.~7101 .~580
.8746 135.6500/
155.10171
DEGREES OF FREEDOM FOR HYPOTHESIS= 3 DEGREES OF FREEDCM FOR ERROR= 80
HYPOTHESIS MEAN PRODUCTS, ADJUSTED FOR ANY COVARIATES 2 ESAYII
ESAY I 1
2
ESAY I ESAYII
326.1804 109.8157
0
,_...,.I 0
149.7386
ORDER OF BASIS VECTORS PAGE 1
7
26
~
HYPOTHESIS
co,co,co,oo, 1, o, o, o, c,t,o,o, o,o,o,o 1, o,o,o,o z, o,o,o,o 3,
1
6 OEGREE!Sl OF FREEDOM
PAGE
27
PAGE
28
CONST. RACE SEX ESSAYS ESSAYS ESSAYS
NEGRO - WHITE MALE - FEMALE
TESTS OF HYPOTHESIS BEING SKIPPED HYPOTHESIS
o,o,t,o,
1 OEGREEISI OF FREEDOM
HIGH - LOW AeiLITY
ASILTY
LOG-DETERMINANT SCP HYPOTHESES + SCP ERROR, ADJUSTED FOR ANY COVARIATES, =
F-RATIO FOR MULTIVARIATE TEST OF EQUALITY OF HEAN VECTORS=
o.F.=
2
AND
ILIKELIHOOD RATIO
VARIABLE 1
HYPOTHESIS HEAN SQ
79.0000
P LESS THAN
8.83161912E-01
UNIVARIATE F
LOG
1816.8563
10.4735
• 0018
ESAYII
908.2788
4.1&90
.0'>45
5.2257
.007~
-1.24246729E-01l
P LESS THAN
ESAY I
1o90758127E+01
STEP DOliN F
P LESS THAN
10.4735 .0018 STEP-DOilN MEAN SQUARES =I 1816.8563/ 173.47101 .0961 .7574 STEP-DOWN MEAN SQUARES =I 14.9076/ 155.10171
DEGREES OF FREEDOM FOR HYPOTHESIS= 1 DEGREES OF FREEDOM FOR ERROR= 80
0 I
.....,. 1-'
HY~OTHESIS
1 ESAY I 1 2
ESAY I ESAYII
181&. 856 12 84.60&
MEAN PRODUCTS, ADJUSTED FOR ANY COVARIATE$
2 ESAYII 908.279 ANALYSIS OF VARIANCE PAGE
29
DEPENDENT VARIABLEISl 3
TOTAL
NUMBER OF ALTERNATE BASIS ORDERS= LOG-DETERMINANT ERROR SUM OF CROSS-PRODUCTS HYPOTHESIS
1.07843069£>01
1 DEGREEISl OF FREEDOM PAGE
30
co,co,co,oo,
CONST.
. . . . . . . . . . .. . . . . . . . . .
TESTS OF HYPOTHESIS BEING SKIPPED
HYPOTHESIS
t,o,o,o,
2
RACE
NEGRO - WHITE
UNIVARIATE ANALYSIS OF VARIANCE FOR HYPOTHESIS M>AN SQUARE=
-~678
F=
.0014
HYPOTHESIS 0 '1, 0 'o,
AND
WITH
. . . . . . . . .. . . .
1902.9876
F=
(', 0,1 ,o'
3.1547
4
HITH
.. .
AND
F=
9.8483
..... ... . .. ....
80 DEGREES OF FREEOOH
o,o,o,o
o,o,o,o o,o,o,o
5
AND
HITH
P LESS THAN
.....
P LESS THAN
.0007
TOT AU
. ... .... . ..
80 DEGREES OF FREEDOM
.0024
DEGREE!Sl OF FREEDOM ESSAYS ESSAYS ESSAYS
1, 2, 3,
3792.8383
.0796
ABILTY
UNIVARIATE ANALYSIS OF VARIANCE HYPOTHESIS M>AN SQUARE=
P LESS THAN
OEGREEISI OF FREEDOM
~
HYPOTHESIS
.9699
TOTAL!
HIGH - LOW ABILITY
5940.6707
P LESS THAN
SEX
UNIVARIATE ANALYSIS OF VARIANCE FOR
.. . . . . . . . . . .
..... . ...... .. ....
80 DEGREES OF FREEOOH
MALE - FEMALE
HYPOTHESIS
HYPOTHeSIS MEAN SQUARE=
TO TALl
OEGREE!SI OF FREEOOH
UNIVARIATE ANALYSIS OF VARIANCE FOR HYPOTHESIS MEAN SQUARE=
.. . . . . . .. . . . . . . .. . . . . . . .
OEGREE!SI OF FREEOOH
F=
&.2877
WITH
3
FO~
ANO
TO TAU 80 DEGREES OF FREEOOH
\"' .... _.,.
"'
..... ... ..... ..... ... . ..... HYPOTH'ESIS
..... ..... .. ..... .............. ...
1 DEGREEISI OF FREEDOM
6
=========================================== 1, 1, o, o,
RACE
X
SEX
R X S
UNIVARIATE ANALYSIS OF VARIANCE Fo·R I TOTAL! HYPOTHESIS HEAN SQUARE= •
•
•
•
•
•
4
•
•
•
•
•
•
4088.3942 ...
•
•
•
•
•
F=
....
6. 7776
•
HYPOTHESIS
WITH
... . ......... ..... .............. 1 AND
80 DEGREES OF FREEDOM
P LESS TMAN
.0110
DEGREEISI OF FRE'EDOH
7
====~====~=================================
RACE X ABILITY
1,0,1,0,
R X A
UNIVARIATE ANALYSIS OF VARIANCE FOR I TOTAL!
.. .... ....... ...... . HYPOTHESIS HEAN SQUARE=
432.4885
...... .. . ...... .. ..................
F=
.7170
WITH
HYPOTHESIS
1 AND
80 DEGREES OF FREEDOM
P LESS THAN
o3997
OEGREEISJ OF FREEDOM 0 I
======================================~====
t,o,o,o
1,.
t,o,o,o
:!,
RACE X ESSAYS RACE X ESSAYS RACE X ESSAYS
1,o,o,o 2,
':
R X E
"'
R XE R X E
UNIVARIATE ANALYSIS OF VARIANCE FOR I TOTAL!
.. .......
HYPOTHESIS HEAN SQUARE=
. ..... ... 352.6975
.5847
F=
.... ... ...... . ......... ...... ....
HYPOTHESIS
WITH
3 AND
. ..... . . ..... . ... ... ... . . F=
.0115
HYPOTHESIS 0,1,0,0 1, 0,1,0,0 z,
o6268
S X A
UNIVARIATE ANALYSIS OF VARIANCE FOR 6.9186
P LESS THAN
OEGREEISI OF FREEDOM
SEX X ABILITY
0,1,1,0,
HYPOTHESIS HEAN SQUARE=
80 DEGREES OF FREEDOM
10
SEX X ESSAYS SEX X ESSAYS
WITH
TOT All
.. ....... 1 ANO
..... .............
80 DEGREES OF FREEDOM
3 OEGREEISI OF FREEDOM S X E
S X E
P LESS THAN
o9150
{l
t1t 0 ,o
:r,
SEX X ESSAYS
S X E
UNIVARIATE ANALYSIS OF VARIANCE FOR HYPOTHESIS MEAN SQUARE=
?69.4500
F=
.4467
HYPOTHESIS
o,a,t,o 1, o,o,t,o 2, o,c,t,o 3,
WITH
11
3 AND
TO TAll
. .... .... .........
80 DEGREES Of FREEDOM
PLESS THAN
.7204
DEGREEISI OF FREEDOM
AXE AXE
ABILITY X ESSAYS ABILITY X ESSAYS ABILITY X ESSAYS
A X E
UNIVARIATE ANALYSIS OF VARIANCE FOR I TOTAL! HYPOTHESIS MEAN SnUARE=
144.0762 Jf.
•
•
Jf.
•
• 2388
F= Jf.
WITH
•
HYPOTH'ESIS 1, 1,1' o,
12
..
AND
.. . ....... .. .. ....
80 DEGREES OF FREEDOM
P LESS THAN
.8691
nEGREEISl OF FREEDOM
RACE X SEX X ABILITY
RXSXA
n I
'"' :t UNIVARIATE ANALYSIS OF VARIANCE FOR HYPOTHESIS MEAN SQUARE=
1156.9497 •
•
..
Jf.
Jf.
•
F=
1.9180
WITH
1 ANO
t,t,c,o 1,
13
1829.4012 •
..
•
Jf.
F= Jf.
..
.. .
3. 0327
HYPOTHESIS 1, O, 1,0 1., 1,0,1,0 2,. 1,0,1,.0 3,
14
...
.1700
.
RXSXE RXSXE RXSXE
UNIVARIATE ANALYSIS OF VARIANCE FOR
•
P LESS THAN
DEGREEISJ OF FREEDOM
RACE X SEX X ESSAYS RACE X SEX X ESSAYS RACE X SEX X ESSAYS
1,1,{1,0 2, 1,1,0,0 ?.,
. . .. . . . . .
80 DEGREES OF FREEDOM
Jf.
HYPOTHESIS
HYPOTHFSIS MEAN SQUARE=
TOTAL!
WITH
ANO
TOTAL!
.... ... .. .........
80 DEGREES OF FREEDOM
OEGREEISI Of FREEDOM
RACE X AAit!TY X ESSAYS QACE X ABILITY X ESSAYS RACE X ABiliTY X ESSAYS
RXAXE RXAXE RXAXE
P LESS THAN
.03~1
UNIVARIATE ANALYSIS OF VARIANCE FOR
...... ...... . . . ... HYPOTHESIS HEAN SQUARE=
F=
575.3823
... .... . o9539
HYPOTHESIS
SEX X ABILITY SEX X ABILITY SEX X ABILITY
1],1,1,0 it 0,1,1,0 2, 0,1,1,0 3,
UNI~ARIATE
.........
HYPOTHESIS HEAN SQUARE=
15
F=
473.4821
3 AND
F=
695.55 05 •
•••••
. .... . .. .. . . ... .. ..
80 DEGREES OF FREEDOM
P LESS THAN
.4188
DEGREEISI OF FREEDOM ESSAYS ESSAYS ESSAYS
SXAXE SXAXE SXAXE
• 7849
16
WITH
3 AND
... . .... ..... . .. .. ...
80 DEGREES OF FREEDOM
1.1531
P lESS THAN
.5059
DEGREEISI OF FREEDOM
UNIVARIATE ANAlYSIS OF VARIANCE FOR HYPOTHESIS MEAN SQUARE=
...
RACE X SEX X ABILITY X FSSAYS RACE X SEX X ABILITY X ESSAYS RACE X SEX X ABILITY X ESSAYS
1,1,1,0 it 1,1,1,0 2t 1,1,1,0 3,
. . . ....... .. .
TOT AU
ANALYSIS OF VARIANCE FOR I TOTAL!
.. .
HYPOTH£SIS
WITH
WITH
AND
RSAE RSAE RSAE 0 I
TOT AU
>-' ..-
80 DEGREES OF FREEDOM
P LESS THAN
If.
.. . .3330
...
ORDER OF BASIS VECTORS PAGE 2
31
4
6
HYPOTHESIS
6 DEGREEISl OF FREEDOM
PAGE
co,co .. co,oo,
t,o,o,a, o,t,o,o,
o,o,c,o o,o,o,o o,o,o,o
NEGRO - WHITE HALE - FEMALE
SEX
ESSAYS ESSAYS ESSAYS
1, 2, 3,
........ .. .. . ... .... .
32
CONST. RACE
TESTS OF HYPOTHESIS
HYPOTHESIS
2
8EI~G
SKIPPED
DEGREEISI OF FREEDOM
. . . . . . . . .. . . . .. . . .. . .
"'
o,o,1,o,
HIGH -
A9Il TY
LOW ABILITY
UNIVARIATE ANALYSIS OF VARIANCE FOR HYPOTHESIS MEAN SQUARE=
CORF USED
FO~
OATA=
5294. 3~67
F=
8.7768
WITH
..
AND
TOTAL I 80 DEGREES OF FREEDOM
~
LESS THAN
.00~1
.............
&88 LOCATIONS OUT OF 3000 AVAILABLE
a
I 1-'
...
"'
PROBLEM 5-- NESTE:D DESIGN NIXED HODEL--·TEST OF INDIVIDUAL PROGRAMMED INSTRUCTION IN REDUCING EFFECTS OF ABSENTEEISM. 4
3
1
1
EXPCON 2 CLAsS· 19 SEX PROBLEM 5 NESTED DESIGN
1
2 NIXED HODEL ANALYSIS
1
1INPUT DESCRIPTIO FACTOR !DENT CARD
MATHEMATICS ACHIEVEMENT DATA WERE COLLECTED FROM STUDENTS COMPLETING SEVENTH GRADE MATHEMATICS IN A LARGE CITY IN EASTERN UNITED STATES. NINETEEN ClASSES WERE RANDOMLY ASSIGNED TO AN EXPERIMENTAL GROUP OF WHICH ANY ABSENT STUDENTS COULD ATTEMPT TO LEARN THE HISSED MATERIAL ON INDIVIDUAL TELEVISED INSTRUCTION CONSOLES. EIGHTEEN ADDITIONAL CLASSES, HAYING ONLY REGULAR TEACHER INSTRUCTION, CONSTITUTE THE·CONTROL GROUP. CLASS AVERAGES FOR BOYS AND FOR GIRLS SEPARATELY CONSTITUTE THE UNIT OF ANALYSIS. ALL STUDENTS WERE TESTED ON THREE ACHIEVEMENT MEASURES AT THE COMPLETION OF THE SCHOOL YEAR. THE MEASURES EMPLOYED ARE RAW SCORES ON THE 1. 2o 3.
COOPERATIVE ARITHMETIC TEST, FORM A ICOOP-Al STANFORD ADVANCED MODERN MATHEMATICS CONCEPTS TEST ISTANFDI ICITYI .JUNIOR HIGH SCHOOL MATHEMATICS TEST GRADE 7 ICJHSMTI
IN ADDITION, DIFFERENTIAL ABSENTEEISM .IS REFLECTED IN SCORES ON THE CRITERION MEASURES. THE AVERAGE NUMBER OF DAYS THE STUDENTS WERE ABSENT.IS INCLUDED IN THE DATA AS A POSSIBLE COVARIATE. THIS IS VARIABLE FOUR ON THE PUNCHED CARDS IDAYSABI.
0
I
.., "" 1-'
THE EF.FECTS IN THE EXPERIMENTAL DESIGN ARE TREATMENT ·GROUP 11: EXPERIMENTAL! · FIXED EFFECT 12=CONTROLI CLASS (COOED 1 THROUGH 19 FOR GROUP 1o COOED 1 THROUGH 18 FOR GROUP 2 .I RANDOM EFFECT NESTED WITHIN EXPERIMENTAL GROUPS SEX11 = FEMALE! 12 = MALE! THE DATA CARDS ARE PUNCHED AS FOLlOWS-CARD COLU"N 1-2 3-5 7
13-22 23-31 32-40
41-49
EXPERIMENTAL GROUP CLASS WITHIN EXPERIMENTAL GROUP SEX COOP-A IF10.51 STANFD IF10o5l CJHSMT, IF10.5l DAYSAB IF10.5l
THE SCORES ON EACH DATA CARD ARE THE NEANS FOR ALL MEMBERS OF THE PARTICULAR SEX GROUP IN A PARTICULAR CLASS IN A PARTICULAR EXPERIMENTAL GROUP. THIS RUN USES -'\
DATA FORM 1 I CONTRAST VECTOR FOR NESTED EFFECT, PLUS REPEAT CODE. SUPPRESSION OF SUBCLASS STANDARD DEVIATIONS SINCE ALL NIJKL1=1• SPECIAl EFFECTS AS ERROR TERM !CLASSES WITHIN EXPERIMENTAL GROUPSol PRINTING OF OPTIONAl OUTPUT OBTAINING SUM OF SQUARES AND PRODUCTS FOR THE LAST HYPOTHESIS BY SUBTRACTION ISEX X CLASSES IN EXPERIMENTAL GROUPS I • INCOMPLETE DESIGN II.E. ONLY 18 CLASSES IN THE CONTROL GROUP!. MEANS KEY FOR OBSERVED MEANS ONLY, FOR FIXED EFFECTS SEX AND EXPE~IHENTAL GROUPS •
.•..••••.....•..•...•..•.•.••...•.....•..•••••...••••....•..••.•.•••.•.•••.....• .••.•.......••••..••••..••.•.••.•••..••••••.•.••••.•••••..•.....................
··••••••••••••••~••••••••••••••••••••• NOTE •••••••••••••••••••••••••••••••••••••
THIS MIXED HODEL REQUIRES THE USE OF TWO SEPARATE ERROR TERMS FOR THE COMPLETE ANALYSIS OF VARIANCE, AS FOLLOWS. A. B.
CLASSES WITHIN EXPERIMENTAL GROUPS PROVIDE A TEST OF SIGNIFICANCE OF THE EXPERIMENTAL GROUP MAIN EFFECT, WHICH IS THE PRIMARY EFFECT OF INTEREST. SEX X CLASSES WITHIN EXPERIMENTAL GROUPS PROVIDES TESTS OF SIGNIFICANCE OF THE SEX AND SEX X GROUP EFFECTS.
THESE ANALYSES REQUIRE TWO SEPARATE RUNS OF THE PROGRAM OF WHICH ONLY THE FIRST IAI IS GIVEN HERE. THE MEAN PRODUCTS FOR THE EFFECTS IN IBI ARE PRINTED, SO THAT THE ADDITIONAL COMPUTATIONS HAY BE DONE BY HAND , IF DESIRED. ,THE PROGRAM-COMPUTED TESTS FOR THE EFFECTS IN IBI ARE NOT VALID ON THIS RUN AND HUST BE IGNORED, TO OBTAIN PROGRAH•COMPUTED TESTS FOR 181, THE SAME DATA ARE ENTERED BUT WITH I RESIDUAL! COOED AS THE ERROR TERH 11 IN COLUMN 12 OF THE ESTIMATION SPECIFICATION CARD. I THE RESIDUAL IS THE SEX X CLASSES IN EXPI'RIMENTAL GROUPS INTERACTION EFFECT. THE HYPOTHESIS TEST CARD FOR THE SECOND RUN HIGHT ALSO BE REPUNCHED AS -2,1,1.
WHERE THE •2 WILL O~IT TESTS OF THE CONSTANT AND EXPERIMENTAL GROUP TERH,AND ELIMINATION OF THE FINAL 35 Will OMIT THE TEST OF THE RANDOM CLASSES WITHIN EXPERIMENTAL GROUP EFFECT •
................................................................................ ..........................................•.•............•......................
•••••••••••••••••••••••••••••••••• NOTE ABOVE ••••••••••••••••••••••••••••••••••
THE COHPLETE DESIGN CAN BE TREATED AS A IZ X 19 X 21 EXPERIMENTAL GROUPS X CLASSES X SEX CROSSED DESIGN. Ill CONTRAST CODES ARE USED TO ESTIMATE CLASS VARIATION SEPARATELY WITHIN EACH EXPERIMENTAL CONDITION. THE RANK OF THE HODEL FOR SIGNIFICANCE TESTING IS 39 11 DoFo FOR CONSTANT, 1 FOR. EXPERIMENTAL GROUPS, 1 FOR SEX, 1 FOR GROUPS·X SEX, AND 35 FOR CLASSES WITHIN EXPERIMENTAL GROUPS!. COOING SPECIAL EFFECTS AS THE ERROR CAUSES THE LAST 35 CONTRASTS TO BECOME ERROR VARIATION. THIS IS THE CLASSES EFFECT, APPROPRIATE AS AN ERROR TERH FOR EXPERIMENTAL GROUP DIFFERENCES. THE H"YPOTHESIS TEST CARD IS -1, 1, 1, 1, -35, 35. TEST OF THE CONSTANT TERH IS BYPASSED. TEST OF THE 1-DEGREE-ciF-FREEDOM EXPERIMENTAL GROUPS EFFECT IS HADE. RESULTS FOR THE 1•DEGREE·OF-FREEDOH SEX AND GROUP X SEX EFFECTS ARE PRINTED, BUT TEST STATISTICS ARE NOT CORRECT, SINCE THE ERROR TERH IS NOT THE APPROPRIATE ONE. THE 35 DEGREES OF FREEDOM FOR CLASSES ARE REDUNDANT, AND ARE BYPASSED. ADDING THE FINAL 35 TO THE HYPOTHESIS TEST CARD CAUSES THE REMAINING 35-DEGREE• OF•FREEOOH SEX X CLASSES MEAN PRODUCTS TO BE PRINTED. THIS EFFECT IS OBTAINED
Q
....,_, "'' I
BY SUBTRACTING ALL OTHER SUMS OF PRODUCTS FROM THE TOTAL MATRIX. USE FOR FURTHER HANO COMPUTATIONS,
IT HAY BE OF
THE RANK OF THE HODEL FOR ESTIMATION IS 2. ONLY THE CONSTANT AND THE EXPERIMENTAL GROUPS CONTRAST ARE ESTIMATED. THESE ARE THE ONLY EFFECTS WHOSE STANDARD ERRORS ARE CORRECTLY ESTIMATED UNDER THE !SPECIAL EFFECTS) ERROR TERM, CLASS MEANS !SEPARATELY FOR BOYS AND GlRLSI ARE THE APPROPRIATE UNITS OF ANALYSIS, THE NESTED CLASSES EFFECT IS NECESSARY SINCE THE HALE AND FEMALE RESULTS ARE TWO OBSERVATIONS TAKEN FROM A SINGLE CLASS, RESULTS FOR INDIVIDUAL PUPILS DO NOT APPEAR IN THIS ANALYSIS, THE MEANS WERE COMPUTED PRIOR TO THIS RUN, THUS THE COMPLETE 12 X 19 X 21 DESIGN HAS ONLY ONE OBSERVATION PER CELL, WITH FOUR HEAN SCORES PER OBSERVATION ITHE FOUR TEST SCORES!, STANDARD DEVIATIONS FOR EACH CEll MAY BE SUPPRESSED, SINCE THEY ARE IDENTICALLY ZERO. CORE SPACE AND COMPUTING TIHE ARE SAVED, BECAUSE THERE IS ONLY ONE .OBSERVATION FOR EACH SEX-CLASS COMBINATION, THE RESIDUAL SUH O.F PRODUCTS I THE· ERROR TERM FOR SUBSEQUENT RUN BI, IS ONLY THE SEX X CUSSES INTERACTION. IT DOES NOT INCLUDE VARIATION AMONG REPLICATES WITHIN THE CELLS. FINISH END DF COHHENTS II2ti3,!2,5X4F 10,51 VARIABLE FORMAT COOP-ASTANFDCJHSHTDAYSAB 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 1 2 3 .. 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 1 2
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 1 1
16.42857 11.56250 14.61538 12.14286 17.60000 12.21429 11.33333 1~.78571
12.41176 23.55000 9~61538
18.43478 18.00000 11.00000 13.22222 15.95455 21.14286 20.05~00
24.47368 16.53846 14.80000 17.22222 12.75000 17.14286 11.40000 13.00000 18.66667 14.44444 21.15385 13.45455 16.82600 17.75000 14.800 00 17.50000 19.33333 19.44444 19.000~0
24.46667 18.54545 14.10000
16.78571 15.92 857 18.81250 16.21429 19.70000 15.86667 15.21429 18.28571 16.11111 25<52941 16.07143 19.07a92 17.94444 16.22222 15.38095 18.73913 20.28571 18.65217 24.00000 16.08333 18.62500 18.40000 15.56250 20.81818 18.30000 17.18182 17.87500 17.25000 22.27273 13.91667 15.87500 17.68750 11.1&667 16.54545 18.28571 22.11111 15.16&67 24.80000 18.50000 19.45'i55
12.50000 12.25800 11.92857 10.00000 14.40000 10.42857 10.90000 11to61538 13.33333 19.26316 9.50000 14.00000 12.05263 6.60000 10.500110 15.73913 17.80000 19.008110 19.75000 14.41667 12.28571 13.94737 11.00000 14.27273 8. 72727 13.12500 15.12500 13.00000 17.46154
10 .ooooo
11.37500 11.38462 9.36364 9.40000 14.00000 17.20000 .17. 09091 20.09091 12.90000
10.1oooo
37.82609 53.00000 29.83333 47 .ooooo 30.12000 47.76190 31.33333 29.85000 35.81818 16.590 91 49.05263 33 .goo oo 29.50000 63.50000 37.00000 17.33333 29.26316 37.45833 37.41667 18.93750 38.78571 22 <25000 35.38889 39.28571 38.07692 30.66667 26.54545 41.090 91 23.85714 35.13333 33.91667 22.66667 42.23077 32.31Z50 16.53333 28.71429 23.53846 25.47059 41.50000 36,60000
0 I
."'~
3 4 5 6 7 8 9 10 11 12 13 14
2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
1~
16 17 18 t 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 ~9
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
13.75000 15.000 00 13.83333 21.85714 14.21429 18.75000 13.71429 15.18182 21.15385 16.82353 21.93750 11.00000 14.25000 12.61538 12.37500 16.15789 17.85714 11.45455 16.33333 13.27273 13. 81U8 24.36364 15.28571 18.16567 15.785 71 17.50000 20.80000 15.27273 23.00000 11. 8571'+ 14.69231 14.92308 12.90909
11.ooaoo 2
2
35
o,t,3,t•3.
co,oo,co, ,o, o,o,t,
t tO
E-C
1,0,1, I1,18Dt,o, !?,1701,0, 3 1 1,2,3. -1,1t1t1t-35,35.
14.66667 17.66667 19.23077 23.00000 16.57143 20.75000 18.90909 18.15385 21.78571 19.66667 22-.31579 15.77778 16.00000 16.42857 15.88235 19.50000 22.50000 15.16667 13.42857 19.25000 16.57143 2'+.18182 16.33333 23.25000 17.'+0000 19.43750 19.92857 17.69231 2'+.71429 16.62500 19.09091 1'+.87500 16.166&7 17.40000
11.18182 11.70000 10.92308 17.92308 11.00000 14.72222 10.66667 11.6428& 15.71429 14.31250 16.0 DODO e. 73333 11.77778 10.83333 9.50000 12.33333 14.40000
25.84615 34.92308 30.84211 12.50000 29.55556 24.40909 48.13333 22.88235 23.00000 21.52632 27.42857 25.54545 15.47619 19.31579 37.38095 19.23810 37.52941 45.4'+'+44 29.92857 38.33333 26.94118 1'+.08333 44.75000 21.28571 22.61111 25.47368 12.94737 19.00000 15.00000 28.87500 28.70000 41.89474 '+2.50000 27.83333
u.ooooo
11. 33'133 10.50000 10.85714 16.63636 10.87500 13.50000 11.28571 13.21429 17.76923 10.90000 18.57143 8.75000 11.53333 12.07692 10.27273 11.83333 1
Q
I f-'
""0 BLANK CARO EST SPEC CARO MEANS KEY
CONST EXPERIMENTAL GROUPS SEX- IGNORE F VALUES THIS RUN SEX X GROUP------------ IGNORE F VALUES THIS RUN CLASS--CLASSES IN GROUP 1 CLASS--CLASSES IN GROUP 2 ANALYSIS SELECT CARO VAR SELECT KEY HYP TEST CARO-OMIT ~ONST AND CLASS EFFECTS STOP
•
~
• • • • • • • MU L T I V A R I A N C E • • • • • • •
UNIVARIATE AND MULTIVARIATE ANALYSIS OF VARIANCE, COVARIANCE ANO REGRESSION VERSION 5
MARCH 1972
... ...... .. ..... .. . PROBLEM
PROBlEM 5-- NESTED DESIGN MIXED HODEL--TEST Of INDIVIDUAl
PROGRAMMED INSTRUCTION IN REDUCING EFFECTS OF ABSENTEEISM.
PAGE
1 0 I
>-'
PROBLEM 5
NESTfO DESIGN
"'
MIXED MODEL ANALYSIS
>-'
MATHEMATICS ACHIEVE"ENT DATA WERE COLLECTED FROM STUDENTS COMPLETING SEVENTH GRADE MATHEMATICS IN A LARGE CITY IN EASTERN UNHED STATES, NINETEEN CLASSES WERE RANDOMLY ASSIGNED TO AN EXPERIMENTAL GROUP OF WHICH ANY ABSENT STUDENTS COULD ATTEMPT TO LEARN THE HISSED MATERIAL ON INDIVIDUAL TELEVISED INSTRUCTION CONSOLES. EIGHTEEN ADDITIONAL CLASSES, HAVING ONLY REGULAR TEACHER INSTRUCTION, CONSTITUTE THE CONTROL GROUP. CLASS AVERAGES FOR BOYS AND FOR GIRLS SEPARATELY CONSTITUTE THE UNIT OF ANALYSIS, ALL STUDENTS WERE TESTED ON THREE ACHI~VEHENT MEASURES AT THE COMPLETION OF THE SCHOOL YEAR. THE MEASURES EMPLOYED ARE RAH SCORfS ON THE 1. 2. 3.
COOPERATIVE ARITHMETIC TEST, FORM A lCOOP-Al STANFORD ADVANCED MODERN MATHEMATICS CONCEPTS TEST lSTANFDl !CITY! JUNIOR HIGH SCHOOl MATHeMATICS T£ST GRADE 7 !CJHSHTI
IN ADDITION, DIFFERENTIAL ABSENTEEISM IS REFlECTED IN SCORES ON THE CRITERION MEASURES. THE AVERAGE NUMBER OF DAYS THE STUDENTS HERE ABSENT IS INClUDED IN THE DATA AS A POSSIBLE COVARIATE. THIS IS VARIABLE FOUR ON THE PUNCHED CA~DS !DAYSABl. THE EFFECTS IN THE EXPERIMENTAL DESIGN ARE TREATMENT GROUP 11= EXPERIMENTAL) FIXED EFFECT I 2=CONTROL l ICODED 1 THROUGH 19 FOR GROUP 1. COOED FOR GROUP 2. l
CLASS
1 THROUGH 18
SEX
11 12
RANOOH EFFECT NESTED HITHIN FE11ALEl HAL El
EX~ERIHENTAL
GROUPS
THE DATA CARDS ARE PUNCHED AS FOLLOWS-CARD COLUMN 1-2 3-5 7
13-22 23-31 32-40 41-4g
EXPERIMENTAL GROUP CLASS WITHIN EXPERIMENTAL GROUP SEX COOP-A IF10.5l STANFO IF1'0.5l CJHSMT IF10.5l OAYSAB 0'10.5!
THE SCORES ON EACH DATA CARD ARE THE MEANS FOR ALL MEMBERS OF THE PARTICULAR SoX GROUP IN A PARTICULAR CLASS IN A PARTICULAR EXPERIMENTAL GROUP. THIS RUN USES -DATA FORM 1 I CONTRAST VfCTOR FOR NESTED EFFECT, PLUS REPEAT CODE. SUPPRESSION OF SUBCLASS STANOARO DEVIATIONS SINCE ALL NIJKL1•1• SPECIAL EFFECTS AS ERROR TERM !CLASSES HITHIN EXPERIMENTAL GROUPS.) PRINTING OF OPTIONAL OUTPUT OBTAINING SUM OF SQUARES ANO PRODUCTS FOR THE LAST HYPOTHESIS BY SUBTRACTION !SEX X CLASSES IN EXPERIMENTAL GROUPS!. INCOMPLETE DESIGN !I.E. ONLY 18 CLASSES IN THE CONTROL GROUP!. MEANS KEY FOR OBSERVED MEANS ONLY, FOR FIXED EFFECTS SEX AND EXPERIMENTAL GROUPS.
~··¥············································································
••••••••••••••••••••••••••••••••••••• NOTE ••••••••••••••••••••••••••••••••••••• •......•...•......•.........••...••..••...••...•....•..••...••...•..••••••••.•••
THIS MIXED MODEL REQUIRES THE USE OF THO SEPARATE ERROR TERMS FOR THE ANALYSIS OF VARI~NCE, AS FOLLOWS.
CO~PLETE
A. B.
CLASSES WITHIN EXPERIMENTAL GROUPS PROVIDE A TEST OF SIGNIFICANCE OF THE EXPERI~ENTAL GROUP MAIN EFFECT, WHICH IS THE I'R!MARY EffECT OF INTEREST. ~EX X CLASSES WITHIN EXPERIMENTAL GROUPS PROVIDES TESTS OF SIGNIFICANCE OF THE SEX AND SEX X GROUP EFFECTS.
THESE ANALYSES REQUIRE TWO SEPARATE RUNS OF THE PROGRAM Of HHICH ONLY THE FIRST lAl IS GIVEN HERE. THE MEAN PRODUCTS FOR THE EFFECTS IN 181 ~RE PRINTED, SO THAT THE ADDITIONAL COMPUTATIONS HAY BE DONE BY HAND , IF DESIRED. THE PROGRAH-COHPUTEO TESTS FOR THE EFFECTS IN 18! ARE NOT VALID ON THIS RUN ANO MUST BE IGNORED. TO OBTAIN PROGRAM-COMPUTED TESTS FOR !8!, THE SAME DATA ARE ENTERED BUT WITH IRESIOUAL1COOEO AS THE ERROR TERH 11 I~ COLUMN 12 OF THE ESTIMATION SPECIFICATION CARD.! THE RESIDUAL IS THE SEX X ~LASSES IN EXPERIH~NTAL GROUPS INTERACTION EFFECT. THE HYPOTHESIS TEST CARD FOR THF SECOND RUN HIGHT ALSO BE REPUNCHED AS " -2, 1, 1.
WHERE THE -2 WILL OMIT TESTS OF THE CONSTANT AND EXPERIMENTAL GPOUP TERH,ANO ELIMINATION OF THE FINAL 35 WILL OMIT THE TEST Of THE RANOOH CLASSES WITHIN EXPERIMENTAL GROUP EFFECT.
...,"I
"'"'
·····~·········································································· NOTE ABOVE ••••••••••••••••••••••••••••••••••
~•••••••••••••••••••••••••••••••••
························!··································~···················· THE COMPLETE DESIGN CAN BE TREATED AS A 12 X 19 X 2) EXPERIMENTAl GROUPS X CLASSES X SEX CROSSED DESIGN. !Il CONTRAST CODES ARE USED TO ESTIMATE ClASS VARIATION SEPARATELY WITHIN EACH EXPERIMENTAl CONDITION. THE RANK OF THE MODEl FOR SIGNIFICANCE TESTING IS 39 11 O,F, FOR CONSTANT, 1 FOR EXPERIMENTAl GROUPS, 1 FOR SEX, 1 FOR GROUPS X SEX, ANO 35 FOR CLASSES WITHIN EXPERIMENTAL GROUPS). COOING SPECIAL EFFECTS AS THE ERROR CAUSES THE LAST 35 CONTRASTS TO BECOME ERROR VARIATION. THIS IS THE c·LASSES EFFECT, APPROPRIATE AS AN ERROR TERM FOR FXPEPIHENTAL GROUP DIFFER£NCES. THE HYPOTHESIS TEST CARD IS -1, 1, 1, 1, -35, 35. TEST OF THE CONSTANT TERM IS BYPASSED. TEST OF THE 1-DEGREE-OF-FREEDOM EXPERIMENTAl GROUPS EFFECT IS MADE. RESULTS FOR THE 1-DEGREE-OF-FREEDOM SEX AND GROUP X SEX EFFECTS ARE PRINTED, BUT TEST STATISTICS ARE NOT CORRECT, SINCE THE ERROR TERM IS NOT THE APPROPRIATE ONE, THE 35 DEGREES OF FREEDOM FOR ClASSES ARE REDUNDANT, AND ARE BYPASSED. ~ODING THE FINAL 35 TO THE HYPOTHESIS TEST CARD CAUSES THE REMAINING 35-DEGREEOF-FREFDOH SEX X CLASSES MEAN PRODUCTS TO BE PRINTED. THIS EFFECT IS OBTAINED BY SUBT~ACTING ALL OTHER SUMS OF PRODUCTS FROM THE TOTAl MATRIX. IT HAY BE OF USE ~OR FURTHER HAND COMPUTATIONS, THE RANK OF THE HODEL FOR ESTIMATION IS 2. ONlY THE CONSTANT AND THE fXPERIHENTA" GROUPS CONTRAST ARE ESTIMATED. THESE ARE THE ONlY EFFECTS WHOSE STANDARD ERRORS ARE CORRECTLY ESHHATED UNDER THE !SPECIAL EFFECTS! ERROR TERM. 0
CLASS MEANS !SEPARATELY FOR BOYS AND GIRlSI ARE THE APPROPRIATE UNITS OF ANALYSIS. THE NESTED CLASSES EFFECT IS NECESSARY SINCE THE HALE AND FEMAlE RESULTS ARE TWO OBSERVATIONS TAKEN FROM A SINGlE CLASS. RESULTS FOR INDIVIDUAl PUPILS 00 NOT APPEAR IN THIS ANALYSIS. THE MEANS WERE COMPUTED PRIOR TO THIS RU>I. THUS THE COMPLETE 12 X 19 X 21 DESIGN HAS ONlY ONE OBSERVATION PER CELL, WITH FOUR MEAN SCORES PER OBSERVATION !THE FOUR TEST SCORESI. STANDARD OFV!ATIONS FOR EACH CELl HAY 8E SUPPRESSED, SINCE THEY ARE IDENTICALLY ZERO, CORE SPACE AND COMPUTING TIME ARE SAVED. BECAUSE THERE IS ONLY ONE OBSERVATION FOR EACH SEX-CLASS COMBINATION, THE RESIDUAl SUM OF ~ROOUCTS I THE ERROR TERM FOR SUBSEQUENT ~UN Bl, IS ONLY THE SEX X CLASSES INTERACTION. IT DOES NOT INCLUDE VARIATION AMONG REDLICATES WITHIN THE CELLS.
I
t:
"'
INPUT PARAMETERS PAGE NUMBER OF VARIABLES IN INPUT VECTORS=
4
NUMBER OF FACTORS IN DfSIGN= OF LEVELS OF FACTOR 1 IEXPCONI NUM8ER OF LfVELS OF FACTOR 2 I ClASSl NUMBER OF LEVELS OF FACTOR 3 I SEX ' NUM8f~
3
2
19 2
INPUT IS FROM CARDS, OATA OPTION 1 MINIMAL PAGE SPACING WILL BE USED ADDITIONAL OUr'PUT WILL PRINTED COMPUTATION OF ST.DEVtS AND COVARIANCE MATRIX FOR EACH GROUP SUPPRESSED
2
COMPUTATION OF COVARIANCE MATRIX FOR EACH GROUP IMPOSSIBLE DUE TO FORM Of DATA INPUT FORMAT Of DATA !I2,I3,I2,5X4F 10.51
VARIABLE FORMAT
FIRST OBSERVA TTON SUBJECT 1 , CELL 1&.4286
16.7857
12.5000
37.6261
CELL IDENTIFICATION AND FREQUENCIES PAGE CELL
EXPCON
CLASS
1 1
1
1 1 1
0
1
7 6 g
10 11
12 13 14 15 16 17 1R
19 ?0 21 22 23 24 25 26 27 28 ?9 30 31 32
N
FACTOR tEVELS
1 2 3 4 5
1 1
1 1 1 1 1
1 1 1 1 1 1 1 1 1 1
1 2 2 3 3 4 4 5 5
6 6
7 7 8
8 9 ~
10 10 11
1 1
11 12 12 13 13 14
1 1
SEX 1 2 1 2 1 2 1 2 1
2 1 2 1
2 1 2 1 2 1 2 1 2 1 2 1
2 1 2 1 2
1 1
1 1 1 1 1 1
1 1 1
1 1 1
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
1
14
1 1 1 1 1
15 15 16 16
34
1
~5
1 1 1 1
17 18 19 19
1
1
2
1
2
1
1
1
?
1 2
2 1 2
1 1 1 1
~3
36
n
~8
39 40 41 42 43
2
2 2
3
17
1R
2 3
1 2 1 2 1 2
1
0 I
"'... >"
~~ ~5 ~6 ~7
~8
49 50 51 52 53 5~
55 56
57 58 59 60 61 62 &3 64 65 66 67 68 69 70 71 72 73 7~
EMPTY EMPTY
2 2 2
2 1 2 1 2 1 2 1
3
4 ~
2 2
5
5 6 6 7 7 8 8
2 2 2 2 2 2 2 2 2 2
2
1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 .1 2
9 ~
10 10
2
11 11
2 2 2 2 2 2 2 2
12 12 13 13 14 14 15 15 16
2
2 2
16
2 2 2 2
17 17 18 16 19
2
1~
2
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
1
' 2 1
<>
1
....I
1 0 0
?.
1 2
~
TOTAL N=
74
2 NULL SUBCLASSIESI.
TOTAL SUH OF CROSS-PRODUCTS
-----------------------------------------------------------1 COOP-A 1 2 3 ~
COOP-A STANFO CJHSHT OAYSAB
2 STANFD
CJHSHT
25275.56 17947.60 40873.,21
26450.66
3
~
DAYSAB
20332.73 224~~.51
16193.22 35581.55
1305~.33
78548.09
OBSERVED CELL MEANS --- ROWS ARE CELLS-COLUMNS ARE VARIABLES 1
~
1 2 3 4 5 6 7 8 9 10 11
12 13 14 15 16 17 18 19 20 21 22 23 24 25 2& 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50
COOP-A
STANFO
CJHSHT
OAYSAB
16.42857 16.5J846 11.56250 14.80000 14.61538 17.22222 12.14286 12.75000 17.60000 17.14286 12.21429 11.4000 0 11.33333 13.00000 14.78571 18.66667 12.41176 14.44444 23.5500e
16.78571 16.08333 t5.n857 1R.62500 18.81250 18.40000 16.21429 15.56250 19.70000 20.81818 15.86667 18.30000 15.21429 17.18182 1R.28571 17.87500 16.11111 17.25000 25.52941 22.27273 16.07143 13.91667 19.07692 15.87500 17.94444 17.68750 16.22222 11.16667 15.38095 16.54545 u . n913 18.28571 20.28571 22.11111 18.65217 15.16667 24.00000 24.80000 18.50000 22.50000 19.45455 15.16667 14.(>66(>7 13.42857 17.666(>7 19.25000 19.23077 16.57143 23.00000 24.18182
tZ. 50000 14.41667 12.25000 12.2&571 11.92857 13.94737 1o.ooooo 11.00000 14.40000 14.27273 10.42857 8. 72727 10.90000 13.12500 14.61538 15.12500 13.33333 13.00000 19.26316 17.4&154 9.5 0000 10.00000 14.00000 11.37500 12.052&3 11.384&2 6.60000 9.36364 10.50000 9.40000 15.73913 14.00000 17.80000 17.20000 19.00000 17.09091 19.75000 20.09091 12.90000 14;40000 10.10000 11.00000 11.18182 11.33333 11.70000 10.50000 10.92308 10.85714 17.92308 18.63636
J7.82&09 18.93750 53.00000 J8.78571 29.83333 22.25000 47.00000 35.38889 30.12000 ~9. 28571 47.76190 38.07692 31.33333 30. 666&7 29.85000 26.54545 35.81818 41.09091 16.59091 23.85714 49.05263 35.13333 33.90000 33.91667 29.50000 22.66667 63.50000 42.23077 37.00000 32.31250 17.33333 16.53333 29.26316 28.71429 37.45833 23.53846 37.41667 25.47059 41.50000 37.52941 36.60000 45.44444 25.84615 29.92857 34.92308 38.33333 30.84211 26.94118 12.50000 14.08H3
21.1~385
9.61538 13.45455 18.4;!478 16.82600 18.00000 17.75000 11.00000 14.80000 13.22222 17.50000 15.95455 19.33333 21.14286 19.44444 20.05000 19.00000 24.47368 24.4666 7 1.8.54'545
17.85714 14.10000 11.4%55 13.75000 16.33333 15.00000 13.27273 13.83333 13.81818 21.85714 24.36364
OBSERVED CELL MEANS --- ROWS ARE CELLS-COLUMNS ARE VARIABLES
-----------------------------------------------------------1
COOP-A
2 SH~FD
3
4
CJHSHT
OAYSAB
0
I
1-'
"'"'
51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74
14.21429 15.28571 18.75000 18 .1&667 13.7.1429 15.78571 15.18182 17.50000 21.15385 20.80000 16.82~53
15.27273 21.9375 0 23.00000 11.00000 11.85714 14.25000 14.&9231 12.61538 14.92108 12.37500 12,90909 16.15789 17.00000
11;. 57143 16.33333 20.75000 23.250·00 18.90909 17.40000 18.15385 19.43750 21.78571 19.92857 19.66667 17,69231 22,31579 24.71429 15,77778 16.62500 16.00000 19.09091 16.42857 14.87500 15.88235 16.16667 19.50000 17.40000
11.00000 10.87500 14.72222 13.50000 10.66667 11.28571 11.64286 13.21429 15.71429 17.76923 14.31250 10.90000 16.00000 18.57143
8.73333 8.75000 11.77778
11.53333 10.83333 12.07692 9.50000 10.27273 12.33333 11.83333
29.55556 44.75000 24.40909 21.28571 48.13333 . 22.61111 22.88235 25.47368 23.00000 12.94737 21.52632 19.00000 27.42857 15.00000 25.54545 28.87500 15.47619 28.70000 19.31579 41.89474 37.38095 42.50000 19.23810 27.83333 PAGE
4
OBSERVED COMBINED MEANS 0 I
(;; ..., OVERAll
N
=
74 16.19
STANFO=
18,26
CJHSHT=
·12,93
OAYSAB=
30,92
COOP-A=
16.27
STANFO=
17,97
CJHSMT=
13.36
OAYSAB=
33,39
COOP-A=·
16.10
STANFO=
18,56
CJHSMT=
12.48
DAYSAB=
28.31
COOP-A=
MEANS
IEXPCO~I
FACTORS LEVEL
N=
38
MEANS
-----LEVEL N =
36
MEANS
t ACTORS
SEX
LEVEL
N
=
37
~EANS
LEVEL
COOP-A=
15.78
STANFD=
18.35
CJHSHT=
12.88
DAYSAB=
32.15
COOP-A=
16.59
STANFD=
18.16
CJHSHT=
12.99
DAYSAB=
29.69
2 37
N ~lOANS
FACTORS.
IEXPCONl
lEVEL N=
3 I
SEX 1
19
HEANS ------
COOP-A=
15.71
STANFD=
18.15
CJHSHT=
13.1+0
OAYSAB=
36.50
COOP-A=
15.83
STANFD=
17.79
CJHSHT=
13.33
DAYSAB=
30.28
COOP-A=
15. 65
STANFO=
18.57
CJHSHT=
12.33
DAYSAB=
27.56
LEVEL N =
19
HEANS
-----LEVEL N=
18
MEANS
-----LEVEL N =
0
....I
"'""
18
HEANS
COOP-A=
16.35
STANFO=
18.56
CJHSHT=
12.63
OAYSAB=
29.06
ESTIMATION PARAMETERS PAGE RANK OF THE BASIS = RANK OF HODEL FOR SIGNIFICANCE TESTING = 39 RANK OF THE HODEL TO BE ESTIMATED IS ERROR TERM TO RE USED IS !SPECIAL EFFECTS! DEGREES CF FREEOCH FOR ERROR IS
35
VARIANCE-COVARIANCE FACTORS AND CORRELATIONS AHONG ESTIMATES WILl BE PRINTED SYMBOLIC CONTRAST VECTORS PAGE 11
co, oo,co,
CONST
21
1,0,0, 31
o, 0 '1,
EXPERIMENTAL
G~OUPS
E-C
SEX- IGNORE F VALUES THIS RUN
6
41 SEX X GROUP------------ IGNORE F VALUES THIS RUN
1,0 ,1'
51
u,o
1, o,
-CLASSES IN GROUP 1
CLASS-
!1, D
2,0,
-CLASSES IN
CLASS-
!1,0
3, 0'
-GLASSES IN GROUP 1
CLASS-
!1,0
4,Q,
-CLASSES IN GROUP 1
CLASS-
61 G~OUP
1
71 81 g)
101 111
!1, 0
5,
o,
-CLASSES IN GROUP 1
CLASS-
u, 0
6,0'
-CLASSES IN GROUP 1
CLASS-
u,o
7'
o,
-CLASSES IN GROUP 1
CLASS-
11,0
8, o,
-CLASSES IN GROUP 1
CLASS-
!1,0
9,0,
121 131 141
-CLASSES IN GROUP 1
CLASS-
It.,o to,o,
-CLASSES IN
CLASS-
G~OUP
1
151 11,0 11.,0,
-CLASSES IN GROUP 1
CLASS-
-------I-1---,--0 -12,0,
-CLASSES IN GROUP 1
CLASS-
!1,0 13,0,
-CLASSES IN GROUP 1
CLASS-
!1,0 14t0t
-CLASSES IN GROUP 1
CLASS-
rt,o ts,o,
-GLASSES IN GROUP 1
CLASSCLASS-
161 171 181 Bl ?
01 11,0 16,0,
-CLASSES IN GROUP 1
!1,0 17,0,
-CLASSES IN GROUP 1
CLASS-
!1,0 18,0,
-CLASSES IN GROUP 1
CLASSCLASS-
211 221 231 1, o,
-CLASSES IN GROUP 2
!2, D
2, 0'
-CLASSES IN GROUP 2
CLASS-
12,0
3,
o,
-CLASSES IN GROUP 2
CLASS-
I2,D
4,0'
-CLASSES IN GROUP 2
CI.ASS-
I2,o
5,
o,
-CLASSES IN GROUP 2
CLASS-
!2, 0
6, 0'
-GLASSES IN GROUP 2
CLASS-
I2, 0
7'
o,
-GLASSES IN GROUP 2
CLASS-
I2, 0
s,o,
-CLASSES IN GROUP 2
CLASS-
!2,0
9,
I2,0 241 251 261 271 281 291 301 !11 321
o,
-CLASSES IN GROUP 2
CLASS-
rz,o to,o,
-CLASSES IN GROUP 2
CLASS-
!2,0 11,0,
-CLASSES IN GROUP 2
CLASS-
331
341
"' >-' "'"' I
!2,0 12,0,
-CLASSFS IN
!2.,0 13,0.,
-CLASSES IN GROUP 2
CLASS-
!2.,0 14,0,
-CLASSES IN GROUP 2
CLASS-
G~OUP
2
CLASS-
351 3&1 371 ·i"
!2,0 15,0,
-CLASSES IN GROUP 2
Cl-ASS-
!2,0 16t0t
-CLASSES IN GROUP 2
CLASS-
I2,D 17,0,
-CLASSES IN GROUP 2
ClASS-
381 391
ERROR SUM OF CROSS-PRODUCTS
-----------------------------------------------------------1 2 3 4
COOP-A STANFO CJHSMT DAYSAB
COOP-A
2 STA NFO
3 CJHSMT
4 DAYSAB
865.947 S7S.901 677.251 -1168.137
514.%6 479.623 -897.218
621.883 -1163.447
5231.447
.
ERROR VARIANCE -COVARIANCE MATRIX
-----------------------------------------------------------1 2 3
4
COOP-A ST ANFO CJHSHT DAYSAB
C DOP-A
2 STANFO
3 CJHSMT
OAYSAB
24.7414 16. 4S43 19.3500 -39.0896
14.7133 13.7035 -25.6348
17.7681 -33.2414
149.4699
4
ERROR CORRELATION MATRIX
-----------------------------------------------------------1
2 3 4
COOP-A STANFO CJ HSHT OAYSAB
COOP-A
2 STANFO
3 CJHSHT
4 DAYSAB
1.000000 .86?408 • 92?890 -.642796
1.000000 .847533 -.546635
1.000000 -.64S032
1.000000
VARIABLE
VARIANCE !ERROR •EAN SQUARES!
STANDARD DEVIATION
0 I
>-'
"'
0
1 COOP-A 2 STANFD 3 CJHSHT 4 DAYSAB
24.741351
4.9741 3.8358 4.Zt52 12.2258
14.713~20
17.768076 149.469920
D.F.=
35
ERROR TERM FOR ANALYSIS OF VARIANCE CSPECIAL EFFECTS!
LEAST
ESTIMATES OF EFFECTS -- EFFECTS X VARIABLES
3
4
COOP-A
STANFO
CJHSHT
DAYSAB
16.183q5 .1705
18.26504 -.59615
12.
30.85286 5.08161
1.
CONST E-C
SQUA~E
2
ESTIMATES OF EFFECTS IN STANDARD DEVIATION UNITS-EFF X VARS -----------~--------------~--~-----------------------------2
1
CONST E-C
3
4
COOP-A
STANFD
CJHSHT
DAYSAB
3.253665 • 034297
4.761735 -.155417
3.065520 .20
2.523589 .415646
STANDARD ERRORS OF LEAST-SQUARES ESTIMATES--EFFECTS BY VARS
1
2
CONST E-C
3
2
1
cOOP-A
STANFO
CJHSHT
.57843~
.446065 .892130
.490188 .
1.15687 0
4
OAYSAB 1.421738 2.843477
·vARIANCE-COVARIANCE FACTORS OF ESTIMATES 1
CONST CONST E-C
t3.5n392E-o3 -73.099415E-05
2
E-C 54.093567E-03
0
,_.I ,_."'
INTERCORRELATIONS AMONG THE ESTIMATES 1
2
CONST CONST E-C
F-C
1.000000 -.0?7027
1.000000 ANALYSIS OF VARIANCE PAGE
7
DEPENDENT VARIABLE!SJ 1 COOP-A 2 STANFO 3 CJHSHT ERROR SUM OF CROSS-PRODUCTS
LOG-DETE~~INANT
HYPOTHESIS
1.609766<J8E+01
1 DEGREE!SI OF FREEDOM PAGE
co,oc.co,
CONST 0 I
.....
"' N
TESTS OF HYPOTHESIS eEING SKIPPED HYPOTHESIS
1 OEGREE!SI OF FREEDOM PAGE EXPERIMENTAL GROUPS
1,0.,0,
LOG-DETERMINANT SCP HYPOTHESFS + SCP ERROR, ADJUSTED FOR ANY COVARIATES, =
F-RATIO FOR MULTIVARIATE TEST OF EQUALITY OF MEAN VECTORS= D.F.=
AND
!LTKELIHOOO RATIO
VARIABLE
HYPOTH~SIS
MEAN SO
33.0000
8.0075~<J~3E-01
UNIVARIATE F
P LESS THAN
LOG
q
E-C
1·63198701E+01
2.7370
.0592
-2.22200318E-011
P LESS THAN
STEP DOWN F
P LESS
THAN
COOP-~
.5380
.0217
STANFO
6.5700
.4465
CJHSMT
14.4485
.8132
.8837
.0217 STEP-DOWN MEAN SQUARES :{ .5084 2.3969 STEP~OOWN MEAN SQUARES :( .3734 5.4733 STEP-DOHN MEAN SQUARES :(
.8837 24.71o1'tl .1309 9.3028/ 3. 88121 .0256 1 ... 2209/ 2.59821 .5380/
DEGREES OF FREEDOM FOR HYPOTHESIS: 1 DEGREES OF FREEDOM FOR ERROR: 35
HYPOTHESIS MEAN PRODUCTS, ADJUSTED FOR ANY COVARIATES 1
1 2 3
COOP-A ST ANFO CJHSMT
3
COOP-A
STANFO
CJHSHT
• 5380 0 -1.88006 2.78806
6.569% -9.74299
14.44847 HYPOTHESIS
1 DEGREE I Sl OF FREEOOH PAGE
10
SEX- IGNORE F VALUES THIS RUN
0, 0,1'
0 I
"''"'
"' LOG-DETERMINANT SCP HYPOTHESES + SCP ERROR, ADJUSTED FOR
F-RATIO FOR
~ULTIVARIATF
D.F.=
VARIABLE
HYPOTHESIS MEAN SQ
------------------
COVARIATE$, :
TEST OF EQUALITY OF MEAN VECTORS=
AND
!LIKELIHOOD RATIO
AN~
33.0000
8.93675109E-01
UNIVARIATE F
P LESS THAN
LOG
COOP-A
12.3152
.4978
.6900
.0469
P LESS THAN
CJHSHT
.2217
.0125
.4852
1.3087
.2881
-1.12412983E-011
---------------------------------
STANFO
1.62100828E+01
STEP OOHN F
.4978 STEP-DOWN MEAN SQUARES =I 2.5 .. 40 STEP-DOWN MEAN SQUARES =t .9118 .8738 STEP-DOHN MEAN SQUARES =t .8299
P LESS THAN
-------------------------------.4852 2 ... 74141 .1200 9.8739/ 3.881ZI .3567 2.2703/ z. 59821
12.3152/
DEGREES OF FREEDOM FOR HYPOTHESIS= 1 DEGREES OF FREEDOM FOR ERROR= 35
HYPOTHESIS MEAN PRODUCTS, ADJUSTED FOR ANY COVARIATE$ 2
1 2 3
COOP-A STANFO CJHSMT
3 CJHSMT
STANFD
~OOP-A
12.31520 -2.91508 1.&5223
.69002 -.39109
.22167 HYPOTHESIS
4
1 DEGREE!SI OF FREEDOM PAGE
1,0,1,
LOG-DETER~INANT
SCP HYPOTHESES + SCP ERROR, ADJUSTED FOR ANY COVARIATES, =
F-RATIO FOR MULTIVARIATE TEST OF EQUALITY OF MEAN VECTORS= o.F.=
3
!LIKELIHOOD RATIO
VARIABLE
11
SEX X GROUP------------ IGNORE F VALUES THIS RUN
AND
33.0 000
9.55429453E-01
P LESS THAN
LOG
UNIVARIATE F
------------------
---------------------------------
1. 7290
.0&99
.7931
.5638
• 0383
.8460
CJHSMT
.6159
• 0347
I 1-'
"'_,.
.6761
P LESS THAN
COOP-A
n
.5131
-4.55943501E-021
HYPOTHESIS MEAN SQ
STANFD
1.61432642E+01
STEP DOWN F
.0699 STEP-DOWN MEAN SQUARES =! .6793 STEP-DOWN MEAN SQUARES =! .8534 .7954 STEP-DOWN MEAN SQUARES =!
DEGREES OF FREEDOM FOR HYPOTHESIS= 1 DEGREES OF FREEDO~ FOR ERROR= 35
HYPOTHESIS MEAN PRODUCTS, ADJUSTED FOR ANY COVAR!ATES
P LESS THAN
-------------------------------• 7931 24.74141 .4156 2.6365, 3.88121 .3790 2.0666/ 2.59821
1.7290,
1 COOP-A 1 2 3
COOP-A ST ANFD CJHSHT
1. 728~~8 -.9R733<3 -1.031~04
2 STANFO
CJHS~T
3
• 563817 .589265
.615862 HYPOTHESIS
5
35 OEGREE(SI OF FREEDOM
=========================================== !1,0 11 ,o 11 ,I)
11,0
11,0 !1,0 !1, 0
!1,0 !1,0
1,0, 2'
-CLASSES -CLASSES -CLASSES -CLASSES -CLASSES -CLASSES -CLASSES -CLASSES -CLASSES -CLASSES -CLASSES -CLASSES -CLASSES -CLASSES -CLASSES -ClASSES -ClASSES -GLASSES -CLASSES -CLASSES -CLASSES -CLASSES -CLASSES -CLASSES -CLASSES -CLASSES -CLASSES -CLASSES -CLASSES -CLASSES -CLASSES -CLASSES -ClASSES -CLASSES -CLASSES
o,
3, o, 4,0, 5,0, 6 '0'
7, o, 8' o,
g,o,
rt,n to,o,
11,0 11,0, 11,0 12,0, !1,0 13,0, !1,0 11,0 !1,0 !1,0 -I-1,0 12,0 !2 ,o !2,0
1,4,0, 15,0, 16,0, 17,.0, 18,-&-,-1, o,
I2,o
4,0,
I?,O
?.,
I?,o
6,0,
!2,0 12,0 !2,!1 !2,0
12,0, 13,0, 14,0, 15,0,
2' o, ~
'o,
o,
12,0 7' o, e,o, !2,0 9, o, I?,O 12,0 to,o, 12,0 11, o,
I2,fJ 1&,0, I?.,D 17,0,
IN IN IN IN IN IN IN IN IN IN IN IN IN IN IN IN IN IN IN IN IN IN IN IN IN IN IN IN IN IN IN IN IN IN IN
PAGE
GROUP 1 GROUP 1 G~OUP 1 GROUP 1 GROUP 1 GROUP 1 GROUP 1 GROUP 1 GROUP 1 GROUP 1 GROUP 1 GROUP 1 GROUP 1 GROUP 1 GROUP 1 GROUP 1 GROUP 1 -GR{)UP 1 GROUP 2 GROUP 2 GROUP 2 GROUP 2 GROUP 2 GROUP 2 GROUP 2 GROUP 2 GROUP 2 GROUP 2 GROUP 2 GROUP 2 GROUP 2 GROUP 2 GROUP 2 GROUP 2 GROUP 2
- - - .. - - - - - - - - - - - -... _ - - - - - - -
1Z
CLASSCLASSCLASSCLASSCLASSCLASSCLASSCLASSCLASSClASSCLASSCLASSCLASSCLASSCLASSCLASSCLASS-CicASSCLASS-
cuss-
CLASSCLASSClASSCLASSCLASSCLASSCLASSCLASSCLASSCLASSCLASSClASSCLASSCLASSCLASS-
n
I 1-'
"'"'
--- - ---
TESTS OF HYPOTHESIS BEING SKIPPED HYPOTHESIS
6
35 OEGREECSI OF FREEDOM
==============:====~=======================
PAGE SUM OF PRODUCTS OBTAINED BY SUBTRACTION
- - - - -- - - - -- - ---- - -- - - --- - - --- - - -- - - - -LOG-OfTERMINANT SCP HYPOTHESES + SCP ERROR, ADJUSTED FOR ANY COVARIATE$,
=
1.71544845£+01
13
F-RATIO FOR MULTIVARIATE TEST OF EQUALITY OF MEAN VECTORS= D.F.=
105
!LIKELIHOOD RATIO
VARIABLE
HYPOTHESIS MEAN SQ
AND
3.4756114DE-01
UNIVARIATE F
------------------
99.7258
LOG
-1.05681469E•OOI
P LESS THAN
---------------------------------
1
COOP-A
1.8~52
• 07~&
1.0000
2
STANFO
2· ~93~
.1695
1.0000
CJHSMT
1. 0924
.0615
.~019
P LESS THAN 1.0000
STEP DOWN F
1.0000
.07~6
STEP-DOWN MEAN SQUARES =I .7787 STEP-DOWN MEAN SQUARES =I 1.oooo .lt581t STEP-ODHN HfAN SQUARES =I
DEGREES OF FREEDOM FOR HYPOTHESIS= 35 DEGREES OF FREEDOM FOR ERROR= 35
P LESS THAN
-------------------------------1o8lt5U
2~.71t141 .767~
a.02241
3.88121 .9875 1.1910/ 2.59821
<> I 1-'
"'"' HYPOTHESIS MEAN PRODUCTS, ADJUSTED FOR ANY COVARIATES 1
1 2 3
COOP-A STANFO CJHSMT
2
3
COOP-A
STANFO
CJHSHT
1.8451&0 .185534 .591200
2.493859 .113197
1.092361
CORE USED FOR DATA=
979 LOCATIONS OUT OF 3000 AVAILABLE
REFERENCES Anderson, T. W. An introduction to multivariate statistical analysis. New York: Wiley, 1958. Anderson, T. W. The choice of the degree of a polynomial regression as a multiple decision problem. Annals of Mathematical Statistics, 1962, 33 (1 ), 255-265. Anscom be, F. J., & Tukey, J. W. The examination and analysis of residuals. Technometrics, 1963,5,141-160. Bargmann, R. E. Representative ordering and selection of variables. Part A. Virginia Polytechnic Institute, June 1962. Cooperative Research Project No. 1132, U.S. Office of Education. Bargmann, R. E. A survey of appropriate methods of analysis of factorial designs. In W. L. Bashaw & W. G. Findley (Eds.), Symposium on genera/linear model approach to the analysis of experimental data in educational research. Athens, Ga.: University of Georgia, 1968, pp. 84-106. Bartlett, M. S. Multivariate analysis. Journal of Royal Statistical Society Supplement, 1947, 9 (B), 176-197. Bjorck, A., Solving linear least squares problems by Gram-Schmidt orthonormalization. BIT, 1967,7,1-21. Bloom, B. S. (Ed.) Taxonomy of educational objectives. Handbook 1: Cognitive domain. New York: McKaY, 1956. Bloom, B.S. Stability and change in human characteristics. New York: Wiley, 1964. Bock, R. D. Programming univariate and multivariate analysis of variance. Technometrics, 1963,5(1),95-117. Bock, R. D. A computer program for univariate and multivariate analysis of variance. In Proceedings of IBM Scientific Symposium on Statistics, October 1963. White Plains, N.Y.: IBM Data Processing Division, 1965, pp. 69-111. Bock, R. D. Contributions of multivariate experimental designs to educational research. In R. B. Cattell (Ed.), Handbook of multivariate experimental psychology. Skokie, Ill.: Rand McNally, 1966, pp. 820-840. Bock, R. D., & Bargmann, R. E. Analysis of covariance structures. Psychometrika, 1966, 31, 507-534. Bock, R. D., & Haggard, E. A. The use of multivariate analysis of variance in behavioral research. In D. K. Whitla (Ed.), Handbook of measurement and assessment in the behavioral sciences. Reading, Mass.: Addison-Wesley, 1968, pp. 100-142. Bock, R. D., & Repp, B. H. ESL matrix operations subroutines for the IBM system/360 computers. Research memorandum no. 11. Chicago: Statistical Laboratory, Department of Education, The University of Chicago, August 1970. Campbell, D. T., & Stanley, J. C. Experimental and quasi-experimental designs for research. Skokie, Ill.: Rand McNally, 1966. Carroll, J. B. A model of school learning. Teachers College Record, 1963, 64,723-733. Clyde, D. J., Cramer, E. M., & Sherin, R. J. Multivariate statistical programs. Coral Gables, Fla.: Biometric Laboratory, University of Miami, 1966. Coleman, J. S., eta/. Equality of educational opportunity. Washington: U.S. Government Printing Office, 1966. · Cooley, W. W., & Lohnes, P.R. Multivariate data analysis. New York: Wiley, 1971. Das Gupta, S. Step-down multiple decision ruies. In R. C. Bose lilt a/. (Eds.), Essays in probability and statistics. Chapel Hill, N.C.: University of North Carolina Press, 1970, pp. 229-250. David, F. N. Tables of the correlation coefficient. New York: Cambridge, 1938. DeLury, D. B. Values and integrals of the orthogonal polynomials up to n=26. Toronto: University of Toronto Press, 1950. 410
References
continuo~s
411
Dempster, A. P, Elements of multivariate analysis. Reading, Mass.: AddisonWesley, 1969. Draper, N. R., &Smith, N. Applied regreksion analysis. New York: Wiley, 1966. Educational Testing Service, Cooperative Test Division, Cooperative mathematics tests, : Form A, 1965. Edwards, A. L. Experimental design ini psychological research (3rd ed.). New York: Holt, 1 Rinehart, and Winston, 1968. Elashoff, J. D., & Snow, R. E. Pygma)ion reconsidered. Worthington, Ohio: Charles A. Jones, 1971. Ellis, A. A critique of systematic theorebcal foundations in clinical psychology. Journal of Clinical Psychology, 1952, 8, 11-15. ' Enslein, K., Ralston, A., & Wilf, H. S. (Eds.) Statistical methods for digital computers. (Volume 3 of Mathematical methods ~or digital computers). New York: Wiley (in press). Federer, W. T. Experimental design. Net, York: Macmillan, 1955. Finn, J. D. The educational environment: Expectations. Paper presented at the meeting of the American Educational Research Association, Minneapolis, March 1970. Finn, J. D. Evaluation of instruction~! outcomes: Extensions to meet current needs. Curriculum Theory Network, 1972, No. 8/9,96-114. a. Finn, J. D. Expectations and the edu9ational environment. Review of Educational Research, 1972,42,387-410.b. Finn, J. D. Measurement and evaluatio;n. Paper presented at the Invitational Conference on Research Design, National Counqil of Teachers of English, Minneapolis, November 20-22, 1972. c. Finn, J, D. MULTIVARIANCE: Univariatf and multivariate analysis of variance, covariance and regression. Ann Arbor, Mich.: Nat:ional Educational Resources, Inc., 1972. d. Fisher, R. A. On the probable error of 'a coefficient of correlation deduced from a small sample. Metron, 1921, 1(4), 3-32. Fisher, R. A. Statistical method for r~search workers (4th ed.). Edinburgh: Oliver and Boyd, 1932. Fisher, R. A, & Yates, F. Statistical tablek for biological, agricultural and medical research (2nd ed.). Edinburgh: Oliver and Boyd!, 1943. Fox, L. An introduction to numericallin~ar algebra. New York: Oxford, 1965. Gagne, R. M. The conditions of learning (2nd ed.). New York: Holt, Rinehart and Winston, 1970. : Glass, G. V. The experimental unit and ~he unit of statistical analysis: Comparative experiments with intact groups. Paper pres,nted at American Educational Research Association Training Presession on Researqh in Reading Instruction, Los Angeles, February 196& ' Glass, G. V., & Stanley, J. C. Statistical,flethods in education and psychology. Englewood Cliffs, N.J.: Prentice-Hall, 1970. Golub, G. H. Matrix decompositions and statistical calculations. In R. c.· Milton & J. A. Neider (Eds.), Statistical computation.' New York: Academic Press, 1969, pp. 365-397. Graybill, F. A. An introduction to linear ~tatistical models, Vol. 1. New York: McGraw-Hill, 1961. . . l Graybill, F. A. Introduction to matrice~ with applications in statistics. Belmont, Calif.: Wadsworth Publishing Co., 1969. Guilford, J. P. The nature ofintelligence. New York: McGraw-Hill, 1967. Heck, D. L. Charts of some upper perceo/tage points of the distribution of the largest characteristic root. Annals of Mathematical Statistics, 1960, 31, 625-642. Hohn, F. E. Elementary matrix algebra (2hd ed.). New York: Macmillan, 1964. Hotelling, H. The generalization of St~dent's ratio. Annals of Mathematical Statistics, 1931' 2, 360-378. ! 1
1
1
412
References
Householder, A. S. The theory of matrices in numerical analysis. New York: Blaisdell,
1964. Hummel, T. J., & Sligo, J. R. Empirical comparison of univariate and multivariate analysis of variance procedures. Psychological Bulletin, 1971, 76 (1), 49-57. Jensen, D. R., & Howe, R. B. Tables of upper percentage points of Hotel ling's T 2 -distribution, 1968. Partially published in Kramer, C. Y., & Jensen, D. R., Fundamentals of Multivariate Analysis, Part I. Inference about means. Joutnal of Quality Technology, 1969, 1,
120-133. Joreskog, K. G. A general approach to confirmatory maximum likelihood factor analysis. Psychometrika, 1969,34,183-202. Kelley, T. L., Madden, R., Gardner, E. F., & Rudman, H. C. Stanford modern mathematics concepts test. New York: Harcourt, 1965. Kirk, R. E. Experimental design: Procedures for the behavioral sciences. Belmont, Calif.: Brooks/Cole Publishing Co., 1968. Krathwohl, D. R., Bloom, B.S., & Masia, B. B. Taxonomy of educational objectives. Handbook II: Affective domain. New York: McKay, 1964. Kropp, R. P., Stoker, H. W., & Bashaw, W. L. The validation and construction of tests of the cognitive processes as described in the Taxonomy of Educational Objectives. Institute of f:luman Learning, and Department of Educational Research and Testing, Florida State University, 1966. Cooperative Research Project 2117, U.S. Office of Education. Kurkjian, B., & Zelen, M. A calculus for factorial arrangements. Annals of Mathematical Statistics, 1962, 33, 600-619. Lorge, 1., Thorndike, R., & Hagen, E. P. Technical manual: Large-Thorndike multi-level intelligence tests. Boston: Houghton Mifflin, 1966. Mandler, G., & Stephens, D. The development of free and constrained conceptualization and subsequent verbal memory. Journal of Experimental Child Psychology, 1967, 5,
86-93. Meredith, W. Canonical correlations with fallible data. Psychometrika, 1964, 29, 55-65. f·Ailler, J. K., & Farr, S. D. Bimultivariate redundancy: A comprehensive measure of interbattery relationship. Multivariate Behavioral Research, 1971, 6 (3), 313-324. Miller, R. G., Jr. Simultaneous statistical inference. New York: McGraw-Hill, 1966. Morrison, D. F. Multivariate statistical methods. New York: McGraw-Hill, 1967. Noble, B. Applied linear algebra. Englewood Cliffs, N.J.: Prentice-Hall, 1969. Olkin, I. Correlations revisited. In J. C. Stanley (Ed.), Improving experimental design and statistical analysis. Skokie, Ill.: Rand McNally, 1967. Olkin, 1., & Pratt, J. W. Unbiased estimation of certain correlation coefficients. Annals of Mathematical Statistics, 1958, 29, 201-211. Olkin, 1., & Siotani, M. Asymptotic distribution functions of a correlation matrix. Report No. 6. Stanford, Calif.: Stanford University Laboratory for Quantitative Research in Education, 1964. Ortega, J. M. On sturm sequences for tridiagonal matrices. Journal of the Association for Computing Machinery, 1960, 7, 260-263. Peng, K. C. The design and analysis of scientific experiments. Reading, Mass.: AddisonWesley, 1967. Pillai, K. C. S. Statistical tables for tests of multivariate hypotheses. Manila: The Statistical Center, University of the Philippines, 1960. Pillai, K. C. S. On the distribution of the largest seven roots of a matrix in multivariate analysis. Biometrika, 1964, 51, 270-275. Pillai, K. C. S. On the distribution of the largest characteristic root of a matrix in multivariate analysis. Biometrika, 1965, 52, 405-414. Pillai, K. C. S. Upper percentage points of the largest root of a matrix in multivariate analysis. Biometrika, 1967, 54, 189-193.
I
I
I
Ref&rono"
410
Press, S. J. Applied multivariate analysis. New York: Holt, Rinehart and Winston, 1972. , Pruzek, R. M., & Kleinke, D. J. Recent d~velopments in educational research methodology. Paper presented at the research copvention of the Educational Research Association of New York State, Albany, November 1967. Ralston, A., & Wilf, H. S. (Eds.) Mathematical methods for digital computers. New York: ! Wiley, 1965. Ralston, A., & Wilf, H. S. (Eds.) Mathefnatical methods for digital computers, Vol. 2. New ~ York: Wiley, 1967. Rao, C. R. An asymptotic expansion of the distribution of Wilk's criterion. Bulletin of International Statistical Institute, 1951, 33'(2), 177-180. Raths, J. The appropriate experimental unit. Educational Leadership, 1967, 25, 263-266. Rosenthal, R. & Jacobson, L. Pygmalion in the classroom. New York: Holt, Rinehart and · Winston, 1968. Roy, J. Step-down procedure in multivariate analysis. Annals of Mathematical Statistics, 1958,29,1177-1187. I Roy, J., & Bargmann, R. E. Tests of multiple independence and the associated confidence bounds. Annals of Mathematical Statjstics, 1958, 29, 491-503. Schatzoff, M. Exact distributions of Wi:lk's likelihood ratio criterion. Biometrika, 1966, 53, 347-358. ' Searle, S. R. Matrix algebra for the bioiJgical sciences. New York: Wiley, 1966. Stewart, D., & Love, W. A general cahonical correlation index. Psychological Bulletin, 1968,70,160-163. . Tatsuoka, M. M. Multivariate analysi$: Techniques for educational and psychological 1 research. New York: Wiley, 1971. Volpe, A., Manhold, J., & Hazen, S. In Jivo calculus assessment: Part 1-A method and its examiner reproducibility. Journal of Periodontics, 1965, 36,292-298. Wall, F. J. The generalized variance ratio of U-statistics. Albuquerque: The Dikewood j Corporation, 1968. Wilkinson, J. H. The algebraic eigenvalue problem. New York: Oxford, 1965. Wilks, s. s. Certain generalizations ir the analysis of variance. Biometrika, 1932, 24, 471-494. I Wolf, R. M. The measurement of enviwnments. In Proceedings of the 1964 Invitational Conference on Testing Problems. Princeton, N.J.: Educational Testing Service, 1965, pp.93-106. ' 1
Index
4
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I
A Addition, of matrices, 23-24 Adjusted predicted means, 376 Adjusted treatment means, 376 Analysis of covariance, 368-393 estimating () and B, 372-374 hypothesis testing, 376-383 with large N, 381 models, 368-372 and nonexperimental studies, 388 prediction by, 375-376 steps in, 371 tests, 369 Word Memory Experiment, 383-393 Analysis of variance, assumptions, 207-208, 211-212 discriminant analysis, 357-368 effects, 206 error terms, 303-304 estimation, 251-295 models, 6, 8, 205-250 least-squares estimation for, 215-219 multivariate, 210-215 rank, 218-219 univariate, 205-210 significance tests, 296-356 Anderson, T. W., 147, 162, 326
Anscombe, F. J., 106, 108 Associative addition, 23
B Bargmann, R. E., 157, 171, 191 Bartlett, M.S., 148, 192, 313 Bashaw, W. L., 11 Basis, for higher-order designs, 236244 for one-way design, 235-236 Behavioral models, 2-3 Bji:irck, A., 45 Bloom, B. S., 2, 3, 7, 11 Bock, R. D., 44, 159, 228, 235, 238, 244, 360
c Campbell, D. T., 94 Canonical analysis, 9 Canonical correlation, 9, 174, 187-193 Canonical variance, 360 Carroll, J. B., 7 Cauchy-Schwarz inequality, 176 Cell, 205 empty, 281 417
418
Index
Characteristic root, 48-50 Characteristic vector, 48-50 Chi-square distribution, 148, 192, 313, 363, 381 Cholesky factors, 37, 39-40, 42-43, 44-47,65-66, 151, 158,166-170, 190, 195, 196,216, 298, 299,300, 306, 314, 322, 324, 327, 331, 338, 340, 343, 352, 379 Coleman, J. S., 7 Column vector, 21 Combined means, 76, 269 Commutative addition, 23 Comparative study, 8 Computer programs, 4 See also MULTIVARIANCE program Concomitant variables, 368-369 Conditional distribution, 63-66 Conditional variance, 99 Conditional variance-covariance matrix, 65, 113, 126 The Conditions of Learning, 3 Confidence intervals, 115-116, 264265 Contrasts, deviation, 230-231 Helmert, 232, 249, 255 in higher-order designs, 236-244 interaction, 246-250 interpretation of weights of, 244250 in one-way designs, 229-235 orthogonal polynomial, 232-234, 249 selection of, 222-223, 228-250 simple, 231-232 Cooley, W. W., 306 Correlation, 173-204 among estimated effects, 100, 258 canonical, 187-193 in Creativity and Achievement Study, 193-198 multiple, 182-187 partial, 181-182 simple, 175-180 within-group, 81-83 Correlation coefficient, 173 partial, 182 Correlation matrix, 60, 71-72, 79, 176177 Covariance tests, 378-383
Creativity and Achievement Study (Sample Problem 1), 10-12 correlation in, 193-198 data summary, 85-88 estimation in, 127-133 principal components in, 201-202 significance testing in, 165-172
D
Das Gupta, S., 159 Data matrix, 19 David, F. N., 178 Deficient rank, 31 Delury, D. B., 233 Dempster, A. P., 62 Dental Calculus Reduction Study (Sample Problem 3), 14-16 data summary, 88-91 discriminant analysis of, 363-368 estimation in, 278-286 principal components in, 202-204 significance tests for, 333-344 Determinant, of matrix, 32-36 Deviation contrasts, 230-231 Diagonal matrix, 22 Directional hypothesis, 333 Discriminant function analysis, 9, 357-368 of Dental Calculus Reduction Study, 363-368 Discriminant function coefficient, 359 Dispersions, estimating of, 114-121, 260-267, 374-375 Distribution, conditional, 63-66 marginal, 63 multivariate normal, 61-66 Divergent achievement, 11 Draper, N. R., 106 Dummy variables, 252
E Edwards, A. L., 178 Eigenvalue, 48-50, 190 Eigenvector, 48-50, 190 Elashoff, J.D., 16
Index
Ellis, A. A., 4 Enslein, K., 10 Error sum of squares, 134 Error terms, classes of, 303-304 in regression model, 136-137 Essay Grading Study (Sample Problem 4). 16-17 estimation in, 286-295 significance tests for, 344-350 Estimation, 92-133 by analysis of covariance, 372-375 by analysis of variance, 251-295 of fi· 96-104 of B, 110-123 in Creativity and Achievement Study, 127-133 definition, 10 in Dental Calculus Reduction Study, 278-286 of dispersions, 100-101, 114-121, 260-267, 374-375 in Essay Grading Study, 286-295 least-squares, for analysis-of-variance models, 215-219 least-squares, of (), 252-260 of means and residuals, 267-273 notes on, 327 point, 252-260 reestimation, 164-165, 327 and reparameterization, 219-222 univariate model, 96-108 in Word Memory Experiment, 273278 Evaluation, 11 Expansion by minors, 34-35 Expectations, vector, 54-61 Experiment, 8 Experimental Design, 4
F F statistic, 148-149, 320-321, 339 F transformation, 313, 331-332, 339 Factor, 205 Factoring, 36-47 Farr, S. D., 192 Federer, W. T., 4. , Finn, J. D., 7, 16, 39, 44
419
Fisher, R. A., 178, 368 Fisher's z transformation, 178-179, 182 ' Fox, L., 37 Full rank, 31-32 G
Gagne, R. M., 3 Generalized inverse, 43 Glass, G. V., 178, 243, 353 Golub, G. H., 36 Gram-Schmidt orthonormalization, 45, 137-138,300,306,335,337,349 Gramians, 29, 31 Graybill, F. A., 20 Group, of observations, 205 Guilford, J. P., 11
H Hagen, E. P., 11 Hazen, S., 15 Heck, D. L., 362, 363 Helmert contrasts, 232, 249, 255 Hohn, F. E., 20 Homogeneity of regression test, 369, 376-378 Hotelling, H., 150,315. Hotelling's P statistic, 150-155, 315319 transformation of, 152 Hotelling's trace criterion, 362, 366, 382 Householder, A. S., 45, 360 Howe, R. B., 152, 316 Hummel, T. J., 156, 162, 314, 320, 363 Hypotheses, directional, 333 1)1Uitiple, 160-165, 324-327 multivariate analysis of variance, 310-312 regression, 145-146 testing, and analysis of covariance, 376-383 Hypothesis sum of squares, 134 Hypothesis testing, 10, 134, 296 Hypothetico-deductive research approach,4-5
420
Index
Identity matrix, 22 Inner product, 24 Interactions, contrasts, 238-244, 281 interpretation, 246-250, 311, 325 in regression, 6, 85 Interval estimates, 115-116, 264-265 Inversion, 40-44
J Jacobson, L., 16 Jensen, D. R., 152, 316
K
Kirk, R. E., 66 Kleinke, D. J., 4 Krathwohl, D. R., 3, 7 Kronecker product, 29-30, 112, 114, 212, 236-244, 334, 348-349, 352 Kropp, R. P., 11 Kurkjian, B., 228
L
Least-squares estimation, 96-97, 110111' 215-219 Likelihood ratio criterion, 145, 146150, 312-315, 324, 331, 362363,381,382 Lohnes, P. R., 306 Lorge, 1., 11 Love, W., 192
M Main effects, 311 Mandler, G., 12, 333 Manhold, J., 15 Marginal distribution, 63
Masia, B. B., 3, 7 Matrix, analysis-of-variance model, 207-210 _conditional variance-covariance, 113 definition, 20 determinant of, 32-36 diagonal, 22 of estimated regression coefficients, 118 gramians of, 29 identity, 22 Kronecker product, 29-30 order of, 20~21 of orthogonal estimates, 138 rank of, 30-32 regression model, 93 of semipartial regression coefficients, 138 singular and nonsingular, 35 sum-of-products, 123-124 symmetric, 22 total sum of products, 68-69 trace of, 36 transformation, 83 triangular, 21-22 variance-covariance, 56, 68-69 of linear combinations of variables, 83 within groups, 78-79, 81-82, 261262 Matrix algebra, 19-52 characteristic roots and vectors, 48-50 derivatives, 47-48 factoring and inversion, 36-47 notation of, 20-23 scalar functions, 30-36 -simple operations, 23-30 Mean squares and cross products, 155-156,262,303,312,320-321 step-down, 158, 322 Means, predicted, 267-273 Measurement, 5 Meredith, W., 191 Miller, J. K., 192 Minor, 34 Model, additive and interactive portions, 6 analysis of covariance, 368-372
Index
Model (cont.) analysis of variance, 205-250 arbitrary-N, 269 constructing, 205-215 of deficient rank, 218-219 definition, 5 fitting to data, 5-6 linear, 6, 92 main-effect, 268 matrix, see specific type (e.g., matrix, regression model) multivariate analysis of variance, 251 reduced-rank, 268 See also specific type (e.g. Analysis of variance, model) Morrison, D. F., 306, 330 Multiple correlation, 174, 182-187 increment in, 184 multiple criteria, 185-186 Multiple regression analysis, computational forms, 123-127 significance tests, 134-172 Multiplication, of matrices, 24-30 MULTIVARIANCE program, 4, 10, 39, 44, 234, 236, 244, 300, 327, 341 Program User's Guide, 397-408 Multivariate analysis, and behavioral models, 2-3 overview, 2-18 and testing, 3-4 Multivariate data, linear combinations of variables, 83-84 more than one sample, 73-81 one sample, 66-73 within-group variances and corre,lations, 81-83 Multivariate general linear model, 510 Multivariate multiple regression model, 108-110 Multivariate normal distribution, 6166
N Nested effects, 209 Noble, B., 20
421
Nonsingular matrix, 35 Null vector, 21
0 Observational vector, 19, 66-67 Olkin, 1., 177, 179, 262 One-way design, bases for, 235-236 contrasts in, 229-235 Order, of effects, 325-326 of predictor variables, 161-163 of predictors, in significance testing, 142-144 of significance tests, :304 Ortega, J. M., 360 Orthogonal estimates, 138, 299 Orthogonal polynomial contrasts, 232234, 249 Orthogonal vectors, 25 Orthogonality, 298 Orthonormal basis, 47 Orthonormal vector, 25, 29 Orthonormalization, 44-47, 137-138, 300
p Parallelism, of regression planes, 376-378, 387-388 Partial correlation, 174, 181-182 Partitioning, of sums of products, 140, 301-303, 352-353 Pearson product-moment correlation, 81, 177 Pillai, K. C. S., 362, 363, 366 Pivotal element, 38-39 Point estimation, 110-111, 252-260 Polynomial model, 94 Pratt, J. W., 177 Prediction, 121-123 by analysis of covariance, 375-376 of means and residuals, 267-273 of scores, 104-107 Predictor variables, 8, 92, 368 Principal components, 198-204 in Creativity and Achievement Study, 20 1-202
422
Index
Principal components (cont.) in Dental Calculus Reduction Study, 202-204 Programmed lnstructon Effects Study, (Sample Problem 5), 17-18 significance tests for, 350-356 Pruzek, R. M., 4
R Ralston, A., 10 Random vector, 54 Rank, of analysis-of-variance model, 217-228, 327 for estimation, 268 for significance testing, 219, 251, 296 deficient, model of, 218-219 of matrices, 30-32 of regression model, 94-95, 137 for estimation, 96, 164 for significance testing, 94 Rao, C. R., 148, 313 Raths, J., 243, 353 Regression analysis, see Multiple regression analysis Regression models, 8 multivariate, 108-110 univariate, 92-96 Reparameterization, 219-228 examples, 223-228 selection of contrasts, 222-223 Repp, B. H., 44, 360 Residual, estimated, 267-273 in regression models, 106, 122 standardized, 270 sum of products, 114, 134,303 Residual vector, 26 Response surface model, 95-96 Rosenthal, R., 16 Row vector, 21 Roy, J., 157 Roy's largest root criterion, 362, 365, 382
s Sample correlation matrix, 71 Sample problems, 10-18
Scalar, 20 Scalar functions, of matrices, 30-36 Scalar product, 24 Schatzoff, M., 148 Searle, S. R., 20, 43 Semipartial regress(on coefficients, 138, 299 Significance tests, analysis of variance, 296-356 in Creativity and Achievement Study, 165-172 criteria, 145-160, 308-327 in Dental Calculus Reduction Study, 333-344 in Essay Grading Study, 344-350 multiple hypotheses, 160-165 multiple regression analysis, 134-172 notes on, 327 order of, 304 order of predictors, 142-144 predictor variables, 137-142 in Programmed Instruction Effects Study, 350-356 reestimation, 164-165 sources of variation, 135-144, 297308 two-stage, 156-157 in Word Memory Experiment, 328333 Simple contrasts, 231-232 Simple correlation, 174, 175-180 Singular matrix, 35 Siotani, M., 179 Sligo, J. R., 156, 162, 314, 320, 363 Smith, N. 106 Snow, R. E., 16 Special effects, 304 Square-root factoring, 37 Standardization, of variables, 59-61 Stanley, J. C., 94, 178 Step-down analysis, 145, 157-160, 186, 196, 322-324, 342-343, 382 Stephens, D., 12, 333 Stepwise procedure, 66, 138, 161 Stewart, D., 192 Stoker, H. W., 11 Student's t, 315 Subclass, of observations, 205
Index
Subtraction, of matrices, 23-24 Sum of cross products, 25, 69, 78 for error, see Error terms for hypothesis, 141, 161-162, 302, 311 ' 325-326 partitioning of, 136-137, 301-303, 352-353 in regression, 123-124, 139-141 Sum of squares, see sum of cross products Symmetric matrix, 22 Synthesis, 11
T The Taxonomy of Educational Objectives, 3, 11 Tests, significance, 134-172, 297-327 on correlations, 177-179 Thorndike, R., 11 Trace, of matrix, 36 Transformation matrix, 83 Transposition, of matrices, 23 Triangular factorization, 37-40 Triangular matrix, 21-22 Tukey, J. W., 106, 108
u Unit vector, 21 Univariate multiple regression model, 92-96 Univariate statistics, 155-157, 320321
423
Variance-covariance factors of estimates, 100, 258 diagonality, 257-258 Variates, 7 Vector, definition of types, 21 expectations, 54-61 length of, 25 mean, 67 normalized, 25 observational, 66-67 orthogonal, 25 orthonormal, 25, 29 random, 54 residual, 26 square length of, 25 Vector observation, 19, 66 Vector variable, 54, 55 Volpe, A., 15
w Wilt, H. S., 10 Wilkinson, J. H., 360 Wilks, S. S., 146 Wilks' A, 156, 157, 162, 192 Within-group matrices, see specific type (e.g. Matrix, variancecovariance) Wolf, R. M., 7 Word Memory Experiment (Sample Problem 2), 12~14 analysis of covariance in, 383-393 estimation in, 273-278 significance tests tor, 328-333
y
v Variables, concomitant, 368-369 discriminant, 357 linear combinations of, 83-84 predictor, 8, 92, 368 Variance, conditional, 99 within-group, 81-83 Variance-covariance matrix, see Matrix, variance-covariance
Yates, F., 178
z z transformation, see Fischer's transformation Zelen, M., 228 "Zeroth" effect, 229, 237
z