s
G TAT il+F T I v -
r
\
s
PRINCIPLES AN DM ETHO DS RevisedPrinti ng
,-,-
STATI 5 FD --
TI v I I
\
s
PRINCIPLES AN DM ETHO DS ng RevisedPrinti
RichqrdJohnsonond GouriBhottochoryyo of Wisconsinof Modison University
JOHNWILEY& SONS New York Chichester Brisbone Toronto Singopore
Copyright @ 1987, by fohn Wiley & Sons, Inc. All rights reserved. Published simultaneously
in Canada.
Reproduction or translation of any part of this work beyond that permitted by Sections 107 and 108 of the 1976 United States Copyright Act without the permission of the copyright owner is unlawful. Requests for permission or further information should be addressed to the Permissions Department, |ohn Wiley & Sons. Library
of congress cataloging
in Publication
Data:
fohnson, Richard Arnold. Statistics, principles and methods. Includes indexes. l. Statistics I. Bhatt acharrya, Gouri K., I94OII. Title. s 19.5 QA27 6.12.163 1984 84-sr07 ISBN O-471-85075-6 Printed in the United States of America 109876s4
PREFACE THENATUREOF THEBOOK Statistics-the subject of data analysis and data-basedreasoning-is playing an increasingly vital role in virtually all professions. Some familiarity with this subject is now an essential component of any college education. Yet, pressures to accommodate a growing list of academic requirements often necessitate that this exposure be brief. Keeping these conditions in mind, we have written this book to provide students with a first exposure to the powerful ideas of modern statistics. It presents the key statistical concepts and the most commonly applied methods of statistical analysis. Moreover, to keep it accessible to freshmen and sophomores from a wide range of disciplines, we have avoided mathematical derivations. They usually pose a stumbling block to learning the essentialsin a short period of time. This book is intended for students who do not have a strong background in mathematics but seek to leam the basic ideas of statistics and their application in a variety of practical settings. The core material of this book is common to almost all first courses in statistics and is designed to be covered well within a one-semesteror two-quarter course in introductory statistics for freshmen-juniors. It is supplemented with some additional special-topics chapters. These can be covered either by teaching the core material at a faster pace or with reduced emphasis on some of the earlier chapters.
ORIENTATION The topics treated in this text are, by and large, the ones typically covered in an introductory statistics course. They span three major areas: (i) descriptive statistics, which deals with summarization and description of data; (ii) ideas of probability and an understanding of the manner in which sample-to-sample variation influences our conclusions; and (iii) a collection of statistical methods for analyzing the types of data that are of common occurrence. However, it is the treatment of these topics that makes the text distinctive. By means of good motivation, sound explanations, and an abundance of illustrations given in a real-world context, it emphasizes more than just a superficial understanding. Each concept or technique is motivated by first setting out its goal and indicating its scope by an illustration of its application. The subsequent discussion is not just limited to showing how a method works but includes an explanation of the why. Even without recourse to mathemat-
vi
Pteface
ics, we are able to make the reader aware of possible pitfalls in the statistical analysis. Students can gain a proper appreciation of statistics only when they are provided with a careful explanation of the underlying logic. Without this understanding, a learning of elementary statistics is bound to be rote and transient. When describing the various methods of statistical analysis, the reader is continually reminded that the validity of a statistical inference is contingent upon certain model assumptions. Misleading conclusions may result when these assumptions are violated. We feel that the teaching of statistics, even at an introductory level, should not be limited to the prescription of methods. Students should be encouragedto develop a critical attitude in applying the methods and to be cautious when interpreting the results. This attitude is especially important in the study of relationship among variables, which is perhaps the most widely used (and also abused) arca of statistics. In addition to discussing inference procedures in this context, we have particularly stressed critical examination of the model assumptions and careful interpretation of the conclusions.
SPECIALFEATURES 1. In the course of discussing the concepts and methods, the crucial elements are boxed for added emphasis. These boxes provide an ongoing summary of the important items essential for learning statistics. At the end o{ each chapter, all of its key ideas and formulas are summarized. 2. A rich collection of examples and exercises is included. These are drawn from a large variety of real-life settings. In f.act, many data sets stem from genuine experiments, surveys, or reports.
3 . Exercises ate provided at the end of each maior section. These provide a reader with the opportunity to prac,tice the ideas fust learned. Occasionally, they supplement some points raised in the text. A larger collection of exercises appears at the end of a chapter. The starred problems are relatively difficult and suited to the more mathematic aILy competent student.
4 . Since regression analysis is a prim ary statistical technique, we provide a more thorough coverage of the topic than is usu aI at this level. The basics of regression are introduced in Chapter 12, whereas Chapter 13 stretches the discussion to several issues of practical importance. These include methods of model checking, handling nonlin ear relations, and multiple regression analysis. Complex formulas and calculations arejudiciously replaced by computer output so the main ideas can be learned and appreciated with a minimum of SITCSS.
Preface
Using the computer. Statistical calculations become increasingly tedious with larger data sets or even small data sets where several variables are involved. Modern computers/ equipped with statistical software packages, have come a long way toward alleviating the drudgery of hand calculation. Yet some initial amount of hand calculation, perhaps on a calculator, is essential for an understanding of most sladstical methods. Many of the examples are included foi the expresspurpose of teaching and reinforcing this latter aspect. Howeverf in real applications, we often have to handle larger data sets. In order to deal with these, calculations and plots must be done on a computer. To this end, computer-based exercises are included in all chapters where relevant. In each sase,we explain, often with a specific o.rtpnt, what results the computer is apt to produce' Guided by these illustrations, one can then make effective use of the computer to analyze large data sets with relative ease. 6. Technical Appendix A presents a few statistical facts of a mathematical tr"t.tii. These are separatedfrom the main text so that they can be left out if the instructor so desires.
ORGANIZATION This book is organized into sixteen chapters, an optional technical appendix (Appendix e), and a collection of tables (Appendix B). Although designed for a one-semester or a two-quarter course, it is enriched with additional material to allow the instructor some choice of topics. "-pi" neyond Chaptbr 1, which sets the theme of statistics, the subiect matter could be classified as {ollows: Topic Descriptive
studY of data
Chapter 2,3
Sampling variability
4, 5, 5 {exceptSection 4), 7 8
Core ideas and methods of statistical inference
6 (Section4l1,9, 10, I I
Special topics of statistical inference
12, 13, 14, 15, rc
Probability and distributions
We regard Chapters I to 11 as constituting the core material of an introduciory statiitics course, with the exception of the starred sections in Chapter 7. Although this material is just about enough for a oneSemestercourse,many instructors may wish to eliminate Somesections in order to cover the basics of regression analysis in Chapter 12. This is most conveniently done by initially skipping Chapter 3 and then taking up only those portions that are linked to Chapter 12. Also, instead of a
viii
Preface
thorough coverageof probability that is provided in Chapter 4, the later sections of that chapter may receive a Iighter coverage. The special-topicschapters can be taught in any sequencefollowing the core material except that Chapters 12 and 13 must be coveredin the proper order. Several options are shown in the following diagram.
Chapters l, 2, 4-IL (3 optional)
Chapter 3 Section 2
Chapter 3 Sections 3-5
II
II I
Chapter L6 Nonparametric Inf erence
I
I
J Chapter12 Regression-I
Chapter 15 Analysis of Variance
Chapter t 4 Analysis of Categorical Data
Chapter 13 Regression-II
ACKNOWLEDGMENTS We would like to thank Tom and Barbara Ryan for allowing us to employ MINITAB commands and output to illustrate the use of the computer. Thanks also go to our colleagues, who contributed data sets that enliven our presentation of statistics. We are indebted to Mary Esser for her excellent typing of the manuscript. Finally, we would like to acknowledge the following reviewers for their suggestions and comments: |anet M. Begun, University of North Carolina; Leonard Haff, University of California at San Diego; W. Robert Stephenson,Iowa State University;
Preface ix Robert Brown, University of California at Los Angeles; Robert Reid Glouchester Community College, [osephine Gervase, Manchester Community college; David Finkel, Bucknell University; fames E. Gehrmann, Califoinia State University at Sacramento; Norton Starr, Amherst College; |udith Langer, wesrchester community college; Albert D. PalUniversity of Cincinnati; Giles Warrack, University of Iowa; Tom ""i"f., Bohannon, Appalachian State University.
PHOTO CREDITS Chapterl, Page5: United Nations. Chapter3 Opener: NASA. Chapter4 Opener: Michael Hayman/StockBoston. Chapter5 Opener: ElaineAbrams/DesignConceptions.Page146 CourtesySands Hotel & Casino. Chapter6 Opener: IngerMcCabe/PhotoResearchers. Chapter8 Opener: Donald C. Dietz/StockBoston. Chapter9 Opener: CourtesyReadersDigest.PhotoR. Michael Stucky/PhotoFile. Chapter l0 Opener: M.E.Masely/Anthro Photo. ChapterI l, Page341: ElizabethCrews/StockBoston. Chapter l2 Opener: Nancy Sefton/PhotoResearchers. Chapter l3 Opener: |eff Rotman. Chapter l5 Opener: F.B.Grunzweig/PhotoResearchers.
CONTENT$
INTRODUCTION I 2 3 4 5
What Is Statistics? Statistics in Our Everyday Life Statistics in Aid of Scientific Inquiry Two Basic Concepts-Population and Sample Obiectives of Statistics
OF DATA AND DESCRIPTION 2 ORGANIZATION I 2 3 4 5 5 7
Introduction Main Types of Data Describing Data by Tables and Graphs Measures of Center Measures of Variation Concluding Remarks Exercises
DATA OF BIVARIATE STUDY 3 DESCRIPTIVE I 2 3 4 5 6
4
Introduction Summ anzation of Bivariate Categorical Data Scatter Plot of Biv artate Measurement Data The Correlation Coefficient-A Measure of Line^r Relation Prediction of One Variable from Another (Linear Regressionf Exercises
t I 4 7 8
,10 t2 T2 T4 25 32 4T 43
55 56
s7 50 62 70
7s 83
PROBABILITY I 2 3 4 5 6 7
0,1
Introduction Probability of an Event Methods of Assigning Probability Event Relations and Two Laws of Probability Conditional Probability and Independence Random Sampling from a Finite Population Exercises
85 85 92 101 108 115
r23 xi
xii
Contents
PROBABILITY DISTRIBUTIONS
,13,1
I 2 3 4
r32 133 r36
Introduction Random Variables Probability Distribution of a Discrete Random Variable Expectation (Mean) and StandardDeviation of a Probability Distribution 5 Exercises
THEBINOMIALDISTRIBUTION AND ITSAPPLICATION IN TESTING HYPOTHESES I Z 3 4 5
Introduction Successesand Failures-Bernoulli Trials The Binomial Distribution Testing Hypotheses about a Population Proportion Exercises
THENORMALDISTRIBUTION I 2 3 4 5 *6 *7 8
Probability Model for a Continuous Random Variable The Normal Distribution-Its General Features StandardNormal Distribution Probability Calculations with Normal Distributions The Normal Approximation to the Binomial Checking the Plausibility of a Normal Model Transforming Observationsto Attain Near Normality Exercises
VARIATION IN REPEATED SAMPLES-SAMPLING DISTRIBUTIONS I Introduction 2 The Sampling Distribution of a Statistic 3 Distribution of the Sample Mean and the Central Limit fheorem 4 Exercises
INFERENCES FROMLARGESAMPLES 9 -DRAWING I 2 3 4 5
Introduction Point Estimation of a Population Mean Confidence Interval for p Testing Hypotheses about p Inferences about a Population Proportion
143 152
tl57 158 1s9 163 17r 186
192 194 201 204 209 213 218 220 224
229 230 232 235 245
249 250 252 256 263 273
Contents
6 Deciding on the Sample Size 7 Exercises
.10 SMALL-SAMPLE FORNORMALF{}TULATIONS INFERENCES Introduction Student's t Distribution Confidence Interval for p-Small Sample Size HypothesesTests for p Relationship between Tests and Confidence lntervals Inferences about the Standard Deviation o (The Chi-SquareDistribution) 7 Robustness of Inference Procedures 8 Exercises
1 2 3 4 5 6
TWOTREATMENTS 41 COMPARING Introduction Independent Random Samples from Two Populations Randomization and Its Role in Inference Matched Pair Comparisons Choosing between Independent Samplesand a Matched Pair Sample 6 Exercises
I 2 3 4 5
ANALYSIS-I 12 REGRESSION (SimpleLineorRegression) I 2 3 4 5 5 7 8 9
Introduction Regressionwith a Single Predictor A Straight-Line RegressionModel The Method of Least Squares The Sampling Variability of the Least SquaresEstimatorsTools for Injerence Important Inference Problems The Strength of a Linear Relation Remarks about the Straight-Line Model Exercises
13 REGRESSIONANALYSIS-II (Multiple Lineor Regressionond Other Topics) I Introduction 2 Nonlinear Relations and LinearrzLng Transformations
278 283
289 290 291 295 299 301 304 309 311
3,19 320 324 339 342 349
3s3
363 364 366 369 371 378 379 389 393 396
403 404 405
xiv
Contents
3 Multiple Linear Regression 4 Residual Plots to Check the Adequ acy of a Statistical Model 5 Exercises
DATA OF CATEGORICAL 14 ANALYSIS 1 Introduction 2 Pearson's 12 Test for Goodness of Fit 3 Contingency Tables with One Margin Fixed (Test of Homogeneity) 4 Contingency Tables with Neither Margin Fixed (Test of Independence| Exercises 5
,I5 ANALYSIS (ANOVA) OF VARTANCE I Introduction 2 Comparison of SeveralTreatments: The Completely Randomized Design 3 Population Model and Inferencesfor a Completely Randomized Design 4 Simultaneous Confidence Intervals 5 Graphical Diagnostics and Displays to Supplement ANOVA 6 Exercises
Ti6 NONPARAMETRIC INFERENCE I 2 3 4 5 6
Introduction The Wilcoxon Rank-Sum Test for Comparing Two Treatments Matched Pair Comparisons Measure of Correlation Based on Ranks Concluding Remarks Exercises
NOTATION APPENDIX A,I SUMMATION AND STANDARD DEVIATION-PROPERTIES APPENDIX A2 EXPECTATION DEVIATION OT X VALUE AND STANDARD A3 THEEXPECTED APPENDIX B TABLES APPENDIX Table I Table 2 Table 3 Table 4
The Number of Combinations (f) Cumulative Binomial Probabilities Standard Normal Probabilities PercentagePoints of t Distributions
409 418 422
428 430 433 436 447 456
465 466 467
47s 479 484 487
494 492 492 5O4 5I4 518 52O
523 529 535 537 537 538 s47 549
contents
Table 5 PercentagePoints of Xz Distributions Table 6 PercentagePoints of F'lrr, vrl Drstributions Table 7 SelectedTail Probabilities for the Null Distribution of Wilcoxon's Rank-Sum Statistic Table 8 SelectedTail Probabilities for the Null Distribution of Wilcoxon's Signed-RankStatistic
EXERCISES ODD-NUMBERED TO SELECTED ANSWERS INDEX
xv
550 551
s53 558
56,1 577
CHA
TER
Introductl on 1, WHATISSTATISTICS? 2, STATISTICS IN OUREVERYDAY LIFE IN AID OF SCIENTIFIC INQUIRY 3. STATISTICS AND SAMPLE 4. TWOBASICCONCEPTS-POPULATION OF STATISTICS 5. OBJECTIVES
Introduction
Gollup Opinionlndex Trend in the popul arrty of the big three sports. Football is the favorite spectator sport in the United States. What is your favorite spectator sport?
17% 36% 2r% 34% 39% 2r% Baseball rO% 8% Basketball 9% 9% (37%) (3s%) (36%) (34%) (Other) 1 9 8 1 1972 1960 1948 Football
38% 16%
In general,do you approve or disapproveof labor unions? Approvol of Lobor Unions (43-yeor trend)
::
! i : l
i i i l
it il
i
a
il 'l , J 'l ; . . . . . . . . . .
i i
I
..'.'''.'.''',.'''' ..'.',''-'.'..'
- -
" -
-
- ' - j
i l
ii ii ii ii rt....l................................................. ......1 i i . .
ii lt ii it
1936 '40
',45
',50
',55 '60
'65
',70
',75
'80
I
2
Chapter 1
1. WHATISSTATISTICS? The word statistics originated from the Latin word "status" meaning "state.'t For a long time it was identified solely with the displays of data and charts pertaining to the economic, demographic, and political situations pervailing in a country. Even today, a major segment of the general public thinks of statistics as synonymous with forbidding arrays of numbers and myriads of graphs. This image is enhanced by numerous government reports that contain massive compilation of numbers and carry the word statistics in their titles-"Statistics of Farm Production," "Statistics of Trade and Shipping," "Labor Statistics," to name a few. However, gigantic advancesduring the twentieth century have enabled statistics to grow and assume its present importance as a discipline of data-based reasoning.Passivedisplay of numbers and charts is now a minor aspectof statistics, and few, If any, of today's statisticians are engagedin the routine activities of tabulation and charting. What then are the role and principal objectives of statistics as a scientific discipline? Stretching well beyond the confines of data-display, statistics deals with collecting inIormative data, interpreting these data, and drawing conclusions about a phenomenon under study. The scope of this subject naturally extends to all processes of acquiring knowledge that involve fact finding through collection and examination of data. Opinion polls (surveys of households to study sociological, economic or health related issues), agricultural field experiments (with new seeds,pesticides or farming equipment), clinical studies of vaccines, and cloud seedingfor artificial rain production are just a few examples. The principles and methodology of statistics are useful in answering such questions as: What kind and how much data need to be collected? How should we organize and interpret the data? How can we analyze the data and draw conclusions? How do we assessthe strength of the conclusions and gauge their uncertainty?
Statistics, as a subiect, provides a body of principles and methodology for designing the process of data collection, summ arrztr:rgand interpreting the data, and drawing conclusions or generalities.
LIFE IN OUREVERYDAY 2. STATISTICS Fact finding through the collection and interpretation of data is not conJined to professional researchers.In our attempts to understand issues of environmental protection, the state of unemployment, or the perform-
2. Statisticsin Our EverydayLife 3 ance of competing football teams, numerical facts and figures need to be reviewed and interpreted. In our day-to-day llfe, leaming takes place through an often implicit analysis of factual information. We are all familiar, to some extent, with reports in the news media on important statistics. Employment: Monthly, as part of the Current Population Survey, the Bureau of Census collects information about employment status from a sample of about 55,000 households. Households are contacted on a rotating basis with three-fourths of the sample remaining the same for any two consecutive months. The survey data are analyzed by the Bureau of Labor Statistics which then reports monthly unemployment rates. n Cost of Living: The consumer price index (CPI) measuresthe cost of a fixed market basket of over 400 goods and services. Each month, prices are obtained from a sample of over 18,000retail stores that are distributed over 85 metropolitan areas. These prices are then combined taking into account the relative quantity o{ goods and services required, in 1967,by a hypothetical "urban wage earner." Let us not be concemed with the details of the sampling method and calculations as these are quite intricate. They are, however, under close scrutiny becauseof the importance to the hundreds of thousands of Americans whose earnings or retirement benefits are tied to the CPI. n Election time brings the pollsters into the limelight. Gallup Poll: This, the best known of the national polls, is based on interviews with a minimum of 1500 adults. Beginning several months before the presidential election, results are regularly published. These reports help predict winners and track changesin voter preferences.One of the most dramatic shifts ever recorded in a presidential election occurred in 1980. Gallup Poll
Reagan Carter
Oct. 25-26 (before debate) 42 45
Oct. 29-30 (after debate) 44 43
Election Resu/ts Final Nov. I
47 44
The presidential debate seems to have had an influence.
Nov. 4 5l 4l
n
our sources of factual information range from individual experience to reports in news media, government records, and articles in professional journals. As consumers of these reports, citizens need some idea of statistical reasoning to properly interpret the data and evaluate the conclusions. Statistical reasoning provides criteria for determining which
4
Chapter 1.
conclusionsare supportedby the data and which arenot. The credibility of conclusionsalso dependsgreatly on the use of statistical methodsat the data collection stage.
INQUIRY IN AID OF SCIENTIFIC 3" STATI$TICS The phrase scientific inquiry refers to a systematic process of learning. A scientist sets the goal of an investigation, collects relevant factual information (or data), analyzes the data, draws conclusions, and decides further courses of action. We briefly outline a few illustrative scenarios. Ttaining programs: Training or teaching programs in many fields, designedfor a specific type of clientele (college students, industrial workers, minority groups, physically handicapped people, retarded children, etc.) are continually monitored, evaluated, and modified to improve their usefulness to society. To learn about the comparative effectiveness of different programs, it is essential to collect data on the achievement or ! growth of skill of subjects at the completion of each program. Monitoring advetising claims: The public is constantly bombarded with commercials that claim the superiority of one product brand in comparison to others. When such comparisons are founded on sound experimental evidence, they serve to educate the consumer. Not infrequentlt however, misleading advertising claims are made due to insufficient experimentation, faulty analysis of data, or even blatant manipulation of experimental results. Government agencies and consumer $oups must be prepared to verify the comparative quality of products by using adequate data collection procedures and proper methods of statistical analysis.
T
Plant breeding.' To increase {ood production, agricultural scientists develop new hybrids by cross-fertilizing different plant species. Promising new strains need to be compared with the current best ones. Their relative productivity is assessedby planting some of each vatiety at a number of sites. Yields are recorded and then analyzed for apparent differences. The strains may also be compared on the basis of disease ! resistance or fe:.]cilizerrequirements. Building beams: Wooden beams that support roofs on houses and public buildings must be strong. Most beams are constructed by laminating several boards togethel. Wood scientists have collected data that show stiffer boards are generally stlonger. This relation can be used to predict the strength of candidates for laminating on the basis of their n itiffness measurements.
3. Statistics in Aid of Scientific Inquiry
5
riiil,iiiiililsl'|'.. i:ri:.rli:iiliiriir;] i'i
$u$j 'i:
:ro:.$
N ,l ' $u,tt
'*
N ""'NN
W
Plontbreeding for increosedproductio
Factual information is crucial to any investigation. The branch of statistics called experimentar rresigncan guide thi investigator in planning the manner and extent of data collection. After the data are collected, statistical methods are available that summarize and describe the prominent features of data. These are commonly known as descriptive statistics Today, a maior thrust of the subject is the evaluation of information present in data and the assessment of the new leaming gained from thiJinformation. This is the area of inferential statistics and its associatedmethods are known as the meth_ ods of statistical inference It must be realized that a scientific investigation is typically a process of trial and error. Rarely, if ever, can a ph.oo-.non be completely understood or a theory perfected by means of a single de{initive ."p"riment. It is too much to expect to get it all right in one shot. Even aftei his
6
Chapter 1
first successwith the electric light bulb, Thomas Edison had to continue to experiment with numerous materials for the {ilament before it was perfected. Data obtained from an experiment provide new knowledge. This knowledge often suggestsa revision of an existing theory, and this itself may require further investigation through more experiments and analysis of data. Humorous as it may appeat,the excerpt from a Woody Allen writing captures the vital point that a scientific processof leaming is essentially iterative in nature.
Inventionof the Sondwichby the Eorlof Sondwich(Accordingto WoodyAllen, humorist*) Analysis
Experiment First completed a slice of bread,a slice of bread and a + work: slice of turkey on top of both.
fails miserably.
Conjecture /
two slices of turk.y with a slice of bread + rejected. in the middle. / / C / some interest, three consecutive slices of ham stacked+ mostly in intellecon one another. ftual circles. / C / three slices of bread -.improved C severalstrips of h^^,/ enclosedtop and bot- tom, by two slices of bread. *
reputation.
/
: ,-- .success' immediate
Copyright O 1966 by Woody Allen. Adapted by permission of Random House, Inc. from Getting Even, by Woody Allen.
4. Two Basic Concepts-Population
and Sample 7
4. TWOBASICCONCEPTS-POPULATION AND SAMPLE In the preceding sections we cited a few examples of situations where evaluation of factual information is essential for acquiring new knowledge.Although these examples are drawn from widely differing fields and only sketchy descriptions of the scope and objectives of the studies are provided, a few sommon characteristics are readily discernible. First, in order to acquire new knowledge, relevant data must be collected. Second, some amount of variability in the data is unavoidable even though observations are made under the same or closely similar conditions. For instance, the treatment for an allergy may provide long lasting relief for some individuals while it may bring only transient relief or even none at all to others. Likewise, it is unrealistic to expect that college freshmen whose high school records were alike would perform equally well in college. Nature does not follow such a rigid law. A third notable feature is that accessto a complete set of data is either physically impossible or practically not feasible. When data are obtained from laboratory experiments or field trials, no matter how much experimentation has been performed, more can always be done. In public opinion or consumer expenditure studies, a complete body of information would emerge only if data were gathered from every individual in the nation-undoubtedly a monumental, if not an impossible task. To collect an exhaustive set of data related to the damage sustained by all cars of a particular model under collision at a specified speed,every car of that model coming off the production lines would have to be subiected to a collision! Thus the limitations of time, resources, and facilities, and sometimes the destructive nature o{ the testing mean that we must work with incomplete information-the data that are actually collected in the course of an experimental study. The precedingdiscussionshighlight a distinction between the data set that is actually acquired through the process of observation and the vast collection of all potential observationsthat can be conceived in a given context. The statistical name for the former is sample; for the latter it is population, or statistical population To further elucidate these concepts, we observe that each measurement in a data set originates from a distinct source, which may be a patient, tree, farm, household or some other entity depending on the object of a study. The source of each measurement is called a sampling unig or simply, a unit, A sampleor sample data set then consists of measurements recorded for those units which are actually observed. The observed units constitute a part of. a far larger collection about which we wish to make inferences. The set of measurements that would result if all the units in the larger collection could be observedis defined as the population.
8
Chapter 1
A (statistical) population is the set of measurements(or record of some qualitative trait) correspondingto the entire collection of units for which inferences ate to be made.
The population represents the target of an investigation. We learn about the population bY samPling from the collection.
A sample from a statistical population is the set of measurements that are actually collected in the course of an investigation.
OF STATISTICS 5. OBJECTIVES The subject of statistics provides the methodology to make inferences about the population from the collection and analysis of sample data. These methods enable one to derive plausible generalizations and then to assessthe extent of uncertainty underlying these generalizations. Statistical concepts are also essential during the planning stage of an investigation when decisions must be made as to the mode and extent of the sampling process.
The maior obiectives of statistics are: (a) To make inferences about a population from an analysis of information contained in sample data. This includes assessments of the extent of uncertainty involved in these inferences. (b) To design the process and the extent o[ sampling so that the observations form a basis for drawing valid inferences.
The design o{ the sampling process is an important step. A good design for the ptoi.t. of data collection permits efficient inferences to be made, often with a straightforward analysis. Unfortunately, even the most sophisticated methods of data analysis cannot, in themselves, salvagemuch
5. Obiectivesof Statistics 9 information from data that are produced by a poorly planned experiment or survey. The early use of statistics in the compilation and passive presentation of data has been largely superceded by the modern role of providing analytical tools with which data can be efficiently gathered, understood, and interpreted. Statistical concepts and methods make it possible to draw valid conclusions about the population, on the basis of a sample. Given its extended goal, the subfect of statistics has penetrated all fields of human endeavor in which the evaluation of information must be grounded in data-basedevidence. The basic statistical concepts and methods, described in this book, form the core in all areasof application. We present examples drawn from a wide range of applications to help develop an appreciation of various statistical methods, their potential uses, and their vulnerabilities to misuse.
Refercnces l. Carcers in Statistics. American Statistical Association. (A copy may be obtained by writing to: American Statistical Association, 806 l5th Street,N. W.; Washington, D.C. 20005.1 2. Statistics: A Guide to the (Jnknown,2nd edition. Tanur, J. (ed.) San Francisco: Holden-Day, Inc., 1978.
CHAPTER
ond Orgonizotion of Doto Description 1, INTRODUCTION 2, MAINryPESOF DATA AND GRAPHS DATABYTABLES 3. DESCRIBING OF CENTER 4, MEASURES OF VARIATION 5. MEASURES REMARKS 6, CONCLUDING
Organization and Description of Data
Acid Roinls KillingOur Lokes Acid precipitation is linked to the disapp eararrrceof sport fish and other organisms from lakes. Sources of air pollution, including automobile emissions and the burning of fossil fuels, add to the natural acidity of precipitation. In 1979, the
Wisconsin Department of Natural Resourcesinitiated a precipitation monitoring program with the goal of developing appropriateair pollution controls to reduce the problem. The acidity of the first 50 rains monitored, measuredon a pH scale from 1 (very acidic) to 7 (neutral) are summ artzedby the histogram.
3.0
3.5
4.0 4.5 5.0 5.5 Histogram of acidraindata
6.0 pH
Notice that all of the rains are more acidic than normal rain, which has a pH 5.6. (As a comparison, apples are about 3 and milk is about 6.) Researchers in Can ada have established that lake water with a pH below 5.6 may severely affect the reproduction of game fish. More research will undoubtedly improve our understanding of the acid rain problem and hopefully lead to an improved environment.
11
12
Chapter 2
1. INTRODUCTION In Chapter I we cited several examples of situations where the collection of data by appropriate processes of experimentation or observation is new knowledge. A data set may range in complexity essential io "iq.rlr. from a few entiies to hundreds or even thousands of them. Each entry corresponds to the observation of a specified characteristic of a sampling unit. For example, a nutritionist may provide an experimental diet to 30 undernourished children and record their weight gains after two months. Here, children are the sampling units, and the data set would consist of 30 measurements of weight gains. Having collected a set of data, a primary step is to organize the inJormation and extract a descriptive that highlights its salient features. In this chapter we learn how irr--"ry to organize and desCribe a set of data by means of tables, graphs, and calculation of some numerical summary measures.
OF DATA 2, MAIN TYPES In discussing the methods of providing summary descriptions of data, it helps to distinguish between the two basic types: (i) qualitative or categorical data and (ii) numerical or measutement data. When the characteristic under study concerns a qualitative trait that is only classified in categories and not numerically measured, the resulting datt are called categorical data. Some examples are: hair color (blond, brown, red, black), imployment status (employed, unemployed), blood type (o, A, B, AB). If, on the other hand, the characteristic is measured on scale, the resulting data consist of a set of numbers, and are "nrr*"ii""l called measurement data. We will use the telm numerical'valued variablb or iust variable to refer to a characteristic that is measured on a nurtrerical scale. The word "variable" signifies that the measurements vary over different sampling units. In this terminology, obsewations of a numerical-valued variabte yietd measurement data. A few examples of nut'nerical-valued variables are: shoe-size of an adult male, daily number of tra{fic fatalities in a state, intensity of an earthquake, height of a l-year-old pine seedling, and the survival time of a cancer patient. While in all these eiamples the stated characteristic can be numerically measured, a close scrutiny reveals two distinct types of underlying scale of measurement. Shoe-sizesare numbers such as 6, 5+, 7, 7+, - . . , which proceed in steps of 1. The count of traffic fatalities can only be an
integer and so is the number of offspring in an animal litter. These are examples of discrete variables. The name discrete draws from the fact that the scale is made up of distinct numbers with gaps in between. On the other hand, some variables such as height, weight, survival time, can ideally take any value in an interval. Since the measurement scale does not have gaps, such variables are called continuous We must admit that a truly continuous scale of measurement is an idealization. Measurements/ actually recorded in a data set, are always rounded either for the sake of simplicity or becausethe measuring device has a limited accuracy. Still, even though weights may be recorded in the nearest pounds or time recorded in the whole hours, their actiral values occur on a continuous scale so the data are referred to as continuous. Counts are inherently discrete, and aretreated as such provided they take relatively few distinct values (for example, number of children in a family, number of traffic violations of a driver). But when a count spans a wide range of values, it is often treated as a continuous variable. For example, the count of white blood cells, number of insects in a colony, number of shares of stock traded per day, are strictly disc,rete but, for practical purposes,they are viewed as continuous. Summary description of categorical data is discussedin Section 3.1. The remainder of this chapter is devoted to a descriptive study of measurement data, both discrete and continuous. As in the caseof summarization and commentary on a long, wordy document, it is difficult to prescribe concrete steps for summary descriptions that work weJl for all types of measurement data. However, a few important aspec,tsthat deserve special attention are outlined below to provide general guidelines for this process. 'i E
Describingo Dotq Setof Meosurements (a)
S t e t l l A - l f A f i ' , { . : . J *t u r , e; p f l q - gt i t " , s , " : i ' t ti i^' i:! ' : ! , ; l , i
i u ' ! i ' ui ,t , .i '. ' i , f ; l l i
$ r ; ri i r
qi ,
(i) Presentation of tables and graphs. (ii) Noting important features of the graphed data including symmetry or departures from it. (iii) Scanning the graphed data to detect any observations that seem to stick far out from the maior mass of the data-the outliers. ( b ) { l u r E l p u t ; r t i * x r , i lsf r ( i f f i r { " : r i sr:l ;rr ;I i . ' , i .t irrh . (i) A typical or representative value that indicates the center of the data. (ii) The amount of spread or variation present in the data.
i
14
Chapter 2
DATABYTABLES 3. DESCRIBING ANDGRAPHS 3.1 CATEGORICAL DATA When a qualitative trait is observedfor a sample of units, each observation id recorded as a rnember of one of several categories.Such data are readily organized in the form of a frequency table that shows the counts (frequencies)of the individual categories.Our understanding of the data is further enhancedby calculation of the proportion (also called relative frequency) of observations in each category.
Relative frequency of a category
Frequency in the category Total number of observations
EXAMPLE1 A campus presspolled a sample of 280 undergraduatestudents in order to study student attitude toward a proposedchangein the dormitory regulations. Each student was to respondas either support/ oppose,or neutral in regard to the issue. Table I records the frequencies in the second column, and the relative frequencies are calculated in the third column. The relative frequencies show that about 54% of the polled students supported the change, 18% opposed,and 28Y" werc neutral'
Toble4 SummoryResultsof on OpinionPoll Responses
Frequency
Relative Frequency
Support Neutral
-
.275
Oppose
:
.182
Total
n
Remark: The relative frequencies provide the most relevant information as to the pattern of th e data. One should also state the sample sLze,which serves as an indicator of the credibility of the relative frequencies. (More on this in Chapter 9')
3. Describing Data by Tables and Graphs 15
3.2 DISCRETE DATA We next consider summary descriptions of measurement data and begin our discussion with discrete measurement scales.As explained in Section 2, a data set is identified as discrete when the underlying scale is discrete and the distinct values observed are not too numerous. Similar to our description of categorical data, the information in a discrete data set can be summarized in a frequency table that includes a calculation of the relative frequencies. In place of the qualitative categories, we now list the distinct numerical measurements that appear in the data set and then count their frequencies.
EXAMPLE2
The daily numbers of computer stoppagesare observed over 30 days at a university computing center, and the data of Table 2 are obtained. The frequency distribution of this data set is presentedin Table 3 where the last column shows the calculated relative frequencies.
Toble 2 Doily Numbers of Computer Sfoppclges I 2 0
3 2 I
I 0 6
I 0 4
0 0 3
I I 3
0 2 I
I I 2
I 2 4
0 0 0
Toble 3 FrequencyDistributionfor Doily Number(x) of ComputerStoppoges Value x
Frequency
Relative Frequency
0 I 2 3 4 5 6
.300 .333 .r67 .100 .067 .000 .033
Total
1.000
T
The frequency distribution of a discrete variable can be presented pictorially by drawing either lines or rectangles to represent the relative {requencies. First, the distinct values of the variable are located on the
16
Chapter 2
0.3
()
L i n ed i a g r a m
o
H istogram
E n2
o
f
J
ct q)
ct o
lb
o
o o q) E
0.1
E 01 q) E.
Figure't Grophicdisployof doto in loble 3. horizontal axis. For a line diagram, we draw a vertical line at each value, and make the height of the line equal to the relative frequency. A histogtam employs vertical rectangles instead of lines. These rectangles are ientered at the values and their areas represent relative frequencies. Typically, the values proceed in equal steps so the rectangles are all of the same width and their heights are proportional to the relative frequencies as well as to the frequencies. Figure 1(a) shows the line diagram and 1(b) shows the histogram of the frequency distribution of Table 3. 3.3 DATA ON A CONTINUOUS VARIABLE We now consider tabular and graphical presentations of data sets that contain numerical measurements on a virtually continuous scale. Of course, the recorded measurements are always rounded. In contrast with the discrete case, a data set of measurements on a continuous variable may contain many distinct values. Then, a table or plot of all distinct values and their frequencies will not make a condensed or in{ormative summary of the data. The two main graphical methods used to display a data set of measurements are the doidiigram and the histogram. Dot diagrams are employed when there are relaiively few observations (say, less than 20 ot 25)i histogtams are used with a larger number of observations'
Dot Diagram When the data consist of a small set of numbers, they can be graphically representedby drawing a line with a scale covering the range of values of thi *""rorements. Individual measurements are plotted above this line as prominent dots. The resulting diagram is called a dot diagram'
3. Describing Data by Tables and Graphs 17
Payingattentionin class.Observations on 24 first gradestudents. oo
ooooo
t' ???,',????,'? 0
Figure2
r
2
3 4
5
6
oo
7 8 9 10 11 12 13 Minutes
p
Time not concentrating on the mathematics assignment (out of 20 minutesl. First-grade teachers allot a portion of each day to mathematics. An educator, concerned about how students utilLze this time, select ed 24 students and observed them for a total of 20 minutes spread over several days. The number of minutes, out of 20, that the student was not on task was recorded (courtesy of T. Romberg). These lack-of-attention times are graphically portrayed in the dot diagram. The student with 13 out oI 20 minutes off-task stands out enough to merit further consideration. Is this a student who finds the subiect too difficult or might it be a very bright child who is bored?
EXAMPLE3
The number of days the first six heart transplant patients at Stanford survived after their operationswere 15,3,46,623, 126,64. These survival times extended from 3 days to 623 days. Drawing a line segment from 0 to 700, we can plot the data as shown in Figure 3. This dot diagram shows a cluster of small survival times and a single rather large value. n
18
Chapter 2
time(days) Survival
Figure3 Dol diogrom for the heorl lronsplonldolo. Frequency Disttibution on Intervals When the data consist of a large number of measurements/ a dot diagram may be quite tedious to construct. More seriously, overcrowding of the dots will cause them to smear and mar the clarity of the diagram. In such casesit is convenient to condense the data by grouping the observations according to intervals, and recording the frequencies of the intervals. Unlike a discrete frequency distribution, where grouping naturally takes place on points, here we use intervals of values. The main steps in this processare outlined below.
Construcfingo FrequencyDisfribution for q ContinuousVorioble (a) Find the minimum and the maximum values in the data set. (b) Choose intervals or cells of equal length that cover the range between the minimum and the maximum without overlapping. These are called class intervalq and their end points are called class boundaries (c) Count the number of observations in the data that belong to each class interval. The count in each class is the class frequency or cell firequency. (d) Calculate the relative frequency of each class by dividing the class frequency by the total number of observations in the data:
Relative frequency _
Class frequency Total number of observations
The choice of the number and position of the class intervals is primarily a matter of judgment guided by the following considerations. The number of classesusually ranges from 5 to 15, depending on the number of observations in the data. Grouping the observations sacrifices in{ormation concerning how the observations are distributed within each cell. With too few cells, the loss o{ information is serious. On the other hand, if one choosestoo many cells and the data set is relatively small, the frequencies from one cell to the next would iump up and down in a chaotic manner and no overall pattern would emerge. As an initial step, frequencies may be determined with a large number of intervals that can
3. DescrtbingData by Tablesand Graphs 19 later be combined as desired in order to obtain a smooth pattern of the distribution. Computers conveniently order data from smallest to largest so that the observations in any cell can dasily be counted. The construction of a frequency distribution is illustrated in Example 4.
EXAMPLE4
Most students purchase their books during the registration period. University bookstore receipts from 40 students provided the sales dat-aof Table 4 where the values have been ordered from smallest to largest. To construct a frequency distribution, we first notice that the minimum sale is 3.20 and the maximum sale is 124.27. We c,hooseclass intervals of length 25 as a matter of convenience.
Tqble 4 The Dqlo of FortyCosh Regisler Receipfs (Dollorcf ol o UniversityBookslore " 3.20 '- 41.81 " 53.72 *7A.98 88.92
I1.70 13.64 43.35 43.94 '-53.92 " 54.03 t,74.52 '76.68 -9r.36 .89.28
' 15.60 '15.89 '49.5r , 19.82 * 56.89 'a63.80 .80.91 . 7 7. 9 4 ,91.62 ,98.79
28.44 , 51.20 " 66.40 '84.04
r02.39
29.07 , 51.43 , 69.64 8s.70 to4.2r
0'
37.34 ,. 52.47 70.15 86.48 124.27
The selection of class boundaries is a bit of fussy work. Since the data have two decimal places, we could add a third decimal figure to avoid the possibility of any observation falling exactly on the boundary. For example, we could end the first class interval at 24.995. Altematively, and more neatly, we could write 0-25 and make the end point convention that the le{t-hand limit is included but not the right. SeeTable 5. The first interval contains 5 obsewations so its frequency is 5 and its relative frequency is t : .125.
Toble5 FrequencyDistributionfor Bookstore Soles Doto (LeftEndpointsIncluded,but RightEndPoints Excluded) Class Interval
Frequency
Relative Frequency
$ o-zs
5
#
25-50
B
# :
s0-75
13
75 - 1 0 0
ll
100-125 Total
3
.2OO
+8_ Bzs ++_ .27s t _ .075 1.000
20
Chapter 2
The relative frequencies add to l, as they should in any frequency distribution. n Remark: The rule requiring equal class intervals is inconvenient when the data are spread over a wide range but are highly concentrated in a small part of the range with relatively few numbers elsewhere. Using smaller intervals where the data are highly concentrated and larger interyals where the data are sparsehelps to reduce the loss of information due to grouping. Tabulations of income, age, and other characteristics in official reports are often made with unequal class intervals. Histogram A frequency distribution can be graphically presented as a histogram. To draw a histogram we first mark the class intervals on the horizontal axis. On each interval we then draw a vertical rectangle whose area represents the relative frequency-that is, the proportion of the observations occurring in that class interval. The total areaof all rectangles equals the sum of the relative frequencies,which is one.
The total ^rea of a histogram is 1.
The histogram of Table 5 is shown in Figure 4. For example, the rectangle drawn on the class interval}-Zl has its area : .005 x 25 : .125, which is the relative frequency of this class. Actually, we determined the height .005 as Relative frequency Height Width of interval
#:
oos
0.015
0.010
2?5
$H s'':'.s 0 . 0 05
::iil
:ii
*Ffl
ffi
25
50
75
100
r25
D o l l a rs a l e s
Figure4 Histogrqmof the bookstore soles doto of Tobles4 ond 5.
3. DescribingData by Tablesand Graphs 2l The units on the vertical axis can be viewed as relative frequencies per unit of the horizontal scale. For instance, .008 is the relative frequency per dollar for the interval $25-$50. Visually, we note that the tallest Llock or most frequent classinterval is $50-$75. Also, proportion .275 + .O75 : .35 sales are for more than $75. Remark: when all class intervals have equal widths, the heights of the rectangles are proportional to the relative frequencies that the areas represent. The formal calculation of height, as area divided by the widih, is then redundant. Instead, one can mark the vertical scale according to the relative frequencies-that is, make the heights of the rectangles equal to the relative frequencies. The resulting picture also makes the areasrepresent the relative frequencies if we read the vertical scale as if it is in units of the class interval. This leeway when plotting the histogram is not permitted in the case of unequal class intervals. Figure 5 shows one ingeneous way of displaying two histograms for
c .o
.F (l)-
6o
>= tl
R Age 0
10
20
50
60
70
c
:
a)E F,o L
r-{
lfl '-r tl
ts
Figure5 Populotiontree (histogroms)of the mole ond femqle qge distributionsin the United Stotes in '1980.(Source:U.S.Bureou of the Census).
22
Chapter 2
comparison.In spite of their complicated shapes,their back-to-backplot as a "tree" allows for easy visual comparison of the male and female age distributions. Stem-and-Leaf Plot A stem-and-leaf plot provides a more efficient variant of the histogram for displaying data, especially when the observations are two-digit numbers. This plot is obtained by sorting the observations into rows according to their leading digit. The stem-and-leaf plot for the data of Table 5 is shown in Table 7. To make this plot: 1. List the digits 0 through 9 in a column, and draw a vertic,al line. These correspond to the leading digit. 2. For each observation, record its second digit to the right of this yertical line in the row where the first digit appears. 3. Finally, arrange the second digits in each row so they are in increasing order.
Toble 6 ExominqlionScoresof 50 Students 7s 86 68 49 93 84
98 78 57 92 85 64
42 37
9s 83 70 73
7s 99 55 7I 62 48
84 66 79 78 80 72
87 90 88 53 74
65 79 76 81 69
s9 80 60 77 90
63 89 77 s8 62
Toble7 Stem-ond-Leof Diogromfor the ExominqtionScores 0 I 2 3 4 5 6 7 8 9
7 289 3s789 022345689 or234s56778899 00134456789 0023589
In the stem-and-leaf plot, the column of first digits to the left of the vertical line is viewed as the stem, and the second digits as the leaves' viewed sidewise, it looks like a histogram with a cell width equal to 10.
Exercises 23 However, it is more informative than a histogram becausethe actual data points are retained. [n fact, every observation can be recovered exactly from the stem-and-leafplot.
EXERCISES 3.1 Recorded here are the blood types of 40 persons who have volunteered to donate blood at a plasma center. Summarize the data in a frequency table. Include calculations of the relative frequencies. OOABAOAAAO BOBOOAOOAA AAABABAAOOA O
O
A
A
A
O
A
O
OAB
3.2 In a study of the job hazards in the roofing industry in Califomia, records of the disabling injuries were classified according to the accident types. Of the total number of 1182injuries inI97O, B29were due to fialls,256 from bums,2I9 from overexertion,2O2 from beirig struck, 40 from foreign substance in eye, and 86 from other miscellaneousreasons.(Source: Dept. of HEW Publication NIOSH Z5176). Present these data in a frequency table and also give the relative frequencies. 3.3 The numbers of peas in 50 pea pods, randomly taken from a day,s pick, are:
4 3 4 5 3
4 3 4 4 A 7 3 4 4 2
3 3 4 3 4
3 6 4 5 3
s 4 s 5 s
s 6 4 s 3 6 3 4 q 3,3,3 6 4 4
4 3 1
0
43 21 22 55 34
3 (a) construct a frequency distribution including the calculation of relative frequency. (b) Display the data as a line diagram.
3.4 on five occasions,the amounts of suspendedsolids (parts per million) detected in the effluent of a municipal waste w"te. treatmenr plant were 14, 12,21,28, AO,65,26. Display the data in a dot diagram. 3.5 one of the mafor indicators of air pollution in large cities and ingustrial belts is the concentration of ozone in the atmosphere. From massive data collected by Los Angeles county authorities, 7g measurements of ozone concentration (in parts per hundred million) in the downtown Los Angeles area during the summers of 1966 and
24
Chapter 2
1967arerecordedhere(courtesyof G. Tiao). Eachmeasurementis an averageof hourly readingstaken every fourth day. 3.5 6.8 2.4 6.8 s.5 6.2 s.7 9.4 6.8 6.6
r.4 2.5 3.0 r.7 1.1 7.5 5.8 3.4 3.1 4.4
6.6 5.4 5.6 5.3 5.1 6.2 3.1 5.8 4.7 5.7
6.0 4.4 4.7 4.7 5.6 5.0 5.8 7.6 3.8 4.5
4.2 5.4 6.5 7.4 5.5 s.8 1.6 r.4 s.9 3.7
4.4 4.7 3.0 6.0 r.4 2.8 2.s 3.7 3.3 9.4
5.3 3.5 4.r 6.7 3.9 6.1 8.1 2.O 6.2
5.6 4.O 3.4 rr.7 6.6 4.r 6.6 3.7 7.6
(a) Constructa frequencydistribution including a calculationof relative frequency. (b) Make a histogtam using the classintervals 0-2, 2'4, and so on/ with the end point conventionthat the left endpointis included but not the right endpoint. 3.5 The following measurementsof weight (in grams) have been recorded for a common strain of 70 3l-day-old rats (courtesy of ]. Holtzman). (a) Choose appropriateclass intervals, and group the data into a frequencydisttibution. -(b) Calculatethe relative frequencyof eachclassinterval' (c) Ptot the relative frequencyhistogram. 180- 116 94 l>o 1\6 9& l\6 l\4
lb6 Isz I!4 106 lo9r.l4o .log r tro N6 1X2 N2 1\6 r IDB r\o lB2
llg N2 1n0 lz"t
N2 \s tba, )s2 t\8 lls No N2 llo llo
r28 \z rB8 t\S n6
lM l\2
tog l$ 129 t10
rq8 l\o 1b2 9U lx
1!o l3o llo lb4 98
1r0 N2 1tr4 NIo 1\O
N2 S4
\06 \4 Nz IOA l0a
3.7 The frequency distribution of the number of lives lost in maior tomadoes in the United Statesbetween 1900 and 1973 appearsbelow. (Sowce: U.S. Environmental Data Service') (a) Plot the relative frequency histogram. Use 0-25 for the first interval and 250-300 for the last. (b) Comment on the shape of the distribution.
4. Measuresof Centet
No. of Deaths <24 25-49 50-74 7s-99 100-r49 150- r99 200-249 250 and over
25
Frequency 8
r6 r6 ll 6 2 4 I 64
3.8 The following data represent the scoresof 40 students on a college qualification test (courtesy of R. W. ]ohnson). 152 r7r 138 145 r44 126 r45 162 r74 178 167( 98,.)'161 r52 182 L36 165 137 133 143 9 s 1 9 0 11 9 r 4 4 1 76 1 3 5 184 166 I 15 I ls ""-"
,a
194 147 160 158 178 162 131 106 157 154 Make a stem-and-leaf diagram.
OF CENTER 4, MEASURES The graphic procedures described in Section 3 help us to visualize the p"tt.oI a data set of measurements. To obtain a more obiective summary description and a comparison of data sets, we must go one step further and obtain numerical values for the location or centel of the data and the amount of variability present. Becausedata arenormally obtained by sampling from a large population, our discussion of numerical measures iJ reslricted to data arising in this context. Moreover, when the population is finite and is completely sampled, the same arithmetic Lplratiotts can be carried out to obtain numerical Ll€asuf€s for the popuIation. To effectively present the ideas and associatedcalculations, it is convenient to represent a data set by symbols to prevent the discussion from becoming anchored to a specific set of numbers. A data set consists of a number of measurements symbolically represented by x1, X2t - - - , xn' The last subscript n denotes the number of measurements in the data, and xr, X2, . . . represent the first observation, the second observation, and so on. For instance, a data set consisting of the five measurements 2.I, 3.2, 4.I, 5.6, and 3.7 is representedin symbols by X11x2t xs, X4, x5 where x, : 2..1,x2: 3.2, x" : 4-1,xo : 5'6, and x, : 3'7' Perhaps the moit important aspect of studying the distribution of a
26
Chapter 2
sample of measurements is locating the position of a central value about which the measurements are distributed. The two most commonly used indicators of center are the mean and the median The mean or average of a set of measurements is the sum of the measurements divided by their number. For instance, the mean of the five measurements2.I, 3.2, 4.I, 5.6, and 3.7 is
2 . I + 3 . 2 + 4 . I + 5 . 6+ 3 . 7
18.7 5
To state this idea in general terms we use symbols. If a sample consists of n measurements xr, Xz, . . . , xn the mean of the sample is xt + x2 + n
+xn
Sum of the n measurements n
The notation V will be used to represent a sample mean. To further simplify the writing of a sum, the Greek capital letter ) (sigma) is used as a statistical shorthand. With this symbol:
The sum xt + xz +
. + x' is denoted as ) rr.
Read this as "the sum of all xi with r rangi"g:fiom 1 t o n . "
For exampl", i", i:
representsthe sum xr + xz + xa + x4 + xs.
I
Remark: When the number of terms being summed is understood from the context, we often simplify to lx, instead of ) xr. Some further operations with the ) notation are d.iscussea6 61,:pl.rdix Al. We are now ready to {ormally define the sample mean.
. , x' is The sample mean of a set of n measurementsxr , x2, the sum of these measurementsdivided by n. The sample mean is denoted by X. Expressedoperationally n sr )\z lri LJ
V:
r; -- L|
nn
Srz 2Ai
or
4. Me,tsuresof Center 27 According to the concept of "average," themean represents a center of a data set. If we picture the dot diagram of a data set as a thin horizontal bar on which balls of equal size aie placed at the positions of the data poi"i*, then the mean x ,"pt"r"rts the point on which the bar will talarrce. The computation of the sample mean and its physical interpretation are illustrated in ExamPle 5.
EXAMPLE5
The birth weights in pounds of five babies born in a hospital on a certain day are9.2, 6.4, 10.5, 8.1, 7.8. The mean birth weight for these data is X:
9 . 2 + 6 . 4 + 1 0 . 5+ 8 . 1 + 7 . 8
4 2 - O: 5
8.4 pounds
The dot diagram of the data appearsin Figure 6, where the sample mean (marked by A) is the balancing point or center of the picture'
r O 6
r 7
,O OO^ e B 4 poJno,
r O 10
-r11
>
Figure6 Dot diogrom ond the somple meon for the birth-weight doto.
. , x' is The sample median of a set of.n measurements xr, the middle value when the measurements are arranged from smallest to largest.
Roughly speaking, the median is the value that divides the data into two equal halves. In other words, 5O"/"of the data lie below the median and 50% lie above it. If n is an odd numbet, there is a unique middle value and it is the median. If n is an even number, there are two middle values and the median is defined as their average.For instance, the data 3, 5,7,8 have two middle values 5 and 7, so median : (5 + 7)/2 : 6.
6 EXAMPLE
Find the median of the birth-weight data given in Example 5. The measurements, ordered from smallest to largest, are
6.4, 7.8, E,
9.2, to.s
The middle value is 8.1, and the median is therefore 8.1 pounds.
T
28
Chapter 2
EXAMPLE 7 Calculate the median of the survival times given in Example 3. Also calculatethe mean and compare. To find the median,first we order the data.The orderedvaluesare 3,
15, 46, 64, 126, 623
There are two middle values, so median :
46 + 64 55 days
The sample mean is
X:
3 + 15 + 46 + 64 + 126 + 623
r
'.v'
6
Note that the largest survival time greatly inflates the mean. Only I out of the 6 patients survived longer than X : 146.2days.Here the median of 55 days appearsto be a better indicator of the center than the mean.
n
Example 7 demonstrates that the median is not affected by a few very small or very large observations, whereas the presence of such extremes will have a significant effect on the mean. For extremely asymmetrical distributions, the median is likely to be a more sensible measure of center than the mean. That is why government repolts on income distribution quote the median income as a summary, rathet than the mean. A relatively small number of very highty paid persons can have a great effect on the mean salary. -If the number of observations is quite large (greater than, say, 25 or 30), it is sometimes useful to extend the notion of the median and to divide the ordereddata set into quarters. |ust as the point for division into halves is called the median, the points for division into quarters are called quartiles. The points of division into more general fractions are called percentiles.
The sample 100p-th percentile is a value such that after the data are ordered from smallest to largest, at least LOOp% of the observations are at or below this value and at least 100(l-P)% are at or above this value.
The quartiles are simply the 25th,50th, and 75th percentiles.
4. Measures of Center 29
SompleQuortiles Lower (first) quartile Secondquartile (or median) Upper (third) quartile
Qr - Zlthpercentile Qz _ 50th percentile Qg : 75th percentile
We adopt the convention of taking an observed value for the sample percentile except when two adjacent values satisfy the definition, in which case their average is taken as the percentile. This coincides with the way the median is defined when the sample size is even. When all values in an interval satisfy the definition of a percentile, the particular convention used to locate a point in the interval does not appreciably alter the results in large data sets, except perhaps for the determination of extreme percentiles (those before the Sth or after the 95th percentile).
EXAMPLE8
The data from 50 measurements of the traffic noise level at an intersection are already ordered from smallest to largest in Table 8. Locate the quartiles and also compute the lOth percentile.
Toble I Meosuremenfsof Trotfic Noise Level in Decibels 52.0 54.4 54.s 5s.7 5s.8
55.9 55.9 56.2 s6.4 s6.4
s6.7 s9.4 60.2 61.0 s6.8 s9.4 6 0 . 3 6 r . 4 s7.2 s9.s 60.s 6r.7 s7.6 s9.8 6 A . 6 6 1 . 8 s8.9 60.0 60.8 62.0
62.r 62.6 62.7 6 3 I. 63.6
63.8 64.0 64.6 64.8 64.9
6s.7 66.2 66.8 67.O 67.r
6 7. 9 68.2 68.9 69.4 77.r
Courtesy of f. Bollinger.
To determine the first quartile, we must count at least .25 x 50 : 12.5 observations from the smallest measurement and at least .75 x 50 : 37.5 from the largest. We can see that the l3th ordered observation is 57.2. This observationhas 13 values at or below it and 38 obsewations at or above it. Consequently, 57.2 is the first quartile. Counting down L3 observations from the largest measurement, we find that 64.6 is the third quartile. The median is (60.8 + 6l.O)/2 : 609. For the 10th percentile, at least .10 x 50 : 5 points must lie at or below it and .90 x 50 : 45 points must lie at or above it. Both the 5th and 6th smallest observation satisfy this condition, so we take their average(55.8 + 55.9)/2 : 55.85 as the l0th percentile. Only lO% oI the 50 measurements of noise level were quieter than 55.85 decibels.
n
30
Chapter 2
EXERCISES 4.1 Calculate the mean and median for each of the following data sets: (a)4,7,3, 6, 5 (b) 2+,28,36,30,24,29 (c) -2, l, -I, O,3, -2, I 4.2 Eight participants in a bike race had the following finishing times in minutes 28, 22, 26, 33, 21, 23, 37, 24 Find the mean and median for the finishing times. 4.3 The monthly income in dollars for seven staff members of an insurance office are 95O, 775, 925, 25OO, 1150, 850, 975 (a) Qalculate the mean and median salary. (b) Which of the two is preferable as a measure of center, and why? 4.4 The following measurements of the diameters (in feet) of Indian mounds in southem Wisconsin were gathered by examining reports in the Wisconsin Archeologist (courtesy of |. Williams): 22, 24, 24, 30, 22, 20, 28, 30, 24, 34, 36, 15, 37. (a) Plot a dot diagram (b) Calculate the mean and median and then mark these on the dot diagram. (c) Calculate the quartiles. 4.5 Refer to the data of college qualification test scoresgiven in Exercise 3.8. (a) Find the median. (b) Find Q1 and qr. 4.6 In an epidemiological study, the total organochlorines and PCB's present in milk samples were recorded from 40 donors in Colorado. (Soutce: Pesticides Monitoring lournal, June 1973.) The measurements were ordered from lowest to highest. For the data set, find the: (a) median and quartiles, (b) 20th percentile and 70th percentile. 61 s3 53 53 52 27 43 96 83 9s 7s 58 70 72 r27 126 115 115 110 115 115
63 97 t34
65 63 101 10s 145 152
153 182 190 r97 r97 282 322 322 342 szr
Exercises 31
4.7 Some propefties of the mean and median. (i) If a fixed number c is added to all measurements in a data set, then the mean of the new measurements is (c * the original mean). (ii) If all measurements in adataset are multiplied by a fixed number d, then the mean of the new measurements is d x (the original mean). (a) Verify these properties for the data set 5, 9, 9, 8, 10, 7 : (i) taking c 4 in and d : 2 in (ii). (b) The same properties also hold for the median. Verify these for the data set and the numbers c and d given in part (a). 'F) reported by 4.8 On a day, the noon temperature measurements (in five weather stations in a state were 75, 82, 78, 78, 75 (a) Find the mean and median temperature in oF. (b) The Centigrade (C) scale is related to the Farenheit (F) scale by C : 8(F- 3D. What are the mean and median temperaturesin'C? (Answer without converting each temperature measurement to oC. Use the properties stated in Exercise4.7.) 4.9 Given here are the mean and median salaries of machinists employed by two competing companies A and B.
Company
A Mean salary Median salary
$2s000 $22000
$23s00 $24000
Assume that the salaries are set in accordance with iob competence, and that the overall quality of workers is about the same in the two companies. (a) Which company offers a better prospect to a machinist superior ability? Explain your answer. (b) Where can a medium Explain your answer.
quality machinist
having
expect to earn more?
32
Chapter 2
OF VARIATION 5. MEASURES In addition to locating the center of the data, another important aspect of a descriptive study of data is numerically measuring the extent of variation around the center. Two data sets may exhibit similar positions of center but may be remarkably different with respect to variability. For example, the dots in Figure 7b are more scattered than the dots in Figure 7a. o oo a o o oooo
o
o
o
5 (a)
o ooooooooo 0510x (b)
Figure 7 Dot diogroms with similorcenter volues but differentvoriotions.
Since the sample mean x is a measure of center, the varration of the individual data points about this center is reflected in their deviation from the mean deviation - observation
(sample mean)
Forinstance,thedataset3,5, 7,7,9hasmeanx : (3 + 5 + 7i+ 7 + 8)/5 : 30/5 : 6, so the deviations are calculated by subtracting 6 from each observation. SeeTable 9. One might feel that the average of the deviations would provide a numerical measure of spread.However, some deviations are positive and some negative, and the total of the positive deviations exactly cancels the total of the negative ones. In the above example we see that the positive
Toble 9 Colculolion of Deviolions Observation X
3 5 7 7 8
Deviation XV
-3 -l I I 2
5. Measuresof Variation 33 deviations add to 4 and the negative ones add to - 4, so the total deviation is 0. With a little.reflection on the definition of the sample mean the reader will realize that this was not just an accident. For any data set, the total deviation is 0 (for a formal proof of this fact see Appendix Al). t) : 0
Xdeviations) - Xxr
To obtain a measure of spread we must eliminate the signs of the deviations before averaging. One way of removing the interference of signs is to square the numbers. A measure of spread, called the sample viriance, is constructed by adding the squared deviations and dividing the total by the number of observations minus one.
Sample variance of n observations: s2:
sum of squared deviations nl
i t', :
EXAMPLE9
l:
x)z
I
n1
Calculate the variance of the data: 3, 5, 7, 7, 8. For this data set, n : 5. To find the variance, we first calculate the 1re4n, then the deviations and the squared deviations. SeeTable I0.
Toble'10Cqlculotionof Vorionce Deviation X-X
(Deviation)z (x x)2
-3 -l I I 2 Total
30 )x --, 30 x:-5-:o
T6
0 Xx
x)
Xx
V)2
r
Sample variance s2 -
16 51
_4
u
34
Chapter 2
Remark: Although the sample variance is conceptu alized, as the avemge squared deviation, notice that the divisor is n - I rather than n-.The divisor, n - l, is called the degreesof freedom associatedwith s2.r Becausethe variance involves a sum of squares,its unit is the square of the unit in which the measurements are expressed.For example, if the data pertain to measurements of weight in pounds, the variance is expressedin (pounds)2.To obtain a measurement of variability in the same unit as the data, we take the square root of the variance, called the standard deviation. The standard deviation rather than the variance serves as a basic measure of variabilitv.
SompleSfondordDeviotion
s
EXAMPLE,IO Calculate the standard deviation for the data of Example 9.
we already calculated the variance s2 : 4 so the standard deviation is s : \/4 :2.
n
To show that a larger spread of the data does indeed result in a larger numerical value of the standard deviation, we consider another data set in Example I l.
EXAMPLE 44 Calculate the standard deviation for the data: l, 4, S, g, ll. plot the dot
diagram of this data set and also the data set of Example 9. The standard deviation is calculated in Table ll. The dot diagrams, given in Figure 8, show that the data points of I The deviations add to 0 so a specilication of any n - I deviations allows us to recover the one that is left out. For instance, the first four deviations in Example 9 add to -2 so, to make the total Q the last one must be + 2, as it really is. In the definition of s2, the divisor n - I represents the number of deviations that can be viewed as free quantities.
5. Measuresof Variation 35
Toble 1tt Colculotion of s (x
-5 -2 -1 3 5
I 4 5 9 11
o
-'ffi
{l
*.*uo'
(x
t)
v)2 25 4 I 9 25
? e. o
o
lllllllllllb
o
2
4
6
8
10
L2
14
16
18
20
12
14
16
18
20
n,
(a)
oo
oo
lllrllrrrtt>
-y
, * l ' " {
i
0
2
4
6
8
10
.r
(b)
Figure 8 Dot diogroms of two doto sets.
Example t have less spreadthan those of Example 11. This visual comD parison is confirmed by a smaller value of s for the first data set. An alternative formula for the sample variance is
",r
: -f-=f r*l - elll n n -tr;r'
I
It does not require the calculation of the individual deviations. In hand calculation, use o{ this altemative formula often reduces the arithmetic work, especially when x turns out to be a number with many decimal places. The equivalence of the two formulas is shown in Appendix A1.
EXAMPLE12 In a psychological experiment, a stimulating signal of fixed intensity was used on six experimental subjects. Their reaction times, recorded in seconds,wete 4,2, 3, 3,6, 3. Calculate the standarddeviation for the data by using the alternative formula. ,7\
36
Chapter 2
These calculations can be conveniently carried out in schematic form:
x2
4 2 3 3 6 3 Total
52:#-[r" s -
2l _)x
83 : ry]
T6 4 9 9 36 9
83 -
2x2
(21)2/6-83 - 73.5 55
9'5
5
( - 1r'9
,r/L9 - l.B8
The reader may do the calculations with the first formula, and verify that the same result is obtained. tr In Example 11 we have seen that one data set with a visibly greater amount of variation yields a larger numerical value of s. The issue there surrounds a comparison between different data sets. In the context of a single data set, can we relate the numerical value of s to the physical closenessof the data points to centerx? To this end we view one standard deviation as a benchmark distance from the mean x. For bell-shaped distributions, an empirical rule relates the standard deviation to the proportion of the data that lie in an interval aroundf.
EmpiricolGuidelinefor Symmetric Bell-ShopedDistributions Approximately
68% of the data lie within X + s 95% of the data lie within x + 2s 99.7% of the data lie within X i 3s
EXAMPLE'13 Examine the 40 bookstore sales receipts in Table 4 in the context of the empirical guideline.
5. Measwesof Variation 37 Using a computer (see for instance Exercise 7.23), we obtain ? : $61.35 s : $28.558, 2s : 2(28.558): $57.316 Going two standard deviations either side of x results in the interval $61.35- 57.316: $3.989 to
$118.566: 51.35 + 57.3L6
By actual count, all of the observations except $3.20 and $124.27 fall in this interval. We find that 38/4O : .95 or 95%" of the observations lie within two standard deviations of x. This example is too good! Ordinarily n we would not find exactly 95% but something close to it.
OTHER MEASURES OF VARTATION Another measure of variation that is sometimes employed is:
Sample range
The range gives the length of the interval spannedby the observations.
EXAMPLEII4 The traffic noise data given in Table 8 contained Smallest observation : 52.O Largest observation : 77.1 Therefore, the length of the interval coveredby these observationsis Samplerange _ 77.I
52.0 _ 25.L decibels
D
As a measure of spread, the range has two attractive features: it is extremely simple to compute and to interpret. However, it suffers from the serious disadvantagethat it is much too sensitive to the existence of a very large or very small observation in the data set. Also, it ignores the information present in the scatter of the intermediate points. To circumvent the problem of using a measure that may be thrown far off the mark by one or two wild or unusual observations, a compromise is made by measuring the interval between the first and third quartiles:
38
Chapter 2
Sample interquartile range - Third quartile
First quartile
The sample interquartile range represents the length of the interval covered by the center half of the observations. This measure of the amount of variation is not disturbed if a small fraction of the observations are very large or very small. The sample interquartile range is usually quoted in government reports on income and other distributions that have long tails in one direction, in preference to standard deviation as the measure of spread.
,I5 Calculate the interquartile range for the noise-level data given in Table 8. EXAMPLE
In Example 8, the quartiles were found to be Q1 : 57.2and Q, : 64.6. Therefore, Sample interquartile range : 64.6
57.2
: 7.4 decibels
l
BOXPLOTS A recently created graphic display, called a boxplot, highlights the summary information in the quartiles. The center half of the data, from the first to the third quartile, is represented by a rectangle(box) with the median indicated by a bar. A line extends from Q3 to the maximum value and another from Q, to the minimum. Figure 9 gives the boxplot for the noise level data in flble 8. The long line to the right is a consequenceof the single high noise level 77.1. The next largest is 57'1'noxpLts are particularly effective for displaying several samples along side each other for the purpose of visual comparison'
| | tittt
I I I lttttttl
50
I lltttl
I ttt
tl
70
60 N oi s e l ev el
Figure 9 Boxplot of the noise level d
Exercises 39
EXERCISES 5.1 For the data set: 1O,8, 16, 6 (a) Calculate the deviations (x - x) and check that they add to 0. (b) Calculate the variance and the standard deviation. 5.2 Repeat (a) and (b) of Exercise5.1 for the data set: 2.2, I.4, 1.8, 1.2, I.4. 5.3 For the data of Exercise 5.1, calculate s2 by using the alternative formula. 5.4 For each data set calculate s2. (a) 2, 5, 4,3, 3 (b) -1, 2,0, -2, l, -r (c) 12, ll, 11, 12, ll, II, 12. 5.5
Find the standard deviation of the measurements of diameters given in Exercise 4.4.
5.6 For the data set of Exercise4.6, calculate the interquartile range. 5.7 Calculations with the test scores data of Exercise 3.8 give x : 150.125ands:24.677. (a) Find the proportion of the observations in the interval t -f 2s, in theintervalt * 3s. (b) Compare your findings in part (a) with those suggestedby the empirical guideline for bell-shaped distributions. 5.8 Refer to the data of ozone measurementsin Exercise3.5. (a) Calculate t and s. (b) Find the proportion of the observations that are in the interval 7 * s, inx -r 2s, and inx -r 3s. (c) Compare the results of (b) with the empirical guideline. 5.9 Some propenies of the standard deviation. (i) If a fixed number c is added to all measurements in a data set, the deviations (x - x) remain unchanged (see Exercise 4.7). Consequently, sz and s remain unchanged. (ii) If atl.measurements in a data set are multiplied by a fixed number d, the deviations (x - t) get multiplied by d. Consequently, s2 gets multiplied by d?, and s by ldl. (Note: Standard deviation is never negative.)
40
Chapter 2
Verify these properties for the data set 5, 9, 9, 8, 10, 7 : (i) and d : 2in (ii). 4in taking c 5.10 Two cities provided the following information on public school teachers salaries. Minimum
city A 18,400 city B 19,500
Qr
24,000 26,500
Median
28,300 31,200
Qs
30,400 35,700
Maximum
36,300 41,800
(a) Construct a boxplot for the salaries in City A' (b) construct a boxplot, on the same graph, for the salaries in City B. 5.11 Make a boxplot of the salesdata in Table 4.
Grophs con Give o vivid overoll Picture.
(g)OATAlrATPlil
Key Ideas and Formulas 4l
6. CONCLUDINGREMARKS Several numerical measures have been introduced for use in the descriptive summary of data. The alternative measures of center (or variation) place varying emphasis on particular features of the distribution of observations. For descriptive purposes, it is prudent to provide more than one statistic. Including both the mean, median, and perhaps the quartiles or other suitable percentiles helps specify where the data are located. Also, the individual values of unusual observations should be reported separately.
KEYIDEASAND FORMULAS Qualitative data refer to frequency counts in categodes. These are summarized by calculating the
relative frequency :
frequency total number of observations
for the individual categories. Data obtained as measurementson a numerical scaleaie either discrete or continuous. A discretedata set is summarizedby a firequencydistribution that lists the distinct data points and the corresponding relative frequencies. Either a line diagram or a histogram can be used for a graphical display. Continuous measurement data should be graphed as a dot diagram when the data set is small, say fewer than 20 or 25 observations. Larger data sets are summarized by grouping the observations in class intervals, preferably of equal lengths. A list of the class intervals along with the corresponding relative frequencies provides a lrequency distribution, which can be graphically displayed as a histogram. A stem-and-leafdiagramis another effective means of display when the data set is not too large. It is more informative than a histogram becauseit retains the individual observations in each class interval instead of lumping them into a frequency count. A summary of measurement data (discrete or continuous) should also include numerical measures of center and spread.
42
Chapter 2
Two important measures of center are
sample*"anf:E
n
sample median : middlemost value of the ordered data set. Thequartiles and, more generally,percentiles are other useful locators of the distribution of.a data set. The second quartile is the same as the median. The amount of variation or spread of a data set is measured by the sample standard deviation s. The sample variance s2 is given by )(x - 1;z n- I
n ^ Also, s2 :
I f--" , _ lL>"'
()x)2] - -?l (convenient for hand calculation)
sample standard deviation s :
+VF
The standard deviation indicates the amount of spread of the data points around the mean x. If the histogram appearssymmetric and bell-shaped then the interval x + V
s includes approximately
68% of the data
2s includes approximately
95% of the data
x + 3s includes approximately 99.7% of the data Two other measuresof variation ate: sample range : largest observation - smallest observation and sample interquartile range : third quartile - first quartile. The five quantities, namely, the median, the first and third quartiles, the smallest observation and the largest observation together serve as useful indicators of the distribution of a data set. These are displayed in a boxplot
7. Exercises 43
7, EXERCISES 7.1 Recorded here are the numbers of civilian employed persons in the United States by major occupation groups for the years 1975 and 1979. (Source: Statistical Abstract of the United States, 1930.)
Number of Workers in Thousands 1 9 75 White-collar worker Blue-collar worker Service worker Farm worker Total
1979
42,226 27,962 I 1,659 2,936
49,343 32,065 12,934 2,703
84,782
96,945
(a) For eachyear, calculate the relative frequencies of the occupation groups. (b) Comment on changesin the occupation pattem between 1975 ar'd 1979. 7.2 Table 12 gives data collected from the students attending an elementary statistics course at the University of Wisconsin. These data include: sex, height, number of years in college, and the general area of intended major (Humanities (H), Social Science (S), Biological Science(B), Physical Science(P)). (a) Summarize the data of ,,intended majo{, in a frequency table. (b) Summarize the data of "year in college,, in a frequency table, and draw either a line diagram or a histogram. (c) Plot the dot diagrams of heights separately for the male and female students, and compare. 7.3 Basedon a sample survey conducted in 1975,the following frequency distribution was obtained for the age of groom at first marriage in the state of Wisconsin. (Source: Vital Statistics of the United States, 1975, Yol.III, Marriage and Divorce.)
44
Chapter 2
Nurzrber of Grooms
Age Interval"
40 4,550 I 6,930
Total
29,270
6,rgo I,r40 330 160 l0 20
"The intervals include the lower end points but not the upper. Take the first interval as 16-18, and the last as 65-75.
Toble'12ClossDoto Student No. 1F 2M 3M 4M 5F 6F 7M 8M 9M 10F 11 L2M 13M 14F 15M 16M T7M 18M T9F 20M 2TM 22F 23M 24M 25M
Sex
Height in Inches
Year In College
67 72 7A 7A 6T 66
3 3 4 I 4 3 3 4 3 3 3 3 2 4 3 3 4 3 2 4 4 2 4 3 I
7r
M
67 65 67 74 68 74 64 69 64 72
7r 67 70 66 67 68 7I 7s
Intended Maior
S P S B P B H B S B H S P P S B P B S S S B S H S
Student No. 26M 27M 28M 29F 30F 31 32M 33M 34M 35M 36F 37M 38M 39F 40M 4IM 42M 43F 44M 45M 46M 47F 48M 49M
Sex
F
Height in Inches 67 683 724 683 662 652 644 72 674 733 71 71 692 694 744 733 683 662 732 732 674 623 682 71
Year in College
Intended Maior
I
B P B P B B B H B S B B S P S B B S P S S S B S
I
4 3
3
7. Exercises 45 (a) Calculate the relative frequency of each class interval. (b) Plot the relative frequency histogram (Hint: Since the intervals have unequal widths, make the height of each rectangle equal to the relative frequency divided by the width of the interval.) (c) What proportion of the grooms rrrafty before age25? after age 30? 7.4 The stem-and-leaf diagram given here shows the final examination scores of students in a sociology course. (a) Find the median score. (b) Find the quartiles Qr and Qr. (c) What proportion of the students scored below 70? 80 and over?
Stem-andLeaf Plot of Scores 2 3 4 5 6 7 8 9
48 155 002 03368 0124479 22355689 004s77 0025
7.5 The following data show the age at inauguration of each U.S. President.
Name
l. Washington 2. I. Adams 3. |efferson 4. Madison 5. Monroe 6. I.Q. Adams 7. fackson 8. Van Buren 9. W. H. Harrison 10. Tyler I 1. Polk L2. Taylor 13. Fillmore
Age at Inauguration
s7 6l
s7 s7 58
s7 6I 54 68 5l 49 64 50
Name
14. Pierce 15. Buchanan 16. Lincoln 17. A. Johnson 18. Crant 19. Hayes 2A. Garfield 21. Arthur 22. Cleveland 23. B. Harrison 24. Cleveland 25. McKinley 26. T. Roosevelt
Age at Inauguration
48 65 52 56 46 54 49 50 47 55 55 54 42
46
Chapter 2
Name
27. Taft 28. Wilson 29. Harding 30. Coolidge 31. Hoover 32. F. D. Roosevelt 33. Truman
Age at Inauguration
51 56 55 51 54 51 60
Name
34. Eisenhower 35. Kennedy 36. fohnson 37. Nixon 38. Ford 39. Carter 40. Reagan
Age at Inauguration
62 43 55 56
6r 52 69
(a) Make a stem-and-Ieaf diagram. (b) Find the median, Q1, and Q.. (Hint You may wish to repeat digits in the stem. The first holds the digits 0-4, the secondholds the digits 5-9.) 7.6 Calculate the mean, variance and standarddeviation for each of the following data sets: (a) 8, 9, 7,9, 12 (b) 23, 28,23,26 ( c ) - 1 . 1 , 1 . 6 ,. 8 , - . 2 , 2 . 9
7.7 (a) Calculatex and s for the data9, 11,7, 12, Ll. (b) Consider the data set 109, 1l l, 107, ll2, ll1, which is obtained by adding 100 to each number given in part (a). Use your results of (a) and the properties stated in Exercises4.7 and 5.9 to obtain the x and s for this modified data set. Verify your results by direct calculations with this new data set. (c) Consider the data set -27, -33, -21, -36, -33, which is -3. Repeat obtained by multiplying each number of part (a) by (b) set. for this new data the problem given in part 7.8 A reconnaissance study of radioactive materials was conducted in Alaska to call attention to anomalous concentrations of uranium in plutonic rocks. The amounts of uranium, in 13 locations under the Darby mountains are: 7.92, 1O.29, 19.89, 17.73, 10.36, 13.50, 8.81
8.33, 9.32, 14.61 6.18, 7.O2, rl.7r, (Source: T. Miller and C. Bunker, lournal of Research,U'S' Geological Survey (1976), pp. 367-377.) Find: (a) Mean.
7. Exercises 47
(b) Stand ard deviation. (c) Median and the quartiles. (d) Interquartile
range.
7.9 Refer to the class data in Exercise7.2. Calculate: (a) X and s for the heights of males.
(b) x and s for the heights of females. (c) Median and the quartiles for the heights of males. (d) Median and the quartiles for the heights of females.
7.10 In a genetic study, a regular food was placed in each of 20 vials and the number of flies of a particular genotype feeding on each vial was recorded. The counts of flies were also recorded for another set of 20 vials that contained grape juice. The following data sets were obtained (courtesy of C. Denniston and |. Mitchell): No. of Flies (Regular Food) ls 20 31 16 L2 22. 23 33 38 28 25 20 2L 23 29 26 40 20 19 31 No. of Flies (Grape fuice) 1312 2 1lL2 5190 2713201819999
516
(a) Plot separate dot diagrams for the two data sets. (b) Make a visual comparison of the two distributions with respect to their centers and spreads. (c) Calculate t and s for each data set. 7.11 The data below were obtained from a detailed record of purchases over several years (courtesy of A. Baneriee).The usage times (in weeks) per ounce of toothpaste for a household taken from a consumer panel were: .74 .4s .80 .9s .84 .82 .78 .82 .89 .7s .76 .81
.8 s .7 s .8 9 .7 5 .89 .99 .7r .77 .ss .8s .77 .87 (a) Plot a dot diagram of the data. (b) Find the relative frequency of the usagetimes that do not exceed (c) Calculate the mean and the standard deviation. (d) Calculate the median and the quartiles. 7.I2 To study how first-grade students utilize their time when assigned to a math task, a researcherobserves24 students and records their times off task out of 20 minutes (courtesy of T. Romberg).
48
Chapter 2
Times off task (minutes) 40 46 54 100
2241 2 97 137710 s39
7 8
For this data set, find: (a) Mean and standard deviation. (b) Median. (c) Range.
7.L3 Blood cholesterol levels were recorded for 43 personssampled in a medical study group and the following data were obtained (courtesy of G. Metter):
249 227 218 310 Lgr 330 226 223 151 195 233 249 284 245 r74 r54 196 299 2r0 301 r99 258 205 195 227 355 234 195 179 357 282 265 286 286 239 2r2
233 256 244 176
195 163 297 (a) Group the data into a frequency distribution' (b) Plot the histogram and comment on the shape of the distribution. 7.14 For the blood cholesterol data in Exercise 7.13, calculate: (a) x and s. (b) Median and the quartiles. (c) Range and interquartile range'
7.15 Refer to Exercises7.13 and'7.14. (a) Determine the intervals t t s, V t 2s, andx * 3s. (b) Find the proportion of the measurementsin Exercise 7.13 that lie in each of these intervals. (c) compare your findings with the empirical guideline for bellshapeddistributions. 7.16 The following summary statistics were obtained from a data set: V : 80.5 s :10.5
median - 84.0 Qr Qs - 96'0
7. Exercises 49 Approximately what proportion of the observations are: (a) Below 96.0? (b) Above 84.0? (c) In the interval 59.5-101.5? (d) In the interval 75.5-96.0? (e) In the interval49.0-112.0? State which of your answers are based on the assumption of a bellshaped distribution. 7.17 T]r1e50 measurementson acid rain in Wisconsin, whose histogram is given on the cover page of the chapter, are:
3.58 3.80 4.01 4.r2 4.18 4.20 4.30 4.32 4.33 4.42 4.45 4.45 4.50 4.51 4.52 4.s8 4.60 4.6r 4.6s 4.70 4.70 4.78 4.78 4.80 5.4r 5.48
4.01 4.05 4.05 4.2r 4.27 4.28 4.35 4.35 4.41 4.50 4.50 4.50 4.s2 4.52 4.s7 4.61 4.62 4.62 4.70 4.70 4.72 5.07 5.20 5.26
(a) Calculate the median and quartiles. (b) Find the 90th percentile. (c) Determine the mean and standard deviation. (d) Display the data in the form of a boxplot. 7.18 Refer to Exercise7.17. (a) Determine the intervals f t s, V 't 2s, and x -r 3s. (b) What proportions of the measurements lie in those intervals? (c) Compare your findings with the empirical guideline for bellshaped distributions. 7.19 One job hazard of the firefighters is their exposure to the poisonous gas carbon monoxide (CO). In a study concerning the design specifications for respiratory breathing devices, Boston firefighters wore personal air samplers that measured maximum concentration of CO during actual fire situations. The 5l observations where the maximum level of CO reached or exceededthe presumably safe level of .05% (500 parts per million) are:
50
Chapter 2
.o7 .o7 .r2 .95 .35 .10 .13 .06 .72 .13 .r7 .15 .27 .38 .09 .06 .58 .31 .r2 .85 .05 .06 .20 .39 .12 .O7 .t4 1.13 .10 .15 .20 .05 .22 .10 .10 .r9 2.40 .57 .11 .40 .s0 .r4 .r2 .08 .29 .09 2.70 .r2 .ll .05 .22 (Source: U.S.Dept of HEW Report(NIOSH)76-121.) (a) Calculatethe median and quartiles. (b) Calculatethe meanand standarddeviation. (c) Group the data into a frequencydistribution (unequalintervals are appropriatehere). (d) Display the datain the form of a boxplot. the positionof a datapoint 7.20 The z-scale(or standatd scale)measures relative to the meanand in units of the standarddeviation.Specifically, measurement
z-value of a measurement
v
When two measurements originate from different sources, converting them to the z-scale helps to draw a sensible interpretation of their relative magnitudes. For instance, suppose a student scored 65 in a math course and 72 rn a history course. These (raw) scores tell little about the student's performance. If the class averages and standard deviations were X 10 in history, this student's z-score in math z score in historY
6s 20
60 : .25 78
72 10
Thus, the student was .25 standard deviations above the average in math; and .6 standard deviations below the average in history. (a) If V : 4g0 and s - LzO, find the z-values of 350 and 620-- 50? (b) For a z-score of 2.4 what is the raw sc,oreif V : 2lO and s
7. Exercises 51 7.21 The winning times of the men's 400-meter freestyle swimming in the Olympics (1908-1980)appearbelow. (a) Draw a dot diagram and label the points according to time order. (b) Explain why it is not reasonable to group the data into a frequency distribution.
WinningTimesin Minutesond Seconds Year
Time
Year
I 908 1912 1920 r924 r928 1932 1936 1948
5:35.8 5:24.4 5:26.8 5:04.2 5:01.6 4:48.4 4:44.5 4:41.O
1952 1956 1960 1964 1968 1972 1976 1980
Time
4:30.7 4:27.3 4:18.3 4:12.2 4:09.0 4:00.27 3:51.93 3 : 5 1. 3I
7.22 The mode of a collection of observationsis defined as the observed value with largest relative frequency. The mode is sometimes used as a center value. There can be more than one mode in a data set. Find the mode for the data given in Exercise3.3. 7.23 Computer-aided statistical calculations. Calculations of the descriptive statistics such as x and s are increasingly tedious with larger data sets. Modern computers have come a long way in alleviating the drudgery of hand calculations. MINITAB is one computing package that is easily accessible to students because its commands are in simple English. We illustrate the usefulness of the computer with a few commands of the MINITAB system. The command SET Cl (set data in column l) places the data in in the computer. The command DESCRIBE Cl produces the results which include x and s along with a record of n, the sample size. We use these commands with a data set of the strength measurements of 6l specimensof southern pine (Source: U.S. Forest products Laboratory). The output from the computer prints the data set and provides n,*, and s.
52
Chapter 2 SET CI
4 0 01 4949
3927 3048 3530 3075
4027 4263 3271 3421 3686 4103 5005 4387 3470
357 1 3894 4315 353 I 3332 3401 399I 35 I 0 3340
3738 4262 3078 3987 3285 3 6 01 2866 2884 3 2 14
D E S C R I B EC 1 N=61 CI
4298 4 0 12 5 I 57 4232 3607 4120 3739 3717 356I 3 8 19 3670
4000 3445 3797 3s50 3598 3852 3889 4349 3544 4846 4003 3173 3694
MEAN= 380 I .0
Use the MINITAB for:
4 74 9 4256 3 14 7 407 1
ST.DEV.= 514.
(or some other pack age program) to find X and s
(a) Ozone data in Exercise 3.5. (b) The acid rain data in Exercise 7 .I7 7 .24 (Furth er MINITAB manipulations and calculations) The basic operation of ordering data and calculation of x, s, and median can be done by individual commands. Histograms can also be created.We illustr ate with the strength data from the previous exercise. ORDERC1 PUT INTO C2 PRINT C2
c2 2 86 6 3 14 7 3332 3 4 70 3550 3607 3 73 8 3 88 9 4000 407 1 4262 4387 5157
2884 3 I 73 3340 35lO 356I 3670 3739 3894 4oo I 4 10 3 4269 4 74 9
A VE R A G E c l = M E AN M E DI A N c l M E DI A N =
3048 3 2 14 340 1 3530 35 7 1 3686 9797 3927 4003 41 20 4298 4846
3075 3271 3 421 3531 3598 3694 3 8 19 3987 4 0 12 4232 43 15 4949
3801.0 3738.0
S T A N D A R D D E VI A T I O N C 1 513 .54 S T . D E V. 3 HI STOGRAMCI
3078 3285 3445 3544 3601 3717 38s2 399 1 4027 4256 4349 5005
7. Exercises 53 c1 M ID D L E O F L I NTERVA 2800 3000 3200 3400 3600 3 80 0 4000 4200 4400 4 6 00 4800 s000 s200
N U M B E RO F OBSERVATIONS ** 2 t** 3 t**t* 5 ****** 6 ***********t* 13 g *****f** tt****t** I ******* 7 *** 3 0 ** 2 ** 2 1*
From the ordered strength data: (a) Obtain the quartiles. (b) Construct a histogram with five cells. Locate the mean, median, Qr, and Qa on your graph.
CHA
ptiveStudyof Descri Blvoriote Doto 1, INTRODUCTION 2. SUMMARIZATION OF BIVARIAIE CATEGORICAL DATA 3, SCATTER PLOTOF BIVARIATE MEASUREMENT DATA 4. THECORRELATION COEFFICIENT-A MEASURE OF LINEAR RELATION (L|NEAR 5. PREDTCTTON OF ONEVARTABLE FROMAI\OTHER REGRESSTON)
56
Chapter 3
The Apollo moon landings made it possible to study first hand the geology of the moon. In their quest {or clues to the origin and composition of the moon, scientists performed chemical analyses of lunar rock specimens collected by the astronauts.
lts
I
c o -o L
850
50 75 (ppm) Hydrogen Contentof moonrocks
Apollo ostronout collecting moon rocks.
INTRODU'CTION ".
In Chapter 2 we discussed the organization and summary description of data concerninf a single variable. Observations on two or mote variables are often recorded for the individual sampling units. By studying such bivariate or multivariate data one typically wlshes to discover if any relationships.exist between the variables, how strong the relationships appear to be, and whether one variable of primary interest can be effectively predicted from information on the values of the other variables. To
2. Summarizationof Bivariate CategoficalData 57 illustrate the concepts, we restrict our attention to the simplest case where only two characteristics are observed on the individual sampling units. Some examples are: Sex and the type of occupation of college gtaduates. Smoking habit and lung capacity of adult males. Average daily carbohydrate intake and protein intake of l0-year-old children. The amount of.Ienllizer used and the yield per acre. The two characteristics observed may both be qualitative traits, both numerical variables, or one of each kind. For brevity, we will only deal with situations where the characteristics observed are either both categorical or both numerical. Summarization of bivariate categorical data is discussed in Section 2. Sections 3, 4, and 5 are concemed with bivariate measurement data and treat such issues as graphical presentations, examination of relationship, and prediction of one variable from another.
OF BIVARIATE 2. SUMMARIZATION DATA CATEGORICAL When two traits are observedfor the individual sampling units, and each trait is recorded in some qualitative categories, the resulting data can be summarized in the form of a two-way frequency table. The categories for one trait are marked along the left margin, those for the other along the upper margin, and the frequency counts recorded in the cells. Data in this summary form are commonly called cross-classified or cross-tabulated data. In statistical terminology they are also called contingency tables EXAMPLE'l
A survey was conducted by sampling 400 persons who were questioned regarding union membership and attitude toward decreased national spending on social welfare programs. The cross-tabulated frequency counts are presented in Table l. Toble f, Support
Indifferent
Opposed
I-Jnion Non-Union
rr2 84
36 68
28 72
Total
196
104
100
Total
58
Chapter 3
The entries of this table are self-explanatory. For instance, of the 400 persons polled, there were 176 uniqn members. Among these union members 112 expressedsupport,36 were indifferent, and 28 opposed.A further understanding of how the responsesare distributed can be gained by calculating the relative frequencies of the cells. For this purpose, we divide each cell frequency by the sample_size400. The relative frequencies (for instance 84/4OO: .21) are shown in Table 2.
Toble2 RelotiveFrequenciesfor the Doto of Tqble'l
Union Non-Union
Support
Indifferent
Opposed
.28 .2r
.09 .r7
.07 .18
Total
T
Depending on the spgcific context of a cross-tabulation one may also wish to examine the cell frequencies relative to a marginal total. In Example 1, you may wish to compare the attitude pattems of the union members with that of the non-members. This is accomplished by calculating the relative frequencies separately for the two groups, (for instance, 84/224 : .375) as Table 5 shows.
Toble3 Support Union Non-Union
.636 .37s
Indifferent
.205 .304
Opposed
Total
.rs9 .32r
From the calculations in Table 3, it appearsthat the attitude patterns are different between the two groups-support seems to be stronger among union members than among non-members. Now the pertinent question is: Can these obsewed differences be explained by chance or are there real differences of attitude between the populations of members and non-members? We will pursue this aspect of statistical irrference in Chapter 14.
EXERCISES 2.I Nausea from afi sickness affects some travelers. A drug compafry, wanting to establish the effectivenessof its motion sickness pill, randomly gives either its pill or a look-alike sugar pill (placebo)to 2OOpassengers.
Exercises 59
Degree of Nausea Moderate
Severe
Slight
None
4336183 33 19
12
36
Total
(a) Complete the marginal totals. (b) Ca'lculate the relative frequencies separately for each row. (c) Comment on any apparent differences in response between the pill and the placebo. 22 Records of drivers with a maior medical condition (diabetes, heart condition, or epilepsy) and also of a group of drivers with no known health conditions were retrieved from a Motor Vehicle Department. Drivers in each grcup were classified according to their driving record in the last year.
Medical Condition
Traffic Violations One or More None
4l 39 78 43
119 L}r 72 r57
Diabetes Heart condition Epilepsy None (control)
Total
160 160 150 200
Compare each medical condition with the control group by calculating the appropriate relative frequencies 2.3 A survey was conducted to study the attitudes of the faculty, academic staff, and students in regard to a proposed measure for reducing the heating and air-conditioning expenseson campus.
Faculty Academic staff student
Favor
Indifferent
Opposed
Total
36 44 106
42 77 r78
r22 r29 I 16
200
zso 400
60
Chapter 3
Compare the attitude patterns of the three groups by computing the relative frequencies. 2.4 Interviews with 185 persons engagedin a stressful occupation revealed that76 were alcoholics, 8l were mentally depiessed,and 54 were both. (a) Based on these records, complete the following two-way frequency table. (b) Calculate the relative frequencies.
Alcoholic
Not Alcoholic
Depressed Not depressed Total
2.5 Cross tabulate the "Class data" of Exercise 7.2 in Chapter 2 according to sex (M, F) and the general areas of intended major (H, S, B, P). Calculate the relative frequencies. 2.6 A psychologist interested in obese children gathered data on a group of children and their parents.
Parent
Obese
child Not obese
At least one obese Neither obese
(a) Calculate the marginal totals.
(b) Convert the frequencies to relative frequencies. (c) Calculate the relative frequencies separately for each row.
PLOTOF BIVARIATE 3. SCATTER DATA MEASUREMENT We now turn to a description of data sets concerning two variables, each measured on a numerical scale. For ease of reference we will label one variable x and the other y. Thus, two nu'merical observations (x, y) ate
3. Scatter Plot of Bivariate Measwement Data 6l recorded for each samPling unit. These observations ate paited in the sense that an (x, Y) Pair arises from the same samPling unit. An x observation from one Pair and an x or y from another ate unrelated. For n (xr, Y), sampling untts, we can wnte the measurement Pairs as 1 .
(x2,/2),...,(xn,yn)
The set of x measulements alone, disregarding the y measurements, constitutes a data set for one variable. The methods of Chapter 2 can be employed for descriptive purposes including graphica! presentation of the p"ti"rn of distribution of the measurements, calculation of the mean, standard deviation, and other quantities. Likewise, the y measurements can be studied disregarding the x measurements. However, a maior purpose of collecting bivariate data is to answer such questions as: Are the variables related? What form of relationship is indicated by the data? Can we quantify the strength of their relation? Can we predict one variable from the other? Studying either the x measurements by themselves or the y measurements by themselves would not help answer these questions' An important first step in studying the relationship between two variables is to graph the data. To this end, the variable x is marked along the horizontal axii and y on the vertical axis on a graph papel. The pairs (x, y) of observations are then plotted as dots on the graph. The resulting diagram is called a scatter diagtam. By looking at the scatter diagram, a visual impression can be formed about the relation between the variables. For instance, we can observe whether the points band around a line, a curve or if they form a patternless cluster.
EXAMPLE2
Recordedin Table 4 are the data of x - UndergraduateGPA
y - Score in the Graduate Management Aptitude Test (GMAT)
for applicants seeking admission to a Masters of Business Administration program. The scatter diagram is plotted in Figure l. The southwest to northeast pattem of the points indicates a positive relation between x ar;:dy:that is, the applicants with high GPA tend to have high GMAT. Evidently, the relation is far from a perfect mathematical relation.
62
Chapter 3
Toble4 Doto of UndergrqduofeGPA(x) ond GMATScore l1l 3.63 3.59 3.30 3.40 3.s0 3.78 3.44 3.48 3.47 3.35 3.39
2.36 2.36 2.66 2.68 2.48 2.46 2.63 2.44 2.r3 2.41 2.55
447 s88 563
5s3 572 59r 692 528 552 520 543
399 482 420 4r4 533 509 504 336 408 469 538
2.80 3 .1 3 3.01 2.79 2.89 2.9r 2.7s 2.73 3.r2 3.08 3.03 3.00
444 416 47r 490 431 446 546 467 463 440 4r9
s09
,r+ k (9
"'f
-"r uoor
a',a ,
ra
lii:trl
'
,
t,,r,=,,,,
i . ; ; : r . .f i
t",l :::::: ::,,",',,,,,,,,,,,,, ,,,,t,
l
,f
r....t.,.'r,,... .t',t.l
300r
ffi Figure'l Scotterplot of oppliconts' scc)res.
COEFFICIENT4. THECORRELATION RELATION A MEASURE OF LINEAR The scatter diagram provides a visual impression of the nature of relation between the x and y values in a bivariate data set. In a great many cases the points appear to band around a straight line; but, to varying extents,
4. The CorrelationCoefficient-A Measuteof Linear Relation 63 chance fluctuations rule out a strictly linear relation. Our visual impression of the closenessof the scatter to a linear relation can be quantified by calculating a numerical measure, called the correlation coefficient The correlation coefficient, denotedby r, is a measure of strength of the linear relation between the x and y variables. Before introducing its formula, we outline some important features of the correlation coefficient, and discuss the manner in which it serves to measure the stlength of a linear relation. (a) The value of r is always between - I and + 1. (b) The magnitude of r indicates the strength of a linear relation while its sign indicates the direction. More specifically; T
r
if the pattern of (x, y) values is a band that runs from lower left to upper right. if the pattern of (x, y) values is a band that runs from upper left to lower right.
r r :
positive slope (perfect positive linear relation). - I if all (x, y) values lie exactly on a straight line with negative slope (perfect negative linear relation).
A high numerical value of r, that is a value close to + 1 or - 1, represents a strong linear relation. (c) A value of r close to zero means that the linear association is very weak. The correlation coefficient is close to zero when there is no visible pattem of relation; that is, the y values do not change in any direction as ihe x values change. A value of r near zero could also happen becausethe points band around a curve that is far from linear. After all, l measures iitr""t association, and a markedly bent curve is far from linear. Figure 2 shows the correspondencebetween the appearanceof a scatter diagiam and the value of r. Obsewe that Figures 2e and 2f correspond to sitriations where r : 0. The zero correlation in Figure 2e is due to the absenceof any relation between x and y while in FigUre 2f this is due to a relation following a curve that is far from linear.
CALCULATION
OF 1
The value of r is calculated from n pairs of observations (x, y) according to the following {ormula:
64
Chapter 3
(e)
Figure2 correspondence between the volues of r ond the omount of scotter.
CorrelotionCoefficient st"
vs*, Y sw Xx x)(y Xx
e
'-y'
/'
5
y)
x)2, Sr, - >(y
y)2
! - \ ) V lnr.grtz /\
The quantities s,. and s' are the sums of squared deviations of the x observations, and they obiervations, respectively. s., is the sum of cross products of the x deviations with the y deviations. 'ihis formula will be examined in more detail in Chapter 12.
4. The Correlation Coefficient-A Measureof Linear Relation 65
3 Calculate r for the n : 4 pairs of observations EXAMPLE (2, 5), (1, 3), (5, 6), (0,2) - x and then y and the We first determine the meanx and deviations x deviationsy - Y.See Table 5.
Toble 5 Colculotion of r x-x
0 -1 3 -2
25 l3 s6 02 Total
816 x:2
(x - v)2
vv I -l 2 -2
y-4
U-v)2
(x x)(v v)
0 I 9 4
I I 4 4
0 I 6 4
I4
r0 su,
1l stu
Stt
Consequently,
, :
t*,
: -:!--:
\/S** VSyv
V14 V10
.930
It is sometimes convenient, when using hand-held calculators, to evaluate r using the alternative formulas for S,', S"u and S":
S**: )x2 - ry-,
(2Y)2 Sr,: 2rz-
S,r:2xY-({UD This calculation is illustrated in Table 6.
Toble 6 AlternoteColculotionof r y2
25 13 56 02 Total
816 )y )x
425 193 25 040 30 2xz
y2
xy l0
36
30
74 2y2
43 Zxy
66
Chapter 3
43
8X T6 4
u we remind the reader that r measures the closeness of the pattem of scatter to a line. Figure 2/ presents a strong relationship between x andy, but one that is not linear. The small value of r for these data does not properly reflect the strength of the relation. clearly r is not an appropriate summary of a curved pattern. Another situation where the sample correlation coefficient, r, is not appropriate occurs when the scatter plot breaks into two clusters. Faced with separate clusters as depicted in Figure 3 r is not oppro- Figure 3, it is best to try and determine the underlying cause. It may be priote-somples from that a part of the sample has come from one population and a part from two populotions. another. CORRELATION AND CAUSATION Data analysts often iump to unjustified conclusions by mistaking an observed correlation for a cause-effect relationship. A high sample correlation coefficient does not necessarily signify a causal relation between two variables. A classic example concems an observed high positive correlation between the number of storks sighted and the number of births in a European city. Hopefully, no one would use this evidence to conclude that storks bring babies, or, worse yet, that killing storks would control population growth. The observation that two variables tend to simultaneously vary in a certain direction does not imply the presence of a direct relationship between them. If we record the monthly number of homicides x and the monthly number of religious meetings y for several cities of widely varying sizes, the data wiII probably indicate a high positive correlation. It is the fluctuation of a third variable (namely, the city population) that causesx and y to vary in the same direction, despite the fact that x and y may be unrelated or even negatively related. Picturesquely, the third variable, which in this example is actually causing the observed correlation between crime and religious meetings, is referred to as a lurking variable. The false correlation that it produces is called spurious correlation. It is more a matter of common sensethan of statistical reasoningto determine if an observed correlation can be practically interpreted or if it is spurious.
I4TARNING; An observed correlation between two variables may be spurious.That is, it may be caused by the influence of a third variable.
F Exercises 67 When using the correlation coefficient as a measure of relationship, we avoid the possibility that a lurking variable is affecting must be ""t"fuIto any of the variables under consideration.
EXERCISES 4.1 PIot the scatter diagram of the data and calculate the correlation coefficient.
4.2 (a) Construct scatter plots of the data sets:
(ii)
(b) Calculate r for the data set (i). (c) Guess the value of r for the data set (ii) and then calculate r. (Note: the x and y-values are the same for both sets, but they are paired differently in the two cases.) 4.3 Match the following values of r with the correct pictures: (i) r : -.3, (ii) r : .1, (iii) r :'9'
*
* { < *
*
{<
,k { < *
(a)
* * {< ,F
{<
Figure4
*
:r
* *
{<
(b)
Chapter
4.4 calculations from a data set of n : 4g pairs of (x, y) values have providedthe following results: X x - 7 ) 2 : 2 6 0 . 2 , 2 ( y - y ) r : 4 0 8 . 7 , X x - x ) ( y- t ) : 2 9 8 . 8 Obtain the correlationcoefficient. 4.5 For a data set of (x, y) pairs one finds that n:26, 2x2 :
)x :
6683I,
>f
1287, 2y: :
IZOT
59059, 2xy :
62262
Calculate the correlation coefficient. 4-6 The following height and weight measurements were recorded for several recent Miss Americas: Height (inches)
Weight (pounds) Height (inches) Weight (pounds)
65
67
66 6s.5
Lt4 na rrc 69
66
67
66
118 115 r24 124 115 116
67 6s.s 68
135 r25
6s 66.s
110 tzl
67
6B
69
68
11B r20 r25 I 19
(a) Plot the scatter diagram. (b) Calculate r. 4.7
Heating and combustion analyses were performed in order to study the composition of moon rocks collect"d-by Apollo t+ aia 15 crews. Recorded here are the determinations of Lydiogen (H) and carbon !c) in,parts per million (p.p.m.) for eleven specimens. (source: lournal of Research,U.S. Geological Survey, Vol. Z, Ig74.) Hydrogen (p.p.m.)
r20
Carbon (p.p.m.)
10s 110 99 22 50 s0 7.3 74 7.7 4s 5l
82 90
38 20 2.8 66 2.0 20 B 5
Calculate r.
4.8 Recordedhere are the scoresof 16 students at the midterm and final examinations of an irrtermediate statistics course.
Exercises 69
Midterm | 81 75 7L 6T Final
96 s6 85 18 70 77 71 9r
80 82 83 s7 100 30 68 s6 40 87 5s 86 7 7 68
Final
82 s7 75 47
(a) Plot the scatter diagram and identify any unusual point. (b) Calculate r. 4.9
ln each instance, would you expect a positive, negative, or zero correlation? (a) Number of sales persons and total dollar sales for real estate firms. (b) Total payroll and percent of wins o of national
league baseball
teams. (c) Number of master points earned in bridge tournaments and number of toumaments entered. (d) Age of adults and their ability to maintain a strenuous exercise program. 4.10 In an experiment to study the relation between the dose of a stimulant and the time a stimulated subject takes to respond to an auditory signal, the following data were recorded: Dose (milligrams) Reaction time (seconds)
I2
13 L4
3.5 2.4 2.t 1.3 r.2 2.2 2.6 4.2
(a) Calculate the correlation coefficient. (b) PIot the data and comment on the usefulness of r as a measure of relation. 4.lI A further propetty o/ r. Supposeall x measurements are changed to x' : ax + b, and all y measurements to y' : cy + d, where a, b, c, and d are fixed numbers. Then the correlation coefficient remains unchanged if a and c have the same signs; it changes sign but not numerical value if a and c are of opposite signs. This property oI r can be verified along the lines of Exercise 5.9 in Chapter 2. In particular, the deviations (x - t) change to a(x - f) and the deviations (v - fl change to c(y y). Consequently, VS,,, Vso and S*" change to lalVs* l"lVs- and ccS., respectively (recall that we must take the positive square root of a sum of squares of the deviations). Therefore, r changes to:
Chapter
- r rI a and c have the same sigr.rs , qg , , f Wl' I : - r tf a and,chave oppositesigns (a) For a numerical verification of this property of.r, consider the data of Exercise4.I. Changethe x and y measurementsaccording to X,
v' Calculate r from the (x', y') measurements and compare with the result of Exercise 4.1 . Suppose from a data set of height measurements in inches and weight measurements in pounds, the value of r is found to be .86. What would be the value of r if the heights were measured in centimeters and weights in kilograms?
5. PREDICTION OF ONEVARIABLE FROM ANOTHER (LtNEAR REGRESSTON)
An experimental study of the relation between two variables is often motivated by a need to predict one from the other. The administrator of a job training program may wish to study the relation between the duration of training, and the score of the trainee on a subsequent skill test. A forester may wish to estimate the timber volume of a ttee from the measurement of the trunk diameter a few feet above the ground. A medical technologist may be interested in predicting the blood alcohol measurement from the read out of a newly devised breath analyzer. In such contexts as these, the predictor or input variable is denoted by x, and the response or output variable is labeled y. The object is to find the nature of relation between x and y from experiment al d,ata,and use the relation to predict the response variable y from the input variable x. Naturally, the first step in such a study is to plot and examine the scatter diagram. If a linear relation emerges, calculation of the numerical value of r will confirm the strength of the linear relation. Its value indicates how effectively y can be predicted from x by fitting a straight line to the data. A line is determined by two constants. Its height above the origin (intercept), and the amount that y increaseswhenever x is increased by I unit (slope).see Figure 5. Chapter 12 explains an obiective method of best fitting a straight line, called the method of least squares
S. prediction of One Vafiable from Another (Linear Regression)
Figure5 Theline i 0o + pnx'
Equotionof the best fitting line:
v where
slope 0 t :
EXAMPLE4
Y)
v)2
Xx
Stt
intercept 0o _ Y
t)(v
Xx
Stu
A
Frx'
A chemist wishes to study the relation between the drying time of a paint and the concentration of a base solvent that facilitates a smooth application. The data of concentration setting (x) and the observed drying times (y) are recorded in the first two columns of Table 7'
Toble 7 Doto of Concenlrofionx ond Drying TimeY (in minutes)ond lhe Bosic Colculqtions y2
0 I 2 3 4 Total l0
I 5 3 9 7
010 1255 496 9 16
8l 49 165
xy
27 28
72
Chapter 3
The scatter diagram in Figure 6 gives the appearanceof a linear relation. To calculate r and to determine the equation of the fitted line, we first calculate the basic quantities! V, S**, Sy, and S., in Table 7.
v l0
X
o
lttttt\ L 2
3
4
S,, :
30
Srr:
165
S,y:
66
(lO)2/5
5 {
Figure 6 Scotter diogrom.
'
(25)2/5 - 40 (10 x 25)15- L6
T6
t-
\m
x l0
r6 ZO
.8
^16
9r_ m go: 5
(r.6)2
The equarionof the fitted line is 9:I.8+1.5x The line is shown on the scatter diagram in Figure 6. If we are to predict the drying time y corresponding to the concentration 2.5, we substitute x : 2.5 in our prediction equation and get the result Predicted drying time atx Graphically, this amounts to reading the ordinate of the fitted line at x :2.5! Here wei.ave only outlined the basic ideas concerning the prediction of one variable from another in the context of a linear relation. chapter 12 expandsupon these ideas, and treats statistical inferences associatedwith the prediction equation.
EXERCISES 5.1 Plot the line y : IO - 3x on graph paper by locating the points for x : 0 and x : 3. What is its intercept? What is its slope? 5.2 A store manager has determined that the monthly profit (y) realized from selling a particular brand of car battery is given by
, --''
Exercises 73 Y_10x
I
where x denotes the number of thesey'atteries sold in a month. (a) If 4I batteries were sold in a m/nth, what was the profit? (b) At least how many batteries r/rust be sold in a month in order to
make a profit? 5.3 Identify the predictor variable x and the responsevariable y in each of the following situations. (a) A training director wishes to study the relationship between the duration of training for new recruits and their performance in a skilled iob. (b) The aim of a study is to relate the carbon monoxide level in blood samples from smokers with the averagenumber of cigarettes they smoke per day. (c) An agronomist wishes to investigate the growth rate of a fungus in relation to the level of humidity in the environment' (d) A market analyst wishes to relate the expenditures incurred in promoting a product in test markets and the subsequent amount of product sales. 5.4 Given these five pairs of (x, y) values:
(a) Plot the points on graph Paper. (b) From a visual inspection, draw a straight line that appears to fit the data well. (c) Compute the least squares estimates Bo and p' and draw the fitted line. 5.5 In an experiment designed to study the relation between the yield (y in grams) of a chemical process and the temperature setting (x in "F) for an important reaction phase of the process/ the following summary statistics are recorded: n -
B, )x
Sr, _- 840,
:
1278, 2y -
396
S*
(a) Find the equation of the least squares regression line.
(b) Using the fitted line, predict the yield when the temperature is set at 170" F.
(c) Calculate the correlation coefficient.
74
Chapter S
5.6 Refer to Exercise4.7. suppose one wishes to use a linear relation to predict the carbon content (y) of amoon-rock specimen whose hydrogen content (x) is already determinec,. (a) Eyeballing the scatter diagram, draw a line that appearsto fit the data well. Use this to predict the y-value corresponding to x: 56. (b) Determine the equation of the least squares regression line, and use it to predict Jt at x : 55.
KEYIDEAS Cross-classified data can be described by calculating the relative frequencies. The correlation coefficierrt, r, measureshow closely the scatter approximates a straight line pattem. positive A value of correlation indicates a tendency of large values of x to occur with large values of y, and also for small values of both to occur together. A negative value of correlation indicates a tendency of large values of x to occur with small values of y and vice versa. A high correlation does not necessarily imply a causal relation. A least squares fit of a straight line helps describe the relation of the response y to the input variable x. A y-value may be predicted for a known x-value by reading from the fittedliney:9o + prx.
KEYFORMULAS For pairs of measurements (x, y) Samplecorrelation: ,where S.. : Xx - Z)2, Sro:
5", vso
vsyy
>0 - V)2 and S." : )(x - z)(y - fl.
Fittedline: y:go+prx
wherep, : pandPo : 7 - 0rr. ox*
6. Exercises 75
6. EXERCISES 6.1 Applicants for welfare aid are allowed an appealsprocess when there is a feeling of being unfairly treated. At the hearing the applicant may choose self-representation or representation by att attorney. The appeal may result in an increase, decrease/ or no change of the aid recommendation. Court records of 320 appeals cases provided the following data.
Type of Representation Self Attorney
Amount of Aid Unchanged
Increased
s9 70
Decreased
17
108 633
Calculate the relative frequencies {or each row and compare the pattems of the appealsdecisions between the two types of representation. 6.2 Table 8 gives the numbers of civilian employed persons by maior occupation group and sex for the years 1975 and1979. (Source: Statistical Abstract of the United States, 1980.)
Toble I Number of Civilion Employed Personsin the U.S.(in Thousondsl r97s Female
Male
Female
4,4OO 2,476
2r,og2 4,742 7,258 460
23,306 26,154 4,823 2,216
26,O37 5,911 8 , 0 11 487
51,230
33,552
56,499
44,446
Male White-collar worker Blue-collar worker Serviceworker Farm worker Total
r979
2r,r34 23,220
(a) Consider females only for each year separately. Calculate the percentages of females in the various occupation groups, and compare between I97 5 and 1979. (b) For each occupation group, calculate the percentage of employees who are female, and compare between 1975 and 1979.
Chapter
6.3 To study the effect of soil condition on the growth of a new hybrid
plant, saplings were planted on three types of soil and their subsequent growth classified in three categories.
Crowth Poor Average Good
Clay
Soil Type Sand
168T4 3l 16 18 36
Loam
2r 25
calculate the appropliate relative frequencies and compare the quality of growth for different soil types. 6-4 A car dealer's recent records on 50 sales provided the following frequency information:
(a) Determine the marginal totals. (b) Obtain the table of relative frequencies. (c) calculate the relative frequencies separately for each row. (d) Does there appear to be a difference in the choice of transmission between diesel and gasoline engine purchases? 6.5 A high risk group of 1,088male volunteers were included in a major clinical trial for testing a new vaccine for tlpe B hepatitis. The vaccine was given to 549 persons randomly selected from the group, and the others were injected with a neutral substance (pla"cebo). Eleven of the vaccinated people and seventy of the non-vaccinated ones later got the disease(source: Newsweek, october 13, l9g0). (a) Present these data in the following two-way frequency table. (b) compare the rates of incidence of hepatitis among the two groups.
6. Exercises 77
Hepatitis
No Hepatitis
Total
Vaccinated Not vaccinated Total
6.6 Given the following (x, y) values:
(a) Plot the scatter diagram. (b) Calculate the correlation coefficient. 6.7 Calculate r for the data using both formulas:
6.8 Calculating from a data set of 2O pairs of (x, y) values one obtains
)x
-
156, 2 y
:
1178
Zxz - 1262, 2y' Find the correlation coefficient. 6.9 As part of a study of the psychobiological correlates of successin athletes, the following measurements (courtesy of W. Morgan) are obtained from members of the U.S. Olympic wrestling team:
Vigor (a) Plot the scatter diagram. (b) Calculate r. (c) Obtain the least squares line. (d) Predict the vigor score y when the anger score is x 6.10 Recorded here are the heights and weights of female students in an undergraduate class.
Chapter 3 (a) Make a scatter plot and guess the value of r. (b) Calculate r.
Height (inches) Weight (pounds)
66 66 66 68 69
62 66 66 62 65 62 65
l l s l l q 1 3 81 4 01 4 0 1 1 0 L 4 s 1 3 01 0 9 1 3 01 1 01 3 5
5.11 The tar yield of cigarettesis often assayedby the following method: A motorized smoking machine takes a 2-second puff once every minute until a fixed butt length remains. The total tar yield is determined by laboratory analysis of the pool of smoke taken by the machine. of course the processis repeatedon severalcigarettei of a brand to determine the averagetar yield. Given here are the data of averagetar yield and the averagenumber of puffs for seven brands of filter cigarettes. Average tar (milligrams)
12.2
t4.3
15.7 12.6 13.5 r4.0
Average No. of puffs
8.s
9.9
r0.7 9.0
9.3
9.s
(a) Plot the scatter diagram. (b) Calculate r; (Remark: Fewer puffs taken by the smoking machine means a faster bum time. The amount of tar inhaled by a human smoker dependslargely on how often the smoker puffs.) 6.12 A director of student counseling is interested in the relationship between the numerical score x and the social science score y on college qualification tests. The following data (courtesy of R. W. Johnson)are recorded:
(a) Plot the scatter diagram. (b) Calculate r. (Use of computer recommended; see Exercise 6.2L.)
6. Exercises 79 6.13 Would you expect a positive, negative, or nearly zero corelation for each of the following? Give reasons for your answers. (a) The weight of an automobile and the average number of miles it gets per gallon of gasoline.
(b) Intelligenc,e scores of husbands and wives. (c) The age of an arrcraft and the proportion of time it is available for flying (part of the time it is grounded for maintenance and repair).
(d) Stock prices for IBM and General Motors' (e) The temperature at a baseball game and beer sales' 6.14 Examine each of the following situations and state whether you would expect to find a high correlation between the variables. Give reasons why an observed correlation cannot be interpreted as a direct relationship between the variables, and indicate possible lurking variables. (a) Correlation between the data on the incidence Iate x of cancer and per-capita consumption y of beer, collected from different states. (b) correlation between the police budget x and the_number of crimes y recorded during the last 10 years in a city like Houston. (c) Correlation between the gross national product x and the number of divorces y in the country recorded during the last 10 years. (d) Correlation between the concentration x of air pollutants and the number of riders y on public transportation facilities when the data are collected from several cities that vary greatly in size. (e) Correlation between the wholesale price index x and the avelage speedy of winning cars in the Indianapolis 500 during the last l0 years. 5.15 Given these five pairs of values:
(a) Plot the scatter diagram. (b) From a visual inspection, draw a straight line that appears to fit the data well.
(c) Compute the least squares estimates po and Br and draw the fitted line. 6.16 Given the six pairs of values:
Chapter 3
(a) obtain the least squaresestimates go, 0r, and the fitted line. (b) Predict the y-value for x : 6. 6.17 rdentify the predictor variable x and the responsevariable y in each of the following situations. (a) The state highway department wants to study the relationship between road roughness and a car,s gas consumption. (b) A concession salespersonat football games wants to relate total fall sales to the number of games the home team wins. (c) A sociologist wants to investigate the number of weekends a college student goes home in relation to the trip distance. 6.18 Scientists in the Netherlands conducteda national inquiry on water quality. (source: B. c.r. zoeteman(I990), sensory Assessment of water Quality, Pergamon press, oxford.) For the eight communities with surface water supplies, they studied the effect of water hardness by obtaining y : taste rating and x : amount of magnesium (milligrams per liter). (a) Obtain 0o, 0r, and the fitted line. (b) Predict the taste rating for surface water having x : 15 milli grams per liter of magnesium.
6.19 From the information given in Exercise 6.g, determine the fitted line. 6.20 use the expressionsfor Boand p, to show that the least squaresline V: 9o + prx passesthro"ughtli -."n point (! l). 6.21 gsing the computer. The MINITAB commands for scatter plot, correlation coefficient and regression equation are illustrated ;ith the {ollowing computer output, which includes the commands, the data, and the results requestedby these commands. R E A DX I N T O C I A N D Y I N T O C 2 22 23 43 4 3.s 54 6.5 4 P L O T C 2 V S CI
6. Exercises 81 c2 4.50+
:** 3.60+
|
-*
*
2.70;
-t
L 80+ +---------+---------+---------+---------*---------+c 6.00 4. O0 5.00 2.OO 3.00
1 7.00
C O R RV I N C 2 A N O X I N C l CORRELATIOO NF
C2 AND CI
= 0.85I
R E G R E S SY I N C 2 O N I P R E D I C T O RI N C I E Q U A T I O NI S THE REGRESSION C2 = 1.80 + 0.370 C1
Use MINITAB (or some other packageprogram) to obtain the scatter plot, correlation coefficient, and the regression line for: (a) The GPA and GMAT scores data of Table 4 in Example 2. (b) The hydrogen (x) and carbon (y) data in Exercise 4.7.
CHA
4 Proboblllty 1, INTRODUCTION 2, PROBABILITY OF AN EVENT PROBABILITY 3, METHODS OF ASSIGNING AND TWO LAWSOF PROBABILITY 4. EVENTRELATIONS PROBABILITY AND INDEPENDENCE 5. CONDITIONAL FROMA FINITE POPULATION 6, RANDOMSAMPLING
84
Chapter 4
TODAYWILLBE PARTLY CLOUDY-25% chonce of roin PROEABITITIES EXPRESS THECHANCE OF EVENTS THATCANNOT BE PREDICTED WITH CERTAINTY
1. Introduction
85
1. INTRODUCTION In Chapter I we introduced the notions of.sample and statistical population in the context of investigations where the outcomes exhibit variation. While complete knowledge of the statistical population remains the target of an investigation, we typically have available only the partial information contained in a sample. Chapter 2 focused on some methods for describing the salient features of a data set by graphical presentations and calculation of the mean, standard deviation, and other summary statistics. When the data set representsa sample from a statistical population, its description is only a preliminary part of a statistical analysis. Our maior goal is to make generalizations or inferences about the target population on the basis of information obtained from the sample data. An acquaintance with the subject of probability is essential for understanding the reasoning that leads to such generalizations. In everyday conversations, we all use expressions of the kind: "Most likely our team will win this Saturday." "It is unlikely that the weekend will be cold." "I have a 50-50 chance of getting a summer fob at the camp." . The phrases, "most likely," "probable," "quite unlikely," and so on are - used qualitatively to indicate the chance that an event will occur. Probability, as a subject, provides a means of quantifying uncertainty. In general terms, the probability of an event is a numerical value that gauges how likely it is that the event will occur. We let probability take values on a scale from 0 to I with very low values indicating extremely unlikely, values close to I indicating very likely, and the intermediate values ,interpreted accordingly. A full appreciation for the concept of a numerical measure of uncertainty and its role in statistical inference can be gained -only after the concept has been pursued to a reasonable extent. We can, however, preview the role of probability in one kind of statistical reasoning. Suppose it has long been observed that in 50% of the cases a certarn type of muscular pain goes away by itself. A hypnotist claims that her method is effective in relieving the pain. For experimental evidence she hypnotizes 15 patients and 12 get relief from the pain. Does this demonstrate that hypnotism is effective in stopping the pain? Let us scrutirtuze the claim from a statistical point of view. If indeed the method had nothing to offer, there would still be a 50-50 chance that a patient is cured. Observing 12 cures out of 15 amounts to obtaining L2 heads in 15 tosses of a coin. We will see later that the probability of at least LZ heads in 15 tosses of a fair coin is .018, indicating that the event is not likely to happen. Thus, tentatively
86
Chapter 4
assuming the model (or hypothesis) that the method is ineffective, 12 or more cures are very unlikely. Rather than agree that an unlikely event has occurred, we conclude that the experimental evidence strongly supports the effectiveness of the metlod. This kind of reasoning, called testing a statistical hypothesis, will be explored in greater detail later. For ror", *" will be concerned with introducing the ideas that lead to assigned values for probabilities.
2. PROBABILITY OF AN EVENT The probability of an event is viewed as a numerical measure of the chance that the event will occur. The idea is naturally relevant to situations where the outcome of an experiment or observation exhibits variation.
Although we have already used the terms ,,experiment,, and ,,event,,, a more specific explanation is now in order. tn the present context, the term "experiment" is not limited to the studies conducted in a laboratory. Rather, it is used in a broad sense to include any operation of data collection or observation where the outcomes are subiect to variation. Rolling a die, drawing a card from a shuffled deck, sampling a number of customers for an opinion survey, and quality inspection of-items from a production line, are iust a few examples.
An experiment is the process of observing a phenomenon that has vanation in its outcomes.
Before attempting to assign probabilities, it is essential to consider all the eventualities of the experiment. Pertinent to their description, we introduce the following terminologies and explain them thiougtr examples.
The sample space associated with an experiment is the collection of all possible distinct outcomes of the experiment. Each outcome is called an elementary outcome, a simple event or an element of fhe sample space An event is the set of elementary outcomes possessing a designated feature.
2. Probability of an Event 87 The elementary outcomes, which together comprise the sample space, constitute the ultimate breakdown of the potential results of an experiment. For instance, in rolling a die, the elementary outcomes are the points L,2,3, 4, 5, and6, which together constitute the sample space.The outcome of a football game would be either a win, Ioss, or tie. Each time the experiment is performed, one and only one elementary outcome can occur. A sample spacecan be specified by either listing all the elementary outcomes, using convenient symbols to identify them, or making a descriptive statement that characterizes the entire collection. For general discussions, we denote the sample space by 9, the elementary outcomes by er, a21ast . . . t events by A, B, etc. In specific applications, the elementary outcomes may be given other labels that provide a more vivid identification.
We say that an event A occurs when any one of the elementary outcomes in A occurs.
EXAMPLE4 Toss a coin twice and record the outcome head (H) or tail (T) for each toss. For two tosses of a coin, the outcomes can be conveniently listed by means of a tree-diagram. 2nd Toss
lst Toss
* -/'
--T
List(designation) HH
k)
HT
@z)
*
TH
(es)
T
TT
ko)
,/ \,
The sample space can be listed as
9 : {er, er, er, eol The order in which the elements o{ 9 are listed is inconsequential. It is the collection that matters. Consider the event of getting bxactly one head and let us call it A. Scanning the above list, we see that only the elements HT (er) and TH (er) satisfy this requirement. Therefore, the event A has the composition A
88
Chapter 4
which is, of course, a subset of g. The event B of getting no heads at all, consists of the single element eoso B : {eq}.Thaiis, B is a simpl".rr"ni as well as an event. The term "event" is a general term that includes simple events. tr
EXAMPLE2
Supposea box containing 50 seedshas 42 seedsthat are viable and 8 not viable. One seedwill be picked at random and planted. Here the two distinct possibilities are either the chosen seed is viable or not viable. The sample spacecan there{ore be structured as g : {y, N} with only two elements coded as V for viable and N for not viable. Alternatively, we may view the choice of each seed as a distinct elementary outcome/ and "viable" or "not viable,, as its properties. In other words, we can imagine tagging the viable seedswith the labels er^ ez, . . ., eor, attdthe non-viable ones with eo", . . ., aso.Sinceany of the 50 seeds can be selected by the process of random choice, the sample space will have 50 elementary outcomes: g :
{et,.
. , a 4 z ,a 4 g , . . . , e s o . / '
++
viable
not viable
Iet us denote by V the event that the chosen seed is viable. Then V has the composition V-
{er,
.te+z}
The second formulation of the sample space appears to be unnecess arrly complex in view of fhe simplicity of the first. Its advantage will surface later when we discuss the assignment, of probability to events. l
Example 2 shows that, iir a given problem, the formulation of the sample spaceis not necessarilyunique. Convenience in the determination of probability is a more important criterion than simplicity or brevity of our list. , Both Example I and Example 2 illustrate sample spaces that have a finite number of elements. There are also sample spaceswith infinitely many elemenls. For instance, supposea gambler at a casino will corrtinue pulling the handle of a slot machine until he hits the first jackpot. The conceivable number of attempts does not have a natural upper limit so the list never terminates. That is, 9 : {1,2,3, . . .} has an infinitc number of elements. However, we notice that the elements could be arrangedone after another in a sequence. An infinite sample space where such an arrangement is possible is called "countably infinite." Another type of inJinite sample spaceis also important. Supposea car with a full tank of gasoline is driven until its fuel runs out and the distance traveled re-
2. Probability of an Event 89 corded. Since distance is measured on a continuous scale, any nonnegative number is a possible outcome. Denoting the distance traveled by d, we candescribe this sample spaceas I : {d: d = 0}-that is, the set of all real numbers greater than or equal to zero.Here the elements of I form a continuum and cannot be arranged in a sequence. This is an example of a continuous sample space. To avoid unnecessary complications, we will develop the basic principles of probability in the context of finite sample spaces.We first elaborate on the notion of the probability of an event as a numerical measure of the chance that it will occur. The most intuitive interpretation of this quantification is to consider the fraction of times the event would occur in many repeated trials of the experiment.
rw
I
W
?qosAslLlTy Lao
I ''/w r-; F\
.'
Jt A
.-
-d'_,
"hc*Rltr.r& w TptlSTrt€oef lT'5 Srpo^JeL{ obAbr€ WAT AFIY rnA?q, trt & sltf7uLD " r\
1979 by Sidney Harris-American
Scientist rnagazine.
90
Chapter 4
The probability of an event is a numerical value that represents the proportion of times the event is expectedto occur when the experiment is repeatedunder identical conditions. The probability of event A is denoted bV p(A).
since a proportion must lie between 0 and r, the probability of an event is a number between 0 and l. To explore a few other important properties of probability,let us refer to the experiment in Example I of tossing a coin twice. The event A of getting exactly one head consists of the elementary outcomes HT (er) and TH (ea). consequently, A occurs if either of these occurs. Because [rreportion of timesl _ frroportion of tim.rl +' fnroportion of timesl A occurs L I HT occurs L I TH occurs L _l the number that we assign as p(A) must be the sum of the two numbers P(HT) and P(TH). Guided by this example, we state some generarproperties of probability.
The probability of an event is the sum of the probabilities assignedto all the elementary outcomes contained in the event. Next, since the sample space g includes all conceivable outcomes, in every trial of the experiment some element of.9 must occur. viewed as an event/ I is certain to occur, and therefore its probability is L The sum of the probabilities of all the elements of g must be 1. In summary,
Probability must satisfy: (i) 0 (ii) P(A) (iii) P(g) -
P(e) ^tr1i,,a "'?i,, ,
P(") _ I
Exercises 9l We have deduced these basic properties of probability by reasoning from the definition that the probability of an event is the proportion of times the event is expected to occur in many repeated trials of the experiment.
EXERCISES 2.1 Match the proposedprobability of A with the correct verbal description. (The latter may be used more than once.) Verbal Description
Probability (a) o (b) - .3 (c) .9 -(d) .5 - ((e) f ) 10.0 .05 (g) .3
(i) (ii) (iii) (iv) (v) (vi)
Very likely to happen As much chance of occurring as not May occur but by no means certain An incorrect assignment Very little chance of happening No chance of happening
2.2 Describe the sample space for each of the following experiments: (a) The record of your football team after its first game next season. (b) The number of students out of 20 who will passbeginning swimming and graduate to the intermediate class. (c) In an unemployment survey, 1000 persons will be asked to answer "yes" or "r;Lo" to the question "Are you employed?" Only the number answering "no" will be recorded. (d) A geophysicist wants to determine the natural gas reserve in a particular area. The volume will be given in cubic feet. 2.3 For the experiments in Exercise 2.2, which sample spacesare discrete and which are continuous? 2.4 Identify these events in Exercise2.2: (a) {Don't lose} (b) {At least half the students pass} (c) iless than or equal to 5.5% unemployment) (d) {Between I and 2 million cubic feet} 2.5 Bob, fohn, and Linda are the finalists in the spelling contest of a local school district. The winner and the first runner-up will be sent to a state-wide competition.
92
Chapter 4
(a) List the sample spaceconcerning the outcomes of the local contest. (b) Give the composition of each of the events: A B
2.6TherearefourelementaryoutcomeSinasampleSpace.IfP(eJ P(e)
2 . 7 suppose
Y : {"r,-ur, er}. ff the simple events er, az, and eB areall equally likely, what are the numerical values or p@t), p(e), and P(et)t
\ 2 . 9 The gamplespacefor the responseof a singleperson'sattitude
toward a political issue consists of the three .I.-. ntary outcomes er _ {Unfavorable}, e2 - {Favorable}, and,e, lowing assignmentsof probabiiities peimrssible? '(a) P(e) (b) P(e) (c) P(er)
i'z.bProbability
and odds, The probability of an event is often expressed in terms of odds. Specifically, when we say that the odds arek to m that an event will occur, we mean that the probability of the event is k/(k + m). For instance,"the odds are4 to i that ."ndid ateloneswill win" means that p(fones will win) _ + statementsin terms of probability: (a) The odds are2 to I that there will be fair weather tomorrow. (b) The odds are 5 to 2 that the city council will delay the funding of a new sports atena.
3. METHODS OF ASSIGNING PROBABILITY An assignment of probabilities to all the events in a sample space determines a probability model. In order to be a valid probabiiity model, the probability assignmentmust satisfy the properties (i), (ii), and (iii) stated in the previous section. Any assignmenf of tru-b"rs'bi"j ,o it tary outcomes will satisfy the three conditions of probability "elemenprovided these numbers are nonnegative, and their sum orr", the outcomes e, in "il 9 is 1. However, to be of any ptactical import the probability assignedto an event must also be in agreement with the concept of probabilitf as the proportion of times the event is expectedto occur. Here we discuss the implementation of this concept in t-o important situations.
3. Methodsof AssigningProbability 93 3.1 EQUALLY LIKELY ELEMENTARY OUTCOMESTHE UNIFORM PROBABILITY MODEL Often symmetry in the experiment ensures that each elementary outcome is as likely to occur as any other. For example, consider the experiment of rolling a perfect die and recording the top face. The sample spacecan be describedas 9 : {ev a2, az, en, e., eul where e, stands for the elementary outcome of getting the face 1, and similarly, a2, . . ., eu.Without actually rolling a die, we can deduce the probabilities. Becausea fair die is a symmetric cube, each of its six faces is as likely to appear as any other. In other words, each {ace is expected to occur one-sixth of the time. The probability assignments should therefore be P(er) : P(ez): ! r' : P(e) : i and any other assignment would contradict the symmetry assumption that the die is fair. We say that rolling a fair die conforms to a uniform probability model becausethe total probability I is evenly apportionedto all the elementary outcomes. What is the probability of getting a number higher than 4? Letting A denote this event, we have the composition A : {as, eul, so P(A)
When the elementary outcomes are modeled as equally likely, we have a uniform probability model. If there are k elem entary outcomes in 9, each is assigned the probability of I/k. An event A consisting of m elementary outcomes, is then assigned
P(A\'
EXAMPLE3
k
#einS
Find the probability of getting exactly one head in two tosses of a fair coin. As listed in Example 1, there are four elementary outc,omes in the sample space: 9a : {HH, HT, TH, TT}. The very concept of a fair coin implies that the four elementary outcomes in I are equally likely. We
Chapter
the probability * to eachof them. The event A : ll:1..fTnas":sign [one neacr two elementary outcomes-namely, HT and TH. He-nce, P(A): ?:.5.
Cregor Mendel, pioneer geneticist, perceived a pattern in the characteri'sticsof generationsof pea plants and conceived a theory of heredity to explain them. A."ording to Mendel, inherited characteristicsare transmitted from- one generation to another by genes.Genes occur in pairs and the offspring obtain their pair by taking one gene from each parent. A simple uniform probability model lies at the heart of Mendelis e*planation of the selection mechanism. one experiment, that illustrated Mendel's theory, consists of cross fertili zrng a pure strain of red flowers with pure a strain of white flowers. This produceshybrids having one gene of each type,.that are pink flowered. Crossing these hyfirids leads to one of four possib.legene pairs. Under Mendel,s i"*r, these -Consequ-ently, four ate equally likely. p[pink] _ ] and pfWhite] coins.)
An experiment carried out by Correns, one of Mendel,s followers, resulted in the frequencies I41, ZgL,and IBZ for the white, pink, and red flowers, respectively. These numbers are nearly in the ratio L:Z:I. (source: W. |ohannsen(1909) EIe, ments of the Precise Theory of Heredity, G. Fischer, |ena.)
3. Methodsof AssigningProbability 95 EXAMPLE4
Refer to Example 2. If one seed is sampled at random from a box that contains 42via,ble and 8 non-viable seeds,what is the probability that the selectedseed is not viable? The intuitive notion of random selection is that no seed is given preference over any other. We deduce that each seed is as likely to be selected as any other. Identifying the selection of each seedas an elementary outcome, the sample space has the composition
viable
not viable
and this specification conforms to a uniform probability model. Each of the 50 elementary outcomes is to be assigned the probability %0. The event "not viable" has the composition [Notviable]-
{eor,
P ( N o t v i a b l e )- # _
Consequently,
.,eso]l
.16.
An altemative reasoning: In the above analysis the individual seeds were not, of course, actually tagged. The tagging was imagined for the purpose of calculating the probabilities. Alternatively, one may argue that since the selected seed must be either viable (V) or not viable (N), there are only two elementary outcomes in the sample space,I : {V, N}. Since the numbers of viable and non-viable seedsare in the ratio 42:8 in the population, their probabilities must also be in the same ratio-that is
P(v)
42 42 + 8
50
'1',
42 + 8
8 50
.r6
While this second reasoning appears more attractive because of its simplicity, it runs into considerable complexities when the experiment involves sampling two or more seeds. n 3.2 PROBABILITY AS THE LONG-RUN RELATIVE FREQUENCY Symmetry among the outcomes, when it is present, leads to a uniform probability model. However, this simple model does not always apply. In many situations it is not possible to construct a sample spacewhere the elementary outcomes are equally likely. If one corner of a die is cut off, it would be unreasonable to assume that the faces remain equally likely and the assignments of probability to various faces can no longer be made by deductive reasoning. When speaking of the probability {or risk}
96
Chapter 4
that a man will die in his thirties, one may choose to identifu the occurrence of death at each decadeor even each year of age as an elementary outcome. However, no sound reasoning can be provided in favor of a uniform probability model. In fact, from extensive mortality studies, the demographers have found considerable disparity in the risk of death at different age goups.
How Long Will o Boby Live?
= :_o o L
o_
The probobilities for life Iength of o boby born in the united Stotes. Obtoined from the Life Toble for the Totol Populotion: United Stotes '1969-71.
When the assumption of equally likely elementary outcomes is not tenable, how do we assessthe probability of an event? The only recourse is to repeat the experiment many times and observe the proportion of times the event occurs. Letting N denote the number of repetitions (or trials) of an experiment/ we set hr(A) _ relative frequency of event A in N trials
# times A occurs in N trials N
3. Methodsof Assigningprobability 97 For instance,let A be the event of getting a six when rolling a die. If the die is rolled 100 times and six comes up 23 times, the observed relative frequency of A would be rroo(A) : fr% : .23. In another set of 100 tosses, six may come up 18 times, resulting in r1% : .18 as the relative frequency. Collecting these two sets together, we have N : 200 trials with the obsewed relative frequency
4#1! : rzoo(A): #:
.2os
Imagine that this processis continued by recording the results from more and more tosses of the die, and updating the calculations of relative frequency. Figure I shows a typical plot of the relative frequency rr(A) versus the number N of trials of the experiment. We see that rN(A) fluctuates as N changes, but the fluctuations become damped with increasing N, and r,.,(A) tends to stabilize as N gets very large. Two persons separately performing the same experiment N times are not going to get exacily the same graph of rry(A). However, the numerical value at which rr(A) stabilizes in the long run, will be the same. Let us designate this feature of Figure I as the long-run stability of relative frequency.
We define P(A), the probability of an event A, as the value to which rru(A) stabilizes with increasing N. Although we will never know P(A) exactly, it can be estimated accurately by repeating the experiment many times and calculating rru(A).
The property of the long-run stabili zatron of relative frequencies is basedon the findings of experimenters in many fields who have under0. i : i : : l l i 1 i : i
0.
;
'............+...........-......'............,!.....'.'..............
;::
a
i0 0.
iii
i i r : l i i , r l j
I
i
a
-*-oii i i "**---i-'-*-- ' i-"'"--'
ii.
a
I l
j
,...,..,,,.,.,.,.,.,.,.,,..,i.,..-,.,-.....,..........,,..,..,........
I 000 Numbeo r f t r i a l s( N )
Figure1 Stobilizotionof relotive frequency.
I 100
1200
1300
98
Chapter 4
taken the strain of studying the behavior of r,u(A) under prolonged repetitions of their experiments. French gamblers,'who provided much oi the early impetus for the study o{ probability, performe-rlexperiments tossing dice_and coins, drawing cards, and playing other games of chance thousandsand thousands oJ times. They observid the siabilization property of relative frequency and applied this knowledge to achieve an understanding of the uncertainty involved in these games. Demographers have compiled and studied volumes of mortality d"t" to examiie iire relative frequency of the occurrence of such events as death in particular age groups. In each context/ the relative frequencies were found-to stabilize at specific numerical values as the number of casesstudied increased. Life and accident insurance companies actualry depend upon the stability property of relative frequencies. we explained in section 3.1 how a determination of probability can be based on logical deductions in the confines of a un'ifor- prob"bility model. when we assign equal probabilities to the six faces oi a die, we essentially conceptualize the-die as a perfectly symmetric and hornogeneous cube. In the real world, we many never know if that is indeed iire case.However, examination of the relative frequencies after alarge number of tosses,can help us assesswhether the iiealized model is tenable. Mendel conducted experiments-over a period of years before deducing the validity of the uniform probability model for gene selection. As another example of an idealized model consider the assignment of probabilities to the day of the week a child will be born. we may tentatively assume the simple model that all seven days of the week are equally likely. Eachday is then assignedthe probabiliiy i.rc edenotes the event of a birth on the weekend (saturday oi sunday),'our model leads to p(A) : f. rhe plausibiiity of tho the uniform probability -probability T9{"1 carr only be ascertained from an extensive set of birih records. Table I shows the frequency distribution of the number of births by days of the week for all registeredbirths in the usA in theyear 1971.
Toble,l Numberof Births(in ,t0,000)by Doy of the Week, u.s. .lg7,l Mon
Tues
Wed
Thurs
Fri
Sat
Number of births
52.09 54.46 52.68 51.68 53.83 47.2r
Relative frequency
'146
.153
.148
.145
.t5l
r32
Sun
All days
44.36 3 5 6 . 3 1 .124
Rindfuss, R., et al. "Convenience and the Occurrence of Births Induction of Labor in the United States and Canada," Intetnational lournal of Health Services, Vol. 9:8, 1979, p-441, Table l. Copyright O Ig7g, Baywood Publishing Company, Inc.
Exercises 99 An examination of the relative frequencies in Table I reveals a sizable dip on Saturday and Sunday. Specifically, the relative frequency of births on the weekend is .256, as compared to the probability of ? : .286 suggestedby the uniform probability model. Becausethe number of trials is so large, the difference .286 - .256 : .03 appears real. A possible explanation may be the prevalence of elective induction of labor on weekdays.
EXERCISES 3.1 Consider the experiment of tossing a coin three times. (a) List the sample space by drawing a tree diagram. (b) Assign probabilities to the elementary outcomes. (c) Find the probability of getting exactly one head. 3.2 Supposeyou are eating at apizza parlor with two friends. You have agreed to the following rule to decide who will pay the bill. Each person will toss a coin. The person, who gets a result that is different from the other two, will pay the bill. If all three tossesyield the same result, the bill will be shared by all. Find the probability that: (a) Only you will have to pay. (b) All three will share. 3.3 A white and a colored die are tossed. The possible outcomes are shown in the illustration.
nffi
tril tril ffi il trfr trE
E ffi E ffi U il ,5 U il
mil ffi il ffin
i: :,jl:i:::t:ilt::t:::t,i
GE trTl ffi ffi mil i]E trT ffi il ffi il ,,,,,,,,,,,,,,,',,',,,,,,,,,,:i,
.,il
ilE
=I
i,r5
-E
'.rE
(a) Identify the eventsA - [Sum _ 6f, B even], D _ [Same number on each die].
ffi ffi il ffi il
mffi
ffi ffi ffi il ffi ru ffi Itl
100
Chapter 4
(b) If both die are "fair,,, assign probability to each elementary out_ come. (c) Obtain P(A), p(B), p(C), p(D). 3-4 A roulette wheel has 34 slots, two of which are green, 16 are red, and 16 arc black. A successful bet on black or red doubles the money, while one on green fetches 30 times as much. If you play the game once by betting $2 on the black, what is the probabilily ihat: (a) You will lose your $2? (b) You will win $2? 3.5 children joining a kindergarten class will be checked one after another to see if they have been inoculated for polio (1) or not (N). suppose that the checking is to be continued until one noninoculated child is found or four children have been checked, whichever occurs first. List the sample spacefor this experiment. 3.6 (a) consider the simplistic model that human births are evenly distributed over the twelve calendar months. If a person is randomly selected, say from a phone directory, what is the probability thai his or her birthday would be in November or Deiember? (b) The following record shows a classification of 4l,2og births in wisconsin (courtesy of professor Jerome Klotz). calculate the relative frequency of births {or each month and comment on the plausibility of the uniform probability model. 3478 Ian Feb 3333 Mar 377I Apr 3542 May 3479 [une 3304
Iuly Aug Sept Oct Nov Dec Total
3476 3495 3490 3331 3188 332I 41,208
3.7 A government agencywill randomly select one of the 15 paper mills in a state to investigate its compliance with federal safety standards. suppose, unknown to the agency, r0 of these mills are in iompliance, 3 are borderline cases, and.2arc in grossviolation. (a) Formulate the sample space in such a way that a uniform proba-
bility model holds. (b) Find the probability that a gross violator will be detected.
3.8 Explain why the long-run relative frequency interpretation of probability does not apply to the following situations. (a) The proportion of days the Dow-|ones averageof industrial stock prices exceeds1200.
4. Event Relationsand Two Laws of Probabilitv 101 (b) The proportion of income-tax retums containing improper deductions if the data are collected only in the slack season, say Ianuary. (c) The proportion of cars that do not meet emission standards,if the data are collected from service stations where the mechanics have been asked to check the emission while attending other requested services.
ANDTWO 4. EVENT RELATIONS LAWSOF PROBABILITY Recall that the probability of an event A is the sum of the probabilities of all the elementary outcomes that are in A. It often turns out, however, that the event of interest has a complex structure that requires tedious enumeration of its elementary outcomes. On the other hand, this event may be related to other events that can be handled more easily. The pu{pose of this section is to first introduce the three most basic event relations: complement, union, and intersection. These event relations will then motivate some laws of probability. The event operations are conveniently described in graphical terms. We first represent the sample spaceas a collection of points in a diagram, each identified with a specific elementary outcome. The geometric pattem of the plotted points is irrelevant. What is important is that each point be clearly tagged to indicate which elementary outcome it represents, and to watch that no elementary outcome is missed or duplicated in the diagram. To represent an event A, identify the points that correspond to the elementary outcomes in A, enclose them in a boundary line, and attach the tag A. This representation, called a Venn diagrarn, is illustrated in Figure 2.
. HH B
.TH
Figure2 Venn diogrom of the events in Exomple 5.
EXAMPLE5
Make a Venn diagram for the experiment of tossing a coin twice, and indicate the following events:
f02
Chapter 4
A:
Tail at the second toss. B: At least one head.
Here the sample space is ge : {HH, HT, TH, TT}, and the two events have the compositions A : {HT, TT}, B : {HH, HT, TH}. Figure 2 shows the Venn diagram. n
EXAMPLE6
Listed below are the tlryes and ages of four monkeys procured by laboratory for a drug trial. Monkey
Type
Age
I 2 3 4
Baboon Baboon Spider Spider
6 8 6 6
Suppose two monkeys will be selected by lottery and assigned to an experimental drug. Considering all possible choices of two monkeys, make a Venn diagram and show the following events: A: The selected monkeys are of the same type. B: The selected monkeys are of the same age. Here the elementary outcomes are the possible choices of a pair of numbers from {1, 2,3, 41.Thesepairs are listed and labeled4s all a2tas1a41 eu, eu fior ease of reference: {1,2} (e) { 1, 3} (ez) {1, 4} (ez)
{2,3} (eo) {2, 4} (es) {3, 4} (ee)
The pair {1, 2} has both monkeys of the same type, and so does the pair {3, 4}.Consequently, A : {er, eu}.Those with the same agesare {1, 3}, {I,41, and {3, 4}, so B = {ez,ez, ee}.Figure 3 shows the Venn diagram.
Figure3 Venn diogrom of the events in Exomple6.
4. Event Relations and Two Laws of Probability
Complement7-
U n i o nA u B
AB lntersection
OO I n c o m p a t i b leev e n t s
f 03
We now proceed to define the three basic event operations and introduce the correspondingsymbols. The complement of an event A, denotedby Z, is the set of all elementary outcomes that arenot rn A. The occurrenceof A means that A does not occur. The union of two events A and B, denoted by A U B, is the set of all elementary outcomes that are in A, in B, or in both. The occurrence of A U B means that either A or B or both occur. The intersection of two eventsA andB, denotedby AB, is the set of all elementary outcomes that are in A and in B. The occurrence of AB means that both A and B occur. Note that A U B is a larger set containing A as well as B while AB is the common part of the sets A and B. Also it is evident from the definitions that A U f and B U Arepresent the same event, while AB and BA are both expressions for the intersection of A and B. The operations of union and iniersection can be extended to more than two events. For instance, A U B U C stands for the set of all points that are in at least one of.A, B, and c; while ABC representsthe simultaneous occurrence of all three events. Two events A and B are called incompatible or mutually exclusive if their intersection AB is empty. Becauseincompatible events have no elementary outcomes in common, they cannot occur Simultaneously.
EXAMPLE7 Refer to the experiment in Example 6 of selecting two monkeys out of four. Let 4 : lsame type], B : [same age], and C : [different types]' Give the compositions of the events C, A, AUB,
AB, BC
The pairs consisting of different types are {1, 3}, {1, 41,12,31,and {2, 4}, so C : {er, o31a4t eo}.The event A is the same as the event C. Employing the definitions of union and intersection, we obtain A U B :
{er, a2, az1a6l
AB : {er,l BC : {e,r, e,}
tr
Let us now examine how probabilities behave as the operations of complementation, union, and intersection are applied to events.It would be worthwhile for the readerto review the properties of probability listed in Section 3.2. In particular, recall that P(A) is the sum of probabilities of the elementary outcomes that are in A, and P(9) : 1' First let us examine how P(A) is related to P(A). The sum P(A) + P(A) is the sum of the probabilities of all elementary outcomes that are in A plus the sum of the probabilities of elementary outcomes not in A.
104 Chapter4 Together, these two sets comprise I and we must have P(g) - l. Consequently, P(A) + PG) I, and we arrive at the following law:
Law of Complementation: P(A) - I
PG)
This law or formula is useful in calculating P(A) when A is of a simpler form than A so that p(a) is easier to calculate. Tuming to the operation of union, recall that A U B is composed of points (or elementary outcomes) that are in A, in B, or in both A and B. Consequenily, P(A U B) is the sum of the probabilities assignedto these elementary outcomes, each probability taken iust once. Now, the sum P(A) + P(B) includes contributions from all these points, but it double counts those in the region AB (seethe figure of.A U B). To adjust for this double counting, we must therefore subtract P(AB) from P(A) + P(B). This results in the following law:
Addition Law: P(A U B) - P(A) + P(B)
P(AB)
lf the events A and B are incompatible, their intersection AB is empty, so P(AB) : 0, and we obtain:
For incompatible events: P(A U B) - P(A) + P(B)
The addition law expressesthe probability of a larger event A U B in terms of the probabilities of the smaller events A, B, and AB. Some applications of these two laws are given in the following examples.
EXAMPLE8
A child is presented with three word-association problems. With each problem two answers are suggested-one is correct and the other wrong. If the child has no understanding of the words whatsoever, and answers the problems by guessing, what is the probability of getting at least one cofiect answer? Let us denote a correct answer by C and a wrong answer by W. The
4. Event Relatlons and Two Laws of Probability
105
elementary outcomes can be conveniently enumerated by means of a tree diagram. P r o b l e mI
P r o b l e m2
P r o b l e m3 E l e m e n t a royu t c o m e s
i
,,,c{c
,/\wft \
\w{c<--w
r^-
\Wfi
ccc CCW CWC CWW WCC WCW WWC WWW
There are 8 elementary outcomes in the sample space and, sin-cethey are equally likely, each has the probability *. Let A denote the event of getting at least one correct answer. Scanning our list we see that A contains 7 elementary outcomes, all except WWW. Our direct calculation yields P(A) : &. Now let us see how this probability calculation could be considerably simplified. First, making a complete list of the sample space is not necessary. Since the elementary outcomes are equally likely, we need only determine that there are a total of 8 elements in 9. How can we obtain this count without making a list? Note that an outcome is represented by three letters. There are 2 choices for each letter-namely, C or W. We then have 2 x 2 x 2 : 8 ways of filling the three slots. The tree diagram explains this multiplication rule of counting. Evidently the event A contains many elementary outcomes. On the other hand, A is the event of getting all answers wrong. It consists of the single elementary outcome WWW, so P(A) : fi. According to the law of complementation. P(A) 117 rgg
EXAMPLE9
Refer to Example 6 where two monkeys are selected from four by lottery. What is the probability that the selected monkeys are either of the same type or the same age? In Example 6, we aheady enumerated the 6 elementary outcomes that comprise the sample space. The lottery selection makes all choices equally likely and the uniform probability model applies. The two events of interest are A B Because A consists of 2 elem entary outcomes ar,d B consists of 3,
106
Chapter 4
P(A) Here we are to calculate p(A UB} To employ the addition law, we also need to calculate P(AB).In Figure 3 we see AB : {eu}, so P(AB) : +. Therefore,
which is confilmed by the observation that A U B : {er, er.,es, eel indeed has four outcomes. n
EXERCISES 4.1 Suppose the sample space of an experiment has 6 elementary outcomes a1, a2, , a6.Two events are given as A : {er, er, eu} and B : {ez, e+, esl. (a) Draw a Venn diagram and exhibit the events A and B. (b) Determine the compositions of the following events:
(i)a fiiI AB (iii) A u B 6i AE (v\ An 4.2 Refer to Exercise 3.3 of tossing two dice. Give the compositions of the following events:
(a\e (b)ruo (c) CD 4.3 Four applicants will be interviewed for a position with an oil company. They have the following characteristics: 1. Chemistry major, male, GPA 3.5 2. Geology maior, female, GPA 3.8 3. Chemical engineering major, female, GPA 3.7 4. Petroleum engineeringmajor, male, GPA 3.2 One of the candidates will be hired. (a) Draw a Venn diagram and exhibit the events
.rt r", l / I
,s
o") sl
{
Exercises 107 A: an engineering major is hired B: The GPA of the selectedcandidate is higher than 3.6 C: a male candidate is hired (b) Give the composition of the events A U B and AB. 4.4 For the experiment of Exercise4.3 give a verbal description of each of the following events and also state the composition of the event:
(a\e
(b) cA ( c \A u e . 4.5 Referring to a Venn diagram verify the following statements: (a) The event A U B includes the event AB.
(b) (AB)u (AEI : 4 (c)AuA:9 4.6 If.A U B : 9, A andB areincompatible and P(B) : zP(A), determine P(A), P(B), and P(B). 4.7 Refer to Exercise4. 1. Supposethe elementary outcomes are assigned the probabilities P(e) : P(e) : P(e") : .I,
P(eq) : P(e.) : .2, P(er,\ : .3
(a) Find P(A), P(B), and P(AB). (b) Employing the laws of probability, and the results of part (a), calculate P(A) and P(A U B). (c) Verify your answers to part (b) by_addingthe probabilities of the elementary outcomes in each of A and A U B. 4.8 A mail order firm offers a mystery gift box to customers who place a purchase order of $20 or more. Each box contains one of the following five assortments: 1. Key chain and utility knife. 2. Name tag and flashlight. "3. Letter opener and flashlight. 4. Utility
knife and letter opener.
5. Memo pad and letter opener. ( a) If a customer places two separate ordersof $20 and receives two gift boxesat random, and assign probaH*#'S-bilities to the simple eveirtS.
\i
(b) State the compositions of the following probabilities. A:
The customer gets a flashlight.
events and find their
f08
Chapter 4
B: AB:
The customer gets a letter opener. The customer gets a flashlight and a letter opener.
4.9 Refer to Exercise4.8. Let C denote the event that the customer gets either a flashlight or a letter opener or both. (a) Relate C to the events A and B, and calculate P(C) by employing a law of probability. (b) State the composition of C and calculate its probability by adding the probabilities of the simple events.
5. CONDITIONAL PROBABILITY ANDINDEPENDENCE The probability of an event A must often be modified after information is obtained as to whether or not a related event B has taken place. Information about some aspect of the experimental results may therefore necessitate a revision of the probability of an event conceming some other aspecr of the results. The revised probability of A when it ii known that Bhas occurredis called the conditional probability ofA given B andis denoted bv P(A|B).To illustrate how such a modification is made, we consider an example that will lead us to the formula for conditional probability. EXAMPLE'10 A group of executives is classified according to the status of body weight and incidence of hypertension. The proportions in the various categories appear in Table 2. (a) V,lhat is the probability that a person selecied at random from this group will have hypertension? (b) A person, selected at random from this Broup, is found to be overweight. What is the probability that this person is also hypertensive? Let A denote the event that a person is hypertensive, and let B denote the event that a person is overweight. (a) Since 20% oI the group is hypertensive and the individual is selected at random from this group, we conclude that p(A) : .2. This is the unconditional probability o{ A.
Toble 2 Body Weight ond Hypertension Overweight Hypertensive Not hypertensive Total
.10 .15
Normal Weight
.08 .45
Underweight
.o2 .20
5. Conditional Prcbability and Independence 109 (b) when we are given the information that the selected person is overweight, the categoriesin the secondand third columns o|Table2 arc not relevant to this person. The first column shows that among the subgroup of overweight persons, the proportion having hypertension is .lo/.25. Therefore,given the information that the person is in this subgroup, the probability that he or she is hlryertensive is
P(AIB)_ # Noting that P(AB) taking the ratio P(AB) lP(B). In other words , P(AIB) is the proportion of the population having the characteristi c A among all those having the characteristi c B. t_l
The conditionalprobabilityof A givenB is denotedby P(AIB) and is definedby the formula P(AIB) :(:) Equivalently, this formula can be written P(AB) This latter version is called the multiplication bility.
law of proba-
SimilarlY, the conditional probability of B given A can be expressed Dr48) P(BIA\: ffi which gives the relation P(AB) : P(A)P(B!A).Thus the multiplication law of probability states that the conditional probability of an event multiplied by the probability of the conditioning event gives the probability of the intersection. The multiplication law can be used in one of two ways, depending on convenience.when it is easy to compute p(A) and p(AB) directly, these values can be used to compute P(AIB), as in Example 10. on the other hand, if it is easy to calculate P(B) and p(AlB) directly, these values can be used to compute P(AB).
110
Chapter 4
EXAMPLE11 A list of important customers contains 25 names. Among them 20 persons have their accounts in good standing while 5 are delinquent. Two persons will be selected at random from this list and the status of their accounts checked. Calculate the probability that: (a) Both accounts are delinquent. (b) One account is delinquent and the other is in good standing. We will use the symbols D for delinquent and G for good standing, and attach suffixes to identify the order of the selection. For instance, GrD, will represent the event that the first account checked is in good standing, and the second delinquent. ( a) Here the problem is to calcul ate P(D tDz). Evidently, D rD, is the intersection of the two events D I and Dz. Using the multiplication law, we write
P(DP)-
*7r-I
*"1
P(Dr)P(Drlp,)'
In order to calcul ate P(D ) we need only consider selecting one account at random from 20 good and 5 delinquent accounts. Clearly, : *.rtre next stepis to calculateP(DzlDr).Given that D, has !(D) occurred,there will remain 20 good and 4 delinquent accounts at the time the secondselection is made. Therefore, the conditional probability of Dzgiven D, is P(DzlD) bilities we get P(both delinquent) : P(DrDz) _
5 25
\/ /\
4 24
I 30
.033
(b) The event [exactly one delinquent] is the union of the two incompatible events GrDrand DrGz. The probability of each of these can be calculated by the multiplication law as in part (a). Specifically,
P(G.D) : P(Gr)P(D"lc,)- z 5 2X0T 4\ / P(DrGz): P(Dr)P(G"lD,)
5
1
6 20 24
1 6
The required probability is P(G.D) + P(D.G) Remark: In solving the problems of Example ll, we have avoided listing the sample space corresponding to the selection of two accounts from a list of 25. A judicious use of the multiplication law has made it possible to focus attention on one draw at a time, thus simplifuing the probability calculations. A situation that merits special attention occurs when the conditional probability P(AIB) turns out to be the same as the unconditional probability P(A). Information about the occurrence of B then has no bearing on
5. ConditionalPrcbabilityand Independence111 the assessment of the probability of A. Therefore, when we have the equality P(AIB) : P(A), we say the events A and B are independent.
Two events A and B are independent if
P(Als) - P(A) Equivalent conditions
are
P(BIA)- P(B)
P(AB) - P(A)P(B)
The last form follows by recalling that P(AIB\ : P(AB)/P(B), so that the condition P(AIB) : P(A) is equivalent to P(AB) : P(A)P(B) which may be used as an alternative definition of independence. The other equivalent form is obtained from
""i'
P(A)P(B): p(BlA) : =P(AB) :- -n6-
n@;
P(B)
The form P(AB) : P(A)P(B) shows that the definition of independence is symmetric in A and B. EXAMPLE,12 Are the two events A : [hypertensive] and B : [overweight] independent for the population in Example 10? Referring to that example, we have P(A)
PW P(AiB)- p(B)
.2s
so the two events A and B are not independent.
T
We introduced the condition of independence in the context of checking a given assignment of probability to see lt P(AIB) : P(A). A second
rt2
Chapter 4
use of this condition is in the assignment of probability when the experiment consists of two physically unrelated parts. When events A and B refer to unrelated parts of an experiment, AB is assigned the probability P(AB) : P(A)P(B). EXAMPLE'13 Engineers use the term "reliability" as an alternative name for the probability that a device does not fail. Supposea mechanical system consists of two components that function independently. From extensive testing it is known that component t has reliability .98 and component 2 has reliability .95. If the system can function only i{ both components function, what is the reliability of the system? Consider the events Ar: Az:
component I functions ." component 2 functions
S: system functions Here we have the event relation S : ArAz. Given that the components operateindependently, we take the events A, and Artobe independent. Consequently, the multiplication law assigns P(S) : P(A)P(A,) : .98 x .95 : .931 and the system reliability is .931. In this sytem, the components are said to be connected in series, and the system is called a series system. A two-battery flashlight is an example. The conventional diagram {or a series system is shown in the illustration:
n EXAMPLE'14 Suppose a different system is constructed with the two components mentioned in Example 13. Functioning of any one component is sufficient for this system to function. What is the reliability of this system? Here we have the event relation S : Ar U Az. Employing the addition law P(S) : P(Ar) + P(Az) - .98 + .95 -- .999
P(AtAz)
(.98 x .95)
Alternatively, we may wish to focus on the event S that the system fails. Since system failure occurs only when both components fail, we
Exercises 113 have the event relation S : complementation we have
ArAr.By
independence and the law of
P(S)- PaArAr)_ P(Ar)P(Ar) _ [1 P(Ar)][l : .02 x .05 P(S)-1
P(S):1
P(Ar)] . 0 0 1_ . 9 9 9
Here the components are said to be connected in parallel, and the system is called a parallel system. One component is redundant but is included in the system in order to increase the reliability. A twin-engine airplane is an example. The conventional diagram for a parallel system is shown in the illustration:
tl Caution: Do not confuse the terms "incompatible events" and "independent events.// We say A and B are incompatible when their intersection AB is empty so P(AB) _ 0. On the other hand, If A arrdB are independent P(AB) _ P(A)P(B). Both these properties cannot hold as long as A and B have non -zero probabilities.
EXERCISES Suppose a f.au die has its even-numbered faces painted red, and the odd-numbered faces are white. Consider the experiment of rolling the die once, and the events
A _ lZ or B shows up] B _ [a red face shows up] Find the following probabilities: (a) P(A) (b) P(B)
(e) P(A u B)
(c) P(AB) I
5 .2 For two events A and B, the following probabilities are given: P(A) : .5, P(B) _ .25, P(AIB) _ .8
ll4
Chapter 4
Use the appropriate laws of probability to calcul ate: I (a) P.al (b) P(AB) (c) P(A U B). h.ecordsof student patients at a dentists' office concerning fear of visiting the dentist suggestthe following proportions:
Elementary
School Middle
High
For a student selected at random, consider the events A - [Fear],
M
( a) Find the probabilities
P(A),
P(AM)
P(M),
P(A u M)
(b) Are A and M independent? *ts_\ ,t"'
/'5.4 lAn urn contains 2 greenballs and B red balls. Supposetwo balls will \--rt be drawn at random one after another and without replacement (that is, the first ball drawn is not returned to the urn before the secondone is drawn). (a) List the elementary outcomes, and assignprobabilities. (b) Find the probabilities of the events A B-
green ball appears in the first drawl
[ a green ball appears in the second draw]
(c) Are the two events independent?Why or why not? 5.5 Refer to Exercise 5.4. Now supposetwo balls wiII be drawn with replacement (that is, the first ball drawn will be returned to the um before the second draw). Repeat parts (a), (b), and (c). a region, ISY" of the adult population are smokers, O.86"/oare CDI" smokers with emphysema, and O.24"/oare non-smokers with emphysema.
Exercises l15 (a) What is the probabilitl, that a person, selected at random, has emphysema? (b) Given that the selected person is a smoker, what is the probability that this person has emphysema? (c) Given that the selected person is not a smoker, what is the probability that this person has emphysema? x5.7 SupposeA and B are independent. Show thqt the events in each pair are independent: (a) A and B, (b) A and B. n a shipment of 15 room air conditioners, there are 4 with defective thermostats. Two aLr conditioners will be selected at random and inspected one after another. Find the probability that: ( a) The first is defective. (b) The first is defective and the second good. ' I r : ( c) Both are defective. (d) The second one is defective. ^\
( e) Exactly one is defective.
5.9,;Refer to Exercise 5.8. Now suppose three arr conditioners will be '" selected at random and checked one another. Find the proba!.fnt bility that: ( a) All three are good. (b) The first two ^re good and the third defect Lve:(;:.
i -r
t.
- i ;;'
( c) Two are good and one defective.
5.10 Approximately 40% of the Wisconsin population has type O blood. If 4 personsare select"d random to be donors, find Plat least one "t .,. : type Ol. 'po*er plant hat reliability 5 . 1I The primary cooling unit in a nuclea, .999. There is also a back-up cooling unit to substitute for the primary unit when it fails. The reliability of the back-up unit is .890. Find the reliability of the cooling system of the power plant. Assume independence. 5.12 An accountant screenslarge batchesof bills accordingto the following sampling inspection plan. She inspects 4 bills chosen at random from each batch and passes the batch if, among the 4, none is irregular. Find the probability that a batch will be passedif, in fact: (a) 5Y" of.its bills are irregular. (b) 20% of its bills are irregular. 5.13 Of the patients reporting to a clinic with the symptoms of sore throat and fever, 25Y" have strep-throat, 5lo/o have an allergy, and 10% have both.
116
Chapter 4
(a) what is the probability that a patient selected at random has either strep-throat, an allergy, or both? (b) Are the events "strep-throat,, and ,,allergy,' independent? 5.14 Supposeindependent components having pfFunction] : ,.g will be placed in parallel to make the probabillty,-that theiystem functions, greater than or equal to .99. How many components are needed?
6. RANDOMSAMPLING FROM A FINITE POPULATION In our earlier examples of probability carculations, we have used the phrase 'randomly selected" to mean that all possible selectrons are equally likely. It usually is not difficult to enumerate all the elementary outcomes when both the pgpulqllgl .ciZe and sample size are small numbers. with larger numbeis, rn-"ti"g a.tist of all iEe p6isibte choice5 becomes a tedious job. Howev.r, rule is available that enables " "o.rriting us to solve many probability problems. We begin with an example wh-ere the population size and the sample both small numbers so all possibleiamples can be i,t^"-"^ltt lrstec. "otru..ri".,ity
EXAMPLE'15 There are fivc qualified applicants for two editorial positions on a college newspaper.Two of these applicants are men and thr". *o-.rr.-it ,t. positions are filled by randomly selecting two of trr. ii"" ,ppri"r"ir, -n"a is the probability that neither of the -;;; selectedr Supposethe three women applicants are identifi ed as a, b, and,c, and the two men as d and e. Two members are selected at random from the Population: {a, b, c, d, e} !--J
-/-
women
men
The possible samples may be listed as
{o, b}, {b, c}, {c, d}, {a, c}, {b, d}, {c, e} {a, d}, {b, e} {a, e}
{d, e}
6. A RandomSampling from a Finite Population ll7 As the list shows, our sample spacehas 10 elementary outcomes. The notion of random selection entails that these are all equally likely, so each is assigned the probability #. Let A represent the event that two women are selected. Scanning our list, we see that A consists of three elementary outcomes: {a, b}, {a, c}, {b, c} Consequently, # elements tn A P(A) : # elements in I
tr
10
Note that our probability calculation in Example 15 only requires knowledge of the two counts: the number of elements in g and the number of elements in A. Can we arrive at these counts without formally listing the sample space?An important counting rule comes to our aid.
TheRuleof Combinotions The number of possible choices of r obiects from a group of N / rr\
distinct obiects is denoted by the symbol (;/,
which reads as
"^,/ chooset." The count is given by
o:
1)x lrrx(N 1) x r x (r
x (N
r + 1)
X 2 X I
More specifically, the numerator of the formula O tt the product of r consecutive integers starting with N and proceeding downward. The denominator is also the product of r consecutive integers but starting with r and proceedingdown to l. Although not immediately apparent, there is a certain symmetry in the / irl\
Th" processof selecting r obiects is the same as choosing counts \;/. (N - r) objects to leave behind. Becauseevery choice of r objects corresponds to a choice of (N - r) objects,
o (rT,)
/\ rNr choose
N leave
f 18
Chapter 4
This relation often simplifiescalculations. Table I in Appendix B provides the valuesof for N ranging from 2 to 20. since I, we O take
$
(i)
f),('f),fi?)
EXAMPLE 46 Calculate the values of
0:
5x4
2 x I
Using the relation
15 x 14 x l3 x lz
10, (f)
o
4 x 3 x 2 x I we have
--t I
(l?)
1365
EXAMPLE'17 Refer to-Example 15 conceming a random selection of two persons from a group of two men and three women. Calculate the requir.a ptol"Uiflry without listing the sample space. The number of ways 2 persons can be selected out of 5 is given by
rs) \2/ 2 x
I
Random selection means that the l0 outcomes are equally likely. Next, we ate to count the outcomes that arefavorableto the event A that'both selected persons are women. Two women can be selected
out of three in
r3) \2/ 2 x
I
3 ways
{, t
\--0 Nv /t
Taking the ratio we have the result
P(A) - * - . 3
:
EXAMPLE'18 The Dean's office has received 12 nominations from which to designate 4 student representatives to serye on a Campus curriculum Among the nominees, 5 ate liberal arts ma;ors and 7 are science "o--itt"a. majors. (a) How many ways can 4 students be selected from the group of 12? (b) How many selections are possible that would include I science major and 3 liberal arts maiors.
6. A Random Sampling from a Finite Population 119 ( c ) ff the selection process were random, what is the probability that
I science major and 3 liberal arts mators would be selected? the number of ways 4 students
( a ) According to the counting rule O, can be selectedout of 12 is
(?):
1 2 x 1 1 x 1 0 x 9 : 495 4x3xZxl
(b) One science maior can be chosen from 7 3 liberal arts maiors can be chosen from 5 in 5x4x3
3 x2 x I
in (i)
:
7 ways. Also,
: 10 ways
Each of the 7 choices of a science maior can accompany each of the 10 choices of three liberal arts maiors. Reasoning from the tree diagram, we conclude that the number of possible samples with the stated composition is
0
X€)
:
7x10
(c) Random sampling requires that the 495 possible samples are all equally likely. Of these, 70 arefavorable to the event A : U science and 3 liberal arts maiors]. Consequently,
n
P(A)
The notion of a random sample from a finite population is crucial to statistical inference. In order to generalizefrom a sample to the population, it is imperative that the sampling process be impartial. This criterion is evidently met if we allow the selection process to be such that all possible samples are given equal opportunity to be selected. This is piecisely the idea behind the term random sampling, and a formal definition can be phrased as follows.
A sample of size n selected from a population of N distinct if each collection of size obiects is said to be a rando*/,t",-ple n has the same p robability V\';) /
of beins selected.
l2O
Chapter 4
Note that this is a conceptual rather than an operational definition of a random sample. On the surface it might seem that a haphazard selection by the experimenter would result in a random sample. Unfortunately, a seemingly haphazard selection may have hidden bias. For instance, when asked to name a random integer between I and 9, more persons respond with 7 than any other number. Also, odd numbers are more popular than even numbers. Therefore, the selection of objects must be entrusted to some device that cannot think; in other words, some sort of mechanization of the selection process is needed to make it truly haphazard! To accomplish the goal of a random selection, one may make a card for each of the N members of the population, shu{fle the cards, and then draw n cards. This method is easy to understand but rather awkward to apply to large size populations. Nowadays, random numbers are conveniently generatedon a computer (see,for instance, Exercise6.10). At the beginning of this chapter, we stated that probability constitutes the major vehicle of statistical inference. In the context of random sampling from a population, the tools of probability enable us to gauge the likelihood of various potential outcomes of the sample process.ingiained in our probability calculations lies the artificial assumption that the composition of the population is known. The route of statistical inference is exactly in the opposite direction, as depicted in Figure 4. It is the composition of the population that is unknown while we have at hand the observations (data) resulting from a random sample. our object of inference is to ascertainwhat compositions (or modelsf of the population are compatible with the observed sample data. we view a modLl as plausible unless probability calculations based on this model make the sample outcome seem unlikely.
A p r o b a b i l i pt yr o b l e m ssks: " W h a ti s t h e p r o b a b i l i ttyh a t t h es a m p l e w i l l h a v e. ' . . ?
ioooi itl::j:::iiitr::ii:+1j:::.:i':ij:i+::tr:t:itl,i O
o
o
i / 1 i \ J i . . . . . . . . ..
I i i
S t a t i s t i c a li n f e r e n c €o S k S : "What modelsof the population ( bl ac k box ) m ak e the obs er v ed s a m p l ep l a u s i b l e ?
ii
Figure4 Probobilityvs.stotisticolinference.
Exercises l2l
EXERCISES 6 . L E v aluate. (a) (b)
('f) ('f) (T) (?) (lr) ff) (c)
(e)
(d)
(f)
6.2 List all the samples from {a, b, c, d, e} when (a) 2 out of 5 are selected,(b) 3 out of 5 are selected.Count the number of samplesin each case. 6.3 Out of 12 peopleapplying for an assemblyiob, 3 cannot do the work. Supposetwo personswill be hired. (a) How many distinct pairs are possible? (b) In how many of the pairs will 0 or 1 people not be-able to do the work? (c) If two persons are chosen in a random manner, what is the probability that neither will be able to do the job? 6.4 After a preliminary screening, the list of qualified jurors consists of 10 males and 7 females.The 5 jurors the judge selectsfrom this list are allmales. Did the selection processseem to discriminate against females? Answer this by computing the probability of having no female members in the fury if the selection is random' 6.5 Supposeyou participate in a lottery conducted by a local store to give away foui prizes. Each customer is allowed to place two cards In the barrel. Supposethe barrel contains 5000 cards from which the four winning cards will be chosen at random. what is the probability that at least one of your cards will be drawn? 6.6 Abatch of 18 items contains 4 defectives.If three items are sampled at random, find the probability of the event: (a) A : fnone of the defectives appear] (b) B : fexactly two defectives appear] 5.7 Ordered sampling vs. unordered sampling. Refer to Exercise 6.6. Supposethe sampling of three items is done by randomly choosing : one item after another and without replacement. The event A can then be described as GrGrGr, where the suffixes refer to the order of the draws. Use the method of Example 11, to calculate P(A) and P(B). Verify that you get the same results as in Exercise6.6. This illustrates the following fact: to arrive at a random sample, we may randomly draw one object at a time without replacement, and then disregard the order of the draws.
122 Chapter4 6.8 In a cloud-seeding experiment, 3 clouds will be randomly selected from 10 for seeding.If, in fact, 6 of these clouds have a high moisture content, what is the probability that only high-moisture clouds are seeded? 6-9 Are the following methods of selection likely to produce a random sample of 5 students from your school? Explain. (a) Pick 5 students throwing frisbies on the mall. (b) Pick 5 students who are studying in the library on Friday night. (c) Select 5 students sitting near you in your statistics course. 6.10 Using the computer to generatea random sample. The MINITAB command IRANDOM will select random integers between specified limits that include the endpoints. I RANDOI6 " { 1 T 0 6 { r S E TC l 6 R A N D 0 l "I,N l T E G E RB SE T I ^ I E E N 1 AND 57. 48. 17, 55. 38. 58
Since the sampling is without replacementany duplicatesin the list are to be discarded.Additional random numbers are required to replace them. Select 10 random integers between I and tig .
KEYIDEAS The probability model of an experiment is described (a) the sample space-a list or statement of all possible distinct outcomes, (b) assignment of probabilities to all the elementary outcomes. p(e) > 0 and )P(e) : I where the sum extends over ail e in g. The_probability of an event A is the sum of the probabilities of all the elementary outcomes that are in A. A uniform probability model holds when all the elementary outcomes in I are equiprobable. With a uniform probability model
P(A): w, P(A), viewed as the long run relative frequency of A, can be approximately determined by repeating the experiment a large number of times. The three basic laws of probability
7. Exercises 123 Law of Complementation: Addition Law: lVlultiplication Law:
P(A) : | - P(A) P(A u B) : P(A) + P(B) - P(AB) P(AB) : P(B)P(A!B)
These are useful in probability calculations when events are formed with the operations of complement, union, and intersection. The concept of conditional probability is useful to determine how the probability of an event A must be revised when anothef event B has occurred. It forms the basis of the multiplication law of probability and the notion of independenceof events. Conditional probability of A given B: P(AIB) :'# The notion of random sampling is formalized by requiring that all possible samples are equally iikely to be selected. The rule of combinations facilitatis the calculation of probabilities in the context of random sampling from N distinct units.
7, EXERCISES 7.1 Identify the statement that best describeseach P(A)' (i) P(A) is incorrect' (a) P(A) : .93 (ii) A rarelY occurs' (b) P(A) : .30 : (iii) A occurs moderately often' (c) P(A) 3.0 7.2 Construct a sample space for each of the following experiments: (a) someone claims to be able to taste the difference between the same brand of bottled, tap, and canned draft beer' A glass of each is poured and given to the subiect in an unknown order. The subject is asked to identify the contents of each glass.The number of correct identifications will be recorded' (b) The number of traffic fatalities in a state next year' (c) The length of time a new video recorder will continue to work satisfactorily without service. which of these sample spacesare discrete and which are continuous? 7.3 Identify these events in Exercise 7.2: (a) {Not more than one correct identification} (b) {Less accidents than last year} (Note: If you don't know last year's value, use 345.) (c) {Longer than the 90-day waranty but less than 425'4 days}
I24
Chapter 4
7'4 A driver is stopped for erratic driving, and the alcohol content his blood is checked. Specify the m-fle spaceand the event :' A {Level exceedslegal limit} if the legaf hmit is .10. 7.5 The wimbledon men's tennis championship ends when one player wins three sets. (a) How many elementary outcomes end in 3 sets?in 4? .(b) I{ the players are evenly matched, what is the probability that the tennis match ends in 4 sets? 7.5 Three jars contain different chemicals R, s, and T. Their identifying labels were accidentally dropped during transportation. Suppose a carelesstechnician decidesto put thelabels tn these iars at random without inspecting their contents. (a) List the sample space. (b) State the compositions of the evenrs 4 = fthere is exactly one match] 3 : fall jars receive wrong labels] 7.7 Refer to Exercise 7.6. (a) Assign probabilities to the elementary outcomes. (b) Find p(A) and p(B).
P l o ta r r a n g e m e n t
7'8 To compare two varieties of wheat, saya and,b, a fierdtrial will be conducted on four square plots located in two rows and two columns. Each variety will be planted on two of tn"r. piotr. (a) List all possible assignmentsfor variety a. (b) If the assignments are made completely at random, find the probability that the plots receiving variety a are: (i) In the same column. (ii) In different rows and different columns. 7'9 Refer to Exercise z.g. Instead of a completely random choice suppose a plot is chosen at random fromeach'row and assignedto a. Find the probability that the plots receiving a1r. i'th. 11ietr same column. 7'10 chevalier de M6r6, a French nobleman of the lTth century, reasonedthat in a single throw of a fair die p(l) : +;.;l;;;o throws P(l appearsat least once) : 6 + 6 : *. Wh"t;-;;""g'*ith the abovereasoning?use the samprespaceof Exercise 3.3 to obtain the correct answer. 7.rr A letter is chosen at random from the word sTATISTICIAN.
7. Exercises 125
(a) What is the probability that it is a vowel? (b) What is the probability that it is a T?. 7.12 Does the uniform model apply to the following observations?Explain. (a) Day of week on which maximum pollution reading for nitrous oxides, occurs downtown in a large city. (b) Day of week on which monthly maximum temperature occurs. (c) Week of year for peak retail sales of new cars. 7.13 A plant geneticist crossestwo parent strains each with gene pairs of type aA. An offspring receives one gene from each parent. (a) Construct the sample space for the genetic type of the offspring.
(b) Assign probabilities assuming the selection of genes is random. ( c ) ff A is dominant
so the aa offspring are short while all the
others are taII, find Pfshort offspring]. 7.I4 Athree-digit number is formed by arrangingthe digits 1, 5, and 6 in a random order. (a) List the sample space. (b) Find the probability of getting a number larger than 400. (c) What is the probability that an even number is obtained? 7.15 A late shopper for Valentine's flowers calls by phone to have a flower wrapped. The store has only 4 roses of which 3 will open by the next day, and 6 tulips of which 2 will open by the next day. (a) Construct a Venn diagram and show the events A : [rose], and B : [will open next day]. (b) If the store selects one flower at random, find the probability that it will not open by next day. 7.16 Express the following statements in the notations of the event operations. (a) A occurs and B does not. (b) Neither A nor B occurs. (c) Exactly one of the events A and B occurs. 7.L7 Supposeeachof the numbers .13, .47, and .68 represents the probability of one of the events A, AB, and A U B. Connect the probabilities to the appropruateevents. 7.18 From the probabilities exhibited in this Venn diagram, find P(A), P(AB), P(B u C), P(BC). 7.I9 The medical records of the male diabetic patients reporting to a clinic during one year provide the following percentages:
126
Chapter 4
Light Case
Serious Case
Diabetes in Parents Yes No Below 40 Above 40
15 15
10 20
Supposea patient is chosen at random from this group, and the events A, B, and C are defined A: He has a seriouscase. B: He is below 40. C: His parents are diabetic. (a) Find the probabilities P(A), P(B), P(BC), P(ABC). (b) Describe the following events verbally and find their probabilities: (i) AE, (ii) A u e , (iii) Ane . 7 .2O A sample space consists of 9 elementary outcom es e1t az, whose probabilities are
P(e)
.08, P(ez) - P(e+)- P(es) : .1
P(ed _ P(ez)
.2, P(ea)- P(eq)- .07
,og
SupposeA : {er, as, as}, B : {ez, er, e", en}. (a) Calculate P(A), P(B), P(AB). (b) Using the addition law of probability, calculate p(A U B). (c) List the composition of the event A U B, and calculate p(A U B) by adding the probabilities of the elementary outcomes. (d) Cal_culateP(E), fuom P(B), and also by listing the composition of B. 7.21 using event relations, expressthe following events in terms of the three events A, B, and C. (a) Atl three events occur. (b) At least one of the three events occurs. (c) A and B occur and C does not. (d) Only B occurs. 7.22 The following data relate to the proportions in a population of drivers: A B _ {Accident in current year}
7. Exercises 127 The probabilities are given in the accompanying Venn diagram. Find P(B|A). Are A and B independent? 7.23 Given P(AB) : .4 and P(B) : .8, find P(A|B). If further P(A) : .5, areAandBindependent? .849
7.24 Refering to Exercise 7.19, (a) Supposea patient will be chosen at random from the group of patients who are below age40. What is the probability that this patient will have a serious caseof the disease?Explain how this can be interpreted as a conditional probability. (b) Calculate the following conditional pro-babilities and interpret them in light of your answer to (a): P(A|B), P(C|A). 7.25 The following probabilities are given for two events A and B:
P(A) : +, P(B): +, P(AIB): 34 (a) Using the appropriate laws of probability, calculate P(A), P(AB), and P(A u B). (b) Draw a Venn diagram to determine P(Ab. 7.26 Of three events,A, B, and C, supposeevents A and B are independent and events B and C are mutually exclusive. Their proba.1, and P(C) : .3. Expressthe bilities are P(A) : .7, P(B): following events in set notations and calculate their probabilities: (a) Both B and C occur. (b) At least one of A and B occurs. (c) B does not occur. (d) AII three events occur. 7.27 Mr. Hope, a character apprehended by Sherlock Holmes, was driven by revenge to commit two murders. He presented two seemingly identical pills, one containing a deadly poison, to an adversary who selected one while Mr. Hope took the other. The entire procedure was then to be repeated with the second victim. Mr. Hope felt that Providencewould protect him, but what is the probability of the success of his endeavor? 7.28 Suppose P(A) : .4, P(B) : .8, and the events are independent. Calculate the probabilities of the four intersections needed to fill in the accompanying table: 7.29 An electronic scanner is successful in detecting flaws in a material in 7OY" of the cases. Three material specimens containing flaws will be tested with the scanner.Assume that the tests are independent. (a) List the sample space and assign probabilities to the simple events.
128
Chapter 4
(b) Find the probability that the scanner is successful in at least two of the three cases. 7.30 In an optical sensory experiment, a subject shows a fast response (F), a delayed response (D), or no response at all (N). The experiment will be performed on two subjects. (a) Using a tree diagram, list the sample space. (b) Suppose,for each subject, P(F) : .4, P(D) : .3, P(19 : .3, and the responses of different subiects are independent. Assign probabilities to the elementary outcomes. ( c ) Find the probability that at least one of the subiects shows a fast response. (d) Find the probability that both of the subiectsrespond.
7 . 3 rA salesmanhas probability L of making a sale.How many stops are necessary in order to have probability greaterthan .95 of making at least one sale (assumeindependence).
7.32Concerning two-children families, suppose the following probabilities are specified: P(Br) P(BzlBr)_ P(Zndchild is boyltst is boy) P(BrlG,)
7 13 I
2
Calculate all the probabilities required to fill in this table. 7.33 For a safe flight of a twin-engine plane at least one of the engines must be working_supposethe engines function independently and the probability of failure of each engine is .008. Find the proba6ility of a safe flight. 7.34 For a newborn child, the probability of survival beyond age 50 is .89, and the probability of survival beyond age6o is .g0. what is the probability that a S0-year-oldperson will live another l0 years? (Hint: The required probability is a conditional probability.) 7 -35 A college senior is selected, at random, from each state. Next, one senior is selected at random from the group of 50. Does this procedure produce a senior selected at random from those in the United States? 7.36 An instructor will choose 3 problems from a set of 5 containing 3 hard and 3 easy problems. If the selection is made at random, what is the probability that only the hard problems are chosen? *7.37 Birthdays. lt is somewhat surprising to leam the probability that two persons in a class share the same birthday. As an approximation, assume that the 365 days are equally likely birthdays.
7. Exercises 129 (a) What is the probability that, among 3 persons,at least two have the same birthd ayl. (Hint: The reasoning associatedwith a tree diagram shows that there are 365 x 365 x 365 possible birthday outcomes. Of these, 365 x 364 x 363 correspondto no common birthday.) (b) Generaltze the above reasoningto N persons.Show that P[no common birthday]
365 x 364 x . . . x (365 - N + l) (36s)N
(Some numerical values are
18 P[no common birthday]
22
23
.973 .90s .6s3 .524 .493
We see that with N _ 23 persons, the probability is greater than t that at least two share a common birthd ay.)
CF{ A
Proboblllty
Dlstri butlons 1, INTRODUCTION 2, RANDOMVARIABLES DISTRIBUTION OF A 3, PROBABILW RANDOMVARIABLE DISCRETE (MEAN)AND STANDARD DEVIATION 4, EXPECTATION DISTRIBUTION OF A PROBABILW
E (g
_o o oL
Student sailors and other boaters on Lake Mendota are serviced by a boating rescue service. A long record for summer days provides a distribution of the number of rescuesper d^y.
234 of rescues/day Number
1, INTRODUCTION A prescription for the probability model of an experiment contains two basic ingredients: the sample space,and the assignment of probability to each elementary outcome. In Chapter 4we encountered several examples where the elementary outcomes had only qualitative descriptions rather than numerical values. For instance, with two tosses of a coin, the outcomes HH, HT, TH, and TT are pairs of letters that identify the occurences of heads or tails. If a new vaccine is studied for the possible side effects of nausea, the responseof each subject may be severe,moderate, or no feeling of nausea. These are qualitative outcomes rather than measurements on a numerical scale. Often, the outcomes of an experiment are numerical values. For example, the daily number of burglaries in a city, the hourly wages of students on summer jobs, and scoreson a college placement examination. Even in the former situation where the elementary outcomes are only qualitatively described, interest frequently centers on some related numerical aspects. If a new vaccine is tested on 100 individuals, the in{ormation relevant for an evaluation of the vaccine may be the numbers of responses in the categories-severe, moderate, or no nausea.The detailed record of 100 responsescan be dispensed with once we have extracted this sum-
2. RandomVariables 133 mary. Likewise, for an opinion poll conducted on 500 residents to determine support for a proposed city ordinance, the information of particular interest is how many residents are in favor of the ordinance, and how many opposed. In these examples, the individual observations are not numerical, yet a numerical summary of. a collection of observations forms the natural basis for drawing in{erences. In this chapter we concentrate on the numerical aspects of experimental outcomes.
2. RANDOMVARIABLES Focusing our attention on the numerical features of the outcomes, we introduce the idea of a random variable.
A random variable X associates a numerical value to each outcome of an experiment.
Corresponding to every elementary outcome of an experiment, a random variable assumes a numerical value, determined from some characteristic pertaining to the outcome. (In mathematical language, we say that a random variable X is a real-valued function defined on a sample space.) The word "random" serves as a reminder of the fact that, beforehand, we do not know the outcome of an experiment nor its associated value of X.
EXAMPLE1 Consider X to be the number of heads obtained in three tossesof a coin. First, X is a variable since the number of heads in three tosses of a coin can have any of the values O, I,2, or 3. Second,this variable is random in the sensethat the value that would occur in a given instance cannot be predicted with certainty. We can, though, make a list of the elementary outcomes and the associatedvalues of X.
Outcome
Value of X
HHH HHT HTH HTT THH THT TTH TTT
3 2 2 I 2 I I 0
134
Chapter 5
Note that, for each elementary outcome there is only one value of X. However, several elementary outcomes may yield the same value. Scanning our list, we now identify the events (that is, the collections of the elementary outcomes) that correspond to the distinct values of X.
Numerical value o f X a s a n event
lx - 0 ] lx - l ] lx - 2 ] lx : 3 ]
Composition of the event
{rrr}
{HTT, THT, TTH) {HHT, HTH, THH) {HHH}
Guided by this example, we observe the general facts:
The events correspondingto the distinct values of.X are incompatible. The union of these events is the entire sample space. Typically, the possible values of a random variable X canbe determined directly from the description of the random variable without listing the sample space.However, to assignprobabilities to these values, treated as events, it is sometimes helpful to refer to the sample space.
EXAMPLE 2 A panel of 12 tasters is asked to compare the crispiness of a name brand cornllakes with a cheaper generic brand. Let X denote the number of tasters who rate the generic brand at least as crisp as the name brand. Here X can take any of the values 0, I, 12, tr
EXAMPLE3
At an intersection, an observerwill count the number X of cars passingby until a new Mercedesis spotted. The possible values of X are then 1, 2, 3,
. I where the list never terminates.
T
A random variable is said to be discrete if it has either a finite number of values or infinitely many values that can be arrangedin a sequence.AII the preceding examples are of this type. On the other hand, if a random variable represents some measurement on a continuous scale and is therefore capable of assuming all values in an interval, it is called a continuous random variable. Of course, any measuring device has a limited acc:uracyand, therefore, a continuous scale must be interpreted as an abstraction. Some examples of continuous random variables are: the height of an adult male, the daily milk yield of a holstein, the survival time of a patient following a heart attack.
Exercises 135 Probability distributions of discrete random variables are explored in this chapter. As we shall see, the developments stem directly from the concepts of probability introduced in chapter 4. A somewhat different outlook is involved in the processof conceptu alizingthe distribution of a continuous random variable. Details for the continuous case are postponed until Chapter 7.
EXERCISES 2.i Sentify the following as a discrete or continuous random variable. . ---'(") Number of cars serviced at a garageduring a day. L (b) Amount of precipitation produced by a storm system. (c) Number of hunting accidents in a state during the deer hunting season.I (d) Time a mechanic takes to replace a defective muffler.
{t
/* \,
(e) Number of correct answers that a student gives on a quiz containing 20 problems. (f) Cumulative gradepoint averageof a student at the time of graduation. (s) Number of cars ticketed for illegal parking on campus today.. 2.2
fwo .-',,@
of the integers {1, 3, 5, 7,9} arc chosenat random. I-et X denote
(a) List all choices and the Eplr-esp9ldin&"valuessFX. (b) .-L,istthe distinct :4gg!_gf-[ 2.3 Let the random variable X represent the sum of the points in two tossesof a die. (a) List the possible values of X. (b) For each value of X, list the correspondingelementary outcomes. 2.4 Three contestants A, B, and C are rated by two judges. Each judge assignsthe ratings I for best, 2 for intermediate, and 3 for worst. Let X denote the total score for Contestant A (the sum of the ratings received from the two judges). (a) List all pairs of ratings that A can receive. (b) List the distinct values of X. 2.5 Refer to Exercise 2.4. Supposeinstead there are two contestants A and B and four judges. Each iudge gives the ratings I for the better and 2 for the worse of the two contestants.
136
Chapter 5
(a) List all possibleassignmentsof ratings to contestant A by the four judges. _-*_.(b) List the distinct values oI X, the total score on A. 2.6\dt X denote the difference (# heads - # tails) in three tossesof a \ -'-Jcoin. t'
(a) List the possible values of X. (b) List the elementary outcomes associatedwith each value of X. 2-7 Supposethat a factory supervisor records whether the day or the night shift has a higher production rate for each of the next B days. List the possibleoutcomes and, for each,record the number of daysx that the night shift has a higher production rate. (Assume there are no ties.)
3. PROBABILITY DISTRIBUTION OF A DISCRETE RANDOMVARIABLE The list of possible values of a random variable X makes us aware of all the eventualities of an experiment as far as the realization of X is concemed. By employing the concepts of probabilit, we can ascertain the chances of observing the various values. To this end, we introduce the notion of a probability distribution.
The protrabilityclistributionor simply, the Cistribution of a discreterandom variable X is a list of the distinct numerical values of X along with their associatedprobabilities. Often a formula can be used in place of a detailed list.
EXAMPLE4
If X representsthe number of headsobtained in three tossesof a fair coin, find the probability distribution of X. In Example I we have arreadylisted the g erementary outcomes and the associatedvalues of X. The distinct values of X areo, r, z, and 3. we now calculate their probabilities. The model ol a fair coin entails that the 8 elementary outcomes are equally likely, so each is assignedthe probability +. The event -the [X : 0] has the single outcome TTT, so its probability is *. Similarry, probabilities of [X : l], [X : 2], and iX : 3l are found to be g, B, and g, respectively. collecting these results, the probability distribution of x ii displayed in Table l.
3. Probability Distribution of a Discrete Random Variable 137
Toble 'l The Probobility Distribufionof X, the Number of Heods in Three Tossesof o coin Value of X
Probability
0 I 2 3
1 8 3 8 3 8 1 8
Total
f--t l l
For general discussion, we will use the notation x11x21and so on to designate the distinct values of a random variable X. The probability that a particular value x, occurs will be denoted by f(xr)- As in Example 4, if X can take k possible values Xr, .. , xo with the corresponding probabilities f (x), . . . , f(xp), the probability distribution of X canbe displayed in the format of Table 2. Since the quantities /(x) represent probabilities, they must all be numbers between 0 and l. Further, when summed over all possible values of X, these probabilities must add up to 1.
Tqble 2 Formof o Discrete ProbobilityDistribufion Value x
Probability f(x)
x1
f (*r) f (")
x2
The probability distribution of a discrete random varlp,bleX is describedas the function
f(x,) xk
f (*p)
which satisfies: (i) f(*')
Total
k
(ii),z,f{",) A probability distribution or the probability function describes the manner in which the total probability I gets apportioned to the individual values of the random variable. A graphical presentation of a probability distribution helps reveal any pattern in the distribution of probabilities. We consider a display, similar in form to a relative frequency histogram discussedin Chapter 2. It will also facilitate the building of the concept of a continuous distribution. To draw a probability histogram, we first mark the values of X on the
138
Chapter 5
horizontal axis.with eachvaluexj ascenter,a vertical rectangleis drawn whoseareaequalsthe probability l(xr). The probability histogramfor the distribution of Example4 is shownin Figurel.
V a l u er
Figure'f The probobility histogromof X, the number of heods in three tossesof o coin.
EXAMPLE5 .
Suppose3o% of the trees in a forest are infested with a parasite. Four trees are randomly sampled. Let X denote the numbei of the trees sampled that have the parasite. Obtain the probability distribution of X, and plot the probability histogram. Since each tree may be either infested (I) or not infested (N), the number of elementary outcomes concerning a sample of 4 trees is i x L x 2 x 2 : 16' These can be conveniently enumeratedin the scheme of Example 8, chapter 4 (carled, coincidentally, a tree-diagram). However, we list them here according to the count X,
x-0 NNNN
X
X_2
NNNI NNIN NINN INNN
NNII NINI NIIN INNI ININ IINN
x-3
x-4
NIII INII IINI IIIN
IIII
our object here is to calculate the probability of each value of X. To this end, we first reflect upon the assignment of probabilities to the elementary outcomes. For a single tree selected at random we obviously have P(1) : .3 and P(N) : .7 because Bo% of the population of trees is
3. Probability Distribution of a Discrete Random Variable f 39
infested. Moreover, as the population is vast while the sample size is very small, the observations on 4 trees can, for all practical purposes, be treated as independent. Invoking independence and the multiplication law of probability, we c a l c u l a t e P ( N N N N ): . 7 x . 7 x . 7 x . 7 : . 2 4 0 1 s, o P [ X : 0 ] : .2401. The event [X : 1] has 4 elementary outcomes, each contains three N's and one I. Since P(NNNI) : (.7)" x (.3) : .1029, and the same result holds for all these 4 elementary outcomes, we get PIX : 1] : 4 x .1O29 : .4116. In the same manner,
PIX PIX _ 3l _ 4 x (.7) x (.3)3: .0756 PIX _ 4l- (.3)4 Collecting these results, the probability distribution of X is presented in Table 3, and the probability histogram plotted in Figure 2.
Toble3 TheProbobilityDistribution of X in Exqmple5
fl*)
f (x) 0 I 2 3 4
.240r .4r16
Total
r.0000
.2646 . 0 75 6 .008I
Figure2
At this point we digress briefly for an explanation of the role of probability distributions in statistical inference. To calculate the probabilities associatedwith the values of a random variable, we require a full knowledge of the uncertainties of the experimental outcomes. For instance, when X represents some numerical characteristic of a random sample from a population, we assume a known composition of the population in order that the distribution of X can be calculated numerically. In Example 5, the chancesof observingthe various values of X were calculated under the assumption that the incidence rate of the parasite was .3 in the population of trees. Ordinarily, in practical applications, this population quantity would be unknown to us. Supposethe letter p stands for this unknown proportion of trees that have the parasite. statistical inference attempts to determine the values of p that are deemed plausible in light of
140
Chapter 5
the value of X actually observedin a sample.To fix ideas,supposeall four of the sampledtrees are found to be infested.Basedon this observation,is .3 a plausible value of p? Table 3 shows that if p were indeed .3, the chanceof observingthe extreme value X : 4 is only .0081.This very low probability casts doubt on the hypothesis that p : .3. This kind of statistical reasoningwill be explored further in later chapters. The probability distributions in Examples 4 and 5 were obtained by first assigningprobabilities to the elementary outcomes using a process of logical deduction. When this cannot be done one must turn to an empirical determination of the distribution. This involves repeating the experiment a large number of times and using the relative frequenciesof the various values of X as approximations of the correspondingprobabilities.
EXAMPLE6
Let X denote the number of magazines to which a college senior subscribes. From a survey of 400 college seniors, suppose the frequency distribution of Table 4 was observed.
Toble4 FrequencyDistributionof the NumberX of MogozineSubscriptions Magazine Subscriptions (x)
0 t 2 3 4 Total
Frequency
6I 153 106 56 24
Relative Frequencyo
.15 .38 .27 .t4 .06 1.00
"Rounded to second decimal.
Viewing the relative frequenciesas empirical estimates of the probabilities, we have essentially obtained an approximate determination of the probability distribution of X. The true probability distribution would emerge if a vast number (ideally, the entire population) of seniors were surveyed.
T
The reader should bear in mind an important distinc,tion between a relative frequency distribution and the probability distribution. The former is a sample-based entity and is therefore susceptible to variation on different occasions of sampling. By contrast, the probability distribution is a stable entity that refers to the entire population. It is a theoretical
3. ProbabilitvDistribution of a Disuete RandomVafiable 141 construct that serves as a model for describing the variation in the population. The probability distribution of X can be used to calculate the probabilities of events defined in terms of X. To illustrate this, consider the probability distribution of Table 5. What is the probability that X is equal to or larger than 2?
Toble 5 Value
Probability
X
f(x)
0 I 2 3 4
.02 .23 .40 .25 .10
The event lX > 2] is composedof [X
P[x >.2]_ f(2) + f(3) + f(4) - . 4 O+ . 2 5 + . 1 0 Similarly, we also calculate PIX < 2]
EXERCISES 3. 1 Listed below are the eleme ntary outcomes of an experiment, their probabilities and the value of a random variable x at each outcome.
E]ementary Outcome er e2 e2 e4
:)
A.
Probability
Value of X
.r2
2 0 0 3 I I 2 3
.29 .05 .08 .t6
e6
.n
e7
.09 .10
e8
142
Chapter 5
Obtain the probability distribution of X. g.i fet X denote the sum of the points in two tossesof a fair die. (a) Obtain the probability distribution of X (refer to Exercise 2.3). (b) Draw the probability histogram. i3.3\xamine if the following are legitimate probability distributions: 'u/ (a) (b) (c) (d\
f(x) 2 B 13
.4 .6 .2
f(x) -2 0 2
.25 .50 .25
-'r" /
i\i
3.4 \Let the random variable X denote the proportion of times a head \".-*r/occursin three tossesof a coin, that is, X : (# headsin 3 tosses)/3. (a) Obtain the probability distribution of X. (b) Draw the probability histogram. 3.5 A surprise quiz contains three multiple choice questions: euestion I has three suggestedanswers/Question 2 has four, and euestion 3 has two. A completely unprepared student decides to choose the answers at random. Let X denote the number of questions the student answers correctly. (a) List the possible values of X. (b) Find the probability distribution of X. (c) Find P[at least I correct] : 4X > t). (d) Plot the probability histogram. *3.5 Runs. In a row of 6 plants two plants are infected with a leaf disease and four are healthy. Restricting attention to the portion of the sample spacefor exactly two infected plants, the model of randomness (or lack of contagion) assumes that any two positions, for the infected plants in the row, are as likely as any other. (a) Using the symbols I for infected and H for healthy, list all possibleoccurrencesof two I's and four H's in a row oI 6. (Note: 'lhere /a\ are [", : 15 elementaryoutcomes.) (b) A random v]'riabte of interest is the number of runs (X) which is defined as the number of unbroken sequencesof letters of the same kind. For example, the arrangement IHHHIH has 4 runs, IIHHHH has 2. Find the value of X associatedwith each outcome you listed in (a). (c) Obtain the probability distribution of X under the model of randomness-
4. Expectation (Mean) and Standard Deviation of a Probability Distribution r
143
\\\'
3.7 of 7 candidatesseeking 3 positions bt a counseling center, 4 have degreesin social scienceand 3 do not. If 3 candidatesare selectedat random, find the probability distribution of X, the number having social science degreesamong the selectedpersons. ;
3.8'Suppose X denotes the number of telephone receivers in a single .''family residential dwelling. From an examination of the phone subscription records of 381 residencesin a city, the following frequency distribution is obtained:
No. of Receivers (x)
No. of Residences (Frequency)
0 I 2 3 4
2 B2 r61 89 47
Total
38l
(a) Basedon thesedata, obtain an approximate determination of the probability distribution of X. (b) Why is this regardedas an approximation? (c) Plot the probability histogram. 3.9 Use the probability distribution given here to calculate:
(a) P[x < 3] (b) Plx > 2l 1
(c)P[l <X<3]
T-at 4 1(;
(;
l(; 4 T6 I T6
4. EXPECTATTON (MEAN)AND STANDARD DEVIATION OF A PROBABILITY DISTRIBUTION we will now introduce a numerical measure for the center of a probability distribution and another for its spread.In chapter 2, we disiussed the concepts of mean, as a measure of the center of a data set, and standard deviation, as a measure of spread.Becauseprobability distributions are theoretical models in which the probabilities can be viewed as
144
Chapter 5
long-run relative frequencies, the sample measures of center and spread have their population counterparts. To motivate their definitions, we first refer to the calculation of the mean of a data set. suppose a die is tossed20 times and the following data obtained:
4, 3, 4, 2, 5, I, 6, 6, 5, 2, 2,6,5,4,6,2,L,6,2,4 The meanof theseobservations,calledthe samplemean,is calculated
x:
sum of the observations sample size
76 20
Altematively, we can first count the frequency of each point, and use the relative frequencies to calculate the mean as
v-
r(h) + z(t) + B(zb) + 4(h) + 5(#) + G(t)_ 8.8
This second calculation illustrates the formula Sample mean f : Xvalue x relative frequency) Rather than stopping with 20 tosses,if we imagine a very large number of tosses of a die, the relative frequencies will approach the probabilities, each of which is * for a fair die. The mean of the (infinite| collection of tosses of a fair die should then be calculated as 1(*) + 2(t) +
. + 6(*) _ )(value x probability)
Motivated by this example and the stability of long-run relative frequency it is then natural to define the mean of a random variable X or of its probability distribution as )(value x probability)
or
2x,f(xr)
where xr's denote the distinct vahres of X. The mean of a probability distribution is also called the poputation mean for the variable X, and is denoted by the Greek letter p. The mean of a random variable X is also called its expected value and, altematively, denoted by E(X). That is, the mean p and expected value E(X), arc the same quantity and will be used interchangeably.
4. Expectation (Mean) and Standard Deviation of a Probability Distribution
145
The Inean of X or ptlpulatiruar$?Er';rst
E(X) _ p - )(value x probability) _ Zxrf(xr) Here the sum extends over all the distinct values x, of.X.
EXAMPLE7 With X denoting the number of heads in three tosses of a falr coin, calculate the mean of X. The probability distribution of X was recorded in Table 1. From the calculations exhibited in Table 6, we find that the mean is 1.5.
Toble 6 Meon of the Distributionof
T:Pr:l___
x
t
it !/ ( x )
xf(x)
t ,
nil;6 LriS:v
I
i8
i8
218
i8
.lilt3 OIBIB --j-l l -
+ --
I i
r
Total
il r i
i
-
I _".**-i""
i
l, i i2 _ 1.5 _ Fr
i.-
L_i
The mean of a probability distribution has a physical interpretation. If a metal block is cut in the shape of the probability histogram, then p represents the point on the base at which the block will balance. For instance, the mean p : 1.5 calculated in Example 7 is exactly at the center of mass for the distribution depicted in Figure 1. Because the amount of probability corresponds to the amount of mass in a bar, we interpret the balance point, p, as the center of the probability distribution. Like many concepts of probability, the idea of the mean or expectation originated from studies of gambling. When X refers to the financial gain in a game of chance, such as playing poker or participating in a state lottery, the name "expected gain" is more appealing than "mean gain." In the realm of statistics, both the names "rnear{'and "expected value" are widely used.
146
Chapter 5
No Cosino Gome Hos o Positive[xpected Profit
Each year thousands of visitors come to casinos to gamble. While all count on being lucky and a few indeed return with a smiling face, most leave the casino with a tight purse. But, what should be a gambler's expectation? Consider a simple bet on the red of a roulette which has l g red, l8 black, and 2 green slots. This bet is at even money so a $ 10 wager on red has expected profit E(Profit) The negative expected profit says we expect to lose an average of 52-6 cents on every $10 bet. OvLr a long series of bets, the relative frequency of winning will the "ppro""h p.g!?bility # and that of losing will ,tg ro a player "ppro""h will lose a subs tanttal amount of mo".y Other bets against the house have a similar negative expected profit. How else could a casino stay in business?
4. Expectation (Mean) and Standard Deviation of a Probability Distilbution
EXAMPLE8
147
A trip-insurance policy pays $1000to the customer in caseof a loss due to theft or damage on a five-day trip. If the risk of such a loss is assessedto be I in 200, what is a fair premium for this policy? The probability that the company will be liable to pay $1000 to a customer is Yzcn: .005. Therefore, the probability distribution of X, the payment per customer, is as follows:
Payment (x)
Probability f(x)
We calcul ate
E{.X):0 x .995+ 1000x .005 : $s.00 The company's expected cost per customer is $5.00, and therefore, a premium equal to this amount is viewed as the fair premium. If this premium is charged, and no other costs are involved, then the company will neither make a profit nor lose money in the long run. In practice, the premium is set at a higher price because it must include administrative n costs and intended profit. leads of expected value also to a numerical measure for the The concept the standard deviation. spread of a probability distribution-namely, probability distribution, the When defining the standard deviation of a reasoning parallels that for the standard deviation of a frequency distribution discussed in Chapter 2. Since the mean p is the center of the distribution of X, we express variation of X in terms of the deviation (X - p). We define the variance of Xas the expectedvalue of the squareddeviation (X - p)2. To calculate this expected value, we note that
(X
p")2 Takes Value (xr (xz
]L)2 p)z
With Probability
f(*')
.
(xr
f@p)
148
Chapter 5
The expected value of (x - p)2 is obtained by multiplying each value (x, - ti, by the probability f(xr) and then summing these products. This motivates the definition: Variance of X
- xxi
ilzf(xi)
The variance of X is abbreviated as var(x), and is also denoted by or. The standard deviation of X is the positive square root of the variance, and is denoted by sd(X) or o (the lower-caseGreek letter sigma).
Variance and standard deviation of X: u2: (r
Var(X) - Xxr 02f(xi) sd(X) - + \mr(X)
The variance of x is also called the populalrn variance and o is called the population standard deviation.
EXAMPLE9
calculate the varrance and the standard deviation of the distribution of X that appears in the left two columns of Table 7. we calculate the mean p, the deviations (x p,), (x p)2, and finally (x The details are shown in Table 02f(x). 7.
Toble7 Colculotionof Vorionceond StondordDeviotion f(x) 0 I 2 3 4
.l
.2 .4 .2 .1
xf(x)
(x
0 .2 .8 .6 .4
p)
-2 -1 0 I
2
(x
p)2
4 I 0 I 4
(x
p)2f(x)
.4 .2 0 .2 .4
Total a2
Var(X) - u2 sd(x) - (I
- {Lz
: 1.095
u
4. Expectation (Mean) and Standard Deviation of a Prcbability Distribution
149
An alternative formula f.or s2 often simplifies the numerical work (see Appendix A2.2).
Alternative formula for hand calculation: a2 :
>*?f(",)
]12
EXAMPLE'10 We illustrate the alternative formula for az using the probability distribution in ExamPle 9. SeeTable 8.
Toble8 Colculotionof Vorionceby the AlfernotiveFormulo
0 I 2 3 4
.1 .2 .4 .2 .1
Total
1.0
xf(x)
x2f(x)
0 .2 .8 .6 .4
0 .2 1.6 r.8 1.6 5.2 _ 2x2f(x)
u2
TI I I
The standard deviation o, rather thano2, is the appropriate measure of spread.Its unit is the same as that of X. For instance, if X refers to income in dollars, o will have the unit (dollar) whereas o2 has the rather artificial unit (dollar)2.
EXERCISES 4.I For the following probability distribution: (a) Calculate p. (b) Calculate o2 and o. (c) Plot the probability histogram and locate p.
150 Chapter5
4.2 For the following probability distribution: (a) Calculate E(X). (b) Calculate sd(X). (c) Draw the probability histogram and locate the mean.
2 3 4 5
.l
.3 .3
6
4.3 Refer to Exercise4.2. (a) List the x values that lie in the interval p - o to pu* o, and calculate P[f, - o < X < p * o]. (b) List thexvalues that lie in the interval p - 2o to p * 2o, and calculate P[p - 2o < X < p + 2o]. 4.4 calculate the mean and standard deviation for the probability distribution of Example 5. 4.5 An insurance policy pays $400 {or the loss due to theft of a canoe.If the probability of a theft is assessedto be .05, find the expected payment. If the insurance company charges$25 for the policy, what is the expectedprofit per policy? x4.6 Definition: Tile median of a distribution is the value mo of the random variable such that PIX < mo] -- .5 and PIX > mo] > .5. In other words, the probability at or below mo is at least .5, and the probability at or above mo is at least .5. Find the median of the distribution given in Exercise4.2.
KEYIDEAS ANDFORMULA The outcomes of an experiment are quantified by assigning each of them a numerical value related to a characteristic of interest. The rule for assigning the numerical value is called a random variable X.
Key Ideasand Fotmula 151 The probability distribution of X describesthe manner in which probability is distributed over the possible values of X. Specifically, it is a list or formula giving the pairs x and f(") : PIX : x]. A probability distribution serves as a model for explaining variation in a population. A probability distribution has mcan p : Xvalue)(probability) : )x/(x) which is interpreted as the population mean. This quantity is also called the expectedvalue, E(X). Although X is a variable, E(X) is a constant. The population varianceis
deviationis a u".,,".,1" ,r:],.;:;"::"rd rhe standaru ",,: measure of the spread or variation of the population.
5. EXERCISES 5.1 Let X denote the number of tails in 3 tossesof a fair coin. Obtain the probability distribution of X. 5.2 Each week a grocery shopper buys either canned (C) or bottled (B) soft drinks. The type of soft drink purchased in 3 consecutive weeks is to be recorded. (a) List the sample space. (b) If a different type of soft drink is purchased than in the previous week, we say that there is a switch. Let X denote the number of switches. Determine the value of X for each elementary outcome. (Example: For BBB,x : 0; for BCB, x : 2.) (c) Suppose that for each purchase P(B) : i and the decisions in different weeks are independent. Assign probabilities to the elementary outcomes and obtain the distribution of X. 5.3 On the dveragea baseballplayer gets a hit in I out of 3 times at bat. Determine the probability of no hits in 4 times at bat. 5.4 A child psychologist, interested in how friends are selected,studies groups of three children. For one grouf, Ann, Barb, and Carol each-is asked which of the other two she likes best. (a) Make a list of the outcomes (Use A, B, and C to.denote the three children.) (b) Let Xbe the number of times Carol is chosen.List the values of X.
152
Chapter 5
(c) Assur4ing each choice is equally likely, determine the probability distribution of X. 5.5 The probability function of a random variable X is given by the formula
f(x) _
, #(+) X
(a) calculate the numerical value of /(x) for each x, andmake a table of the probability distribution. (b) Plot the probability histogram. 5.6 Based on recent r-ecords,the managet of a car painting center has determined the following probability distribution for th"enumber of customers per day: (a) If the center has the capacity to serve 2 customers per day, what is the probability that one or more customers will be tumed away on a given day? (b) what is the probability that the center,s capacity will not be fully utilized on a day? (c) By how much must the capacrtybe increasedso the probability of turning a customer away is no more than .10?
f(x) 0 I 2 3 4 5
5.7 In an assortment of 1l light bulbs there are 4 with broken filaments. A customer takes 3 bulbs from the assortment without inspecting the filaments. Find the probability distribution of the number x oI defective bulbs that the customer may get. 5.8 Given the following probability distribution: (a) Construct the probability histogram. (b) Find E(X), o2, and tr.
5. Exercises 153
--*-r----
l-l 4'r .4 0 .3 I .2 2 3 i " * _ 1 . -. 1 '^_ 5.9 Find the mean and standard deviation of the foilowing distribution:
x if@) 0 I 2 3
.3 .5 .l .l
5.10 A student buys a lottery ticket for $1. For every 1000 tickets sold, 2 bicycles are to be given away in a drawing. (a) What is the probability that the student will win a bicycle? (b) If each bicycle is worth $160, determine the student's expected gain. 5.11 In the finals of a tennis match, the winner will get $50,000and the loser $15,000.Find the expectedwinnings of player B if (a) the two finalists are evenly matched and (b) player B has probability .9 of winning. 5.12 A lawyer feels that the probability is .3 that he can win a wage discrimination suit. If he wins the case,he will make $15,000but if he loses he gets nothing. (a) What is the lawyer's expected gain? (b) If the lawyer has to spend $2,500 in preparing the case,what is his expected net gain? 5.13 The number of ovemight emergency calls, X, to the answering service of a heating and air conditioning firm has the probabilities .05, .1, .15, .35, .20, 15 for 0, 1, 2,3,4, and 5 calls, respectively. (a) Find the probability of fewer than 3 calls. (b) Determine E(X) and sd(X). 5.14 A store conducts a lottery with 5000 cards. The prizes and the corresponding number of cards are as listed below in the table. (a) List all elementary outcomes and the corresponding values of X.
154
Chapter 5
Supposeyou have receivedone of the cards(presumably,selected at random), and let X denote your prrze. (a) obtain the probability distribution of x. (b) Calculate the expectedvalue of X. (c) If you have to pay $6 to get a card,,find the probability that you will come out a loser.
Prize
$4000 $1000 $ 100
$s $0
Num ber of Cards
I 3 9s 425 4476 s000
5.15 A botany student is asked to match the popular names of three house plants with their obscure botanical tt"-.r. Suppose the student never heard of these names and is trying to -aicr, by sheer guess. Let X denote the number of correct mrth.r. (b) Obtain the probability distribution of X. (b) What is the expectednumber of matches? 5. I 5 A salesman of home computers will contact four customers during a week. Each contact can result in either a sale, with probab ility .2,"ot no- sale with probability .g. Assume that custonier conracrs are independent. (a) List the elementary outcomes and assign probabirities. (b) If x denotes the number of computers sold during the week, obtain the probability distribution of X. (c) Calculate the expected value of X. 5.17 Refer to Exercise5.16. supposethese computers are priced at $2000, and let Y denote the salesman's total sales (in dollarsl during a week. (a) Give the probability distribution of y. (b) calculate E(Y) and seethat it is the same as 2000 x E(x). [This illustrates the general property of expectation: If c is a constant, then E(cX) : cE(X).)
5. Exercises 155 given by 5.18 Supposethe probability function of a random variable X is the formula 60 I
f(x\:fi,
x:2,3,4,s
Calculate the mean and stanJard deviation of this distribution. 5.19 A roulette wheel has 38 slots of which 18 are red, 18 black, and 2 green. A gambler will play three times, each time betting $5 on red. ih" g"-bl.r gets $10 if red occurs, and loses the bet otherwise. Let X denote the net gain of the gambler in 3 plays (for instance, if he - - l5). loses all three times then X (a) Obtain the probability distribution of X' (b) Calculate the expected value of X' (c) Will the expected net gain be different if the gambler alternates his bets between red and black. Why or why not? 5.20 Let * : averugenumber of dots resulting from two tosses of.a fait die. For instance, if the faces 4 and 5 show, the corresponding value of x is (4 + 5)12 : 4.5. Obtain the probability distribution of X. 5.21 Refer to Exercise 5.20. on the same glaPh, plot the probability histograms of Xt : # Points on first toss of the die X : av€raSe:rumber of points in two tosses of a die' 5.22 The cumulative probabilities for a distribution A probability distribution can also be described by a function that gives the accumulated probability at or below each value of X' Specifically, F(c) :
P[X< c]-
Cumulativedistribution : i,iln of probabilities functi on at c For the probability distribution we calculate
F(l) - PIX < 1] _ /(1)
of all values x { c
given here,
:-.07
F(2):PIX<2]
(a) Complete the F(x) column in this table. (b) Now cover the f(x\ column with a strip of paper. From the F(x) f(x) _ values, reconstruct the probability function /(x). lHint: 1)l F(x F(x)
I
TheBlnomiol ond lts Distribution Applicotionin Testing Hypotheses 1, INTRODUCTION TRIALS AND FAILURES-BERNOULLI 2, SUCCESSES DISTRIBUTION 3. THEBINOMIAL PROPORTION ABOUTA POPUIATION HYPOTHESES 4. TESTING
158
Chapter 6
N
'
Boy or girl?
4. INTRODUCTION This chapter dealswith a basic distribution that models chancevariation in many statistical investigations. The distribution pertains to experiments where the outcomes are only of two possible categories.The random variable of interest is the frequency count of one category, in a fixed number of repetitions of the experiment. Examples l, 2, and 5 of chapter 5 illustrate experiments having this structure. In each of these scenarios, we 4re concemed with sampling from a population consisting of two categories of elements. The probability distribution of X, the number of sampled elements in one category, is calculated under the assumption that the population proportion is known. For instance, the probability distribution of Table 3 in Chapter 5 resulted from the specification that 3O% of the population of trees is infested with the parasite. In a practical situation, however, the population proportion is usually an unknown quantity. When this is so, the probability distribu-
and Failures-Bernoulli Trials 159 2. Successes tion of X cannot be numerically determined. Yet, we will see that it is possible to construct a model for the probability distribution of X that contains the unknown population proportion as a parameter. The probability model serves as the major vehicle of drawing inferences about the population from observations of the random variable X.
A probability model is an assumedform of the probability distribution that describesthe chancebehavior for a random variable X. Probabilities are expressedin terms of relevant population quantities, called the parameters
In this chapter, we introduce a quite versatile probability model that finds applications in many sampling situations where the population can be thought of as a collection of elements of only two types. Sections 2 and 3 describethe genesisand important propertiesof this model. Its application in testing a coniecture about an unknown population proportion is then discussed.This technique of inference,called testing hypotheses,is introduced in Section 4.
ANDFAILURES2. SUCCESSES TRIALS BERNOULLI Sampling situations where the elements of a population have a dichotomy abound in virtually all walks of life. A few examples are: Inspect a specified number of items coming off a production line and count the number of defectives. Survey a sample of voters and observe how many favor a reduction of public spending on welfare. Analyze the blood specimens of a number of rodents and count how many carry a particular viral infection. Examine the casehistories of a number of births and count how many involved delivery by Caesareansection. Selecting a single element of the population is enyisioned as a trial of the (sampling) experiment, so that each trial can result in one of two possible outcomes. Our ultimate goal is to develop a probability model for the number of outcomes in one category when repeated trials are performed.
160
Chapter 6
An organization of the key terminologies, conceming the successive repetitions of an experiment, is now in order. we call each repetitlon by the simpler name-a_tria!. Furthermore, the two possibl. o",ffiil trial are now assignediE-e-technicalnamer ,,r.""ri (s) and failure (F) iust to emphasizethe point that they are the only twi"possible,;n1r.'fi;r; names bear no connotation of successor failure in real life. customarily, the outcome of primary interest in a study is labeled ,rr"""., (even if it " is-a disastrous event). In a study of the raie of unemployment, the status of being unemployed may be attributed the statistical name success! Further conditions on the repeated trials are necessaryin order to arrive at our intended probability distribution. Repeated triais that obey these conditions are called Bernoulli tdals, aftei the French mathematician facob Bemoulli.
BernoulliTriofs (a) Each trial yields one of two outcomes, technically called
(S) an4failure (F). trial;th-. ptdbability of successp(s) is the same and is denoredby p then P(F) _ I p for each trial and is denoted by q, so thatp + q (c) Trials areindependent. The probability of successin a trial does not chffi an; information about the outcomes of other trials. sgg;
(b) Fffih
Perhapsthe simplest example of Bernoulli trials is the prototype model of tossing a coin, where the occurrenceshead and toil can be labeled S and F, respectively. For a farr coin, we havep - q
_+.
'l somplingfrom o populotionwith EXAMPLE Twocotegoriesof Elements
Consider a lot (population) of items in which each item can be classified as either defective or nondefective. (a) Sampling with replacement:Supposethat a lot consists of l5 items of which s-ge-ae@gg,e and l0 are nbndefective. An item is drawn at 'random (i.e., in a manner that all it;il;i;T[eTot are equally likely to be selected).'Thequality of the item is recordedand it is ritumed ,o itt" tot before the next drawing. The conditions for Bernoulli trials are satisfied. If the occurrence of a defective is labered s, we have p(s) : +. (b) Sampling without repracement: In situation (a), suppose that 3 items are drawn one at a time but without replaceme"t. f.tr." the condi-
and Failures-Beruoulli Trials 161 2. Successes tion concerning the independence of trials is violated. For the first drawing, P(S) : +. If the first draw produces an S, the lot then consists of 14 items, 4 of which are defective. Given this information about the result of the first draw, the conditional probability of obtaining an S on the second draw is then * # *, which establishes the lack of in{epen{-ence. thrust when This violation oi tir.ffises-irs the population is vast and only a small {raction of it is sampled. Consider r"-plitrg 3 items without replacement from a lot of 1500 items, 500 of whiih aie defective. With S, denoting the occurrence of an S in the first draw, and S, that in the second, we have
P(S,): #% -- + and P(S2lSr): 6 For most practical purposes,the latter fraction can be approximated by *. Strictly sfeaking, there has been a violation of the independence of trials, but it is to such-a negligible extent that the model of Bernoulli trials can f-l good approximation. be assumed ", " Example 1 illustrates the important points:
If elements are sampled from a dichotomous population at and with rep]gssmgnt, the conditions for Bglggulli ra[dgpfl -are satis-fied.WEen 6e sa-1lP4lg is made without trials .ondition of tlGTffilendence dFffils is rgbgT.-ilhl violaled. However, if the population ls large and onlti small ?*-e'
iffi_nofit(lessthan|o%,xSaru1eofthumb)issampled, the effect of this violation is negligible and the model of the Bernoulli trials can be taken as an approximation.
Example 2 further illustrates the kinds of approximations that are sometimes employed when using the model of the Bernoulli trials.
2 Testingo NewAntibiotic. EXAMPLE Suppose that a newly developed antibiotic is to be tried on 10 patients who have a ggr13pjlcc-age and that the possible outcomes inEdh-case are cULe(S) or no cure (F). Each patient has a distinct physical condition gmc consfituTi6ilthat cannot be nerfectly -aiched by any other "trd patient. Therefore, strictly speaking, it may not be possible to regard the
162
Chapter 6
trials made on 10 different patients as l0 repetitions of an experiment under identical conditions, as the definition of Bernoulli trials demands. we must remember that the conditions of a probability model are abstractions that help to realistically simplify the complex mechanism governing the outcomes of an experiment. Identification with Bernoulli trials in such situations is to be viewed as an approximation o{ the real world, and its merit rests on how successfully the model explains chance variations in the outcomes.
T
EXERCISES s the model of Bernoulli trials plausible in each of the following ituations? Identify any serious violations of the conditions.
(a) A dentist recordsif eachtooth in the lower jaw has a cavity or has none. (b) Personsapplying for a driver's license will be recordedas wdting left- or right-handed. (c) For each persontaking a seat at a lunch counter, observethe time it takes them to be served. (d) Each day of the first week in April is recorded as being either clear or cloudy. (e) cars selectedat random will or will not passstate safety inspection. 2.2 Give an example (different from those appearingin Exercise2.1) of repeatedtrials with two possible outcomes where: (a) The model of Bernoulli trials is reasonable. (b) The condition of independence is violated. (c) The condition of equal P(S) is violated. 2.3tFrom four agricultural plots, two will be selected at random for a t
J T
pesticide treatment. The other two plots will serve as controls. For each plot, denote by S the event that it is treated with the pesticide. Consider the assignment of treatment or control to a single plot as a trial. (a) Is P(S) the same for all trials? If so, what is the numerical value of P(S)?(b) Are the trials independent? Why or why not?
{\
Refer to Exercise 2.3. Now suppose, for each plo t a f,atr coin will be tossed . If a head shows up, the plot will be treated, otherwise it will
3. The Binomial Distribution 163 be a control. With this manner of treatment allocation, answer questions (a) and (b). 2.5 A market researcherintends to study the consumer preferencebetween regular and decaffinated coffee. Examine the plausibility of the model of Bernoulli trials in the following situations. (a) one hundred consumers are randomly selected and each is asked to report the types of coffee (regular or decaffinated) purchased in the five most recent occasions.Considering each purchase as a trial, this inquiry deals with 500 trials. (b) Five hundred consumers are randomly selected and each is asked about the most recent purchase of coffee. Here again the inquiry deals with 500 trials. 2.6 Consider four Bernoulli trials with successprobability p : .7 in each trial. Find the probabilitY that: (a) All 4 trials result in successes. (b) All are failures. -,. (c) There is at least one success. 2.i gn animal either dies (D) or survives (S) in the course of a surgical -.- .i.'experiment. The experiment is to be performed {irst with two ani-als. tf both survive, no further trials ale to be made. If exactly one animal survives, one more animal is to undergo the experiment. Finally, if both animals die, two additional animals are to be tried. (a) List the sample space. (b) Assume that the trials are independent and that the probability of survival in each trial is {. Assign probabilities to the elementary outcomes. (c) Let X denote the number of survivors. Obtain the probability distribution of X.
DISTRIBUTION 3. THEBINOMIAL Consider a fixed number n of Bernoulli trials with the successprobability p in each trial. The number of successesobtained in n trials is a random variable which we denote by X. The probability distribution of this random variable X is called a binomial distribution. The binomial distribution depends on the two quantities n and p. For instance, in Chapter 5, the distribution appearingin Table I is precisely the binomial distribution with n : 3 and p : .5 while that in Table 3 is andp: .3. t h e d i s t r i b u t i o n w i t hn : 4
164
Chapter 6
The BinomiqlDistribution Denote
n _ a fixed number of Bernoulli trials p - the probability of successin each trial X - the (random) number of successesin n trials. The random variable X is called a binomial random variable.Its distribution is called x binomial distribution.
A review of the developments in Example 5 of Chapter 5 will help motivate a formula for the general binomial distribution.
EXAMPLE 3 Exomple5 of Chopter5 Revisited The random variable X represents the number of infested trees among a random sample of n : 4 trees from the forest. Instead of the numerical value .3, we now denote the population proportion of in{ested trees by the symbol p. Furthermore/ we relabel the outcome "in{ested" as a success (S) and "not in{ested" as a failure (F). The elementary outcomes of sampling 4 trees, the associatedprobabilities, and the value of X arc listed below: FFFF
SFFF
SSFF
SSSF
FSFF
SFSF
SSFS
FFSF
SFFS
SFSS
FFFS
FSSF
FSSS
SSSS
FSFS FFSS Value of X Probability of each outcome
Number of outcomes
q4
pq3
pzqL
pBq
p4
3. The Binomial Distribution 165 Since the population is vast, the trials can be treated as independent. Also, for an individual trial, P(S) : p and P(F) : q : I - p. The event [X : 0] has one outcome FFFFwhose probability is
PIX To arrive at an expression for IIX : 1], we consider the outcomes listed in the second column. The probability of SFFFis P(SFFF):Pxqxqxq:Pq3 and the same result holds for every outcome in this column. There are 4 4pd. The factor 4 is the number of outcomes so we obtain 4X : Il: outcomes with one S and three F's. Even without making a complete list of the outcomes, we can obtain this count. Every outcome has 4 places and the I place where S occurs can be selected from the total of 4 in (f) : 4 ways, while the remaining three places must be filled with an F. Continuing in the same line of reasoning, the value X : 2 occurs with (t) -: 6 outcomes, each of which has a probability of p2q2.Theref.orc, PIX : 2] : (ilp'q2. After working out the remaining tenns, the binomial distribution with n : 4 trials can be presentedas in Table 1.
Tqble'l BinomiolDisfributionwith n = 4 Triqls Value x Probability
/(x)
(t),,n^ (i)p,q. 6)0,n, (i) rq' (I)o-r"
It would be instructive for the reader to verify that the numerical probabilities appearing in Chapter 5, Table 3 are obtained by substituting p : .3 and q : .7 in the entries of the above Table l. n Extending the reasoning of Example 3 to the caseof a general number n of Bernoulli trials, we observe that there are (1) outcomes that have exactly x successesand (n - x) failures. The probability of every such outcome is p*qn -*. Therefore,
f(x)
(i)0"r"- t
for x - 0, 1,
,n
is the formula for the binomial probability distribution with n trials.
166
Chapter 6
The binomial distribution with n trials and successprobability p, is describedby the formula
f(x)- PIX:xl for the possiblevaluesx - 0, I,
EXAMPTE4
(l)n',t ,n.
According to the Mendelian theory of inherited characteristics, a cross fertilization of,related species of red and white flowered plants produces offspring of which 25Yo arc red flowered plants. Suppose that a horticulturist wishes to cross 5 pairs of red and white flowiied plants. Of the resulting 5 offspring what is the probability that: (a) There will be no red flowered plants? (b) There will be 4 or more red flowered plants? Becausethe trials are conducted on di{ferent parent plants it is natural to assume that they are independent. Let the random variable X denote the number of red flowered plants among the 5 offspring. If we identify the occurrence of a red as a success S, the Mendelian th.ory specifies that P(s) : p : I,andhence xhas abinomial distributionwithn : 5 andp : .25. The required probabilities are therefore:
(a) Plx : 0l : /(0) : (.7s)s: .2s, (b) P[x > 4] : f(4) + f(5) : (?X.2s)4(.7s)r + (!)(.2s)'(.zs;o : .015+ .001: .016
n
To illustrate the manner in which the values of p in{luence the shapeof the binomial distribution, the probability histograms for three binomial distributions with n : 6 andp : .5, .3, and.7, respectively, arepresented in Figure 1. whenp : .5, the binomial distribution is symmetriC with the highest probability occurring at the center (see Figure la). For values of p smaller than .5, more probability is shifted toward the smaller values of x and the distribution has a longer tail to the right. Figure lb, where the binomial histogram for p : .3 is plotted, illustrites this tendency. On the other hand, Figure lc with p : .l illustrates the opposite_tendency:the value of p is higher than .5, more probability mass is shifted toward higher values of x, and the distribution has a longer tail to the left. considering the histograms in Figures lb and,lc, we note that the value of p in one histogram is the same as the value of q in the other. The probabilities in one histogram are exactLythe same ai those in the other, but their order is reversed.This illustrates a general property of the binomial distribution: whenp and q arc interchanged, the disiribution of probabilities is reversed.
3. The Binomial Distribution
167
.324
.3r2
(a)
Figure'l Binomiol disfributionsfor n = 6.
How to Use the Binomial Table (Appendix B, Table 2) Although the binomial distribution is easily evaluated on a computer and some hand calculators, for readers' convenience we provide a short table in Appendix B, Table 2. It covers selected sample sizesn ranging from 1 to 25, and several values of p. For a given paft (n, p), the table entry corresponding to each c represents the cumulative probability P[X < c] 2
x:o
f@), as is explained in the following scheme.
The Binomiol Distribulion
Aooendix B, B.Toble 2 provides orovides Appendix
Value x
Probability f(x)
0 I
/(0)
? n
Table Entry
f(r)
0 I
: f(n)
n
f(?)
/(0)
/(0)+ f(r) f(o) f0 + fQ) I : 1.000
Total
The probability of an individual value x can be obtained from this table by a subtraction of two consecutive entries: for example:
PIX
entry atl
Itabte LC
entry atl _l
Itable_ 1
168
Chapter 6
EXAMPLE5
Suppose it is known that a new treatment is successful in curing a m 'scular pain in 5o% of.the cases.If it is tried on 15 patients, find ihe probability that: (a) At most 6 will be cured.
(b) The number cured will be no fewer than 6 and no more than 10. (c) 12 or more will be cured.
Designating the cure of a patient by S and assuming that the results for individual patients are independent, we note that th! binomial distribution with n : 15 and p - .5 is appropriate tor X : number of patients who are cured. To compute the iequired probabilities, we consult the binomial table for n : 15 and p : .1.
(a) Plx < 6l _ .304,which is directly obtained by readingfrom the row c-6. (b) We areto calculate P l 6< X <
l0l ro x:6
The table entry corresponding to c to x:o
and the entry corresponding to c - 5 gives 5 X:O
Since their difference represents the sum
f
f@),\Meobtain
X:6
P l 6< X <
1 0 1: P I X < 1 0 ]
P[x < s]
_ .790 (c) To find Plx > Lzl we use the law of complement: PIX > Lzl
Note that [x < rz] is the same event as [X < ll].
Exercises 169 (An aside: Refer to our "musculat pain" example in Section I of Chapter 4. The mystery surrounding the numerical probability .018 is now -J resolved.) The lVleanand Standard Deviation of the Binomial Distribution Although we already have a general formula that gives the binomial probabilities for any n and p, inlater chapters we will need to know the mean and the standard deviation of the binomial distribution. The expression np for the mean is apparent from the following intuitive reasoning: If a fair coin is tossed 100 times, the expected number of heads is 100 x * : 50. Likewise, i{ the probability of an event is p, then in n trials the event is expected to happen np times. The formula for the standard deviation requires some mathematical derivation, which we omit.
The binomial distribution with n trials and successprobability p has Mean Variance sd- {nW
EXAMPLE6
For the binomial distribution with n : 3 and p : .5, calculate the mean and the standard deviation. Employing the formulas, we obtain M."tr _ np _ 3 X .5 _ 1.5
sd_ t@q-- ffi:
lFls:.s56
The mean agrees with the results of Chapter 5, Example 7. The reader may wish to check the standard deviation by numerical calculations using the definition of o.
EXERCISES 3.1
Construct a tree diagram for three Bernoulli trials. Attach probabilities in terms of p and q to each outcome and then table the binomial distribution for n : 3 .
170 Chapter6
3.21 (a) Plot the probability histogramsfor the binomial distribution for ' n = 5 and fot p : .2, .5, and .8. (b) Locatethe means. (c) Find IIX > 4l for eachof the three cases. A stop light .,'3.1',
on the way to class is red 60% of the time. What is the probability of hitting a red light: / \u*--/. (a) 2 days in a row? f
(b) 3 days in a row? (c) 2 out of 3 days? 3.4
A basketball team scores 4o% of the times it gets the ball. Find the probability that the first basket occurs otr lts third possession. (Assume independence.) j using the binomial table, find the probability (a) 3 successesin 8 trials when p _ .4. (b) 7 farlures in L6 trtals when p _ .G. (c) 3 or fewer successesin 9 trials when p _ .4. (d) More than 12 successesin L6 tnals when p - .7. (e) The number of successesbetween 8 and 13 (both inclusive), in L6 tnals when p - .6.
3.6 using the binomial table, find the probability of: (a) 3 or less successes for p- .1, .2, .8, .4, and.5 when n - IZ. (b) 3 or less successesfor p _ .1, .2, .8, .4, and .5 when n - lg.
3 . 7 A sociologist feels that only half of the high-school seniors capabte of graduating from college go to college. Of 17 high-school seniors who have the ability to graduate from college, find the probability that 10 or more will go to college if the sociologist is correct. Assume that the seniors will make their decisions independently. Also find the expected number.
3.& Only 30"/" of the people in a large city feel that its mass transit system is adequate. If 20 persons are selected at random, find the probability that 5 or less will feel that the system is adequate.Find the probability that exactly 5 will feel that the system is adequate. Calculate the mean and standard deviation of the binomial distribu6n tion with: (a) n (b) n (c) n 3.10 (a) For the binomial distribution with n : 3 and p : .6,list the probability distribution (x, f(xD in a table.
about a PopulationProportion l7l 4. TestingHypotheses (b) Calculate from this table, the mean and standard deviation by using the methods of Chapter 5, Section 4(c) Check your results with the formulas: mean _ np, sd - lFpq. .IHEU TRUE ANOTI{ER ANDTI{EN tF VOT,.RE ONEUJILL slt4AffT,qOUCAN I}IE NEKT TI{ATAAEAN5 0ttE5ANDTI{EN ?A55A TRUE FALSE IHE Tl00rt4ORE ORFAL6ETE'T OFAALANCE TO9ORT BEFAL6E INA ROOJ...THE? TRUES ONE AL9O TI{REE orTt{ouTDE|N6€ihAgTI NEXT OIT^L IRUEONE..THE TRUE5 TI{REE IN I{AYE A ALI'JAVs ANCRT PATTERN,. THE TO BEFAL9E ANqN{ER 9 ; rct|JfrTTEPLACE..TI.IEN Es. TRUE... ANOANOTI{ER FALSC g c
C F
i:
ir .= i"; g a ! t
i1
DON,TAST,ME...IT tr)AS A DIsA5TER..
J
N
(OUEVEN A55 ATRUE COULDNT
I FALSEDIdI{ENI ORFAL5ETEST?OI{AT I{AP?ENEDI SAAI)LDI{AVETRUEDJ
Poor Linus.Chonce did nof even fovor him with holf correct.
ABOUTA HYPOTHESES 4. TESTING PROPORTION POPULATION 4.I BACKGROUND On severaloccasions,we alerted the readerthat probability plays a major role in drawing inferences from the sample data. Now that the binomial probability model has been introduced, we can discuss the role of probabihty in an important type of inference, called testing of statistical hypotheses.Our immediate concern is primarily with applications of the binomial model. However, the key ideas here carry ovel intact to later chapters where we treat inference problems involving other probability models. Broadly speaking,the goal of testing statistical hypothesesis to determine if a conjecture about some feature of a population is strongly supported by information obtained from the sample data.
A statistical hypothesis is a statement about the population. Its plausibility is to be evaluated on the basis of information obtained by sampling from the population.
172
Chapter 6
4 !"ry examples of situations where tests of hypotheses are appropriate will help set our goal in clearer terms. Sampling inspection. A contractor supplies some items, in large lots, to an assembly plant. A lot is considered to be of acceptable quality if less than 8% of the items are defective. Becausescreening all items in a lot is prohibitively expensive, management wants to inspict a sample of the items and then decide between purchase or return of the complete lot. A decision in favor of purchase will be made only if the sampie strongly -ff indicates that the lot is of acceptable quality. Documentation of a genetic theory. Some tentative assumptions on the propa-gationof genes have led to the model that 2s"/oof a progeny of plants will possessa certain dominant character. By experimenting with a sample of such plants, the geneticist wants to learn whether or not the model is seriously contradicted. n A new "improved" laundry soap is claimed to have better cleaning power than the "old" formula. Several different piles of dirty clothes wif be washed, one half with each cleaner, and the number of caseswhere the new brand cleans better than the old will be observed.The intent here is to examine whether or not the data substantiate the claim. n The reader will notice a few common features of these examples: (a) Each concems two complementary statements (or hypotheses) about an unknown population quantity; (b) available information is in the form of a-samplefrom the population; and (c) the investigator wants to determine if a particular statement (or hypothesis) is strongly borne out by the sample data. For instance, in the example of sampling inspection, aiot is the population whose fraction defectivep is unknown. rhi hypotheses of interest are that the lot is of acceptablequality (p < .0g) or not (p > .0g). By inspecting a sample of items, one wants to determine if the hypothesis p < .08 is strongly supported. 4.2 T}lE NULL AND THE ALTERNATIVE HYPOTHESES A discussion of the formulation of a statistical hypothesis testing problem and the steps for solving it requires the introduction of a number of definitions and concepts. Instead of discussing this topic in its full generality, we develop its basic ideas in terms of a specifii problem in which the chance behavior is governed by the binomial distribution. Problem: Experience has shown that the cure rate for a given disease using a standard medication is 4oyo. The cure rate of J new drug is anticipated to be better than the standard medication. suppose that the new drug is to be tried on a sample of 2o patients and that the number cured X in the 20 is to be recorded. How should the experimental data be used to answer the question: "Is there substaitial
4. Testing Hypothesesabout a Population Proportion 173
evidence that the new drughas a higher cure rcte than the standard medicationJ" The cure rate of the new drug is a proportion p whose value can be correctly ascertained only if the drug were administered to a vast number of patients. In light of the question raised in the statement of the problem, the following two hypothesesare relevant: The new drug is better than the standard medication: p > .4 The new drug is not better than the standard medication: p < .4 The success rate p of the new drug is unknown. Of the two statements concerningp, one is called the lull hypothesis Ho and the other is called the alternative hypothesis Hr. To determine which hypothesis should be Iabeled the null'hypothesis, the difference between the roles and the implications of these two terms should be clearly understood.
Choice of Ho ond H,l When our goal is to establish an assertion with substantive support obtained from the sample, the negation of the assertion is taken to be the null hypothesisHo, and the assertionitself is taken to be the alternative hypothesis H r.
The word "null" in this context can be interpreted to mean that the assertion we seek to establish is actually void. Before claiming that a statement is establishedstatistically, adequate evidence from data must be produced to support it. A close analogy can be made to a court trial where the jury clings to the null hypothesis of "not guilty" unless there is convincing evidence of guilt. The intent of the hearingsis to establishthe assertionthat the accusedis guilty rather than to prove that he or she is innocent.
Require strong evidence of: Null hypothesis (Hj: Alternative hypothesis (Hr): Attitude;
Court Trial
Testing Statistical Hypothesis
Guilt Not guilty Guilty uphold "not guilty" unless there is a strong evidence of guilt
Conjecture Conjecture is false Conjecture is true Retain the null hypothesis unless the sample data testify strongly against it
False rejection of Ho is a more serious error than failing to reject Ho when H, is true.
174
Chapter 6
_ In our drug experiment, we seek to establish statistically that the new drug is better. This statement should therefore be the alternative hypothesis, and the null hypothesis should be the statement that it is not titt"r. A-reiection of H, would amount to an endorsement of the new drug. A false rejection of rlo is a more serious error becausemarketing a bad jrug is more serious than a failure to publicize a potentialy good drug In view of these guidelines, the specification of the null and altemative hypotheses in our problem should be Ho: p < .4 (new drug is not better) Hl p > .4 (new drug is better) The relevant information from trying the new drug on 20 patients will be in the form of X, the number of cures. Any of the values 0, l, . . . 20 of , X is physically possible under both Ho and Hr, so that none of these outcomes can absolutely prove that Ho is true or that Hl is true. we can only ask ourselves: what values of x can be regarded as strong evidence for-the suneriority of the new drug (that is, rtiong evidence alainst the null hypothesis)? Intuition suggeststhat large values of X wouli strongly indicate that Ho may be false, whereas small values of X would rnppiri Ho. For instance, 19 or 2o curesout of 20 patients suggeststhat p is near l. Furthermore, If Ho is true, we would expect g or fewer therefore ".tr.rl {or rejectsomewhat more than 8 could be viewed as reasonableevidence ing Ho. But where should we draw the line? For an obiective decision procedure, we need to specify a course (or rule) of action. ior example, we may adopt the
Decision rule: Reiect Ho if X Retain Ho if X
A convenient notation: R:X>12 (Readas: reject Ho if X > IZ)
such a specific courseof action is called a test of the null hypothesis,and x is called a test statistic. obviously, many other rules be consid"orrld ered.
The random variable X whose value serves to determine the action is called the test statistic. A test of the null hypothesis is a course of action specifying the set of values of a test statistrc X for which Ho is to be rejected. This set is called the reiection region of the test. A test is completely specified by " test statistic and the rejection region.
4. Testing Hypotheses about a Population Proportion
175
Reiection Region
F 10 1l
12 13 14 15 16 17 18 19 20
Possiblevalues of the test statisttc X Figure2 Disployof the reiectionregionR:X ---42.
4.3 THE TWO TYPES OF ERRORS We continue our discussion o{ the problem conceming the cure rate of a new medicine. Common sensesuggeststhat Ho: p < .4 should be reiected for large values of X, although no reason has yet been forwarded for the choice of X > 12 as the reiection region. Other possiblechoices might be X > lO or'X > 15. When choosing a reiection region, we must consider our chances of making a wrong decision with any rule that we use. Considering the unknown state of nature and the possible results from applying a test, we see that one of the following situations will arise: UNKNOWN TRUE STATE OF NATURE
TEST CONCLUDES:
Ho True (p
Ho False (p > .4)
Do not reiect Ho
Correct
Wrong (Type II error)
Wrong (Type I error)
Correct
Reject Ho
The decision reached by using a test may be wrong in either of the two ways indicated here: (a) Homay be true, and the test may conclude that it should be rejected, or (b) the test may fail to reject Ho when H, is true. These two types of errors are called the type I error and the type II error 16spectively.
Two Types of Errors Reiection of Flo when Ho is true: Type I error Failure to reject Ho when ^FIris true: Type II error
176
Chapter 6
ble. If p belongs to Ho, only a type I error can occur. on the other hand, a type II error can occur only if p belongs to Hr. The decision reached by a test depends on the observed value of X : number of cures out of 20. BecauseXis variable, there is a probability that an error will occur. Errors cannot be prevented but a small error probability implies that the error is unlikely to occur.
EMMPLE 7
Referto the clinical studyof anew drugwhere therejectionregion R: X> 12 will be used to test Ho:p < .4 vs. Hr: p > -4. Determine the type of error that can occur and calculate the error probability when (a) p : .B and (b) p : ,7. (a) When p : .3, the null hypothesis Ho: p < .4 is true. The only possible error in this case is to reject Ho (type I error). P[type I error given p - .3] - PIX > 12 given p - .3]. Because responses of different patients are independent, the binomial distribution is appropriate for X. For n - 20 and p _ .3, thelinom.lal*
table provides
l
a'.1. ' ^
PIX > t}l(-"1r.-)
I
PIX < 11]
l_
.99s
I
'-L:,
."' ('
,,',!
?-)
'] ,^,-", ,") il .'i
^-.'\ We conclude Pftype I error given p _ .3] -- .005
" r-.. 1
''\' I
.rn?
,r
r\ l \ ' 1
"i:
,'\
ii
/' \'" i'.
(b) When p _ .7, thealternative hypothesis H I p > .4 is true. The only possible error is failure to reject Ho ftype II error). Observethat Ho is not rejectedonly if X < 1l. From the binomial table with n P[type II error given p - .7) _ PIX < 1l given p _ .7]
Let us focus our attention on the type I error, which is after all the more serious of the two types of error. In Example 7 we found that the test R: X > 12 has type I error probability .005 when p : .3. Repeating that calculation for other values of p in Ho, the following error probabilities can be obtained.
4. TestingHypothesesabout a Population Proportion 177
ptnHo
,2
.3
.4
.000+
.00s
.057
Type I error probability
These illustrate the general fact that large values of X are more likely when p is increased. We see that, over all values of p in Ho: P < '4, the type I irror probability is largest ^tp : .4, theboundary point between Ho ^id Ur. Therefore when choosinga test, we need only be concernedwith the magnitude of the error probability at this boundary point'
The maximum tYPe I error probability of a test is called its l*vrl {}$signi$ic;tllceand is denoted by c t .
The level of significanceis determined from the value of the parameter on the bound ary between Ho and H r. In our example, the test with reiection region R: X > 12 has level of significance ct
The typr ll firsor pr*bahilitv is denotedbV 9.
I
-- P[Type II error]
depends on the value of p in F/r.
4.4 CHOOSING THE TEST The conventional practice is to ensure that a is controlled below a predetermined levei of tolerance. Supposein our study of the new medication we require that the type I error probability of the test should not exceed .07. Slanning the binomial table for n : 20 and p : -4 we find that ,r'\
'i P I X > 1 1 1- I - i :i1(:-U., {.u:.j*\ot.t/'T ' *{ . Plx > r2l it"r.-il l''-.,i
.872 : .I28
The test R: X > 1l violates our requirement on a, but the test R: X > 12 'iust meets the requirement. Therefore we use R: X > 12' It is not
178
Chapter 6
necessaryto shrink_the rejection region further becausedoing so will only increase B : p[Fail to reiect Ho given H, is true]. The specification of the tolerance level ior the type I error probability is not a statistical problem. It must be ascertained from considerations of the strength o{ the evidence that is required to reject Ho. Traditionally, low values such as o. : .01, .05, or .10 are used to perform"statisticaltests. From the.interpretation of probability, an ct of .05 means that Ho would be wrongly rejected in about 5 out oi 100 independent tests. EXAMPLE8
For the problem of evaluating the new drug with 20 trials, (a) Formulate the test with the level of significance approxim ately 2"/o. (b) What error could be made if the true cure rate of the drug were .7? Evaluate this error probability for the test determined in"(a). (a) To formulate the test we again consider the type I error probability ^t p the. boundary point between Ho: p 1 .a and ir: p > .4. ..4, Consulting the binomial table Ior n : 20 and"p: .4,*.find ptx = tZ1 : .O57,PIX > 131 : .021,and plx > 141 : .OOO. Thus the test R; X ) rc provides a : .021, which is close to the specification. (b) At P : .7 the altemative is true so the only possible error is not to reject H,o.The type II error probability, at the alternative p : .7, B, is the probability of nonrejection PIX < 12]. Consulting the binomi"t t"Ui. fo, n : 20 and p : .7, we find
p EXAMPLE9
r
A large shipment of golf balls is acceptable to a purchaser if less thanloy, of them are defective in the senseof not passinga bounce test. A r"ndosample of 18 balls will be inspected, and the f,urchaser will acceplthe batch if the inspection turns up strong evidenie of an acceptablettch quality. (a) Formulate the hypotheses. (b) Set the rejection region with a not exceeding.10. (c) If the shipment contains only 5% defectives, what is the probability that the purchaser will decide against buying it? (a) Denote by p the unknown fraction defective of the shipment. Since strong evidence is sought in support of p < .2, we formulate the hypothCSES
Ho:p>.2
vs. H'p<.2
(b) The test statistic is X, the number of defectives in a random sarirple
4. TestingHypothesesabout a PopulationPtoportion 179 of 18 balls from the batch. The possiblevalues of X arc 0, I, . . . , 18. Since extremely small values of X should contradict the null hypothesis, the reiection region should be of the form R: X < c. We will determine c from the specification that a is not to exceed .10. Since the shiprnent (population) size is large, the binomial probability : 18 model for X is appropriate. consulting the binomial table with n and p : .2 we find plx < 1l : .099 and PIX < 2] : .271'Therefore, we choose the reiection region R;X< l,whichhasa:.099 (c) we are to find the probability of not reiecting the null hypothesis : '05 we when, in fact, p : -05.From the binomial table for n : 18 and p obtain the probability PIX>2whenP:.051:
I - .774:'226
tr
4.5 DRAWING CONCLUSTONS FROM A TEST Having determined the rejection region with a preset low level o{ signifiready to implement the test. In our drug trial problem we cance i, -" "r" wish to test Ho: p < .4 vs. Hr: p > .4 on the basisof X, the number of cures out of 20 patielnis treated with the new drug. If ct : .057, the corresponding rejection region is R; X > 12. Once the experiment is-perfotmed,we obierve a numerical value of the test statistic X, and decide for or against rejecting Ho. Suppose, out of the 20 patients 14 are found to be cured. What conciusion would we draw from this obsewation? Since the observed value 14 falls in the rejection region, we would refect the null hypothesis at the 5.77" level of significance and conclude that there is suistantial evidence that the new drug performs better than the standard medicine. If on the other hand, the observed value is x : 9, Ho is not reiected at the 5.7Y" levelof significance and we would conclude that the data do not provide convincing evidencethat the new drug is better than the standard medicine. Note in particular, the retention of Ho is interpreted only as a lack of evidence to reject it. Rather than saying Ho is accepted,we say that Ho is not reiected.
With a specified level of significance ct, a test conclusion should be stated as Ho is reiected at the level of significance tr ot, to the contrarY, Flo is not reiected at the level of significance CI
180
Chapter 6
choosing o : .057 in our example, an observedvalue x : 12 leads to the rejection of Ho as does an observed value of x : 14. Apparently, however, x : 14 constitutes stronger evidencein support of arihan the value x : 12 does. To pursue the idea of strength o? evidence, we can determine how small ct could have been and yeihave tro be rejected on the basis of the observedvalue. For instance, iI x: 14 is observed,the tests X > 14, x > L3, x >- 12 will all lead to the rejection of Ho. Consulting the binomial table for n : 20 and p : .4, the correspondini o's are found to be.005, .ozr, .os7. The smallist possible ., that wouli pcrmit the rejection of Ho, on the basis of the observed value x : 14, is therefore .006. This o value is called the signifigance probability (or P-value) of the observationx : 14.
The significance probability (or p-value) of an observed test statistic is the smallest cr for which this observation leads to the rejection of Ho.
In other words, the P-value is the probability under Ho of the occurrence of the particular observed value or more extreme values. The significance probability gaugesthe strength of evidence against Ho on a numerical scale.A small p-value indicates a strong iustification of reiection. In addition to performing a test of hypothesii'#itrr a predetermihed o, it is a good statistical practice to record the significanceprobability as well.
EXAMPLE,10 Refer to the sampling inspection situation of Exampre 9. Suppose we
inspect a random sample of 18 balls from a shipment and find that none is defective. Find the significance probability of this obsewation and interpret the result. For this testing problem, small values of X go in the rejection region. Note that there are no values of X that are more extreme than the observed value x : 0. consequently, the significance probability is the probability of [X : 0] calculated under p : .2. We get P-value
- PIX
Given the observed value x . 0, Flo would be rejected with ct as low as I.8%-strongly contradicting the null hypothesis. I * Power
of a Test
How well a test is likely to perform must be judged in light of the probabilities of the two types of error. Note that
about a PopulationPtoportion l8l 4. TestingHypotheses P[Ho not reiected] : 1 - P[Ho reiected]' Thus, a convenient means of tracking both the error probabilities is the calcuiation of P[Ho is reiected) under various values of p in the range covered by Ho as well as those in the range covered by Ht'
EXAMPLE14 Refer to the problem of clinical study with the new drug, and consider the
test R; X = iOfor testing the null hypothesis Ho:P <.4 vs' the alternative plot the probabilities of the two types of error. ir, p ,.4. Calculat. : 20 andp denotes "trd h""a11 that the distribution of X is binomial with n probability refection The drug. new the the unknown true cure late of pfX = tOlcan be obtained from the binomial table for n : 2Oand for each : '3, we o-b-tainfrom the sieci{ied-value of p. For instance, when P : I -'?52 : 048' The : < PIX 9] I binomial table P[X > 10] proUrUitl,i. s plX-> l0l foi various values of.p are listed in-Table 2, and than .4, Hr is true; therefore, it is itott"a in Figure 3. When p is greater not possible to make a tYPe I error.
Tqble 2 The Probqbilifiesof Reiectionof Hofor lhe TestR:X> 40 *-Hn--T .3
.2
Hr
\-rvp .4
.5
.s88
Plx > 101
.003
.048
.24s
P[type I error]
.003
.048
.245
.6
.872
.7
.B
.983
.999
Cannot make tYPe I error
.001
Cannot
effor]
I.0
r
k A
tl
ac) 0 . 5 P
Atp
c) d
(l
0.4 o.245
O
;o.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
HO
Figure3 The rejection probobility (power) curve for R:X ,> 70.
1'0 p
182
Chapter 6
Y"*t
Ho, p is restricted to the rangep < .4, which is to the left of the lvrl
*:*l,f:rical,line
vr
tllt
a. n tf,is part of the graph,the rejection T lis"t.
p,Tl:l'I.r.ill lld_efinition,the ;am,ea9tt" typ.T probability. Undgrir,, therange "'.r1r of p is_p> .4,whichi,;" it; ,r*r,rli",ir" ffihJj this ra.nse, pf,vT" : reiected)l tl. ri enorl. rhus ::1i:lllle..In . f(% ty curve nrgblbili pr'r'"id.; ;"_ri; :*_.Tj* :l :1. ? ""tron of thetest for -"r ",."i performance
l]:::j"^:1lhe. to the.truestate
all possible-con;il;ilfi;
of nature.The two typ"r or
I.g4 be obtained from the two parts of this graph:
",,",
pr"L?riliries can
Type I error probability
- height of the curve at a value p Type II error probability B - t height of the curye at avalue p > .4
plot of the rejection probabilities versus the parameter values called the power curve of the test.
*..t-prove upon the_testR: X > l0? The test R; X > lL hasa 9?,"^ ':*t"" s,oP[Reject.H.o]is smaller for all p. Thatis, the :::]t::^1n:f9_l but Bis r"arger. rn anidealrest,c andB are i:1T:"':Yl#'^.f:l:.:piller both as small aspossiblebut sucha rotriio., cannot be obtained;ltf,;i:
sample size fixed. A compromise must be reached. Because the reahzation of a type I error is deemed more serious than a type II ettor, the conventional practice is to ensure that a is controlled below a predetermined level.
EXERCISES 4.1 Identifu the null and the alternative hypotheses in terms of descrip_ tive statements. The type of answer required is illustrated in (a). (a) A construction engineer wishes to determine if a new cement mix has a better bohding quality than the mix currently in use. The new mix is more expensive, so the engineer would not recommend it unless its better quality is supporied by experimental evidence. The bonding quality is to be observed from several cement slabs prepared with the new mix. Answer: Null hypothesis: Alternative
hypothesis:
The new mix is not better. The new mix is better.
Exercises 183 (b) A state labor department wishes to determine if the current rate of unemployment in the state varies significantly from the forecast of 5% made two months ago. suffers from flu (c) ' ' During the flu epidemic, 20% ofa city population C are of,vitamin users regular that theorizes physiiian attacki. A regular 500 sample to intends flu. she less susceptible to the users of vitamin c, determine how many of them had the flu, and use the data to document her claim' (d) A toothpaste manufacturer wants to establish that their sales amount to over 4O% of market in your town' 4.2 A market researcherwill test Ho: P < .6 against Hl P ) '6, where : 25 p : proportion of market captured by product A' Basedon n if crrstom"rs,the market researcherwiII use the test: reiect Ho 19 or more are users of Product A. (a) What error could be made i f p (b) What error could be made r f p 4.3
In Exercise 4.2, identlfy: (a) parameter, (b) test statistic, (c) reiection region.
4.4
In Exercise 4.2, evatuate: (a) Level of significance. (b) B when P : .7'
_ < 4.5 )In testing the null hypothesis Ho: p ='.5 ds. the alternative Hl P the has test .6 for a binomial ntod"l, the refection region of a structure X 4 c, where X is the number of successesin n trials' For and each of the following tests, determine the level of signi{icancA ' ip + '3' the probability of ^ type II error at the alternative (a)n:10,c:2 (b)n:10,c:3'::'(c)n:19,c:7 4.6
r'",,-
On a typical day 7o% of. all flights depart on time from a maior airport. A slowdown is suspectedand you decide to -testHo: p > '7 Hi p <.7. You will reiect Ho if in a sample of t8 flights, 9 or "gri.r.t fewer departures are on time. (a) What error could be made if P : . 4 ? (b) What error could be made if P - - . 8 ?
4 . 7 In the context of Exercise 4.6, identify: (a) parameter, (b) test statistic, (c) reiection region. 4.8
In the context of Exercise 4.6, evaluate: (a) Level of significance.
184
Chapter 6
(b) B wheraP: .4. (c) P - v a l u e i x f_
7.
4.9 Among l-year-old flashlight batteries, only 70% possessa specified strength' Given a proposed new method of storing, it is anticipated that a higher percentage of batteries will possEss this .p."ifi.d strength. Let p : p[Speci{ied strength]. (a) state the null hypothesis of "no improvement,, and the arternative hypothesis of ,,improvement,, under the new method of storing. (b) Given a sample size of 13, determine the rejection region so that ct < .07. (c) What is B for p : .95? (d) If only 3 out of 13 of the batteries sampled do not possess the specified strength, what is your conclusion? (e) Evaluate the p-value. 4.10 A psychiatrist believes that more than soy, of the users of sleeping oills sleep better simply becauseof the psychological efiect or taiin! the pills. To substantiate this hypothesis wittidata, ,h. ,"l."ts random sample of 20 insomniacs and gives each of ihem a box of" pills to use. These are actually sugarpill-sbut are ott.r*ir. identical to a popular brand of sleeping pills currently on the market. subsequently, l5 of thesepatients report that the pills have been effective in inducing sleep.Assuming a level of significanceclose to .05, does this observation support the psychiatrisi,s conjecture? 4.ll A study of spidersleads an investigator to conjecture that less than 4o% of all spider webs are built orrthe ground. The data to corroborate this observation consist of a count oi tt number of webs on the ground X among a sample of 20 webs. " (a) State the null and the altemative hypotheses. (b) With o = .1, determine the rejection region. (c) what conclusion would you draw from the obervation : x B? *(d) EvaluateP[Reject Ho]for p : .I, .2, .8, .4, .Sand graph the power curve of your test.
KEYIDEAS Bernoulli trials are defined by the characteristics: (a) two possible outcomes, success(S) or failure (F), for each trial, (b) a constant probability of success,and (c) independence. sampling from a finite popuration without replacement violates the
Key Ideas 185 requirement of independence. If !h" population is large and the sample size is small,ihe trials can be treated as independent for all practical Purposes. is The number of successesX, in a fixed number of Bemoulli trials, distribution, probability called a binornial randorn variable. Its called the binomial distrihution, is given by for x
C)nrn-x
tn
in each where n : number of trials, p : the probability of success trial,andq:I-P. The binomial distribution has Mean : np StandardrJt'viatirlrt: t/-npq parametel. A statistical hypothesis is a statement about a population suppolt A statement or claim, which is to be established with a stlong hypothesis alternative iioln the sample data, is formulated as the that the claim is void' i"rj. +" nuli hypothesis (H6) says when not to A test is a decision rule that tells us when to reiect Ho a"9 -' region. reiection a and statistic H" e ,.* is specifiedby a test ;.;, A wrong decision may occur in one of the two ways: A false rejection of Ho (type I error) Failure to reiect Ho when Hr is true (Type lI error) a cannot always be prevented when making a decision based on Errors ---,"-pr" small. keep to we attempt that ptob"bihti"s It is their type I A type I error is considered to be more serious' The maximum -and is probability of a test is called its level of significance "i-r denoted bY cr. The main steps in testing statistical hypothesesare: 1'Identifythenullhypothesis(Ho)andthealtemativehypothesis (Hr). SeeSection 4.2 for the guidelines' 2. Choose the test statistic. 3. With a selectedct, determine the reiection region' 4. Check whether or not the value of the test statistic, calculated from the obsewed sample data, falls in the reiection region' Draw a conclusion accordinglY. 5.ItisagoodStatisticalpracticealsotoStatethesignificanceproba-
186
Chapter 6
bility or P'value of an observed test statistic. It is the smallest a for which this observation leads to the rejection of Ho. The type II error probability is denoted by g. For a given test , a graph of
Pftest rejects Ho] vs. parameter value is called the power curve of the test. It gives complete information about the error probabilities of the test.
EXERCISES 5.1 Is the model of the Bernoulri trials plausible in each of the following situations? Discuss in what manner (if any) a seriousviolation of the assumptions can occur. (a) Beetles of a common strain are sprayed with a given concentration of an insecticide and the occurrence of death or survival is recorded in each case. (b) A word association test is given to 10 first-grade children and the amount of time each child takes to complete the test is recorded.
(c) Items coming off an assemblyline areinspectedand classified defective or nondefective. (d) Going house by house down the block and recording if the newspaperwas deliveredon time. 5 . 2 If the probability of having a male child is .s, find the probability that the third child is the first son. 5 . 3 If the probability of getting caught copying someone else,s exam is .2, find the probability of not getting caught in three attempts. Assume independence.
5 . 4 A backpacking party caruresthree emergency signal flares, each of which will light with a probability of .99. Assuming that the flares operate independently, find: (a) The probability that at least one flare lights. (b) The probability that exactly two flares light. 5.5 The proportion of people having the blood type 0 in a large southern city is .4. For two randomly selecteddonors, (a) Find the probability of at least one tnre 0.
5. Exercises 187 (b) Find the expected number of type 0. (c) Repeat parts (a) and (b) if there are three donors. 5.5 A viral infection is spread by contact with an infected person. Let the probability that a healthy person gets the infection, in one contact,bep: .4. (a) An infected person has contact with five healthy persons. Specify the distribution of X : No. of persons who contract the inJection. (b) Find PIX < 31,PIX : 01, and E(X). 5.7 The probability that a voter will believe a rumor about a politician is .2. If 20 voters are told individually, find the probability that: (a) None of the 20 believes the rumor. (b) Seven or more believe. (c) Determine the mean and standard deviation of the number who believe. 5.8 A new driver, who did not take driver's education, has probability .8 of passing the driver's license exam. If tries are independent, find the probability that the driver (a) will not pass in two attempts, (b) will not pass in three attempts. 5.9 A fisherman has probability .12 of catching a legal sized muskie during each day of fishing. If the catches each day arc independent, how many days must the fishing trip last to have an expected number caught greater than I ? (At most one catch is allowed per daY') 5.10 A school newspaperclaims that 80% of the students support its view on a campus issue. A random sample of 20 students is taken, and 12 students agreewith the newspaper. Find PILL or less agree], If 80% support the view, and comment on the plausibility of the claim. 5.ll
For the binomial distribution with n : 14 rndp : .4, determine: (b)P[4<X<9], (c)I{a <X<9f, @)Pla<x<91, (d) E(X),
(e) sd(D.
5.12 Using the binomial table: (a) List the probability distribution for n : 5 and p : .4. (b) Plot the probability histogram. (c) Calculate E(n and Var(X) from the entries in the list from part (a). (d) Calculate E(X) : np and Var(X) : npq and compare your answer with part (c).
188
Chapter 6
5.13 Forfixedp, comparesd (X): (a) for n : 9 with its value for n : (b) for sample size n with its value for sample size 4n.
86,
5.14 Identify the null and the alternative hypotheses in terms of descriptive statements. (a) An agronomist believes that plants grown from a new strain of seedare likely to be more resistant to a diseasethan an existing variety. He plans to expose both types of plants to the disease] count the number of incidenceso{ the disease,and use the data to establish his conjecture. (b) The research and development department of a cigarette company believes that the average tar content of a niw blend of tobacco will be less than 5 milligrams per cigarette. This lowtar quality has a good market potential. Daia collected from chemical analysesof severalcigarettesare to be used to determine if there is strong support for this conjecture. (c) Referring to (b), suppose that the cigarette company is now marketing its new brand with the claim ,,Average-tarcontent: 5 milligrams per cigarette" printed on the packi. A consumer testing group suspectsthat the true averagetar content of these cigarettes may be higher than the manufacturer,s claim. The group intends to analyze several cigarettes and use the data it collects to challenge the company,i claim. (d) 4" inspector wants to establish that 2 x 4 lumber at a mill does not meet a specification that requires that at most 5% break under a standard load. 5.15 As of last year, only 2oo/oof the emproyeesin a large organization used public transportation to commute to and from work. To determine if a. recent campaign encouraging the use of public transportation has been effective, a random sampre of 25 employees is to be interviewed and the number of employe"r using public transportation X is to be recorded. "rrrri.rtiy (a) Formulate the hypothesesin terms of p, the popuration proportion of employeescurrently using public tr;sportatlon. (b) what should the rejection region be if o is to be controlled below .l? (c) For the rejection region chosen in (b), what is the lever of significance? 5.16 Referringto the test in Exercise5.15: (a) If p : .4, what is the probability that the test will fail to reject Ho? (b) After interviewing 25 employees, supposethat 10 are currently
Exercises 189 using public transportation. What conclusion would you draw by using the test? (c) What is the smallest o at which Ho could be rejected,given the data in ft)? 5.17 An advertisementmanagerfor a radio station claims that over 30% of all young adults in the city listen to a weekend music program. To establish this conjecture,it is decidedto test Ho: P <.3 against Hl.p >.3. Let X : No. of young adults who listened last week out of a sample o{ size n : 16, and consider the test that rejects Ho if X>8. (a) Determine the level of significance. (b) Evaluate B when p : .4. (c) If 10 out of the 16 report they listened, what is the conclusion of your test? (d) What is the smallest o at which Ho could be reiected,given the data in (c)? 5.18 In Exercise5.17 what error could be made If p :'25? lf.p : .35? 5.19 Many psychological experiments are conducted to determine whether animals exhibit a preferencebetween two rewards I and 2. In an experiment consisting of 7 trials with dif{erent animals, the results are Trial
Preferredreward Assume that the binomial model applies, and let X denote the number preferring Reward 2. (a) State the null hypothesis of no preference. (b) State the altemative hypothesis of preferencefor Reward 2. (c) If the rejection region is X > 6, determine cr and also the value of BatP:.8. (d) Use this test to draw conclusions from the above data. 5.20 In testing the null hypothesis Ho:p : .5 vs. the alternative Hr: p * .5 for a binomial model the rejection region of a test has the structure Reject Ho If X { cr or rf X > c, where X is the number of successesin n trials. For each of the following tests, find the level of significance and the probability of a type II error at the alternative p : .3.
1 9 0 Chapter (a) n
I, cz
(b) n
2, c2
*5-21 Quality control. when the output of a production processis stable at an acceptablestandard, it is said to be "in control.,, suppose that a production processhas been in control for some time and that the proportion of defectives has been .05. As a means of monitoring the proceSs,theproductionStaffwillSample15items@z ,@ectiveSw{lbeconsideredStrongevidencefor,,outof Q@t" (a) Find o,, the probability of signaling ,,out of control,,, rghen the process is at p
(b) Find B for p .3, .4. Draw the power curve. (Note; In quality control operations,one usually plots the values of P(FI. is not rejected).The resulting curve is called the operating characteristic (OC)- curve.) Referring to Exercise5.21: Graph the power curve for the test with
level ct
s.2l(b).
.05, and compare with
the power curve in Exercise
*5.23 Again referring to the situation describedin Exercises.zr determine a plan based on n - 25 with c, < .04. Graph the power curve on the same paper you used to graph the power curve of the test in Exercise 5.21(b). what has increasing the sample size accomplished? *5.24 Geometric distribution. Instead of performing a fixed number of Bemoulli trials, an experimenter performs tiials until the first successoccurs. The number of successesis now fixed at 1, but the number of trials y is now random. It can assume any of the values l, 2, 3, and so on with no upper limit. (a) Show that f(fl P,y (b) Find the probability of 3 or fewer trials when p x5.25 Poissondistdbution for rare events.The poisson distribution is often- appropriate when the probability of an event (success) is small. It has served as a probability model for the number of plankton in a liter of water, the number of calls per hour to an answering service,and number of earthquakesin ^year. The poisson distribution also approximates the binomial when the expected value np is small but n is large. The poisson distribution with mean m has the form
f(x) - s-rn T
for x
5. Exercises 191 where e is the exponential number or 2.718 (rounded) and x! is t h e n u m b e r x ( x- l X x - 2 ) " ' l w i t h 0 ! : l . G i v e n m : 3 a n d e-3 : .05, find: (a) PIX : 0], (b) P[X : 1]. 5.26 Many computer packagesproduce binomial probabilities. The single MINITAB command B I N O l {I A L F 0 R N = 5 r P = . ? 5
producesthe output
B I N O MA I L P R O B A BLII T I E S F O R N = K 0 I 2 3 4 s
P( X = K) 0.2373 0.3955 0 .2637 0.0879 0.0146 0.0010
5
ANDP = O.250000
P(X LESSOR = K) a.2373 0 . 6328 0 . 8965 0 .9844 0 .9990 1.0000
UsingthecomputercalculateP|X
CHAPTER
TheNormol Dlstrl butlon 1. PROBABILITY MODELFORA CONTINUOUS I?ANDOMVARIABLE 2, THENORMALDISTRIBUTION-ITS GENERAL FEAruRES 3. STANDARD NORMAL DISTRIBUNON 4. PROBABILIry CALCULATIONS WTHNORMAL DISTRIBUTIONS 5' THENORMAL APPROXIMATION TOTHEBINOMIAL *6, CHECKING THEPLAUSIBILIry OFA NORMAL MODEL -7, TRANSFORMING OBSERVATIONS TOANAINNEARNORMALIry
The Normal Distribution
Graduate Record Examinations (GRE) are a main element used to predict an applicant's graduate school performance.These examinations, administered world wide several times a year, provide graduateadmissions committees with a common test score on which to base decisions.The GRE scorescan also identify exceptional students deseruingof fellowships or other special awards.
Mean :
VerbalabilitYGREAPtitudeTest 447, Standarddeviatiofl: 120 basedon 449,300 test scores
(Bosedon Toble'lBGuide to the Useof the GroduoteRecord EducotionolTestingService,Princeton,New 1981-1982. Exominotions Jersey.)
f 93
194 Chapter
PROBABILW MODELFORA CONTINUOUS RANDOMVARIABLE up to this point, we have limited our discussion to probability distributions of discreterandom variables.Recall that a discreterandom variable takes on only some isolated values,usually integersrepresentinga counr. we now turn our attention to the probability distribution of a continuous random variable-one that can ideally assume any value in an interval. variables measured on an underlying continuous scale, such as weight, strength, life length, and temperature, have this feature. - fust as probability is conceived as the long-run relative frequency, the idea of a continuous probability distribution draws from the reiative frequency histogram for a large number of measurements. The reader may wish to review section 3.3 of chapter 2 where grouping of data in class intervals and construction of a relative frequency hlstogram were discussed.we have remarked that with an increasing number of obserrrations in a data set, histograms can be constructed with class intervals having smaller widths. we will now pursue this point in order to motivate the idea of a continuous probability distribulion. To focus the discussion let us consider that the weight (x) of a newborn baby is the continuous random variable of our interest. How do we conceptualizethe probability distribution of X? Initially, suppose that the birth weights of 100 babies are recorded, the data grouped in class intervals of I found, and the relative frequency histogram in Figure la is obtained. Recall that a relative frequency histogram has the properties: (a) The total area under the histogram is l. (b) For two points a and b such that each is a boundary point of some class,the relative frequency of measurementsin the interval a to b is the area under the histogram enclosed by this interval. For example,Figure la shows that the interval 7.s-g.s pounds contains a proportion .Zg + .ZS : .53 of the 100 measuremenrs. Next, we suppose that the number of measurements is increased to 5000 and they are grouped in class intervals of .25 pound. The resulting relative frequency histogram appearsin Figure lb. This is a refinem.rrt of the histogram 1a in that it is constructed from a larger set of observations and exhibits relative frequencies for finer class inteivals. (Narrowing the class interval without increasing the number of observations would obscure the overall shape of the distribution.) The refined histogram lb again has the properties (a) and (b) stated above. Proceeding in this manner, even further refinements of relative frequency histograms can be imagined with larger numbers of observations and smaller class intervals. In pursuing this conceptual argument, we ignore the difficulty that accuracy of the measuring device is limited. In the course of refining the histograms, the jumps between consecutive
1. Probability Model for a Continuous Random Variable
f 95
x
(o) Relotive frequency histogrom of '100birth weights with o closs intervol of 'l pound.
(b) Relolive frequency histogrom of 5000 birth weighls with o closs intervol of .25 pound.
(c) Probobilitydensity curve for the confinuousrondom vorioble X = birth *",n,jr. Figure't Probobilitydensity curve viewed os o limiting form of relofive frequency hislogroms.
196
Chapter 7
rectangles tend to dampen out, and the top of the histogram approximates the shape of a smooth curve, as illustrated in Figure tc. geiause probability is interpreted as long-run relative frequency, the curve obtained as the limiting form of the relative frequency histograms represents the manner in which the total probability I is distributed over the interval of n,o,:siblevalues of the random variable X. This curve is called ths probability density curve of the continuous random variable X. The mathe_ matic_al_ {unction /(x) whose graph produces this curve is called the probability density function of-the continuous random variable X. The properties (a) and (b) that we stated earlier for a relative frequency histogram are shared by a probability density curve that is, afier ali, conceived as a limiting smoothed form of a histogram. Also, since a histogram can never protrude below the x-axis, we have the further fact that l(x) is non-negative for all x.
The probability density function f(x) describes the distribution of probability for a continuous random variable. It has the properties: (a) The total area under the probability density curve is l.
( b ) P l a< X < b l
(c) f(x)
between o and b.
_ unlike the description of a discrete probability distribution, the probability density f(x) for a continuous random variable doesnot represent the probability of [X : x]. Instead,a probability density functionlelates the probability of an intervalla, blto the areaunder the curve in a strip over this interval. A single point x being an interval with a width of 0, supports O area,so P[X : x] : 0.
With a continuous random variable, the probability that X _ x is always 0. It is only meaningful to speakabout the probability that X lies in an interval.
The deduction that some clarification. In pounds] child can have a birth
the probability at every single point is zero needs the birth-weight example, the statement Plx -- 8.5 seems shocking. Does this statement mean that no weight of 8.5 pounds? To resolve this paradox, we
1. Probability Model for a ContinuousRandom Variable 197 need to recognize that the accuracy of every measuring device is limited, so that here the number 8.5 is actually indistinguishable from all numbers in an interval surrounding it, say [8.495, 8.505]. Thus the question really concems the probability of an interval surrounding 8.5, and the area under the curve is no longer 0. When determining the probability of an interval a to b, we need not be concerned if either or both end points are included in the interval. Since the probabilities of X : a and X : b arc both equal to 0, Pla< X< bl : 4a <X<
bl:4a
<X<
bf :
Pla<X
In contrast, these probabilities may not be equal for a discrete distribution. Fortunately, for important distributions, areas have been extensively tabulated. In most tables, the entire area to the left of each point is tabulated. To obtain the probabilities of other intervals, we must apply the following ruIes: PIa < X < bl _ (Area to left of b)
(Area to left of a)
b
P[b < Xl _ I
(Area to left of b)
Specification of a Probability Model A probability model for a continuous random variable is specified by grving the mathematical form of the probability density function. If a fairly large number of observations of a continuous random variable are available, we may try to approximate the top of the staircase silhouette of the relative frequency histogram by a mathematical curve. In the absenceof a large data set, we may tentatively assume a reasonable model that may have been suggestedby data from a similar source. Of course, any model obtained in this way must be closely scrutinized to verify that it conforms to the data at hand.
f98
Chapter 7 (flat) Uniform
L o n gt a i l t o (skewed)
Syrnmetric
Bell-shaped
I. - n" ' n o" '' h t ' bo t a i l t n r ' 'i D
( s k e w e d)
Figure2 Ditferenlshopesof probobilitydensitycurves.(o) Symmelryond devio(b) Differentpeokedness. fionsfrom symmetry. Features of a Continuous Distribution As is true for relative frequency histograms, the probability density curves of continuous random variables could possessa wide variety of shapes.A few of these are illustrated in Figure 2. Many statisticians use the term skewed for a long tail in one direction. A continuous random variable X also has a mean, or expected value E(X), as well as a variance and a standard deviation. Their interpretations are the same as in the caseof discrete random variables, but their formal definitions involve integral calculus and are therefore not pursued here. However, it is instructive to see in Figure 3 that the mean p : E(n marks the balance point of the probability mass. The median, another measure of center, is the value of X thet divides the area under the curve into halves. Besidesthe median, we can also define the quartiles and other percentiles of a probability distribution. The quartiles for two distributions are shown in Figure 4.
The population lOOpth percentile is an x value that supports p to its right. area p to its left and I Lower (first) quartile Second quartile (or median) - 50th percentile Upper (third) quartile -
7St}a percentile
1. Probability Model for a Continuous Random Variable
p
Median
Median
Figure 3 Meon os the bolonce point ond medion os the point of equol divisionof the probobility moss. f@)
2
6
lst quartile
3rd quartile
t
lst quartile
II
3rd quartile
Figure4 Quortilesof two continuousdistributions.
199
200
Chapter 7
statisticians often find it convenient to convert random variables to a dimensionless scale. SupposeX, a rcaLestate salesperson,scommission for a month, has mean $4000 and standard deviation $500. Subtracting the mean produces the deviation X - 4000 measured in dollars. Then, dividing by the standard deviation, expressedin dollars, yields the dimensionless variable Z : (X - 4000)/500.Moreover, the standardizedvariable Z can be shown to have mean 0 and standard deviation 1. (See Appendix A.3 for details.)
The standardized
7
variable X-
p',
Z.J
o
variable mean standard deviation
has mean 0 and sd 1.
EXERCISES 1.1 Which of the functions sketched in (a), (b), (c), and (d) could be a probability density function for a continuous random variable?whv or why not?
r.2 Determine
the following probabilities from the curve f(x) diagramed in Exercise1.1(a):
2. The Normal Distribution-lts
( a )P [ 0 < X < . s ] (c) P[l.s < x < 2]
General Features 201
( b )P [ . s < x < 1 ] (d) Plx
1 . 3For the curvef(*) graphedin Exercise1.1(c),which of the two intervals [0
r . 4Determine the median and the quartiles for the probability distribution depictedin Exercise1.1(a). 1 . 5 Determine the median, and the quartiles for the curve depicted in Exercise1.1(c).
r . 6Determine the
10th percentile of the curve in Exercise1.1(a).
r . 7Find the standardizedvariable Z (a) Mean 8 and sd 2. (b) Mean 350 and sd 25 (:) Mean 666 and variance 100
if X has: ; ' '' ,h.
/
" -l*r
DISTRIBUTICN 2. THENORMAL FEATURES ITSGENERAL The normal distribution, which may already be familiar to some readers as the curve with the bell shape,is sometimes associatedwith the names of Pierre Laplace and Carl Gauss, who figured prominently in its historical development. Gauss derived the normal distribution mathematically as the probability distribution of the error of measurements, which he called the "normal law of errors." Subsequently, astronomers, physicists, and, somewhat later, data collectors in a wide variety of fields found that their histograms exhibited the common feature of first rising gradually in height to a maximum and then decreasing in a symmetric manner. Although the normal curve is not unique in exhibiting this form, it has been found to provide a reasonable approximation in a great many situations. Unfortunately, at one time during the early stagesof the development of statistics it had many overzealous admirers. Apparently, they felt that all real-life data must conform to the bell-shaped normal curve, or otherwise, the processof data collection should be suspect.It is in this context that the distribution became known as the normal distribution However, scrutiny of data has o{ten revealed inadequacies of the normal distribution. In fact, the universality of the normal distribution is only a myth, and examples of quite nonnormal distributions abound in virtually every field of study. Still, the normal distribution plays a central role in statistics, and inference procedures derived from it have wide applicability and form the backbone of current methods of statistical analysis. Although we are speaking of the importance of the normal distribution,
Chapter
our remarks really apply to a whole class of distributions having bellshapeddensities. There is a normal distribution for each value of itJmean p and its standard deviation o.
A normal distribution has a bell-shapeddensityr shown in Figure 5. It has mean standard deviation
,
#l
p-2c
A r e a= . 6 8 3 |
p.-a
+
p+(r
tL
Figure5 Normol distribution.
The probability of the interval extending one sd each side of mean:
P[f,
c r< X <
two sd each side of mean:
P[f., 2o<X
three sd each side of mean:
P[p
p + o]-.683 2ol- .954
3o<X<]r+3ol-.997
A few details of the normal curve merit specral attention. The curve is symmetric about its mean F, which locates the peak of the bell (see tThe formula, which need not concern us, ls
f(x)
:
-;:
-;(?)'
,
fOr -m
V LTIC| "
where T is the area of a circle approximately 2.7183.
having
unit
radius,
(
X{
m
or approximately
3.1416,
and
e is
2. The Normal Distribution-lrs
General Features 20J
Figure6 Twonormoldishibutions wilh differenl meonsbul wilh the somestondorddeviotion. Figure 5). The interval running one standard deviation in each direction from p has a probability of .583, the interval from p - 2o to p + 2o has a probability of .954, and the interval from p - 3o to p * 3o has a. probability of .997, The curve never reaches 0 for any value of x, but becausethe tail areasoutside (p - 3o, p * 3o) are very small, we usually terminate the graph at these points.
Nototion The normal distribution with a mean of p and a standard deviation of o is denoted by N(p, o).
Interpreting the parameters, we can see in Figure 5 that a.change of mean from p, to a larger value p", merely slides the bell-shaped curve along the axis until a new center is established at pr. There is no change in the shape of the curve. A different value for the standard deviation results in a differenr maximum height of the curve and changesthe amount of the areain any fixed interval about p (seeFigure 7). The position of the center does not change if only o is changed.
Figure7 Decreosing
204
Chapter 7 Area = .954
, r
l-
-2
Area=
,
|
.683
_1
0
12z
Figure8 The stondord normol curve.
3. STANDARD NORMAL DISTRIBUTION The particular normal distribution that has a mean of 0 and a standard deviation of I is called the standardnormal distribution. It is customary to denote the standard normal variable by Z. The standard normal curve is illustrated in Figure 8.
The standardnormal distribution has a bell-shapeddensity with Mean p :_ 0 Standard deviation o The standard normal distribution is denoted by N( 0, l).
Use of the Standard Normal Table (Appendix B, Table 3) The standard normal table in the appendix gives the area to the left of a specified value oI z: PIZ < z] - Area under curve to the left of z For the probability of an interv al la, bl, Pla
[Area to left of a]
The following properties can be observed from the symmetry of the standardnormal curve about 0 as exhibited in FigLLre9:
3. Standard Normal Distribution
PIZ ( -zl
205
I - PIZ ( -zl
Figure9
(a) PIZ < 0] : '5 ( b )P I Z < - z l : 1 -
PIZ
Plz>zl
't Find PIZ < 1.371and PIZ > r.371. EXAMPLE From the normal table,we seethat the probability or areato the left of IIZ < I.371: .9147.More1.37is .9L47.(SeeTable1.) Consequently, > over,becauselZ 1.371is the complementof lZ < I.371, PIZ> 1.371: | - PIZ< 1.371- I - .9147: .0853 as we can seein Figure10.An altemative method is to use symmetry to show that PIZ > 1.371: PIZ < -L.371,which can be obtaineddirectly tl from the normal table.
Toble'l Howto Reodfrom Appendix B,Toble 3 for ='1.3 + .47 z = 11.37 .07 I I I I
Figure{0
EXAMPLE2
.9r47
Z .- 1.60], CalculateP[-.155 From Appendix B, Table 3 we see that
PIZ
1.601_ Area to left of 1.60 _ .9452
Chapter
-.155
Figure41
We interpolate2 between the entries for -.15 and -.16 to obtain
P IZ< -.l ssl Therefore,
4-.15s < Z < 1.501: .9452- .4384: .5068 which is the shadedarcain Figure I l.
n
EXAMPLE 3 Find PIZ < - 1.9or Z > Z.tl. The two eventslz < - 1.9]and lz > z.rlare incompatible,so we add their probabilities PfZ<-1.9orZ>2.Il
- 1.9
Figure42
As indicated in Figure 12, PIZ > 2.1] is the area to the right of 2.1, which 2since z : -.155 is halfway between -.15 and -.16, the interpolated value is halfway between the table entries .4404 and .4364. The result is .4384.
3. Standard Normal Distribution
207
(Area to left of 2.I) - 1 is 1 .982r Plz < - 1.91 Adding these two quantities, PIZ<-1.9orZ>
EXAMPLE4
2.11_ .0287+ .0179
Locate the value of z that satisfies PIZ > z] : .OZS. Using the property that the total area is 1, the areato the left of z must be I - .0250 : .9750.The marginal value with the tabular entry .9750is z : 1.96 (diagrammedin Figure 13). n
Figure'13 EXAMPLE5
Obtain the value of z for which Pl- z We observe from the symm etry of the curve that PIZ< -zl:
PIZ>zl:.05
From the normal table, we seethat z : I.65 g1vesplZ < - 1.65] : .0495 and z : 1.64 gives PIZ < -I.541: .0505. Since .05 is halfway between these two probabilities, we interpolate between the two z-values to obtain z : 1.645 (seeFigure 14). n
- 1.645
Figure14 Suggestion: The preceding examples illustrate the usefulness of a sketch to depict an area under the standard normal curve. A correct diagram shows how to combine the left side areas given in the normal table.
Chapter
EXERCISES 3.1 Find the atea under the standard normal curve to the left ( a )z _ T.T7 (b) z _ .16 2 .3 (c)z(d)z_ 1.83 3.2 Find the ;h area under the standard normal curve to the right (a) z : T . L 7 (b) z _ .60 (c) z : - 1 . 1 3 (d) z 3.3 Find the area under the standard normal curve over the interval: -.65toz: (a)z: .65 (b)z: -1.04 toz: LO4 (c)z: .32toz:2.65 (d) z : -.755 to z : 1.254(interpolate) 3.4 Identify the z-values in the following diagrams of the standard normal distribution (interpolate, as needed).
(a)
A
/ ll'Y /
ru., /l\
,oz
zO
1.82
ttz
3.5 For a standard normal random variable Z
(a) Plz (c) Plz > I.5el (e) Pl- 1.2 (g) Pl- 1.62
find: (b) Plz < .421 - 1.6e1 (d) Plz (f) P[.0s Z (h) P|zl r.641
4. Probability Calculations with Normal Distributions
209
3.5 Find the z-valte in eachof the following cases: (a) PIZ < zl : .1735 (b) 4Z > zf : .Io (c) Pl-z < Z < zl : .954 (d) PI- .6 < Z < zl : .5O 3.7 Find the quartilesof the standardnormal distribution.
WITH 4. PROBABILIIY CALCULATIONS DISTRIBUTIONS NORMAL Fortunately, no new tables are required for probability calculations regarding the general normal distribution. Any normal distribution can be set in correspondenceto the standard normal by the following relation:
fi X is distributed as N(p, o), then the standardized variable
, _X-p,
17 ZJ
a
has the standard normal distribution.
This property of the normal distribution allows us to cast a probability problem concerning X into one conceming Z. To {ind the probability that X lies in a given interval, convert the interval to the z-scale and then calculate the probability by using the normal table (Appendix B, Table 3).
EXAMPLE6
Given that X has the normal distribution N(50, 4), find P[55 < X < 63]. Here the standardized variable rs Z
x60
The distribution of X is shown in Figure where the z-scale is also displayed below the x-scale. In particular, x _ 55 gives z
55
60 -
1.25
glves Figure
Therefore,
P[ss
631
4- r.25
. 7s l
210
Chapter 7
Usingthe normaltable,w€ find plz < .TSl .1056so the requiredprobabilityis .7734 .1056- .66Tg. The working steps employed in Example 6 can be formalized into the rule:
If X is distributedas N(p, o), then Pla<X< bl whereZ hasthe standardnormal distribution.
EXAMPLE7 The number of calories in a salad on the lunch menu is normally distributed with mean - ZOOand sd you select will contain:
5. Find the probability that the salad
(a) More than 208 calories.
(b) Between 190 and ZO0calories. Lettin g x denote the number of calories in the salad, we have the standardizedvariable
x
Z
200
(a) The z-value corresponding to x Z
209
200
,
Therefore,
PIX
- I
.9452:.0548
(b) The z-values correspondingto x
4. Probability Calculations with Normal Distributions
190 --
200
-2.0
and
200
2ll
200
respectively. We calcul ate
P l L 9 0 < X < 2 0 0 1 Pl- 2.0 .0229- .4772
.5
ti
EXAMPLEB The raw scores in a national aptitude test are normally distributed with mean : 506 and sd : 81.
(a) What proportion of the candidatesscoredbelow 574? (b) Find the 30th percentile of the scores. Denoting the raw score by X, tlire standardized score z7J
x-506 8t
is distributed as N(0, 1). (a) The z-scorecorrespondingto 5 74 ts
5 7 4g z : 8r
:.8895
So PIX < 5741: I4Z <.83951 :
-799
Th:us 79.9"/" or about 80% of the candidates scored below 574. In other words, the score 574 nearly locates the 80th percentile. (b) We first find the 30th percentile in the z-scaleand then convert it to the x-scale. From the standard normal table, we find PIZ < -.5241 : .30 The standardtzed score z
- .524corresponds to 5 0 6+ 8 1 ( - . s 2 4 ) 463.s6
Therefore,the 30th percentile scoreis abovt 463.6.
Chapter
4.I If X is normally distributed with p : g0 and o : 4, find:
(a) P[x < 7s] (c) Plx > 861 ( e )P [ 7 3< X < 8 9 ]
(b) ptx < 871 (d) ptx > 7rl ( f ) p l 8 1< x < 8 4 1
4.2 If X hasa normal distributionwith p : 150and o : 5, find b such that: (a) PIX < b] : .97s (b) P[x > bl: .02s (c) P[x < b] : .305 4.3 Scoreson a certainnationwidecollegeentranceexaminationfollow a normal distribution with a -ean of s00 and a standarddeviation of 100.Find the probabilitythat a studentwill score: (a) Over 650. (b) Less than ZS0. (c) Between825 and 6TS. 4.4 Refer to Exercise4.8 (a) If a school only admits students who score over 670, what proportion of the student pool would be eligible for admission? (b) What limit would you set that makes 50% of the students eligible? (c) What should be the limit if only the top ls% areto be eligible? 4.5 The error in measurements of blood sugar level by an instrument is normally distributed with a mean of .05 and a standard deviation of 1.5; that is, in repeatedmeasurements, the distribution of the difference (recordedlevel - true level) is N(.05, 1.5). (a) what percentage of the measurements overestimate the true level? (b) suppose that an error is regarded as serious when the recorded value differs from the true value by more than 2.g. what percentage of the measurementswill be in serious error? (c) Find the 80th percentile of the error distribution. 4.6 The time for an emergency medical squad to aryrve at the sports center at the edge of town is distributed as a normal variable with p (a) Determine the probability that the time to arrive is:
5. The Normal Approximation to the Binomial
213
(i)
More than 22 mrnutes. (ii) Between 13 and 21 minutes. (iii) Between 15.5 and 18.5 minutes. (b) Which arcLvalperiod of duration I minute is assignedthe highest probability by the normal distribution?
5. THENORMALAPPROXIMATION TO THEBINOMIAL The binomial distribution, introduced in Chapter 6, pertains to the number of successesX in n independent tdals of an experiment. When the success probability p is not too near 0 or 1, and the number of trials is large, the normal distribution serves as a good approximation to the binomial probabilities. Bypassing the mathematical proof, we concentrate on illustrating the manner in which this approximation works. Figure 16 presents the binomial distribution for number of trials n : 5, 12, and25 whenp : .4. Notice how the distribution begins to assume the distinctive bell shapefor increasing n. Even though the binomial distributions with p : .4 are not symmetric, the lack of symmetry becomes negligible for large n. But how do we approximate the binomial probability
PIX:x]_
k)o",t
by a normal probability? The normal probability assigned to a single value x is zero. However, as shown in Figure 17, the probability assigned to the interval x - L to x + * is the appropriate comparison. The addition and subtraction of i is called the continuity cortection For n : 15 and p : .4, the binomial distribution assigns
PIX _ 7l Recall from Chapt er 6 that the binomial distribution
has
mean -- np : 15(.4) _ 6
sd_ffi-ffi:I.B9T To obtain an approximation we select the normal distribution with the same mean, JL : 6, and same o : 1.897.The normal approximation is then the probability assigned to the interval 7 - + to 7 + L.
214
Chapter 7
r\-/
rJ
av
)c
Figure16 The binomiol distributionstctrp = .4 ond n = 5 , ' 1 2ond , 25.
< X < 7.s1 PL6.s
'L
r.997
---
t . B 9 7 - - - t - 8 g zJ
: P1.264 Considering that n _ 15 is small, the approximation .1814 is reasonable compared to the exact value .177. More importantly, the accuracy of the approximation increases with the number of trials n. The normal approximation to the binomial applies when n is large
5. The Normal Approximation to the Binomial
x-)
22
2I5
I
x*
Figure1 7
and the successprobability p is not too close to 0 or 1. The binomial probability of la < X < bl is approximated by the normal probability of
la
+<x< b + +1.
TheNormolApproximolionto fhe Binomiol p) are both large, say greater than 15, the When np and n(l binomial distribution is well approximated by the normal distributionhavingmean:npandsd:ffi.Thatis,
Z-
EXAMPLE9
+is Y np(I
p)
approximatelyN(0, 1).
Let X have a binomial distribution with p : .6 and n : 150. Approximate the probability that: (a) X is between 82 and 101 inclusive. (b) X is Sreater than 98. (a) We calculate the mean and standard deviation of X. mean-np:150(.6)_90
sd_ffi@-\R-6 The standardized variable is -_:
x-go 6
The event [82 < X < 101] includes both end points. The appropriate
216
Chapter 7
continuity correction is to subtract L from the lower end and add i to the upper end. We then approximate
<x< ror.5]:{!f#=\29=r{-29] P[8r.s :Pl,-1.417
97lwe reason that97 is not included so that lX > 97 + .Sl or lX >- 97.5] is the event of interest.
Irx>sl.sl:rll7='J#l : r I z > r . 2 5 1: L - . 8 9 4 4 Thenormalapproximation to the binomialgivesplx > 971- .tOS6n
EXAMPLE',|0 A large-scalesurvey conducted 5 years ago revealed that 30% of the adult population were regular users of alcoholic beverages.If this is still the current rate, what is the probability that in a random sample of 1000 adults the number of users of alcoholic beverageswill be: (a) less than 280? (b) 316 or more? Let X denote the number of users in a random sample of 1000 adults. Under the assumption that the proportion of the population who are users is .30, x has a binomial distribution with n : 1000 and p : .g. Since
np:300, ffi_\m the distribution of X is approximately N(300, 14.5). BecauseX-is a count X < 280 is the same as X < 279. Using the continuity correction, we have PIX < 279] '=
- -J tPlz = 27e.s soof r4.s PIZ < .0787
- r.4r4l
Exercises 217
(b)
:rlt=#] Plx>3161 : plz : I
.8577_.1423
f-__l tl rl
If the obiect is to calculate binomial probabilities, today Remark the best practice is to evaluate them directly using an established statistical computing package. The numerical details need not concern us. However, the fact that
X -+
ffi
tely normal is i isapproxima
when np arrdn(l - p) are both large, remains important. We will use it in later chapters when discussing inferences about proportions. Because the continuity correction will not be crucial, we will drop it for the sake of simplicity. Beyond this chapter, we will employ the normal approximation but without the continuity correction.
EXERCISES 5.1 Let the number of successesX have a binomial distribution with n:25 andp : .5 (a) Find the exact probabilities of each of the following: X:17,
11sXs18,
l1<X<18
(b) Apply the normal approximation to each situation in (a). 5.2 Let the number of successesX have a binomial distribution with p : .25 and n : 300. Approximate the probability of (a) X : 80, (b) X= 55, and (c) 68 < X< 89. 5.3 State whether or not the normal approximation to the binomial is appropriate in each of the following situations: ( a )n : 5 0 0 , p (c) n (e) n 5.4 Copy Figure 16 and add the standard score scale z thex-axis for n _ 5, 12, and 25. Notice how @underneath the distributions center on zero and most of the probability lies betweert z _ -Z and z 5.5 In a large midwestern university, 30% of the students live in apart-
218
Chapter 7
ments. If 200 students are randomly selected, find the probability that the number of them living in apartments will be between 50 and 75 inclusive. 5.6 The unemploymentrate in a city is7.9%. A sample of 300 persons is selected from the labor {orce. Approximate the that: irobability (a) Less than 18 unemployed personsare in the sample. (b) More than 30 unemployed persons are in the sample. 5.7 suppose that 2o% of the trees in a forest are infested with a certain type of parasite. (a) what is the probability that, in a random sample of 300 trees, the number of trees having the parasite will be between 49 and 71 inclusive? x(b) After sampling 300 trees, suppose that 72 trees are found to have the parasite. Does this contiadict the hypothesis thai the population proportion is 2Oy"?Give ,r"ror* io. yo,r, ,nswe., .rsirrg the terminologies of testing hypotheses.
*6. CHECKINGTHEPLAUSIBILIry OFA NORMAL MODEL Does a normal distribution serve as a reasonablemodel for the population that produced the sample?one reasonfor our interest in this question is that many commonly used statistical procedures require the population be nearly normal. If a normar distribution is tentatively assumed to be a p-lausible-model,the investigator must still check ttrr'"rr"-ption once the sample data are obtained. Although they involve subjective judgment, graphical procedures prove most helpful in detecting serious deparlures f;;Hi.,ogr"-, can be inspected for lack of symmetry. The thick";"o;iiay. oii(; tails can be checked for conformance witl the noimal ny iroportiorrs of observations in the intervals (x - s, x * "o-plri'g;ir-; s), (' _ 2s, x * 2s), and F-- 9r,x + 3s) with-those suggestedby the s"id"rir" for the bell-shaped(normal) distribution "#pi.i""t A more effective way,to check the plausibility of a normal model is to _constructa special graph, called a normal-scoresplot of the sample data. In order to describe this, method, we will first &pui" tt. Leaning of normal-scores, indicate how the plot is and then explain "onrt-"tid,of ih" id""r, how to interpret the plot. For an easy expranation we work with a small sample size. In practical appricationr, re"si rs or zo observations are needed to detect a meanrn-gful pattern",i,,ih. ptot. The term normal-scoresrefersto an idealized samplefrom the standard normal distribution-namely, the z-values that divide the standard normal distribution into equal-probability intervals. For purposesof discus-
6. Checkingthe Plausibility of a Normal Model 219 sion, suppose the sample size is n : 4. Figure 18 shows the standard normal distribution where four points are located on the z-axis so the -- -2. distribution is divided into five segments of equal probability + These four points, denoted by mr, m2, rrTst and mo, are precisely the normal-scoresfor a sample of size n : 4. Using Appendix B Table 3, we find that mr m2 m3 m4
tT\
tTL4
ond f ilbution il?'1"',HJi:Jl!?'3' The steps in constructing a normal-scoresplot are: (a) Order the sample data from smallest to largest. (b) Obtain the normal scores. (c) Pair the ith largest observation with the ith largest normal-score, and plot the pairs in a graph.
EXAMPLE11 To illustrate, suppose a random sample of size 4 has produced the observations 68,82, 44,75. The orderedobservationsand the normal-scoresare shown in Table 2, and.the normal-scores plot of the data is given in Figure 19.
Toble2 Normal Scorcs mr: m2: m3: m4:
- .84 - .26 .26 .84
Ordered SamPle 44 68 7s 82
0 scores Normal
Figure'19Normol-scoresplot of Toble 2. Interpretation
o[ the Plot
How does the normal-scores plot of a data set help in checking normalilty? To explain the main idea, we continue our discussion with the data of Example 1 I . Let p. and o denote the mean and standard deviation
220
Chapter 7
of the population from which the sampre was obtained. The normal scores that are the idealized z-observations can then be converted to the x-scale by the usual relation x : p + oz. The actual x-observations and the corresponding idealized observations are given in Table 3. If the population were indeed normal, we would the two columns of "*p""t Table 3 to be close. In other words, a plot of the observedx-values versus the normal scores would Droduce a straight line pattern where the inter_ cept of the line would indicate the valule of p and the slope of the line would indicate o.
Toble 3 Observed x-Values
Idealized x-Values
44 68
p+ affit l-r + afrz p+ offie p+ affi+
7s 82
A straight line pattern in a normal-scoresplot supports the plausibility of a normal model. A curved appearance indicates departure from normality.
The normal-scores plot of a data set is easily obtained using the com_ puter. use of the MINITAB packageis illustratid here with ihe oata set of Exercise 7.23, Chapter 2, conceming the strengtrr *."r"t"ments of southem pine. The NScoRE is applieiio a;l;-" 1, which "o*to"id contains the dl measurements, followed by it. proi-"o-mand. (see Exercise 8.23f. Notice that the plot in Figure 20 conforms quite weil to the straight line-pattem expected for normal observations (the numeral 2 indicates a double point). The largest five observations, howerr"r, grr. ,o_e evidence of being too large with respect to the others.
*7. TRANSFORMING OBSERVATIONS TO ATTAINNEARNORMALITY A valid application of many powerful techniques of statistical inference, especially those suited to small or moderate-samples, ,"q.ri*, that the population distribution be reasonably close to normal- whln the sample
7. Transforming Observations to Attain Near Normality
221
cl 5400 . 00+
480o.oo:
***2
**
4200 . 00+
*2* 222*
: 3600.00+
:
*** * ***
*23 * 222 * 22* 2*
23
3 0 0 0 .o o *
2400. 00+ -2.50
- I .50
-0.50
0.50
1.50
2.50
Figure 20
measurements appear to have been taken from a population that departs drastically from normality, an appropriate conversion to a new variable may bring the distribution close to normal. Efficient techniques can then be safely applied to the converted data, whereas their application to the original data would have been questionable. Inferential methods requiring the assumption of normality are discussedin later chapters.The goal of our discussion here is to show how a transformation can improve the approximation to a normal distribution. There is no rule for determining the best transformation in a given situation. For any data set that doesnot have a symmetric histogram, we consider a variety of transformations.
Some Useful Tronsformotions Make large values larger: x3, xz
Make large values smaller: {*,
{",
log. x,
*
222
Chapter 7
You may. recall that log"_x is the natural logarithm. Fortunately, computers easily calculate and order the tranformed values, so that'serrer"l transformations in a list can be quickly tested. Note, however, that the observationsmust be positive if we iniend to use \/;, {i, r"i tog,,. The selection of a good transformation is largely ^ ^^tt , of ttiii ,"a error' If the data set.contains a few numbers thafappear to be detached far to the right, Vx, (fy, roq"x or negatiu. po*"rr-ihat would p,rtt irr.r. stragglerscloser to the other data points should be consideredl EXAMPLE'12 A forester records the volume of timber, measured in cords, for 49 plots selected in a large forest. The data are given in Table 4 and the corresponding histogram appearsin Figure Zti. rhehistogram exhibits torrg tail to the right, so it is reasonable to consider t"he transformations " \/r, {x, rog" x, and,r/x. The most r"tiri""tory resurt, obtained with Transformed Data
is illustrated in Table 5 and in Figure zlb. The latter histogram more nearly resemblesa sample from a normal population.
Toble4 Volumeof Timberin Cords 39.3 3.5 6.0 2.7 7.4 3.5 19.4 19.7 1.0 8.7
14.8 8.3 1 7. T 25.2 6.6 8.3 19.0 10.3 7.6 18.9
6.3 10.0 16.8 24.3 5.2 44.8 14.1 3.4 28.3 3.4
.9 1.3 .7 L 7. 7 8.3 8.3
r.9 16.7 26.2 10.0
6.5
7.r 7.9 3.2 s.9 13.4 12.o 4.3 3r.7
Courtesy of ProfessorAlan Ek.
Tobte 5 TheTronsformedDoto ffi 2.50 r.37 r.57 1.29 r.64 r.37 2.07 2.TT 1.00 r.72
r.96
r.7a 2.03 2.26 r.60 r.70 2.r0 r.79 r.66 2.09
r.58 1.78 2.02 2.22 1.51 2.s9 1.93 r.36 2.3r r.36
.97
r.o7 .9r 2.O5
r.70 r.70 r.r7 2.O2 2.26 r.78
r.60 1.63 1.68
r.34 1.56 I.9r r.86 r.44 2.37
Key ldeas
20
30
223
40
V o l u m e( i n c o r d s )
0
05
1.0
2.O 1.5 i votumE
2.5
3.0
(b)
Figure 21 An illustrotionof the tronsformotiontechnique. (o) Histogromof tirnbervolume. (b) Histogrom aof V volume.
KEYIDEAS The probability distribution for a continuous random variable X is specified by aprobability density curve The probability that X lies in an interval form from a to b is determined by the area under the probability density curve between a and b. The total area under the curve is 1, and the curve is never negative. Thenormal distribution has a symmetric bell-shapedcurve centeredat the mean. The intervals of one, two, and three standard deviations around the mean contain the probabilities .683, .954, and .997, respectively. If X is normally distributed with mean : p and sd : o, then Z-
X -Jt' o
has the standard normal distribution
224
Chapter 7
when the number of trials n is large, and the successprobability p is not too near 0 or l, the binomial distribution is well approximated 1
r
r.
.i
by a normal distribution with mean : np andsd : f@(t _ p; Specifically,the probabilitiesfor a binomial variablex i^n b. "pproximatelycalculatedby treating T x-n: p Inp(I
p)
as standard normal. The normal scores plot of a data set provides a diagnostic check for possible departure from a normal distribution.
Transformation of the measurement scale often helps to convert a long-tailed distribution to one that resemblesa normal distribution.
B. EXERCISES 8.1 Determine the (a) median and (b) quartiles for the distribution shown in the following illustration.
x
8.2 For X having th-e density in Exercise g.1, {ind (a) p[X > .g] (b) Pt.s s X < .81and (c) p[.s < X < .8]. 8-3 In the context of scores on the Graduate Record Examination presentedat the front of the chapter, describe the reasoning that leads from a histogram to the concept of a probability dJnsity curve. fThink of successivehistograms basedon r00 .*"- ,.or.r, 5000 scores,449,BOO scores,and then an unlimited number.] 8.4 For a standardnormal random variable Z, find: (a) Plz < 1.311 (b) p[Z > 1.20s]
( c ) P 1 . 6<7Z < 1 . 9 8 1
( d )p [ - L . B z < Z < r . o s s l
8.5 For the standardnormal distribution, find the value z such that: (a) Area to its left is.0969. (b) Area to its left is.l2. (c) Area to its right is .2578. (d) Area to its right is .25.
8. Exercises 225 8.6 Find the 20th, 40th, 60tlr, and 80th percentiles of the standard normal distribution. 8.7 rf z is a standard normal random variable, what is the probability that: (a) Z exceeds-.722. (b) Z lies in the interval (- 1.50, l.S0)? (c) lzl exceeds2.0? (d) lzl is less than 1.0? 8.8 The distribution of raw scores in a college qualification test has mean : 582 and standard deviation : 75. (a) If a student's raw scoreis 695, what is the correspondingstandardized score? (b) If the standardizedscore is -.8, what is the raw score? (c) Find the intewal of standardized scores corresponding to the raw scoresof 380 to 560. (d) Find the interval of the raw scores corresponding to the standardizedscoresof -1.2 to 1.2. 8.9 If. X is normally distributed with p : 100 and o : 8, find:
(a) Plx < 1071 (c) Plx > 1101 (e) P[9s< x < 106] ( g ) P [ 8 8< x < 1 0 0 ]
(b) P[x < 971 (d) P[x > 90] (f) P[103< x< 114] (h)P[60<x< 108] 8.10 If Xhas a normaldistributionwith meanp, : 2OO ando : 5, find b such that:
(a) Plx < bl (c) Pllx 2001
(b) PIX > bl - .0110
8.11 Supposethat a student's verbal score,X, from next year,sGraduate Record Exam can be considered as an observation from a normal population having rnean 497 and standard deviation 120. Find: (a) PIX > 500]. G) qOth percentile of the distribution. (c) Probability that the student scoresbelow 400. 8.12 The lifting capacities of a class of industrial workers are normally distributed with mean : 65 pounds and sd : l0 pounds. Whai proportion of these workers can lift an 8O-poundload? 8.13 The bonding strength of a drop of plastic grue is normally distributed with mean : 100 pounds and sd : B pounds. A bioken plastic strip is repaired with a drop of this glue atrd then subjected
226
Chapter 7
to a test load of 98 pounds. What is the probability that the bonding will fail? 8.14 Grading on a curve. The scoreson an examination are normally distributed with mean lL : 70 and standard deviation o : 8. Supposethat the instructor decides to assign letter grades according to the following scheme:
Scores
Grade
Less than 58 58 to 66 66 to 74 74 to 82 82 and above
F D C B A
Find the percentage of students in each grade category. 8.15 Supposethe duraction of trouble-free operation of a new vacuum cleaner is normally distributed with mean : 530 days and sd : 100 days. (a) What is the probability that the vacuum cleaner will work for at least 2 years without trouble? (b) The company wishes to set the warranty period so that no more than IO"/" of the vacuum cleaners would need repair services while under warranty. How long a waffarrty period must be set? 8.16 An aptitude test administered to aircraft pilot trainees requires a series of operations to be performed in quick succession. Suppose that the time needed to complete the test is normally distributed with mean : 90 minutes and sd : 20 minutes. (a) To passthe test, a candidatemust complete it within 80 minutes. What percentageof the candidateswill pass the test? (b) If the top 5% of the candidates are to be given a certificate of commendation, how fast must a candidatecomplete the test to be eligible for a certificate? *8.I7 A prcperty of the normal distribution Supposethe random variable X is normally distributed with mean : p and sd : o. If Y is a linear function of X-that is, Y : a + bX, where a and b are constants-then Y is also normallv distributed with mean Sd
8. Exercises 227 For instance, If X is distributed as N(25, 2) and y _ 7 BX, then the distribution of Y is normal with mean sd (a) At the "Iow" setting of a water heater, the temperature (X) of water is normally distributed with mean - 102'Fand sd - 4"F. If Y refers to the temperature measurement in the centigrade scale, that is, Y _ 8(X 32), what is the distribution of yz. (b)Referringtopart(a),findtheprobabi1ityof[35< Remark: The relation between a generalnormal and the standard normal is only a special case of this property. Specifically, the standardtzedvariable Z is the linear function z
L:
X--g oo
+
-I X cr
whose
l't + 1r, (r' o
mean : sd -
p
^I 0 o
8.18 Let X denote the number of successesin n Bernoulli trials with a successprobability of p. (a) Find the exact probabilities of each of the following: (i)X=5whenn:25,p:.4. ( i i ) 1 1 < X < 1 7 w h e nn : 2 O , P: . 7 . (iii) X> 11 when n : 16,p : .5. (b) Use a normal approximation for each situation in (a). 8.19 It is known from past experience that 9"/" of the tax bills are paid late. If 20,000 tax bills are sent out, find the probability that: (a) Less than 1750 are paid late. (b) 2000 or more are paid late. 8.20 A particular program, say program A, previously drew 30% of the television audience. To determine if a recent rescheduling of the programs on a competing channel has adversely affected the audience of program A, a random sample o{ 400 viewers is to be asked whether or not they currently watch this program. (a) If the percentage of viewers watching program A has not changed, what is the probability that fewer than 105 out of a sample of 400 will be found to watch the program? (b) If the number of viewers of the program is actually found to be less than 105,will this strongly support the suspicion that the population percentage has dropped?
8.21 rhe number of successesx has a binomial distribution. state whether or not the normal approximation is appropriate in each of thefollowingsituations: (a) n : 400,p: .28,(b) n : 20,p: .04, (c)n: 9O,p: .99. 8.22 Because 10% of the reservation holders are ,,no-shows,,,a IJ.S. airline sells 400 tickets for a flight that can accommodate 370 passengers. (a) Find the probability that one or more reservation holders will not be accommodated on the flight. (b) Find the probability of fewer than 850 passengerson the ftight. *9.23 Normal Scores Plot. Use a computer program to make a normal scoresplot for the timber data in Table 4. comment on the departure from normality displayed by the normal scores plot. with the data set in Column l, the normal scores plot is c-reatedby the commands N S C O RCE1 s E T I N C ? P L B TC ? U 5 C I *8.24 Trunsformations and normal scoresplots. The MINITAB computer language makes it possible to easily transform data. with the data already set in Column l, the commands L O G E C1 P U T I N C 4 S Q R T C1 P U T I N C 5
place the log, in Column C4 and square root in C5, respectively. Take the square root of the timbe r data in Table 5 and then do a normal-scoresplot.
Q
G
Wrlotlon in RepeotedSomplssng Distri Sompli butions 1, INTRODUCTION 2, IHESAMPLING DISTRIBUTION OFA STATISTIC 3. DISTRIBUTION OFTHESAMPLE MEANANDTHECENTRAL LIMITTHEOREM
230
Chapter I
i,:l r;;1i..;.i',',1-f "-
Bowlersore well lomiliorthot theirthree gome overogesore lessvorioblethon theirsinglegome scores.
1. INTRODUCTION At the heart of statistics lies the ideas of inference. They enable the investigator to argue from the particular observations in a sample to the general case. These generalizations are founded on an understanding of the manner in which variation in the popuration is transmitted, by sampling, to variation in statistics like the iample mean. This key concept is the subject of this chapter. Typically, we are interested in learning about some numerical feature of the population, such as the proportion possessinga stated characteristic, the mean and standard deviation of the popultion, or some other numerical measure of location or variability.
A numerical feature of a population is called a parameter
The true value of a population parameter is an unknown constant. It can be correctly determined only by a complete study of the population. The concepts of statistical inference come into play wheniver this is impossible or not practically feasible.
1..Introduction 23f If we only have accessto a sample from the population, our inferences about a parameter must then rest upon an appropriate sample-based quantity. While a parameter refers to some numerical characteristic of the population, a sample-basedquantity is called a statistic.
A statistic is a numerical valued function of the sample observations.
For example, the sample mean
X
Xr+
*Xn
n
is a statistic because its numerical value can be computed once the sample data, consisting of the values of Xr, . . . , Xn are available. Likewise, the sample median and the sample standard deviation are also sample-basedquantities so each is a statistic. A sample-basedquantity (statistic) must serve as our source of information about the value of a parameter. Two points are crucial: Becausea sample is only apart of the population, the numerical value of a statistic cannot be expected to give us the exact value of the parameter. Moreovet, the value of the statistic dependson the particular sample that happensto be selected. Over different occasions of sampling, there will be some variability in the values of the statistic. A brief example will help illustrate these important points. Supposean urban planner wishes to study the averagecommuting distance of workers from their home to principal place of business.Here the statistical population consists of the commuting distances of all the workers in the city. The mean of this finite but vast and unrecorded set of numbers is called the population mean, which we denote by p. We want to learn about the parameter p by collecting data from a sample of workers. Suppose80 workers are randomly selected and the (sample) mean of their commuting distancesis found to be X : 8.3 miles. Evidently, the population mean p cannot be claimed to be exactly 8.3 miles. If one were to observe another random sample of B0 workers, would the sample mean again be 8.3 miles? Obviously, we do not expect the two results to be identical. Becausethe commuting distances do vary in the population of workers, the sample mean would also vary on different occasions of sampling. In practice, we observe only one sample and correspondingly a single value of the sample mean such as x : 8.3. However, it is the idea of the variability of the f-values in repeated sampling that contains the clue to determining how precisely we can hope to determine p from the information on X.
232
Chapter 8
DISTRIBUTION 2, THESAMPLING OF A STATISTIC The previous discussion leads to the important concept that the value of a statistic varies in repeatedsampling. In other words, a statistic is itself a random variable and consequently has its own probability distribution. The variability of a statistic in repeated sampling is described by this probability distribution.
The probability distribution of a statistic is called its sampling distribution.
The qualifier "sampling" indicates that the distribution is conceived in the context of repeated sampling from a population. We often drop the qualifier and simply say the distribution of a statistic. Although in any given situation we are limited to one sample and the corresponding single value for a statistic, over different samples the statistic varies according to its sampling distribution. The sampling distribution of a statistie is determined from the distribution l(x) that governs the population, and i! also depends on the sample size n. Let us see how the distribution of X can be determined in a simple situation where the sample size is 2 and the population consists of 3 units.
EXAMPLE'l
A population consists of the three housing units, where the value of X, the number of rooms for rent in each unit, is shown in the illustration.
ffiffiffi Consider drawing a random sample of size 2 with replacement. That is, we select a unit at random, put it back, and then select another unit at random. Denote by X, and Xrthe observation of X obtained in the first and seco_nd drawing, respectively. We want to find the sampling distributionof X: (Xt + X2)/2. The population distribution of X is given in Table 1, which simply formalizes the fact that each of the X-values 2, 3, and 4 occurs in * of the population of the housing units.
2. The Sampling Distribution of a Statistic 233
Toble'l ThePopulolion Distribution
_l_
3 I :l 1 3
Becauseeach unit is equally likely to be selected, the observation X, from the first drawing has the same distribution as given in Table 1. Since the sampling is with replacement, the second observation X, also has this same distribution. _ The possible samples (xt, xz) of size 2, and the corresponding values of X are (xr, xr) (2, 2) (2,3) (2, 4) (3, 2) (3,3) (3, 4) (4, 2) (4,3) (4, 4) xl+x2 -T-
2.5
2.5
3.5
3
3.5
Each of the nine possible samples are equally likely so, {or instance, : 2.5]:3. Continuing in this manner, we obtain the distribution of 4X X, which is given in Table 2.
Distribution Toble2 The_Probobility
otx Value of X 2 2.5 3 3.5 4
Probability I I
2 9 :l I
,
g I
9
This sampling distribution pertains to repeated selection of random samples of size 2 with replacement. It tells us that if the random sampling is repeated a large number of times, then in about i or ll"/o of the casesthe sample mean would be 2, in E or 22% of the casesit would be 2.5, and so on. Figure I shows the probability histograms of the distributions in Tables I and2. n In the context of Example I supposeinstead the population consists of 300 housing units of which 100 units have 2 rooms, 100 units have 3
234
Chapter B
4-x
2
34
distribution.(b) Sompfing ofX:(Xt+
Figure.l ldeo of o SomplingDisfribution. rooms/ and 100 units have 4 rooms for rent. when sampling two units from this large population, it would make little difference whlther or not we replace the unit after the first selection. Each observation would still have the_sameprobability distribution-nameLy, plX : Zf : p[X : 3] : 4X : 4f : *, which characterizes the popul"iiorr. when the population is very large, and the sampre size relatively small, it is inconsequential whether or not a unit is replaced before the next unit is selected. under these conditions/ too, we refir to the observations as a random sample. what are the key conditions required for a sample to be -of random? The observationsXr, X", . . . , Xn ateaiandom sample ,ir" ,, from the population.distribulion if theyiesult from indep.ird.rr, selections and each has the same distribution as the population. Because of variation in the population, random-samples vary and so does the value of x or any other statistic. To illustraie the idea of a sampling distribution, with a minimum of calculation, we considered a small population with only three values of X and a small sample size n : 2- The calculation gets more tedious and extensive when a population has many values oI x andn is large. However, the procedure remains the same: Once the population and sample size arespecified: l. List all possible samplesof sizen. 2. Calculate the value of the statistic for each sample. 3. List the distinct values of the statistic obtained in step 2. calculate the corresponding,probabilities by identifying all the samples that yield the same value of the statistic. we leave the more complicated casesto statisticians who can sometimes use additional mathematical methods to derive exact sampling distributions.
3. Distribution of the SampleMean and the Central Limit Theorcm 235 Instead of a precise determination, one can tum to the computer in order to approximate a sampling distribution. The idea is to program the computer to actually draw a random sample and calculate the statistic. This procedure is then repeated a large number of times and a relative frequency histogram constructed from the values of the statistic. The resulting histogram will be an approximation to the sampling distribution. This approximation will be used in Example 3.
EXERCISES 2.1 Identify each of the following as either a parameter or a statistic: sample standard deviation sample range population 10th percentile sample first quartile population median 2.2 From the set of numbers {1, 3, 5} a random sample of size 2 will be selectedwith replacement. (a) List all possible samples and evaluate x for each. (b) Determine the sampling distribution of X. 2.3 A random sample of size 2 will be selected, with replacement, from the set of numbers {2, 4, 6}. (a) List all possible samples and evaluate X and s2 for each. (b) Determine the sampling distribution of X. (c) Determine the sampling distribution of s2. 2.4 A consumer wants to study the size of strawberries for sale on the market. If a sample of size 4 is taken from the top of a basket, will the berry sizes Xr, Xz, Xe, Xo be a random sample? Explain.
DISTRIBUTION OF THESAMPLE MEAN AND THECENTRAL LIMITTHEOREM In so far as the population mean represents the center of a population, statistical inference about this parameter is of prime practical importance. Not surprisingly, inference procedures are based on the sample mean X
Xr + X2 +
+Xn
236
Chapter 8
and its sampling distribution. Consequently, we now explore the basic properties of the sampling distribution of X and explain the role of the normal distribution as a useful approximation. In particular, we want to relate the sampling distribution of X to the population from which the random sample was selected. We denote:
Population mean : Population standard deviation
l-t
-oi
The mean and standarddeviation of the sampling distribution of X are then determined in terms of p and tr. (The interested readercan consult Appendix A.3 for the details.)
Meon ond Sfondord Deviotion of X The distribution of the sample mean, basedon a random sample of srzen, has E(X) _ l-t (-
po pulation m ean)
population varian var(X)- tn \ (-sample s12e /
: sd(t)_ +( Vn \
9
population standard deviation
The first result shows that the distribution of X is centered at the population mean p in the sense that expectation seryes as a measure of center of a distribution. Second,the standard deviation of * is the population standard deviation divided by the squareroot of the sample size. This latter result shows that the variability of the sample mean is govemed by the two factors: the population variability o and the sample size n.Large variability in the population induces large variability in X thus making the sample information about p less dependable. However, this can be For instance, with n : 100, the standard countered by choosing 4t5". : o/l1, a tenth of the population standard deviation of X is o/t/lO} deviation. With increasing sample size, the standard deviation "/!i decreasesand the distribution of X tends to become more concentrated around the population mean p.
3. Distribution of the Sample Mean and the Central Limit Theorem 237
EXAMPLE2
Calculate the mean and standard deviation for the population distribution given in Table I and for the distribution of X given in TabIe 2. Verify
the relations E(X) - p and sd(X) _ o l{n. The calculations areperformed in Table 3 .
Toble3 Population Distribution f(x)
2 3 4 Total
p_3 a2: if
124 333 1:1 333 | 337
xf(x)
xzf(x)
16 29 T
I
x
f(x)
2 2.5 3 3.5 4
9 4
(X, + Xr)lZ
Distribution of X-
2
eg-T| 99T
xzf(x)
24 9'9 5
12.5 -T-
927 99 7
24.5
9
Total
(3)2 - 3
xf(x)
4
16
13Ib4
E(8)
3 84 I
Var(X)
p
(3)2- +
By direct calculation sd(X) : l/\/3. This is confirmed by the relation
sd(x)
o
{"
I
,Etn \re
t l
tl L*l
We now state two important results concerning the shape of the sampling distribution of X. The first result gives the exact form of the distribution of X when the population distribution is normal:
X is normal when samplirrg frorn 'a norrnal population. In random sampling from a normal population with mean [.r and standard deviation o, the sample mea n X has the normal distribution with mean p and standard deviation o ltfr.
When sampling from a nonnormal population, the distribution of X depends on the particular form of the population distribution that prevails. A surprising result, known as the central limit theorem, statesthat when the sample size n is large, the distribution of * is approximately normal, regardlessof the shapeof the population distribution. In practice, the normal approximation is usually adequate when n is greater than 80.
238
Chapter 8
Centrol LimifTheorem Whatever the populaticln, the distribu{iorr of X is approximately nrlrrnal when n is large In random sampling from an arbrtrary population with mean p and standard deviation o, when n is large, the distribution of X is approximately normal with mean p and standard deviation o /{i. Consequently,
Z
alYn
Whether the population distribution is continuous, discrete, symmetric, or asymmetric, the central limit theorem asserts that as longas the population variance is finite, the distribution of the sample mean X is nearly normal if the sample size is large. In this sense, the normal distribution plays a central role in the development of statistical procedures. A proof of the theorem requires higher mathematics. However, we can empirically demonstrate how this result works.
of the CentrolLimitTheorem, EXAMPLE 3 Demonstrotion Consider a population having a discrete uniform distribution that places a probability of .1 on each of the integers O, I, . . . , 9. This may be an appropriate model for the distribution of the last digit in telephone numbers or the first overflow digit in computer calculations. The line diagram of this distribution appears in Figure 2. By means of a computer, 100 random samples of size 5 were generated from this distribution, and x was computed for each sample. The results of this repeated random sampling are presented in Table 4. The relative frequency histogram in Figure 3 is constructed from the 100 observed values of x. Although the population distribution (Figure 2) is far from normal, the top of the histogram of the x values (Figure 3) has the
to,t at
0123456789x
on the Integers0, 1, . . . ,9. Figure2 UniformDistribution
Tqble 4 Sqmplesof Size5 from o DiscreteUniformDistribution Sample Number
1 2 3 4 5 6 7 8 9 10 11 T2 l3 I4 15 I6 T7 l8
r9 20 2I 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 4I 42 43 44 45 46 47 48 49 50
Observations
Sum
Mean ,
4,7,9,0,6 7,3,7,7,4 0,4,6,9,2 7,6, r,9,1 9,0,2,9,4 9,4,9,4,2 7,4,2, r, 6 4,4,7,7,9 9,7,6,0,5 7,9,r,0,6 r, 3, 6,5,7 3,7,5,3,2 5,6,6,5,0 9,9,6,4,1 0,0,9,5,7
26 28 2T 24 24 28 20 31 26 23 22 20 22 29 2L 2T r9 22 26 29 r6 T4 25 25 18 I6 27 27 r9 r9 T7 29 2l 26 22 2A t4 28 25 1B 28 25 23 23 T6 35 22 25 T7 T7
5.2 5.6 4.2 4.8 4.8 5.6 4.O 6.2 5.2 4.6 4.4 4.0 4.4 5.8 4.2 4.2 3.8 4.4 5.2 5.8 3.2 2.8 5.2 5.0 3.6 3.2 5.4 5.4 3.8 3.8 3.4 5.8 4.2 5.2 4.4 4.O 2.8 5.6 5.0 3.6 5.6 5.0 4.6 4.6 3.2 7.O 4.4 5.0 3.4 3.4
4 , 9 ,r , r , 6 9,4, l,l, 4 6,4,2,7,3 9,4,4, r,8 8,4,6,8,3 5,2, 2, 6, I 2 , 2 , 9 , 1 ,0 I, 4,5, 8,g 8, l, 6,3,7
r , 2 , o ,g ,6
8,5,3,0,0 9,5,8,5,0 8,9, l, r,8 9,0,7,4,o 6, 5, 5,3,0 4, 6,4,2, I 7,8,3,6,5 4,2,9, 5,2 7,1,9,0,9 5,8,4,1,4 6,4,4,5,1
4 , 2 ,r , r , 6 4,7,5,5,7 9,0,5,9,2 3, r, 5,4,5 9,8, 6,3,2 9,4,2,2,9 8,4,7,2,2 0, 7,3,4,9 0,2,7,5,2 7, r, g, g, g 4,0, 5, g, 4 5,8,6,3,3 4, 5,0, 5,3 7,7,2,O,1
Sample Num ber
5l 52 53 54 55 56 57 58
s9 60 6T 62 63 64 65 66 67 68 69 70 7I 72 73 74
7s 76 77 78 79 80 81 82 B3 B4 B5 86 87 88 B9 90 9T 92 93 94
9s 96 97 98 99 r00
Observations
Sum
4,7,3,9,9 2,0,3,3,2 4,4,2,6,3 r,6,4,0,6 2, 4, 5,8, 9 1,5,5,4,0 3,7, 5,4,3 3, 7, 0, 7, 6 4,9,9,5,9 6,7,9,2,9 7,3,6,3,6 7,4,6,0,1 7,9,9,7,5 8,0,6,2,7 6,5,3,6,2 5,0,5,2,9 2,9,4,9, I 9,5,2,2,6 0,1,4,4,4 5,4,0, 5,2 r, l, 4,2,o 9,5,4,5,9 7,r,6,6,9 3,5, 0,0, 5 3,7,7,3,5 7,4,7,6,2 8, r,0, g,l 6,4,7,9,3 7,7,6,9,7 9,4,2,9,9 3, 3,3,3, 3 g, 7, 7, 0, 3 5,3,2,L,1 4,4,5,2,6 3, 7, 5, 4, I 7,4,5,9,9 3,2,9,0,5 4,6,6,3,3 1 , 0 ,g , 3 , 7 2,9,6,9,5 4,9,0,7,6 5,6,7,6,3 3, 6,2,5,6 0, r, l, g, 4 3, 6, 6,4,5 9,2,9,9,6 2,0,0,6,9 o,4,5,0,5 0,3,7,3,9 2,5,0,0, 7
30 10 r9 r7 28 t5 22 23 35 32 25 l8 37 23 22 2l 25 24 13 L6 8 32 29 13 25 26 T9 29 36 33 l5 25 T2
r7 20 33 I9 22 20 30 25 27 22 t4 24 34 r6 I4 22 T4
Mean x
6.0 2.0 3.8 3.4 5.6 3.0 4.4 4.6 7.0 6.4
s.0 3.6 7.4 4.5 4.4 4.2
s.0 4.8 2.6 3.2 r.6 6.4 5.8 2.6 5.0 5.2 3.8 5.8 7.2 6.6 3.0 5.0 2.4 3.4 4.0 6.6 3.8 4.4 4.0 6.0 5.0 5.4 4.4 2.8 4.8 6.8 3.2 2.8 4.4 2.8
240
Chapter 8
ULZJ+30t6YT
Figure3 Relotivefrequency histog;rorn of the x volues recorded in Toble 4.
appearanceof a bell-shaped curve, even for the small sample size of 5. For larger sample sizes, the normal distribution would give even a closer approximation. It might be interesting {or the reader to collect samples by reading the last digits of numbers from a telephone directory and then to construct a histogram of the x values. tr Another graphic example of the central limit theorem appearsin Figure 4, where the population distribution represented by the solid curve is a continuous asymmetric distribution with p : 2 and o : 1.41. The distributions of the sample mean X for sample sizesn : 3 and n : 10 are
:\ off i {. o,ttribution
popufation Asymmetri;c distri:bu:tioh
Value
3 ond n ---'t0 in Figure4 Distributions of X far n somplingfrom o skewed PoPulction.
3. Distribution of the Sample Mean and the Central Limit Theorem 241
plotted as dashedcurves on the graph. These indicate that with increasing n, the distributions become more concentrated around p and look more like the normal distribution.
EXAMPLE4
Consider a population with mean 82 and standard deviation 12. (a) If. a random sample of size 64 is selected, what is the probability that the sample mean will lie between 80.8 and 83.2? (b) With a random sample of size 100, what is the probability that the sample mean will lie between 80.8 and 83.2? (a) We have p - 82 and o : 12.Sincen theorem tells us that the distribution of X mean -
64 ts large,the central limit approximately normal with
l..r_ 82 0
standard deviation
Yn
12
a-
V64
To calculate P[80.8 < X < 83.2], we convert to the standardizedvariable Z-
X - p,- X 82 1 . 5 al{n
The z-values correspondingto B0.Band 83.2 are 80.8 lj
82 -
79 80.5
-2
, 83.2 and ls
-.8
82
- I
Consequently,
- P[-.8 P[80.8< X < 8 3 . 2 1
83.5 85
82
n
242
Chapter B (b) We now have n r.7
g2
x
Z-
Therefore,
<x<83.2]: P[80.8 rLry
- Pl-r.o
EXAMPLE5
Supposethat the population distribution of the gripping strengths of industrial workers is known to have a mean of 110 and standarddeviation of 10. For a random sample of 75 workers, what is the probability that the sample mean gripping strength will be: (a) Between 109 and 112? (b) Greater than 111? Here the population mean and the standarddeviation are p : l l0 and o : 10, respectively.The sample sizen : 75 is large, so the central limit theorem ensures that the distribution of X is approximately normal with M e a n o fX :
ttO
: {: Standarddeviation oI * : + y75 Yn
f.rSS
(a) To find P[109 < X < 112] we convert to the standardizedvariable _7 u
x
110 1.155
and calculate the z-values
109
ffi
rtO
rlz_I10
Exercises 243
The requiredprobabilityis P [ 1 0 9< x < r r 2 ] _
Pl-.866
( b )P I V > t l l l : r l r r # ] -- 1 _ I
PIZ=.8661 .807
A natural question that arises is how large should n be {or the normal approximation to be used for the distribution of X? The nature of the approximation dependson the extent to which the population distribution deviates from a normal form. If the population distribution is normal, then X is exactly normally distributed for all n, small or large. As the population distribution increasingly departs from normality, larger values of n are required for a good approximation. Ordinarily, n > 30 provides a satis{actory approximation.
EXERCISES 3.1 A population has rr'ear' 99 and standard deviation 7. Calculate E(X) and sd(t) for (a) sample size 4 and (b) sample size 25. 3.2 A population has standarddeviation 10. What is the standarddeviation of t for a random sampleof size (a) n 4007 3.3 Using the sampling distribution determine_dfor X :_(Xr + Xz)/z in Exercise 2.2, verify that E(X) : p and sd(X) : o/\/2. 3.4 Using the sampling distribution determinqd for | :_6t Exercise 2.3, verify that E(X) : p and sd(X) : o/t/Z. 3.5 A normal population has p determine the (a) mean of distribution of X.
+ X")/2 in
27 and o (b) standard deviation of X, and (c)
3.6 Supposethat the moisture content per pound of a dehydratedprotein
244
Chapter B
concentrate is normally distributed with a mean of 8.5 and a standard deviation of .5. A random sample of 16 specimens,each_consistingof one pound of this concentrate, is to be tested. Let X denote ihe sample mean of these measurementsof moisture content. (a) what is the distribution of xl Is it the exact or an approximate distribution? (b) What is the probability that: (i) * will exceed3.7? (ii) t will be between B.B4 and,8.66? 3.7 The distribution of personal income, of persons working in a large Eastem city, has p : $31,000 and o : $5000. (a) What is the approximate distribution for X based on a random sample of 100 persons? (b) Evaluate pl* > 81,5001. 3.8 A random sample of size 100 is taken from a population having a mean of 20 and a standard deviation of 5. The shapeof the populatibn distribution is unknown. (a) What carnyou say about the probability distribution of the sample mean X? (b) Find the probability that X will exceed ZO.7S. 3.9 How does the central limit theorem expressthe fact that the probability distribution of * concentratesmore and more probability near p as n increases?
KEYIDEAS A parameter is a numerical characteristic of the population. It is a constant although its value is typically unhnown to us. The object of a statistical analysis of sample data is to learn about the parameter. A numerical characteristicof a sample is called a statistic. The value of a statistic varies in repeatedsampling. Random sampling from a population refers to independent selections where each observation has the same distribution as the population. When random sampling from a population, a statistic is a random variable. The probability distribution of a statistic is called its sampling distribution. The sampling distribution of X has mean p and standard deviation of\/n, where p : population mean, o : population standarddeviation and n : sample size.
4. Exercises 245 With increasing n, the distribution of X is more concentrated around p. If the populltion distribution is normal N(p, o), the distribution of X is N(p, o/Vn). Regaldlessof the shape of the population distribution, the distribution of x it approximately N(*, ol\/-nl provided n is large. This result is called the central limit theorem.
4, EXERCISES 4.1 A population consists of the four numbers {2, 4, 5, 8}. Consider drawing a random sample of size 2 with replacement. (a) List all possible samplesand evaluate i for each. (b) Determine the sampling distribution of X. (c) Write down the population distribution and calculate its mean p and standard deviation o. (d) Calculate the mean and standard deviation of the sampling distribution of Xobtained in part (b), and verify that these agree #ith p ando/\/i, respectively. 4.2 Refer to Exercise4.1, and instead of X consider the statistic sample range R : largest observation - smallest observation For instance, if the sample observations ate (2,6), the range is 5 2:4. (a) Calculate the sample range for all possible samples. (b) Determine the sampling distribution of R. 4.3 Consider random sampling from a population that has mean 550 and standard deviation 70. Find the mean and standard deviation of * for (a) Sample size 15. (b) Sample size 160. 4.4 What sample size is required in order that the standard deviation of X be: (a) i of the population standard deviation? (b) * of the population standard deviation? (c) 15% of the population standard deviation? 4.5 Suppose a population distribution is normal with mean 80 and standard deviation 10. For a random sample o{ size n : 9:
246
Chapter B
(a) What are the mean and standard deviation of X? (b) what is the distribution of xl ts this distribution exact or approximate? (c) Find the probability that X lies between 76 and,g4. 4.5 The weights of pears in an orchard are normally distributed with mean .32 pound and standard deviation .0g pound. (a) If one pear is selectedat random, what is the probability that its weight will be between .28 and .34 pound? (b) If t denotes the average weight of a random sample of 4 pears, what is the probability that x wilr be between .za aid .s+ pound? 4 . 7 Supposethat the size of pebbles in a river bed is normally distributed with a mean of I2.I and standarddeviation B.Z.A random sampleof 9 pebbleswill be measured.Let x denotethe average size of the sampledpebbles. (a) What is the distribution of X? (b) what is the probability that x is smaller than l0? (c) What percentageof the pebbles in the river bed are of size smaller than 10? 4 . 8 A random sample of size 150 is taken from a population that has a mean of 60 and a standard deviation of B. Th; pbpulation distribution is not normal. (a) Is it reasonable to assume a normal distribution for the sample mean X? Why or why not?
(b) Find the probability that X lies between 59 and 61. (c) Find the probability that * exceed s 62.
4.9 The distribution for tfe time it takes a student ro complete the Fall class registralion has i meatr of 94 minutes and a stand^arddeviation of 10 minutes. For a random sample of g1 students: (a) Determine the mean and standard deviation of X. (b) What can you say about the distribution of X? 4.10 Refer ro Exercise4.9. Evaluate: ({ p[* > 96] (b) plgz.B< x < gol
(i) 4t < esl.
4.ll
rhe mean and standard deviation of the strength of a packaging material are 55 pounds and z pounds, respectively. If 40 specimeni of this material are tested: (a) what is the probability that the sample mean strength Xwill be between 54 and 55 pounds? (b) Find the interval centered at 55, where x will lie with probability .95.
ComputetPrcject 247 4.12 Consider a random sample of sizen : 100 from a population that has a standard deviation of o : 20. (a) Find the probability that the sample mean X will lie within 2 units of the population mean-that is, P[-2 < X - p < 2]. - p= k] : .90. ft) Find the numberk so that Pl-ft = X (c) What is the probability that * will differ from p by more than 4 units?
CLASSPROJECTS 1 . ( a ) Count the number of occupantsX including the driver in each of 20 passingcars. Calculate the mean x of your sample. (b) Repeat(a) l0 times. (c) Collect the data sets of the individual c.arcounts x from the entire class and construct a relative frequency histogram. (d) Collect the X values from the entire class ( 10 from each student) and construct a rcIative frequency histograrn for X choosing appropriate class intervals. (e) plot the two relative frequpncy histograms and comment on the closeness of their shapes to the normal distribution.
2. (a) Collect a sample of size 7 and computeV and the sample median. (b) Repeat (a) 30 times. ( c ) plot dot diagrams for the values of the two statistics in (a). These plots reflect the individual sampling distributions-
(d) Compare the amount of variation in X and the median. In tbis exercise, you might record weekly soft-drink consumptions, sentence lengths, or houis of sleep for different students.
PROJECT COMPUTER l. Conduct a simulation experiment on the computer to verify the central limit theorem. Generaten : 5 observationsfrom'the continuous distribution that is uniform on 0 to 1. Calculate X. Repeat 150 times.r Make a histogram of tbe X values and a normal-scoresplot. Does the distribution of X appear to be normal for n : 6? You may wish to repeatwithn:20. If MINITAB is available, you could use NOPRINT followed by the commands
248
Chapter 8
U R A N D O MI 5 O C I U R A N D O MI 5 0 C 2 U R A N D O MI 5 0 C 3 URANDOMI 50 C4 U R A N D O MI 5 0 C 5 U R A N D O MI 5 0 C 6 'XBAR' NAMEClI ADD CI-C6 PUT IN ClO LET Cl t=C10/6.0
The 150 values of x can then be describedby the commands H I S T O G R A MC 1 1 DESCRIBECI I N S C O R EC l I P U T I N PLOT C11 VS C12
C12
In additiol, you may wish to replace the uniform URANDOM command by the normal N R A N D O MI 5 0 M U = 2 S I G M A = S C l
command, or the command IRANDOI4150 0 TO 9
which generafesvalues uniform on 0, L,
Cl
. , g.
Drowinglnferences FromLorgeSomples 1. INTRODUCTION MEAN OF A POPULATION 2, POINTESTIMATION FORP INTERVAL 3. CONFIDENCE ABOUTtr HYPOTHESES 4. TESTING
5 INFE-'J:IHR: 3iT:H#$TORTION
250
Chapter 9
one of the mojor contributions of stotistics to modernthinkingis the understondingthot informotion on single,highlyvoriobfeobservotions con be combined in greotnumbersto obtqinveryprecise informotion obouto populotion.
737" clre sotisfiedwith their presentjob. 56"/"ore sotisfiedwith their stondord of fiving. courtesy ReodersDigest.Photo from photofile.
II. INTRODUCTION The problem of statistical inference arises when we wish to make gener_ alizations abour a populatio" .; ;h.;;li, of sample selectedfrom it. once a sample is observed,its main i*arr., "can be determined by the methods of descriptive summary discussed i" ch"pi..Jil;. However, more often than not, our principal concern is not just with the particular data set, but with what can ti, r"ta li"ut the population. consider a study on the effectivenessof which 30 3.qi.;;t"cram-in "oI participants report their weight loss. we wilr h"rr" orr"hr"a" *-pi. it, -."r.rr.ments of weight loss. But, is the goal of the study confirr"a to it i, particular group of 30 persons? No, lt i, ,rol. The more important aspect -pof.rlation concerns the effectiveness of the diet program io, th. of potential users. The sample measuremerit, _.rrt, of the basis for these conclusions. "o.,rri, irovide
Statistical inference deals with drawing conclusions about population parameters from an analysis of the sampl e data.
7. Inttoduction 251 Any inference about a population parameter will involve some uncertainty because it is based on a sample, rather than the entire population. To be meaningful, a statistical inference must include a specification of this uncertainty. The ide-as of probability and the sampling distribution of a statistic play a fundamental role in this regard. The nature of the inference to be considered depends on the intent of the investigation. The two most important types of inferences are (a) estimation of pararnetcr(sland (b) tcsting of statistical hypotheses. The true value ol a parameter is an unknown constant that can be correctly ascertained only by an exhaustive study of the population, if indeed that were possible. Our objective may be to obtain a guessor an estimate of the unknown true value along with a determination of its This type of inference is called estinlation o[ parameters.An ac6;rJracy. altemative objective may be to examine whether the sample data support or contradict the investigator's conjecture about the true value of the paiameter. We have aheady introduced this latter type of inference, ialled testing of statistical hypothcscs,in Chapter 6 in connection with testing proportrons. EXAMPLE't
To study the growth of pine trees at an early stage, a nursery worker records40 measurementsof the heights of l-year-old red pine seedlings. This set of measurements appearsin Table 1.
Toble4 Heightsof 'l-Yeor-OldRed PineSeedlings Meosuredin Centimeters 2.6 r.6 2.0 t.2 1.5
r.9 1.5 1.5 t.2 t.6
1.8 t.4 r.7 1.8 2.2
r.6 r.6 1.5
r.7 2.1
1.4 2.3 1.6 0.8 3.1
2.2 1.5
2.r 1.5
r.7
r.2 l .1 2.8 2.0 r.7
1.6 r.6 1.0 2.2 r.2
Courtesy of ProfessorAlan Ek.
Employing the ideas of Chapter 2 we can calculate some descriptive summary for this set of measurements: sample mean x
I.715,
sample standard deviation s
sample median
However, the target of our investigation is not just the particular set of measurements recorded, but it concerns the vast (infinite) population of the heights of all possible l-year-old pine seedlings. The population distribution of the heights is unknown to us and so are the parameters such as the population mean p and the population standard deviation o.
252
Chapter 9
Taking the view that the 40 observations represent a random sample from the population distribution of_heights,oni goal of this study may be to "learn about p.,, More specifically, rv. may,irh to, (a) Estimate a single value for the unknown p @oint estimation). (b) Determine an interval of plausibre varues for p (interval estimation). (c) Decide whether_ornot the mean height p is 1.9 centimeters, which was previously found to be the mean height of a different stock of pine seedlings(testing a hypothesis). n
EXAMPLE 2 A govemment agency wishes to assessthe prevairing rate of unemproy-
ment in a particular county. It is correctly felt that this assessment could be made quickly and effectively by rr-piirrg a small fraction of the labor force in the county and counting ihe mrmb."t of p"r.orr. unemo-loyed.suppose that 500 randomly selected p"rJon, are ""iir"rrv interviewed and that 4l are found to be unemployed. A descriptive summary of this finding is provided by sample proportion of unemployed p : #:
.0g2
Here the target of our investigation is the proportion unemployed, p, in the entire county popuration. The true vaiue'of tn" p"r"-"ter p is unknown to us. The sample quantity f : .0g2 sh.d, so-e light on p, but it is subject to some error since it draws only from a pafiof ti. population. we would like to evaluate its margin of error and to^provide an interval of plausible values of p. we may also wish to test the rrypoihesis that the unemployment rate-in the county is not higher than the rate quoted in a federal report. This latter type of problem was discussedin chapt., e rri only in the confine of a small sample size. A large sample size requires a normal approximation to the binomial distribut-ion. n
2. POINTESTIMATION OFA POPULATION MEAN The-object of point estimation is to calculate, from the sample d.ata,a single number that is likely to be close to the unknown value of the parameter. The available information is assumed to be in the form of a random sample-X,, Xr, . . . , Xn of size n taken from the population. We wish to formulate a statistic such that its value from the "o-p,rt.a sample data would reflect the value of the population parameter as closely as possible.
2. Point Estimation of a Population Mean
253
A statistic intended for estimating a par^meter is called a point of estimator or, simply, an estirnator. The standard deviation an estimator is called its standard error: S.E.
perhaps When estimating a population mean from a random sample' the most intuitive estimator is the sample mean:
X
Xr + Xz +
+Xn
n
pine For instance, to estlmate the mean height of the population of the of ,""ati"gt in Example 1, we would naturally compute the mean data of sample measuremlnts. Employing the estimator *, with the a irUi" l, we get the result x : LTIS centimeters, which we call point estimate, or simply an estimate o[ p' quoted as an Without an assessment of accuracy/ a single number indicate the estimate may not serve a very useful purpose' We must of vailabillty in the distribution of the estimator. The standard estimator, pro"*l"rr, deviation, alternatively called the standard error of the vides information about its variability' etttimator of In order to study the properties of the sample mean X a-s-an 8: Chapter from results the review rr. p, t"t the population mean
(i) E(E : p. (ii) sd(x)
_o
{n
SOS.E.(X) -
(iii) with large n, x is nearly normally distributed with mean p and standard deviation o /\/ n' p The first two results show that the distribution of x is centered around standard population and that its standard error is o/f-n, where o is the deviation, and n is the samPle size' To understand how closely X is expected to estimate lr,,we now examine the third result, which is depicted in Figure l Recall that, in a normal airtribrrtiott, the interval running two standard deviations on either'side probaof the mean contains probabilitybS+. Thus, prior to,sampling, the from Zo/t/-n tift,V ir .954 thatthe estimatoi X *itt be wiihin a distance rephrased be can the irue parameter value p. This probability statement by saying that when we are estimating p by x, the95.4"/" error margin is ktlYn. -1 -mor
difficulty in computing the standard error of X remains,
254
Chapter 9
fi11$ft...,,,rt,,".9.5,4
2rrl\/;-*)*-2ol{n p
Figure 1 Approximote normol clistri_ bution of X.
b.ecausethe expression involves the unknown population standard deviation o. We can estimate o by the sample ,t"rrd"id deviation n s-l
x)z
L8, l:1
F When
n
o/\fr
by s/{;can
is
large,
the
effect
of
estimating
the
standard
be neglected. We now summ arLze:
error
PointEstimotionof the Meon Parameter: f)ata, Estimator:
population mean p Xr, .
, Xn (a random sample of size n) X lsample mean)
S.E (*) _ o/{i, estimated S.E. (X)- s/{n For rarge n, an approximate gs.4% error margin is zs/\fr.
EXAMPLE3
From the data of Example l, consisting of 40 measurements of the heights of 1-year-o1dred pine seedlinqs,give-a point estimate of the population t mean height and state a 95.46/oeiror margin. The sample mean and the standard deviation computed from the 40 measurements in Table I are
x _ f)t,r v
\r
L&t
E
r'7rr n2 - \n
s4 _ .4Ts
Exetcises 255 A point estimate of the population mean height is X : 1.715centimeters. Also, Estimated S.E.
s {40
#-
ozs
An approximate 95.4% error margin is then :
2sl _ .r5 centimeters
I+o
n
Caution: (a) Standard error should not be interpreted as the "typtcaI" error in a problem of estimation as the word "stand atd" may suggest.For instance,when S.E-(t) p) is likely to be .3 but rather that, prior to observing the irror (X the data,the probability is approximately .954 that the error will be within i2(S.E.) _ t.6(b) An estimate and its variability areoften reportedin either of the forms: estimate I S.E.or estimate l2(S.E.). In reportinga numerical result such as 53.4t 4.6,we must specifywhether 4.6 representsS.E., 2(S.E.),or some other multiple of the standarderror.
EXERCISES 2.1 Consider the problem o[ estimating p basedon a random sample of size n. Compute a point estimate of p, and the estimated standard error when: (a) n : 80, )x, : 752,2(x, - V)t : g+S L290,Xxi - V)' : A+Z (b) n : I69,2x,: 2.2 Apopulation has mean p : 15 and o : 5. For n : 49:. (a) Evaluate the S.E.of X. (b) Give the 95.4% error margin. 2.3 A random sample of 90 apples gives an average weight of V : 4-6 ounces and s : 1.8 ounces. Provide: (a) An estimate of p : population mean weight. (b) An approximate 95.+"/"error margin. 2.4 The time it takes for a taxi to drive from the office to the airport was recordedon 40 occasions.It was found that x : 47 minutes and s : 5 minutes. Give:
256
Chapter 9
(a) An estimate of p - population mean time to drive. (b) An approximate gS.4%error margin. 2.5 By what factor should the sample size be increasedto reduce the standard error x-- to: (a) i its original value? (b) * its original value?
3. CONFIDENCE TNTERVAL FORp For point estimation, a single number ries in the forefront even though a standarderror is attached.rnstead/it is often more desirableto producean interval of values that-is likely to contain the true,"r".,iit. parameter. Ideally we would like to be able to collect , ."t";i; tli.rr rrr" it to calculate an interval that would definitely ";i*r"-u"t,r" of the parameter. This goal, however, is not achievable "orr,ri.r',h" u."".rr. oi sample-tosample variation. Instead, we insist that the propor"J i.r**"t *itt tain the true value *i!b specified high ptobatrht Th;; "orr;robability, " called the level of confidence, is typicaliy taken ^" .g1,-.ii-, l, .gq. To developthis concept,we first confine our attention to the construc_ tion of a confidence intervar for a popuration mean p assuming that the population is normal and the standard deviation i" iii*". This re_ striction helps to simplify the initial presentation "' of the concept of a confidenceinterval. Later on we will treat the more realistic casewhere o is also unknown. A probability statement about * based on the normal distribution provides the cornerstone for the development of a confidence interval. From c_lapter 8 recall.that when the poiuratio" irrroi-"ljti" airtrir.r_ tion of Xis also norm,arJt has mean p ani rt"ndard deviation tG.'ii;;" r'r'is unknown but o/t/i is a known number u""""rrtt . ,r-pr" size n is known and we have assumed that o is known. The normal table shows that the probability is .95 that a normal landgm variable will lie within r.95 standard dwiations r.or'it, mean. For X, we then have
P[r,
1.96a/Vn< X
as shown in Figure 2. Now, the relation t,r and
1.96o/\n
< X
is the same as p < x
+ t.g6o/\n
3. Confidence Interval for p" 257
p.-I.96ol{n
p
p+LS6oh6
i
Figure2 Normol distributionof X.
V.
p + 1.96o/t/i
is the same as * - t.geo/\/i
< t,
as we can see by transposing I.96o/f-n from one side of an inequality to the other. Therefore, the event [r,
7.96o1{n<X
lX
r.96ol{n
I.96ol{n]
is equivalent to
p <X + r.96ol{nl
In essence, both events state that the difference (X F'")lies between - l.96ol{n Thus, the probability statement and l.96ol{n
P[r, L.s6+ Yn
L
Yn)
can also be expressedA S
nlo
r.s6+ Yn
+ rs6*]
This second form tells us that, in repeated sampling, the random interval from X - 1.96o/t/ito x + L96o/\/-nwill include the unknown parameter p.with a probability of .95. Becauseo is assumedto be known, both the upper and lower end points can be computed as soon as the sample data are available. Guided by the above reasonings, we say: the interval 6
- t.geo /t/-n,
* + t.96o /\/-n)
is a 95"/" conlidence interval for p when the population is normal and o known.
258
Chapter 9
EXAMPLE4
Given a random sample of 25 observationsfrom a normal population for which p is unknown and o 8, the sample mean is found to be x - 42.7. Construct a 95% confidence interval for p. Becausethe population is normal, * also has a normal distribution. Using the observed value x _ 42.7, a 95% confidence interval for p becomes
(ort
r . 9 6Tr B
v2s
42.7 + 1.e6+)
v25/
Referring to the confidence interval obtained in Example 4, we must not speak of the probability of the fixed interval (89.6,45.g) covering the true mean p. The particular interval (39.6, 45.8) either does or does not cover p, and we will never know which is the case. we need not always tie our discussion of confidence intervals to the choice oI a95"/olevel of confidence. An investigator may wish to specify a different high probability. we denote this probability by 1 - cr ani speak of a 100(l - o)% confidenceinterval. The only changeis to replace 1.96 with zo,r, where zo2 denotesthe upper a/2 point of the standaid normal distribution (i.e., the area,tothe right of zorris af 2, as shown in Figure3). In summary, when the population is normal and o is known, a 100(1 - ct)% confidence interval for p is given by
(l
Zo/2h,x + Zo/27")
- Ztut2
0
Z*t2
Figure3 The nototiofrzo/2.
A few values of zorrobtained from the normal table appear in Table 2 for easy reterence.
Toble 2 Volues of zo,,
3. ConfidenceInterval for p" 259 Interpretation of Confidence Intervals To better understand the meaning of a confidence statement, we use the computer to perform repeated samplings from a normal distribution with p. : 100 and o : 10. Ten samples oJ size 7 are selected, and a 95"/" confidence interval x -r 1.96 x IO/l/7 is computed from each. For the first sample, x : 104.3 and the interval is 104.3 + 7.4 or 96.9 to 111.7. This and the other intervals are illustrated in Figure 4 where each vertical line segment representsone confidence interval. The midpoint of a line is the observed value of X for that particular sample. Also note that all of the intervals are of the same length 2 x l.96al\/n : 14.8. Of the l0 intervals shown, 9 cover the true value of p. This is not surprising, because the specified probability .95 represents the long-run relative {requency of these intervals crossing the dotted line through p.
No
Yes
|
'r'
110
t-L= 100
90
Y e s :I n t e r v acl o n t a i n sp No: Interval doesnotcontainp
lrrrrrrtrrr r\ U
r
2
3
7 5 4 6 S a m p l en u m b e r
8
9
10
Figure 4 Interpretotionof the confidence intervolfor p.
Because confidence interval statements are the most useful way to communicate information obtained from a sample, certain aspects of their formulation merit special emphasis. Stated in terms of a 95"/o confidence interval for p, these are: (a) Aconfidenceinterval I - t9e"/ti, X + t.l6o/t/i) isarandom interval that attempts to cover the true value of the parameter p,. (b) The probability
,lx
r.e6*.p <x+ 1.s6*]
260
Chapter 9
interpreted as the long-run relative frequency over many repe_ titions of sampling, asserts that about 9s"/o of the intervals will cover p. (c) once x is calculated from an observed sample, the interval (t - I.96o/\/-n, x + 1.960/{-d, which is a reaiization of the random interval, is presented as a 9s%oconfidence interval for p. Having determined a numerical interval, it is no longer sensible io speak about the probability of its covering a fixed quantity p. At this point one might protest "I have only one sample and I am not rcally interested in repeated sampling." But if the confidence estimation techniques presented in this text are mastered and followed each time a problem of interval estimation arises, then over a lifetime approximately 95% oI the intervals will cover the true parameter. of this is "orr.", contingent on the validity of the assumptions underlying the techniques. Large Sample Confidence Intervals for p Having established the basic concepts underlying confidence interval statements, we now tum to the more realistic situation for which the population standard deviation o is unknown. we require the sample size n to be large in order to dispense with the assumption of a normal population. The central limit theorem then tells ui that x is nearly normal whatever the form of the population. Referring to the normal distribution of x in Figure 5 and the discussion Figure 2, """o-p"rrying we again have the probability statement 'o/2
o
,Yn
(strictly speaking, this probability is approximately I - o. for a nonnormal population.) Even though the interval
(x
Zo/2
F-zat2al\E
Cr-./ -t
/\
+
Zo/2*)
Yn
F*zttr2nl\,E
Figure 5 Normol distributionotX.
3. Confidence Interval for p" 261 includes p with the probability I oL,it does not serve as a confidence interval because it involves the unknown quantity o. However, because n is large, replacing o with its estimator s does not appreciably affect the probability statement. Summ arrzLng, the large sample confidence interval for p has the form Sstim ate + (z-valueXestimated standard error)
lorge Somple Confidence Inlervol for p When n is large, a 100(1 given by
a)% confidence interval for p is
Zo/24,X &s.+) \^^
+ Zo/2 t")
where s is the sample standard deviation.
EXAMPLE5
To estimate the averageweekly income of restaurant waiters and waitressesin a large city, an investigator collects weekly income data from a random sample of 75 restaurant workers. The mean and the standard deviation are found to be $227 and $15, respectively. Compute (a) 9OW and (b) 80% confidence intervals for the mean weekly income. The sample size n : 75 is large, so a normal approximation for the distribution of the sample mean X is appropriate. From the sample data, we know that x:$227 (a) With I
and
s:$tS
ct -- .90 we have a/2
s {n
r.645
1 . 6 4 5x 1 5
{ts
Hence, a 90"/" confidence interval for the population mean puis
(o
6
\
r . 6 4 5j , x + r . 6 4 s+ ) - ( 2 2 7 2 . 8 5 , 2 2 7+ 2 . 8 5 ) \/ n/ Yn or approximately(Zz4,zBO) S
This means that the investigator rs 90"/" confident the mean income p is in the interval of $224 to $230. That is, 90% of the time random samples
262
Chapter 9
of 75 waiters and waitresseswould produce intervals x -,- I.64s s /tffi that contain p. (b) With I c{ .80, we have a/2 s .tO
,-
Yn
- 1.28x 15 _ 2.22
{E
Hence, afr 80% confidence interval for pr.is
(227 - 2.22,227 + Z.Z2) : (ZZS,Z2g) - comparing the two results, the 80% confidence interval is shorter than the 9o% interval. A shorter interval seems to give a more precise location for p but suffers from a lower long-run frequency of being correct. [l Confidence Interval for a parameter The concept of a confidence interval-applies to any parameter, not iust the mean._It requires that a lower timit r and an'upper limit u be computed from the sample data. Then the random inteival fuom L to tf must have the specified probability of covering the true value of the parameter' The large sample 100(r - a)% confidence interval for p has L_
X
s Yn
Z o / 2G ,
U-
X + Zo/2
s {n
Definilionof o CcnfidenceIntervol-foro poromeler An interval (L, U) is a 100(1 parameter if
a)% confidence interval for a
PIL ( parameter and the end points I and (J arecomputable from the sample.
EXERCISES 3.1 An experimenter always calculates 90% confidence intervals for a mean. After 2OOapplications, about how many would actually cover the respective means? Explain.
4. TestingHypothesesabout 1t" 263 3.2 A forester measures 100 needles off a pine tree and finds x : 3.1 centimeters and s : .7 centimeters. She reports that a 95o/" conlidence interval for the mean needle-length is
3 .1- r.9 6-L to v1 0 0
3.24) or ( 2.96, 3.1+ r .96- L v100
(a) Is the statement correct? (b) Does the interval (2.96,3.24) cover the true mean? Explain. 3.3 In a study to determine whether a certain stimulant produces hyperactivity, 55 mice were iniected with 10 micrograms of the stimulant. Afterward, each mouse is given a hyperactivity rating score. The mean scorewas 7 : I4.9 and s : 2.8. Give a957" confidenceinterval for p : population mean score. 3.4 An engineer wishes to estimate the mean setting time.of a new gypsum cement mix used in highway spot repairs. From a record of the setting times for 100 spot repairs, the mean and standard deviation are found to be 32 minutes and 4 minutes, respectively. Use these values to determine a 99T" confidence interval for the mean setting time. 3.5 Determine a 9O"/"conIidence interval for the mean setting time in Exercise 3.4. 3.6 Radiation measurements on a sample of 65 microwave ovens prod u c e d x - . 1 1 a n d s : . 0 6 . D e t e r m i n e a 9 5 % c o n f i d e n c einterval for the mean radiation. 3.7 Refer to the 40 height measurements given in Table I and their summ ary statistics reported in Example 1. Calculate a 99% confidence interval for the population mean height.
ABOUTp HYPOTHESES 4. TESTING Testing statistical hypotheses constitutes an altemative approach to inference when the primary goal is to determine whether a conjecture is supported or contradicted. In Section 4 of Chapter 6, we introduced the basic ideas of testing statistical hypotheses in the context of inferences about a population proportion. Because the same process of reasoning applies to testing hypotheses about a population mean, it would help the reader to review the key terminologies: the null and alternative hypotheses,type I and type II errors, level of significance, reiection (or cdtical) region, and the significance probability (P-value)of an observedresult. The principal steps that are involved in testing statistical hypotheses can be summarized as follows:
264
Chapter 9
l. Identify the null hypothesis (Ho) and the altemative hypothesis (Hr) in terms of the population parameter(s) that are relevant to a given problem. 2. Choose a test statistic. 3. with a selected level of significance cr, determine the rejection region. 4. calculate the observedvalue of the test statistic from the sample data. Check whether or not it {alls in the rejection region, and draw a conclusion accordingly. 5. when possible, find the significance probability of the observed value and strengthen your conclusion. Becausethe rejection region is determined from the altemative hypothesis, it is important that H, be correctly identified.
Generol Guideline H r is the claim or conjecturewe wish to establishon the basis of the data. Ho is the oppositeof Hr.
we now turn to the probl_emof testing hypotheses about a population mean p..It is natural to use x as atest statistic to decideupon the validity of hypotheses concerning p. when the sample size n isiarge, x can bL treaEd as approximately normal with -ea., p and standaid deviation o/Yn whatever the form of the underlying population. Initially we treat the population standard deviation o as known. specifically, consider the problem of testing a one-sided hypothesis about p of the following form (see Figure 6): Ho: P
p0
Figure6 Ho:p
< po ond
Hr: Fr > Fo.
where p.ois a specified number. The normal distributionN(p,,o/\/-d of * is centered at F. Figure 7 illustrates this distribution for several values of
4. Testing Hypotheses about st' 265
^F/strue
He true
x
l.l"t
Figure 7 The normol distributiontor X under different volues of p. Shoded oreo - Plreiect Hol
p. One expects X to be qear p, so reiection of Ho should be recommended if the observedvalue of Xr: too far above po. In other words, Ho should be rejected in favor of HrlI X - po is to_omany standard deviations above zero. Since the standard deviation of X is o/\/n, the rejection region has the form f -go= ol{n
corR:*>lro+
ca/{n
The boundary c of the rejection region must now be determined from a specified tolerance of the type I error probability ct. The type I error probability is highest when p : ps, the boundary point between Ho and H'It is enough to determine c such that
266
Chapter 9
when p Z o/{n
has the N(0, 1) distribution. Becausep[Z - zo] : a, the requirementfor the type I error probability is satisfiedby cho6iing c, so thai c : zo. (see Figure8.)
0
Figure8 Choiceol c = 2,,. when the sample size is larger than 30, the normal approximation for x remains valid even if o is estimated by the sample standard deviation s. Therefore, for testing Flo: p < po versus Ht: tr > po with a large sample, we employ the X - l'o test statistic: 2 : s/\/n and set the rejection region R as Z > zo. This test is commonly called a normal test or a Z-test
EXAMPLE6
supposeyou are to test Ho: F < 20 versus Hr: F > 20 using a sample of size 100. (a) With a : .05 determine the rejection region. (b) If the sample gives x : 2l and s : 4, what does the test conclude? As stated in the problem, we have Ho: P
H t: l't
l.d , -: 100. The population standard deviation o is unknown but, n being large, we will employ the Z-test. since the boundary between Ho and H, is Fro: 20, our test statistic is
4. Testing Hypotheses about st" 267
Z:
Xzo s/rfr-o
(a) Because the alternative is right-sided, the reiection region should consist of large values oI Z.With ct : .05, the normal table gives Z.os : 1.545 so the rejection region is R;
Z > I.645
(b) The observed value of the test statistic is
2t-20
z:-_----::/'.c
4l1./roo
which lies in the rejection region R: Z > I.645 (See Figure 9). We therefore conclude that Ho: p = 20 is rejected in favor of Hr: p' > 2O,at the level of significance a : .05.
0
1 . 6 4 52 . 5
Figure9 Disfribution ot / unclerHo.
Let us proceed to calculate the significance probability or the P-value of the observed result z : 2.5. Recall that a P-value is the probability of getting the observed or more extreme results. In this case z : 2.5 so
P-value '::r; :
ill-
thenormaltabre)
A P-value of .0062 means that Ho would still be reiected with a as low as .OO52.This extremely small P-value lends strong support for Hr. n Remark: The direction of the alternative hypothesis H.: p ) po leads us to set the right-sided rejection region R: Z > zo. It makes no difference whether we state the null hypothesis as .Flo: p < po or as Ho: lr : Fo. With either formulation, po marks the boundary between Ho and Hr, and the place where the type I error probability needs to be controlled.
268
Chapter 9 What if a problem concerns the null hypothesis Ho:_[r > [ro vs. the alternative hypothesis Hi p judged by
small values of the standardizedtest statistrc Z, should then comprise the rejection region. SincePIZ = - z.'f _ e, alevel a test has the reiection region R: Z < -zs.
EXAMPLE7 From extensive recordsit is known that the duration of treating a disease by a standard therapy has a mean of 15 days. It is claimed that a new therapy can reduce the treatment time. To test this claim, the new therapy is to be tried on 70 patients and their times to recovery are to be recorded. (a) Formulate the hyryothesesand determine the rejection region of the test with level of significance a : .025. (b) If T : 14.6 and s : 3.0, what does the test conclude? (a) Let p denote the population mean time to recovery for the new therapy. Because the aim of the investigation is to substantiate the assertion that p < 15, we formulate the hypotheses Ho:p>15
vs. Hr:p<15
The sample lize is n : 70. Denoting the sample mean recovery time of 70 patients by X, and the standard deviation by s, our test statistic is 7:*-]'o-*-ts s/Yn
s/V100
The rejection region will consist of small values of Z becauseH, is leftsided. Since z.or, : I.96, the test with level of significance .025 has the rejection region (see Figure 10)
R:Z
- 1.96
Figure'10
4. Testing Hypotheses about 1t 269
(b) With the observedvalues X _ 14.6 and s _ 3, we calculate z
14.6- 15 :
--------------
:
3l\/70
-
L.L/,
Since - 1.12 is not in R, we do not reiect Ho: p = 15 at the level of significanceo. : .025.We conclude that, basedon the choice of ct : .025, the stated claim is not substantiated. Since our observedz is - 1.12and smaller values are more extreme, the significance probability of this result is
P'varue '::r;;::1n. :
normartabre)
Thus .1314 is the smallest ct at which Hocan be reiected.
tl
Example 7 illustrates the step-by-stepapplication of a statistical hypothesis test. The technical terminology should not be allowed to obscure a practical issue. After all, the role of statistics ought to be to enhance our commonsense reasoning, not to cloud it. Let us review the analysis of Example 7. The claim is that the population mean recovery time is less than 15 days. At hand is a data set of 70 patients giving us the mean recovery time v : 14.6 days and standard deviation s : 3.0 days. The observed sample mean 14.6 is below 15, and, on the surface, it appearsto support the claim. However, we must note that the result X : 14.6 has come only from a sample. Another sample of 70 patients is not likely to produce exlctly the same result. Bearing in mind the sample-to-sample variability of X, the pertinent question is: Could the obsewed result x : 14.6 days be explained by chance fluctuation of X even though the true population mean may be The answer in Figure I 1. approximated
p draws from the normal distribution of X, which is shown 15 and its standard deviation is It is centered at p - 3.Ol{7-0 - . 3 6 . as s/{n
P- value -.1314 14.6
15
Figure'l'l P-volue.
270
Chapter 9
we would like to know whether or not the observed value x : 14.5 is in a tail of this distribution where the probability is negligibly small. As computed in Example 7, the p-value is .1814 (seeFigure it;. rtris is not ordinarily considered a negligible chance. we conclude that the observed result_doesnot really contradict the possibility that the population mean is still 15 days. The precedinghypothesesare called one-sidedhypotheses,becausethe values of the parameter p,under the alternative hypothesis lie on one side of those under the null hypothesis. The corresponding tests are called one-sidedtests or one-tailed tests. By contrast, we can h"rr. problem of " testing the null hypothesis Ho: versus the two-sided
1't
alternative
Hr: p" * p,u Here Ho is to be rejected if x is too far away from pr,oin either direction, that is, If Z is too small or too large. For a level J t"rt we divide the rejection probability o equally between the two tails and construct the rejection region R: which
Z<
-Zo/2
Z>Zo,/2
can be expressed in the more compacr
R: EXAMPLEI
or
notation
lzl 2 Zo/z
Consider the data of Table I concerning the height measurementsof 40 pine seedlings.Do these data indicate that the population mean height is different from 1.9 centimeters? We are seekingevidencein support of p # 1.9 so the hypothesesshould be formulated as Ho: p : 1.9 vs. Hr: p, * 1.9 The sample size n : 40 being large, we will employ the test statistic
Z_
X-
po
sl\fr
X
1.9
shm
The two-sidedform of Hr dictates that the rejection region must also be two-sided.
4. TestingHypothesesabout 1t 271 Let us choosea : .05, then cr/2 : .025 andz.or, : I .96. Consequently, for ct : .05, the refection region is
R: lzl From Example 1, x : 1.715and s : .475 so the observedvalue of the test statistic is 7 z-
X-lro
.47s/{To
s/{i
Since lzl : 2.46 is larger than 1.96, we reiect Ho at a : .05. In fact, the large value lzl : 2.46 seems to indicate a much stronger rejection of Ho than that arising from the choice of ct : .05. How small an o can we set and still reject Ho? This is precisely the idea underlying the significance probability or the P-value. We calculate
P-value: Pllzl=2.461 : PIZ = -2.461 + PIZ = 2.461 :2x.0059:.0138 With ct as small as .0138,Ho is still reiected.In other words,this very n small P-valuegivesstrongsupportfor Hr. In summary:
LorgeSompleTestsForp When the sample size is large, tests concerning p. are based on the normal statistic TXlro Z-J ffi
The rejection region is one-sided or two-sided depending upon the alternative hypothesis. Specifically, Hr: P Hr: P Hr: I,r + Fo
R:
lzl z zo/z
272
Chapter 9 Since the central limit theorem prevails for lar ge n, no assumption is required as to the shape of the population distribution.
EXERCISES 4.1 A company claims their pens will write for over 100 hours. If we take this statement to apply to the mean p, show how to state Ho and H, in a test designedto establish the claim. 4.2 (a) What is the form of the rejection region when testing Ho: F < I02 againstHl lt > 102 basedon a sample sizen: 50?
(b) rf t what is the conclusion of your test with cr .05? Also find the significance probability and interpret the result. 4.3 It is claimed that a new treatment is more effective than the standard treatment for prolonging the lives of terminal cancer patients. The standard treatment has been in use for a long time, and from records in medical journals the mean survival period is known to be 4.2 years. The new treatment is administered to 80 patients, and their duration of survival recorded. The sample mean and the standard deviation are found to be 4.5 years and 1.1 years,respectively.Is the claim supportedby these results?Test at ct : .05. Also calculate the P-value. 4.4 A sample of 40 sales receipts from a university bookstore have x : $121 and s : $10.2. Use these values to perform a test of Ho: t, > 125 against Hr: l, < 125 with a : .05. 4.5 Use the values in Exercise4.4 to test H6 p:L25 with cr -
vs.
p+L25
.05.
4.6 A sample of 70 cans of cola was analyzed for amount of caffeine.The s :7.3. Test results yielded x Ho: t-t
p
with ct _ .10. 4.7 Use the information in Exercise 4.6 to test Ht: tr + 60 with a - .10.
p
4.8 A random sample of 50 video tape rental club members were questioned about the number of movie tapes rented last month. It was
5. Inferences about a Population Proportion 273 found that * _ .05that the mean is greater than 8.6? use ct
A ABOUT 5. INFERENCES PROPORTION POPULATION The reasoning leading to estimation of a mean also applies to the problem of estimation-of a population proportion. Example 2 considers sampling n : 500 persons to infer about the proportion of the population that is unemployed. When n elements are randomly sampled from the population, ihe'data will consist of the count X of the number of sampled elements possessingthe characteristic. Common sensesuggeststhe sample proportion t\X
p:i as an estimator of P. When the sample sizen is only a small fraction of the population size, X has the binomial distribution with mean np and the sample "orrrit q : (l - p)' Recall from Chapter 7 that, t-npq,where standard deviation when n is large, the binomial variable X is well approximated by a normal *irh -""r, ip'and standard deviation t/-npq' That is, L. - x_ - n P tfi_opq is approximately standard normal. This statement can be converted into about proportions by dividing the numerator and the de.i"t.*"rrt "nominator by n. In Particular,
,-(x-np)ln:i= (\/npq)/n
Ypq/n
This last form, illustrated in Figure 12, is crucial to all inferences about a population proportion p. [t shows that p is apprgxigrately normally distributed with mean p and standard deviation !pq/n. Point Estimation o[ p Intuitively, the sample proportion p is a reasonable estimator of the population proportion p. When the count X has a binomial distribution, E(X) : np,
sd(x) : t6q
274
Chapter 9
I p-
p + 'rr,r\$ql'
,^,rJpqln
Figure 12 Approximote normol distribution tor p.
Since f : X/n, the properties of expectation give E(p) : p
sdtP;: fnsi In other words, the sampling distribution of has mean equal to the f population proportion. The second result shows that the standard error of the estimator f is
s.E-(p)
pa
v7
The estimated standard error can be obtained by substituting the sample estimate p for p and A - I p for q Ln the formula, or Estimated S.E.(p)
t.^^ lpct \;
When n is large, prior to sampling, the probability is approximately .gS4 that the error of estimatio tp - pl \^/ill be less than 2 x (estimated S.E.). "
Poinf Esfimofionof o populotion proportion Parameter: Data: Estimator:
Population propo rtron p X _ Number having the characteristic in a random sample of srze n X
', p-n r.(p)
S.Fr'ir
tj-
f^^ esfirrrrteAq,E/;\ -
Yn
For|argen,anapproximateg5.4%errormarginis2\m.
5. Infercncesabout a PopulationProportion 275
EXAMPLE9
A large mail-order club that offers monthly specials wishes_t_otry out a ,r"* i,"-. A trial mailing is sent to a random sample of 250 members selected from the list of over 9000 subscribers. Based on this sample mailing, 70 of the members decide to purchase the item. Give a point estimale of the proportion of club members that could be expected to purchase the item and attach a95.4"/" error margin' The number in the sample represents only a small fraction of the total be treated as if it were a binomial variable. membership, so the "o.rrri""r, : :-25O the estimate of the population ploportion so 70, andX Here n is
YA 7 0Lso
.28
f^"
Estimated S.E.(i)
w-028
-95.4% error margin : 2 x .028 .056 Therefore,the estimated proportion it p : .28,with a95.4% error margin of .06 (roundedto two decimals). I
Confidence Interval for P A large sample confidence interval for a population proportion follows p is from the approximate normality of the sample propor:[qg p', Since random the lpqln, deviation p standard and with mean nearlv ,ror111"1 is a candidate. However, the standard deviation i"t.i""f p t "-,;-pqln involves th" ,itrlnown parameter p so we use the estimated standard deviation f W/" t" r"t lh" end points of the confidence interval. Notice again, the common form of the confidence interval (z-valueXestimated standard error)
Estimate i
LorgeSqmpleConfidenceIntervqlfor p For large n, a 100(1
(p
a)% con{idence interval for p is given by
t*,rff6Jn,
p + zo/2
p0l")
EXAMPLE'IO Consider the data of Example 2, where 41 people were found unemployed
out of a random sample of 500 persons from a county labor force. Compute a 95% confidence interv"l for the rate of unemployment in the county.
27 6
Chapter 9
. The samplesize n : S00 is large, so a nonnal approximationto the distributionof the sampreproportion is justifieo. I o) : ii"". f w e h a v ea f L : . 0 2 S a n d z o *J t . g e . f h e o b s e r v e J ; ; i ; ;ii : 4 r / S 0 o.95, : .082,and q : | - .082 : pf S. We calculate
z.ozs\ffi
V
soo
Therefore, a 95Y" confidence interval for the rate of unemployment in the county is .082 t .024 : (.0_qg,.106), or (S.B%, 10.6"/")il-p!i""o,"g"r. Becauseour procedure will produce true statements 952" of the time, we can be95% confident that the rate of unemployment is between 5.g% and 10.6% Large Sample Tests about p The details for testing hypotheses about a binomial proportron were presented in Chapter 6. Here we adapt these procedures to a large sample situation.
We considertesting Ho:p po vs. Hl p * po. With a large number of trials n, the sample proportion ,a,
p
X
is approximately normally distributed. Under 'and' the null hlryothesis, p has the specified value po the distrib.rtiorr. of l is approximately N (po,t$oqJi). Consequently, the standardized statistic 2:
P-Po Ypoqo/n
has the N(0, 1) distribution. Sincethe altemative hypothesis ------- -* is two-sided, I the rejection region of a level ct test i, giu"" Uy
R: lzl For one-sided alternatives, we use a one-tailed test in exactly the same way we discussed in Section 4 in connection with tests about p .
EXAMPLE41 A S-year-old census recorded that 2oyo of the famiries in a rarge commu_ nity lived below the poverty revel. To determine if this perc'entagehas changed, a random sample of 4oo families is studied and 70 are found to be living below the poverty revel. Does this findin! i"ai""L that the current percentage of families earning incomes beloi the poverty level has changed from what it was 5 years"ago?
Exercises 277 Let p denote the current population proportion of families living below 'the poverty level. Becausewe are seeking evidence to determine whether p is differcnt from .2O,we wish to test Ho:p:.29
vs. Hr:P#.2O
The sample size n : 400 being large, the Z-test is appropriate. The test statistic is Z-
p-.2
Fv+oo
Settinga : .05, the rejectionregionis R: lzl = t.ge. The computed value of Z frorn the sampledata is z-QO/4OO)-.2:.r75.-',
ffi
:-r.zs
Becauselzl : I.25 is smaller than 1.96,the null hypothesis is not reiected at ct : .05. We conclude that the data do not provide stlong evidence that a change in the percentage of families living below the poverty level has occurred. The significance probability of the observed value of.Z is
P-varue :'rl'Z':]i;1, + Ptz>r 2sl 2 x .1056 We would have to inflate o to more than .21 in order to reiect the null n hlpothesis. Thus, the evidence against Ho is really weak.
EXERCISES 5.1 A sample of 78 university students revealed that 49 carried their books and notes in a backpack. (a) Estimate the population proportion of students who carry their books and notes in a backPack. (b) Obtain the estimated S.E. (c) Give an approximate 95.4o/"error margin. 5.2 An analyst wishes to estimate the market share captured by Brand X detergent-that is, the proportion of Brand X sales compared to the
278
Chapter 9
total salesof all detergents.From data supplied by several stores, the analyst finds that out of a total of 425 boxes of detergent sold, LZ} were Brand X. (a) Estimate the market share captured by Brand X. (b) Estim ate the S . E . 5.3 use the data in Exercise 5.1 to obtain an approximategso/o confidence interval for the proportion of students-r"ho.rr" backpacks. 5.4 To estimate the percentage of a species of rodent that carries a viral infection, 128 rodents are histologically examined andT2of them are found to be in{ected. compute a 9s6/o confidence interval for the population proportion. 5.5 Referringto Exercise5.2, determine a9s"/oconfidenceintewal for the market share captured by Brand X detergent. 5.5 Assuming that n is large, write the test statistic and determine the rejection region in each case. (a) Ilo: p = .4vs.Ilr: p ) .4,a : .10. ( b ) H o : p : . 7 v s . F I r : p * . 7 ,a : . 1 0 . 5 . 7 A n educator wishes to test H^: o:p
proportion of college football players who graduate in 4 years. (a) State the test statistic and the rejection region for a large sample testhavingct-.05.
tl IJ
(b) ff t 9 out of a random sample of 48 playersgraduatedin four years/ what does the test conclude?calcllate th; p-value and int-erpret the result. 5.8 A concernedg'oup of citizens wants to show that less than half of the voters support the president's handling of a recent crisis. Let p : proportion of voters who support the handling of the crisis. (a) Determine Ho and Hr. (b) If a random sample of 500 voters give 22g in support, what does the test conclude?Use q : .05. AIso evaluat" tG p-value. 5.9 A buyer for the outing center wants to test Ho: p H r : P > .5, where p is the proportion of studentswho use backpacks to carry books.Referringto Exercise5.1 perfolm a test with a
6. DECIDINGON THESAMPLE SIZE During the planning stage of an investigation it is important to address the question of sample size. Becausesampling is costly and time consuming the investigator needs to know, beforehand, the sample size required to give the desiredprecision.
6. Deciding on the SamPleSize 279 Sample Size for Estimation of p A formula for a 100(1 - s)% error margin for the estimation of p by X is given by
z_tzTn To be 100(1 - a)% sure that the error does not exceeda specified amount d, the investigator must then have
z_nfn This givesan equationin which n is unknown. Solvingfot n, we obtain f-
*f2
": l-?,
which determines the required sample size. Of course, the solution is rounded to the next higher integer, because a sample size cannot be fractional. This determination of sample size is valid provided n > 30 so that the normal approximation to X is satisfactory'
To be 100(1 a)% sure that the error of estimation lX doesnot exceedd, the required sample size is n
L d
F.l
I
If o is completely unknown, a small-scale preliminary sampling is necessary to obtain an estimate of o to be used in the formula to compute n.
EXAMPLE'12 A limnologist wishes to estimate the mean phosphate content per unit volume of a lake water. It is known from studies in previous years that the standard deviation has a fairly stable value o : 4. How many water samples must the limnologist analyze tobe 90"/" certain that the error of estimation does not exceed0.8? Here o : 4 and 1 - cr : .9O, so af2 : .05. The upper .05 point of the N(0, 1) distribution is z.ou : l'645' The tolerable error is d : '8' ComPuting
280
Chapter 9
n - L -tre+tl, .g J the required sample size is n - 6g.
n
Sample Size for the Estimation of p A 100(l - a)o/" error margin for the estimation of p is givenby -rr\/pq/n, and the required sample size is obtained by equating z_1jpq/n : d., sl z ' where d is the specified error margin. We then tbtaln"
": onlTl, If the value of p is known to be roughly in the neighborhood of a varue p*, then n can be determined from
Without prior knowledge of p, pq can be replaced by its maximum possiblevalue * and n determinedfrom the relation
tlul' 4tdJ EXAMPLE'13 A public health survey is to be designed to estimate the proportion p of a population having defective vision. How many persons i6o"tO be &amined if the public health commissioner wishes.t6 be98% certain that the error ot estimation is below .05 when: (a) There is no knowledge about the value of pz. (b) p is known to be about . 3 7 The tolerable error is d .05.Also (1 the normal table, we know that z.ot (a) Since p is unknowr,
cr) - .98, so a/2
the conservative bound on n yields
u +lr+l' 4t.05 J A sample of size S4Bwould suffice. (b) If p* _ .3, the required sample size is
n - ( . 3X . r ' , 1 W 1 '
Key ldeas 281
EXERCISES 6.1 An investigator, interested in estimating a population mean, wants to be 95"/" certain that the error of estimation does not exceed 2.5. What sample size should she use if o : 18? 6.2 Afood service manager wants to be 95% certain that the error in the estimate of the mean-number of sandwiches dispensedover the lunch hour is 10 or less. What sample size should be selected if a preliminary samPle suggestso : 40? 6.3 What sample size is required if o : 80 in Exercise 6'2t' 6.4 How large a sample should be taken to be 95% sure that the error of estimation does not exceed .02 when estimating a population proportion? Answer if (a) p is unknown and (b) p is about '7' 6.5 In a psychological experiment, individuals are permitted to react to a stirnulus in one of two ways/ say A or B' The experimenter wishes to estimate the proportiort, p, of persons exhibiting reaction A. How -"ny persons"shouldbe included in the experimgni to be 90% confiden.'tirat the error of estimation is within .04 if the experimenter (a) Ihows that P is about .2? (b) Has no idea about the value of P? 6.6 Anational safety council wishes to estimate the proportion of automobile accidents that involve pedestrians. How large a sample of accident records must be examined to be 98% celtain that the estimate does not differ from the true propoltion by more than .04? (The council believes that the true proportion is below '25')
KEYIDEAS The concepts of statistical in{erence are useful when we wish to make generalizations about a population on the basis of sample data. Two basic forms of inference are: (a) estimation of a population parameter, and (b) testing statistical hypotheses. A parameter can be estimated in two ways: by quoting (i) a single numerical value (point estimation) or (ii) an interval of plausible values (interval estimation). The standard deviation of a point estimatol is also called its standard error. To be meaningful, a point estimate must be accompanied by an evaluation of its error margin A 100(l - a)% confidence interval is an interval that will cover the true
282
Chapter 9
value of the parameter with probabirity (r - a). The interval must be computable from the sample data. ff
samples 31{om _are repeatedly drawn from a population, and a 100 (l - a)% confidenceinterval is calculated from th"r, about 100(l - a)% oI those intervals will include the true """t"value of the parameter. we never know what happens in a single application. our confidence draws from the ,rr"".r, rate of 100(i _'Z)it" in _"rry applications.
Inferences about a population Mean When n is Large Whel n is 1arge, we need not be concemed about the shape of the population distribution. The central limit theorem tells rr, ,h", the sam_ ple mean x is nearly normaly distributed with mean standard ; ft deviation o/{i. Moreover o/t/-n can be.rti-"t"Jly itV".Parameterof interest: p_: population mean Inferencesare basedon X : sample mean. (i) A point estimator of p is the sample mean X. Estimated standard errcr : s/\/-n Approximate 100(l - a)% error margin : z*rrsf {n. (ii) A 100(l - ct)% confidence interval for p is (* - z-1rs/tfn,
k + z-rrs/fi1
(iii) To test hypothesesabout p, the test statistic is Z-
X-Fo s/{n
where Frois the value of p that marks the bound ary between Ho and H L. Given a level of significance o, re,ect
p
in favor of
Hi
Fr
reject
p
in favor of
Hl
p
reject
p-Fo
in favor of
HI
Lr + ]Lo if lzl 2 Zo/z
Inferences about a Population Proportion When n is Large Parameterof interest:p, thepopulation proportion of individuals possessing a stated characteristic Inferences are based on i : X the sample proportion. n'
7. Exercises 283 (i) A point estimator of P is f . Estimated standard error 100(1 o)% error margin : (ii) A 100(1
\m,where t-1r\ffi
a-- I
p
a)% confidenceinterval for p is
(p ,-,r\ffi,P
+ t-,r@
(iii) To test hypothesesabout p, the test statistic is Z
PPo
\ffi
where po is the value of p that marks the bound ary between Ho-and HL. The"reiection region is right-sided, left-sided, or two-sided ac' .ordirrg to Hl p , io, Hr: p l Po ot Hr: p + Po, respectively'
7, EXERCISES 7.1 Consider the problem of estimating a population mean p based on a random sample of size n from the population. Compute a point estimate of p and the estimated standard error in each of the following cases: 852,)(x, - t)z : ZtS (a) n : 7O,2xr: (b) n : l4},2x, : 1653,2(xi - 7)2 : +6+ (c) n :
l5o,Zxr:
1985,Xxi - x)2 : +lS
7.2 Compute an approximate 95.4% error margin for the estimation of p in each of the three casesin Exercise 7.1. 7.3 A zoologist wishes to estimate the mean blood sugar level of a speciesof animal when injected with a specified dosageof adrenaline. A sample of 55 animals of a common breed are infected with adrenaline, and their blood-sugar measurements are recorded in units of milligrams per 100 milliliters of blood. The mean and the standard deviation of these measurements are found to be 126.9 and 10.5, respectively. (a) Give a point estimate of the population mean and find a95-4Y" error margin. (b) Determine a9O"/oconfidence interval for the population mean.
284
Chapter 9
7.4 Determine a 100(1 a)% confidence interval for the population mean in each of the following cases: (a) n )rr _ 725, Xxi x)2
(b) n
)rr _ 2562, Xxi
V)z
7.5 A sample of 64 measurementsprovide the sample mean x _ 8.76 and the sample standard deviation s the population mean/ construct a:
(a) 90% confidence interval. (b) 99T" confidence interval. 7.6 After feeding a special diet to g0 mice, the scientist measuresthe weight gains and obtains x ands_4grams.Hestates that a 9o% confidence interval is given by I
(rr
r . 6 4 s* , , 3'5 ) + rr'o45 . 6 4 s 4 \ or (34'26' 35'74) V8o \m)
(a) Was the confidence interval calculated correc tly? If not, provide the correct result. (b) Does the interval (84.26, 85.74)cover the true mean? Explain your answer.
*7'7 compu-tingfrom .a rygg sample, one finds the 95o/oconfidence interval for p to^be 1s.s,t+.zj.Iiased on this i"ror-n,io" alone, determine the 90% confidenceinterval for p. A 100(1 lHint: and has half-width
a)% confidence interval for p is centered atV l-r, _ zo/zs/{n.l
7.8 In each caseidentify the null hypothesis(r{j and the arternative hypothesis(rJr) using the app.opii"t" .y*Loi'rot irr" f"i"-eter of interest. (a) A consumer group plans to test drive several cars of a new model in order to document that its average highway mileage is less than 50 miles per gallon.
(b) subsoil water specimenswill be analyzed in order to determine if there is a convincing evidence thai the mean concentration of a chemi cal agenthas exceeded.00g. (c) A chiropractic method will be tried on a number of persons sufferitg from persistent backachein order to demonstrate the claim that its successrate is higher than 50%. (d) The setting of an automatic dispenser needsadjustment when the mean filt differs from the intended amormi of 16 ounces. Severalfills will be accurately measuredin order to decide if there is a need for resetting.
7. Exercises 285 (e) A researcher in agro-genetics wants to demonstrate that less than 30% of a new strain of corn plants die from a specified frost condition. 7.9 In a given situation, supposeHo was rejected at ot : .05- Answer the following questions as "yes," "no," ot "car:''t tell" as the case may be. (a) Would Ho also be reiected at a : .02? (b) Would Ho also be reiected at ct : .10? (c) Is the P-value larger than .05? 7.10 Assuming that n is large in each case,write the test statistic and determine the reiection region at the given level of significance. (a) Ho: lL : 20, vs. Hr: p' * 2O, o : .05 (b) Ho: p > 30, vs. Hr: t, < 30, a : .02 7.11 In a problem of testing Ho: F < 75 versus Hr: p > 75, the following sample quantities are recorded: n:
56, *:77.O4,
s : 6.80
(a) State the test stastistic and find the rejection region at ct : .05. (b) Calculate the test statistic and draw a conclusion at ct : .05. (c) Find the P-value and interpret the result. 7.I2 A sample of 42 measurements was taken in order to test the null hypothesis that the population mean equals 8.5 against the alternative that it is different from 8.5. The sample mean and standard deviation were found to be 8.79 and 1.27, respectively. (a) Perform the hypothesis test using.l0 as the level of significance. (b) Calculate the significance probability and interpret the result. 7.13 In a large-scale, cost-of-living survey undertaken last |anuary, weekly grocery expenses for families with I or 2 children were found to have a mean of $98 and a standard deviation of $15. To investigate the current situation, a random sample of families with I or 2 children is to be chosen and their last week's grocery expensesare to be recorded. (a) How large a sample should be taken if one wants to be 95"h sure that the error of estimation of the population mean grocery expensesper week for families with I or 2 children doesnot exceed $2? (Use the previous s as an estimate of the current o.) (b) A random sample of 100 {amilies is actually chosen, and from the data of their last week's grocery bills, the mean and the standard deviation are found to be $103 and $12, respectively.
286
Chapter 9
construct a 98% confidence interval for the current mean grocery expenseper week for the population of families with I or 2 children.
7'14 calculate a r00(r - a)% confidenceintervar for the population proportion in eachof the following cases. ( a )n : r c O O , p : . 3 6 , 1- o : . 9 g (b) n
7 . r 5A random sample of 2000 persons from the labor force
of a large
city areinterviewed, and 165 of them are found to be unemployed.
(a) Estimate the rate of unemployment
based on the data.
(b) Establish a 95% error margin for your estim ate. 7 . 1 6Referring to Exercise 7.rs compute a gg% confidence interval for the rate of unemployment.
7 . r 7Let p - proportion of adults in a crty who required a lawyer in the past yeat. (a) Determine the rejection region for an o _ .0s level test of Ho: P (b) If 65 persons in a random sample of 200 required law yer services, what does the test concllde?
7 . r 8Referring to Exercise7 -17, obtain a 90% confidence interv al forp. 7 . r 9specimens of a new insulation for cables will be
tested under a hish-temperature stress condition in order to estimate the chance of failure under this srress. (a) How many specimens should be tested if the investigator wants to be 95"/o sure that the error of estimatio' do"a ,ro, exceed '08? (From experience with a .i-ir"r -"i"riar he expects a failure probability somewhere near .6.) (b) After testing 100 specimens it was found that 6g had failed. Give a 95% confidence interval for the a*. of f"ilrrr". "t "i"" 7 '2o A marketing manager wishes to determine if lemon fravor and almond flavor in a dishwashing liquid ril..Jiy consumers' out of 25o consumers interviewed,"."r4s "q""iiy ."pi"r."i it"it preference for the lemon flavor and the remaining 105 preferred the almond flavor. (a) Do these dataprovide strong evidence that there is a difference inpopularitybetweenthetwoflavors?(Testwitha
(b) Construct a 95"/" confidence interval for the population proportion of consumers who prefer the almond flav^or.
7'21 Researchers in cancer therapy often report only the number of
7. Exercises 287 patients who survive for a specified period of time after treatment iather than the patients' actual suwival times. Suppose that 3OY" of the patients who undergo the standard treatment are known to survive 5 years. A new treatment is administered to 100 patients, and 38 of them are still alive after a period of 5 years' (a) Formulate the hypotheses for testing the validity of the claim that the new treatment is more effective than the standard theraPy. (b) Test with a : .05 and state your conclusion' 7.22 A genetic model suggeststhat 80% of the plants srlwn from a cross between two gin"n strains of seeds will be of the dwarf variety. After breeding 200 of these plants, 136 were of the dwarf variety. (a) Does this observation strongly contradict the Senetic model? (b) constru ct a 95"/oconfidence interval for the true proportion of dwarf plants obtained from the given cross' 7.23 An electronic scanner is believed to be more efficient in detecting flaws in a material than a mechanical testing method which detects 60% of the flawed specimens. Specimens with flaws are tested by the electronic scanner to determine its successfate. (a) Supposethe scannerdetects flaws in 58 casesout of 80 flawed rp"Ci*"ns tested. Does this observation constitute strong evidince that the scanner has a higher success rate than the mechanical testing method? State the P-value' (b) Basedon the data of part (a), give a 95% confidence interval for the true successrate of the scanner. 7.24 Aresearcher in a heart and lung centel wishes to estimate the rate of incidence of respiratory disorders among middle-aged males who have been smoking more than two packs of cigarettes per day during the last 5 years. How large a sample should the researcher seleci to be 95"/o confident that the error of estimation of the proportion of the population afflicted with respiratory disorders doesnot exceed.03?(The true value of p is expectedto be near .15.) 7.25 One wishes to estimate the proportion of car ownels who purchase more than $500,000 of liability coveragein their automobile insurance policies. (a) How large a sample should be chosen to estimate the proportion with a 95Yo enor margin of .008?(Usep* : '15') (b) A random sample of 400 car owners is taken, and 56 of them are found to have chosen this extent of coverage. construct a 95% confidence interval for the population proportion'
288
Chapter 9
x7.25 Finding the power of a test. consider the problem of testing Ho: l-r - l0 vs. Hi p 0 region of this test is given by
x- to= r.96 or 2/\R
X- l0 + I.sG+
v64
Supposewe wish to calculate the power of this test at the alternative pr _ I l. Recall that, Power _ the probability of rejecting the null hypothesis when the alternative is true. Since our test rejects thenullhypothesiswheny=Io.4g,itspoweratFrr probability P [X = IO.4gwhen the true mean pr _ l l] ,hg population mean is 11, we know that X has the lf normal distribution with mean _ 11 and sd _ v/\fr_ z/tE+: .il-.^rt. standardizedvariable is Z_
Xtl %
and we calcul ate
Power _ PIX
I PLZ= 10.49 11-l .2s
J
- Plz = -2.041 Following the above steps, calculate the power of this test at the following alternatives: (a) lrr (b) Fr
CHAPTER
Smoll-Somple for Normol Inferences ons Populotl 1, INTRODUCTION t DISTRIBUTION 2, STUDENTS SIZE FORp-SMALL SAMPLE INTERVAL 3, CONFIDENCE TESTS FORP 4, HYPOTHESES INTERVALS AND CONFIDENCE TESTS BETWEEN 5, RELATIONSHIP DEVIATION o ABOUT THESTANDARD 6. INFERENCES (THECHI-SQUARE DISTRIBUTION) PROCEDURES OF INFERENCE 7, ROBUSTNESS
290
Chapter 10
Excovoted objects ore often corbon doted. Scientists use radioactive methods for dating very old obiects. Anything that was alive at one time can be dated by an internal radio active clock that is triggered when it dies. Radioactive carbon -I4 then decays to nitrogen and the amount of carbon-14 still present indicates the age. Excavations at Hierakonpolis, in the late I97Os, unearthed the earliest substantial architecture in Egypt. Ten samples of wood were carbon dated, and their ages were estimated as
4900 47s0 4820 4710 4760 4300 4s70 4680 4800 4670 A 95% confidenceinterval for the populationmeanis
lL
Age(years)
(Courtesy R. Steventon of the Universityof Wsconsin Rodio octive corbon Doting Loborolory,center for climotic Reseorch.)
I'. INTRODUCTION In chapter 9 we discussedinferences about a population mean when a large sample is available.Those methods are deeply rooted on the central limit theorem, which guaranteesthat the distribution of x is approximately normal. By the versatility of the central limit theorem we did not need to know the specific form of the population distribution. Many investigations, especially those involving costly experiments, require statistical inferences to be drawn from small samples (n < 30, as a rule of thumb). since the sample mean x will still be used for inferences
2. Student'st Disfiibution 291 about tL we must address the question "What is the sampling distribution of X when n is not large?" Unlike the large sample situation, here we do not have an unqualified answer. In fact, when n is small, the distribution of X does depend to a considerable extent on the form of the population distribution. With the central limit theorem no longer applicable, more information concerning the population is required for the development of statistical procedures.In other words, the appropriate methods of inference depend upon the restrictions met by the population distribution. In this chapter, we describe how to set confidence intervals and test hypotheseswhen it is reasonableto assumethat the population distribu' tion is notmal. We begin with inferences about the mean p of a normal population. Guided by the development in Chapter 9, it is again natural to focus on the ratio
X-p. s/t/-n when o is alsounknown.The samplingdistributionof this ratio, called Student'st distribution,is introducednext.
t DISTRIBUTION 2. STUDENT'S When X is based on a random sample of size n from a normal N(p, o) population, we know that X is exactly distributed as N(p, ol\/n). Consequently, the standardized variable Z:
X-p. ol{n
has the standard normal distribution. Becauseo is typically unknown, an intuitive approach is to estimate o by the sample standard deviation s. |ust as we did in the large sample situation, we consider the ratio + L
X-p sl{n
Although estimating o with s does not appreciably alter the distribution in large samples, it does make a substantial difference if the sample is small. The new notation t is required in order to distinguish it from the standard normal variable Z.In fact, this ratio is no longer standardized.
292
Chapter 10
Replacing o by the sample quantity s introduces more variabirity in the ratio, making its standard deviation larger than l. The distribution of the ratio t is known in statistical literature as "student's r distribution." This distribution was first studied by a British chemist w. S. Gosset,who published his work in r90g urrder ihe ps.udoname "student." The brewery for which he worked apparentlyiid ,rot want the competition toknow that they were using statistical techniques to better understand and improve their fermentation process.
Student'st Distribution rf xr, - , xn is a random sample from a normal population N(p, cr) and X
n-'nl
then the distribution of t -
+
x'-
Itu
sl{n is called Student'st distribution with n freedom.
I degreeso[
The qualification "with n _ r degrees of freedom,, is necessary, becausewith each different sample size or value of n - l, there is a different t distribution. The choice n - I coincides with the divisor for the estimator s2 which is based on n - I degreesof freedom. The t distributions are all symmetric around 0 but have tails that are more spread out than the N(0, 1) distribution. However, with increasing degreesof freedom the r distributions tend to look more like the N(0, il distribution. This agrees with our previous remark that for large n the ratio X-p"
s/\/i i-sapproximately standard normal. The density curves for t with 2 and 5 degreesof freedom are plotted in Figure I along with the N(0, I) curve. -Appendix B, Table 4 gives the upper a points t. for rom" serectedvalues of a and the degreesof freedom (abbreviated dJ.).
2. Student's t Distribution
-N(0, 1) ----tWithd.f. . . . . . , . . t w i t hd . f .
-5
-4
-3
293
5 2
-'z
'l) Figure'l ComPorisonof N(0, .
- t.. The The curve is centered at zeto so the lower a point is simply entries in the last row marked "d.f.. : in{inity" in Appendix B, Table 4 are exactly the percentage points of the N(0, 1) distribution'
EXAMPLE4 Using Appendix B, Table 4, determine the upper '10 point of the t
distribution with 5 degreesof freedom. Also find the lower .10 point. with d.f. : 5, the upper .10 point of the t distribution is found from Appendix B, Table 4 to 6e t.ro : l'476' Since the curve is centered at 0' -r'476' SeeFigure 2' thilower '10 point is simply : -t'ro : Percentage Point of t distribution
-L476
Figure2 The upper ond lower .'10 points of the f distribution with d.f. = 5.
EXAMPLE2
For the t distribution with d.f. : 9, find the number b such that Pl-b
294
Chapter L0
Figure3
EXERCISES 2 . 1 Using the table for the t distributions find the: (a) Upper .05 point when d.f. (b) Lower .025 point when d.f. - 12. (c) Lower .01 point when d.f. (d) Upper .10 point when d.f.
2.2 Name the t-percentilesshown and find their valuesfrom Appendix B, Table 4.
2.3 In each case,find the numb er b so that: (a) Plt 6. (b) Pl- b t d . f. - 1 5 . (c) Plt B.
(d) Plt
11.
2.4 Record the t.os values for d.f. - 5, 10, 15, 20, and 29. Does this percentile increase or decrease with increasing degreesof freedom?
3. Confidence Interval for p"-SmaLL
INTERVAL 3. CONFIDENCE SIZE FORP-SMALLSAMPLE The t distribution for fr.
+
x-r'' slI n
provides the key for determining a confidence interval for the mean of a irormal population. For a 100(1 - o)% confidence interval, we consult th" t tableiAppendixB, Table 4) and findto,r, the upper ul2pointof the t distribution with n - I degreesof freedom (see Figure 4)'
X - tatT
Figure4
X Since a ! has the t distribution with d.f. : n slI n
*-J. 4- to/2 s l l n
t: ,- /-2l ) _- lr
I,we have
ct
In order to obtain a confidence interval, Iet us rearrange the terms inside the brackets so that only the parameter p remains in the center. The above probability statement then becomes
nl*
to/z*
which is precisely in the form required for a confidence statement about p. The probability is I - a that the random interval X - t-,rs/\/n to X + t-,rs/\/-n will cover the true population mean p. This argument is virtuilly the same as in Section 3 of Chapter 9. Only now the unknown o is replaced by the sample standard deviation s, and the t percentagepoint is used instead of the standard normal percentagepoint.
296
Chapter 10
A'100('l a)o/oConfidence Intervql for o Normol Populotion Meon is givenby to/Z*,
(t
x+to/?*)
where to/2 is the upper o/2 point of the t distribution d.f. _ n 1.
with
Let us review the meaning of a confidence interval in the present context. Ima_ginethat random samples of size n are lepeatedll drawn from a normil population and the inierval (x - t_,rs/t/i', + t_,rs1t/iy calculated in each case.The intewal is centered ro the" center varies "t " from sample to sample. The length of an interval,2t_,rsf\/i, also varies from sample to sample because it is a murtiple of tle sample standard deviation s. (This is unlike the fixed length situation illustraied in Figure 4 of chapter 9, which was concemed with a known o.) Thus, in repeated sampling, the intervals have variable centers and variable lengths. However, our confidence statement means that if the sampling is repeated many times, about 100(1 - a)"/o of the resulting intervals would cover the true population mean p. Figure 5 shows the results of drawing l0 samples of size n : 7 from the normal population with p : 100 and o : 10. selecting a : .05, we find that the value o{ r.o,, with 6 d.f. is 2.447 so Yes No
110
I
p=100
90
I nterva I contains p. lntervaldoesnot containp.
r
2
3
4 5 6 B 7 S a m p lNeu m b e r
9
10
Figure 5 Behovior of confidence intervols bqsed on the t distribution.
3. Confidence Interval for p-Sm all Sample Size 297
the 95% confidence interval is x -r 2.447s/\/7.In the first sample x : 103.88and s : 7.96, so that the interval is (96-52, Ill'24).The95% confidence intervals are shown by the vertical line segments'
EXAMPLE3
A new alloy has been devised for use in a spacevehicle. Tensile strength measurements are made on 15 pieces of the aJloy, and the mean and the standard deviation of these measurements are found to be 39.3 and 2.6, respectivelY. (a) Find a 9OY"confidence interval for the mean tensile strength of the alloY. (b) Is p. included in this interval? (a) Assuming that strength measurements are normally distributed, a 90% confidence interval for the mean p is given by x + t.os: {n
where n : 15 and consequently d.f. : 14. Consulting the t table, we find t.os : 1.761.Hence a 90% con{idenceinterval for p. computed from the observedsample is
: 3g.3+ 1.18,or (38.12,40.48) 3g.3+ L.76I x l+ v15 We are 9OY" confident the tensile strength is between 38.12 and 40.48, because 9O"/oof.the intervals calculated in this manner will contain the true mean tensile strength. (b) We will never know if a single realization of the confidence intervai, such as (38.12,40.48),covers the unknown p. Our confidence in the method is basedon the high percentage of times p is Eoveredby intervals in repeated samplings.
u
Remark: When repeated independent measurements are made on the same material and arry variation in the measurements is basically due to experimental error (possibly compounded by nonhomogeneous materials), the normal model is often found to be appropriate. It is still necess ary to graph the individual data points (not given here) in a dot diagram and normal scores plot to reveal any wild observations or serious departures from normality. In all small-sample situations, it is important to remember that the validity of a confidence interval rests on the reasonableness of the model assumed for the population.
298
Chapter 10
Recall from the previous chapter that the length of a 100(l - s)% confidence interval fola normal mean p,is2z_,rofV-n when o is known, whereas itisZt-rrsfVn when o is unknown. iiven a small sample sizen and consequently a small number of degreesof freedom (n - l), the extra variability caused by estimating o with s makes the t percentage point f.rr, much larger than the normal percentage point zo,r. For instance, with d.f. : 4,we have t.or, : 2.776, which is consideiably larger than z.ozs : 1.95. Thus when o is unknown, the confidence estimation of p based on a very small sample size is expectJd to produce a much less precise inference (namely, a wide confidence interval) compared to the situation when o is known. with increasing n, o can be more closely estimated by s and the difference between t-,, andzo1" tends to diminish.
EXERCISES 3.1 Measurement o{ the amount of suspendedsolids in river water/ on 15 Monday mornings, yields x : 47 and s : 9.4. Obtain a 95% confidence interval for the mean amount of suspendedsolids. state any assumption you make about the population. 3.2 Determine a 99Y" confidence interval for pr.using the data in Exerc i s e3 . 1 . 3.3 In an investigation on toxins produced by molds that infect corn crops, a biochemist prepares extracts of the mold culture with organic solvents and then measures the amount of the toxic substance per gram of solution. From 9 preparations of the mold culture, the following measurements of the toxic substance (in milligrams) are obtained: 1.2, .8, .6, l.l, I.Z, .9,1.5, .9, 1.0. (a) calculate the mean i and the standard deviation s from the data. (b) Compute a 98o/" confidence interval for the mean weight of toxic substance per gram o{ mold culture. State the assumption you make about the population. 3.4 The time to blossom of 21 plants has x : 89 daysand s : 5.1 days. Give a 95% confidence interval of the mean time to blossom. 3.5 In a lake pollution study, the concentration of lead in the upper sedimentary layer of a lake bottom is measured from 25 sediment samples of 1000 cubic centimeters each. The sample mean and the standard deviation of the measurements are found to be .38 and .06, respectively. Compute a 99Y" confidence interval for the mean concentration of lead per 1000 cubic centimeters of sediment in the lake bottom. *3.6 From a random sample of size 12 one has calculated the 95% confidence interval for p and obtained the result (19.6,26.2).
4. Hypotheses Tests fa, p
299
(a) What were the x and s for that sample? (b) Calculate the 98% confidence interval for p.
FORP TESTS 4. HYPOTHESES The steps for conducting a test of hypotheses concerning a population -""n *ir" presented in the previous chapter. If the sample size is small, basically the same procedure can be followed provided it is reasonable to assume that the population distribution is normal. Howevet, in the small sample situation, our test statistic x
[r,o
sl{n has Student/s t distribution
I degrees of freedom:
with n
teststatistict-
:n
*, s/Y n
I
The t-table (Appendix B, Table 4) is used to determine the reiection region.
HypothesesTestsfor p-Smoll Somples To test hypotheses concerning the mean of a notmal population the test statistic is +XPo v
-
-
slln which has Student's t distribution freedom.
with n
I degrees of
Hr: P Hr: p Ht: p + Fro
R:
ltl
The test is called a Student's f-test, or simply a t-test
300
Chapter 10
EXAMPLE4
A city health department wishes to determine if the mean bacteria count per unit volume of water at a lake beach is within the safety level of 200. A researcher collected l0 water samples of unit volume and found the bacteria counts to be: 1 75 , L 9 0 , 2 L 5 , L g g , 1 9 4 207,
2ro, rg3,
196, 190
Do the data strongly indicate that there is no cause for concem? Let p denote the current (population) mean bacteria count per unit volume of water. Then, the statement "no cause for concem,, translates to !r < 200, and the researcher is seeking strong evidence in support of this hypothesis. So the formulation of the null and altemative hyp&heses should be Ho:p > 200 vs. Hr:p < 200 since the counts are spread over a wide range, an approximation by a continuous distribution is not unrealistic for inferenCe about the mean. Assuming further that the measurements constitute a sample from a normal population, we employ the t-test with X - 200
_9
s/\m
f"e1us perform the test at the level of significance ct : .01. Since H, is left-sided, we set the rejection region t < - t.or. From the t table we find that t.or with d.f. : 9 is 2.821 so our rejectioi region is R: r < - z.gzr. Computations from the sample data yield x _ 194.8 s _ 13.14 + L
-
tg4.g
20a :
13.14/\m
-5.26 4.156
' -' L ' / \ r
Becausethe observedvalue t : - 1.25 is larger than -z.g2'l, the null hypothesis is not rejected at a : .0r. on the basis of the data obtained from these l0 measurements/there does not seem to be strong evidence that the true mean is within the iafety level. n Remark: The conclusion of a t-test can also be strengthened by reporting the significance probability (p-value) of the observedstatistic. when using a normal test, the p-value could be readily calculated becausethe standard normal table is quite elaborate. Since the t table
5. Relationship Between Tests and Confidence Intervals 30f provides only a few selected about the P-value but not its data of Example 4 gave an Scanning the t table for d.f. _
percentage points, we can get an idea exact determination. For instance, the observed t _ - 1.25 with d.f . _ 9. 9 we notice that I.25 lies between t.ro
than . l0 but not as great as .25. Fortun ateLy, we also have recourse to some computer package programs when exact P-values are needed.
EXERCISES 4.1 A random sample of size 20, from a normal population, has i : 182 ands : 2.3. Test Ho: l, < 181 againstHr: p > 181 with a : .05. 4.2 Refer to Exercise 3.1. The water quality is acceptableif the mean amount of suspendedsolids is less than 49. Construct an o : .05 test to establish that the quality is acceptable. (a) Specify Ho and Hr. (b) State the test statistic. (c) What does the test conclude? 4.3 Refer to Exercise3.4. Do these data provide strong evidencethat the mean time to blossom is less thain 42 days? Test with ct : .01. .34 4.4 Referring to Exercise 3.5, test the hypotheses Ho: t-r, vs. Hl p * .34 at ct : .01, where p denotes the population mean concentration of lead per 1000 cubic centimeters of sediment. 4.5 Repeat Exercise4.1 with cr : .01. 4.5 An accounting firm wishes to set a standard time, p, required by employes to complete a certain audit operation. Times from 18 employesyieldx : 4.1 hours and s : 1.5.Test Ho: t-t: 3.5 vs. Hti p > 3.5 using ct : .05. 4.7 Referring to Exercise4.6, test He: p, : $.5 vs. Ht: p' * 3.5 using ct
: .o2.
TESTS BETWEEN 5. RELATIONSHIP INTERVALS AND CONFIDENCE By now the careful reader should have observed a similarity between the formulas we use in testing hypothesis and in estimation by a confidence interval. To clarify the link between these two concepts, let us consider again the inferences about the mean p, of a normal population.
302
Chapter 10
A 100(l
a ) % confidence
(x
interval
for p is given by
tt o/2e' x
+ to/2 \n/
becausethe probability of
x
to/2+.
Ir <x
VN
+ to/2
s {n
is 1 - c. on the other hand, the rejection region of a level c test for Ho: p : Fo vs. the two-sided alternative Hr: pr-* po is
R: l=+l I s/t/n
I
2to/z
Let us use the rtame "acceptance region" to mean the opposite (or complement) of the rejection region. Reversing the inequality itt R, we obtain - t'o/2 , 1 . I -- l o a
Acceptance region:
-\
t,o/Z
,/{n
which can also be written Acceptance region:
X
.s
to/2 {na
Fo < X + to/2
s {n
The latter expression shows_-thatany given nulr hypothesis po will be accepted(more precisely,_willnot be rejected)at levJ c if po liei within the 100(1 confidence interval. Thus having eJtablished a ")% 100(1 - c)% confidence interval for p, we know rt orr".1hrt all possible null hypothesespo lying outside this interval will be rejected at level of significance a and that all those lying inside wiil not be reiected.
EXAMPLE5
A random sample OIf size n themeant-8.3 alO rd the standard deviation s confidence interval ffor p and also test Ho: l.r with a A 95% confidence irnterval has the form
(, *t \,
.ozs
{n,n
- T -L . o 2 s Vn,
Exercises 303 where t.ozs : 2.3O6 correspondsto n - I : 8 degreesof freedom. Using ? : 8.3 and s : 1.2, the interval then becomes
(t,
2.806#r,8.3+ ,so6#r)
Turning now to the problem of testing Ho:F : 8.5, we observe that the value 8.5 lies in the 95o/" confidence interval we have just calculated. Using the correspondence between confidence intewal and acceptance region, we can at once conclude that Ho: p : 8.5 should not be rejected at c : .05. Altematively, a formal step-by-step solution can be based on the test statistic
x-8.5 sltfn The rejection region consists of both large and small values.
region:lftfl= Reiection
,o,r,: 2.3s5
Now the observedvalue ltl : Vqla.a - 8-51/r'2: '5 does not fall in the rejection region, so the null hypothesis Ho: p : 8.5 is not reiected at ct : .05. This conclusion agrees with the one we arrived at from the confin dence interval. This relationship indicates how confidence estimation and tests of hypotheses with two-sided altematives are really integrated in a common framework. A confidence interval statement is regarded as a more comprehensive inJerence procedure than testing a single null hypothesis, becausea confidence interval statement in effect tests many null hypothesesat the same time.
EXERCISES 5.1 Basedon a random sample of size 18 from a normal distribution, an investigator finds the 95"/" confidence interval 7 - 2.lls/V18 to x + 2.Il s/V18 equalsl7.l to29.3. (a) What is the conclusion of the t-test for Ho: F : 19 versus Hl t, * 19 at level ct : .05? (b) What is the conclusion if FIo: p : 29.87
304
Chapter L0
5.2 In Example 3, the 90% confidence interval for the mean tensile strength of an alloy was found to be (38.12,40.48). (a) What is the conclusion of testing Ho: tr - 39 versusHl p + gg, at level ct - . 1 0 ? (b) What is the conclusion if Ho: p 5.3 Establish the connection between the large sample Z-test, which rejects Ho: p _ l.r,oin favor of Hr: p + Fo, at a - .05, if \/
- r.96
Z sl{n
sl{i
and the 95% confidence interval
X
r.s6+ to X + r.s6s
\n
Yn
6. INFERENCES ABOUT THESTANDARD (The DEVIATION o CHI-SQUARE Distribution) Aside from inferences about the population mean, the population variability may also be of interest. Apart from the record of a baseball player,s batting average, information on the variability of the player,s performance from one game to the next may be an indicator of reliability. Uniformity is often a criterion of production quality for a manufacturing process. The quality control engineer must ensure that the variability of the measurements does not exceed a specified limit. It may also be important to ensure sufficient uniformity of the inputed raw material for trouble-free operation of the machines. In this section, we consider inferences for the standard deviation o of a population under the assumption that the population distribution is normal. In contrast to the inference procedures concerning the population mean p, the usefulness of the methods to be presented here are extremely limited when this assumption is violated. To make inferences about o2, the natural choice of a statistic is its sample analog, which is the sample variance
s2
l:1
nl
We take s2 as the point estimator of s2 and its squ are root, s, as the point estim ator of o. To estim ate by confidence intervals and to test hypoth-
Disttibutionj 305 6. InferencesAboutthe standatd Deviationo (The chi-square eses, we must consider the sampling distribution of s2. To do this, we introduce a new distribution, called the X2 distribution (read "chi-square - l' distribution"), whose form depends on n
X2 Distribution . , Xnbe a random sample from a normal population, Let Xr, N(p, o). Then the distribution of
i,o
x2_T:€
x)' (n r)s,
is called the X2 distribution with n
I degreesof freedom.
of a Unlike the normal or t distribution, the probability density curve positive of side the over stretching curve an asymmetric 12 distribution is the on depends cuwe the of form The tail. right long a ile hne and having in Figu"t.r" of the degreesof fr"edom. A typical 12 curve is illustrated ure 6.
x? L
- U
xa
Figure6 Probobilitydensitycurve of o xz distribution.
Appendix B, Table 5 provides the upper ct points of 12-distributions for u"ri'o^r5values of c andihe degreesof freedom. As in both the casesof the t and the normal distributions, the upper o point 1] denotes the 12 value such that the area to the right is ct. The lower ct point or lOOath percentile, read from the column x?_. ir the table, has an area 1_- ct to the right. For example, the lower .0S point is obtained from the table by t.l"ding the X2", column, whereas the upper .05 point is obtained by reading the column x2,rr.
306
Chapter 10
EXAMPLE 6 Findthe upper .05 point of the 12 distribution with 17 degreesof freedom. Also find the lower .05 point Percentage Points of the yz Distribution (Appendix B, Table s) 0
.95
.05
r7
27.s9
The upper .05 point is read from the column labeled a : .05. We find X2os: 27.59 for 17 d.f. The lower .05 point is read from the column ct : .95. We find 12n, : 8.76, which is the lower.05 point. tr The 12 is the basic distribution for constructing confidence intervals for o2 or o. we outline the steps in terms of a9s% Jonfidence interval for o2. Dividing the probability o : .05 equally between rhe two tails of the 12 distribution and using the notation,ust explained we have
nlx?,,, .(n
-
l)s2.
02
x?r"rf
where the percentagepoints are read from the 12 table at d.f. : n _ l. Because (n
I)sz /o2
and
x'.nzr1 (n - L)s2/o2is equivalent to o, < (n - l)sr/xr.sz, r, pl(, . tltt(o2<(,
L
x'o*
_ l ) r ' l : .'-q s
x'.nro J
This last statement, concerning a random interval covering o2, provides a 95Y" confidence interval for o2. A confidence interval for o can be obtained by taking the square root of the end points o{ the interval. For a confidence ievel .95, the iiterval for o becomes
6. Inferences About the Standard Deviation o (The Chi-Square Distribution)
EXAMPLE7
3O7
A precision watchmaker wishes to leam about the variability of his product. To do this, he decides to obtain a confidence interval for o based on a random sample of 10 watches selected from a much larger number of watches that have passedthe final quality check. The deviations of these 10 watches from a standard clock are recorded at the end of one month and the following statistics calculated:
v
s : .4 second
Assuming that the distribution of the measurements can be modeled as a normal distribution, find a 90% confidence interval for o. Heren : 10, so d.f. : n - I : 9. Thelz table givesl2nu: 3.33 and : 16.92.Using the preceding formula, a9o"/oconfidence interval for X2.os o2 is
(e :L^!t,,', 2_!14_" : (.08s,.4Bz) "-\ / 3.33 \ 16.92 : (-29, .66). and the corresponding interval for o is (V585, {Vnl .29 and .56 is between The watchmaker can be 9O% confident that o procedure in this second, because 9O% of the intewals calculated by I I repeated samples will cover the true o. It is instructive to note that the midpoint of the confidence interval for oz in Example 7 is not s2 : .L6, which is the best point estimate. This is in sharp contrast to the confidence intervals for p, and it servesto accent the difference in logic between interval and point estimation. For a test of the null hypothesis Ho: o2 : o\it is natural to employ the statistic s2. If the alternative hypothesis is one-sided,say Hr: o'? o-tot then the rejection region should consist of large values of s2, or alternatively large values of the ^ h - l),"2 d.f. : n - I teststatistic x' - Y:-6r:-, The reiection region of a level ct test is therefore (n
T
1)s2
For a two-sided alternative H I u2 + azoa level o reiection region is (n - 1)s2 azo
or
(n - - 1)s2
ua
X&tz
Once again we remind the reader that the inference procedures for o presented in this section are extremely sensitive to departures from a normal population.
I
308
Chapter 10
EXERCISES 6.1 using the table for the x2 distribution, find: (a) The upper 5% point when d.f. (b) The upper L% point when d.f. _ 9. (c) The lower 2.5% point when d.f. - 16. (d) The lower L% point when d.f. _ 10.
6.2 Name the X2percentiles shown and find their values from Appendix B, Table 5.
(a)
(b) 0.05
(a)
(c) Find the percentile rn part (a) if d.f. (d) Find the percentile in part (b) if d.f.
6'3 Find a 9o"/oconfidence interval for o based on the n : 4omeasurements of heights-o{ red pine seedlings given in Tabre l of chapter g. (Note: s : .475 for this data set. 5rrt" assumption you make about the population.) "rry 6'4 Refer to Exercise 5,3 related,6pecies has popuration standard deviation o : '5. Do the + data prouid. strong *id"n"" that the red pine population standard deviation is smalleitrr"" .e ii..i;;l : .05. " 6'5 Plastic sh--eets produced by a machine are periodicaly monitored for possible fluctuations in thickness. unconirollabre rr.t.iog"rreity in the viscosity of the liquid mold makes some variation in thickness measurements unavoidable. However, if the true standard deviation of thickness exceeds 1.5 millimeters, there is cause to be concemed about the product quarity. Thickness measur€ments (in millimeters) of 10,specimens produced on a p"rai"rrr* shift resulted in the following data:
2 2 6 , 2 2 8 , 2 2 6 , 2 2 5 , z B 2 , z 2 B ,z z \ , z z g , z z s , zB0 Do the data substantiate the suspicion that the process variability exceeded the stated level on this particular shift? (Test at a -* .05.) State the assumption you make a6out the population distribution.
6-6 Refer to Exercise6.s. construct a gs% confidence interval for the true standard deviation of the thickness of sheets produced on this shift.
of lnfercncehocedutes 309 7. Robustness
OF INFERENCE 7. ROBUSTNESS PROCEDURES The small-sample methods for both confidence interval estimation and hypothesis testing presupposethat the sample is obtained from a normal poiulation. Users of these methods would naturally ask: (a) What method can be used to determine if the population distribution is nearly normal? (b) what can go wrong if the population distribution is non-normal? (c) What proceduresshould be used if it is non-normal? (d) If the observations arenot independent, is this serious? (a) To answer the first question, we could construct the dot diagram or ,rormal scores plot. These may indicate a wild obsevation or a striking departure frorn normality. If none of these is visible, the investigator would feel more secure using the preceding inference procedures. Howprovide convincing iustification for exer, a small-sample plot ""ttttor to iustifa or to refute the observations normality. Lacking rrrlfi"i"ttt of the second question' consideration normal assumption, we are led to a conceming F', are hypotheses, of (b) Confidence intervals and tests non-normal, the is population If the based on Student's t distribution. the tabulated from zubstantially actual percentage points rnay differ * interval for confidence is a95Y" t.orrsf \/i values.When *e r"y that* p may be, contain will interval p, the true probability that thisiandom 'say,85Yo p, using the about inferences or'ggYo.Foriunately, the effects on moderately is atleast t statistic, are not too serious if the sample size large (say 15). In larger samples such disturbances tend to disappear due "the central limii theorem. We express this fact by saying that into ferencesabout p using the t statistic are reasonably"robust." However, this qualitative discuision should not be considered a blanket endorsefor t. When the sample size is small, a wild observation or a *"rf distribution with long tails can produce misleading resuLts' . Unfortunately, infJrences about o using the X2 distribution may be seriously affectid by nonnormality even with large samples. We express this by saying that inferences about o using the 12 distribution are not itobr.rra,, departures of the population distribution from nor"g"i"rt mality. (c) we cannot give a specific answer to the third question without knowing something aboui the nature of nonnormality. Dot diagrams or t i*rogt;nr of the J.iginal data may suggest some transformations that *itt t"ti"g the shape of"th" distribution closer to normality. If it is possible to obtairia transf6rmation that leads to reasonably normal data plots, the problem can then be recast in terms of the transformed data. Otherwise, benefit from consulting with a statistician' m"tt (d) ""tt A basic assumption throughout chapters 9 and I0- is that the r"-pt. is drawn at random, to that the obsewations are independent of
310
Chapter 10
one another. If the sampling is made in such a manner that the observations are dependent, however, all the inJerential procedures we discussed here for small as well as large samples may be seriously in error. This applies to both the level of significance of a test and a stated confidence level. concemed with the possible effect of a drug on the blood pressure, supposean investigator includes 5 patients in an experi-ent and makes 4 successivemeasurementson each. This doesnot yield a random sample the 4 measurementsmade on each person are g{si.ze5 : 4 -: 20,_because likely to be dependent. This type of sampring requires , *o16 sophisticated method of analysis. An investigaiot ."ho-i. sampling opinions about a political issue may choose r00 families at random and record the opinions of both the husband and wife in each family. This also does not provide a random sample oI size 100 x 2 : 2oo, aithough it may be a convenient sampling method. when measurements are made ciosely together in time or distance, there is a danger of losing independence becauseadjacent obsevations are more likelylo be simili than observations that are made farther apart. Beca.rse ittdependence is the most crucial assumption, we must be constantly aleit to detect such violations. Prior to a formal analysis of the data, a close scrutiny of the sampling processis imperative.
KEYIDEAS when the sample size is small, additionar conditions need to be imposef on the population. In this chapter/ we assume normal populations. Inferences about the mean of a normal population a." b"sed o' +xr.r s/{n U -
which is distributed as Student's r rvith ra I degreesof freedom Inferences about the standard deviation of a normal population are based on (n L)s2/a2, whtch has a X2distrihsti'onnrith n I tlegreesof frerdom.
Moderate departures from a normal population distribution do not seriously affect inferences based on t. These procedures are robust. Non-normality can seriously affect inferenies about o. Inferences about a Normal population Mean when n is small, we-assume that the population is approximately normal. Inference procedures are derived fiom the sampling distribution of Student,sr_(X-r.r). slYn (i) A 100(l - ,)% confidence interval fior p is
B. Exercises 31I
(t t-,rsf{n, X + to/zsl{n) (ii) To test hypotheses aboutp the test statistic is t
s/{i
Given a level of significance ct,
Reiect Ho: tr Reject Ho: tr Reiect Ho: tr Infer ences about a Normal Population Standard f) eviation Infer ences are derived from the 12 distribution for (n
I)s2/o2.
(i) Point estimator of o is the sample standard d eviation s. (ii) A 95% confidence interval for tr
(
tn_r
FT\
fv .,r,'*,'v, r) (iii) To test hypotheses about o, the test statistic is (n - l)sz o2o Given a level of si gnificance ct, Reject Ho: tr in favor of H l o Reiect Ho: tr in favor of H l tr
)
'= ,1(";!r, x?_. a6 (n
]
Reiect Ho: cr in favor of Hl o + tro ]
.. rf -T--
1)s2
., ( n
rr----
x7
a6 l)sz o6
\
x?-o/2or frb=
x?rz
B. EXERCISES 8.1 Using the table of percentage points for the t distributions, find: (a) t.os when d.f. - 14. (c) The lower .05 point when d.f. - 14. (b) t.ozs when d.f .
.05 point when d.f. _ 19.
312
Chapter 10
8 . 2 A t distribution assignsmore probability to large values than does the standard normal. (a) Flnd r rrr\r t.os ,.O5 f_or\I.l r\-rl d.f. - 15 I J and then evaluate PIZ > t.or]. Verify that d.llu
Plt
(b) Examine the relation for d.f. - 5 and 20, and comment. 8.3 A random sample of n _ ZOfrom a normal population gives X _ 140 and s (a) construct a 98Y" confidence interval for the population mean. (b) what is the length of this confidence intervar? what is its center? (c) If a 98% confidence interval were calculated from another random sample of n : 20, would it have the same length as that found in part (b)? Why or why not? 8.4 In a reconnaissancestudy, the concentration of uranium was measured at 13 locations under the Darby mountains in Alaska. The measurements are: 7.92,10.29,rg.gg, 17.73,10.36,13.50,g.gl, 5.19,7.02, rl.7r, 9.33,9.32, 14.61 (Source: T. Miller and C. Bunker, lournal of Research,U.S. Geo_ logical Survey (1976), p. 367-877) Determine a 9s%oconfidence interval for the mean concentration of uranium in that region. 8.5 An experimenter studying the feasibility of extracting protein from seaweed to use in animal feed makes lg determin.xions of the protein extract, each based on a different SO-kilogram sample of seaweed.The sample mean and the standard deviation are fotind to be 3.6 kilograms and .8 kilogram, respectively. Determine a9s"/o confidence interval for the mean yield of protein extract per 50 kilograms of seaweed. 8'6 Henry cavendish (1731-1810) provided direct experimental evidence of Newton's law of universal gravitation, whiih specifies the force of attraction between two masses. In an experiment with known masses determined by weighing, the measured force can also be used to calculate a value for the density of the earth. The values of the earth's density from cavendish's renowned experiment in time order by column are: 5.36 5.29 5.58 s.65 5.57 5.53
s.62 s.29 5.44 5.34 s.79 5.10
5.27 s.39 5.42 5.47 5.63 5.34
5.46 5.30 5 . 75 5.68 5.85
8. Exercises 313 (These data were published inPhilosophicalTtansactions, Vol. 17, 1798,p. 459.) Find a99Y" confidenceinterval for the density of the earth. 8.7 District court recordsprovided data on sentencingfor 19 criminals convicted of negligent homicide. The mean and standard deviation of the sentenceswere found to be 72.7 months and 10.2 months, respectively. Determine a 95"/o confidence interval for the mean sentencefor this crime. 8.8 Measurements of the acidity (pH) of rain samples were recorded at 13 sites in an industrial region: 3.5, 5.1, 5.0, 3.6, 4.8, 3.6, 4.7, 4.3, 4.2, 4.5, 4.9, 4.7,
4.8
Determine a 95% confidence interval for the mean acidity of rain in that region. 8.9 A random sample of 16 observationsprovided t : 182 and s : 12. (a) Test Ho: p : 190 vs. Hr: p < 190 xt a : .05. State your assumption about the population distribution. (b) What can you say about the P-value of the test statistic calculated in part (a)? 8.10 The supplier of a particular brand of vitamin pills claims that the averagepotency of these pills after a certain exposure to heat and humidity is at least 65. Before buying these pills, a distributor wants to check if the supplier's claim is valid. To this end the distributor will choose a random sample of 9 pills from a batch and measure their potency after the specified exposure. (a) Formulate the hypotheses about the mean potency p,. (b) Determine the reiection region of the test with a : .05. State any assumption you make about the population. (c) The data are 63,72, 64, 69, 59, 65, 66, 64,65. Apply the test and state your conclusion. 8.11 A weight loss program advertizes "LOSE 40 POUNDS IN 4 MONTHS." A random sample of n : 25 customers has i : 32 pounds lost and s : 12. Test Ho: p = 40 against Hr: p < 40 with cr : .05. 8.12 A car advertisement asserts that with the new collapsible bumper system, the averagebody repair cost for the damagessustained in a collision impact of 10 miles per hour doesnot exceed$800. To test the validity of this claim, 5 cars are crashed into a stone barrier at an impact force of 10 miles per hour and their subsequent body repair costs are recorded. The mean and the standard deviation are
314
Chapter 10 found to be $858 and $45, respectively. Do these data strongly contradict the advertiser's claim?
8 .1 3 Combustion efficiency measurements were recorded for 10 home heating furnaces of a new model. The sample mean and standard deviation were found to be 73.2 and 2.74, respectively. Do these results provide strong evidence that the average efficiency of the new model is higher than 7Ol (Test at a the P-value.)
8 . 1 4A physical model suggests that the mean temperature increase in the water used as coolant in a compressor chamber should not be more than 5'C. Temperature increases in the coolant measured on 8 independent runs of the compressing unit revealed the following d a t a :6 . 4 , 4 . 3 , 5 . 7 , 4 . 9 , 6 . 5 , 5 . 9 , 6 . 4 , 5 . 1 . (a) Do the data contradict the assertion of the physical model? (Test at ct assumption you make about the population.
(b) Determine a 95o/"confidence interval for the mean increaseof the temperature in the coolant. 8.15 Five years ago, the averagesize of farms in a state was 160 acres. From a recent survey of 27 farms, the mean and standard deviation were found to be 180 acres and 3d acres,respectively. (a) Is there strong evidence that the averagefarm size is rarger than what it was 5 years ago? (b) Give a 98% confidence interval for the current average size. 8.15 The mean drying time of a brand of spray paint is known to be 90 seconds. The research division of the company that produces this paint contemplates that adding a new chemical ingredient to the paint will acceleratethe drying process.To investigate this conjecture, the paint with the chemical additions is sprayed on 15 surfaces and the drying times are recorded. The mean and the standard deviation computed from these measurements are g6 seconds and 4.5 seconds,respectively. (a) Do these data provide strong evidence that the mean drying time is reduced by the addition of the new chemical? (b) Construct a98oh confidence interval for the mean drying time of the paint with the chemical additive. 8.17 Rock specimensare excavatedfrom a particular geologicalformation and are subiected to a chemical analysis to determine their percentagecontent of cadmium. After anaLyzing25 specimens, the mean and the standard deviation are found to be 10.2 and 3.1, respectively. Supposethat a commercial extraction of this minerar will be economically feasible if the mean percentage content is at least 8.
8. Exercises 315 (a) Do the data strongly support the feasibility of commercial extraction? (Test at a : .01.) (b) Construct a 99"/oconfidence interval for the mean percentage content of cadmium in the geological formation. 8.18 The number of days to maturity was recorded fior 25 plants grown from seeds of a single stock. The mean and standard deviation were: x : 58.4 days and s : 5.5 days. (a) Do these results contradict the claim that the averagematurity time is 65 days for this stock? (b) Construct a 95Y" confidence interval for the mean maturity time. (c) Construct a 9OY" confidence interval for o. 8.19 In a metropolitan area, the concentrations of cadmium (Cd) and zinc (Zn) in leaf lettuce were measured at six representative gardens where sewage sludge was used for fertilizer. The following measurements (in milligrams/kilogram of dry weight) were obtained Cd:
2I
3B
T2
15
T4
Zn:
140
190
130
150
160
140
(a) Obtain 95% confidence intervals for the mean concentrations of Cd andZn. (b) Is there strong evidence that the mean concentration of Cd is higher than 12? 8.20 A few years ago, noon bicycle traffic past a busy section of campus had a mean of Fr : 300. To seeif any change in traffic has occurred, a count was taken for a sample of t9 week days. It was found that t:340ands:30. (a) Construct an o : .05 test of Ho: p : 300 against the altemative that some change has occurred. (b) Obtain a 95Y" confidence interval for p. (c) Is the confidence interval consistent with your conclusion in part (a)? (d) What would the a - .05 test of 8.21 Calculating from a random sample, suppose you determine that the 95Y" confidence interval {or p is (42.6, 51.5). (a) If with the same data you were to test Ho: p : 50 vs. Hi p # 5O at c : .05, what would be your conclusion? Explain. *(b) If with the same data you were to test l-1,_ 50 vs.
316
Chapter 10
Hr: p * S0 at o, : Explain.
.01, what would be your conclusion?
8.22 Using the table of percentage points of the 12 distribution, find: (a) x3, with d.f. : 4. (b) x%rs with d.f. : 25. (c) Lower.05 point with d.f. : 4. (d) Lower .025 point with d.f. : 25. 8.23 Test H6: o : l0 vs. H; o > l0 with a : .05 in each case: (a) n : 25,2(x, - V)2 : 4O16. (b)n:15,s:12. (c) n
1 1 0 ,1 2 6 ,l 3 l , L 4 g ,1 5 6 ,
r 65. 8.24 Given the sample data 12, 19, g,
15, 14
(a) obtain a point estimate of the population standard deviation o. (b) Construct a 95"/oconfidence interval for o. (c) Examine whether or not your point estimate is located at the center of the confidence interval. 8.25 Refer to the data of Exerciseg.13. Is there strong evidencethat the standarddeviation for the efficiency of the ioa.iis berow.3.ol ""* 8'25 Referring to Exerciseg.rz, one other indicator of the quality of an ore is the uniformity of its mineral content. Suppose that the ore qrrality is considered satislactory if the true standard deviation o of the percentage content of cadmium does not +. using the data given in Exerciseg.l7 "*"""J (a) construct a test to determine if there ls strong evidence that o is smaller than 4. (b) construct a gB% confidence interval for o.
8'27 From a random, sample, a 9oy" confidence interval for the population standard deviation .' was found to be (g.6, 15.3). with the same data, what would be the conclusion of i"rtirrg Ho: o : 7 vs. Flr:o*7atct:.10? *8.28 From a dataset of n : l0 observations,one has calculated the 9s% confidence interval for o, and obtained the result (.g1,8.22). (a) what was the standard deviation s for that sampre? (Hint: Examine how s enters into the formula of a confidence interval.)
8. Exercises 317 (b) Calculate a 90T" confidence interval for o. 8.29 Computer exercise. The calculation of confidence intervals and tests about p can be conveniently done by computer. In MINITAB the commands SET IN Cl 2,7,3,6 TINTERVAL 95
P E R C E N TC 1
produce the output
cr
N 4
MEAN 4. 50
STDEV 2.38
SE MEAN 1.2
95.0 (
P E R C E N TC . I . 0.2, 9.3)
The additional command T T E S T M U =2 . 5
CI
producesoutput for a two-sidedtest of Ho: p _ 2.5 vs. Hr: p + 2.5.
TEST OF MU = 2 . 5 0 c1
N 4
VS MU N. E. MEAN 4.50
2.50 STDEV 2.38
SE MEAN 1.2
T I .68
P VALUE 0.19
( a ) What is the conclusion at a
(b) Locate the significance probability in the output. (c) Using the data in Exercise 8.6, find a 95"/" confidence interval
for the density of the earth. 8.30 Computer exercise. Conduct a simulation experiment to verify the long-run coverage property of the confidence intervals i t-,rsf \/i to x + t.,rsf \fn. Generate n : 7 normal observations having mean 100 and standard deviation 8. Calculate the 95% confidenceinterval x - t.orrs/t/ito x + t.orrs/\/7.Repeat a large number of times. Students may combine their results. Graph the intervals as in Figure 5 and find the proportion of intervals that cover the true mean p : 100. Notice how the lengths vary. In MINITAB, the command trtopRrNrfollowed bv N R A N D O M7 M U = 1 0 0 S I G M A = 8 p U T I N TI NTERVAL 95 PERCENT CI
Cl
3f8
Chapter 10
provides the output
cl
MEAN I 04. gg
STDEV 7.O7
SE MEAN 2.7
(
9 5 . 0 P E R C E N TC . I . 99.5, 111.5)
These last two commands can be repeated to obtain another realrzatton of the confidence interval.
CHAPTER
Comporing Two Treotments 1. INTRODUCTION FROMTWOPOPULATIONS SAMPLES RANDOM 2, INDEPENDENT IN INFERENCE ANDITSROLE 3. RANDOMIZATION PAIRCOMPARISONS 4. MATCHED PAIRSAMPLE ANDA MATCHED SAMPLES INDEPENDENT 5. CHOOSINGBETWEEN
Chapter
Scientistslsolole NoturolSleep Potion Boston (AP)-scientists studying the processof sleep have isolated a natural human chemical they believe is nature's own sleeping potion. The chemi cal, called factor S, was discovered by Harvard Medical School researchers,who found that it puts animals into deep but normal sleep. They said the chemical may someday be used to fieat human insomnia, but they cautioned that many years of testing will be necessary before it could be given to people. The isolation and analysis of factor S culminates 15 years of work by Harvard professorsfohn Pappenheimer and Manfred Karnovsky. In one series of verification trials (Source: lournal of Biological Chemistry, Vol. 257, pp. 1664-1659,lg1z),2.I rabbits were administered a compound containing factor S and 21 others received the same compound without factor S. The percentage of time asleepwas recorded. Percentageof Time Asleep Mean Standatd error WithoutfactorS
V_43
WithfactorS
y:63
#:4
[-'z: 4
V2t
f.. INTRODUCTION In virtually every area of human activity, new procedures are invented and existing techniques are revised. Advances occur whenever a new technique proves to be better than the old. To compare them we conduct experiments, collect data about their performance, and then draw conclusions from statistical analyses. The manner in which sample data arc collected, called an experimental designor sampling desig4 is crucial to an investigation. In this chapter, we introduce two experimental designs that are most fundamental to a comparative study. As we shall see, ihe methods of analyzing the data are quite different for these two processes of sampling. First, we outline a few illustrative situations where a comparison of two methods requires statistical analysis of data.
L, Introduction 321 EXAMPLE'l
Agricultural Field Trials To ascertain if a new strain of seedsactually produces a higher yield per acre compared to a c,urrent major variety, field trials must be performed by planting each variety under appropriate farming conditions. A record of crop yields from the field trials will form the data base for making a comparison between the two varieties. It may also be desirable to compare the varieties at several geographic locations representing different climate or soil conditions. T
EXAMPLE2
Drug Evaluation Pharmaceutical researchers strive to synthesize chemicals to improve their efficiency in curing diseases.New chemicals may result from educated guessesconceming potential biological reactions, but evaluations must be based on their effects on diseasedanimals or human beings. To compare the effectiveness of two drugs in controlling tumors in mice, several mice of an identical breed may be taken as experimental subiects. After infecting them with cancer cells, some will be subsequently treated with drug I and others with drug 2. The data of tumor sizes for the two groups will then provide a basis for comparing the drugs. When testing the drugs on human subiects, the experiment takes a different form. Artificially inlecting them with cancer cells is absurd! In fact, it would be criminal. lnstead, the drugs will be administered to cancer patients who are available for the study. In contrast with a pool of mice of an "identical breed," here the available subfects may be of varying conditions of general health, prognosis of the disease and other factors. n When discussing a comparative study, the common statistical term treatmentis used to refer to the things that are being compared. The basic units that are exposed to one treatment or another are called experimental units or experimental subiects and the characteristic that is recorded after the application of a treatment to a subject is called the response For instance, the two treatments in Example I are the two varieties of seeds, the experimental subjects are the agricultural plots, and the response is yield. The term experimental design refers to the manner in which subjects are chosen and assigned to treatments. For comparing two treatments, the two basic types of design are: I. Independent samples II. Matched pair sample The caseof independent samples ariseswhen the subjects,arerandomly divided into two groups, one group is assigned to treatment I and the other to treatment 2. The responsemeasurements for the two treatments are then unrelated becausethey arise from separateand unrelated groups
322
Chapter 11
D r u g1
S p l i ta t r a n d o m
Drug2
Figure 'to Independenf somples, eoch of size 4.
D r u g1
Drug2
Pair1
Pair2
Pair3
Pau4
Figure 'lb Mofched poir design yyithfour poirs of subjecfs.
Exercises 323 of subjects. Consequently, each set of response measurements can be considered a sample from a population, and we can speak in terms of a comparison between two population distributions. With the matched pair design, the experimental subjects are chosen in pairs so that the members in each pair are alike while those in different pairs may be substantially dissimilar. One member of each pair receives treatment I and the other receives treatment 2. Example 3 illustrates these ideas.
EXAMPLE3
To compare the effectiveness of two drugs in curing a disease,suppose 8 patients are included in a clinical study. Here, the time to c,ure is the response of interest. Figure la portrays a design of independent sarnples where the 8 patients are randomly split into groups of 4, one group is treated with drug I and the other with drug 2. The observations for drug 1 will have no relation to those for drug 2 becausethe selection of patients in the two groups is left completely to chance. To conduct a matched pair design, one would first select the patients in pairs. The two patients in each pair should be as alike as possible in regard to their physiological conditions; for instance, they should be of the same sex and age group and have about the same severity of the disease.These preexisting conditions may be different from one pafu to another. Having paired the subjects, one member is randomly selected from each pair to be treated with drug I and the other with drug 2. Figure lb shows this matched pair design. In contrast with the situation of Figure la, here we would expect the responses of each pair to be dependent for the reason that they are I governed by the same preexisting conditions of the subfects. In summary, a carefully planned experimental design is crucial to a successful comparative study. The design determines the structure of the data. In turn, the design provides the key to selecting an appropriate analysis.
EXERCISES 1 . 1 A1, Bob, Carol, Dennis, and Ellen are avatlable as subjects. Make a list of all possible ways to split them into two groups with the first group having two subiects and the second three subiects.
r.2 Six mice,
alpha, tau, omega , pr,beta, and phi, are to serve as subiects. List all possible ways to split them into two groups with the first having four mice and the second two mice.
324
Chapter 11
1.3 six students in a psychology course have volunteered to serye as subjects in a matched pair experiment. Name
Age
Sex
Tom Sue Erik Crace Chris Roger
18 20 l8 20 18 18
M F M F F M
(af List all possible sets of pairings if subjects are paired by age. (b) If subjects are paired by sex, how many pairs are available for the experiment? 1.4 Identify the following as either matched pair or independent samples. ^"nd Also identify the experimental units, ireatments ,"sporrr" i' each case. (a) Twelve persons are given a high potency vitamin c capsule once a day. Another twelve do not iaki extra vitamin c. Investigators will record the number of colds in 5 winter months. (b) one self-fertilized plant and one cross-fertilized plant are grown in each of 7 pots. Their heights will be measured after 3 months. {c) Ten newly married couples will be interviewed. Both the husband and wife will respond to the question ,,How many children would you like to have?,, (df Learning times will be recorded for 5 dogs trained by a reward method and 3 dogs trained by a reward_punishment method.
2. INDEPENDENT RANDOM SAMPLES FROMTWOPOPULATIONS Here we discuss the methods of statistical inference for comparing two treatments or two populations on the basis of independent samples. Recall that with the design of independent samples, a iollection of n, + n, subjects is randomly divided into two gtonpr and the responses are r-ecorded._weconceptualize Population I as the collection of responses that would result if a vast number of subjects were given treatment l. Similarly, Population 2 refers to the population of resfonses under treatment 2. The design of independent samples can then be viewed as one that produces unrelated random sampleJ from two populations (see Figure 2). In other situations, the populations to be compared may be quiL real entities. For instance, one may wish to the resideirtial "o-pL"
2. Independent Random Samplesfrom Two Populatrons 325
Figure2 property values in the east suburb of a city to those in the west suburb. Here the issue of assigning experimental subfects to treatments does not arise. The collection of all residential properties in each suburb constitutes a population from which a sample will be drawn at random. With the design of independent samples we obtain:
Sample
Summaty Statistics nr f7r a l
Xr,Xr,
'tXn,
x_;,?,
xi sl_
nrl
from population I nz ,n2
Yr,Yr,
.tYn,
l'-: r
nz ,?,
nz
I
from population 2
To make confidence statements or to test hypotheses, we specify a statistical model for the data.
Stotisticol Model: lndependent Rondom Somples (a) Xr, Xr, . I Xn, is a random sample of size n, from population I whosd mean is denoted by F.r and whose standard deviation is denoted by or. (b) Y r, Yr, ulation 2 whose mean is denoted by p, and whose standard deviation is denoted by or. (c) The samples are independent. In other words, the response measurements under one treatment are unrelated to the response measurements under the other treatment.
326
Chapter 11
we now set our goal toward drawing a comparison between the mean responses of the two treatments or populations. In statistical language, we are interested in making inferences about the parameter !t"z _ (mean of population 1)
Fr
(mean of population 2)
INFERENCES FROM LARGE SAMPLES Inlerences about the difference pl - pz are naturally based on its estimate X - Y, the difference between "the sample means. when both sample sizes n, and n, arelarge (say, greater than B0),X and Y are each approximately normal and their difference * - | is approximately normal with
Mean: E(X
Variance: Yar(X
Y) _ Frr
pz
+
D' n r
a7 n2
Becau_se nr and n2are both large, the approximation remains valid if ol and o2,are replaced by their estimators nl
nz
S
z (x, S?-
x)2
nrl
and sB
Y2
l:1
nzI
We conclude that, when the sample sizes n and nz are large, I
7 ZJ
(X-Y)-(p,-pr)
is approxim ately N(0, l)
A confidence interval for_p, - pz is constructed from this sampling distribution. As we did for the single sample problem, we obtain a confidence interval of the form Estim ate of parameter + (z-value) (estimated standard error)
2, Independent Random Samples from Two Populatrons 327
Lorge Somple Confidence Intervol for pl When 100(1
30, an approximate pz is given by
nr and nz are greater than il"t" confidence interval for pr
(x
t
l'l,z
Fr")
7r x / ^2 \W , x f r z , Z 4
t+ ZslZ
where zo/2 is the upper alT point of N(0, 1).
EXAMPLE4
To compare the age at first marriage of females in two ethnic groups, A and B, a random sample of 100 ever-married females is taken from each group and the ages at first mardage are recorded. The means and the standard deviations are found to be: A
20.7 6.3
Mean Sd
I 8.5 5.8
Construct a 95% confidence interval for jt"e We have
Fr-
nr:
I 00, v :
20.7, sr -- 6.3
n2 --
100,
18 . 5 , s 2 : . 5 . 8
tr,sz
I.r:
(6.3)2
T00_ +' ry_ 100
\4r4
.8563
For a 95"/" confidence interval we use z . o z s
v
y_20.7
18.5-2.2
^m
-r Z.ozs i2 \ ",,
Therefore, a 95% confidence interval for Va
l.rs is given by
2.2 + 1.68 or (.52,3.88)
328
Chapter 11.
Femalesin ethnic group B tend, or the average,to marry .52 yearto 3.88 years younger than those in ethnic group A.
T
A test of the null hypothesis that the two population means are the same/ Ho: lrr pz - 0, employs the test statistic ryxl a-
which is approximately N(0, l) when pr
EXAMPLE5
Wz _ 0.
Test for equality of mean age at first marriage for the two groups in Example 4. Take ct - .05. We wish to test Ho: pr ltz against Hl Fr + Fz.With a a/2 I.96. The rejection region rs Z < - 1.96 or Z > I.95. The data are nr: n2:
r00, v I 00, v -
20.7,
o1
6.3
1 8 , 5 , D2
5.8
18.s
2'2 sm-
ar
o
Since
7:
x -V s2, I t?
V",
r
",
20.7
( 5. 8 ) 2 ( 6 . 3 ) 2 , -Tm -T00r
., 2'569
which is larger than L.96, we reject Ho at the 5% level. The mean ages at first marriage are significantly different.
D
EXAMPLE6
In |une 1980, chemical analyseswere made of 85 water samples(eachof unit volume) taken from various parts of a city lake, and tir" -""rrrr"ment of chlorine content were recorded. Duringihe next two wintery the use of road salt was substantially reduced in the catchment areas oj the lake. In fune 1982, ll0 water samples were analyzedand their chlorine contents recorded. Calculations of the mean and the standard deviation for the two sets of data give:
Chlorine 1980 Mean Standard deviation
18.3
r.2
Content 1982
17.8 1.8
2. Independent Random Samplesfrom Two Populations
329
Do the data provide strong evidence that there is a reduction of average chlorine level in the lake water in 1982 compared to the level in 1980? Test with cr : .05. :: population The alternative hypothesis is lrr ]rz mean in 1980 and j;,z : population mean in 1982. For : I.645, the rejection region ts Z > 1.645. We have nt :
18.3, sr - I.2
85, x:
-
n2
ct
I7.8, s2 _ 1.8
t,
and
18.3
17.8
:
'5 _ z.Bz .2r54
Therefore, at the 5o/"leveI, we conclude that there has been a reduction in the mean chlorine level. Note that since Z.or : 2.33, the test would reiect Ho even for cr near .01. n We summarize the procedure for testing Fr - pz : Eo, where Eo is specified under the null hypothesis. The case Fr : Pz corresponds to 6o:o'
Testing Ho:pt - ltz = 60, Lorge Somples Test statistic:
,.,xlbo
lt? , s7 V4+n2 Alternative hypothesis: H ti l.r"r Ht: lrr Ht: l.r,r
lt"z ]tz jtz + Eo
Level ct reiectionregion: R;Z R:Z R; lzl 2 Zolz
INFERENCES FROM SMALL SAMPLES Not surprisingly, more distributional structure is required to formulate appropriate inference procedures for small samples. Here we introduce the small-sample inference plocedures that are valid under the following assumptions about the populaton distributions. Naturally the usefulness
330
Chapter 17
of such proceduresdependson how closely these assumptions are realtzed.
AdditionolAssumptions Whenthe SompleSizesore Smoll (a) Both populations arenormal. (b) The population standard deviations tr, and u2 areequal.
A restriction to normal populations is not new. It was previously introduced for inferences about the mean of a single populaiion. The second assumption, requiring equal variability of the populations, is somewhat artificial but we reserve comment until later. Letiing o denote the common standard deviatiorl we summarize.
Smolf-SompleAssumptions (a) Xr, Xr, . f,nt is a random sample from N(pr, o). (b) Yr, Yr, . Yn, is a random sample from N(pz, o). (Note: is the same fol both distributions.) (c) Xr, Xr, . Xn, and Yr, Yr, . yn., areindependent.
Again X
f is our choice for a statistic: Mean of
(x
h
Var
(x
ll
E(X u2
nr
+
Y) -
Fr
[t"z
#
The common variance o2 can be estimated by combining in{ormation provided by both samples. Specifically, the sum 2iy I n, incorpor(X, rates nr - I pieces ofjnlormation about o2, in view of the constraint that the deviatiors X, - X sum to zero.Independentlyof this, 2?Z - l), r$, contains nz - | pieces of information about o2. These two quantities can then be combined
>(xr-X)2+2(Y,-Y'P to obtain a pooled estimate of the common o2. The proper divisor is the
2. IndependentRandomSamplesftom T\troPopulations 331 freedom,or(nr - 1) + (n, - 1): nr + sumof thecomponentdegreesof nn-2.
t I
PooledEstimotorof the Commonu2
I
I It I t
I I
z nz
nl
n2 D pooled
I II I !
I
iI
n22
(nz
e
I
1)s?
I ! I I
nr + nz
I II
2
I
!
EXAMPLE7
To illustrate the calculation of pooled sample variance, consider these two samples: , , 6 , 9, 7 Sample from population 1 : 8 , 5 7 Sample from population 2 : 2,6,4,7,6 The sample means are X
T
42 6
v-
2yi 5
25 |D
)tt
Further (5 - l)s?:2(xi-v)z : (B- 7;z1 (5- 712a (7- 772a (5- 772',(9- 772a (7- t1z: 1o (s - 1)s?:2(y,-y)' : (2 - s)2 + (6 - 5)2+ @ - 512* (Z - 5)2 + (6 - 5)2: t6 : Thus sf 2, s7: 4, andthe pooledvarianceis sfroot.a )(r,
v)2 + 211r, y)2 nr+nz
2
10+ 6 +5
The pooled variance is closer to 2 than 4 becausethe first sample-size is larger. r- --i i t L_l
Employingthepoo1edestimator\ffiforthecommono,we tain a Student/s t vanable that is basic to inferencesabout Fr
obpz.
332
Chapter 11
t_
(x-7)
-(rr,-pr)
fr
L
spooledvrh + ,h
has Student's t distribution, with nr * nz
2 degreesof freedom.
we can now obtain confidence intervals for (p, - tLz),which are of the form Estimate of parameter i
(t-value) x (estimated standard error)
Confidence intervol for lrl - lL2, SmollSomples A 100(1
a)% confidence interval for p,
X
',,.m Y + tt c,l 2^J pso o l e d \ n ,
lrz is given by
-
,b
where
sfroot.a and
d.f.
EXAMPLE8
to/Z
(n, - 1)s?+ (n, - I )s? nr + nz 2
is the upper a/2 point of the t distribution +n2 2.
with
A feeding test is conducted on a herd of 25 milking cows to compare two diets, one of dewatered allalfa and the other of field-wilted ilf^If^. r sample of 12 cows randomly selected from the herd are fed dewatered alfalfai the remaining 13 cows are fed field-wilted alfalfa. From observations made over a three-week period, the averagedaily milk production is recorded for each cow. The data are given in Table t. Obtain a 95% confidence interval for the difference in mean daily milk yield per cow between the two diets. The dot diagrams of these data, plotted in Figure 3, give the appearance of approximately equal amounts of variation.
2. Independent Random Samplesfrom Two Populations 333
Toble1 Milk Yield (in Pounds) Field-wilted alfalfa
44,44,56,46,47 ,39,59,53,49,35,46,30,4I
Dewatered alfalfa
35,47,55,29,40,39,32,4r,42,57,51,39
We assume that the milk-yield data for both field-wilted and dewatered alfalfa are random samples {rom normal populations with means of p, arrd lrr, respectively, and with a common standard deviation of o. Computations from these data provide the summary statistics v)2:767.59,
45.L5, Xx,
Field-wiltedalfalfa: x-
y)2 : 840.25, s7_ 76.39
Dewatered alfalfa: y _ 42.25, Z(yi The sample sizesare nr : 13 and nr: V)2 + >(y, nr + nz 2
sfrool.a Xxi
s? _ 63.97
12. The pooled samplevariance is
y)2
767.69 + 840.25
With a 95% confidence level a/2 : .025, and consulting the t table we find that t.ozs:2.069 with d.f. : nr + n2 - 2 - 23. Thus a95% confidence interval for p, - l-uzis x
Y i
t.ozsspooted
We can be 95% confident the mean yield from field-wilted aIfaIIacan be anywhere from 4.OL pounds lower to 9.82 pounds higher than for dewatered alfalfa.
n
o
o
o
to 40
01 30
o
o
taoo 40
3
30
F i el d- w i l tedal fal fa
o Dewatered alfalfa
ol 50
I 50
Figure 3 Dof diogroms of milk-yield dofo in Exomple 8.
aao
I
334
Chapter 11
Fr - ]tz = Do,Smqll Sqmples
Tesf esting Test statistic:
x
t-
pooled
Alternat tative Ht: Fr Hr: Fr Hr: Fr
hypothesis: ltz ltz
Level ct rejection region: R;t R;t
R; ltl
]rz + Eo
t**_-_","_.-"*"._.
[XAfvtf][i: {r} Refer to the feeding test on cows describedin Example 8. Do these data strongly indicate that the milk yield is less with dewatered alfalfa than with field-wilted alfalfa?Test at a _ .05. We previously calculated Field-wilted alfalfa: Dewatered aIIaIfa: sfroot"a
4 5 .1 5 42.25 69.9
To determine the alternative hypothesis, we recall: lrr Jlz
Then the alternative hypothesis is pr To test Ho: Frr - lrz vs. Hr: Frr
Y
t
- nr + nz
L -
2
Spooled
For cr value of
.05, the one-sided rejection region is is f
45.15
42.25
L -
8.63
I
2.90 3.45
t > t.os. The observed
Exercises 335 For d.f. is not rejectedwith ct -- .05.
t.os - 1.714.Hence the null hypothesis I
Deciding Whether or Not to Pool Our preceding discussion of large- and small-sample inferences raises a few questions: For small-sample inference, why do we assume the population standard deviationslo be equal when no such assumption was needed in the large-samplecase? When should we be wary about this assumption, and what alternative proceduresare available? Learning statistics would be a step simpler if the ratio (x
Y)
(p'
Fz)
had a t distribution for small samplesfrom normal populations. Unfortunately, statistical theory proves it otherwise. The distribution of this ratio is not a t and, worse yet, it dependson the unknown quantity o?1o7. The assumption o. : o, and the change of the denominator to t-basedinferences to be valid. However, TaQ.alowlhe "o,,or.o{TQ, pooling are not neededin large : accompanying and oz iestriction tiii'J, holds. samples where a Z approximation With regard to the second question, the relative magnitude of the two sample variances sl and sl would of course be the prime consideration. As a working rule, the range of values L ' s?/szz< 4 rl;^aybe taken as reasonablecasesfor pooling. If sl/l is very much different {rom l, the assumption ar : az would be suspect. In that case, the appropriate methods of inierence about Fr - trz are more complex and will not be discussedin this text. One simple but conservativeprocedureis outlined in Exercise2.15.
EXERCISES 2.1 Independent random samples from two populations have provided the summary statistics:
336
Chapter 1.1
Sample 1 nr X
sl
40 93 132
Sample 2 n2
,
sl
45 85 157
(a) Obtain a point estimate of lr, - st2, and calculate the estimated standard error. (b) Construct a 95o/oconfidence interval for p, - p2. 2.2 A group of 141 subjects is used in an experiment to compare two treatments. Treatment I is given to 79 subjects selected at random, and the remaining 62 arc given treatmentZ. The means and standard deviations of the responsesare
Mean Standard deviation
Treatment 1
Treatment 2
109 46.2
128 53.4
Determine a98% confidenceinterval for the mean differenceof the treatment effects. 2.3 Refer to the data in Exercise2.2. supposethe investigator wishes to establish that treatment 2 has a higher mean response than treatment 1. (a) Formulate Ho and Hr. (b) State the test statistic and the rejection region with ct : .05. (c) Perform the test at cr : .05. Also, find the p-value and comment. 2.4 A national equal employment opportunities committee is conducting an investigation to determine if women employes are as well paid as their male counterparts in comparablejobs. Random samples of 75 males and 64 females in junior academic positions are selected, and the following calculations are obtained from their salary data: Male Mean Standard deviation
$24,530 780
Female
$23,620 7s0
Constmct a 95% confidence interval for the difference between the mean salaries of males and females in iunior academic positions.
Exercises 337 2.5 Refer to the confidence interval obtained in Exercise 2.4.If. you were to test the null hypothesis that the mean salaries are equal versus the two-sided alternatives, what would be the conclusion of your testwitha:.05? 2.6 Given the two samples and 4,6
7,9,8
calcul ate (a) sfroor.dand (b) the t-value when Fr :
1l,z.
2.7 Given the two samples 3,5,4
and 9,5,7
and (b) the t-value when calcul ate (a) sfroor"d
f.t,f
:
|1"2.
and n,
2.8 Given n, Z(vi Y)2
2.9 Use the datain Exercise2-7 to test Ho:lrr _ pz a9ainst Hr:pr with a : .05. 2.10 Use the data in Exercise 2.8 to determine a 95% confidence interval for p, - pr. 2.11 The following summary statistics are recordedfor two independent random samples from two populations. Sample 1 nl -
ll
- 10.7 sl - 1 . 3 6
X
Sample 2
! , : l3 v : 9.6 ^2_ 12-
2.r7
(a) Construct a95"/" confidence interval for pr - p2. (b) State the assumptions about the population distributions. 2.12 Refer to the data in Exercise 2.11. Is there strong evidence that Population t has a higher mean than Population 27 2.13 The peak oxygen intake per unit of body weight, called the "aerobic capacity," of an individual performing a strenuous activity is a measure of work capacity. For a comparative study, measurements of aerobic capacities are recorded for a group of 20 Peruvian Highland natives and for a gloup of I0 U.S. lowlanders acclimatized as adults in high altitudes. The following summary statistics were obtained from the data lsource: A. R. Frisancho, Science,Vol. 187 ( 1 9 7 5 )3, 1 7 1 :
338 Chapter11
Peruvian Natives
U.S. Subiects Acclimatized
46.3
38.s
Mean Standard deviation
s.0
5.8
Construct a 98% confidence interval for the mean difference in aerobic capacity between the two groups.
2.r4 Do the data in Exercise 2.rB provide a strong indication of a difference in mean aerobic capacity between the-highland natives and the acclimatized lowlanders? Test with c : .01. .2.15
A conservativec,onfidenceprocedurefor p, - p" when the popurations are normal or-and cr2atenot assumedto be aq.r"ti .but A 100(l - u)"/o confidence interval for p., _ pz is given by
Y + th,
X
where tlrzdenotes the upper a/zpoint of the r distribution with d.f. - (smaller sample size) l. The interval is conservative in the sense that the actual confidence probability is a t least (I a). We illustrate this formula with the data of Example 8, where nr
s?
n2
s 7 : 76.39
63.97
We calculate
xy Basedon d.f. confidence interval is
- 2.90, I
2.90 -f 2.20I x 3.36 - Z.9O+ T.4O or Note that it is wider than the confidence interval calculated in Example 8 under the assumption o, - az. Apply this procedure to the data of Exercise 2.II and compare the result with that of part (a) of that exercise.
3. Randomization and lts Ro/e in Inf ercnce 339
AND 3. RANDOMIZATION ITSROLEIN INFERENCE We have presented the methods of drawing inferences about the difference between two population means. Let us now turn to some important questions regarding the design of the experiment or data colIeciion procldure.The mantter in which experimental subiects are chosen for the two treatment groups can be crucial. For example, suppose that a remedial-reading instru"toi h"t developed a new teaching technique and is permitted to use the new method to instruct half the pupils in the class. The instructor might choose the most alert or the students who are more promising in som-eother way, leaving the weaker students to be taughi in the c-onventional manner. Ctearly, a comparison between the ,""Jing achievements of these two groups would not iust be a comparison of twJteaching methods. A similir f.aIlacy can result in comparing the nutritional q,r"lity of a new lunch package if the new diet is given to a and the conventional diet Oo,rn of chiidren suffering from malnutrition in good health. already are who group of children is given to a units is under our experimental to treatments of iVh"tr the assignment between the two comparison a valid to ensure control, steps cai be taken or ran' selection, impartial of principle the lies tr""t-"rrtr.- At the core or treatment for one units experimental the domization. The choice of one favor not does that mechanism a chance ihe oth", must be made by p"iti".tt"t selection over any other. It must not be left to the discretion of because, even unconsciously, they may be partial to it " ""p"ti-enters one treatment. Supposethat a comparative experiment is to be run with n experimen,"i.tttiar, of which n, units are to be assigned to treatment 1 and the remaining flz : fr - nr units are to be assignedto treatment 2' The principle of r-andomization tells us that the n, units-for treatment I must L" chos"tr at random from the available collection of n units-that is, in a (f,) manner such that "lI selected.
nossible choices are equally likely to be
RondomizotionProcedurefor Comporing TwoTreotments From the avarlable n _ nL + n2 experimental units, choos Q flt units at random to receive treatment I and assign the remaining n2 units to treatment 2. The random choice entails that all /n\ selections are equally likely to be chosen. GrJ nosible
340
Chapter 11
As a practical method of random selection, we can label the available units from I to n. Then n identical cards, marked from I to n, can be shuffled thoroughly and n, cards can be drawn blindfolded. These n, experimental units receive treatment I and the remaining units receive treatment 2. For a quicker and more efficient means of random sampling one can use the computer (seefor instance Exercise6.10, Chapter 4). Although randomization is not a difficult concept, it is one of the most fundamental principles of a good experimental design. It guarantees that uncontrolled sources of variation have the same chance of helping the responseof treatment I as they do of helping the responseof treatment 2. Any systematic effects of uncontrolled variables, such as age, strength, resistance, or intelligence, are chopped up or confused in their attempt to in{luence the treatment responses.
Rando mization prevents uncontrolled sources of variation from influencing the responses in a systematic manner.
of course, in many cases/the investigator does not have the luxury of randomization. consider comparing crime rates of cities before and after a new law. Aside from a package of criminal laws, other factors such as poverty, inllation, and unemployment play asignificant role in the prevalence of crime. As long as these contingent fa"tors c4nnoi be regulated during the observation period, caution should be exercised in cr-editing the new law if a decline in the crime rate is observed or in discrediting th! new law if an increase in the crime rate is observed.when randomizalion cannot be performed, extreme caution must be exercised in crediting an apparent difference in means to a difference in treatments. The dif-ferences may well be due to another factor.
EXERCISES 3.1 Randomly allocate two subiects from among Al, Bob, Carol, Dennis, Ellen to be in the control group. The others will receive atreatment. 'Randomly 32 allocate three subjects from among six mice alpha, tau, omega , pL, beta, phi
to group 1.
4. MatchedPait ComParisons 341 3.3 Observations on l0 mothers who nursed their babies and 8 who did not, revealed that nursing mothers felt warmer toward their babies. Can we conclude that nursing effects a mother's feelings toward her child? 3.4 Early studies showed a disproportionate number of heavy smokers among lung cancer patients. One scientist theorized that the presence of a particular gene could tend to make a person want to smoke and be susceptible to lung cancer. (a) How would randomization settle this question? (b) Would such a randomization be ethical with human subjects?
Motched Poirs Identical twins are the epitome of matched pair experimental subf ects. They are matched not only with respect to age but also a multitude of genetic factors. Social scientists, trying to determine the influence of environment and heredlty, have been especially interested in studying identical twins that were raised apart. Observed differences in IQ and behavior are then supposedly due to environmental factors. When the subjects are animals like mice, two from the same litter can be paired. Going one step further, genetic engineers can now provide two identical plants or small animals by cloning these subjects.
342
Chapter 77
4. MATCHED PAIRCOMPARISONS In comparing two treatments, it is desirable that the experimental units or subiects be as alike as possible, so that a difference in responses between the_two groups can be attri6uted to differenc", ir, ,r""a-ents. If some identifiable conditions vary over the units in an uncontrolled man1er, they could introduce a large variability in the measurements. In tum, this could obscure a rcal difference in treatment effects. on the othei hand, the requirement that all subjects be alike may impose a severe limitation on the number of subjects available fot a experi"omparative p"l! To compare twoanalgesics, for example, it would be impractical to look for a sizable number of patients who are of the ,"-" ,.", age, and general health condition and who have the same severity of pain. Aside from the question of practicality, we would rarcly *"tt, ,o confine a comparison to such a narow group. A broader scope of inference can be attain€d by applying the treatments on a variety of patients of both sexes and different age groups and health conditions. The concept of matching or blo-cking is fundamental to providing a compromise between the two- conflicting requirements that the experi mental units be alike and also be of Jiffeient kinds. The procedure consists of choosing units in pairs or blocks so that the units in each block are similar and the unitJ in different blocks are dissimilar. one of the units in each block is assignedto treatment l; the other to treatment rfis p-rocesspreserves the effectiveness of a comparison *ithir, ], block and permits a diversity of conditions to exist in different """h blocks. of course, the treatments must be allotted to each pair randornly to avoid selection bias. This design is called sampling; Ly m"tcheJ pairs For
Motched Poir Design Matchedpair
1 2 3
Experimentalunits:
m m
m
a
a a
a
a a
a
ru Units in each pair arealike, whereas units in different pairs may be dissimilar. In each paur,a unit is chosen at random to receive treatment 1: the other unit receivestreatment 2.
4. Matched Pair ComPatisons 343 example, in studying how two different environments influence the leaming capacities of preschoolers, it is desirable to remove the effect of heredity: ideally this is accomplished by working with twins. In a matched pair design, the response of an experimental unit is inlluenced by: (a) The conditions prevailing in the block. (b) A treatment effect. By taking the difference between the two observations in a block, we can fiiter oui the common block effect. These differences then permit us to focus on the effects of treatments that are freed from undesirable sources of variation.
Poiring(or Blockingl Pairing like experimental units according to some identifiable charaiteristic(s) serves to remove this source of varuation from the experiment.
The structure of the observatiohs in a paired comparison is given belov', where X and Y denote the responsesto treatment I and treatment 2, respectively. The difference between the responses in each pair is recorded in the last column, and the summary statistics are also presented.
Structureof Dqtq for q Motched PqirCompqrison Pair I 2
Treatment I
xr x2
Treatment 2 Yr
Difference DT D2
Dn
Yn
nXn The differencesD r, Dr, Summary statistics:
n
1n
D:
I\
n
Dn are a random
z-/ Di l:
I
i:
I
nl
sample.
344
Chapter 11
yr)-"r.-independentof one another, Although the pairs X, andy, !{r, within the ith pair will usually be dependent.In f;,;;h; pairing of experimentalunits is effective,we would expectx, and,yrto be relatiiely small together. Expressed in anoiher #ay, wl *o,rra 9ls"- 9r expect a high positive correration.BecausetLe differencesDi : \f, - 2 to have xt Yt,i : r, 2, . . . ., n arefreedfromtheblock iiis reasonable to assumethat they constitite a random samplefrom "ff""t., population a with mean : 6 and variancg : yL where 6 representsth. -ertr differenceof the treatment effects.In other words, E(D,) Var(Dr)
i-
l,
tn
If the mean difference E is zetot then the two treatments can be consid-
A nositive 6 signifiesthat treatment I hasa higher mean :::9:q:tllllt- treatment 2. ConsideringDr,. . . , Dnto be a singlerandom ::T:l^r.._.lan ""ppiy w€ iilm.otrr.rv ;il;'?;h;il;: :?:.lf ^f,':T A ry.plt*on,, ""n discussed in chapters 9 and l0 to leam about.d";6ir-,;;;tr. Duvr
SmollSompleInferencesobouf lhe Meon Ditference6 Assume that the differences ,Di from N(6,o o) distribution. Let
n
D
l:
I
'
and
sD
"
Then: (a) A 100 (1
a)% confidence interval for E is given by (D
to/zso/{i,D
+ torzso/ti)
where to," rs based on n I degrees of freedom. (b) A test of Ho:6 _ Eo is based on the test statistic t:
DDo sof{n'
d.f.
urv4rr
4. Matched Pair Comparisons 345
As we leamed in Chapter 9, the assumption of an underlying normal distribution can be relaxed when the sample size is large. The central limit theorem applied to the differencesDr, . . ., D, suggeststhat when n is large, say greater than 30,
D-a is approximately N(0,1)
ffi
The in{erencescan then be basedon the percentagepoints of the N(0,1) distribution or, equivalently, on the percentage points of the r distribution, with the degreesof freedom marked "infinity." EXAMPLE,10 A medical researcherwishes to determine if a pill has the undesirable side effect of reducing the blood pressule of the user. The study involves recording the initial blood pressuresof 15 college-agewomen. After they use the pill regularly for six months, their blood plessures are again recorded. The researcherwishes to draw inferences about the effect of the pill on blood pressure from the observations given in Table 2.
MeqsurementsBeforeqnd Atler Useof Pill Tqble 2 Blood-Pressure Subiect
Before (x) After (y)
d
70 80 68 72
72 62 l0
76 70
76 58
76 66
18
l0
72 68
1,0
11
12
13
14
15
74 74
92 60
74 74
68 72
84 74
-4
10
78 52
82 64
64 72
26
l8
-8
32
Courtesy of a family planning clinic
Here each subiect representsa block generating a pair of measurements: one before using the pill and the other aftet using the pill. The yi are computed in the last row of Table 2, paired differencesdi - xi and the following summ ary statistics are calculated:
a-u1 5 Assuming that the paired differences constitute a random sample from a normal population N(6,oo), a 95"/" confidence interval for the mean difference 6 is given by A t
t.azs
sp
\ffi
346
Chapter 71
y-h.t9 to"u is basedon d.f. : 14.From the t tablewe find t.ozs 'vz : z.L4S. The 95% confidenceinterval is then computedas
8.80+-2.r4sx
#
This means that we are 95o/o conlident the me:rn reduction of blood pressureis between 2J2 and l4.gg.
of Table 2 substantiatethe claim that use of the pill
P::n:'11"
Because confidence t.,t"*"iirr"r"a"r""ilr, :"^*:.: p_ositivel-1::1 valuesry::ure? of E, a reduction off.1 blood pressure '----
"-"1
is strongly indicated.
to formallytest.thenull hyptth.ri, ao,a-: \ - ' Y ' far,a O?. lri , O. U'"
Y:^:lytsh c Assuming observed value of the test statistic t:
a sD/\n
8.80
1A.98/\m
.05
1 . 7 61. The
_ 8.80 ffi-3'10
falls in the rejection region. consequen try, Hois reiected in favor of H, at a 5%"level of significance. To be more convinced that the pill causes the reduction in blood pressure, it is advisable to measure the blood pressures of the same subiects once again after they have stoppedusing the pill for a period of time. This amounts to performing th; experimint in reverse order to check the findings of the first srage.
T
Example_I0 is a typical before-after situation. Data gathered to determine the effectiveness o{ a safety program or an exerciJe program would have the same structure. In such there is realy.ro *"] to choose "ases how to order-the experiments within a pair. The before situation must precede the after situation. If something other than the institution of the program causesperformance to improve, the improvement will be incorrectly credited to the p{ogram. However, whenihe order of the application of treatments can be determined by the investigator, somethiin^g can be done about such systematic influences. suppos. th"t a coin is flfrped to select the treatment for the first unit in each palr. Then the other treatment is applied to the second unit. Becausethe coin is flipped again for- each new pair, any uncontrolred variabre has an .l.r"f'"h"r". of hglplng either the performance of treatment I or of treatment 2. After eliminating an identi{ied source of variation by pairing, *" ,"trr* to randomization in an attempt to reduce th" syst.matiJ'effects of any uncontrolled sources of variation.
Exercises 347
with Poiring Rondomizotion After pairing, the assignment of treatments should be randomized for each pair.
Randomrzation within each pair chops up or diffuses any systematic influences, that we are unable to control.
EXERCISES 4.I Given the following paired sample data: xy
(a) Evalu atethe t-statistic t : -d
'o1{n
6 10 8
7 9 11
(b) How many degrees of freedom does this t have?
4.2 Givdn the following paired sample data: (a)
Evaluatet_+
so/Y n
(b) How many degrees of freedom does this t have? 4.3 A manufacturer claims his boot waterproofing is better than the major brand. Five pairs of shoes are available for a test. (a) Explain how you would conduct a paired sample test. (b) write down your assignment of waterproofing to each shoe. How did you randomize? 4.4 Ablanching process currently used in the canning industry consists of treating v-getables with a large volume of boiling water before canning. A newly developed method, called Steam Blanching Process (SBP),is expected to remove less vitamins and minerals from vegetables, becauseit is more of a steam wash than a flowing water wash. Ten batches of string beans from different farms are to be used to compare the SBPand the standard process.One-half of each batch of beans is treated with the standard process; the other half of each
348
Chapter LL
batch is treated with the SBP.Measurements of the vitamin content per pound of canned beans are:
Batches
Standard
construct a 98%"confidence interval for the difference between the mean vitamin contents per pound using the two methods of blanching. 4.5 Do the data in Exercise 4.4 provide strong evidence that sBp removes less vitamins in canned beans than the standard method of blanching? Test with cr = .02. 4.6 An experiment is conducted to determine if the use of a special chemical additive with a standard fertilizer pi"rrt gro*th. Ten locations are included in the study. At """"t"r"t* each to""tiJrrl*o plants growing in close proximity are treated: one is given the standard Iertllizer; the other is given the standard fertilizer"wi,r, an. chemical additive.- Plant growth after four weeks is measured in centimeters, and the following data are obtained:
Location
L0 Without
additive
With additive
20
3l
16
22
I9
32
25
r 8 20
r9
23
34
15
2l
22
31
29
20
23
24
Do the data substantiate the claim that use of the chemical additive accelerates plant growth? state the assumptions that you make and devise an appropriate test of the hypothesis. Take a 4.7 Obtain a 95% confidence interval for 6 using the datain Exercise 4.6.
4.8 Measurements of the left- and right-hand gripping strengths of l0 left-handed writers are recorded.
S, Choosing Between Independent Samples and a Matched Pair Sample 349
Person
(a) Do the data provide strong evidence that people who write with the left hand have a greater gripping strength in the left hand than they do in the right hand? (b) constru ct a 9o"/oconfidence interval for the mean difference. 4.9 In an experiment conducted to see if electrical pricing policies can affect consumer behavior, 10 homeowners in Wisconsin had to pay a premium for power use during the peak hours. They were offered io*", off-peak rates. For each home, the |u1y on-Deakusage (kilowatt hours) in 1982 was compared to the previous fuly usage' (a) Find a 95% confidence interval for the mean decrease.
Year 1981
1982
200 180 240 425 120 333 418 380 340 516
r60 t75
2ro 370 110 298 368 250 305 477
(b) Test Ho:D ct-.05. (c) Comment on the feasibility of randomizaton of treatments. (d) Without rando rntzation, in what way could the results in (a) and (b) be misleading? (Hint: What if air conditioner use is a prime factor and Iuly 1982 was cooler than Iuly 1981.)
INDEPENDENT BETWEEN 5. CHOOSING PAIRSAMPLE MATCHED A AND SAMPLES When planning an experiment to compare two treatments/ we often have the option of either designing two independent samples or designing a s"-pi" with paired observations. Therefore, some comments about the pros and cons of these two sampling methods are in order here. Becausea paired sample with n pairs of observations contains 2n measurements, a comparable situation would be two independent samples with n observa-
350
Chapter 1L
tions- in each. First, note that the sample mean difference is the same whether or not the samples are paired. This is because
D
XI
Therefore,-using either sampling design, the confidence intewals for the difference between treatment effects have the common form (* - Vl ! to/2(estimated standard error). However, the estimated standard error as well as the degrees of freedom for t are different between the two situations:
Independent Samples (n, _ n2 - n )
Estimated standard error d.f. of r
Paired Sample (n Pairs) sp
spooled
{n
2n
2
nl
Because the length of a confidence interval is determined by these two components/ we now examine their behavior under the two competing sampling schemes.
Paired sampling results in a loss of degrees of freedom and conseAuen{1 in a larger value of to,r. For instance, with a paired ,"_pt. of n.: l0.we have t.o, : l.g33 with d.f. : 9. But the t_ialue associated with independent sampleseach of size 10, is t.o. = l.7}4withd.f. : lg. Thus if the estimated standard error remains e{ial, then a loss of degrees of freedom tends to make confidence intervals larg.t ror p"ii"d sam"ples. Likewise, in testing hypotheses a loss of degreesof"freedo'mfor the t-test results in a loss of power to detect real Jifferences in the pop.rlatioo means. fr"- merit of paired sampling emerges when we turn our attention . to the other component. If experimental units are paired so that an interfering factor is held nearly constant between mimbers of each pair, the treatment responsesX and y within each pair will be equally afficted by this factor. If the prevailing condition in a pair causesthe xmeasurement to be large, it will also causethe corresponditrg ymeasurement to be large and vice versa. As a result, the variance of tie difference x - y will be smaller in the case of an effective pairing than it will be in the case of independent random variables. The-estimated standard deviation will be typically smaller as well. with an effective pairing, the reduction in the
Key Formulas 351 standard deviation usually more than compensatesfor the loss of degrees of freedom. In Example l0 concerning the effect of a pill in reducing blood pressure, we note that a number of important factors (age,weight, height, general health, etc.) affect a person's blood pressure. By measuring the blood pressure of the same person before and after use of the pill these influencing factors can be held nearly constant for each pair of measurements. On the other hand, independent samples of one group of persons using the pill and a separate control group of persons not using the pill are apt to produce a greater variability in blood-pressure measurements if all the persons selected are not similar in age,weight, height, and general health. In summary, paired sampling is preferable to independent sampling when an appreciable reduction in variability can be anticipated by means of pairing. When the experimental units are already alike or when their dissimilarities cannot be linked to identifiable factors, an arbitrary pairing may fail to achieve a reduction in variance. The loss of degrees of freedom will then make a paired comparison less precise.
KEYIDEAS A carefully designed experiment is fundamental to the success of a comparative study. The appropriate methods of statistical inference depend on the sampling scheme used to collect data. The most basic experimental designs to compare two treatments are: independent samples and matched pair sample The design of independent samples requires the subjects to be randomly selected for assignment to each treatment. Randomization prevents uncontrolled factors from systematically favoring one treatment over the other. With a matched pair design, subjects in each pair are alike while those in different pairs may be dissimilar. For each pair the two treatments should be randomly allocated to the members. Pairing subjects according to some feature prevents that source of variation from interfering with treatment comparisons. By contrast, random allocation of subiects according to the independent random sampling design spreads these variations between the two treatments.
KEYFORMULAS Inferences with Two Independent Random Samples (a) Large samples. When n, and nz^re both greater than 30, inferences about Fr - pz are based on the fact that
352
Chapter 11
(pr
]rz)
,F* A 100(l
Is approxim ately N(0, 1)
a)% confidenceinterval for (p,
pJ is
(x To test Ho:p,
Itz_ Do,
statistic
No assumptions are needed in regard to the shape of the population distributions. (b) Srnall samples' when n, and n2 aresmall, inferencesusing the r distribution require the assumptions: (i) Both populations arenormal. (ii) or The common o2 is estimated by (nz
slool.d
nt+
Inferencesabout Fr t-
n22
lr.z are based on
vl
+ t (x
I )s?
(pt
tLz)
m
d.f.
spooledv", + ", A 100(l
a)% confidence interval for p,
(X
To test Ho:p, t
Y)
+
pz is
to/2spooled
Vz - 6o, the test statistic is
(x t) O spooled Fr V;
Eo flz
,d.f.
nL + n2
2
6. Exercises 353 Inferences with a Matched Pair Sample With a paired sample (X1,Y1),. . . , (Xn,Yn),the first step is to calculate the differences Di : Xi - Yr, their mean D, and standard deviation sr. If n is small, we assume that the D,'s arenormally distributed N(6,oo). Inferences about 6 are based on D-E t:-=---, so/Yn
d.f.:n-l
A 100(1 - d)% confidence interval for 6 is D * t-,rsoft/i The test of Ho:E : Dois performed with the test statistic ,:D-6o "
df u'r'
:
n-r
trr/t/-"/
If n is large, the assumption of normal distribution for the Dr's is not needed. Inferences are based on the fact that D - \: is approximately Mo, t ) z : sol\/n
6, EXERCISES 5.1 The following summary is recordedfor independent samples from two populations: Sample 1
Sample 2
nt_ 60 , :136 86 s?:
nz60 , - ll2 s7-137
(a) Construct a98Y" confidence interval for pr - p2' (b) Test Ho:lrr - Pz : 20 vs. Hr:trr - Pz * 20 with o' : 'O2' (c) Test Ho:lrr - pz : 20 vs. Hr:lrr - Fz > 20 with ct : .05. 6.2 Rural and urban students are to be compared on the basis of their scoreson a nationwide musical aptitude test. Two random samples
354
Chapter 11
of sizes 90 and 100 areselected from rural and urban seventh grade students. The summ fiy statistics from the test scoresare:
Sample size Mean Standard deviation
Rural
Urban
90 76.4 8.2
100 8t.2 7.6
Establish a98o/" confidence interval for the difference in population mean scores between urban and rural students. 6.3 construct a test to determine if there is a significant difference between the population mean scores in Exercisi 5.2. use a : .05. 6.4 + study of postoperative pain relief is conducted to determine if A has a significantly longer duration of pain relief than drug B. {rys Observationsof the hours of pain relief are recordedfor 55 oati patlents given drug A and 58 patients given drug B. The summ ary sratistics ate
A Mean Standard deviation
5.64 r.25
5.03 1.82
(a) Formulate Ho and Hr.
(b) S t a t e t h e t e S t s t a t i s t i c a n d t h e r e j e c t i o n r e g i o n w i t h c t (c) state the conclusion of your test with o
P-value and comment.
6.5 Consider the data of Exercise 6.4. (a) Construct a 90% confidence interval for ]re lrs. (b) Give a 95o/oconfidence interval for lra using the data of drug A alone. (Note; Refer to Chapter 9.)
5.6 obtain sf;oorea for the data on sleeping rabbits given on the chapter frontpiece. 5.7 Given the following two samples: 7,9,8
and 6,2, 4
obtain (a) sfroor"aand (b) value of the r statistic for testing Ho:pr - lLz: O. 6'8 Two work designs are being considered for possible adoption in an
6. Exercises 355 assembly plant. A time study is conducted with l0 workers using design I and LZ workers using design 2. The means and standard deviations of their assembly times (in minutes) ate: Design 1 Mean Standard deviation
78.3 4.8
Design 2 85.6 6.5
Is the mean assembly time significantly higher for design 2? 6.9 Refer to the data of Exercise 5.8. Give 95% con{idence intervals for the mean assembly times for design I and design 2 individually (see Chapter l0). 6.10 An investigation is conducted to determine if the mean age of welfare recipients differs between two cities A and B. Random samples of 75 and 100 welfare recipients are selected from city A and city B, respectively, and the following computations are made:
Mean Standard deviation
City A
City B
38 6.8
43 7.s
(a) Do the data provide strong evidence that the mean ages are different in city A and city B? (Test at a : '02') (b) Constru ct a 98"/oconlidence interval for the difference in mean ages between A and B. (c) constru ct a 98o/oconfidence interval for the mean age for city A and city B individually- (Note: Refer to Chapter 9.) 6.11 To compare two programs for training industrial workers to perform a skilled iob, 20 workers are included in an experiment. Of these l0 are selected at random to be trained by method 1; the remaining l0 workers are to be trained by method 2. After completion of training, all the workers are subiected to a time-and-motion test that records the speed of performance of a skilled iob. The following data are obtained: Time (in minutes)
356
Chapter 11
(a) can you conclude from the data that the mean job time is significantly less after training with method 1 than alter training with method Z? (Test with a - .05.) (b) state the assumptions you make for the population distributions. (c) construct a 9s"/" confidence interval for the population mean difference in iob times between the two -"tnoAr. 6.12 An investigation is undertaken to determine how the administration of a growth hormone affects the weight g"i" oipr.g.ant rats. weight gains during gestation are recordeJfor? to. 6.rats receiving the growth hormone. The followirrs "orrtriii"ts *__"ry "rrasta_ tistics are obtained [source: v. R. Sara et al., s"i"n"i, vol. rg6
(t974),4461:
Control rats Mean Standard deviation
41.8 7.6
Hormone rats
60.8 16.4
(a) state the assumptions about the populations and test to determine if the mean weight gain is sffificantly highei for the rats receiving the hormone than for the rats in ihe lontrol group. (b) Do the data-indicate that you should be concerned about the possible violation of any assumption? If so, which one? 6'13 Referring to Exercise6.12, supposethat you are asked to design an experiment to study the effect of a hormone injection on the *iiglri gain of pregnant you have decided to iniect gestation. 11t9-du_rins 6 of the 12 rats available for the experiment, and to retain the other 6 as controls. (a) Briefly explain why it is important to randomly divide the rats into the two groups. What -igh! be wrong *itt, tfr. Lxperimen_ tal results if you choose to givi the hormoln" it.ri-.", to 6 rats that are easy to grab from their cages? (b) Srrpposethat the_12 rats are tagged with serial numbers from l through 12 and that 12 marbles identical in are also numbered from l through 12. How can you use "pr;;;;." these marbles to randomly select the rats in the treatment and contror grorrpri 6-14 To compare the effectiveness of isometric and isotonic exercise methods, 20 potbellied business executives are included in an expenment: l0 are selected at random and assigned to one exercise method; the remalnlng l0 are assigned to the other exercise method. After five weeks, the reductions in abdomen measure-
6. Exercises 357 ments ate recorded in centimeters, and the following tained:
Mean Standard deviation
Isometric Method A
Isotonic Method B
2.5 0.8
3.1
results ob-
r.0
(a) Do these data support a claim that the Isotonic method is more effective? (b) Construct a95o/o confidence interval for p"B - lra. 6.15 Refer to Exercise6.14. (a) Aside from the type o{ exercise method, identify a few other fac,torsthat are likely to have an important effect on the amount of reduction accomplished in a five-week period' (b) What role does randomization play in achieving a valid comparison between the two exercise methods? (c) If you were to design this experiment, describe how you would divide the 20 business executives into two groups. 5.16 In the early l97os, students started a phenomenon called "streaking." Within a two-week period following the first streaking sighted on campus, a standard psychological test was given to a group of 19 males who were admitted streakers and to a control group of 19 males who were nonstreakers. S. Stoner and M. Watman reported the following data lPsychology,Vol 11, No.4 (1975),14-161,regarding scores on a test designed to deterrnine extroversion: Streaker
,: J r
I
15.26 2.62
Nonstreakers
v-
s2:
13.90 4 . 1I
(a) Construct a95"/o confidence intewal for the difference in population means. Does there appear to be a difference between the two gtoups? (b) It may be true that those who admit to streaking differ from those who do not admit to streaking. In light of this possibility, what criticism can be made of your analytical conclusion? 6.17 It is claimed that an industrial safety program is effective in reducing the loss of working hours due to factory accidents. The follow-
358
Chapter 11
ing data are collected concerning the weekly loss of working hours due to accidents in 5 plants both before and after the safety piogr"is initiated.
Do the data substantiate the claim? use a _ 6.18 Two methods of memorizing difficult material are being tested to determine if one produces better retention. Nine pairs oif students are included in the study. The students in each iair are matched according to I.Q. and academic background and thin are assignedto the two methods at random. A membrization test is given to all the students, and the following scores are obtained:
Method
: .05, test to determine if there is a significant difference in 4t " the effectiveness of the two methods. 6.19 In each of the following cases,how would you select the experimental units and conduct the experiment? (a) Comp are the mileage obtained from two gasolin€s; 16 cars are available for the experiment. (b) Test two varnishes: 12 birch boards are avarlablefor the experiment. (c) compare two methods of teaching basic ice skating; 40 sevenyear-old boys are available for the experiment 6.20 Referring to Exercise 4.5 suppose that the two plants at each location are chosen from a row stretching in the Easl-west direction. [n designing this experiment, you must decide which of the two plants at each location-the one in the East or the one in the west-is to be given the chemical additive.
6. Exercises 359 (a) Explain how by repeatedly tossing a coin you can randomly allocate the treatment to the plants at the 10 locations. (b) Perform the randomization by actually tossing a coin 10 times. 6.21 Atrucking firm wishes to choose between two alternate routes for transporting merchandize from one depot to another. One maior is the travel time. In a study, 5 drivers were randomly "onceselected from a group of 10 and assignedto route A,the other 5 were assigned to route B. The following data were obtained. Travel Time (hours) Route A Route B
(a) Is there a significant difference between the mean travel times between the two routes? State the assumptions you have made in performing the test. (b) suggest an altemative design for this study that would make a comparison more effective. 6.22 Astudy is to be made of the relative effectiveness of two kinds of cough medicines in increasing sleep. Six people with colds are given med'icine A the first night and medicine B the second night. Their hours of sleep each night are recorded. The data are: Subiect
Medicine A
Medicine B
(a) Establish a 95"/o confidence interval for the mean increase in hours of sleep from medicine B to medicine A' (b) How and what would you randomize in this study? Briefly explain your reason for randomization. 5.23 It is anticipated that a new instructional method will more effectively improve the reading ability of elementary-school children than the standard method curently in use. To test this coniecture, 16 children are divided at random into two groups of 8 each. One group is instructed using the standard method and the other group is
360
Chapter 11
instructed using the new method. The children's scoreson a reading test are found to be: Reading
Standard
use both hylothesis test and confidence interval methods to draw inferences about the difference of the population mean scores.Take c : .05. 6.24 Five pairs of tests are conducted to compare two methods of making rope. Each sample batch contains enough hemp to make two ropes. The tensile strength measurements are:
Method
(a) Treat the data as five paired observations,and calculate a95v" confidence interval for the mean difference in tensile strengths between ropes made by the two methods. (b) Repeat the calculation of a9s"/o confidence interval treating the data as independent random samples. (c) Briefly discuss the conditions under which each type of analysis would be appropriate. 6.25 Specimens of brain tissue are collected by performing autopsies on 9 schizophrenic patients and 9 control patients of ages.A "o-lparabie certain enzyme activity is measured for each subiectin terms of th. amount of a substance formed per gram of tissue per hour. The following means and standard diviations are calculated from the data [Source:R. J. Wyatt et al., Science,yol. Ig7 (1975), 869]: Control Subiects Mean Standard deviation
39.8 8.16
Schizophrenic Subiects
3s.5 6.93
6. Exercises 361 (a) Test to determine if the mean activity is significantly lower for the schizophrenic subjects than for the control subjects. Use cr : .05. (b) Construct a95o/oconfidence interval {or the mean difference in errzyme activity between the two Sroups. 6.26 Atwo sample analysis can be done using the MINITAB command POOLED T 95Y", where 95% denotes the confidence level. SET Cl 4627 SET C2 795 P O O L E DT . 9 5 % F O R D A T A C I A N D C 2 TWOSAMPLT E FOR CI VS C2 MEAN STDEV N 2.22 4.75 4 cl 2.oo 7.00 c2 3
SE MEAN 1.1 1. 2
9 5 p C T c r F O RM U C l - M U C 2 , ( - 6 . 4 , 1 . 9 ) T T E S TM U C l = M U C 2 ( V S N E ) : T = - 1 . 3 8 P = 0 ' 2 3 D F = 5 . 0
Find a 97o/o confidence interval for the difference of means in Exercise5.11.
Regression Anolysis-l SimpleLineorRegression 1, INTRODUCTION 2, REGRESSION WITHA SINGLEPREDICTOR 3. A STRAIGHT-LINE REGRESSION MODEL 4, THEMETHODOF LEAST SQUARES 5. THESAMPLING VARIABILW OF THELEAST SQUARES ESTIMATORS-TOOLS FORINFERENCE 6, IMPORTANT INFERENCE PROBLEMS 7, THESTRENGTH OF A LINEAR RELATION 8. REMARKS ABOUTTHESTRAIGHT-LINE MODEL
seriqlFemolesex chongesAflersimultoneous Removolof Molesfrom SociolGroupsof q CorolReefFish .i\rtN.
x
(l) UI h0
l0
'5b 8
a (1)
6 I
(tr
/
4 E (t) o
\(I)
-o
I
,/o -
,/
E $ (J
I
o
2
E = 0 z
/
/ l
'.:i\-N,il
Source:Science.Vol. 2Og.September5, 1980.
4. INTRODUCTION Until now we have discussed statistical inferences based on the sample measurements of a single variable. In many investigations, two or more variables are observed for each experimental unit iriorder to determine: (i) Whether the variables are related. (ii) How strong the relationships appear to be. (iii) whether one variable of primary interest can be predicted from observations on the others. The subject -of reg_ressionanalysis concerns the study of relationships *9rg variables, for- the purpose of constructing models for predictiJn and making other inferences. It treats two-variable (bivariate) or severalvariable (multivariate) data. chapter B provided a glimpse of ihis subject, but it was limited to a descriptive study of bivariate dlata. , ol" may be curious about why the study of relationships of variables has been given the rather unusual name ,?egression.,, Hi-storically, the word "regression" was first used in its present technical contexi'by a British scientist, sir Francis Galton, who analyzed the heights of sons and the heights of their parents. From his observatiois, Galton con-av-erage cluded that sons of very tall (or short) parenrs were generally taller
L. Introduction by johnny
DIDYOUKNOWIF YOUC.oUNT -TAE CRICKETCAIRTg IN ONE M\tNUfe,DtVDe AY FO)R,TI{EN ADD rcRTY ff OI\E6WU TAE OVTSIOE TCTATRATIJRE?
oR czLsus 7 FAr-{REr.lHelT
365
hert
-(AeRe sucH A THtrue 6 A5 METRIC CRICKCTS?
By permission of fohnny Hart and Field Enterprises, Inc.
(shorter) than the average but not as tall (short) as their parents. This result was published in 1885 under the title "Regression Toward Mediocrity in Hereditary Stature." In this context, "regression toward medioctity" meant that the sons'heights tended to revert toward the average rather than progress to more extremes. However, in the course of time, the word "regression" became synonymous with the statistical study of relation among variables. Studies of relation among variables abound in virtually all disciplines of science and the humanities. We outline just a few illustrative situations in order to bring the obiect of regression analaysis into sharp focus. EXAMPLE,l
A factory manufactures items in batches and the production manager wishes to relate the production cost (y) of a batch to the batch size (x). Certain costs are practically constant, regardless of the batch size x. Building costs and administrhtive and supervisory salaries are some examples. Let us denote the fixed costs collectively by F. Certain other costs may be directly proportional to the number of units produced. For example, both the raw materials and labor required to produce the product are included in this category. Let C denote the cost of producing one item. In the absenceof any other factors, we can then expect to have the relation Y:F+CX In reality other factors also affect the production cost, often in unpredictable ways. Machines occasionally break down and result in lost time and added expenses for repair. Variation of the quality of the raw materials may also cause occasional slowdown of the production process. Thus an ideal relation can be masked by random disturbances. Consequently, the relationship between y anldx must be investigated by a statistical analysis of the cost and batch- srze data.
n
366
Chapter 72
EXAMPLE2
suppose that the yield (y) of tomato plants in an agricultural experiment is to be studied in relation to the dosage(x) of a certain fertilizer, while other contributing factors such as irrigation and soil dressing are to remain as constant as possible. The experiment consists of applying different dosagesof the fertilizer, over the range of interest, in diflereni plots and then recording the tomato yield from these plots. Different dosagesof the fertTlizer will typically produce different yields, but the relationship is not expected to follow a precise mathematical formula. Aside from unpredictable chance variations, the underlying form of the relation is not known. n
EXAMPLE3
The aptitude of a newly trained operator for performing a skilled job dependson both the duration of the training period and thi nature of ihe training program. To evaluate the effectiveness of the training program, we must conduct an experimental study of the relation between growth in skill or learning (y) and duration (x) of the training. It is too much to expect a precise mathematical relation simply because no two human beings are exactly alike. Yet an analysis of the data of the two variables could help us to assessthe nature of the relation and to utilize it in
n
evaluating a training program.
'These
examples illustrate the simplest settings for regression analysis where one wishes to determine how one variable is rehled to orre oihe, variable. In more complex situations several variables may be inteffelated or one variable of major interest may depend on ,"rr.rri influencing variables. Regression analysis extends to these multivariate problemsl (SeeSection3, Chapter l3.f
2. REGRESSION WITHA SINGLEPREDICTOR A regression problem involving a single predictor (also called simple regression)ariseswhen we wish to study the relation between two variables x and y and use it to predict y from x. The variable x acts as an independent variable whose values are controlled by the experimenter. The variabl" y dependson x and is also subiect to unaccountablevariations or errors.
Nototion x -
v
independent variable also called predictor causal variable or input variable
2. Regtessionwith a SinglePredictor 367 For clarity, we introduce the main ideas of regressionin the context of a specific experiment. This experiment, described in Example 4, and the data set of Table I will be re{erred to throughout this chapter. By so doing we provide a flavor of the subject matter interpretation of the various inJerences associated with a regression analysis.
EXAMPLE4
In one stage of the development of a new drug for an allergy, an experiment is conducted to study how different dosagesof the drug affect the duration of relief from the allergic symptoms. Ten patients are included in the experiment. Each patient receives a specified dosage of the drug and is asked to report back as soon as the protection of the drug seems to wear off. The observations are recorded in Table l, which shows the dosage(x) and duration of relief (y) for the ten patients.
Tobletl Dosoge(xl (in Milligromslqnd fhe Numberof Doysof Relieflyl from Allergyfor TenPofients. Dosage
Duration of Relief
X
v
3 3 4 5 6 6 7 8 8 9
9 5 t2 9 I4
l6 22 l8 24 22
Seven different dosagesare used in the experiment, and some of these are repeated for more than one patient. A glance at the table shows that y generally increases with x, but it is difficult to say much more about the form of the relation simply by looking at this tabular data. tr For a generic experiment, we use n to denote the sample size or the number of runs of the experiment. Each run gives a pair of observations (x y) in which x is the fixed setting of the independent variable and y denotes the conesponding response.SeeTable 2.
368
Chapter 12
Toble 2 Dqfo Sfrucfurefor o Simple Regression Setting of the Independent Variable
Response
xr x2
Yt Yz
1' xn
yn
we begin our analysis by graphing the data.
FirstStep in the Anolysis In investigating the relationship between two variables, plotting a scatter diagram is an important preliminary step prior to undertaking a formal statistical analysis.
scatter diagram of the observations in Table r appearsin Figure _.The r. This scatter diagram reveals that the relationship i, in nature; that is, the points seem to crustei "ppio*i-"tety1irre", a straight line. "ro.r.ta
0
24
68
10
F i g u r e ' l Scotler diogrom of the dotar of Toble '1.
ModeL 369 3. A Straight-LineRegtession Because a linear relation is the simplest relationship to handle mathematically, we present the details of the statistical regression analysis for this case.Other situations can often be reduced to this caseby applying a suitable transformation to one or both variables'
IVIODHL REGRESSION 3. A STRAIGHT-LINE line, then Recall that if the relation between y andx is exactly a straight formula the by the variables are connected I:Fo*Frx y-axis and p1 where Bo indicates the intercept of the line with the change in x per y unit in change the the slope of the line, or t"pt"t""it (seeFigure 2). when the Stati"sticalideas must be introduced into the study of relation l. We inligure as line, a on perfectly poi"i, i" a scatter diagram do not lie is that relation linear ihint of these data as observations on an underlying this being masked by random disturbances or experimental errors..Given we formulate the following linear regression model as a tentay x' "i"*"pol"t, tive iepresentation of the mode of relationship between ar,d
StotisticolModel Foro Stroight-lineRegression We assumethat the response(Y) is a random variable that is related to the controlled variable (x) by Yi :
Fo + Frxi + ei,
i -
1,
. tn
where: (a) y, denotes the response corresponding to the ith experimental run in which the input variable x is set at the value xr. (b) or, ,an are the unknown error components that are superimposed on the true linear relation. These are unobservahle randrlrn variahles, which we assume ate independently and normally distributed with mearl zeto and an unknown standard deviation o. (c) The parameters Bo and Br, which together locate the straight line, are unknown.
370 Chapter12
Figure2 Grophof stroighfline y = [to + f],x. According to this model the observation t corresponding to revel x, of the controlled variable, is one observation from the rror-""1 distribution : 9o * Brx, and standard deviation : o. One interpretation Ytfr.meal of this is that as we attempt to observe the true value on the line, nature adds the random efior e,to this quantity. This statistical -oa"t is illustrated in Figure 3, which shows a few normal distributions for the response variable Y. All these distributions have the same standard deviation and their meanslie on the unknown true straight rine Bo * Frx. Aside from the fact that o is unknown, the line on wh'ich th. -."r* of thesenormal distributions are located is atsounknown. I" f;;an important obiective of the statistical anarysisis to estimate this line.
Figure3 Normol distributionsof v with meons on o stroightline.
EXERCISES 3.1 Identify the values of the parameters go, Br, and o in the statistical model
4. TheMethodof LeastSquarcs 371 Y:8-6x*e where e is a normal random variable with mean 0 and standard deviation 4. 3.2 Under the linear regression model: (a) Determine the mean and standard deviation of I for X : l, when : 3. 9 o : 2 , 9 t : 4 , a n do (b) Repeat part (a) with x : 2. 3.3 Graph the straight line for the means of the linear regression model Y : Fo * prx + ehaving 9o : 7 andPr : 2. 3.4 Consider the linear regression model Y : Fo * Frx + e, where e, has standard 9o : 3, F, : 5 and the normal random variable, deviation 3. (a) What is the mean of the response, Y, when x = 4?'When x : 5? (b) will the response at x : 5 always be larger than a responsett X : 4? Explain.
SQUARES OF LEAST 4. THEMETHOD Let us tentatively assume that the preceding formulation of the model is correct. We can tlen proceed to estimate the regressionline and to solve a few related inference problems. The problem of estimating the regression parameters po and p, ian be viewed as fitting the best_straightline on the ,""tt", diagram. Oni can draw a line by eyeballing the scatter diagram, but such a-ludgment may be open to dispute. Moteover, statistical inbasedon a line that is estimated subiectively. On the ferences ""rrnolb" other hand, the method of least squates is an obiective and efficient method of determining the best-fitting straight line. Moreover, this method is quite versatile because its application extends beyond the simple straight-line regression model. suppose that an arbitrary line y : bo I brx is drawn on the scatter diagram as it is in Figure 4. At the value x, of the independent variable, thry value predicted by this line is bo + brxrwhereas the observedvalue is yr. The diicrepancy between the observedand predicted y values is then (yr-- bo - br"ri: dr, which is the vertical distanceof the point from the Iine. Considering such discrepanciesat all the n points, we take
D:id?:i
as an overall measure of the discrepancy of the observed points from the
372
Chapter 12
ti
.",6n-Ur.,,
dr=
{
y = b o+ b p
I
I bs + b$;
Figure4 Deviolionsof fhe observotions from o line y=bo+bnx.
trial line y :-bo + brx. The magnitude of D obviously dependson the line that is drawn: in other woidr, it depends on bo the two quantities that determine the trial line. A good fit will riake ""a'I, D ai small as possible.we now state the principle of least squaresin general terms to indicate its usefulnessto fitting many other models.
ThePrincipleof leost Squores Determine the values for th e parameters so that the overall discrepancy D is minimuzed. The parameter values thus determined are called the least squares estimates.
For the straight-line model, the least squaresprinciple involves the 'and determination of bo b L to minimize
D- irr, I
bo
b rxr)2
4. The Method of LeastSquarcs 373 The quantities bo and b, thus determined are denoted by 0o and Br, respectively, and are called the least squares estimates of the regression parameters po and Br. The best fitting straight line is then given by the equation
v To describe the formulas for the least squares estimators we first introduce some basic notation.
BosicNototion
y : Itu
v_ lt" n'n Sr, - Xx
V)2- 2x2 ryn
srr- >(y y)2- 2y' ry : Xx V)(V n- 2xy S,.u ry
The quantities x and y are the sample means of the x and y values; S** and S,, are the sums of squared deviations from the means, and S,,, is the sum 6f the crossproducts of deviations.Thesefive summary statisiics are the key ingredients for calculating the least squares estimates and handling the inference problems associatedwith the linear regression model. The reader may review Sections 4 and 5 of Chapter 3 where calculations of these statistics were illustrated. The formulas for the least squares estimates are:
Least squares estimate of Fo:
0o
v
Least squares estimate o f F r :
1.,I
& Srt
0'i
37 4
Chapter 12
The estimates p6 and p, can then be used to locate the best fitting line:
Fitted (or estimatedlregressionline:
f _0o + 9rx
As we have already explained, this line provides the best fit to the data in the sense that the sum of squares of the deviations or
Z(yi
0o
0rrr)'
is the smallest. The individual deviations of the observations y, from the fitted values yi : 9o t prxi are called the residualg and we dlnote these by dr.
Residuals: ei : Y i
9o
Frx'
i:I,
tn
while some residuals are positive and some negative, a property of the least squaresfit is that the sum of the residuals is always zero. In Chapter 13 we will discusshow the residuals can be used to check the assumptions of a regressionmodel. For now, the sum of squaresof the residuals is a quantity of interest because it leads to an estimate of the variance o2 of the error distributions illustrated in Figure 3. The residual sum of squares is also called the sum of squares due to error and is abbreviatedas SSE.
The residual sum of squaresor the sum of squaresdue to error is
SSE_ 20?- Sr, % Srt
The second expression for SSE,which follows after some algebraic manipulations (see Exercise 4.7), is handy for directly calculating SSE. However, :,t/e stress the importance of determining the individual residuals for their role in model checking (see Section 4, Chapter l3).
4. The Method of LeastSquares 375 An estimateof o2 is obtained by dividing SSEby (n - 2). The reduction by two is becausetwo degteesof freedom are lost from estimating the two parameters Bo and Pr.
The variance o2 is estimated by s2-
SSE
nz
In applying the least squaresmethod to a given data set, we first compute the basic quantities x,r, S**, S,,, and S,,. Then the preceding-formulas can be use-dto obtain the least SQuaresre'gressionline, the residuals, and the value of SSE.Computations for the data given in Table 1 are illustrated in Table 3.
Tqble 3 Computofionsfor the leqst SquoresLine,SSE,ond ResiduolsUsingthe Dqlo of Tqble 'l Residual y2
y2
xy
0o+0,"
a
9 9 T6 25 36 36 49 64 64 81
81 25 r44 81 196 2s6 484 324 576 484
27 15 48 45 84 96 154 r44 r92 198
7.r5 7.r5
1.85 - 2.t5 2.rl - 3.63 - r.37 .63 3.89 - 2.85 3.15 - 1.59
s9 1 5 1 389
2651
1003
3 3 4 5 6 6 7 8 8 9 Total
i
1\
9 5 t2 9 t4 T6 22 18 24 22
Sr, : 265r Srn - 1003
10
- BTo.s
5 9 x 1 5 1_
.04
(rounding error)
0 o _ 1 5 . 1 2 . 7 4x 5 . 9 _ - 1 . 0 7
Wl 0: 4 0 . s
- ry
rs.37 1 8 . 1I 20.85 20.85 23.59
y : 2 . 74'4', 0' r40.9
t - 15.1
S,.,.- 389
9.89 12.63 15.37
lI2.I
(11? '' l)" 6l.6szg ssE- B7o.s 40.9
376 Chapter12
o A
y=
t,,
':Ir/,+i ; t-'Q/:.',+
10
Figure5 The leostsquoresregressionline for the doto given in Toble ,1.
The equation of the line fitted by the least squares method is then
v Figure 5 shows a plot of the data along with the fitted regression line. The residualsA, - yi - yi : yi * I.O7 - 2.74x arecomputed in the last column of Table 3. The sum of squares of the residuals is 2 A ? : ( 1 . 8 5 ) 2+ ( - 2 . 1 5 ) 2 + ( 2 . 1 1 ) 2+ . . . + ( - 1 . 5 9 ) 2 :
6Z.65g
which agrees with our previous calculations of SSEexcept for the error due to rounding. Theoretically, the sum of the residuals should be zero, and the difference between the sum .04 and zero is also due to rounding. The estimate of the variance o2 is
6-
SSE n2
eg.eszs 8
EXERCISES 4.I Given these six pairs of (x, y) values: x'l
I
2 3
3
45
6 3 3
Exercises 377 (a) Plot the scatter diagram. (b) CalculateV, y, S,., S'y, and S"u. (c) Calculate the least squaresestimates Bo and pr' (d) Determine the fitted line and draw the line on the scatter diagram. 4.2 Refer to Exercise4.1. (a) Find the residuals and verify that they sum to zero' (b) Calculate the residual sums of squares SSEby (i) adding the squares of the residuals, and (ii) using the formula SSE : Syy - S?y/S**' (c) Obtain the estimate of s2. 4.3 Given the five pairs of (x, y) values:
(a) Calculate*, Y, S,,, S.y, Syy. (b) Calculate the least squaresestimates po and Br' (c) Determine the fitted line. 4.4 Computing from a data set of (x, y) values, the following summaly statistics were recorded: X Sru
v 5,,
(a) Obtain the equation of the best fitting straight line' (b) Calculate the residual sum of squares' (c) Estimate o2. 4.5 Using the formulas of p, and sSE, show that sSE can also be expressedas: (a) SSE : S- - 0rS*"
(b)ssE: s," - 0?s,,
4.6 Referring to the formulas of po and frr, show that the point (x, y) lies on the fitted regression line. *4.7 Tosee why the residuals always sum to zero, refet to the formulas of po and Pr, and verifY that: (a) The predicted values arey, : y + Br(xr - x). (b) fhe residuals are
378 ChapterL2
Ai : yi - y. : (yt - V) - 0rk, -
")
Then show that ) Ai = O. (c) Verifythat) A?: So + 6?5*,.- 2FrS,": S", - Sr_/5"*.
5. THESAMPLING VARIABILITY OF THE LEAST SQUARES ESTIMATORS-TOOLS FORINFERENCE It is important to remember that the line y : 0o * prx obtained by the principle of least squares is an estimate of the unknown true regression prx. In our drug evaluation problem (ExamplJ4), the line y :9o * estimated line is 9:-1.07+2.74x ttlslgne p-1: 2.74 suggeststhat the mean duration of relief increases by 2.74 daysfor each unit dosageof the drug. Also, if we were to estimate the expected duration of relief for a specified dosagex* : 4.s milligrams, we would naturally use the fitted regression hnJ to calculate the estimate -1.o7 + 2-74 x 4-5 : 1r.26 days. A few questions conceming these estimates naturally arise at this point: (i) In light of the -value L.74 for Br, could the slope B, of the true regression line be as much as 4? could it be zero ro ih"t the true regressionline is y : go, which doesnot depend on x? what are the plausible values for Br? (ii) How much uncertainty should be attached to the estimated duration of 11.26 days correspondingto the given dosagex* : 4.5? To answer these and related questions, we must know something about -the sampling distributions of the least squares estimators. Again the t distribution is relevant.
(a) The standard deviations (also called standard errors) of the least squaresestimators are:
_ s.E.(0,)
\q
SE(9J
To calculate the estimated standard error, use in place of o.
6. Important Inference Problems 379
(b) Inferencesabout the slope I r arebasedon the t distribution , A
+
(9r -
9r)
slv s,.*
2
d.f.:n
Inferences about the intercept go are based on the t distribution
9o-9o
f
L-,
lr
"!n . \ a , -
2
d.f._n
L
x"
s*,.
(c) At a specified value x _ X*, the expected response 1 S prx* with Fo + prx*. This is estimated by 0o +
Estimated standard error: Inferencesabout go + prx* are basedon the t distribution d.f.:n
t_
2
S
PROBLEMS INFERENCE 6. IMPORTANT We are now prepared to test hypotheses, construct confidence intervals, and make predictions in the context of straight-line regression. 6.1 INFERENCE CONCERNING THE SLOPE BI In a regression analysis problem it may be of special interest to determine whether the expected responsedoes or does not vary with the magnitude of the input variable x. According to the linear regression model, Expectedresponse: 9o + prx This does not change with a change in x if an only if Pr : 0. We can therefore test the null hypothesis Ho:Br : 0 against a one- or a two-sided alternative, depending on the nature of the relation that is anticipated.
380
Chapter L2
Referringto the statement(b)of section5 the null hypothesisHo:B, : 6 is to be testedusing the: Teststatistic t-
^ Fr
s/v s,,
d.f.
EXAMPLE 5 Do the data given in Table I constitute strong evidence that the duration
of relief increaseswith higher dosagesof thedrug? For_an increasing relation we must have p, > 0. Therefore/ we are to test the null hypothesis Ho:B, : 0 versus the one-sidedaltemative Hr:p, > 0. Using the calculations that follow Table 2 we have
9r c2
2 . 74 ssE
63.6s28
L)
k2
EstimatedS.E.(0,) Test statistic
s
s
2.8207
t
with d.f. : 8, the upper 5% tabulated value of r is I.g50. The observedtvalue is therefore higlly significant, and Ho is rejected. There is strong evidence that larger dosagesof the drug tend to increase the duration oT relief over the range covered in the stndy. ! A waming is in order here conceming the interpretation of the test of : YotB, 9 tr Ho is not rejected,we may be tempted to conclude that y doesnot dependon x. such an unqualified statement may be erroneous.. First, the absenceof a linear relation has only been estabiishedover the range of the x values in the experiment. It may be that x was iust not varied enough to influence y. Second,the interpretation of lack oi d.p"rrdenceon x is valid only if our model formulatibn is correct. If the ,""tt", diagram depicts a relation on a curve but we inadvertently formulate a linear model and test Ho:Pr : 0, the conclusion that Ho is not rejected should be interpreted to mean "no linear relation,, rathei than ,,no relation." we elaborate on this point further in Section 7. our present viewpoint is to assume that the model is correctly formulated and to discuss the various inference problems associatedwith it. More generully, we may test whether or not B, is equal to some specified value pro, not necessarily zero.
6. Important Inference Problems 381
The test for Ho:Fr is basedon
(P ' - Pd s/Vs,, A
t
d.f._ n
2
In addition to testing hypotheses, we can provide a confidence interval for the parameter B1 using the t distribution.
A 100(1
a)% confidenceinterval for Br:
0r + to/z+
/s""
where to,ris the upper o-l2point of the t distribution with d.f.. : n - 2.
EXAMPLE6
Construct a95"/oconfidence interval for the slope of the regression line in referenceto the data of Table l. In Example 5 we found that Br : 2'74 and s/Vs* : '441'The required confidence interval is given bY 2.74 -r 2.306 x .44I - 2.74 -f I.OZ or (I.72, 3.76) We are 95% confident that by adding one extra milligram to the dosage, the mean duration of relief would increase somewhere between 1.72 and tr 3.76 days. 5.2 INFERENCE ABOUT THE INTERCEPT BO Although somewhat less important in practice, in{erences similar to the ones outlined in Section 5.1 can be provided for the parameter Bo. the procedures are againbased on the t distribution with d.f. : n - 2, stated for Bo in Section 5. In particular:
382
Chapter 12
A, 100(1
a)% confidenceinterval for Bo:
oo+to/2rffi
To illustrate this formula let us considerthe dataof Table l. In Table B we have found Bo _ - L.07, x _ 5.9, and S* Therefore, a 95% confidence interval for Fo is calculated as
I 1 , (5.g)2 - I . 0 7 + 2 . 3 0 6x Z .3207 i
Vmr
:
4a9
-1.07 -r 6.34 or (-7.41,5.27)
po represents the mean responsecorresponding to the value 0 _ Note that for the input variable x. In the drug evaluation ptoble- of Example 4, the parameter Bo is of little practical interest because the range ofx values covered in the experiment was 3 to 9 and it would be unrealistic to extend the line to x : 0. In fact, the estimate 0o : -I.O7 does not have an interpretation as a (time) duration of relief. 5.3 PREDICTION OF THE MEAN RESPONSEFOR A SPECIFIEDx VALUE The most important obfective in a regression study may be to employ the fitted regression in estimating the eipected response co.resporrding to a specified level of the input variable. For example, we may iant to estimate the expected duration of relief for a speciiied dosagex* of the drug. According to the linear model described in Section 8,1h" expected rJsponse at a value xx of the input variable x is given by Bo + Brxx. The expected responseis estimated by 0o + Brx*, wliich is the 6rdinaie of the fitted regression line at x : x*. Refeiring io the statement (c) of Section 5 the t distribution can be used to construct con_fidenceiniervals or test hypotheses.
A 100(1 a)% confidence interval for the expected response Fo + prx* is -f t..12s 9o + prx*
(x*
_ ,)2 Srt
6, Important Inference Problems 383
To test the hypothesisthat Bo + grx* - lro, some specified value, we use t
9o * Frx* - l.ro 1 , (x* T
n
EXAMPLE7
d.f._n
2
x)2
s""
Again consider the data given in Table I and the calculations for the regression analysis given in Table 3. The fitted regression line is
y:-I.O7+2.74x The expectedduration of relief correspondingto the dosagex* : 5 milligrams of the drug is estimatedas 0o + prx* :
- 1 . 0 7 + 2 . 7 4x 6 :
15.37days
Estimated standard error: rV+ . W : 2.8207 x .3166 : .893 A95Y" confidence interval for the mean duration of relief with the dosage x* : 5 is therefore 1 5 . 3 7+ t . o z sx . 8 9 3
15.37i2.306x.893 1 5 . 3 7+ 2 . 0 6 o r ( 1 3 . 3 1 ,L 7. 4 3 )
We are 95% confident that 6 milligrams of the drug produces an average duration of relief that is between about 13.3 and 17.4 days. Suppose that we also wish to estimate the mean duration of relief under the dosagex* : 9 .5. Following the same steps,the point estimate is
- I . 0 7 + 2.74 x 9.5 - 24.96 (9.5- s.g)z Estimatedstandarderror - 2.8207 I + 10 40.9 0o+0rr*-
A 95% confidence interval is 24.96 + 2.306 x 1.821
n
384
Chapter 12
The formula for the standard error of prediction shows that when x* is close to x, the standard error is smaller than it is when x* is {ar removed from x. This is con{irmed by Example 7, where the standard error of prediction at xx : 9.5 can be seen to be more than twice as large as the value at x* : 5. consequently, the confidence interval for the f-ormer is also wider. In general, prediction is more precise near the mean x than it is for values of the x variable that lie far from the mean. caution: Extreme caution should be exercisedin extending a fitted regression Iine to make long-range predictions far away from the range of x values covered in the experiment. Not only does the confidence interval become so wide that predictions basedon it can be extremely unreliable, but an even greater danger exists. If the pattern of the relationship between the variables changes drastically at a distant value of x, the data provide no information with which to detect such a change. Figure 6 illustrates this situation. we would observe a good linear relationship if we experimented with x values in the 5-10 range,but if the fitted line were extendedto estimate the responseat xx : 20, then our estimate would drastically miss the mark.
Irue relation
Figure6 Dongerin long-rongeprediction. 6.4 PREDICTION OF A SINGLE RESPONSEFOR A SPECIFIEDx VALUE suppose that we give a specified dosagex* of the drug to a single patient and we want to predict the duration of relief from the symptoms of allergy. This problem is different from the one considered in seition 6.3, where we were interested in estimating the mean duration of relief foi the population of all patients given the dosagex*. The prediction is still determined from the fitted line; that is, the predicted value of the reprx* as it was in the p."""di.rg case. However, the sponse i. 0o * standard error of the prediction here is larger, because a single observa-
6. Important InferenceProblems 385 tion is more uncertain than the mean of the population distribution. We now give the formula of the estimated standard error for this case.
The estimated standard error when predicting a single observation y at a given x* is (x* - x)z Stt
The formula for the confidence interval must be modified accordingly.
EXAMPLE 8 Once again, consider the drug trial data given in Table 1. A new trial is to be made on a single patient with the dosagex* : 6.5 milligrams. The predicted duration of relief is 0o + 0,x* :
-1.O7 + 2.74 x 6.5 :
16.74 days
and,a 95"/" confidence interval for this prediction is
16.74-,2.8o6x2.szo7 ffi :
16.74I 6.85 or (9.89,23.59)
This means that we are 95"h confident that this particular patient will tl have relief from symptoms of allergy for about 9.9 to 23.6 days. In the preceding discussion we have used the data of Example 4 to illustrate the various inferences associatedwith a straight line regression model. Example 9 gives applications to a different data set.
EXAMPLE9
In a study to determine how the skill in doing a complex assembly job is influenced by the amount of training, l5 new recruits were given varying amounts of training ranging between 3 and 12 hours. After the training their times to perform the job were recorded. Denoting x : duration of training (in hours) and y : time to do the job (in minutes), the following summary statistics were obtained: 33.6, 160.2,
S,.u-
-57.2
386
Chapter 12
(a) Determine the equation of the best fitting straight line. (b) Do the data substantiate the claim that the job time decreaseswith more hours of training? (c) Estimate the mean job time for t hours of training, and construct a 95% confidence interval. (d) Find the predicted y for x : 35 hours, and comment on the result. We find: (a) The least squaresestimates are _57')
s
"
9,:s:-:-#:-t.702 oxx
go : v - 0,t : 45.6- (-1.702)x 7.2: 57.85 So, the equation of the fitted line is y _ 57.85
I.702x
(b) To answerthis question we are to test Ho:9r - 0 versus H{ of pr is 0r Fr we calcul ate
(- -!!'?Y- 6z.sz4 33.6
SSE - 5,,
rcE
S
\
EstimatedS.E.(0,)
t6LB24 r Bs
\/s''
- 2.rgg :
xme
37g
The t statistic has the value f
- r.702 379
- 4.49,
d.f.
With d.f. : 13, the lower a : .01 point of t is -r.or : -2.550. Since the observedt : -4.49 is less than - 2.550,Ho is reiectedwith cr : .01. We conclude that increasing the duration of training significantly reduces the mean job time within the range covered in the experiment. (c) The expectedjob time correspondingto xx : t hours is estimated as
0 o + 0 , r * - 5 7 . 8 5+ ( - 1 . 7 0 2 )x g - 42.53 minutes
Exercises 387 and its
EstimatedS.E. _ s
1 +' Q : . 7 : 2 ) 2 _ 8 8 8 15 33.6
Since t.oru : 2.160 with d.f. : 13, the required confidence interval is 42.53 + 2.150 x .888 : 42.53 I I.92 or
(40.5, 44.5)minutes
(d) Sincex the experimental range of 3 to L2 : 35 using the fitted regression y predic x at t hours, it is not sensibleto gives line. Here, a formal calculation Predictediob time : 57.85 I . 7 O Zx 3 5 - 1.72 rnrnutes, a nonsensical result
ll
EXERCISES 6.I Given these five pairs of (x, y) values:
34
5
.9 2.r 2.4 3.3 3.8 (a) Calculatethe least squaresestimatesBoand 0 t. Also estimate the
error variance o2.
(b) Test Ho:Fr (c) Estimate the expectedy value correspondingto x - 3.5, and give
a 95"/" confidence interval. (d) Construct a 9OY"confidence interval for the intercept Bo. 6.2 For a random sample of seven homes that are recently sold in a city suburb, the assessedvalues (x) and the selling prices (y) are:
(thousand dollars)
v
(thousand dollars)
83.s 90.0 70.s 100.8 1 1 0 . 2 9 4 . 6 1 2 0 . 0 8 8 . 0 g r . 2 76 . 2 1 0 7 . 0 I I I . 0 9 9. O I 1 8 . 0
(a) Plot the scatter diagram.
388
Chapter 12
(b) Determine the equation of the least squares regression line, and draw this line on the scatter diagram. (c) Construct a 95Y" confidence interval for the slope of the regression line. 6.3 Refer to the data in Exercise6.2. (a) Estimate the expected selling price of homes that were assessed at $90,000,and construct a95Y" confidence interval. (b) For a single home that was assessedat $90,000, give a 95"/" confidence interval for the selling price. 6.4 In an experiment designedto determine the relationship between the dose of a compost fertllizer x and the yield of a crop y, the following summary statistics are recorded: n-15, S,.,.- 70.6,
x:
10.8,
, - 122.7
Sru -
Assume a linear relationship. (a) Find the equation of the least squares regression line. (b) Compute the error sum of squares and estim ate o2. (c) Do the data establish the experimenter's conjecture that, over the range of x values covered in the study, the average increase in yield per unit increase in the compost dose is more than .6?
6.5 Refer to Exercise6.4. (a) Construct a 95% confidence interval for the expected yield correspondingtox_ 12. (b) Construct a 95% confidence interval for the expected yield correspondingtox- 15. 6.6 To study the effect of water hardness on taste, the following data were obtained from specimens of drinking water from eight communities (seealso Exercise5.18 in Chapter 3): x :
amount of magnesium (milligrams
y :
taste rating
per liter)
(a) Considering a linear regression model, provide a 90"/o confidence
interval for the slope of the regression iine.
(b) Estimate the expected taste rating when x confidence interval.
7. The Strengthof a Linear Relation 389 (c) For a single specimen that has x : 10, predict the taste rating and Brle a 95% confidence interval. 6.7 The atmospheric concentration of trace gas F-12 was measured in parts per trillion at the South Pole. The results were, with 1976 coded as time 0,
01234
Time
in Antatctrca,"Rasmussen, Adaptedfrom "AtmosphericTraceGasses R. A., pp. 285-287,Frg. I,16 fanuaryIgSl.CopyrightO 1981by Science,Yol.2II, AAAS. (a) Obtain the least squaresline and sketch the line on the scatter plot. (b) Test Ho:pr : 0 versusHr:Bt (c) Predict the concentration for 1995.What addeddangeris there in this prediction?
RELATION 7. THESTRENGTH OFA LINEAR To arrive at a measure of adequacyof the straight-line model we examine how much of the variation in the response variable is explained by the fitted regression line. To this end, we view an observedy, as consisting of these two components:
+ (v,
yi
Observed y value
Explained by linear relation
9o
0,xr)
Residual or deviation from linear relation
In an ideal situation where all the points lie exactly on the line, the residuals are all zero, and the y values are completely accounted for or explained by the linear dependenceon x. We can consider the sum of squares of the residuals SSE _ Z(yi
0o
Frx,)' _ S,
s?" Stt
to be an overall measure of the discrepancy or departure from linearity.
390
Chapter 12
The total variability of the y values is reflected in the sum of squares
of which SSE forms a paft. The difference
5,,
&
SSE-5,,
Srt
-&
Srt
forms the other part. Motivated by the decomposition of the observation n just given, we can now consider a decomposition of the variability of the y values:
5,,
% Stt
Total variability o f y
Variability explained by lin ear relation
+
SSE Residual or unexplained variability
The first term on the right-hand side of this equality is called the sum of squares(SS)due to linear regression.Likewise, the total variability S," is also called the total SS of y. In order for the straight-line model tii be considered as providing a good fit to the data, the SS due to the linear regr'essionshould comprise a ntajor portion of S,,. In an ideal situation in which all points lie on the line, SSEis zero so S"" is completely explained by the fact that the x values vary in the experiment. That is, the linear relationship between y and x is solely responsible for the variability in the y values. As an index of how well the straight-line model fits, it is then reasonable to considerthe proportion of the y-vafiability explained by the linear relation: SS due to linear regression Total SS of y
S?r/S*, _
From Section 4 of Chapter 3 recall that the quantity Sxy
\6
S?V
s"suu
7. The Strength of a Linear Relation
391
is named the sample correlation coefificient. Thus, the square of the sample correlation coefficient represents the proportion of the y-variability explained by the linear relation.
The strength of a linear relation is measuredby
t-2- q s
s?v
which is the square of the sample correlation coefficient r.
EXAMpLE ,10 Let us consider the drug trial data in Table l. From the calculations provided in Table 3, S,., : 40.9,
Svv : 37o'9,
Fitted regressionline:
f :
S'u :
rl2'l
-I.07 + 2'74x
How much of the variability in y is explained by the linear regression model? To answer this question, we calculate
,' : ut?,{-- #L'or: "xx"yy
'83
This means that 83% of the variability in y is explained by linear regrestr sion, and the linear model seems satisfactory in this respect. When the value of 12is small, we can only conclude that a straight-Iine relation does not give a good fit to the data. Such a case may arise due to the following reasons: (a) There is little relation between the variables in the sense that the scatter diagram {ails to exhibit any pattern, as illustrated in Figure 7a. In this cai, the use of a different regression model is not likely to reduce the SSEor to explain a substantial part of Su"' (b) There is a prominent relation but it is nonlinear in nature; that is, the scatter is banded around a curve rather than a line. The part of S"" that is explained by straight-line regression is small because the model is inappropriate. Some other relationship may improve the fit substantially.-fig.tt" 7b illustrates such a case,where the SSEcan be reduced by fitting a suitable curve to the data.
392
Chapter 12
rc
(b)
Figure7 Scotterdiogrom potterns:(o) No relotion, (b) A nonlineorrelqtion.
EXERCISES 7.1 Civen.S,.. tion of variation in y that is explained
9 .3, determine the proporlinear regression.
7.2 A calculation shows that S*, : 92, Su, : 160. , 457, and S." Determine the proportion of variation in jz that is explainedby linear regression. 7.3 Referring to Exercise 6.2, determine the proportion of variation in the y values that is explained by the linear regression of y on x. What is the sample correlation coefficient between the x and y values? 7.4 Refer to the data of Exercise6.6. (a) What proportion of y-variability is explained by the linear regression on x? (b) Find the sample correlation coefficient.
7 . 5 (a) Show that the sample correlation coefficient r and the slope 0, of the fitted regressionline are related as
t
L
0,Vs,.,
-
Y 5,,
8. Remarks about the Straight-Line Model
393
(b) Show that SSE : (l - rr)Srr. 7.5 Show that the SSdue to regression,S|rfS*,, can also be expressedas
0?s.*.
8. REMARKS ABOUTTHESTRAIGHT-LINE MODEL A regression study is not conducted by performing a few routine hypothesis tests and constructing confidence intervals for parameters on the basis of the formulas given in section 5. such conclusions can be seriously misleading if the assumptions made in the model formulation are grossly incompatible with the data. It is therefore essential to check the data carefully for indications of any violation of the assumptions. To review, the assumptions involved in the formulation of our stiaight-line model are briefly stated again: (a) The underlying relation is linear. (b) trdependence of errors. (c) Constant variance. (d) Normal distribution. of course, when the general nature of the relationship between y and x forms a curve rather than a straight line, the prefiction obtained from fitting a straight-line model to the data may p.oduce nonsensical results. often a suitable transformation of the data tednc", a nonlinear relation to one that is approximately linear in form. A few simple transformations are discussedin chapter 13. violating the assumption of independenceis perhaps the most serious matter, because this can drastically distort the conclusions drawn from the t-tests and the confidence statements associated with interval estimation. The implications of assumptions (c) and (d) were illustrated earlier in Figure 3. If the scatter diagram shows different amounts of variability in the y values for different-levels of x, then the assumption of constant variance may have been violated. Here again, an appropriate transformation of the data often helps to stabilize the variance. Lastly, using the t distribution in hypothesis testing and confidence interval estimation is valid as long ih" errois are approx"r from normality does imately normally distributed. A moderate departure not impair the conclusions, especially when the data set is large. In other words, a violation of assumption (d) alone is not as serious as-aviolation of any of the other assumptions. Methods of checking the residuals to detect any serious violation of the model assumptionJ are discussed in Chapter 13.
394
Chapter 12
KEYIDEAS In its simplest form, regression analysis deals with studying the manner in which a responsevariable (y) dependson a predictor variable (x). The first important step in studying the relation between the variables y and xis to plot the scatter diagram of the data (xr, yr),i : I, . . ., n,If this plot indicates an approximate linear relation, a straight-line regression model is formulated: Response- (A straight line in x) + (Random error) _
Yi
+
9o+prx,
ei
The random errors are assumed to be independent, normally distributed, and have means 0 and equal standard deviations o. The regression parameters po and B, are estimated by the method of least squares, which minimizes the sum of squared deviations 2(yi - Fo - Brx)2. The least squaresestimates po and p, determine the best-fitting regression line y : 0o + Brx, which serves to predict y from x. The differences (y, - 9,r) : (observed response - predicted response) are called the residuals. The adequacy of a straight line fit is measured by 12,which represents the proportion of y-variability that is explained by the linear relation between y and x. A low value of 12 only indicates that a linear relation is not appropriate-there may still be a relation on a curve.
KEYFORMULAS
Fitted regression line:
f -0o + 0,x
Residuals: ei:
,,
Yi
A
0o : Y
Leastsquaresestimators: Bl : k, Srt'
Frx
& Residualsum o[ squares:SSE : >e? -- S.,., vY
stt
Estimate of o2: s2
SSE
n2
Inferences (a) Inferences concerning the slope Fr ate based on the estlmator
0'
Key Formulas 395 estimated S.E. _
\q
and the sampling distribution
t-0r-9r, s/VS A 100(I
d.f._n
z
a)% confidenceinterval for B, is Fr + to/Z
To test Ho:
Bl :
\q
9ro, the test statistic is t
slV S,,
(b) Inferences concerning the intercept Bo are based on the estimator Bo -,
tr
estimateds.E. - r. /1 +' + 1/n
Sr.
and the sampling distribution
-
9o-9o -'
11
stl
I -I
Yn
A 100(1
T
,
d.f.-n
2
X-
S"
a)% confidenceinterval for go is ,, 0o+to/zr1/;+*
=-
(c) At a specified x : X*, the expected response is Bo * Brx*. Inferencesabout the expectedresponseare basedon the estimator 0o + 0rr* estimated A 100(l given by
_L n
a)% confidence interval for the expected response at x* is
396
Chapter 12
(xx - i)z
0o + 0r"* t to/zs
Stt
(d) A single responseat a specifiedx _ x* is predicted by Bo + 0,x* with estimated S.E. - s
t+1+(x*_-f)2 n
Srt
A 100(1 - o)% confidence interval for predicting a single responseis
0 o + 0rt* -{- to12
S
Decomposition of Variability Variability explained by the linear relation )rt
Residual or unexphined variability Total y-varrability - S* Proportion of y-variability explainedby linear regression:
I
Sample correlation coefficient
12
s"suu r
{ril
9, EXERCISES 9.I Given the nine pairs of (x, v) values:
3
v
34
r0 15 t2 19 24 2r
( a ) Plot the scatter diagram.
(b) Calculate x, y, S,.,, Sr, and Sry. ( c ) Determine the equation of the least squares fitted line and draw the line on the scatter diagram.
Exercises 397
(d) Find the predicted y correspondingto x : 3. 9.2 Refer to Exercise9.1. (a) Find the residuals. (b) Calculate SSEby (i) summing the squares of the residuals and also (ii) using the formula SSE : Sr, - S?u/S*". (c) Estimate the error variance. 9.3 Refer to Exercise9.I. (a) Construct a 95"/" confidenceinterval for the slope of the regression line. (b) Obtain a 9O% confidence interval for the expected y value correspondingtox:4. 9.4 An experiment is conducted to determine how the strength (y) of plastic fiber depends on the size (x) of the droplets of a mixing polymer in suspension. Data of (x, y) values, obtained from 15 runs of the experiment, have yielded the following summary statistics: X
St* - 5'6,
(a) Obtain the equation of the least squaresregressionline. (b) Test the null hypothesis Ho:9r : -2 against the alternative Hr:pr 1-2, witha:.05. (c) Estimate the expectedfiber strength {or droplet size x : 10, and set a 95Y" confidence interval. 9.5 Refer to Exercise 9.4. (a) Obtain the decomposition of the total y-variability into the two parts: one explained by linear relation and one not explained. (b) What proportion of the y-variability is explainedby the straightline regression? (c) What is the sample correlation coe{ficient between x and y7 9.6 A forest scientist investigatesthe relationship between the terminal velocity of maple samara and a measure of its size and weight, known as the disk loading. (A samarais the winged fruit that falls to the ground with a helicopter-like notion. The disk loading is a quantity related to helicopter aerodlmamics.)Thirteen samarasare randomly selectedfrom a maple tree, and the disk loading (x) and the terminal velocity (y) of each samarais measuredin the laboratory. (Courtesy of Rick Nordheim.)
398
Chapter 72
Disk loading x Terminal velocity y
.257 .295 .284 .272 .277 .245 .255 .284 1.35 1.50 1.33 1.39 1.28 1.06 r.32 1.23
(a) Plot the scatter diagram and find the least squares fit of a straight line. (b) Do these data substantiate the claim that the terminal velocity is affected by disk loading? (Test with a : .05.) 9.7 Refer to Exercise9.6. (a) Calculate the sample correlation coefficient. (b) What proportion of the y-variability is explained by the fitted regressionline? 9.8 A moming newspaper lists the following used-car prices for a foreign compact, with age x measured in years and selling price y measuredin thousands of dollars:
(a) Plot the scatter diagram. (b) Determine the equation of the least squares regression line and draw this line on the scatter diagram. (c) Construct a 95% confidence interval for the slope of the regression line.
9.9 Refer to Exercise9.8. (a) From the fitted regression line, determine the predicted value for the averageselling price of a S-year-old compact and construct a 95"/" confidence interval. (b) Determine the predicted value for a S-year-old compact to be listed in next week's paper. Construct a 90% confidence interval. (c) Is it justifiable to predict the selling price of a Z}-year-old compact from the fitted regression line? Give reasons for your answer. 9.10 Again referring to Exercise9.8, find the sample correlation coefficient between age and selling price. What proportion of the yvariability is explained by the fitted straight line? Comment on the adequacy of the straight-line fit.
Exercises 399
9.11 Civen n:
20,
)x
160, 2y
Zxz : 1536, 2xy:1932,
:
24O
2y2 :29Gs
(a) Find the equation of the least squares regression line. (b) Calculate the sample correlation coefficient between x and y. (c) Comment on the adequacy of the straight-line fit. 9.12 Using the computer. The calculations involved in a regressionanalysis become increasingly tedious with larger data sets. Access to a computer proves to be of considerable advantage.Illustrated here is a computer-based analysis of linear regression using the data of Example 4 and the MINITAB package. With the MINITAB commands SET CI 3345667889 SET C2 9 5 12 9 14 16 22 l8 24 22 R E G R E S SY I N
C2 USING I
P R E D I C T O RI N
CI
THE REGRESSIONEQUATION IS c2 = 1.07 + 2.74 Cl
we obtain all the results that arebasic to a linear regressionanalysis. The important piecesin the output areshown in Table 4.
Toble 4 MINITABRegressionAnolysisof the Doto in Exomple 4 COLUMN
ct
COEFFICIENT -1.07 1 2 .7409
ST. DEV. oF coEF. 2.751 0.44 ll
T H E S T . D E V . O F V A B O U T R E G R E S S I O NL I N E I S S = 2.82 WITH ( rO- 2) = DEGREESOF FREEDOM = 8 2 . 8 P E RC E N T R-SQUARED ANALYSIS OF VARIANCE DUE TO DF REGRESSION I RESIDUAL I T O T AL 9
SS 3 0 7. 2 5 63.65 3 70 . g 0
T-RATIO = COEF/S.D. -0.39 6.21
Chapter
Compare Table 4 with the calculations illustrated in Sections 4-7. In particular, identify: (a) The least squaresestimates.
(b)ssE (c) Estimated standard errors of Bo and Br. (d) The f statistics for testing Ho:po : 0 and Ho:B, : 0. (e) rz (f ) The decomposition of the total sum of squaresinto the sum of squares explained by the linear regression and the residual sum of squares. 9.13 Fit a straight line to the data in Exercise9.8 using the computer. 9.14 Some chemists interested in a property o{ plutonium called solubility, which dependson temperature, reported the following measurements. []. C. Mailen et al., lournal of Chemical and Engineering Data (1971),VoI. 16. No. I, 591f.ora plutonium powder (P,,Fr)in a molten mixture (2LiF - BeFr), where y is -logro(solubility) and x is lOO0/(temperature'C).
(a) Find the least squaresfit of a straight line. (b) Establish a 95% confidence interval for pr. Does solubility depend on temperature over the range of the experiment? (c) Predict the mean value of -logro(solubility) ^t 714"C or x : 1.40. Construct a 95% confidence interval. (d) Construct a95"/" conJidenceinterval for a new measurement to be made at x : 1.8. 9.15 Many college students obtain college degreecredits by demonstrating their proficiency on exams developed as part of the College Level Examination Program (CLEP). Based on their scores on the College Qualification Tests (CQT), it would be helpful if students could predict their scoreson a correspondingportion of the CLEP exam. The following data (courtesy of R. W. fohnson) are for x : Total CQT score and y : mathematical CLEP:
Exercises 401 (a) Find the least squares fit of a straight line.
(b) Construct a 95% confidence interval for the slope. ( c ) Construct a 95% confidence interval for the CLEP score of a
student who obtains a CQT score of 150. (d) Repeat (c) with x : 175 and x : 195. 9.16 An engineer, interested in studying the thermal properties of ductile cast iron, measures its cooling rate y in "F per hour during a heattreatment stage. The engineer also records the data of xr
:
The number of graphite spheroids present in a square millimeter area of the cast iron.
xz - The percen tage of pearlite present in the cast iron. (Courtesy of A. Tosh.)
As a first step in the analysis,disregardthe variable xrand study the suitability of the simple linear regressionmodel Y-9o+prxr+e In particular: ( a ) Plot the scatter diagram of y vs. xr.
(b) Determine the equation of the fitted straight line. (c) Test to determine if the data provide strong evidence that the cooling tate increases as the count of graphite spheroids increases.
(d) Construct a95% confidence interval for the average c,ooling rate corresponding to x, :
300.
(e) Give the decomposition of the total y-varrability, and comment on the adequ acy of the linear regression.
402
Chapter L2
9.17 crickets make a chirping sound with their wing covers. Scientists have recognized that there is relationship between the frequency of chirps and the temperature. (There is some truth to the cartoon.) Use the 15 measurementsfor the striped ground cricket to: (a) Fit a least squaresline. (b) Obtain a 95"/" confidence interval for the slope. (c) Predict the temperature when x : 15 chirps per second. Chirps (per second) (x)
Temperature ("F)
20.0 16.0 r9.8 18.4 1 7. I 15.5 14.7
88.6
1 7. r 15.4 16.3 15.0 17.2 16.0 17.0 t4.4
0) 7r . 5 93.3 84.3 80.6 75 . 2 69.7 82.0 69.4 83.3 79.6 82.6 80.6 83.5 76 . 3
Source: C. Pierce (1949), The Songs of Insects, Harvard University Press,pp. 12-21.
CHAPTER
I Anolysis-l Regression
ond MultipleLineorRegression OtherTopics 1, INTRODUCTION TRANSFORMATIONS RELATIONS ANDLINEARIZING 2, NONLINEAR REGRESSION LINEAR 3. MULTIPLE MODEL OFA STATISTICAL PLOTS TO CHECKTHEADEQUACY 4. RESIDUAL
Micronutrients ond KelpCultures:Evidencefor Cobolt ond MongoneseDeficiencyin SoufhernCqlifornio Deep Seowoter Abstract. It has been suggested that naturally occurring copper and zinc concentrations in deep seawater are toxic to mafine organisms when the free ion forms are overabundant. The effects of micronutrients on the growth of gametophytes of the ecologically and commercially significant giant kelp (Macrocystis pyrifera) wete studied in defined media. The results indicate that toxic copper and zinc ion concenftations as well as cobalt and manganese deficiencies may be among the factors controlling the growth of marine organisms in natute.
A least squares fit of gametophytic medium generated the expression
growth
data in the defined
Y Txrnx"u L8x"o2
l\xrnz 6xauxzn2
27x-n2
I2x.u2
(1)
6x.uxMn2
where Y is mean gametophytic length in micrometers. The fit of the experimental data to Eq. I was considered excellent. Source: Kuwabara, I.S. Micronutrients and Kelp Cultures: Evidence for Cobalt and Manganese Deficiency in Southern California Deep Sea Waters , Science, YoL. 216, pp. I2I9-L22I, I I |une, 1982. Copyright O 1982 by AAAS.
404
2. Nonlinear Relations and Lineafizing Transformations
405
1. INTRODUCTION The basic ideas of regression analysis have a much broader scope of application than the straight-line model of Chapter 12. In this chapter, our goal is to extend the ideas of regression analysis in two important directions: (a) To handle nonlinear relations by means of appropriate transformations applied to one or both variables. (b) To accommodate several predictor variables into a regression model. These extensions enable the reader to appreciate the breadth of regression techniques that are applicable to real life problems. We then discuss some graphical procedures that are helpful in detecting any serious violation of the assumptions that underlie a regression analysis.
2. NONLINEAR RELATIONS AND LINEARIZING TRANSFORMATIONS When studying the relation between two variables y and x, a scatter plot of the data often indicates that a rel-ationship, although present, is far from linear. This can be established on a statistical basis by checking that the value of 12 is smeil so a straight-line fit is not adequate. Statistical procedures for handling nonlinear relationships are more complicated than those for handling linear relationships, with the exception of a specific type of model called the polynomial regression model which is discussed in Section 3. In some situations, however, it may be possible to transform the variables x and/or y in such away that the new relationship is close to being linear. A linear regression model can then be formulated in terms of the transformed variables, and the appropriate analysis can be based on the transformed data. Transformations are often motivated by the pattern of data. sometimes when the scatter diagram exhibits a relationship on a curve in which the y values increase too fast in comparison with the x values, a plot of.t/-y or some other fractional power o{ y can help to linearize the relation. This situation is illustrated in Example l.
EXAMPLE1
To determine the maximum stopping ability of cars when their brakes are fully applied, 10 cars are to be driven each at a specified speedand the distance each requires to come to a complete stop is to be measured. The various initial speeds selected for each of the r0 cars and the stopping distances recorded are given in Table l.
406
Chapter 73
Toble'l Doto on Speedond StoppingDistqnce Initial Speedx (mph)
20
Stopping Distan ce y (ft)
20
30
30
30
40
40
s0
50
50
16.3 26.7 39.2 63.5 51.3 98.4 65.7 104.1 1ss.6 2r7.2
The scatter diagram for the data appears in Figure 1. The relation deviates from a straight line most markedly in that y increases at a much faster rate at large x than at small x. This suggests that we can try to linearize the relation by plotting l-y o, some other fractional power of y with x. We try the transformed data, V-n given in Table 2. The scatter diagram for these data, which exhibits an approximate linear relation, appearsin Figure 2.
Toble 2 Dofo on Speed qnd Squore Roof of Sfopping Dislonce 20 y':
{y
20
30
30
30
40
40
50
50
60
4.037s.r67 6.2617.9697.1629.9208.10610.20312.47414.738
With the aid of a standard computer program for regression analysis (seeExercise2.4), the following results are obtained: i
37, 1610, 0 o _ - .166, .(t
q: -xx
0
10
20
30
40
50
60
Figure'l Scotter d i o g r o m of the doto given in Toble '1.
,' _ 8.604 Sr,r, - 97.773, gt
10
20
Srr, _ 381.62I
30
405060x
Figure2 Scotterdiogrom of the tronsformeddoto given in Toble 2.
2, Nonlinear Relations and Linearizing Transformations
4O7
Thus the equation of the fitted line is
9' :
-.166 + .237x
The proportion of the y' variation that is explained by the straight-line model is f2
L
(38r.62r)2 _ .92 ( 1610)(e7 .773\
n
A few common nonlinear models and their corresponding linearizing transformations are given in Table 3.
Toble 3 SomeNonlineorModelsond TheirLineorizing Trqnsformqfions Transformation
Nonlin ear ModeL
(a) y (b) y
-
axb
I a+bx (d) y _ a + b{x (c) y :
EXERCISES
v' v' 'Yv' : 1 v'
x'
Transformed Model Y' - 9o + prX', 9o : Iog.a Fo _ log a
gr _ b gr _ b
Fo-a, 9o _ a
Fr: b Fr : b
In some situations a specific nonlinear relation is strongly suggested either by the data or by a theoretical consideration. Even when initial information about the form is lacking, a study of the scatter diagram often indicates the appropriate linearizing transformation. Once the data are entered on a computer, it is easy to obtain the transformed data l/y, IogSr, yr/2, and yr/+. Note ytz+ is obtained by taking the square root of yr/2. A scatter plot of l-y u". log" x or any number of others can then be constructed and examined for a linear relation. Under relation (a) in Table 3, the gaph of log, y vs. x would be linear. we must remember that all inferences about the transformed model are basedon the assumptions of a linear relation and independent normal errors with constant variance. Before we can trust these inferences, this transformed model must be scrutinized to determine whether any serious violation of these assumptions may have occurred (see Section 4).
2.I Given the pairs of (x, y) values
Chapter
(a) Plot the scatter diagram. (b) obtain the best fitting straight line and draw it on the scatter diagram. (c) what proportion of the y-variability is explained by the fitted line? 2.2 Refer to the data of Exercise2.1. (a) Consider the reciprocal transformation y, : l/y, and plot the scatter diagram of y' vs. x. (b) Fit a straight-line regression to the transformed data. (c) Calculate P and comment on the adequacy of the fit. 2.3 Find a line aruzLrlgtransformation in each case: I (a) y (a + bx),
(b)-Lv
a r
',
b
(1 +x)
2.4 Using the computer. Refer to the data of speed (x) and stopping distance (y) given in Table t. The MINITAB commands ior fitiing i straight-line regression to y' : {y and x are:
SET C1 20 20 30 30 30 40 40 50 50 60 SET C2 16 . 3 2 6 . 7 3 9 . 2 6 3 . 5 5 1 . 3 9 8 . 4 6 5 . 7 1 0 4 . I SQRT C2 PUT I N c3 R E G R E S SY I N C 3 O N 1 P R E D I C T O R I 1 - I C I
155.6
217.2
( a ) Obtain
the computer output and identify the equation of the fitted line and the value of 12 (cf. Example l).
(b) Cive a 95% confidence interval for the slope. (c) Obtain a 95% confidence interval for the expected y' value at x - 45.
2.5 A forester seeking information on basic tree dimensions obtains the following measurements of the diameters 4.5 feet above the ground and the heights of 12 sugar maple trees (courtesy of A. Ek). The
3. Multiple Linear Regression 409 forester wishes to determine if the diameter measurements can be used to predict tree height. Diameter x (inches) Height y (feet)
. 9 1 . 22 . 9 3 . 1 3 . 3 3 . 9 4 . 3 6 . 2 9 . 6 1 2 . 6 1 6 . 1 2 5 . 8
18 26 32 36 44.535.640.5s7.s 67.3 84 67 B7.s
(a) plot the scatter diagram and determine tf a straight-line relation is appropriate. (b) Determine an appropriate line arLzurtgtransformation. In particular, try x' : log x, Y' : log Y. (c) Fit a straight-line regression to the transformed data. (d) What proportion of variability is explained by the fitted model?
REGRESSION LINEAR 3. MULTIPLE A response variable y rnay depend on a predictor variable x but, after a straight-line fit, it may turn out that the unexplained variation is large so 12 is small and a poor fit is indicated. At the same time, an attempt to transform one or both of the variables may fail to dramatically improve the value of 12. This difficulty may well be due to the fact that the response depends not iust on x but on other factors as well. When used alone, x fails to be a good predictor of y becauseof the effects of those other influencing variables. For instance, the yield of a crop depends not only upon the amount of fertilizer but also on the rainfall and average temperature during the growing season.Cool weather and no rain could completely cancel the choice of a correct fertllizer. To obtain a useful prediction model, one should record the observations of all variables that may significantly affect the response. These other variables may then be incorporated explicitly into the regression analysis. The name multiple regression refers to a model of relationship where the responsedependson two or more predictor variables.Here we discuss the main ideas of a multiple regressionanalysis in the setting of two predictor variables. Supposethat the responsevariable y in an experiment is expected to be influenced by two input variables xr and xr, and that the data relevant to these input variables are recorded with the measurements of y. With n runs of an experiment, we would have a data set of the form shown in Table 4.
410
Chapter 13
Toble 4 Doto sfructurefor Multiple Regression with TwoInputVoriobles Experime,ntal Run
Input xr
I 2
Xtt
Variables xz
*?.' xit
: Xnl
Xtz
"?.' Xiz
Response
v Yt Yz
,:
: Xnz
yn
By analogy with the simple linear regressionmodel, we can then tentatively formuiate:
A Mulliple RegressionModel Yi where x it and xiz are the values of the input variables for the rth experimental run and Y, is the corresponding response. The error components ei are assumed to be independent normal variables with mean - 0 and variance parameters 9o, 9r, and B, are unknown and so is o2.
This model suggests that aside from the random error, the response varies linearly with each of the independent variables when the other remains fixed. The principle of least squares is again useful in estimating the regression parameters. For this model, we are required to vary bo, b r, and b, simultaneously to minimize the sum of squared deviations 17
(y, l:
I
bo
b rx^ - brx,r),
3. Multiple Linear Regression 411 The least squaresestimates 0o, 0r, and p" are the solutions to the following equations, which are extensions of the corresponding equations for fitting the straight-line model (cf. Section 4 of Chapter 12).
prsrr+prsrr:sr" 0 rSr ,* Fr S"r : Sr , 0o:Y-9rir -93, where Srr, Srz, and so on are the sums of squaresand cross products of the variables in the suffix. They are computed just as in a straight-line regression model. Methods are available for interval estimation, hypothesestesting, and examining the adequacy of fit. In principle, these methods are similar to those used in the simple regression model, but the algebraic formulas are more complex and hand computations become more tedious. However, a multiple regression analysis is easily performed on a computer with the aid of the standard packages such as MINITAB, BMDP, or SAS. We illustrate the various aspects of a multiple regression analysis with the data of Example 2 and computer-based calculations.
EXAMPLE2
we are interested in studying the systolic blood pressure y in relation to weight x, and age xrin a class of males of approximately the same height. From 13 subjects preselected according to weight and age, the data set listed in Table 5 was obtained.
Toble 5 TheDofo of x, = Weightin Pounds, X2 = Age, ond Y = Blood Pressure of '13Moles xl
t52 183 17T 165 158 161 r49 r58 170 153 r64 190 r85
x2
50 20 20 30 30 50 60 50 4A 55 40 40 20
r20 r4l t24 r26 r17 r29 r23 125 132 r23 r32 155 r47
412
Chapter 18
_ To use MINITAB, we first enter the data of x11X2t and y in three different columns and then use the regression "omr.rarrd, R E A DI N T OC 1 - C 3 15? 5{r 13Lr
t:=
t
1t
185 ?0 r47 R E G R E SYS I N f , 3 O NT P R E D I C T O R SI N C 1 f , 3
with the last command, the computer executes a multiple regression analysis.we focus our attention on the principal aspectsof the ourput as shown in Table 6.
Toble 6 RegressionAnolysisof fhe Dotq in Toble 5: Selected MINITAB Oulput T H E R E G R E S S I O NE O U A T I O NI 5
/n \/ U/ Y=
65.1+1.08Xl +u-4?5)t?
COLUI'IN Hl ){?
Cl f,i
5T. DEL" CCIEFFICIENT OFCOEF. -65. il9 1 4. 9 4 r -r77f:7 1 . t j 7 7 L i s 0 , @ L r. 4 ? 5 4 1 0 . rJ7315
T-RATI0 = C O E F / 5 .D . -4.35 1 3. g 7
@
s.sl
T H E 5 T . D E L " C I F ' / A E C I U TR E G R E S S I O NL I N E I 5
@
s = i.FoB t ^I tT H ( 1 3 - 3 )
@
R - s a u A R E=D3 5 . 8 P E t - r f , E N T
= 1 ( J D E G R E E SO F F R E E D O I " I
A N A L Y ISS O F L ' A RI A N C E D U ET I ] R E G R E SI S CIN R E SI D U A L TOTAL
DF 3 lrJ 13
55 14?3 . 837
@
6r.s31
l"l5=55/DF 7LL,g1B 6.333
1486 . 765
We now proceed to interpret the results in Table 6 and use them to make further statistical inferences. (i) The equation of the fitted linear regression is
3. Multiple Linear Regtession 413
f : -55J + 1.08xr+ -425x, This means that the mean blood pressure increases by 1.08 if weight xt increases by one pound and age x, remains fixed. Similarly, a l-year increase in age wtir weight held fixed will only increase the mean blood pressureby .425. (ii) The estimated regressioncoefficient and the coffesponding mated standard errors are:
0o l-r l
2
- 65.09,estimatedS.E.(Bo) _ 14.94 1.077lO, estimatedS.E.(0 r) .4254I, estimatedS.E.(P") - .07315
: 2.508 Further, the error standard deviation o is estimated bY t with Degrees of freedom
n(# 132
input variables)
I
I
10 These results are useful in interval estimation and hlpotheses tests about the regression coefficients. In particular, a 100(1 a)% confidenceinterval {or a B-coefficientis given by Estimated coefficient + to/Z (estimated S.E.) where to/Lis the upper alz point of the t distribution with 10. For instance,a 95% confidenceinterval for B, is I . 0 7 7 1 0 + 2 . 2 2 8x . O 7 7 O 7 : L.O77IO+ .l7l7l or (.905,I-249) To test the null hypothesis that a pafticular B-coefficient is zeto, we employ the test statistic
t-
Estimated coefficient Estimated S.E.
0
These t-ratios appear in Table 6. Suppose we wish to examine whether the mean blood pressure significantly increases with age. In the language of hypothesis testing, this problem translates to one of testing Ho:92
414
Chapter 13
the test statistic is t d.f. the tabulatedvalue tor - 2.764,the null hypothesis is rejectedin favor of Hr, with cr _ .01.In fact, it is rejectedeven with a _ .005. (iii) In Table 6, the result "R-SQUARED _ 95.8 PERCENT,, or R2 _ .958
tells us thatg6vo of the variability of y is explained by the fitted multiple regression of y on x, and xr. The ,,analysis of variance,, shows the decomposition of the total variability 2(yi - ,)2 : 1486.769intothe two components:
@ r486.76s Total variability of y
1423.837
+
Variability explained by the regressionof y on xr and x,
62.931 Residual or unexplained variability
Thus,
n, : t+.?9191: es8 1486.769 and o2 is estimated by s2 : 62.93I|IO : 6.293, so s : 2.508 (see(ii)).
Polynomial Regression A scatter diagram may exhibit a relationship on a curve for which a suitable linearizing transformation cannot be constructed. Another method of handling such a nonlinear relation is to include terms with higher powers of x in the model Y : 9o + Frx + e.In this instance, by including the second power of x, we obtain the model Yi
: Fo + prxi + \rt+
ai, i - I,
.,n
which states that aside from the error components, er, the responsey is a quadratic function (or a second-degreepolynomial) of the independent variablex. Such a model is called a polynomial regressionmodel of y with x, and the highest power of x that occurs in the model is called the degree or the order of the polynomial regression.It is interesting to note that the analysis of a polynomial regression model does not require any special techniques other than those used in multiple regression analysis. By identifying x and x2 as the two variables x, and xr, respectively, this
3. Multiple Linear Regression 415 second-degreepolynomial model reduces to the form of a multiple regression model Yi _ 9o + prxrt + lzxiz +
i
In fact, both of these types of models where xit : x,. and Xiz : "? and many more types are special cases of a general class called linear models [1]. General Linear lVlodel By virtue of its wide applicability, the multiple linear regression model plays a prominent role in the portfolio of a statistician. Although a complete analysis cannot be given here, the general structure of a multiple regression model merits further attention. We have already mentioned that most least squares analyses of multiple linear regression models are carried out with the aid of a computer. Programs for implementing the analysis all require the investigator to provide the values of the responsey, and thep input variables Xil, . . ., xrofor each run i : l, 2, . . ., n. In writing I . po where I is the known value of an extra "dummy" input variable corresponding to Po, the model is Input variables
Observation
Error
Yi
The basic quantities can be arrangedin the form of these arrays, which are denoted by boldface letters: Input variables
Observation
Xrr
Yt Y.z
y= Yi
Yn
*?' X=
xto
*?o
'i' X.-l
Xn,
416
Chapter 73
Only the arrays y and X are required to obtain the least squaresestimates of 9o, Fr, . . . ,Fo that minimize
t0,
l:1
bo
xitbr-..
xipbp)z
The input arcay X is called the design matrix In the same vein, setting
eL
e=
"? en
9o andp =
q' pp
we can write the model in the suggestive form
Observation
Design matrix Parameter
Error
Y=XB+e which forms the basis for a thorough but more advanced treatment of regression.
EXERCISES 3.1 Theregressionmodely: 9o * Brxr llzxz + ewasfittedto adata set obtained from 20 runs of an experiment in which two predictors x, and xzwere observed along with the responsey. The least squares estimates were
0o - 4-2I, 0r : 1I.37, brPredict the responsefor: (a)xr -8, x2_30 8 , (b)xr xz:50 (c) xr - 3, xz
-.513
Exercises 417 3.2 In Exercise3.1, supposethe residual sum of squares(SSE)was 46.25 and the SS due to regressionwas 236.70. (a) Estimate the error standard deviation o. State the degrees of freedom. (b) Find R2 and interpret the result. 3.3 Refer again to Exercise3.1. Supposethe estimated standarderrors of 0o,0r, and B, were2.25,1.08 and.098, respectively. (a) Determine 95'/" confidence intervals for Bo and B". (b) Test Ho:Fr : l0 vs. Hr:Fr > 10, with cr : .05. 3.4 Referring to Exercise9.16 in Chapter 12, (a) Find the least squaresfit of the multiple regression model y : go * prxr * }zxz + e to predict the cooling rate of cast iron, using the number of graphite spheroids x, and the pearlite content x2 as predictors. (b) What is the predicted cooling rate corresponding to x, : 350 and xz: 3? (c) What proportion of the y-variability is explained by the fitted model? (d) Obtain 95% confidence intervals for 9o, 9r, and Br. 3.5 Given the pairs of (x, y) values:
(a) Fit a quadratic regression model Y these data. The MINITAB commands are
prx + 9rx2 + e to
R E A D I N T O C 1- C 2 3.2 130 3.8 1s6 a
a
a
.
g.a a)o MULTIPLY Cl BV Cl PUT IN C3 R E G R E S SV I N C 2 O N 2 P R E D I C T O R Sr N
cl c3
(b) What proportion of the y-varrability is explained by the quadratic regression model? (c) Test Ho:9r
418
Chapter 13
4. RESIDUAL PLOTS TOCHECK THE ADEQUACY OF A STATISTICAL MODEL
GenerolAttitudetoword o StotisticolModel A regression analysis is not completed by fitting a model by least squares, by providing confidence intervals, and by testing various hypotheses. These steps -b. tell only half the story: the statistical inferences that can -"de when the postulated model is adequate. In most studies, we cannot be sure that a particular model is correct. Therefore, we should adopt the following strategy: (a) Tentatively entertain a model. (b) Obtain least squares estimates and compute the residuals. (c) Review the model by examining the residuals.
Step (c) often suggests methods of appropriately modifying the model. Returning to step (a), the modified model is then entertained, and this iteration is continued until a model is obtained for which the data do not seem to contradict the assumptions made about the model. Once a model is fitted by least squares,all the information on variation that cannot be explained by the model is contained in the residuals ,n
where y, is the observed value and fl denotes the corresponding value predicted by the fitted model. For example, in the caseof a simple linear regressionmodel, i, : 9o * Frxr. Recall from our discussionof the straight-line model in Chapter 12 that we have made the assumptions of independence,constant variance, and a normal distribution for the error components er. The inference proceduresare basedon these assumptions.When the model is correct, the residuals can be considered as estimates of the errors e, which are distributed as N(0, o). To determine the merits of the tentatively entertained model, we can examine the residuals by plotting them on graph paper. Then if we recognize any systematic pattern formed by the plotted residuals, we would suspect that some assumptions regarding the model are invalid. There are many ways to plot the residuals, depending on what aspectis to be examined. We mention a few of these here to illustrate the techniques.
4. Residual Plots to Check the Adequacy of a Statistical Model
419
A more comprehensive discussion can be found in Chapter 3 of Draper and Smith tll.
Histogram ot Dot Diagram of Residuals To picture the overall behavior of the residuals, we can plot a histogram for a large number of observations or x dot diagram for fewer obsewations. For example, in a dot diagram like the one in Figure 3a, the residuals seem to behave like a sample from a normal population and there do not appear to be any "wild" observations. In contrast, Figure 3b illustmtes a situation in which the distribution appears to be quite normal except for a single residual that lies far to the right of the others. The circumstances that produced the associated observation demand a close scrutiny. o ooo oooooo oooooooo oooooooooooo
o rr
I
t
t
t
r
tt
lt
o t
tt
tt
t
t
tl
t
t
0 Residuals 6
(a)
o ttttttrltrl
(b)
t
oo ooa ooooo oooooo
oo oo oooo ttttlrtrllrt
0 6 Residuals
Figure 3 Dot diogrom of residuols.
Plot of Residual vs. Predicted Value A plot of the residuals d, vs. the predicted value y, often helps to detect the inadequacies of an assumed relation or a violation of the assumption of constant error variance. Figure 4 illustrates some typical phenomena. If the points form a horizontal band around zero, as in Figure 4a, then no abnormality is indicated. In Figure 4b, the width of the band increases noticeably with increasing values of y. This indicates that the error variance o2 tends to increase with an increasing level of response. We would then suspect the validity of the assumption of constant variance in the model. Figure 4c shows residuals that form a systematic pattem. Instead of being randomly distributed around the y-axis, they tend first to increasesteadily and then to decrease.This would lead us to suspectthat the model is inadequate and that a squared term or some other nonlinear x term should be considered.
420
Chapter 13
Figure4 Plotof residuolvs.predictedvolue.
Plot of Residual vs. Time Order The most crucial assumption in a regression analysis is that the errors e, are independent. Lack of independence frequently occurs in business and economic applications, where the observations are collected in a time sequence with the intention of using regression techniques to predict future trends. In many other experiments, trials are conducted successively in time. In any event, a plot of the residuals vs. time order often detects a violation of the assumption of independence. For example, the plot in Figure 5 exhibits a systematic pattern in that a string of high values is followed by a string of low values. This indicates that consecutive residuals are (positively) correlated, and we would suspect a violation of the independence assumption. Independence can also be checkedby plotting thb successivepairs (Ai,Ai-r), where dt indicates the residual from the first y value observed, A, indicates the second, and so on. Independence is suggested if the scatter diagram is a pattemless cluster, whereas points clustered along a line suggest a lack of independence between adiacent observations.
Key ldeas
421
Figure5 Plot of residuolvs.time order.
It is important to remember that our confidence in statistical inference procedures is related to the validity of the assumptions about them. A mechanically made inference may be misleading if some model assumption is grossly violated. An examination of the residuals is an important part of regression anaylsis, because it helps to detect any inconsistency between the data and the postulated model.
If no serious violation of the assumptions is exposed in the process of examining residuals, we consider the model adequate and proceed with the relevant inferences. Otherwise, we must search for a more appropriate model.
References l. Draper, N. R., and Smith, H. Applied RegressionAnalysis,2nd ed., fohn Wiley & Sons,New York, 1981.
KEYIDEAS When a scatter diagram shows relationship on a curve, it may be possible to choose a transformation of one or both variables such that the transformed data exhibit a linear relation. A simple linear regression analysis can then be performed on the transformed data. illultiple regression analysis is a versatile technique of building a prediction model with several input variables. In addition to obtaining the least squaresfit, we can construct conJidenceintervals and test hypotheses about the influence of each input variable.
422 Chapter13 A polynomial regressionmodel is a special caseof multiple regression where the powersX, x2, x3,and so on of a single predictor x play the role of the individual predictors. The measure R2, called the square of the multiple correlation coefficient, represents the proportion of y-variability that is explained by the fitted multiple regression model. To safeguard against a misuse of regression analysis, we must scrutinize the data for agreement with the model assumptions. An examination of the residuals, especially by graphical plots, is essential for detecting possible violations of the assumptions and also in identifying the appropriate modifications of an initial model.
5. EXERCISES 5.1 Given the pairs of (x, y) values:
(a) Transform the x values to x' _ logro x and plot the scatter diagram of y vs. x' .
(b) Fit a straight-line regressionto the transformed data. (c) Obtain a 90% confidence interval for the slope of the regression line. (d) EStim ate the expected y-value corresponding to x _ 300 and give a 95"/" confidence interval.
*5.2 Obtain a line atuzur;:gtransformation (a) y
in each case:
I
11 + aeb\z
(b) y 5.3 An experiment was conducted for the purpose of studying the effect of temperature on the life-length of an electrical insulation. Specimens of the insulation were tested under fixed temperatures, and their times to failure recorded. Temperature (x)
Failure Time (y) (Thousand Hours)
r80 2ro
7.3, 7.9, 8.5,9.6, 10.3 r . 7 , 2 . 5 ,2 . 6 ,3 . r 1 . 2 ,1 . 4 ,1 . 6 ,1 . 9 . 6 ,. 7 , 1 . 0 ,1 . 1 ,1 . 2
('c)
230 250
5, Exercises 423 (a) Fit a straight-line regression to the tranformed data
x' : l/x,
y' : Iogy
(b) Is there strong evidence that an increase in temperature reduces the life of the insulation? (c) Comment on the adequacy of the fitted line. 5.4 In an experiment (courtesy of W. Burkholder) involving storedproduct beetles (Trogoderma glabrum) and their sex-attractant pheromone, the pheromone is placed in a pit-trap in the centers of identical square arenas.Marked beetles are then releasedalong the diagonals of each square at various distances from the pheromone source. After 48 hours, the pit-traps are inspected. Control pit-traps containing no pheromone captured no beetles. Release Distance (centimeters)
6.25 12.5 24 50 100
No. of Beetles Captured out of I
5,3,4,6 5 , 2 ,5 , 4 4 , 5 ,3 , 0 3,4,2,2 I,2,2,3
(a) Plot the original data with y : number of beetles captured. Repeat with x : lo8" (distance). (b) Fit a straight line by least squares to the appropriate graph in (a). (c) Construct a95Y" confidence interval for Br. (d) Establish a 95% confidence interval for the mean at a release distance of 18 cm. 5.5 A genetic experiment is undertaken to study the competition between two types of female Drosophila melanogaster in cageswith one male genotype acting as a substrate. The independent variable x is the time spent in cages, and the dependent variable y is the ratio of the numbers of type I to type 2 females. The following data (courtesy of C. Denniston) are recorded: (a) Plot the scatter diagram of y vs. x and determine if a linear model of relation is appropriate. (b) Determine if a linear relation is plausible for the transformed data Y' : loSro r. (c) Fit a straight-line regression to the transformed data.
424
Chapter 13
Time x (Days)
T7 3l 45
s9 73
No. Type 1
No. Type 2
137 278 331 769 976
s86 479 r67 227 7s
v-
No. Type 1 No. Type 2 .23 .58 1.98 3.39 13.01
5.6 A multiple linear regressionwas fitted to a data set obtained from 27 runs of an experiment in which four predictors X1r X21xsr and x* were observed along with the responsey. The following results were obtained:
0o
9r-
9,
0,
2.37
SS due to regression- 925.50
SSE-
82.86
(a) Predict the response for:
(i) x, : 15, xr: (ii) x, : 25, xr:
.5, Xe : 5, xq : 4.6 .8, xa : I, x+ : 2.3
(b) Estimate the error standard deviation o, and state the degrees of freedom. (c) What proportion of the y-variability is explained by the fitted regression? 5.7 Refer to Exercise 5.6. The estimated standard errors of pr and p, were -O52and 2.51, respectively. (a) Obtain a9O"h confidence interval for p,. (b) Test Ho:92: 25 vs. Hlgz < 25 with a : .05' 5.8 Listed below are the price quotations of used cars along with their age and odometer mileage. Age (years) x, Mileage xz (thousand miles)
10 8.1 r7.O 12.6 18.4 19.5 29.2 40.4 s1.6 62.6 80.1
Price y s.4s 4.80 s.00 4.003.703.203.1s 2.69 1.90r.47 (thousanddollars)
5. Exercises 425 Perform a multiple regressionanalysis of these data.In particular: (a) Determine the equation for predicting the price from age and mileage. Interpret the meaning of the coefficients B, andpr. (b) Give 95% confidence intervals for Br and 92. ( c ) Obtain R2 and interpret the result. 5.9 Recorded here are the scores (x, and x") in two midterm examinations, the GPA (xr) and the final examination score (y) Ior 2O students in a statistics class. (a) Ignoring the data of GPA and the first midterm score, fit a simple linear regression of y on xr. Compute 12. (b) Fit a multiple linear regression to predict the final examination score from the GPA and the scores in the midterms. Compute R2. (c) Interpret the values of 12 and R2 obtained in parts (a) and (b).
x1
x2
x3
87 100 9r 85 s5 81 85 96 79 95
25 84 52 60 76 28 67 83 60 69
2.9 3.3 3.5 3.7 2.8 3.1 3.1 3.0 3.7 3.7
50 80 73 83 33 65 53 68 88 89
xl
xz
x3
93 92 r00 80 100 69 80 74 79 95
60 69 86 67 95 5l
3.2 3.1 3.6 3.5 3.8 2.8 3.6 3.1 2.9 3.3
7s 70 66 83
44 53 86 59 81 2A 64 38 77 47
5.10 Refer to Exercise9.15 in Chapter 12. (a) Fitaquadraticmodely:9o + Frx + \rx, + eto thedatafor CLEP scoresy and CQT scoresx. (b) Use the fitted regression to predict the expected CLEP score when x : 150. (c) Compute P for fitting a line, and the square of the multiple correlation for fitting a quadratic expression. Interpret these values and comment on the improvement of fit. *5.1| Write the designmatrix X for fitting a multiple regressionmodel to the data of Exercise 5.8. *5.12 Write the design matrix X for fitting a quadratic regressionmodel to the data of Exercise3.5.
426
Chapter L3
5.13 A least squares fit of a straight line produces the following residuals:
Residuals
Residuals Plot these residuals against the predicted values and determine if any assumption appears to be violated. 5.14 A second-degree polynomialf, :0o + 0,x + izx2 is fitted to a response y, and the following predicted values and residuals are obtained:
Residuals
Residuals Do the assumptions appear to be violated? 5.15 The following predicted values and residuals are obtained in an experiment conducted to determine the degree to which the yield of an important chemical in the manufacture of penicillin is dependent on sugar concentration (the time order of the experiments is given in parentheses): Predicted 2.2(e)3.1(6)2.s(13)3.3(l) 2.3(7)3.6(14)2.6(8) Residual
-3
-l
5
0
2.5(3) 3.0(12) 3.2(4) 2 . e ( 1 ) 3.3(2) 2 . 7 ( 1 0 3) .2(5)
(a) Plot the residuals against the predicted values and also against the time order. (b) Do the basic assumptions appear to be violated? 5.16 An experimenter obtains the following residuals after fitting a quadratic expression in x:
5, Exercises 427
x-2
x_1
-.1 0 -.2 .6 -.1
x_4
x:3 -.1 -.3
1.3 -.2 -.1 -.3 .1
X_5
-2 0 -.2 -.2 -.3 -.1
0 .2 -.1 0 -.2
.1 .4 -.1 .l
Do the basic assumptions appearto be violated? interested student used the method of least squaresto fit the straight line fl - 264.3 + L8.77x to grossnational producty in real dollars.Theresultsfor26recentyearS,X:I,2, below. Which assumption(s)for a linear regressionmodel appearto be seriously violated by the data? (Note: Regressionmethods are usually not appropriatefor this type of data.)
s . r 7An
Year
309.9 323.7 324.1 35s.3383.4 395.t 4 r 2 . 8 2 8 3 . 1 3 0 1 . 9 320.6 339.4 3 5 8 . 2 376.9 39s.7 Residual
2r.8
25.8
1 5 . 9 25.2
17.T
18.2
Year 407
438
446.1 452.5 447.3 4 7s . 9 487.7
414.5 433.2 452.0 470.8 489.5 508.3
527.r
- 1 8 . 3 - 42.2 - 32.4 - 3 9 . 4
Residual Year
497.2 529.8 5 5 1 Residual
5 8 1 . 16 1 7 . 86 s 8 . 15 75 . 2
545.8 564.6 5 8 3 . 4 602.r 620.9 639.7 658.4 - 48.6 - 3 4 . 8 - 32.4 -21 18.4 15.8
Year
Residual
706.6
725.6
722.5
7 45.4
790.7
677.2
696.O
71 4 . 7
733.s
7 52.3
29.4
29.6
I 1.9
38.4
CHAPTER
4 Anolysis of Cotegoricol Doto 1, INTRODUCTION 2, PEARSON'S FORGOODNESS OF FIT x2 TEST 3. CONTINGENCY TABLE WITHONEMARGINFIXED (TEST OF HOMOGENETTY) 4, CONTINGENCY TABLE WITHNEITHER MARGINFIXED (TEST OF INDEPENDENCE)
Analysis o/ Categorical Data
429
DoesGrowing up in q Right Honded World Exerton OverwhelmingBiosTowordRighthqndedness? Data to help answer this question were gathered by L. Cattet' Saltzman.
Biologicolond SocioculturolEtfectson Hondedness: ComporisonBetweenBiologicolond AdoptiveFomilies Abstra ct. Data from adoption studieson handednessindicate that the effects of shared biological heritage are more powerful determinants of hand preference than sociocuhural factors. Biological offspting wete found to shownonrandom distributions of right- and non-right'handed' nessas a function of parental handedness./n conttast, the handedness distribution of adopted children ds a function of parcntal handedness was esentially random.
As o Function Toble3. OffspringHondednessDisfributions qnd Biologicol in Hondedness of Porenfol Adoptive Fomilies Adopted Offspring
Biological Offspring Righthanded Parental handedness (Father x Mother)
Total (No.)
Right versus let (total)* Right x right Right x left Left x right
400 340 38 22
Righthanded
Lefthanded
Lefthanded
Num- Per- Num- Per- TotaL Num- Per- Num- Per ber cent ber cent t'No./ ber cent bet cent 348 303 29 t6
87 89 76 73
52 37 9 6
13 l1 24 27
408 355 L6 37
354 307 12 35
87 86 75 9s
13 54 48 14 425 25
xleft-handedness is defined as all Edinburgh Inventory scores s 0 (through 0) and right-handednessas all scores > 0 (.01 through + 100).
100
Source: Carter-saltzman, L. Biological and Sociocultural Effects on Handedness: Comparison Between Biological and Adoptive Families, Science, Vol. 209, pp. 1263-1265,Table 3, 12 September,1980. Copyright O 1980 by AAAS.
430
Chapter 14
4. INTRODUCTION The name categorical data refers to observations that are only classified into categories so that the data set consists of frequency counts for the categories.Such data occur abundantly in almost af fietas of quantitative study, particularly in the social sciences. In a study of religious affilations, people may be classified as Catholic, Protestant, |ewish, or other; in a survey of job compatibility, employed persons may be classified as being satisfied, neutral, or dissatisfied with their jobs; in plant breeding the offspring of a cross-fertilization may be grouped into several genotypes; manufactured items may be sorted into such categoriesas "free of defects," "slightly blemished," and "rejects." In all of these examples each category is defined by a qualitative trait. Categories can also be defined by specifying ranges of values on an original numerical measurement scale, such as income that is categorized high, medium, or low, and rainfall that is classified heavy, moderate, or light.
EXAMPLE4
One Sample Classified in Several Categories. The offspring produced by a cross between two given types of plants can be any of the three genotypes denoted by A, B, and C. A theoretical model of gene inheritance suggeststhat the offspring of types A, B, and C should be in the ratio 1:2:1. For experimental verification, 100 plants are bred by crossing the two given types. Their genetic classifications are recorded in Table 1. Do these data contradict the genetic model?
Toble 'l Clossificotionof CrossbredPlonts Cenotype
Observed frequency
Let us denote the population proportions or the probabilities of the genotypesA, B, and C by pa, Pn, andpc, respectively. Since the genetic model statesthat these probabilities are in the ratio 1:2:l, our obiect is to test the null hlryothesis
Ho: pe : +, pn Here the data consist of frequency counts of a random sample classified in three categories or cells, the null hypothesis specifies the numerical values of the cell probabilities, and we wish to examine if the observed tr frequencies contradict the null hypothesis.
7. Introduction 431
EXAMPLE2
Independent Samples Classified in Several Categories To compare the effectiveness of two diets A and B, 150 infants were included in a study. Diet A was given to 80 randomly selected infants and diet B was given to the other 70 infants. At a later time, health of each inlant *". ob."*"d and classified into one of the three categories "excellent," "avetage," and "poor." From the frequency counts recorded in Tabie 2, we *ish to test the null hypothesis that there is no difference between the quality of the two diets.
Toble 2 Heollh under Two Different Diels Excellent
37 17
Diet A Diet B
Average
Poor
24 33
19 20
The two rows of Table 2 have resulted from independent samples. For a descriptive summary of these data, it is proper to compute the relative frequencies for each row. These are given in Table 2(a). The (unknown) population proportions or probabilities are enteled in Table 2(b). They allow us to describe the null hypothesis more clearly.
Toble2(o) ReloliveFrequencies(fromToble 2l Total
Tqble 2(b) Populqtion Proportionsor Probqbilities Total
The null hypothesis of "no difference" is equivalent to the statement that, for each response category, the probability is the same whether for diet A or diet B. Consequently, we formulate Ho:
Pet
-
Pat,
Pez
432
Chapter 14
Note that, although Ho specifies a structure for the cell probabilities, it does not give the numerical value of the common probability in each column. n
EXAMPLE3
One Sample Simultaneously Classified According to T\uo Characteristics A random sample of 500 persons is questioned regarding political affiliation and attitude toward a tax reform program. From the observed frequency table given in Table 3, we wish to answer the following question: Do the data indicate that the pattern of opinion is different between the two political groups?
Toble 3 PoliticolAffiliotion ond Opinion Favor
Indifferent
Opposed
Democrat Republican
138 64
83 67
64 84
Total
202
150
r48
Unlike Example 2, here we have a single random sample but each sampledindividual elicits two types of responses:political affiliation and attitude. In the present context, the null hypothesis of "no difference" amounts to saying that the two types of responses are independent. In other words, attitude to the program is unrelated to or independent of a person's political affiliation. A formal specification of this null hypothesis, in terms of the cell probabilities, is deferreduntil Section 4. n Frequency count data that arise from a classification of the sample observations according to two or more characteristics are called crosstabulated data or a contingency table. If only two characteristics are observed, and the contingency table has r rows and c columns, it is designated as an r x c table. Thus Tables 2 and 3 are both 2 x 3 contingency tables. Although Tables 2 and 3 have the same appearance,there is a fundamental difference in regard to the method of sampling. The row totals 80 and 70 in Table 2 are the predetermined sample sizes; these are not outcomes of random sampling as are the column totals. By contrast, both sets of marginal totals in Table 3 are outcomes of random samplingnone were fixed beforehand. To draw the distinction, one often refers to Table 2 as a 2 x 3 contingency table with lixed row totals. In Sections3 and 4 we will see that the formulation of the null hypothesis is different for the two situations.
2. Pearson'sx2 Test fot Goodnessof Fit 433
OF FIT FORGOODNESS 2. PEARSON'S x2 TEST We first consider the type of problem illustrated in Example I where the data consist of frequency counts observedfrom a random sample, and the null hypothesis specifies the unknown cell probabilities. Our primary goal is tb test if the model given by the nuII hypothesis fits the data, and this is appropriately called testing for goodnessof fit For general discussion, suppose a random sample of size n is classified intoft categoriesor cells labeledl, 2, . .., ft, and letnr, fl2, . . ., no denote the respectivecell frequencies.Denoting the cell probabilitiesbY Pr, Pz, . . . , pk, a null hypothesis that completely specifies the cell probabilities is of the form Ho: Pr: where prc, .
Pto,
Pp:
Ppo
, ppo are given numerical values that satisfy Prc +
+
Pto: lFrom Chapter 6 recall that if the probability of an event is p, then the expected number of occurrences of the event in n trials is np. Therefore, once the cell probabilities are specified, the expected cell frequencies can be readily computed by multiplyingthese probabilities by the sample size n. A goodnessof fit test attempts to determine if a conspicuous discrepancy exists between the observed cell frequencies and those expected under Ho. (SeeTable 4.)
Toble 4 The Bosisof o Goodnessof FitTest Total
Cells Observedfrequency (O) Probability under Ho Expectedfrequency (E) under Ho
nr Prc nPrc
nk
n2
Pro nPpo
Pzo nPzo
n I n
A useful measurefor the overall discrepancy between the observed and expectedfrequenciesis given by k
x2
\z-/ l:1
(nr -
npio)2 nPio
""uu"
E
where O and E symbolize an observed frequency and the corresponding expected frequency. The discrepancy in each cell is measured by the squared difference between the observed and the expected frequencies divided by the expected frequency. The 12 measure is the sum of these quantities for all cells.
434 Chapter14 The 12 statistic was originally proposedby Karl Pearson(1857-1935), who found the distribution for large n to be approximately a 12 distribution with d.f. : .k - l. Due to this distribution, the statistic is denoted by X2 and is called Pearson's12 statistic for goodnessof fit. Becausea large value of the overall discrepancy indicates a disagreement between the data and the null hypothesis, the upper tail of the 12 distribution constitutes the rejection region.
Peorson'sX2Teslfor Goodnessof Fit (Bosedon Lorge n) Ho: pt
Null hypothesis:
.,k
Test statistic: Rejection region:
X2
r-r
/
\n
nPio
= ".ou"
E
X2 the X2 distribution with d.f. _ k (number of cells) - 1.
1 _
It should be remembered that Pearson's 12 test is an approximate test that is valid only for large samples. As a rule of thumb, n should be large enough so that the expected frequency of each cell is at least 5.
EXAMPLE4
Referring to Example l, test the goodnessof fit of the genetic model to the data in Table l. Take cr : .05. Following the structure of Table 4, the computations for the 12 statistic are exhibited in Table 5.
Toble 5 TheX2Goodnessof FitTestfor the Doto in Toble'l Cell Observedfrequency (O) Probability under Ho Expected Frequency (E)
(o-Dz E
Total
18 .25 25
55 .50 50
27 .25 25
100 1.0 100
r.96
.s0
.16
2.62 _ X2 d.f. - 2
The upper 5% point of the 12 distribution with d.f.. : 2 is 5.99 (Append-
Exercises 435 ix B, Table 5). Becausethe observedX2 : 2.62 is smaller than this value, the null hypothesis is not rejected at a : .05. We conclude that the data in Table I do not contradict the genetic model.
n
The 12 statistic measuresthe overall discrepancy between the observed and those expected under a given null hypothesis. Example 4 ft"q""""i"r from a derironstrates its application when the frequency counts arise characterisone single random ,"*pl" and the categories refer to only the genotype of the offipring. Basically, the same principle ticanamelt ."t"rrdr to i"stirrf hypoihes"r with more complex types- of categorical In data such as the -otiittg".t"y tables illustrated in Examples 2 and 3' properties pi"p"ratiott for these developments, we stete two fundamental of the 12 statistic:
Propertiesof Peorson'sX2Stotistic independent (a) \ / Additivity: If y2 statistics are computed from d.f. whose statistic a samples,ih.tr ih.it sum is also x2 components. the equals the sum of the d.f.'s of (b) Loss of d"[. due to estimation of parameters:If Ho doesnot completely specify the cell probabilities, then some parameters have to be estimated in order to obtain the expected cell frequencies.In that case,the d.f. of Xz is reducedby the number of Parametersestimated. d.f. of X2 : (No. of cells)
1
(No. of parametersestimated)
EXERCISES 2.I Given below are the frequencies observedfrom 300 tossesof a die. Do these data cast doubt on the fairness of the die?
Frequency
2.2 A market researcher wishes to assess consumers/ preference among four different colors avarlable on a name-brand household washing machine. The following frequencies were observed from a random sample of 300 recent sales:
436
Chapter 14
Color
Avocado Tan
Frequency
9I
White
Blue
Total
60
66
300
83
Test the null hypothesis, at q : .05, that all four colors are equally popular. 2.3 cross fertilizing a pure strain of red flowers with a pure strain of white flowers produces pink hybrids that have one gene of each grye. Crossing these hybrids can lead to any one of four possible gene pairs. Under Mendel's theory, these four are equally likely so P(white) : |, P(pink; : j, P(red) : l. An experiment carried ouVby Correns, one of Mendel's followers, resulted in the frequencies 141, 29I, anl I32 for the white, pink, and red flowers, respectively. (Source: W. |ohannsen, 19O9, Elements of the precise Theory of Heredity, G. Fischer, |ena). Do these observations appear to contradict the probabilities,suggestedby Mendel's theory? 2.4 The following table records the observed number of births at a hospital in four consecutive quafterly periods: Ian.-Mar. Apr.-lun. Iul.-Sep. Oct.-Dec.
Quarters Number of births
110
53
57
80
It is coniectured that twice as many babies are born during the Ian.-Mar. quarter than are born in any of the other three quarters.At cr - . 10, test if thes e data strongly contradict the stated coniecture.
x2.5 An alternative expressionfor Pearson's By expandingthe square Xr. on the right-hand side of .,2 n
(ni nPidz I nPio #ii"
show that the 12 statistic can also be expressed as X2
L^' cells
Lrre r rvl
nP io
#ir,
E
3" CONTINGENCY TABLE WITHONE FIXED{TEST MARGTN OF HOMOGENETTY} From each population, we draw a random sample of a predetermined sample size and classify each response in categories. These data form a
3. ContingencyTable with One Margin Fixed (Testof Homogeneity) 437 two-way contingency table where one classification refers to the populations and the other refers to the responseunder study. Our obiective is to test whether the populations are alike or homogeneous with respect to cell probabilities. To do so, we will determine if the observedproportions in each response category are nearly the same for all populations. Let us pursue our analysis with the data of Table 2.
EXAMPLE5
Continuation o[ ExamPle 2 For easeof reference,the data in Table 2 arc reproducedin Table 6. Here the populations correspondto the two diets and the responseis recorded in thr ee categories.The rory totals 80 and 70 arc the fixed sample sizes.
Toble 6 A2 x 3 Contingency Toble with Fixed Row Tofols Excellent
Average
Poor
24 33
19 20
37 17 Total
'homogeneity' or We have already formulated the null hypothesis of 'no difference between the diets' as fsee Table 2(b)] Ho: Pet : Pnt,
Paz _ Pnz, Pes - Pns
If we denote these common probabilities under Ho by Pr, Pz, and p*, respectively, the expected cell frequencies in each row would be obtained by multiplying these probabilities by the sample size. In particular, the expected frequencies in the first row are 80pr, 80p", and 80pr, and those in the secondrow are 7opr,7opr,70p". However, the pr's are not specified by Ho. Therefore, we have to estimate these parameters in order to obtain the numerical values of the expected frequencies. The column totals, 54, 57, and 39, in Table 6 are the frequency counts of the three response categories in the combined sample of size 150. Under Ho, the estimated probabilities are ^54^57^39 Pt : 150 Pz:
150 Pz:
150
Using these estimates, the expected frequencies in the first row become 80x54 r5o '
80x57 150 '
,80x39 ano l5o
438
Chapter 14
and similarly for the second row. Referring to Table 6, notice the interesting pattem in these calculations: row total x column total Expected cell flequency _ grand total
Table 7(a) presents the observed frequencies (O) along with the expected frequencies (E). The latter are given in parentheses. Table 7(b) computes the discrepancy measure (O - E)2/E for the individual cells. Adding these over all the cells we obtain the value of the 12 statistic.
Toble 71o)TheObservedond ExpectedFrequencies of the Dotq in Toble6 Excellent
Average
37 (28.8)
24 (30.4)
19 (20.8)
17 (2s.2)
33 (26.6)
20 (18.2)
Toble7(b) TheVoluesof lO Excellent
Average
Poor
EYte Poor
Diet A Diet B
4.224 _ x2
In order to deternrine the degreesof freedom, we employ the properties of the 12 statistic stated in Section 2. Our 12 has been computed from two independent samples, each contributes (3 - 1) : 2 d.f. becausethere are three categories.The added d,.f..: 2 + 2 : 4 must now be reduced by the ngmber of p3rameters we have estimated. Since pr, pr, andp" satisfy the relation pr -l Pz t Pe : l, there are really two undetermined parameters among them. Therefore, our 12 statistic has d.f. : 4 - 2 : 2. With d.f. : 2, t};ietabulated upper 57. point of 12 is 5.99 (Appenfix B, Table 5). Since the observed X' : 8.224 is larger, the null hypothesis is reiected at ct : .05. It would also be refected at c : .025. Therefore, a significant difference between the quality of the two diets is indicated by the data. Having obtained a $ignificant 12, we should now examine Tables 7(a)
3. ContingencyTable with-one Maryin Fixed (Testof Homogeneity) 439 large and 7(b) and try to locate the source of the significance. We find that relathe where io 1' come from the "excellent" category B' "ontribrrtions for diet tive frequency ls'i7/SO ot 46Yofor diet A, and l7/7O or24"/o n These data indicate that A is superior. Motivated by Example 5 we are now ready to describe-the 12 test procedure for an r x ; contingency -talle that has independent samples iro- , populations which are classified in c response categories. As we |1"* r"i"iefore, the expected frequency of a cell is gtlen by (row total x column total)/grand toal. With regard to the d.f. of the 12 for an t x c - 1) d.f.'s so the total table, we rrot" ih"t each of the r rows contributes (c - 1) numbdr of parameters have to be condibution is r(c - 1). Since (c estimated, d.f. of X' _ r(c
1)
(c
1)
1)
1)(c of rows
1) x (No. of columns
1)
Thex2 Testof Homogeneityin o contingency Toble Null hypothesis: In each responsecategory,the probabilities ^re equahfor all the PoPulations' Test statistic: x2: cells
LE
_
d.f. Reiection region: Xz
EXAMPLE6
A survey is undertaken to determine the incidence of alcoholism in different professional groups. Random samples of the clqrgy, educators, executivei, and merchants are interviewed, and the observed frequency counts are given in Table 8. Construct a test to determine if the incidence rate of alcoholism appears to be the same i4 allIour groups.
440
Chapter 14
TobleI ContingencyTobleof Alcoholism vs. Profession Alcoholic
Clergy Educators Executives Merchants Total
Nonalcoholic
32(s8.2s) s r (48.s4) 67(s8.25) 83(67.e6)
268(24r.7s) r9e(201.46) 233(241.75) 267(282.04)
233
967
300 250 300 350
r200
Let us denote the proportions of alcoholics in the populations of the clergy, educators,executives,and merchants bypr, pz, pe, andpo,respectively. Basedon independent random samples from four binomial populations, we want to test the null hypothesis Hoi Pr:Pz:Pz:P+ The expected cell frequencies, shown in parentheses in Table 8, are computed by multiplying the row and column totals and dividing the results by 1200.The 12 statistic is computed in Table 9.
Toble9 TheVqluesof lO in Toble8 Alcoholic Clergy Educators I Executives Merchants
I 1.83 .t2 1.31 3.33
EYtefor the Doto
Nbndcoholic 2.85 .03 .32 .80 20.59 - x2
of X2-
(4
r)(2
1) - 3
With d.f. : 3, the tabulated upper 5% point of 12 is 7.81 so that the null hypothesis isrrejectedat ct : .05. It would be rejectedalso at cr : .01 and so the P-value is less than .01. Examining Table 9 we notice that a large contribution to the 12 statistic has come from the first row. This is becausethe relative frequency of alcoholics among the clergy is quite low in comparison to the others, as one can see from Table 8. fl
3. Contingency Table with One Margin Fixed (Test of Homogeneity)
441
EXAMPLET A2 x 2ContingencyTable To determine the possible effect of a chemical treatment on the rate of seed germination, 100 chemically treated seedsand 150 untreated seeds are sown. The numbers of seedsthat germinate arerecorded in Table 10. Do the data provide strong evidence that the rate of germination is different for the treated and untreated seeds?
Toble'10 Germinated Treated Untreated
Not Germinated
84 (86.40) r32(r2e.60)
16(13.60) 18(20.40)
T'otal
Letting pr and prdenote the probabilities of germination for the chemically treated seedsand the untreated seeds,respectively,we wish to test Xz test, we the null hypothesis Ho: Pt given in are These way. usual in the frequencies calculate the expected is of value The computed X2 parenthesesin Table 10. x2 : -o67 + .424 + -o44 + -282 : .817 d.f.:(2-r)(2-l):1 The tabulated 5Y" value of 12 with d.f. : 1 is 3.84. Becausethe observed : x2 : .817 is smaller, the null hypothesis is not rejected at ct .05. The treated and rate of germination is not significantly different between the n untreated seeds.
Another Method of Analyzing a 2 x 2 Contingency Table In light of Example 7, we note that a 2 x 2 contingency table, with one margin fixed, is essentially a display of independent random samples from two dichotomous (that is two-category) populations. This structure is shown in Table l1 where we have labeled the two categories "success" and "failure." Here X and Y denote the numbers of successesin independent random samples of sizes nr and n, taken from population I and population 2, respectively.
442
Chapter 14
Toble 1r, IndependentSomplesfrom Two DicholomousPopulolions
Population I Population 2
Lettin g p, and p, denotethe probabilities of successfor Populations I and 2, respectively, our obiect is to test the null hypothesis Hoi p - pz. t The sample proportions ^X Pt_
and
d
pz _ Y
n2
provide estima_tesof p, and pr. When the sample sizes are large, a test of Ho: pt : p2can be basedon the test statistic A
Z _
Pt-Pz
Estimated standard error
: N(0, 1)
If we denote by p the common probability of success under H o, the standard error of p, pz rs given by the expression
S.E.(P, pr): ffi The unknown parameterp is estimated by pooling information from the two samples.The proportion of successesin the combined sample provides Pooled estim ate p _
EstimatedS.E.(0,
In summary:
X + Y nr + nz pr) -
g. Contingency Table with One Maryin Fixed (Test of Homogeneity)
443
Testing HotP,t = p2 with lorge somples Test statistic: Pz
Pt
Z:
where p
X + Y nL + n2
ffi Z The level ct reiection region is lzl according as the alternative hypothesis is Pt + Pz,Pr p r ) pz.Ffere zo denotesthe upper ct point of the N(0, 1) distribution.
Although the test statisttcs Z and n x':
s(o .L_ cells
D2 E
appear to have quite different forms, there is an exact relation between them-namely, Zz : x2 (for a 2 x 2 contingencY table) Nso, zz-,, is the same as the upper ct point of 12 with df. : 1. For : Q96)2 : 3.8416,which is also the upper instance, with a : .05, zz.ozs 5% point of 12 with d.f. : I (seeAppendix B, Table 5). Thus the two test procedures are equivalent when the alternative hypothesis is two-sided. However, if the alternative hypothesis is one-sided, such as Hr: p, ) pr, only the Z-test is appropriate.
EXAMPLE8
Use the Z-test with the data of Example 7. We calculate ^84
Pt
- .g4, pr: _ +_q .gg
pooled estimatep_
ffi:
.g64
444
Chapter 74
P,
Z
Pz
ffiF, - .04
WF
:
-.9O4
136 + Vmd m
Becauselzl is smaller than z.o2s: I.96, thenull hypothesisis not rejected at c : .050.Note that 7 : (- .0O+1,: .817 agreeswith the result 1z : .817 found in Example 7. I The approximate normal distribution of (0r - pr) allows us to construct a confidence interval for the difference (pr - pz).
LorgeSompleConfidenceIntervoltor pn An approximate 100 (1 is
pz
a)% confidenceinterval for p,
pz
(p, pr)tZo/z./ry+ ry provided the sample sizes n I and n2 are large.
To illustrate, we refer once again to the data of Example 7 and calculate Pt -
#_
A
Pz -
.t'r
- p) r^fpra nr V t
-
^ .84, Pz
132 l5o
-.04
PzQ
Pz) n2
.84 x .16 .88 x .I2 -T00r t 50- .045
An approximate 95% confidence interval for (p r .04+ I.96 x.045_
p) is
-.04 + .09 or (-.13,.0S)
Exercises 445
EXERCISES diox3.1 Many industrial air pollutants adversely affect plants' Sulphur many in bleaching intraveinal of leaf damage in the form ide ""rrr"* plants. In a"study of the e{fect of a given concentration of sensitive sutphur dioxide in the aii on three types of garden vegetables, 40 pi"ir,r of each ty-pe are exposed to the pollutant under-controlled The frequencies of severe leaf damage are greenhouse "onditionr. recorded in the following table: Leaf damage Moderate or None
Severe Lettuce Spinach Tomato
328 28 19
12 2l
Ana\yze these data to determine if the incidence of severe Ieaf damage is alike for the three types of plants. In particular: (a) Formulate the null hypothesis.
(b) Test the null hyPothesiswith (c) construct three individual 95% confidence intervals and plot them on graph PaPer. 3.2 When a new product is introduced in a market, it is important for the manufacturer to evaluate its performance during the critical months after its distribution. A study of market penetration, as it is called, involves sampling consumers and assessing their exposure to the product. Suppose that the marketing division of a company selects iandom samples of 200, 150, and 300 consumers from three cities and obtains the following data from them: Never Heard of the Product
Heard About It but Did Not Buy
City I Crty 2 City 3
36 4s s4
ss 56 78
Total
135
r89
Bought It at Least Once
109 49 168
Total
200 150 300
6s0
446
Chapter 14
Do these data indicate that the extent of market penetration differs in the three cities? 3.3 Random samples of 250 persons in the 80-40 years 4ge group and 250 persons in the 60-70 year age group are asked about the average number of hours they sleep per night, and the following summ ary data are recorded: Hours of Sleep g
30-40 60-70
r72 t20
78 130
292
208
By using the 12 test determine if the sleep needs appearto be different for the two age groups. 3.4 Referring to Exercise 3.3, denote by pr artdpz the population proportions in the two groups who have =8 hours of sleep per night. (a) Use the Z-test for testing Ho:pt: pz vs. Hr: p, * pr. (b) Construct a95%" confidence interval for p, - pr. 3.5 Given the data nL -
100,
n2 -- 2OO,
,r ttt
50
^ Pz
140
_f
(a) Find the 95% confidence interval for p, - pr. (b) Perform the Z-test for the null hypotheses Ho: pt : Hr: p, * pr.
pz ys.
3.6 A study (courtesy of R. Golubjatnikov) is undertaken to compare the rates of prevalence of CF antibody to Parainfluenzal virus among boys and girls in the age group 5-9 years. Among 113 boys tested, 34 are found to have the antibod/; among 139 girls tested, 54 have the antibody. Do the data provide strong evidence that the rate of prevalence of the antibody is significantly higher in girls than in boys? Use a Z-test with a : .05. 3.7 Refer to Exercise3.6 (a) Write the data in the form of a 2 x 2 contingency table. (b) Perform the 12 test at ct : .05. (c) How does the 12 test relate to the test you used in Exercise3.6?
4. ContingencyTable with Neither Margin Fixed (Test of Independence) 447 3.8 To study the effect of soil condition on the growth of a new hybrid plant, saplings were planted on three types of soil and their subse(uettt gowth classified in three categories' Soil Type Growth
Clay
Sand
Loam
Does the quality of growth appear to be different for different soil types?
WITH TABLE 4. CONTINGENCY MARGINFIXED NEITHER (TEST OF INDEPENDENCE) When two traits are observed for each element of a random sample, the data can be simultaneously classified with respect to these traits. We then obtain a two-way contingency table in which both sets of marginal totals are random. Anillustration was aheady provided in Example 3. To cite a few other examples: a random sample of employed persons may be classified according to educational attainment and type of occupation; college students *"y t" classified according to the year in college and attitude toward a dormitory regulation; flowering plants may be classified with respect to type of foliage and size of flower' A typical inferential aspect of cross-tabulation is the study of whether the two characteristics appearto be manifested independently or whether certain levels of one characteristic tend to be associated or contingent with some levels of another.
EXAMPLE9
Continuation of Example 3 The 2 x 3 contingency table of Example 3 is given in Table 12. Here a single random sample of 500 persons is classified into six cells. Dividing the cell frequencies by the sample size 500 we obtain the relative frlquencies shown in Table 12(a). Its row marginal totals .570 and .430 represent the sample proportions of Democrats and Republicans,
448
Chapter L4
Toble 12 contingency Toble for politicql Affiliotion ond Opinion Favor Democrat Republican Total
Indifferent
Opposed
138 64
83 67
64 84
202
150
148
respectively. Likewise, the column marginal totals show the sample proportions in the three categories of attitude.
Toble 12(o) Proporfionof observotionsin EochCell Favor Democrat Republican
Indifferent
Opposed
.276 .r 2 8
.166 .134
.128 .1 6 8
.404
.300
.296
Total
Imagine a classification of the entire population. The unknown population proportions (i.e., the probabilities of the cells) are representedby the entries in Table l2(b), where the suffixes D and R stand for Democrat and Republican, and l, 2 and 3 reler to the "favor," "indifferent,,, and ,'opposed" categories.
Toble 12lbl Cell Probobilities
Democrat Republican
Favor
Indifferent
Opposed
Row Marginal Probability
Pot Pn
Poz Pnz
Pos Pns
Po Pn
Column marginal probability
Table 12(b) is the population analogue of Table 12(a),which shows the sample proportions. For instance,
4. ContingencyTable with Neithet Maryin Fixed (Testof Independence) 449 Cell probability
Pot : P(Democrat and in favor) P(Democrat)
Row marginal probability Po: Column marginal probability
Pr : P(in favor)
We are concemed with testing the null hypothesis that the two classifications are independent. Recall from Chapter 4 that the probability of the intersection of independent events is the product of their probabilities. Thus, independence of the two classifications means that pot : PoPr, poz : popr, and so on. Therefore, the null hyryothesis of independence can be formalized as Ho: Each cell probability is the product of the corresponding pair of marginal probabilities' To construct a x2 test, we need to determine the expected frequencies. Under Ho, the expected cell frequencies are SOOp opr,
SOOp rpr,
500pop"
500p^pr,
SOOp^pr, 500p^p"
These involve the unknown marginal probabilities that must be estimated from the data. Referring to Table L2, the estimates are
" _285 vo - 5ggt
^ _215 sR - 500
^ _202 ^ _ 150 ^ _r48 vt - 5ggt vz - 5ggt Y3 - 500 Using these, the expected frequency for each cell of Table 12 is seen to be row total x column total
grand total For instance, in the first cell, we have
SOOPDP.:500 t
285
202
x 500: ffi
: 2 8 5x z o 2: 1 1 5 . 1 4 5oo Table 13(a) presents the observed cell frequencies along with the expected frequencies which are shown in parentheses. The quantities (O - B7z1B and the 12 statistic are then calculated in Table l3(b).
450
Chapter 14
Toble '13(o)The Observed ond Expected Cell Frequenciesfor lhe Doto in Tobletl2. Favor Democrat
Republican
Indifferent
Opposed
83 (85.s0) 67 (64.s0)
64 (84.36) 84 (63.64)
138 (1ls.r4) 64 (86.86)
Toble'13(b)TheVoluesof (O Favor Democrat Republican
4.s39 6.016
Indifferent
.073 .097
EYte Opposed
Total
4.914 6.sr4 22.t53
Having calculated the 12 statistic, it now remains to determine its d.f. by invoking the properties we stated in Section 2. Because we have a single random sample, only the property (b) is relvant to this problem. S i n c e p o - l p n : 1 a n d p , + p l , * p s : l , w e h a v e r e a l l y e s t i m a t e d l+ 2 : 3 parameters. Hence, d.f. of *z : (No. of cells) - I - (No. of parameters estimated) :6-1-3 :2 Using the level of significance ct : .05, the tabulated upper 5% point of 12 with d.f. : 2 is 5.99. Becausethe observed 12 is larger than the tabulated value, the null hypothesis of independence is reiected at ct : .05. In fact, it would be rejected even for ct : .01. An inspection of Table 13(b) reveals that large contributions to the value of 12 have come from the corner cells. Moreover, comparing the observed and expected frequencies in Table l3(a) we see that support for the program draws more from the Democrats than Republicans. n From our analysis of the contingency table in Example 9, the procedure for testing independence in a general r x c contingency table is readily apparent. In fact, it is much the same as the test of homogeneity described in Section 3. The expected cell frequencies are determined in the same way-namely,
4. Contingency Table with Neither Margin Fixed (Test of Independence) 451 row total x column total Expected cell frequency _ grand total
and the test statistic is again -
\r(o-P)z
*':4,
n
- l) With regard to the d.f. of X2in the present case,we initially have (rc d.f. because there are rc cells into which a single random sample is classified. From this we must subtract the number of estimated param- 1) eters. This number is (r - l) + (c - 1) because there are (r (c parameters 1) probabilities and parrmeters among the row marginal the column marginal probabilities' Therefore, "motrg
- (No. of rows - 1) x (No. of columns - 1) which is identical with the d.f. of X2 for testing homogeneity. In sumfrary, the 12 test statistic, its d.f., and the reiection region for testing independence fie the same as when testing homogeneity. It is only the statement of the null hypothesis that is different between the two situations.
The Null Hypothesisof Independence Hoi Each c,ell probability equals the product of the corresponding row and column marginal probabilities.
Spurious Dependence When the 12 test leads to a rejection of the null hypothesis of independence, we conclude that the data provide evidence of a statistical association between the two characteristics. However, we must refrain from making the hasty interpretation that these characteristics are directly related. A claim of causal relationship must draw from common sense,which statistical evidence must not be allowed to supersede. Two characteristics may appear to be strongly related due to the common influence of a third factor that is not included in the study. In such
452
Chapter 14
cases/the dependenceis called a spurious dependence.For instance, if a sample of individuals is classified in a 2 x 2 contingency table according to whether or not they are heavy drinkers and whether or not they suffei {rom respiratory trouble, we would probably find a high value for 12 and would conclude that a strong statistical association exists between drinking habit and lung condition. But the reason for the association may be that most heavy drinkers are also heavy smokers and the smoking habit is a direct causeof respiratory trouble. This discussion should remind the reader of a similar warning given in chapter 3 regarding the interpretation of a correlation coefficient between the two sets of measurements.
EXERCISES 4.1 Refer to the study of handedness reported in the cover page of this chapter. A sample of 408 children, who were adopted in early infancy, was classified according to their own handednessand the handedness of their adoptive parents.
Handedness of Adopted Offspring Adoptive Parents
Right
Left
Both right handed At least one left handed
307
48
Total
476 354
Total
355 53
54
Test the null hypothesis of independence. Take ct : .05. 4.2 A survey was conducted to study peoples' attitude toward television programs which show violence. A random sample of 1250 adults was selecqedand classified according to sex and responseto the question: Do you think there is a link between violence on TV and crime?
Response Not sure
Exercises 453 Do the survey data show a significant difference between the attitudes of males and females? 4.3 The food supplies department circulates a questionnaire among U.S. armed forces personnel to gathel information about preferences for various types of foods. One question is "Do you prefer black olives to green olives on the lunch menu?" A random sample of 435 reipondents is classified according to both olive preference and geographical arca of.home, and the following data are recorded:
Region
Number Preferring Black Olives
West East South Midwest
65 59 48 43
Total No. of Respondents
l r8 135 90 92
Is there a significant variation of preference over the different geographical areas? 4.4 Ina study of factors that regulate behavior, three kinds of subiects are identified: overcontrollers, averagecontrollers, and undercontrollers, with the first group being most inhibited. Each subiect is given the routine task of filling a box with buttons and all subiects are told they can stop whenever they wish. Whenever a subiect indicates he or she wishes to stop, the experimenter asks "Don't you really want to continue? The number of subiects in each group who stop and the number who continue are given in the following table: Controller Over Average Under
Continue
Stop
Total
99 812 314
Do the three groups differ in their response to the experimenter's influence?
4 . 5 Using the computer. The analysis of a contingency table becomes tedious especially when the size of the table is large. Using a computer makes the task quite easy. We illustr ate this by using MINITAB to anaLyze the data in Table 12, Example 9.
454
Chapter 1.4 READ THE TABLE I NTO C I C2 C3 r38 83 64 64 67 84 C H I S Q U A R EA N A L V S I S O N T A B L E C I
C2 C3
E X P E C T E D F R E Q U E N C I E SA R E P R I N T E D B E L O W O B S E R V E D F R E Q U E N CE I S I Cl I C2 I ITOTALS C3 -5-I--I -II--
I
r I
138 r 115.1I
-I-
2O2 I
64.5I --I
63.6I -I-
150 I
148 I
295
-I-
5OO
SQUARE =
TOTAL CHI 4.54 6.02
64 r 94.4r
--I
I-2r64r67I94I215 g6.gI I -rI-TOTALS I
83 r 95.51
+ +
0.07 0.10 =
+ +
4.gl 6.51
+ +
22.15
D E G R E E SO F F R E E D O M=
(
Z-l)
X
(
g-l)
=
(a) Comp are this output with the calculations presentedin Example 9. (b) Do Exercise4.2 on a computer. (c) Do Exercise4.3 on a computer.
KEYIDEASAND FORMULAS 1. Pearson's 12 Test for Goodness of Fit Data; Observed cell frequencies nr, . . . , flk from a random sample of size n classified into k cells. Null hypothesis specifies the cell probabilities Ho: Pr:
Test statistic:
x2cells
Rejection region:
Pro,
(n, - npro)' ' npio
, Pp_ Pto
d.f._k
I
x2 > x3
2. Testing Homogeneity of an t x c Contingency Table Data:
Independent random samples from r populations, each sample classified in c response categories.
Key ldeas and Formulas 455
Null hypothesis: In each response category, probabilities for all th e populations. E)2 \/ - J t(o r, Test statistic: Xz t-'
are the same
cells
where for each cell
Reiection region:
o:
observed cell frequency
E_
row total x column total grand total
X'> X3
3. Comparing TWo Binomial Proportions-2
x 2 Contingency Table
Data: X : # successesin n, trials with P(S) : pt Y : # successesin n, trials with P(S) : P, (a) To test Ho: Pr : Pz vs. Hr: Pt * Pz Either use ;12test given in 2 or use the Z'test: Pz
Z_
wherep, _ fr,
with
lzl 2 Zolz
X + Y p, : :r, P : n r + n 2
The two tests ate equiva.lent: 22 - X2 (b) To test H o : p r - - P z v s . H i P t ) P z use the Z-test with R; Z > z-. The X2 test is not appfopriate. (c) A 100(1 a)% confidence interval for pr Pz Ls
(p,
pr) + zs./2 {?
nz
4. Testing Independence in an r x c Contingency Table Data:
A random sample o{ size n is simultaneously classified with respect to two characteristics, one has r categories and the other c categories.
456
Chapter 14
Null hypothesis: The two classifications are independent; that is, each cell probability is the product of the row and column marginal probabilities. Test statistic and rejection region: Same as in 2. 5. Limitation All inference procedures of this chapter require large samples. The x2 tests are appropriate if no expected cell frequency is too small (>5 is normally required).
5. EXERCISES 5.1 To examine the quality of a random number generator/ frequency counts of the individual integers are recorded from an output of 500 integers. The concept of randomness implies that the integers 0, 1, . . . ,9 are equally likely. Basedon the observedfrequency counts, would you suspect any bias of the random number generator? Answer by performing the 12 test. Integer
Frequency 5.2 A large-scalenationwide poll is conductedto determine the public's attitude toward the abolition of capital punishment. The percentagesin the various responsecategoriesare: Strongly favor 20%
Favor
30%
Indifferent
20%
Oppose
Strongly oppose
20%
ro%
From a random sample of 100 law enforcement officers in a metropolitan area, the following frequency distribution is observed: Strongly favor 14
Favor 18
Indifferent 18
Oppose 26
Strongly oppose 24
Total 100
Do these data provide evidence that the attitude pattern of law enforcement officers differs significantly from the attitude pattern observed in the large-scale national poll?
5. Exercises 457 5.3 Obsewations of 80 litters, each containing 3 rabbits, reveal the following frequency distribution of the number of male rabbits per litter: Total
Number of males in litter
L9
Number of litters
32
80
22
Under the model of Bernoulli trials for the sex of rabbits, the probability distribution of the number of males per litter should be binomial with 3 trials and p : probability of a male birth. From these data,the parameterp is estimated as p:
Total number of males in 80 litters
9 7 :) .4 240
(a) Using the binomial table for three trials and P : .4, determine the cell probabilities. (b) Perform the 12 test for goodness to fit. (In determining the d.f. note that one parameter has been estimated from the data.) 5.4 Osteoporosisor a loss of bone minerals is a common causeof broken bones in the elderly. A researcher on aging coniectures that bone mineral loss can be reduced by regular physical therapy or by certain kinds of physical activity. A study is conducted on 200 elderly subjects of approximately the same age divided into control, physical therapy, and physical activity groups. After a suitable period of time, the nature of change in bone mineral content is obsewed. Change in Bone Mineral
Control Therapy Activity Total
Appreciable Loss
Little Change
Appreciable Increase
38 22 15
15 32 30
7 16 25
Total
60 7A 70 200
Do these data indicate that the change in bone mineral varies for the different groups. 5.5 An investigation is conducted to determine if the exposure of women to atomic fallout influences the rate of birth defects. A sample of 500 children bom to mothers who were exposed to the atomic explosion in Hiroshima is to be studied. A sample of 400
458
Chapter L4
children from a )apaneseisland far removed from the site of the atomic explosion forms the control group. Supposethat the following data for the incidence of birth defects are obtained: Birth Defect Present
Mother exposed Mother not exposed Total
Absent
Total
84 43
416 357
500 400
r27
773
900
Do these data indicate that there is a different rate of incidence of birth defecis for the two groups of mothers? Use the 12 test for homogeneity in a contingency table. 5.6 Refer to the data in Exercise5.5. (a) Use the Z-test for testing the equality of two population proportions with a two-sided altemative. Verify the relation X2 : 7 by comparing their numerical values. (b) If the alternative is that the incidence rate is higher for exposed mothers, which of the two tests should be used? 5.7 Given the data A
160
nr _ 200,
Pr:ffi_'B
nz _ 2OO,
A80 fiz:2oo - '4
(a) Find the 95% confidence interv aI for p, (b) Calculate the Z for testing Ho: p t _ pz.
pz.
5.8 Referring to the data in Exercise 5.7, test
Ho: pt : p, againstHr: p, * prwith cr : .05. 5.9 A sample of 100 females was collected from ethnic group A and a sample of 100 was collected from ethnic group B. Each female was asked: "Did you get married before you were l9?" The following counts were obtained:
5' Exercises 459 (a) Test for equality of two proportions against a two-sided alternative.Takea=.05. - Pn. (b) Establish a 95% confidence intenral for the difference Pa 5.10 An antibiotic for pneumonia was injected into 100 patients with kidney malfunctions (called uremic patients) and into_100 patients with no kidney malfunctions (called normal patients).,some allergic reaction deveioped in 38 of the uremic patients and in 21 of the normal patients. (a) Do the data provide strong evidence that the rate of incidence of allergic reaction to the antibiotic is higher in uremic patients than it is in normal Patients? (b) Constru ct a95"/oconfidence interval for the difference between the population P.roPortions. 5.11 An experiment is conducted to compare the viability of seedswith and without a cathodic protection, which consists of subiecting the seedsto a negatively charged conductor. seedsof a common type are randomly divided in two batches of 250 each. one batch is given cathodic protection, and the other is retained as the control group. goth batches are then subfected to a common high temperatule to induce artificial ag1ng. Subsequently, all the seeds are soaked in water and left to germinate. It is found that}So/" of the control seeds and 10% of the cathodically protected seedsfail to germinate. (a) Do the data provide strong evidence that the cathodic protection permits a higher germination rate in seedssubiected to artificial aging? (b) Construct a98o/oconfidence intewal for the di{ference Pr - Pc, where p, andpc represent the germination rates of cathodically treated seedsand control seeds,respectively' 5.12 A medical researcher coniectures that heavy smoking can result in wrinkled skin around the eyes. The smoking habits as well as the presence of prominent wrinkles around the eyes are recorded for a iandom sample of 500 persons. The following frequency table is obtained:
Prominent Wrinkles
Wrinkles Not Prominent
Heavy smoker Light or nonsmoker
(a) Do these data substantiate an association between wrinkle formation and smoking habit?
460
Chapter 14
(b) If the null hypothesis that wrinkle formation is independent of smoking habit is rejected after an analysis of the data, can the researcher readily conclude that heavy smoking causes wrinkles around the eyes?Why or why not? 5.13 Based on interviews of couples seeking divorces, a social worker compiles the following data related to the period of acquaintanceship before marriage and the duration of marriage:
Acquaintanceship before marriage
Under t year +-G years Over 1| years
Duration of Marriage < 4 Years
11 28 2I
8 24 19
Total
Total
T9 52 40 111
Perform a test to determine if the data substantiate an association between the stability of a marriage and the period of acquaintanceship prior to marriage. 5.14 By polling a random sample of 350 undergraduatestudents, a campus press obtains the following frequency counts regarding student attitude toward a proposed change in dormitory regulations:
Favor
93 ss r48
Indifferent
72 79 1 5I
Oppose
2r 30 5l
Does the proposal seem to appeal differently to male and female students? 5.15 In a study of possible genetic influence of parental hand preference (re{er to the cover pageof this chapter), a sample of 400 children was classified according to their own handednessand the handednessof their biological parents.
5. Exercises 461
Handednessof Biological Offspring Parents' Handedness (Father x Mother)
Right
Right x Right Right x Left Left x Right
Left
303 299 166
37
348
52
Total
Do these findings demonstrate an association between the handedness of parents and their biological offspring?
5.15 A major clinical trial of a new vaccine for type B hepatitis was conducted with a high risk group of 1083 male volunteers. From this group, 549 men were injected with a placebo. A follow-up of all these individuals yielded the data:
Follow-up Got Hepatitis
Did Not Cet Hepatitis
Vaccine Placebo
(a) Do these observations testify that the vaccine is effective? (b) Construct a95o/oconfidence interval for the difference between the incidence rates of hepatitis among the vaccinated and nonvaccinated individuals in the high risk goup. 5.17 The popular disinfectant Listerine is named after foseph Lister, a British physician who pioneered the use of antiseptics. Lister coniectured that human infections might have an organic origin and thus could be prevented by using a disinfectant. Over a period of several years he performed 75 amputations; 40 using carbolic acid as a disinfectant and 35 without any disinfectant. The following results were obtained.
462 Chapter 1 4
Patient Survived With carbolic acid Without carbolic acid
(a) Using the x2 test determine if the proportion that survive is significantly different between the two groups. (b) Use the Z statistic to test the null hypothesis of no difference against the one-sidedalternative that the proportion that survive is higher for operations done with carbolic acid. 5 .1 8 Pooling contingency tables can produce spurious association.A large organization is being investigated to determine if their recruitment is sex biased. Table (i) and Table (ii), respectively, show the classification of applicants for secretartal and for sales positions according to sex and result of interview. Table (iii) is an aggregation of the correspondingentries of Table (i) and Table (ii).
Toble (i) SecreloriolPositions Offered
Denied
Total
Toble (ii) SolesPositions Offered
Denied
150 75
50 25
Total
Toble (iii) Secretoriolond Soles
Positions
Total
Offered
Denied
1,7 5 150
100 175
32s
275
Total
5. Exercises 463 (a) Verify that the 12 statistic for testing independence is zero for each of the data sets given in Table (i) and Table (ii). (b) For the pooled data given in Table (iii), compute the value of the 12 statistic and test the null hypothesis of independence. (c) Explain the paradoxical result that there is no sex bias in any iob category but the combined data indicate sex discrimination.
of Vorionce Anolysis (AN 1, INTRODUCTION THECOMPLETELY TREATMENTS: OF SEVERAL 2. COMPARISON DESIGN RANDOMIZED
FORA COMPLETELY 3 . POPULATIONMODELAND INFERENCES DESIGN RANDOMITED INTERVALS CONFIDENCE 4. SIMULTANEOUS AND DISPLAYS DIAGNOSTICS 5. GRAPHICAL ANOVA TO SUPPLEMENT
466 Chapter 15
Which brond hos lhe cleorestpicfure?The onolysisof vorionce ollowsus to compore severolbronds.
1. INTRODUCTION In Chapter ll, we introduced methods for comparing two population means. When several means must be compared,more general methods are required. We now become acquainted with the powerful technique called analysis of variance (ANOVA) that allows us to analyze and interpret observationsfrom severalpopulations. This versatile statistical tool partitions the total variation in a data set according to the sources of variation that are present. In the context of comparing k population means, the two sources of variation are (l) differences between means and (2) within population variation (error). We restrict our discussion to this case although ANOVA techniques apply to much more complex situations.
2. Compailson of several Treatments: The completely Randomized Design
467
TREATMENTS: OF SEVERAL 2. COMPARISON DESIGN RANDOMIZED THECOMPLETELY It is usually more expedient both in terms of time and expense to simultaneously compare ieveral treatments than it is to conduct several trials two at a time. The term completely randomized design "o-p"r"iirre is sy-nonymouswith independent random sampling from several populations when each population is identified as the population of responses under a particular treatment. Let tleatment I be applied to n, experimental units, treatment 2 to n, units, . . . , treatment k to no units' In a completely randomized design, n, experimental units selected at random + n1 units are to from the availablecollection of n : nL + n2 +''' receive treatment l, n, units randomly selected from the remaining lot are to receive treatment 2, andproceeding in this manner, treatment k is to be applied to the remaining no units. The special caseof this design for a comparison of k : 2 treatments has already been discussedin Section 2 of Chapter 11. The data structure for the responsemeasufements can be represented by the format shown in Table I where Ir is the ith observation on treatment i. The summary statistics appear in the last two columns.
Toble l Dofq Structurefor the Completely Rondomized Designwith k Treotments Observations Treatment
I
Ytt,Ytz,
Y t.,,
Mean
Sum of Squares
Vt
(yri
i)'
(Yzi
ir)'
(ypi
i o)'
i:l n2
Treatment
2
Yzt, Yzz,
Yzn, i:l
Treatment
J<
Ykt,Ykz'
Y knt i:l
sum of all observations Grand mean y + nk nr + nz +
ntYt +
nr +
+ npl* +nk
Before proceeding with the general case of k treatments, it would be instructive to explain the reasoning behind the analysis of variance and the associated calculations in terms of a numerical example.
468
Chapter 15
EXAMPLE't
In an effort to improve the quality of recording tapes, the effects of four kinds of coatings A, B, c, D on the reproducing quality of sound are compared. Supposethat the measurements of sound distortion given in Table 2 are obtained from tapes treated with the four coatings.
Toble2 SoundDislorlionsObtqinedwith FourTypesof Cooting Observations
Coating
Mean
Sum of Squates 5
A
1 0 , 1 5 ,g , 1 2 , 1 5
Yr - 12
,; --
(yri
V)' _ 38
(Yzi
Vr)' _ 30
(Ysi
Vs)' _ 12
(y+i
V)'
4
1 4 , 1 9 ,2 r , 1 5
ir:
17 i: 7
C
1 7, 1 6 , 1 4 , 1 5 , L 7, 1 5 , l g
Vt:
16
;-
t-
6
D
1 2 , 1 5 , 1 7, 1 5 , 1 6 , 1 5
Vo: 15
_ 14
i:
Grand mean y
Two questions immediately come to mind. Does any significant difference exist among the mean distortions obtained using the four coatings?can we establish confidenceintervals for the mean differences between coatings? rt
An analysis of the results essentially consists of decomposing the observations into contributions from different sources. Reasoning that the deviation of an individual obsenration from the grand mean, yii - V is partly due to differences among the mean qualities of the coatings and partly due to random variation in measurements within the same group, the following decomposition is suggested: Observation : Yii
(crand) + (Deviation due) + (Residual) \ to treatment / \ mean /
v
(Vi -Y)
+ (Y,i
V).
For the data given in Table 2, the decomposition of all the observations can be presented in the form of the following arrays:
2. Comparison of SeveraTTreatments: The Completely Randomized Design
469
Observations yii
[ro lr+ ltt
15 8 12 15 18 2r 15 16 L4 15 17 15 18 15 17 ls 16 15
L"
Treatment Effects
Grand Mean
(vi
v [tt - I15 | tt Lls
15 15 15 15
15 15 15 15
-3 -3 222
15 15 15 l+ 1 5 1 5 1 5 1 5| 15 15 15 [:
lll
000
y) -3
-3 111 00
Residuals (V,i V,)
l-z 3
+|-l I 0 L-3
-4 0 3 4-2 -l -2 I 2 01
-1 2 0
For instance, the upper left-hand entries of the arrays show that (-3)
10:15+
+
(-2)
Yn:V+(Vt-fl+(Yrr-Vt) If there is really no difference in the mean distortions obtained using the four tape coatings, we can expect the entries of the second array on the right-hand side of the equation, whose terms are (y, - 7), to be close to zero. As an overall measure of the amount of variation due to differences in the treatment means, we calculate the sum of squaresof all the entries in this array, ot (-3)2 +...
+ (-3)2 +22 + ... +22+ 12+... + 12+02 + ... +02 \_'-\-.---J!--J
\--
\.--
nr:5
flz:4
ns:7
_ 5(- 3)2 + 4(2), + 7(I)2 + 6(0)2 _69
fr+:6
470
Chapter 15
Thus the sum of squaresdue to differences in the treatment means, also called the treatment sum of squares, is given by 4
\
Treatment Sum of Squares
Z-r l: I
flz - 68
ni(Vi
The last array consists of the entries (yri - V) that are the deviations of individual observations from the corresponding treatment mean. These deviations reflect inherent variabilities in the experimental material and the measuring device and are called the residuais. The overall variation due to random erors is measured by the sum of squares of all these residuals (-2)'+
32 + (-4)2 +
+ 12+ 02_94
Thus we obtain 4ni
\\
Residual Sum of Squares
Z-/ r:l
Z-J i:l
Vr)' - 94
(Yii
The double summation indicates that the elements are summed within each row and then over different rows. Alternatively, rcferring to the last column in Table 2 we obtain 5
\ (yri Residual Sum of Squares -- Z-r i:
I
+t
Vr)"
6
7
+
(Yzi
i:I
(Yzi i -l
Vs)z rI !
z-,1
i:L
(y+i
V)z
38 + 30+12+14 Finally, the deviations of individual observations from the grand mean yti- V are given by the array
-5 -1 2 -3
0 -7 3 6 I -1 0 2
-3 0 0 20 0 10
,]
The total variation present in the data is measured by the sum of squares of all these deviations
2. Comparisonof SeveralTteatments:The CompletelyRandomizedDesign 471 4ni
Total Sum of Squares l:l
i:l
+02 - 162 Note that the total sum of squares is the sum of the treatment sum of squares and the residual sum of squares. It is time to turn our attention to another propelty of this decomposition, the degrees of freedom associated with the sums of squares. [n general terms:
fNumber of I lelements I lwhose squares| ["t. summed )
fnegrees of lfreedom I associatedwith [sum of squares
fNumber of linear lconstraints | satisifed by the [elements
In our present example, the treatment sum of squares is the sum of four terms n{Vt - V)2 + nz(Vz - V)2 + ne(Vz - V)2 + nq(V+- y)z,where the elements satisfy the single constraint
nJVt
fl+nz(Vz
fl+ns@s
Y)+n+F+
Y)_0
This equality holds becausethe grand mean y is a weighted averageof the treatment means, or
v
nrVr*n,ryr*ntVt*noyo nr + n2 + nB + n4
Consequently, the number of degrees of freedom associated with the treatment sum of squares is 4 - I : 3. To determine the degrees of freedom {or the residual sum of squares,we no6that the entries (yr, - Vr) in each row of the residual aray sum to zero and that there are 4 rows. The number of degreeso{ freedom for the residual sum of squaresis then ( n , * n r + n a + n ) - O : 2 2 - 4 : l s . F i n a l l y , t h e n u m b e r o fd e g r e e s of freedom for the total sum of squaresis (nr + n2 + ns * n+) - I : 22 - I : 21, becausetlire22 entries (yti - l) whose squaresare summed satisfy the single constraint that their total is zero. Note that the degrees of freedom for the total sum of squares is the sum of the degrees of freedom for treatment and error. We summarize the calculations thus far in Table 3.
472
Chapter 75
Toble3 Source
Sum of Squares
d.f.
68 94
3 r8
162
2l
Treatment Error Total
Guided by this numerical example, we now present the general formulas for the analysis of variance for a comparison of k treatrnents, using the data structure given in Table l. Beginning with the basic decomposition (yii - V) : (Vi - V) + (yri - V,) and squaring each side of the equation, we obtain (yti - V)z : (yi- V)z + (yii - V)2 + 2@, - V)$ti Vr) When summed over I : I, . . . , nr, the last term on the right-hand side of this equation reduces to zero due to the relation ) (yri - Ir) : O. Therefore,summing each side of theprecedingr.htlo:otorr"r i : I, . . ., fri and i : l, . .,k providesthe decomposition kni l:l
kni l:1
i:l
v)2
+ r: I i:I
tt
t
Total SS
Treatment
SS
ResidualSS or error ss
k
k
d.f.
Id.f. l:
I
l:
I
It is customary to present the decomposition of the sum of squares and the degreesof freedom in a tabular form called the analysis of variance table, abbreviated as the ANOVA table This table contains the additional column for the mean square associatedwith a component, which is defined sum of gquares rMean y I e a nsquare b q u a r_ e_ F The ANOVA table for comparing k treatments appearsin Table 4.
Design 2. Comparison of Several Treatments: The Completely Randomized
473
Toble4 ANOVAToblefor comporing kTreotments df
Sum of Squates
Source
Mean Square
k
t.(-/ niTi
SSt
Treatment
i: L k ,Lt I:1
I
MSt
SSt
rT
k
17i
s.L
t
SSE
Error
k
V)z
l:
i:I
MSE : _
nik I
SSE
s
Lni i:
Total
k
rti
j:l
l:l
)t,
v)2
k
I
I
Guide to Hand Calculation express When performing an ANOVA on a calculator, it is convenient to treatment the employ These form. alternative an in the suirs of squares totals ni
all responsesunder treatment r
T ti i:I kk'fi T I
r:1
^
Yii l:l
i--L
to calculate the sums of squares k
ni
72
a where n n
Total SS : i:1
i:l
:,>,+ SS, k
ry-2
SSE _ Total SS
k t:1
T2 n SSt
Notice the SSEcan be obtained bY subtraction.
EXAMPLE2
To illustrate this alternative calculation with the data of Example 1, we first calcul ate Tr:10 T2
+ 15 + 8 + 12 + 15_60
17+ L6+ 14+ 15+ 17+ 15+ 18 \-T4- 12+ 15+ 17+ 15+ 16+ 15 90
nr:5 nz_4 n3 n4:6
47 4
Chapter 15
and T:
Tz+ Tq+ T4 : 6 0T+t +6 8 + ll}+90:830
D:
frr + n2+ nB+ n4 :5+ 4+7 +6:22
Since 4ni
,e Zrr?
_ ( 1o)2+ (15)2 + "' + (16)2 + (ts)z
Total SS _ SIIZ SSt
(3aO)z 22
547622
SSE _ Total SS
ss-
T
EXERCISES 2.1 (a) obtain the arrays that show a decomposition for the following observations. (b) Find the sum of squares for each array. (c) Determine the degreesof freedom for each sum o{ squares. (d) Summafizeby an ANOVA table. Treatment
I 2 3
Observations
35,24,29,2r l g , 1 4 ,1 4 ,1 3 2 r , 1 6 , 2 11, 4
2.2 Repeat Exercise 2.1 for the data
Treatment
I 2 3
Observations
5 ,3 , 2 , 2 5,0,1 2 , 1 , 0 ,I
3,populationModelandlnferencesforacompletelyRandomizedDesign4TS the follow2.3 Use the relations for sums of squaresand d.f. to complete ing ANOVA table. d.f.
Sum of Scluares
Source Treatment Error
35
Total
92
30
obtain the 2.4 Provide a decomposition of the following observations and ANOVA table' Treatment
Observations L, 1,3 l, 5 9,5,5,4 3,4,5
I 2 3 4
2 . 5 Given the summ aty statistics from three samples: 10
1)s?:
nr:
1 0 , Yr _ 5
(nr
n2
f,
_2
(nz
: 1)s?
n3
Vs : 7
(na
t)s3 : )gri
-30
-r ':ut
v)'
Z(vri 9
-
_16
Vr), _ 2 5
i:l
Create th e ANOVA table.
MODELAND INFERENCES 3. POPULATION DESIGN RANDOMIZED FORA COMPLETELY To implement a formal statistical test for no difference among treatment effects we need to have a population model for the experiment. To this end, we assume that the responsemeasurements with the ith treatment constitute a random sample from a normal population with a mean of p, and a common variance of o". Th. samples are assumed to be mutually independent.
47 6
Chapter 15
PopulotionModel for Comporingk Treotments Yii _ Pi + aii, where
i _
I,
., friand i
eii are all indepen-
Fi
dently distributed as N(0, a).
F Distribution The null hypothesis that no difference exists among the ft population means can now be phrased Ho: trr:pz:
.':lr,ft
The alternative hypothesis is that not all the pr's are equal. Seeking a criterion to test the null hlpothesis, we observethat when the population means are all equal, (-y, - V) is expected to be small, and consequently, the treatment mean square 2nr(J, - y)2/(k - 1) is expected to be small. On the other hand, it is likely to be large when the means differ markedly. The residual mean square, which provides an estimate of o2, can be used as a yardstick for determining how large a treatment mean square should be before it indicates significant differences. Statistical distribution theory tells us that-under-Ho the ratio rD
Treatment mean square :
Residual SS/(,4 ni
k)
has an .Fdistribution with d.f. : (k _ 1, n _ k), where n : 2nr. Notice that an F distribution is specified in terms of its numerator degreesof freedom yr : ft'- I and denominator degreesof freedom u, : n - k. We denote Fo(vy v2) : upper ct point of the F with (vr, vr) d.f. The upper a : .05 and a : .10 points are given in Appendix B, Table 5 for severalpairs of d.f. With vr : 7 andvr: 15, for ct : .05, we read from column vt : 7 and row vz : 15 to obtain F.or(7,15) : 2.71(seeTable 5).
Design 477 3. population Model and Inferences fot a completely Randomized
Toble 5 PercentogePoints of F(v,,,vz) Distributions
We summ arLzethe F test introduced above.
F Testfor Equolityof Meons Reiect Ho:
lrr :
l:,,z =F*(L
F: where n the F distribution with d.f . :
(k
I, n
I,n-l<)
R).
The computed value of the F ratio is usually presented in the last column of the ANOVA table.
EXAMPLE3
Construct the ANOVA table for the data given in Example I concerning a comparison of four tape coatings. Test the null hypothesis that the means are equal. Use a : .05. Using our earlier calculations for the component sums of squares, we construct the ANOVA table that appears in Table 6. A test of the null hypothesis Ho: pr : [Lz : Ps : po is performed by comparing the observed F value 4.34 with the tabulated value of F with d.f. : (3, 18).At a .05 level of significance,the tabulated value is found to be 3.16. Becausethis is exceededby the observedvalue, we conclude that there is a significant difference among the four tape qualities.
478
Chapter 15
Toble 6 ANOVAToblefor the Doto Given in Exomple,l Source
Sum of Squares
d.f.
Mean Square
Treatment
68
3
22.67
Error
94
18
5.22
Total
r62
F-ratio
#:
4.34
2l
l
EXERCISES 3.1 Using the table of percentagepoints for the F distributior, find: (a) The upper S% point when u, (b) The upper 5% point when u, _ l0 and u, _ 5. 3-2 Using Appendix B Table 6, find the upper I0% point of F for: (a) d.f. _ (3, 5) (c) d.f. _ (3, 15) (d) d.f. _ (3,80). (e) What effect does increasing the denominator d.f. have? 3.3 Given the following ANOVA table: Soutce
Sum of Squares
d.f.
Treatment Error
r04 r09
5 20
Total
213
25
Carry out the F test for equality of means taking a _ . 10. 3.4 Given the following ANOVA Source
table:
Sum of Squares
d.f.
Treatment Error
24 57
5 35
Total
8l
40
4. SimuhaneousConfidenceIntervals 479 : '05' Carry out the F test for equality of means taking a 3.5UsingthedatafromExercise2'l,testforequalityofmeansusing a : .05. in Exercise 2'2' Take 3.6 Test for equality of means based on the data : .05. a 3.TThreebreadrecipesaretobecomparedwithrespecttodensityofthe loaf. Five loaves will be baked using each recipe' (a)Ifoneloafismadeandbakedatatime,howwouldyouselectthe order? (b) Given the followin gdata,conduct an q : .05 F test for equality of means. Observation
Recipe
I 2 3
. 9 5 ,. 8 6 ,. 7 r , . 7 2 ,. 7 4 . 7 1 , . 8 5.,5 2 ,. 7 2 ,. 6 4 . 6 9 ,. 6 8 ,. 5 1 ,. 7 3 ,. 4 4
INTERVALS CONFIDENCE 4. SIMULTANEOUS The ANOVA F test is only the initial step in our analysis. It determines if significant differences exist among the treatment means. Our goal should be"more than to merely conclude that treatment differences are indicated by the data. Rather we must detect likenesses and differences among the Thus the problem of estimating differences in treatment ti""t-.n6. means is of even greater importance than the overall F test' Referring to the comparison of k treatments using the data structure given in fiUte I let us ixamine how a confidence interval can be estabii.h"d for p, - pr, the mean difference between treatment I and treatment 2.
vr)
(pt
]Lz)
k, and this can be employed to has a t distribution with d.f. for construct a confidence interval [r,r trz. More generallY:
480
Chapter 15
confidence Intervolfor o single Ditference A 100(1 a)% confidence interval for (p, Fi,), the mean difference between treatment r and treatment r' is given by (Vi
Vr,) + to/2s
,ffi
where
S_ \reand to/2 is the upper a/z point of r with d.f.
If the F test {irst shows a significant difference in means, then some statisticians feel that it is reasonableto compare means pairwise according to the precedingintervals. However, many statisticians prefer a more conservativeprocedure basedon the following reasoning. without the proviso that the F test is signifiCant,the precedingmethod provides individual confidence intervals for pairwise differences. For instance, with ft : 4 treatments there *" (;) : 6 pairwise differences (pi - t i,), and this procedure applied to all pairs yields six confidence statements,each having a 100(l - a)% level o{ confidence.It is difficult to determine what level of confidence will be achieved for claiming that all six of these statements are correct. To overcome this dilemma, procedures have been developed for several confidence intervals to be constructed in such a manner that the joint probability that all the statements are true is guaranteednot to fall below a predetermined level. such intervals are called multiple confidence intervals or simultaneous confidenceintervals. Numerous methods proposedin the statistical literature have achieved varying degreesof success.we present one that can be used simply and conveniently in general applications. The procedure, called the multiple-t confidence intervals, consists of setting confidence intervals for the differences (p,' - t r,) in much the same way we just did for the individual differences excepi that a different percentagepoint is read from the t table.
4. Simultaneous Confidence Intervals 481
Multiple-fConfidenceIntervols a)% simultaneous confidence intervals for m A set of 100(1 number of pairwise differences (p, lri,) is given by (Vi
Vr) I
to/2rn s
where s : l[nnsE, m_ the number of confidence statements, k. and to/zrn - the upper otl(Zm) Point o l t w i t h d . f . : n m statements the all probability of the Using this procedure, cr). being correct is at least ( I
Operationally, the construction of these confidence intervals does not r"qn1r" any new concepts or calculations, but it usually involves some - c,) nonstandard percentage point of t. For example, with k : 3 and (1 : , - .gs,if we want to set simultaneous intervals for all - : (f ) pairwise differences, we require the upper o/(Lm) : .05/6 : .0083 point of a t distribution.
EXAMPLE4
An experiment is conducted to determine the soil moisture deficit resulting from varying amounts of residual timber left after cutting trees in a forest. The three treatments are treatment 1: no timber left; treatment 2: 2000 bd ft left; treatment 3: 8000 bd ft left. (Boardfeet is a particular unit of measurement of timber volume.) The measurements of moisture deficit are given in Table 7. Perform the ANOVA test and construct confidence intervals for the treatment differences. Our analysis employs convenient alternative forms of the expressions for sums of squares involving totals. The total number of observations n : 5 + 6 + 6 : 17. Bni
3ni
Total SS -
(Yti r:1
i:i
- 7I.3O47
Y)2 :
i:l
i--l
64.253I - 7.051tr
482
Chapter 15
Toble7 MoistureDeficitin Soil
I 2 3
r . 5 2r . 3 8r . 2 9 1 . 4 81 . 6 3 1 . 6 3 1 . 8 2 1.351.032.30 r.45 2 . 5 5 3 . 3 2 2 . 7 62 . 6 32 . r 2 2 . 7 8
3
ni(Yi l:1
: 69.5322
t,Yz:
r.460 r.s97 Y z : 2.69s
TI 7.3O T2 T3 : I 6 . L 7 Cra nd total T _ 33.05
3
Treatment S S _
Mean
Total
Observations
Treatment
flz -
S
ryr2
ri
3, "t
Grand mean r.944 v
T2 n
64.2531 : 5.279I
Treatment SS : L.7725
Residual SS : Total SS
The ANOVA table appearsin Table 8.
Toble I ANOVAToble for Comporisonof Moisture Deficil Source
Sum of Squares
d.f.
Treatment Error
5.279r r.7725
2 L4
Total
7.Osr6
I6
Mean Square
2.640 .r27
F-ratio
20.79
gel 3d( t val v'aru ue e vtarbulratte( er th tlnanl t he Becausethe observed value of F is Iar8 '.4 _ nen .en nt fme [fere c :ei in the ttfreeatn eOtr.c F.or(2,14) 3.74, the null hypothesis of n10r diliff ,oul ro i t aan ny rue a t arh.IIf., nost anl rld tbe: 1tru effects is rejected at cr' - .05. In fact, thi S wwou nfidr IlCI on den enc ce tipl e-t (CC : % t ml urltil significance level. In constructing a set or f: !995' (B' intervals for pairwise differences,note that there are \2, _ 3 parrs:
,t)
cr. 2m
.05
ffi-
'00833
From Appendix B, Table 4 the upper .0083 point of t with d.f. - 14 is 2.7 46. The simultaneous confidence intervals are calculated as follows.
Exercises 483
ltz
I.460) -F 2.745 x .356 x
[rr: Q.597
_ (_ .47,.74) ira
jrz: Q.695
I.597) I 2.746 x .356 x
- (.53, L.66) pa
I.460) + 2.746 x .356 x
trr: Q.695
: (.63, l.g4) These confidence intervals indicate that treatments r and 2 do not differ appreciably but that the mean of treatment 3 is considerably different from the means of I and 2. T
EXERCISES 4.1 Taking ct : .05 andn - k : 2,6,determine the appropriate percentile of the t distribution when calculating the multiple-t confidence intervals with (a) m : 3 and (b) m : 5. 4.2 Construct the 95% multiple-t con{idence intervals using the sound distortion data in Example l. 4.3 Given the following summary statistics: nr-30
Vt
ro.2
nz:
v,
8.1
nB:24
V"
9.7
n4:
io
6.2
18
8
s
Use c : .05 and determine: (a) t intervals for each of the six differences of means. (b) The six multiple-t intervals. 4.4 Determine the expression for the length of the t-interval for p,, - p, and the multiple-t interval for p, - p" when m : 10. The ratio of Iengths does not depend on the data. Evaluate this ratio for c : .10 andn - k:.15.
484
Chapter 15
5. GRAPHICAL DIAGNOSTICS AND DISPLAYS TOSUPPLEMENT ANOVA In addition to testing hypotheses and setting confidence intervals, an analysis of data must include a critical examination of the assumptions involved in a model. As in regression analysis, of which analysis of variance is a special case,the residuals must be examined for evidence of serious violations of the assumptions. This aspect of the analysis is ignored in the ANOVA table summary. EXAMPLE5
Determine the residualsfor the moisture data given in Example 4 (see Table 9) and graphically examine them for possible violations of the assumptions.
Toble 9 Residuolsyii
yi For fhe Dqfo Given in Toble 7
Treatment
I 2 3
Residuals
-.08 .22 .63
.06 .03
- .r4
- .r7 - .25 .07
.02 - .s7 - .07
.r7 .7A - .58
-.15 .09
The residual plots of these data are shown in Figure I, where the combined dot diagram is presented rn 2a and the dot diagrams of residuals corresponding to individual treatments appear in 2b.
0 Combined residual plot (a)
TieatrlrentI
o
o ''
o
ao
0 Treatrnent 3 R e s i d u a l sw i t h i n d i v i d u a lt r e a t m e n t (b)
Figure 't Residuol plofs for fhe doto given in Exomple 4.
5. Graphical Diagnosticsand Displaysto SupplementANOVA 485 From an examination of the dot diagrams, the variability in the points for treatment I appearsto be somewhat smaller than the variabilities in the points for treatments 2 and 3. However, given so few observations it is difficult to determine if this has occurred by chance or if treatment I actually has a smaller variance. A few more observations are usually necessary to obtain a meaningful pattern for the individual treatment plots. Fortunately, the ANOVA testing procedure is robust in the sense that small or moderate departures from normality and constant variance do l not seriously affect its performance. In addition to the ANOVA analysis of means a graphical portrayal of the data, as a box plot for each treatment, conveys important information avarlable for making comparisons of populations.
EXAMPLE6
The sepal width was measured on 50 Iris flowers for each of three varietiei iris setosa,iris versicolor and iris virginica. (Source:Annals of Eugenics, 1936, Vol. 7, 179-188) A computer calculation produced the summary shown in the following ANOVA table:
Soutce
ss
Treatment Error
I 1.345 16.962
2 r47
Total
28.307
r49
49.16
Since F.os(2,I47) - 3.05, we reject the null hypothesis of equal sepal width means at the 5% level of significance.
Treatment
Iris setosa Iris versicolor Iris virginica
Sample Mean
3.428 2.770 2.974
A calculation of multiple r-confidenceintervals shows that all population means differ from one another (seeExercise6.7). Box plots graphically display the variation in the sepal width measurements. From Figure 2, we see the Iris setosatypically has larger sepal width. R. A. Fisher, who developedanalysis of variance, used these data along
486
Chapter 15
li::ttii$i:$ to a
l's v1s*,n,"u l''i'l
3
2
4
width
Figure2 Box plots for the three iris sornples.
with other lengths and widths to introduce a statistical technique for identifying varieties. n
KEYIDEAS Several populations can be compared using the Analysis of Variance (ANOVA). The ltn observation on the i.h treatment is yrr. The total variation in the observations y' is expressedas the kni
Total Sum of Squares _
(yii r:1
fl'
i:I
ftni
The Total Sum of Squares
Yl2 is partitioned into the two components (i) Treatment Sum of Squares l:l
i:l
k
SSt and (ii) Error Sum of Squares kni
SSE :
yr)' r:
Mean Square -
I i:I
Sum of Squares Degrees of Freedom
Jhs f-statistic : (Treatment Mean Square)/(Enor Mean Square) When calculating confidence intervals for population mean differences pi - lli,, it is desirable to use multiple-t intervals to insure an overall confidence level for all statements.
Exercises 487
EXERCISES 6.1 Providea decomposition for the following observations from a completely randomized design with three treatments:
Treatment 1
Treatment 2
Treatment 3
r9
T6 t1 13 t4 1l
13 I6 18 ll l5 11
18 2T I8
6.2 Compute the sums of squaresand construct the ANOVA table for the data given in Exercise 5.1. 6.3 Using the table of percentage points for the F distribution, find: (a) The upper 5% point when d.f. : (8, 12) (b) The upper 5Y" point when d.f. : -(8, 20) (c) The upper 10% point when d.f. : (8, 12) 6.4 As part of the multilab study, four fabrics are tested for flammability at the National Bureau of Standards. The following bum times in secondsare recorded after apaper tab is ignited on the hem of a dress made of each fabric:
Fabric 1
17.8 16.2 17.s 1 7. 4 15.0
Fabric 2
rr.2 IT.4 15.8 10.0 10.4
Fabric 3 I 1.8 11.0 10.0 9.2 9.2
Fabric 4 14.9 10.8 12.8 10.7
r0.7
(a) State the statistical model and present the ANOVA table. With cr _ .05, test the null hypothesis of no difference in degree of flammability for the four fabrics.
(b) If the null hypothesis is rejected, construct confidence intervals to determine the fabric(s) with the lowest mean burn time.
(c) Plot the residuals and comment on the plausibility sumptions.
of the as-
488
Chapter 15
(d) If the tests had been conducted one at a time on a single mannequin, how would you have randomized the fabrics in this experiment? 6.5 Subjects in a study of the impact of child abuse on IQ [Sandgrund et aL.,American lournal of Mental Deficiency, YoI. 79, No. 3 (1974), 327-3O] are selected from families receiving public assistance.The criterion for "abuse" is that an ongoing situation be confirmed by investigation, the "neglect" category is based on legal findings with regard to lack of adequate care, and the "nonabuse control" group is selected from out-patients at a clinic. The summary statistics for boys are: Verbal IQ Abuse Yt - 8 1 ' 0 6 sl - 17.05 nr:32 where
Neglect
Nonabuse Control
Yz : 78.56 s2 - I 5.43 n2: 16 ni \r
s?: z
(Y,i
Yz : 87.81 s3 - 14'36 nB: 16 i)'l(ni
1)
i:I
(a) Present the ANOVA table for these data. (b) Calculate simultaneous confidence intervals for the differences. (c) Becauserandomization of treatments is impossible, what conclusions regarding abuse and low IQ is still unanswered? 6.6 Using the computer MINITAB can be used for ANOVA. The sET command has already been used to set the data in Example I into columns l, 2, 3, and 4, respectively. The command A O V O N E W A YO N C 1 - C 4
produces the following output: A N A L V S IS O F V A R I A N C E SOURCE FACTOR ERROR T O T AL
LEVEL
ct c2 c3 c4
DF 3 l8 21
N 5 4 7 6
SS 68.00 94.00 162.00 MEAN I 2. 00 17.00 16.00 15.00
MS 22 .67 5 .22
S T DE V 3.08 3.16 r.41 r .67
F 4 .34
Exercises 489
15.0
12.O
18.0
Use the computer to analyze the moisture data in Table 7. 6.7 The MINITAB output for the analysis of the iris data described in Example 6 is given below. A N A L Y SI S O F V A RI A N C E SS DF S O U R CE I I .345 FACTOR 16 . 9 6 2 147 ERROR 28.307 149 T O T AL
N 50 50 50
LEVEL c4 cs c6
MEAN 3 .428 2.770 2.974
MSF 5.672 0.115
4 9 . 16
STDEV 0.379 0.314 o .322
POOLEDSTDEV = 0.340 INDIVIDUAL 95
PCT CI'S
FOR MEAN
:i:::_1_:99::?_:I::v__+_
(---*---)
(---*---)
(- - - * - - - )
+--
2.88
-+-
3.12
--+
3.36
(a) Identify SSE and its degrees of freedom. Also locate s. (b) Check the calculation of F from the given sums of squares and d.f . (c) Is there one population with highest mean or are two or more alike? Use multiple-t confidence intervals with ct - .05.
CHAPTER
Nonporometrlc lnference 1. INTRODUCTION TWOTREATMENTS FORCOMPARING TEST 2, THEWILCOXONRANK-SUM PAIRCOMPARISONS 3, MATCHED ON RANKS BASED OF CORRELATION 4, MEASURES REMARKS 5. CONCLUDING 6. EXERCISES
492
Chapter 16
I{,.INTRODUCTION Nonparametric refers to inference procedures that do not require the population distribution to be normal or some other form .p.iifi.d itt terms of parameters. Nonparametric procedures continue to gain popularity because they apply to a very wide variety of population disiriLutions. Typically, they utilize simple aspects of the sample data, such as the signs of the measurements, order relationships, or category frequencies. Stretching or compressingthe scale of measurement does not alter them. As a consequence/ the null distribution of a nonparametric test statistic can be determined without regard to the shape of the underlying population distribution. For this reason, these tests are also called distribution-free tests. This distribution-free property is their strongest advantaSe. what type of observations are especially suited to a nonparametric analysis? characteristics like degree of apathy, taste preferenci, and surface gloss cannot be evaluated on an objective numerical scale, and an assignment of numbers is, therefore, bound to be arbitrary. Also, when people are asked to express their views on a S-point rating scale, Strongly Disagree Disagree Indifferent
Agree
Strongly Agree
the numbers have little physical meaning beyond the fact that higher scores indicate greater agreement. Data of this type are called ordinal data, because only the order of the numbers is meaningful and the distance between two numbers does not lend itsel{ to practical interpretation. Nonparametric procedures that utilize information only on order or rank are particularly suited to measurements on an ordinal scale.
2. THEWILCOXONRANK.SUM TEST FOR COMPARING TWOTREATMENTS The problem of comparing two populations based on independent random samples has already been discussed in Section 2 of Chapter ll. Under the assumption of normality and equal standard deviations the parametric inference procedures were basedon student's t statistic. Here we describe a useful nonparametric procedure named after its proposer F. Wilcoxon (1945). An equivalent alternative version was independently proposed by H. Mann and D. Whitney (1947).
The Wilcoxon Rank-SumTest fot Compating Two Treatments 493 For a comparative study of two treatments A and B, a set of rr : rr4 * nu experimental units are randomly divided into two gloups of sizes na ^id, i", respectively. Treatment A is applied to the nA units, and treatment ? is applied to the nu units. The responsemeasurements, recorded in a slightly different notation than before, are Treatment A
X,
Treatment B Xzr
X'
Xtn^
Xzz
X^u
These data constitute independent random samples from two populations. Assuming that larger responses indicate a better treatment, we wish to test the null hypothesis that there is no difference between the two treatment effects vs. the one-sided alternative that treatment A is more effective than treatment B. In the present nonparametric setting, we only assume that the distributions are continuous.
Model: Bothpopulotiondistribulionsore conlinuous Hypotheses: are identical.
Ho:
The two population distributions
Ht:
The distribution of population A is shifted to the right of the distribution of PoPulation B-
Note that no assumption is made regarding the shape of the population distribution. This is in sharp contrast to our t-test in Chapter 1l where we assumed that the population distributions were normal with equal standard deviations. Figure I illustrates the above hypotheses Ho and Hr. The basic concept underlying the rank-sum test can now be explained by the following intuitive line of reasoning. Supposethat the two sets of Shift of amount t<-a--->l ldenticaI d istribution of both populations
('"fl
Figure ,l Represenfofion of Ho ond 4 in lerms of the omount of shitl A
494 Chapter 16
observations are plotted on the same diagram, using different markings A and B to identify their sources. Under Ho, the samples come from the same population, so that the two sets of points should be well mixed. However, if the larger observations are more often associated with the first sample, for example, we can infer that population A is possibly shifted to the right of population B. These two situations are diagiammed in Figure 2 where the combined set of points in each case is serially numbered from left to right. These numbers are called the combined sample ranks. [n Figure 2alarge as well as small ranks are associatedwith each sample, whereas in Figure 2b most of the larger ranks are associated with the first sample. Therefore, considering the sum of the ranks associated with the first sample as a test statistic, a large value of this statistic should reflect that the first population is located to the right of the second.
*iii,lii**?$ffih$?:1fr#ffi#*:i$#ss#
ffi rt't,:8f * .l.r'.'
4, ,:$,',
'fl l''.l8fi ll.l.l ..l:,.i.l ..il..,.., i'. ..
3i 4'5
Ranks I
6
Figure2 combined plol of lhe two somplesond lhe combined sompte ronii. To establish a rejection region with a specified level of significance, we must consider the distribution of the rank-sum statistic under the null hypothesis. This concept is explored in Example l, where small sample sizes are investigated for easy enumeration.
EXAMPLE1
To determine if a new hybrid seedling produces a bushier flowering plant than a currently popular variety, a horticulturist plants 2 new hybrid seedlings and 3 currently popular seedlings in a garden plot. After the plants mature, the following measurements of shrub girth in inches are recorded: Treatment A (New hybrid)
31.8
39.r
Treatment B (Current variety)
35.5
27.6
SHRUB GIRTH
(TN TNCHES)
2r.3
Do these data strongly indicate that the new hybrid produces larger shrubs than the current variety? We wish to test the null hypothesis Ho: A and B populations are identical. vs. the alternative hypothesis Hr: Population A is shifted fuom B toward larger values.
The Wilcoxon Rank-Sum Test for Compafing Two Treatments
495
For the rank-sum test, the two samples are placed together and ranked from smallest to largest: Combined sample ordered observations
2r.3
31.8
2 7. 6
35.5
39.1
Ranks Treatment
* 5:8 RanksumforA:Wa:3 : | + 2 + 4 : 7 Ranksum forB: Wn Becauselarger measurements and therefore higher ranks for treatment A tend to support Hr, thereiection region of our test should consist of large values for Wo: RejectHo if
Wa> c
To determine the critical value c so that the t1rye I error probability is controlled at a specified level o, we evaluate the probability distribution of Wounder Ho. When the two samples come from the same population, every pair of integers out of {1,2,3,4,5}is equally likely to be the ranks for the two A measurements. There are (i/ : 10 potential pairs, so that each collection of possible ranks has a probability of *t : .l under Ho. These rank collections are listed in Table I with their corresponding W'o values.
Toble1 RonkCollectionsfor Treotment A with SomPleSizes l?a = 2' f?3 = 3 Ranks of A
Rank Sum WA
L,2 1,3 1,4 1,5 2,3 2,4 2,5 3,4 3,5 4,5
3 4 5 6 5 6 7 7 8 9 Total
Probability
r.0
496
Chapter 16
The null distribution of wocan be obtained immediately from Table I by collecting the probabilities of identical values (seeTable 2).
Toble2 Distribulion of the Ronksum w^for somple Sizesn^ = 2, nB = 3 Value of Wo
Probability
The observedvalue Wa : 8 has the significanceprobability pHo(WA> g) : .1 + .l : .Z.In other words, we must tolerate a type I error piobibility of .2 in order to reject Ho. The rank-sum test leadsus to conclude that the evidence is not sufficiently strong to reject Ho. Note that even if the A measurements did receive the highest ranks of 4 and 5, a significance level of q : .l would be required to reject Ho. n Guided by Example 1, we now state the rank-sum test procedurein a general setting.
Wilcoxon Ronk-SumTest Let Xrr, . , Xrno and Xrr, . , Xzn* ba independent random samples from coniinuous populations A and B, respectively. To test Ho: The populations are identical: (a) Rank the combined sample of n - nA + nB observations in increasing order of magnitude. (b) Find the rank sum W A of the first sample. (c) (i) For H I Popul atron A is shifted to the right of populatron B; set the reiection region at the upper tail of wA. (ii) For H I Popul atron A is shifted to the left of population B; set the reiection region at the lower tail of W A. (iii) For H I Populations are different; set the reiection region at both tails of WA having equal probabilities.
A determination of the null distribution of the rank-sum statistic by direct enumeration becomesmore tedious as the sample sizes increase. However, tables for the null distribution of this statistic have been prepared for small samples, and an approximation is available for large
The Wilcoxon Rank-SumTestfor CompafingTwo Treatments 497 samples. To explain the use of Appendix Table 7, first we note some features of the rank sums Wo and W". The total of the two rank sums l4lo * Ws is a constant, which is the sum of the integers 1,2, . . . , n, whete n is the combined sample size. For instance, in Example 1, We*Wn:(3+5)+(l+2+4) :l+2+3+4+5:15 Therefore, a test that rejects Hofor large values of Wo is equivalent to a test that rejects Ho for small values of W". We can iust as easily designate Wuthe test statistic and set the reiection region at the lower tail. Consequently, we can always concentrate on the rank sum of the smaller sample and set the reiection region at the lower (or upper) tail, depending on whether the alternative hypothesis states that the corresponding population distribution is shifted to the left (or right). Second,the distribution of each of the rank-sum statistics W, andW" is symmetric. In fact, W'o is symmetric about ne(na + nB + l)/2 andWt is symmetric about n"(n6 + nB + I)/2.Table 2 illustrates the symmetry of the l4lo distribution for the casenA : 2, na : 3. This symmetry also holds for the test statistic calculated from the larger sample size.
The Use of Appendix Table 7 The Wilcoxon rank-sum test statistic is taken as
w":
sum of ranks of the smaller sample in the combined sample ranking.
When the sample sizes ate equal, take the sum of ranks for either of the samples. The Appendix Table 7 gives the upper- as well as the lower-tail probabilities: PlW, = xl:
UPPer-tail ProbabilitY
P[W" = x*]:
Lower-tail ProbabilitY
By the symmetry of the distribution, these probabilities are equal when x and x* areat equal distancesfrom the center. The table includes the x* values corresponding to the x's at the upper tail. EXAMPLE2
Find P[W" = 25] and PlW" = 8l when smaller sample srze - 3. larger sample size
498 Chapter 16
From Table 7, we read P : x : 25, so .033 : P[W" - 25].
plw"
The lower tail entry PIW obtained by reading PlW " < oppositex* - 8. We find PlW, 8] - .033 illustrating the symm" etry of
w"
P-
PIW,>x]-
PIW"=x*]
Smaller Sample Size - 3 Larger Sample Size 7 x*
22 23 24 +25 26 27
ll l0 9 8 7 6
.033
The steps to follow when using Appendix Table 7 tn performing a ranksum test are: Use the rank-sum w of the smaller sample as the test statistic. (If the sample sizes are equal," take either rank sum as W".)
(a) If Hr states that the population corresponding to Ws is shifted to the right o,fthe other population, set a rejection region oi the form W" > c and take c as the smallest x value for which p is < a. (b) If Hr states that the population corresponding to I4l" is shifted to the left; set a rejection region of the form I4/, = c and ta[e c as the largest x* value for which p = c. (c) If Hr states that the population corresponding to Ws is shifted in either direction; set a rejection region of the form [I4l"scr or W"> cz] and read cr from the x* column and c, from the x-coltrmn, so thal P < s./2. EXAMPLE3
Two geological formations are compared with respect to richness of mineral content. The mineral contents of 7 specimens of ore collected from formation I and 5 specimens collected from formation 2 are rneasured by chemical analysis. The following data are obtained: Mineral Content Formation
I
7. 6
I 1.1
6.8
9.8
4.9
Formation 2
4.7
6.4
4.r
3.7
3.9
1 5 l.
The WilcoxonRank-SumTestfot CompatingTwo Tteatments 499 Do the data provide strong evidence that formation I has a higher mineral content than formation 2? Test with ct near .05. To use the rank-sum test, first we rank the combined sample and determine the sum of ranks for the second sample, which has the smaller size. The obsewations from the second sample and their ranks are underlined here for quick identification: Combined ordered values
4.9 6.r
I
Ranks
? q !
6.8 7.6 9.8 11.1 15.1
s 6 z
8 e r0
t2
The observed value of the rank-sum statistic is I,V":1+2+3+4+7:17 We wish to test the null hypothesis that the two population distributions are identical vs. the alternative hypothesis that the second population, corresponding to I4l", lies to the left of the first. The rejection region is therefore at the lower tail of I4l". Reading Appendix Table 7 with smaller sample size : 5 and larger sample size : 7, we find PfW" = 2Il : .037 and P[W" < 22] : .953. Hence the reiection region with a : .053 is established as W" < 22Because the observed value falls in this region, the null hypothesis is reiected at o. : .053. In fact, it would be reiected if a were as low as ! PlW" = l7l : .995. EXAMPLE4
Flame-retardant materials are tested by igniting a paper tab on the hem of a dress wom by a mannequin. One response is the vertical length of damage to the fabric measured in inches. The following data (courtesy B. foiner) for 5 samples, each taken from two fabrics, are obtained by researchersat the National Bureau of Standardsas paft of a larger cooperative studY: Fabric Fabric
Do the data provide strong evidence that a difference in flammability exists between the two fabrics? Test with ct near .05. The sample sizes are equal, so that we can take the rank sum of either ih. test statistic. We compute the rank sum for the second sample "r sample: Orderedvalues
500
Chapter 16
W"-1+2+3+6+9_ZI Becausethe alternative hypothesis is two-sided, the rejection region includes both tails of w From Appendix Table T we find that , ". PlW, = B7l Thus with ct is W, observed value does not fall in the rejection region so the null hypothesis is not reiected at a
r
Large-Sample Approximation
when the sample sizes are large, the null distribution of the rank-sum statistic is approximately normal and the test can therefore be performed using the normal table. specifically, with wo denoting the rank sum of the sample o{ size n^,_supposethai both no o:rd,n" ^r"i^rgl rh"r, W, is
ra un,{h;;t#;"il;-;;fi wA Or
has mean variance
no(no*no*I) 2 nonu(no*nu*l)
T
LorgeSompleApproximotion to the Ronk-Sum Stofistic Z
INA
na(ne
non"(no
+ nB + D/2
+ ne*
I)/12
is approxim ately N(0, l) when Ho is true. I * -'.: t
The rejection region for th e Z statistic can be determined by using the standard normal table.
To illustr ate the amount of error involved in this approximation, consider the case nA _ 9, frs - 10. With ct sided rejection region is
R:
wA
e(2o)/2
wl
= , 2 o= r . 6 4
12.247
we
Exercises 501 which simplifies to R: wo > 110.1.From Appendix Table 7 we find P[w" > 1101: .056 and PF4Z"= l11l : .o47,which are quite close to a : .05. n The error decreaseswith increasing sample sizes. Handling Tied Observations In the preceding examples observations in the combined sample are all distincl and therefore the ranks are determined without any ambiguity. Often, however, due to imprecision in the measuring scale or a basic discretenessof the scale, such as S-point preference rating scale, observed values may be repeatedin one or both samples. For example, consider the data sets Sample Sample The ordered combined samPleis 18
20
22
24
24 tre
26
26
26
28
30
tre
Here two ties are presentrthe first has 2 elements, and the secondhas 3. The two positions occupied by 24 arc eligible for the ranks 4 and 5, and we assign the averagerank (4 + 5)/2 : 4.5 to each of these observations. Similarly, the three tied observations 26, eligible for the ranks 6, 7, and8, are each assigned the average rank (5 + 7 + 8)/3 : 7. Aftet assigning averageranks to the tied observations and usual ranks to the other observations, the rank-sum statistic can then be calculated. When ties are present in small samples, the distribution in Appendix Table 7 no longer holds exactly. It is best to calculate the nuII distribution of W" under the tie structule or at least to modify the variance in the standardized statistic for use in large samples.SeeLehmann [1] for details.
EXERCISES 2 . 1 Independent random samples of sizes ne from two continuous Populations. (a) Enum erate all possible collections of ranks associated with the smaller sampl. in the combined sample ranking. Attach probabilities to ttrese rank collections under the null hypothesis that the populations ate identical. (b) Obtain the null distribution of W" : sum of ranks of the smaller sample. Verify that the tail probabilities agree with the tabulated values.
502
Chapter 16
2.2 Independentsamplesof sizesDa : 2 andn" : 2 atetaken from l*o continuous populations. (a) Enumerate all possible collections of ranks associated with population A. Also attach probabilities to these rank collections assuming the populations are identical. (b) Obtain the null distribution oI Wo. 2.3 Using Appendix Table 7, Iind: (a) PlW" > 39] when fla : 5, na: 6 (b) PtW" = l5l when na : G,ns : 4 (c) the point c such that plw"> c] is closeto.05 whenno: ns:7'
7,
2.4 Using Appendix Table 7, find: PlW" > 571when nA : 6, ns : B .(?) (b) PtW" < 3tl when n; : g, : 6 "; (c) P[W" > 38 or W" - 2}fwhen na : S and n" : 5 (d) the points c sirch that p[142,< c] is close to .05 when na : 4, ns:7 (e) th^epoints c, and c, such that p[W" = : pfw" > cr] is about "r] .025 when fia : 7, nn : 9 2.5
Given the data population A
2.r
5.3
population B (a) Evaluate WA. (b) Evaluate W r.
2.7
3.2
3.7
2.6 A mixture of compounds called phenolics occurs in wood waste products. It has been found that when phenolics are present in large quantities, the waste becomesunsuitable for use fee-cl. To compare two species of wood, a dairy scientist ", "-lilr"rto"k measures the percentagecontent of phenolics from 6 batches of waste of species A and 7 batches of waste of speciesB. The following d.ata areobtained: Percentage of phenolics
SpeciesA
2.38
4.r9
1.39
3.73
2.86
r.2r
SpeciesB
4.67
5.38
3.89
4.67
3.58
4.96
3.98
(Jse the Wilcoxon rank-sum test to determine if the phenolics content of species B is significantly higher than that of species A. Use a close to .05.
Exercises 503 2.7 A proiect (courtesy of Howard Garber) is constructed to prevent the dechne of intelleciual performance in children who have a high risk of the most common type of mental retardation, called culturalfarnilial. It is believed that this can be accomplished by a comprehensive family intervention program. Seventeen children in the high-risk category are chosenln early childhood and-given special scf,oo[ng unti-l the age of 4]. Another 17 children in the same highform the control group. Measurements of the psycho,irt ""a"iory quotient (PLQ) are recorded for the control and for the linguistii explrimental groups at the age of 4l years' Do the data strongly indicate improved PLQs for the children who received special sJhooling? Use the Wilcoxon rank-sum test with a large-sample approximation: us€ a : '05' PLQ at age4I yearc Experimental group Control group
79.6
Experimental group Control group
1 0 5 . 4 I 1 8 . 1 r27.2 87.3
79.6
I 10.9 109.3 121.8 rt2.7 76.9
79.6
98.2
88.9
120.3 70.9
1 1 0 . 9 1 2 0 . 0 1 0 0 . 0 r 2 2 . 8 1 2 1 . 8 112.9 107.0 113.7 103.6
87.0
77.O 96.4 100.0 103.7
6r.2
91.I
87.0 76.4
2.8 The possible synergetic effect of insecticides and herbicides is a matter of concern to many environmentalists. It is feared that farmers who apply both herbicides and insecticides to a crop may enhance the toxicity of the insecticide beyond the desired level. An experiment is conducted with a particular insecticide and herbicide to determine the toxicity of the treatments: Treatment 7: A concentration of .25 pgper gram of soil of insecticide with no herbicide. Treatment 2: Same dosageof insecticide used in treatment 1 plus 100 pg of herbicide per gram of soil. Severalbatches of fruit flies are exposed to each treament, and the mortality percent is recordedas a measure of toxicity. The following data are obtained: Treatment
40 28 31 38 43 46 29 l8
1
Treatment 2
36 49
s6 25 37 30 4I
504
Chapter 16
Determine if the data strongly indicate different toxicity levels among the treatments. 2.9 Morphologic measurementsof a particular type of fossil excavated from two geologicalsites provide the following data Site A
r.49 r.32 2.0r 1.59 r.76
Site B
1.31 r.46 1.86 1.58 t.64
Do the data strongly indicate that fossils at the sites differ with respect to the particular morphology measured? 2.10 If fra : I and nu : 9, find (a) the rank configuration that most strongly supports Hr: lation A is shifted to the right of population B. (b) the null probability of Wa : IO.
popu-
(c) Is it possible to have a : .05 with these sample sizes?
3. MATCHED PAIRCOMPARISONS In the presence of extensive dissimilarity in the experimental units, two treatments can be compared more efficiently if alike units are paired and the two treatments applied one to each member of the pair. In this section we discuss tr /o nonparametric tests, the sign test and the Wilcoxon signed-rank test, that can be safely applied to paired differences when the assumption of normality is suspect. The data structure of a matched pair experiment is given in Table 3, where the obsewations on the ith pair are denoted by (Xri, X"r). The null hypothesis of primary interest is that there is no difference, or Ho: no difference in the treatment effects. The Sign Test This nonparametric test is notable for its intuitive appeal and ease of application. As its name suggests,the sign test is basedon the signs of the responsedifferencesDr. The test statistic is S : Number of pairs in which treatment A has a higher responsethan treatment B. : Number of positive signs among the di{ferencesDr, . . Dn. ,
Matched Pair Comparisons 505
Toble 3 Doto Structureof PoiredSompling Treatment
Diff erence
xtt *:.'
X,,
Dr
xrn
xrn
Dn
Treatment PaiTABAB
I 2
*7.'
When the two treatment effects are actually alike, the response differenceD, in each pair is as likely to be positive as it is to be negative. Moreover, if measurements are made on a continuous scale, the possibility of identical responsesin a pair can be neglected. The null hypothesis is then formulated as Ho: P[+] : .5 : P[-] Identifying a plus sign as a success, the test statistic S is simply the number oitn"iesses in n trials and therefore has a binomial distribution with p : .5 under Ho. If the alternative hypothesis states that treatment A hai higher r"sponi"r than treatment B, which is translated P[+] > .5, then large values of s should be in the rejection region. For two-sided alternativei Hr: P[+] I .5, a two-tailed test should be employed'
EXAMPLE6
Mileage tests are conducted to compare an innovative vs. a conventional spark plug. A sample of 12 cars ranging from subcompacts to large station -"gorr aie included in the study. The gasoline mileage for each car is recorded, once with the conventional plug and once with the new plug. The results are given in Table 4. Looking at the differences (A - B), we can see that there are 8 plus signs in the sample of size n -- 12. Thus the observed value of the signtest statistic is S : 8. In testing Ho: No differencebetweenA and B, or P[+] : .5 vs. Ht: A is better than B, or P[+] > .5 we will reject Ho for large values of S. Consulting the binomial table for n : 1 2 a n d p : . 5 , w e f i n d P I S > 9 f : . 0 7 3a n d P I S= l 0 ] : . 9 1 9 .1 i * . wish to control a below .05, the reiection region should be establishedat S > 10. The observed value S : 8 is too low to be in the refection region,
506
Chapter 16
Toble4 Car Number I 2 3 4 5 6 7 8 9 t0 11 t2
New Conventional ABAB 26.4 10.3 15.8 16.5 32.5 8.3
24.3 9.8 16.9 17.2
+ +
30.s
+ +
7.9 22.4 28.6 13.1 r 1.6 25.5 8.6
22.r 30.l 12.9 12.6 27.3 9.4
Diff erence
+
2.r .5 1.1 .7 2.0 .4 .3 1.5 .2 1.0
+ + r.8 + .8
so that at the level of significance c : .019 the data do not sustain the claim of mileage improvement. The significance probability of the observed value is p[S > g] : .194.
tr An application of the sign tes'does not require the numerical values of the differences to be calculated. The number of positive signs can be obtained by glancing at the data. Even when a respottse be mea"*ttot which sured on a well-defined numerical scale,we can often determine of the two responsesin a pair is better. This is the only information that is required to conduct a sign test. For large samples, the sign test can be performed by using the normal approximation to the binomial distribution. with large n, the binomial distribution with p : .5 is close to the normal distribution with mean n/2 and standard deviation \/nn.
LorgeSompleApproximotion to the SignTestStotistic Under Ho, Z-
S - n/2
\ffi
is approximately distributed as N(0, l).
MatchedPair Compafisons 507
EXAMPLE7
In a TV commercial, filmed live, 100 persons tasted two beers A arrdB and each selected their favorite. A total of S : 57 preferred beer A. Does this provide strong evidence that A is more popular? According to the large sample approximation S - n/2-
\ffi
57 _-50 _ L.4
tfs
The significance probabllity PIZ > l.4l : .0808 is not small enough to provide strong support to the claim that beer A is more popular. n Handling Ties When the two responsesin a pair are exactly equal, we say that there is a tie. Becausea tied pair has zero difference, it does not have a positive or a negative sign. In the presence of ties, the sign test is performed by discarding the tied pairs, thereby reducing the sample size. For instance, when a sample of n : 20 pairs has l0 plus signs, 6 minus signs, and 4 ties, the sign test is performed with the effective sample size n : 2O - 4 : 16 andwithS: 10. The Wilcoxon Signed-Rank Test We have aLreadynoted that the sign test extends to ordinal data for which the responses in a pair can be compared without being measured on a numerical scale. However, when numerical measurements are available, the sign test may result in a considerable loss of information becauseit includes only the signs of the differences and disregardstheir magnitudes. Compare the two sets of paired differences plotted in the dot diagrams in Figure 3. In both casesthere aren : 6 datapoints with 4 positive signs, so that the sign test will lead to identical conclusions. Howevet, the plot in Figure 3b exhibits more of a shift toward the positive side, because the positive differences are f.arther away from zero than the negative differences.Instead of attaching equal weights to all the positive signs, as is done in the sign test, we should attach larger weights to the plus signs that are farther away from zero. This is precisely the concept underlying the signed-rank test.
n h)
(b)
Figure 3 Two plots of poired ditferences with the some number of + signs but wifh ditferent locotions for the distribulions.
508
Chapter 16
- In the signed-rank test, the paired differences are ordered according to their numerical values without regard to signs, and then the ranks associated with the positive observations are addedto form the test statistic. To illustrate, we refer to the mileage data given in Example 6 where the paired differences appearin the last row of Table 4. we attach ranks by arranging these differences in increasing order of their absolute values and record the corresponding signs: Paired differences
2.1 .5 - 1.1 - .7 2.0 .4 - .3 1.5 - .2 1.0 1.8
Ordered absolute values
.4
.5
Ranks Signs
3 +
4 s6 ++++
.7.8
1.0 l.l
7
1.5 1.82.0 2.r
8
9 l0
11 12 +++
The signed-rank statistic T+ is then calculated as T+ - sum of the ranks associated with positive observations
- 3 + 4 + 6 + 7 + g + 10+ 11+ 12
- If the null hypothesis of no difference in treatment effects is true, then the paired differencesDt, Dz, . . , Dnconstitute a random sample from a population that is symmetric about zero. on the other hand, the alternative hypothesis that treatment A is better asserts that the distribution is shifted from zero toward positive values. Under FIr, not only are more plus signs anticipated, but the positive signs are also lrkely to be associated with larger ranks. consequently, T+ is expected to be large under the one-sidedalternative, and we select a rejection region in the upper tail of ?+.
Steps in the Signed-RonkTest (a) Calculate the differences I)i - X,
Xrr, i _ 1,
, n. (b) Assign ranks by arranging the absolute values of the D, in increasing order; also recor&the corresponding signs. (c) Calculate the signed-rank statistic T+ : positive differences Dr.
Sum of ranks of
(d) Set the rejection region at the upper tatl, lower tail, or at both tails of T+, according to whether treatment A is stated to hav e a higher, lower, or different response than treatment B under the alternative hypothesis.
Matched Pair Comparisons 509
Selected tail probabilities of the null distribution of T+ are given in Appendix Table 8 forn : 3 to n : 15. Using Appendix Table 8 By symmetry of the distribution around n(n + I)14, we obtain p{T*>x]:4T*<x*] when x* : n(n + l)/4 - x. The x and x" values in the Appendix Table 8 satisfy this relation. To illustrate the use of this table, we refer once again to the mileage data given in Example 6. There n : 12 and the observed value of T+ is found tobe 62' From the table, we find P[T+ = 611 : .046. Thus the null hypothesis is reiected at the level of significance o. : .O46,and a significant mileage improvement using the new type of spark plug is indicated. PIT* <x"]
P_ PIT* >x]:
n_ 12 x*
?,
?u
'17 .046
+61
:,
:,
L*
io
With increasing sample srze n, the null distribution of T+ is closely approximated by a normal curve, with mean _ n(n + I) /4 and variance - n(n + I)(Zn + r)124.
Lorge Sompfe Approximotion to Signed-RonkStotislic Under the null hypothesis
z--
T+
n(n + I)14
n(n +1)(2n+ r)/24
is approximately distributed as N(0, l).
510
Chapter 16
This result can be used to perform the signed-rank test with large samples.
EXAMPLE 8 For the mileage data in Example 4, T+ the exact observed significance probability is pfy+ normal approximation to this probability uses
62
r2(r3)/4
From the normal table, we approximate
23 r2.7 5 pl7+
.036 I I
The normal approximation improves with increasing sample size. *Handling
ties
In compu-ting the signed-rank statistic, ties may occur in two ways: Some of the,differences D, may be zero, or some nonzero differences D, may have the same absolute value. The first type of tie is handled by discarding the zero values and simultaneously *odifyitrg the sample size downward to n':n
-No.of zeros
The second type of tie is handled by assigning the average rank to each observation in a group of tied observations with nonzetJdiff"r.o"es Dr. see Lehmann [1] for instructions on how to modify the critical values to adjust for ties.
EXERCISES 3.1 In a taste test of two chocolate chip cookie recipes, 13 out of lg subjects favored recipe A. using the sign test find the significance probability when F/, states that recipe A is preferable. 3.2
Two critics rate the service at six award winning restaurants on a continuous0to5scale Is there a difference between the critics ratings? (a) Use the sign test with a below .05. (b) Find the significance probability.
Exercises 51 f
Service Rating Critic 2
1
Critic
Restaurant
7.3 5.5
6.1 5.2 8.9 7.4 4.3 9.7
I 2 3 4 5 6
9.r 7.0 5.1 9.8
3.3 A social researcherinterviews 25 newly married couples. Each husband and wife are independently asked the question: "How many children would you like to have?" The following data are obtained:
Couple I 2 3 4 5 6 7 8 9
r0 11 L2 13
Answer of: Husband Wife
3 I 2 2 5 0 0 I 2 3 4 I 3
Couple
2 I I 3 I I 2 3 2 I 2 2 3
Answer of: Husband
2 3 2 0 I 2 3 4 3 0 I I
T4 l5 L6
r7 18 I9 20 2T 22 23 24 25
Wife
I 2 2 0 2 I 2 3 I 0 2 I
Do the data show a significant difference of opinion between husbands and wives regarding an ideal family? Use the sign test with a close to .05. 3.4
Two computer specialists estimated the amount of computer memory (in megabits) required by five different offices Office
Specialist A
Specialist B
I 2 3 4 5
5.2 6.4 3.2 2.O 8.1
6.8 3.1 2.9 12.3
7.r
512
Chapter 16
Apply the sign test, with ct under .10, to determine if specialist B estimateshigher than specialistA. 3.5 Use Appendix Table 8, to find (a) PIT+ > 541when n : ll (b) P[f+ < 32] when n : t5 (c) Find the value of c so that prT+ > cl is nearly .05 when : n
14.
3.6 Using Appendix Table 8, find: (a) P[T+ = 65] when n : t2. (b) P[T+ = l0] when n : 10. (c) The value c such that Pf7+ = c] : .039 when n : g. (d) The values c, and c" such that plT+ = cr] : fl1T* > cz] : .O27 when n : 11. 3.7 Referringto Exercise3.2, use the Wilcoxon signed rank test with a near .05. 3.8 Referringto Exercise3.4, apply the Wilcoxon signed rank test with a under .10. 3.9
The null distribution of the wilcoxon signed-rank statistic T+ is determined from the fact that under the null hypothesis of a symmetric distribution about zero, each of the ranks l, Z, . . ., n is equally likely to be associated with a positive sign or a negative sign. Moreover, the signs are independent of the ranks. (a) considering the casen : 3, identif y all2' : g possibleassociations of signswith the ranks r,2, and3 and determine the value of T+ for each association. (b) Assigning the equal probability of * to each case, obtain the distribution o,f T* and verify that the tail probabilities agree with the tabulated values.
3.10 A sample of size n : 30 yielded the wilcoxon signed-rankstatistic T+ : 325. what is the significance probability if the alternative is two-sided? 3.11 h Example 10 of Chapter 11, we presenteddata on the blood pressure of persons before and after they took a pill.
Exercises 513
Before
After
Diff erence
Before
After
Diff erence
70 80 72 76 76 75 72 78 82
68 72 62 70 58 65 68 52 64
2 B 10 6 18 10 4 26 18
64 74 92 74 68 84
72 74 60 74 72 74
-8 0 32 0 -4 l0
(a) Perform a sign test, with a near .05, to determine if brood pressure has increased after taking the pill. 3.12 Charles Darwin performed an experiment to determine if self-fertilized and cross-fertilized plants have different growth rates. Pairs of Zea mays plants, one self- and the other cross-fertilized, were planted in pots, and their heights were measured a{ter a specified period of time. The data Darwin obtained were: Plant height (in * inches)
Plant height (in * inches)
Pair
Cross-
Self-
Pair
Cross-
Self-
I 2 3 4 5 6 7 8
188 96 168 176 r53 172 177 163
139 r63 150 160
9 10 t1 t2 13 l4 15
r46 173 186 168 177 184 96
t32 r44 130 144 r02 124 144
r47 r49 r49 r22
Source: Darwin, C., "The Effectsof Cross-and Self-Fertilization in the VegetableKingdom," D. Appleton and Co., New york, 1902.
(a) Calculate the paired differences and plot a dot diagram for the data. Does the assumption of normality seem plauiible? (b) Perform the Wilcoxon signed-rank test to determine if crossed plants have a higher growth rate than self-fe rttltzed plants.
514
Chapter 16
4. MEASURE OF CORRELATION BASED ON RANKS Ranks may also be employed to determine the degree of association between two random variables. These two variables could be mathematical ability and musical aptitude or the aggressivenessscores of firstand second-bornsons on a psychologicaltest. We encounteredthis same general problem in Chapter 3, where we introduced Pearson,s product moment correlation coefficient n
(x, l:
x)(Yi
7l
I n -1
as a measure of association between X and Y. Serving as a descriptive statistic, / provides a numerical value for the amount of linear dependence between X and Y.
Struclureof the Observotions The n parrs (Xr, Yr), (Xz, Yr), . , (Xn, Y.) are independent, and each pair has the same continuous bivari ate distribution. The Xr, . , Xn are then ranked among themselves, and the Yr, . t Yn are ranked among themselves: Patrno.
I
2
Ranks of Xi Ranks of Yi
Rr Sr
R2 52
n . .
R,, S,,
Before we present a measure of association, we note a few simplifying properties.Becauseeach of the ranks 1,2, . . . , n must occur exactly once in the set Rr, Rr, . . . , Rn,it can be shown that R:
I + 2 +
I7
l:
I
for all possible outcomes. SimilarIy,
+n
(n+1) 2
Measures of Correlation Based on Rank 5f 5
?J _
(n + 1) 2
I7
and i:
I
A measure of correlation is defined by C. Spearmanthat is analogous to Pearson's correlation, r, except that Spearman replaces the observations with their ranks. Spearman'srank conelation rrn is defined by n \.-l
z (R, /sp
R)(Si
n+1\ 2/
S)
l:1
:
n
n(nz
I)/Lz
-l
This rank correlation sharesthe properties of r that - I s rro < I and that values near + I indicate a tendency for the larger values of i to be paired with the larger values of Y. However, the rank correlation is more meaningful, becauseits interpretation does not require the relationship to be linear.
Speormqn'sRonkCorrelqfion 17
rsp :
l:l
n(n2
I) /12
(a) -1 < rsp (b) rsp near + I indicates a tendency for the larger values of X to be associatedwith the larger values of Y. Values near - I indicate the opposite relationship. (c) The association need not be linear; only an increasing/decreasing relationship is required.
EXAMPLE9 An interviewer in charge of hiring large numbers of typists wishes to
determine the strength of the relationship between ranks given on the basis of an interview and scores on an aptitude test. The data for 6 applicants are Interview rank
Aptitude score
516
Chapter 16
Calculate rsp. There are5 ranks, so that R
_ 3512
Interview R,
523
Aptitude S,
53
7/2 we obtain I
64
Thus
_ (5
3.5X5
_ 1 . 5 ( 1 . 5+) ( - 16.5 and
r s p - f1f6i . 5 The relationship between interview rank and aptitude score appearsto be quite strong. n Figure 4 helps to stress the point that rro is a measure of any monotone relationship, not merely a linear relation. A large sample size approximation to the distribution of rrn leads to a convenient form of a test for independence.
fiX and Y areindependent, \F rro is approximately distributed as N(0,1) provided the sample size is large.
Reject Ho:
X and Y are independent
in favor or Hr:
Large values of X and Y tend to occur together if
Exercises 517
f*j Figure 4 rro is o meosure ol ony monolone relolionship. \/n
-
| rsp2 zo.
Recall that z-is the upper cr point of a standard normal distribution. tailed tests can also be conducted.
Two-
EXAMPLE'10 The grade point average (GPA) of Scholastic Achievement Test (SAT) scores fior4Oapplicants yielded trn : .4. Do large values of GPA and SAT tend to occur together? : t/-eg(.+) : L.498.
\6:Trrn
Sincez.o,: l.96we rejectHo: Xand Yare independentatlevelct : .05.
D
EXERCISES 4.I Calculate Spearman'srank correlation from the data x
I 3.1
5.4
4.7 4.6
4.2 Determine Spearman/s rank correlation from the data
x
| 13.2
18.7
19.4
6.3
9.2
8.5
518
Chapter 16
4.3 The following scores are obtained on a test of dexterity and aggression administered to a random sample of 10 high-school seniors: Student
r0
Dexterity
23 29
45
36
49
4I
30
1 5 42
38
Aggression
45 48
15
28
38
21
36
31
37
Evaluate spearman's statistic. Does the value of r"- indicate that dexterity is high when aggressionis low and vice vJria? 4.4 Referring to Example 10, determine the significance probability of rsp : .4, using the one-sidedtest, when n : 4O. 4.5 If rrn : .3 and n : 80, determine the significance probability of the test that reiects for large values of rrn.
5. CONCLUDING REMARKS In contrast to nonparametric procedures, student's t and the chi-square statistic (n - l)s2/oz were developed to make inferences aboui the parameters p and o of a normal population. These normal-theory parametric procedures can be seriously affected when the sample size is small and the underlying distribution is not normal. Drastic departures from normality can occur in the forms of conspicuonr asy*metry, sharp peaks, or heavy tails. For instance, a t-test with an intended level oi significance of c : .05 may have an actual type I error probability far in excess of this value. These effects are most pronounced for small or moderate sample sizes precisely when it is most difficult to assessthe shape of the population. The selection of a parametric procedure leaves the data analyst with the question: Does my normality assumption make sense in the present situation? To avoid this ris\ a nonparametric method could be used in which inferences rest on the safeiground of distribution-free properties. when the data constitute measurements on a meaningful numerical scale and the assumption of normality holds, parametric procedures are certainly more efficient in the sense that tests have higher power than their nonparametric counterparts. This brings to mind the old adage "You get what you pay for." A willingness to assume more about the population form leads to improved inference procedures.However, trying to get too much for your money by assuming more about the population than is reasonable can lead to the "purchase" of invalid conclusions. A choice between the parametric and nonparametric approach should be guided by a consideration of loss of efficiency and the degreeof protection desired against possible violations of the assumptions.
Concluding Remarks 519 Tests are judged by two criteria: control of the type I error probability and the power to detect altematives. Nonparametric tests guarantee the desired control of the type I error probability ot,whatever the form of the distribution. However, a parametric test established at a : .05 for a normal distribution may suffer a much larger a when a departure from normality occur. This is particularly true with small sample sizes. To achieve universal protection, nonparametric tests, quite expectedly, must forfeit some power to detect altematives when normality actually prevails. As plausible as this argument sounds, it is rather surprising that the loss in power is often marginal with such simple procedures as the Wilcoxon rank-sum test and the signed-rank test. Finally, the presenceof dependenceamong the observations affects the usefulness of nonparametric and parametric methods in much the same manner. Using either method, the level of significance of the test may seriously differ from the nominal value selected by the analyst.
Coution: When successiveobservations are dependent,nonparametric test procedureslose their distribution-free propetty, and conclusions drawn from them can be seriously misleading.
Ref erence 1. Lehmann, E. L., Nonparametrics: Statistical Methods Based on Ran/<s, Holden-Day,San Francisco,1975.
KEYIDEAS 1 . Nonp arametric tests obtain their distribution-free character because rank orders of the observations do not depend on the shape of the population distribution.
2 . The Wilcoxon rank-sum test statistic Wa : sum of ranks of the no observations, from population A, among all nA + nB observations applies to the comparison of two populations. 3. In the paired sample situation, equality of treatments can be tested using either the sigl test statistic
520
Chapter 16
s or the Wilcoxon signed-rank statistic T+ _ sum, over positive differences, of the ranks of their absolute values. 4. The level of a nonparametric test holds whatever the form of the (continuous) population distribution.
EXERCISES 5 . 1 Evaluate W A for the data Treatment A
90
32
81
Treatment B
67
99
43
5.2 Using Appendix Table 7, find: (a) PlW" = 42] when nr : 5, nz: 7. (b) PtW" = 25] when nr : 6, nz : 6. (c) P[W" > 8l or W" < 45] when n, : lO, n, : 7. (d) The point c such that P[W" > c] : .036 when n, : 8, n, - 4. (e) The points c, and c, such that P[W" > cz] : plW" = cr] : .05 when n, : 3, n, : 9. 5.3
(a) Evaluate all possible rank configurations associatedwith Treatment A when no : 3 and ns : 2. (b) Determine the null distribution of W'r.
5.4 Five finalists in a figure-skating contest are rated by two judges on a l0-point scale as follows: Contestants
fudge I |udge 2
9 10
Calculate the Spearinan/srank correlation /sp between the two ratings. 5.5 Using Appendix Table 8, find: (a) Plr+ (b) Pl7+ (c) The point c such that PIT* n
Exetcises 521 5.6 Referring to Exercise 5.4, calculate (a) The sign test statistic and (b) The significance probability when the alternative is that |udge 2 gives higher scores than |udge l. 5.7 In a study of the cognitive capacities of nonhuman primates, 19 monkeys of the same ageare randomly divided into two groups of l0 and 9. ih" gronpr are trained by two different teaching methods to recollect an acoustic stimulus. The monkeys' scores on a subsequent test are: Memory Scores 167 r49 137 178 r79 155 164 104 151 150
Method I Method 2
98 r27 140 103 I 16 10s 100 95 131
Do the data strongly indicate a difference in the recollection abilities of monkeys trained by the two methods? use the wilcoxon rank-sum test with ct close to .10. in 5.8 The following data pertain to the serum calcium measurements in measurements phosphate units of IU/L and the serum alkaline Hampshire: and white units of p,g/mlfor two breedsof pigs, chester Chester White Calcium
115
TT2
82
63
TT7
69
79
87
Phosphate
47
48
s7
7s
65
99
97
110
Hampshire Calcium
Phosphate
62
s9
80
105
60
7L
r03
100
230
182
r62
78
220
L72
79
58
using the wilcoxon rank-sum procedure, test if the serum calcium level is different for the two breeds. 5.g Referring to the data in Exercise 5.8, is there strong evidence of a difference in the serum phosphate level between the two breeds? 5.10 (a) calculate Spearman's rank correlation for the data in Exercise 5.8. (b) Test for independenceof calcium and phosphate levels using the reiection region
\ffi
/sp
(c) What is the approximate level of this test?
522
Chapter 16
5.1I Given the following
data on the pairs (*,y),
l0 7 8
15 L2 9
evaluate rsp.
5.12 Refer to Exercise5.11. Evaluate (a) sign test statistic (b) signed-rank statistic. 5.13 one aspectof a study of sex differencesinvolves the play behavior of monkeys during the first yeat of life (courtesy of H. Harlow, U. W. Primate Laboratory). six male and six female monkeys are observed in groups of four families during several ten-minute test sessions. The mean total number of times each monkey initiates play with another age mate is recorded: Males Females (a) Plot the observations. (b) Test for equality using the wilcoxon rank-sum test with ct approximately .05. (c) Determine the significance probability. 5.14 Confidence intetval for median using the sign test. Let X1, . . . , Xn be a random sample from a continuous population whose median ii denoted by M. For testing Ho: M : Mo we can use the sign test statisticS : No. ofXj > Mo, i : 1,. . .,n. Hoisrejectedatlevelctin favorofHr:M + MoIf.Sn - r * lwhere>l: ob(xin,.S) : s./2. Repeating this test procedure for all possible values of Mo, a 100(1 - ct)% conlidence intervalfor M is then the range of values Mo so that S is in the acceptance region. Ordering the observations from smallest to largest, verify that this confidence interval becomes (r + l)st smallest to (r + l)st largest (a) Refer to Example 6. Using the sign test, constru ct a confidence interval for the population median of the differences (A B), with level of confidence close to 95%. (b) Repeat Part (a) using Darwin/s data given in Exercise 3.I2.
APPENDIX
Summotion Nototion
524 Appendix A1
41.1SUMMATION AND ITSPROPERTIES The addition of numbers is basic to our study of statistics. To avoid a detailed and repeated writing of this operation, the symbol ) (the Greek capitalletter sigma) is used as mathematical shorthand for the operation of addition.
SummofionNofotion The notation . , xnand,, ,""i as the $um cf al| *rwith i rangirg from I to n.
The term following the sign ) indicates the quantities that are being summed, and the notations on the bottom and the top of the: specify thJ range of the terms being added. For instance,
4 i:
I
EXAMPLE Suppose that t h e f o u r m e a S u r e m e n t S i n a d a t a S e t a r e g i v e n a S X 1 5 , x g _ 3, x+ - 4. Compute the numerical values of:
xz_ 4
4
\ (a) Z-/
xi,
i:1
(b) l:1
4
4
) ( c ) Z-r l:
I
2x, |Xi7 ,
l:
4
(e)
s
B)
I
4
.2
Z-r X' j t l:1
Solution:
(a) i
(d) ) (',
xi
(f) ) (',
j:'
s),
A1.1 Summation and lts Properties 525
(b) l:1 4
(c) t:o'
(d)
(x,
,),
3) _ (xr
_
\
\i:l
/
3) + (xg
3) + (x,
+
14
3) + (xo
3)
4(3): 14 rz - 2
4r*,
4
(e) (f )
,;, (x,
,},
3)2 _ (x,
3)t + (xe
3)2 + (x,
_ (2
3)2 + (3
3)2 + (s
3)2 + (x+ 3)2 + (4
3)2
3)z
_ I + 4 + 0 + 1_ 6 Alternatively, noting that (x'.
Z
3)" - i? - 6*, + 9, we can write
(x, B)' @7 6*" + 9) + ("?
+ (xZ
6*, + 9)
6*o + 9)
+ 4(e) l:1
:6 A few basic properties of the summation operation are apparent from the numerical demonstration in the example.
SomeBosicProperliesof Summotion ff a and b are fixed numbers, n
n
\i
b*,
Z-r i:l
l:
s
( b x ,+ a ) - b 2 4
.2
I
l:
l:
n
s
I 17
17
ZJ l: I
+ na
I
17n
(x,
a)2 _
xi + na2 i:I
l:
I
526 Appendix A1
EXERCISES I Demonstrate your familiarity with the summation notation by evaluating the following expressionswhen Xr : 4, Xz : -2, x" : I.
(d) l:l
i:l
(d)4
l:l
( h )> @ , - B ) 2( i ) > @ l - 6 x , + e ) i:l i:l
i:l
2 Five measurements in a data set are Xr : 4, xz : 3, xB : 6, x4 : 5, Xs : 7. Determine:
AI'.2SOMEBASICUSES OF Let us use the summation notation and its properties to verify some computational facts about the sample mean and variance.
tr(x;-x)=0 The total of the deviations about the sample mean is always zero. Since T : (x, * xz I . - . * x,)/n,wecanwrite n
) Consequenur, *;""er
"r
: xr + x2 + ' ' ' * xn:
77Y
the observations,
n
sr
Z
i:
I
@i- x) : (x, - x) + (xz -x) + ... + (x, - x) :xr*xz*"'*xn-rtx :n7-nx:0
A1.2 SomeBasic(Jsesof 2 in Statistics 527 We also verify this directly from the second ploperty for summation in -x Al.l,whenb:landa: Alternative Formula for s2 By the quadratic rule of algebra,
Therefore,
x)z : 2*? - 22vx, + )vz
Xx,
\.'2-
OX)X, + nVz
Using (Zxr)ln in place of t, we get
-'+t )(xi- v)2: 2x? s
:tvz_-+-
. ry ()x,)2
2(2x,\z
a
'nn
/s- )2 :>*?_+
We could also verify this directly from the third property for summation, in A1.1, with a : x. This result establishes that
s
- *' '
- ? r\*, n., _
n-I
- ( 2x,) z/n _2x? -n--T--
so the two forms of s2 are equivalent. Sample Correlation Coefficient The sample correlation coefficient and slope of the fitted regression line contain a term
S .' , :
)
(", -x)(yr-Y)
i:l
which is a sum of the products of the deviations. To obtain the alternative form, first note that (x,
V)(yi
fl
_ xiyi
"ry
Vyi + Vy
528
Appendix A1
number,with index r, and conclude that x)(yi 2?yi + >Vy Y) - 2x,Y, 2*,y
We treat xiyi as a single Xxi
V2xi Since x
(2x,) /n and y
Xxi
x2yi + rl)ry
- (2y,)ln,
x)(y,
Consequently, either Xx, used for the calculation of
()xr) n
y)
V)(yi Sry.
fl or Zx,y,
()xr) (2y,)/n can be
APPENDIX
Expectotion ond DeviotionStondord Properties
530
Appendix A2
The expected value of a discrete random variable is a summation of the products (value x probability). The key properties of expectations are then all inherited from the properties of summation. In this appendix, we indicate this development for some of the most useful pioperties of expectation and variance. The interested reader can consuli [t] for more details.
M.l EXPECTED VALUE AND STANDARD DEVIATION OF cX + b The units of the random variable X rnay be changed by multiplying by a constant, for example, X _ height in feet,
IZX
or by adding a cons tant, for example,
X : temperature ("F) X - 32 : degreesabove freezing (.F) The mean and standard deviation o{ the new random variables are related top: E(X)ando: sd(X). If X is multiplied by a constant, c Random Variable
Mean
Sd
CX
C}L
ltlo
If a constant, b, is added to X Random Variable
Mean
Sd
x + b
t-r + b
o(unchanged)
Notice that adding a constant to a random variable leaves the standard deviation unchanged.
EXAMPLE LetXhavemean : 3 : Fandstandarddeviation: 5 : o. Findthemean and sd of (a) X + 4, (b) 2X, (c) -X, and (d) +(X _ B). 1 Bhattacharyya, G. K, and lohnson, Richatd A., statisticaj concepts and Methods,wiley, New York, 1978.
A2.1 Expected Value and Standard Deviation of cX + b
531
By the properties above:
Random
Variable
X + 4 2X -X: (-l)X Finally, (X - 3) has mean 3 +(0) : 0 and sd : +(5) : l.
Mean
sd
3 + 4_7 2(3) : 5 ( - 1 ) B_ - 3
5 2(5) : l0
3 _ 0 and sd :
Any random variable having E(X) : verted to a
l- lls 5,so+(X- 3) hasmean:
n
p and Var(X) : o2 can be con-
Standardrzedvariable;Z_
X - P" o
The standardized variable Zhas mean : 0 and variance : 1. This was checked for Z :'+
Y_?
in the example above.
J
*Verification of the Mean and sd Expressions for cX + b Consider the random variable cX + b, which includes the two cases above.Thechoicb e : O g i v e s c X a n d t h e c h o i c e c : l g i v e sX + b . W e restrict our verification to discrete random variables where probability /(x) is attached to xr. BecausecX + b takes value cx, + b with probability
f(x,), (value x probability) _ (cx, + b)f(x,)_ cxrf(xr) + bf(xr)
mean - )(value x probability) - Zcxrf(xr)+ Zbf(xr) _ cZxrf(xr\+ bZf(xr)- clL + b' 1 Next, (mean),
deviation _ cX + b
(cp + b)
532 Appendix A2
variance
Xdeviation)2 x probability 2c2(x,
p")2f(xi)
c22(x,
p,)2f(x,)
Taking the positive squ areroot yields sd(cX + Lastly, takin g c
cX + b
b) - l"lo.
a('u
so the standardizedvariable Z has Mean: sd:
cll +
,l
D
(r(r
ca_7o (r
42.2 ALTERNATIVT FORMULA FORG2 An alternative formula for o2 often simplifies the numerical calculations. By definition, a2
but o2 can also be expressedas 2xlf(x,) - ;Lz To deduce the secondform, we first expand the squareof the deviation: (xr
]L)2 _
1-
2px, + ]L2
Then, multiply each term on the right-hand side by f(xr) and sum: First term Secondterm
- Zp2xf(x,) _
Third term
+ Stz2f(xr)
Result:
u2
-2p,
since ZxJ@r)
t,r I
A2.3 Propeniesof ExpectedValuefor T\to Random Vafiables 533 EXAMPLE Calculate o2 by both formulas. Calculation Xx
I 2 3 4
f(x)
xf(x)
.4 .3 .2 .1
.4 .6 .6 .t 2.0 : p.
(x
Calculation >x2f(x) - ]Lz
lL)zf(x) (x
2)z
2)2f(x) .4
I 0 I 4
0 .2
f(x) I 2 3 4
.4 .3 .2 .1
1.0 : u2 a2
xf(x)
xzf(x)
.4 .6 .6 .4
.4 1.2 1.8
2.0
5.0
r.6 -l
VALUE OF EXPECTED A2.3 PROPERTIES FORTWORANDOMVARIABLES The concept of expectation extends to two or more variables. With two random variables: (a) E(x + Y) : E(x) + E(Y) (additivity or sum law of expectation). (b) If X and Y are independent, then E(XY) : E(X)E(Y). Remark:
(a) Holds quite generally, independence is not required.
*Demonstration We verifu both (a) and (b) assuming independence.Independence implies that P[X : X, Y : y] : PIX : xlPlY : yl for all outcomes (r y). That is, the distribution of probability over the pairs of possible values (x, y) is : PIX : x, Y : yl. specified by the product fAx)flfl The expected value, E(X + Y), is obtained by multiplying each possible and summing: value (x + y) by the probability f "(x)fly)
E(x + D : > )k xy
+ flfx@)fr\v)
) | "r"
: (Vonr)(>r;'r) . (;ua)()'r"r',)
-T E(x)+ E(v)
534 AppendixA2 Next,
E(XY:
? 7"'f"(x)f"0)
l"r,k)(?vrln)_ E(X)E(y)
T
Under the proviso that the random variable s are independent variances also add. (c) If X and Y arc independent, Yar(X + Y)_
Yar(X) + Var(Y)
*Verification We set I\ : E(X) and p" : E(Y), so by (a), E(X+Y):pr*pz Then, since variance is the expected value of (variable - mean)2, Var(X + Y) : E(X + Y - Fr - pz)z : E[(X - Fr)2 + (Y - p.)2 * 2(x - p"r)(Y - p,z)] : E(x - pr)2 + E(Y - p)2 + 2E(X - pr)(Y - Fz) (by the sum law of expectation (a)) : Var(X) + Var(Y) This last step follows since
-pr) -' a (Y E(x (b lprrry_, =,(by tr
Volue TheExpected ond Stondord Devloti on of X
536
Appendix A3
some basicproperties-ofthe samplingdistribution of x canbeexpressed in terms of the population mean and variance when the observations form a random sample.Let p = population mean o2 : population variance In a random sample, the random variables Xr, . . . Xn ate independent, , and each has the distribution of the population.'consequentiy, each observation has mean p and variance oi, o, _lr -02
Next,
X
+ Xn)
and n is a constant. Using the additivity properties of expectation and variance discussed in Appendix 42, we obtain I
E(X)
+ E(X,)l
nL'
- lr + tlrl, zi- r Var(X)
I
fiv^r(X, +
+
rnW lrj_
(mean of sum _ sum of means)
n_p
'+X') (variancesadd due to independence)
Further, taking the square root
sd(t)_\ffi
o
\fr
APPENDIX
D D Tobles (?)
Toblel TheNumberof Combinotions
2 3 4 5 6 7 8 9 10 11 t2 13 T4 15 L6 L7 18
r9 20
I 3 6 10 15 2I 28 36 45 55 66 78 9L 105 r20 135 153 L7T 190
I 4 10 20 35 56 84 r20 165 220 286 364 455 s60 580 816 969 r,L4O
1 5 l5 35 70 126
2ro 330 495
7rs 1,001 r,365 1,820 2,380 3,060 3 , 8 76 4,845
Source: Hoel, Paul G. Elementary
I 5 2T 56 126 252 462 792 r,287 2,OO2 3,003 4,368 6,188 8,568 I1,628 15,504
I 7 28 84 210 462 924 1 , 71 6 3,003 5,005 9,008 12,375 18,564 27,r32 38,750
I 8 36 L20 330 792 r , 71 6 3,432 6,435 TT,44O 19,448 3r,824 50,388 77,520
I 9 45 165 495 r,287 3,003 6,435 12,970
243r4 43,758 75,582 125,970
Statistics, [ohn Wiley & Sons, New York, l97l-
I 10 55 220
7rs 2,OO2 5,005 ll,44o 24,3rO 48,620 92,378 167,960
1 11 66
286 1,001 3,003 8,008 19,448 43,758 92,378 184,756
538 Appendix B
Toble2 cumulolive Binomiolprobobilities ,Y< = ^ar = ' Plx cl
.05
€
.10
yl'r - pln-x
t?l-
,4o(;/
.20
.30
.40
.50
.60
.70
.80
.90
.95
C
n:
n:
10 I
.9s0 .900 .800 .700 .600 .500 .400 .300 .200 .100 .050 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000
I 2
.902 .810 .640 .490 .360 .2s0 .160 .090 .040 .010 .002 .997 .990 .960 . 9 r 0 .840 .7so .64A .s10 .360 .190 .097 1.000 1.000 1.000 1 . 0 0 0 1.000 1.000 1.000 1.000 1.000 1.000 1.000
I 2 3
.857 .729 .sr2 .343 .2t6 .r25 .064 .027 .008 .001 .000 .993 .972 .896 . 7 8 4 .648 .500 .352 .216 .ro4 .028 .007 r.000 .999 .992 . 9 7 3 .936 .875 .784 .6s7 .488 .27r .r43 r.000 1.000 1.000 1 . 0 0 0 r.000 1.000 1.000 1.000 1.000 1.000 1.000
20
30
n -- 4 0 I 2 3 4
. 6 5 6 . 4 1 0 .240 .130 .063 .026 .008 .002 .000 .000 . 9 4 8 . 8 1 9 .652 . 4 75 . 3 1 3 .r79 .084 .027 .004 .000 r.000 .996 .973 . 9 1 6 .82r .688 . s z s . 3 4 8 .l8l .052 .0r4 1.000 1.000 .998 .992 .974 .938 .870 .760 .s90 . 3 4 4 . 1 8 5 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000
n-5
0 I 2 3 4 5
. 7 74 .590 .328 . 1 6 8 .078 .031 .010 .oo2 .000 .000 .000 . 9 7 7 .9r9 .737 .528 . 3 3 7 . 1 8 8 .087 .031 .007 .000 .000 .999 .99r .942 .837 .683 .500 .3r7 .r"63 .0s8 .009 .001 r.000 1.000 .993 .969 . 9 1 3 . 8 1 3 .663 .472 .263 .081 .023 1.000 1.000 1.000 .998 .990 .969 .922 .832 .672 .410 .226 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000
I 2 3 4 5 6
.73s .531 .262 . 1l 8 .047 .016 .004 .001 .000 .000 .000 .967 . 8 8 6 . 6 5 s .420 .233 .r09 .041 .011 .aaz .000 .000 .998 .984 .901 .744 .544 .344 .r79 .070 .0r7 .001 .000 1.000 .999 .983 .930 . 8 2 1 . 6 5 6 .456 .256 .099 .016 .002 1.000 1.000 .998 .989 .959 .891 .767 .s80 .34s .rr4 .033 1.000 1.000 1.000 .999 .996 .984 .9s3 .882 .738 .459 .265 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000
n
n_7
0 I 2 3 4 5 6 7
.815 .986
.698 .478 .2ro .9s6 .850 .s77 .996 .974 .8s2 1.000 .997 .967 1.000 1.000 .99s 1.000 1.000 1.000
.028 .008 .oo2 .000 .000 .000 .000 .159 .063 .019 .004 .000 .000 .000 .420 .227 .096 .029 .00s .000 .000 .7ro .s00 .290 .126 .033 .003 .000 .904 .773 .580 .353 .148 .026 .004 .981 .938 .841 .67| .423 r.000 r.000 1.000 r.000 .998 .992 . 9 7 2 . 9 1 8 . 7 9 0 .150 .044 .522 1.000 1.000 1.000 r.000 1.000 1.000 r.000 1.000 1.000 1.000 .302 1.000 .082 .329 .647 .874 . 9 7r .996
Toble 2 lContinuedl
.05
.10
C
.20
.30
.40
.50
.60
.70
.80
.90
.95
I 2 3 4 5 6 7 8
.663 .943 .994 r.000 1.000 1.000 1.000 1.000 1.000
.430 .813 .962 .99s 1.000 1.000 1.000 1.000 1.000
.168 .058 .or7 .503 .255 .106 .797 .5s2 . 3 1 5 .944 .806 .s94 .990 .942 .826 .999 .989 .9s0 1.000 .999 .99r 1.000 1.000 .999 1.000 1.000 1.000
.000 .035 .009 .001 .r45 .050 .011 .058 .363 . r 74 .637 .406 .r94 .85s .58s .448 .965 .894 .74s .996 .983 .942 1.000 1.000 1.000
.000 .000 .000 .000 . 0 0 1 .000 . 0 1 0 .000 .056 .005 .203 .038 .497 . 1 8 7 .832 .s70 1.000 1.000
.000 .000 .000 .000 .000 .006 .0s7 .337 1.000
I 2 3 4 5 6 7 8 9
.630 .929 .992 .999 1.000 1.000 1.000 r.000 1.000 1.000
.387 .775 .947 .992 .999 1.000 1.000 1.000 r.000 1.000
.040 .0r0 .002 .000 .000 .r34 .020 .004 .000 .436 .196 .07r .004 .738 .463 .232 .090 .ozs .730 .483 .254 .099 .025 .9r4 .980 .901 .733 .500 .267 .O99 .270 .sr7 .901 .746 .997 .97s .s37 . 9 1 0 .768 1.000 .996 . 9 7s 1.000 1.000 .996 .980 .929 .804 1.000 1.000 1.000 .998 .990 .950 1.000 1.000 1.000 1.000 1.000 1.000
.000 .000 .000 .000 .000 .000 .003 .000 .020 .001 .086 .008 .262 .0s3 .564 .225 .866 .613 1.000 1.000
.000 .000 .000 .000 .000 .001 .008 .A7r .370 1.000
80
n
.004 .001
n-
l0 0 I 2 3 4 5 6 7 8 9 10
.s99 .914 .988 .999 1.000 1.000 1.000 1.000 1.000 1.000 1.000
.349 .736 .930 .987 .998 1.000 1.000 1.000 1.000 1.000 1.000
.r07
.028 .L,49 .376 . 5 7 8 t83 .650 .879 .850 .967 .9s3 .994 .989 .999 .998 1.000 1.000 1.000 1.000 1.000 1.000 1.000
.000 .000 .002 .000 . 0 1 2 .002 .r72 .05s . 0 1I . r 6 6 .o47 .377 .367 . 1 5 0 .623 . 6 1 8 .3s0 .828 .833 .617 .945 .954 . 8 5 1 .989 r.000 .999 .994 .972 1.000 1.000 1.000 1.000
.000 .000 .000 .000 .000 .000 .001 .000 .006 .000 .033 .002 .T2T . 0 1 3 .322 .070 .624 .264 .893 . 6 5 1 1.000 1.000
.000 .000 .000 .000 .000 .000 .001 .o12 .086 .401 r.000
n-
ll
.s69 .898 .98s .998 1.000
.3r4 .697 .910 .981 .997 1.000 1.000 1.000 1.000 1.000 1.000 1.000
.020 .086 .113 .322 .313 .617 .570 .839 .790 .9s0 .922 .988 .978 .998 .996 1.000 .999 1.000 1.000 1.000 1.000 1.000 1.000 1.000
.000 .004 .006 .030 .033 .rr9 . 11 3 .296 .274 .533 .s00 .7s3 .726 .901 .887 . 9 7r .967 .994 .994 .999 1.000 1.000 1.000 1.000
.000 .000 .000 .00r .006 .001 .029 .004 .o99 .022 .247 .078 .457 .210 .704 .430 .881 .687 .970 .887 .996 .980 1.000 1.000
.000 .000 .000 .000 .000 .000 .000 .000 .002 .000 .Or2 .000 .0s0 .003 .16r .019 .383 .090 .678 .303 .686 .9r4 1.000 1.000
.000 .000 .000 .000 .000 .000 .000 .002 .015 .ro2 .431 1.000
0 I 2 3 4 5 6 7 8 9 l0 11
r.000 1.000 1.000 1.000 1.000 r.000 1.000
.005 .o46 .r67 .382 .633 .834 .945 .988 .998
.001 .011 .055
539
540
Appendix B
Tobfe 2 lContinuedl
n-
12 0 I 2 3 4 5 6 7 8 9 10 ll I2 130 I 2 "3 4 5 6 7 8 9 l0
n t2 l3 n-
14 0 I
2 3 4 5 6 7 8 9 l0 1l T2 l3 t4
.05
.10
.20
.s40 .882 .980 .998 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000
.282 .659 .889 . 9 74 .996 .999 1.000 1.000 1.000 r.000 1.000 1.000 1.000
.069 .27s .s58 .79s .927 .981 .996 .999 1.000 1.000 1.000 1.000 1.000
.513 .86s . 9 7s .997 1.000 1.000 1.000 r.000 r.000 1.000 r.000 r.000 1.000 1.000
.254 .621 .866 .966 .994 .999 1.000 1.000 1.000 1.000 1.000 r.000 r.000 1.000
.488 .229 .847 .585 .970 .842 .996 .9s6 1 . 0 0 0 .991 1.000 .999 1 . 0 0 0 1.000
r.000 1.000 r.000 1.000 1.000 r.000 1.000 1.000
1.000 r.000 r.000 1.000 1.000 1.000 1.000 1.000
.30
.40
.50
.60
.70
.014 .oo2 .000 .000 .000 .08s .020 .003 .000 .000 .2s3 .083 .019 .003 .000 .493 .225 .073 .0ls .002 .724 .438 .r94 .os7 .009 .882 .66s .387 .158 .039 .96r .842 .613 . 3 3 s . l 1 8 .99r .943 .806 .s62 .276 .998 .98s .927 .77 s .so7 1.000 .997 .981 .9r7 .747 1.000 1.000 .977 . 9 8 0 . 9 1 5 1.000 1.000 1.000 .998 .986 1.000 1.000 1.000 1.000 1.000
.80
.90
.95
.000 .000 .000 .000 .001 .004 .019 .073
.000 .000 .000 .000 .000 .000 .000 .000 .000 .000 .000 .000 .001 .000 .004 .000 .z}s .026 .002 .442 .l l l .020 .72s .341 .l l8 .93r . 7 1 8 . 4 6 0 1.000 1.000 1.000
.05s .234 .502 . 74 7 .901 .970 .993 .999 1.000 1.000 1.000 1.000 1.000 1.000
.010 .001 .000 .000 .000 .000 .000 .000 .064 .013 .002 .000 .000 .000 .000 .000 .202 .058 .011 .001 .000 .000 .000 .000 .42r .169 .046 .008 .001 .000 .000 .000 .654 .353 .133 .032 .004 .000 .000 .000 .835 .s74 .29r . 0 9 8 . 0 1 8 . 0 0 1 .000 .000 .938 .77r .500 .229 .062 .007 .000 .000 .982 .902 .709 .426 .l6s .030 .001 .000 .996 .968 .867 .647 .346 .099 .006 .000 .999 .992 .9s4 . 8 3 1 . s 7 9 . 2 5 3 .034 .003 1.000 .999 .989 .942 .798 .498 .t34 .025 r.000 1.000 .998 .987 .936 .766 .379 _135 1.000 1.000 r.000 . 9 9 9 . 9 g o . 9 4 5 .746 .487 1.000 1.000 1.000 1.000 1.000 1.000 r.000 1.000
.044 .198 .448 .698 .870 .9s6 .988 .998 1.000 1.000 1.000 1.000 1.000 1.000 1.000
.007 .001 .000 .000 .000 .000 .000 .000 .047 .008 .001 .000 .000 .000 .000 .000 .16l .040 .006 .001 .000 .000 .000 .000 .35s .r24 .029 .004 .000 .000 .000 .000 .584 .279 .090 .018 .002 .000 .000 .000 .781 .486 .2r2 .0s8 .008 .000 .000 .000 .907 .692 .39s .150 .031 .002 .000 .000 .969 .8s0 .60s .308 .093 .or2 .000 .000 .992 .942 .788 .514 .2r9 .044 .001 .000 .998 .982 .910 . 7 2 r . 4 1 6 . 1 3 0 .009 .000 l .000 .996 .97| .876 .64s .302 .o44 .004 r.000 .999 .994 .960 .839 .552 .158 .030 1.000 1.000 .999 .992 .953 .802 .415 .I 53 1.000 1.000 1.000 .999 .993 .9s6 . 7 7| .5r2 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000
Toble 2 lContinuedl .05
.10
.24
.30
.40
.50
.70
.80
.90
.95
.000 .000 .000 .000 .001 .004 .015 .050 .131 .278 .485 .703 .873 .96s .995 1.000
.000 .000 .000 .000 .000 .000 .001 .004 .018 .061 .164 .3s2 .602 .833 .965 1.000
.000 .000 .000 .000 .000 .000 .000 .000 .000 .002 .013 .0s6 .184 .451 .794 1.000
.000 .000 .000 .000 .000 .000 .000 .000 .000 .000 .001 .005 .036 . r 7| .537 1.000
.64
C
n-15
n_
0 I 2 3 4 5 6 7 8 9 l0 1l t2 13 T4 l5 L5 0 I 2 3 4 5 6 7 8 9 10 11 T2 l3 T4 l5 T6
.463 .829 .964 .99s .999 1.000 1.000 r.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000
.206 .549 .8l6 .944 .987 .998 1.000 1.000 1.000 r.000 r.000 1.000 1.000 1.000 1.000 1.000
.03s .167 .398 .648 .836 .939 .982 .996 .999 1.000 1.000 1.000 1.000 1.000 1.000 1.000
.440 . 8 1I
.185 .515 .789 .932 .983 .997 .999 1.000 1.000
.028 .141 .352 .s98 .798 .918 .973 .993 .999
r.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000
r.000 1.000 r.000 1.000 1.000 1.000 1.000 1.000
.9s7 .993 .999 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000
.000 .000 .000 .002 .009 .034 .095 .2r3 .390 .s97 .783 .909 .973 .99s r.000 1.000
.00s .035 .r27 .297 .sls .722 .869 .9s0 .985 .996 .999 1.000 1.000 1.000 1.000 1.000
.000 .005 .027 .091 .2r7 .403 .610 .787 .90s .966 .991 .998 1.000 1.000 1.000 1.000
.000 .000 .004 .018 .059 .15I 304 .s00 .596 .849 ,941 .982 .996 1.000 1.000 1.000
.003 .o26 .099 .246 .450 .660 .825 .926 . 9 74 .993 .998
.000 .003 .018 .065 .r67 .329 .527 . 71 6 .858 .942 .981 .99s .999 1.000 1.000 1.000 1.000
.000 .000 .000 .000 .000 .000 .000 .000 .002 . 0 1 1 .001 .000 .005 , .000 .038 . 0 1 9 .002 .105 .058 .007 .227 .026 .t42 .402 .074 .598 --*IE4' ,473 . r 75 .773 .340 .67r .89s .833 , .550 .962 .93s \ . 7 5 4 .989 .901 .998 1.000 .997 . 9 74 1.000 1.000 .997 1 . 0 0 0 1.000 1.000
r.000 r.000 1.000 1.000 1.000 1.000
.2sz
.000 .000 .000 .000 .000 .000 .000 .000 .000 .000 .000 .000 .000 .000 .000 .000 .000 .000 .000 .000 .000 .001 .000 .000 .007 .000 .000 .o27 .001 .000 .082 .003 .000 .202 .017 .001 .402 .068 .007 .648 .2TL .043 .859 .485 . 1 8 9 .972 . 81 5 .560 1.000 1.000 1.000
542 Appendix B
Toble 2 lContinuedl .05 n-
c 17 0 I 2 3 4 5 6 7 8 9 10 ll l2 l3 t4 l5
.418 .792 .950 .99r .999 1.000 1.000 1.000 r.000 1.000 1.000 r.000 1.000 1.000 1.000 1.000 r 6 1.000 T 7 1.000
n _ 18 0 'l 2 3 4 5 6 7 8 9 l0
1r t2 r3 t4 l5
r6 L7 l8
.397 .774 .942 .989 .998 1.000 1.000 r.000 1.000 1.000 r.000 1.000 1.000 1.000 r.000 r.000 1.000 1.000 r.000
.10
.20
.30
.40
.167 .482 .762 .9r7 .978 .995 .999 1.000 1.000 1.000 r.000 1.000 r.000 1.000 1.000 1.000 1.000 r.000
.023 .l 18 .310 .549 .7s8 .894 .962 .ggg .gg7 ! 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000
.002 .ot9 .077 .202 .389 .597 .77s ,.995 .960 .987 .997 .999 1.000 1.000 1.000 1.000 1.000 1.000
.000 .002 .or2 .046 .126 .264 .448 .64r .801 .908 .965 .989 .997 1.000 1.000 1.000 1.000 1.000
.000 .000 .000 .000 .000 .000 .000 .000 .000 .000 .000 .000 .001 .000 .000 .000 .000 .000 .006 .000 .000 .000 .000 .000 .025 .003 .000 .000 .000 .000 .072 .01I .001 .000 .000 .000 .166 .03s .003 .000 .000 .000 . 3 1 5 .092 .013 .000 .000 .000 .s00 .r99 .040 .003 .000 .000 .685 .359 .105 .01I .000 .000 .834 .552 .225 .038 .001 .000 .928 .736 .403 .106 .00s .000 .97s .874 .61I .242 .o22 .001 .994 . 9 s 4 . 7 9 8 . 4 s r .083 .009 .999 .988 .923 .690 .238 .050 1.000 .998 .981 .882 . 5 1 8 . 2 0 8 1.000 1.000 .998 .977 .833 .582 1.000 1.000 1.000 1.000 1.000 1.000
.150 .450 .734 .902 .972 .994 .999 1.000 1.000 r.000 r.000 r.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000
.0lg_ .002 ', .O9g .014 .27L .060 .501 .165 .716 .333 .867 .s34 .949 .722 .984 .8s9 .996 .940 .999 .979 1.000 .994 1.000 .999 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000
.000 .001 .008 .033 .094 .209 .374 .563 .737 .865 .942 . .980 .994 .999 1.000 r.000 1.000 1.000 1.000
.000 .000 .000 .000 .000 .000 .000 .000 .000 .000 .000 .000 .001 .000 .000 .000 .000 .000 .004 .000 .000 .000 .000 .000 .015 .001 .000 .000 .000 .000 .048 .006 .000 .000 .000 .000 . 1 1 9 .020 .001 .000 .000 .000 .240 0s8 .006 .000 .000 .000 .407 .l3s .02r .001 .000 .000 .593 .263 .060 .004 .000 .000 .760 . 4 3 7 .l 4 l . 0 1 6 .000 .000 ', .626 .278 .0sl .88'ltr .001 .000 .952 .79r .466 .133 .006 .000 .98,5 .906 .667 .284 .028 .002 .996 .967 .83s .499 . 0 9 8 . 0 1 1 .999 .992 .940 .729 .266 .058 1.000 .999 .986 .901 - .550 .226 1.000 1.000 .998 .982 .850 .603 1.000 1.000 1.000 1.000 1.000 1.000
.50
.60
.70
.80
.90
.95
Toble 2 (Continuedl
.10
.20
.30
.377 .75s .933 .987 .998 r.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 r.000 r 7 1.000 1 8 1.000 T 9 1.000
.135 .420 .705 .885 .96s .99r .998 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 r.000 1.000 1.000
.014 .083 .237 .4s5 .673 .837 .932 .977 .993 .998 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000
.001 .010 .046 .133 .282 .474 .666 .818 .916 .967 .989 .997 .999 1.000 1.000 1.000 1.000 1.000 1.000 1.000
.358 .736 .92s .984 .997 r.000 r.000 1.000 1.000 1.000 1.000 1.000 1.000 r.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000
.r22 .392 .677 .867 .9s7 .989 .998 1.000 1.000 1.000 1.000 1.000 1.000 1.000 r.000 1.000 1.000 1.000 1.000 1.000 1.000
.012 .069 .206
.001 .008 .035
.4 rr
.ro 7
.630 .804 .9r3 .96g .990 .?97 .999 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 r.000
.238 .416 .608 . *772 .887 .9s2 .983 .99s -"3qg-
.05
.50
.60
.70
.000 .001 .005 .023 .070 .163 .308 .488 .667 .814 .9r2 .96s .988 .997 .999 1.000 1.000 1"000 1.000 1.000
.000 .000 .000 .002 .010 .O32 .084 .180 .324 .500 .676 .820 .916 .968 .990 .998 1.000 1.000 1.000 1.000
.000 .000 .000 .000 .001 .003 .0r2 .03s .088 .186 .333 .512 .692 .837 .930 .977 .995 .999 1.000 1.000
.000 .000 .000 .000 .000 .000 .001 .003 .011 .033 .084 .182 .334 .526 .718 .867 .9s4 .990 .999 1.000
.000 .001 .004 .016 .051 .126 .250 .4t6 .596 .755 .872 -944 .979 .994 .998, r.000 1.000 1.000 1.000 1.000 1.000
.000 .000 .000 .001 .006 .021 .0s8 .r32 .252 .4r2 .s88 .748 .858 .942 .97,9 .994 .999 1.000 1.000 1.000 l.000
.000 .000 .000 .000 .000 .000 .000 .000 .000 .000 .002 .000 .006 .000 .001 .02r .0s7 .005 .0r7 .r28 .245 .048, .404 *..-+.lA .584 .228 .392 .7so .s84 .874 .949 .762 .984 .893 .996 .96s .999 992 1.000 .999 1.000 r.000
.40
.80
C
n:19
0 I 2 3 4 5 6 7 8 9 10 11 l2 13 l4 15 T6
n
0 I 2 3 4 5 6 7 8 9 10 ll T2 l3 T4, 15
r6 r7 18 T9 20
r.000 1.000 1.000 1.000 1.000 r.000 1.000 1.000
.90
'.9s
.000 .000 .000 .000 .000 .000 .000 .000 .000 .000 .000 .000 .002 .009 .035 .r 1 5 .29s .580 .86s 1.000
.000 .000 .000 .000 .000 .000 .000 .000 .000 .000 .000 .000 .000 .000 .002 .013 .067 .245 .623 1.000
.000 .000 .000 .000 .000 .000 .000 .000 .000 .000 .000 .000 .000 .000 .000 .000 .000 .000 .001 .000 .003 .000 .010 .000 .O32 .000 .087 .002 .196 .011 .370 .043 .589 . . 1 3 3 .794 .323 .931 .608 .988 .878 1.000 1.000
.000 .000 .000 .000 .000 .000 .000 .000 .000 .000 .000 .000 .000 .000 .000 .003 .016 .075 .264 .642 1.000
.000 .000 .000 .000 .000 .000 .000 .000 .000 .AO2 .007 .023 .068 .163 .327 .545 .763 .9r7 .986 1.000
543
544 Appendix B
Toble 2 lContinuedl
n
0 I 2 3 4 5 6 7 8 9 l0 ll L2 13 T4 15
r6 r7 18 T9 20 2l 22 23 24 25
.05
.10
.20
.277 .642 .873 .966 .993 .999
.072 .27r .s37 .764 .902 .967 .99r .998 r.000 r.000 1.000 1.000 r.000 1.000 r.000 1.000 1.000 r.000 1.000 1.000 1.000 1.000 1.000 1.000 r.000 1.000
.004 .027 .098 .234 .42r .617 .780 .891 .9s3 .983 .994 .998 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 r.000 1.000 1.000 1.000
r.000 1.000 r.000 1.000 1.000 r.000 r.000 1.000 1.000 r.000 1.000 1.000 1.000 r.000 r.000 1.000 1.000 1.000 r.000 1.000
.30
.40
.50
.000 .000 .000 .002 .000 .000 .009 .000 .000 .033 .002 .000 .090 .009 .000 .193 .029 .002 .34r .074 .007 .512 .t54 .022 .677 .274 .054 .8tl .425 .l 15 .902 .s86 .2r2 .9s6 .732 .34s .983 .846 .500 .994 .922 .655 .998 .966 .788 1 . 0 0 0 .987 .88s 1 . 0 0 0 .996 .946 1 . 0 0 0 .999 .978 1 . 0 0 0 1.000 .993 1.000 1.000 .998 1 . 0 0 0 1.000 1.000
r.000 1.000 1.000 r.000 1.000
r.000 r.000 1.000 1.000 1.000
1.000 1.000 1.000 1.000 1.000
.60
.70
.80
.90
.9s
.000 .000 .000 .000 .000 .000 .000 .000 .000 .000 .000 .000 .000 .000 .000 .000 .000 .000 .000 .000 .000 .000 .000 .000 .000 .000 .000 .000 .000 .000 .000 .000 .000 .000 .000 .001 .000 .000 .000 .000 .004 .000 .000 .000 .000 .013 .000 .000 .000 .000 .034 .002 .000 .000 .000 .078 .006 .000 .000 .000 .154 .017 .000 .000 .000 .268 .044 .002 .000 .000 .4r4 .098 .006 .000 .000 . 5 7 s . 1 8 9 . O r 7 .000 .000 .726 .323 .047 .000 .000 .846 .488 .109 .oo2 .000 .926 .659 .220 .009 .000 .97| .807 .383 .033 .001 . 9 9 r . 9 1 0 . s 7 9 .098 .007 .998 .967 .766 .236 .034 r.000 .99r .902 .463 .r27 1.000 .998 .973 .729 .358 1.000 1.000 .996 928 .723 r.000 1.000 1.000 1.000 1.000
Toble 3 StondordNormolProbobilities
-
3.5 3.4 3.3 3.2 3.1 3.0
- 2.9 -2.8 - 2.7 - 2.6 -2.5 - 2.4 - 2.3 -2.2
- 2.r - 2.0
- r.9
- 1.8
- r.7 - r.6 - 1.5
- r.4 - 1.3 - r.2 - l.l - 1.0
-.9 -.8 -.7 -.6 -.5 -.4 -.3 -.2 -.1 -.0
5*
p,)o2
I
.00
.01
.02
.03
.o4
.05
.06
.a7
.08
.09
.0002 .0003 .000s .0007 .0010 .0013
.0002 .0003 .0005 .0007 .0009 .0013
.0002 .0003 .0005 .0006 .0009 .0013
.0002 .0003 .0004 .0006 .0009 .0012
.0002 .0003 .0004 .0006 .0008 .0012
.0002 .0003 .0004 .0006 .0008 .0011
.0002 .0003 .0004 .0006 .0008 . 0 0 r1
.0002 .0003 .0004 .0005 .0008 .0011
.0002 .0003 .0004 .0005 .0007 .0010
.0002 .0002 .0003 .0005 .0007 .0010
.0019 .0026 .0035 .0047 .00601 .0082 .0107 .0139 .0179 .0228
.0018 .0025 .0034 .004s .0060 .0080 .0104 .0136 .0r74 .0222
.0018 .0017 .0024 .0023 .0033 .oo32 .0044 .0043 .0059 .00s7 .0078 .0075 .0102 .0099 .0132 .0129 .0170 .0166 .0217 .0212
.0016 .0023 .0031 .0041 .0055 .0073 .0096 .0125 .0162 .0207
.0016 .0022 .0030 .0040 .0054 .0071 .0094 .0122 .0158 .0202
.0015 .0021 .oo29 .0039 .00s2 .0069 .009r . 0 11 9 . 0 15 4 .0197
.0015 .0021 .0028 .0038 .0051 .0068 .0089 . 0 11 6 .0150 .0192
.0014 .0020 .0027 .0037 .0049 .0066 .0087 . 0 11 3 .0146 . 0 18 8
.0014 .0019 .0026 .0036 .0048 .0064 .0084 . 0 11 0 .0143 .0183
.02s0 .0314 .0392 .0485 .0594 .o72r .0869 .1038 .1230 .r446
.0244 .0307 .0384 .0475 .0582 .0708 .0853 .1020 .1210 .r423
.o239 .0301 .o375 .0465 .0571 .0694 .0838 .1003 .r190 .1401
.0233 .0294 .0367 .0455 .0559 .0681 .0823 .0985 . 11 7 0 .r379
.1660 .r922 .2205 .25t4 .2843 .3r92
.163s .1894 .2177 .2483 .2810
.l6l I .1867 .2148 .245r .2776
.3r 56 .3520 .3897 .4286 .468r
.3r2r .3483 .38s9 .4247 .464r
.0287 .03s9 .o446 .0548 .0668 .0808 .0968 .1151
.0281 .0351 .0436
.0274 .0344 .0427 .0s37 .0526 .0655 .0643 .o793 .o77 8 .09s1 .0934 . 11 3 1 . 1 1 1 2 . r 3 s 7 . 1 3 3 5 . 1 31 4 . 1 5 8 7 .1562 .1539
.0268 .0336 .0418 .05r6 .0630 .o764 .09r8 .1093 .r292 .1515
.o262 .0329 .0409 .0s05 .06r8 .o749 .0901 .1075 .1277 .r492
.0256 .0322 .040r .049s .0606 .0735 .0885 .1056 .1251 .r469
.1841 .2tr9 .2420 .2743 .308s .3446 .3821 .4207 .4602 .5000
.1814 .2090 .2389 .2709 .3050 .3409 .3783 .4168 .4562 .4960
.1762 .2033 .2327 .2643 .298r .3336 .3707 .4090 .4483 .4880
.r736 .2005 .2297 .26rr .2946 .3300 .3669 .4052 .4443 .4840
. 1 7 1l .r977
.1788 .206r .2358 . 2 6 76 .3015 .3372 .374s .4129 .4522 .4920
.2578 .2912 .3264 .3632 .4013 .44O+ .480r
.2546 .2877 .3228 .3594 .3ss7 .3e7 .3936 + .4354,., .4325 .476r .472r
545
546 Appendix B
Toble 3 lContinuedl .00
.01
.02
.03
.04
.05
.06
.07
.08
.5000 .s398 .s793 .6179 .6554 .6915 .7257 .7s80 .7881 . 81 5 9
.s040 .s438 .s832 .6217 .659r .69s0 .729r .76rl .7910 . 81 8 6
.s080 .5478 .5871 .6255 .6628 .598s .7324 .7642 .7939 .82t2
.5120 .5s17 .5910 .6293 .6664
.70r9 .73s7 .7673 .7967 .8238
.5160 . 5 5 s7 .5948 .6331 .6700 .70s4 .7389 .7703 .7995 .8264
.5r99 .5596 .s987 .6368 .6736 .2088 .7422 .7734 .8023 .8289
.5239 .s636 .6026 .6406 .6772
.s279 . 5 6 7s .6064 .6443 .6808
.5359 . s 75 3 .614r .6s17 .6879 .7r90 .7224 .7sr7 .7s49 .7823 .7852 .8106 .8133 .836s .8389
r.9
.8413 .8643 .8849 .9032 .9192 .9332 .9452 .9s54 .964r . 9 71 3
.8438 .8665 .8869 .9049 .9207 .934s .9463 .9564 .9649 . 9 71 9
.8451 .8686 .8888 .9066 .9222 .93s7 .9474 .9s73 .96s6 .9726
.8485 .8708 .8907 .9082 .9236 .9370 .9484 .9s82 .9664 .9732
.8508 .8729 '8e2s .9099 .925r .9382 .9495 .959r .967r
.8531 .8749 @ gTts .926s .9394 .9505 .9s99 .9678
2.0 2.1 2.2 2.3 2.4 2.5 2.6 2.7 2.8 2.9
.9772 .9821 .9861 .9893 .9918 .9938 .99s3 996s .9974 .998r
.9778 .9826 .9864 .9896 .9920 .9940 .99ss .9966 .997s .9982
.9783 .9830 .9868 .9898 .9922 .994r .99s6 .9967 .9976 .9982
3.0 3.1 3.2 3.3 3.4 3.5
.9987 .9990 .9993 .999s .9997 .9998
.9987 .999r .9993 .999s .9997 .9998
.9987 .999r .9994 .999s .9997 .9998
1.0 l.l
t.2 1.3 t.4 1.5 1.6
r.7 1.8
.7r23 .74s4 . 7 76 4 .8051 .8315
.8ss4 .8770
. 7r s 7 .7486 .7794 .8078 .8340
.09
.5319 . s 71 4 .6103 .6480 .6844
.s7ls .e744ftryq
.8s77 .8790 .8e80 .9147 .9292 .9418 .9s2s .9616 .9693
.s7s6
.8599 .8810 .8997 .9162 .9306 .9429 .9535 .9625 .9699 .976r
.862r .8830 .901s .9177 .9319 .944r .954s .9633 .9706 .9767
.9788 .9834 .987| .990r .9925 .9943 .99s7 .9968 .9977 .9983
.9793 .9838 .9875 .99A4 .9927 .994s .99s9 .9969 .9977 .9984
.9798 .9842 .9878 .9906 .9929 .9946 .9960 .9970 .9975 .9984
.9803 .9846 .9881 .9909 .9931 .9948 .996r .997| .9979 .998s
.9808 .98s0 .9884 .99rr .9932 .9949 .9962 .9972 .9979 .998s
.9812 .9854 .9887 .9913 .9934
.9817 .9857 .9890 .9916 .9936
.9963 .9973 .9980 .9986
.9964 .9974 .9981 .9986
.9988 .999r .9994 .9996 .9997 .9998
.9988 .9992 .9994 .99e6 .9997 .9998
.9989 .9992 .9994 .9996 .9997 .9998
.9989 .9992 .9994 .9996 .9997 .9998
.9989 .9992 .999s .9996 .9997 .9998
.9990 .9993 .999s .9996 .9997 .9998
.9990 .9993 .999s .9997 .9998 .9998
9 t3l .9279 .9406 .9st s .9608 .9686
.99sr .99s2
Toble 4 PercentogePointsof f Distributions
"-Tu"f'Zcz
.25
.10
.05
.025
.01
.00833
.00625
.00s
I 2 3 4
1.000 .816 .76s . 74 r
3.078 1.886 1.638 1.533
5.3r4 2.920 2.353 2.t32
12.706 4.303 3.182 2.776
31.821 5.96s 4.541 3 . 74 7
38.190 7.649 4.8s7
3.96r
50.923 8.850 5.392 4.315
63.5s7 9.925 5.841 4.504
5 6 7 8 9
.727 .718 .706 .703
r . 4 76 t.440 1.415 1.397 1.383
2.0r5 r.943 1.895 1.850 1.833
2.57| 2.447 2.365 2.306 2.262
3.35s 3.143 2.998 2.896 2.82r
3.534 3.287 3.128 3.015 2.933
3.810 3.s21 3.335 3.206 3 . 1 lI
4.032 3.707 3.499 3.355 3.250
10 1t t2 13 t4
.700 .697 .695 .694 .692
r.372 1.363 1.356 1.350 t.345
1.812 1.796 1.782 r . 7 7|
r . 76 r
2.228 2.201 2.r79 2.160 2.t45
2.764 2.718 2.691 2.6sO 2.624
2.870 2.820 2.779 2.746 2.718
3.038 2981 2.934 2.895 2.864
3.169 3.106 3.0s5 3.012 2.977
15 L6 t9
.59r .690 .689 .688 .688
1.34r r.337 1.333 1.330 1.328
r.753 r . 74 5 r.740 r.734 r.729
2.t31 2.r20 2 . 11 0 2.101 2.093
2.602 2.s83 2.s67 2.552 2.539
2.694 2.673 2.555 2.639 2.625
2.837 2.813 2.793 2.77s 2.759
2.947 2.921 2.898 2.878 2.86r
20 2L 22 23 24
.687 .686 .686 .68s .685
r.325 r.323 1.321 1.319 1.318
r.725 r.72r r . 71 7 r.714
2.528 2.518 2.508 2.s00 2.492
2.613 2.60r 2.59r 2.582 2.s74
2.744 2.732 2.720 2.7r0 2.700
2.845
1 . 7 I1
2.A86 2.080 2.074 2.069 2.064
25 26 27 28 29
.684 .684 .584 .583 .583
r.316 1.315 t.3r4 1.313 r . 3 1I
r.708 r.706 1.703 I .701 r.699
2.O50 2.0s5 2.052 2.048 2.O45
2.485 2.479 2.473 2.457 2.462
2.565 2.559 2.552 2.546 2.541
2.692 2.584 2.676 2.569 2.663
2.787 2.779 2.77r 2.763 2.7s6
30 40 60 r20
.683 .681 .679 .677 .674
1.310 1.303 r.296 1.289 1.282
r.597 1.684 r . 6 7r 1.658 r.645
2.042 2.O2r 2.000 1.980 1.960
2.457 2.423 2.390 2.3s8 2.326
2.536 2.499 2.463 2.428 2.394
2.657 2.616 2.57s 2.536 2.498
2.750 2.704 2.660 2.617 2.575
r7 r8
@
. 7r r
2.83r 2.819 2.807 2.797
547
548
Appendix B
Toble 5 PercenlqgePoinlsof X2 Distributions
I 2 3 4 5 6 7 8 9 l0 ll t2 l3 t4 l5 l5 T7 l8 t9 20
2r 22 23 24 25 26 27 28 29 30 40 50 60 70 80 90 100
.99
.97s
.9s
.0002 .02 .11 .30 .55 .87 1.24 1.65 2.O9
.001 .05 .22 .48 .83 r.24 r.59 2.18 2.70
.004 .10 .35 .71 t.l5 1.64 2.r7 2.73 3.33
2.s6 3.05 3.57 4.rI 4.65 5.23 5.81 6.4r 7.Or 7.63
3.24 3.81 4.40 s.Ol 5.62 6.26 6.90 7.s6 8.23 8.90
3.94 4.57 5.23 5.89 6.s7 7.26 7.96 8.67 9.39 10.12
8.26 8.90 9.54 10.20 10.86 l 1.52 12.20 12.88 13.56 14.26
9.59 10.28 10.98 rr.69 12.40 13.11 13.84 14.57 1s.30 16.04
14.95 22.16
16.78 24.42 32.3s 40.47 48.7 5 57.rs 65.64 74.22
29.7r 37.48 45.44 53.54 6r.75 70.06
.90
.50
.02
.45
.2r .s8
r.39 2.37
.10 2.7L
.05
.025
.01
3.36 4.35 5.35 6.35 7.34 8.34
7. 7 8 9.24 10.64 r2.o2 13.36 14.68
3.84 s.99 7.81 9.49 11.07 12.59 14.07 ls.5l 15.92
5.58 6.30 7.04 7.79 8.55 9.31 10.09 10.86 11 . 6 5
9.34 10.34 rr.34 12.34 13.34 14.34 15.34 15.34 17.34 18.34
15.99 17.28 18.55 19.81 2r.06 22.3r 23.54 24.77 25.99 27.20
18.31 19.68 21.03 22.36 23.68 2s.00 26.30 2 7. 5 9 28.87 30.14
20.48 2r.92 23.34 24.74 26.12 27.49 28.8s 30.19 31.53 32.85
23.2r 24.72 26.22 27.69 29.t4 30.58 32.00 33.41 34.8I 36.19
10.85 I l.s9 12.34 13.09 13.85 14.6r 15.38 16.15 16.93 1 7. 7|
t2.44 t3.24 14.04 14.85 rs.66 15.47 17.29 l 8 . lI 18.94 19.77
19.34 20.34 21.34 22.34 23.34 24.34 25.34 26.34 27.34 28.34
28.4r 29.52 30.81 32.01 33.20 34.38 35.56 36.74 37.92 39.09
31.41 32.67 33.92 35.17 35.42 37.65 38.89 40.1I 4r.34 42.s6
34.17 3s.48 36.78 38.08 39.36 40.55 4r.92 43.19 44.45 45.72
3 7. 5 7 38.93 40.29 4r.64 42.98 44.3r 45.64 46.96 48.28 49.59
18.49 26.sr 34.76 43.19 5r .74 60.39 69.r3 77.93
20.60 29.05 37.69 46.46 s5.33 64.28 73.29 82.35
29.34 40.26 43.77 45.98 s0.89 3 9 . 3 4 5 1 . 8 1 ss.76 s9.34 63.69 49.33 63.17 6 7. s O 7r.42 75.15 s9.33 74.40 79.08 83.30 88.38 69.33 8s.53 90.53 9s.o2 100.43 79.33 96.s8 r01.88 t05.63 1t2.33 89.33 r07.57 l 1 3 . l 5 l l 8 . l 4 124.t2 99.33 I 18.50 124.34 t29.55 135.81
1.06 1.61 2.20 2.83 3.49 4.17 4.87
4.6r 6.25
5.02 7.38 9.35 I l.l4 12.83 t4.45 16.01 17.s3 19.02
6.63 9.2r I 1.34 13.28 15.09 16.81 18.48 20.09 2r.67
I\
|J)
\O H
-
-
\O
cO \O O O\ cr\ S * cl oO rO N O 6 \O sf, cl i O\ € F- \O Ln \t I\ O 6t + \ n -1 \ -1 \ u? oc c'j - Q q q oc oq \ \ \ \ g \q \q \q \q '4 tQ 14 q u? q n q .,? ..! idr C'lO\tneOcONclCl iiiriiid-r 6lelNd
\o
eo F- \o tO tf, i c.i oi tn
\o
{v
; t\
o \o @ .o "o
O eo
rJl c\]
€ $ \o co ?o ro o\ eo o\ Lo d F- tn cO cl r Q O\ O\ oO oO € oi e.i oi oi oi e.i -i -.i -j .j .j
\Q \Q F- ej l=- O - € d € e! t e.i oi.o ii \o ".j.tj
\O € LQ \O <) a \Q to co c\ i o o o\ o\ e.i oi c.i oi c.i c.i -j .j
rQ F- cQ O\ r !Q li|' oO i € O
F* Q () t
d "i "j..j $ l.-i
\o
t <' oi
F- O \ crl d H
F- $ € 6 .j -i
- € \O \t c.t O ot\ F- \O tn oo Fr F- Fr F- F- \o \o \o \o .i .j .j -i -.i -j -j .j .j .j
qO oo co O\ \O co O O O\ O\ € oO € €
e.j e.i oi oi oi e.i c.i c.i .j -i .j -i j
- \O (\l 0O sf, I $ O\ el O O el I i € cl € ro !- co C\l d o o o\ o\ Lo co c'i e.i e.i e.i e.i e.i c.i c.i e.i .j .j
c.f el Q N tt N -i o\ d
c! u) cQ i o\ Fr \o d (o o o\ € F- r Fr F- t- F- \O \O \O \O \O \O \O (n rn (O m j -j -i ..i .j .j j -i .j .j ..i .j -j ..i .j
€ \O $ F- F- I\
eo O F- F- t\
..i .i -j .j .j .j j
O\ \O S € oo € .j -i ..i
r € .i
t+ eO c.l r \o \o \o \o .j .i .i .i
$ A (o \i .j j
- $ =f co -j .j
€ F- \O ro $ co F- C) ro @ \O \O \O \O \O \O rO rO $ ce
.i -i -i .j .i .i .j ..i .i .i
O\ oO \O $ eO el i O O\ oO t\ F- F- F- F- N F- F- F- \o \o \o .i .j .i -.i .j -j .j .j ..i .j .i
Fr t F- cO \O + + F- O LQ - N. \l d O\ \O $ m i O oo 61 oo \O $ cO c.-t d i O O o|\ O\ O\ € € oO oO oo € crj c.i oi e.i oi c.i c.i e.i c.i ci e.i .i ...i .'i .i -i -j .i .i ..j
<. F- o * cO .j .j ..i
d \o .i
$ oO c..l U.) t+
€ F- \O rO $ cO e.l \O O F- Fr F- f.- F_ t.- F. \o \o -i -i .j -.i ..i .j .j .i .i
\o i i cl o N o F- c) € € i ro o t,o d o\ \o co i o\ F- \o $ eo t.l r c) o\ € F- i \ n e! q e] q 9 v? d? c.l e.! - i q q q q q q ? oC oC oq oq 09 oq oq \ \ \ \ cO cO c.l c{ c\l c.l c.l c\ el cl N c.l i i i i i i i r i O O\tO \o
rO O\ rn $ -..i -j
\o o ro 9 \q q i
i
i
o\ o\ co c-r o $ o $ c{ c{ L') o\ \t o \o co c) € \o + 6l c) o\ co F- \o ro !f, co el \o i rn c) r cO N O\ cO O\ N Ln $ crl cl i r i O O O O\ O\ O\ O\ O\ @ oO € € € oO € € N F- \O \O c.j c.i oi e.i e.i e.i c.i c.i oi c.i c.i c.i e.i i .j -i -,i i i .j .j .j .j .j .i -i -.i ..i .i .j ci oi.o \o ".j \O €
t I oCd? el q
el \O ci \O I
rO N-
\O rO (O
\O
r
!+
\O
rO O\ 6l O\ \g eQ O € a-l O\ € F- F€ cO q \ I n n el - -1 q q q e q q q q q q oq oq e oq e oq \ \ 9 9 i i d c.l c\l "? c\l N cl c{ cl N el c.l c.l
eQ e.l cl O\ O\ tJ) cO "? F= tn
r.) sl
oo r')
F-
u)
\o
d
rJ)
o
+ € O :i O o\ al o\ \t co el + cl o oo Fc) o\ € co Fr 6l Fn oa c.l q e'?q \ 'Q n c,?oc e.l ej - - e q q q q q q q q q q q q oq e aq \ \ g ro co (.l i r H d
o
-r
c t{r
J
o I I
L
ts
.2
o -
N D
rF
eo c.l ct
o\ o\ rO
rrj F-
F-
cl
c.1 ei
6l
d
oo
c.l 6l
e.l N
el
\o
co
oo eQ
cil
cl
N
\o
-
\o
!a
r+
co co F- c{ }.- c.l $ o\ o € o\ € F$ cil c-l q .? e'l €q .e -rq \ 9 q -rn e,?c el - - -1 - q q q e q q q q q q q q q oq oq \ \ tn (.l (.l i
oO O\ ro
eO .O cO c{ c.l N el eil .t
e\l cl N c't c'l
o eo € d o (o co N. (n \o o\ co € e{ cA N O * O €\O tn \f eOeOel od oi .o + e.j erj e.i c.i .i c.i c.i c.i e.j rn t q i !Q ro i € co - c.l (o o\ rn
i
c'l c{ cl
I - oo (n co - c\ co \o ro \i cl d o o o\ 0o eo F- et Fi O O O O O O O O O O\O\O\€ c.l c.l 6N oi c.i e.i oi ii c.i oi oi oi c.i oi c.i e.i e.i e.i -.i -i -.'i .j .i ..i r
N
r
\o
$ al o oo $ co o o\ 0o N \o \o Ln o ro o l/) e.! cj c'?q n - oC\ 9 t! n C,'? el c.l el e.] -1 -1 .1 - -1 - e e q q q q Q q q e cO cO N N c.l c{ N el cl "? cl et(.l i i c{ at c{ cl 61 c{ cl e.l (-l el 6l et cl 6l c.l i
F- O\ tO $ ro
Y
\ b
o
oq !i q,
o
qQ A
o-
o o) o
IF
c o g o
o-
$
-
€
-1 el '4 -
\9
-
O. -
!+ cO eO O\ \O o) -
Or F- to dl
\O q
"? a
O
cl
-
O\ oO F- F- \O rn \t
q oq 9 9 v? n n e c'? eQ e,?e.l c\l cl e.l el c.l -1 -.,: -.: -1 -
"rl$ c O c O c \ l c l c l c . l e l c . l c l e t c . l N c . l c . l N c { c . l !QO\tn ro
.a C,
+r l -
cO t
q
i
O\ l+ A
\f
-.: -1 Q C q q
6lelNelc{elNeleilc.lcfj.i
-_'
\O c.l O\ \O S ei Q a \O to $ cO C\t r Cj \ El O o\ € € cO € cO oO SO \Q .c O\ - 9 cl q q 09\ 9 9 q q n n n n n o?o?.c .,?.'? .? d?Q n n el .i - n c to .+ .O cO ca d c.l el c.l el cl c\l N 6l el et c-t cl c\ e.t c.l c\t el rl c.t cl c.l c.l c.t cl
\9
c\l €
\O \O -
qf
\9
-i
-r
c{
cl
\O (o \f,
d? el i Or N Q ={ O c} O\ + O\ rO c) \O 9.) a \ { Cf n e \ n q - q q o q0 9\ \ \ 9 9 9 I q q q q , , ? q n q q q n i a . , i { 1 ?? 6 to + co co co eo co c{ .l .l c.l el el c.l .l ci c.t c.t c.t ct c.t et c.l
q s
e{ c{
el
c.l el
er cl
el
\Q eO d
\O i eo Lo S \Q 90 O\ \8 \O O) eQ € $ O \ !Q qO =i q\ \ q C q q q tn et n -1 - -1 q q q q oq o\o9o\ oo \la o\ q tq q et e = ! 1.C "c ei"c (\i [e.i \\ q Ioo ILo tI + \cO ?O cO cO cO cO cQ cO cO cO cO cO cO c.t €.t t.t 6l .f 6i t.i e.i c.i e.i c.i e.i
c.i e.i
I
o o o F
i
cl co $
l,) \o F- @ o\ c) r
(.l co g i--
rr) \o t-
o 6 c) - c.t co :f, rn \o F- € o\ o o icicrtc.tctNclcrlc\lNdea$\ocrl
o
o
g
549
09 q ca o\ e! q, gg a ei \o - g) er € !o sr q \Q \t cr Q q i- to $ $ eo eo (\t tl. _ f \n 9 n \:f a d? = o\ oi € € € od - il i\ X F- G !d + erj a \ 9 n a Q ei ; - 6 6 o\ o\ € n cej e.j erj oi e.i c.i e.i oi e.i c.i e.i c.i e.i j -..i -j ..i -j -j -..i -.i -.i -..i .j -..i .: -j -.1 : s 9! lt) -
cl
g !i a q \ Le + c.t d o\ o\ o\ o o\ I eo \a 9e a 9 \ e tn o \o co -1. \ q \d IY F\ st o, 6 bodddbbdooFr 6'b tZ e e q 9 I n e e'l i i - E q $d 6 o, : "? -.;-j-j-j-.,; -.;-i-j-i_..i
= 6€to$cocQcadcrcrelotc.ie.ie.ic{e.ie.i-.j-l-j-.'-.1 lt) i 6l
-1€qqqqqEEEt\frEfriqiEsE6Eddsslsss€ss .; .; .; j -; I::::::::
d + e'j dj c.i ..i ..i c.i e.i..i c.i 6i ..i ..j..j; B g.d ".j 6l
"?€q\qqq ; sRq Eq$fr3 =:= 53 sShEXd6SsRs€6
ro ol
qoidd+..j".je.je.ie.ic.re.lc.te.rcte.lol-c.icic.ic.i-.r':.j,l::.j-.:-j.l:j \lel
yasfr3tHre5835ehEX8sp€h cE€qEqqid\qff op oi qj d +.'j e.j e.i oi c.i e.i e.i c.i e.i e.i c.i e.i e.i c.i c.i I I I X::::' -i: -i -;
c.l
"'j
!f,-
N - ca I
\Q 9l S
i
"I
q
e-'l =
$
gl A \9 a
Lo d
N cO Q € Lo rO d -.:
v ro F-l - - 6 b . d c5 o c.t o\ ooF\ \o
= f \ t Y q q el q oC\ I q n i "C n?"t ..] i
r.o
q\ Lr \O st eO i
c . i c . i c . io i e . i c . i - j - j . j . i S9-tn$cococococrerc'rotoioic.ie.ic.ie.ie.ic.ie.ic.ie.i N - I r € O F\ @ N - $ q a ca og el € $ r \o tn eo et c) a a a Q € qn q qi i n r
i - -
d 6 \ - 6 ,6 r
+qqqAfr
C 6\qq 88 d SR fi a @ tn !+ $ co co ca e.l c.t e.l ol e.r oi e.i c.i e.i c.i e.i oi c.i c.i e.i e.i c.i e.i e.i c.i c.i e.i -..i -j -.i sI -
cl
el
qSEE*€qqIq€SqqqqEiEs$3\s*sRe=:8Res - ro erj
F(
d $ o
T I
aj e.i oi oi c.i c^ie.i e.i c.i e.i e.i 6i c.j -i J..i
Gi
".i..i.i
c.i e.i e.i -::
]
A\iESlqEEEq s8F\sN N= s 88s Affqv$ Jab$ "?EE c.ie.ie.ici e.ic.i.i.i J J J;.1;i .i:i..i ;.1.{.i I:: "'j
oi.d vi + +.'j,"j a v-
"i "i
N
rI{Hqi[iI8{q\EqqqqqEqqD€sqEsfr \Risx
H I el
-
\o v $ co co co ro ot c{ e.i e.i e.i c.i e.i c.i 6i c.i oi e.i e.i e.i c.i e.i c.i c.i c.i e.i e.i ci -
-.;
q aDEq qs i e E fi EEetjs Ie.j E 6 qXi q q qs qqc.i-ic.i_iQ "eq I vtaq + +.tj d c.i c.i e.i e.i ..i J J J e.i e.i c.i e.i c.i oi el ct et N e.i "'j
g $.e
"l ".1
6l
cqdIqfr EqDSeeqqEfl R€€eh33a$5S$s$$s== - \o !+ $ eo co eo eo co co e.i e.i c.i e.i e.i e.i ei e.i e.i e.i oi e.i oi e.i e.i ei oi e.i J J .l x
$ I
6l
rqiS€a\EqqEiEEE€6\rEE€ss€nh333sbsF
B 9 ^ r o o d e r i e . S c r i " . j e . j e o e r j c i c . i o i o i c . i o i e . i o i c . i e . i d a J J J J . i. . i " i ^ i I cl
Sh YFi QiqiSqqESqi€EEEEE$$8RR*RRRSGfr X9^sLod+..jcri.rj.rjc.j.rje.j.rjerje.ie.ie.ie.ic.ic.ie.ie.icie.ie.ic.ie.iJJJ.iJ cl
}:FqEXqEEiqQqSSNE:iiE€BEAXESEdSP€G !96\oIo$$$coeoeococrioe.jerjerjerico".j.rj"rj.o.rjc.ioie.ic.ie.ie.ioioic.ie.i
6'
ct
sc
6 9r cr
g
.o
\J
!FE{BiXYXiqAE{Eqqqqq\{qqsDs$s$Rs58 or \O ro tn \t
st -f
\f, eO cO crj cO erj erj e.j crj .A
".j
qq G o v?- - = d vj er r \e s ro r- o $ o\ ro i id N
.rj e.j co .tj
".j
,"j
"rj ".j
..j
".j
co erj ..j ,.j
€ r
d; \q e u?a e e a,€ $ I K F X R F R = 5 I 8 S * - \ e "q + ": + + \od cj'v!d d d d + + + + + + + + + + J + + + + + + + +=J ;j; -j
;
o o o l550
-
ci fD v
tn \o F- co o\ o
d
ct (Q t
ro \o F_ co 6 i
Q a: e! a $ a g .t 6l cit ct ct 6l rt
\ co o\ o ot c.{ 6i 6
o <
o 6
o I
^ g
Toble7 SelectedToil Probobilitiesfor the Null Distribulion of Wilcoxon'sRonk-Sum Stolistic P = PlW, Smaller Sample size _ 2 Larger SampLe Size 45 xx
x*
8 .200 4 9 .100 3 100 2
10 .133 4 l1 .067 3 t20 2
7
8 x*
t5 .111 16 .056 17 .028 18 0
5 4 3 2
I I .190 12 .095 13 .048 140
5 4 3 2
13 .143 5 14 .07r 4 l s .035 3 160 2 10
9 X{<
15 .133 17 .089 18 .044 19 .022 200
;*
x*
6 5 4 3 2
x*
X{<
18 .109 19 .073 20 .036 2r .018 220
Smaller Sample Size :
6 5 4 3 2
19 20 2r 22 23
.136 .091 .061 .030 .015
7 6 5 4 3
3
Larger Sample Size
x*
X{<
13 .200 t4 .100 15 .050 160
8 7 6 5
16 .II4 17 .057 18 .O29 190
8 7 5 5
x*
l8 .r25 19 .071 20 .036 2r .018 220
9 8 7 6 5
9 x*
X{<
22 .133 23 .092 24 .058 25 .033 26 .Or7 27 .008 280
1t r0 9 8 7 6 5
24 25 26 27 28 29 30 31
.r39 .O97 .067 .042 .O24 .Or2 .006 0
T2 1l
r0 9 8 7 6 5
x*
20 .131 2r .083 22 .048 23 .024 24 .Or2 2505 10
xx
27 28 29 30 3l 32
10 9 8 7 5
.105 I2 .O73 l l .050 1 0 .032 9 .018 8 .009 7
X{<
29 .108 30 .080 31 .056 32 .038 33 .O24 34 .014
13 L2 II 10 9 8
3s .a07 7
552 AppendixB
Tobfe 7 lContinuedl Smaller Sample size Larger Sample Size
x*
2 2 . r 7r 23 .100 24 .O57 25 .029 26 .0r4 2709
14 13 12 II l0
x*
25 .t43 26 .09s 27 .0s6 28 .032 29.016 30 .008 3l 0
B
ll l0 9
28 29 30 3l 32 33
9 y*
34 .r07 35 .077 36 .05s 37 .036 38 .024 39.014 40 .008
l5 t4 13 t2
l8 17 16 15 14 13 12
.r29 .086 .o57 .033 .Or9 .010
r6 15 T4 13 t2 ll
31 32 33 34 35 36 37
.115 .082 .055 .036 .O2r .0r2 .006
17 t6 15 14 13 12 II
10 x8
X{<
36 .130 37 .O99 3 8 . O 74 39 .0s3 40 .038 4r .o25 42 .Or7 43.010
x*
XX<
20
r9 l8
r7 T6 15 t4 13
39 40 4r 42 43 44 45 46 47
.r20 .094 . o 7| .053 .038 .O27 .018 .OI2 .007
Smaller Sample size -
2r 20 19 18 17 t6 t5 14 13
5
Larger Sample Size
x* 34 .lll
2r
3s .07s 20 36 .048 37 .028 38 .016 39 .008
L9 18 17 16
x*
37 38 39 40 4r 42 43
.r23 .089 .063 .o4r .026 .015 .009
23 22 2I 20 I9 18 T7
x*
4 r .1 0 1 42 .074 43 .0s3 44 .O37 45 .024 46.015 47 .009
24 23 22 2l 20 T9 r8
x* 44 45 46 47 48 49 50 51
. 11 l .085 .064 .O47 .033 .O23 .015 .009
26 25 24 23 22 2r 20 19
Toble 7 (Continuedl 10 x*
47 48 49 s0 51 52 53 54 ss
.120 28 .09s 27 .O73 26 .0s6 25 .04r 24 .030 23 .02r 22 .014 2l .009 20
x{<
5r 52 53 54 55 56 57 58
.103 .082 .065 .050 .038 .028 .020 .014
29 28 27 26 25 24 23 22
s9 .0r0 2r Smaller Sample Size : 6 Larger Sample Size 7B
47 48 49 50 sl 52 53 s4
.120 .090 .066 .047 .o32 .021 .013 .008
3l 30 29 28 27 25 25 24
51 52 53 54 5s 56
.Ir7 .090 .059 .051 .o37 .026
33 32
3r
30 29 28 s7 .or7 27 5 8 . 0 1 1 26 s9 .oa7 25
x*
x*
x*
X{<
5 5 . 11 4 35
59 .rr2 37
s6 .o9r 34 r 33 57 .A7
60 6r 52 63 64 65 66 67 68
58 59 60
.054 32 .041 3 l .030 30
6r .o2r 29 62 63
. 0 1 5 28 .010 27
.091 .072 .057 .O44 .033 .025 .0r8 .013 .009
36 35 34 33 32 3r 30 29 28
Smaller Sample Sizes - 6 Larger Sample Size 10 X{<
63 64 65 66 67 68 69 70 7r 72 73
.l 10 .090 . 0 74 .059 .047 .036 .028 .o2r .015 .01r .008
39 38 37 36 35 34 33 32 31 30 29
553
554
Appendix B
Toble 7 (Continuedl Smaller Sample Size Larger Sample Size
10 xx
63 .rO4 64 .082 65 .064 66 .049 67 .035 58 .027 69 .019 70.013 7r .009
42 4r 40 39 38 37 36 3s 34
x*
67 .l16 68 .095 69 .076 70 .060 7 | .047 72 .036 73 .O27 74 .020 75.014 76.010
45 44 43 42 4l 40 39 38 37 36
x*
72 73 74 7s 76 77 78 79 80 8l 82
.105 47 .087 46 .O7r 45 .o57 44 .04s 43 .036 42 .O27 4 T .O2r 40 . 0 1 6 39 . 0 l l 3B .008 3 7
Smaller Sample Size -
8
Larger Sample Size 910 x*
80 81 82 83 84 85 86 87 88 89 90
.rr7 56 .097 s5 .080 54 .06s 53 .052 52 .04r 51 .032 s0 .025 49 .019 48 .0r4 47 .010 46
x*
86 87 88 89 90
.100 58 .084 s 7 .069 s6 .Os7 5 5 .046 54 9r .037 53 92 .030 52 93 .023 5 l 9 4 . 0 1 8 50 95 .0r4 49 96 .010 48
x*
9r .ro2 6l 92 93 94 9s 96 97 98 99 100 l0l ro2
.086 .073 .061 .051 .042 .034 .027 .022 .or7 .013 .010
60 59
s8 57 56 55 54 53 52 5l 50
1*
76 77 78 79 80 8l 82 83 84 85 86 87
. 1 1 5 50 .497 49 . 0 8 1 48 .067 4 7 .054 46 .o44 45 .035 44 .028 43 .022 42 .017 4T . 0 1 2 40 .009 39
Tobfe 7 lContinuedl Smaller Sample Size : 9
Smaller Sample Size Larger Sample Size 10
Larger Sample Size
10 x*
x* 100 .111 101 .095
ro2 103 104 10s 106 ro7 108 109 110 111 rr2
7r 70
.08r 69 .068 68 .057 6 7 .047 66 .039 65 .031 64 .025 63 .o20 62 .016 61 . O r 2 60 .009 59
106 r07 108 r09 110 111 rr2 113 Lr4 r 15 r 15 rr7 118 I 19
.r06 .091 .078 .067 .0s6 .047 .039 .033 .O27 .O22 .Or7 .014 .011 .009
74 73 72 7I 70 69 68 67 66 65 64 63 62 6I
Adapted from: Kraft, C., and van Eeden, C., A Nonparametric The Macmillan Company, New York, 1958.
x*
r22
r32 r33 r34
.109 .09s .083 .o72 .062 .0s3 .045 .038 .o32 .026 .022 .018 .014
88 87 86 85 84 83 82 81 80 79 78 77 76
135 136
.or2 .009
75 74
t23 t24 125 t26
r27 r28 r29 130 131
Introduction
to Statistics,
555
556
Appendix B
TobleI SelectedToilProbobilitiesfor the Null Distribution of wilcoxon'ssigned-Ronkstotistic P = Pff+ n x*
X{<
5 .250
I 0
6 .rzs 70
8 .188 2 9 .r25 I 1 0 .062 0 ll 0
x*
L2 .156 13 .094 14 .062 r5 .031 160
3 2 I 0
x*
17 .109 r 8 .078 19 .047 20 .031 2r .016 220
n-7
nx*
22 23 24 25 26 27 28
.109 .078 .055 .039 .023 .016 .008
6 5 4 3 2 I 0
x*
27 28 29 30 31 32 33 34 35
n - 11
.t25 .098 .o74 .055 .039 .027 .020 .0t2 .008
9 8 7 6 5 4 3 2 I
12
.103 .087 .074 .062 .051 .042 .034 .027 .ozt .016 .012 .009
l8 17 16 15 14 13 t2 II l0 9 8 7
.r02 .082 .064 .049 .037 .027 .020 .or4 .010
ll l0 9 8 7 6 5 4 3
x*
40 .116 4r .097 42 .080 43 .065 44 .053 45 .042 46 .032 47 .024 48.019 49 .0r4 50 .010
13
x*
10
x*
34 35 36 37 38 39 40 4t 42
n-
58 59 50 6T 52 63 64 55 66 67 68
. 0 76 .065 .055 .046 .o39 .032 .026
20
r9
l8 T7 r6 l5 I4 .02r l 3 .017 T2 .013 l l .010 r0
64 .108 27
6s .09s 26 66 .084 25 5 7 . 0 7 3 24 68 .064 23 69 .055 22 70 .047 2 l 7r .a4a 2A 72 .O34 r 9 73 .029 l 8 74 .024 L 7 75 .020 T 6 76.016 15 77 .013 T4 7 8 . 0 1I 1 3 79 .009 t 2
15 14 l3 12 11 l0 9 8 7 6 5
14
x*
5 6 . r 0 2 22 s 7 .088 2 L
4 3 2 I 0
x*
73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89
.108 .097 .086 .077 .058 .0s9 .0s2 .045 .039 .034 .029 .025 .O2r .018 .015 .012 .010
32 3t 30 29 28 27 26 2s 24 23 22 2r 20 19 18 17 16
Toble 8 lContinuedl n - 15 x*
X{<
83 84 85 86 87 88 89 90 9L
.104 .O94 .084 .076 .068 .060 .053 .047 .042
37 36 35 34 33 32 3l 30 29
92 .035 93 .O32 94 .028 95 .024 96 .02r 97 .018 98.015 99 .013 100 .011 101 .009
28 27 25 25 24 23 22 2r 20 19
Adapteil from: Kraft, C., and van Eeden, C., A Nonparumetric Introduction to Statistics, The Macmillan Company, New York, 1968.
557
to Selected Answers Odd-Numbered
Exerclses 2 CHAPTER 3.1
TABLE FORBLOODWPE FREQUENCY Blood Type
Frequency
o
T6
A B AB
r8
Total
40
Relative Frequency .40 _ 16/40 .45 _ L8/4A .10 : 4l4O .05 _ 2/40
4 2
1.00
0- 25 3.7 (a) Height : (relative frequency)/width : .0050 for : .0019for 100-150 (b) The distribution has a long right-hand tail. a.l (a) V : 5, median : 5 (b) x : 28.5, median : 28.5 a.3 (a) 7 : 1160.7,median : 950 (b) For a typical salary, the median is best. Only one person makes more than x. a.5 (a) Median : 153
(b) Q, : 135.5, Q3 : 166.5
4.9 (a) Company A. The average is highest and a superior machinist would make above the median.
5.1 (a)x : 10,s : VTE57: 4.32 ( c )s : \ / . 2 8 6 : . 5 8 5 5 . 5( a )s : V t . g O : l . l 4 5.7 (a) The interval t -r 2s : (lOO.77l, L99.479)contains proportion gg/4o : .95 of the observations.
560 Answers
5.11 min : 3.2O,Qr : 43.645,median : Qe : 84.87,max : 124.27
60.345
7.3 (c) Proportion .732 rnarry before 2s andproportion .057 marry after 30. 7.5 (b) Q1 : 51, median : 55, Qa : 57.5 7.9 (a) V : 69.9I,s : 2.97,n : 35 (d) Q, : 65, median : 66.5, Qe : 67 7.ll (b) 12/24 : .s (c)?:.794ands:.115 7.15 (a) Interval (b) Proportions (c) Guideline
188.61-290.59 IB7.6L-B4|.SB .G9g .9SB .68
86.69-g92.57
.gS
1.000 .gg7
7.I7 (a) Median : 4.505,Q, : 4.30, Qz:4.7O (b) 90th percentile : 4.985 7 -21 (b) A frequency distribution would not show the systematic decreaseover time. 7.23 (a) V : 4.97, s : 2.00 ( b ) x : 4 . 5 0 7 ,s : . 3 6 8
CHAPTER 3 2-l (c) The pill seems to reduce the proportion of severeand the proportion of moderate casesof nausea.
2.5
Maior HSBpTotal
Male Female
.082 0
.286 .082
.Z4S .tZZ
.IOZ .082
Total
.082
.368
.867
.184
.7ts .2g6 r.ool
4.I r : .820 4 . 5r : . 8 1 8 4.9 (c) Positive,you expectto gain additional points for each toumament entered-
Answers 561 5.1 Intercept :
10, sloPe :
-3
5.3 (b) x : averageNo. of cigarettes smoked / : carbon monoxide level
5.5(b)f:6o.9r" 5.1 Having an attorney improves the chances of getting an increase.
6.s (b)
Hepatitis Vaccinated Not vaccinated
.955 6.9 (c) i, : 25.9
.020 .l3l
No Hepatitis
.980 .869
Total
1.000 1.000
6 . 7r -
.232x
5JL r - .988 6.13 (d) Positiveassociation;both dependon the economicclimate. 6 . 1 9i _ 5 6 . 3 8 + . 3 2 3 x 6 . 2 1( b ) r : . 5 6 3
4 CHAPTER 2.1 (a) vi, (d) ii, iii, (f) v, iii 2.5 (a) 9 : {BL BL,JB,lL,LB,Ll} 0) e : {LB, Ll}, B : {lt, Ll} 2.7 Eachl/3. 3.3 (a) A : {(1,5), (2,4),(3,3),(4,2),(5, 1)} (c) P(A) : 5/36,P(B) : %' P(C) : Vz,P(D) : Vo 3.5 {N, IN, IIN, IIIN, IIB 3.7 (a) Denoting "compliance", "borderline case", and "violation" by c, b, and v, respectively. g (b) Trs
4.r (b) (i) A (iii) A U B _ {er, oz, a4, as, ee}
562 Answers
4.3 (b) A u B _ {er, az, e+},AB _ {er} 4.7 (a) P(A) _ .6, P(B) (b) P(A) .4, P(A u B) _ .g 5.1 (a) rh, (b) t/2, (c) V6,(d) rh, (e) z/e
s.3 (a) P(A) (b) Not independent 5.9 (a) .368,(b) .t6t, (c) .484 5.11 .ggggg 6.1 (a) r2A, (e) 6545 6.3 (a) 66, (b) 63, (c) t/zz 6.5 .OO2 7.3 (a) {0, l}, (b) {0, I, 7.5 (a) 2, 6, (b) 6ho
. ,844}, (c) {t:
gO
7.9 Vz 7.Il (a) 'Ar, (b) V4 7.Ls (b) Vz 7.I7 P(A) _ .47, P(AB) _ .13,p(A U B) : .6g
7.2r (a) ABC,(b)A u B u C, (c) ABq tal Anc 7.23P(AIB) 7.27V4 7.2e(b) .784 7.33 .99999r 7.35No
CHAPTER 5 2 . 1 (a) Discrete, (b) continuous, (c) discrete 2 . 7 Outcome I ooo DDN DND NDD DNN Value of X
0
11
3.3 (a) No, (b) yes, (c) y€s, (d) no
NDN
NND
NNN
Answers 563
3.s (a) 0 , I , 2 , 3 (b) X 6/z+ rr/z+
f(') (c)
6/z+
Vz+
3/+
3.7 x rL/zs r8/es a/ss
Lhs
f(x)
4 . 1 (a) Fr _ l, (b) 4.5
$s
5.1
X
.6,o
c2:
0
f(x\ 5.3
r6/sr
s.7 x .2r2
f(x)
s.9Mean
sd:
.024
.255
.509 .894
5 .1 3 (a) .30,(b) E(X) : 3, sd(X) : 1.342 5 . 1 5 (b) x
f(x) 5 . r 9(a) x f(x) (b) E(x)
0 2/e
13 %
- 15 .1458
,E(X)-1 V6 -5
.3935
.3543
(c) No
CHAPTER6 2.3 (a) Yes, P(S) : .5 (b) Not independent 2 . 5 (a) Bernoulli model not appropriate (b) Appropriate
2 . 7 (c)
.1063
564 Answers
3.3 (a) .36, (b) .216,(c) .4Bz 3.5 (a) .279,(c) .483,(e) .840
3.7 Plx 3.e (b) Mean _ 2.5, sd 4.r (b) Null hypothesis: Current rate _ 6% Alternative hypothesis: current rate + 6% 4.5 (a) cr
.012,p
(c)
.035,g
CI
4.7 (a) p
(b)x
No. of departures on time
(c) R: 4.9 (a) Ho: p (b)R: X>IZ,ct (c) P _ .135 (d) Ho not rejected (e) .42r 5.1 (a) Plausible (d) Assumption of independencemay be violated 5.3 .5r2 5.7 (a) .012,(b) .087,(c) mean 5.9n 5.1I (a) .858,(c) .663,(e) 1.833 5.15 (a) Ho:p
.2, Hl
p
( b )R : x > 9 (c) .,
s.le (a)Ho: (c)
.5
P
CI.
(d) Ho not rejected at ct against Ho is very weak.
s.zL(a)a (b) Power
s.zs(a).os,(b) .ls
.451
P value - .5 is high so evidence
Answers 565
7 CHAPTER 1 . 3The secondinterval has a higherprobability. 1 . 5M e d i a n _ \ n : I . 4 I 4 First quartile _ L, third quartile
x
1.7 (c) Z
656 10
3.1 (a) .8790,(c) .0336 3.3 (b) .7016,(d) .6700 3.7 First quartile - - .675, third quartile : .675 4.I (a) .1056,(c) .0658,(e) .9477 4.3 (a) .0668,(b) .0062 (c) .9198 5.3 (a) Appropriate, (c) not appropnate 5.5 .9390 s.7 (a) .903 8 . 1 ( a ) . 5 , ( b ) . 2 5 ,. 7 5 8.5 (a) - L.36,(c) .65 8.7 (b) .8664,(d) .5862 8.9 (a) .8092,(c) .1056,(e) .9394,(g) .4332 8.11 (a) .1955,(b) 650.6,(c) .2096 8.15 (a) .0228,(b) 4AZ days *8.17 (a) N(38.89,2.222),(b) .879r 8 . 1 9 ( a ) . 1 0 8 4 ,( b ) : 0
8 CHAPTER 2.r (a) Statistic, 2.3 (b)
(c) parameter
Probability (c)
s2
Probability
2 Vs
028
2/g
3/g
2/q
566 Answers
3 . 1 (a) E(X) _ gg, sd(X) : 3. (b) E.x) 3 . 5 (a) EE)_ 27,(b) sd(x) (c) N(27, 1.5) 3 . 7 (a) N(31000,500) (b) . 1 5 8 7 4 . 1 (b) IAe
Probability
zAe
3Ao
ahe
3Ae
2Ae rAe
(d) E(x) -- s, sd(*)_ \E 4.5 (a) Mean _ 80, sd (b) N(80, 'o/t), Exact
ro/e
(c) .7698
4.7(a)N( Lz.L,+) (b) .0244 (c) About 26% 4 . 1I (a) .634,(b) (52.83,57.r7)
CHAPTER 9 2.1 (a) x _ 9.40, estimatedSE - .ZB4 2.3 (a) V _ 4.6 oz, (b) .88 oz 3.3 (14.16,15.64) 3.5 (31.34,32.66) 3 . 7 ( I . 5 2 1 ,1 . 9 0 9 )c m 4.L Ho: tr 4.5 Observedz :
-2.48 is in R: lzl
4.7 Observedz
.05.
is rejected at cr - .10.
5 . 3 ( . 5 2 ,. 7 4 ) 5.5 (.24,.32) 5.7 (a) Test statisttc Z -
.3x.7 n (b) R:Z>1.645
is reiected at ct -
Answers
567
5.1 n : 2OO 6.5 (a) n - 27I, (b) n 7.I (a) V - 12.L7,estimatedSE : .2II 7.3 (a) X - L26.9,error margin - 2.8 (b) (124.6, 129.2) *7.7 (7.4, 13.6)
- 75 ,R: Z>1.645 sffi (b) Observedz : 2.24. Ho is reiectedat a _ .0s. (c) .0125 :v
7.13(a) n : 2 L 7 (b) (100.20, 105.,80) 7 . L s( a ) 8.25% (b) r . 2 % 7.2I (a) Ho: P (b) Observedz cI' :
Z
inR:
Ho is rejected at
.05.
7.23 (b) ( . 6 3 , . 8 2 ) 7.2s (b) (.11, .r7)
,IO CHAPTER 2 . 1 (a) 1.895,(b) - 2 . 1 7 9 2,.3(b) b _ 2.r3r,( d ) b _ - 2 . 7 1 8 3 . 1 (41.8, 52.2) 3.3 (a) x _ L.O22,s - .2635 (b) (.77, 1.28) 4.3 Observedt - - 2 . 7 0 i s i n R : t ( -2.528. Hgis reiectedat ct - .10.
4.5 Observed t
I.94 is not in R: t
cr - .01.
4 . 7 Observed t cr.-
1.59 is not in R:
.O2.
5.1 (a) Ho is not rejected. (b) Ho is rejected.
ltl
568
Answers
6.1 (a) II.O7, (c) 6.91 6.3 (.4O2,.585) 8 . 1 ( c ) - 1 . 7 6 1 (, d ) - 1 . 7 2 9 8.3 (a) (135.5,144.5) (b) Center _ I4},length - 9.0 8.7 (67.8,77.6)months 8.tl Observedt - -3.38, Horeiectedat a - .05 8.13 Observedt _ 3.69, Horciectedat a _ .05, p value :.003 8.17 (a) Observedr _ 3.548, Horeiectedat a _ .01 (b) (8.47,11.93) 8 .1 9 ( a ) ( 1 2 9 ,1 74 ) (b) Observedt _ 1.38,Ho not reiectedat a - .05 8.23 (a) X2 (c) X2 _ 2L.375, Ho rejected 8.25 Xz - 7.51,Ho not rejected 8.27 Ho not reiected
11 CHAPTER 1.3 (a) {(7, E), (C, R), (S, G)} {(T, C), (E, R), (S, G)}
{(r, R),(c, E),(s,G)}
(b) There are nine sets each consisting of two pairs.
2 . 1 ( b ) ( 2 . 9 ,1 3 . 1 ) pz 2.3 (a) Ho: ltr (c) Z _ -2.224 and P value -- .0131 2.7 (a) sfroor.a: 2.5 (b) t 2.9 t _ -2.23 so we reject Ho. 2.Il (a) (-.04,2.24) (b) Normal populations with equal variances 2 . L 5 ( - . L 2 ,2 . 3 2 ) -.866 4.r (a) t:
(b)zdf
Answers 569 q.S @\ One shoe of each pair gets the maior brand. 4.5 t : I.616, fail to reiect Ho 4.7 (.46,3.54) for the mean of D : (with - without) 6.1 (a) (19.52,28.48) (b) Z : 2J05, fail to refect Ho at level o : '02
6.s (a) (.13,r.09) (b) (5.31,s.97) 6.7 (a) sf;oor"6: 2.5 (b) r : 3.098 6.9 p"o: Q4.87,81.73) p.u: (8I.47, 89.73) 6 . 1 1 ( a ) Y e s ,t : - l . 8 0 5 s o w e r e j e c t H o : p r - P z ) 0 a t l e v e l a : . 0 5 . (c) (-9.09, .69) 5.15 (a) Age, eating habits, etc. 6 > 0 , w h e r e 6 _ m e a n o fD :
6 . 1 7 t i - I . 4 6 4 ,d f (before after) 6.2I (a) t 6.25 (a) t- 1,205with d/ : (b)(-3.27,11.87)
16,fail to reiect Ho
12 CHAPTER 3.1 Fo _ 8,9r -- -6, o 1.8x 4.I (c) i, : 9.9 4.3 (c) 0o - .8, 0t 6 . 1 ( a ) 0 t - . 7 , 0 o - - . 4 ,s 2 - . 0 5 3 3 (b) t (c) (2.50,3.2O)
( d )( - . r 7 o , 9 7 0 ) 6.5 (a) (122.9,124.9) (b) (12s.8,128.8) 5.7 (a) f _ 195.4+ 22.2x (b) t : L4.2
570 Answers
(c) 395.2,model may not hold 7.I 12: .519 7.3 r _ .989
e.r (c)f 9.3 (a) (2.86,4.45) (b) (r7 .r0, 19.61) 9.5 (a) 38.7 _ 27.46 + II.Z4 (b) 12 _ .71 (c) t : .84 9.7 (a) r 9.9 (a) (3.09,8.64) (b) (2.634,4.098)
e.11 (a)i, e.ls (a) i, e .r7(a )f (b) (2.02,4.50) (c) ir:
74.55
,13 CHAPTER 2.r (b)f (c) 12
straight line.
2.3 (a) y' 2.5 log"(height) - B.O7+ .465 log,(diameter) 3 . 1 (b)f - 6e.s2 3.3 ( a ) 9 o : G . 5 6 , 8 . 9 8 )
gr: (-.360, .054) (b) t _ I.269
3.5 (a) f _ llg 6.05x + B.7Oxz (b) R2 _ .996 5.1 (b) f _ 5O.BT tg.06logro(x) (c) (-19.87,-6.75) (d) (14.84,2r.17)
Answers 571 I 5.3 (a) f,' : log(time) : - 5. g 4 g + 1 4 5 3 . 9temp (b) t _ 15.83with 16 df (c) Rz - 94.7
s . 7(a) (2.26,2.48) (b) t _ - 1.912,reject Ho: 9z - 25 5.9 (a) i, : 50.4 + .I91t, , 12 _ .03 (b) i, : -92.3 + .583x, .149x, + 35.1*r, R2 : 58.6 (c) Even three variables don't do well. 5.13 Violation of constant variance assumption
14 CHAPTER 2.L ObservedX2 - 16.60,df _ 5, X?or : contradicted(with a - .01). 2.3 Observed12
15.09. Fairness of the die is
Xzto l, X?or -
3.3 Observed x2 significant.
6-53. Difference is highlY
3.5 (a) ( -.32, - .08) (b) Observedz: -3.39, Horeiectedat a : .05.Pvalue : .0006. : 2'71,independence not contra4.1 Observed12 : -194,df : L, x2.ro dicted. 4.3 Observed12 : 4.095,df = 3, x%s : 7.81.The null hypothesisof independenceis not reiected.
s.3 (a) Probability (b) Observed x2 contradicted.
0t
.2r6
.432 1.098,d f _ 2 ,
.o64 4.6I. The model is not
of 5.5 ObservedX2 : 6.707,df - L, x%r : 6.63.The null hYPothesis .01. homogeneity is reiected at ct 5.9 (a) ObservedX2_ 2I-96, d1 (b) (.2o, .46) 5.11 Observedz:4.41isinR: of a higher rate.
l, x?ol: 6-63-Hoisstronglyreiected. Z>2.33 withot : .01.Strongevidence
5.15 Observed 12 : 9.146, df : 2. Ho refected at c : .025 but not at c : .01.
572 Answers
s . I 7 ( a ) ObservedX2 (b) Observedz 2.91,P value surviv al rate.
Ho rejected at a _ .01. .0018. Strong evidence of a higher
,I5 CHAPTER 2.r (b)-(d)
Source
Sum of Squores
df
Treatment Error
3t2 170
9
Total
482
2.5 Source Treatment Error Total
2
n
Sum of Squares
df
90 7r
2 22
r6r
24
3.1 (b) F.os(10,5) : 4.74 3.5 F : 8.26. Since F.or(2,9) : 4.26 we reject the equality of means at cr : .05. 3.7 (b) F : 3.675, and we fail to reject Ho. 4.L (a) m _ 3, t.ooes- Z.SS1 4'3 For Fr-lL+, t interval 4.0 + 2.53, multiple-t interval 4.0 -+. 5.3 (b) F.or(9,20) - 2.45 6.5 (a)F-1.47 6.7 (b) F (c) Fr
pz: (.494,.BZZ) Fr F a : ( . 2 9 0 ,. 6 1 8 )
f'6 CHAPTER 2 . 1 (b)Value Probability
Answers 573 (c)c_66
2.3 (a) P[W" 2.5 (a) WA:
( b ) n B : 2 ts smaller sLze,W" : 2 + 3
10,
2.7 W" _ 62, we fail to reiect H o a t u 3.1 Significanceprobability _ .048
3.s (a) PIT* 8.7
(b) PIr*
(a) T+
3 . 1I S :
13, reiect Ho. The significance probability
-
-004
4 . 1 /sp - -5 4.3 rsp 2.667, significance ProbabilitY
4.5 \ F . r r r :
+5:10
5.1 WA:1+4
5 . 3 (b)Value of W" Probability
s.s (a).098, (c)c:2I 5 . 7 W s : 5 1 , reiect Ho. 5 . 1 1r s p : . 5 5.13 (b) ws :
2I, reject Ho at level a : 2 x .008 : .015.
INDEX
Index 575 { I of proportions,444 \ Addition law of probability,104 I interpretation,259 Taaitivity: of expectation, 501 for mean 261,296 of chi-square for pairedmeandifference,344 , 435 Alternativehypothesis, 173, 270 for proportion , 275 Analysisof variance,466 , relationwith tests,301 completelyrandomtzeddesign,467 ' simultaneous, 480 ANOVA table,473 line, 381 i for slopeof regression F test,477 j_ for standarddeviation,306 model, 476 Sonfidencelevel,256 simultaneous confidence intervals, fContingency table, 57, 432 481 i- pooling ol 462 ANOVA , seeAnalysis of variance Continuitycorrection, 213 Association,seeCorrelation Itontinuous randomvariable,194 'Corr.lation: a - . - > ^ v r > L r - v v
AveragetseeSamplemean
/4
distribution: ] Bell-shaped guideline,36 .-pirical i norrnal,202 J'IBernoulli trials,160 I Binomialdistribution,164 normalapproximatio n, 215 I properties, | 69 L Boxplot,38, 486 Categoricaldata,12,430 Causationand correlation, 66 Cell frequenc, 18,58 Center,measures of, 25 Centrallimit theorem,238 ft%i-squaredistri bution, 305 I chi-squareresr: I in contigencytable,139,447 I for goodnessof fit, 434 [_ for variance,307 Class: boundaty,I 8 interval,l8 relativefrequency,l8 Combinations, I 17 Complement,103 Completelyrandomrzeddesigry467 Conditionalprobability,I 09 Confidenceinterval,262 '' for differenceof means,327, BgZ, 339
and causation,66 and linearrelation,63 and regression, 391 spurious,66 Correlationcoefficient(samplel,64 properties, 69 Crosstabulation,57 Curvefitting, seeRegression Data: 12,430 categorical, crosstabulated, 57, 432 measurement, 12 iD?greesof freedom: 47| i analysisof variance, distribution,305 I chi-square tables,439 fcontingency -F distribution, 476 goodness of fit test, 435 iT aistribution, 292 ftnsi ty,seeProbabilitydensity Designof experiments, 321, 467 Discretedistribution,136 Dispersion, seeVariation Distribution:seeProbability distribution;Samplingdistribution Dot diagram, 16 Elementary outcome,86 Error: margin, 253
576 Index Type I and TypeII, 17s inference, 382 Errorsum of squares,374,470 Interquartilerange,38 Estimate: Intersectionof events,103 interval,262 Intervalestimation,262 point, 253 Event,86 { Largesampleinference: ', incompatible,103 difference,betweenmeans, 327,gZ9 independent, III betweenproportions,443 mutually exclusive,103 m ean, 261,27| ; operations,103 proportion, 275,276 t simple,86 Lawsof probability: Expectation(expected value),144 , addition,lO4 501 j .omplementation,104 . ....properties, '_Expected cell frequetrcy, 438 multiplication, l0g ' i-Least Experiment, 36 squares, 371 designof, 321 j Levelof confidence , 256 Experimental units, 321 Levelof significance,177 i*Linea F distribution , 476 rrzrngtransformations, 407 F test,477 Linearmodels,4l5 Freque flcy,relative,14 Linearregression, 70 Frequency distribution,I 8 multiple,410 straight-line,369 . Ceometricdistriburion,190 Location,seeCenter Goodness of fit test,434 Lurking variable,66 Graphicalpresentation: Matchedpair, 343 boxplot,38, 486 Mean (population), 145 dot diagram,1,6 Mean (samplel,26 histogram,20 deviationfrom,32 scatterdiagram , 6I, 368 samplingdistributionof, 237 stem-and-leaf ,22 standard errorof,253 Mean square, 472 Histogram. Median: probability,137 population, l5l, l98 relativefrequency,20 sample,27 Homogeneity,test of, 439 Mode,5l Hypothesis,17| Multiple regression, 410 alternative,173 Multiplicationlaw of probability,109 "Pnull, 173 Mutually exclusive,103 Hypothesistesting,seeTestof hypotheses Normal approximationto distributions: Independence,test of, 447 binomial, 215 Independentevents, I I I samplemean,238 Independentsamples,325 sampleproportion,273 : Niormaldistributions, 202 Inference,statistical, 250 lntercept of regressionline, 7l ;* standard, 204
Index
Normal scoresplot, 218 Null hypothesis,173
577
Randomization,BBg,34T Randomsample,llg,Z}4
-tl1ilJ:ffT;ot"
one-sided alternative,270. One-waylayout {Completely randomizeddesign),467 outlier, 13
discrete,134 Range,37 inierquartile,3g *tff:,t;i"fi, p-value(significance_probabilityl, 180 trix, 416 Paireddifferencg 343 linearizingtransformations' 407 Parameter,159,2g0 multiPle'4lo rercentrles: PolYnomial,4l4 population,l9g straiSht-line'369 sample,28 o Regression analysis(straightline): point estimation,253 checking assumptions'418 ' Poissondistributiorl 190 estimates' 373 pooledestimateof variance, 'o"o'Ls/ 331 oor inferenCeOninterCepg382 pOpUlatiOn, g inferenceon sloPe'379 mea4 145 model'369 mediaryl9g
-*tJ!lill?l;1"!"',no
proporrion, r58
quartiles, 198 standarddeviation, 236 Power,l8o, 288 Powercurve,182 Prediction: long range,384
,FReiative freq"uerrcy, 14 LR.eliability, ,yrt.rn, t tz Residual,374 of'419,484 ^ Plots Residualsum of squares'374
of,ieanierponse, 382
ftli,lX?,Ttli$
of singleresponse, 385 Predictorvariable,70 Samplg g f?iobability, 89 correlation,64 i conditionll, interquartiierange,38 ]p9 model,159,194 mean,26 i relative frequencyinterpretatiory9Z mediarl 27 i model,93 paired,'B4l l_ 1"1fo.rrn Probabilitydensity,196 percentiles,2g i ProbabilitVdistribution,136 proportion,273 f binomial,164 quartiles, i9 f 190 Beometric, randonl ll9 i Poisson,190 range,ll t^ I'roportlons: sizerequired 229 confidenceintervals,275 standaiddeviation,34 standarderror,274 variance,33 testsfor, 172,276 Sqnnle space,g6
eualitycontrol,190 29,l9B Quartiles,
'
"Tlrti"r:olacemeng 160 withoui,.plr..-.rri,-160
578 Index Samplingdistribution,234 difference,between mea ns, 329, Bg4 Scatterdiagram , 61,368 between proportions, 14J Set,seeEvent goodnessof ftt, 484 Significance level,177 homogeneity, 499 probability,I 80 Significance independence , 447 Simultaneous confidence intervals, mean, 271, 299 480 proportion,172,276 Skewed,198 standarddeviation, J07 Slopeof regression line,7l variance,307 ', Testing inferenceabo:ut,379 hypotheses,, Spread ; level of significance,lT7 , seeVariation Spuriouscorrelation, 66, 451 power,,180,288 i Stabilityof relativefrequency,97 i reyectionregion, 174 Standard deviation{population), 148 \ rignificanceprobability,1 8 0 confidenceintervals,306 L type I and type II error,t 7 s testsof hypotheses, 307 Tbsrsraristic, 174 Standarddeviation(samplel,34 Tiansformations,221,4A7 properties, 39 Tieatment,321 Standarderror: sum of squares,470 of mean,253 jryo-sided alternatives,270 of proportion, 274 I Type I errorand type II error,17S Standardnormal,204 probabilities of, 177 L* Standardized variable,2OA Statistrc,23I [Jniform probability model, gB Statisticalinference, 250 lJnion, 103 Statistics, 2 Units: plot,22 Stem-and-leaf experimental, 321 7l, 369 ,Straightline regression, sampli ng, 7 jSfudent'st distribution, seet ' distribution itum of squares: Variable: error,seeresidual predictor,T0 explainedby regression, 390 random, 133 residual,374, 470 response,70 treatment,470 Variancet analysisof, 466 (I), 26,492 Summation Variance (population),148 t,
Table: analysisof variance,473 contingency,57,432 ,{ l aistribution, 292 I t test: f one sample,299 pairedsample,344 two-sample,334 Trcstof hypotheses: analysisof variance,477
confidence intervals, 306 tests of hypotheses,307 Variance (sample),33 Variation, measuresof, 32 Venn diagram, l0l Violation of assumptions,309, 393, 484
Z-scale, 50 (Normal test),266 {-rtrst L-**.^