Probability and Experimental Errors in Science

PROBABILITY (ID EXPERIMENTAL ERRORS IN SCIENCE u« G m \\ Probability and Experimental Errors in Science Scie...

Author: Lyman George Parratt

136 downloads 892 Views 20MB Size Report

This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!

Report copyright / DMCA form

DOWNLOAD PDF

PROBABILITY (ID

EXPERIMENTAL ERRORS IN

SCIENCE

u«

G

m

\\

Probability and

Experimental Errors in

Science

Science Editions®

JOHN WILEY

and SONS, INC.

NEW YORK

PROBABILITY

AND EXPERIMENTAL ERRORS IN

SCIENCE An elementary survey

LYMAN

G.

PARRATT

Professor of Physics

Chairman of the Department of Physics Cornell University Ithaca,

New York

COPYRIGHT

r

1961

BY

JOHN WILEY

All Rights

& SONS, INC.

Reserved

This book or any part thereof must not

be reproduced

in

any form without the

written permission of the publisher.

Library of Congress Catalog Card Number: 61-15406 Printed in the United States of America First Science Editions printing 1966 Science Editions Trademark Reg. U.S. Pat.

Off.

DEDICATION This book

is

dedicated

to those timeless intellectuals

who

have so shaped our cultural pattern

that experimental science can live as a part of a science that seriously

tampers with

the plaguing and hallowed uncertainty in

man's comprehension of and of the universe.

his

gods

it,

"He

that

is

unaware of

his

ignorance will be

misled by his knowledge."

WHATELY

Preface

Although

the concepts of probability

everything in science, the student in

haphazard fashion. His

first

is

and

too often

statistics left to

underlie practically

acquire these concepts

contact with quantitative probability

may

be in extracurricular gambling games, and his second in some requirement of a laboratory instructor

who

insists arbitrarily that

a

±

number should

follow a reported measurement. In his undergraduate training, he

may be

introduced in a social science to sampling procedures for obtaining data, in a biological science to formulas for the transmission of genes in inherit-

ance characteristics, and in a physical science to Heisenberg's uncertainty

which he intuitively concludes is essentially nonsense. Such good as far as they go (except, of course, any excess in gambling and the conclusion as to nonsense), are left woefully disconnected and do not prepare him adequately to understand the intrinsic "open-ended" feature of every measurement and concept in science. Probability is the lens that brings science into philosophic focus. Without a fairly clear comprehension of this fact, the scientist cannot be really "at home" in his own field. And, at a time when as never before the results of science rush on to overwhelm society, the scientist, for lack of focus, is a poor ambassador. Not only is he spiritually uncomfortable in his own field, but science itself, as he portrays it, cannot fit comfortably in the society of other human activities and knowledge. In a very humble way, we are attempting at Cornell University to principle,

experiences,

introduce the undergraduate student to the unifying concepts of probability

Preface

vim

and

statistics as

they apply in science. This

The

a difficult task at this level of

is

broad scope is no doubt the most mature and sophisticated one with which man has ever struggled. But it is believed that the best time to instill a general attitude in a student he will have less trouble later in maturing properly. is when he is young This is admittedly the objective of a teacher rather than of a book. But experience shows the impracticality of trying to teach undergraduate students without a book. The present volume has been assembled to fill the student's development.

subject in

its

—

this

pedagogical need, at least to

fill it

be a base from which the teacher

A

is

patterned to

to discuss further aspects,

deepen and broaden the understanding of

especially those aspects that science.

The book

in part.

may go on

few suggestions are given of such excursions in the under-

standing of science, particularly in the

first

chapter.

comments on the different meanings of probability, then goes into the classical games of chance as examples of the classical or a priori meaning. Although these games have almost nothing to do with science, they provide a convenient framework for the teaching The book begins with

brief

of basic principles, for example, of combinatorial analysis which

is

funda-

probability reasoning, and of the sampling process inherent in

mental in

all

scientific

measurements.

The games

are also remarkably successful in

arousing the undergraduate's interest in the subject, and in providing

numerous problems

to help

quantitative concepts.

we

him develop

Once

his feelings for probability into

the basic principles are well established,

turn to their applications in problems

more

serious than gambling

games. In the bulk of the book, emphasis

is

placed on the experimental definition

of probability rather than on the classical definition. After the ideal games,

and

after

bility,

comments on

the role in science played by both kinds of proba-

namely, classical and experimental, the discussion

ments and to the general

maximum

statistical concepts.

shifts to

measure-

These concepts include

likelihood, rules for the propagation of errors, curve fitting,

several applications of the least-squares method, consistency tests, a

little

on the analysis of variance, a little on correlation, and so on. Then, the normal (Gauss) and the Poisson models of mathematical probability are explored both analytically and with typical problems. The normal and the Poisson models are given about equal weight, this weighting being roughly commensurate with their invocations in modern scientific measureof statistics our discussion is very an undergraduate student in an experimental science. But in both statistics and probability, the point of view taken in this book is somewhat different from that of the professional ments.

Especially in

elementary

the

subject

—

just the essentials for

statistician or

mathematician.

Preface

ix

Numerous problems

are given in each of the five chapters.

Many

of

these problems are intended to provoke discussion,

and the instructor should look them over carefully before he assigns them to the student. The most commonly used equations in statistics and probability are gathered together for convenience and placed at the end of the book, just before the index.

am

I

K.

I.

pleased to express

my

indebtedness and thanks to Professor

Greisen of Cornell University for reading the manuscript, for checking

and for making numerous helpful general suggestions. And, needless to say, practically all that I know about the subject I have learned from others.

the problems,

When 'Omer smote his bloomin' lyre, 'E'd 'eard men sing by land and sea; An' what he thought 'e might require, 'E went an' took the same as me!

—

Kipling In partial acknowledgment, the following books are listed and

them 1.

I

recommend

for collateral reading.

T. C. Fry, Probability

and

Its

Engineering Uses (D.

Van Nostrand

Co.,

New

York,

1928). 2.

3.

4.

G. Hoel, Introduction to Mathematical Statistics (John Wiley & Sons, New York, 1954), 2nd ed. A. G. Worthing and J. Geffner, Treatment of Experimental Data (John Wiley & Sons, New York, 1943). H. Cramer, The Elements of Probability Theory and Some of Its Applications P.

(John Wiley

& Sons, New

5.

A. M. Mood, Introduction York, 1950).

6.

B.

7.

E. B. Wilson,

York, 1955). to Theory of

Statistics

(McGraw-Hill Book Co.,

W. Lindgren and G. W. McElrath, Introduction (Macmillan Co., New York, 1959).

to Probability

and

New

Statistics

Jr., An Introduction to Scientific Research (McGraw-Hill Book Co., York, 1952). R. B. Lindsay and H. Margenau, Foundation of Physics (John Wiley & Sons, New York, 1943). William Feller, An Introduction to Probability Theory and Its Applications (John Wiley & Sons, New York, 1957), 2nd ed. R. D. Evans, The Atomic Nucleus (McGraw-Hill Book Co., New York, 1955), Chapters 26, 27, and 28.

New 8.

9.

10.

11.

Emanuel Parzen, Modern

New Ithaca,

May

Probability

and

Its

Applications (John Wiley

&

Sons,

York, 1960),

New

1961

York

LYMAN G. PARRATT

Contents

Chapter

I

EARLY DEVELOPMENTS: IDEAL GAMES A.

B.

INTRODUCTION,

1

1-1.

"Three" Meanings of Probability, 2

1-2.

Historical Perspective, 6

CLASSICAL (A PRIORI) PROBABILITY, 1-3.

Definition of Classical Probability, 8

1-4.

Probability Combinations, 9

8

Mutually exclusive events, 10 Independent events, 10

Compound

events: general addition theorems, 11

Conditional probability

:

multiplication

theorem, 12 1-5.

Inferred Knowledge, 17

1-6.

Problems, 20

1-7.

Combinatorial Analysis, 23 Permutations, 23

24 Sampling without replacement, 26 Sampling with replacement, 26

Stirling's formula,

Combinations: binomial

coefficients,

Binomial distribution formula, 30

Multinomial

coefficients,

35

Multinomial distribution formula, 37 XI*

27

Contents

xii

Sampling from subdivided populations without lottery problem and bridge replacement: hands, 39 1-8.

Classical Probability

and Progress

in

Experimental

Science, 41

Applications in statistical mechanics, 42 Classical statistics, 43

Quantum 1-9.

C.

bosons and fermions, 43

statistics:

Problems, 45

EXPERIMENTAL 1-10. Definition

(A POSTERIORI) PROBABILITY, 40

of Experimental Probability, 49

Number of "equally probable

outcomes''''

meaningless, 51

Chapter 2

1-11.

Example: Quality Control, 51

1-12.

Example: Direct Measurements

in Science, 52

DIRECT MEASUREMENTS: SIMPLE STATISTICS A.

B.

MEASUREMENTS IN SCIENCE: ORIENTATION, 2-1.

The Nature of

2-2.

Trial

2-3.

Random

a Scientific Fact, 55

Measurements and

Statistics,

2-4.

Probability Theory in Statistics, 61

2-5.

Computed Measurements, 62

2-6.

Conclusions, 63

BASIC DEFINITIONS: FIGURES, ETC., 63 2-7.

56

Variation, 59

ERRORS, SIGNIFICANT

Types of Errors, 64

Random

(or accidental) error, 64

Systematic error, 67 Precision and accuracy, 68

Discrepancy, 69 Blunders, 69 2-8.

C.

Significant Figures

and Rounding of Numbers, 69

FREQUENCY DISTRIBUTIONS AND PRECISION INDICES, 2-9.

71

Typical Distributions, 72

Terminology: types of distributions, 72 2-10. Location Indices,

76

Median, 76

Mode Mean

(most probable value), 76 (arithmetic average)

m

and

/i,

76

55 55

Contents

xiii

2-11. Dispersion Indices, 79

Range, 79 Quantile, 79

Deviation {statistical fluctuation), 79

Mean

{average) deviation, 80

Experimental standard deviation

s,

82

Moments, 84 Variance a2 : "universe" or "parent"" standard deviation a, 86

Degrees offreedom, 89 Variance: binomial model

Standard deviation error) s m

,

distribution, 91

mean {standard

in the

92

Skewness, 94

Other dispersion

indices,

95

Conclusions, 97 2-12. Problems, 98

Chapter

3

OF MEASUREMENTS FUNCTIONAL RELATIONSHIPS

STATISTICS

3-1.

IN 101

Method of Maximum Likelihood,

103

p

in the

binomial distribution, 105

//

and a

in the

/<

in the

Poisson distribution, 107

normal

distribution,

106

Instrumental parameter, 107 Precision in the

maximum

likelihood

estimate, 108

Standard error a m 109 ,

3-2.

Propagation of Errors, 109

Nonindependent errors: systematic

Random errors, 1 Mean {and fractional mean)

errors,

110

1

deviation, 113

Standard {and fractional standard) deviation, 114

Sum

or difference,

1 1

Product or quotient: factors raised powers,

1

to various

16

Other functions, 118 3-3.

Different Means, 118

Weighted mean,

1 1

Weighted dispersion

indices,

120

Consistency of two means: the

Comparison of precisions ments: the

F test,

123

in

t test,

120

two sets of measure-

Contents

xiv 3-4.

Curve Best

Fitting: fit

Least-Squares Method, 126

of a straight

line,

127

Straight line through origin,

1

3

Best fit of parabola, 132 Best fit of a sine curve, 133 Criterion for choice offunctional relation, 133 3-5.

Justification of Least-Squares

Maximum

Method from

Likelihood, 135

3-6.

Data Smoothing, 139

3-7.

Correlation, 140 Correlation coefficient, 140

Covariance, 143 Interpretation, 144 3-8.

Inefficient Statistics, 146

Location index, 147 Dispersion indices, 147

Standard deviation

in the

mean (standard error), 148

Examples, 148 3-9.

3-10.

Conclusions and Design of Experiments, 148 Summary, 150

3-11. Problems, 150

Chapter 4

NORMAL PROBABILITY DISTRIBUTION 4-1.

Derivation

Normal (Gauss)

of the

156

Probability

Density Function, 157

Shape of the normal frequency

curve, 161

Normalization, 161 4-2.

Errors in the Normal Approximation, 163

4-3.

Significance

of the

Bernoulli

Trials

in

Actual

Measurements, 164 Elementary errors, 164

Mechanical analog for Bernoulli-type elementary errors in continuous sample space, 166 Characteristics of elementary errors, 168 4-4.

The Error Function, 169

4-5.

Precision Indices, 170

Standardized variables,

Mean

69

deviation, 170

Standard deviation, Probable error,

1

Confidence limits 4-6.

1

1

72

73 in general,

1

73

Probability for Large Deviations, 175 Rejection of a

"bad" measurement, 175

Chauvenet's criterion for rejection, 176 4-7.

Test of a Statistical Hypothesis: Example, 178

Contents

xv 4-8.

Test of Goodness of Fit of a Mathematical Model, 180 Graphical comparison offrequency curves, 181

Graphical comparison of cumulative distribution functions: probability paper, 181

Skewness and kurtosis, 183 The x2 test, 184 4-9.

Conclusions, 191

4-10. Problems, 192

Chapter 5

POISSON PROBABILITY DISTRIBUTION 5-1.

Introduction, 195

5-2.

Derivation of the Poisson Frequency Distribution Function, 197

Rare

195

events, 196

Shapes of Poisson frequency distributions, 198 Poisson to normal distribution, 199 Normalization, 201 5-3.

5-4.

Errors in the Poisson Approximation, 201 Precision Indices, 202

Standard deviation, 202 Fractional standard deviation, 203

Standard deviation

in

a single measurement, 204

Probable error, 206 Skewness, 207 5-5.

Significance of the Basic Bernoulli Trials, 207

5-6.

Goodness of Fit,

5-7.

Examples of Poisson Problems, 213 Deaths from the kick of a mule, 213

Two mechanical

analogs, 209

21

Spatial distribution, 212

Radioactive decay, 215

Counts per unit time: precision, 218 examples, 221 Composite of Poisson Distributions, 224 Measuring with a background, 224

More

5-8.

Precision, 225 5-9.

Interval Distribution, 227

Dispersion indices, 229

Resolving time: lost counts, 229

Coincidence counting, 230 Conclusions, 231 5-10. Problems, 231

SUMMARY

239

GLOSSARY INDEX

241

249

"For there was never yet philosopher That could endure the toothache patiently,

However they have writ the style of gods And make a pish at chance and sufferance." WILLIAM SHAKESPEARE

"Life

is

a school of probability."

I

Developments:

Early Ideal

A.

WALTER BAGEHOT

Games

of

Chance

INTRODUCTION

Every fact in science, every law of nature as devised from observations, "open-ended," i.e., contains some uncertainty and is subject to future improvement. This may sound harsh but it is simply the way of things, and this book is written to help the student of science to understand it. is

intrinsically

The

subject of probability has

introductions.

many

facets

Sections 1-1 and 1-2 of the present chapter,

The student

is

and needs many

This book has three introductions: advised to read them

progresses into Section

and

all in

(3)

different

(1) the Preface, (2)

Section 2-1 of Chapter

the order

named

2.

before he

1-3.

Although the primary objective of the book with the modern philosophic focus in science, probability, the elementary tools (methods

is

to acquaint the student

viz.,

through the lens of

and formulas) of

statistics

and

of the popular mathematical models of probability are also discussed to

about the extent commonly needed in the practice of an experimental science.

Actually,

most of the pages are devoted

to these

more or

less

conventional practical topics.

The treatment of the

subject presumes that the reader has studied

some

Elementary and intermediate algebra suffice as the mathematics background for the bulk of the book, but full comprehension of a few of the formulas requires just a little knowledge of calculus. experimental science for about one year.

2

Probability

and Experimental Errors

in

Science

"Three" Meanings of Probability

l-l.

The concept of probability is of great antiquity. Its first development was no doubt intuitional. Indeed, one of the "three" general meanings of probability in acceptable use today is a subjective or an intuitional one. This meaning refers to a qualitative state of mind or intensity of conviction, a meaning that is not intended to be, or else cannot be, quantitatively measured. Examples of this meaning are found in the statements "She probably would have won the beauty contest if she had entered," "Shakespeare's plays were probably written by Bacon," and "Life probably exists on Mars." In many examples of intuitional probability, some cogent arguments, even some quantitative arguments, may be marshaled in support of the intensity of the conviction. When an appreciable amount of quantitative support is arrayed, this meaning of probability emerges as one of the two quantitative meanings discussed in this book. Actually, of course, there exists a whole gamut of partially quantitative meanings. The spread is according to the degree of quantitativeness. Let us illustrate one end of this gamut. Suppose the doctor tells his patient that he has a mild attack of what appears to be appendicitis and that the probability is good that he will recover without immediate surgery. The doctor's appraisal of the situation is based on his experience, both direct and indirect, which involves a large number of more or less similar cases. If pressed,

he

may

give a quantitative value of the recovery probability,

in mind an uncertainty of say ±0.3 or so, but he would be very reluctant (and properly so) to state any such quantitative values. He knows that there is still a large amount of nonquantitative "art" in his meaning of probability. At the other end of the gamut, a

such as

0.6,

and have

mathematician, when he speaks of probability, refers (unless he says otherwise) to a quantity having an exact value, an axiomatic or "classical"

but

it is

ledge

may not be known, anyway, obtaining the numerical know-

In real-life situations, the exact numerical value

value.

is

presumed

to exist, and,

On the other

not the mathematician's problem.

works with a

large

amount of

rather precise quantitative value of probability.

good

the data are equally

(or have

known

an "internally consistent" analysis, he or subjective evaluation that the physician.

The

statistician

is

is

hand, a statistician

data and deduces therefrom a

real-life

He presumes

that

injects

none of the nonquantitative

inherent, for example, in the practice of

generally rather closetothemathematician

in the high degree of quantitative intent in his probability, but he

that his value

data,

is

numerous

not

100%

all

weightings) and, concerned with

precise (as

is

knows

the mathematician's) because his

to be sure, are only a limited sample of

all

possible data.

3

Introduction

Let us turn next to the scientist. He is in the business of both making measurements and interpreting them in terms of the laws of nature. In a sense, he is more concerned with a sort of "external consistency" than is the statistician.

This

returns to plague

is

because the

him when more

reporting a measurement, he states

numerical

±

view of the laws of nature

scientist's

measurements are made. In the numerical value and then gives a precise

value to indicate (implicitly)

quantitative reliability or

its

measurement is "correct." The ± value is typically deduced from a combination of (a) a more or less careful statistical analysis of his trials and (b) a guess based on his experience in setting up and performing measurements of this sort, and perhaps from a few unrecorded test measurements made in the process of adjusting the apparatus. Also, he indulges in the guess because he suspects that his measurement contains some undetermined amount of systematic error. His statistical analysis the probability that the

is

often a bit careless for another reason,

in

mind

that

if,

the interpretation he has

viz.,

measurement may not require greater

care. But note well by the design of the experiment, he judges the systematic error

for the

to be relatively small and/or the interpretation to

he must then make a very careful

demand

a high degree of

Hence, depending upon the particular measurement and its interpretation, the experimental scientist works in the range of quantitative probability somewhere between the physician and the statistician. But the scientist reliability,

statistical analysis.

when he is conjecturing about nature in the absence of pertinent measurements, resorts perforce to the classical type of probability of the also,

mathematician.

This he does in desperation, as discussed later in this

chapter.

The

bility is

improved as soon as actual pertinent measurements become

tentative description of nature in terms of classical proba-

available.

To make

clearer the distinction

between the probability of the mathe-

matician (or of the desperate theoretical scientist) and that of the cian (or of the scientist in his eventual description of nature), briefly

one of the points made above.

This

is

let

statisti-

us amplify

beyond doubt the most

subtle point in the concept of quantitative probability.

A

quantitative measure of anything always implies

operational definition of the thing being measured.

(hence meaning) of probability in a contains a certain

amount of

real-life

or

always an arbitrariness that

scientific situation

inherent arbitrariness,

exists in addition to the effect

an unambiguous But the definition

of the subjective guess in evaluating the

systematic error just mentioned. This stems from the fact that, in a given situation,

happen.

bound

we can never imagine or evaluate all As a matter of practical experience, we

the things that might are usually content to

the situation with a closed group of separately

and individually

Probability and Experimental Errors in Science

4 evaluated possibilities.

If the

amount of

is

arbitrariness

often

arbitrariness,

group

small.

It is

is

reasonably comprehensive, the

primarily this feature of inherent

small but always partially unknown,

that

philosophical (and theological) fascination into the subject.

mentary treatment, we

shall for the

infuses

In our ele-

most part side-step the conventional

But we point out now that an attempt to reduce the residual arbitrariness in the definition of probability in real-life situations that we progress from the first to the second of the two so-called quantitative meanings. We shall return to different aspects of this arbitrariness in Sections 1-5 and 1-10, and generally throughout the book. The first of the two quantitative meanings is called the classical or a priori probability. The classical meaning, discussed with examples in Part B of this chapter, is based on the presumption that the probability (for the occurrence of a specified event) can be determined by an a priori analysis in which we can recognize all of the "equally possible" events. The usefulness of this meaning is limited to a rather specialized class of "mathematical" probability situations, such as those encountered in the ideal gambling games (e.g., dice, cards, coin tossing, etc.), and to real-life situations where we are desperate for lack of reliable objective knowledge philosophical (and theological) implications.* it is

really in

(measurements).

The second of

the

two quantitative meanings

or a posteriori probability.

is

called the experimental

This probability concept, having by far the

and understanding of nature, has which the number of "equally possible" events has admittedly not been or cannot be determined. This is the case in all

greater profundity in the description to

do with

real-life It is

our

situations in

probability situations.

of

little

failure to

situations

is

or analytic

consequence in the limited discussion

make

book whether

due to a limitation (temporary or inherent) of our knowledge or is due to an inherent characteristic of the events under

ability,

consideration. f

In either case, the causal factors in the individual events in

such probability situations are not understood us to

in this

the a priori determination of probability in real-life

make

in sufficient detail to

reliable individual-event predictions;

said to be at the

whim

or caprice of nature. But

that in these probability situations

considered by the methods of

each such event

it is

especially

allow

may

be

noteworthy

a large number of events are an amazing degree of regularity

when

statistics

* An introduction to probability from the philosophical point of view is given, e.g., by E. Nagel, Principles of the Theory of Probability (International Encyclopedia of Unified Science, vol. 1, no. 6, Chicago, 1939). t We may, if we like, include as part of the event characteristic the perturbation introduced by the observation itself (Heisenberg's uncertainty principle).

5

Introduction

becomes apparent

—the

capriciousness of nature

is

limited.

With an

experimental or a posteriori definition of probability, predictions of specified future events

can be made, and, more, a significant quantitative

degree of reliability can be assigned to each prediction

The term "random mass-phenomena"

is

often applied to a class of

events that are amenable to analysis by statistics and experimental-probability theory,

a whole.*

events that are unpredictable in detail but are predictable as

Practically every intellectual discipline

with quantitative relationships

war

—economics,

science, biology, chemistry, physics, engineering,

medicine, to mention a few

—

is

replete with

whose content deals

sociology, political science,

commercial business,

phenomena

that are effectively

random. Statistics and probability theory are rapidly being extended to more and more facets of all of these subjects. In particular, the substance of any experimental science is measurement, and practically all measurements are in the class of essentially random mass-phenomena. Measurements are so classed because of one or both of the following reasons: (a) each measurement is merely a sample of a treated as

large

number

(essentially infinite in

some

instances) of possible measure-

ments that differ more or less slightly among themselves, i.e., the experimenter is unable to control completely all the experimental factors involved, and (b) the property being measured may contain a degree of randomness as an inherent characteristic. And, equally important, an increasing fraction of modern scientific theories (i.e., generalizations drawn from measurements) are based on a view of the statistical behavior of nature. Examples of these features of measurements and of theories are discussed later.

For the student of any quantitative science, early recognition and underrandom mass-phenomena are imperative. Obviously, such recognition and understanding require an early grasp of the fundamentals of statistics and probability theory. This book attempts to impart these standing of

fundamentals.

With

this objective, the

book

is

largely concerned with the third of the

above. However, for historical and pedagogical reasons, and because of the use of axiomatic probability by desperate theoretical scientists (desperate in the sense mentioned above),

"three"meanings of probability as

set forth

the next few sections of this chapter are devoted to a review of selected

concepts and arguments that were developed in connection with the classical or a priori probability.

This a priori meaning

one to grasp and to use to the extent that * This

is

our

first

it is

applicable.

much

It

is

given in Section 2-3.

the easier

should be kept

use of the term "random," a term or concept that

present in any discussion of statistics and probability.

concept of random

is

is

inextricably

Further discussion of the

Probability and Experimental Errors in Science

6 in

mind

that all of these concepts

and arguments (excepting, of course,

the definition of probability itself) are also applicable to

phenomena

in

random mass-

which the experimental meaning of probability

is

the

appropriate one.

1-2.

Historical Perspective

and probability as we know it today is dually games of chance and (b) accumulated records such as those of human life and death. Both of these roots, in really recognizable form, date from about the middle of the seventeenth century. The concepts of classical (a priori) probability grew mainly from the first root (ideal games of chance), and the experimental (a posteriori) concepts, based on statistics, grew mainly from the second. In most of this chapter, the development from the first root is reviewed, but, as was just mentioned, most of these

The

subject of statistics

rooted in

(a) ideal

aspects have general applicability.

we launch

into the classical part of our study, a few more comon historical perspective. Around 1650, gambling was very popular in fashionable circles of French society. Games of dice, cards, coin tossing, roulette, etc., were being rather highly developed. As personal honor and increasing amounts of money were involved, the need was felt for some formulas with which gambling chances could be calculated. Some of the influential gamblers, like de Mere, sought the help of the leading mathematicians of the time, such as Pascal, Fermat, and, later, d'Alembert and de Moivre.* Fortunately, the mathematicians accepted the problems as their own, and soon the subject of classical probability took shape. The a priori definition of probability was formulated in correspondence between Pascal and Fermat in 1654. Huygens published the first treatise on the subject in 1657. The famous Bernoulli theorem and the binomial distribution were introduced in 1713. The general theorem known as the probability multiplication rule was proposed by de Moivre in 1718, and de Moivre also published the first indication of the normal probability

Before

ments are

in order

first

special case of the powerful "central limit

in 1733 to 1738.

Further development of the normal distribu-

distribution (and the

theorem")

made by Gauss, whose name

is often attached to it, and it was soon used by Gauss and by Laplace independently in the analysis of errors of measurements in physical and astronomical observations. The important principle of least squares was formulated by Legendre at about this time, and the "theory of errors" was well on its way.

tion

was

later

* See, e.g., Todhunter's History of the Mathematical Theory of Probability from the time of Pascal to that of Laplace (Macmillan, Cambridge and London, 1865).

Introduction

Laplace

7

of 1812,* a treatise in which the a priori

in his classical treatise

type of probability holds supreme sway,| gives a rather complete

summary

of the mathematical theory of games of chance. Soon after 1812, contact with classical mathematicians was almost

lost.

Continued development of the subject was made by statisticians in various fields such as in actuarial work, in certain branches of social, biological, and physical sciences in the treatment of errors in measurements, and in theoretical physics in what is called statistical mechanics. In its

and the root of experimental probaand death records published by Gaunt in England in 16624 Such records and interpretations were significantly extended by Halley a few years later, and Halley is sometimes called the father of statistics. Statistics flourished for some 200 years without much present form, the subject of statistics bility

apparently began in the

life

further progress in probability theory, except for the

new

definition.

Vigorous contact with mathematicians was re-established in the 1920's, and basic development along the lines of mathematics is today continuing

apace with the multitude of new applications. maticians

is

The work of

the mathe-

again closely akin to the classical (a priori) concepts in the

sense that probability

theorems are

valid,

is

taken as axiomatic.

The new mathematical

of course, irrespective of the method whereby the

numerical value of probability

is

obtained.

Philosophical aspects of probability continued to develop as scholars in * Pierre S.

de Laplace, Theorie analytique des probability (1812), and the companion

treatise Essai philosophique sur les probabilites (1814).

For

selections

from

several

contributors to probability theory, including Laplace, see The World of Mathematics (Simon and Schuster, New York, 1956), ed. James R. Newman.

Why

t

—along with most of the leading mathematicians and —so uncritical of the a priori definition of probability Such questions are

was Laplace

of his time

scientists

?

interesting but of course difficult to answer. Perhaps it was because of the intellectual fashion in those days to believe that complete knowledge was available to man, that

of the "equally possible" outcomes in any situation were knowable.

This was and to what later became the principle of determinism. This philosophical fashion was dominant in intellectual circles throughout the nineteenth century, and did not yield until the advent of quantum physics early in the twentieth century. With Heisenberg's uncertainty principle, the whole of science was recognized as ultimately based philosophically on the concepts of experimental all

related to the philosophy of sufficient reason,

probability. X

There

exist very early records of a type

of insurance business in the protection of

Greek merchants from the maritime hazards of ships at sea, but these records do not indicate a very good statistical basis on which the insurance "premiums" were determined. To marine insurance, the Romans added health and burial insurance, again with rather inadequate statistical records. In the year 1609, Count Anton Gunther in

Germany fire;

refused the request of his people for financial protection against the hazard of he refused for fear of "tempting the gods." As a matter of history, it appears that

the English started the

first fire

insurance

company

in 1680.

Probability and Experimental Errors in Science

8

general devoted themselves to the subject.

Activity in

philosophical

probability has been particularly intense since the widespread dissemination of Heisenberg's uncertainty principle of 1927.

based

in

quantum

The general of

(a)

This principle

is

physics.*

subject of probability at the present time

mathematics, (b) measurements or

is

a combination

statistical data, (c)

theory of

and (d) theory of knowledge itself. Typically, the student begins the study from some one rather specialized approach, but he is soon obliged to broaden his view to include all of them. This, perhaps more than any other, is a humbling subject, because it is so all-inclusive. nature,

CLASSICAL (A PRIORI) PROBABILITY

B.

1-3.

Definition of Classical Probability

In a simple ideal game, the chance of winning

heads or

is

if

comes, and

one out of

For

;

The

will settle either

is

honestly cast, there are six possible out-

number (a specified event) is Also, we may readily see that the

the chance for a particular face

six, i.e.,

the probability

is \.

probability of drawing the ace of spades sV

it

tails.

ordinary (six-sided) ideal die

is

easily deduced.

an ideal coin is There are two possible outcomes of the toss and the chance the same for either outcome. The chance for either a head or a tail one out of two, i.e., the probability is \. By similar argument, if an

example,

is

is

honestly tossed into the air

from an ordinary deck of 52 cards

and the probability of drawing a spade

is

||

=

\.

underlying conditions for such simple calculations of probability

are that (1) every single trial must lead to one of a definite known number of outcomes or events, and (2) every possible outcome must have an equal chance.

be defined as the number of events recognized as "win," and let n be the total number of equally possible events. Then, the probability of

Let

w

winning, p,

is

given simply by the ratio

P

^

(i-i)

n

and the probability of

losing, q,

is

f

given by

=

"-^ n

*

W.

Heisenberg, Z. Physik, 43, 172 (1927).

d-2)

9

Classical (a priori) Probability

Equation

of classical or a priori probability,

1-1 constitutes the definition

a quantitative definition that was no doubt generally understood and used

very early in our history. Nowadays, this definition because

was so

it

a priori definition since

it

sometimes called the Laplace

is

by him

well formulated

(in 1812).

allows determination of p before the

It is

an

game has

been played. Application of the classical definition required increasingly critical thought as games of somewhat greater complexity were considered. The definition does not give a criterion for determining the values of w and

of n, especially of n, and the proper values become rather obscure as the complexity of the game increases. As an example, consider a game in which a penny

is

game is won if a head appears T. One argument says that viz., HH, HT, TH, and TT, and

tossed twice in succession and the

Denote heads by

at least once.

H and

tails

there are four equally possible events,

by

won in each of the first three events. Hence, p = f But an argument says that the events and HT are winners on the Acfirst toss and that the second toss in these two cases is not made. cordingly, there are only three equally possible events, viz., H, TH, and TT, and the first two events are winners. The proponents of this argument concluded that p = § instead of |. Which is correct?* The student should recognize that in the second argument the three outcomes are the

game

is

.

HH

alternative

not equally probable. It

was paradoxes such as

this

(and

this is

a very simple example) that

caused the gamblers to go to the famous mathematicians of the time. The

mathematicians were soon able to resolve the ambiguities in the numbers

w and n

shall take

1-4.

games of chance by the methods known as and combinatorial analysis, subjects that we

in Eq. 1-1 for the ideal

probability combinations

up

briefly

now.

Probability Combinations

For convenience

in terminology, let A, B, C,

•

•

•

stand respectively for

the various possible specified outcomes or events in a probability situation.

A may

be heads in a penny toss, or it may be red in the draw of a card from an ordinary deck B may be tails in the penny toss, or it may be a ;

face card in the card draw;

spades

;

etc.

Or one of the

7 in the cast of

two

events are very

dice, or

p

—

|

second argument

is is

it

commonly

* Curiously, the early

leading to

C may not even exist, or it may be the deuce of events

may

may

be more complex,

be either a 7 or an

specified,

11, etc.

and the determination of

mathematicians were divided on

the correct one;

may be a Compound

e.g., it

the paradox

given twice the weight of either

is

this question.

resolved

TH or

TT.

if

their

The argument

the event

H in

the

and Experimental Errors

Probability

10

Science

in

Two

probabilities requires in general very careful analysis indeed.

rather

simple combinations involve either mutually exclusive or independent

component

events.

Mutually exclusive events. Two events A and B are mutually excluone of them can occur in a single trial. For example, a head and a tail are mutually exclusive in a penny toss. The probability of occurrence of an unspecified member of a set of mutually exclusive events is the sum of the component probabilities. This statement is conveniently sive if only

written as

A

p(either

follows that the

It

exclusive events

is

sum of

unity,

=

or B)

p(A)

+

p(B)

of

the probabilities

(1-3)

all possible

mutually

i.e.,

p(A)

+

Independent Events. Events rence or nonoccurrence of

+

p(B)

A

in

p(C)

+

A and B no way

•

•

•

=

(1-4)

1

are independent

if

the occur-

affects the probability

of occur-

rence of B. Examples of independent events are found in successive tosses

of a penny, or in successive samplings of differently colored balls from a jar

if

is

is replaced and the jar shaken before the next sample is The probability of occurrence of two or more independent events product of the component probabilities,

each sample

taken. the

p(A and

The

B and C

•

•

=

•)

p(A)

probability of tossing a penny three heads in a

the three specified events being independent.

two heads in a row and then a tail tossing two heads and a tail in three events

p(B)

is

not specified,

is

3

(f )

+

3

(£)

have a combination of mutually

The

•

p(C)

row is

•

\

•

•

•

(1-5)

\

\

=

3

(|)

,

probability of tossing

But the probability of also (f)3 particular sequence of the tosses, if

is

+

.

3

(I)

=

exclusive

3(£)

3 .

In the latter case,

and independent

we

events.

two pennies are tossed together three times, the probability of seeing two matches and one mismatch in any (unspecified) sequence is 3 3(|) since the outcome of the first penny is never specified but in each of the three tosses the outcome of the second penny is specified (with/? = ^), and there are three independent ways (different possible sequences) in which the winning outcomes may appear. The three independent ways refer, of course, to the fact that the mismatch may follow the two matches, come between them, or precede them.* Another example of the combined probabilities of independent events is found in the following. Suppose an incident is witnessed by one person Similarly, if

,

* is

Another view of this problem is to consider that the outcome of each penny toss Then, the probability of each outcome is (£)\ and there are 24 different

specified.

Classical (a priori) Probability

who

describes

it

II

to another person

who

in turn transmits

people are in the chain before the incident

is

it

related to you,

on.

20

If

and

if

the

component probability for truth is 0.9 per person, what is the probability that you are told the truth? This combined probability is (0.9) 20 an 0.1, and you should not put much credence in the story as you hear it. We might inquire as to the number N of such people in the chain before the combined probability has dropped to 0.5. The answer is given implicitly

=

in the equation (0.9)^

0.5.f

made commonly

In connection with the independence of events, a remark must be in regard to the

quoted

in

popular "law of averages." This so-called law

support of a fallacious belief that a run of bad luck presages a

run of good luck such that the two runs course, merely wishful nonsense

the

is

component

probabilities

if

will

remain unchanged.

or nonfacetious interpretation of this "law of averages"

number of trials

is

if

is

the one implied

By

this

determined only after an extremely large

identical trials (an infinite

either kind of luck

of

is

of

The only acceptable

in the experimental definition of probability as discussed later. definition, the probability

is,

and

average or cancel. This

the events are truly independent

number

in the limit),

and a run of

number

eventually of negligible consequence as the

continues to increase (assuming that the probability so defined

unique, an assumption also discussed

Compound

events:

probability of a general

general addition theorems.

compound

is

later).

event

when

the

Now

consider the

component events are

overlapping or have what are called overlapping probabilities.

Over-

lapping component events are not entirely mutually exclusive. For example, if

one card

is

drawn from each of two ordinary decks of 52

ways of having two matches and one mismatch. In The 24 ways are seen from the table :

HH

this view,

p =

24(£)

6

cards,

=

3(J)

3 .

Probability

12

what is

is

and Experimental Errors

the probability that at least one of

-

viz.,

\+

5

-£$,

probability that both cards are the the

sum of

the probabilities for (a)

on the second and

(b)

(i.e.,

either

Science

one or both)

sum of the component because this sum includes twice the ace of spades. The correct answer is an ace on the first draw and anything

the ace of spades? This probability

event probabilities,

them

in

no ace on the

is

not just the

first

draw and an ace on

the second,

viz.,

-1 5

2~

T

1

I

1

"

51

_1_

52"

1 1 — 52+52

_,„ Or

52

'

t 1

\h2

1 \

'

h2)

where the term in parentheses corrects for the probability of simultaneous appearance (or overlap) of both component events. This and other examples of probabilities of compound events made up of overlapping independent component (nonmutually exclusive) events are illustrated

by the following equations

p(neither

A

nor B)

= [1 - p{A)~\ [1 - />(#)] = 1 - p(A) - p(B) + p(A) p(B) (1-6) = p(A) [1 - p(B)] + p(B) [1 - p(A)] = p(A) + p{B) - 2p(A) p(B) (1-7) = p(A) p(B) + ^(either A or B, not both) = 1 — ^(neither A nor B) = p(A) + p(B) - p(A) p(B) (1-8) •

^(either

A

^(either

or B, not both)

A

or

B

or both)

•

•

•

•

Equations

1-6, 1-7,

theorems.

Equation

and 1-3

1-8 are is

commonly known

as the general addition

the special case for mutually exclusive indepen-

dent events.

The concept of "sample space"

is

a very convenient one.

In a given

probability situation all possible outcomes comprise sample space.

For

example, in the drawing of one card from a deck of 52 cards, there are 52 points in sample space. (Sample space

may be

visualized as points appro-

on a sheet of paper.) The convenience of the concept is readily seen in the idea of overlapping component events. Consider the 52-point space just cited: of the four points representing aces, two also are found among the 26 points representing red cards. Other examples are given in the problems of Sections 1-6 and 1-9. priately arranged

Conditional

multiplication

probability:

multiplication theorem, of

which

Eq. 1-5

is

theorem.

branch of the subject

general

This leads to a

events, involves the ideas of partially dependent events.

known

The

the special case for independent

as conditional probability.

If

event

B cannot

occur unless some condition is imposed, say unless event A has occurred, then the probability for B must include the probability that the condition

Classical (a priori) Probability is

13

In this simple example, the probability for the

satisfied.

event (A and B)

may

compound

be written

=

p(A and B)

p(A)pA (B)

(1-9)

on the assumption that A has p A (B) is written as p(B A). Equation 1-9, in its general form in which B depends upon more than one condition, is known as Bayes' theorem. (Bayes' theorem is usually stated a little differently, viz., as the probability that B was preceded by the where p A (B)

to be read as the probability,

is

B

already occurred, that

specified events

Ax A2 ,

Often,

will occur.

•

•

•

,

;

this is also

|

known

as inverse probability.)

Incidentally, the definition for the independence of events

p(B

|

A)

=

and

p(B)

p(A

=

B)

\

A and B

is

p(A)

Consider the following example. Three white balls and one black ball and one white ball and two blacks are placed in an

are placed in a jar identical jar.

If

withdrawn from

We

one of the two jars it,

what

selected at

is

argue that on the condition that the

f ; and that the white probability is |.

probability

is

on the condition Either jar

Hence, the probability that a white

and from the second white ball

is

the

jar, \

random and one

ball

the probability that this ball will be white?

is

•

The

\.

first

jar

chosen the white

is chosen chosen with a probability of \.

is

drawn from

is

is

that the second jar

the

first

over-all probability for

jar

is

\

•

f

drawing the

sum A)(^l)

= 2*4+2'3 =

24

The

subscript on p indicates that this is the probability based on our knowledge before any ball has been drawn; we make use of the subscript

notation

later.

Another example of conditional probability is the following. Suppose that in a jar are two balls of which either is black (B) or white (W), and suppose that we have no additional a priori information about the particular color complex of the balls in the jar. What is the probability that the first ball drawn will be white? In the absence of any additional a priori information, it is customary to presume that there is an equal

p that each of the possible hypotheses is the correct one. We might further presume that there are three hypotheses, viz., two whites in probability

the jar, a white

we would

and a black, and two blacks. With these two presumptions,

write

p (Hyp

WW) =

p (Hyp

WB)

= j^Hyp BB) =

Accordingly, the over-all probability for a white on the

by the sum Po\ "l)

=

3

'

I

+

3

'

2

+

'

3.

t

=

2

±

first

draw

is

given

Probability and Experimental Errors in Science

14

since the three hypotheses are mutually exclusive.

chosen at random, It is

in the

In this problem, the

argument to the two jars, one of which

three hypotheses are similar in the

is

preceding problem.

to be emphasized that the presumption of equal probabilities for

the three different hypotheses

is

really

made

in

desperation since

information on which to form a better judgment. All we (1) that the probability for

range of

to

hypotheses

is

1

and

each possible hypothesis

sum of

(2) that the

is

in fact are

somewhere

the probabilities for

all

in the

possible

unity.

Depending upon our view

This example has a further complication.

of the conditions under which the balls were placed or the jar,

we have no

know

we might presume

somehow

got into

that there are four, instead of three, equally

probable hypotheses. According to this view we would write

WW) =

p (Hyp

For our purpose now, no

Hyp BW,

made between

=\ Hyp WB and

would be assigned a probability of

\ instead of \

p (Hyp

but as a unit

it

WB)

=

p (Hyp

BW) =

distinction need be

p (Hyp BB)

of being the correct one. Accordingly, the over-all probability for a white

on

the

first

draw

is

a probability that

given by the

is

the

equally likely hypotheses. the ball replaced

sum

same as But

if

that determined

the color of the

on the basis of three ball drawn is noted,

first

and the jar shaken, and then a second

ball

is

to be drawn,

it can be easily shown that the numerical value of the white probability number of equally likely hypotheses assumed 2 ) is dependent upon the Pi( at the time of the first draw. (As an exercise, the student should show this

W

dependence.) as to the "proper" number of a priori equally probable an inherent part of problems of this sort. This is one non-

The ambiguity hypotheses trivial

is

aspect of the arbitrariness inherent in the concept of probability as

mentioned in Section 1-1. For practice, let us rephrase the problem and extend it, knowing however that the numerical values from here on depend upon the initial number of equally probable hypotheses assumed. Consider a thin metal disk which we propose to toss as a true coin. Suppose that there are only three hypotheses: (a) that the disk has a mark on both sides that we may call heads, Hyp HH; (b) that it has a different mark on both sides that we may call tails, Hyp 7T; and (c) that it has heads on one side and tails on the other, Hyp HT.* With no a priori information as to which of these *

Again,

owing likely

if

Hyp

in any way from Hyp HT, e.g., we would start with four equally

77/ were recognized as different

to the process of manufacture of the disk,

hypotheses instead of three.

Classical (a priori) Probability

three hypotheses

is

the correct one, suppose that

an equal probability,

we

write

p H h(H\) as trie Hyp HH is

condition that toss

first

is

correct, then toss

first

is

Po(tfi)

/>

(Hyp HT)

we may

=

p (Hyp TT)

probability that the correct,

Pht(^i)

heads on the condition that

as the probability that the is

we assume

in

desperation

viz.,

HH) -

p (Hyp If

IS

first

toss

is

Hyp

first

=

toss

}

is

heads on the

as tne probability that the

HT

is

correct,

and p TT {H^)

heads on the condition that

Hyp TT

write the expression for the probability that the

heads as

=

Po(Hyp

HH) p HB(HJ + p (Hyp HT)

+

Wi)

p (Hyp TT) pTT (H x )

where the subscript on p refers to the probability before the outcome of any toss is known. Substituting the known quantities as discussed above, /><>(# i)

= i-i +

as expected.

Next,

we

Now we

toss the thin metal disk

W + i-o-i

and observe

that the

outcome

is

heads.

have some information about the relative probabilities of the

three hypotheses; viz.,

we know

that

A(Hyp TT)

=

and, furthermore, that* Pl

(HypHH)>p (UypHT)
The subscript 1 onp refers to the probability after the outcome of one toss is known. Again let us point out that all we really knew before noting the outcome of the first toss was that each hypothesis had a probability somewhere in the range to 1, and it was only in desperation that we guessed the same probability, J, that each hypothesis was correct. The quantitative evaluation of each of the remaining two hypotheses is made by the same *

We

might

arguing that

if

fail

outcome of the were

in

fact

to recognize the second feature of the additional information,

the other side of the disk were in fact tails first toss.

tails,

This argument would be valid

it

if

would not be altered by the we knew that the other side

but this we do not know; we are concerned merely with the

Only the Omniscient knows for sure that it is tails, if it is. two remaining hypotheses before and after the toss, i.e., that the relative probabilities are unaltered by the outcome of the first toss, implies belief in unaltered probabilities after many tosses regardless of their outcomes. That this belief is untenable, unless revealed by the Omniscient, is easily seen. Suppose that a long sequence of heads were to appear; even a novice gambler would suspect that the disk were HH. And an expert gambler or logician would be a little bit suspicious even probability that

it is tails.

Belief in equal probabilities for the

after the first toss.

Probability and Experimental Errors in Science

16

type of reasoning as before, using respective probabilities instead of events in Eq. ft (Hyp

Thus,

1-1.

Po(HypHH)- PHH (lH)

HH) = p (Hyp

HH)

p HH (lH)

+

HT) p HT (W)

p (Hyp

•

and Pl (Hyp

is

the heads observed in the

^^

HH) =

x

3

Now,

•

p (Hyp HH) p HH (\H) + p (Hyp HT)

\H

where the event

p,(Hyp

^PoiHypHT) p HT (lH)

HT) =

^(Hyp HT)

^

= 3

= Pl (Hyp HH) p HH (H + ^(Hyp HT) —-2.1L 3 _ — 56 •

1

I

3

We

and

-'

2

Hence,

toss.*

=

l^32

x

the probability that the second toss will be heads, event

Pl (H 2)

The

2)

•

J

H

2,

is

p HT (H 2 )

1

.

2

i

second time and observe that

toss the disk a

heads.

=3

first

p HT {\H)

•

HH hypothesis

is

again comes out

it

further strengthened at the expense of the

HT hypothesis. We have observed event 2H, viz., two heads in a row, and we

write

p 2 (Hyp

HH) Phh(2H) (Hyp HT) p HT (2H) p (Hyp HH) p HH (2H) + p (Hyp

HH)

•

/>

i-l p 2 (Hyp

+

i-4

5

Po(Hyp

HT) = p (Hyp

l

3

l

HH) 4

3

•

HT)

p HH (2H)

+

p HT (2H)

•

pQ (Hyp

HT)

p HT (2H)

•>

These expressions may be "derived" with the following type of argument. Let us make tosses, using an disk Np (Hyp HH) times and an HT disk Np (Hyp HT) times. This arrangement ensures that the probability is/? (Hyp HH) that a disk chosen at random is an disk. In the ./V tosses, the number of heads that we obtain, on the *

HH

N

HH

average,

is

A^ (Hyp HH)

-p HB {\H)

+ Np

(Hyp HT) p BT (\H)

Np (Hyp HT)-

BT (\H) are with the

HT

probability that any one toss, chosen at

random from among

the

Of

these observed heads,

outcome

is

heads with an

/>

HT disk

(Hyp

the

first

Then the whose

tosses,

is

p (HypHT)p HT (\H) HH) PlIB {\H) + p (Hyp HT) p BT (\H)

This can be stated alternatively as the probability that e.g.,

disk.

N

toss in a series of tosses, the disk will be

in

an

any one toss giving a head, i.e.^^Hyp HT).

HT disk,

17

Classical (a priori) Probability

and the probability that the

third toss will be heads

p 2 (H 3 )

The outcomes of

We may

all

generalize

JH

)

|

+

1

•

i

•

i

=A

tosses are, of course, assumed to be independent. and write the expression for the probability that the

wth toss will be heads, p

=

is

n

if all

—

1

=

tosses were heads, as

1

—^

+

•

-

=

^J-

(1-10)

(remember, for n = 1, we had three hypotheses instead of two).* After observing n heads in a row (n > 0), the probability

for

>

any integer «

for the

1

HH hypothesis, viz., Pn

(HypHH)=^-

(1-11)

i

rapidly approaches unity as n increases, but the hypothesis never

completely certain

—after n

tosses, the (n

+

l)th toss

may be

tails

becomes and the

HH probability would drop abruptly to zero. 1-5.

Knowledge

Inferred

The above examples of conditional feature of inferred knowledge.

This

probability illustrate also the basic

is

the real substance of science.

In the last example, as only //'s appeared in n

trials,

we became

the probability

is

small for

tails

on the

(/?

+

and, in this event, the reliability of the

tails,

abruptly to zero. Such

is

l)th toss,

it

may

HH hypothesis

indeed be

would drop

the possible fate of any inferred knowledge,

Any

of any knowledge based on a limited number of observations.

knowledge

is

actually a hypothesis which, as our confidence in

with experience,

may

rather

But, even though

certain by inference that the disk consisted of two heads.

it

i.e.,

such

increases

be dignified by being called a theory or a generaliza-

tion.!

Any and

all

knowledge

in

an experimental science

is

inferred

from a

problem were initially stated in terms of a disk having unlike sides or like good coin or a bad coin, the expression for/>(Hyp bad) would be the same as here given for/>(Hyp HH) and the restriction n ^ 1 would be removed. This is the case in the "sunrise" example discussed presently. t C. S. Pierce, The Collected Papers of Charles Sanders Pierce (Cambridge, Mass., 1931-35), ed. Hartshorne and Weiss, said in effect, "All beliefs and all conclusions, however arrived at, are subject to error. The methods of science are more useful than old wives' gossip for achieving stable and reliable conclusions, but science offers no * If the

sides,

i.e.,

a

access to perfect certitude or exactitude.

We can

never be absolutely sure of anything."

"There is no absolute certainty" is itself inconsistent, Pierce answered "If I must make any exception, let it be that the assertion 'Every assertion but this is fallible" is the only one that is absolutely infallible."

Then

to the objection that the proposition

Probability and Experimental Errors in Science

18 limited

number of observations.

a scientific generalization rests

than one experiment

is

Usually, however, the evidence on which

complex that the outcome of more

so

is

needed to topple

it

Rather than

completely.

when confronted with an unexpected outfurther developed) to include the new infor-

toppled, a well-based theory,

come, is usually altered (i.e., mation as "expected." Such is the progress of inferred knowledge of any sort; and such is the central feature of experimental probability asdiscussed later.

upon the "proper" number of equally an inherent part of each problem of inferred knowledge. Let us explore this a little further and ask the queswas

It

said earlier that deciding

probable a priori hypotheses

What

is

tomorrow? One two hypotheses to be considered cither assumption is the sun will rise or it will not rise analogous to the two outcomes in the toss of a coin. These two hypotheses are presumed in desperation to be

tion,

the probability that the sun will rise

is

—

that there are only

—

equally probable, each probability being t at the start of things,

before the

sun

first

Some

sunrise.

will rise again, after

having risen n days in a row,

obviously erroneous (notice

an increase

in

i.e.,

people argue that the probability that the

how

small

it

is!)

because

it

is

(|)

n+1 but this ,

the sunrise probability as experience accumulates.

other people argue that, after the

is

does not allow for So,

and more sunrise observations, the

first

the probability decreases that the hypothesis "the sun will not rise"

the

is

correct one. This argument is identical to the one in the thin metal disk problem as discussed above, and the desired probability is (2" + 1)/ n (2 + 2). As a third and last argument, we might consider that at the dawn of history or at whatever time n = 1, all hypotheses in the entire are equally probable. This assumption to range of probabilities from infinite number of hypotheses is again each of an of equal probabilities for 1

a desperation-in-ignorance type of assumption. universe was chosen at

random from

It is

to the effect that

our

a collection of universes in which

all

conceivable universes in regard to the sunrise probability were equally probable.

On

bility is

+

*

(/?

This result

this \)\{n

may

argument we would conclude that the desired proba+ 2).* Laplace advanced this last argument in 1812,

be derived along lines of conditional probability without specifi-

cally evaluating hypotheses as such.

white balls such that the

/th jar

Imagine

contains

/

N+

1

jars,

black and

each containing

N—

i

white

balls,

N

black and

i

taking on

to N. A jar is chosen at random and n balls drawn one by one with integer values from replacement after each draw. Suppose event (nB) has occurred, i.e., that all n balls are black. What is the probability that the next ball drawn from the jar will also be black? If

we choose

the

;'th

jar, the probability for

(nB)

is

p,(nB)

=

(i/N)".

Therefore,

Classical (a priori) Probability

19

and the expression for the probability

(n

+

+

l)/(n

2)

is

called the Laplace

law of succession. Laplace offered publicly to bet anyone 1,826,214 to that the sun would rise tomorrow (he reckoned n as 5000 yearsf).

1

These three arguments conclude with quite different numerical values of the desired probability, aside from the question of the proper value of n,

and

test

serve to illustrate the inherent difficulty in the development

of the

when

reliability

a bit of

of knowledge. The problem

new knowledge

is

is,

and

of course, most acute

just being conceived.

At

this time,

what

are the equally likely hypotheses, or what are the particular hypotheses

Think of the plight of the observer, who was born during the night 5000 years ago and who has never seen or heard of the sun or of a tomorrow, contemplating the prospect that the sun will rise tomorrow, or, if he has seen it just once, contemplating the probability that it has regular habits. Of course, now, with our accumulated experience, confidence in our knowledge that the sun will rise tomorrow is great, and the difficulties in the origin of this knowledge may be amusing. But the alert student will see immediately many modern examples of such even worth considering?

inherent difficulties in new hypotheses or theories and of the inherent arbitrariness in the probability or reliability of a prediction in terms of a

—

new theory or, indeed of any theory, old Further comment is in order in regard This assumption possible tion

is

or new. to the desperation assumption.

with no information whatsoever, each of only two

that,

outcomes should be assigned a probability of \. say that the probability of "life on Mars"

we would

since choices of jars are mutually exclusive events.

=

pUiB) ^

+

!»

+

1

jn+i

The required

+

l)B)

+

N"(N

and, likewise, the probability that n

p((n

2"

=

probability, viz., that («

balls

+

+

assumpwould

We

AT"

in a

row

are

all

black

is

t-N«+i

-j

N n+1 (N + +

this

-|.

1)

drawn

2»+i

is

Then,

•

+

On

1)

1)5 occurs after we

know

that

nB has

occurred,

is

p({n

+

\)B)

lim jy_*oo

p(nB)

=

n n

+ +

1

2

equivalent to evaluating and using the approexample with the thin metal disk, but this may not be obvious.] t The modern scientist would find numerous inferred reasons" for believing that the sun had been rising regularly for many years before 5000 years ago. For example, we may invoke generalizations about planetary motion, interpretations of archeological and geological records of sun-dependent life preceding the earliest records of man, of the

[Dividing by p{nB) in the

last step is

priate hypotheses in the

time involved

in the

evolution of stars,

etc.

Probability and Experimental Errors in Science

20

on Mars

also say that the probability of cats

of every individual form of

the probability of at least one form

is §,

of elephants

is

indeed,

.>,

N

life is J. is

If there are different forms of life, A which, if 1 is large, is very (A)

—

N

What

near certainty and

much

wrong? Nothing

wrong. As soon as we know or profess to know that

there

we

is

a reasonable probability for

is

no longer

are

greater than the

Nor

question.

is

in

answer of

first

more than one form of

\.

life

such complete desperation as to answer

on Mars,

to the

\

is

first

our ignorance quite so profound as for us to answer

any of the additional questions, although we are admittedly rather when confronted with such questions. We must be very careful making the complete desperation assumption. There are numerous in classical "paradoxes" that have been expounded on this point. Additional knowledge always modifies a probability, except for the Omniscient for \ for

disturbed

Whom

the answer to any probability question

is

always either zero or

unity.

1-6.

Problems

Note:

A

numerical answer to a problem

is

not a complete answer;

the

student should justify the application of the equation(s) he uses by giving an

how

analysis of the problem, pointing out

conditions on which each equation

is

the problem meets satisfactorily the

based.

To develop

his "intuition," the

student should contemplate the comparison of the correct answer with his

To

a priori expectation. 1.

What

this end,

answers are given to most problems.

the probability of drawing at

is

jar containing 3 red

and

random each of

the following

from a

5 black balls:

(ans. 2-g)

(a) 2 red balls simultaneously,

(b) 3 red balls in successive

draws with replacement

after each

draw, and (ans. 6-jt-2 )

2 reds and a black in a single draw of 3 balls?

(c)

Ten people are arranged

2.

at

random

the probability that 2 given people will be

by

1

person between them?

3.

Two

(i)

in a ring.

next to each other, and

[ans. (a)

cards are drawn

(ans. 5^)

(a) in a row, and (b)

(i) |, (ii)

(ii)

0.178; (b)

simultaneously from a 52-card deck.

What

is

separated

(i) f, (ii)

What

is

f]

the

probability that

one of them is a spade, and an ace and the other a black ten?

(ans. 34)

(a) at least

(b) they are both red, (c)

one

is

4.

With two dice

(a)

an even number on each

(b) either a 7 or (c)

neither a

1

2,

cast together,

an

II,

nor an

what

is

(ans. x*%\ (ans. 8 f 3 )

the probability for (ans. ])

die,

and 1

1

,

nor a 7

(ans. |) in the first cast,

and a 7

in the

second cast

?

(ans. i)

Classical (a priori) Probability

What

5.

is

the probability that, in tossing a penny,

heads appear

(a) 5

21

(ans. 3A2 )

in the first 5 tosses,

second head appears on the fifth toss, and 5 tosses, a head appears exactly twice?

(ans. |)

(b) the in

(c)

In

6.

how many throws of

die

1

is

there a probability of less than 0.2 of (ans.

(a) seeing a 5,

(b) seeing a 5 for the first time in the last throw,

7.

odd,

<2)

1, i.e.,

and (ans. 9)

not seeing a 5 ?

(c)

^6 )

(ans.

Two dice are cast B the event of at

Let

together. least

1

A be

the event that the

sum of the

faces

is

Are these 2 events independent? Mutually

ace.

What are the probabilities that A and B occur, (b) either A or B or both occur, (c) A and not B occurs, and (d) B and not A occurs? Note that the sum of answers (a), (c), and (d) exclusive?

both

(a)

8.

A

coin

succession.

9.

tossed until for the

What

A

answer to

gives the

(b).

is

toss,

and

(ans. jf)

required?

(ans. §)

Cornell student parked illegally overnight on 100 randomly selected

He

received a total of only 12 "tickets"

or Friday nights.

nothing further

If

nights for checking parked cars, if

he parks

(a) next

Monday

a ticket

-3)

the probability that

is

an even number of tosses

nights.

(ans.

(ans. 3-6 )

time the same result appears twice in

first

experiment ends before the sixth

(a) the

(b)

is

(ans. i) (ans. f f

what

and

all

these

on

either

Monday

is

known about

is

the probability of the student's getting

the police's selection of

night,

(ans. 0.42)

on Monday and Friday nights of next week, and (ans. 0.24) (ans. 0.30) (c) on no more than two unspecified nights of next month If it is known that the police check on random nights, what is the probability that the student, parking on random nights, will get the next 12 tickets all on (d) Mondays and Fridays, and (ans. 2(y) 12 ) (ans. 21(f) 12 ) (e) on no more than two unspecified nights of the week? (b)

10.

On

placed at

an empty chessboard of 64 squares, the white and black queens random except that they cannot occupy the same square. What is

probability that the 2 queens are in striking position, horizontal, vertical, or diagonal 11.

A

marksman

hits a target

the probability of a hit (a) exactly

4 times

is

in

i.e.,

the

first

In what sense 12.

What

(ans. ^f)

on the average 4 times out of

constant.

What

is

Assume

(ans. 0.41) fifth

shot?

(ans. 0.082)

an a priori probability problem? In what sense

is

same calendar month,

that

(ans. 0.41)

the probability that the birthdays of 12 people" assuming equal probabilities for all months, fall (a) in the

5.

the probability that he hits the target

4 shots,

and 4 times but misses on the

is this

on the same

row?

(b) exactly 4 times in 5 shots, (c)

are

are the

is it

not?

randomly

selected,

(ans. 1.34

x 10~ 12 )

Probability and Experimental Errors in Science

22

and

(b) in January, (c)

in 12 different

13.

A man

(ans. 1.12

calendar months?

(ans. 5.37

belongs to a club of 10 members.

making a

to dine with him,

Every day he invites

members

each day.

different party

(a)

For how many days can he do this?

(b)

How many parties will each man attend? A closet contains 9 pairs of shoes with no

14.

5

x 10-13 ) x 10 -5 )

(ans. 126) (ans. 70, 126)

2 pairs alike.

4 shoes are

If

random, what is the probability that there will be no complete pair among them? Generalize the answer for the case of n pairs of shoes with 2r chosen

at

<

shoes chosen at random (2r

(ans.

n).

Suppose that 6 white and 9 black

15.

(a) If the color

of the

not replaced, what

is

first ball

drawn

™)

balls are in a jar.

at

random

is

not

the probability that the next ball

known and

drawn

will

this ball

is

be white? (ans. f)

(b) If the first ball

is

known

probability that the second (c) is

If

the

first ball is

known

not known, and neither

drawn (d)

16.

Why

not replaced, what

be white?

will

is

the

(ans.

/4 )

to be white but the color of the second ball

replaced,

what

drawn

the probability that the third ball

is

is

the answer to part (a) the

same

as the probability for

drawing the

white?

Suppose that the 15 balls of Problem 15 are divided between 2 jars with 4 black in one and with 2 white and 8 black in the other. If a jar is selected at random and a single ball drawn, what is the proba-

white and (a)

is

is

be white?

will

first ball

draw

and

to be white

bility that

1

it

be white?

will

(b) If this ball

is

(ans. 1)

not replaced and

its

color

is

not known, what

is

the proba-

next ball drawn (from a randomly selected jar) will be white? answer be expected to be the same as the answer to part (a)? (c) If the first ball is not replaced but is known to be white, what is the probability that the second ball drawn (from a randomly selected jar) will be white? (ans. 0.47) bility that the

Should

(d)

this

What

is

the probability for the successive drawing of 2 white balls without

replacement from randomly selected jars? 17.

Suppose

that

someone has

million of being bad,

i.e.,

a penny that

(ans. 0.235)

you suspect has one chance in a The penny is then

of having heads on both sides.

it comes up heads. would n have to be before your suspicion that the penny is chance in 2? (ans. n = 20) must n be if you were initially certain that the penny was and

tossed n times and every time (a)

How

large

bad increases to (b)

How

large

remains good?

1

(ans. In this case experimental evidence

is

not relevant)

18. Referring to Problem 9, calculate the ticket probability for next Wednesday night. Assume that only two hypotheses are reasonable, viz., the police check either (A) randomly among the 7 days of each week or (B) randomly on only Mondays and Fridays; and further assume, before the Monday-Friday 12-ticket experience, that the two hypotheses are equally probable (the usual -8 guess in desperation before actual observation). (ans. 3.55 x 10 )

23

Classical (a priori) Probability

Combinatorial Anal/sis

1-7.

As stated in Sections 1-2 and 1-3, methods for determining the total number of equally possible events in somewhat complicated ideal games of chance were soon worked out by the mathematicians. These methods they give formulas for

constitute the subject of combinatorial analysis:

computing the number of permutations and the number of combinations. These formulas are discussed next. Permutations.

Consider a

set

of n objects having different colors or

make each object different from any other. some order, e.g., along a line. This arrangement

some

characteristics that

Arrange the n objects in called a permutation of the n objects. If two objects are interchanged in their positions, a different permutation results. Now select k of the n objects and arrange them in order, e.g., along a line. How many permutais

tions are possible as

k objects are selected from the population of n

objects?

Suppose for simplicity that the k objects are selected one by one. Any one of the // objects may be chosen to fill the first of the k positions; hence there are n possible choices in the selection of the first object. Then, n — objects are left from which one is chosen for the second 1

Since there are n possible choices in selecting the

position.

first

object

— possible choices in selecting the second object, there are a total — 1) possible choices in selecting both objects. Continuing, there of are n — 2 objects left from which the third is selected. In selecting all three objects, there are n(n — 1)(/? — 2) possible choices. In general, the A-th object selected from (n — k + 1) objects, and the total number of possible choices in selecting all k objects is n(n — \)(n — 2) (n — k + 1)and n

1

/?(//

is

•

•

•

Since each possible choice represents one permutation, the total

permutations

is

,A =

n(n

-

lXn

-

2)

•

-

(n

•

k

+

1)

=

7-^77. (/?

The symbol

n

Pk

is

commonly read

things taken k at a time.*

In case

n *

The phrase "«

"taken" rather than the lation

11.

Pn

=

=

n,

//

-

number of permutations of n number of permutations is

the

(1-13)

n\

commonly used

In a given sample,

things, but the total

(1-12)

fe)!

it

is

in the

discussion or

the k things that are

number of permutations

is

equal to the

size

k that can be taken from the popu-

In this sense, the n things are taken as

one "boundary condition" and the in the sampling process.

number of

sample

as the

k

things taken k at a time,"

permutations, requires amplification. total

number of

given by

size k

is

different ordered

samples of

taken as a second "boundary condition"

Probability and Experimental Errors in Science

24 This

consistent with the convention

is

=

0!

As implied above, a permutation sample of a

to or less than the total population,

to

//.

defined as one possible ordered

is

The sample size may be equal number of objects in the

of nonidentical objects.

set

algebra of taking

factorial

in

Of course, k can be any number from

1.*

i.e.,

the total

set.

not necessary that the k objects be selected one by one; they

It is

be withdrawn simultaneously and subsequently placed

all

in order.

may It

is

Suppose that we have n bins, and that we wish to distribute k objects among these bins, no more than one object to a bin. In placing the first of the k objects, we have n possible possible choices; in placing the second object, we have // problem from

instructive to consider the

this view.

—

The total number of the same as given by Eq.

1

choices;

etc.

possible choices in distributing

k objects

is

1-12.

As

a simple example, consider the

when four

arise />

4

2

=

and

—

4!/(4

are a, b,

two

=

2)!

4

3

•

=

12,

that

may

In this case,

at a time.

and these permutations,

can be written as ab, ba, ac,

c, d,

if

the

letters

ca, ad, da, be, cb, bd, db, cd,

dc.

formula.

Stirling's

often inconvenient.

Numerical evaluation of a factorial number is Help is frequently found in the use of Stirling's

formula, z!

=

where z

any

is

12z

^

9,

The

integer.

2%

mation, good to

for z

— + -J- - -^- + )

+

x/2^ (z/eyll \

z

number of permutations

different letters are taken

all

first

=

5,

288z 2

term of

and

518402

this

is

1

(1-14)

/

expression

to better than

the straight factorial evaluation

3

is,

%

usually the

as an approxi-

for z

>

9.

For

more convenient,

and the practical use of Eq. 1-14 is usually limited to the first term. The essential part of Stirling's formula can be derived as follows. The natural logarithm of z!

=

log (2!)

Consider.a graph of y log

2,

log

3,

log 4,

is

=

etc.,

log 2

log

+

2,

log 3

+

log 4

as in Fig. 1-1,

and abscissa values

+

•

•

•

+

log z

and mark ofTordinate values 1, 2, 3,

etc.

Then, log

(2!) is

clearly equal to the sum of the areas of the rectangles indicated in the figure, each rectangle having unit width and having a height equal to

log *

2,

or log

This

n\jn, and,

0!

=

1.

is

3,

•

•

•

,

or log

2.

often stated as follows:

if

n

is

This area By

is

approximately equal to the area

the definition of a factorial

taken as unity in this expression, n\ n

=

1.

It

number,

follows that

(/;

(1

—

1)! 1)!

= =

Classical (a priori) Probability

2S

log 7 log

log

r^T

6

log 5

—

.y

=

logz

4

l/|

^~A

log 3

\/ log 2

tT

i

l|

-3ZJZ^\

I

I

I

I

I

I

I I

I

I

I

i

i

i

I

i

i

i

i

I

I

I

I

I

~

i

i

i

i

i

~ —

J I

i

i

I

I

0123456789

-

10

z

Fig. 1-1.

Graphical interpretation of

under the smooth curve y = log z out to as z increases. Hence, for large z, log

z,

log

(2!) ph

Stirling's formula.

the approximation improving

2

dz

Jo

and, integrating by parts,

—z+

log (2!) £a z log 2

Putting this in exponential form,

^

1

2

log 2

—

the

first

2

we have

&

2!

(z/eY

which, with the insertion of the factor V2ttz,

is

term of

Stirling's

formula, Eq. 1-14. The additional terms enter in the complete derivation to take account of the excess area between the step curve

and the smooth

curve in Fig. 1-1.* *

Mathematicians commonly define

z\

as XI

x ze

dx

(1-15)

J* z is integral or not. Later, we shall use the factorial of a negative terms of Eq. 1-15 that such factorials can be shown to be infinite. Interpreted graphically, Eq. 1-15 refers to the area under the curve y = z ze~*. The

which applies whether integer.

It is in

integration indicated in Eq. 1-15 can be carried out by parts,

+

i

e -*dx

=

z{z

-

(1-16)

1)!

Jo Further discussion of

this integration, as well as

formula, can be found is

called the

gamma

cussed under this

in

function of

title.]

of the complete derivation of Stirling's

standard textbooks of mathematics. z,

written T(z),

and

Stirling's

[Incidentally, (z

formula

is

—

1)!

often dis-

Probability and Experimental Errors in Science

26

Sampling without replacement.

Either procedure just described in

number of permutations of /? things taken k at a time, or, other words, the number of different ways k objects can be placed in

arriving at the in

is also called sampling without replacement. With this terminology, „Pk is read as the number of different samples of size k that can be taken from a population of n elements. In this case, the term "sample" refers

n bins,

to a particular permutation of the k objects.*

The

a priori probability of drawing, or of otherwise realizing, a partic-

from a population

ular permutation or properly specified sample of size k of/; elements

lj n

is

Pk assuming each ,

possible permutation or sample to

have an equal probability.

Sampling with replacement. ]f the first object selected from the n is noted and then replaced before the next selection is made, we have sampling with replacement. A given object may be selected more than once; repetitions are permitted. It follows that in selecting two objects

objects, the total

three objects, the

the total

number of possible total number is n n

number of possible choices

ordered arrangements of

is

choices

=

n

n

//

We

fc .

is

3 ;

n

n

and

=

/r;

in selecting

in selecting

say that n k

is

the

size k, repetitions permitted, that

k objects,

number of

can be taken

from a population of/? different objects. Note that in this type of sampling no restriction is placed on the magnitude of A:; the sample size k may be smaller or larger than the population

Tossing coins, casting dice, ment.

/?.

are examples of sampling with replace-

In tossing a coin, the population n

the sample size

is

When random made

etc.,

the

number of

sampling

is

order of heads and bility for this

is

\jn

k .

sample

(five

all

heads

in a

independent events as discussed

problems involve,

For example, what

selected digits are

tails,

and

different?

is

the

in

is

is

equally

size k, repetitions

In the tossing of a coin,

tails is specified as, say, five

interesting probability

sampling.

each of the possible samples

Hence, for each specified ordered sample of

permitted, the probability

Some

heads and

done with replacement, the assumption

in a priori probability that

probable.

is 2, viz.,

tosses.

if

the sequential

row, the proba-

earlier)

is

1/2

5 .

a sense, both kinds of

probability that five

randomly

In this problem, although a selected digit

* The term "sample" has various meanings: as a verb it refers to a process, as a noun it refers to something selected; here, it means a permutation; in the next paragraph, it means an ordered arrangement of k objects that are not necessarily all different and in "combination" problems discussed presently, sample refers to different objects but without regard to their particular sequential order. The term "arrangement" is also a general one, like "sample," and is also often used in special senses without the proper qualifying phrases. Some authors use the term "combination" in a similar general way, but usually it has the restricted meaning assigned presently.

27

Classical (a priori) Probability

not "replaced," there are assumed to be an

is

of

digit in the

infinite

imaginary jar from which selection

number of each type is made; hence the

problem is really one of sampling with replacement. It is immaterial whether sampling is one by one or is a single group of five together. We note that, in this example, n = 10 and k = 5, and that there are therefore 10 5 possible ordered arrangements, repetitions permitted, each to be equally probable. There are

ments without

repetitions).

10

Pb

(i.e.,

=

5

An

Consider another problem.

and

elevator has seven passengers

we

empties with stops at ten floors, and

same

at the

is

^ 0.3024

™*J 10

assumed

ordered arrange-

Hence, the answer to the question

p

no two passengers leave

permutations

ask,

floor?

What

is

the probability that

Assuming equal probability

for each of the 10 7 ordered arrangements with repetitions permitted, find the

we

answer to be

p

=

io^z

10

7

w 0.06048

Suppose now that the k by sequential drawings from n different objects or by individual positions in an /7-place bin, are considered to be a single unordered group. Such an unordered group is a type of sample called a combination. Thus, in the case of the four letters a, b, c, and d, the twoletter sample ab is the same combination as ba, but is different from ac, etc. The total number of combinations possible in selecting k objects from n different objects may be denoted by n Ck or, in more modern notation, Combinations:

binomial coefficients.

objects, instead of being ordered

by 1,1.

Either of these two symbols

read as the total number of

is

combinations of n things taken k at a time, or as the total number of combinations of size/: from a population of n elements. Again, < k < n.

The expression

for the total

The k

as follows.

number of combinations, 1,1,

is

obtained

group make one combination,

objects as an unordered

whereas one particular order of the k objects makes one permutation.

The k\,

total

number of permutations

and the

total

in

one combination of k objects

number of permutations

in

Thus,

I

^

,

)

'

combinations

is

(

^

is

A:!.

,

J

'

••=(;) k\ from which, with Eq.

1-12,

(1-17)

kf

k\

k\(n-k)\

28

and Experimental Errors

Probability

It is

apparent from the denominator of the

—

k\ and (n

k)\

can be interchanged.

:)

The symbol

I

:

(.

of

last part

this

Science

expression that

follows that

J

called the binomial coefficient because

is

I

-

It

in

it

appears in

Newton's binomial expansion

+ by =

(a

n \^

ja"« »

n „n-2L2 + ^^""^

n-lu + [{ja^b + '

rt

»

(

,

\

.

,

coefficient

I

•••

n

l.
+

(") b"

arranged in an interesting way in

is

,

i

•

+ The

+

.

•

0-19) Pascal's

J

'

^

triangle:

=0

n

1

11 12 13 3

= =2 =3 1

14

=4 = = =

5

6

6

1

7

10

20

15

35

21

7

1

1

4

6

10

5

1

1

1

5

1

6

15

35

1

21

7

1

etc.

Entries in the

/7th

row are the

respective values of

I

given by successive

,

J

Note

values of k.

that the basic idea of the triangle

(except for the unused apex) ally

above

it;

i.e.,

is

the

is

that each entry

sum of the two immediately and diagon-

.

,

1

("t

=(:)+(,->

)

(As an exercise, the student should prove Familiarity with combinations let

is

this equality.)

best achieved

from

specific

examples;

us consider a few examples.

The number of selected

two-letter combinations (sample size

from the four

letters a, b, c,

and d (population 4!

,k!

and these are

Ml =

ab, ac, ad, be, bd,

and

=

k /;

= 2) that can be = 4) given by is

6

2!(4-2)! cd.

Consider further the example about a random sample of discussed

in

different, in

five digits

the previous section, where the problem specified five digits,

an ordered group or permutation.

Now

let

us ask, In

all

how

29

Classical (a priori) Probability

many ways can

five different digits

be selected

the sequential order

if

is

of

I, i.e., reduced from no consequence? The answer to this question is ^ ^ factor 5 10 P5 by the Another example is found in the problem, How many different single bridge hands can be dealt from a 52-card deck? A hand contains 13 cards and the particular order of the cards in a hand is of no consequence. So I

the question can be rephrased,

How many

combinations of

taken from a population of 52? The answer

And

number of permutations,

the

specified,

is

Among

i.e.,

is

I

^

I

can be

size 13

635,013,559,600.*

with the sequential order in the deal

even greater by the factor (13)!

all

'"

these

cards? Each hand

1

is

I

how many have

possible bridge hands,

a sample of size 13 cards from the population of 26 .

hand contains only red cards

probability that a

only red

is

_

I

.

The

then

26 13

52 13 It is

perhaps instructive to solve this

problem by the long-hand method of

writing the product of the probabilities of successive independent single-

card deals. Thus, the probability that the

second

is

are red

is

that the 13 cards

25

24

14

52

51

50

40

In general, coefficients,

it

i.e.,

trial

acquitted

first

is

etc.,

26!

26!

13!

(26-13)!

52!

52!

39!

(52-13)!

26! 13! (26

/26I

-13)!

!

-

13)!

method of

to think in terms of a sample size

\13

/52

52! 13 (52

simpler to use the shorthand

is

A bridge player

At the

red

is -§£;

;

26

*

card

and the probability of the event that all is ff product the of the component events, viz., given by

red

U3 the binomial

k from a population of

immediately after looking at his hand rose up and shot the dealer.

he pleaded that the dealer had obviously stacked the deck, and he was

when he proved that if the deck had not been stacked the probability of his hand of cards would have been only 1 in 635,013,559,600. What

getting that particular

should have been the counterargument of the prosecuting attorney? t It

is

not valid to argue that the sample in this problem

is 1

3

cards from a population

of two colors, giving 2 13 red hands, because this improperly presumes sampling with replacement.

Probability and Experimental Errors in Science

30 size n, is

although

example the most apparent simplification

in this particular

notation.

in the

Another example: What

the probability that a poker

is

number of

cards contains five different face values? The

(combinations)

is

(combinations) the

from a

=

I

I

five

would be

1

different

ways

cards having different face values can be selected

single suit of 13cardsis

the answer

The number of

2,598,960.

hand of five hands

different

I

/

If a suit

I.

I

I,

I

but no

were specified in the problem,

suit is specified.

The number of

ways (arrangements with repetitions permitted) in which the can be arranged among the four suits is 4 5 Hence, the answer to the problem, assuming all possible hands to be equally different

specified five cards

probable,

.

is

,

45

p

=

5 ,

«s 0.5071

'

(?) Note that in each of the last two examples, component parts of each problem are analyzed in terms of different population numbers, and, in the last example, one component sampling is done with replacement. Different types of populations in a given problem are common. Binomial distribution formula. there are only

In probability situations in which

two possible outcomes (outcome population

observation or sampling with replacement

Obvious examples of Bernoulli

trials

is

called a

=

2),

Bernoulli

each trial.

are the successive tossings of a coin,

the casting of a die for a six vs. a nonsix, etc.

The

essential features of

Bernoulli trials are:

must be independent, each trial must be determined entirely by chance, and outcome of the the probability for any specified outcome (called "success") must be

(1) successive trials (2) (3)

constant for Let

p be

"failure."

all trials.

the probability for "success,"

What

is

The problem posed by If

and

let

q be the probability for

the probability for exactly k successes in n trials? this

question

the sequential order of successes

is

known

and

as the Bernoulli problem.

failures

were important, the

—

k failures would be probability of exactly k successes and of exactly n pkqn-h s nce successes and failures, being mutually exclusive, are necesj

But since the sequential order is of no conse~k k by the number of combinations of the quence, we must multiply p q n sarily

independent events.

Classical (a priori) Probability trial

31

—

population n taken k (or n

equivalent to adding up

This multiplication

k) at a time.

the mutually exclusive combinations.

all

the answer to Bernoulli's problem

given by

is

n

k„ n-k n,p)=\ JpY-' k

B(k;

is

Hence,

>

c

(1-20)

and may be read as the Bernoulli (or the binomial, see below) probability k successes out of n trials with a success probability p. Since p + q = 1, it follows by the use of Eq. 1-19 that for exactly

=

1 .

(P

+ *)" =

re

1

4

P

I

Ip"' *

j

+

+

n I

2

---

-2

)p

+

q k n

\

k )p

_ *= I 5(fc; 1= I (£W K =Q k2ft k=0 \

The

term

in the

n successes in n

trials;

n

—

etc.

up

1

first

-k

q

+ '-'+q n

«,p)

(1-21)

(1-22)

/

sum of Eq.

1-21 gives the probability for exactly

the second term gives the probability for exactly

successes; the third term the probability for exactly n

—

2 successes;

the last term, q n gives the probability for zero successes. The series k to and including the p term represents the probability that the event ;

,

happens at least k times in n trials. The probability for at one success is 1 — q n Because of its general form, Eq. 1-21 or 1-22 is called the binomial formula, and a graph of B(k; n,p) vs. k is called the binomial distribution of probabilities. Such distribution graphs are shown in Fig. 1-2 for two

called success least

.

particular pairs of values of n and/?.

for the cases in which p

=

The

\ (symmetry

distribution

is

asymmetric except be proved

in these special cases will

in the next chapter).

Examples of binomial probability are found in the tossing of a coin n k heads (and n — k tails), for which the head probability per trial is known, e.g., is | the casting of a die for which p for a particp ular face value is \, or for an even number is i; the casting of two dice for which p for a double ace is gV; the hitting of a target k out of n times times, seeking

;

if

the hit probability It

is

known;

etc.

follows from the definition of the success probability/?, Eq. 1-1, that,

in n Bernoulli trials, the a priori expected

the product np.

expectation value possible values of

by Eq. 1-20 or

This product np is

number of successes k

is

given by

The when all

called the expectation value.

equal to the arithmetic

k are weighted by

1-21.

is

mean

value of k

their respective probabilities as given

Probability and Experimental Errors

32

in

Science

Expectation value

C^

p = np = 50*1-

0.08

16.66'

o fl 0.06 -a;

03

0.04 0.02 !

o.oo

•

2

4

*

»

8

6

10

12

16

14

20

18

22

24

26

28

k Fig. 1-2.

Two

The asymmetry decreases

binomial probability distributions.

as the

expectation value, np, increases.

The most probable value of

k, viz.,

k

,

that value for

is

nomial probability is a maximum. In some cases k two values differing by unity [see Problem 14(o) of Section differs from np by more than unity.* is

*

This can be shown as follows. By the definition of k B((k

+

1);

n,p)< B(k

B((k

-

1);

n,p)

;

,

in

which the

bi-

double-valued, the 1-9].

k never

general

n,p)

and also

By expressing each B

B(k

\

n,p).

as the appropriate term in Eq. 1-22, these inequalities can also be

written as

n

1

—

p

1

—

p

p

33

Classical (a priori) Probability

Before proceeding,

us recapitulate a

let

for the binomial distribution formula

Bernoulli

trials,

we

little.

In writing the expression

use n to represent the

number of

whereas, in the discussion that led to the binomial co-

number of unlike objects from which random k are made. It is important to realize that, although the terminology is different and the "unlikeness" of the "objects" is rather specialized, the two /7's refer to the same thing. Each binomial coefficient is the number of combinations or ways in which an unordered sample of size k can be taken from the n unlike objects. The binomial probability for k successes is proportional to this coefficient, i.e, to the number of ways efficients,

n refers to the

selections of size

trials can be taken k successes at a time. The k successes are of course in an unordered sequence with n — k failures mixed in. The specialized

n

unlikeness of the objects in this instance

is

that the population of n trials

made up of only two different kinds of things, viz., successes and failures, and we do not know which is success and which is failure until after the is

But

trial.

does not impair the argument.

this

The binomial can be analyzed

is

a very important distribution since so

in

terms of basic Bernoulli

many problems

Indeed, the normal

trials.

(Gauss) and the Poisson mathematical models of probability, the two most

important models science,

may

of measurement in any experimental

in treating errors

(but not necessarily) be considered to be special cases of the

binomial model. The normal case

is that for which n becomes very large and p is sufficiently large that np > 1 (infinite in the In practice the normal approximation is fairly good as long as when p < \, and nq > 5 when p > \. The formula for what is

(infinite in the limit) limit).

np

>

5

called the probability density in this distribution

G(z;

fc)

=

A

e

is

-»V

(1-25)

V

ITT 7

where h

=

1/V 2npq and '

z

=

np

—

Equation 1-25

k.

is

derived in Chapter

The Poisson special case obtains when n becomes very large (infinite in the limit) and p becomes very small (zero in the limit) but in such fashion

4.

that the product np remains

moderate

The Poisson formula (derived

in

P(fc ;A

where//

The

=

viz., that

np <^

\ n.

5) is

= £iL-

d-26)

np.

special conveniences of Eqs. 1-25

measurements are discussed ately apparent convenience, is

magnitude,

in

Chapter

in detail in

and 1-26 and

Chapters 4 and

when compared with

that evaluation of large factorials

is

avoided.

their application to 5,

but one immedi-

the binomial formula,

34

Probability

When

n

excessive.

Eq. 1-25

mation

is

is

is

small, Eq. 1-20

When

n

used.

And when

used.

is

large

Thus,

is

is

is

large

and p

pennies.

not

is

small, the Poisson approxi-

experiment to check

it.

it

is

instructive

Toss five "honest"

Pick up those that are not heads and toss them a second time.

Finally, toss

any that are not heads Table

For Pl

=

,

1

(2

« r = h;

l-l.

i

Pz

this

time a third time.

Values of 8(k; 32,

lt = d -\r 16807

Now,

p.)

243 0.237;

1024

* -0-W- 32678 ~°- 313 k

is

have been considered.

Because of the importance of the binomial distribution little

Science

not small, the normal approximation

all possibilities

for the student to perform a

in

used directly and the computation

and p n

and Experimental Errors

and

after the

35

Classical (a priori) Probability

understanding of the values of/? listed at the head of the table,* and then check a few of the binomial probabilities as listed in the table. Use Stirling's formula where it is accurate enough and convenient. The numt

and

bers n

np, at least in the third-toss case, are sufficiently large that fair

accuracy can be achieved with the simpler normal approximation, Eq. 1-25 also, B(k; 32, p^) may possibly be satisfactorily represented by the Poisson approximation, Eq. 1-26. Try these approximations in a few instances in test their accuracy and their

each of the one-, two-, and three-toss cases to relative convenience.

Then compare

the observed

numbers of successes

with the calculated frequencies and discuss the discrepancies.

Later, in

Chapter 4, we discuss the so-called % 2 test for determining whether or not the observed discrepancies are reasonably compatible with the assumption that the five pennies are "honest" (i.e., that each has a head probability of

|).

We

have discussed permutations and combinations of n different objects taken k at a time. In many problems the n objects are not all different from one another. Thus, in an ordinary

Multinomial coefficients.

Suppose that k x + k z + k3 + there are r different kinds of objects; then n + kr r < n. The expression for the number of permutations n Pk k k can be 52-card deck, 4 cards are aces, 13 are spades, 26 are red.

=

•

•

•

,

...

same type of reasoning as was used

arrived at by the

in obtaining

I

J

from

n

Pk The k x

similar objects in a single combination represent

.

kx

permutations of the n\ total number of permutations that would exist

Likewise, the k 2 similar objects in a single

n objects were different.

all

combination represent k 2

permutations

\

\

if

if all

n objects were different.

By continuation of this argument, the number of permutations n Pk k k when multiplied by k 1 \k 2 k r gives the total number of permutations ...

l

*

Calculation of p x

show heads

to

first

=

p2

is/» 3

is

—

5

J)

By

.

= d -i)

p 2 may on the

-

To

easy.

twice in succession

or the second try (1

-

the

-

\

calculate is

J;

p2

— J). The probability that all five pennies will do this is same type of argument, the probability of success in three tries

is (1

5 -

also be calculated as follows.

The

—

n tails

5!

—

2" "h! (5 Thereafter, 5

we

probability for n heads and 5

first try is

1 5 1/2 "".

note that the chance of a penny failing

,

hence the probability for heads on either the

—

If these

obtain

p2

.

ii)'!

The chance that they are all heads and summed over n from n = to n =

n pennies are tossed again.

is

two

5,

factors are multiplied

This sort of argument also gives ,5,

A

5!

1

P*= „f 2,= 1o™t2 s

«! (5

-

(5-h)!

1

»)!

2 5 ""

(5

-

n

-

m)\

1

2s

Probability and Experimental Errors in Science

36 that

would be present

if all

1

may

Hence, we

n objects were different.

write

=1

where the symbol n means the product. This is to be read as the total number of ways in which n objects can be divided into r groups of which the first contains k x objects, the second contains k 2 objects, etc., when the order of the r groups is preserved. Only the order within each k group is

making

relaxed, It is

it

a combination.

instructive to derive Eq. 1-27

by a different route,

this

time following

the line of reasoning that led to the expression for „Pk directly rather than

II

to the expression for

As

as above.

a

step, consider n to

first

divided into just two groups of like objects in each group.

kx

+

k2

.

Then, n

be

=

In this case, after the k x like objects have been selected from the

— k x objects left. We can write nPk # as the product of the number of ways k x objects can be selected from n objects and the number of ways k 2 objects can be selected from n — k x objects. Thus,

n objects, we have n

^=(;J("; 4=t)=^=^7

o*>

2

k2

since

=

—

n

k x and

M =

I

which there are three groups of

same argument

as in the

Now

like objects,

P

1

k2

(n

kx

—

kx

—

-

.k 2 .k 3

k2

—

«

=

Ar

x

+

Ar

2

+

kz

-

k 2 )\

By

.

the

2

k3

)\

n\

k x \{n

consider the next step in

first step,

_lA(n-k\ln-k -k

^*"Wl

since (n

1.

kx )\ k 2 \(n

-

k.V.

-k x

(/i

-

fci

-k -k -

k 2 )\ k 3 \(n

x

2

k3 )\

3 .

k 3 )\

Pkk

n^ = k

0!

=

=

1.

By 5!

generalizing,

we

= -5L»

=

see that

(1-29)

l

It is perhaps clearer in this derivation than in the former that the order is preserved among all the r groups, each group being a single combination.

37

Classical (a priori) Probability

The symbol

Pk

n

k

...

appears as the coefficient of each term in the

k

+

algebraic expansion of (a x '

+

a2

•

•

+

•

a r) n

and

,

for this reason

is

it

that it is called a multinomial (or polynomial) coefficient. Consider a few illustrative examples. First, how many different ways can five letters be arranged if three of the letters are x and two are p.

The answer

is

5^3,2

=

=

5!/3!2!

and these ten arrangements are and jjxxx. Note

10,

xxxjj, xxjxj, x/'xxj, jxxxj, xx/jx, xjxjx, jxxjx, xj'jxx, jxjxx,

again that the group sequential order

from

jjxxx, although the order

is

important;

In an earlier example, we inquired as to how bridge hands could be dealt from a 52-card deck.

many

permutations are possible

normal way.

We

when

xxxjj

e.g.,

not preserved in the

is

z's

many

is

different

or in the

y's.

different single

Now

let

us ask

how

the four hands are dealt in the

are concerned with permutations because, in the

game

of bridge, the sequential order of North, East, South, and West is important. There are four combinations, each of size 13 cards, to be selected

from a population of 52 cards. Hence, n

52!

P*!,*8 ,*3,*4 =

13! 13 13! 13 1

which

is

a very large number,

viz.,

(5.3645

outcomes,

to the case in

i.e.,

is

.

distribution for-

which the object population or outcome

subdivided into more than two groups of like elements.

For example, a

may

die has six different sides, a jar

different colors than two, a

in a gas

28

)10

•

be generalized to the case of more than two possible

easily

population

•

The binomial

Multinomial distribution formula.

mula can

•

contain balls of

deck of cards has four different

have many different values of velocity,

Ax A2 A3 ,

,

•

•

Ar

•

,

,

Let

.

more

molecules

etc.

Consider a probability situation in which there are possible outcomes, viz.,

suits,

r

mutually exclusive

p be

the probability

t

outcome A occurs at a trial and let n independent trials be made. The probability that outcome A x occurs exactly k 1 times, that outcome A 2 occurs exactly k 2 times, etc., is calculated in a manner identical to that used in deducing the binomial formula. The probability/? of obtaining a partickr ular sequence of outcomes is pfrp^p** p r and if we are not interested in the sequential order in which the A outcome occurs in the k times it is observed, and if we do wish to preserve the sequential order in which the various groups of like outcomes occur, we must multiply by the multinomial coefficient n Pk k 1-29. Thus, k from Eq. that

t

'

'

'

,

i

x

^(particular sequence)

=

:

«!

=

M[(/c x

;

n, p^){k 2

;

./c

n,

2

kr

.

p2 )

•

•

Pi^pf*

'

' :

Pr

lr

.

(k r

;

n,

p r)~\

(1-30)

Probability and Experimental Errors in Science

38

which may be read as the probability that in n independent trials A 1 A 2 occurs exactly k 2 times, etc., when the respective outcome probabilities are p v p 2 etc. Here, k is any integer from to n

occurs exactly k x times,

,

t

=

=

1. with the condition, of course, that ^; = i^, «• Also, of course, ]£• x=1 p in Eq. 1-30 stands for multinomial. It can be shown easily The symbol i

M

that the

sum over

all

values of

A:

Eq. 1-30 gives the expression for the

in

multinomial (ft

and Eq. 1-30

may

is

known

+ ft +

all

*

*

+

PrY

=

(1-31)

1

as the multinomial formula.

be put in graphical form

a graph for

'

if

the graph has r

+

1

Equation 1-30

dimensions; such

values of k represents the multinomial distribution of

probabilities.

An

understanding of the multinomial coefficients and distributions

imperative

if

is

the student seeks an understanding of the kinetic theory of

gases or, indeed, of any physical theory involving statistical mechanics.

Note

well that such theories are of increasing importance

in

the

all

physical sciences.

We may

point out in the interests of general perspective that

the analysis of errors of experimental measurements, we

later, in

shall conceive

of some probability distribution as being the subdivided population of "objects" from which a sample, i.e., a single measurement or a limited

number of trial measurements,

is

sample, with replacement, from the

same

taken.

Each measurement or

trial set is

a

a rather specially subdivided population of

our considerations of the multinomial and the probability per outcome, e.g., the probability of a

sort as that described in

coefficients,

particular measurement,

is

given by the population distribution probability.

This distribution, for which n

remain unknown or

it

is

very large, infinite in

may be assumed to be known. Commonly assumed parent

some It is

instances,

may

also called the

an Poisson and the experimental science are the normal (Gauss) distribution distribution, both of which may be considered as special cases of the binomial distribution as stated earlier in this section. The statistical problem in experimental measurements is generally to infer from a limited number of trial measurements (a) what is the most appropriate parent probability distribution, e.g., normal or Poisson, and (b) what are the quantitative values of its descriptive parameters. Help in the answer to (a) is usually afforded from a priori experience and the particular type of measurement or from statistical analysis from a rather large number of trials; obtaining the answer to (b) is often solely an a posteriori problem. The features of the measurement problem should become clear as the "parent" distribution.

reader progresses in this book.

distributions in

Classical (a priori) Probability

39

Sampling from subdivided populations without replacement: problem and bridge hands. A basic condition in the binomial distribution and in the multinomial distribution is that the component lottery

probability p be constant for

This condition restricts applications

all trials.

to sampling with replacement. But, the use of the binomial coefficient as

giving the

extended

number of combinations can be

to, a

common

replacement from a subdivided population. in

further illustrated by, or

type of problem in which sampling

is

done without is one

This type of problem

which we ask for the probability that a random sample of size/ contains i elements of a specified type k from a population of n elements

exactly

subdivided into n

— k + k2 + x

'

+k

* '

r,

with

r

<

n,

when

the sampling

done without regard to sequential order among the i elements. Suppose, first, that there are only two different subdivisions in n, viz., n = Ar x + k 2 To make the problem concrete, let k x be the number of winning tickets in a lottery in which n tickets have been sold,/ be the number of tickets we have bought, and i be the number of winning tickets that we hold. Then, the desired probability is given by* is

.

(M(.M p(exactly

i)

=

;

v

)'

"

(1-32)

(;) Here,

|

}

I

distributed

is

the

among

number of ways all

which our winning

in

the outstanding winning tickets,

and

tickets

k2

/ I

.

can be \ is

.

the

J

number of ways our

losing tickets can be distributed

standing losing tickets.

among

all

the out-

Since any single combination of winning tickets

can occur with any single combination of losing

tickets, the

product of

the two

numbers gives the total number of ways that our tickets, both winning and losing, can be arrived at from the total number of outstanding tickets. The denominator of Eq. 1-32 is simply the number of combinations of n tickets taken j at a time.

To make this example numerical, suppose that there are 400 tickets sold, we bought ten of them, and that there are four prizes. Then, the probability that we will win exactly one prize is that

396 ^(exactly 1)

=

.

.

9 ,

-

^

0.0934

/400\ I *

Equation 1-32

of p(i)

vs.

i

is

is

also

known

10/

as the hypergeometric probability formula;

called the hypergeometric distribution of probabilities.

a graph

Probability and Experimental Errors in Science

40 Incidentally,

should be obvious that the probability that we

it

or more prizes

is

win one

will

given by

p(\

k( or more) = '"

3 •

_ ) - % 0.0967

) (

(D since the events of winning exactly

mutually exclusive. ability

of winning

Of

all

one

prize, exactly

we bought

course, had

two

400

all

prizes, etc., are

prob-

tickets, the

four prizes would be given by

Kall4)

=

W

400 - 4; \ /400\

=

l

',400/

As another easy example with n = k x + k 2 consider the combination problem given earlier about the number of all-red hands possible in a single hand of bridge. This problem can be discussed in terms of n = 52 cards divided into k 1 = 26 red and k 2 = 26 black cards. An all-red hand corresponds to i = 13 and j = 13. The answer to the problem can be ,

written as

'

v

,

/>(13 red,

black)

=

,

.

26 \ 26W26\ 13/ \ 0//

/26

=

\13

"(3 Or, the probability that a single bridge hand will contain one ace to be

p(\ ace, 12 nonace)

Next, consider an example in which n like elements.

Let us ask,

What

is

similarly for the other suits. in the

hand

is

I

_ I.

in

The

is

subdivided into four groups of

hand of

the probability that a bridge

13 heart cards, h total

<

13, is

I

,

number of combinations

So the desired probability

K t '*A number h specified -fi~4 suit) -rt p(specined in each •

seen

4W48 = 1/U2

cards consists of s spades, h hearts, d diamonds, and c clubs? The

of ways h hearts can be found

is

I

;

= \s)\h)\d)\c)

3

and

possible

is

-

13

number

Classical (a priori) Probability

41

if s = 5, h = 4, d = 3, and c — 1, the answer is 0.00538 problem did not specify the particular suit distribution, the probability that a hand consists of, say, five of one suit, four of another, three of a third, and one of the fourth suit is 4! (= 24) times greater, viz.,

Numerically, If the

0.1293

•••-.

As a

final

groups and

example, consider a case in which n

is

subdivided into four

which the particular sequential order of the four groups is important. We may ask, What is the probability that each of the four bridge hands contains one ace? First, how many ways can four aces be arranged into four ordered groups each of size 1 ? The answer to this ace question

in

is

Second,

how many ways can

the four ordered hands?

4'

_ |- 441

—

p

48*1,1,1,1-

-

m!1!1

the remaining 48 cards be distributed

The answer

to this question

among

is

48! 48*12,12,12,12

12!12!12!12!

Then, since each permutation of the aces and each permutation of the

nonaces are mutually exclusive, we must add up all the separate ways in which they can occur together, i.e., we must multiply the two respective

numbers of permutations

Hence,

to obtain the required probability.

this probability is

=

4!48!/(12!) 52!/(13!)

4

=

24-48!-(13) 4

4

^

Q

m

52!

This, except for problems in Section 1-9,

is

as far in this

book

as

we

shall

pursue the arguments in ideal games of chance. These games have served well in helping us to get acquainted not only with classical or a priori

probability but with the basic principles of probability combinations

and

of combinatorial analysis. The principles are, of course, just as applicable

when 1-8.

the experimental or a posteriori definition of probability

Classical Probability

and Progress

in

is

used.

Experimental

Science

As implied above, upon

the progress of any experimental science

the constant repetition of three steps:

or a conception of the behavior of nature as best calculation of the a priori probabilities

is

based

(1) the invention of a model

we understand

it, (2) a such a of conception, on the basis

and, then, (3) a comparison of the a priori probabilities with actual

measurements,

i.e,

with the experimental or a posteriori probabilities.

42

and Experimental Errors

Probability

in

Science

Confidence in the a priori conception increases to the degree that the comparison is favorable; and the conception is modified to the degree that the comparison is unfavorable. Then, more a priori calculations are made with the new model, more measurements taken, etc. Measurements are always the final arbiters, and in this sense the experimental or a posteriori meaning of probability is the more significant one. But note well that both types of probability are essential in scientific progress. In fact, we have already seen in our discussions of conditional probability and of inferred knowledge some easily recognized examples of the modification of conceptions of nature (hypotheses or theories),

modification of our degree of confidence in them as actual observations

becomes

These are good,

available.

examples of the elements of progress

arguments

science,

in

us

let

reference to statistical mechanics.

albeit simple,

in science.

Applications in statistical mechanics. ability

and of the

new experience of

To

illustrate further the

prob-

amplify very briefly the earlier

The following

illustrations

of binomial

and multinomial probability are extremely cryptic; if the reader has had no previous introduction to the subject of statistical mechanics he may be well advised to skip directly from here to Section 1-9. In the subject of statistical mechanics we are generally interested in specifying the simultaneous values of six properties of each of a large

number of

interacting particles, viz., the three mutually perpendicular

components of position and of momentum. These properties are commonly expressed in a six-dimensional "phase space."

To

express the frequency

any given instant of time for N particles, we need a seven-dimensional graph, the seventh dimension being the number of particles (and if time is also a variable we need eight dimensions). (As a special case, in considering only the velocity distribu-

distribution of these six properties at

tion of molecules of a

components of

monatomic gas we are interested in onlv the three and then we deal with the conventional three-

velocity,

We imagine that phase

dimensional space and a four-dimensional graph.) space the

is

subdivided into numerous regions called

same shape and small numerical

frequency distribution of the being assigned to the distribution

may

be

rth cell.

made

is

W= where some

known

particles

cells,

The problem

among

all

If there are r cells, the

each is

cell

having

to specify the

the cells, k

t

particles

number of ways

this

given by

s Pkl *

2

,..* r

=-^—

(1-33)

W

is may have no particles and some more than one. "thermodynamic probability" of the system if the system

cells

as the

N

size.

Classical (a priori) Probability

can be defined particles

because

it is

manner, and the most probable distribution of which W^is a maximum. (W is not a true probability

in this

that for

is

greater than unity, but

incidentally, the logarithm of

Classical

43

The

statistics.

W

is

it is

proportional to probability; and,

proportional to the entropy.)

so-called

Maxwell-Boltzmann

based on the hypothesis that both the position and particle

can be simultaneously exactly specified.

This

is

is

of a

a reasonable

hypothesis in view of our experience in macroscopic mechanics.

initial

Each point

a

in

of position and is

statistics

momentum

cell in

phase space corresponds to a possible specification

momentum of any one particle. A second initial hypothesis

that every cell

is

equally probable for every particle, subject to the

boundary conditions imposed by the conservation of energy and of the total number of particles. In this case, placing dW\{dt) = 0, where t refers to time, and imposing the boundary conditions, we find (although we shall not prove it now) the Maxwell-Boltzmann distribution k = (NlZ)e~^Wi where Z and /? are constants of the system and w is the energy of a particle ,

t

{

in the /th cell.

Quantum

bosons and fermions.

statistics:

accumulates, the Heisenberg uncertainty principle

and momentum of a given

Then, tells

as

experience

us that the position

cannot be exactly specified at any given volume of a cell in phase space cannot be arbitrarily small. The smallest volume must now be taken as h 3 where h is Planck's constant. The cell of interest is much larger than h3 so we particle

time, and, as a consequence, the

,

,

make

are obliged to

a finer subdivision of each

cell;

now we

reckon n

compartments, each of volume h 3 in each cell. This greatly increases the magnitude of but does not alter the relative probabilities. ,

W

Another new feature of quantum

statistics is that

of the indistinguish-

of identical particles. In classical statistics, if two identical particles exchange cells, we count the new arrangement as a different permutation ability

but in quantum or

cells,

statistics, if

we count

the

reduces the magnitude of

A

two

exchange compartments same permutation. This both

identical particles

new arrangement

as the

W and alters the relative probabilities.

new development

in quantum statistics, a result of the further accumulation of experience and the modification of former hypotheses, is the new property of each type of particle called its spin. Spins are quantized angular momenta and the quantum numbers are of only two

third

kinds

:

integer values (0,

1

,

2,

•

•

•)

and

half-integer values (|,

f

,

•

•

•)•

Particles having integer spin are called bosons (photons, neutral atoms, a-particles, 7r-mesons, etc.);

particles

having half-integer spin are called

fermions (electrons, protons, neutrons, neutrinos, //-mesons, is

no

limit to the

number of bosons

that can

etc.) There occupy a given compartment

Probability and Experimental Errors in Science

44 in

phase space; but the number of fermions per compartment

limited by the Pauli exclusion principle, this limit being is

\,

the magnitude of the spin

quantum number. The

sq the occupancy by electrons

is

limited to

U+

is

severely

1,

where J

spin of an electron

is

(Incidentally, the Pauli

2.

exclusion principle also governs the arrangements of electrons in atoms

where the other quantum numbers enter into the picture; the coordinates of the h 3 volume in phase space correspond to the other quantum numbers.) In the case of bosons, since there are n compartments in the z'th cell, there — 1)! ways of distributing the A^ particles. But of these are n{n + { ways, many represent indistinguishable ways in which particles are merely interchanged between different compartments and between different cells.

N

The

number of

net

distinguishable distributions

w=

n(n

+

-

N,

1)!

In

=

nlNtl

and of the system as a whole,

\

all cells

l\

t

I

t

+ N;-

n

i

is

+N N

considered,

w=Uw = u( This

is

1

(1-35) )

the basic formula for the Bose-Einstein statistics.

In the case of fermions of spin

|,

=

the

maximum number of available

sites

3

where v is the volume of the cell. In the z'th cell, which contains In sites, k of the sites are occupied and 2/z — k are empty. The thermodynamic probability for the z'th cell is given by the number of distinguishable ways In sites can be divided into in

each

cell in

phase space

is 2/<

2r///

,

t

t

two groups,

viz.,

occupied and empty, and

this is

W,

:

=

I

,

,

much

simpler

J

than

in the

boson

case.

The general expression

for the system

is

2

W=f[( f] This

is

(1-36)

the basic formula for the Fermi-Dirac statistics.

In each of these kinds of statistics, the frequency or probability distribution

is

obtained by maximizing

conditions as stated above.

W

consistent with the

boundary

The respective distributions, although not

derived here, are k,

:

=

—— Be pu

>

and k

i

=

—— Be

where B and

ft

'

+ —

for fermions

(1-37)

for bosons

(1-38)

1

1

are constants of the system.

These expressions are to be

45

Classical (a priori) Probability

compared with that given earlier for the Maxwell-Boltzmann statistics; the classical Maxwell-Boltzmann expression is the special case of the boson

<

expression for which (kjn)

1.

mechanics provides an excellent example of the progression of hypotheses or theories from the desperation-in-ignorance type of guess to rather complex conceptions that are more consistent with nature's behavior Statistical

as our experimental knowledge of that behavior accumulates.

claim with impunity that our knowledge

is

No

at all yet complete.

one can Inferred

knowledge of phase space and of the occupancy probabilities is still way to new information and to a better theory

growing, each theory giving as the science develops.

Further pursuit of this subject, and indeed a proper understanding of the features lightly touched

upon

Problems

1-9.

Note the "instructions" preceding 1.

11(b)

beyond the scope of this book.*

here, are

Solve Problems 1(a) and

and

(c), 12(a), (b),

and

3(a)

(c),

(c),

the problems in Section 1-6.

13(a)

and and

and (c), 6(a), and 14 of Section

(b), 5(b)

(b),

(b),

1-6

and

(c),

by using

equations of combinatorial analysis. 2.

What

is

the probability that

among

9

random

digits the digit 7

appears

(a) exactly 3 times,

(b) 3 times or less, (c)

A

3.

and

more than once? horseshoe contains 8

(a) In

(b) If

how many

nails.

different orders

may

they be driven?

shoe were to be attached in a different way to each horse, and

1

horse were shoed in

own

its

6-ft stall,

how many

miles of

stalls

if

each

would be

required ?

How

4.

long must a series of random digits be so that there

is

a probability of

0.9 that the digit 7 appears (a) at least once,

and

(b) at least twice?

How many

5.

(a)

<5

(b)

>0.5 for

for

Make

6.

no

dice ace,

must be and

at least

1

cast together in order for the probability to be (ans. 7)

pair of aces?

the histogram,

(ans. 9)

and indicate the position of the expectation

value, for

the binomial distributions

B(k;

B(k;l,h)-

*

F.

and

(a)

(b)

6, £),

For an elementary treatment of these features of

W.

(Addison-Wesley Publishing Co.,

New

statistical

of Gases, and York, 1955), 2nd ed.

Sears, Thermodynamics, The Kinetic Theory

mechanics, see, Statistical

e.g.,

Mechanics

Probability and Experimental Errors in Science

46 7.

What

the probability that in 7 tosses of a coin the

is

odd

tosses will

show

heads and the even tosses tails? 8.

in

If

birthdays are

random among n people

same birthday? An answer 9.

The

in

10.

11.

The

of the

letters

is

"peep"?

A

set

What

in order.

word "pepper"

What

The cards

is

the probability that at least

now

What

10 years old.

of them will

(a) exactly 3

them none of them will

live to

(b) at least 3 of

will live to

(c)

live to

13.

is

this

living at the

age of

Each of

5

the probability that

be 21, be 21, and

be 21

?

How would you plot the frequency distribution of a trinomial distribution ?

14. (a)

Show

random

digits (this

if

(n

that

if

+

\)p

is

a decanomial distribution).

the binomial probability /(£)

/12

-k\

=

B{k;

12, \),

then

1

(XT!

) 2

/<*>•

the general expression for the binomial probability/^

is

is

equal to an integer.

f (k + » = What

is

that, in a binomial distribution, the most probable value k

Show

double-valued

(c)

of the 2

an a posteriori rather than an a priori probability problem?

Discuss the distribution of 800

(b)

is

1

the probability that neither of the

is

According to a table of mortality, of 100,000 persons is

are then

the probability that the

is

10 years, 91,900 on the average are living at the age of 21 years.

Why

are then

of dominoes runs from double blank to double N.

1

(b) If 2 dominoes are drawn, what numbers on either of them is /V?

children

n

the probability that the

is

are written on cards.

in order.

domino is drawn, what numbers on it is N? (a) If

12.

is

"oral"?

is

thoroughly shuffled, and 4 drawn result

large

The cards

of the word "tailor" are written on cards.

letters

how

0.5 that 2

;

thoroughly shuffled, and 4 drawn result

at a certain party,

and only 2 of the people have the the form of an equation suffices.

order that the probability be

+

1)

for

any n and for any p ? 15.

What

are the probabilities

(a) that (i)

(b) that

(i)

1

1

ace and

(ii)

at least

double ace and

(ii)

1

ace will appear in a cast of 4 dice, and

at least

1

double ace

will

appear

in

24 casts of

2 dice? [That the (a) probability in each case

Mere's paradox.

De Mere

argued

is

known as de number of number of possible

greater than the (b)

that, since

4

is

is

to 6 (the total

possible events in a cast of 1 die) as 24 is to 36 (the total events in a cast with 2 dice), the respective probabilities should be equal.] 16.

What

is

the probability that in a sample of 10

random

equal? Compare the answer with that obtained by using

digits

no 2 are

Stirling's formula.

47

Classical (a priori) Probability that the ratio of male to female children

Assuming

17.

find the probability

is 1,

that in a family of 6 children

be of the same sex,

(a) all children will

4 oldest children

(b) the

will

(ans.

be boys and the 2 youngest

will

be

3i>)

and

girls,

(ans. 6\)

the children will be boys.

(c) exactly half

(ans. re)

A

box contains 90 good and 10 defective screws. If 10 screws are selected as a sample from the box, what is the probability that none in the sample is 18.

defective

if

(ans. 0.330 ••

sampling is done without replacement, and (b) with replacement? (a)

How many

19. (a)

different

(ans. 0.348

outcomes (permutations) are possible

k

=

12,

appearing twice

)

(ans. 6*0

what

(i.e.,

the probability for the event of every face

is

number

(ans. 0.003438

2 aces, 2 deuces, 2 treys, etc.)?

How many

20. In a lottery of 10,000 tickets there are 100 prizes.

must a person buy so that the probability of his winning 50%? An approximate answer suffices.

at least

1

•

tickets

prize will

(ans: 69)

exceed 21.

•

••

in the cast

of k dice together ? (b) If

•

A certain professor always carries 2 match boxes, each initially containing

25 matches. Every time he wants a match he selects a box at random. Inevitably a moment occurs when, for the first time, he finds a box empty. Then, what is the probability that the other

box contains

r

=

0, 1, 2,

•

•

•

matches?

22. In a laboratory experiment, projectiles (small steel balls) are shot (at

random

The screen

times) through a screen (the spokes of a rotating wheel).

(spokes) deflects or scatters

some and allows

others to pass through undeflected.

Suppose 8 projectiles are shot. Suppose that the probability of each passing through undeflected is 0.8. Compute and plot the probability distribution for traversals without deflection. If the experiment were to be repeated many times, what proportion of the trials would yield results within ±2 of the mean value? This

is

a typical "scattering cross-section" experiment in which, usually, the is determined from the observed numbers of undeflected

basic event probability/?

When

projectiles. it is

the experiment

is

performed for the purpose of determining/?,

a typical a posteriori probability experiment.

23.

Among

(a) If

N

two

TV different keys,

probability that the lock will be (b)

What

will

open a certain

100 and half of the keys are selected at

is

is

lock.

random

to try,

opened?

what

is

the

(ans. 0.752)

the limiting value of this probability as

N increases

indefinitely? (ans. |)

(c) If TV is 100,

how many

keys should be selected to try in order that there

should be just more than an even chance of opening the lock? 24. (a) If the

on any

odds are k to

particular day,

show

1

(ans. 35)

against a machinist's meeting with an accident

that the

odds are

(1

+

\jk) n

—

1

to

1

against

escaping injury for n days. (b) If

k

=

1000,

escaping injury?

what

is

the greatest value of n giving favorable odds for (ans. 693)

48

Probability

The A gene

25.

Aa have

A

the

is

and Experimental Errors

dominant, the a recessive;

characteristic,

i.e.,

and of type aa the a

Science

in

A A and Assume {, i,

organisms of types

characteristic.

and \ to be the a priori probabilities for the gene types AA, Aa, and aa (Assume Aa = aA). (a) If both parents are A, what is the probability that an offspring will be a?

respectively.

(ans. £)

(b) If all

4 grandparents and both parents are A, what

the second generation will be 26. In testing

ESP

is

the probability that

A?

(ans. if)

(extrasensory perception), an experiment

is

conducted

with 4 red and 4 black cards. The cards are thoroughly shuffled and placed face

down on

the table.

black cards, but he

The person A to be tested is told that there are 4 red and 4 knows nothing as to their arrangement. Person B draws a

card and, without either looking at If

A

answers "red,"

B places

it

it

on one

himself or showing

it

side of the table

A

on the other side. This process drawn. Let us assume A has no ESP. places

(a)

it

What

is

is

;

if

repeated until

the probability that there will be just

1

to A, asks

its

color.

all

cards have been

black card in the "red"

pile?

A)

(ans.

(b) If the first card to

appear

is

black but

is

called red,

what

is

the probability

that there will be exactly 3 red cards in the "red" pile at the

experiment? the

(c) If

end of the 4

(ans. first

card

is

called correctly,

what

is

B

answers "black,"

3 5)

the probability of having

exactly 3 correct cards in each pile?

(ans. §§)

27. In the game of craps, the person casting 2 dice wins if he gets a 7 or an II on the first cast or, alternatively, if the first sum is a 4, 5, 6, 8, 9, or 10 and the same sum reappears before a 7 appears in any cast after the first. (a) What is the win probability when the game is defined in this way? (ans. 0.49293

•

•

)

Sometimes the game is defined so that the player does not automatically lose if he casts a 3 on the first throw, and 3 is then added to the winning sums for succesive throws. What is the win probability in this case? (ans. 0.50682 (b)

•

28.

A

poker hand of

5 cards

is

dealt

•

from an ordinary 52-card deck. What

•

is

the probability for each of the following: (ans. 0.422)

(a) a single pair,

(ans. 0.0476)

(b) 2 pairs, (c) 3

of a kind,

(ans. 0.0211)

(d) straight (5-card sequence, ace permitted at either end, including a flush), (ans. 0.00394) (e) flush (5 (f) full

(g)

cards in a single

suit,

including a straight),

4 of a kind,

(ans. 0.00024)

(h) straight flush (including a royal flush), (i)

royal flush, and

(j)

"opener"

(a pair

(ans. 0.00198) (ans. 0.00144)

house,

(ans. 0.0000155) (ans. 0.0000015)

of jacks or better)?

(ans. 0.206)

49

Experimental (a posteriori) Probability

29. Consider the "pros" and "cons" of the following system of betting: Suppose in successive games, in each of which the odds are 50-50, you bet SI. At any time that you win, you pocket the winnings and start betting again at SI At any time that you lose, you bet double the amount on the next game. No matter how long the series of consecutive losses, when you win you are $1 ahead as though the losses had not occurred. (a) If you were the owner of a gambling house, under what conditions would you allow a client to use this system? (b) How would you alter the system if the odds were known to be 75-25?

(In considering both parts of this problem, ignore the usual bias

the house in

C.

its

own

EXPERIMENTAL 1-1 0.

imposed by

interest.)

(A POSTERIORI) PROBABILITY

Definition of Experimental Probability

Suppose that for some reason we wish to check the

classical (a priori)

idea that the probability for observing a head with a tossed coin

The obvious

thing to do

keep a record of the

results.

We

moderately large number n obs of independent ratio

u' ot)S /tf obs is,

|.

We

trials.

say that the

for this value of « obs the best experimental value of the ,

probability for heads in any single toss,

e.g.,

in this value increases as n oX)S is increased.

experimental probability

is

fluctuate rather erratically

probability steadies

down

the next toss.

Indeed,

if

Our confidence

the value of this

plotted as a function of « obs

when n ohs

is

small, but, as

/?

,

it

is

Fig. 1-3.

By

the

definition, the experi-

mental probability (sometimes called the frequency probability)

becomes

seen to

obs increases,

to an apparently constant equilibrium value.

A typical graph of this sort is shown in this ratio as « obs

is

number of times and to observe heads u obs times after some

to toss the coin a large

is

is

simply

indefinitely large, viz.,

pobs

=

limit

^

(1-39)

nobs— 00 Hobs the outcome of each trial (toss) is (a) independent of all preceding trials, and (b) determined entirely by chance. There are four difficulties with this definition. First, how can we be sure that all the trials are independent? The practical problem here is that the coin may wear out asymmetrically or that the person (or device) tossing the coin gradually but inadvertently acquires a "system" which favors a particular outcome. It should be noted here that we do not require the absence of a "system," but merely that if it is present it must

if

Probability and Experimental Errors in Science

50 Heads

Tails

400

Fig. 1-3. Experimental probability (frequency ratio for "heads") steadies

apparently equilibrium constant value as n

bs increases.

down

to

an

(Note the logarithmic abscissa

scale.)

Second,

remain constant. trial

is

how can we

be sure that the outcome of each

determined entirely by chance?

related to the

The

practical

one for the independence of successive

problem here

trials.

is

Third, the

limit n obs -*> oo is obviously impractical. In this regard, we substitute a conceptual extension of the experiment after « obs has become "satis-

However, the value of p obs for any large but finite n obs it as a consequence of the fact that n obs not strictly converge mathematically no ratio does finite. Fourth, the is matter how large « obs becomes. This is because, after any specified n obs

factorily" large.

contains some small uncertainty in

,

there

The

is

a

finite

chance that a long run of heads (or of

tails) will

occur.

experimentalist points out that as n obs increases, such a run must be of

increasing length to have a given effect in the value of p ob9

,

and

that after

verv ' ar g e trie probability for having a significant effect of this sort >*obs This has been proved mathematically in is so small as to be negligible. terms of the so-called strong law of large numbers. It is important, i

s

nevertheless, that n obs be very large indeed if p obs is to be expressed with very high precision. Later we shall show that the standard deviation,

a measure of the

statistical uncertainty, in the

proportional to the square root of n oba

.

measure of p obs

is

inversely

Experimental (a posteriori) Probability

SI

Even with these difficulties, the experimental definition is the one that must be invoked to "prove" that the coin is "satisfactorily honest," i.e., that the a priori probability is reasonably valid, or sometimes even to prove that a very complex combinatorial analysis is indeed correct.

Number of "equally probable outcomes" meaningless. Outside the realm of ideal games numerous probability situations exist in which the number of equally probable outcomes is entirely meaningFor these situations the classical probability, Eq. 1-1, cannot be Examples are legion: A marksman shoots at a target; evaluated. what is the probability of a hit? What is the probability that a particular person of given age will die within one year? What is the probability that a given house will be ravaged by fire within a specified time? If a baby is to be born, what is the probability that it will be a boy? What is the probability that John Doe, a candidate for public office, will be elected? What is the probability that the next measurement of cosmic-ray intensity will differ by a given per cent from the immediately preceding measurement? What is the probability that two different measurements of the velocity question of the

less.

of light agree within the experimental errors? In such probability situations

we

are at a complete loss in trying to apply the classical definition for

the probability.

Rather than rely on "armchair" reasoning, or make a

basic desperation-in-ignorance guess,

we may experiment, make

actual

measurements, and use the experimental definition of probability. I-II.

Example: Quality Control

Determining the experimental probability of a specified outcome generally involves rather intricate statistical reasoning in order to achieve

satisfactory numerical value with a

minimum of

the heads probability in tossing a coin

is

effort.

very simple.

a

The example of

To

illustrate the

problem discussed problem, a random sample of limited

typical complexity, let us consider the lottery type of in the last part

sizey

was

of Section

selected

from a

1-7.

In this

large population n subdivided into n

We

=

kx

+

k2

how many elements of the kind k x may we expect to have in the sample j. Suppose now that we alter this problem as follows. The numerical value of n is known but the division of n between k 1 and k 2 is not known, and we wish, from

with

all

numerical values known.

inquired then as to

an observation of the number of k x elements in j, to determine the ratio kjn. This is a typical problem in what is called "quality control." It is instructive to consider this type of problem a little further because it illustrates one essential feature of the measurement problem in an experi/'

mental science.

A factory turns out a very large number n of supposedly identical items,

Probability and Experimental Errors in Science

52 but some

unknown

fraction are defective,

whether or not

infer

can be discussed

in

this fraction

and we wish, by sampling,

to

exceeds a specified value. The problem

terms of the equation for the probability for having

i

defectives in sample j, viz.,

" (M ( '~ M //l; /; n

KexactlyO='

(;)

As

a

first

approximation,

this expression for /^(exactly

"equal" to the observed ratio

i/j,

i)

may

be placed

and a value of the defective fraction kjn

deduced therefrom. The difficulty is that a different value of kjn is obtained from each different sample ratio i/j. Of course, the reliability of the deduced value of kjn increases as the sample size increases or as the

number of independent samplings increases to provide a more reliable mean value of i/j. The problem really is to determine, for preassigned reliability in k x jn, the optimum sample size and number of samples commensurate with a minimum of effort in examining the samples. There are various statistical arguments in treating quality control problems of this sort, and discussion of them is beyond the scope of this

But one approach to this problem, in case n is very much larger mentioned now because it ties together some of the concepts discussed earlier in this chapter. In this case, the problem can be approxibook.

than

y, is

mated by one of sampling with replacement. Then, the binomial equation can be used, Eq.

1-20, viz., ]-i

\\)l

and the problem becomes one of determining the parameter p (= kjri) from the observed ratio i/j. Suppose that a guess as to the true value of p puts it in the range 2 to 4%, and that it suffices to know/? to one significant figure.

hypotheses,

One procedure then is to make five mutually exclusive 2, 3, 4, or 5 % and to guess initially (in desperation) p = all equally likely, i.e., probability \. The binomial probability

viz.,

that they are

1

,

j, p) may be calculated for each value of p, and comparison with the outcomes of successive independent samples serves to

distributions B(i;

increase the probability that one of the hypotheses

is

to be favored over

the others.

1-12.

Example: Direct Measurements

Now

let

in

Science

us extend the quality control problem so as to

to a typical

measurement problem.

make

it

similar

This illustrates a most significant

Experimental (a posteriori) Probability

S3

application of the multinomial probability distribution in the science of

measurements. As a

first

step in this extension, suppose that the definition

of "defective" involves an upper and a lower limit of tolerance in the pertinent aspect of quality, e.g., in a linear dimension such as the diameter of ball bearings. With n

=

+

kx

k2

+

The problem,

k3

if

n

this extension, n r

i.e.,

,

=

3,

subdivided into three categories,

is

with category k 2 being "nondefective."

very large compared toy, becomes one of multinomial

is

p x p 2 pz unknown. In this optimum sample size and number of the compromise between reliability and

probabilities with the respective probabilities

determination of the

the

case,

samples, with consideration for

,

,

even more complicated than in the case in which n was divided

effort, is

two

into only

categories,

and we

shall

not attempt

it

further here.

Next, suppose that the n elements are subdivided into a

much

larger

number of different categories. Suppose that these categories are ordered in terms of some numerical characteristic of the elements, perhaps the diameter of the ball bearings. Our objective is to infer the average or arithmetic mean value of the entire population of n possible values from a sample of

In the determination of a length

size j.

(e.g.,

the balls as measured with a pair of calipers or with instrument),

we

take a

number j of independent

a sample of size/, from an essentially

From i

z

+

•

'

,

h

-,

i

r

'

K

=j

and

the arithmetic mean.

+

^1

+h+

^2

of k x

,

i

2

of k2 ,

large that adjacent numerical

limit

is

'

'

kr

<

n

mean

value, a valid inference r in

n

is

in this case so

we can read some other instrument indeed, if

k values are as

the vernier caliper scale or the scale of

no upper or lower

'

We infer that the length being measured

reasonably large. The number of subdivisions

infinite

ij

trials.

,

has a value reasonably close to the measured

be

i.e.,

population of possible

of k r where

+h+ "+

we calculate if / is

fine

measurements,

the variety of values observed in this sample, viz.,

of k3

h

infinite

trial

the diameter of

some other

closely spaced as

;

imposed on the

size

of the ball bearings,

r

may

even though the effective least count in the measurement scale

is

finite.

The

quality control problem in simple

form

is

also seen to be identical

to the problem of determining the experimental probability of heads in a

coin toss from a sample size « obs (=/) taken from an infinite population n. In this case, the number of subdivisions r of n is only two, viz., heads and

The problem in slightly extended form is also similar to the one of determining the experimental probability for a deuce in the cast of a sixtails.

= co and r = 6. However, as implied above, in a typical measurement problem, n is infinite but the number of subdivisions,

sided die, n scientific

although

infinite in principle,

may

be limited in practice by the effective

Probability and Experimental Errors in Science

54

count of the measurement scale and by the largest and smallest possible measurements.

least

As mentioned it

turns out

direct

in

earlier in this chapter

an experimental science

and again that, in

measurements, the very large population n

real-life

in

many

aspect of the

Chapters 4 and

5,

of the problems of

(infinite)

can be taken as

subdivided according to either the normal (Gauss) distribution or the

Poisson distribution.

This knowledge makes for great simplification

in

determining the experimental probability with reasonable precision by

invoking one of the mathematical models that

is

based on axiomatic or

classical probability concepts.

But before we discuss these features let us introduce in the next chapter some basic notions about measurements in general and about elementary statistics.

"We of

are

in

having

the ordinary position of scientists

we

with

piecemeal

make several but we cannot make anything

improvements: clearer,

content

be

to

can

things clear."

FRANK PLUMPTON RAMSEY

2

"Probability

is

a

measure of the importance

of our ignorance."

THORTON

C.

FRY

Direct Measurements:

Simple

Statistics

MEASUREMENTS

A.

2-1.

The Nature

Most people

IN SCIENCE:

ORIENTATION

of a Scientific Fact

are strongly inclined to the idea that a so-called "fact"

immutable, an absolute truth, and that science especially yields such truths. But as we study science and its philosophical implications, we is

is entirely foreign to science. It becomes necessary two kinds of facts: (1) those known by the Omniscient, and (2) those devised by man. Only the former are absolute.* Complete certainty is never the mark of a scientific fact, although it is the business of scientific endeavor to reduce the uncertainty as much as

find that absolute truth

to distinguish

In

possible.

many

instances, the residual uncertainty

and some people may be is

inclined to say that

not valid in principle. Scientific knowledge

it is

is

is

very small indeed,

negligible.

Such neglect

always inferred knowledge,

knowledge based on a limited number of observations. But it is to be emphasized that, as our experience accumulates and our interpretations

i.e.,

*

The

mean

choice.

Wendell Holmes wrote: "When I say a thing is true, I cannot help believing it. I am stating an experience as to which there is no

late Justice Oliver

that

I

But ...

inabilities

I

do not venture

of the universe.

and leave absolute truth Doubts,"

Illinois

Law

I

to

assume that

my

inabilities in the

way of thought are

therefore define the truth as the system of

for those

who

are better equipped."

Rev., 10 (1915).]

55

my

limitations,

[From "Ideals and

Probability and Experimental Errors in Science

56

become more critical and more and more reliable.*

Some

objective, scientific

knowledge becomes steadily

people have argued that science, more than any other subject,

responsible for the progressive emancipation of men's minds the philosophy of absolute truths.

Whether science

absolutes, toward absolutes, or neither,

or ethics, for

meaning.

it is

only

in

Science, per

concepts of science,

is

directs

is

away from away from

a matter of one's religious faith

such faith or cultural patterns that absolutes have se,

like the

is

necessarily silent

on

this question. f

The

concept of probability and for essentially the

same reason, are "open-ended" concepts. Whether or not these concepts may be the basis of a new type of religion or ethics is also a matter of opinion which we shall not discuss in this book. The feature that distinguishes scientific knowledge is not only that there is

a clear recognition of uncertainty but that the degree of uncertainty can

usually be rather well determined.

This determination

is

carried out by

and probability theory. As we accumulate scientific facts, including knowledge of their intrinsic uncertainty, our philosophical intuition grows, and we are no longer dismayed that scientific the

methods of

facts are if

statistics

"merely" probable; indeed, we

the probable error

by man) because

it

is

known,

is

most

realize that reliable

probable knowledge,

of all knowledge (devised

includes a realistic self-appraisal.

The fundamental

truth or fact in an experimental science, e.g., in physics,

chemistry, biology, engineering,

ment.

the

etc., is

always an observation, a measure-

Prediction, or.a generalized description of the behavior of nature,

an important goal of the science, but the degree of reliability of the is no better than the measurements upon which it is based. Careful analysis of the reliability of measurements therefore is necessarily an early step in achieving scientific maturity. is

prediction or of the distilled principle

Measurements and

2-2. Trial

As an example of

a measurement, consider the direct determination of

the length of a piece of wire.

measurements with a *

During the

last half

Statistics

Suppose that we have made 12 independent count (smallest scale division)

ruler having a least

of the nineteenth century, Newtonian mechanics and classical all the observations that

electromagnetic theory were able to "explain" just about

mechanics and electricity. Then, some measurements were made with some experiments were carried out in a new domain, viz., atomic physics. It was immediately necessary to modify the then-current theories to encompass the new observations. Examples of the need for, and the process of, increasing scientific reliability still continue in every active facet of science, and this

had been made

in

greater precision, and

situation will persist indefinitely so long as that facet of science remains active. t

This fact

is

in

no sense a belittlement of the

social

and

attending absolute truths; see the "Dedication" of this book.

intellectual

problems

Measurements

in Science:

Orientation

Table 2-1. Typical Set of Measurements

Measured Value Trial

1

(mm

units)

57

Probability

58

t

and Experimental Errors

in

Science

Measurements

in Science:

Orientation

The branch of applied mathematics or as nearly ideal

A

that treats

data as possible,

trial

fundamental postulate

is

and

interprets trial data,

called statistics.

in statistics is that the variations in a set

ideal (or near-ideal) trial data are strictly

chance.

59

(The concept of random

is

random,

i.e.,

of

are due entirely to

discussed in the next section.)

It is

assumed, of course, that the property being measured does not change during the measurements. The bulk of the problems in statistics deal with data for which the actual degree of approximation to random ideal trial data re-

and

quires careful thought

test (e.g.,

ages of people at the time of death, sex

distribution in a population, accidents

ravaged by

on the highway, number of houses

votes in an election, nutritive value of milk from cows on

fire,

etc.). But in a set of painstaking measurements in an experimental science, especially in laboratory physical science in which the subjects are inanimate, the random ideal trial nature of the measurements can often be safely assumed. Often, however, it is desirable

unintentionally different diets,

to carry out a test to check specifically for the presence or constancy of

systematic (nonrandom) errors in the measurements. It is well known that, in comparison with the physical scientist who works under the "controlled" conditions of a laboratory, either the biologist or the nonlaboratory physical scientist* must typically put up

with certain extraneous factors that controlled.

With

minimize the in

relative

approximating

ments.

The

diligent design

make

his

experiments

and performance of

less

well

his experiments to

importance of these factors, he also often succeeds

satisfactorily the conditions of

social scientist,

on the other hand,

is

random

trial

measure-

frequently confronted

with such extraneous factors that the majority of his measurements are

perhaps better described as investigations than as

scientific

experiments.

Certain methods in statistics have been developed to give special attention

and with these complications the subject of In this elementary book for the student of experimental science, we shall for the most part pass over the statistical treatments of extraneous factors; when the student needs them, he can to the extraneous factors,

statistics is necessarily intricate.

find these treatments in the literature.

2-3.

Random

Variation

The mathematician

defines

random

(or stochastic) as the adjective

modifying a variable whose value depends on the outcome of a random experiment.

A

random experiment

is

one whose possible outcomes are

all

equally probable, or, better, for which each possible outcome (each point *

Examples of nonlaboratory physical sciences are astronomy, meteorology, cosmic geomagnetism, cosmology, etc.

rays,

60

Probability and Experimental Errors in Science

in

sample space) has a fixed probability. Idealized penny tossing, drawing

a number from a

hat, etc., are often cited as such "experiments."

Also,

random numbers are digits arranged in a random manner. The phrase "random numbers" is short for "randomly generated numbers."* The experiment or the process of generation of random numbers in real life is left to the scientist to devise; and, confronted with

to the mathematician,

this nontrivial task,

even the scientist would prefer to be a mathematician

(or an armchair philosopher).f It is

The

impossible to give a rigorous operational definition of random.

subjective meaning, however,

may

statements

A

set

is

not

The following

difficult to grasp.

be helpful.

of generally nonidentical numbers has one essential feature of

randomness

if,

as the

numbers are

successively revealed, the next

number

in the series has an a priori equal chance of being larger or smaller than the median valued of the already revealed numbers. Another necessary

condition of randomness errors, the first

moment

absence of inconstant systematic

that, in the

is

of the

set is

zero

when taken about

the arithmetic

mean value, i.e., moments is discussed presently.) Identification of randomness in terms of the mean value is not really practical because inconstant systematic errors the random deviations must add up

are never completely absent;

because

it

the

A

(The concept of

attempt has significance, however,

emphasizes consideration of the

sented by the individual numbers

to zero.

sizes

of the deviations repre-

as well as the respective algebraic signs.

single event in a set of generally different events,

whether each event

numerical measurement or some other type of observation,

is

random

is

a

if it

has an a priori constant chance of occurring regardless of the position of this event in the

ordered

depend on the position

set

(although the magnitude of the chance

in the set).

The

adjective "a priori"

is

used

in

may

two of

these statements, and, strictly speaking, the a priori chance cannot be

determined

—

it

can merely be inferred.

The concept of random

an especially interesting one. In science, it It is properly defined in terms is intrinsically an "open-ended" concept: of the chance happening of a future event, always about the unknown. Hence, it does not properly apply in the description of a number or of a set of numbers (or of an event or of a set of events) that has already been is

* A book of "random" numbers has been published, A Million Random Digits, by The Free Press, Glencoe, 111. t And the mathematician said "Let there be random numbers," and lo and behold it came to pass: there were random numbers. Only a mathematician can get away

with X

this.

The median, discussed

later in this chapter,

is

defined as the middle value, or as the

interpolated middle value, of a set of ordered numbers. If the histogram of the is

symmetrical, the median and the arithmetic

mean have

the

same

value.

numbers

Measurements

Nor can any

revealed.

of a

Orientation

61

a posteriori test of the

randomness of a number or

of numbers (or events) be completely satisfactory. The best we can

set

do is

in Science:

from the already revealed numbers whether or not the next-tonumber may be expected to be a random member of the set.* This past-vs.-future aspect of random has philosophical fascination, and some people say (erroneously) that the inherent arbitrariness of any operational definition of random prevents the subject of probability from to infer

be-revealed

being properly a part of a science.

Actually, every operational definition

most cases) arbitrariness or unknowledge is complete, no measure-

in science has a residual (albeit small in

certainty in

ment It

inasmuch as no

it

scientific

exact.

should be mentioned that, as regards the experimental concept of

random, the terms "equal chance" and "constant chance" in the respective statements above have significant meaning only in terms of a very large set of observations. The set must be sufficiently large that the statistical pattern of the variation, including the median or the arithmetic mean value of the already revealed numbers, has taken on an essentially equilibrium value (see the definition of experimental probability, Eq. 1-39). It

apparent that the terms "trial" and "random" are somewhat

is

An

related. refers to

interesting distinction between them is the following: trial an experimental process (although an impractical one since an

actual process never has quite the perfection required in the strict definition of trial

trial),

whereas random

is

a mathematical condition.

implies a striving for a real-life perfection, whereas

a kind of perfection by acclamation;

this is

In a sense,

random

refers to

a characteristic difference

between a science and mathematics.

As was

stated before, simple statistical methods treat sets of trial data which the variations are assumed to be satisfactorily random. And, fortunately, in an experimental science, the assumption of random variations in successive trial measurements is often satisfactory per se. in

2-4.

Probability Theory in Statistics

In treating

random trial data, it is often possible to invoke a mathematical

model of the variations among the it

trials.

model

If this

enables the statistician or the experimenter to

limited

number of

trials,

is

not too complex,

quickly,

i.e.,

with a

very significant computations about such pro-

perties of the set as (a) the best value

and

its reliability, (b)

with which a particular result or measurement *

make

may

the frequency

be expected to occur

Comment on the following story The doctor shook his head as he finished examin"You have a very serious disease," he said. "Nine out of ten people :

ing the patient.

having this disease die of it. But you are lucky because you came to me. had nine patients all of whom died of it."

I

have already

Probability

62

and Experimental Errors

in

Science

when a certain number of trials are made, (c) the number of trials that need be made for a specified precision in the best value, etc. The branch of statistics that applies and/or develops mathematical models for random trial data is called probability theory. The simplest mathematical model is the binomial distribution. This was initially model, whose formula was derived and discussed in Chapter 1

devised for the simple ideal games of chance dice, dealing cards, etc.).

(e.g.,

,

tossing coins, casting

The two mathematical models of outstanding

an experimental science are the normal (or Gauss) distribution and the Poisson distribution, both of which, mentioned briefly in Chapter 1, may be considered as limiting cases of the binomial distribution. importance

It

in

turns out that one or the other of these two models very often satis-

measurements,* and only a rudimentary knowledge of the subject is necessary to enable the experimenter to decide which of these two models is the one of interest. Procedures for testing the degree of fit are discussed later. These models do not involve

factorily "fits" a set of actual direct

advanced mathematics (beyond elementary calculus), and tions are of

and then 2-5.

in

immense help in designing the experiment analyzing and interpreting the results.

their applica-

in the first place

Computed Measurements

The type of measurements discussed measurements.

in

the last section are direct

computed or

Usually, at least in the physical sciences,

derived quantities, also called "measurements," are

more

frequently the

An example of a computed measurement is that of which is obtained from the directly measured quantities of distance and time. Other examples are legion. After the direct measurements have been recorded and the best value and its reliability determined, we apply the appropriate statistical formula for the propagation of errors and determine the reliability of the computed result. The probability model, if it exists, for computed results is generally different from that applicable to the direct measurements, and usually no simple model applies.! Hence, with little or no prospect of finding a satisfactory model for computed measurements, we must be content with the more limited

focus of attention. velocity

No mathematical model distribution conforms any set of experimental measurements. Whether or not the actual degree of "misfit" and the consequences thereof are serious depends upon the care and pains that the measurements justify or that the experimenter is willing to take. *

"Fits" needs further comment.

strictly to

t

Note

that

if

the direct measurements are

made on

a scale that

respect to the errors themselves, then generally no simple in

such a case that a better

fitting

than for the direct measurements.

model may be found

for

is

nonlinear with

model applies; it is possible the computed measurements

63

Basic Definitions: Errors, Significant Figures, Etc. potentialities

of the

statistical precision indices as

discussed in Part

C

of

measurements

in

an

this chapter.

Conclusions

2-6.

With the

special characteristics of

experimental science,

we

most of the

direct

usually assume satisfactory compliance with the

two assumptions:

random

independent measurements carried and for which there is a constant characteristic chance that any particular possible measurement will occur as the next measurement, and (2) a simple mathematical model of the variations. (1)

trial

measurements,

i.e.,

out under identical conditions,

Then, the pertinent principles and

details of the generally

complicated

and probability theory are not very formidable even to the beginner. The reliability of computed measurements, and of any direct measurements that are not satisfactorily fitted by a simple mathematical model, may be obtained by the statistical precision indices without resort to any model. The general objectives in the statistical treatment of measurements are (1) the determination of the best (or sometimes the median or the most probable) value from a limited number of trials, (2) the specification of and

specialized subject of statistics

the reliability of the best value, (3) a statement of the probability that the

measurement would have a particular value, and (4) assistance and performance of the experiment so as to obtain a desired degree of reliability (expressed as an error) with a minimum of effort. next

trial

in the design

B.

BASIC DEFINITIONS: ERRORS, SIGNIFICANT FIGURES, ETC. There are

many

different specific aspects of the error concept.

discuss these aspects

second, in Part

ments.

C

first

of this chapter, as they apply to

sets

of

Errors in the latter case are especially interesting

tion observed

among

We

as they apply to individual measurements

the trials

is fitted

trial

when

shall

and

measurethe varia-

by a simple mathematical model,

but in the remaining two parts of the present chapter individual measure-

ments and sets of trial measurements are discussed without regard to any mathematical model. Discussion of the models and of the probability predictions will be delayed until later chapters.

Some

basic concepts

and

definitions have already

been given or implied

Probability and Experimental Errors

64 in Part

A

ments,

random

of

this chapter, viz., the

nature of a scientific fact,

in

trial

Science

measure-

variations in measurements, histogram, frequency distri-

bution, frequency curve, statistics as a general subject, probability theory, direct

and computed measurements,

etc.

We now explore some additional

terms, principally those dealing with errors, with elementary statistics,

and with precision of

measurements.

Types of Errors

2-7.

It

direct

is

convenient to subdivide the general concept of error into three

broad types, viz., random errors, systematic errors, and blunders. In our present discussion, blunders should be immediately dismissed with the appropriate embarrassment, but a few examples are mentioned briefly below. In general, the term experimental error is some additive function of all three.

Random

(or accidental) error.

concern in the

statistical analysis

of random error are in

(1)

A

Random

errors are of the greatest

of measurements. Four separate meanings

common

use as follows:

deviation or statistical fluctuation

(Eq.

2-5) is the

difference

between a single measured value and the "best" value of a set of measurements whose variation is apparently random. The "best" value is defined

mean of all the actual trial measurements. random it is necessary that the systematic errors

for this purpose as the arithmetic

[For the deviations to be

(mentioned presently) be either absent or not change as the trial set is obtained.] For a symmetrical frequency distribution that is unimodal (has only one

maximum

in

it),

the arithmetic

mean

is

obviously the "best"

value and also the most probable value; for an asymmetrical distribution, the

mean

somewhat arbitrary but is supported by shown later. For an asymmetrical distridepends upon the intended use, and sometimes the

as the "best" value

is

the principle of least squares as

bution, "best" really

median or the most probable value of a deviation the mean (2)

Random

is

is

preferable; but for the determination

conventionally used.

error sometimes refers to the difference between the arith-

mean as determined from a certain number of random trials and the mean that is determined from a larger number of trials. Often the latter mean is the hypothetical or theoretical "true" value that we believe would be obtained from an infinite number of trials it is often called the "parent" metic

;

or "universe" mean.

This error with respect to the hypothetical "true"

value does not have experimental significance but

more advanced

statistics

and

in

is

of great interest

in

any discussion with a mathematical model.

Basic Definitions: Errors, Significant Figures, Etc.

65

20

15

Time (sec) Fig. 2-3.

Record of "zero-reading"

deflections of a very sensitive torsion balance.

These irregular fluctuations are due to elementary errors, perhaps dominated by Brownian motion in this case.

(3) A more difficult concept of random error has it as some one of numerous so-called elementary errors that are merely imagined to exist.

According to the theory, these elementary errors conspire to be observed as a deviation or statistical fluctuation in certain types of measurements.

Examples of the imagined random elementary errors, and measurement process, are discussed in Chapter 4 in connection with the normal (Gauss) mathematical model of probability. In this theoretical interpretation of deviations in real-life measurements, an elementary error may indeed be either a random error or an inconstant systematic error. The term "inconstant" refers either to the magnitude of the elementary error or to its algebraic sign as it adds into the sum to give the deviation or it may refer to a time dependence in those measurements that are made in a time sequence, as most measurements are made. See Fig. 2-3.

their role in the

;

(4) Finally, reliability

mean This

random

error

value, determined

is

indices.

may

refer to a quantitative statement of the

of a single measurement or of a parameter, such as the arithmetic

from a number of random trial measurements. is one of the so-called precision

often called the statistical error and

The most commonly used

reliability

indices, usually in reference to the

of the mean, are the standard deviation, the standard error (also

called the standard deviation in the mean),

and the probable

error.

Precision indices are defined and discussed later.

The student may

well have difficulty in immediately perceiving

some of the

above distinctions, but they should become clearer later. It is important to note that the algebraic sign of a random error is either positive or negative with an equal probability when the error is measured with respect to the median value. In the fourth meaning listed, fine points in the

Probability and Experimental Errors in Science

66 the precision index

always measured with respect to the "best" value

is

of the parameter under consideration, and the best value, as stated above, is

the arithmetic

mean

rather than the

value (the mode). Also note that

asymmetrical

(as, e.g., in

distributions), the

when

median or the most probable

the distribution of measurements

the cases of the binomial

median, the mean, and the most probable values are

In this case, the precision index of reliability

all different.

is

and Poisson model is

properly

(although in practice only occasionally) expressed, not as plus or minus

some symmetrical error magnitude, but rather as plus one error magnitude and minus another error magnitude. Sometimes we distinguish between (a) the random errors that are introduced specifically as a part of the measurement process and (b) the

random phenomena

that are inherent in the statistical nature of the

property being measured.

one but

is

us elaborate

on

is perhaps not a fundamental measurements themselves. Let

This distinction

significant in certain types of this a little.

The former case

typically refers to the role

of the elementary errors as they conspire to produce an observed deviation in a

measurement

in a very fine-grained (perhaps continuous)

sample

space, whereas the latter very often (but not necessarily) refers to a

which the sample space is obviously discrete. Examples in measurements of a length or of a time; examples of the latter are found in so-called counting measurements (e.g., in counting nuclear disintegrations, in measuring with a counter the intensity of cosmic rays, or in the quality control problem of the last chapter in which "defec-

measurement

in

of the former are found

is unambiguous). A count is an integer with no uncertainty in it, but whether or not a count occurs in a specified sample (in a selected batch or in a selected time interval, etc.) is due to chance which is itself due to an unfathomed array of elementary errors. In a counting experiment, we measure the probability that a count will occur in the selected sample

tive"

with

all

the elementary errors

lumped

of answer, and the probability sought as

is

into a discrete "yes" or is

some

"no" type

characteristic of nature just

the length of a piece of wire.

The central feature of all statistical measurements, of whatever type, is that we seek to determine a measure, from a limited sample (number of trials),

of a property of a large (often

mean of

infinite)

population. In whatever type,

measurements increases in statistical fashion as the size of the sample increases. Only in the event that the "sample" is the entire population (often an infinite number of trials) are the random sampling errors reduced to zero. Distinction between the properties of a sample and of the entire ("parent" or "universe") population is basic in the second of the four meanings of random error as set forth the reliability of the

above, and

is

the

the subject of several later discussions.

67

Basic Definitions: Errors, Significant Figures, Etc.

Systematic error.

An

error that always has or tends to have the

same algebraic sign, either an additive or subtractive quantity introduced in the measurement process, is called a systematic error. If the magnitude of this error does not change with time,

appears as a constant error in

it

median and arithmetic mean values. If it changes in magnitude, it introduces some skew (asymmetry) in the observed histogram. If the changes in magnitude occur in some irregular fashion, it is especially a most unpleasant and insidious error contribution. In any case, since systematic errors are not generally amenable to statistical treatment, they impair the reliability of the mean to a degree which can only be estimated and often not very well.* The observed errors in every instance probably include both random and of the measurements and also

all

in the

systematic errors.f

Examples of systematic errors are those caused by: (1) Incorrect (or

friction or

electrostatic

an unjustifiably assumed) calibration of an instrument,

moving parts of an instrument (as in a "sticky" meter), charge on the glass front of a meter, failure to correct for

wear

in

the "zero" reading, etc. (2)

Constructional

in

faults

apparatus,

the

e.g.,

misaligned parts,

thermal electromotive forces from poorly chosen materials, screw errors, etc.

(3)

Inadequate regard to constancy of experimental conditions and

imperfect measurement techniques,

e.g.,

changes

in

dimensions owing to

thermal expansion, one-sided illumination of a scale, nonvertical position of a liquid manometer, alteration of the property being measured as in

chemical contamination or spilling part of a liquid sample, (4)

Failure to

make

necessary corrections,

e.g.,

etc.

for the effect of atmos-

pheric pressure or of the variation of gravity with elevation or latitude in

determinations of mass by weighing, meniscus corrections in a liquid *

Examples of

classical situations involving large systematic errors are:

to 1920, atomic weight determinations were, as

was

later

shown,

afflicted

(1) Prior

with unsus-

pected systematic errors that averaged fully ten times the stated experimental errors, and 10 esu (note the (2) in 1929 the accepted value of the electronic charge was 4.7700 x 10"

was changed to 4.80294 x 10 -10 esu. Examples of large unsuspected systematic errors are numerous in the scientific literature, far more numerous than is generally suspected. This is the "problem" in the apparent inconsistencies in the fundamental constants (e.g., see Cohen, DuMond, Layton, and Rollett, Rev. Mod. Phys., 27, 363 (1955), and Bearden and Thomsen, Nuovo Cimento, Suppl. (Ser. 10), 5, no. 2,

number of

significant

figures),

and

(Significant figures are discussed

in

later

it

Section 2-8.)

267 (1957). t

And

behold

it

the pseudo scientist said, "Let there be

came

be so sure.

to pass: there were

no systematic

no systematic errors," and lo and Only the pseudo scientist can

errors.

Probability and Experimental Errors in Science

68

barometer, "stem" correction

in

a

common

mercury-glass thermometer,

etc.

(5) Bias by the observer, e.g., more or less constant parallax, supposed improvement in technique in the midst of a set of measurements, desire

for the "right" result, etc.

This

not intended to be exhaustive, merely illustrative.

list is

To justify the we often attempt

application of a mathematical model of variability, as to do,

we assume randomness

may

observed variability,

in the

an assumption whose

validity

systematic errors.

generally imperative that every effort be

and

detect

It is

be jeopardized by the presence of

to eliminate systematic errors.

dependent, recognition of it

If a

may come from

systematic error

is

made

to

not time-

greater care or by comparison

with the results from other apparatus or from some more ingenious method

of measurement. If the error changes with time, a study of the time dependence of the mean value may reveal the error's presence. Tests of correlation

and of consistency of means, discussed

Sections 3-3 and 3-7,

in

are helpful.

Elimination of systematic errors often strains the ingenuity, judgment,

and patience of

After exhausting

the best of experimenters.

all

metiiods,

even though he does not believe them to be entirely absent, he resigns himself to their presence but assumes the variability to be

purpose of

random

for the

treatment of the deviations.

statistical

However, note in and accuracy.

this

regard that a distinction

is

made between

pre-

cision

Precision

and accuracy.

Precision in a

to the reciprocal of the statistical error is

small; accuracy

"high"

is

if

and

is

mean

value

"high"

is

proportional

if the statistical error

the net systematic error

is

small.*

Usually,

but not necessarily, high accuracy implies a small statistical error as well. *

As

errors

one

is

stated above, objective numerical determination of the residual systematic

not practical.

for precision, but

An arbitrary

its

The

error for accuracy

numerical value

procedure, often used,

is

is

is

usually appreciably greater than the

perforce

left

to the observer's best judgment.

to estimate the equivalent statistical error caused

by estimated component systematic errors. This estimate is usually deliberately very chance that the "true" value lies outside of the say about a 1 or a 5 limits given by the equivalent error. Some experimenters go further: With an assumed

conservative

—

%

%

mathematical model of the "histogram" of the component systematic errors, e.g., the normal (or Gauss) distribution, such an estimated error is converted by the formulas of the model to the same "confidence limit" used

in the statistical error;

often this

is

the

probable error. The assumption of a mathematical model for these component errors is admittedly highly ad hoc and is not to be much trusted; but there is apparently no better general procedure.

(See next chapter.)

Basic Definitions: Precision

69

Errors, Significant Figures, Etc.

and accuracy are not interchangeable terms. * Statistical methods measure of precision, not of accuracy

give specifically a quantitative

(however, see discussion of the consistency of means, next chapter).

The difference between two measured values, e.g., values reported by two different observers, or the difference between a value by an observer and an "accepted" value as listed in a handbook, is Discrepancy.

This difference

called a discrepancy.

is

the need of a statement of error, both

not an error, although statistical

it

and systematic,

implies in

each

value to provide a basis for interpreting the discrepancy.

Blunders.

These are outright mistakes.

A

measurement known to

contain one or more blunders should be corrected or discarded.

Blunder

errors include the following effects: (1)

Misunderstanding what one

is

doing, incorrect logic.

(2) Misreading of an instrument. (3) Errors in transcription of data. (4)

Confusion of

(5)

Arithmetical mistake, "slide-rule error."

(6)

Misplaced decimal point. Listing of an improper number of significant

(7)

2-8. Significant

units.

Figures and Rounding of

figures, etc.

Numbers

Significant figures are the digit figures necessary to express a measure-

ment so

some

as to give immediately

ment. There

is

idea of the accuracy of the measure-

no uniformly accepted

rule for deciding the exact

number

by measurement of, say, 63.7cm indicates a "high" probability for a value between 63.55 and 63.85 cm but a "possible" range 62.2 to 65.2. Another popular practice is to retain the last digit that is uncertain by 10 units or less. In case either procedure is followed, it is recommended to include one more figure but to set it down slightly below the line of the significant figures. If this additional subfigure, with of digits to use.

more than

1

One popular

practice!

*s

t0

drop a ^

digits uncertain

5 units. Accordingly, a

was desired to determine as precisely as possible the height was unthinkable to bother theemperorwith a direct measurement. So, the investigator, knowing about statistics and being of diligent mind, conducted an extensive poll. He selected at random a thousand, nay, a million Chinamen from all parts of the nation. Each was asked to give his opinion as to the height of the emperor, and was then sworn to secrecy. The average of all the numbers provided a very precise determination. But none of the Chinamen concerned had ever even seen *

According to legend,

of the emperor of China.

it

It

the emperor! t

Recommended by The American of Data (ASTM, Philadelphia,

tation

Society for Testing Materials, 1937),

2nd printing,

p. 44.

Manual on Presen-

Probability and Experimental Errors in Science

70

more than 10 (or 15) units, were to be included as a it would erroneously imply in some cases that the preceding figure was uncertain by less than 10 (or 15) units. "Uncertain" in a single measurement refers to the observer's best guess as to the "sum" of the random and systematic errors. Sometimes this guess is taken its

uncertainty of

significant figure,

simply as the effective measuremental least count,

i.e.,

either as the smallest

measurement or as the observer's estimate of a meaningful interpolation. In the mean measurement, the precision part of the uncertainty is set by, say, the standard deviation or by the standard

division

error

on the

scale of

the standard deviation in the mean).

(i.e.,

mean and

Significant figures in the

some precision indices are illustrated in Table 2-2 which is introduced and discussed in Section 2-11, and also in some of the problems in

and answers

in Section 2-12.*

in determining the proper number of significant figures with As which to express the precision of a mean determined from seven or more equally weighted measurements, the mean should have one more significant figure than has each measurement. In general, justification of this rule, and indeed the proper number of figures for the mean in any case, is indicated by the magnitude of, say, the standard deviation or, better, of the standard deviation in the mean. As stated, the proper use of significant figures provides a rough method of expressing accuracy. f However, because it is only rough and involves an arbitrary criterion of uncertainty, it is by no means a good substitute

a guide

for the assignment of the appropriate statistical error (e.g., standard

and also of a separate estimate of the net systematic error. Rounding of numbers is the process of dropping one or more significant

deviation, standard error, or probable error),

*

A man

letter to

named Babbage read Tennyson's The

the poet:

"In your otherwise beautiful

moment

'Every

dies a

Every moment one It

must be manifest

standstill.

that,

were

is

Every speaking

this

in the line,

but

Strictly

get

it

Comment on

is I

not correct.

believe

1

,'„

man, born.'

slightly in excess of that of death.

poem you have

'Every

is

of Sin, and wrote the following is a verse which reads:

there

population of the world would be at a

this true, the

In truth, the rate of birth

that in the next edition of your

Vision

poem

moment moment The

will

it

dies a 1

x

\

I

would suggest

read:

is

man, born.'

actual figure

is

a decimal so long that

be sufficiently accurate for poetry.

A physics student

I

I

am,

cannot etc."

in a mathematics class received which ten equally weighted questions had been asked. He inquired of the instructor whether the examination was graded on a basis of 10 as perfect or on a basis of 100. The instructor insisted that the question was f

a grade of zero

the following story

on an examination

pointless, saying that zero

:

in

was zero regardless of the

basis of scoring.

Frequency Distributions and Precision Indices

7/

when the measurement is used in computations for a computed The rules for the rounding of numbers are rather well developed

figures result.

and are

When

stated below.

these rules are followed consistently, the

errors due to rounding largely cancel one another.

To round of ten) less

off to n figures, discard

(i.e.,

replace with zeros or with powers

If the discarded number than half a unit in the nth place, leave the nth digit unchanged; if it all digits

to the right of the /7th place.

greater than half a unit in the nth place,

discarded

number

nth digit unaltered

is

add

to the nth digit.

1

is is

If the

exactly half a unit in the nth place, then leave the

if it is

an even number but increase

it

by

1

if it is

an odd

number. In multiplication and division (indeed, in

all

computations except

addition and subtraction) with numbers of unequal accuracy and of equal

weights in the final result, a generally safe rule

is

:

Retain from the begin-

ning one more significant figure in the more accurate numbers than contained in the least accurate number, then round off the

is

final result to

same number of significant figures as are in the least accurate number. unequal weights are involved, adjust the respective number of significant

the If

figures accordingly (weights are discussed later).

In the case of addition or subtraction, retain in the

more accurate

numbers one more decimal digit than is contained in the least accurate number. A decimal digit is a figure on the right of the decimal point regardless of the number of figures on the left. The above rules presume that all the measurements are independent. If the measurements are at all correlated (see Section 3-7), the rules are not applicable and we must proceed with caution.

C.

FREQUENCY DISTRIBUTIONS AND PRECISION INDICES The

variations in successive trial measurements are completely repre-

sented by a detailed graph of the frequency distribution of the measure-

ments. The idea of a frequency distribution was introduced and discussed very briefly in Chapter 1 and also in connection with Figs. 2-1 and 2-2. It

was mentioned

that

some

distributions are symmetrical in shape

and

that others are asymmetrical.

we shall comment on a few of the some typical empirical frequency distributions, discuss some easily obtained numerical measures of

In the remainder of this chapter, qualitative features of

and then we

shall

the shapes of or the types of variations in distributions.

We

shall treat

general distributions, symmetrical or otherwise, and regardless of any possible

fit

of a mathematical model.

These numerical measures, except

Probability and Experimental Errors

72

in

Science

for the location values (e.g., the arithmetic mean), are called precision indices of dispersion. 2-9.

Typical Distributions

Most

actual sets of observations or measurements have frequency

distributions

skewed

somewhat

like the bell shape,

and

bell shape (Figs. 2-1

may have

almost any conceivable shape.

shown shown

Fig. 2-4(a)

in

in this figure

through

or like a more or

But

2-2).

possible that a distribution

A

variety of distributions are

Each of the

(g).

first

three distributions

has a rather large classification interval.

cation interval in (c) were

much

less drastically

it is

smaller than

it is,

If the classifi-

the distribution

would

undoubtedly drop to a low frequency value at very small ages and contain a narrow or strongly peaked maximum. As another example (not shown) a frequency distribution of the wealth of the average person in New York State according to age

would probably be

similar to (b) or (c) but with

reversed skewness.

Comment zero

is

is

in

order on a few additional features.

displaced to the

left,

off the page.

In (a), the abscissa

In the (a), (b), (c),

and

(g)

examples, measurements are presumed to be possible that would have a nonuniform distribution within each indicated classification interval.

The indicated

interval size

is

by perhaps the resolving power and (c) by perhaps some decision of hand. The problem in (c) evidently did not

imposed

of the particular apparatus used,

convenience for the problem

at

in (a)

in (b)

include finding the most probable age or the shape of the distribution for

very small ages, or perhaps the investigator judged that a smaller

Note

cation interval would not be meaningful. cation intervals;

the interval size

is

in (g) the

unequal

classificlassifi-

varied so as to avoid the vanishingly

small frequencies per interval that would obtain in the higher intervals

if

a

constant class size and a linear abscissa scale were used.

types of distributions. Referring to Fig. 2-4, we an example of a discrete distribution, "discrete" because only in (d) integer numbers of colonies of bacteria are possible by the nature of the

Terminology:

have

observations.

In (e), (/"),

and the

fitted

curve

in (g), the distributions are

continuous since they are calculated from the respective functional theoretical relations

which are continuous.

The binomial mathematical models of curves,

Fig. 2-2,

may

well be included

Fig.

among

binomial and Poisson distributions are discrete; is

1-2,

and the

distribution

those of Fig. 2-4.

The

the normal distribution

The shape and "smoothness" of a model distribution are though the number of observations was infinite.

continuous.

always as

Another general feature of interest is the range or bounds. For example, the normal distribution curve extends from — oo to +oo, the distribution

Frequency Distributions and Precision Indices of

(e) is

bounded by

±A,

etc.

at

(c)

type; frequency diagram for the (d)

and frequency curve for the

(e)

or (/) type. In this terminology, the fitted curve is a frequency curve.

basic graph in (g)

A

often reserved for the block-area type of distri-

is

or

bution such as in the type;

+ oo, and the distribution of (/) is bounded

and by

The term histogram

73

is

(a), (b),

a histogram; the

histogram approaches a frequency curve as the classification interval

goes to zero and the number of

goes to

trials

In this

infinity.

use the term frequency distribution to refer to

all

book we

of these types, either

for actual measurements or for the mathematical models of probability distribution.

Another distinction tribution has a unit it

is

in terminology

sum

if it is

a continuous distribution,

ordinate value

is

is

A

often made.

by

probability dis-

and a unit area

a discrete distribution

An

greater than unity.*

if

hence no

definition of probability;

observed frequency distribution

has ordinate values greater than unity unless the observed values are

"normalized,"

i.e.,

divided by the total

normalized distribution it

number of

sometimes called the

is

If

is a mathematical model it is called the frequency function. Finally, we often have occasion to refer to the sum or to the area under

the distribution between specified abscissa limits. quencies, either observed or to

The

observations.

relative distribution.

some

specified value

is

relative,

The sum of

from an abscissa value of

the fre-

(or of

called the cumulative distribution, or the

bution function, or the error function

if

the distribution

is

— oo)

distri-

continuous and

expressed analytically. After a set of measurements has been classified,

appropriate class intervals, and next task in statistics

is

to devise

the particular distribution.

amount of

The

grouped into the

some simple numerical

The features of primary

location index of the "center" of the distribution,

the spread or dispersion.

i.e.,

plotted as a frequency distribution, the

and

descriptions of

interest are (1) a (2)

a measure of

simplest descriptions (simple in terms of the

Of

arithmetic involved) give the least information.

we had

would be completely described if mathematical formulation and the necessary parameters. But

the distribution

experimental distribution, such a formulation

is

its

course,

analytical

in a typical

impractical.

We

proceed to discuss some descriptions of the distributions, location indices as well as measures of dispersion, that are applicable to any set of experimental data (or to any mathematical model).

The measures of

dispersion are called the precision indices. *

The ordinate

probability in a continuous distribution

interval (e.g., see Eq. 4-10

and discussion);

if

altered extended interval, the numerical value

is

proportional to the abscissa

the probability

may

is

expressed in terms of an

exceed unity.

74

Probability and Experimental Errors in Science

a O .

u

« ~*

c

<-

*-

.

-

o >

O u J?

1

o i-

o

r~

is |c 3

«->

<«

S

>

D

Frequency Distributions and Precision Indices

75

Probability and Experimental Errors in Science

76 2-10.

The

Location Indices

commonly used

three

When

the mean.

mode, and and is symvalue. Almost all

location indices are the median, the

the distribution has a single

metrical, these three location indices are

all

maximum

identical in

in

it

unimodal (have a single maximum) if the number of trial measurements is large enough for the meaningful application of statistical methods. But the condition of symmetry is less often realized in practice. practical cases are

Median. The is

compute is the median. This odd number of measurements

easiest location index to

defined as the middle measurement of an

(all

ordered as to magnitude), otherwise as the interpolated middle value.

For some

statistical

problems the median This

three location indices.

is

the most significant of the

when

apt to be the case

is

the distribution

is

For example, we may be more interested in the median wage than in the mean wage of workers in a certain large industry for which the top executives' wages may give an undesirable distortion. However, even in a strongly skewed distribution of measurements in an experimental strongly skewed.

science, the

mean

is

almost universely accepted as

statistically the best

location index.*

Mode

Of

(most probable value).

metrical distributions

is

especial interest in

some asym-

the abscissa value at the peak position.

the measurement, or the interpolated value, having the

This

maximum

is

fre-

quency of occurrence. It is of much less interest in actual measurements in mathematical model distributions because of the great difficulty

than

of its accurate determination with great

when

Mean

(arithmetic average)

data; this difficulty

real-life

the classification interval

m

is

is

especially

very small.

and

By

most important location index is the mean. The experimental mean is denoted by w; the hypothetical ("parent" or "universe") mean for an infinite number of measurements, or the mathematical model mean, is denoted by /a. m is -, x •, x defined for /; trial measurements, x x x 2 xz n by the relationf ,

m =

Xl "*

^

"I"

X2 "

^

"1"

X3 "

fx.

,

,

"f" T

'

'

•

*

By is

•

•

t

•

,

,

2

Xi

+ Xn _I_r2?=i^_' '

(2-1)

n

n index

far the

the conventional criterion for statistical efficiency, the median as a location

considerably less "efficient" than

is

the mean.

In the normal distribution, for

example, the median is only 64% as efficient as the mean (although both indices in this case have the same numerical "parent" values). The criterion for statistical efficiency is

described in the discussion of the

mentioned

in

Section 3-1.

A

mean

deviation in the next section and

is

also

few additional "inefficient statistics" are mentioned

in

Section 3-8. t In this chapter,all

The mean

is

measurements

x,

are assumed to be individually equally weighted.

defined in Section 3-3 in terms of unequally weighted measurements.

Frequency Distributions and Precision Indices

77

5000 Fig. 2-5. Typical variation of the experimental mean as the number of (Note the logarithmic abscissa scale.)

trials increases.

one or more values of a; are observed more than once, say xt is observed < n) different x values in the n trials. In this

If

fi times, there will be ri (ri

case,

we may weight each x by i

_ J\ X

\

~\~

J2 X2

"t"

'

'

'

frequency of occurrence j], and write

its

+ Ji x +

'

'

i

+ Jn' X

'

Hfi x

i

n'

__ i=\

n

m = xfi + xfi + n n

=

where, of course, n for observing to infinity

.

+

xfi

+

•

•

•

+

Xn

U= |

n

2fli/f

x^* The sum

if it is

.

.

,

(2-21

n

and/?j

n

xiPi

(2-3)

t=i

=fjn is the experimental probability

in Eqs. 2-2

and

understood that for some

2-3 could just as well be taken

Vsf may be t

zero.

Often we are interested in a comparison of two means determined on different days, or with different observers.

measurements taken with different apparatus, or by statistics of such a comparison is discussed in

The

Section 3-3.

The hypothetical or mathematical mean /u is defined by a relation similar would obtain if n = oo. Of course, if x is a point in continuous sample space, the summation in Eq. 2-3 must be replaced by an integral, /u is also called the universe or parent mean, and m the sample mean. (In slightly more advanced statistics, m is called an "estimator" to Eq. 2-3 that

i

of the parent parameter

mined, but

As

its

illustrated in Fig. 2-5,

but

it

gradually settles

* Just for practice,

the

/u.)

mode

is 9,

the

Of course,

ju

cannot be experimentally deter-

value can be approached as n

m may

down is 8,

made

to a steady value as n

check that for the numbers

median

is

larger

and

larger.

fluctuate rather violently for n small,

and the mean

is

becomes

large.

The

2, 3, 5, 6, 7, 8, 8, 9, 9, 9, 11, 11,

7.69.

12

and Experimental Errors

Probability

78

in

Science

usual objective in the statistical interpretation of measurements with a

number of trials is the best determination of /n, not of m, and this some very interesting statistical arguments. Of course, in many

limited

calls for

sets

of measurements the difference between m and /j, may be believed to be It turns out that for any given number of trials n the corre-

very small.

sponding

m from Eq.

2-1

is

generally the best available value of [x.

presently define a precision index that gives the reliability of

given

We shall

m

for

any

n.

mean of any mathematical model distribution, we necessarily refer to /u. If we are convinced that a model fits the experimental measurements, we approximate li by m and proceed to take In speaking of the

advantage of calculations and predictions by the model. By the term "mean" without further qualification we always refer to the arithmetic mean. (Other possible means are the geometric, root-meansquare, etc.)

The mean

is

the abscissa value of the center of area of the

frequency distribution.

The phrase expectation value is often used [x in a model distribution.

to refer to the

mean

value m,

or to

mean

Calculation of the

working mean.

m

usually rather laborious

is

where w, the working mean, chosen to be somewhere near invariant significant figures in venience, chosen such that x/

ments

at

w

is

carried

t

=w+

is

m all

=

x/

an arbitrary constant value of x. w is magnitude, chosen to include all the the actual measurements, and, for con-

in

for

some one value of

/'.

A

convenient

readily selected after a visual inspection of the measure-

hand. Then,

J

t=i

and

it is

it

x

value for

when

can generally be simplified by the device known as the To effect this simplification, write for each measurement

out directly, but

xf

= 2 (w + i=i

*{)

=

nw

+ ^

x

t

i=i

let

from which

m = w+A The convenience of

make

the working

calculation of A, which

than calculation of m.

mean

is

A

is

mean

is

(2-4)

realized

if vv is

so chosen as to

small in magnitude, appreciably easier

simple example of a calculation with the working

given in Table 2-2, which

is

introduced later in Section 2-11.

Frequency Distributions and Precision Indices

79

Dispersion Indices

2-11.

There are several indices of the spread (dispersion) of the measurements about the central value. The dispersion, as stated above, is a measure of precision.

As with

the location indices, the

amount of information con-

tained in each dispersion index increases roughly with the arithmetical

labor involved.

Range. The simplest index of dispersion

the range.

is

It is

equal to

the difference between the largest and the smallest measurements.

For

obvious reasons, the magnitude of the range generally increases with the

number of trial measurements. Hence, whenever

the range

should be accompanied by a statement of the number of

is

specified,

it

trials.

Quantile. Suppose that all the n measurements are ordered as to magnitude and then divided into equal intervals, each interval having an intervals, each interval is equal number of measurements. If there are called a quantile or an Af-tile, or sometimes a fractile. The quartiles refer to the quarter divisions of the total number of measurements, there being two quartiles on each side of the median. Deciles refer to the division by tenths, etc. The dispersion information given by the quantiles increases as increases, but it disregards the distribution of values within each

M

M

quantile.

For the

Deviation (statistical fluctuation). set

of n

trials,

the deviation

zi

From

the definition of z t

it

denoted as

is

=

zt or

-m=

x

i

rth

measurement is denned as

in a

dx { and

6xt

(2-5)

readily follows that

= JU =

i

(2-6)

l

The mean m is used as the reference value in this definition in order that the sum of the squares of all deviations shall be a minimum, as shown in the next paragraph. All the precision indices that have to do with the shape or dispersion of the frequency distribution are defined in terms of deviations.

Let z/

is

S be

the

sum of the squares of the deviations when each deviation some unspecified reference value m. Then,

defined in terms of

S=Iz/ = i

To we

2

l

=I i

=l

fo,

-

m')

2

= I x? =l

find the particular value of m', call differentiate

S

2m'

i

it

mm

',

that

|=

1

xt

+

nm' 2

1

makes S a minimum,

with respect to m', place the derivative equal to zero,

and Experimental Errors

Probability

80

and solve for

mm

'.

Since

all

t

n

X*

n

—dS- — — 2 J dm

which

t

+

2nm m

=

'

mental deviations

'

n

=i

point which

and as used

is a minimum.* we encounter over and over again

dealing with experimental deviations

such as

does not

fx,

deviations,

value

li

ments

i.e.,

in

Eq. 2-5. There-

in dispersion indices

any other reference value,

that

make S a minimum. Only when we

refer to all possible

minimum. But note that the objective of the measurewe really wish the precision dispersion index to m. The best we can do with a limited number of observa-

a

not m, and

not to

/u,

is

to the "universe" (infinity) of deviations, does the reference

make S

is /a,

refer to tions,

and

= m m = ^~

or

m is the reference value for which the sum of the squares of the experi-

fore,

A

xf

the mean, m, as defined in Eq. 2-1

is

Science

the x values are constants their derivatives

Then,

are zero.

in

however,

is

practical in

m and a dispersion index with respect to m. many

It is

possible

situations to introduce additional information

from more than one sample of n measurements each, or from samples of different sizes, or from a reasonably fitting mathematical model. Use of such additional information,

into the problem, as, for example,

introduces factors such as Vnj{n

—

1),

Vn(n

—

1), etc.,

in the quantitative

dispersion indices; these factors are discussed later.

When we deal S to be

value for

Mean

with a mathematical model distribution, the reference a

minimum

is /u.

(average) deviation.

This index

is

defined without regard to

the algebraic signs of the individual deviations.

defined as

The mean

deviation z

is

n

x

=

I

—kl

t -^ 1

(2-7)

n

A

small numerical value of

closely grouped,

value of

z

and

z

means

that the individual

that the distribution

is

measurements are

The amount by mean value m.

rather sharply peaked.

also provides a sort of numerical guess as to the

which the next measurement is likely to differ from the The use of this measure of dispersion, z, is rather widespread in scientific work, but it is not much used by statisticians. It is what the statisticians call an "inefficient" measure of dispersion, and thisfor the following reason. Suppose that a large number of trial measurements are made from which one application of the famous principle of least squares; it is really by this mean of any distribution is said to be the "best" location value. The basis for this principle is discussed in Chapter 3. *

This

is

application that the

Frequency Distributions and Precision Indices the

mean

deviation

81

computed. Then suppose these measurements to be many small subsets and the mean deviation zf of

is

arbitrarily divided into

computed with respect to w the The subset values z j will show a rather large

each subset computed. (Each

mean of scatter

the

y'th subset.)

about the value

measure of dispersion

z

z

i

in zi is

3

,

forthegrandtotalofthemeasurements. Anefficient

is

one that shows small

scatter,

i.e., is

one that allows

a statistically reliable estimate of the precision based on just one subset. Efficiency refers specifically to the reciprocal of the square of the standard deviation of the zi distribution about the central value z. (Standard deviation

is

defined in the next section.)

In a practical sense, efficiency

to the inverse of the number of measurements required for a given

refers

statistical precision:

number

the smaller the

the greater the efficiency.

that to have a given degree of precision in a set of measure-

Gauss showed ments from a parent population having a normal distribution 14% more measurements are required if z, rather than the standard deviation, is used as the precision index. Any set of measurements in real life is just one subset of a much larger number of possible measurements. Judged by this statistical efficiency criterion, the mean deviation does not justify the widespread use to which

However,

it is

it

has been put by

scientists.

nonetheless a very useful index of dispersion

if

the larger

deviations in the measurements are believed not to justify a higher weighting than the first power. (The standard deviation weights according to the

work. Suppose Another characteristic of the mean that a large set of measurements is divided into subsets of two measurements in each subset, and suppose that the value zi is computed for each subset. The average of these two values of z i is generally less than, statis-

Such measurement problems do

second power.)

deviation

tically

only 0.707

of,

the

mean

is

arise in scientific

the following.

deviation of the parent

set.

It is

true that

the larger the number of measurements in each subset the closer the average of the subset values becomes to the value of the parent set. For subsets of three measurements each, the statistical average of the mean deviations z i

is

0.816 of the parent

for subsets of five measurements

z;

0.943. Most actual sets of and the mean deviation gives measurements in real life are not precision. This bias is also found an unduly optimistic measure of the each case can be corrected and in in the experimental standard deviation,

each,

it is

0.894; for subsets of ten each,

it is

very large,

—

by multiplying by the factor Vn/(n

The

fractional

mean

deviation

is

1)

fractional z

and

is

as discussed later.

defined as

usually expressed in per cent.

=— m

(2-8)

Probability and Experimental Errors in Science

82

Although the value of the fractional value depends

upon

measurements of temperature

fractional : in a set of

numerical

z is dimensionless, its

the choice of zero of the x scale. is

For example, different if

x

is

expressed in degrees centigrade or in degrees Kelvin. Hence, the fractional value

x

is

scale

usually used only in those measurements for which the zero of the

This restriction also applies to

the physically significant zero.

is

any fractional dispersion index, e.g., the fractional standard deviation as mentioned presently. In Eqs. 2-7 and 2-8 the value of z is that obtained with the mean m, not ju, as the reference value.* Hence, z and fractional z are experimental quantities. The corresponding indices for a model distribution are not used because of their inefficiency and bias as stated above. Experimental standard deviation persion of the frequency distribution limited

number

Another measure of the disWith a

s. is

the standard deviation.

n of trial measurements, the experimental or sample

standard deviation

s is defined asf

=

s

\

—

=\ ^L—J

This expression can also be written, in case x t

2

where s

is

ri

<

n and

p

{

=1

is

(2- 9 )

l

/

is

observed

sum makes

t

-(5(*-«Fa)

the dimensions

and

units of s

(2-10)

xt The quantity Taking the square the same as of x.

the probability of occurrence of

also called the root-mean-square (or rms) deviation.

root of the

times, as

f

.

(Again, the sums in Eq. 2-10 could just as well be taken to

Note that the deviations summation is made; this

in

infinity.)

Eq. 2-9 are individually squared before the

assigns

more weight to the large deviations. is more sensitive to large deviations

Hence, as a measure of the dispersion, s than

is

the

mean

deviation

z.

It

follows that, of two frequency distributions

having the same mean deviation, the distribution with the relatively higher tails has the greater standard deviation. In a series of measurements, a large deviation *

Sometimes

is

always cause for concern;

in statistics

a

"mean

reference value. This, however, f

This definition presumes

all

is

deviation"

very

is

uncommon

its

appearance increases our

reckoned with the median as the work.

in scientific

individual measurements x, to be equally weighted, a

typical presumption for a set of

random measurements. The standard

defined in Section 3-3 in terms of unequally weighted measurements.

deviation

is

Frequency Distributions and Precision Indices

83

a priori belief that another large deviation will soon appear. deviation

more

a

is

efficient

reckons efficiency, than

Of all

the

is

The standard

measure of the precision, as the

mean

statistician

deviation.

the precision dispersion indices, the standard deviation

And

in widest use in statistics. science, but

nowadays

it

is

the probable error, which

is

deviation, seems to be about equally popular.

discussed in Chapters

3, 4,

and

is

the one

also widely used in experimental

based on the standard The probable error is

5.

Because of the squaring operations

in

Eq. 2-9, s does not allow distinc-

and gives no asymmetry of the distribution. For a symmetrical distribution, s can be indicated with meaning as ±5 values on the graph of the distribution; for an asymmetrical distribution, it cannot be so indicated and it remains of mathematical significance only. The standard deviation provides, as does any dispersion index, a numerical guess as to the likely range of values into which the next measurement may fall. With this interpretation, s is sometimes called the standard deviation of a single measurement rather than of the distribution itself. As we shall see in Chapter 4, if the normal model fits the experimental parent distribution of mean fx and standard deviation a, the probability is 0.675 that the next measurement will fall within ju ± <** which is very nearly the same as m ± s. Often we are interested in a statistical comparison of the precision of the means determined on different days, or with different apparatus, etc. This comparison is discussed in Chapter 3. The fractional standard deviation is defined as tion between the algebraic signs of the individual deviations

indication of the degree of

fractional s

This index

is

5 =— m

(2-11)

extensively used in problems dealing with the propagation

of error, discussed in Chapter

3.

However, as with the fractional mean

deviation, the fractional standard deviation

cases

when

the zero of the x scale

Computation of carefully done,

inaccuracy

is

is

s

is

is

not especially useful in

not physically significant.

by Eq. 2-9 or 2-10

is

often tedious and, unless very

apt to be somewhat inaccurate.

a consequence of the fact that

be used in the value of

m

more

than in each value x

t

.

One reason

for the

significant figures

To

must

reduce this difficulty

can be shown that, in any distribution, the probability/* of observing a deviation 2 is p < l/k where k is any positive number. (This is known as Tchebycheff's theorem or inequality.) * It

2

>

ks

,

Probability and Experimental Errors in Science

84

convenient expression for

in the calculations, a

(x (

— mf in

Eq. 2-9 and by using Eq.

=

5

I

(*<

;

\\4/n

In /

2-1

-

/

obtained by expanding

\Vi

n

I ^ 2

w) 2 \

s is

thus,

2m

^

*i

+ nm2 \

—

X? — nm (2 •^-

A

further simplification

in Eq.

2-4, viz.,

vr

is

achieved by using the working

=m—

From

A.

2

\

(2-i2,

;

mean introduced

a visual inspection of the set of

measurements at hand, or better, of the frequency distribution graph, a working mean w is chosen which may be a convenient round number, if the classification interval is bounded by round numbers, and which contains For convenience, w is all the invariant significant figures in all the x/s. chosen to be close in value to m. Then, in terms of (x — w), which may be i

called an adjusted deviation,

I=

-

fo,

w

-

A) 2

2

- "? - »^2

to

1

2

(2-13)

where, as for Eq. 2-4,

2

(

x

i

-

w)

A = -^ !

A

(2-14)

can be made relatively quickly by choosing a different value of the working mean and by performing an check of the calculation for

s

2

independent calculation.

An example ally, for the is

of the shortened computational method for 5 [and, incident-

mean

and for the

given in Table 2-2.

observed value xit the is

m is

In this

large

.

number of

skewness (discussed

expressed with one more

is

This

is

.r/s,

and

later)],

table,/, the frequency of occurrence of the

used to shorten the tabulation. Note that

computed value of aw

each value x t

coefficient of

justified

in

the table

significant figure than

because the mean

this justification

is

value of the standard deviation in the mean,

is obtained from such a confirmed by the numerical

5,„,

a precision index intro-

duced presently.

Moments. The square of the standard deviation is also known as the moment about the mean. Since moments higher than the second are also mentioned presently, it is worthwhile now to become acquainted with moments in general. second

Frequency Distributions and Precision Indices

85

Table 2-2. Sample Calculations for Mean and Standard Deviation (Using the

Measured

Frequency

Value xt

Observation

(cm)

fi

125

"Working Mean") Adjusted Deviations and Moments with Working Mean w = 128 cm

of (x (

-

w)

(cm)

ffa (cm)

w)

ffa - wf (cm2 )

fi(xt

-

w) 3

(cm 3 )

Probability and Experimental Errors in Science

86

The kth moment about

dk

2

-

(*,

= «^—

m

6km ,

mean m,

the

mffi

- I

n

from which, with Eq.

is

i*i

-

mfPi

(2-17)

2-10,

=

s

2

(2-18)

as stated above. If the

mean

deviation were defined differently than

on

specifically, if the absolute brackets

it is

in

Eq. 2-7,

i.e.,

were removed and regard were

zi

maintained for the algebraic signs of the individual deviations, then this 2 would be the first moment about the mean. But the first moment about the mean, is

A

equal to zero by the definition of the mean, and this

viz., Oj™, is

the reason that

z is

defined without regard to algebraic signs.

moments about the origin shown by essentially the same

useful relation exists between the second

and about the mean expansion made

in

K=I

Eq. 2-12,

(**

-

and by

= I *V, - 2m 2 xiPi + m 2 £ i=i i=i i=i

*nfPi

—

u2

This expression

is

we have

m

w2

the desired relation

m2

(2-19)

'"

the well-known formula

identical to

inertia of a

axes a distance

Pi

- 2mB» + w 2

° 2

substituting Eq. 2-16 for 6^,

moments of

is

viz.,

i=i

=

This

respectively.

body of

unit

mass about two

relating

the

different parallel

apart, with one axis through the center of mass.

"universe" or "parent" standard deviation a. If all the infinite population of possible measurements in the "universe" were known (a purely hypothetical situation), would be known and we could use /n instead of m as the reference value in computing each deviation. The standard deviation of a set of n deviations, as n —> co, with ju as the Variance a 2 :

/li

reference value

is

t

is

known

=

denoted by

—

l

as the variance,

a,

= a

and

matical model.

it

is

square

= 2

*=*

is

standard deviation. The variance distribution whether

its

fo.

- n?Pi

(2

-

20 )

also called the "universe" or "parent" is

a parameter of the universe or parent

of the "imagined

Incasethe parent distribution

real-life" is

type or

is

a mathe-

continuous, thesummation

Frequency Distributions and Precision Indices

may

Eq. 2-20

in

87

be replaced by an integration.* Thus, using/as the con-

tinuous frequency function from

to oo,

fxffdx Jo

(2-21)

f

fdx

Jo

The

integral in the

denominator

that the frequency function

function

included in this expression to be sure

is

normalized;

is

when a model

used, the integral in the denominator

is

The variance a 2

,

i.e.,

the second

moment about //,

is

probability

unity by definition.

is statistically

the

most

important parameter in describing the dispersion of any universe or parent distribution, including any mathematical model distribution. With either

the

"real-life"

distribution,

imagined universe

we must assume

that

//,

is

The value of [x, hence of a 2 can never be ,

set

We

of measurements.

* In

distribution

or

the

model a2

known

in order to calculate

exactly

known from any

.

real-life

use the "best" estimate that can be obtained.

going from a discrete to a continuous frequency distribution,

we

use the basic

For some students, it may be helpful to review this argument. Consider the 12 measurements listed in Table 2-1 and graphed in the discrete distribution of Fig. 2-1. Or consider any list of a large number n of measurements. The range of the abscissa or x axis of interest can be arbitrarily divided into a large number N of equal increments Ax,. The number of measurements that fall into the ith interval is «,, and the normalized frequency (or the experimental probability) with which a measurement is observed within this interval is argument of

calculus.

N i

The normalized frequency distribution

the graph of/>, vs. x,, where x, is the coordinate taken as the average of the measurements within this interval. of course, discrete so long as Ax, is finite in size.

of the interval Ax, and This distribution

is,

n

I-, =i

is

is

We wish now to approximate the discrete frequency distribution/?,(x,) by a continuous function /(x), defined by

i.e.,

one for which Ax,

means of the

->-

in the limit as n

-*

oo.

This function can be

relation

which says that the value of/(x) at x = x, is to be made such that the product of this value and the width of the interval Ax, is equal to the normalized frequency of the observed measurements within this interval. Actual real-life measurements are far too few in number in any given situation to determine /"(x) in fine detail (i.e., Ax, small, zero in the limit) by direct use of this definition. We are usually content to guess the "way/"(x) should go" reasonably consistent with the actual measurements at hand. This gives a continuous function that approximates not only the actual discrete frequency distribution but also the presumed-to-becontinuous parent distribution.

Probability and Experimental Errors in Science

88

For a

set

of n measurements, as stated above, the best value of

generally taken as the experimental value

m; and

to a 2 that can be

of n measurements

deduced from a

set

/x is

the best approximation is

generally

taken asf n

9

o

(2-22) 1

one estimator of a, but Vn/(n — \)s is generally considered to be a better estimator of a because the radical factor corrects for a bias inherently present in s. This bias was mentioned [The sample standard deviation s

earlier, in

connection with the

is

mean

deviation,

and

is

discussed again

below.]

Combining Eqs.

2-9

and

2-22,

we note

that

(n

\

'-^1

n

M (2-23)

/

—

/

1

In practice, this is commonly put in the form of a guess that a particular known continuous function satisfactorily "fits" the finite set of actual measurements; in other words, the guess is made that more measurements, were they to be taken, would merely increase our satisfaction that the analytic function fits and describes the parent distribution.

Then, the

common problem becomes one

of determining the best guesses

as to the important parameters of the continuous function.

of the sample and of the parent distribution respectively,

For example, for the means

we

write

N

m = i—L_ = n

^ />.(*)>

tl

wm

>

ancl

f*

=

f(x) d" Jo

i=i

and for the sample and the continuous-parent k moments about the mean

(see Eq. 2-17),

/*00

«*/(*) dx

N ek m

=£(*<-/*)*/>,(*<)

(experimental)

»'=l

and

Bk "

=^

(parent)

f(x)dx Jo

| That these are the "best" values is discussed with more or less general statistical arguments in the appendix of the paper by R. H. Bacon, Am. J. Phys., 21, 428 (1953). 2 Later, in Chapter 3, we show that taking these as the best estimators of /< and a in the normal distribution is consistent with the method of maximum likelihood; see Eqs. 3-12, 3-14, and 3-97 and the discussions attending these equations. In general, of course, other estimators exist for the parent parameters // and a but they are of no great interest to us at our present level of discussion. But it is worth

mentioning that there is no rigorous proof that m and Vn/(n — l)s are in fact the best Such a proof requires the introduction of some condition or criterion in addition to the conventional theory of probability. Maximum likelihood is such a

estimators.

condition, and

it

leads to useful estimators that are generally taken as best for

practical purposes in experimental science.

all

Frequency Distributions and Precision Indices

and

this is the best practical

89

formula (or estimator) for the universe or

Note

that in this expression for a in terms of m, the denominator is n — 1 instead of n. Often, no distinction need be made between the numerical values of s and a since neither is numerically very significant unless n is reasonably large, and then the differences between m and ju, and between n and n — 1, are relatively small. For

parent standard deviation.

— 1) which corrects for the bias whenever the standard deviation of the parent distribution is desired. And, regardless of the size of n, the difference in principle between s and a (and between m and p) is fundamental in statissmall n, say

<10

or 15, the factor Vn/(n

must be applied to

tical

s

theory and, indeed, in the philosophy of science.

It is

of importance that Eq. 2-22 or 2-23 not be interpreted in any sense

refers to the parent population and the equations means of estimating the value of a. The "approximately equals" sign in Eqs. 2-22 and 2-23 approaches

a

as a definition of a;

here merely provide a

"equals" as n

—>

oo.

On

the average, even for n rather small (but of course

greater than unity), the expression

note that, for n in this case;

proper in

=

1,

is

very nearly exact.

Eq. 2-23 gives 0/0 and a

and, for n

=

1,

is

Eq. 2-9 gives s

this case although, here also,

It is

interesting to

indeterminate, as

=

0, as is

is

proper

mathematically

"indeterminate" expresses a more

appropriate meaning.

Degrees of freedom. The factor n/(n —1) enters in Eq. 2-22 by an argument in statistical theory, and we shall encounter it again in Chapter 3 in connection with various tests and in Chapter 4 in connection with the 2 X test of the fit of a model distribution to an experimental distribution. So let us take a moment to see at least its plausibility. This has to do with the concept of the "degrees of freedom," which has the same meaning here that it has in geometry and in mechanics. The argument can be stated in terms of the number of "restraints" imposed on the universe or parent distribution. We seek to know the parameter a of the parent distribution of infinite population of which the n measurements are a sample. The only thing we know about a, or indeed of the parent distribution at all, is that the experimental measurements are a sample. When we insist that the parent distribution have a characteristic, any characteristic, which is determined from the sample, we impose a restraint on the parent distribution. For example, insisting that the parent distribution have a mean value equal to the sample value m imposes one restraint, and the price we pay for this restraint is a sacrifice in s as being the best estimate of

a.

In other words, the effective

number of

measurements useful in the determination of the best estimate of a is reduced from n to (about) n — 1, as in Eq. 2-22; one view is that (about)

90

and Experimental Errors

Probability

in

Science

one of the n measurements is used to determine the mean and the remaining — 1 measurements are left to determine the errors or the dispersion. The best estimate of a is a little greater than the value of s. n

In another view, the reason that the best estimate of a is

population

is

a

minimum

only

if //,

not m,

greater than s

for the universe

used as the reference value in

is

sum

calculating each deviation, whereas the

minimum

is

The sum of the squares of the deviations

the following.

for the n sample deviations

is

m, not /u, is the reference value. This argument is equivalent to the one expressed in the previous paragraph. The restraint mentioned above, viz., that m = ju, is always unavoidably a

only

if

imposed whenever we try to match a parent distribution, either a model distribution or the one believed to exist in real life (even though it may not be fitted by any model). And it is fairly obvious that a second restraint is imposed if and when we insist that the parent parameter a be given in terms of

—

as n

s.

1.

—

Then, the factor n

We

2 enters the statistical formulas as well

shall discuss situations

shall see that the

x

2

test

of this sort in Chapter

3,

and we

of model match in Chapter 4 involves just this

type of argument.

An

way of looking at the restraint is in terms of the ideas of The solution of a single equation is said to be unrestrained, but when we require the simultaneous solution of an additional equation we impose a restraint; we impose two restraints with three simultaneous independent equations; etc. The attempt to determine a equivalent

simultaneous equations.

from the n experimental measurements, as

in Eq. 2-23, involves three

simultaneous equations n

a2

=

I

Urn n-*co

^

n

n

te

- /")

2

ii

,

=

fl

2

lim

x '-^—

n-*oo

n

i ,

m=

2

xi

^—

(2-24)

n

two of which are independent and from which we eliminate /z and solve known quantities m, n, and the x/s. This solution can be carried out statistically but it is not an easy one. The feature of the argument is readily seen when it is pointed out that for n = 1 we have no deviation; for n = 2 we have two deviations but they are identical (except for sign) and therefore are not independent, i.e., we have only 2 — independent deviations. We can generalize the argument to say that for n measurements we have n — independent deviations; and the proof of this conclusion lies in the fact that any one of the x values is given by Eq. 2-1 in terms of the other z values and the value of the mean m. The argument for the n — factor in the denominator of Eq. 2-23 is a for a in terms of the

1

1

t

t

1

little

more involved than

just

the

number of independent

deviations

Frequency Distributions and Precision Indices

9/

because, with the aj/s as only a sample of the parent distribution,

not possible from Eq. 2-24 to determine either

it

is

or a exactly.

/u,

Another example of the concept of degrees of freedom is found in curve A curve may be made to go through all the points if the equation of the curve has as many constants as there are points. With the curve so fitted, no point deviates at all from the curve, and the sum of the squares of the deviations (i.e., the numerator of the expression for a) is zero. This is also the same as saying that one degree of freedom is lost for each point fitted, and therefore the number of degrees of freedom left for the determination of error (i.e., the divisor) is zero. The standard deviation a there is no information concerning is thus 0/0 and is indeterminate: dispersion left in the set of points; there is no freedom left. Finally, the remark may be made that division in Eq. 2-23 by the number of degrees of freedom, instead of by n, insures that we have the property mentioned earlier, viz., that the standard deviations based on small subsets have statistically the same average value as that computed for the fitting.

parent

set.

binomial model distribution.

Variance:

The mean and

the variance

of the binomial model distribution are readily written as the respective

moments. Using Eq.

1-20 for

=

a

^ The k =

factor is

k

p

= V

0« l

from k nominator by

k,

to k = with k >

ii

Next, n and

p

—

k$n-k)\

sum

1.

is

(2-25)

us that the term for

by changing the lower

0,

pY~ k

are factored out, giving

I,

and

7]

=

/"

n

=

(n

= i (k

—

1

nP

— ;

-

see that the

summation

fc

—

_! n _ fc

k)\

then K<

2

is

1)!

1)!(h

k=o k\{t\

and we

tells

not altered

= I

jt

k

k n ~k

pq

Then, by dividing both numerator and de-

*

=

find

numerator of the summation

in the

=

we

k

t=o

zero, so the value of the

limit

Let k

in Eq. 2-3,

t

—

^B(k;

P

f

K

k)\

r\,

p) which, by Eq. 1-22,

is

unity.

Hence, ju

=

np

(2-26)

Probability and Experimental Errors in Science

92

The variance of

the binomial distribution can be written, with the help

of Eq. 2-19, as

V = ^ -K=I

=

a"

w! k*

Now we

—

substitute [&(£

write

+

1)

k] for & 2

„

,ft-»

—

Ac!(n

+ /«-V

/c)!

—

1)

factor in the summation, the

as well begin with k

=

2.

factor out n(n

o2

=

&(/;

—

$p

2 ,

Then,

if

we

=

k

we have

2)! - dp 2 i = 2(/c — ;; 2)!(/t — and v = n — 2; then (/

2

a2

=

»(n

k)\

-^ A d!(v — i

-

l)p

2

2=

a

1

a Substituting np for

ju

=

2

n{n

v

"5

TTT^

o

+

" y"2 :

A*

d)!

and, again, the summation £5(<3;

r, /?)

—

l)p

2

= +

so

1,

—

fx

/j?

from Eq. 2-26, we have a2

=

In the binomial distribution,

The

^-y-» + /• -V

n(n

—

summation may just — 1), and

cancel the factor k(k

a

Let 6

(2-27)

and, making use of Eq. 2-25

,

"•

s=o

Because of the

2

/,

A:)!

,

= iKk-i)

o2

pV"* -

—

k!(h

a=o

np(l 2

<j

is

—

p)

always

=

npq

less

(2-28)

than np

(=

//).

fractional standard deviation, also called the coefficient of variation,

can be written for the binomial model distribution as

=

Inpjl-p)} np

ju

which

is

)

'

1

a

*

2

=
np

J

(i \/u

especially convenient in those cases for

_

S

(2 . 29)

2

w

which n

>

/a,

as in the

Poisson approximation. Since the normal

and the Poisson model

distributions are special cases

of the binomial, Eqs. 2-26, 2-28, and 2-29 also apply to them.

Equations

2-26 and 2-28 are derived again in Chapters 4 and 5 specifically for these distributions.

Standard deviation

mean (standard error) s m If we were measurements, the second value of the mean

the

in

to record a second set of

//

.

would in general differ from the first value. But the difference between the two means would be expected to be less than the standard deviation in either set. This expectation could be checked, of course, by a large number TV of repetitions of the set of measurements and the frequency distribution

93

Frequency Distributions and Precision Indices of the

N means analyzed. We

deviation in the

would write

/ S

where

m

is

for the experimental standard

mean

the grand

willing to record the

»

"

|

- mf\

(m,

V^H

<2 " 30>

/

mean of the TV values. Very few experimenters Nn measurements required in order to obtain

are the

value of s m from Eq. 2-30. Fortunately, statistical theory provides a satisfactory formula for sm from the n measurements of a single set. In Chapter 3 we show in simple

fashion that sm

=

(2-31) -J=-

In reference to the parent distribution, the formula rather than for s m

,

is

«

=

.

\

-=

(2-32)

yjn

I

Combining Eq. 2-32 with Eqs. 2-22 and

°m

=

am

and

N

Either sm or a m

of course for a m

is

2-9,

we have

^

(2-33)

/

often called the standard error in (or of) the mean, or,

in experimental sciences, simply the standard error *

the reliability of the mean,

it

standard deviation because

it

number of measurements

As

a measure of

has more direct significance than has the includes

more vigorously

the effect of the

n.

In one theoretical derivation of the expression for o m (see Eq. 3-26) the

approximation

made

is

with vVvery large,

is

that the hypothetical distribution of the TV means,

almost a normal distribution irrespective of the shape

of the particular parent distribution of which the samples. tion

is

distribution *

comes

Unfortunately,

closer to being normal.

many

are

is very good even when from normal, and it improves as the parent

This approximation

significantly different

Nn measurements

the parent distribu-

One consequence of

investigators use the term "standard error" as

this

synonomous

with "standard deviation" without specifying the particular random variable involved. The phrase "standard deviation in the mean" is awkward; and if the ambiguity in

"standard error" persists, a new term should be agreed upon or else the qualifying phrases "in the measurements" and/or "in the mean" respectively must be added.

Probability and Experimental Errors in Science

94

approximation /u

by

less

teristic

The error,

than

is

any one sample mean

that the chance that

±o m

is

about 0.683 since

this

of the normal distribution (see Table

fractional s m

55

=— =

—

is

from

a charac-

4-5).

=— =
fractional a m

,

differs

mean, or the fractional standard

fractional standard deviation in the is

numerical value

—o

-=

(2-34)

Equations 2-31, 2-33, and 2-34 are formulas that should be

working knowledge of

scientist's

in

every

statistics.

Skewness. The first and second moments about the origin and about mean have been discussed above. Moments about the mean are

the

And

especially useful as dispersion indices.

it

was pointed out

that the

moment about the mean the greater is the relative weighting of the large deviations. None of the dispersion indices discussed so far gives a measure of the asymmetry of the distribution. Now we introduce higher the

the dispersion index called the coefficient of skewness, which in

terms of the third

moment about

the mean.

This coefficient

is

defined

is

n

skewness

=

—

(2-35)

3 /is

(experimental)

and

skewness

=

—

(2-36)

no

(universe)

(For a continuous universe distribution, the summation should be replaced by an integral, or a3

etc.,

as in Eq. 2-21.)

in

Eq. 2-36

The

factor 5 3

makes skewness a dimensionless quantity used. The coefficient of skewness of the measure-

the denominator

in

independent of the scale

ments in Table 2-2

is

tion in Fig. 2-A(e)

is

+0.05], but the coefficient of skewness of the distribuabout +21. Positive skewness means that more than half of the deviations are on the left (negative) side of the mean but that the majority of the large deviations are on the right (positive) side. Because the skewness is so sensitive to large deviations, its numerical

value varies value

is

initially rather

restricts its practical use,

upon

widely as n increases, and a reasonably stable

not generally realized until n

it is

but when

its

is

rather large.

This sensitivity

experimental value can be relied

a very powerful aid in the fitting of a

model

distribution.

This

is

Frequency Distributions and Precision Indices particularly true since

generally

most

so sensitive in the

it is

difficult to

check, and since

it

95

tail

regions where the

fit is

does allow a comparison of

asymmetry.

The expression

for experimental skewness in terms of the working

=

skewness

2 —

(*i

-

w) 3

-

nA 3

3A (2-37)

ns

(experimental)

s

where w and A are defined in Eqs. 2-4 and 2-14. The binomial model distribution has a third central moment, the mean, given by 8

"

=

np{\

-

-

p){\

=

6>

*

3 -%

=

=

p)(l

2p)

2p)

(1

-.,

(2-39)

1

«"

(binomial)

in deriving

- — — —"^ =— ^ [*» - V)Y W np(l-

about

(2-38)

easily

skewness

i.e.,

2p)

proved by an extension of the argument used Eq. 2-28. The binomial skewness is

which can be

mean

n -^t

(2 .40)

no Equation 2-38 or 2-39 shows that the binomial distribution only in case/? = \, as was mentioned in Chapter 1.

Other dispersion is

indices.

The fourth

is

symmetrical

moment, divided by and is written

central

called the coefficient of peakedness or of kurtosis,

s4 ,

as

n

2

peakedness

=—

(*<

-m

4 )

(2-41)

"s

(experimental)

and

peakedness (universe)

(Again,

if

Eq. 2-42

is

=—

(2-42)

na

to apply to a continuous universe distribution the

summation should be replaced by an integral.) The peakedness, like skewness, is dimensionless and is independent of the scale. The fourth moment, even more so than the third, is restricted in its usefulness with actual measurements unless a very large number of trials have been made. If n

is

not large, the value of the peakedness

because of

its

distribution.

is

numerically unreliable

high sensitivity to fluctuations in the

tail

regions of the

Probability and Experimental Errors in Science

96

Combinations of precision indices are sometimes combination

useful.

One such

is

3(skewness)

2

—

2(peakedness)

+

6

is zero for a normal distribution, positive for a binomial or a Poisson distribution, and negative for distributions that are extremely

which

peaked. Additional indices of dispersion can be defined and evaluated but their An exception may be is generally not very great.

practical significance

the universe standard deviation in the sample standard deviation,

a s which,

if

,

the

a can be approximated by

number of

deviation

s.

where 0/

is

s, is

which to express the standard

significant figures with

may be

This index

written in general form as

moment about

the fourth

viz.,

useful as a guide in determining

the universe mean.

In special cases,

Eq. 2-43 simplifies as

a for a

normal distribution (2-44)

and

=

as

/2a

2

1

I

An

\

where the It is

effect

evident that

distribution

6/

is

,

a2

+

3a4 expressed

in

a

units.

the use of Eq. 2-44 for a

case, suppose that n = 8;

more than two

for a Poisson distribution

1

of the particular shape of the distribution can be seen. " normal distribution is 3a4 and for the Poisson 4 for the

As an example of normal

+

H l\

then \/V2n

normal or approximately

=

0.25, and, therefore, not

significant figures should be used in expressing the standard

and most likely only one. It has been mentioned that the probable error index in scientific work, and it may be worthwhile

deviation

s,

is

a popular precision

to write the expression

for the probable error in the probable error, viz.,

pe pe

where the numerical generally a

little

=

0.675

-^ = 0.48-^

(2-45)

normal distribution and is any other distribution (e.g., for a Poisson

coefficient applies for a

different for

distribution see Table 5-2).

As

a final

comment,

all

the central

moments

are useful

when we

are

Frequency Distributions and Precision Indices

97

dealing with mathematical or theoretical distributions,

i.e.,

with distribu-

whose shapes can be expressed exactly. For example, this is the case with the distributions (e) and (/) in Fig. 2-4. However, for some very interesting distributions the central moments do not exist unless the tails tions

of the distributions are arbitrarily cut

Cauchy

this is the case for the so-called

off;

distribution, viz.,

/(*)

= 7r[l -f-~7 + (* ~ yU)vn]

(2 " 46)

2

This expression appears in theoretical physics rather frequently, classical dispersion in physical optics, in the shapes of lines, etc.

In actual experimental work, distributions whose

as x2 or less rapidly than x2 (the square ,

moments

diverge), are not

As

Conclusions.

is

e.g., in

atomic spectral tails

drop off

the limiting rate for which the

uncommon.

we would

stated earlier, the information

to have in the description of a set of measurements

is

really like

the complete mathe-

matical equation for the distribution histogram or frequency curve. But, in real

life,

distribution

is

with a limited number n of actual measurements, the

at best defined with

some

finite

obtaining the exact mathematical formulation

vagueness. is

Then, since

we must Each index

impractical,

content ourselves with one or more of the precision indices.

has

its

own advantages and

disadvantages in providing the information

we

want.

The standard deviation, i.e., the square root of the second central moment, is the index in widest use and the one on which most statistical theory is based. The probable error, the evaluation of which first requires is rather widely used among experiThere are three excellent reasons for the popularity of

evaluation of the standard deviation,

mental

scientists.

the standard deviation: reliable; (2) the rules

(1) it is statistically efficient

and experimentally

of operation in the propagation of errors as based on

the standard deviation (rules discussed in the next chapter) are conveniently

simple; and (3) the tively unreliable.

moments higher than

the second are usually quantita-

Occasionally, a measurement situation

is

encountered

which the larger deviations are suspect for nonstatistical reasons, and then even the second moment overweights the larger deviations and the mean deviation is used as the dispersion index. The mean deviation, howin

ever,

is statistically less efficient

The general

than

is

the standard deviation.

and precision indices discussed in of measurements having any type of

definitions, concepts,

the present chapter apply to sets

frequency distribution.

We

shall continue with the treatment of empirical

data for another chapter, introducing a theory, before taking

little more advanced up the mathematical models.

statistical

Probability

98 2-12.

and Experimental Errors

in

Science

Problems

Note:

A

numerical answer to a problem

is

not a complete answer;

the

student must justify the application of the equation(s) he uses by giving an analysis of the problem, pointing out how the problem meets satisfactorily the

on which the equation

conditions 1.

From

is

based.

the measurements of Table 2-1, determine (ans. 31.7 to 32.3

(a) the range,

(b) the (c)

(ans. 31.9

median, (ans. 31.8 to 31.9

the middle 2 quartiles,

(d) the

mm;

31.9 to 32.0 (ans. 31.9

mode,

(e)

the arithmetic

(f)

the

mean with and without

the device of the working mean, (ans. 31.92

mean

(ans. 0.11

deviation,

(g) the fractional

mean

mm) mm) mm) mm)

mm) mm)

(ans. 0.0036 or 0.36%) and without the device of the working mean,

deviation,

(h) the standard deviation with

(ans. 0.15 (i)

the fractional standard deviation,

(j)

the standard error, and

mm)

(ans. 0.004 7 or 0.4 7 (ans. 0.045

%)

mm)

(ans. 1.5) (k) the skewness with the working mean. Express each answer with the proper number of significant figures and indicate

the proper dimensions. 2.

Make

the histogram for the following frequency distribution

x

=

5

83.304

Probability and Experimental Errors in Science

100

Why (i)

do we

prefer

moment

the second central

rather than the fourth central

moment

in

describing experimental dispersion, (ii)

the arithmetic

mean

rather than the

rms mean

as the best location value,

and (iii)

in

some

cases the

median rather than mean as a location value?

13. Discuss the relative

advantages and disadvantages of the

mean

deviation

the standard deviation as a dispersion index in 8 measurements that approxi-

vs.

mate a normal

distribution except that they are

(a) consistently

somewhat higher

in the tail regions,

(b) inconsistently higher in the tail regions

owing to inconstant systematic

errors, or (c)

made with

a rather high "background" (as in the intensity measurements

of a component line in emission spectra or in a nuclear cross-section measurement).

"The true

world

logic of this

is

the calculus

of probabilities."

JAMES CLERK There

numerus:

"Defendit

is

MAXWELL safety

in

numbers."

AUTHOR UNKNOWN

3

"Maturity

capacity

to

endure un-

JOHN

Statistics of in

the

is

certainty."

FINLEY

Measurements

Functional Relationships:

Maximum

Likelihood,

Propagation of Errors,

Consistency Tests,

Curve

The

sole

Fitting, Etc.

purpose of a measurement in an experimental science

is

to

influence a hypothesis.

Science progresses, as stated earlier, by the constant repetition of three steps: (1) conception of

some aspect of nature,

i.e.,

a hypothesis, based

on

all experience available, (2) calculation of the a priori probabilities for

certain events based

on

this hypothesis,

priori probabilities with actual probabilities.

The comparison

thesis or a rational basis for

The

and

measurements,

modifying

with experimental

is

that

it is

we would we would measure, are

define,

101

hypo-

perforce formulated

the intrinsic difficulty with

that they are unavoidably veiled in errors.

particular property of nature that

quantity of nature that

i.e.,

it.

with only a limited amount of experience, and is

comparison of the a

yields either a confirmation of the

intrinsic difficulty with a hypothesis

actual measurements

(3)

and each

Each

particular

at best only probabilities.

Our

task as scientists

hypotheses are correct,

in

and Experimental Errors

Probability

102

to increase the

is

i.e.,

in

Science

probability that our current

are confirmed by objective experience.

Elementary but fundamental examples of the function of measurements improving scientific hypotheses were pointed out in Sections 1-4 and 1-5.

These examples arose in the discussion of conditional probability and of the reliability of inferred knowledge. In Chapter 1, however, there was no ambiguity as to the outcome of each experiment; the veil of errors enshrouding each measurement was not involved. line

of argument begun in Chapter

with emphasis

1

Let us continue the

now on

the measure-

mental uncertainties. First,

we continue

this

argument with very general

specific line of

we introduce

equations for a page or two, and then in Section 3-1

maximum

powerful and useful method of the veil of errors.

some

the

likelihood for dealing with

These discussions are intended to give the reader though upon first read-

insight into or feeling for the subject even

ing he

may

in these

not understand every step in detail.

If

he has

much

trouble

pages he should not be discouraged at this time but should pro-

ceed in cursory fashion on to Section

Consider

first

3-2.

a discrete set of possible hypotheses; in particular, con-

two hypotheses A and B.

Suppose that pin(A) and pm(B) are the A and B, respectively, are correct. pm(A) + pin(B) = 1. Suppose that an experiment is performed •, x •, x Suppose also and a set of measurements xx x 2 it n is obtained. sider

initial

or a priori probabilities that hypotheses

•

,

that

pA (x,) is

•

•

,

•

the probability, on the assumption that

A

is

correct, that the

would be observed, and that /^(x ) is the probaThen, as the bility, if B is correct, that the set x would be observed. consequence of the experiment, the a priori probabilities p-m{A) and p m {B) that applied before the experiment are now modified; they become particular set of values x

i

t

Pmod(A)

=

(J-1)

Pin(A)p A (x n

l

These expressions were written "/?

heads

in

)

+

p in (B)p Ii (xi )

Pin(B)p l{ (x,)

P\n(A)p (x

"head" or

t

in

l

)

+

Chapter

a row" appeared

p in (B)p Ji (x I

i

where the

in the

)

single observation

case of the penny-toss

in place of x if and where "the sun rose" or "the sun rose n row" appeared in place of xt Insofar as the experiment yielding x is well chosen and well performed, our confidence in A (or B) is increased at the expense of our confidence in B{or A), the increase in confidence being

experiment times

in

a

.

i

proportional to, say, the difference /j„io.iM) — p\n(A). Of course, this value /?mo.iM) becomes p\ n {A) for the next experiment to be performed, etc.

of Measurements

Statistics It

in

Functional Relationships

often happens that the possible hypotheses

than a discrete

set rather

As mentioned

set.

in

make up

Chapter

1,

the problem of the sun's rising as involving a continuous

a continuous

Laplace viewed set,

whereas the

analogy to the coin-tossing experiment presumed a discrete In the continuous case, the experiments serve to sort the numerical

argument set.

103

in

values of each of the/Jmod's into a continuous distribution, and presumably this distribution

has in

a single

it

maximum which

article

of faith in experimental science.) The

corresponds to the

(This presumption

hypothesis deserving our greatest confidence.

maximum becomes

is

an

increas-

good additional experiments are performed and interpreted. Whether the set of possible hypotheses is discrete or continuous, we are confronted in the evaluation of each/? m0 d with the problem of what to do with the variety of x values yielded by the experiment. The calculation of pA (z ) requires that the hypothesis A be expressible in some analytic functional form. Sometimes the hypothesis refers to the functional form itself, and sometimes it involves the property being measured as either a ingly sharp as

i

t

In any case, suppose that the

variable or a parameter in the function.

functional form

observed

is

is

cj>(A,

Then

xt).

the probability

equal to the product of the n factors

a single experimental observation.

pjm) =

trials

are

all

II (A,

x)

is

t

independent (see Eq.

and

A and

B,

is

x { ), each factor being

n #a *d

(3-3)

=1

written with the assumption that the n 1-5).

A similar expression may be written To compare the

for each of the other possible hypotheses. different hypotheses

that set x {

pA (x^)

Thus,

1

where the product

cf>(A,

we may

write the ratio

reliability

from Eqs.

of two

3-1, 3-2,

3-3,

Equation 3-4

is

likelihood ratio,

pA (x t)

is

PmodM)

p in (A)

Pmoa{B)

pm(B)

U(f>(A,

x^

U<j>{B,

x)

(3-4) •

t

recognized as the "betting odds" and

and

Ucf>(A,

ar ) 2

is

a normalized probability as

known all

often called the

is

as the likelihood function

proper probabilities

cal evaluation of the likelihood function

is

straightforward

are].

if

[if

Numeri-

the functional

form of each hypothesis is known. And, as mentioned in Chapter 1, if we have no a priori knowledge favoring A or B, we usually resort to the desperation-in-ignorance guess that pm(A) = pin(B). 3-1.

Method

of

Maximum

Likelihood

As an example of the use of the likelihood ratio, Eq. we wish, from n independent trial measurements of x,

3-4,

suppose that

to find the

most

Probability and Experimental Errors

104

in

Science

g of a true parameter y in a known matheAssume that there is only one parameter x n) °f tne up some function g = g{%x, #2

likely estimate (or estimator)

matical functional form

We

to be determined.

y).

(f>(x;

set

>

'

'

'>

values of x from which the estimate

g is to be deduced. There are several methods for setting up such g functions, and each method gives a different degree of goodness of estimate ofg. The statisticians rate these methods in terms of their relative efficiencies. As stated in the discussion of the mean deviation in Section 2-11, the relative effitrial

ciency

is

defined as follows.

If

N sets of samples each

of size n are taken

from the parent population, N different values of g are obtained. These N values of g themselves form a distribution, and let us say that the standard deviation of this g distribution is noted. This process is repeated methods for estimating g, and the standard deviation ob-

for each of the

tained with the respective different

methods

deviations.

is

Of many

g

most

The

noted.

is

relative efficiency

method having

possible methods, that

standard deviation has said to be the

method

its

the smallest

values clustered most closely together and

g

Also, with any method,

efficient.

if

the

mean of

distribution for a sample size TV tends to a value different

estimate

is

of two

taken as the inverse ratio of the squares of the standard

said to be biased.

If the estimate

g converges

from

is

the

y, the

to the true value

y as N —* co, the estimate is said to be consistent, i.e., free of bias as the sample size increases without limit. (An example of bias is mentioned presently.) For scientific work, it is generally agreed that a good estimate must have zero (or at most small) bias as well as reasonably high efficiency.

For most parametric estimation problems, the method of estimation as the method of maximum likelihood is the most efficient, and, if n is large, the estimate is usually satisfactorily consistent. The likelihood

known

function, the product of

L(x x x2 ,

,

•,

all

n values of

x n y) ;

=

{x x

;

(:r

y), is written

t ;

y)(x 2

;

y)

•

•

•

0(x„; y)

(3-5)

Especially in the case of a discrete population, certain values of x i are

observed with a frequency/, which

is

greater than unity.

actual frequency/, appears as an exponent total

number of

factors

is

r

with

r

<

on the factor

In this event, the <£(#,; y),

and the

n.

which there is a continuum of possible The relative is a continuous variable. probability of any two different values of g is given by the likelihood ratio, Eq. 3-4, in which the likelihood functions are of the form given in Eq. 3-5 with one value ofg in place of y for the numerator of the likelihood ratio and with the other value of g in place of y for the denominator. The ratio Consider the general case

values for g,

i.e.,

p-m{A)lp m {B),

if

in

a parameter that

nothing

is

otherwise

known about

to unity as the desperation-in-ignorance guess.

it,

may

We

be taken as equal imagine each of

N

Statistics

of Measurements

possible values of g, viz.,

and each of the

Functional Relationships

in

g lt g2

•

,

L

•

•,

gp

•

•

gN

•,

,

105

inserted in the

L

function,

computed. These TV values of Lj form a distribution which, as TV— oo, can be shownf to approach a normal distribution and whose mean value at the maximum of the distribution of

TV values

;

corresponds to the desired estimate g. To find the value of L j that makes

L a maximum, we differentiate L with respect to y and set the derivative equal to zero. Since Lisa maximum when log L is a maximum, we may use the logarithmic form when it is more sum than with

convenient to deal with a / aiogiA

=<>=£/< |- log #*<;$)

L

a Oy dv

\

,

ly=g

and we seek a solution of

Then,

a product.

(3-6)

og

=l

i

this expression for g.

This value for

g

is

the

of y (but as we shall see it is not always an unbiased estimate). Solution of Eq. 3-6 is often explicit and easy without multiple

most

likely estimate

roots;

in case of multiple roots the

The procedure can be there

is

most

significant root

generalized to treat

is

chosen.

more parameters than one;

one likelihood equation for each parameter.

The maximum

method^ is generally considered to be about approach to the majority of measuremental problems

likelihood

the best statistical

encountered in experimental science. This method uses all of the experimental information in the most direct and efficient fashion possible to

unambiguous

give an

estimate.

functional relationship must be

p

in

by a binomial

distribution,

and

let

that "success"

is

observed

likelihood function

is,

w

us find the

'-=

more

known

to be fitted

maximum

likelihood

Call/?* this estimate of/7.

times and "failure" n

from Eq.

that the

the above example

of n measurements that are

estimate of the success probability/?.

is

or assumed.

To make

the binomial distribution.

specific let us consider a set

principal disadvantage

Its

known

—

w

Suppose

times, so that the

1-20,

Ijpo

-

h

p)

Then,

— logL op t See, e.g.,

/p=p*

=0 = --p*

H. Cramer, Mathematical Methods of

1

—

(3-8)

p*

Statistics (Princeton University

Press, Princeton, 1946). % This method was used in special applications as early as 1880 by Gauss, but was developed for general applications by Fisher in 1912 to 1935. See, e.g., R. A. Fisher,

Statistical

Methods for Research Workers (Oliver and Boyd, Edinburgh,

1950), 11th ed.

Probability and Experimental Errors in Science

106

and the most

likely estimate

of the true value of p

is

=~

P*

(3-9)

n

which

is

the experimental probability as defined in Eq. 1-39.

This method also leads to a

and a

fi

Vnpq;

Problem 27

see

To

the normal distribution.

in

maximum

of

=

in Section 3-11.

method and the standard

illustrate further the

mean

likelihood, let us estimate the

fi

deviation a of a normal distribution from a sample set of size the estimates

m

and

n.

place of \/hV2, Eq. 4-8, the likelihood function

is

L=n^=exp(-^) 2a i

= i a^J27T

Following the procedure of the

Call

In this case, using Eq. 1-25 with a in

s respectively.

maximum

likelihood

method

first

0=|^i

=

(AlogL)

(3-.0)

1

\

for

ju,

(3 .U)

from which

m=

-fx n =

which agrees with Eq.

To

(3-12)

i

i

i

2-1.

estimate a,

(-U^)

=0 = 2

f.ogL)

(3-13)

from which s

=

2

-

I

- f*f

(*,

0-14)

n t=i which,

if

t In the if

we

is

replace

/u

with our best estimate,

method of maximum

retain

estimate

As

we

u

it

likelihood,

v

s'

a

is

s'

2

=

——

"

1

2, (x =

-

not necessary to "replace" s'

=

2

"

1

n

—

; 1

V

(x,

—

fi

2-9.

with m;

mY, and

this

>-,

s' is

not an unbiased estimate of

t

~m

2 )

'

s

an unbiased estimate of a 2

Vo = 2

- HM/i -

the parent distribution

is

normal.

—

We

,

1

a.

Rather, the unbiased estimate of

given by a/ v i"

if

m, agrees with Eq.

unbiased. See Eq. 3-98 and the attending discussion.

a fine point, although

=

is

can be shown that the estimate

i

2

it

viz.,

777-; r(j»)

l)]

s

shall generally ignore this fine point.

Statistics

of Measurements

Functional Relationships

in

In the case of the mean, the estimate

estimate of /.i; this

But

in the case

is

of the standard deviation, s

the fraction it is

V(n

—

it is

always

in this case, correction for

as n

—*

size

even though

fi,

in

an unbiased and consistent mean of any distribution.

a biased estimate of

is

than a;

less

it is

l)/n as stated in Eq. 2-22.

a consistent estimate since

5 is

is

true for the estimate of the

a negative bias because

as

m

107

it

When

the bias

is

s

has

less

by

known,

can be made. However, the estimate

it

converges on the true value asymptotically

oo (but s does not converge

N ->

a.

on the average

on a

if

we take

N samples each of small

oo).

Suppose that the variable k is known mean /u unknown. The m, can be obtained from the likelihood function

the Poisson distribution.

to have a Poisson distribution, Eq. 1-26, with the

estimate of

by

//,

viz.,

differentiating with respect to

[x

and

setting the derivative equal to zero,

=0=i4^-l)

flogl)

(3-16)

and solving for m, 1

-Ik f = -Xx n i=i n

™=

i

i

i

(3-17)

i=i

in

agreement with Eq.

Can mean is

2-1.

the statement that the

this

be interpreted in any sense to confirm

statistically the best location

value for an

asymmetric distribution ? Instrumental parameter.

method of maximum

As

a final example in this section of the

likelihood, suppose that

/ is

the time interval between

counts in a Geiger counter measurement of the intensity of cosmic rays,

and suppose that the frequency function for
where 6

is

=

6e-

t is

et

of the form (3-18)

some unknown instrumental parameter whose value we wish

Suppose that a sample set of n measurements of / has been made. The maximum likelihood estimate O e is given by the procedure of setting up the likelihood function

to determine.

L=0"exp -02'* i

=l

(3-19)

Probability and Experimental Errors in Science

108

and of finding the value of

=

(!£)

e

(Incidentally, in this

0, viz.,

°

6 e , that

makes L a maximum,

= K^-^H-^X, +

n)

(3-20)

=^-

(3-21)

example the estimate 6 e

is

mean

the reciprocal of the

of the recorded times.) Precision in the

maximum

lihood function L(g) in the set

As

vs.

g

of measurements xi that

graph of the

like-

pertinent to the possible values ofg.

is

discussed above, the estimate g* of the true value y corresponds in

the graph to the

maximum

depends upon the

details

may be As

A

likelihood estimate.

gives all the experimental information contained

value of L;

the precision in this estimate

of the spread of L(g) about g*.

This precision

stated as, say, the standard deviation of the L(g) distribution.

N—* oo, the L(g) distribution becomes normal in shape, and the standard with N limited and rather easily deduced but in real

deviation

is

life,

;

may be

usually rather small, L(g)

problem of finding the precision approximation, however,

it

may

For an assumed normal L(g)

greatly different

reliably

is

from normal, and the

more complicated. As a

be treated as though distribution,

we may

it

first

were normal.

write the spread of

L(g) about the best estimate g* as

L(g) oc e

where the standard deviation

4-8.

=

^

L= — h\g —

g*f

Then, log

(3-22)

is

"•

from Eq.

- h2(9 - 9 * )2

(3 " 23)

+

constant

2 ^logL=-2fc 2

(3-24)

dg

Combining Eqs. 3-23 and

3-24,

we

a °=

/

find the following very convenient

expression,

Standard error a m

.

~T2

1

As an example of

\* (3-25)

the use of Eq. 3-25,

we may

point out that differentiating the logarithmic form of Eq. 3-10 twice with respect to

\x

allows us to write immediately the expression for the standard

Statistics

of Measurements

Functional Relationships

in

error (standard deviation in the

mean) for the normal

g

distribution, viz.,

=^

*m-l-rr\ In this example,

of the

refers to a possible value

(3-26)

mean

is

an exact expression even

if

sample set of can be shown

in a

n measurements from a normal parent population, but that Eq. 3-26

109

it

the parent distribution

is

not

normal. In

many

L cannot

by trying several

successive value of

g

of g and by using Eq. 3-5 with each

different values

These several values of L(g) If it is normal,

written in place of y.

and the general shape of L(g) sketched in. the same everywhere if it is not normal but

are then plotted 2

(d /dg

ent

be determined analy-

In such cases the distribution L(g) can be approximated numeri-

tically.

cally

actual problems, (d 2 /dg 2 ) log

2 )

log

L is

;

from normal, the average value

The method of maximum

2

(d /dg

2

likelihood,

precision of the estimate so obtained,

is

)

log

and

L may be its

is

not far differ-

extension to give the

applicable in

problems in which the functional relationship

is

used in Eq. 3-25.

known.

many

We

statistical

shall

have

occasion to use this method in later discussions.

We

have in the above discussion referred several times to the normal

distribution

and

to

details; we shall continue to do so For a proper background, the reader is the first part of Chapter 4.

some of

its

throughout the present chapter. well advised to read at least 3-2.

Propagation of Errors

In Chapter

maximum

2,

and

in

most of the

specific

examples of the method of

likelihood of Section 3-1, the statistics of direct measurements

was considered. But interest in precision in experimental science extends, of course, beyond direct measurements. The best value (the mean) of each of several direct measurements is very often used in the computation of the value of another property. For example, velocity is derived from the direct measurements of a distance and of a time, a computation involving simple division. (A critical examination of the measurement of a distance or of a time shows it to be the really a difference between two location points in space or time observed "direct" fluctuations include the component fluctuations in each of the two location points.) Examples of derived properties from more

—

—products, quadratics, —are very easy to

complicated relationships metric functions,

etc.

exponentials, trigono-

find.

In this section, the rules are discussed for the determination of the precision or reliability of the

computed "measurement"

in

terms of the

Probability

110

measured property. This

precision of each directly

as

tfie

and Experimental Errors

Science

in

the subject

is

known

propagation, or compounding, of errors.

Suppose that the derived property u is related to the and y by the functional relation

directly

measured

properties x

= /(*,

"

where the bar

= f(x

ut

y),

it

mean

u

yt),

= fix,

(3-27)

y)

and where the function may be combination of assumed to be regular as regards continuity and

signifies the

value,

additive, multiplicative, exponential or otherwise, or a

The function

these.

is

derivability. First,

we must

decide whether the measurements of x are entirely

independent of those of

many

In

y.

answer

cases, the

is

For

obvious.

example, the distance and time measurements for velocity are independent.

The measurements of the two sides of a rectangular table for its area, if in each measurement the observer uses a ruler having a wrong calibration and/or makes the same systematic parallax error, are dependent to the extent that both contain the same systematic error. Such measurements and their errors are also said to be partially correlated (i.e., as discussed later, they have a correlation coefficient between and ±1). In this case, and in most or all actual cases, the independence is only partial the errors are of both the random and the systematic types. Both types are obviously present, for example, in the measurement of the area of the table when the parallax errors are random but the calibration error is

—

systematic.

In

many

actual problems, the functional relation involves parameters

that are not independent of each other.

involve two or

more of

For example, the

the "fundamental constants"

relation

(e.g.,

may

electronic

charge, electronic mass, velocity of light, Avogadro's number, Planck

Boltzmann constant, Faraday constant, Rydberg constant, Bohr magneton, fine structure constant, etc.), and the errors in these parameters from previous experiments propagate along with those in x

constant,

in the present experiment.

Nonindependent errors: dependent error

is

that

its

systematic errors.

systematic errors cause

i.e.,

otherwise

partially correlated as stated above.

propagate to yield the error

=

Au

in u

all

in u.

the component The symbol A

direct in

characteristic of a

independent deviations to be

(See Section 2-7.)

Dependent errors

according to the relation

— Ax + — A*/+ ••• dx

for

A

algebraic sign generally tends to be the same;

(3-28)

By

measurements

x, y,

-

•

that

may

Eq. 3-28, in contrast with d which

be involved is

used for

of Measurements

Statistics

in

Functional Relationships

supposedly independent or random errors,

we

are

now

III

intended to indicate that

is

dealing with a clearly recognized systematic type of error.

In practice the dependent errors are not clearly recognized;

usually inextricably mixed in with the

random

they are

and both types of

errors,

make up the observed frequency distribution of x or y Sometimes one can make a test for the presence of dependent

errors together

or

•

\

The

errors.

test uses the basic expression for correlation (discussed later).

This expression, for n measurements of each variable, with the variables

taken a pair at a time, say x and

y, is

i(te ay,)«0

(3-29)

(

4

=

where dx €

—

xt

dy {

x,

=

=1

yi

independent, each term in the and, as «

—»- oo,

sum

the

is

—

sum

is

not very large, the

may

nearly zero, and the deviations dx it dy t interpretation

Of course,

becomes a

if

will

it

not be detected by

sum may be

be independent;

still

the

statistical one.

a systematic error

a dependent error,

are

as likely to be positive as negative,

If n is

zero.

sum

deviations in the

If the

y.

cause a

is

present in x and/or

shift in the

this correlation test.

the inconsistency of different

mean we

Later,

means by which,

y, etc.,

x or y or discuss

in

some

•

•

is

not

and

will

but •,

some

tests for

instances, the

presence of those systematic errors that depend on the variable or on time

may

Usually, however, small systematic errors can only be

be detected.

suspected,

When

and the magnitude of each component only estimated.

dealing with systematic errors in general, the assumption that the

algebraic signs of

This

justified.

ponents

is

is

the

all

component

errors are the

especially so if the total

greater than, say, three.

If the

same

is

not

statistically

number of independent comnumber of components is suffi-

ciently large to justify treating the individual systematic errors as having a

own, with a meaningful mean and standard deviation, then these errors may be treated as though they were random. But if this is not the case, we are generally left with no satissignificant frequency distribution of their

factory procedure for reckoning the propagation of systematic errors.

Strenuous efforts should be large systematic error.

tinguishable from the in the

mean

made

to recognize

and to eliminate any

Residual systematic errors are perforce indis-

random

errors, but, if their net effect

can be estimated

value, this estimate should be stated as well as, say, the ob-

served standard deviation.

Random

errors.

treated as random.

values of x and n" W; is

Suppose that

all

errors are independent

In the general case, in using Eq. 3-27, trial

values of y.

It is

most

and may be

we have

likely that ri 7^ ri

not computed from an associated pair of xi and y t

.

ri trial

and

that

So we imagine an

Probability and Experimental Errors in Science

112

and n values of y such that each of the n computed from a corresponding imagined pair x{ yt The pairing may be done at random. The imagined xi or y is equivalent to the actual measured values of x or y if the frequency distribution of the imagined set is the same as of the actual set. With this equivalence, we choose n, the number of "trial" values of u, as large as we please. The argument is easily extended to include more than two independent equivalent set of n values of x

values of u

is

.

,

t

{

t

variables.

Next we assume that all the deviations 6x = xi — x and by Then, we define the deviation Su as t

are relatively small.

=

{

y

t

—

y

t

S Ui

=

Ui

- u w — dx + — dy t

ox

(3-30)

f

oy

when

which follows from the definition of the

partial differential

for small increments. Or, better, Eq. 3-30

may be obtained from the Taylor

written

expansion from elementary calculus; thus ui

=f([x

+

= /(*,

V)

+

dx l [y i

+

-^ & x ox

i

dyj)

+

-^ oy

fyi

and du t which,

if

— dy

= u -u = — 6x + f

t

ox

(3-31)

t

oy

the higher order terms are neglected,

is

the

same

as Eq. 3-30.

=

we have

here taken u a.sf(x, y) rather than as u (2?=i M i)/ w but these two definitions of u are essentially the same //all deviations are

Note

that

small.*

Note

that the partial derivatives are taken at x

=

x,

»

y

=

y,

hence

are constants. *

For example, suppose

we can show

=

u(x)

that this relation x, 2

=

is

x2

not true in general that x-

It is

.

true in the limit as dx t -> 0.

+ dx y =

(x

+

x2

t

2x dx

t

+ 6xi

We

=

z2

.

write

2

Then, by definition of the mean square, n

1

x2

i

<5x,

2

term.

But

it

i

=l

follows from the definition of the 1

"

y _ X2

as 6x t -* 0.

(5x,>

.*".

=\

-

Hence

2

*

t

n neglecting the

n

1

y x = -n y (x + 2x ~.

=-

<5x,

=

=

X2

o

mean

that

However,

of Measurements

Statistics

in

Functional Relationships

113

Equation 3-30 or 3-31 indicates the procedure by which individual deviations propagate to give an individual deviation in

We

u.

learn

from these equations that the effect in u of a deviation in x (or in y, etc.) is multiplied by dujdx (or by du\dy, etc.), hence that, if x (or y, etc.) appears in u with an exponent much greater than unity, the effect may be rather large. In planning an experiment and apportioning relative effort in the measurement of each of the various component quantities, it behooves us to note the exponents on the respective variables. As an example, consider the relation for the acceleration due to gravity in terms of the length / and the period P of a simple pendulum, viz., 2

A

/

for which

dg ^

=

d6 gi

<5/.

+

fractional effect

P2

dP

dl

The

^ dPi = — dg

dl {

-

~P

6P

t

3

due to the component fractional deviations dgi

= dk_

g

I

2

SP1

P

where we write / for / and P for P since the differentiation these values. Note the factor —2 in the last term.

We

are generally

when each

error

is

more

these errors

Mean (and

M

.

—

u

fractional

J independent

deviation, as a standard deviation, is

not useful in

\u,

ta =

+ dy<

—

this case

because

signs.

mean) deviation. The equations that govern and fractional mean deviations are

variables,

fractional

carried out at

deviations

du

=

=z„

mean

-\dx for

mean

Equation 3-31

do not have meaningful algebraic

the propagation of

is

interested in the rules of propagation of errors

expressed as a

or as a probable error.

is

u\

r J

idu

2 If

(3-32)

and

=

fractional z„

wa M V" M \

= \-j

= x\OX i]/ )

2

xK

(3-33)

Equation 3-33 usually simplifies to a very convenient expression because

The basic

of cancellation. squares,

is

deviations,

and

that the x, y,

same

type.

relation, the square root

of the

sum of

the

derived presently for the case of propagation of standard

•

this relation applies for •

•,

and u frequency

mean

deviations to the extent

distributions are all of essentially the

In this event, the simple numerical constant relating the

mean

deviation and the standard deviation cancels out. This numerical constant

Probability and Experimental Errors in Science

114

depends, of course, on the particular shape of the distribution; 0.80 for a normal distribution,

As was pointed out

Chapter

in

z

i.e.,

however, as mentioned in Chapter

some measurement

On

computational labor. standard deviation

the

2,

2,

it is

about

not as

statis-

0.80s.

mean

deviation

may

give too

situations,

is

The standard

standard deviation.

tically efficient as is the

deviations in

($t

and

much

deviation,

weight to large

also entails a

little

more

the basis of statistical arguments alone, the

preferable.

is

Standard (and fractional standard) deviation. square of the standard deviation in

u,

written s u

,

By

the

definition,

is

IW

2

_ =

2

s,r

t

=l

(3-34)

Squaring the binomial'in Eq. 3-30 gives

Then, placing

o

As

this expression in

Eq. 3-34, we have

+ 2 £ ^ + (f*)W \ax/ ox oy sum

(W

+

ox oy

\oxi

(fs) \oy/

as likely

. 2

it

(3-35)

a? and y are comany particular product dx dy { to be positive as negative. Then, since

n increases, the

2(<5z t 6y { ) goes to zero if

t

pletely independent (uncorrected) because is

W

2(<^.)

t

2

and

t

s

2

=

%r

follows that

[(duV

2

. (duY

J (3-36)

Note that n does not appear in Eq. 3-36 and, as may be expected, s u does not depend upon either the actual or the imaginary number of x's or the actual or the imaginary number of t/'s. Generalization of the derivation of Eq. 3-36 to cases where more than two variables are involved is obvious. The general expression for J independent variables

may s„

be written

=

(3-37) i

=

l\OX j /

Statistics It

of Measurements

Functional Relationships

in

IIS

from Eq. 3-37 and from arguments leading to Eq. 3-45 that

also follows

the standard deviation in the

mean

w.

or the standard error,

is

(3-38)

where

ss is the

standard deviation

in the

mean

value of they'th component

property.

The

fractional standard deviations in u,

j

=— =

fractional s u

I L

and then

in

it,

are written as

du

dx

(3-39)

ii

c-

=— =

fractional s«

(3-40)

ii

and are usually expressed tion,

in per cent.

As with

the fractional

mean

devia-

Eq. 3-39 or 3-40 should not be used unless the zero of each of the

x j and the u scales

is

Equations 3-39 and 3-40

physically significant.

generally simplify to very convenient expressions.

Equations 3-36 through 3-40 are the really fundamental equations in and are well worth memorizing.

the subject of propagation of errors It

was mentioned in Chapter 2

that scientists frequently use the "probable

For any given

error," pe, as a precision index. error, discussed in detail in

to the standard deviation.

pe

«a 0.6755;

different.

Chapters 4 and

distribution, the probable 5, is

linearly proportional

For a normal or near-normal

distribution,

for any other distribution the constant of proportionality

But irrespective of the numerical value of the constant,

it

is

follows

that -

Pe

j

/

5u

-VA

\2

u= I\T-)pe

2 ;

Xi

(3-41)

and

//all the

fractional pe u

=

fractional

=

pe,-,

(3-42)

u and the xi distributions are of the same type so that the constant

of proportionality cancels out.

Probability and Experimental Errors in Science

116

Sum

or difference.

Referring to Eq. 3-27, u

then

Bu and, by Eq. 3-37,

=

x

±

y

±

•

let

of Measurements

Statistics

in Functional Relationships

and, by Eq. 3-37, 5U

= VflV (a_

W

+ bvy<

b

-i

117

v

(3-48)

Because the partial differentiation in the Taylor expansion is carried out x and y, the values of x and y in this expression are taken as

at the values

the

mean

pe u for

values x and y. Likewise, the probable error

pex for

su ,

sx ,

pe y for

sy

,

and

may be written with

also the standard error s a

and the

meanpe a may be written with the appropriate changes assuming, of course, that whence is used the proportionality

probable error in the in notation,

constant cancels out.

The expression probable error)

is

for the fractional standard deviation (or fractional especially convenient,

fractional s u

=

fractional s a

=

—

b*s

+

f-

2

f

^+ ^x

\

2

y

(3-49)

2

and fractional

(^L + ^^,V!

^ = (I

fractional pe a

y

= [tj^L + x2

\

if,

&WY y

2

2

(3 . 50 )

I

in the formulas involving pe, all the frequency distributions are of the

same

type.

It is

obvious, since a and b are squared in Eqs. 3-49 and 3-50,

that the sign of a or b in Eq. 3-47

may

be either positive or negative

and Eqs. 3-49 and 3-50 are unaltered. Thus we have treated the case of either a product or a quotient.

As a

special case of Eq. 3-47, consider

u

where

A

is

su

Note

a constant. With Eq. 3-48,

=

xA4

that

aV (a_1 V = Aax

We

case in which

with

2

we have assumed

independent.

it

is

2

a

(3-51)

we

not

write

-\;

fractional s u

1

,

this

assumption

satisfied is the following one.

we

find su

=

^

(3-52)

components are

that the errors in all the

Eq. 3-43 we obtain su

.

=

x

must be careful that

= x From A = 2 and a =

where x x

= Axa

= V2sx

2s x which ,

is

.

is

satisfied.

Let u

=x + l

A x2

,

But from Eq. 3-52,

different

from the

vious value by the factor

pre-

V2. The latter calculation is the correct one; the former improperly presumes 6xx to be independent of 6x whereas in 2 ,

Probability and Experimental Errors in Science

118

same quantity. Inclusion of the correlation term

fact they are the

in

Eq.

3-35 corrects for the dependence and brings the results into agreement.

Other functions.

If the functional relation

u

we

=

write du/dx

x

(3-53)

B\x\ then, with Eq. 3-37, the standard deviation in u

= (~ s x

su

= B In

that of a logarithm,

is

2

\x 2

Y= ^ X

=A

u is

hot near nj2.

whose

first

terms

in the

It is

sin

(3-54)

xu

/

As another example, consider a trigonometric

where x

^

=

fractional s u

;

is

relation,

x

(3-55)

important to realize that for any function

derivative in the region of interest

very small, the higher

is

Taylor expansion cannot be neglected.

In the present trigo-

nometric example, su

=

As x cos

x;

fractional s u

=

sx

cot x

(3-56)

In expressing s x note that the units should be radians, not degrees, since ,

the trigonometric functions

3-3. It

and the derivatives are

in radians.

Means

Different

often happens that in a series of direct measurements the individual

values are not

than others

values should be given

determination of the mean; otherwise the

And

the "best" value.

means

Some

equally good.

all

in the

often

we

more weight mean is not

are interested in comparing two different

for consistency.

Weighted mean. weighted mean m"'

If

measurement x

w

assigned a weight

is

t

the

written as

is

n

mw

__

H' 3' 1

1

+

WX

W 2 X2

+

W2

+

+ +

W

;

X,.

+

5* •

•

+

•

W„-r„

N»

hwH

K'j is

s^

35?

,

=

l

i

w

t

is

replaced by

1/s,

2 ,

a relation

from the principle of least squares (if all the the grand weighted mean of N different means is

presently to follow

errors are small). desired,

'

of each measurement x known, the weight

properly expressed in terms of st ;

shown

'

yw i

With the standard deviation

X

\V

,T 1

=

we may

Or,

if

write the weight of the /th

weighted grand mean

=

xw

mean

=

as

^ 1

=

1

l/.s>

2 .

Then,

(3-58)

Statistics

of Measurements

To show

that

of least squares values of xt

h>,

in Functional Relationships

should be replaced by 1/s? according to the principle

(if all

the errors are small), consider for simplicity only

_ W& + w Wj + Write u = m w and, a;

+ wx 1 + w

xx

__

2

vv 2

=

w

w 2 jw 1

.

\

We

=

0;

This argument is

we

is

z

with Eq. 3-37,

a+ (1

w)2 J

proceed to solve for the particular value of w,

dsmwfdw

two

Then,

.

2

where

119

call it

wmln

for which

,

find

easily generalized to

inversely proportional to s?,

wlSl 2

=

w2 s22

show

that each of

many

weights

w

t

i.e.,

=

w3s32

=

•

•

w

•,

t

oc

—

(3-59)

2

s,-

The value of (/?e,-) 2 can

alternatively be used in place of sf.

One immediate consequence of Eq. 3-57 or Eq. 3-58, whether for the weighted mean of a simple set of n measurements or for the weighted grand mean of different component means, is that, if one or more of the standard deviations is much smaller than the others, the less precise values of xi or of x i can be neglected. This

is

a feature of great interest in

designing experiments in which several different direct measurements are

involved to yield a computed result. large deviations, x t

Note

—

x,

(This does not say, however, that

can be neglected;

see Section 4-6.)

from placing dsm w/dw equal to zero and this is equivalent to finding the value of w, for which the sum

that Eq. 3-59 followed

solving for

w

;

-iw^-m") n =

2

i

is

a minimum.

Thus, the weighting 1/5/

weighted

least squares.

principle

is

or

if all

i

Strictly speaking,

is

based on the principle of

we should point out

valid only if the frequency distribution in question

that this

is

normal

the deviations are sufficiently small that the higher powers in the

Taylor expansion are negligible. It is instructive to apply the method of maximum likelihood to the case of the normal distribution. The likelihood function is written as in Eq.

and Experimental Errors

Probability

120

now

3-10 except that different for

the standard deviation

each measurement xt

Science

considered as being possibly

is

Thus,

.

(-^-

2

L= l

1

ex H P

= a j2TT &aJ2n i

i

in

\

iy

2cr 2a.

)

2

/

and

=0=|^-— m

logL)

2

from which

m« =

2*W i±l

t

=

(3-60)

l

Comparison of Eq. 3-60 with Eq. 3-57 shows that for the normal case the weight

oc

h'j

I/a,-

The

.

may now be

Equation 3-45 ing.

2

reliability (in

ments, each of unit weight, (the

mean) having weight s

2

derived directly from an argument of weight-

terms of the standard deviation) with n measureis

n.

the

same as the

reliability

of one measurement

Thus,

= nsj

or

s

m

= -= v»

This derivation the condition

Weighted

wJCLw^

is

really equivalent to the

was made

one leading to Eq. 3-45

that all deviations

dispersion

indices.

It

is

must be

in

which

small.

more or

less

obvious

that

equivalent to the relative frequency, fjn, in the expression for the mean, Eq. 2-2. Also, with this change in notation, the weighted is

standard deviation

is

written

= / Swfo - m«ry = if

/ sw,fo

- mn

2

y

(3 _ 61)

Sw^ = n. pe w is given by the product of the appro-

the weights are so chosen that

The weighted probable

error

w the appropriate constant being 0.675 in the case

priate constant times sx

,

of a normal distribution, as stated

earlier.

Note that by allowing different weights to be assigned to individual measurements x we distort the condition of random trials, and if a t

,

weighted location index or precision index such.

However,

different

is

used

it

should be stated as

means may be weighted with impunity.

Consistency of two means:

the

existence of a systematic error in a

t

test.

mean

One important

value

is

test for the

afforded by comparison

Statistics

of Measurements

with another

mean

Functional Relationships

in

111

value determined on a different day, or by a different

This comparison

observer, or with modified or different apparatus.

is

an

involved statistical problem because each mean, even in the absence of systematic errors, possible means.

is

only a sample of the infinite parent population of

The two samples are expected

to disagree to a certain

extent simply because they are merely samples;

decide whether or not the observed disagreement basis of

and the problem is to "reasonable" on the

is

random sampling.

The simplest is that two means may be considered consistent (no "evidence" for a significant systematic error in either mean) if they agree within about the sum of their standard deviations. The criterion for "agreement" is entirely arbitrary; some conservative investigators take twice the sum of the standard Several criteria for consistency have been proposed.

deviations.

A if

better criterion for consistency

—

the difference between them, x x

in the difference.

two means are consistent

that the

is

x2

,

is less

This standard deviation

is

than the standard deviation

given with the aid of Eq. 3-43

as s

= ^5 ^ 2 +

(*i-* 2 )

2

s*

(3-62)

2

In case the two means have equal precision, s {f _ £ is

>

= V2s£

.

(This test

further developed in Section 5-8 as a quantitative test to determine

whether or not a measured signal is "really" above the background noise.) An even better criterion for consistency is the following one. Assume that the two sets of measurements, n x of x 1 and « 2 of x 2 are consistent and ,

are pooled. Then, the best estimate of the standard deviation of the parent

population, of which the n x

sample,

+

n % measurements are a

compound random

is

where n x + n z — 2 is the number of degrees of freedom. parameter in this test, which is called the t test, we write t

= h^I?

In the special case for which n x t

On

n 1 =n z

our assumption that the

=

=

sets

l^h!h^\

As a working

A (3 .64)

n z Eq. 3-64 becomes ,

1

^""

2

l~\

V2 n

x x and x% are consistent, the value of

as different pairs of sample sets are considered,

is

t,

expected to fluctuate

for the reason that they are sample sets each of finite size.

If

an

infinite

Probability and Experimental Errors in Science

122 0.4

_4

_5 Fig. 3- 1

.

/

-2

distribution curves for different numbers of degrees of

the distribution

number of /

-3

is

pairs of

constitute a

t

sample

sets are

imagined, the corresponding values of

x4 and y are

the

if

t

parent distribution, viz., as

f{t)

where

oo

This distribution can be expressed in rather

distribution.

simple analytic form (not easily derived, however)

from a normal

=

freedom v. For v

normal.

c is a constant

=

/

c[l

,2\-[^(v + l)]

+

(3-65)

'-)

chosen to make the integral of/(/) equal unity, and

number of degrees of freedom. This t distribution, illustrated in and, for v < oo, Fig. 3-1, is symmetrical in shape about the mean t = Knowing distribution. normal the is relatively higher in the tails than is probability the can calculate we the analytic form of the t distribution, v

is

the

that the value of

range,

e.g.,

is

of the next sample pair of sets

outside a range set by the values

made of ±t c to bound

tion

/

±t c

will fall outside a specified

this

range

the calculated probability

is

is

arbitrary.

0.05,

i.e.,

This calcula-

in Fig. 3-1.

The

by integrating Eq. 3-65 over the set range.

Commonly,

t

c

specification

chosen so that

is

that out of 100 sample values of

/

only 5 on the average will fall outside the bounds of ±t c Note that the calculated probability is based on the assumption that x\ and x2 are consistent. If it turns out that the magnitude of the experimental value of t as deduced from Eq. 3-64 is larger than the magnitude of t c this fact does not prove that the two means are inconsistent but it argues rather strongly in favor of a suspicion that they are inconsistent. The argument is even .

,

Statistics of

stronger

Measurements

if t c is set

Inconsistency

at

any

in

Functional Relationships

limit less

than the

5%

123

limit, e.g., the

would be caused by the presence of any

1

%

limit.

significant

systematic error affecting the observed deviations differently in one set

of measurements than in the other Values of if

no

/

set.

that are exceeded only

significant

1

(or 5) in 100 times

on the average,

nonconstant systematic errors are present, are

Table 3-1 for several different numbers of degrees of freedom n x Table n1

+

«2

—

2

3-1.

Values of

t

c

in

the

t

Test,

I

%

and

5%

Limits

+

listed in

n2

—

2.

Probability and Experimental Errors in Science

124

like to pool them so as to increase the total number of measurements and thereby increase the reliability of the mean according to Eq. 3-45. The pooling of such sets of measurements requires that the sets be

we would

consistent not only in regard to their

Or,

standard deviations.

we may wish

means but

also in regard to their

to test the internal consistency of

the precision of two sets of measurements (rather than of merely the means)

recorded on different days, with different apparatus, or by different observers. Again, the standard deviations of the various sets or samples are expected to

differ

somewhat among themselves, even

because of the fact that they are merely samples.

they are consistent,

if

We

seek a

test,

proba-

of course, for consistency of standard deviations.

bilistic

In the

test for

/

consistency of two different means, as just described,

that both sample sets of measurements are from (from a normal population if Eq. 3-65 is used). the same population of the validity of this assumption is tested. We shall Then the probability

the assumption

made

is

assumption again, but this time use the so-called F ratio as the working parameter which is defined in terms of the standard deviations of

make the

this

two

sets.

means.)

(The

t

parameter was defined in terms of the difference

Incidentally,

strictly

F

speaking,

(or

/)

in the

a "statistic" not a

is

"parameter."

Suppose that s x and sx are the respective sample standard deviations Then, in the nx measurements of x 1 and in the n 2 measurements of x 2 ox and ax are the best estimates of the standard deviations of the parent .

populations.

The

F ratio

is

defined as "i

_

a

F = -=;2 -

"2

n2

method of the

in the

sets

This

is

c

=

in the

e.g.,

infinite

number of

^~

2)

pairs of

sample

constitute an

V2)]

vx

/?

x

1

F

/

test,

F of the

we

Fig. 3-2.

in Fig. 3-2 for a typical pair

x

.

n2

—

1

are the

of values of

\\

and

v2

.

can calculate with Eq. 3-68 the probability that the

next sample pair of sets will

This calculation

F and F2

(3-68)

shape of the /"distribution

fall

outside a specified range,

outside the range set by the arbitrary values

limits

F

+ ri F)- [!i(Vl + = — and v2 =

(v 2

numbers of degrees of freedom. The

asymmetric, as shown

value of

cF H

a constant and where

is

respective

As

*X 2 1

a continuous distribution

f(F)

is

an

—

2

whose analytic form, if the of measurements are from the same normal parent distribution, is

distribution.

where

test,

2

imagined, and the corresponding values of

sets are

two

/

5 *i

(3-67)

a *2

As

j

is

F and F2 x

indicated in

an integration of Eq. 3-68 with the particular

of Measurements

Statistics

Fig. 3-2.

F distribution

Now, however, define

F

in

Functional Relationships

curves for different pairs of numbers of degrees of freedom

since f{F)

is

not symmetric,

as being greater than unity,

i.e.,

values of

F

is

the arbitrary value 0.05,

If

it

i.e.,

is

is

chosen so

that out of 100 sample

F2

.

Note

that the

based on the assumption that ax and ax are

comon parent population. Fas deduced from Eq. 3-67 is

estimates of the true value a x of a

turns out that the experimental value of

larger than

F2

Suppose

i.e.,

only 5 on the average will be larger than

calculated probability consistent,

convenient further to

it is

to have the larger standard

deviation always in the numerator of Eq. 3-67. that the probability

I2S

F2 we may ,

say that, statistically, the standard deviations are

not consistent.

Table 3-2 limit are 5

lists

%, for

the limits for F, different

if

the chances that

it

will

numbers of degrees of freedom

not exceed the in

determining

Probability and Experimental Errors in Science

126 Table

"l(=«l -1) Denominator

(for

in

Eq. 3-67)

3-2.

Limits for F

in

the F Test,

5%

Level

of Measurements

Statistics

Functional Relationships

in

127

between the dependent and the the problem often justifies considerable effort in the analysis of the measurements. If there are K constants to be determined and if there are K pairs of measurements, then there are K simultaneous equations which may be solved for each constant. This is a so-called "exact" determination with no degrees of freedom for the evaluation of the precision. But if more than K pairs of measurements are available, the constants are said to be "overdetermined"; the errors in the measured values of the variables prevent an "exact" determination but do provide a basis for evaluating the in a specified functional relationship

The importance of

independent variables.

precision.

The usual procedure

is

to

make a graph of the measured quantities and we can. If we rely chiefly upon the eye

to "fit" a curve to the data as best

making this fit, there is a strong tendency to give undue weights to the end points. As a matter of fact, the end points are often the least reliable

in

because of experimental factors in the extremes of the range. By the method of least squares, however, we can give either equal or unequal weights, as desired, to the various points of the graph.

The method of least squares does not functional relationship;

it

does

constants appearing in the equation.

between two Best

Also,

different functional relations, as

of a straight

fit

us in a practical

tell

line.

Many

way

the best

us precisely the best values of the

tell

it

does allow us to choose

is

seen presently.

functional forms can be expressed

as a linear relation,

y

The

=

a

+

photoelectric equation cited above

bx is

(3-69)

an example of

this

The

form.

constant that relates the variations in electrical resistivity p with the temperature Tis given by the expression a (T/p) (dp/dT), and this expres-

=

= A a log T. The Cauchy equation for the refractive index of a substance is n = a + 6/A 2 which is seen to be linear when x is written for 1/A2 The exponential decay law, / e -/iX can be rephrased to be log / / jux. log I sion can be put in the linear

+

form log p

,

.

=

It

=

,

usually turns out that, of the

3-69,

we

are

more

—

two general constants a and b

interested in b than in a, but this

graph and a the intercept. Consider the graph of measured values of x and

is

in Eq.

not always

so.

b

gives the slope of the

and a

y,

such as in Fig.

3-3,

straight line

yQ

=

a

+

bx

such that the sum of the squares of the deviations from

minimum.

(3-70) it

shall

be a

In what direction should the deviations be reckoned ? Ideally,

Probability and Experimental Errors in Science

128

(a)

Fig. 3-3.

fitted

by a curve:

(a)

by a

straight line,

by a parabola.

(b)

only

if

(b)

Graphs of experimental points to be

random

and

errors are present in both x

the deviations should be

y,

reckoned perpendicular to the straight line. But the arithmetic involved in the determination of the constants a and b is rather formidable in this case and, in general, the result depends upon the choice of the scale of each correction for even this effect

coordinate axis;

The usual procedure

laborious.

is

is

possible but

very

is

to choose either the x or the y direction

for the deviations, recognizing that the price paid for the simpler arith-

metic

is

a sacrifice, usually negligibly small, in the accuracy of the best

of the

fit

The choice between

line.

the x

and the y direction

is

favor of that direction in which the larger standard deviation

made

is

in

found;

comparison in the same dimensions and units, s y In almost all cases in experimental science, x is taken as the independent variable whose values are selected with practically negligible error, and in these cases the deviations are reckoned along the

make

this

compared with

bs x

in order to is

y

.

axis.

We

shall

along the y

assume

in the following text that all the deviations are taken 2

axis, i.e., that b sx

the exact value of y,

viz.,

<5&

Graphically, dy t

+

is

y

=

.

Vi

2

<s y 2

Then, a

.

+

bx

i

Accordingly, a deviation

-

=

y

Vt

-

(a

+

the length of a vertical line

is

always taken as

is

written as

bx ) t

(3-71)

drawn between y and t

^

bx ) at the x abscissa position. (Remember that if 6x 0, by is not an observed deviation in y, i.e., yt — y; rather, 6y t may be greater or less than the observed deviation because 6y also includes the observed

(=

y

a

t

t

t

t

{

deviation in x t .)

Let us assume

initially that all

dy/s are equally weighted.

with the principle of least squares,

we

In accord

seek those values of a and b that

Statistics

of Measurements

make

sum of

the

in

Functional Relationships

the squares of the n deviations dy t a

|(W=i(%-a-K) 2

129

minimum. Thus,

2

=1

-

- 22 y, +

:

IbUxi

=

(3-72)

da

2E(*t y,)

=

(3-73)

from which

a

=

2x,-y,

«

fe

=

2av

2x 2 2y ?

7:

2a:i2(a;t yi)

y2x 2 i

-

zSfoy,.)

Probability and Experimental Errors in Science

130

must be noted that when a nonlinear relation is put in linear form by in one or both variables, e.g., an exponential equation to logarithmic form, etc., such a change always involves an alteration of scale which is equivalent to a gradual change of weights along the scale. Such a change It

a change

of weights, unless properly corrected

for,

always results in a different value

(An example is given as Problem 12 in Section 3-11.) The "best" values of a and b depend on the "best" form of the relationship between x and y. It is important to note that, in case the y values for each of a and

b.

x are normally distributed, the proper correction is made automatically, irrespective of the functional form, //weight I Is? is consistently assigned to the ith point; this is shown by the method of maximum likelihood (and is also part of Problem 12, Section 3-11). The standard deviations in the values for a and b may be deduced as follows. Draw the n straight lines through the n points x y and the common point x, y. This gives n values of a and n values of b These n values, however, are not equally weighted; the standard deviation in a and in 6 is proportional to l/(x — x), and thus the weight is proportional to (x t — x) 2 The standard deviation in a and b is determined with Eq. 3-37, for which J = 1, viz., t ,

t

t

t

.

t

t

t

.

t

sh

,

da

{

dy

(

s„,

=

s„

=

dbi

=

x,-

t

—

x

Then, with the weighting factors

— 1

oc

we can

and

-

write, using Eq. 3-61, if all the sv

wh 's

oc

.

are the

same

(all

y/s equally

weighted),

Xwai ( ai - a?Y*

(I Y,(xt

-

xf(a t

Zte

-

-*« - afj (3-78)

xf

and

=

ti

a,

(b i

<

t

(

t

or explicitly by a.

(3-79)

%»»

and b are t

bt

2

\n

where

-bn

= n xixt-xftbi-by y l.(x, — x) Swv \n 2j(#j given implicitly by y = a x + b and y =

zw

= X;

—

and X

and a and b by Eqs. 3-74 and

3-75.

b

t

=

Vi

-

y

ax t

t

+

b

t,

Statistics

of Measurements

Functional Relationships

in

Or, each standard deviation

may be

131

expressed in terms of the standard

deviation s u from equally weighted y values, i

= ( *» -

s,

-

'

"*->

T

(3-80)

sy

(3-81)

*

<3 - 82)

viz.,

s-

Sx 2

=

\^-V^rnXx - (2^.)

2

2

and 5

These expressions

may

U;-(W

-

'

^

be derived by using Eq. 3-37,

viz.,

and

Substitute a

and £ from Eqs. 3-74 and 3-75 and take

all

j

's

to be the

same

For example, in the b case, db

nx

_

dy

riLzf

t

— Ear, - (Lx^2 •

{

and »

y (?£f=

riLxf-XLxtf

i=i\dyj

from which Eq. 3-82 readily

follows.

The argument

give a y

n were replaced by n

is

essentially the

same

for Eq. 3-81.

Note

that Eq. 3-80

would

ber of degrees of freedom

if

—

2; the

num-

here two less than n because of the two arbit-

is

rary parameters, a and b, in the definition of each deviation.

The probable

error in a or b

by the constant 0.675

if

is

given by multiplying Eq. 3-81 or 3-82

the distribution of the weighted values of a or b

is

normal. Straight line through origin. the straight

line,

Eq. 3-69,

is

If the intercept a in the equation for

known

to be zero,

differentiate with respect to a, as in Eq. 3-72.

for b

is

it is

not meaningful to

In this case, the expression

obtained directly from Eq. 3-73 with a put equal to zero.

b

J***h Ea-,

2

or

bW= ?WWi ? Sw,x, ii

Then,

( 3.83)

The expression on

and Experimental Errors

Probability

132

Science

in

for the correlation coefficient, defined presently,

is

based

Eq. 3-83.

Best

of a parabola. The equation of a parabola

fit

y Again,

+

bx

ex

(3-84)

the number n of pairs of measurements xt y is greater than three, number of constants now to be determined, the method of least ,

squares gives the best

The argument that

+

a

if

the

i.e.,

=

all

is

2

the error

is

is

fit

the in

t

of the parabola.

same as for the case of the straight line. Assume yit and write the deviation of each point from the

best parabolic curve as fyi

The

=

y,

—

a

-

best values of the constants a, b,

vp*wn =

apw

bx

and 2

=

]

t

-

(3-85)

c are those for

which

3PW] =

db

da

ex?

dc

The algebra for the simultaneous solution of a, b, and c is straightforward.

these three equations for

Likewise, if the x ( y measurements are not equally weighted, and if the weights are known, the weighted values of a"\ b w and c w can be calculated. ,

t

,

The expressions for the weighted precision indices also follow. Perhaps the most important aspect of the method of least squares curve stants,

fitting is that

and

this part

it

in

allows determination of the precision of the con-

of the problem must not be neglected; the procedures

are similar to those detailed above in the discussion of the best

fit

of a

straight line. It

should be pointed out that calculations of

simplified

if

a, b,

and

c are greatly

the values of x are chosen to have equal interval spacings

Ax and

to be symmetrical with respect to the median, and if the x variable changed so that the zero of the scale is at the median value and is measured in Ax units. In this case, all summations of odd powers of the new variable x are equal to zero. Denoting this case with primes also on a, b, and c, we write* is

y r '2

y-r' 4

- (Z*/ 2 2 - (Zx/ 2 f nS*,' 4 -y^ n^ nLx^xf* - (Lx/ 2

hS*/ 4

See, e.g.,

>

riLzt*

)

t

c

l

2

nZx* -

(£z/ 2 ) 2

- (Z*/2)

,

g6) )'

v Y*x' 2 Vi '

G. C. Cox and M. Matuschak,

/.

^ Xi nZ*/ 4

'

- (Lx/ 2

2

Zis Vi

)

Phys. Chem., 45, 362 (1941)

of Measurements

Statistics

In these equations, x'

ifV

integers

=

(x t

-

is

in

Functional Relationships

133

measured from the median value and x values are if a' = 2(x - xmedian)/Aa;

* median)/A:r for n even and

t

for n odd.

The expression for a forms the basis of the least-squares parabolic method for the smoothing of data, as discussed later. Best

of a sine curve. The function

fit

=

y

a sin

(

-

b)

(3-87)

cannot be converted to the form of a straight line or to a power series in which a and b appear as coefficients of the power terms. The fitting of this type of function illustrates additional interesting features.

From let 6

=

Then,

a graph of the measurements y t fa, estimate a value e for b and e, which must be rather small. e, and A Let 6 \\a. (f> ,

—

b

sin

(cf>

—

=

—

b)

=

sin (6

—

d),

=

and, after expanding sin (0

—

d),

Eq.

3-87 becomes

Ay — Assume

all

the error in the

sin 6

z'th

+

=

point to be S(Ay t) and write the deviation as

KAyd = Ay — t

sin Bt

+

6 cos

0,

approximation to the best values of A and d, hence for a and b, that for which '^{Ay i — sin d + d cos 6 ) 2 is a minimum, i.e., for which

The

is

d cos 6

first

t

AlLyf v4E?/t cos

=

f

t

- E& sin

— 2 cos

+ <5E& cos 6 = = sin fy + (3D cos = 0. If the approximation Bt

t

2

i

2

and then d/dd given by the solution of these two simultaneous equations is not satisfactory, a better estimate of e is then found by making d smaller and repeating the

by writing d/dA

process.

Likewise, the weighted values a w and b w and the precision indices sa , sb sa w and sb w may be computed with the same type of argument that ,

,

was given for the straight-line case. The cases of the straight line, the parabola, and the illustrative

of the curve-fitting aspect of

sine curve are

statistical precision;

the reader

is

referred to the literature for the fitting of functional relationships that

cannot be expressed

in

one of these three forms.*

Criterion for choice of functional relation. least squares in curve fitting does not

method of

As tell

stated earlier, the

us directly the best

functional relation y =f(x), only the best values of the constants. it

can be used to *

For example,

New

York, 1959).

tell

which of two relations

is

But

the better. Thus, for example,

see F. S. Acton, Analysis of Straight-Line

Data (John Wiley

& Sons'

Probability

134

we may choose between a

and Experimental Errors

straight-line relation

in

Science

and a parabolic

relation,

or between a parabolic relation and a power series which has terms higher

than the second power. Let the two relations to be compared be a and

t

jS

be exact,

let

Then, calculate

Q.

=

^ -" n

The

having a number of

ft,

For each value of x which is assumed to value of be the or y computed from the respective relation. y, y^

constants c\ and c respectively.

relation

y* f

and

,

2 q,, **-*?

n-

ca

which has the smaller value of

Q. is the

c

(3-89)

p

one that better

the measurements.

Table

x

xl

y

3-3.

A

Differences Table A/y

A 2y

A 3#

A4

*/

fits

of Measurements

Statistics

in Functional Relationships

their differences), a function consisting

mation

is

135

of an «th order power series approxi-

required.

The scheme works for all functional forms that can be converted to a power series by a change of variable, * even though the highest power is unity as in the case of the straight line.f 3-5. Justification

Method from Maximum

of Least-Squares

Likelihood

We

have encountered so far several applications of the principle of least The first was met in the definition of the mean with the deviations

squares.

equally or unequally weighted (whichever applies) as the location index for which the

sum of

the principle

was implied (but not

statistical efficiency

the squares of the deviations

on the sum of the squares of the deviations t,

of the least-squares values of constants in the principle

was used

it

;

was

minimum

;

then,

high

also implied (but not

F, and, later, the

method was used

tency; the least squares

finally,

a

of the standard deviation as a dispersion index based

discussed) in the high efficiency of

and,

is

specifically discussed) in the

x

2

tests for consis-

explicitly in the

known

determination

functional relationships

in the criterion for

choosing the better

of two different functional forms for the representation of a ments. *

A

set of measurepopular method of data smoothing (mentioned presently) is

Note that a change of variable implies a change of relative weights, as was discussed fit of a straight line; see Problem 12, Section 3-11. Incidentally, if we ignore errors and precision altogether, the average of the

in connection with the best

t

"constant" values of the wth-order difference allows a determination of the value of the intercept in the

power

with a

a second-order power

series,

trivial bit

y

we may

Furthermore,

series relation.

may be approixmated

=

if

Ax

is

small, the other constants

of computation. For example,

a

+

bx

+

ex 2

Ax)

+

c(x

if

the function

is

write

y

where, since

Ax

is

+

Ay

=

Ay

= b Ax + c Ax +

a

+ b(x +

2

constant, a'

and

Ay

+

Ax

2cx

Ax) 2

=

a'

+ b'x

Repetition of this procedure gives

b' are constants.

+A

2

y

=

a'

+

b'(x

A

2

y

=

b'

Ax

=

+

Ax)

b"

Hence, knowing Ax, we can determine b' from the second-order difference "constant"; then, knowing b', the average \(x + x i+1 ), and Ay we can compute a'; then, from a' and b', we can compute b and c; then, knowing y and xu we can compute a. The difficulty with these approximations is that we must assume that a A.y, and a pair x it y { are exactly known. Of course, what appears to be the best values are chosen. As stated in the first sentence of this footnote, this procedure ignores errors and all questions of precision. t

t ,

{

Probability

136 also based

on

Science

in

And, errors propagate according to convenviz., based on the simple

this principle.

based on the sum of the squares,

ient rules

addition rule for the variances, in the

and Experimental Errors

if all

the errors are sufficiently small that,

Taylor expansion, the terms of powers higher than the

may

first

be

neglected.

method

Strictly, the least-squares

valid only in case the errors are

is

normally distributed, but the method

is frequently invoked for nearnormal distributions and indeed for distributions in general in which the errors are all small. The validity in the case of normal errors is easily

shown by

method of maximum

the

likelihood.

Suppose that we have a single independent variable x and that we are concerned with values of y given by

= f(z, a,

y

where

a,

Suppose

•

ft,

•

•

•

•)

•

ft,

whose values we wish to determine.

are fixed parameters

that the deviations are essentially all in y, viz., that fyi

=

- f( x )

Vi

was discussed in connection with Eq. 3-71, and are normally distributed. and of x the probability of a measurement For given values of a, ft, yielding a result between y and y + dy is given by as

•

•

•

f,

t

t

t

^e p(-<M

2

Pto,

The y n in

yi

+ ^) =

)

probability of observing y x in the interval dy±, y 2 in dy 2 dy n is the likelihood function

L-ft if

(3-9.)

X

• ,

<«.«,(- 2«2)

'

•

-,

and

(3-92)

the deviations are independent. It is

convenient to think of "error space" which

space in which the

z'th

coordinate

is

zi

=

is

an ^-dimensional

yja^ Then, Eq. 3-92 can be

written as

L

-

-^

exp

[- |S(^) 2 ]

dv

(3-93)

(2tt)

where dv is the elementary volume in error space. The relative likelihood of any set of errors or deviations in dv depends only on 2(&z,.) 2 i.e., on the square of the radius vector R in error space. It is immediately apparent •) that corresponds to the maximum value of that the value of a (or of ft, ,

•

L

is

This

is

•

which the sum of the squares of the deviations is a minimum. a proof of the validity of the principle of least squares for normally

that for

distributed errors.

Statistics

Note

of Measurements

that

it

^[(dy,-)

is

in

2

2 /^,-

Functional Relationships that

]

is

137

minimized where a?

to be

is

the

weighting factor to be associated with each y (Such weighting was discussed earlier for the special case of the weighted mean in connection t

.

with Eq. 3-60.) The derivation of the principle of least squares, therefore, requires that, in application of the principle, each value 6y t be properly

weighted.

In

many measurement

problems,

that

all

known about a

is

t

same for all i; in such cases all deviations are properly equally weighted and we write simply a for a If many different samples each of size n are taken from the infinite is

that

its

expected value

is

the

t

.

parent y population, many different values of £(<3z,) 2 (= R 2 ) are obtained. Analysis of the R 2 distribution gives further interesting information

regarding the principle of least squares.

We

can rewrite Eq. 3-93 in terms of R, with

L(R) dR

C

where the constant

= RdR;

and

sum of all

dR

=

dv, as

= Ce^^R"- dR

(3-94)

can be evaluated as follows. Let

let

R

L(u) du Since the

_1

1

=

u

then du

i?"

n~2

=

=

Ce-

u

IR

(2w)^

=1=

(3-95)

- 2)

(n

;

then,

2^ n ~ 2)u 1A{n - 2)

probabilities L(u)

L(«) du

2

du

must equal

C[|(n

-

2)]! 2

unity,

H(n " 2)

J

and therefore

C=

1

2^»-2)[i(„_2)]!

Then,

=

L(u) du

e

-u

v*(n-2)

du

(3 _ 96)

[Kn-2]! Equation 3-96 forms the basis of the % 2 distribution and of the x 2 test for the goodness of fit of a mathematical model frequency distribution to a set of actual measurements.

of the probability that the fall in

an

limits

of integration. This

Specifically,

sum of the

Eq. 3-96 allows calculation

squares of n observed deviations will

arbitrarily selected range, a range that is

is

by the choice of the

set

discussed further in Chapter

The derivation of Eq. 3-96 has not

4.

specifically included the effect of the

•)• If there are q a, /S, in the function y = f(x, a, /?, parameters whose values are to be estimated from the n measurements at hand, then proper account of the effect of these parameters in the mini-

parameters

mization of/? 2 n

—

•

[i.e.,

•

•

of

•

(<5z ) 2

q degrees of freedom

2

in

left

Eq. 3-93] reduces n to n

•

—

q;

i.e.,

there are

for the estimation of the characteristics of

138

and Experimental Errors

Probability

R2

the

may

(a, /?,•••

distribution,

Science

in

refer to functional constants, instru-

mental parameters, distribution characteristics such as y and a v The pertinent error space then has n — q dimensions.

R2

In regard to the likely value

of R

maximum

of

2 ,

viz.,

distribution,

R* 2

Bm

R*

likely value is

=

R* 2 was

#

'/

\dR

It

most from Eq. 3-94 by the method

interesting to note that the

Thus,

likelihood.f

and the m©st

it is

readily obtained

is

,

n

-

=

R* 2

or, generally,

1,

n

-q -

(3-97)

1

normal distribution of errors, that the best estimate 2 is [«/(« — \)]s Equation 3-97 for the case in which no additional parameters a, /?, are

asserted, for a

of the variance of the parent distribution reiterates this

.

•

n-l

R 2 = ±-l (dy f = -s 2 = t

Also, the

mean

=

By

R2 is

R2

=

n,

viz.,

,

R 2 may mean

is

=

\U

=

u

e

\n.

known specifically about

= n

The most

likely value

=

u,

R2 =

n

-q

(3-99)

the proper weight to be assigned

R 2 [=

,

2(<5y 1 /o I )

2 ]

tell

us

-q

R can also be obtained from L(u). To do this we note that V2u dR, and we must use the function V2uL(u), not L(u)

of

L(u)

d

log

value of

Hence,with Eq. 3-95,

or, generally,*

o?'av

R 2 = R**

-u H(n-2) du M

to each y t value, Eq. 3-99 and the definition of that the proper average weight is

L(R) dR = L(u) du alone. Then,

•

be obtained from either

,

definition of the

\UL(U) du

which, upon integration,

In case nothing

i

value of

Eq. 3-94 or Eq. 3-96.

U

•

Write

to be estimated.

t

etc.)

,

V2uL(u)\

= 0=J--\+l(n-2)\

from which

R" = %

The function

involved.

2«*

=

n

-

1

L(u), rather than 2uL(u), suffices here because

no

differentiation

is

Statistics

3-6.

of Measurements

in

Functional Relationships

139

Data Smoothing

Let us discuss briefly another problem in the treatment of experimental data, a problem closely related to curve fitting.

This

is

known

as data

Data smoothing always presumes some knowledge of the analytic form of the best-fitted curve y = f(x), although this presumption is not always recognized by investigators who try to smooth their data. It must be emphasized at the beginning of this discussion that, in general, we should not indulge in data smoothing unless we have a clear a priori assurance of the order of magnitude of at least the first and second derivsmoothing.

= f(x) throughout the region of Furthermore, because of the inherent arbitrariness in any datasmoothing operation, it is practically impossible to treat the measurements atives of the appropriate function y

interest.

statistically after

they have been smoothed, and any quantitative inter-

is likewise open to some question. mention two popular methods of data smoothing. In the first method, smoothing is accomplished graphically, a portion of the data at a time, by arbitrarily drawing a smooth curve "through" the experimental points. This method is improved if the first-order differences between adjacent x values are smoothed, and is further improved if second-order differences are smoothed. (First- and second-order differences have been

pretation of the results

We

shall

described earlier in connection with Table 3-3.) this

method

is

By

successive applications,

capable of very satisfactory results (except for the end points)

but the procedure repetition that this

is usually slow and inefficient. The fact deserves and any method of data smoothing require that the

=

unknown

relation y f(x) be a slowly varying function over each small portion of the curve being treated otherwise, any short-range real "struc;

would be undesirably smoothed out. second smoothing method is based on the principle of

ture" in the curve

A

least squares.

This method, for convenience, presumes that the values of x are chosen with equal interval spacings Ax, that the error

is

entirely in the y values,

and that each small portion, e.g., four adjacent intervals each of size Ax, of the unknown/(x) curve agrees reasonably well with a small portion of a parabola. Consider the five measured values y_ 2 */_ x y y+i, y+i cor,

responding to the values x

We

2Ax, x

—

Ax, x

,

,

smoothed value, to replace y x be replaced by x', defined as

wish to find

First, let

—

y,

the

x'

x

,

+

Ax, x Q

+

2Ax.

.

= ^-—^°

(3-100)

Ax as used also in Eq. 3-86, so that x

is

the central value

and the

unit of x'

Probability and Experimental Errors in Science

140

Ax.

is

If the

parabolic relation

fits

y

=

a

'

+

unknown we seek is

the

well over the range 4Ax, the value of y b'x'

+

c'x'

2

(3-101)

and, because of the shifted zero of the x scale to x is

just

The value of

a'.

relation satisfactorily

,

this value

of y we seek

given by Eq. 3-86, which for five points

a' is

becomes

=

y*

a

'

=

+

™\- ll yo

This method of smoothing

is

12 (2/+i

+

y-i)

-

3 (y +2

+

y- 2 )]

used by starting at one end of the range of

%iS and working systematically and consecutively through the values of x to the other end. The method does not do well at the ends, and the end t

regions should be repeated, working toward the respective end.

important that the

unknown

It is

also

function have no real "structure," of a

comparable to the range 4Ax, which should not be smoothed

size

out.*

Correlation

3-7.

So

we have considered relations y = f(x) in which x Additional variables are often involved, and if so

far in this chapter

the only variable.

is

we

say that y

Or,

we

The

to x.

is

correlated to x

say that y

is

and correlated

to each of the other variables.

functionally related stochastically rather than exactly

cases of correlation

most frequently encountered in practice known; in fact,

are those for which the other variables are not specifically their existence

is

suspected only because the observed fluctuations in y

for a given value of x are too great to be attributed to experimental error alone.

Examples of correlated properties are

(1) the heights

of mothers and of

adult daughters in the United States, (2) the grades of college students in

mathematics courses and

in physics courses, (3) the longevity

age 60 years and their weight

age 50 years, and

of

men

past

mass of the atomic nucleus and the number of nuclear fragments into which the nucleus breaks in fission by slow-neutron bombardment. Correlation coefficient.

at

(4) the

Let the graph of Fig. 3-4 of a set of student

grades in mathematics and physics represent two correlated properties.

We

shall define

This coefficient * It

is

an index or coefficient to express the degree of correlation. is

such that

possible, in this

component

in

operation

is

like a filter

others.)

would be zero

if

the points were distributed

method, to accentuate rather than to smooth a periodic

the experimental "jitter" of the measurements

or less than about 4A.r.

some

it

if

this

period

suspected, try a different x interval.

is

equal to

(The smoothing that attenuates waves of some frequencies and resonates with If this is

Statistics

of Measurements 1UU

in

Functional Relationships

141

Probability and Experimental Errors in Science

142

Assume first that all the points xu y are equally weighted. assumption the equation of the straight line is t

With

this

n

V'si

=

=

bx>

i^—

(3-102)

x\

Introduce the quantity

us

S,

which

is

a

(m^lf

(3 . 10 3)

measure of the dispersion of the measured values

to the least-squares fitted straight line.

of estimate.) The standard deviation

(S^-

is

y\

with respect

called the standard error

in the y\ values is

V = (^) The

correlation coefficient

(3-104)

defined in terms of S,/

r is

rs

and

s„-

as follows:

U--HJ

(3-105)

s~

\

y

From Eq. 3-105, the value of r may be anywhere in the range 1 to — 1 by convention, the negative values correspond to an inverse correlation, i.e., to a situation in which the general trend of the points is such that the slope of the straight line

For

=

y' gl

and

=

r

0,

Sy

=

-

0, that (1)

(2) the

s

y

-,

is

and

negative (see also Eq. 3-107 and Fig. 3-5).

this

is

possible only

if y'sl

=

0.

It

follows, if

the least-squares straight line, Eq. 3-102, has zero slope;

£a^

sum

is

zero, which, as discussed in Section 3-1,

the

is

condition encountered in the propagation of errors that the deviations

bx

t

(= x

The

i

—

x

=

Sy

x'{)

and by (= y t

i

—

y

=

y\)

be entirely independent.

increases, and this ratio is y sometimes called the alienation coefficient. Computation of the correlation coefficient r is generally easier with a modified form of Eq. 3-105 obtained as follows. From Eqs. 3-102 and ratio

js

.

decreases in magnitude as

r

3-103,

n 2 x'}

hence, noting also that sy

2

-( using Eq. 3-83.

=

(Zx'. 2

)/n,

¥f- ^J-=b

5 -£

(3-107)

Or, better, in terms of the weighted means xw and r

=

*"<* ~ *"»* ~

//

.

>•> (3-108)

Statistics

of Measurements

in Functional Relationships

143

Fig. 3-5. Scatter diagrams illustrating different correlation coefficients

and regression

lines.

Covariance.

The term covariance

is

commonly

samples.

It

may

observations

be written oxy

This

used.

is

a

which x and y are individual and the best evaluation of it from n actual

characteristic of the parent population of

t

i

is

n

The covariance may be divided by

-

1

n

-

(3-109) 1

the product of ax

.

and ay to give the -

best determination of the parent or universe correlation coefficient.

Probability and Experimental Errors in Science

144 It is

often convenient in statistics to speak of the experimental covariance

which

s

from Eq.

H=

n

= "A'

(3-HO)

3-107.

The

Interpretation.

help

—

given by

is

usefulness of the correlation coefficient

provides in answering such questions as:

it

mathematics

what grade may he expect

75,

is

such a question can be worked out simply

The answer to and b are known.

in physics?"

if r, y, x,

The equation of

the least-squares fitted straight line

This

M indicated

sy , is

(3-111)

V'si=bx' is

the line

given value of

Sy

The expected value of y

in Fig. 3-4.

readily obtained

for a

from Eq. 3-111. Then, the value of

computed from Eq. 3-105 as

is

'

is

a;

in the

is

"If a student's grade in

SvU and the answer

to the question

y

= s Jl y

is

r

2

(3-112)

simply

+ M«- ±

syU

(3-H3)

where the plus or minus value is the standard deviation in the expected physics grade. If the x and y' frequency distributions are normal, the i.e., the chances are is 0.6755^ 50-50 that the student's physics grade would be y + [y's i\ x ± 0.6755^. and mark the 50-50 Lines A and B in Fig. 3-4 are parallel to the line

probable error in the expected value

,

-

M

limits;

calculation of lines

distribution

is

A and B

independent of

also

presumes that the parent y

x.

In the question as posed, the student's grade in mathematics

To make x

=

70, sv

9 if

the calculated answer numerical, suppose that r >

=

and b

10,

+ M*'= = 5

the reliability

V

is

80

=

y/x.

Then,

+

f(75

-

70)

±

10Vl

-

(0.60)

2

=

0.60,

=

85.7

y

±

is

75.

=

80,

8

expressed as the standard deviation, and as

+ M*<= 5 =

85.7

±

0.675

x

8

=

85.7

±

5.4

and y' distributions are normal. (If they are not normal, the factor 0.675 must be changed appropriately; and if the distributions are asymmetrical, lines such as A and B in Fig. 3-4 do not have the same graphical significance as

if

the reliability

is

expressed as the probable error and

if

the x

for symmetrical distributions.)

Another useful feature of the correlation If r

=

60%

0.60,

we may

of the net

effect

coefficient

infer that the effect of the

of

all

is

x variable

the following.

may be about

the correlated variables including x.

However,

of Measurements

Statistics

Fig. 3-6.

distributions for

r

different values of

hood

an

maximum

145

two

likeli-

r*.

interpretation of this sort

the

Functional Relationships

in

common

effect in

x and y

is

very risky because of the fact that

may

of

all

be due to other variables.

Each of these physical interpretations of the correlation coefficient presumes that the unspecified as well as specified correlated variable have distributions that are

somewhere near normal.

Calculation of the standard deviation in r can be impractical procedure of recording

many sample

ments, computing r t for each sample

and then

set,

made by

the rather

of x t y measureand the mean value of r, viz., sets

,

t

manner with an equaform of Eq. 2-9. The distribution of all the sample values of r is essentially normal about the mean r if r is small, but the distribution is increasingly skewed as r approaches ±1, i.e., as x and y become strongly r,

calculating sr in the usual experimental

tion of the

t

correlated either positively or negatively.

No

This

is

illustrated in Fig. 3-6.

simple formula, therefore, can be given for the standard deviation in a

measurement of

single

r for

any value of

r.

In the above discussion of correlation, a straight line least-squares fitted curve y

which t

all

the points

For example,

in the state of

would

is

taken as the

=f(z), the so-called regression^ curve, lie if

on

the correlation coefficient were unity.

and liquor consumption have quite a positive correlation as studied over the past 90

tuition fees for students at Cornell University

New York

years. t The line M, or in general the curve y =/(*), is often called the regression line. This term comes originally from studies of the correlation of heights of sons and heights of fathers: It was found that sons' heights, compared with fathers' heights, tended to regress toward the symmetrical or mean line M. Regression is another word to express

the fact that additional uncontrolled variables are involved.

The viz., y,

The

slope of the regression line

regression curve discussed here

the error in x.

M,

viz.,

the constant b,

and the mean value of

y,

are sometimes called the regression coefficients.

is

in y

;

is

based on the assumption that x is exact and all is obtained if error is allowed

a slightly different regression line or curve

Probability and Experimental Errors in Science

146

often happens that the straight line

It

but that a curve of a higher power

not the most reasonable curve,

is

or whatnot,

series,

is

example, the points on a graph of "distance in feet to stop"

random

miles per hour" for a

better. vs.

For

"speed in

selection of 100 automobiles are better

by a parabola by the argument attending Eq. 3-89, and it is preferable to use the squared relation rather than to change the variable to make the relation linear.* (This point is treated in Problem 29 of Section 3-11.) In this nonlinear event, the quantity Sy is, as before, the measure of the y spread about the curve in general, and a correlation coefficient (often called the correlation index p to distinguish it from r for a straight line), But the magnitude of p is different from r, and is defined by Eq. 3-105. fitted

-

the physical interpretation of the numerical value of correlation in the

nonlinear case

is

more complicated than

in the linear case.

It is,

of course,

important in speaking of correlation to specify the particular least-squares regression line involved.

Inefficient Statistics

3-8.

Throughout

this

book, emphasis

placed on the procedures that use

is

of the experimental information;

all

referred to as efficient statistics.

of

it is

often convenient to use

The quick-and-easy formulas them almost no attempt to use

desired feature of the measurements.

inefficient statistics frankly

much

However,

obtain quickly at least an order of magnitude of

inefficient statistics to

some

such procedures are often loosely

have

in

of the imformation actually available in the measurements.

In this

quick-and-easy interest, their discussion may logically have appeared earlier in this book, somewhere in the latter part of Chapter 2. But the

understanding of the concepts of efficiency considerable degree of

vs.

statistical sophistication.

inefficiency involves

We

a

have in the present

chapter at least touched upon the types of arguments basic to this under-

However,

standing.

rehearsed: (1) the

Suffice

in this brief section, these

to say that in each case

it

numerical efficiency of a location value

arguments

mentioned is

from a parent normal *

Each case of

this sort

parabolic relation e.g.,

is

and

(2)

now be belowf

relative to the efficiency

of the mean, and the efficiency of a dispersion index the standard deviation,

not

tersely

will

is

relative to that of

the measurements are presumed to be

distribution.

must be judged on

its

own

merits.

In the example cited, the

preferable because of the role of the other correlated variables,

variables such as the weight of the car, the size

time of different drivers,

and type of brakes and

tires,

reaction

etc.

W. Dixon and

F. Massey, Jr., Introduction to Statistical Analysis(McGrawYork, 1951), for more complete discussion of inefficient statistics. + For an asymmetrical parent distribution, the quantitative use of these particular inefficient statistics depends of course upon the degree of asymmetry. t See, e.g.,

Hill

Book

Co.,

New

of Measurements

Statistics

Functional Relationships

in

147

The median has been described as being 64% effimeasurements whose parent distribution is normal.

Location index.

cient in a large set of

36% of the information contained in the data is ignored if the median taken as an estimate of the value of the mean. But, when the number n

Thus, is

measurements is very large, this inefficiency may not be serious. median increases as n becomes smaller; for example, the efficiency is 69% for /? = 5, and 74% for n = 3. In general, however, the median is more efficient as a location value than is the mid-range which approaches 0% for very large n, and is 54% of

trial

The

efficiency of the

for n

~

10,

77%

for n

=

5,

92

%

=

for n

3,

two of the large number n of measurements are to be averaged to give an estimate of the mean, again if the parent distribution is approximately normal, the best two measurements for this purpose are those at the points 29% and 71 % along the range. This estimate of the mean is 81 % efficient. The 25% and 75% points are practically as good and are easier to remember. If three measurements are used, the average of those at the 20%, 50%, and 80% points in the range give an 88 % efficiency in estimating the mean value, and these three are also about the best three. When n is only about 7 to 10, the average of the third measurement from each end of the range gives an estimate of the mean with an efficiency of about 84%. If only

Dispersion indices.

normal

that, in a is

88% efficient. When the number

mean

deviation, as a dispersion index,

only

population is

Attention has already been called to the fact

distribution, the

n of measurements

is

very large and the parent

approximately normally distributed, the standard deviation

is

estimated with about 65 s

_

%

from

efficiency

(93%) point

- (7%

point) (3 114)

3

or with about

80%

(97%

_

efficiency

point

from

+ 85% point - 15% point - 3% point

These respective points are the best

if

(3-115)

only two or four points are to be

used.

When

n

is

small, 2 to 20 or so

the standard deviation

is

from a near-normal parent

distribution,

estimated simply from the range as s

^

—

range =-

V"

,

.

(3-116)

Probability and Experimental Errors in Science

148

with an efficiency that

As n

increases,

3.5 for n //

=

100.

=

falls

we should

The

efficiency,

Standard deviation

in

the simplest estimation

use, instead

=

n

15, 3.7 for

99%

from 20, 4.

1

for n

=

3 to about

85%

at n

=

10.

Vn in Eq. 3-116, the following: = 30, 4.5 for n = 50, and 5.0 for

of

for n

however, progressively decreases as n increases. the

mean (standard

error).

When

n

is

small,

is

s„~^

(3-117,

n

where, again, for n

y

numerical values given

Examples.

10,

n should be replaced with the respective

in the previous

As examples with

paragraph.

the measurements listed in Table 2-2,

the inefficient estimates of the mean, of the standard deviation,

and of the

standard error are compared with the

3-4.

efficient values in

Table

Table 3-4. Comparison of Inefficient Estimates with the Efficient Values from Measurements of Table 2-2

Statistics

of Measurements

in

Functional Relationships

149

the need of a quantitative analysis of precision as free as possible

ambiguity. This need

may come

of the precision of his

in the stating

from

own

measurements or in the interpretation of another's. This need is more frequently and more acutely felt as the pertinent facet of science progresses and the residual uncertainties in the pertinent scientific facts and concepts

become smaller. The continuing tests of a generalization or of a theory depend more and more upon small differences. After systematic errors have been eliminated as far as possible, the remaining inherent ambiguity in precision and in its interpretation is reduced to a

minimum

only by a statistical argument.

The design of an experiment, statistics, refers

as the phrase

is

used in the subject of

generally to the process of determining a priori the

most

economical sample size or sizes to give a specified precision, where economical refers to both time and financial cost. It usually does not involve any changes in the apparatus or in the general measurement operations; but it does specify the order of taking measurements and the groupings of measurements. One purpose of the order or grouping has to

do with

the possible detection of sources of constant or systematic error

measurements.

Design features that are obviously pertinent are broad general terms:* (1) The number of measurements must be sufficiently large to give the necessary number of degrees of freedom for the determination of the in the

easily stated in

desired precision. (2) If subsets

of measurements (including controls or background) are

involved, they should be of approximately equal size

a

way

and grouped

in

such

as to reveal inconstant effects (such as a zero drift in the apparatus).

(3) If

measurements or subsets are of unequal precisions, each should

be weighted inversely as the square of the standard deviation (or, alternatively, if the distributions allow, as the square of the probable error). (4)

Before subsets recorded at different times, with different apparatus,

or by different observers, are pooled to form a grand

set,

the subsets

should be tested for consistency. (5) If

one or more measurements or subsets has a high precision,

the measurements or subsets having low precision s as a consequence of weighting

if,

say, s

t

5*

4s h

t

may

sh ,

be neglected

.

measurements to be used in computing the value of a derived property should have precision roughly in accord with the relative magnitude of the final propagated error in the computed value it serves little purpose to spend time and effort improving the (6) All

component

direct

—

W. G. Cochran and G. M. Cox, Experimental York, 1950), and R. A. Fisher, The Design of Experiments (Oliver and Boyd, Edinburgh, 1949), 5th ed. *

For

fuller discussions, see, e.g.,

Designs (John Wiley

&

Sons,

New

Probability and Experimental Errors in Science

ISO precision of one

component

if

the precision of the derived

measurement

is

heavily dominated by another component.

And,

(7)

the fluctuations in the direct measurements, or in the

finally,

must be as nearly random as possible; this often on the part of the experimenter in order to avoid systematic errors and bias. subsets of measurements,

involves a rather rigid discipline

Summary

3-10.

The

topics discussed in

many

parts of this chapter

and

in

Chapter 2

serve the pedagogical purpose of emphasizing the difference between

the properties of a sample set of measurements and those of the parent

For example, the mean and

or universe frequency distribution.

cision are at best related stochastically to the corresponding

precision of the parent distribution. reliability

This

is

essentially the

its

pre-

mean and

problem

in the

of inferred knowledge of any sort, including measurements and

hypotheses, the basic

phenomena of any experimental

science.

This chapter has been concerned with direct or derived measurements that have parent distributions of

commonly found

unknown

shape.

This

is

the situation

for derived measurements, but, fortunately,

many

of

measurements in experimental science have parent distributions of one of two general types, and each of these types is fitted reasonably well by a simple mathematical model. With the model, once its paramthe direct

eters are satisfactorily

determined

(albeit

only stochastically since the

by means of the sample), predictions of future measurements can be made and the reliability of any characteristic of the parent distribution can be established with many fewer measurements than is Some possible when we must rely on the sample knowledge alone. determination

examples of

is

this

convenience, in the case of specialized types of derived

quantities (e.g., the fitted

means and

precisions of successive subsets) that are

reasonably well by the normal distribution, have been shown in the

many of the topics of this chapter. In Chapter 4, the normal model is explored in some detail, and then in the final chapter the Poisson model and typical Poisson measurements are discussed.

discussions of

3-11.

Problems

The student should

test his intuitional feeling

with the correct answer to each

problem. 1.

Given the measurements:

gm

10.01

±

10.00

±0.12

9.97

± ±

±

9.96

±0.15

9.99

0.25

0.05

9.98

0.04 0.06

gm

of Measurements

Statistics

Find the weighted mean

(a) (i)

values,

151

the weights are taken as inversely proportional

if

to the squares of the

(ii)

±

Find the standard error (standard deviation

(b) is

±

to the

Functional Relationships

in

values as they should be. in the

mean)

if

each

±

value

interpreted as the standard deviation.

The standard deviation

2.

in the reading of a given voltmeter

corresponding quantity for a given ammeter

is

is

What

0.015 amp.

0.20 v; the are the per-

centage standard deviations of single determinations of wattages of lamps

operated at approximately their rated wattages, obtained from readings on these instruments for the case of (a)

a 500-w, 115-v lamp,

(b)

a 60-w, 115-v lamp, a 60-w, 32-v lamp, and

(c)

(d) a 60-w, 8-v 3.

With what

0.39%) 2.9%) (ans. 1.0%) (ans. 2.5%)

(ans.

(ans.

lamp?

precision

may

the density of a 10-g steel ball bearing of approxi-

mate density 7.85 g/cm3 be obtained tion of its average radius 4.

What is

is

the standard deviation of the determina-

if

mm, and of its mass,

0.015

the standard deviation in

where u

u,

=

0.05

mg ?

(ans. 0.67

%)

3x, in terms of the standard

deviation in x ? 5.

One of the

radiation constants

is

given by the formula

2n*k*

"15^

"

where k h c

= = =

where the

1.38049 x 10~16 (1 ± 0.000,05) erg/(molecule °K), 6.6254 x 10-27 (1 ± 0.000,2) erg-sec,

2.997928 x

±

lO^l ±

0.000,004) cm/sec,

values are probable errors.

(a)

Solve for a and express

(b)

What is

its

it

with the proper number of significant figures.

probable error expressed with the proper number of significant

figures?

What is the standard deviation in u, where u = 3x + 5y2 from the measure-

6.

,

ments

x y (a)

(b)

= =

12

13

11

12

10

14

13

12

14

13

12

35

37

34

37

34

37

36

35

38

34

35

when x and y are assumed to be completely independent, and when they are recognized as being partially dependent ?

What What is

the correlation coefficient in Problem 6 ?

7. (a)

is

(b)

the equation, with constants evaluated, of the linear regression

line? 8.

p

is

the pull required to

lift

/>(lb)

w

a weight

w by means

made

following measurements are

(lb)

= =

12

15

21

25

50

70

100

120

(a)

Find a linear law of the form p

(b)

Compute/? when w

=

150

lb.

=

a

+

bw.

of a pulley block, and the

Probability and Experimental Errors in Science

152

Find the sum of the deviations. Find the sum of the squares of the deviations of the given values of from the corresponding computed values. Note significant figures in all parts of this problem. (c)

(d)

p

In a determination of h/e by the photoelectric method, the following stop-

9.

ping potentials were found, after correction for the contact potential difference,

corresponding to the various wavelengths of incident

A(A)= V(y) =

3126 -0.385

2535

+0.520

light:

4047 -1.295

3650 -0.915

Using the least-squares method, determine h/e and

Assume

V only,

errors in

a

R is the resistance to motion of a + b V 2 from the following data

weighted equally, and

(b)

weighted in proportion to the speed V:

K(mi/hr) = R (lb/ton) =

The

standard deviation.

car at speed V, find a law of the form

(a)

11.

its

5461

-2.045

a fractional standard deviation of 0.5 %.

10. If

R =

4339 -1.485

10

20

30

40

50

8

10

15

21

30

a-ray activity of a sample of radon, expressed in terms of

measured

its initial

each succeeding 24-hr interval to be: 0.835, 0.695, 0.580, 0.485, 0.405, 0.335, 0.280, and 0.235. On the assumption that the activity obeys an exponential decay law, find the equation that best represents activity as unity,

the activity,

is

after

and determine the decay constant and the (ans.

12.

What

is

y

=

Solve this problem

(ii)

(iii)

and

E = olT4 In E = In

a

=

a

£/(r 4 )

+

4

In

E=

oiT 4 ,

from n

pairs of

measurements?

in

each of the following forms

T

give a qualitative reason for the differences in the answers.

ans. (b) Solve the

for

half-life.

/day, 0.1815/day, 3.82 days)

without knowledge of the precision of the measure-

first

ments by writing the relation (i)

1815<

the expression for the best value of a in the blackbody law relating

radiant energy and temperature, (a)

1.000,36
all

XEiT+KXTf),

(i)

problem

in

(ii)

ln"1

^ In E

t

- 42 In T )jn, t

(iii)

-

2 -i

n

l

i

_

terms of the standard deviations sE and s T constant

pairs of measurements.

13. Calculate the value of the correlation coefficient r for the following data

on the heights x x

y

= =

(in inches)

and weights y

63

72

70

68

124

184

161

164

66 140

(in

pounds) of 12 college students:

69

74

70

63

72

65

71

154

210

164

126

172

133

150

Are the stars that are easily visible with the naked eye randomly distributed sky? Divide the entire sky into many equal small solid angles, and discuss a method for finding the answer in terms of sample means and standard devia14.

in the

tions.

of Measurements

Statistics

=

—

4x

error of

The

16.

What

2/x.

viscosity

mean

deviation in y corresponding to an

is

% when x

1

=

when x

large, oo

is

1/V2)

calculated using Poiseuille's formula for the

through a cylindrical tube of length

/

and of radius

deviation,

(c) the fractional

in terms

r\

mean

under a pressure difference p. Write the expression for

t

(a) the

of a liquid

?]

Q flowing mean

(b) the fractional

in

the percentage

depends upon x; about

quantity of liquid in time

is

% in x ?

1

(ans.

a

IS3

A quantity y is expressed in terms of a measured quantity x by the relation

15.

y

Functional Relationships

in

deviation,

standard deviation

of the errors in the measured quantities Q,

a,

I,

and p, where

npafit

17.

The

viscosity

so that

G —

=

ri

-

(\ I

is measured by a rotation viscometer. The and a torque G is applied to the rotating cylinder

of a liquid

->]

cylinders are of radii a

and

— — —\

b,

l

I

(a) the fractional

mean

,

where

to is

the angular velocity of rotation. Calculate

and

deviation,

(b) the fractional standard deviation r\ when a = 4 cm and b = 5 cm, and when the mean deviation in both a and b is 0.01 cm and the standard deviation in both a and b is 1.25 x 0.01 cm, assuming that the error in G/oj may be neglected.

in

18.

A

coil

point on

its

of n turns of radius r carries a current

axis at a distance

error in measuring x (a)

(b)

when when

x from

its

center

is

find the value of x for

is e,

e is the

standard deviation, and

e is the

mean

2

in

H

is

a

If the

greatest

(ans. r/2)

deviation.

The mean of 100 observations is 2.96 cm and 0.12 cm. The mean of a further 50 observations is

the standard deviation 2.93

cm

is

with a standard

Find

deviation of 0.16 cm.

mean,

(b) the standard deviation, (c)

field at

+ x 2 )~%.

2

which the error

19.

(a) the

The magnetic

/.

H — 2nr I(r

and

the standard error

for the

two

sets

of observations taken together as a single set of 1 50 observations.

20. Derive Eq. 3-36. 21.

Prove that the frequency function of the variable

the frequency function of the normal variable

degrees of freedom

->•

v

oo.

Assume

z,

/,

Eq. 3-65, approaches

Eq. 1-25, as the number of

that the constant approaches 1/

^2n.

accompanying data on the yield of corn in bushels per plot on 22 experimental plots of ground, half of which were treated with a new type 22. Consider the

of

fertilizer.

Does

the fertilizer increase the yield ?

Treated

6.2

5.7

6.5

6.0

6.3

5.8

5.7

6.0

6.0

5.8

Untreated

5.6

5.9

5.6

5.7

5.8

5.7

6.0

5.5

5.7

5.5

IS4

Statistics

Samples of and s 2 =

25. sx

=

of Measurements 10

sizes

12

18.

in

Functional Relationships

1

55

and 20 taken from two normal populations give

Test the hypothesis that the standard deviations are

internally consistent. 26. The curve to be fitted is known to be a parabofa. There are 4 experimental points at x = -0.6, -0.2, 0.2, and 0.6. The experimental y values are 5 ± 2, 3 ± 1, 5 ± 1, and 8 ± 2. Find the equation of the best fitted curve. [ans. y(x)

=

(3.685

±

0.815)

+

(3.27

±

1.96).r

27. Differentiate (d/dp) logZ., Eq. 3-8, with respect to

solve for (p

— p*) 2

bution

is

by the method of

equivalent to writing a w

is

ap

=

maximum

= ^ npq,

\ p*{\

±

(7.808

4.94).r 2 ]

and, using Eq. 3-9,

and then show that the standard deviation

reference value as the estimate p*, result obtained

p

+

in p,

— p*)jn. Show

with the that this

likelihood for the binomial distri-

Eq. 2-28, where

w

is

the

number of

"wins." 28. in air 29.

Smooth

the measurements given in Table 3-5 of the sparking potentials between spheres of 75-cm diameter as a function of sphere separation.

Measurements were made on the distance-to-stop as a function of speed

with a group of 50 different automobiles of various manufacture and with different drivers. The measurements are given in Table 3-6. The speed is presumed to have been very accurately known in each case, the parent y distribution is presumed to be independent of x, and all the distance measurements to

have equal weighting. Which of the following relations best represents the measurements and what are values of the constants: 2 (a) y = ax, (b) y = b + ex, (c) y = dx or (d) y = ex + fx2 l ,

(ans. e

=

1.24,/

=

0.082).

"Everybody believes of errors;

think

it

in

the exponential law

the experimenters because they

can be proved by mathematics; and

the mathematicians because they believe

4 Normal

it

has been established by observations." E.

T.

WITTAKER

Probability Errors

The normal (Gauss) probability distribution is the mathematical model most commonly invoked in statistics and in the analysis of errors. For example, as was pointed out in the last chapter, it can be demonstrated analytically that this model fits very well the distribution of each of certain special parameters of empirical distributions:

(a) the likelihood functions

Lj of Section 3-1 approach a normal distribution;

consistency of Section 3-3, the degrees of freedom

is

t

distribution

very large; and

(c)

is

of

(b) in the tests

normal

if

the

number of

the ratios s^/s^, can be

assumed

to be normally distributed.

And

it

was pointed out

that the analysis of the errors in direct measure-

if it can be assumed that the parent distribution Examples of such simplification are (a) the weight to be assigned to each measurement a;, is llsx 2 if ther-'sare normally distributed; (b) the method of least squares is strictly valid if the normal distribution applies; and (c) errors propagate according to convenient rules if the

ments is

is

greatly simplified

normal.

parent distributions are normal. Indeed, the "theory of errors" as developed during the early years of the subject,

i.e.,

the theory of probability as applied to direct measure-

ments, was based almost exclusively on the assumption that the normal distribution

fits

Nowadays, however,

the measurements.

of the popularity of measurements

Chapter

5),

made

the Poisson distribution takes

especially in view

with Geiger counters, its

distribution in the analysis of errors in direct measurements.

when

the expectation value

//

in the

etc. (see

place alongside the normal

But even

Poisson distribution (Eq. 1-26)

large, the simpler algebraic expression that describes the

is

normal case

convenient and satisfactory approximation to the Poisson expression. 156

so,

rather is

a

Normal 4-1.

It is

Probability Errors

Derivation of the Function

Normal (Gauss)

Probability Density

apparent in the binomial distribution, Eq. 1-21, that,

probability

p

is

constant, the expectation or

boring region of interest

of

157

trials

n increases.

shift to larger

This

is

shown

and

mean

(1)

the

value,

When

k

=

0),

and

0.4

0.3

0.2

0.1

0.0 0.3

i.e.,

number

n becomes very

mean value ju becomes of we become much more

from the origin between adjacent k values becomes the distribution loses much of its practical

(2) the unit interval

relatively very small,

the "success"

and the neigh-

i.e.,

interested in the deviations than in the values of A: reckoned (at

if

fx

larger values of k as the

in Fig. 4-1.

large, two significant features appear: dominant importance as the reference

value

Probability and Experimental Errors in Science

IS8

Fig. 4-2.

Binomial probability B(k;

discrete character.

k\n\p constant, n varied.

n, %) vs.

In regard to the second feature,

the unit k interval to the standard deviation a

and the

becomes continuous

distribution

adjacent values of the deviation variable z

can be written, Eq. 4-1, k

f)

is

in

=

k

—

[X

=

k

—

(= Vnpq) approaches

zero,

In other words,

in the limit.

z,

rip

n -> cc, the ratio of

if

where

&

k

—

k

(4-1)

by the differential dz. In k as is discussed in connection

the limit, as separated

the most probable value of

— up = 0. normal distribution is the special case of the binomial distribution when n becomes infinite and p remains of moderate value.

with Eq. 1-24; and, in the limit, k

As we It is

shall see, the

instructive to plot binomial distributions, as

(instead of k) as the abscissa values.

curves of Fig. 4-2, the

mean

/;

In such plots, as

value of k\n

is

constant

increases, with kjn

shown

if/? is

the width of the curve decreases and the individual steps

To

in the step

constant, but

become

derive the formula for the normal probability distribution,

smaller.

we

use

Normal

Probability Errors

the

first

term of

factorial

number

IS9

approximation, Eq. 1-14, to represent each

Stirling's

in the binomial expression.

=

B(k; n,p)

k

k$n-k)\

pq

Thus,

n-k

n n J27rnn e~ fc

V27rfcA;

n

/

\27rk(n <2irk(

and, writing n n

=

n kn n

n

n - kVY kHn -k)

n

—

k

now change

=

nq

—

(4-2)

k n .k

(4-3)

n - kPq

~k

\27Tk(nLet us

k n~k

- k)(n - k) n ~ k e - (n_fc) p q

e"V277(n

the variable

k to

z (the latter follows

k)I

z

\kJ \n-kJ

according to Eq. 4-1 and also write

from Eq.

+q=

and from p

4-1

1).

Then, Eq. 4-4 becomes /

i

\np + z/

1

\nq — z

i

B(z; n,p)

rnpq(l+-)ll--)

n large

27 L

npi

\

Since, for n large, the quantities z\{np) unity,

-

1

nq/

\

np>

and

nq'

zj{nq) are small

compared

to

we may neglect them in the first factor; but they cannot be neglected

in the parenthesized factors that are raised to high powers.

the parenthesized factors,

it

is

In treating

convenient to rewrite the expression in

logarithmic form so as to take advantage of the power series expansion

>x> - If -

of a logarithm. This expansion, for any x such that 2 log e x

= {x-$-\(x-

If

+

\{x

0, is

(4-5)

Hence, log, B(z

;

n,

p) at

-

-

log, (27rnpq)

- (np + z)

n large

+ {nq - z) (i) log

M

,

2nV

Lnp

^+ Inq '

2

2«V

+

3«V Z 3

3n g 3

_£(l + l) + 5(i-i) 12n i \p#i

+ a*) q

(4-6)

Probability and Experimental Errors in Science

160

For n small, several approximations have been made up to this point in most serious approximation is the one to be made now, viz, that the derivation; but for n large, the

z

z (q

-

2

p

2 Z _ Ap + q

2 )

6nW or, since

a

= Vnpq from

)

12nW

Eq. 2-28, that

*y-P )_*y+^ 2

6a4

(4 . 7)

12a 6

Which of these two terms is the more important depends on the values of p and q. With this approximation, it is seen that, with p of moderate value, we neglect the net effect of all z terms of powers higher than the second in Eq. 4-6. Then, as an approximation, we change B(z; n,p) to G(z; n, p), with

G

symbolizing "Gauss", note that/?

log e G(z;

n,p)= -(-)

log, (2-nnpq)

+q=

1,

and write

- -^— 2npq

\2/

and

=

G(z; n, p)

*

e

.

-l

*

l2npQ)1

yjl-nnpq

An

important feature of

this expression

is

some

simplification

is

and q Hence,

that the parameters n, p,

of the binomial distribution appear always as a

triple

product.

afforded by writing

^jlnpq

Oyjl

and then G(z;

k)-A re -*W

(4-9)

or G(z;

h)bz =-^= e _7lV ^z

(4-10)

Equation 4-9 is the normal (Gauss) probability density function. It is also normal differential probability distribution, or the law of the

called the

normal frequency

distribution.

is only one value in a continuum of values, and G(z; h) does not have the significance of probability until it is multiplied by Az, as in Eq. 4-10. G{z; h) Az is the probability of observing a

As

seen in Eq. 4-9, G(z; h)

deviation z within the small interval Az.

The function

»

-^ yJTT

«T* J -oo

V dz = (D(z)

(4-11)

Normal

Probability Errors

161

h G(z\h)

=

1.41,

a=

0.40

=

Fig. 4-3. Normal (Gauss) density function (normalized frequency distribution) for each of three values of the parameter h.

normal (Gauss) probability distribution function or the (Note the difference between the terms "probability density function" and "distribution function.") Equation 4-11 gives the probability of observing a deviation z in the range called the

is

cumulative probability distribution.

— oo to z. Shape of the normal frequency curve. Three graphs of the normal density function, Eq. 4-9, are shown in Fig. 4-3. The normal (Gauss) curve is symmetrical, since z appears to the second power only, and the curve approaohes the z axis asymptotically at both extremes. The curve is shown as continuous as is appropriate for the normal distribution which is have been based on the assumption that the limits n —> oo and Az -> reached. The maximum ordinate and the shape of the curve are determined by the

single

parameter

h.

is

A/vrr, and

is

a precision

The peak ordinate value

the relative width of the curve increases as h decreases,

h

index that varies inversely with the standard deviation a (Eq.

Normalization. is

In order that Eq. 4-10 or 4-11 give a probability,

necessary that the probability function be normalized,

of

all possible

4-8).

i.e.,

that the

it

sum

outcomes equal unity h_ f«

V *

IT

J-

dz=

1

(4-12)

(

This normalization has been assured in the derivation inasmuch as

Probability and Experimental Errors in Science

162

=

n >P)

2*=o^(^

1»

Eq. 1-22. Since the area under the curve

ized, the inverse relation

the curve

between the

maximum

is

normal-

ordinate and the width of

obvious.

is

We shall have need later for the type of integration indicated in Eq. 4-12, so

us carry

let

=

same time check the normalization. is based on geometrical considerathe z, y plane, and a similar function

at the

for this integration

Consider y

tions.

y

now and

out

it

The usual method

=

G(z;

h) in

G(x; h) in the perpendicular

y plane from z y plane from x =

the curve in the

curve in the

x,

z,

A=

2

4=

f

x,

y plane.

=

to oo,

to oo.

also the area under the

Thus,

V*v dz = 2 ~= f Y» v dx

/ show that the coefficient 2(/7/v -n-) is such x and z variables are independent,

and we wish Since the

to

H = A-

f

7T

Jo

(4-13)

yJTT Jo

yj7T Jo

A*

Let A/2 be the area under

and

that the area

V»V * fV»V dz = 4 h- f" fVV** Jo Jo Jo

A = 1.

<**

IT

Evaluating the double integral corresponds to determining the volume of h) curve about the y axis. To convenient to change to polar coordinates

the solid obtained by rotating the G(z;

perform

this integration,

in the x, z plane.

becomes

r

dd

dr,

it is

So, place r 2

and

=

z2

+

this is (7r/2)r dr in

x 2 The element of area dz dx one quadrant. Hence, .

i

hr rdr

I JO

TT

and the integration is in an easy form. Integration gives A 2 = 1, which proves that the normal probability density function is normalized as it is written in Eqs. 4-9 and 4-12. In a measurement problem, we are concerned with an experimental frequency

distribution

distribution scale

by

n.

tribution values,

is

is

of n

may An alternative

trial

measurements.

This

experimental

be normalized by dividing the ordinate (frequency) procedure,

if

use of the normal frequency dis-

involved in the analysis or predictions of the experimental

to multiply G(z;

h) Az,

G(z;h)dz, by

or

n.

The normal

Jz x

expression, multiplied by distribution; but

normalized.

this

With the

n,

product

is is

properly called the normal frequency it is no longer and procedure of normalization well

not a probability because

significance

usually not explicitly

understood, careful distinction

is

the terms frequency distribution

and probability

made between

distribution.

Normal

Probability Errors

4-2.

Errors in the

The

essential

163

Normal Approximation

approximation made

Eq. 4-9 was the

in the derivation of

neglect of the terms of powers higher than the second in the logarithmic

expansion.

n

is

This approximation, represented by Eq. 4-7,

very large or

if all

the deviations are very small.

of Eq. 4-7 shows that the deviations should be at

±3c

if

n

Bernoulli

~

10,

or about ±4
if

n

~ 100.

A

k

Errors

in

B(k; 10,0.2)

G(z; h)

valid only if

least smaller

(Note that n

trials.)

Table 4-1.

is

quick inspection

Az

G(z; h) Az

is

the

than about

number of

Probability and Experimental Errors

164

Table 4-2.

k

Errors

in

G(z; h)

Az

in

Science

Normal

Probability Errors Table 4-3.

Errors

in

165 G(z; h)

Az Independent

of Skewness

All odd-order terms in Eq. 4-6 are zero

k

Ak

=3

Probability and Experimental Errors in Science

166 errors, all

presumed now

to be

random and independent,

is

very large

indeed.

With measurements is

continuous sample space, each elementary error

in

to be identified with a basic Bernoulli trial.

of each elementary error negative, in the actual

is

measurement;

in discrete

assume that the

we assume

errors so conspire that the observed deviation

measurements

We

effect

a very small increment, either positive or

sample space, as

is

in

that the elementary

their algebraic

sum. With

counting experiments, the

elementary errors are grouped into bundles that correspond to "yes, a

observed" and "no, a count

is

not observed" in a specified sample.

In this case, as discussed in Chapter

5,

the basic Bernoulli trial

count

is

with "a count"

vs.

is

Mechanical analog for Bernoulli-type elementary errors tinuous sample space.

Referring

to

identical small spherical steel balls are

from the nozzle

identified

"no count."

at the top.

Each

Fig.

4-4,

dropped one

ball filters

in

suppose that

con-

many

at a time vertically

down through

the symmetrical

array of steel pins (represented by solid circles in the figure) and comes to

Normal rest in it

Probability Errors

167

one of the many identical bins

at the

bottom. Whether the

determined entirely

ball, as

presumed to be by chance. As drawn, the pins are arranged in an

encounters a pin, goes to the right or to the

left is

array of quincunxs (one side of a face-centered cubic structure). ball

is

If

each

only slightly smaller than the clearance distance between pins,

falls practically

it

head on to the next pin and has a constant chance, perhaps

a 50-50 chance, of being deflected to the right or to the

left.

It is

not

must be constant, the same for each pin encountered a noneven chance might be realized in this model if pins of an asymmetrical shape were used. There are numerous possible paths through the array and some balls will find their way to each of many necessary that this chance be 50-50, but

it

;

different bins.

In analogy, each operation of a ball filtering through to a bin

is

a

measurement. The horizontal position of the nozzle represents the supposedly "true" value for the case of the 50-50 pin chance; each deflection caused by a pin encountered in the

filtering

process corresponds to a

and the position of the bin into which the ball finally comes to rest represents the measured value. * If the chance is 50-50, then the central bins, directly under the nozzle, have the best chance of receiving the largest number of balls, and the frequency with which a ball enters a particular bin decreases with the distance of the bin from the central position. If the right-left deflection small elementary error;

chance per pin to one side;

constant but

is

is

not 50-50, the shape of the histogram

same and symmetrical but

essentially the

this

its

maximum

ordinate

is

is

shifted

corresponds to the effect of a systematic error.

Chapter 3, there is no practical way of distinguishing between an undetected systematic error and one or more random elemen-

As mentioned

in

tary errors since, in real life,

To be

we never truly know the "position of the nozzle."

a good analog for the normal distribution, the number of hori-

and the number of balls dropped must be increased and the geometrical size of the balls, the pin spacing, and the bin size must be reduced indefinitely (conditions that, in combination, correspond to the infinite number of Bernoulli trials and to continuous

zontal rows of pins indefinitely,

sample space).

These extensions can be easily imagined.!

* The "true" value is merely supposed because, unfortunately, our best view of it is through the thick veil of elementary errors. Also, in some measurements, the property itself is altered by the very process of measurement, a complication enshrined in Heisenberg's uncertainty principle. In any case, the "true" experimental value is usually taken as the mean value resulting from the measurements. t The ball filtering down through the array of pins is solving the popular randomwalk problem in one dimension. If the deflection chance is 50-50, the ball performs a symmetric random walk. The physicist takes this as the simplest model for onedimensional diffusion.

Probability and Experimental Errors in Science

168

Characteristics of elementary errors. In the derivation of the normal distribution we assumed that the magnitude of the increment per elementary error is constant (only two possible outcomes of a Bernoulli trial, viz., positive and negative), and that the probability p that the magnitude is positive is constant for all elementary errors. In an actual measurement, it is most unlikely that all the elementary errors contribute in accord with these two assumptions. However, those elementary errors making extremely small incremental contributions are presumed to be less important than those making larger contributions. In essence, then, we assume the existence of a very large number n of important elementary errors all of about the same incremental size, and all of about the same positive sign probability p. p may be reasonably presumed to be \ but this value

not necessary.

is

In support of the just-mentioned relaxation of the rigid Bernoulli

we may point out that the normal on the basis of elementary errors characteristics from those of the Bernoulli

requirements of the elementary errors, distribution function can be derived

having somewhat different

our derivation.*

trials in

Two

other sets of characteristics are as follows:

(1)

If

p

=

\,

incremental contributions need not be rigidly constant in magnitude for

elementary errors;

if

order of magnitude.

they are very small, they (2) If the sizes

of

all

may be

more, the standard deviation error may be large or small. In conclusion,

random

it

is

may

the possible increments of a

number of

be either large or small, and, further-

in the distribution

due to any one elementary

reasonable to suppose that numerous elementary exist and are measurements reasonable to suppose that these

errors of the various imagined causes actually

indeed responsible for the observed variations in the

continuous sample space.

in

all

merely of the same

given elementary error are themselves normally distributed, the errors n need not be specified, n

the

And

it

is

do

trial

trial measurenormal (Gauss)

elementary errors conspire in such fashion as to cause the

ments to

fit

in

more or

distribution, even characteristics. set

less

good approximation

though we are not able to

Also,

it is

of trial measurements

fix in detail

It

is

their special

reasonable that the region of greatest misfit of a

is

in the tails,

say

\z\

y

which a few elemennonnormal shape would

2a, for

tary errors of relatively large contributions but of

have the greatest

to the

effect.

significant that, at least to the author's

knowledge, no experi-

mental situation leads to a truly normal distribution, and that the theory of the proof of the soSee H. Cramer, Mathematical Methods of Statistics (Princeton University Press, Princeton, 1946), pp. 213-232. * All these derivations are special cases in probability

called central limit theorem.

Normal

Probability Errors

169

deviations from normal are greatest in the

tail

regions beyond about

±2(7.

In the application of the normal distribution function to actual measuren, p, and q have no individual beyond the concepts of the elementary errors. These paramalways appear as the product npq and this product, which does

ments, the Bernoulli-trial parameters significance eters

have practical significance, we refer to

=

or a [a

terms of the precision index h

in

=

Vnpq]. In the application of the normal distribution, we shall generally determine (best estimate ) a and h from the actual measurements themselves rather than from the Bernoulli parameters. Having no further need of n as a symbol of the number of Bernoulli trials, we use it as the symbol for the number of actual trial measurements. It is hoped that this double use of the symbol n will not be confusing. 4-4.

l/(/z/V2)

The Error Function

probability of observing a deviation z in the range from —z x to where z x and z 2 are arbitrarily chosen, is found by integrating the normal density function, Eq. 4-9, between these particular limits. This integration is carried out by expanding the exponential term in a power series and by integrating term by term, but it is a tedious process. Fortunately, integral values for most problems can be found in reference

The

+z 2

,

tables.

from

In the tables,

either

to 2 or

we

generally find the integral value as integrated

from

—z

+z

to

the parameter of the table), and

it is

(where the numerical value of

necessary to

make

z is

simple additions or

subtractions to deduce the integral value between two arbitrary limits of

and

done with comprehension if any two limits is the area under that part of the normal density curve bounded by the two limits. integration such as z x

we remember

z2

.

This

is

easily

that the integral between

Standardized variables. in a satisfactory

The function 0(z) given

form for general tabular

listing

numerical value for each different specific convenient to standardize the variable,

i.e.,

set

in Eq. 4-11

is

not

because h has a different

of measurements.

to use either hz or zja

(=

It

is

V2 hz)

instead of just z; then, in terms of either of these forms of the variable, the

error function

is

invariant to different values of h (or of a).

The two most popular forms of the

invariant function for computational

purposes are

»

= 4= \e~ x%dx

(4-14)

y/TT JO

where x

=

hz,

and erf (0

=

erf (-j

=

-L

+ I

V

<2/2

dt

(4-15)

no

Probability and Experimental Errors in Science

where

/

=

= \2 hz.

z\a

in reference to Eq. 4-15.

The term "error function"

To

out that, in Eq. 4-14 where x

=

4-15* where x

Eq. 4-15,

if

=

=

z\a

is

used specifically

we

aid in the ready use of the tables,*

Vlhz,

=

O(x)

hz,

erf(f)

=

the integration limits are

=

0.8427 for x

0.6827 for to

=

=

t

1;

=

1

and also

—

instead of from

/

point

in Eq.

;

t

to

in

+

f,

(Note that x here is not the (/) value of a single measurement, as in Chapters 1, 2, and 3, but is a standardized deviation.) Table 4-4 lists some values of \ erf (/) from to /. \ erf

4-5.

0.3413,

viz.,

0.6827/2, for

t

1.

Precision Indices

To use Eq. 4-14 or 4-15 in a typical measurement problem, we must know two parameters. First, we must know the central location value, the value at which z = 0. This is usually taken as at the arithmetic mean of the set of n observed trial measurements. Then, we must know one or more of

For example,

the dispersion precision indices.

deviation s

is

known from

if

the standard

the n observed measurements, a satisfactory

estimate of the universe standard deviation a

obtained from the

is

relation

"fcF as discussed in connection with Eqs. 2-22 ical value

of a we

standard variable

make

may proceed zjo,

s

and

3-98.

Knowing

the numer-

with the change of the variable

or to zh since

we know

that

a =

z to the

l/(/rV2),

and

use of Eq. 4-15 or 4-14 respectively.

Dispersion indices other than a and h are

common,

e.g.,

mean

the

and the probable error (or some other confidence limit). For a mathematical model of the frequency distribution, such as the normal distribution, a simple numerical relation exists between each pair of

deviation

z

the various dispersion indices.

Mean

deviation.

The mean deviation

z is

taken without regard to

the algebraic sign of the individual deviations, as discussed in Chapter 2 * B.

O. Peirce,

128, uses the

A

Short Table of Integrals (Ginn

form of Eq.

&

Co., Boston, 1956), 4th ed., p.

4-14.

H. B. Dwight, Tables of Integrals (Macmillan Co.,

New

York, 1957), 3rd

ed., p.

275, uses the form of Eq. 4-15.

The Handbook of Chemistry and Physics (Chemical Rubber Publishing Co., 1956), 38th ed., uses the form of Eq. 4-15 with the integration from

—

/

to

/

instead of

from

to +t.

Tables of Probability Functions, Vol.

1

(Federal

Works Agency, Work

Projects

Administration, 1941, sponsored by the National Bureau of Standards)," uses Eq. 4-14.

Normal

Probability Errors

Table 4-4. Error Function

171

},

erf

(t)

from

to

G(t)"=(l/\ 2^) e -<-/•;

t

and Ordinate Values

Probability and Experimental Errors in Science

172 this case

because we already

know

easy one to evaluate, and

we

that the probability distribution

The remaining

properly normalized, Eq. 4-12.

integral in Eq. 4-16

is

an

is

find that

2 tn

— =— h^n

0.564

1

=

=

(4-17)

h

Our

best estimate as to the value of z lh

usually taken to be

from

in

Chapter

2,

mean

the

used by experimenters, not only because calculate.

It

was

inefficient index.

the

mean

the experimental value,

is

2

2th*H—^— j As mentioned

z,

-

,,

,

5

(4-18)

deviation z it

is

rather

commonly

an easy dispersion index to

is

also stated in Chapter 2 that the mean deviation is an And, as mentioned before, in addition to its inefficiency,

deviation does not lead to a useful general rule for the propa-

The standard deviation

gation of errors in a derived measurement.

is

generally a preferable index.

Standard deviation. The square root of the mean squared deviation normal distribution is written as

for the

(4-19)

where, again, the indicated normalization This expression

case.

may

be altered

and the integration performed by Then,

parts.

first

2

J

integration,

°°

2

= \-Tr

term on the right vanishes at both

definite integral

=

we encountered

we have a

=

z

~ 2

The

(Write u

,-

,

«

not actually necessary in this

is

slightly,

in

— /

=

N /2

limits;

Eq. 4-13.

=

»

and dv

=

ze~ h

*

z2

dz.)

-,,i-2

r

the second term

is

the

After carrying out the

(4-20)

h

This expression we already knew, Eq. 4-8, from the derivation of the normal

frequency distribution;

its

derivation here merely checks the various

arguments and gives us practice

in their use.

Normal

Fig.

tion

Probability Errors

The

4-5.

known

173

particular devia-

as the probable error,

pe, divides the area £:£:}.

— pe Probable error.

The probable

deviation that divides the

±pe

ing a deviation within

=

This

is

indicated in Fig. 4-5.

In other

the particular value of z for which

is

\, viz.,

erf(z)

=

A1

=

2

This integral

and we

Thus, the probability of observ-

parts.

is \.

words, for a normal distribution,/^ erf(z)

defined as the particular

is

(or right) half of the area under a frequency

left

two equal

distribution curve into

error, pe,

+pe

easily evaluated

is

*x a = li(ve)

2

f

-v/7T

JO

4

V'

from a

x e~ "dx

table of values of error functions,

find

pe

The probable

error,

=

0.4769

=

0.6745(7

(4-21)

having equal positive and negative magnitudes,

is

a

dispersion index that can be indicated on the graph of symmetrical distributions only.

It is

hardly the

less useful,

although not pictorial, as a

dispersion index for asymmetrical distributions.

Indeed, the probable an index rather commonly used by experimental scientists, although statisticans always prefer the standard deviation. It is important to note

error

is

that the numerical relation between the probable error

and any other

index depends specifically upon the shape of the distribution; the numerical relation in

Eq. 4-21 between pe and a holds specifically for the normal

distribution.

Confidence limits the

90%

confidence limit,

the probable error.

0.90

=

The probable

in general.

fidence limit by definition;

may

Jtt Jo

error

is

the

50%

other confidence limit,

cone.g.,

be deduced in the same manner as that for

Thus, for the

—

Any

see Fig. 4-5.

90%

e~

x

limit,

dx,

90%c.l.

=

-^

(4-22)

h

In terms of confidence limits in the normal distribution, the precision indices correspond to the per cent limits as listed in Table 4-5.

174

Probability and Experimental Errors in Science

Table 4-5. Numerical Relationships between Various Dispersion Indices

and the Confidence Limits for the Normal Distribution Dispersion Index

Normal 4-6.

Probability Errors

ITS

Probability for Large Deviations

In a normal distribution the probability that a deviation

served equal to or greater than

>

G(\z\

This probability

\h\)

some

=

particular deviation

-f=\

e~

.2 2 hz

will

\z\

\z

x

\

is

be ob-

given by

dz

may be more easily evaluated, depending on the particular

tables available, if the limits of integration are

written as G{\z\

>

\z \) x

=1-

~ ?h

changed and the expression

<.

fZ*

e~

h

2

2

dz

>

(4-23)

Jtt Jo

A few calculations with Eq.

4-23 are listed in Table 4-6.

convenience, the independent variable

is listed

as \zja\-

In this table, for

Note from the

Table 4-6. Probability for Large Deviations

Odds

Odds G(\ z \zJo\

0.6745

\

> N)

(%)

against,

tol

G(\z\

\zJo\

>

(%)

\z

x \)

-

against,

tol

Probability and Experimental Errors in Science

176

the standard deviation).

do "justice" to measurement be rejected? In order to

The experimenter

faced with the question:

is

measurements as a whole, should the "bad'

his

Before seriously considering rejection, the experimenter should do the following.

make

he should

First,

additional measurements

if

at

all

possible so as to lessen the relative influence of the divergent value or else

to reveal

more convincingly

it

as being "bad."

Second, he should

make

every effort to find a blunder or a transient systematic error that might be responsible for the discordant value.

Many

been made

reasons for the divergence beyond

that

in searches for possible valid

owing to randomness. There

is,

important discoveries have

for example, the

discovery of argon by Lord Rayleigh.

He noted

famous case of the

a discrepancy between

from air and that of a sample produced chemically. It would have been easy for him to reject immediately one of his results as having been caused by some unidentified mistake. Sometimes, confronted with the question of what to do with a divergent value, the experimenter uses the median instead of the mean as the better location value and also as the reference value in computing the dispersion index, e.g., the "standard deviation from the median." However, a price the median is less efficient is paid for the safety afforded by this scheme the density of a sample of nitrogen prepared

—

than the mean, precision.

i.e.,

more measurements

Also, this procedure

same

are needed to obtain the

very unconventional in experimental

is

may be misunderstood. is so common that, as a general policy, some investigators take the mean of all but the highest and lowest values in each set of trial measurements. To resort to this science,

and

if it is

used the reported measurements

This problem of what to do with a large deviation

device

is

obviously

less

than honest, and, in

fact,

it

denies the fundamental

basis of statistical interpretations of precision.

Chauvenet's criterion for hunch or of general fear is not criterion

tive

proposed,

may set

all

is

better than

at all satisfactory,

trials shall

Many

none.

of them arbitrary.

The one due

be rejected

if its

to

Chauvenet

is

old but in

a is

This

larger does not exceed l/(2«).

significance level in rejection

script

sort of objec-

deviation (reckoned from the mean)

such that the probability of occurrence of

If the

and some

objective criteria have been

This criterion states that a measurement

serve as an example.

of n

Rejection on the basis of a

rejection.

parent distribution

is

is

is

all

deviations equally large or

not a good criterion because the

too sensitive to the sample

normal, the

size n.

critical rejection size z ch (sub-

"ch" for Chauvenet) can be computed for any value of n from the G(\z\

>

|zch|)

= -^ y/7r J*ch

e

'

dz

=

T~ 2n

(

4 " 24 >

Normal For

Probability Errors

is computed (from s) before the measurement in The need for the factor 2 in the coefficient of the

h

this calculation,

question

rejected.

is

integral in Eq. 4-24

as follows:

The

177

is

readily recognized

if

the rejection criterion

deviation, to be acceptable,

must

fall

is

restated

with the range

bounded by ±z ch if it falls ouside of this range, on either side, it is rejected. Note that as n increases the critical rejection size z ch also increases, and, for very large n, rejection of any measurement becomes very improbable, as it should. The dependence of z ch on n up to 500 is shown in Table 4-7. ;

Table 4-7. n

5

Dependence on //Zch

Zchjo

n of Chauvenet's Limiting Values hz ch z ch /a, z ch /pe ,

Zchlpe

n

hz^

z^\a

z cti lpe

Probability and Experimental Errors in Science

178

for different types of measurements. arbitrary at best,

is

of measurements. |2.5cr| is

As

a consequence, the criterion,

generally arbitrary in a different

way

We should especially note that the

for different types

region beyond about

any a priori case, we lose confidence that an adequate description of the parent population. important that the experimenter who rejects one or more

just the region where, in

the normal distribution Finally,

it is

is

measurements, and intends

results

his

to

be

significant,

statistically

should report very carefully the detailed conditions of the measurements, the total number of trials, the particular measurement(s) rejected, and the criterion of rejection,

4-7.

all this

as part of the reported final results.

Test of a Statistical Hypothesis: Example

The type of arguments made in the objective test for rejection of a "bad" measurement is also involved in the test for rejection of a statistical hypothesis. An understanding of this type of argument is essential in the statistical interpretation of the significance of almost any type of observation or theory. For practice, let us discuss now a simple example of a test of a statistical hypothesis. Also, this example will better prepare us to understand the y 2 test of the next section. Consider that a die has been cast 3 5,672 times and that either a 5 or a 6 1

appeared 106,602 times.

The hypothesis

to be tested

is

that the die

is

"true." In this example,

experiment

fit

we wish

to find out

satisfactorily a

whether or not the outcomes of this

binomial distribution where

//

=

315,672

and, according to the hypothesis, p = \. The binomial expectation value for success, i.e., for either a 5 or a 6, is np = 315,672 \ = 105,224. This •

is different from the one actually observed by a relatively small amount, viz., 1378 about \\ %. This difference does not seem to be very much, but the question is, Is it more than we should expect on the basis of purely random outcomes of each cast of a perfectly true die? We can answer this question with rather satisfactory reliability by the following

value

—

argument. If many experiments, an infinite number in the limit, were to be performed with a perfectly true die, each experiment consisting of 315,672 casts of the die, there would be many different numbers of successes; in fact, with an infinite number of such experiments, the frequency

distribution

Eq. 1-20.

a

is

just the binomial distribution B(k\

The standard deviation

= Jnpq =

Jnp(l

Now, we may compare

-

p)

//,/?)

in this distribution

=

x/3 15,672

X

i

x

=

B(k; 315,672,

\),

is

5

=

264.9

the deviation of the result obtained with the actual

Normal

Probability Errors

with the standard deviation with a perfectly true die,

die, viz., 1378,

264.9.

179

We may

=

experimental standardized deviation (1378/264.9)
Our next 5.20(7, is

task

is

a reasonable one owing to statistical fluctuations alone on the

=

£,

is

true.

We

of having a deviation

315,672 casts?

is

k

true.

number of

=

What

is

the binomial probability,

is

or larger in a single

very small,

we

shall

"unreasonably" large and that the die

not true; but, on the other hand,

"probably"

ask,

this large

If this probability

the deviation 5.20a

the

5.20c

to determine whether or not this observed deviation,

assumption that the die with/?

viz.,

express this comparison conveniently by writing the

To determine

if this

probability

this probability,

successes outside the limits

is

"probably"

not small, the die

is

we must sum over

±5.20(7,

i.e.,

distribution

is

write, with n

B(np-

a very

=

1378

sum

is

simplified

good approximation

315,672 and/?

< k<

np

+

=

is

all

greater than

The

105,224 and less than 103,846, in the binomial distribution.

arithmetic in performing this

of

trial set

conclude that

by noting that the normal

to the binomial.

Then we may

\,

1378;

n, p)

= -==

e~

t/2

dt

tJItt Jo.20

= 0.000,000,2 using Eq. 4- 1 5 in which the standardized variable

is zja.

Hence, the chance

that a true die will give a result of 106,602 (or more) successes 1

in 10,000,000,

and we conclude that

either the die

is

is

about

not true or

else a

most unexpected event has occurred.* Since, as we have just shown, it is not reasonable for p to be -], it is instructive to extend this example to include the question, What is the reasonable value and range of values for

p

as judged

from the 106,602

The most reasonable value for/? is simply the experimental value 106,602/315,672 = 0.3377. The proper number of significant figures successes?

We must decide is our next concern. what numerical deviations are reasonable. One commonly employed criterion, an easy one to use, is that reasonable deviations must not exceed ±o\ The value of a is not sensitive to the actual value of/? in this example, and we may take it as 265. Hence, the limiting expectation values are 106,602 ± 265 and the "reasonable" limiting range of values of/? is ±0.000,84 as deduced from the calculation (106,602 ± 265)/3 15,672 with which to write this value of/?

=

0.3377

±

0.000,84.

Another commonly used criterion is that the limiting deviations are those for which the probability exceeds 0.01 (or sometimes 0.05). With *

This example could be rephrased to demonstrate the law of large numbers

effective

in

convergence of the expression for the experimental probability, Eq. 1-39.

the

Probability and Experimental Errors

180

this criterion, the limiting deviation

-%= V 277

dt

>

0.01

value 0.01,

we

find

'

<2/2

is

x

\

critical

of the error function.

(or

It

This

0.0022.

sometimes 0.05)

(4-25)

\z

x

=

\

2.516a from a table of values

follows that the limiting range of values of p

±0.0022 from the calculation (106,602

±

given implicitly by the expression

J *io

"

Using the

\z

Science

in

±

265 x 2.576)/3 15,672

the "reasonable" limiting range

is

if

we

=

is

0.3377

say that on the

average 99 out of 100 random observations (each observation consisting

of 315,672

trials)

are reasonable,

picions to the extent that

The

we

and

declare

that

it

criterion mentioned, viz.,

first

\z

that about 68 out of 100 are reasonable

This

a

is

much more

1

out of 100 arouses our sus-

to be unreasonable. x

\

=

±o,

is

equivalent to saying

and 32 out of 100 are unreasonable.

stringent requirement.

Let us quickly review what we have just done. In this binomial example, the position of the actual deviation, viz., 1378,

was found

to

lie

in the

of the binomial distribution for p = \. For this reason, it is unreasonable to say that the deviation is due to random fluctuations alone,

remote

tails

and so the

statistical

hypothesis that p

=

\

was

rejected.

Then, assuming

that the binomial model, with the experimental value of p, does

observations, In this

we found

example as

it

fit

the actual

the "reasonable" range of values of p. is

given, the statistical hypothesis as to the value

test. Suppose now that, instead of having only two possible outcomes of each cast, viz., a 5 or a 6 on the one hand and a 1, 2, 3, or 4 on the other hand, there has been recorded the number of

of/?

is

the only one

we can

times each of the six different sides of the die appeared.

Now we

can

test

whether or not the multinomial distribution agrees reasonably well with the observations as well as test whether or not each of the six values of/?

The test for each value of/? would proceed in the same manner as above described, the problem being treated as a binomial one. But the test of the hypothesis as to the model parent distribution is more involved in just the same way that the multinomial distribution is more complex is J.

than

is

the binomial distribution.

The

test

of a hypothesis that a particular

model distribution fits a given experimental distribution test for "goodness of fit" and is discussed next. 4-8.

Test of Goodness of

The frequency

distribution

Fit of

is

known

as the

a Mathematical Model

of a small number n of measurements

generally provides us with only sketchy information as to the parent distribution

may

suggest

whose characteristics we seek. The experimental distribution more than one maximum, will generally show asymmetry,

Normal and

Probability Errors

may

it

If n

is

may

not.

181

suggest either higher or lower

increased, these

nonnormal

The problem now

is

tails

than the normal distribution.

characteristics

may

disappear or they

to decide, having only a small n, whether or

not the experimental distribution can be satisfactorily assumed to be a

sample from a normal parent distribution. We shall mention two qualitative graphical of the goodness of

tive tests

sion here

is

fit

tests

and then two quantita-

of the normal curve. Although the discus-

normal model, the general methods of the any model.

specific for the

tests are applicable to

Graphical comparison of frequency curves. The observed frequency measurements is plotted with the normal-

distribution curve of the n trial

i.e., with each observed frequency divided by n. Then and the standard deviation 5 are computed. The model value of// is taken equal to m. The value of the model index h is obtained from a which, in turn, is taken as given by Eq. 2-22; then,

ized ordinate scale,

the

mean

m

sVl With plotted

\

n

I

/a and h known, the normal frequency curve is calculated and on the same graph paper as the experimental curve. The normal

A visual is of course centered about the experimental mean m. comparison of the experimental points relative to the normal curve affords the first test of goodness of fit. Figure 4-7 shows a typical example of the graphical comparison of an experimental histogram and the fitted normal curve. This comparison is sometimes extended so as to express the discrepancy as the percentage of "excess" or "deficiency" of the normal curve at each curve

experimental value of x or at the center of each classification interval.

By

this

extension the test becomes, in a sense, quantitative.

age discrepancies are large, the but

if

fit

the discrepancies are small,

of the model curve

is

If the

percent-

obviously poor;

we need some further arguments to help may be merely the fluctuations

us decide whether or not these discrepancies to be expected in a sample size n

normal model the x 2

distribution.

from a parent population of the assumed

The additional arguments

are

made

later in

test.

Graphical comparison of cumulative distribution functions: probapaper. The second qualitative test compares summations of the observed values with corresponding integrals of the normal curve. bility

The observed deviations order of

size,

z

with respect to the

mean

m

the largest negative value at the top of the

positive value at the bottom.

The

are listed in the

list

and the

largest

entire range of observed deviations

is

Probability

182

Fig. 4-7.

Normal curve

and Experimental Errors

fitted to

in

Science

an experimental histogram.

Fig. 4-8. fit.

Ogive curve for goodness of

Normal

183

Probability Errors

divided into

M intervals, where M

sary that

intervals be of the

all

about 10 or 20 or

is

same

size;

in fact,

so.

It is

not neces-

make

usually best to

it is

extreme ends of the range of deviations.

relatively large the intervals at the

No interval should be so small as to contain only one or two observations. •••,/,•• •, M. •, j, These intervals are numbered consecutively 1, 2, They'th interval has (/obs); observed values in it, and, of course, 2j^i(/obs)i •

=

the

n,

plotted,

number of measurements

in the set.

where

= 2 = ;

z l is

the deviation reckoned

the /th interval, large

This

/.)

is

The

Now the

points y ohs vs. z l are

i

2/obs

and

•

(z, is

C/obs),1

from the mean

large negative for small

(at z /

and

= is

0) to the center

of

large positive for

plot consists of points such as are illustrated in Fig. 4-8.

called the ogive curve.

frequencies normalized,

It

convenient to have the observed

is

divided by n;

i.e.,

in this case the

normalized

to 1 ordinate scale (fohs)jln goes from Then, on the same graph paper, the corresponding quantities from the fitted

normal distribution curve are plotted. These quantities are Z

yth

= JL\ e~^

dz

yJTT J -oo

where z is a continuous variable reckoned from the experimental mean m, and the parameter /; is also determined from the experimental measurements as stated above. Comparison of the experimental points with the theoretical curve allows a qualitative test of the goodness of fit. This comparison attempts to minimize the rapid fluctuations and to

show trends of agreement or disagreement over extended regions of the distributions. But in the tail regions, where our concern is often most acute, the ordinate scale

is

too crowded, and,

not satisfactorily sensitive in these regions.

way

scale is stretched in such a nonlinear

a straight line, then the test stretched ordinate scale

about the y

=

=

0.5 line

much

consequence, this

However,

better.

is

test is

the ordinate

if

that the y xh vs. z curve

becomes

Graph paper having such a

called probability paper;

is

and

is

in

it

is

symmetrical

from The comparison

linear in units of z\a in both directions

Probability paper

Fig. 4-9.

is illustrated in y between the observed points and the normal curve on probability paper

0.5.

can be made readily

in the tails

of the curve.

Probability paper can also be used conveniently to determine the

experimental values of the

from the center and the

fit

Skewness and of the normal

mean

m

and the standard deviation

s directly

slope, respectively, of the best-fitted straight line.

kurtosis.

Useful quantitative tests of the goodness of

distribution can be

made by comparing

the numerical

Probability

184 99.99

and Experimental Errors 50

P 6d

o

in

Science 0.01

Normal

Probability Errors

185

of about 30 measuremeuts. The x 2 test gives a single numerical measure of the over-all goodness of fit for the entire range of deviations.

minimum

The observed deviations intervals all be of the

that

it

least

same

but

size,

classified as in the

Again,

now no

it

is

procedure

not required that the

interval should be so small

contains less than about five measurements, and there should be at

about

(fobs)j

and

are ordered

for the ogive curve described above.

m

six

In this test, the observed frequency

or eight intervals.

the interval (Az) 3

compared with

is

the theoretical

model value

(/th ) ; (Az) corresponding to the center of the interval. (/th is given by the product of the model probability and the number of trials n.) If the interval 3

is

that/th cannot be assumed

so large

to be constant throughout (Az) 3

,

the frequency function should be integrated over the range of the interval,

but this

usually not necessary except in the

is

first

and

last intervals.

To maximize the statistical efficiency of the test, the square of the difference,

-

i-e-, [(/th),

quantity x

2

is

2

(/obs),]

>

is

and

taken,

2

x

this quantity is divided

by (/th),. The

sum

defined as the

= y 3

[(/°bs)j

=1

~

(f^)i]

(A-26)

(/th),-

Exact fit of the model or theoretical frequency curve to the experimental measurements would correspond to x 2 = 0, an extremely fortuitous event in any real-life situation because of statistical fluctuations even if the model fits perfectly the parent distribution of which the actual set of measurements Increasingly large values of x 2 involve the probability arguments concerning the question of what statistical fluctuations are reason-

is

a sample.

able even If the

may be

if

the

model

model does

distribution

fit

is

perfectly.

simply a uniform

2

X

= 2 = 3

in case

(flat)

distribution, Eq. 4-26

written as

no attempt

is

made

to

(Xj 1

~

m)2

m

group the n measurements into

(4-27)

intervals.

In a general view, the x 2 test determines the probability that a purely random sample set of measurements taken from the assumed model

show better agreement with the model than is shown by the actual set.* The type of arguments involved is essentially the same as was encountered in Section 3-3, in the / test and in the F test for consistency. Also, the parent distribution would

The x 2

test was introduced by Karl Pearson in 1900, and it is often called Pearson's However, in the form of Eq. 4-27 it was first used by Lexis in 1877, and is sometimes called the Lexis divergent coefficient when written in this form. *

X

2

test.

Probability

186

arguments are those involved of maximum likelihood, and

and Experimental Errors

in the derivation

in

Science

of Eq. 3-96 by the method

are very similar to those

made

in the rejection

of a "bad" measurement and in the test of a statistical hypothesis. As stated before, these arguments are fundamental in almost every statistical interpretation for significance of actual data.

But, although essentially

arguments as they are stated now the same difficult to follow through the first and more are a little more complex and review if necessary, the general time. The reader should keep in mind, examples. "philosophy" of the previous In order to obtain the theoretical frequency values/,,, devoid of random intervals, we imagine an infinite number of fluctuations, for each of the trials or sets of measurements, all known to be sample sets from a multias in the previous instances, the

M

In each trial the

nomial model parent distribution.

outcomes

very large; this

is

large but finite

if

distribution, or

normal

/-

number

is

r in Eq. 1-30.

number of

possible

taken as generally

r is

the y 2 test concerns a model having a discrete frequency may be infinite if the model is continuous such as the In

distribution.

any

case,

/•

is

subdivided into the same

M intervals

grouping of the experimentally observed frequencies. Then, from Eq. 1-30 with n very large, the multinomial frequency for the center of they'th interval, (/t)l ) ; is computed. This computation requires knowledge used

in the

,

of the

M different values of the probabilities p, in the multinomial distribu-

knowledge is obtained from the experimental measurements, from the mean m and from the standard deviation s. By coupling the respective observed and theoretical intervals, we determine the frequency difference (/obs — /„,), for each interval, and then the value tion,

and

this

specifically

The subscript "obs" is attached to this value of from the theoretical value discussed next, which is based

of £* hs from Eq. 4-26. 2 y to distinguish

exclusively

it

upon the model.

2 the exclusively theoretical value of y i.e., the effect of purely fluctuations alone, we look mere closely at the parent multi-

To deduce random

,

nomial frequency distribution during its analytical formation, e.g., as number of trial sets of hypothetical measurements builds up and

the

becomes infinite. The multinomial probability p i for the same value as was used above in determining hs

^

may be determined

value of this probability, as

of

trial sets

of hypothetical measurements.

values about this mean.

We

shall not

prove

they'th interval ;

this

is

is

mean number

the

with a very large

But there exists a spread of it

here, but this spread itself

has essentially a normal frequency distribution. We make a single random theoretical trial set of measurements, then determine the difference

between

this

and the mean and then finally determine random theoretical trial set of measurements. The

random frequency value

in they'th interval

theoretical frequency value in this interval,

the value of y l for this

Normal

187

Probability Errors

Fig. 4-10.

2 x distribution for various degrees of freedom

v.

value of x 2 is, of course, also one member of a frequency distribution that is spelled out and takes on its equilibrium shape as the number of random theoretical trial sets of

measurements becomes

The derivation of the % 2 frequency

variations in each of the possible outcomes,

multinomial distribution.

infinite.

distribution i.e.,

is

in

made from each

M

the

normal

interval, in the

The arithmetic involved becomes

and

tedious,

approximations are made similar to those in our derivation of the normal distribution. These approximations are reasonably good if a parameter v (defined

below)

is

greater than about

5,

and

if

the

number of measure-

5. These conimposed for similar reasons to those placed on the use of the normal approximation to the binomial distribution, viz., np and nq each greater than about 5. v is related to M, as is mentioned presently.

ments

in

each classification interval

is

greater than about

ditions are

The expression

for the x 2 frequency distribution is*

/(rw), =

(#^ 2

2

\\v

-

d(x

2

(4-28)

)

1)!

The form of this expression was derived in Section 3-5 where R 2 (= 2w) is written in place of x 2 and n in place of v; see Eqs. 3-95 and 3-96. The shapes of x 2 distributions for a few values of v are illustrated in Fig. 4-10. As stated above, the significance of the actual value of ;q, s is found in a

comparison with the theoretical value of x 2 f° r the appropriate * if

Note the close similarity of this expression to that for the Poisson distribution 1 and £t> ^> 1, the approximation to a Gaussian is also very good.

2 x ;>

This

v.

;

and

188

Probability and Experimental Errors

comparison is made with the particular value x under the x 2 frequency curve in a specified way, (or 95 to

5).

We

say that

^

bs is 2

for which the probability P(x

2

>

Science

which divides the area

e.g., in

"unreasonable"

in

the ratio 99 to

1

greater than x 2 '% (or 5%). By

if it is

2

) is less than "unreasonable" we mean of course that the mathematical model used in computing xlus probably does not "reasonably" fit the actual measure-

ments.

And

x

note that, for a given value of

a large probability

P

v,

a small value of

as thus defined.

Table 4-8. Values of Xc 2 where P

=

2

f( x

)

d{y})

^

bs

means

Normal Probability Errors

189

number of theoretical frequencies considered number of experimental values. This constraint, viz.,

the fact that the n,

the

i

fobs l

expressed as

= 2 = j

/th

=

n

1

A second constraint is introduced when model frequency curve. This constraint is in the model distribution be equal to the the condition that

inherent of course in the % 2 we locate the center of the is

limited to

M

M

2=

is

test.

/j,

mean

experimental

A

value m.

third constraint

is

introduced

when we

deduce and use the universe value of the standard deviation from the

These three constraints are usually all goodness of fit of a model distribution, and, if

experimental standard deviation. that are

made

in testing the

=

M—

3.

so, v

However, if the total number of measurements is not sometimes worthwhile to impose a fourth restraint, one in and the interval size are so chosen that the number of measureeach interval is constant and nearly as large as M. This condition

very large,

which ments

M in

it is

allows about the greatest validity of the % 2 approximations, viz., that v be greater than about 5 and that the number of measurements in each interval be greater than about

7.

The

interval sizes

however, without introducing a constraint

no

if

can be adjusted,

size is influenced

by any

equation or approximate relation involving the experimental values.

As an example of the % 2 of

Table 4-9

light.

test,

consider 233 determinations of the velocity

the frequency

lists

obs

the respective classified deviation interval.

measurement occurs

in

In this case the deviation

is

that a

reckoned for arithmetic convenience from an arbitrary reference value of 299,000 km/sec, although the origin of the normal distribution curve placed

deviation in

the

at

mean, 299,773.85 km/sec.

14.7 km/sec.

is

Table 4-8 to a

P

The .^ bs value of 29.10

for v

=

13 corresponds

probability of only about, 0.005, and, since this

we

is

The experimental standard is

less

normal distribution is not a good However, if those intervals containing a fit to the actual measurements. small number of measurements are grouped together (as indicated by the braces in Table 4-9), reducing v from 13 to 8, the #obs is 18.52 and the P than, say, 0.01 (or 0.05

probability

is

larger,

does not

fit.

prefer), the

about 0.018. This

of the approximations the normal curve

if

may

made

latter value

in the derivation

be said to

The formal fit may

fit

—

is

more

reliable in

view

of the/(;r) distribution, and

at least

we cannot be

very sure

possibly be even further improved by

it

more

This example emphasizes the arbitrariness of the an unambiguous "yes" or "no" answer is not possible. Finally, it must be pointed out that the x 2 test can be applied in the test of the goodness of fit of any type of mathematical model of probability. The/th of Eq. 4-26 must be calculated, of course, on the basis of the model

appropriate grouping. criterion of

fit;

Probability and Experimental Errors in Science

190

Application of the %z Test of Goodness of Fit of the Normal Distribution to Measurements of the Velocity of Light

Table 4-9.

Deviation

Normal

Probability Errors

191

arranged into different groups of intervals either by their relative size or by their total number. Indeed, illustration of this fact was just given in the velocity of light example.

This

the measurements themselves

1

partly because of the

random nature of

partly a consequence of the approxi-

the derivation of the x 2 frequency distribution. largely for such reasons that we set the "rejection ratio" so low, say

mations that are made It is

is

and

in

or 5 out of 100, instead of near the ratio corresponding to the standard

deviation, about 32 out of 100. Statisticians

of

fit,

4-9.

have developed additional quantitative

tests

of goodness

but they are seldom used by investigators in experimental science.

Conclusions

There are

many

direct trial

measurements

in

experimental science that

are fitted "rather well" by the normal (Gauss) distribution. These include

the host of measurements that differ one from another in an essentially

continuous manner as a consequence, apparently, of a large number of small elementary errors.

The

fit

is

typically less

good

in the tails of the

presumably due to the fact that the idealized set of conditions on which the normal distribution (central limit theorem) is based is not quite realized in real life. However, to the extent that the fit This

distribution.

is

satisfactory,

is

i.e,

that the parent distribution

form of the distribution function allows the probability that any measurement,

(a)

is

normal, the analytic

very convenient predictions of

either past or future, will have a

value within a specified range, (b) simple and convenient rules for the

propagation of errors

in a derived

or computed measurement,

(c) rules for

assigning weights to measurements, (d) convenient equations for curve fitting, etc.

Such predictions, calculations,

etc.,

are of such great conven-

ience in the statistical interpretation of measurements that there

is

a rather

strong tendency for scientists to accept uncritically the normal distribution as the ansver to their prayers.

A

loud note of caution must be sounded

typically not very

good

But

is

in

pointing out that the

fit

is

Almost any set of trial measurements is generally bell shaped in the central region, and if interest in the statistics of the set is not very quantitative the normal approximation suffices. if

the interest

in the tails.

precise or if the

tail

regions are of special concern (as

"bad" measurement"), a specific test of goodness of fit must be made and the reliability of the normal approximation judged accordin rejecting a

ingly.

In addition to its degree of quantitative its

use therein, the normal distribution

is

fit

to direct

measurements and and valid

the one of most general

use in statistical theory in dealing with certain parameters of empirical distributions.

This application of the normal distribution has been noted

Probability

192 in the first

now add 4-10. 1.

2.

this

in

chapter; to this earlier paragraph,

the x 2 test for the goodness of

Science

we must

fit.

Problems

Show

points at

paragraph of

and Experimental Errors

z

that the curve of the

—

normal distribution formula has

inflection

±a.

The times recorded by 37 observers of

to the nearest 0.1 sec as follows:

Observers

a certain

phenomenon

are classified

Normal Probability Errors

193

As an example in the normal curve approximation, suppose that the marksman will hit a target is ^ and that he takes 12 shots. Compare the binomial probability with the normal probability that he will score (ans. about J % discrepancy) (a) at best 6 hits, and 6.

probability that a

(b) exactly 6 hits. 7.

with h

—

0.447 reciprocal seconds, assume a normal distribution and find

(a) the probability

1.0

about 5 % discrepancy)

(ans.

In a series of observations of an angle taken to tenths of a second of arc

and

1.1 sec,

that the next observation will have an error between

and (ans.

depends on interpretation of "between", e.g., 0.0204) than ±3 sec.

(b) the probability that the error will not be greater

(ans. 0.9421) 8.

If \\h is

2ft and the least count

is

in.,

1

what

is

the probability that 3

randomly chosen measurements, regardless of the order of their taking, will have deviations of 8 in., 16 in., and —4 in.? What is the probability if the order is specified? Assume a normal distribution and assume that the mean is at the center of the least count interval.

[ans. P(8)

=

i>(_4)

Show

=

0.023,

0.022, i»(16)

P^Pa =

8.1

=

0.016,

x 10~ 6 ]

normal distribution is equal to 3. quoted for the rest mass of the electron m =9.1154 x 10" 28 (1 ± 0.00018) g, of which ±0.00018 has the significance of a fractional probable error. Determine the probability that the value quoted (a) is correct to within 0.0005 x 10~ 28 g, (ans. 0.162) 9.

10.

A

(b)

is

(c) is

that the kurtosis of the

value

is

correct to within 0.00010 x 10~ 28 g,

and

(ans. 0.0325)

not correct to within 0.001 x 10~ 28 g.

11. In a breeding experiment,

it

(ans. 0.682)

was expected that ducks would be hatched

in

duck with a white bib to each 3 ducks without bibs. Of 86 ducks hatched, 17 had white bibs. Are these data compatible with expectation? Do the observations prove that the expectation was correct? 12. Should the last recorded observation in the data listed in Problem 2 the ratio of

1

be rejected according to Chauvenet's criterion ? 13. (a)

Compare on

distributions for B(k; (b)

Why

14.

From

probability paper the binomial and normal probability 100, 0.3)

and G(z;

hi)

as listed in Table 4-2.

not practical for a bookstore to stock probability paper for other model distributions than the normal ? is it

past experience, a certain machine properly operating turns out

items of which 5

%

are defective.

On

a certain day, 400 items were turned out,

30 of which were defective. (a) If a

normal distribution

is

assumed

ordinates of the plot of the distribution (b)

What

is

in this

problem, what are the co-

?

the probability that the machine

was operating properly on

this

day? According to Mendelian inheritance theory, certain crosses of peas should and green peas in the ratio 3:1. In an experiment 176 yellow and 48 green peas were obtained. 15.

give yellow

Probability and Experimental Errors in Science

194 (a)

Do

these

conform

to theory?

peas conforms to the theory

if it is

Assume

that the observation of 176 yellow

within 2a of the expected value. (ans.

(b)

Show

that about 95

%

of the normal area

bounded by

is

conforms)

2a.

2 16. (a) Apply the x test to the fit of the normal curve to the following 500 observations of the width of a spectral band of light:

/obs =5 /th =5

12

43

61

105

103

89

54

19

7

2

14

36

71

102

109

85

50

21

7

2

2/obs 2/th

Here/th denotes the fitted normal curve frequencies obtained by mean and the standard deviation from the actual measurements, (b) What is the significance of the difference (2/ h — 2/obs )?

= =

500 502

estimating the

t

How

would you determine whether or not 100 given measurements (each measurement expressed, e.g., with 5 significant figures) (a) are approximately random, and (b) fit as members of a normal (Gauss) distribution ? 17.

"Lest

5

men

suspect your tale untrue,

Keep probability

in

view."

JOHN GAY

Poisson Probability Distribution

5-1.

Introduction

In the preceding chapter the

normal (Gauss)

distribution, Eq. 4-9,

was

discussed as an approximation to the exact binomial distribution, Eq. 1-20.

A more or less paralleling discussion is to be made for the Poisson distribution, so First,

we shall quickly review the line of argument involved. remember that the normal distribution plays a role of

fundamental importance application to

many

sets

in

statistical

theory in addition to

of direct measurements.

This

is

its

great direct

not so for the

Poisson distribution, and our discussion in this chapter exclusively with applications to those sets of direct satisfy

The

reasonably well the Poisson conditions. algebraic expression for the

the particular conditions that the

and

is concerned measurements that

normal distribution was derived under

number n of Bernoulli

trials is

very large

p in each trial remains constant. The first practical advantage of the normal approximation in dealing with direct measurements is in the greatly simplified arithmetic when n is large the factorial numbers, the fractions raised to high powers, and the tremendous algebraic summations are avoided. But the most significant advantage in that the success probability

—

dealing with direct measurements

and q appear

first

that the Bernoulli parameters n, p, in the product np, the location value, and then in

the triple product npq.

is

This triple product, generally considered to be the

only parameter in the normal expression, distribution,

i.e.,

npq

=

a

2 .

An

is

equal to the variance of the

estimate of the parameter a

from the standard deviation of the

set

significance of the individual Bernoulli parameters

unidentified

and

is

obtained

of actual measurements; and the is

then relegated to the

little-understood elementary errors that are believed to be 1

95

Probability and Experimental Errors in Science

196

unavoidably present with different net ments.

effects in successive trial

With a evaluated, the simple normal formula

measure-

of inestimable

is

aid to the investigator in designing his experiment, in allowing predictions

of the probability of future measurements, and in judging the "reasonableness" of past measurements.

The Poisson

distribution

may

an approximation to

also be derived as

Again, a single parameter

the binomial distribution.

is

involved whose

direct experimental evaluation, without regard to the values of the separate n, p, and q, allows very useful application in the measurements and in the design of experiments. In this case, however, we can often recognize the basic Bernoulli trials and evaluate n, p, and q from them; but often these basic trials remain hypothetical, as they are in the normal case. When the basic Bernoulli trials are recog-

binomial parameters

analysis of

nized, their characteristics

may justify

Poisson formulas; otherwise a

test

immediately the application of the

of goodness of

fit,

such as the x 2

test,

must be made.

Rare events.

The Poisson approximation holds when

the following

three conditions in the binomial distribution are satisfied: (1)

//

very large, infinite in the limit,

(2)

p

very small, zero in the limit, and

(3) the

product np moderate in magnitude,

Thus, on the average,

many

called success appears.

known

i.e.,

that np

< Vn.*

Bernoulli trials are required before the event

For

this

reason the Poisson distribution

is

often

as the formula for the probability of rare events.!

There are statistical

many examples

analysis,

of rare events for which we wish to

make

and arguments

as to

predictions of future events,

As illustrations, we may mention such classical problems as the fractional number of soldiers who die each year from the kick of a mule, the number of atoms that spontaneously decay per unit time in a reasonableness.

man of age 25 will die at uncommon noncommunicable disease

radioactive sample, the chance that an average

a specified age, the incidence of an (such as polio) and

its

response to large-scale vaccination treatment,

and the number of houses per thousand burned by fire per year. Typical rare-event problems are discussed in detail after we understand, first, the essential Poisson equations and, second, the order of magnitude of the errors involved in the approximation. *

k

A

better statement of the third condition

~ np, the

left

side need not be

much

less

is

that k 2

+

(np)-

<

n,

and then,

if

than the right side.

t The Poisson distribution is often improperly number of successes need not be small when n is

called the law of small numbers.

very large.

case of a "spatial distribution" of events, as pointed out

This

later.

is

The

generally so in the

197

Poisson Probability Distribution

5-2.

Derivation of the Poisson Frequency Distribution Function

The derivation of the Poisson function may probability equation, Eq. 1-20.

(Eq. 2-26),

we

B(k;

write Eq. 1-20 in

n,

P)

=

Noting that/? the form

start

+q

(jW- = "-f^M" -

2)

=

with the binomial 1

=

and that np

•••(«-

fc

+

[j,

1)

k>.

/ff(,

'H)H)-(<-^) k\

1

--

v

i

(-3(«-i)

11

-

-

(5 -°

-J Under

the Poisson conditions, viz., n very large,

product np of moderate magnitude, the unity,

and the

last

this exponential

power

first

p

very small, and the

fraction in Eq. 5-1

is

essentially

factor can be approximated as an exponential.

To show

approximation, write an exponential in the form of a

series,

^ = l+A + l + - + - + 2

0!

1!

2!

and write out the binomial expansion,

e

-- =1 nj

—J

+

3! e.g.,

4-i)er 2!

+

•••

: --

(5-2)

4!

from Eq.

4

1-21,

-;)(> 3!

± n'

-m

Probability and Experimental Errors in Science

198

where the sign of the last term depends on whether n By comparison, it is apparent that B

lim

(

1

«/f) nl

n-oo \

With

the

first

=1

2

3

2!

3!

even or odd.

is

_£ + AL _AL + ^---=e-" 1!

(5-3)

4!

and with the last factor becomes the Poisson prob-

fraction of Eq. 5-1 written as unity

as an exponential, the binomial probability ability, viz.,

P{k;

^—

t

x)=

f

(5-4)

k!

Equation 5-4

is

called the Poisson frequency distribution or "density"

Note that

function.

it

contains only one parameter,

The Poisson cumulative distribution function following sum at the desired value of k:

a-

=o

fc=o

shown

It is

!i

1

is

—

;

e~

for ll \

n\

presently that this sum, as written,

probability for observing fxe~

2!

1!

fc!

two

successes,

no "success" (/u

2

e~'')l2\;

is

is

simply e for

etc.;

forexactly k successes, Eq. 5-4;

/x.

given by stopping the

is

etc.

equal to unity. - '';

The

for one success,

more than one

Stirling's

success,

formula, Eq. 1-14,

when k is greater than about 9. number N of trials in an experiment may be

often a help in evaluating k\

In actual measurements, a

made, each trial involving n basic Bernoulli trials. Equations 5-4 and 5-5 give the normalized probability or probabilities, as is expected of a model, and comparison of the calculations with the observed frequency or frequencies requires that either the observed values be normalized by dividing by TV or that the calculated values be multiplied by N. But note the difference between the definitions of

N and

of

n.

Shapes of Poisson frequency distributions. The shapes of the frequency distributions (or histograms) represented by Eq. 5-4 are essentially the same as the shapes of the binomial distributions for /; large and p small. (Binomial distributions are illustrated in Figs. 1-2, 4-1, and 4-2, and

also in

Problem 6 of Section

but approaches symmetry as

ju

1-9.)

The shape has

increases.

When

fx

a positive skewness, is

rather large, the

between the Poisson and the normal distributions; in this region we may generally approximate the binomial by using either Eq. 5-4 or Eq. 4-10, whichever offers the simpler arithmetic. The most probable value of A:, viz., k is generally less than the expectation value /x but never differs from it by more than unity. This was proved

distribution

is

in the transition region

,

in

a footnote in Chapter

double-valued,

i.e.,

1,

p. 32.

If (n

+

\)p

adjacent k values at

/x

is

equal to an integer, k

and

at

/x

—

1

is

have equal

Poisson Probability Distribution

Problem 14 of Section from Eq. 5-4 that

probabilities [see It

199 1-9].

readily follows

p< + '*

* = -jl.

fc

+

k

P{k\fi)

(5-6)

l

which indicates conveniently the rate with which P(k; everywhere in the distribution. Poisson

to

normal

distribution.

It

fi)

instructive

is

varies with

derive

to

k

the

normal density distribution from the Poisson equation, Eq. 5-4. To do this we may define the normal variable z in terms of the Poisson variable as k,ifk 1 and ft > 1

>

,

-k-

z

k ** k

-

-

(fi

(5-7)

\)

and by Eq. 5-7 Note we match in location the most probable value of/: with the normal maximum. The means of the two distributions thus differ on the average by \. If k > 1 and fi 1, the term \ is of no practical consequence (except for k very close to ft). With Eq. 5-7 we may write that the Poisson distribution

is

intrinsically discrete,

>

Wo +

+ Zp-"

//"O

= ?,

*); /*)

log, P((k

(kQ

+

,,

=

.*

+ z)\

=

z); fi)

Pfro;

a«)

(*b

P(k

log,

+

fi)

;

+

+ 2)"-(*b +

lX*b

log,

z

z

-

) (f \kJ

log, (l \

»)

+ f) k n

-

z

1+-

-•••-log, \

A:

and fi are very large compared with unity; by expanding the logarithms in a power series, Eq. 4-5, neglecting terms 2 and higher powers, we find [(ft — k )lk ]

In the normal case, k, k

+ 2);

log, P((k

/.)

,

«

log,

P(k

;

fi)

+

Z{fi

~

K)

- ^-^-

}

2k

k

Then, with the definition of

+

P((/c

By

z); //)

z in

w

terms of

P(fe

the property of normalization,

C

2=

e"

?2/2A'°

;

as given by Eq. 5-7,

ft

u>- j2/2i

»

/

C may

=C

e~

= Ce^2*

be determined as follows, z2/2k0

dz

=

1

fc

and, since q ph

1,

and by Eqs. 1-24 and

4-8,

£

1

/i

yj2TTk Q

yJlT

C = -=L= = -4

«s* 77/N7

=

1/2//

2 ,

(5-8)

Probability

200

and Experimental Errors

in

0123456789 Fig. 5-1. Binomial

and Poisson distributions for n

=

12,

p

=

10 \.

123456789 Fig. 5-2. Binomial

and Poisson distributions

for n

=

96,

p

Science

=

10 1/24.

Poisson Probability Distribution

We

201

conclude that

+ z);

P((/c

p) t*

P((fx

+ 2);

//)

^

G(z; h)

=

-£= e-*

v which

is

when Az

the

=

same

v

(5-9)

77

normal density function, or as Eq. 4-10

as Eq. 4-9 for the

1.

Normalization.

That the Poisson distribution is normalized, i.e., outcomes in the Poisson distribution is unity as n —> co, is assured by the fact that it is derived from the normalized binomial distribution, but this assurance depends upon the validity of the approximations made in the derivation. However, it is easy to show directly that the

sum of all

possible

that this distribution

T

n

P(k; u)

=

rt%

As

indeed exactly normalized.

is

n i

e~>

J fc

k

=

£-

L0!

n -> oo, the factor in brackets

lim n-»oo

5-3.

Errors

When

n and

£ + £ + £+£ + ...+-£"

£?""

=ofc!

Write Eq. 5-5 as

is

'

1!

'

2!

'

3!

' '

of the same form as Eq.

]T P(k; n) =

= e'^ =

n!

5-2.

Hence, (5-10)

1

fc

in

the Poisson Approximation

p

are both

finite,

made in using the relatively cumbersome but accurate of course, when the Poisson conditions

the errors

simple Poisson expression, instead of the

binomial expression, are significant, are poorly approximated. Table 5-1.

k

B{k; 5,|)

For example,

Errors

in

B{k;

if

n

=

12 and/?

=

\,

the errors

the Poisson Approximation 10, to)

B(k; 100,

j^)

P(k;

1)

Probability and Experimental Errors in Science

202

=

with np

11

=

in all three cases.

1

the Poisson approximation the order of 100 or rule of

thumb

stated as k 2

—

as (k

To

evident from these examples that

more and

if/? is less

most applications

than about 0.05.

This

but, as noted earlier, the condition for small errors

+

(up) 2

n.

n

if

is

is

of

good

a

better

is

can be seen that the relative error increases

It

fx)fn increases.

Precision Indices

5-4.

must

It is

satisfactory for

is

predict probability first

know

from the Poisson

relations, Eqs. 5-4

We

the magnitude of the parameter p.

method of maximum

5-5,

we

likelihood that in the binomial distribution this

parameter, the arithmetic mean,

mean of

that in the Poisson case

show

for practice let us

k values

all

We

equal to the product np.

is

assumed above, with perfect validity, and is also the arithmetic mean. Just from Eq. 5-5 that p is the mean. Let the arithmetic

and

have shown by the

have np

=

p

directly

m th

Eq. 5-5 be denoted by

in

.

Then, by definition,

lk = ^" I— = *~> 2 — 2— — n

n

n

m th = 2

=

kP(k; p)

e -»

i=o

=o

a-

k=i

fc!

H*

n

k

i

k\

k

_l

= i(k

1)!

(5-11)

where the lower

limit

sum

of the

changed from

is

presence of the factor k in the numerator (the

first

to

1

because of the

term, for k

=

equal

0, is

and where, in the last step, we divided through by k (k > 0) and substituted p p'~ l for p k * Then, as n -> oo, the sum is equal to the exponential e as may be seen when e is written as a power series, Eq. 5-2. to zero),

1

1'

',

Hence, Wth

=

lim n -»

'

2=

/,-

kp ( k

'>

/")

= e'^p =

(5-12)

p

o

This conclusion was also reached by the method of

maximum

likelihood

Section 3-1, Eq. 3-17.

in

Standard deviation. distribution, a

since q «*

1,

=X

npq.

We know Hence,

*

Eq.

The argument

gamma 5-1

1

is

in

is

sometimes made

binomial

that, in the

the Poisson distribution, a t&

and, by Eq. 5-12 and the relation

o

the

from Eq. 2-28

p

=

= ^p

that, since

X np

np,

(5-13) (

— 1)! = x

function, see discussion attending Eq. 1-16), the

equal to zero and that, hence, the lower limit of the

changed from to 1. But this argument would have through by 0, which is not allowed.

(which can be shown by

first

term of the

last

us, for the

last

sum may k

=

sum

in

as well be

term, divide

Poisson Probability Distribution

where the

show By

it

=

sign

2

= I (* - vfP{k; *=o A-

—

>-

strict

Poisson conditions. But we shall

definition,

=| since

used for the

is

directly.

o

«

203

(A:

2

^

2

2

P(Ar;

)

p)

=

= f W(fc; =

Zkp

+

-

pt)

/r)P(/<;

A«)

2

(5-14)

/<

fc

^£% =Q kP(k; p) oo

-

= 2 (k t=o

ft)

=

by Eq. 5-10.

by Eq.

/u,

and

5-12,

since

2fc=o^°(^

(Incidentally, this expression

A*)

=

f° r

1

one form of the

is

general relation, irrespective of the type of distribution,

a2 where k

2

mean of

the arithmetic

is

= ie-^

arithmetic mean.

This expression

and the next time

as Eq. 2-19.)

(5-15)

the squares,

first

appeared

Equation 5-14 can also be written with (k(k

k

=o

k\

fc=2

—

Because of the factor k(k

1) in

and

ju

in this

2

is

—

1)

+

(k

—

2)!

the numerator of the

the square of

book

as Eq. 2-12,

k) for k 2

first

;

then

sum, the lower

sum may just as well be written as k = 2, as is done in the second sum. Then the sum is equal to unity (for n —> oo) by the property

limit of this

of normalization, Eq. 5-10.

a

=

2

Note that the

2

ft

+

In conclusion,

ju

—

2 /li

=

and

ju

a

=

yjfi

(5-16)

and negative values of a have no simple interasymmetry of the Poisson distribution, a is an rms deviation having an asymmetrical graphical interpretation. The rms of all positive deviations is greater than the rms positive

pretation as actual deviations because of the

of

all

negative deviations.

To show by

the

method of maximum

likelihood that a

posed as a problem for the reader.

(It is

= V npq

in the

27 in Section 3-11, that a

Fractional standard deviation. the most

commonly used

The

shown by

this

= A up = \ /u is method, Problem

binomial distribution.) fractional standard deviation

is

precision index in those "counting" measure-

ments for which the Poisson model is a satisfactory fit and for which is rather large by the design or analysis of the experiment. The fractional jli

standard deviation in per cent fractional a in

is

defined as

%=

- x 100

= -^

x 100

(5-17)

204 It is

Probability and Experimental Errors in Science the simple inverse relation with

V/u that makes the fractional a so

popular.

When single

/u is

moderately large, not

measurement k s

(fractional a) of, say,

much

1

error

is

introduced by writing a

In this case, in order to have precision

in place of//.

%, we must observe k s

=

10,000 "successes."!

Standard deviation in a single measurement. Even when the single measurement is not large the question often arises. What is the precision in this single measurement? This question may be stated another way: If a highly precise value of m (sa //) is known, as from many trial measurements, what is the probability that the single measurement k s and [x will differ by a given amount? To answer this question, we may consider m, instead of the single measurement k s to be the variable in P(k m). Note that m is a continuous variable although k is discrete. The likelihood function L(m) for a single measurement k s is ,

;

Urn)

=

^—

(5-18)

kg.

Following the procedures of the method of lined in

Chapter

we

3,

log

/dm

maximum

likelihood as out-

write

L(m)

[log L(m)-]

=

m — m —

k s log

=-m

log k s

(5-19)

\

s

(5-20)

1

and

^ [logL(m)]=--^m dm 21

To

(5-21)

2 l

m* from Eq. 5-20, measurement k s we write

find the best estimate

from a

single

i.e.,

the

most probable value

,

^-[logL(m)] m=m .

=

m*

=

(5-22)

dm This result is

is

the one

we would

ks

(5-23)

expect for a symmetrical distribution, but

not necessarily the expectation for an asymmetrical distribution;

remember, our only knowledge

To |

is

from a

find the standard deviation in a single

Such a large k or

/<

the Poisson distribution

but

measurement, A ,. measurement, a,. we combine

single

,

does not invalidate the Poisson condition that up be moderate; is

not one of small numbers, only one of rare events.

205

Poisson Probability Distribution

Eqs. 5-21 and 3-25 with the reasonable assumption that the values of L(m) are normally distributed about

a*

=

1

m* (= k from s

Eq. 5-23):

Probability and Experimental Errors in Science

206

Probable error.

By

definition, the

probable error, either positive or

marks the + and — limits in any distribution such that the number of measurements between the respective limit and the mean is equal to the number beyond the limit, i.e., the 50% confidence limit on either side of the mean. The median value is always enclosed within these ±pe limits, and, in most distributions of interest, the mean value m or fi also lies somewhere between +pe and —pe. In a symmetrical distribution, the positive and negative values ofpe are equal in negative, viz., ±/?e,

is

Table 5-2.

that error that

Numerical Coefficients pe/o for Various

Values of

/<

the Poisson Distribution

in ft

20

peja

207

Poisson Probability Distribution

We

can also write

-L

(5-30)

the appropriate numerical coefficient from Table 5-2. the fractional probable error fractional pe k tv

^^

«

ak

±0.6745

P

x

K

with a similar qualification as to the magnitude of the numerical coefficient.

Skewness. The Poisson distribution is intrinsically skewed. The asymmetry or skewness decreases as the expectation value p increases. It is interesting to see the relationship between the skewness and p. By definition, the third moment about the mean is

f

(k

- pfP{k;

= SCfc3 -

p)

3k 2p

+

3/c/r

- pz)P{k\

p)

k=

= £/c 3P(/c; By

-

3pLk P{k\ p)

we

—

separate the quantity k{k

1)

sum from the k property of normalization. Then the

change the

first

term

in the

p*

+

z 3p ZkP(k; p)

+

- p3

in deriving the expression for a,

same arguments as were used

the

Eq. 5-16, the

p)

2

3SA: 2 P{k- p)

-

—

(k

=

2) out

of the

first

term,

k = 3 term, and use term becomes

to the

first

2LkP(k;

p).

We make use of the relations that Hk P(k; p) = a2 + p2 from Eq. 5-15, that a 2 = p from Eq. 5-16, and that T,kP(k; p) = p from Eq. 5-12. Substi2

tuting these quantities in the expression for the third

moment about

the

mean, we get

|=

(k

- pfP(k;

p)

=p

(5-31)

A

Finally, the skewness,

divided by a 3

by

definition,

is

the third

moment about

the

mean

,

skewness

= l (K-M^k:,) = JL = ±_ = a k=o p

N

p

i a

(5 .32,

a very simple relation that holds for every Poisson distribution.

The derivation of

the expression for kurtosis

is

assigned to Problem

11, Section 5-10.

5-5.

Significance of the Basic Bernoulli Trials

Since P(k; p), Eq. 5-4, contains a single parameter, p, each of the p is known. In the derivation

Poisson probabilities can be calculated once

of P(k;

ju), p is taken as equal to the product np, where n and/? are the parameters in the binomial distribution and therefore are characteristics

of basic Bernoulli

trials.

However,

tional purposes whether or not

//

it is

of no consequence for computa-

and/? are separately known. (However,

208

Probability

note that knowledge of n and

and Experimental Errors

p may

in

Science

establish immediately the appli-

\i by m from a and then to use Eq. 5-4 or 5-5 in predictions and in calculations of precision and of reasonableness.* This, however, presumes that the Poisson model satisfactorily fits the actual measurements. The question of goodness of fit is discussed

cability of the Poisson model.)

rather large

number of

It suffices

to approximate

actual measurements by Eq. 2-2,

later.

In practice, Poisson problems can be divided into two general classes: (a)

sampling with replacement and

In the are

first

class there

presumed

limited.

to

is

come;

no known in the

(b)

sampling without replacement.

limit to the supply

second

class, the

from which events

supply

is

known

to be

In sampling with or without replacement, the experiment or

measurements must be such that the basic success probability, or the Poisson probability for a given value of k, must remain constant for

all

samplings.

An example of the first class is in the number k of defective

screws that a

machine turns out in a given batch of known size n if the average performance of the machine is known to be ju (^ m) defectives per batch of this size; p (= /////) is the probability, assumed to be constant, that any one screw will be defective. We do not inquire into the machine factors that make a screw defective it is presumed that the machine never changes in its performance in this regard. It is also presumed that our knowledge of n and p is such as to justify the assumption that the problem is indeed Poissonian. Another example of this class is in the number k of cosmic rays from outer space that appears in a specified solid angle in a specified time interval at noon of each day, or in the number k of X rays emitted per second from an X-ray tube under steady operation. In this example, the basic Bernoulli parameters n and p are not a priori known, but a special argument can be invoked to show that the parent distribution is satisfactorily Poissonian. This argument is the one of "spatial distribution" to be elaborated presently. A Poisson example of sampling without replacement is the classical one of radioactive decay the number k of atoms in a given radioactive specimen that decays in a specified time. To be Poissonian, this example must include the specification that the lifetime for decay must be long compared with the time interval of observation, and the number of atoms in the specimen must be rather large. In this example, the basic Bernoulli trial is whether or not a given atom decays during the interval of observation. certain

;

—

made in Chapter 4 regarding the normal Gauss distribution approximate the normal mean // (for the position at which 2 = 0) and the parameter h from the experimental measurements. The parameters /;, p, and q are not *

it

This argument was also

suffices to

separately evaluated.

209

Poisson Probability Distribution Clearly, the supply of possible events

of atoms in the specimen.

is

limited, viz.,

is

the total

number

These and other examples are discussed

in

detail later.

Two mechanical analogs. It was stated or implied in Sections 2-7 and 4-3 that the distinction between "success" and "failure" in a Poisson problem may be thought of as due to an unfathomable array of elementary errors. According to this view, a mechanical analog of "success" and "failure" is illustrated in Fig. 5-3. This mechanical model is a simplification of the one used for the normal distribution, Fig. 4-4, and the interpretation ball as

it

is

In both cases, however, the deflection of the

quite different.

encounters a pin corresponds to an elementary error.

By the relative size and position of the "success" bin, only a small number k of the total number n of balls dropped manages to reach this bin. The Poisson probability p refers to the chance that a given ball will do so. This probability is constant if the (hypothetical) pins and their array are appropriately designed. If the n balls are gathered up and again

•

• •

•

•

Fig. 5-3.

A

•

•

•

•

possible mechanical analog of "successes"

probability problem. after n balls

•

••••• ••••••

• •

• •

A

and "failures"

single trial consists in observing

have been dropped from the nozzle at the top.

k balls

in the

in

a Poisson

"success" bin

Probability

210

and Experimental Errors

in

Science

Lasassa 'Successes"

k

k<£n Another possible mechanical analog of "successes" and "failures" in a The angles of bounce of the dropped balls are random. A large number of ball drops gives only one Poisson measurement, viz., the number A' of balls Fig.

5-4.

Poisson problem.

in

the "success" bin.

Poisson Probability Distribution

211

dropped through the nozzle, as another Poisson trial, a generally different number k of successes is observed. A graph of the number of times each to n, as the number of Poisson possible value of k is observed, from

N

trials

increases indefinitely, gives the Poisson distribution.

geometry, k 2

+

[By the

(np) 2 << ».]

In this analog, each dropped ball, not each pin, corresponds to the basic

Bernoulli

trial.

And

a single

elementary error Bernoulli It is

trial,

(i.e.,

measurement

the

is

In the analog for the

position of a single ball.

number

normal

k,

not the final

distribution, each

the deflection at each pin) corresponds to a basic

but not so in this Poisson analog.

instructive to consider another possible mechanical analog for the

one

Poisson case.

This

into a

deflection

single

is

in

which

all

uncertainty.

the elementary errors are

lumped

Consider the model shown

in

Fig. 5-4.

dropped irregularly one by one from and bounce off the rotating rough central pin. The pin is rotating and rough so as to insure that the angles of bounce are essentially random. On one side of the cylindrical box is a hole, somewhat In this model, n identical balls are

the nozzle so as to strike

larger than a ball, leading to the "success" bin.

and the radius of the

cylindrical

box

By having

this hole small,

large, the "success" probability

is

small.

This model, with a large

ment

is

number

n of balls dropped, illustrates only one

relatively small number k number n of basic Bernoulli trials. If the experirepeated with the same number n of basic trials, the value of k

measurement, of "successes"

will in general

viz.,

the

measurement of the

in a large

be different;

if

the experiment

is

repeated

N times,

many

The Poisson frequency distribution is the graph of the number of times each k value is observed vs. the k values themselves as p -> and TV— oo. [Again by the geometry, k 2 + (npf <«.] different

5-6.

k values are observed.

Goodness of

Fit

Before the Poisson equations

may

be properly applied to calculate the

probability of the value of the next measurement, to assess the precision,

or to judge the reasonableness of past measurements, the investigator must

be assured that the measurements are members of a parent distribution that

is

satisfactorily Poissonian.

There are two general ways of determining the goodness of fit of the Poisson distribution, viz., by a priori analysis and by test. Analysis of the experiment usually allows the investigator to see whether or not the conditions of the basic Bernoulli

trials,

hence, of the Poisson

Probability and Experimental Errors in Science

212

approximations, are satisfactorily met. These conditions, to recapitulate, are: (1)

only two possible outcomes,

(2)

each outcome determined entirely by chance,

number of

(3) the

product

each basic

k, in

trial,

basic trials, n, very large,

(4) the success probability, p,

(5) the

k and not

viz.,

constant and very small, and

+

such that k 2

tip

(tip)

2

n.

In experiments in which the Bernoulli parameters are directly recognizable,

or are seen to be adjustable as

a spatial distribution (examples are

in

discussed presently), the analysis for goodness of priori that the parent distribution

infrequently indicates

it

is

case in which for any reason there

necessary to carry out a

The

qualitative

is

and

In the transition case, or in a

a serious question of goodness of

fit,

test.

and quantitative

tests

described in detail in Section 4-8

method

for the normal distribution are also applicable in

case or for any model distribution.

for the Poisson

Discussion of these tests will not be

Of them, the quantitative y 2 test is number of trial measurements.

repeated here.

often indicates a

to be in a transition region between binomial

Poisson or between normal and Poisson.

it is

fit

satisfactorily Poisson, but also not

generally the best but

requires the largest

Spatial

distribution.

problems,

let

argument, (infinite

Before

proceeding

examples

to

of Poisson

us elaborate the argument of spatial distribution.

many

By

this

experimental problems of sampling with replacement

supply of events) are recognized a priori as being satisfactorily

Poissonian.

Consider a problem in which events occur randomly along a time axis, and consider a unit time interval /. (In this case, "spatial" refers to time; in

other cases

it

may

terms of Bernoulli

time interval.

refer to space.)

trials,

each

trial

In each observation,

and "one or more events"

is

We may

first

analyze the problem

in

being an observation during the unit

"no event"

is

declared to be failure

declared to be success.

We may

repeat the

observation for TV successive unit intervals of time. The observed distribu-

numbers of successes is represented by the binomial distribution NB(k' ri p'), where p is the success probability and k' is the number of "one or more events" observed out of ri Bernoulli trials. In most problems, we are not satisfied with success defined as "one or more events"; we wish it to be only "one event." We may then imagine

tion of ;

',

the size of the time interval reduced until the probability for observing

more than one event Suppose

that, in

doing

the time interval in the

in the this,

we

reduced time interval

is

negligibly small.

divide the unit time interval by an integer n;

new Bernoulli

trial is

now At

=

l/n.

Since division

Poisson Probability Distribution

213

by n is purely imaginary, we may make n as large as we please. As n becomes very large (infinite in the limit) the probability p n that At contains even one event

is

very small (zero in the

limit).

We now have

n imaginary

Bernoulli trials that by definition in the limit satisfy exactly the Poisson

Then, for each of the

conditions.

TV actual observations, the probability

/ is B(k; n,p n ) = and the predicted frequency of observing exactly k events in N trials is NP(k; ju ). The point of special significance in the spatial distribution is that the basic Bernoulli trials on which the justification for the Poisson approximation is based are hypothetical trials having no actual

for observing exactly k events in the unit time interval

P(k;

fj, t ),

t

measuremental

significance.

t, we may start with any arbitrary time interand then reduce the Bernoulli time interval by dividing by nT instead of by n. The argument to be followed is the same as before except that now, to have the same success probability p n we have Tn basic Bernoulli trials. The Poisson predicted frequency distribution is written with T/u instead of ju viz., NP(k; T/u ). Then we may write /u for Tfi

Instead of the unit interval

val T,

,

t

t

.

,

t

important to realize that

It is

(x

t

is

t

a physical constant determining

oin or of measurement during the unit

the density of events along the time axis independent of the value

nT.

(x

is

t

the expectation value in an actual

time interval,

/u,

the expectation value during the time interval T,

determined from the viz.,

ph

fx

m=

N

Yufk k\N.

is

experimental measurements, each over time T,

For convenience

in

choosing an appropriate T. This adjustment

practice,

is

we

adjust

fj.

by

merely a k scale factor of

convenience. 5-7.

Examples of Poisson Problems

Perhaps actual examples provide the best way

(a) to

become acquainted

with the a priori analysis in determining whether or not a given problem Poissonian, and (b) to

become

is

better acquainted with the Poisson expres-

sions in general.

Deaths from the kick of a mule. tioned in

almost

all

This

is

a classical example men-

textbook discussions of

Poisson

probabilities.

Records of the number of soldiers dying per year from the kick of a mule were kept for a Prussian army unit of 200 men for a time of 20 years. It is assumed that this army unit was kept at full strength and at essentially the same routine activities throughout all of the 20 years. The records for one year are shown in Table 5-3. The mean or expectation value m (en fi) 200) observations is given by the ratio (Z/*A:)/(E/ ) = 122/200 and the Poisson probability can be readily computed. This probability, multiplied by N, is also listed in the table. Our first conclusion is that by direct comparison of the experimental and the model frequency

from

=

N(=

0.61,

fr

214

Probability Table

k

5-3.

and Experimental Errors

Deaths from the Kick of

fk

a

Mule

NP(k; 0.61)

in

Science

Poisson Probability Distribution

or a mule dies he or

it is

21

replaced immediately by another

insofar as soldier-mule interaction

is

who

is

equivalent

concerned, or else the army unit

is

large that one (or a few) deaths has a negligible effect in changing

so

pn

;

argument of spatial distribution we must have sampling with replacement (or a very good approximation thereto). Consider next as a basic Bernoulli trial the observation of one "statistically average" soldier, instead of the army as a unit, for a period of one year. In this case, the number of possible outcomes per trial is decreased effectively to only two, viz., one death or no death; the number of trials n is increased from one by a factor about equal to the number of average soldiers in the army; and p is reduced to a satisfactorily small value. p n is kept constant in this case by the mere assumption that all soldiers, in regard to mule contacts, are equivalent. Replacement of casualties need not be immediate but if not, to approximate the Poisson conditions, in order to use the

it is

required that the lifetime of a soldier be long,

army

unit be large.

This analysis

In discussing this problem,

is

we

i.e.,

that the size of the

the best of the three tried.

first

considered the simplest possible

Then we explored the and found that, along the time axis, the Bernoulli probability p n could not be assumed to be constant. Finally, our best analysis was made in terms of the basic Bernoulli trial as the

basic Bernoulli trial

argument of

and found

unsatisfactory.

it

spatial distribution

observation of a single soldier, with the assumption that identical.

In spite of the excellent

fit

all soldiers

are

of the Poisson calculations to the

reported measurements, Table 5-3, this problem barely qualifies as a

Poisson example.

Radioactive

Perhaps the best known

decay.

Poisson distribution in physics

is

a-particles in radioactive decay, counts of

counts of visible light photons,

application

in "counting" experiments:

etc.

And

cosmic rays, counts of X rays, commonest of these experi-

the

ments are those for which the source intensity constant over the time of measurement.

is

safely

Consider a radioactive substance emitting a-particles. is

placed behind a

set

of the

counts of

assumed to be This substance

of diaphragms which screens off all but a small solid

The unscreened rays fall upon a sensitive counting device, such as a Geiger counter tube combined with an amplifiersealer arrangement and a count register. The number of counts is recorded angle of the emitted rays.

for each of

N(=

2608) time intervals of

T (=

7.5) sec each.

Table 5-4

k counts were recorded in an actual fk experiment. The total number of counts is 2.fk k = 10,094, and the average counting rate is 10,094/2608 = 3.870 per 7.5 sec. The calculated Poisson

lists

the frequency

that exactly

distribution, multiplied

by

N to make

it

frequency instead of probability,

216 is

Probability

also listed in the table.

frequencies shows rather

The

fit

and Experimental Errors

in

Science

Direct comparison of the observed and calculated

good agreement.

of the Poisson model to the observed data

may

be judged

m

quantitatively by application of the x 2 test tms example we have enough measurements to use this test to advantage. This test tells us that 5

comparable cases would show worse agreement than appears owing to chance fluctuations alone. We conclude that the good Poisson fit establishes that the a-emission from this long-lived substance is not only governed by chance but is well represented by the Poisson model. 17 out of 100

in

Table

5-4,

Table 5-4.

k

Radioactive Disintegrations

fk

NP(k; 3.870)

Poisson Probability Distribution

217

example has many things in common with the one dealing with the kick of a mule, and it has two essential differences. The first difference is that, if the source intensity

is

satisfactorily constant during all of the

measurements

long atomic lifetime and large number of atoms in the specimen), may safely invoke the argument of spatial distribution. In this argu-

(i.e.,

we

ment, the time interval T (= 7.5 sec for convenience in the experiment) is imagined as subdivided into n equal subintervals of time, and n may be very large indeed. The second difference is that all atoms of a given species are strictly identical, as we know from the success of the principle of indistinguishability in

we may

quantum

statistical

alternatively analyze the

mechanics. With this knowledge

problem

in terms of the observation of each atom for 7.5 sec as the basic Bernoulli trial, the number n of such trials being equal to the number of atoms in the specimen. In terms

of either

The

set

of basic Bernoulli

trials,

the problem

is

clearly Poissonian.

analysis in terms of individual atoms, instead of the subintervals

of time,

is

in general the

more

because, often, the lifetime

logical

is

one

in

problems of radioactive decay

too short, or the number of atoms

is

too

small, to justify the assumption of constant source intensity

and thus of constant Bernoulli probability. This type of problem was mentioned earlier as an example of sampling without replacement, and only if the source intensity does remain constant can the spatial distribution argument be invoked, i.e., can the problem be considered as one of sampling with replacement. Note also that if the lifetime is short and the number of atoms is small, the problem is a binomial one in which the basic Bernoulli trial is an observation of each atom. A practical limit on the intensity of the source, or, rather on the maximum counting rate, is imposed by the measuring equipment. All detector devices have an inherent resolving time, i.e., a "dead time" immediately following a count. If one or more a-particles arrive at the detector during this dead time, it or they will not be counted. The Poisson distribution cannot "exactly" counts is

is

fit

the measurements, of course, unless the

negligible.

about 2-

sec,

The dead time of the eye

of a typical Geiger counter about 120

proportional counter about

1

/usee,

about 10 -7 sec as determined

number of "lost"

(as in counting scintillations) //sec,

of a typical

of a typical "fast" scintillation counter

now

(in the

year 1960) by the associated

electronic circuits.

Another

practical matter in regard to the source

"conditioned"

—in

the early runs

is

that

it

must have been

of data, loose molecular aggregates

become detached from the emitting substance atomic recoils from disintegrations within the

as a consequence of the

aggregates.

During the

conditioning process, the value of % 2 decreases successively as the source intensity becomes more nearly constant.

Probability and Experimental Errors

218

Counts per unit time: radioactive decay, typifies

many

precision.

The measurement of

in

Science

the rate of

the source intensity remains satisfactorily constant,

if

Poisson measurement problems

—most of the

problems

nuclear physics, cosmic rays, radioactive tracer work, carbon dating fact,

in

— in

any Poisson measurement amenable to analysis as a spatial distribuA feature of especial interest in such measurement problems

tion of events. is

precision, not yet discussed in these examples.

Consider the relative precision

in

counts per minute

among the

following

three measurements (a)

(b) (c)

one measurement 100 counts in lOmin one measurement: 1000 counts in 100 min ten successive measurements (N = 10) each for 10 min, the :

number of counts being The counting

total

1000.

assumed to vary randomly on a short-time scale on the average, to be constant, i.e., constant as measured on a long-time scale (e.g., days). The average counting rate is given experimentally by the mean m which is our best estimate of the desired parent mean /lc; m is the same in all of the three cases. Consider each of two dispersion indices of the parent (Poisson) distribution, viz., the standard deviation a and the fractional standard deviation a/ju in per cent; and consider also the standard deviation in the mean (standard error) a m These indices, for the three measurement situations,

(e.g.,

rate

is

minutes), but,

.

are:

(«)

Eq.

5-24

Eq.

5-28

a

=

-

= 4=

yjks

Jk

fi

=

^/lOO c/10 min

X 100

=

s

=*

=

-^2L V100

cpm

= cpm 1

10^

2-32 Eqs. 3-26

am

= —a=l =

Vl00 v

yjN

Vl

3-45

Eq.

5-24

Eq.

5-28

(b)

a

=

-

= JL

(i

V/c s

2-32 Eqs. 3-26

3-45

a

yJk s

=

,_

.

min

=

Vl00 cpm

=

.

1

cpm

10

x/l000c/100min

X 100

= —a— = JN

/1A c/10

=

-122= Viooo

w

..__ ^1000 ^ c/100min

^/i

'

= ^2°. C pm m 0.32 cpm 3.2% _ ._ ^ = ^1000 cpm ^0.32 cpm

100

Poisson Probability Distribution , u c Eq. 5-16

g

5-17

Eq.

/= Ju v

(7

(c)

->

y/ft

=

"1=

JN

~~F=

J 10

^

/1000

=

100

=

°"m m

-.^

/

V N

VlOOO/N

]

Eqs. 3-26

/1000

an

= lpO^

[x

t-

219

=

/

V

=

.

10 c/10

min

1

cpm

10

.

.

100

10

.

/b

71000/10

3.2 c/10

mm =

0.32

cpm

3-45

Or, better, the standard deviation a in case

Eq. 5-24 with

all

100 min. Then,

(O

may

(c)

be computed from

1000 counts considered as a single measurement over

we have

Eq.

5-24

a

= Jk =

Eq.

5-28

-

=

am

Eqs. 3-26

1

=

-J= x 100

—a

= —= JN

V1000

y.

10

=0.32 cpm

ion ~4= ^ 3.2%

71000

V/c s

ix

2-32

= ^ 100° c/10min

,/lOOb c/100 min

s

_ .

&

_

.

;1 3.2 c/10

min

=

. ,_ 0.32

cpm

Jl

3-45 First,

compare the precision of (a) with

that of (b).

It is

a characteristic

of Poisson problems that the precision in counts per unit time becomes greater as the mean m increases, but only {/"the effective number n of basic Bernoulli trials in each measurement also increases.

example, which involves hypothetical Bernoulli

T

of the observational interval

trials

must be stated

distribution argument, this characteristic

In the present

because of the spatial in

terms of the size

along the time axis instead of in terms of

number of hypothetical Bernoulli trials; the two statements are seen be equivalent when it is realized that the gain is in the product np, not

the to

merely in n alone.*

This

made

is

clear in a

penny-tossing example

presently.

Second, compare

remains the same, *

Note

that

if

(a)

with

observation time per measurement

If the

(c).

T=

viz.,

10 min, the values of a

the counts for each of the 10 trials (each for 10 min) are

use Eq. 2-23 for

o, viz.,

a

=

^(k —

I

t

/

in

Eq. 2-32 for a m

,

viz.,

am

=

I

of any mathematical model. But

m) 2 /9

|

10

^ (&i — w) /90

,

the Poisson

this

value of

\H .

These expressions apply regardless

J

model

is

known

to

fit

the parent distri-

bution, the Poisson equations lead to a greater precision with a given

measurements.

are the

known, we may

and then we may use

2

if

affx

VA

/ 10

a

and of

number of

220

Probability

same, respectively,

by

and Experimental Errors

Science

That these precision indices are unaltered measurements is expected from the statement in the

in (c) as in (a).

repetition of the

previous paragraph relating precision to the

having a given success probability.

trials

in

number of

basic Bernoulli

Note, however, that the addi-

tional nine trial measurements in (c) have improved the precision in the mean, the standard error a m On the other hand, if (c) is considered as a single measurement of 1000 counts with the observational time interval T = 100 min, the precision as measured by all three indices is improved; in this view, (b) and (c') are equivalent. .

Third, consider further (b) tially

case

N,

made above, but one

(c'), it is

i.e.,

This comparison has already been par-

= a/V N for

imperative that the value of a be consistent with the value of

the value of

We

measurements. the

vs. (c').

point remains. In the expression a m

same time take

or

must correspond

cannot take a

= Vk =

N=

would

this

\0;

deduced from the

to the value s

VlOOO

=

yield a m

N

cpm and at cpm but is not

en 0.32 0.1

valid.

Finally,

it

should be emphasized that, although we

may

at first

be

inclined to regard the subdivided data in (c) as thereby containing addi-

somehow yield an improvement in precision, no additional information (except in checking systematic errors as mentioned below, and in checking the applicability of the mathematical model). We have assumed that all the fluctuations in successive 10-min trials are random; and all of the information about the dispersion of the parent distribution is contained in the product np of the basic Bernoulli trials it does not matter how this information is obtained; it may be obtained in subdivided measurements or all at once in a single measurement. tional information that should

there

is

in fact

—

The

situation

is

somewhat

similar to that encountered in tossing pennies.

If five pennies are tossed together,

tions of heads

and

tails is

binomial probability

a

= Vnpq =

distribution

vf, and

some one of the possible five combinaThe five combinations form a parent

observed.

a/ju

=

B(k;

which

for

5, |)

(1/V5) x 100%.

If

/.i

=

np

=

experiment

the

f, is

repeated ten times, the experimental frequency distribution begins to take

shape and we have additional information for the estimate m of the mean i.e., a m is improved, but the parent distribution \0B(k; 5, £) is un-

/u;

altered (except for the normalization factor of 10);

and a Ip, are

just the same.

the values of

/*,

a,

But, instead of repeating the experiment ten

times, suppose that 10 x 5 = 50 pennies are tossed together; in this new experiment a single observation is made which is a member of the parent

distribution B(k;

pennies,

50, |), a different distribution

and one whose parameters are

/x

=

from that with

np

=

5 °, 2

a

=

\

just five \°,

and

Poisson Probability Distribution

221

a I[A = (1/v 50) x 100%. Of course, the new experiment could be performed by tossing the 50 pennies in 10 lots of 5 each, calling all 50 pennies r.

a single observation; this

is equivalent to subdividing the total number of The dispersion of B(k; 50, ^) is unaffected by any mere treatment of the data. The crucial point is in recognizing the parent distribution whose dispersion we would specify, i.e., the number of basic Bernoulli trials. It is apparent that a in the new experiment with 50

basic Bernoulli

trials.

is only VlO, instead of 10, times larger than a in the old experiment with 5 pennies, and, on a "per penny" basis, the new experiment is the more precise. (The "per penny" basis is somewhat artificial here but illustrates the point. Had we started with a very large number of pennies, say 10,000, instead of 5, the "per certain large number of pennies" would make more sense.*) A real advantage of subdivision of the data is that any significant

pennies

misbehavior (change

in

systematic errors) of the experiment

may

be

recognized, such as bursts of spurious background counts or a slow drift

magnitude of the background or perhaps of the average value of If enough measurements and subdivisions are made, the % % test can be used to check whether or not the systematic errors are constant, i.e., whether or not the observed fluctuations are purely random. Such a test in the radioactive decay example revealed that a changing systematic error was present until the radioactive specimen had in the

the quantity being measured.

become "conditioned." The above conclusions as to precision, equally valid in many Poisson measurement situations, illustrate the importance of applications of probability theory and statistics in the design and performance of experiments.

More examples. The

soldier-mule example and, strictly speaking,

the radioactivity example discussed above illustrate Poisson sampling

But, the army was presumed to be kept up to And, with the condition that the source intensity remain constant, the radioactivity example was also discussed as though the supply of events were unlimited. Let us discuss some more Poisson

without replacement. strength at

all

times.

examples.

Suppose that the volume of blood in an adult human is 6 liters, and that 3 volume C bacteria. A sample of size 0.6

there are distributed in this

mm

* The situation with the pennies does not allow a convenient analog to the argument of spatial distribution, but this argument is not really involved in the present discussion.

However, such an analog would require an adjustable reciprocal relation between the head probability/?' and the number /;' of pennies tossed for each observation with the product n'p' remaining constant.

222 of blood

What

taken for examination.

is

k bacteria

in the

sample?

This

imagine, as a basic Bernoulli

bacterium.

number of

is

in

Science

the probability that there are

is

a Poisson problem because

we may

the observation of one particular

trial,

bacterium in the sample? The success probability

Is this

the ratio of volumes, 0.6/(6

is

and Experimental Errors

Probability

C

10 6)

x

=

10~ 7 and the

number of trials n

,

p is

volume of blood. If the criterion answer to the problem is P(k; 10" 7 C). If many samples are taken from the same person we have another example of sampling without replacement; it is essentially the same as the radioactivity case in which the observation is whether or not a given atom the

+

k2

<

(np) 2

n

bacteria is

in the total

satisfied, the

many identical persons is we have sampling with replacement; but in this case identity of different persons may be suspect. It should be noted that if value of k we select in this problem is close to the expectation value

decays in a specified time interval. If each of used for each sample, the the

=

np

10

_7

C, the standard deviation in the value k

10~ 7/2 C-.

is

Also sampling with replacement: How many radio tubes in a new sample batch of 800 are reasonably expected to be defective if the manufacturing process

trial is

p

=

is

either

The success 0.03.

good or

may

i.e.,

the average,

3%

defectives?

not in bunches, the basic Bernoulli

The number of

defective.

probability

These Bernoulli

distribution

on

to put out,

— the inspection of each of the 800 radio tubes where

obvious here

each tube 800.

known

is

appear randomly,

If the defectives

p

is

Bernoulli trials n

is

also given as part of the problem, viz.,

trial characteristics

are such that the parent

be assumed to be Poisson. The

by the expectation value

np.

But

this

first question is answered answer should be accompanied by a

statement of precision, say, by the standard deviation

±Vnp.

Thus, a

reasonably complete answer is 0.03 x 800 ± Vo.03 x 800 = 24 ± 4.9. Another question may be asked, What is the probability that, in the batch, less than 30 tubes are defective? The answer is ^fLoP(k', 24). Or the question may be, What is the probability that at least one tube will be defective? To this question the answer is

P(>1;

24)

=

1

-

P(0; 24)

=

}

K

=

1

-

e~

2i

«

1

Consider another simple problem. Suppose that at a certain telephone switchboard 50,000 connections are made and that 267 of them are wrong. From a random sample of 25 connections, what is the probability that one be wrong? Again, in this problem if the wrong connections occur randomly, the basic Bernoulli trial is the observation of each connection.

will

The 0.

1

best determination of the expectation value

335.

The answer to the question is P(

1

;

0.

1

335)

=

25 x 267/50,000 en

is

0.

1

335e>-°

1335

^0.117.

Poisson Probability Distribution

As

223

a variation of this problem, suppose that at another switchboard

two wrong connections are made in 500 trials. What is the probability that one or more wrong connections will occur in the next 35 trials? This is left as an exercise for the reader. As a final example, suppose that one night the "enemy" dropped 537 bombs in an area of 576 square blocks in the central region of Washington, D.C. We study the array of bomb hits and ask the question, Were the Table 5-5.

k

Bombs "Dropped" on Washington, D.C.

fk

NP(k; 0.9323)

Probability

224

and Experimental Errors

in

Science

that your house, occupying 1/100 of a block, will be directly hit?

The

subdivision of the pertinent area into blocks was for convenience in the

problem dealing with the size of the White House or the Capitol. Since your house is now the interesting unit of area, we subdivide into 57,600 units, each an equally likely target. The answer to the question posed is approximately 537/57,600. Why "approximately"? 5-8.

Composite of Poisson Distributions

The observed fluctuations in any set of trial measurements are generally due to more than one source. For example, as an extreme case, the normal density function, Eq. 4-9, was derived as the composite of numerous elementary distributions. Now we wish to examine briefly the effect of a small

number of sources of

fluctuations in a typical Poisson problem.

Measuring with a background.

measurement involvthe most common second source of fluctuations is the ubiquitous background "noise." Whether or not the primary source under study is introduced, this background is present and is producing a measurement effect. A background is generally defined as those residual observations made when all deliberate sources are removed. Thus, a background is separately In almost any

ing the counting of individual photons, a-particles,

etc.,

measurable.*

A typical background in a measurement with a Geiger counter or with an ionization chamber is caused by radioactive contaminations in the air or materials of the laboratory, by cosmic rays, by "spurious" counts inherent in the counter tube or ion chamber, or by the "noise" in the associated electronic circuits. The type of background we shall discuss is one that can be assumed to be constant on the average and random in detail. Since the background can be separately measured, its constancy on the average can be experimentally checked with the x 2 test by assuming measured means and by using Eq. 4-27. Generally, found to be not only constant on the average but Poissonian in distribution. The randomness and distribution can be checked experimentally with the x 2 test by assuming a Poisson distribution of individual measurements of the background. It can be shown (and this is put as Problem 14, Section 5-10) that if both the background and the distribution of events we would measure are Poissonian, then the observed composite distribution is also Poissonian. a

flat

the

distribution of

background

* In

is

measuring the background

—

in practice,

the deliberate sources are preferably not

removed this would also remove any secondary effects (such as spurious scattering) in the background itself. It is often better to stop, by an appropriate shield, the direct source radiation from reaching the detector. physically

225

Poisson Probability Distribution

Thus,

I where k b

is

the

P(k b

;

background count,

the observed composite count,

ju

the

kb

ju b is

the

x is

we are usually mean composite count.

activity (the quantity is

-

P(k

fx b )

p x)

;

=

k

P(k;

(5-33)

/x)

the mean background count, k is mean value of the primary source

trying to measure),

The observed composite count

Precision.

=

K+

kb

and

/u

=

/u b

+

fi x

is

'

(5-34)

and the primary source count can be written kx

By Eq.

==

k

kb

kx

3-43, the standard deviation in sx

2

=

s

2

+

is 2

(5-35)

sb

(Note the sum, not a difference, in Eq. 5-35.) By Eq. 5-16, or Eq. 5-24 if we are dealing with a single measurement in each case, Eq. 5-35 becomes sx

2

w

(k x

+

+

k b ')

kb

(5-36)

where k b is a specific measurement of the background in the absence of k x k b and k b are generally different because of statistical fluctuations. .

The

conclusion

first

is

obvious,

viz., if

the signal

is

relatively strong,

background is not very important. But if the signal is relatively weak, the k b and k b terms in Eq. 5-36 are important, and considerable time must be spent in measuring the background. Suppose that a time t b is spent in measuring the background rate precision in determining the

bit

=

B.

Then, Bt b counts are recorded. Our best estimate of the mean rate B and its standard deviation is

background

B= The

precision in

B is

»>'»

In this case

mean

rates

(X

+

t

(*'»>*J

= B ± (»V

(5-37)

inversely proportional to the square root of the time

spent in determining B.

suppose that a time

*

x is

Then, with the source under study introduced, + k b ')jt = + B'

X

spent in measuring the rate (k x

B')t x counts are recorded,

and our

best estimate of the

is

X+ B = (X + B')t ± x

[(X

+

B')*J*

Probability and Experimental Errors

226

By

and using Eq.

similar arguments,

The

5-35,

£a'

X

w

{

_

X is approximately

w !s_ x 100

jl

X+

B'

Science

we obtain

fractional standard deviation in per cent in

lx±w +

in

(5-40)

_B

h

tx

Equations 5-39 and 5-40 could have been written directly from Eqs.

and k b had been divided by the times involved But writing X and B as rates, independent of the particular choice of observational times, is general and shows clearly, by Eqs. 5-37 and 5-38, that the precision in each case is inversely proportional to the square root of the time of measurement. A common practical problem in precision is the following one. If the time t x + t b is to be constant, what is the most efficient division of time? The most efficient division refers to that value of the ratio tjt b such that the standard deviation s x is a minimum. It can be shown (Problem 15, 5-36

and 5-34

if

the counts k

in their respective

measurements.

Section 5-10) that this ratio

is

-

t

-t b Js x = min

(5 - 4i >

(H*f B

Another practical problem in precision occurs when the mean background counting rate B is comparable in magnitude with, or greater than, the

mean primary source X.

In such a case, great care

is

required in

measurement X + B'. The desired signal X, obtained by subtracting B from X + B', may be confused with a fluctuation peak in the background. Help in this interpretation is afforded by the tests for significance of the difference between two means, in this case between the interpreting the

means

X+

B' and B. Such

tests,

including the useful

t

test,

were discussed

in Section 3-3.

The type of argument involved in the / test may be applied a little more in two means as follows. Let m and m h be the means in question. Then our best estimate of the difference in the parent means is written, with the standard deviations, as

simply to the difference

P If

we imagine

—f*i>

('"

-

rn b )

±

s lm -

mb )

(5-42)

number of independent values of m — m b to be can be shown (by the central limit theorem) to be

a very large

obtained, these values

-

Poisson Probability Distribution essentially

227

normally distributed about the parent mean value

a standard deviation s {m_ m

By Eq.

>.

=

<:

<\

—

[x

/u b

with

3-43, 2

4-

'V

«>

2

and, for the Poisson case, Jc (m-

m b)

= -+

—

(5

"

43 )

by Eqs. 3-45 and 5-16, where n and n b are the numbers of trial measurements used in determining m and m b respectively. Again, if only a single measurement is used in either case, n or n h = 1, Eq. 5-24 should be used instead of Eq. 5-16. Hence, s {m _ m can be evaluated, and with the single parameter h of the normal distribution thus known by Eq. 4-8, calculation can be made by Eq. 4-10 of the probability that /;; — m b will differ from — ju b by any specified value. Now, if the two parent means /u and [x b are assumed for the moment to be the same, i.e., if there is no difference >

fji

X+

between

B' and B, there

cally greater than s (m _ m >. normal distribution curve,

A 32%

chance

is

a 32

B.

68%

Fig. 4-6, lies within

h

will

be numeri-

of the area under the

^(standard deviation).

usually considered to be too large to be a satisfactory

is

X

criterion that the desired signal

background

% chance that m — m

In other words,

It

is

be declared to be clearly above the

customary to take

5%

as the "significance level."

measurement m — m b is so far out on the tails that it is beyond the 5% limits, then the assumption that /n = fi h is thereby declared to be unreasonable, and consequently the desired signal is clearly above the background. One important general conclusion of the above discussion of the highbackground problem is that the useful sensitivity of a measuring instrument depends greatly upon the background. This dependence is not specified by the response of the instrument to a source of standard strength divided by the background, but the standard response must be divided by the magnitude of the fluctuations in the background. A very large background is perfectly acceptable if it is perfectly constant in magnitude. If the actual

5-9.

An

Interval Distribution interesting extension can be

that involve a spatial distribution. intervals

made

easily in those Poisson

problems

This refers to the sizes of individual

between adjacent success events.

The discussion

is

usually of

events that occur along a time axis, and the intervals are time intervals.

As

discussed so

as independent.

questions

—

it

far,

the Poisson distribution treats

all

these intervals

But the so-called interval distribution answers further

gives the probability for the occurrence of an interval of a

228

and Experimental Errors

Probability

specified

size.

This distribution

is

in

Science

of course, whenever the

realized,

Poisson conditions are satisfied for events that occur randomly along some spatial axis, such as time.

A

spatial-distribution Poisson

problem can be

considered as one of measuring intervals instead of one of measuring

We should be acquainted

events; the two views are obviously related.

both views and choose the more convenient one

From

with

problems.

in individual

Eq. 5-4, the probability that there will be no event per unit time

u)=£J— =

P(0;

e

-"

is

(5-44)

0!

where

fx is

the average

number of events per

time interval of interest be this interval

is /at,

unit time.

Let the size of the

then the average number of events during

/;

and P(0;

fxt)

=

£>-"'

(5-45)

The

probability that an event will occur in the interval dt

This

is

the probability for a single event

probability for no event during is

the product fie~'

lt

t

and

Equation 5-46

is

<

dt

1.

is

simply

/u

dt.

Then, the combined

for one event between

Designate this probability as

dt.

/(/;

/

and / + dt and write

ju) dt,

= fie'^dt

l{t; fi)dt

sizes

if [x

(5-46)

the probability density function for the distribution of

of intervals occurring between random rare events.

It is

immediately

evident that small intervals have a higher probability than large intervals;

hence, the interval distribution

is

asymmetrical.

measurement of k events in time T is accompanied by K intervals, where K = k ± depending upon whether we start and/or end the time T with a count or with an interval. These K intervals are of many different sizes. The number n, of intervals having sizes greater than t Y but smaller than t 2 is given by

A

1

t

n tuU

=

KI(t;

ju)

n hh In particular, with

t

x

=

0,

=K =

/ue'

Kie-"*

Eq. 5-47 or 5-48

-

1

is

11

dt

(5-47)

e~" h )

(5-48)

'

the cumulative probability

distribution function. If

t

2

>

taken as

T, the time of the actual observation for the infinite,

n t>h

The average

k events,

t2 ,

may

be

and then

interval

is

larger than the average

simply

= Ke~^

1/ju,

(5-49)

so the fraction of intervals that are

is

0.37

(5-50)

229

Poisson Probability Distribution

As a second interesting limiting case, suppose that t x = 0. Then, by Eq. 5-48, the fraction of intervals that are smaller than any specified interval

t

t is

=

- e~^

1

K Dispersion indices. is

(5-51)

In the interval distribution, the

mean

deviation

defined as

Jo =^ t-r _ /

dt

(5-52) 00

jue'^dt Jo

where t

is

mean we have

written for the

integration of Eq. 5-52,

t

t

interval,

=

\ffx.

After performing the

- t = — fv 0.7358t

(5-53)

e

Equation 2-21 defines the variance for any continuous distribution, such as the interval distribution, and, accordingly,

a2

P° (t = J-±-^

write

rffie-^ dt (5-54)

°°

11

dt

lie-'

|

we

Jo

The standard

deviation in the size of the intervals between randomly

distributed events

is,

from Eq.

5-54,

=

a

r

=

-

(5-55)

just equal to the average interval.

Resolving time:

lost counts.

It is

when

physically impossible to measure

the interval between

them becomes

two

spatially separated events

less

than the resolving ability of the measuring instrument.

axis, the limiting resolving ability

time, or the

of the instrument

"dead time." Because short

long intervals, Eq. 5-46, the

finite

is

intervals are

On

a time

called the resolving

more probable than

resolving time reduces artifically the

to become artifically smaller than unity, where n is the number of Poisson become less % measurements; see Problem 30(b), Section 5-10).

dispersion of Poisson events; (i.e.,

2

/n to

it

causes x

2,

230

and Experimental Errors

Probability

Science

in

which events are "received" by the instrument,

If R, the rate at

and a small fraction of the counts observed rate R c is given by small,

c

w

R(l

R

as

R

R

-R

c

rr)

(or intervals)

for

R

rr

<

1

for

R c tt

<

1

c

is

rather

then the

is lost,

(5-56)

from which where r r

is

c

+R

(l

c

tt)

In this case, r r

the resolving time.

R- R

=

RR and,

if

If

R

R

is

usually written as

c

C

known, r r can be measured. becomes rather large, we must distinguish between two is

types of counters.

Equation 5-56

is

satisfactory

counts do not themselves extend the dead time; then value \JT r

.

But

if

we must

=

when Rrr

1,

R

= R e - RT

c

*

—

c

e

declines.

(5-59) v '

er r

At the maximum counting

rate,

only \\e

= 37%

of

R

increases

instead of the events being

randomly

This type of counter,

the received events are counted.

It is

(5-58)

maximum

reaches a

c

R 'max = - =

indefinitely,

c

(lost)

approaches the

write

R

and then

R

each unrecorded count does extend the dead time, then,

instead of Eq. 5-56,

In this case,

different

unrecorded

if

if

becomes completely paralyzed.

interesting to note that

if,

spaced, they are uniformly spaced, as are "pips" from an oscillator, and

if

recovery were abruptly complete after the dead time interval, the second type of counter described in the preceding paragraph would follow faithfully until the rate equals l/r r

events.

An

oscillator

electronic circuit, e.g.,

is

and

,

this

is

just e times

/?

Cmax

for

random

often used to determine the resolving time of an

one which

some allowance must be made Coincidence counting.

is

part of a counter system, but usually

for incomplete recovery.

There are a great many interesting problems

that arise in practical applications of the Poisson statistics.

We

shall

mention just one more. Suppose that, in the radioactive decay of ^Al, we wish to show that the emission of the /?- and y-rays is very nearly simultaneous in the decay process,

Two in

i.e.,

simultaneous within the resolving time of the equipment.

counters are used and arranged so that both must respond together

order for the event to count. Since each counter has a

that

random coincidences

finite

will

This

is

called coincidence counting.

background, there

is

a certain probability

be counted that are unrelated to the 13AI

Poisson Probability Distribution

Furthermore, in

decay.

231

experiment, since each counter subtends only

this

a small fraction of the A-n solid angle of the

and y-emission, a

ft-

ft

(or a y)

may

be received by one counter and the related y (or ft) go in such a direction as to miss the other counter; perhaps it also enters the first

counter.

This situation greatly complicates the background for each

counter;

the

total

background includes the unrelated rays from the

We shall not work out We have discussed in

decaying 13AI. Conclusions.

examples of Poisson type problems

the details here. this

chapter only a few selected

— starting

with the. simple and pro-

ceeding to the more complex

—problems that are typical

problems are legion, but

us say that they lead us beyond the scope of

let

in science.

Other

book.

this

Problems

5-10.

A deck of 52 cards

1.

cards are turned up

The

1

shuffled

is

and placed face down on a

player calls each card without looking at

calls.

Show by

it

it is

Then

the

examined.

and promptly forgets what he and the Poisson

the parameters of the basic Bernoulli trials

conditions that the probability distribution for the

expect to call correctly

is

number of cards he may

essentially Poissonian.

Suppose that the weather records show

2.

table.

each card being discarded after

at a time,

that,

on the average,

5

out of the

30 days in November are snowy days. (a)

What

is

the binomial probability that next

November

will

have at most

4 snowy days ?

What is the Poisson probability for the same event? Make the histogram of the following numbers of seeds area on damp filter paper:

(b) 3.

unit

k = fk =

2

1

6

20

28

3456789

12

germinating per

10

6

8

and (b) a Poisson distribution to these measurements and plot two frequency distributions on the same graph as the experimental histo-

Fit (a) a binomial

these

gram. (Hint: In the Poisson calculation, use Eq.

5-6.)

Suppose the number of telephone calls an operator receives on Tuesday mornings from 9:00 to 9:10 is fitted by a Poisson distribution with p =3. (a) Find the probability that the operator will receive no calls in that time 4.

interval next Tuesday. (b)

Find the probability that

a total of (c)

1

call in that

in the

next 3 Tuesdays the operator will receive

time interval.

Find the probability that the

call

1

of part (b) will be in the

first

Tuesday. 5.

A

book of 600 pages

contains,

on

the average 200 misprints.

the chance that a page contains at least 3 misprints. estimate.

[ans.

p

(3

Estimate

Discuss the reliability of this or more) rs 0.29; a ^ 2.36 x 10~ 2 ] fl

Probability and Experimental Errors in Science

232

A

6.

company has From a mortality

life-insurance

people at age 25.

25, 88,314 are alive at age 26.

1000 policies, averaging S2000, on

lives of found that, of 89,032 alive at age Find upper and lower values for the amount

table

it is

which the company would reasonably be expected to pay out during the year on these policies. 7. (a)

What are the binomial and the Poisson probabilities that exactly 3 random sample of 500, have birthdays on Christmas? (Assume all

people, in a

days of each year to be equally probable as birthdays.)

What is What is

(b) (c)

the expectation value of the

number of birthdays on February 29?

the precision of the answer to part (b)?

number of random samples of 500 people per sample were

(d) If a large

in-

vestigated an experimental probability could be obtained that "exactly" 3 people

out of 500 have birthdays on Christmas. Mention a few factors that would it

unlikely that this experimental probability

binomial probability for

How

8. (a) girl

make

would agree with the calculated

this event.

would you determine the probability that an unspecified college

has red hair, assuming no ambiguity in color?

your determination.

Assume

in the

remainder of

Discuss the reliability of

this

problem that

this

proba-

bility is 0.05.

What

(b)

the probability that in a

is

random sample of 20

college girls 4 will

have red hair?

What

(c)

ment of

the probability that, of 4 girls in a physics class having an enroll-

is

30, only

How

1

has red hair?

must a random sample be it tne probability of its containing at least 1 red head is to be 0.95 or more? (e) List the Bernoulli and the Poisson conditions separately, and write "good," Si fair," or "poor" by each condition to indicate the degree to which it is "satisfied" in part (b), in part (c), and in part (d). (d)

9.

A

large

long-lived radioactive source emits particles at

(a)

What

is

(b)

What

is

the expectation

number of

an average

particles observed in 10

rate of 10/hr.

min? (ans. 1.67)

the probability that in a 10-min run

no

particles are

observed?

(ans. 0.188) (c) If 20 measurements are made, each for 10 min, what can you say about a measure of fluctuations relative to the mean value?

(d)

What

(e)

If

is

the precision of a single 10-min observation?

the average counting rate were 300/hr instead of 10/hr,

answers be for parts 10.

If

is

the difference great

11.

enough

in

one hour, then 265 counts

finally

i.e.,

concluding as to significance.

Derive the expression for kurtosis

in the

to indicate a significant time variation

cosmic-ray intensity? Treat the data three different ways,

arguments, before

what would the

and (d)?

246 cosmic-ray counts are observed

next hour, in the

(a), (b), (c),

in the

Poisson distribution.

by different

Poisson Probability Distribution

233

»

Consider the distribution B(k\ 100,0.05)

12.

P(k;

What

5).

the value of

is

mean, the most probable value,

(a) the

(b)

standard deviation,

(c) the

mean

(d) the standard deviation in the (e)

the skewness,

(f)

the kurtosis

The

13.

(standard error),

and

?

mean value in almost any probability Show that P{fi /u) = l/(27r,«)^, using Stirling's

probability of observing the

distribution

surprisingly small.

is

;

formula.

counting with a constant background, when the source

14. (a) Verify, for

rate

=

kjt

is

k\t

-

k b /t b

that

,

k

2

P(k b

fi b

;

)P{k -

kb

/i ) x

;

=

P(k;

/i

+

6

n x)

=

P{k;

/<)

(b) Reconcile this identity with the statement (implied in Section 4-3) that a

large

number of component Poisson

distributions gives a composite normal

distribution. 15. If

+

/

tb

is

constant, where

cluding the background) and

show and

16.

mean

most

that the

kit

such that

is

is

tb

/

is

X

the time spent in counting

rays (in-

the time spent in determining the background,

efficient division

of time between measurements of k b /t b

(klk b fK

t/t b *t

Show, using Eq. 5-4, that the probability of observing one less than the value is the same as the probability of observing the mean value in a

Poisson distribution. 17.

As an

inspector of an

enormous quantity of some manufactured gadget,

determine the sample size and the acceptance number so that

should be

(a) there

than

less

1

chance

in 10 that a lot with

5°

defectives

is

accepted,

is

(b) there

should be

rejected,

and

(c) that the

18.

less

than 5 chances in 100 that a

combination

(a)

and

lot

with only 2

% defectives

(b) obtains.

Consider the chances of a bomber pilot surviving a series of statistically which the chance of being shot down is always 5 %.

identical raids in (a)

From an

survive

1, 5,

original

group of 1000 such pilots, how many are expected to and 100 raids? Plot the survival curve.

10, 15, 20, 40, 80,

(b)

What

(c)

In a single raid of 100 planes, what are the chances that 0,

will

is

the

mean

life

of a pilot in

number of

(ans. 20 raids)

raids?

1, 5,

or 10 planes

be lost ?

19.

Ten cm3 of a

material

is

(a) that

Each of

liquid contain 30 bacteria.

inoculated with

only 1 of the

test

1

cm 3

of

this solution.

tubes shows growth,

(b) that the first test tube to be inoculated

i.e.,

10 test tubes of a nutrient

What

is

the probability

contains at least

shows growth,

1

bacterium,

and Experimental Errors

Probability

234 of the

(c) that all 10

test

is

show growth, and show growth ?

a multinomial problem, but can any part or parts of

How many

be conveniently

it

problem?

treated as a Poisson 20.

Science

tubes

(d) that exactly 7 test tubes

This

in

stars

must there be randomly distributed

in the sky all

around

the earth in order that there be a 50-50 chance of having (a) a "north" polar star, i.e., within, say, 2° of the axis,

both a "north" and a "south" star, one or both? (d) What would the answer be to part buted in the sky? (b)

(c) either

What

21.

additional information,

if

(a) if the stars

any,

were uniformly

do you need

distri-

in order to determine

each of the following parts whether or not it is a probability distribution, and, if it is, which of the 3 distributions emphasized in this book does it most

in

closely

approximate? Give as many a

each answer as you

priori reasons for

can.

New York

in

State according to financial

year.

Number

(b)

men

of adult

(a) Classification

income per

of defective lamp bulbs in each of

from a factory. (c) Repeated

measurements of

trial

(i)

many

large sample batches

very feeble and

(ii)

very intense light

intensity.

(d) One hundred measurements of the winning time at a horse race, each measurement by a different observer using his own stop watch. (e) Values of the height of the emperor of Japan from a poll of every tenth adult resident of Japan, the residents being randomly selected.

Number

(f)

vs. deflection

through a thin metallic

foil,

beam of protons

angle of a there being

process per scattered proton processes per scattered proton

on the average

single scattering),

(i.e.,

and

than

(ii)

1

scattering

100 scattering

(Scattering

multiple scattering).

(i.e.,

scattered in passing

(i) less

is

due to

the proton-nucleus electrical repulsion).

Suppose that the average number of

22.

of Ithaca

1.5/yr.

is

occurring in the city

within the range of numbers

(a) Is this

number

may be (b) What

considered reasonable on the basis of chance alone?

that

is

(4) reasonable,

the critical

prompt, say, an increase

owing

it

fall

fatal accidents in

of the police force

5%

any one year that should if

the criterion

that this or a greater

number

is

set that

will

occur

is

entirely possible in a multinomial probability

problem

to

have n

and np moderate, where p, is the basic Bernoulli probability p k successes and i is any one of the possible outcomes in Eq. 1-20. Show

large,

for

does

chance alone?

to

23. It

i.e.,

number of in the size

the probability shall be less than

t

small,

t

/-

(

= 3 the How would

that for r

multiple Poisson distribution

is

normalized.

you design the experiment to distinguish between randomness direction and randomness in time in the emission of a-particles from polonium?

24. in

fatal accidents

In a particular year 4 fatal accidents occur.

Poisson Probability Distribution

235

Discuss your choice of classification intervals and of the total number of measurements of counting rate. 25.

A

proportional counter

is

used in the measurement of

X rays

of constant

average intensity. (a)

A

count (source plus background) of 8000 is observed in 10 min. X rays removed, 10 min gives a total of 2000 background counts. the average X-ray intensity in counts per minute, and what is the standard total

Then, with the

What

is

deviation in this value? (b) if

What

is

the

optimum fraction of time to spend measuring make measurements is fixed?

the background

the total time to

26. In the measurement of y-rays, a counter is used with a measured average background of 120 cpm. If the y-rays enter the counter at an average rate of 240/min,what must be the duration of an observation of the y-rays if the measurement of the number of y's per minute is to have a probable error of 2 % ? 27.

Show by

the

method of maximum

likelihood that a

= Vnp =

/n

in the

Poisson distribution. (See Problem 27, Section 3-11.) 28. In expressing the dispersion of a set of

members of a Poisson distribution, monly than the mean deviation. (a) State as

many

(b) Outline in

deviation 29.

A

is

measurements believed to be

the standard deviation

is

used more com-

reasons for this as you can.

some

measurement situation in which the mean (Assume that no common probability model "fits").

detail a real-life

preferable.

college class of 80 students meets 267 times during the year.

number of class meetings fk with k absences Table 5-6. Absenteeism

k 0-2

fk

in

is

listed in

Table

5-6.

College Class Meetings

267P(k; 8.74)

The

Probability and Experimental Errors in Science

236

you arrange the data and proceed to determine whether or not the absenteeism on days of out-of-town football games was due to random fluctuations alone?

Make

30. (a)

the y} test of the goodness of

of the Poisson model to the

fit

observations in Table 5-4. (b)

Show

that, for a perfect

fit,

x

2

=

n,

where n

is

the

number of trial measure-

ments. 31. In successive

5-min intervals the background with a certain counter is 5. A radioactive source of long half-life is brought

310, 290, 280, 315,315, 275, 3

up

to the counter.

1

The increased counting

rate for successive

5-min intervals

is

720, 760, 770, 780, 710, 780, 740, 740. (a)

for

(i)

(b)

Calculate in counts per minute the average value and the probable error the background,

Show

ground can

(ii)

the background plus source,

and

(iii)

the source alone.

quantitatively whether or not the data with the source plus backsafely be considered to be

randomly

32. In counting a-particles, the average rate

distributed.

is

30

a's per hour.

What

is

the

fraction of the intervals between successive counts such that they are (a) longer

than 5 min,

(b) longer than 10 min, (c)

shorter than 30 sec.

a- and /J-rays are emitted from a certain radioactive sample. Assume and ^-emissions are independent, i.e., from different noninteracting atoms. The observed counts are A a's per minute and B /3's per minute. What is the combined probability that a particular interval between two successive a's will have a duration between t and t + dt and will also contain exactly x /Ts?

33.

Both

that the a-

34. (a)

and

Perform the integrations of Eqs. 5-52 and 5-54 for the mean deviation

for the standard deviation in the interval distribution.

(b)

Derive the expression for the skewness of the interval distribution.

35. In a certain experiment, a counter system gives counts per in

Table

5-7.

The parent Table

5-7.

distribution

is

Observed Counts Trial

1

minute as

listed

expected to be Poissonian. The internal in a

k

Certain Experiment

237

Poisson Probability Distribution

37. There are more positively charged cosmic rays at sea level than negatively charged ones. In a given experiment, 2740 positive ones and 2175 negative ones were detected during the same time interval. How should the ratio of positive to negative particles be reported ? 38.

One g

of radioactive material of atomic weight 200

which has a 5.00% registered in 30 min.

is

exposed to a counter 787 counts are

efficiency for detecting disintegrations;

What

is

the

mean

lifetime of the radioactivity?

What

is

the most probable lifetime of single radioactive nuclei? 39.

A

piece of metal

is

exposed to neutron irradiation and therafter

is

placed

near a counter than can detect the induced radioactivity. During the first minute after irradiation, 256 counts are recorded; during the second minute there are

49 counts. Ignore background. Assuming that only one kind of radioactivity was produced, determine the decay constant and the standard deviation in its determination. (Assume, of course, exponential decay). 40.

A cosmic-ray

of 0.02 steradian

is

"telescope" with a sensitive area of 100

pointed in a fixed direction.

different times the following

In

1

cm 2 and

numbers of counts are recorded

276

an aperture

hr intervals at various

Summary

Throughout

this

book we have devoted

the majority of the pages to the

practical mechanics of analyzing observations

and measurements with an

eye to the proper formulas for use in determining experimental errors and probability precision. But the discussion has been deliberately cast in the

framework of the

scientist rather

than of the mathematician or of the

statistician. It is

hoped that this book leaves the reader with a better understanding is a complex of observations, measurements, theoretical conand predictions that are all essentially probabilistic, that all "facts"

that science cepts,

of science are probabilistic;

that the "exact" or deterministic views of

and indeed held by many pseudo scientists open-ended views. And thus can science continue

science of the 19th century,

today, have given

way to

and grow and be philosophically endeavors of man.

to live tual

239

at

home among

the other intellec-

Glossary

The equations

numbered

as in the

of equally probable

outcomes:

listed in the five parts of this glossary are

text; the pages in the text are also indicated.

I.

CLASSICAL PROBABILITY

Definition of p;

w = number

Independent events

A and

p(A and B) = p(A)-p(B)

=

p(B\A)

Compound

A

^(either

^

^(either

A

or B)

=

nor B)

1

-

or

B

or both)

=

p. 13

+ p{B)

p{A)

-

p(A)

p{B)

+ p(A)-p(B)

=

p(A)

+ p{B) -

=

£(4)

+ £(B) -

or B, not both)

A

10

component events; additive theorems:

^(either

^(neither

p.

p(A\B) = p{A)

and

p(B)

events; independent

(1-3)

(1-8)

p. 8

n

B:

(1-5)

(1-7)

= number

w =-

p

(1-1)

(1-6)

n

of "wins,"

2p(A)-p(B) />(;4)

-p{B)

.

p.

10

p.

12

p.

12

p.

12

Partially dependent events; conditional probability: (1-9)

/>04

Permutations, total number

of,

and 5) = £(4) £(B|;4)

n

n!

Pk = (n

Stirling's

formula

(for

any

of,

—

—

p.

23

«)!

n objects taken k at a time (the k objects unordered)

(n\

n Pfc

factorial

number ,

„-,«)

13

n objects taken k at a time (the & objects ordered)

(1-12)

Combinations, total number

p.

n\

z!):

>"V=(|) (i+s*s ? --) 247

242 II.

Probability

and Experimental

Errors in Science

MEASUREMENTS IN SCIENCE: SIMPLE STATISTICS (REGARDLESS OF 'TIT" OF ANY MATHEMATICAL MODEL)

Definition (experimental) of p;

w ba = number

of

'win" observations,

rc

bS

= number

of identical trials:

(1-39)

pobs

=

, .

Mean m

(sample,

real-life

data)

;

^obs

.

limit ra

p.

49

obs~*°° ^obs

r different

values of

x, xi

observed /, times,

in

n

trials:

n

,

.

(2-1)

(2-2)

(2-3)

m =

Xl

flXl m =

+x

Xn

2 -\

=

i= l

n

+/a*2

p.

76

p.

77

p.

77

n

-\

frXr

=

i=l

Glossary

243

Second moment; variance

s

2 :

(2-19)

s

Universe variance a

2 ,

E =

for

m =

-

(*<

2

°

- m2

p.

86

p.

86

p.

87

p.

88

p.

93

p.

94

discrete universe distribution:

ju,

2

m)

E

=

-

lim

n

n-»o

Same

62

with universe mean

a2

(2-20)

=

2

~

lim ( xi n—>0 i=l

tfpi

continuous universe distribution

-

(x

I

2

n)

Jo

px dx

(2-21) I

Jo

p x dx

Practical ("best") estimate of a:

E

°<^)

(2-22, 2-23)

Standard deviation

in the

mean sm

n

—

m) 2

1

data and universe:

real-life

,

'A

-

(xi

n

(2-31

2-32 2-33)

(2-34)

— V«

-

,,<

fractional sm

Standard deviation

in

=

;

m

E

-—

m =

=

Vn

\

-

(xi

-

1=L

\

w (w

fractional

mVn

m) 2

-

!)

m =

—= M

the standard deviation

for

/xV«

approximately normal

(bell-

shaped) distribution: (2-44)

Probable error

/>e,

50%

«

a-/\/2n

p.

96

confidence limits; for approximately normal (bell-shaped)

distribution

pe

(4-21)

Probable error

in

pep.

Skewncss, coefficient

~

0.65^/ a/2w

= 0.46£e/Vw

p.

of:

E

skewness

(*i

-

E

m) 3

i=l

=

Peakedness, coefficient

of:

n

peakedness

(

xi

lim sample

(2-41, 2-42)

pp. 173, 206

0.655

the probable error pe pc for approximately normal distribution:

(2-45)

(2-35, 2-36)

»

=

E(*»i=i

n—>»

7Z<x

~

M)

96

244

and Experimental

Probability

PROPAGATION OF ERRORS

III.

Propagation of random errors,

in function

(3-31)

dUi

= f(x,

u

du

du — dy

« —Sxi + dx

Mean

=

deviation zu in u ,

(3-33)

,

in

with

f(x, y)

fraCtl0naI

Standard deviation su

u

and

zx

%

(3-36, 3-39)

Same

112

2 -i

^

^=LU)^ U)7]

= /(x,

with S* and

y)

= [(£)

in

p.

'du\ 2

Sz

>dy/

sy

/3m\2

+ (y)

2

= /(x,

u

known:

^ pp. 114, 115

V*]

and

with £e z and

y)

y

—

=

su

for s„ with s^ written for s x

Probable error pe u (or pen),

same as

Syi

+

fractional

(3-38, 3-40)

y)

known:

zy

2

IV.

Errors in Science

Sy for s y

/>e v

(or £&g

p.

and

pe$)

115

known:

for s u (or for Sg) with £e replacing 5 throughout.

MORE

STATISTICS

Weighted mean

mw

,

each

x,-

with weight W{, n

trials:

n

ww =

(3-57)

-

x^

X —

wx i

i

p.

n

118

i=i

Weighted grand mean J" (i.e.,

wxi

oc

of iV

component means each weighted by inverse variance

\/sxi "):

J2

xw

(3-58)

Weighted standard deviation

sx

2

Xi/s±

=— t Us 1=1

p.

118

p.

120

'

Xl

w :

n

£

-

Wi{xi

x w) 2

i=i (3-61)

t

/

in

the

/

test for consistency of

means i\

(3-64)

/=ff

—

of

=l

two

ii /

-(

sets of

«i«2

Ul+l2) \«i

\

V-^-J n-i) +

measurements: P 121

245

Glossary

F

the

in

F

standard deviations of two sets of measurements:

test for consistency of

«i

2

2

F =

(3-67)

-\=
p.

n2

2

M X

2

in

the chi-square test for goodness of

and model subdivided into

zl

<

—

1

mathematical model; actual data

of a

fit

124

2

M intervals, / = frequency of measurements (actual,

obs, or universe, th) in thej'th interval:

..__,

(*-26)

If

x

the model distribution

2

£

=

(4-27)

X

„ 2

~

2

(/th),]

TT-\

?=1

(Jth)j

uniform

is

[(/obs)/

2_,

=

p.

n total measurements:

(flat),

E~ -m m)~ »

185

2

(Xi

P 185

j=l

Curve

fitting,

y

=

a

+ bx,

values of a and b:

a

(3-74)

2x<2 2yi

—

=

-= 2

nZxi (3-75)

b

=

- Sxi2(xiyi) -^ - (ZXf) 2

—

g

For weighted values, see Eqs. 3-76 and

3-77.

For standard deviations, see Eqs. 3-78 and In case a

Curve

=

fitting,

0,

y

\*

.

3-79.

see^Eq. 3-83.

=

a

+ bx + ex

2 ,

values of

see Eq. 3-86.

a, b, c:

p.

129

p.

129

p.

1

p.

130

p.

131

p.

132

29

Correlation coefficient r for straight-line regression curve through the origin at x> y: 2jXiy{ Sx

Sx = =bSy _

(3-107)

Covariance

r

s xy for straight-line

y—

iAj

p.

142

Sy

regression curve through the origin at x, y

and with

correlation coefficient r: Sxy

_ —

VXjyj

_ —

TSxSy

n

V.

MATHEMATICAL MODELS OF PROBABILITY

All the equations listed in Parts II, III

and ful

and IV above, except Eqs.

2-44, 4-21,

apply also to all the mathematical models. In some instances more powerexpressions apply specifically to the models; only the specifically more powerful 2-45,

expressions are listed below.

Binomial

Model

Probability (distribution, discrete) for k successes, n (1-20)

B(k; n, p)

trials,

= (") pk q n ~i

p

-f q

=

1

p. 31

246

and Experimental

Probability

Cumulative probability (from k =

£ B(k;

(1-22)

Expectation value

is

q

=

\

P- 31

1

n'=n

np.

p. 31

ko in

is>

(1-24)

-

np\

m

= np

|£

Mean

k n ~k

k=0

Jt=0

Most probable value

k'):

P £ (?) \«/

=

n, p)

=

to k

Errors in Science

^

1.

32

p.

n:

(2-26)

Standard deviation

variance a 1

a;

p. 91

:

= vnpq;

(2-28)

a

(2-29)

fractional a

=

2
=

=

(

Multinomial probability (distribution) for each , k r observations, n trials, pi k\, kn, p2

M[(h;

of

n,

;

(k r

p2)

;

more than two

—

+

n, pi)(k 2

-\

•

+ pr n

p r)\ =

n,

p.

92

p.

92

Model

Multinomial

(1-30)

)

«/

\m

M

•

rc£g

=

possible outcomes;

1:

\

,

,fti*W 2

£ r*'

P-

37

Normal (Gauss) Model

=

Probability (density or distribution, continuous), z

G(z;h)

(4-9)

=

deviation, h

—-r " ft

=

l/
160

p.

160

y/ir

Cumulative probability (from z (4-11)

= —»

With standardized

r

h

=

*(*)

=

to z

2

z')\

'

e~" 2z2 dz

y/H J —x

variable: /

x

(4-14)

=

hz\

=

(*)

\

e~

x2

dx

p.

169

p.

169

y/^ Jo 2

/*'

1

/=-;

(4-15)

erf(0=— 7=1

_/2/2 e

dt

V2)r J -f

Mean

deviation

5 (universe)

(4-16)

Standard deviation

1

=

Z

=

0.564 — —

p. 171

a:

(4-20)

1

=

<j

=

0.707 p.

172

p.

173

P-

173

h

hy/2 Probable error pc (50% confidence limits): (4-21)

pe

=

0.4769

=

0.6/4.->cr

h

90%

confidence limits:

(4-22)

90%

c.l.

=

—

247

Glossary

Model

Poisson

—

Probability (distribution, discrete) for k successes, n

value np

=

fx

moderate

[k

+

2

2

(np)

=

Cumulative probability (from k

=

to k

fc=0

variance

0,

expectation

p.

198

P-

198

P-

202

p.

203

p.

205

p.

206

p.

207

2 o-

:

a

(5-17)

fractional

in a single

>

«!

/c=0

(5-13)

Standard deviation

—

= Z^rr-

)

a;

p

Jcg—lt

k'

£ P(k; M

Standard deviation

trials,

k')

V

(5-5)

<x>

—

=

P(k;n)

(5-4)

»

=

Vm =

=

measurement,

:

s

= y/T,

aks

(5-24)

a/,

—

Probable error pe

«

pe

Skewness, coefficient

0.65
of:

=

skewness

(5-32)

<x

Optimum time moved)

B

for

superposed on a background, obobserved background rate (signal re-

ratio for counting a signal rate

served combined rate

time

X+B

time

for

tx ,

fa:

+B

JX

t

-x =

(5-41)

a

p.

B„

M

tb

226

Model

Interval

Probability (density or distribution, continuous) for size or duration of intervals

randomly distributed between rare events, mean number size

/ is fit,

mean

interval

is

r

(

=

(5-46) t

=

to

—— = fit

(5-51)

Mean

l//x)

/(/; n)

Cumulative probability (from

deviation

t

—

t

= =

M e-"

ne-*

I

1

e

t')

with

dt

=

ci

st'

K

/

of events in interval of

1

p.

K -

events observed

«-"*'

in

228

time T:

p.

229

p.

229

p.

229

Jq

t:

(5-53)

t

-

t

=

—2

~

0.7358

n

fie

Standard deviaton a: (5-55)

a

=

M

Ind ex

statistics, 44 Bridge hand probabilities, 29, 37, 40

Bose-Einstein

Accuracy, 68 see also Precision vs.

Acton,

accuracy

F., 133

Addition theorems, 11

Cauchy

Alienation coefficient, 142

Central limit theorem,

Analysis of variance,

F

test, 123,

e.g.,

t

test,

134

see also

see

Chance, see Probability Chavenet's criterion for rejection of a "bad" measurement, 176

Mean

Bacon, R., 88 Bayes' theorem, conditional probability,

Chi-square (x 2 ), 185 frequency distribution, 187

goodness of 184-191

13

test for

Bernoulli theorem, 6

fit

of a

model,

dependence on the chosen group

30

conditions

for,

binomial probability, 32

in

normal probability, 164

in Poisson probability,

in-

tervals, 190

30

in

probabilities

P(x 2

>

xc

2 )-

table of,

188

example of, 190 Choice of functional relation for least squares fit, 133 F test for, 134

207

Bernoulli probability, 31 see also Binomial probability "Best" (location) value, 64, 88 see also

"Best" value; Mean; Median;

Mode

2

Average (arithmetic),

168

6,

Central (location) value, 73, 79

120

X test, 184 measuring with a background, 227

trial,

distribution, 97

Mean

Class (classification) interval, 72, 183, 186 Classical probability, 2, 8

Betting odds, 103 Bias in estimate, 81, 88, 104, 107

conditions

for,

8

Binomial coefficients, 27 Binomial probability, 31 distribution, 30 mean, and expectation value in, 3 1 91 most probable value in, 32 skewness in, 95 standard deviation and variance in, 92 graphs of, 32, 157, 158 in normal approximation, 33 in Poisson approximation, 33

Cochran and Massey, 146

Birge, R., 123

Composite Poisson distributions, 224 see also Measuring with a background

Coefficient of variation, 92 see also Standard deviation Coincidence counting, 230

,

Combinations, 23, 27 Combinatorial analysis, 9, 23 combination (unordered group) in, 27 permutation (ordered group) in, 23 Comparison of precisions in two sample sets,

F

Blunders, 69

249

123-126

test for, 123

250

Index

Compounding

of errors, 109

Dispersion indices, 73, 79

Propagation of errors Conditional probability, 12 see also

knowledge

see also Inferred

Confidence

limits, 173

deviation; Peakedness;

Range; Skewness; Standard deviation; and Variance Distribution function, see Frequency disQuantile;

see also Levels of significance

tribution

Consistency of two means, 120-123 /

Mean

see also

test for, 120

Dixon and Massey, 146 Dwight, H., 170

Correlation, coefficient of, 142 inverse, 142

Elementary

of errors, 111

Equally probable outcomes (events), 50 Errors, 64 blunders, 64, 69

two or more variables stochastically related, 140

Covariance, 143

Cox and Matuschak, 132

errors, see Errors

examples

of,

Cramer, H., 105, 168 Craps, game of, 48

by

Curve

elementary, 65

fitting,

examples of, 165 in normal probability theory, 164 mechanical analog of, 166

choice of functional relation, 133 differences, table for, 134

parabola, 132

experimental, 64

sine curve, 133

nonindependent, 110 propagation of, 62, 109-118

straight line, 127 for,

129

standard deviation for, 130

in

random

parameters

(accidental, independent), 64,

111

straight line through origin, 131

systematic, 64, 67

examples d'Alembert,

J.,

69

significant figures, 69

dispersion, see Dispersion

126

parameters

theory

6

of,

of, 6,

67

156

Data smoothing, 139 by first-order differences, 139 to a parabola by least squares, 139

Error function, 169

Degrees of freedom, 89

Estimator, 77, 88, 89, 104

in

curve

in

dimensions of error space, 138 standard deviation, 89

in

in the chi-square (x in the

F test, t

test,

table of values of, 171

Error space, 138

127

fitting,

in the

bias in, 81, 88, 104, 107

2 )

test,

consistent, 104, 107

Events, compound,

189

independent,

124

Expectation value, 31, 78 in binomial distribution, 31, 91 in Poisson distribution, 198

error propagation, 113

149

see also

Mean

Experimental probability,

Deviation, 64, 79

Mean

11

mutually exclusive, 10 overlapping (nonindependent), 11 rare, 196

de Moivre, A., 6 Design of an experiment, 148 statistical features in,

9,

10, 13

individual, 4

121

de Mere, Chevalier, 6 de Mere's paradox, 46

see also

8,

Skewness; Standard and Variance Discrepancy, 69

4,

49

conditions for and difficulties

deviation; Peakedness;

in,

deviation;

F

test for consistency, in

standard deviations, 123-126

49

251

Index

F test for

consistency,

curves

of,

F distribution,

124

125

for choice of functional relation, 134

Factorials, see Stirling's formula

146-148

for dispersion indices, 147 for location index, 147

beginning progress

of,

in,

19

with experience,

41

18,

see also Science

Interval probability, 227

181

of,

10, 13

Inefficient statistics,

Inferred knowledge, 17, 55

Fermat, P., 6 Fermi-Dirac statistics, 44 Fit of a mathematical model, 62, 180 chi-square (x 2 ) test of, 184-191 graphical visual test

Independent events,

cumulative distribution function, 228

ogive curve, 183 probability paper, use

of,

density function, 228

183

skewness and kurtosis, in test

of,

183

distribution, 227

mean

Fisher, R., 105, 123, 126

Frequency distribution,

in,

229

standard deviation

58, 71

in,

229

resolving time, lost counts, 229

cumulative, 73

continuous type, 72

curve

of, 58,

diagram

of,

73

Kurtosis, see Peakedness

73

discrete type, 72 relative (normalized), 73

Mathematical model Frequency function, 73 see also

Laplace,

Galton's quincunxes, 166

Games

Gamma Gauss,

function, 25

C,

parent, 175

Law

of chance, 6, 8

P., 6, 7, 9, 18

law of succession of, 19 Large deviations, probability for, 175 rejection of a "bad" measurement, 175 table of probabilities for, with normal of averages, 11

Least squares, see Principle of least

6

probability density formula, 33, 160

squares Legendre, A., 6

probability model, 33, 161

Levels of significance, in

see also

Gossett,

W.

Normal probability

in

(Student), 123

in chi-square (x

Goodness of fit, test of, see Fit of a mathematical model Grand mean, 118, 123 weighted, 118

Gaunt, J. A., 7 Gunther, Count

F

t

test,

120

123

test,

2 )

184

test,

measuring with a background, 224 see also Confidence limits

in

Lexis divergent coefficient, 185 see also Chi-square (x

2 )

Likelihood function, 104 A., 7

Maximum

see also

likelihood

Likelihood ratio, 103 Halley, E., 7

Heisenberg, W., see Uncertainty principle

Histogram, 57, 73 see also

Frequency distribution

History, highlights

of,

of probability,

6

1,

P.,

Lottery problem, 39

6

of statistics, 6, 7

Hoel,

Location indices, 76 see also "Best" value; Mean; Median; Mode; and Most probable value

188

Holmes, Justice O., 55 Huygens, C, 6 Hypergeometric probability, 39

Mathematical model

of probability, 61,

63, 72 see also

Binomial; Multinomial; Inter-

val;

Normal; Poisson;

F,

t,

and

chi-square (x 2 ) probability distributions

252

Index

Maximum

of,

105-108

precision

in,

108

Maxwell-Boltzmann

Mean

Normal

likelihood, 103

examples

statistics,

Normal frequency

43

deviation, 80

see also Normal probability Normal cumulative probability

normal distribution, 170

distribu-

tion, 161

use in science, 81, 97

see also

Normal likelihood, 106, 107

Normal probability

probability, 156

density function, 157, 199

sample (experimental), 76 universe (parent, theoretical), 77, 78

weighted, 118

approximations 160, 163-165

binomial)

(to

of,

distribution, 6, 160

working, 78

dispersion indices

Measurements, 5, 56 as a multinomial probability problem,

in a graph,

mean

38

in,

in,

160,

170-174

174

by maximum

likelihood,

106

as a quality control problem, 52

computed

standard deviation likelihood, 106

(derived), 62

direct, 52, 62

elementary errors

in science, 3, 41, 52, 56, 101

signal,

Bernoulli

trials,

168 fit of,

to actual data, 180-191 of,

graphs, 161, 174

large deviations in,

"bad" measure-

frequency curves ments, 175

errors,

166, 209,

efficiency of, 76 of least squares, see Principle of

least squares

Mode, 76 Model of probability, see Mathematical model Moments, 84 about the mean (central moments), 86 about the

by maximum

different allowed characteristics of,

225

Mechanical analogs of 210 Median, 76

in,

in,

164

random, trial, 56, 58 Measuring with a background, 224 chi-square (x 2 ) test of flatness and randomness of background, 224 precision in measuring a superposed

Method

(or density function)

distribution, 160

fractional, 81

Mean, 64, 76 by maximum

160

Normal probability

see also

efficiency of, 81

in

differential probability distribu-

tion,

origin, 85

Most probable value (mode),

32,

76,

mechanical analog of, 166 wide use in statistical theory, 156

Observation, see Measurement

Ogive curve, 183 Ordered group, see Permutation Over-determined (vs. exact) constant, 127

Parent distribution, see Universe bution

distri-

Pascal, B., 6

Pascal's triangle, 28

198 Multiplication

theorem,

probabilities,

12

Multinomial

coefficients, 35

probability distribution, 37

Peakedness (experimental and universe), 95 Pearson, K., 185

Permutations, 23

Mutually exclusive events, 10

Phase space, 42 Philosophy of sufficient reason,

Nagel, E., 4

Pierce, B., 170

Newman,

Pierce,

J., 7

C,

17

7

253

Index Poisson frequency distribution (density

Principle of least squares, justification of,

Poisson cumulative distribution func-

quantitative, 2

198

a priori, axiomatic, exact, mathematical, 2, 4, 8

classical,

Poisson probability, 33, 156, 195 Bernoulli trials

in,

207

conditions

density function, 198

approximation

3, 4,

(to binomial) of,

201

Probability combinations, 9

by maximum

likelihood, 107

standard deviation

in,

distributions

202

to actual data, 211

models,

mathematical, 61,

63, 72 see also

measurement, 204

illustrating,

Probability distributions, see Frequency Probability

fractional, 203

problems

49

theological, 4, 56

202

probable error in, 206 skewness in, 207

fit of,

8

philosophical, 4

distribution, 198

in a single

for,

experimental, a posteriori, scientific,

conditions for, 196, 212

in,

Mathematical model

Probability paper, 183

Probability theory, 62

213

Probable error,

normal distribution, 173

bacteria in blood samples, 221

in

bombs "dropped" on Washington,

in Poisson distribution, 206

223

popularity in science, reasons

deaths from the kick of a mule, 213 defective units in manufacturing process, 222

radioactive decay, 215 precision,

2

of,

intuitional, non-quantitative, 2

see also Poisson probability

mean

likelihood, 135

Probability, several meanings

see also Poisson probability

tion,

maximum

from

function), 198

subgroups

in the

148

Propagation of errors in computed measurements, 109M18 in a sum or difference, a product or quotient, a logarithm, a sine func-

in total

counts, 218

shapes of frequency distribution,

for,

probable error, 96

of

198, 200

tion, 116-118 nonindependent (systematic) errors,

110

spatial distribution, 212

of

telephone switchboard, 222

two mechanical analogs of, 209, 210 Poker hand probabilities, 30, 48

random errors, mean deviation

111 in,

113

probable error in, 115 standard deviation in, 114

Polynomial, see Multinomial Precision indices, 65, 71, 170 see also

Location indices and Disper

Quality control, 51 Quantile, 79

sion .indices

Quantum

Precision vs. accuracy, 68 Prediction,

5,

in

curve

in

data smoothing, 139

fitting,

Random mass-phenomena, Random numbers, 60 Random variation, 59

126

in definition of deviation,

80

in regression curve, 145 in various applications, 135

weighting by standard deviation, 119

43

56

Principle of least squares

in

statistics,

5

mathematical meaning

of,

scientific (operational)

meaning

Range, 79 Rare events,

60 of,

see Poisson probability

Rejection, see Large deviations

60

254

Index Standard deviation, reasons for popularity

Regression curve, 145 Restraints, see Degrees of freedom

Root-mean-square deviation,

see

ard deviation

Rounding

of

97

of,

sample (experimental), 82

Stand-

fractional, 83

numbers, 69

universe (parent), 86

Standardized variable in error function, 169

Sample space, 12

Normal probability

see also

Sampling, 26

Statistical bias, see Bias

random, 26

Statistical efficiency, 76, 81, 104

with replacement, 26

Statistical fluctuation, 64

measurements, central feature

without replacement, 26 Scatter diagram, 143

Statistical

Science, progress in, 41, 55, 101

Statistical mechanics,

of,

purpose of a measurement in, 101 roles of classical and experimental probability in, 42 see also Inferred

knowledge and Meas-

urements Scientific fact,

history of, 6 definition of, 59 Stirling's formula, (t)

24

distribution, see

t

test

"Sun rise" problem, 18 System of betting, 49

55

see also Inferred

42-45

Statistics, 7, 56, 101, 146

Students

1,

66

knowledge

Systematic error, 64

Sears, F., 45

examples

Significance, see Levels of significance

of,

67

Significant figures, 69

Skewness, 72, 94 coefficient of, 94

t

experimental, 94

t

universe, 94 in

curves

Spread, see Dispersion indices

Standard error of estimate, 142 Standard error (standard deviation mean), 92, 93

in the

likelihood, 109

fractional,

94

inefficient,

148

Standard deviation,

82, 86

"best" estimate of (variance), 88, 89,

of,

122

ity, 83 Test of a hypothesis, e.g., fit of binomial model, 178 "true coin" example, 14

extended to include veil of errors in measurements, 102 "true" die, 178 see also Fit of a mathematical model

Theory

of errors, 6, 156

Thermodynamic

138

Todhunter,

calculation of, 85

measurement, 83, 204 mathematical models of probability,

in a single in

consistency of two means, 120

distribution, 122

Tchebycheff's (or Chebyshev's) inequal-

Poisson distribution, 207

by maximum

test,

I.,

probability, 42

6

"True" value, 64, 167 "True coin" hypothesis,

test of, 14

e.g.,

binomial, 92

normal, 172 Poisson, 203 in the

mean (standard

in

error), 92, 93

94 the standard deviation, 96, 120

fractional,

inefficient, 147

Uncertainty principle (Heisenberg), 8, 43 Universe distribution, 58

mean

in, 76,

78

in, 86-89 Mathematical models

standard deviation see also

4, 7,

255

Index

Weighted mean, weighted grand mean,

Variance, 86 best estimate in

of,

88

118

binomial model distribution, 91

see also

Standard deviation

Weighted standard deviation, 120 Working mean, 78 for skewness, 95

Weighted mean, 118 by standard deviation, 119 by maximum likelihood,

for standard deviation, 84

120, 138

X

2 ,

see

Chi-square

SCIENCE

Probability

and

Experimental Errors in Science Lyman G.

What is tion

is

Parratt

The answer to this quesmodern science and its relaThis book supplies the answer.

the nature of scientific "meaning"?

essential to

an understanding

of

tionship to other intellectual activities. It brings

home

the base of all

the significance of a conceptual revolution that lies at science — the replacement of the "exact" or

modern

"absolute" scientific meanings of the 19th century by the "probabilistic" meanings of the 20th. Written by a man who is an experienced teacher and a distinguished physical scientist, the book fully clarifies the relationship of statistics and probability theory to probin, as well as the philosophy of, modern physical science. At the same time, it teaches the mechanics involved in applying proper formulas to scientific situations.

lems

Both the experimental and the

classical definitions of probability are

covered, but primary emphasis

is

given to the experimental. Early in

the book, classical games of chance are introduced in order to arouse

him develop his feeling for probability The discussion then shifts to measurements in science and to general statistical concepts (maximum likelihood, curve fitting, consistency tests, etc.). The normal (Gauss) and the reader's interest and help into quantitative concepts.

the Poisson models of mathematical probability are explored both analytically and through typical problems; both types of models are given about equal weight. In line with the purpose of the book, the discussion is elementary and concentrates on essentials. Numerous prob-

lems are included, some with answers.

cover design:

mike mciver

Probability and statistics in experimental physics

Read more

Probability and Statistics in Experimental Physics

Read more

Probability and Theory of Errors (Fourth Edition)

Read more

Probability Models in Engineering and Science

Read more

Probability Models in Engineering and Science

Read more

Chemistry An Experimental Science

Read more

Science, Probability, and the Proposition

Read more

Logics & Experimental Science

Read more

Errors, Lies, and Libel

Read more

Heavenly Errors

Read more

Mechanical Reliability Improvement - Probability and Statistics for Experimental Testing Marcel

Read more

Mechanical Reliability Improvement: Probability and Statistics for Experimental Testing

Read more

Probability and statistics. The science of uncertainty

Read more

Probability and Statistics for Computer Science

Read more

Probability and Statistics for Computer Science

Read more

Probability and statistics: the science of uncertainty

Read more

Probability and Statistics: The Science of Uncertainty

Read more

Experts in Uncertainty: Opinion and Subjective Probability in Science (Environmental Ethics and Science Policy)

Read more

Experts in Uncertainty: Opinion and Subjective Probability in Science (Environmental Ethics and Science Policy)

Read more

Deadly Errors

Read more

Bible Errors

Read more

Rounding errors in algebraic processes

Read more

Rounding errors in algebraic processes

Read more

common errors in english usage

Read more

Heavenly Errors

Read more

Induction, Probability and Confirmation (Minnesota Studies in Philosophy of Science)

Read more

Survey Errors and Survey Costs

Read more

Infrared Thermography: Errors and Uncertainties

Read more

Reversible Errors

Read more

Heavenly Errors

Read more

Recommend Documents

Probability and statistics in experimental physics

Probability and Statistics in Experimental Physics

Probability and Theory of Errors (Fourth Edition)

Probability Models in Engineering and Science

DK3213_half 4/29/05 2:59 PM Page 1 Probability Models in Engineering and Science © 2005 by Taylor & Francis Group, LLC...

Probability Models in Engineering and Science

DK3213_half 4/29/05 2:59 PM Page 1 Probability Models in Engineering and Science © 2005 by Taylor & Francis Group, LLC...

Chemistry An Experimental Science

Science, Probability, and the Proposition

Science, Probability, and the Proposition Bas C. van Fraassen PSA: Proceedings of the Biennial Meeting of the Philosophy...

Logics & Experimental Science

Aristotle (384 -322 B.C.) A Greek philosopher, born in Stagira (a.k.a. Stagirus), a Greek colony a few miles from Mount ...

Errors, Lies, and Libel

Heavenly Errors

M isconceptions A bout the R eal N ature of the U niverse Neil F. Comins H e a v e n ly E r r o r s Heavenly Errors...